Data Scientist - Disease Genomics

Calico Labs
South San Francisco, California
Sep 14, 2018
Required Education
Position Type
Full time

Who we are:

Calico is a research and development company whose mission is to understand the biology of aging, and to help people to lead longer and healthier lives. We aim to combine the best biomedical science with cutting edge technology and computing. We are building a new and unique culture of true partnership between the best biomedical and computational scientists, within a company that is both a nimble startup but also has a firm financial footing.

Position description:

Biology is rapidly becoming a data science due to the exponential growth in high-quality biological and medical data. These data are transforming our understanding of biology and disease, but deriving valuable scientific insights requires careful analysis and interpretation of data done in close collaboration with computational and life scientists. Calico is seeking to expand our group of Data Scientists who are exploring diverse questions from a computational perspective; and who are developing methods which facilitate a deeper understanding of data.

As a data scientist, you will work cross-functionally with other individual contributors within computing, as well as researchers at Calico, on collaborative research projects which may be centered around biological questions, analytical methods, or engineering.

We are seeking an exceptional data scientist with interest and experience in computationally elucidating disease biology to join the team.

Position responsibilities:

  • Use multiomic data, including bulk and single cell RNAseq, ATACseq, metabolomics and proteomics, to identify biomarkers of disease and identify dysregulated processes
  • Translate insights across diseases datasets including between meta-analyses of public data, cell culture models, mouse models and early stage human research
  • Convey results to computational and experimental scientists verbally and through impactful visualizations
  • Develop methods which enable and, when appropriate, automate the above

Position requirements:

  • Ph.D. in Bioinformatics, Computational Biology, Statistics, Biological Sciences, Genetics, Genomics, Computer Science, or equivalent preparation and experience
  • 6+ years of experience in data analysis, including at least 3 years of hands-on work experience with real data (either in an academic environment or in industry)
  • 3+ years of experience studying the biology of disease using computational approaches
  • Solid understanding of cellular and organismal physiology and pathophysiology
  • Experience with analyses of at least three of the following data types: bulk RNAseq, single-cell RNAseq, ribosome profiling, ATACseq/ChipSeq, metabolomics, proteomics
  • Experience in high-quality software development, including proficiency with R (tidyverse) and/or Python
  • Significant experience with the application of statistical and machine learning methods, such as: regularized GLMs, Bayesian modeling, time series analysis, modern neural network architectures, and more
  • Track record of effective cross-functional collaboration on complex projects involving people with very diverse backgrounds

Nice to haves:

  • Understanding of cancer biology and the biology of neurodegenerative diseases