Senior Data Scientist - Genomics (REMOTE)

Location
San Diego, California, United States
Posted
Jun 22, 2021
Ref
oDaNffwz
Hotbed
Biotech Beach
Required Education
Bachelors Degree
Position Type
Full time
Prometheus is a IPO biotechnology company pioneering a precision medicine approach to the discovery, development, and commercialization of novel therapeutic and companion diagnostic products for the treatment and diagnosis of inflammatory bowel disease (IBD). The company's precision medicine platform, Prometheus 360, combines proprietary bioinformatics discovery methods with one of the world's largest gastrointestinal bioinformatics databases to identify novel therapeutic targets and develop therapeutic candidates to engage those targets. Prometheus is a spin out of Cedars-Sinai Medical Center and partners with the hospital system for its biospecimens, clinical data, and bioinformatics.

FQA about Data Science and Engineering at Prometheus:
https://prmb.io/dse-faq

OVERVIEW:

Are you a skilled, ambitious, and enthusiastic data scientist who wants to impact human health at the forefront of precision medicine and drug discovery? Do you have experience in computational biology, functional genomics, multi-omic analyses (e.g. RNAseq), and machine learning?

AS A DATA SCIENTIST IN THE DSE, YOU WILL:
  • Develop statistical and machine learning-based algorithms to identify drug targets and biomarkers using genomic, multi-omic, and clinical datasets.
  • Inform Prometheus' discovery pipeline strategy, prioritization, and execution.
  • Provide scientific direction to senior management and research teams.
  • Independently design, execute, and interpret data analysis strategies to answer scientific questions.
  • Maintain, clean, and contribute to genomic, molecular, and clinical databases.

ABOUT YOU:
  • 5-10 years of post-graduate experience in data science, computational biology, bioinformatics, or related, in academic or industry (preferred) environments.
  • MS or PhD degree in Computational Biology, Bioinformatics, Biostatistics, Genetics, Computer Science, Electrical Engineering, Mathematics, Physics, or a related discipline.
  • Fluent in Python, R, and SQL, and feel at home in a remote Linux server session.
  • Familar using packages such as sci-kit learn, TensorFlow, pandas, PyTorch, H2O
  • Familiar with AWS, GCP, or Azure.
  • Passionate about high code quality, automated testing, and clear documentation.
  • Comfortable with asychronous and collaborative workflow (agile development, Kanban, Github, code review).
  • Share our vision for using data, compute, and science to make a substantially positive impact in the world.

ANY OF THE FOLLOWING IS A PLUS:
  • Fluent in the use of R/Bioconductor for the analysis of gene expression and gene regulation using internal and public array, RNAseq, and scRNA-seq transcriptomics datasets.
  • Experience with genomics and genetics methods (GWAS, rare variants, eQTL, etc.) and statistical genetics packages (PLINK, ADMIXTURE, GATK, BOLT-LMM, etc.)
  • Familarity with human genetic and genomics databases: GWAS Catalog, OpenTargets, UK Biobank, GEO, GTEx, etc.
  • Experience integrating multi-platform omics data and applying machine learning techniques for target and biomarker discovery.
  • Familiarity with molecular / cell biology assays and immunobiology.


WORK ENVIRONMENT

The work environment characteristics described here are representative of those an employee encounters while performing the essential functions of this job. Reasonable accommodations may be made to enable individuals with disabilities to perform the essential functions.

The noise level in the work environment is usually quiet.

Travel up to 25% may be required.

All qualified applicants are considered for employment without regard to race, color, religion, age, sex, sexual orientation, gender identity, national origin, disability, veteran status or other protected class.