Data Scientist

Working from home
Jun 08, 2021
Required Education
Bachelors Degree
Position Type
Full time

Mammoth Biosciences is harnessing the diversity of nature to power the next-generation of CRISPR products. Through the discovery and development of novel CRISPR systems, the company is enabling the full potential of its platform to read and write the code of life. By leveraging its internal research and development and exclusive licensing to patents related to Cas12, Cas13, Cas14 and CasPhi, Mammoth Biosciences can provide enhanced diagnostics and genome editing for life science research, healthcare, agriculture, biodefense and more. Mammoth is democratizing disease detection with easy and affordable point-of-care tests that allow real-time and simultaneous detection of multiple conditions, along with high-throughput tests that allow for unprecedented testing volume. Further, the company is transforming disease treatment with its proprietary micro-sized CRISPR proteins that enable new editing and delivery options. Based in Brisbane, CA, Mammoth Biosciences is co-founded by CRISPR pioneer Jennifer Doudna and principal founders Trevor Martin, Janice Chen, and Lucas Harrington. The firm is backed by top institutional investors including Decheng, Mayfield, NFX, and 8VC, and leading individual investors including Brook Byers, Tim Cook, Bob Nelsen, and Jeff Huber.


Mammoth is seeking a Data Scientist to join its growing Gene Engineering - Protein Discovery (GEPD) team. The GEPD team is responsible for developing novel CRISPR-associated (Cas) nucleases for genome editing and diagnostic applications.

The successful candidate will be responsible for analyzing and building models to help make sense of large genomic databases, molecular assays and NGS analysis results. Additionally, the successful candidate will work collaboratively with a diverse group of scientists and engineers to build new tools to beat COVID-19 and cure genetic disease.

    • Build ETLs across a variety of structured and unstructured datasets
    • Build visualizations and dashboards
    • Define features and train machine learning models to identify key features of novel CRISPR systems
    • Work directly with teams of scientists and engineers
    • Document and regularly present data at internal meetings

    • Advanced degree in a quantitative discipline (e.g., Statistics, Computer Science, Bioinformatics, Applied Mathematics, or similar) or equivalent practical experience
    • Proficiency in Python (inc. pandas, numpy and scikit-learn)
    • Experience working with unstructured sequence data
    • Experience working with AWS Services (EC2, S3)
    • Familiarity with SQL database technologies
    • Experience solving biological problems
    • Strong understanding of machine learning algorithms and concepts

    • Experience working with metagenomic data
    • Proficiency in Apache Spark (preferably using AWS EMR)
    • Record of working collaboratively with scientists and engineers
    • Ability to work independently and manage multiple projects simultaneously

    • Company-paid health/vision/dental benefits
    • Unlimited vacation and generous sick time
    • Company-sponsored meals and snacks
    • Wellness, caregiver and ergonomics benefits
    • 401(k) with company matching

It is our policy and intent to provide equal opportunity to all persons without regard to race, color, religion, political affiliation, sex/gender (including gender expression/identity, pregnancy, childbirth and related medical conditions), marital status, registered domestic partner status, sexual orientation, age, ancestry, national origin, veteran status, disability, medical condition, genetic characteristics, and/or any other basis protected by law. This policy covers all facets of employment including, but not limited to: recruitment, selection, placement, promotions, transfers, demotions, terminations, training, and compensation.