Data Engineer

Pasadena, CA
Nov 07, 2021
Biotech Beach
Required Education
Bachelors Degree
Position Type
Full time

Company Overview:

Terray Therapeutics is a venture-backed biotechnology company leveraging our vast high-quality data to optimize the path from discovery to transformative therapeutics. Our approach delivers on the promise of computation to revolutionize drug discovery. Through our closed-loop wet lab discovery platform and data-rich AI, we overcome existing constraints on chemical data by systematically mapping the biochemical interactions of massive and diverse small molecule libraries to identify novel therapeutic compounds. Access to this quantitative data at scale allows us to intelligently navigate an infinitely extensible and highly diverse chemical space to efficiently design, discover and optimize small molecule therapeutics. Our internal development programs are focused on immunology. In addition to these programs we also work with leading pharmaceutical companies.  

Our integrated computational platform, tNova, harnesses our novel affinity binding technology, tArray, unique resynthesis capabilities, and broad biology infrastructure. tArray is the foundation of our discovery engine. It enables us to screen hundreds of millions of compounds in minutes and return quantitative data on each compound. Vast, high purity, and high diversity compound libraries are rapidly iterated using best-in-class chemistry to unlock chemical insights at the scale necessary to power AI-driven drug discovery.

Position Summary: Terray Therapeutics is seeking a motivated, creative, and experienced Data Engineer. As an integral member of our data team, the candidate will be responsible for building a reliable, distributed data pipeline to handle millions of raw fluorescence microscopy images and their extracted features, allowing our machine learning engineers and data scientists to fully leverage our data to accelerate internal drug discovery efforts. The position will report to the Head of Computational and Data Sciences.

The core responsibilities of this job will be:

  1. Manage and improve our data lake of millions of fluorescence microscopy images
  2. Work with our data scientists to incorporate our image processing workflow into the data pipeline
  3. Build and manage our databases of billions to trillions of chemical structures, intensities, affinities, and data from other biological assays
  4. Design and architect a data warehouse to support downstream analytics

Experience and Qualifications: Given the company’s size, anticipated growth and fast-paced environment, the organization requires a data engineer who is thoughtful, high energy and can partner with the broader organization to further enhance our next generation drug discovery capabilities.

Part of Terray Therapeutics’ success is nurtured by a hands-on work environment where everyone is accountable, everyone is vested in a vision of excellence, and everyone actively takes part in the success of the business. Terray Therapeutics supports a positive work environment comprised of engaged employees who feel appreciated, recognized and free to be creative.

Qualifications include:

  • Expert in engineering big data pipelines using modern technologies and cloud infrastructures
  • Expert in building and managing scalable relational databases, preferably in the life sciences space
  • Experience with cloud computing services, preferably AWS (EMR, Redshift)
  • Experience with high-end distributed data processing environments (Spark, Hadoop, etc.)
  • Proficiency in Linux environment, experience with database languages (e.g., SQL) and experience with version control practices and tools (Git)
  • Experience with pipeline/workflow managers (Luigi, Airflow, Nextflow, etc.)
  • Highly proficient in Python and the PyData stack (numpy, pandas, scipy, dask, etc.)

She/he will exhibit the ability to work well under pressure to provide results in a short timeframe. The company is looking for a highly responsive, goal-oriented individual who will bring significant energy and drive to solve complex technical problems and help us achieve our mission to advance human health.