Principal Data Engineer

Redwood City, California
Jun 01, 2021
Required Education
Bachelors Degree
Position Type
Full time

At Bristol Myers Squibb, we are inspired by a single vision – transforming patients’ lives through science. In oncology, hematology, immunology and cardiovascular disease – and one of the most diverse and promising pipelines in the industry – each of our passionate colleagues contribute to innovations that drive meaningful change. We bring a human touch to every treatment we pioneer. Join us and make a difference.

The Research Solutions group of Bristol Myers Squibb seeks a resilient, results-oriented data engineer to join our motivated and diverse team focusing on informatics and data enablement initiatives in Research and Early Development (R/ED). The individual will play a leadership role in data engineering and be responsible for defining data engineering practices for the group. Additionally, the individual will be designing and implementing ETL pipelines that focus on data enablement of primary and secondary high-dimensional data in pre-clinical and clinical settings where data is being employed to identify molecular drug targets, characterize MOA, prioritize disease indications and generate patient selection hypotheses. This hands-on role interfaces closely with data and computational scientists in Informatics and Predictive Sciences and business partners in IT and supports programs spanning both discovery and translational sciences. We are seeking an individual with extensive experience integrating data and building data solutions to make data accessible and meaningful for the R/ED community. 


•    Define and lead data engineering practices for the group, including establishing templates and frameworks, determining best usage of specific cloud services and tools, and working with vendors to provision cutting edge tools and technologies
•    Innovate and advise on the latest technologies and standard methodologies in Data Engineering and be able to identify software solutions that can address hurdles in data enablement
•    Design, implement and manage ETL data pipelines that ingest vast amounts of genomic, phenotypic and screening data from public, internal and partner sources
•    Collaborate with data scientist leads to determine best-suited data enablement methods to optimize the enablement and interpretation of the data for downstream scientists
•    Apply value-balanced approaches to the development of data ecosystem and pipeline initiatives
•    Proactively communicate data ecosystem and pipeline value propositions to partnering scientific collaborators
•    Collaborate with colleagues across Informatics and Predictive Sciences to make data, including raw/interim data, available to R/ED department personnel as the need arises 

Education and experience

•    Bachelor’s degree and 7 years or a Master’s degree with at least 5 years’ experience in an engineering field
•    Excellent skills and deep knowledge in Python and object-oriented programming is a must, including common Python libraries such as pandas
•    Excellent skills and deep knowledge of databases such as Postgres, Elasticsearch, Redshift, and Aurora, including distributed database design, SQL vs. NoSQL, and database optimizations
•    Solid understanding of ETL pipeline and workflow managements tools such as Airflow, Luigi, AWS Glue, Amazon Kinesis or AWS Step Functions
•    Solid understanding of AWS cloud computing services such as Lambda functions, ECS, Batch and Elastic Load Balancer and other compute frameworks such as Spark, EMR and Dask
•    Solid understanding of container strategies using Docker, Fargate, and ECR
•    Proficiency with modern software development methodologies such as Agile, source control, CI/CD, project management and issue tracking with JIRA
•    Experience in a life sciences research environment a plus

Around the world, we are passionate about making an impact on the lives of patients with serious diseases. Empowered to apply our individual talents and diverse perspectives in an inclusive culture, our shared values of passion, innovation, urgency, accountability, inclusion and integrity bring out the highest potential of each of our colleagues.

Bristol Myers Squibb recognizes the importance of balance and flexibility in our work environment. We offer a wide variety of competitive benefits, services and programs that provide our employees with the resources to pursue their goals, both at work and in their personal lives. 

Our company is committed to ensuring that people with disabilities can excel through a transparent recruitment process, reasonable workplace adjustments and ongoing support in their roles. Applicants can request an approval of accommodation prior to accepting a job offer. If you require reasonable accommodation in completing this application, or any part of the recruitment process direct your inquiries to Visit to access our complete Equal Employment Opportunity statement.