Principal Big-Data Architect (Molecular Profiling and Data Science)

Tarrytown, New York, United States of America
May 27, 2021
Required Education
Masters Degree/MBA
Position Type
Full time
As the Molecular Profiling and Data Science Principal Big-Data Architect you will be Architecting and building out a world class drug discovery data platform. Your hands on engineering strengths will be needed to build a scalable platform that will enable scientific discovery to be done on petabytes of data. As a lead in this effort you will collaborate with Scientists (Data and Biological) in developing machine learning models. The experience that you bring to this effort will allow you to never lose sight of a macro point of view as you provide guidance and direction in tilting this organization towards having a complete data science component.

Your typical day may include:
• Develop and maintain data pipelines to manage large datasets from internal and external sources (such as Bulk and Single Cell RNA-Seq Data, T-cell Receptor and B-cell Receptor data).
• Design data management systems to enable innovative machine learning technologies.
• Work collaboratively with research scientists and technical counterparts to design and implement a data architecture, data pipelines and data visualizations
• Daily Stand ups in an Agile Scrum paradigm
• Work both as a Data Architect and a Hands-On Data Engineer
• Architect and Develop Data Pipelines using Spark and PythonContribute as both a peer and a mentor in code reviews

This job might be for you if you:
• Enjoy working in a fast-paced environment, supporting innovative scientists doing ground breaking data science and biological science to discover lifesaving medicines.
• Have the ability to embrace the failure that comes with the adventure of discovery.
• Enjoy working in a Team environment.
• Enjoy Building Data Solutions that, as of yet, have not been accomplished.

To be considered for this role, you must have a minimum of MS/BS and at least 4 years relevant experience. Deep knowledge of current data management technologies and database design. Strong experience with at least some of the technologies or platforms we use: python, pyspark, Apache Airflow, Spark and Hive, Jupyter and Zeppelin Notebooks, AWS infrastructure, open source and commercial data management tools and components from Cloudera and Databricks. An understanding of biological data such as gene expression analysis is a plus. #LI-EG2

Does this sound like you? Apply now to take your first steps toward living the Regeneron Way! We have an inclusive and diverse culture that provides amazing benefits including health and wellness programs, fitness centers and stock for employees at all levels!

Regeneron is an equal opportunity employer and all qualified applicants will receive consideration for employment without regard to race, color, religion or belief (or lack thereof), sex, nationality, national or ethnic origin, civil status, age, citizenship status, membership of the Traveler community, sexual orientation, disability, genetic information, familial status, marital or registered civil partnership status, pregnancy or maternity status, gender identity, gender reassignment, military or veteran status, or any other protected characteristic in accordance with applicable laws and regulations. We will ensure that individuals with disabilities are provided reasonable accommodations to participate in the job application process. Please contact us to discuss any accommodations you think you may need