Spark R&D Developer (Data Engineer)
Known for its scientific and operational excellence, Regeneron is a leading science-based biopharmaceutical company that discovers, invents, develops, manufactures, and commercializes medicines for the treatment of serious medical conditions. Regeneron commercializes medicines for eye diseases, high LDL-cholesterol, atopic dermatitis and a rare inflammatory condition and has product candidates in development in other areas of high unmet medical need, including rheumatoid arthritis, asthma, pain, cancer and infectious diseases.
We are looking for an R&D Spark Developer to join the Genome Informatics team to expand the RGC's big data infrastructure and develop new algorithms/tools to support various workflows/analyses throughout the RGC and Regeneron. Specifically, the candidate will collaborate closely with various team members at the RGC to (i) establish efficient data representations for genotypes, phenotypes and association results, (ii) implement scalable production workflows, and (iii) develop novel machine learning approaches to uncover new relationships between genotypes and phenotypes.
The ideal candidate will have a strong background in computer science specializing in distributed systems and/or machine learning, experience in analyzing large datasets, and have strong communication skills as this job requires collaboration among multiple cross-functional teams.
This position will provide exciting opportunities to work on the bleeding edge of genome informatics and genomic medicine. The RGC hosts a vast amount of data encompassing thousands of phenotypes derived from electronic medical records, integrated with genomic data. Together, these represent a landmark collection of information that will move precision medicine and novel therapeutic discovery forward as a new data-driven paradigm in healthcare.
• Build out a big data distributed architecture capable of efficiently processing terabytes of genomic and clinical data
• Develop algorithms and tools to analyze large data sets consisting of billions of rows
• Develop and deploy machine learning algorithms
• Develop new web applications used by Regeneron scientists to analyze genomic and clinical datasets
• Build automation around various components of the system
• Interact and collaborate with other scientists to clearly define and iterate on requirements
• Keep abreast of new state-of-the-art software technologies and best-practices including: Spark, Hadoop, various NoSQL databases, AWS, React, and Functional Programming
This position requires a MS (Ph.D. preferred) with 3 or more years of experience in computer science specializing in distributed systems and/or machine learning.
Additional requirements include:
• Expertise in large distributed systems, such as Spark, Hadoop, or related frameworks/databases is essential
• 3+ years of software engineering experience in a modern Object Oriented or Functional language (e.g., Scala)
• Experience in developing and applying machine learning algorithms
• Excellent communication and presentation skills required
• Working knowledge of SQL
• Experience with cloud computing (AWS preferred)
• Familiarity with genomics and bioinformatics is preferred, but not required
This is an opportunity to join our select team that is already leading the way in the Pharmaceutical/Biotech industry. Apply today and learn more about Regeneron's unwavering commitment to combining good science & good business.
To all agencies: Please, no phone calls or emails to any employee of Regeneron about this opening. All resumes submitted by search firms/employment agencies to any employee at Regeneron via-email, the internet or in any form and/or method will be deemed the sole property of Regeneron, unless such search firms/employment agencies were engaged by Regeneron for this position and a valid agreement with Regeneron is in place. In the event a candidate who was submitted outside of the Regeneron agency engagement process is hired, no fee or payment of any kind will be paid.
Regeneron is an equal opportunity employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability status, protected veteran status, or any other characteristic protected by law.
Requisition Number: 12660BR