Lead R&D Informatics Developer

Tarrytown, NY, United States
May 13, 2019
Required Education
Masters Degree/MBA
Position Type
Full time
Regeneron Genetics Center, LLC is seeking a Lead Research & Development Informatics Developer for our Tarrytown, New York location. The position will be a part of the Genome Informatics Team to build and scale out the Regeneron Genetics Center's (RGC's) big data infrastructure to support various workflows and analyses throughout RGC and Regeneron. Use leading-edge data technologies to advance science and healthcare. Build out big data distributed systems architecture capable of efficiently processing terabytes of clinical and genomic data. Develop and maintain web applications used by scientists to query and mine the data sets produced by the RGC. Design and develop parallelized algorithms and tools to analyze large graph models consisting of millions of nodes and billions of edges. Collaborate with other team members to develop novel and scalable machine learning approaches for mining genotypic and phenotypic data. Build automation around various components of the system. Interact with scientists and staff to clearly define and iterate on requirements. Keep abreast of the latest advances in state-of-the-art software technologies (e.g., Apache Spark, Play framework, Scala, AWS, React).

Qualified applicants will possess a Master's degree in Computer Science, or a closely related field, and 1 year of experience with genomics, bioinformatics and clinical informatics. Must possess 1 year of experience with: programming experience in a modern object-oriented language, including Scala and Python; with client-side software development, including HTML, JavaScript, jQuery, CSS, D3, Node.js, React; SQL and NoSQL databases, including Hive, MongoDB and MySQL; in Python and Scala libraries used for Data Science and Machine Learning, including Pandas, Scikit-learn, NumPy, Scipy, Spacy, Matplotlib, NLTK; with Linux command line and bash scripting; with Software Development Life Cycle (SDLC) and various software development models including Agile Development; and analyzing the time complexities of key data structures and algorithms. Must include 1 year of experience with distributed computing frameworks, including Apache Spark and Hadoop; with the web development frameworks - Play, Django and React to build web applications in Scala and Python; developing and applying Machine Learning and Natural Language Processing algorithms on large-scale data; and with cloud computing, including AWS. Must possess 1 year of experience: using Spark Ecosystem including Spark Core, Spark Streaming, Spark MLlib and GraphX; managing and deploying web-based applications on cloud platform; and testing and mocking frameworks, including ScalaTest and ScalaMock.

To apply for this position, please submit your resume, indicating Requisition Code 16545BR to:
George Peterson
Regeneron Pharmaceuticals, Inc.
777 Old Saw Mill River Road
Tarrytown, New York 10591