ETL Developer, Clinical Informatics

Tarrytown, New York, US
Nov 15, 2018
Required Education
Bachelors Degree
Position Type
Full time
The Regeneron Genetics Center RGC is a wholly-owned subsidiary of the Company, whose goals are to apply large scale human genetics to identify new drug targets and to guide the development of therapeutics programs and precision medicine. Building upon Regeneron's strengths in mouse genetics and genetics-driven drug discovery and development, the RGC specializes in ultra-high-throughput exome sequencing, large scale informatics and data analysis encompassing genomics and electronic health records, and translating genetic discoveries into new biology and drug discovery opportunities. The RGC leverages multiple approaches including large population based studies, Mendelian genetics and family based studies, founder population genetics, and large-scale disease focused projects and has developed a network of over 50 collaborations with research organizations around the world. Including some of the largest sequencing studies in the world, such as the DiscovEHR study in collaboration with Geisinger Health System, and an initiative to sequence 500,000 participants with the UK Biobank, the RGC has built one of the largest human genetics databases, including sequence data from over several hundred thousand participants and rapidly growing. Our interests encompass a breadth of different areas across all therapeutic areas and the RGC is highly integrated into all facets of research and development at Regeneron. Program goals include target discovery, indication discovery, and patient-disease stratification. Objectives include advancing basic science around the world through public sharing of discoveries, providing clinically-valuable insights to physicians and providers of collaborating health-care systems, improving patient outcomes, and identifying novel targets for drug development. The position will be responsible for creating and maintaiing databases for the Clinical Informatics group. Responsibilities will include desigining and maintaining databases, optimizing the performance of databases, developing and executing ETL processes, integrating datasets from outside collaborators into the overall RGC data ecosystem, and potentially developing data models. Data sources will generally be EHRs domestic and international but may include research registries as well. The ETL developer will also be responsible forc creating, executing, and optimizing SQL queries on both local and shared data, which includes creating stored procedures, scripts, and other potentially complex functions. The developer may also perform tasks related to database administration. This role will interact with a team of database administrators, biostatisticians, clinical scientists, and programmers to structure and mine clinical and phenotypic data sets and support genomic studies and association analyses. The position will require coordination and collaboration with the Genome Informatics group as well as corporate IT, in addition to scientists within the department, research and clinical scientists at Regeneron, and external collaborators. The position will report to the Head of Clinical Informatics Responsibilities - Help translate research questions into executable SQL queries, and execute those queries - Develop automated processes for data ingestion and cleaning - Work to extract and load data from multiple internal and external sources - Accept raw data files from outside institutiosn and develop ETL processes to efficiently integrate the data into the RGC data ecosystem - Work with other developers to create optimized queries - Ensure data quality - Close collaboration and coordination with external health system collaborators and informatics teams. Work with these collaborators to structure data and develop algorithms, rules engines, and querying tools to access and curate the phenotypic datasets - Work with the business users and the technical teams to understand business requirements and translate requirements into data model s , refactoring existing models where needed - Function as a super user of Data Reporting/Management tools . Requirements This position requires a minimum of 3 years of experience as a database developer. Experience with healthcare data is strongly preferred. Additional requirements include - Expert knowledge of SQL - Experience in a healthcare setting, and familiarity with EHR data. Familiarity with data from clinical trials and registries would also be helpful - Strong communication skills - Familiarity with clinical data standards such as ICD, SNOMED, or OMOP is preferred - Experience working with investigators - Demonstrated proficiency with Data Reporting/Management Tools - Bachelors Degree required This is an opportunity to join our select team that is already leading the way in the Pharmaceutical/Biotech industry. Apply today and learn more about Regeneron's unwavering commitment to combining good science & good business. To all agencies . Please, no phone calls or emails to any employee of Regeneron about this opening. All resumes submitted by search firms/employment agencies to any employee at Regeneron via-email, the internet or in any form and/or method will be deemed the sole property of Regeneron, unless such search firms/employment agencies were engaged by Regeneron for this position and a valid agreement with Regeneron is in place. In the event a candidate who was submitted outside of the Regeneron agency engagement process is hired, no fee or payment of any kind will be paid. Regeneron is an equal opportunity employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability status, protected veteran status, or any other characteristic protected by law.