OmniTier Debuts CompStorTM Novos Platform for De Novo Assembly-based Variant Calling
With its distributed, scalable compute architecture as implemented on commodity servers, CompStor Novos enables the accurate variant calling needed for higher throughput genomic informatics pipelines at lower cost. More robust and affordable informatics solutions such as CompStor Novos help more institutions leverage next-generation sequencing (NGS) techniques and improve patient well-being through precision medicine.
“Variant calling is an important problem in personalized medicine that is usually addressed by aligning reads to the reference genome, an approach that often misses large structural variations,” said Dr. Pavel Pevzner, Ronald R. Taylor professor of computer science and director of the NIH Center for Computational Mass Spectrometry at the University of California, San Diego who has published numerous foundational papers on de novo assembly algorithms. “Although an alternative assembly-based approach is better suited for finding such variations, it is still rarely used due to its high computational cost. OmniTier’s CompStor Novos represents an important advancement in enabling de novo assembly-based pipelines.”
New results from a joint study by researchers at Mayo Clinic’s Center for Individualized Medicine and OmniTier show that the CompStor Novos WGS bioinformatics platform effectively addresses both major categories of variants available from a de novo assembly-based approach:
- It achieves short-variant statistics comparable to the most reliable alignment-based pipeline, the Broad Institute’s BWA-MEM aligner paired with the GATK Haplotype Caller;
- It reveals longer variants not found with the alignment method.
The joint-study results have been submitted to Nature Methods for publication. A preprint of that article is available here.
Beyond speed and accuracy - easy to use, scalable, cost-efficient, multi-purpose, and versatile
With its powerful memory-centric distributed compute architecture, the CompStor Novos platform can support complete, integrated informatics pipelines for a wide variety of genomics applications. It enables:
- Simplified use via a user-friendly web-based interface, with a one-button solution from FASTQ input dataset from Illumina sequencers to an output variant calling (VCF) file. The data ingress and egress along with data placement across the server nodes are optimized automatically.
- Scale-out solution allows customers to expand beyond 2- or 4-node configurations as compute requirements grow for larger problems and greater throughput.
- Cost-efficient solution built from commodity x86 servers with tiered DRAM and low-cost NVMe SSDs to expand the computational memory beyond the common DRAM-only approaches.
- Multi-purpose platform supports both alignment and de novo assembly-based variant calling using the highly-trusted GATK variant caller from the Broad Institute. Beyond short variants, internally generated information from the de novo assembler is made available to users to identify the presence of complex indels and structural variants.
- Versatile platform that allows customers to expand functionality beyond variant calling by adding software modules from OmniTier, such as big data analytics, artificial intelligence support, and other “omics” assembly solutions on its roadmap.
As a benchmark, CompStor Novos completed FASTQ-to-VCF processing of a 36x coverage WGS dataset in 3 hours based on a 4-node configuration—a time comparable to many alignment-based pipelines. Additional nodes can further reduce the processing time. The platform is designed to handle FASTQ files of up to 800x coverage depth.
UCSD’s Dr. Pavel Pevzner explains that “de novo assemblers are generally available for use on a single compute platform-- either a single server, which is slow but low-cost, or a supercomputer, which is fast but expensive. By introducing parallelized de novo assembly and efficiently aggregating the compute and memory resources on commodity servers, the CompStor Novos tiered memory solution offers the opportunity to speed-up assembly-based workflows and extend such solutions to other time-extensive applications in genomics, transcriptomics, and proteomics,” Pevzner said.
Mayo Clinic’s Dr. Alexej Abyzov, computational genomicist and biologist, senior associate consultant and assistant professor of biomedical informatics, relates that “discovering and analyzing structural variants accurately is important yet underserviced by the mainstream alignment-based methods area in personal genomics. OmniTier’s platform provides precise, fast, and efficient means for researchers and clinicians to pursue such discoveries,” he said.
“Inefficient informatics pipelines are hampering the genomic industry’s ability to use higher accuracy methods while handling the projected exponential growth in the number of genomes sequenced annually,” said Dr. Hemant Thapar, founder and CEO of OmniTier. “The substantial throughput and cost improvements made available by our flagship CompStor Novos platform will enable researchers to adopt de novo assembly techniques for improved diagnostics, and to speed up the pace of precision medicine.”
About OmniTier Inc.
OmniTier develops software for bioinformatics, scientific computing, and web services applications. Its integrated software solutions accelerate data-intensive infrastructure applications, including genomic workflows, high performance object (K-V) caching, and scientific analysis for machine learning and AI. Founded in February 2015, the company has R&D operations in Milpitas, CA and Rochester, MN. The OmniTier R&D team has a proven track record of previously delivering “industry-first” technology and products.
CompStor™ is a trademark of OmniTier, Inc.
Source: OmniTier, Inc.