Staying up-to-date has never been simpler. Sign up for the free GenePool newsletter today!
June 17, 2013 -- Scientists at A*STAR’s Genome Institute of Singapore (GIS) have developed a revolutionary method to quickly cut through noise and generate a unified and simplified analysis of high-throughput biological data from, for example, patient samples. The technique, known as a pre-whitening matched filter, is well known in electrical engineering and widely used in cell phones and radar. This is the first time, however, computational scientists, led by Dr Shyam Prabhakar, Associate Director, Integrated Genomics, GIS, have adapted it to the analysis of high-throughput DNA sequencing data, with surprisingly accurate results. The development was recently published in the prestigious journal, Nature Biotechnology.
High-throughput DNA sequencing has revolutionized the study of molecular biology and human disease. The technology has yielded major insights into cancer, infectious diseases, Parkinson’s disease and many developmental disorders.
The difficulties facing this technique are the massive amounts of data that are generated. To add to that, it was generally believed that a different method of analysis was required for each type of sequence data. Hence, each new data type was treated as a completely new analysis problem, resulting in a tremendous number of different analytical methods to solve them.
Dr Prabhakar and his team at the GIS, however, discovered that by using the pre-whitening matched filter technique, the results were uniformly better than other existing algorithms at a whole range of analysis tasks. In essence, the technique was applied to accurately detect segments of the genome that stood out from the rest of the sequence data. This was possible because, as lead author Dr Vibhor Kumar quickly realized, the underlying mathematics to the solution of all these analysis problems was the same.
The team was also able to use a variant of the technique to accurately predict gene expression, from epigenomic data. In other words, they could predict the activity levels of genes from data on chemical changes in the genetic material. This is significant especially in clinical settings, since gene expression is difficult to measure by conventional methods in old and degraded tissue samples.
“Our work fits into the pattern of applying engineering solutions to data analytics problems, and we are excited about using our approach to uncover important features of human disease,” said Dr Prabhakar. “This discovery will make it a lot easier for scientists to make biological inferences from high-throughput DNA data, particularly in the context of clinical samples from patients.”
GIS Executive Director Prof Ng Huck Hui said, “"This is a classic work of high performance computational biology that provides an analytical solution for a complex big data era. With this development, Dr Prabhakar’s team brings us one big leap further and faster in scientific high-throughput sequencing work.”
Dr Rob Mitra, Alvin Goldfarb Distinguished Professor of Computational Biology and Associate Professor, Department of Genetics at the Washington University School of Medicine said, “This work provides an elegant solution to a ubiquitous problem: separating the signal from the noise in deep-sequencing datasets. The DFilter algorithm represents a significant advance because it is widely applicable and because it is more accurate than existing algorithms. DFilter can be used to analyze virtually any sequence-tag analysis of DNA binding (e.g. ChIP-Seq, DNASE-Seq, or FAIRE-Seq), and since it uses the mathematically optimal linear discriminant, it was able to outperform all of the existing tools that were developed specifically for each type of assay.”
Dr Olli Yli-Harja, Professor in the Department of Signal Processing in Tampere University of Technology, Finland, added, “This is very inspiring for signal processing researchers, as this study demonstrates the great benefit of systematic benchmarking in signal estimation and heterogeneous sample analysis for data generated by next-generation sequencing.”
Research publication:
The research findings described in the press release was published in the 16 June 2013 advanced online issue of Nature Biotechnology under the title “Uniform, optimal signal processing of mapped deep-sequencing data”.
Authors:
Vibhor Kumar1, Masafumi Muratani2, Nirmala Arul Rayan1, Petra Kraus3,4, Thomas Lufkin3,4, Huck Hui Ng2,3, Shyam Prabhakar1#
1. Computational and Systems Biology, Genome Institute of Singapore, Singapore 138672
2. Gene Regulation Laboratory, Genome Institute of Singapore, Singapore 138672
3. Stem Cell and Developmental Biology, Genome Institute of Singapore, Singapore 138672
4. Department of Biology, Clarkson University, Potsdam, NY 13699 USA
Corresponding author: Dr Shyam Prabhakar
Email: prabhakars@gis.a-star.edu.sg
Genome Institute of Singapore, 60 Biopolis Street #02-01, Singapore 138672
Telephone: (+65) 6808 8046 Fax: (+65) 6478 9004
Contact
Winnie Lim
Genome Institute of Singapore
Office of Corporate Communications
Tel: (65) 6808 8013
Email: limcp2@gis.a-star.edu.sg
About the Genome Institute of Singapore (GIS)
The Genome Institute of Singapore (GIS) is an institute of the Agency for Science, Technology and Research (A*STAR). It has a global vision that seeks to use genomic sciences to improve public health and public prosperity. Established in 2001 as a centre for genomic discovery, the GIS will pursue the integration of technology, genetics and biology towards the goal of individualized medicine.
The key research areas at the GIS include Systems Biology, Stem Cell & Developmental Biology, Cancer Biology & Pharmacology, Human Genetics, Infectious Diseases, Genomic Technologies, and Computational & Mathematical Biology. The genomics infrastructure at the GIS is utilized to train new scientific talent, to function as a bridge for academic and industrial research, and to explore scientific questions of high impact. www.gis.a-star.edu.sg
About the Agency for Science, Technology and Research (A*STAR)
The Agency for Science, Technology and Research (A*STAR) is Singapore’s lead public sector agency that fosters world-class scientific research and talent to drive economic growth and transform Singapore into a vibrant knowledge-based and innovation driven economy.
In line with its mission-oriented mandate, A*STAR spearheads research and development in fields that are essential to growing Singapore’s manufacturing sector and catalysing new growth industries. A*STAR supports these economic clusters by providing intellectual, human and industrial capital to its partners in industry.
A*STAR oversees 20 biomedical sciences and physical sciences and engineering research entities, located in Biopolis and Fusionopolis as well as their vicinity. These two R&D hubs house a bustling and diverse community of local and international research scientists and engineers from A*STAR’s research entities as well as a growing number of corporate laboratories.
www.a-star.edu.sg
Help employers find you! Check out all the jobs and post your resume.