BioSpace Collaborative

Academic/Biomedical Research
News & Jobs
Biotechnology and Pharmaceutical Channel Medical Device and Diagnostics Channel Clinical Research Channel BioSpace Collaborative    Job Seekers:  Register | Login          Employers:  Register | Login  

NEWSLETTERS
Free Newsletters
Archive
My Subscriptions

NEWS
News by Subject
News by Disease
News by Date
PLoS
Search News
Post Your News
JoVE

CAREER NETWORK
Job Seeker Login
Most Recent Jobs
Search Jobs
Post Resume
Career Fairs
Career Resources
For Employers

HOTBEDS
Regional News
US & Canada
  Biotech Bay
  Biotech Beach
  Genetown
  Pharm Country
  BioCapital
  BioMidwest
  Bio NC
  BioForest
  Southern Pharm
  BioCanada East
  US Device
Europe
Asia

DIVERSITY

PROFILES
Company Profiles

INTELLIGENCE
Research Store

INDUSTRY EVENTS
Research Events
Post an Event
RESOURCES
Real Estate
Business Opportunities

PLoS By Category | Recent PLoS Articles
Computer Science - Science Policy

Connecting the Dots between PubMed Abstracts
Published: Tuesday, January 03, 2012
Author: M. Shahriar Hossain et al.

by M. Shahriar Hossain, Joseph Gresock, Yvette Edmonds, Richard Helm, Malcolm Potts, Naren Ramakrishnan

Background

There are now a multitude of articles published in a diversity of journals providing information about genes, proteins, pathways, and diseases. Each article investigates subsets of a biological process, but to gain insight into the functioning of a system as a whole, we must integrate information from multiple publications. Particularly, unraveling relationships between extra-cellular inputs and downstream molecular response mechanisms requires integrating conclusions from diverse publications.

Methodology

We present an automated approach to biological knowledge discovery from PubMed abstracts, suitable for “connecting the dots” across the literature. We describe a storytelling algorithm that, given a start and end publication, typically with little or no overlap in content, identifies a chain of intermediate publications from one to the other, such that neighboring publications have significant content similarity. The quality of discovered stories is measured using local criteria such as the size of supporting neighborhoods for each link and the strength of individual links connecting publications, as well as global metrics of dispersion. To ensure that the story stays coherent as it meanders from one publication to another, we demonstrate the design of novel coherence and overlap filters for use as post-processing steps.

Conclusions

We demonstrate the application of our storytelling algorithm to three case studies: i) a many-one study exploring relationships between multiple cellular inputs and a molecule responsible for cell-fate decisions, ii) a many-many study exploring the relationships between multiple cytokines and multiple downstream transcription factors, and iii) a one-to-one study to showcase the ability to recover a cancer related association, viz. the Warburg effect, from past literature. The storytelling pipeline helps narrow down a scientist's focus from several hundreds of thousands of relevant documents to only around a hundred stories. We argue that our approach can serve as a valuable discovery aid for hypothesis generation and connection exploration in large unstructured biological knowledge bases.

  More...

 

//-->