BioSpace.com

Biotech and Pharmaceutical
News & Jobs
Search the Site
 
   
Biotechnology and Pharmaceutical Channel Medical Device and Diagnostics Channel Clinical Research Channel BioSpace Collaborative    Job Seekers:  Register | Login          Employers:  Register | Login  

NEWSLETTERS
Free Newsletters
Archive
My Subscriptions

NEWS
News by Subject
News by Disease
News by Date
PLoS
Search News
Post Your News
JoVE

CAREER NETWORK
Job Seeker Login
Most Recent Jobs
Browse Biotech Jobs
Search Jobs
Post Resume
Career Fairs
Career Resources
For Employers

HOTBEDS
Regional News
US & Canada
  Biotech Bay
  Biotech Beach
  Genetown
  Pharm Country
  BioCapital
  BioMidwest
  Bio NC
  BioForest
  Southern Pharm
  BioCanada East
  US Device
Europe
Asia

DIVERSITY

INVESTOR
Market Summary
News
IPOs

PROFILES
Company Profiles

START UPS
Companies
Events

INTELLIGENCE
Research Store

INDUSTRY EVENTS
Biotech Events
Post an Event
RESOURCES
Real Estate
Business Opportunities

PLoS By Category | Recent PLoS Articles
Computer Science

Mining GO Annotations for Improving Annotation Consistency
Published: Wednesday, July 25, 2012
Author: Daniel Faria et al.

by Daniel Faria, Andreas Schlicker, Catia Pesquita, Hugo Bastos, António E. N. Ferreira, Mario Albrecht, André O. Falcão

Despite the structure and objectivity provided by the Gene Ontology (GO), the annotation of proteins is a complex task that is subject to errors and inconsistencies. Electronically inferred annotations in particular are widely considered unreliable. However, given that manual curation of all GO annotations is unfeasible, it is imperative to improve the quality of electronically inferred annotations. In this work, we analyze the full GO molecular function annotation of UniProtKB proteins, and discuss some of the issues that affect their quality, focusing particularly on the lack of annotation consistency. Based on our analysis, we estimate that 64% of the UniProtKB proteins are incompletely annotated, and that inconsistent annotations affect 83% of the protein functions and at least 23% of the proteins. Additionally, we present and evaluate a data mining algorithm, based on the association rule learning methodology, for identifying implicit relationships between molecular function terms. The goal of this algorithm is to assist GO curators in updating GO and correcting and preventing inconsistent annotations. Our algorithm predicted 501 relationships with an estimated precision of 94%, whereas the basic association rule learning methodology predicted 12,352 relationships with a precision below 9%.
  More...

 

//-->