BioSpace Collaborative

Academic/Biomedical Research
News & Jobs
Biotechnology and Pharmaceutical Channel Medical Device and Diagnostics Channel Clinical Research Channel BioSpace Collaborative    Job Seekers:  Register | Login          Employers:  Register | Login  

Free Newsletters
My Subscriptions

News by Subject
News by Disease
News by Date
Search News
Post Your News

Job Seeker Login
Most Recent Jobs
Search Jobs
Post Resume
Career Fairs
Career Resources
For Employers

Regional News
US & Canada
  Biotech Bay
  Biotech Beach
  Pharm Country
  Bio NC
  Southern Pharm
  BioCanada East
  C2C Services & Suppliers™


Company Profiles

Research Store

Research Events
Post an Event
Real Estate
Business Opportunities

PLoS By Category | Recent PLoS Articles
Biochemistry - Chemistry - Mathematics - Neurological Disorders

Interpretation and Visualization of Non-Linear Data Fusion in Kernel Space: Study on Metabolomic Characterization of Progression of Multiple Sclerosis
Published: Friday, June 08, 2012
Author: Agnieszka Smolinska et al.

by Agnieszka Smolinska, Lionel Blanchet, Leon Coulier, Kirsten A. M. Ampt, Theo Luider, Rogier Q. Hintzen, Sybren S. Wijmenga, Lutgarde M. C. Buydens


In the last decade data fusion has become widespread in the field of metabolomics. Linear data fusion is performed most commonly. However, many data display non-linear parameter dependences. The linear methods are bound to fail in such situations. We used proton Nuclear Magnetic Resonance and Gas Chromatography-Mass Spectrometry, two well established techniques, to generate metabolic profiles of Cerebrospinal fluid of Multiple Sclerosis (MScl) individuals. These datasets represent non-linearly separable groups. Thus, to extract relevant information and to combine them a special framework for data fusion is required.


The main aim is to demonstrate a novel approach for data fusion for classification; the approach is applied to metabolomics datasets coming from patients suffering from MScl at a different stage of the disease. The approach involves data fusion in kernel space and consists of four main steps. The first one is to extract the significant information per data source using Support Vector Machine Recursive Feature Elimination. This method allows one to select a set of relevant variables. In the next step the optimized kernel matrices are merged by linear combination. In step 3 the merged datasets are analyzed with a classification technique, namely Kernel Partial Least Square Discriminant Analysis. In the final step, the variables in kernel space are visualized and their significance established.


We find that fusion in kernel space allows for efficient and reliable discrimination of classes (MScl and early stage). This data fusion approach achieves better class prediction accuracy than analysis of individual datasets and the commonly used mid-level fusion. The prediction accuracy on an independent test set (8 samples) reaches 100%. Additionally, the classification model obtained on fused kernels is simpler in terms of complexity, i.e. just one latent variable was sufficient. Finally, visualization of variables importance in kernel space was achieved.