Publications of Giovanni Felici

This page shows all publications that appeared in the IASI annual research reports. Authors currently affiliated with the Institute are always listed with the full name.

You can browse through them using either the links of the following line or those associated with author names.

Show all publications of the year  2013, with author Felici G., in the category IASI Research Reports (or show them all):


IASI Research Report n. 13-16  (Previous    Next)  

Weitschek E., Giulia Fiscon, Giovanni Felici

Supervised Learning Meets DNA Barcoding Species Classification

ABSTRACT
Specific gene regions known as DNA Barcode are the main players in the technique of DNA Barcoding to identify species of the most life kingdoms. Reliable methods and algorithms support the solving of the DNA Barcode sequence classification problem whose aim is to assign an unknown specimen to a known species through the analysis of its Barcode. In this work we propose the use of supervised machine learning methods as an effective approach to address the task of species classification through the DNA Barcode sequences. The Weka machine learning classifiers are selected to carry out the DNA Barcode analysis. Specifically, the trees-based J48, the rules-based RIPPER (JRIP), the Bayesian approach Naïve Bayes and functions-based method support vector machines (SMO) are evaluated on simulated and experimental datasets, i.e., public available datasets that belong to the animals, fungi and plants kingdoms. Then, well-established DNA Barcode classification methods, (e.g., phylogenetic trees based NJ and PAR, the similarity based BLAST, and the character based DNA-BAR and BLOG are compared to the Weka classification results. The classification analysis on synthetic and real data sets shows that supervised machine learning methods are promising candidates for handling with success the DNA Barcoding species classification task. Indeed, the results show that Support Vector Machines and Naïve Bayes methods outperform on average the other analyzed classifiers. However, they do not provide a clear human interpretable classification model. Instead, the ruled-based ones give the diagnostics positions and nucleotide assignments, even if they show slightly lower classification performances.
back
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -