DMB (Data Mining Big) is a collection of data analysis tools. 
DMB contains a collection of software tools that perform knowledge extraction from data. The methods adopted are based on several models and algorithms that have been developed by a team of researchers, most of them members of the computational and system biology research group. The description of the methods is available in different papers listed in the publication page, while the code can be executed remotely from this website on the servers of IASI - CNR - a research institute of the Italian Research Council. Results are sent by email or visualized on the web interface.

Focus home page

EBRI (European Brain Research Institute) is a private non-profit institute with the objective of investigating fundamental questions about the functional organization of the brain and to translate basic brain science into ways to possibly cure the diseases affecting the nervous system. IASI is actively collaboration with the Neurotrophic Factors and Neurodegenerative Diseases Laboratory of EBRI, using the logic mining system DMB (developed in IASI) to characterize the gene expression profile of the AD11 mice in different brain areas following temporal progression and to explain the onset of the Alzheimer disease and thus identify early biomarkers of the pathology.

Focus home page

The structural Alignment of two or more proteins is one of the most studied problems in Bioinformatic and Computational Biology of the last decade. The problem consists in trying to establish equivalences between two or more polymer (proteins or RNAs) structures based on their sequence of residues, shape and three-dimensional conformation. On the basis of the information that one chose to consider, different types of alignment arise. The algorithms considering only the primary sequence of the proteins are usually referred to as sequence alignment algorithms. In contrast to this simple sequence alignment, three-dimensional structural alignment exploit to a full extent the knowledge on the tertiary structure of the proteins, that is the complete information on the coordinates of the atoms composing the proteins. Structural alignment is therefore a valuable tool especially for the comparison of proteins with low sequence similarity, where evolutionary relationships between proteins cannot be easily detected by standard sequence alignment techniques. Tools for structural alignment are of paramount importance also in assessing the reliability level of structural prediction methods. Indeed, evaluating such predictions often requires a structural alignment between the model and the true known structure to assess the model's quality. Structural alignments are especially useful in analyzing data from structural genomics and proteomics efforts, and they can be used as comparison points to evaluate alignments produced by purely sequence-based bioinformatics methods.

Among the codes and methods specifically designed for proteins structural alignment, those concerned with binding and active sites of proteins are of particular interest. Indeed, The identification of protein binding sites, their classification and analysis is of much interest for drug design and treatment of diseases. Binding sites recognition is generally based on geometry often combined with physico-chemical properties of the site since the conformation, size and chemical composition of the protein surface are all relevant for the interaction with a specific ligand. At IASI we developed a new structural alignment method (CO) in collaboration with the University of Padova. The method is based on a reformulation of the structural alignment problem as a continuous global optimization problem. The method compares favorably with well-known tools for structural alignment such as Multibind and MolLoc.

Focus home page

Single Nucleotide Polymorphisms (SNPs) are single loci of the DNA where mutations occur. In these loci at least 5% of the individuals of a given population has the same nucleotide value that differs from the nucleotide value of the other 95% of individuals. The most frequent nucleotide value is called phase. SNPs contain all the relevant information of the DNA. SNPs analysis of DNA sequences is the main purpose of  HAPMAP Project of the Division of Extramural Research of the National Human Genome Research Institute where two problems are studied:

  1. TAG SNP selection: to find specific nucleotides of haplotypes that identify the haplotypes (called tag SNPs) so reducing the number of SNPs required to examine the entire genome for association with a phenotype from the 10 million SNPs that exist to roughly 500,000 tag SNPs.
  2. Haplotype Inference: To produce information for studying the genetic factors contributing to variation in response to environmental factors, in susceptibility to infection, and in the effectiveness of and adverse responses to drugs and vaccines.

At IASI, Optimization techniques have been applied to both problems. In particular various Integer Programming Formulations of the Haplotype Inference Problem with Parsimony have been studied. The research produced the software for Haplotyping CollHaps (revision 2.0) that is the subject of the focus home page.

Focus home page

The Consortium for the Barcode of Life (CBOL) is an international initiative devoted to developing DNA barcoding as a global standard for the identification of biological species. DNA barcoding is a new technique that uses a short DNA sequence from a standardized and agreed-upon position in the genome as a molecular diagnostic for species-level identification. IASI operates in the Data Analysis Working Group of the Consortium develiping ad hoc logic classification algorithms that are able to detect the specie from the barcode with high precision.

Focus home page

Over the last two decades, biological sciences have undergone radical transformation through the development of new research technologies that have produced a real explosion in the amount of data available. Just think of the modern genomic sequencing techniques that have made the sequencing of the human genome, different animal and plant organisms, and many microorganisms simpler, less expensive and more reliable with enormous benefits for diagnosis and treatment of diseases. In the wake of the genome, many other objects that represent various biological entities analyzed in their entirety are defined and studied: from transcriptome (the complete set of RNA expressed by the cell) to the proteome (the complete set of proteins) to more exotic objects such as interactome and metabolome. This huge amount of data available is an immense resource for research, but only the amount is not enough. If in the past there was a difficulty in collecting genetic data, today the challenge is to give them meaning and it is therefore essential to use effective informatics solutions capable of managing, analyzing and integrating these biological "Big Data".

An excellent solution for integrating data to support research was designed by Dr. Paci, who developed SWIM (SWitchMiner), a freely downloadable open-source software with GNU GPL license available at The software comes with a wizard-like GUI that greatly simplifies the execution of the otherwise complicated procedure and allows the user to interact with the software by executing certain operations through a series of subsequent steps. The software, capable of detecting genes responsible for major changes in the phenotype of a cell, has so far been successfully applied in two very different fields: the vine-wine and the oncology.

On the one hand, viticulture is undoubtedly a field of great economic and strategic importance and of great cultural value for Italy. Speaking of numbers, grapevine moves a turnover of over 100 billion euros a year. The potential impacts justify the choice of applying SWIM to the genome of grapevine, a project that has led to the identification of key genes in the ripening process of grapes. Thanks to SWIM it is now possible to decipher plant responses to particular conditions or stages of development and to control the quality of wine in response to climate change. The results of this study were published in the prestigious scientific journal The Plant Cell (The Plant Cell 2014, 26, pp. 4617-4635) and by many national newspapers (La Stampa, Il Gazzettino, Gambero Rosso, Corriere del Veneto, Vinoso, Bere il Vino, Agrinews,  VQ-Vite, Vino&Qualità, Trebicchieri). For this publication, Dr. Paci received the SysBio 2014 Award as the best publication of the year by the SYSBIO Center for Systems Biology (

On the other hand, oncology is undoubtedly a field of high healthcare, social and economic impact. Cancer is still the second cause of death in Italy (30% of all deaths) after cardiovascular disease, with a growing number of tumor sufferers. It is estimated that in Italy there are 365,000 new cancer diagnosis per year (excluding carcinomas), over 189,000 (52%) among men and over 176,000 (48%) among women. To these few reassuring numbers, the ever-increasing costs that the National Health Service must support for anti-cancer therapies must be added. In Italy, the costs are between 50 and 150 thousand euros per year of care, with an increase estimated at + 17% in 2018. The main goal of research in this area is certainly the innovation of therapy through discovery and the development of new drugs that can provide incremental benefit both to the patient and National Health Service in terms of health and costs. The potential impacts justify the choice of applying SWIM to about twenty different types of tumor, a project that has led to the identification of genes with a key role in neoplastic transformation. Thanks to SWIM it is now possible to identify new potential therapeutic targets for the treatment of different types of cancer. The findings of this study were published in the prestigious Scientific Reports of Nature (Scientific Reports 2017, 7, Article number: 44797). For this work, Dr. Paci also received the Best Poster Award 2016 from the IEEE Technical Committee on Computational Life Science Society (TCCLS) at the Lipari Computational Microbiology and Microbiome-Based Medicine School.

Focus home page
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -