TY - JOUR
T1 - Computational prediction of the human-microbial oral interactome
AU - Coelho, Edgar D.
AU - Arrais, Joel P.
AU - Matos, Sérgio
AU - Pereira, Carlos
AU - Rosa, Nuno
AU - Correia, Maria J.
AU - Barros, Marlene
AU - Oliveira, José L.
N1 - Funding Information:
This work has received support from the RD-CONNECT project (EC contract number 305444). Edgar D. Coelho is funded by Fundação para a Ciência e Tecnologia, FCT, under Grant SFRH/BD/86343/2012.
PY - 2014/2/27
Y1 - 2014/2/27
N2 - Background: The oral cavity is a complex ecosystem where human chemical compounds coexist with a particular microbiota. However, shifts in the normal composition of this microbiota may result in the onset of oral ailments, such as periodontitis and dental caries. In addition, it is known that the microbial colonization of the oral cavity is mediated by protein-protein interactions (PPIs) between the host and microorganisms. Nevertheless, this kind of PPIs is still largely undisclosed. To elucidate these interactions, we have created a computational prediction method that allows us to obtain a first model of the Human-Microbial oral interactome.Results: We collected high-quality experimental PPIs from five major human databases. The obtained PPIs were used to create our positive dataset and, indirectly, our negative dataset. The positive and negative datasets were merged and used for training and validation of a naïve Bayes classifier. For the final prediction model, we used an ensemble methodology combining five distinct PPI prediction techniques, namely: literature mining, primary protein sequences, orthologous profiles, biological process similarity, and domain interactions. Performance evaluation of our method revealed an area under the ROC-curve (AUC) value greater than 0.926, supporting our primary hypothesis, as no single set of features reached an AUC greater than 0.877. After subjecting our dataset to the prediction model, the classified result was filtered for very high confidence PPIs (probability ≥ 1-10-7), leading to a set of 46,579 PPIs to be further explored.Conclusions: We believe this dataset holds not only important pathways involved in the onset of infectious oral diseases, but also potential drug-targets and biomarkers. The dataset used for training and validation, the predictions obtained and the network final network are available at http://bioinformatics.ua.pt/software/oralint.
AB - Background: The oral cavity is a complex ecosystem where human chemical compounds coexist with a particular microbiota. However, shifts in the normal composition of this microbiota may result in the onset of oral ailments, such as periodontitis and dental caries. In addition, it is known that the microbial colonization of the oral cavity is mediated by protein-protein interactions (PPIs) between the host and microorganisms. Nevertheless, this kind of PPIs is still largely undisclosed. To elucidate these interactions, we have created a computational prediction method that allows us to obtain a first model of the Human-Microbial oral interactome.Results: We collected high-quality experimental PPIs from five major human databases. The obtained PPIs were used to create our positive dataset and, indirectly, our negative dataset. The positive and negative datasets were merged and used for training and validation of a naïve Bayes classifier. For the final prediction model, we used an ensemble methodology combining five distinct PPI prediction techniques, namely: literature mining, primary protein sequences, orthologous profiles, biological process similarity, and domain interactions. Performance evaluation of our method revealed an area under the ROC-curve (AUC) value greater than 0.926, supporting our primary hypothesis, as no single set of features reached an AUC greater than 0.877. After subjecting our dataset to the prediction model, the classified result was filtered for very high confidence PPIs (probability ≥ 1-10-7), leading to a set of 46,579 PPIs to be further explored.Conclusions: We believe this dataset holds not only important pathways involved in the onset of infectious oral diseases, but also potential drug-targets and biomarkers. The dataset used for training and validation, the predictions obtained and the network final network are available at http://bioinformatics.ua.pt/software/oralint.
KW - Bayesian classification
KW - Oral interactome
KW - Protein-protein interactions
UR - http://www.scopus.com/inward/record.url?scp=84897570043&partnerID=8YFLogxK
U2 - 10.1186/1752-0509-8-24
DO - 10.1186/1752-0509-8-24
M3 - Article
C2 - 24576332
AN - SCOPUS:84897570043
SN - 1752-0509
VL - 8
JO - BMC Systems Biology
JF - BMC Systems Biology
IS - 1
M1 - 24
ER -