Semi-Supervised Learning Using Hierarchical Mixture Models: Gene Essentiality Case Study

Daniels, Michael W. and Dvorkin, Daniel and Powers, Rani K. and Kechris, Katerina (2021) Semi-Supervised Learning Using Hierarchical Mixture Models: Gene Essentiality Case Study. Mathematical and Computational Applications, 26 (2). p. 40. ISSN 2297-8747

[thumbnail of mca-26-00040.pdf] Text
mca-26-00040.pdf - Published Version

Download (5MB)

Abstract

Integrating gene-level data is useful for predicting the role of genes in biological processes. This problem has typically focused on supervised classification, which requires large training sets of positive and negative examples. However, training data sets that are too small for supervised approaches can still provide valuable information. We describe a hierarchical mixture model that uses limited positively labeled gene training data for semi-supervised learning. We focus on the problem of predicting essential genes, where a gene is required for the survival of an organism under particular conditions. We applied cross-validation and found that the inclusion of positively labeled samples in a semi-supervised learning framework with the hierarchical mixture model improves the detection of essential genes compared to unsupervised, supervised, and other semi-supervised approaches. There was also improved prediction performance when genes are incorrectly assumed to be non-essential. Our comparisons indicate that the incorporation of even small amounts of existing knowledge improves the accuracy of prediction and decreases variability in predictions. Although we focused on gene essentiality, the hierarchical mixture model and semi-supervised framework is standard for problems focused on prediction of genes or other features, with multiple data types characterizing the feature, and a small set of positive labels.

Item Type: Article
Uncontrolled Keywords: semi-supervised; hierarchical mixture models; essential genes; genomic; integration
Subjects: SCI Archives > Mathematical Science
Depositing User: Managing Editor
Date Deposited: 10 Nov 2022 05:19
Last Modified: 06 Aug 2024 06:17
URI: http://science.classicopenlibrary.com/id/eprint/120

Actions (login required)

View Item
View Item