Skip to yearly menu bar Skip to main content

Workshop: AI for Science: Mind the Gaps

$\textit{Ab Initio}$ Discovery of Biological Knowledge from scRNA-Seq Data Using Machine Learning

Jiaqi Li · Fanhong Li · Sijie Chen

Abstract: Expectations of machine learning (ML) are high for discovering new patterns in high-throughput biological data, but most such practices are accustomed to relying on existing knowledge conditions to design experiments. Investigations of the power and limitation of ML in revealing complex patterns from data without the guide of existing knowledge have been lacking. In this study, we conducted systematic experiments on such $\textit{ab initio}$ knowledge discovery with ML methods on single-cell RNA-sequencing data of early embryonic development. Results showed that a strategy combining unsupervised and supervised ML can reveal major cell lineages with minimum involvement of prior knowledge or manual intervention, and the $\textit{ab initio}$ mining enabled a new discovery of human early embryonic cell differentiation. The study illustrated the feasibility, significance, and limitation of $\textit{ab initio}$ ML knowledge discovery on complex biological problems.