Decision Tree Models and Cell Fate Choice
Single cell transcriptomics has laid bare the heterogeneity of apparently identical cells at the level of gene expression. For many cell-types we now know that there is variability in the abundance of many transcripts, and that average transcript abun-dance or average gene expression can be a unhelpful concept. A range of clustering and other classification methods have been proposed which use the signal in single cell data to classify, that is assign cell types, to cells based on their transcriptomic states. In many cases, however, we would like to have not just a classifier, but also a set of interpretable rules by which this classification occurs. Here we develop and demonstrate the interpretive power of one such approach, which sets out to establish a biologically interpretable classification scheme. In particular we are interested in capturing the chain of regulatory events that drive cell-fate decision making across a lineage tree or lineage sequence. We find that suitably defined decision trees can help to resolve gene regulatory programs involved in shaping lineage trees. Our approach combines predictive power with interpretabilty and can extract logical rules from single cell data.