Repository logo

Machine learning based lineage tree reconstruction improved with knowledge of higher level relationships between cells and genomic barcodes.

Published version

Repository DOI

Change log


Prusokiene, Alisa 
Prusokas, Augustinas 


Tracking cells as they divide and progress through differentiation is a fundamental step in understanding many biological processes, such as the development of organisms and progression of diseases. In this study, we investigate a machine learning approach to reconstruct lineage trees in experimental systems based on mutating synthetic genomic barcodes. We refine previously proposed methodology by embedding information of higher level relationships between cells and single-cell barcode values into a feature space. We test performance of the algorithm on shallow trees (up to 100 cells) and deep trees (up to 10 000 cells). Our proposed algorithm can improve tree reconstruction accuracy in comparison to reconstructions based on a maximum parsimony method, but this comes at a higher computational time requirement.


Acknowledgements: We thank the anonymous reviewers for valuable comments that improved the quality of the paper.


31 Biological Sciences, 3102 Bioinformatics and Computational Biology, Human Genome, Genetics, Networking and Information Technology R&D (NITRD), Bioengineering

Journal Title

NAR Genom Bioinform

Conference Name

Journal ISSN


Volume Title



Oxford University Press (OUP)