期刊名称:Proceedings of the National Academy of Sciences
印刷版ISSN:0027-8424
电子版ISSN:1091-6490
出版年度:2022
卷号:119
期号:14
DOI:10.1073/pnas.2111786119
语种:English
出版社:The National Academy of Sciences of the United States of America
摘要:Significance
The current state-of-the-art mappings of cell types fall short regarding finely resolved subtypes of neural cells, especially γ-aminobutyric acidergic and glutamatergic subtypes. Most such maps compromise on either the number or specificity of unique cell types quantified in each study. Others only use qualitative validation for their maps and fail to address whether gene subset selection is necessary for optimal maps. The Matrix Inversion and Subset Selection pipeline uses publicly available in situ hybridization and single-cell RNA sequencing gene expression data to infer cell-type distributions to map diverse cell types across the murine brain. Most importantly, we demonstrate that data-driven feature selection is necessary to arrive at quantitatively optimal cell-type maps using inversion-, deconvolution-, and correlation-based mapping approaches.
The advent of increasingly sophisticated imaging platforms has allowed for the visualization of the murine nervous system at single-cell resolution. However, current experimental approaches have not yet produced whole-brain maps of a comprehensive set of neuronal and nonneuronal types that approaches the cellular diversity of the mammalian cortex. Here, we aim to fill in this gap in knowledge with an open-source computational pipeline, Matrix Inversion and Subset Selection (MISS), that can infer quantitatively validated distributions of diverse collections of neural cell types at 200-μm resolution using a combination of single-cell RNA sequencing (RNAseq) and in situ hybridization datasets. We rigorously demonstrate the accuracy of MISS against literature expectations. Importantly, we show that gene subset selection, a procedure by which we filter out low-information genes prior to performing deconvolution, is a critical preprocessing step that distinguishes MISS from its predecessors and facilitates the production of cell-type maps with significantly higher accuracy. We also show that MISS is generalizable by generating high-quality cell-type maps from a second independently curated single-cell RNAseq dataset. Together, our results illustrate the viability of computational approaches for determining the spatial distributions of a wide variety of cell types from genetic data alone.