Simultaneous Dimensionality Reduction and Cell-Type Annotation of Single-Cell RNA-seq Data using Marker Enriched Uniform Manifold Approximation and Projection

Aparajita Khan Speaker
Stanford University
Wednesday, Aug 9: 9:15 AM - 9:35 AM
Topic-Contributed Paper Session 
Metro Toronto Convention Centre 
Cell type identification is a key step in analyzing single-cell RNA-seq. Existing methods are typically based on a cluster-then-annotate approach, wherein clustering is performed across cells using principal components that capture the reduced dimensionality of highly variable genes; while cell type annotation is performed separately for each cell-cluster utilizing external marker gene information. This separation could potentially lead to poor annotation due to discrepancy between clustering and cell-type annotation subspaces. We propose a novel two-graph fusion technique that judiciously integrates complementary information from both marker and variable gene sets to perform simultaneous dimensionality reduction and cell type annotation. We directly incorporate marker gene information into uniform manifold approximation and projection to improve cell-type predictions. Through comprehensive evaluations on several real scRNA-seq datasets spanning various cancerous tissues of melanoma, colorectal carcinoma, and brain metastasis, as well as normal tissues from human and mouse, we demonstrate the efficacy of the proposed method over state-of-the-art cell-type annotation approaches.