Single Cell Data Analysis: Computational methods for characterizing cell types

Mark Gerstein Speaker
 
Wednesday, Aug 9: 9:35 AM - 9:55 AM
Topic-Contributed Paper Session 
Metro Toronto Convention Centre 
I will describe two techniques for the analysis of single-cell sequencing data. (1) Forest Fire Clustering. This is an efficient and interpretable method for cell-type discovery from single-cell data. It makes minimal prior assumptions and, different from current approaches, calculates a non-parametric posterior probability that each cell is assigned a cell-type label. These posterior distributions allow for the evaluation of a label confidence for each cell and enable the computation of "label entropies," highlighting transitions along developmental trajectories. (2) SCAN-ATAC-Sim. It is difficult to benchmark the performance of various scATAC-seq analysis techniques (such as clustering and deconvolution) without having a priori a known set of gold-standard cell types. To simulate scATAC-seq experiments with known cell-type labels, we introduce an efficient and scalable scATAC-seq simulation method that down-samples bulk ATAC-seq data (e.g., from representative cell lines or tissues). Our protocol uses a consistent but tunable signal-to-noise ratio across cell types in a scATAC-seq simulation.