COPSS Distinguished Achievement Award and Lectureship

Amita Manatunga Chair
Emory University
 
Wednesday, Aug 9: 4:00 PM - 5:50 PM
1323 
Invited Paper Session 
Metro Toronto Convention Centre 
Room: CC-Hall FG 

Keywords

Plenary 

Applied

No

Main Sponsor

Committee of Presidents of Statistical Societies
JSM Partner Societies

Presentations

Veridical Data Science Toward Trustworthy AI

Data Science is central to AI and has driven most of recent advances in biomedicine and beyond. Human judgment calls are ubiquitous at every step of a data science life cycle (DSLC): problem formulation, data cleaning, EDA, modeling, and reporting. Such judgment calls are often responsible for the "dangers" of AI by creating a universe of hidden uncertainties well beyond sample-to-sample uncertainty.
To mitigate these dangers, veridical (truthful) data science is introduced based on three principles: Predictability, Computability and Stability (PCS). The PCS framework and documentation unify, streamline, and expand on the ideas and best practices of statistics and machine learning. In every step of a DSLC, PCS emphasizes reality check through predictability, considers computability up front, and takes into account of expanded uncertainty sources including those from data curation/cleaning and algorithm choice to build more trust in data results. PCS will be showcased through collaborative research in finding genetic drivers of a heart disease, stress-testing a clinical decision rule, and identifying microbiome-related metabolite signature for possible early cancer detection. 

Keywords

DSLC

data science life cycle 

Speaker

Bin Yu, University of California at Berkeley