Monday, Aug 7: 10:30 AM - 12:20 PM
Invited Paper Session
Metro Toronto Convention Centre
Thresholding or Boolean behaviors of biomolecules underlie many biological processes. Decision-trees capture such behaviors and tree-based methods such random forests have been shown to succeed in predictive tasks in genomics and medicine. In this talk, we use UKBiobank data and a stable version of the random forests, iterative random forests (iRF), to recommend gene and gene-gene interactions that have predictive and stable data evidence for possibly driving a heart disease, Hypertrophic Cardiomyopathy (HCM). Gene-silencing experiments show significant causal evidence in 4 out of the 5 experiments based on iRF-based recommendations and domain knowledge. This and other empirical successes of iRF motivate a theoretical investigation of its tractable version under a new local sparse and spiky (LSS) model where the regression function is a linear combination of Boolean interactions of features. The tractable version of iRF is shown to be model selection consistent under this new model and conditions of feature independence and non-overlap of interactions.
, University of California at Berkeley