Inference for sequence data and applications

Ning Hao Chair
University of Arizona
Ning Hao Organizer
University of Arizona
Sunday, Aug 6: 4:00 PM - 5:50 PM
Topic-Contributed Paper Session 
Metro Toronto Convention Centre 
Room: CC-802A,CC-802B 



Main Sponsor


Co Sponsors

Biometrics Section
Business and Economic Statistics Section


Inference for Gaussian Multiple Change-point Model via Bayesian Information Criterion

For a change-point model with a piecewise constant mean structure and additive Gaussian noises, a fundamental inference problem is to determine the existence of change points. Early works usually assume that there is at most one change point. Many recent works can handle multiple changes, however, mainly focus on identifying individual change points. In particular, it is not clear the weakest condition to guarantee the existence of an asymptotically powerful test. In this talk, we answer this question via a Bayesian information criterion approach.  


Yue Niu, University of Arizona

Equivariant Variance Estimation for Multiple Change-point Model

The variance of noise plays an important role in many change-point detection procedures and the associated inferences. Most commonly used variance estimators require strong assumptions on the true mean structure or normality of the error distribution, which may not hold in applications. More importantly, the qualities of these estimators have not been discussed systematically in the literature. In this paper, we introduce a framework of equivariant variance estimation for multiple change-point models. In particular, we characterize the set of all equivariant unbiased quadratic variance estimators for a family of change-point model classes, and develop a minimax theory for such estimators. We also consider the autocovariance estimation and test for serial dependence for change-point models.


Han Xiao, Rutgers University

Efficient line search optimization of penalty functions in supervised changepoint detection

Receiver Operating Characteristic (ROC) curves are commonly used in binary classification, and can also be used to evaluate learned penalty functions in the context of supervised changepoint detection. Since the Area Under the Curve (AUC) is a piecewise constant function of the predicted values, it can not be directly optimized by gradient descent. Recently we showed that minimizing a piecewise linear surrogate loss, AUM (Area Under Min of false positives and false negatives), results in maximizing AUC. In this talk we propose a new algorithm for AUM minimization, which exploits the piecewise linear structure to efficiently compute an exact line search, for every step of gradient descent. Because the exact line search is quadratic time in the worst case, we additionally propose an approximate line search which is log-linear time in the worst case (asymptotically the same as a constant step size). Our empirical results show that the proposed algorithm is more computationally efficient than other variants of gradient descent (constant step size, line search using grid search, etc). 


Toby Hocking, Northern Arizona University

Learning topological statistics for Bayesian changepoint analysis

We consider the dual objectives of inference on a changepoint and the best univariate representation of an image series through Bayesian techniques, providing estimates of the uncertainty of both the changepoint and the time series representation. The representations we choose are derived from topological data analysis and so depend neither on the mean nor the variance of a given series. We provide a flexible model for performing said inference and offer theoretical guarantees of convergence to the correct changepoint and representation given that they exist. We apply our method to noisy nanoparticle and solar flare data and demonstrate it yields correct results on simulated data as well. 


Andrew Thomas, Cornell University

WITHDRAWN Bayesian change point detection with spike and slab priors

We study the use of spike and slab priors for consistent estimation of the number of change points and their locations. Leveraging recent results in the variable selection literature, we show that an estimator based on spike and slab priors achieves optimal localization rate in the multiple offline change point detection problem. Based on this estimator, we propose a Bayesian change point detection method, which is one of the fastest Bayesian methodologies, and it is more robust to misspecification of the error terms than the competing methods. We demonstrate through empirical work the good performance of our approach vis-a-vis some state-of-the-art benchmarks. 


Oscar Hernan Madrid Padilla, University of California, Los Angeles