Astronomers Speak Statistics: Statistical Challenges in Detecting Exoplanets

Eric Feigelson Speaker
Pennsylvania State University
Monday, Aug 7: 9:25 AM - 9:50 AM
Introductory Overview Lectures 
Metro Toronto Convention Centre 
A revolutionary discovery has emerged in recent decades: most stars in the Galaxy have planetary systems, and many exoplanets have conditions that could harbor living organisms. But only a few thousand exoplanets have been definitively identified, mostly from transits as the planet repeatedly passes in front of the star, due to the difficulties of detecting tiny transit signals amid stronger variations from the host star and instrument. Typical analysis involves detrending aperiodic variability by local regression, a least-squares search for periodic dips in brightness, machine learning classification, and expert vetting to reduce false positives from uninteresting periodic variable stars. The classes are highly imbalanced: millions of stellar photometric time series are analyzed to obtain hundreds of likely candidate planets.

We focus on the AutoRegressive Planet Search procedure that has exhibited considerable success for both regular and irregular cadence time series. ARIMA models first remove aperiodic variations, a new Transit Comb Filter (TCF) periodogram is applied, and Random Forest or XGBoost classification follows.