IOL: Astronomers Speak Statistics

Jogesh Babu Chair
Pennsylvania State University
David Hunter Organizer
Penn State University
Hyungsuk Tak Organizer
Pennsylvania State University
Monday, Aug 7: 8:30 AM - 10:20 AM
Introductory Overview Lectures 
Metro Toronto Convention Centre 
Room: CC-718A 
Modern telescopes, such as the recently launched James Webb Space Telescope and the
upcoming RubinObservatory Legacy Survey of Space and Time, are generating massive and complex data sets. For the past few months the James Webb Space Telescope has been astounding the public with spectacular early-release pictures.

The continually expanding size of astronomical datasets and the complexity of the astrophysical
models used to interpret them lead to a wide array of mathematical, computational, and statistical challenges, making close collaborations between astrophysicists and statisticians essential for tackling them. These collaborations benefit everyone involved, as domain scientists equip themselves with novel analysis approaches while statisticians gain insights from testing their methodologies on compelling problems drawn from large open-access astronomical
datasets. Advanced statistical methodology is becoming essential for scientific advances in this
era of astronomical large-scale surveys.

The Center for Astrostatistics at Penn State organizes a series of sessions called "Astronomers Speak Statistics", inviting experienced astronomers in the interface between astronomy and statistics to introduce and showcase their astronomical science and data-analytic challenges to statisticians. This year, four astronomers tell the story of galaxy formation and cosmology, which is full of unprecedented questions and opportunities for statisticians in the era of Big Data Astronomy. In this sense, this session will be attractive to a broad audience of statisticians who
may or may not have background knowledge in astronomy but are interested in dealing with
scientific questions via advanced statistical methodology.

Main Sponsor

JSM Partner Societies

Co Sponsors

Section on Bayesian Statistical Science
Section on Physical and Engineering Sciences
Section on Statistical Learning and Data Science


Astronomers Speak Statistics: Statistical Challenges in the Deep Universe

The James Webb Space Telescope is the culmination of thirty years of planning, twenty years of construction, and eleven billion dollars of funding, and for the past twelve months it has been dazzling astronomers and the public alike with spectacular early-release science images. I will provide an overview of this flagship telescope and discuss some of the stunning early, and sometimes tentative, discoveries in the field of galaxy formation, such as tantalizing evidence for massive galaxies that are forming "impossibly early" in the Universe, puzzling brand-new obscured galaxy populations unveiled by JWST, and multiple new records for the most distant object observed by astronomers. I finish with key outstanding questions in the field of galaxies and their statistical underpinnings: when, where, and how did these vast cosmic ecosystems assemble? Armed with ultra-deep data from JWST, we are in a better position to answer these questions than ever before. Yet these new data breathe new life into long-standing statistical challenges in understanding extragalactic observations: how do we coherently model all of these processes at once? 


Joel Leja, Pennsylvania State University

Astronomers Speak Statistics: Statistical Challenges in the Formation of the Milky Way

The Milky Way is one of the best laboratories for understanding the formation and evolution of galaxies over cosmic time, because we can obtain detailed information for large samples of individual stars. Much progress has been made in understanding the spatial, dynamical, and chemical structure of the Milky Way using large survey data sets. Many statistical challenges exist in this field, including how to deal with the significant selection biases in various quantities, heteroskedastic uncertainties that are not always well understood, and how to efficiently fit complex models to data sets consisting of millions to billions of stars while treating the data properly. I will describe some of these challenges and illustrate them with examples from our work on the chemo-dynamical structure of the Milky Way. 


Jo Bovy, University of Toronto

Astronomers Speak Statistics: Statistical Challenges in Detecting Exoplanets

A revolutionary discovery has emerged in recent decades: most stars in the Galaxy have planetary systems, and many exoplanets have conditions that could harbor living organisms. But only a few thousand exoplanets have been definitively identified, mostly from transits as the planet repeatedly passes in front of the star, due to the difficulties of detecting tiny transit signals amid stronger variations from the host star and instrument. Typical analysis involves detrending aperiodic variability by local regression, a least-squares search for periodic dips in brightness, machine learning classification, and expert vetting to reduce false positives from uninteresting periodic variable stars. The classes are highly imbalanced: millions of stellar photometric time series are analyzed to obtain hundreds of likely candidate planets.

We focus on the AutoRegressive Planet Search procedure that has exhibited considerable success for both regular and irregular cadence time series. ARIMA models first remove aperiodic variations, a new Transit Comb Filter (TCF) periodogram is applied, and Random Forest or XGBoost classification follows.  


Eric Feigelson, Pennsylvania State University

Astronomers Speak Statistics: Statistical Challenges of Supernova Cosmology


Kaisey Mandel, University of Cambridge