Pulling Signal out of Noise for Data-Driven Discoveries in Astronomy

Sara Algeri Chair
University of Minnesota
Aneta Siemiginowska Discussant
Harvard-Smithsonian Center for Astrophysics
Hyungsuk Tak Organizer
Pennsylvania State University
Tuesday, Aug 8: 10:30 AM - 12:20 PM
Topic-Contributed Paper Session 
Metro Toronto Convention Centre 
Room: CC-205C 



Main Sponsor

Section on Statistical Learning and Data Science

Co Sponsors

Section on Bayesian Statistical Science
Section on Physical and Engineering Sciences


Improving Power Spectrum Estimation using Multitapering: Prospects for understanding stars, the Milky Way, and beyond

Stars oscillate in much the same way as musical instruments, but at millions of closely-spaced frequencies. Time-series data of stars have imprints of these oscillations, whose detection and characterization allow us to probe stellar interiors physics. This is usually done by computing the Lomb-Scargle (LS) periodogram, a power spectrum estimator for unevenly sampled time-series. However, the LS periodogram suffers from the statistical problems of (1) inconsistency (or noise) and (2) bias due to high spectral leakage. Here, I develop a multitaper spectral estimation method using the Non-Uniform Fast Fourier Transform (mtNUFFT) that tackles the inconsistency and bias problems of the LS. My method allows precise estimation of frequencies of stellar oscillations and exoplanet transits, thereby providing precise age estimates of stars. For e.g., I obtained a 3.96 +/- 0.48 Gyr age estimate of the Kepler-91 star with 36% better precision than the state-of-the-art. In this talk, I will discuss these results to illustrate that mtNUFFT has promising implications for understanding stars, exoplanets, and galaxies. I will also highlight the Python package I have developed for this work. 


Gwendolyn Eadie, University of Toronto
Joshua Speagle, University of Toronto
David Thomson


Aarya Patil

Searching for Continuous Gravitational Waves: Methods and Results

Since the start of the first Advanced LIGO observing run in 2015, nearly 100 gravitational wave events have been observed using data from the LIGO and Virgo detectors. In addition to these short-lived transient signals, the LIGO-Virgo-KAGRA collaboration conducts searches for a variety of other signals, including weaker but long-lived continuous gravitational waves, mostly from rapidly spinning neutron stars. To search for these signals, which are nearly periodic, but Doppler-modulated by the motion of the Earth and the source, one must analyze long stretches of data. A variety of techniques are employed, depending on the amount of information (sky position, spin frequency and evolution, binary orbit) available about potential sources. I will describe these methods, which typically need to balance sensitivity, computing cost and robustness, and summarize the results through the first three observing runs of advanced gravitational wave detectors.


John Whelan, Rochester Institute of Technology

Model-based clustering in the presence of measurement error

Observations on measurement error are commonly available in many physical and engineering applications, including astronomy. We develop methodology for model-based clustering in the presence of measurement error, and use that to classify gamma ray bursts (GRBs) from the Reuven Ramaty High Energy Solar Spectroscopic Imager (RHESSI) mission. Our statistical model incorporates the observations on measurement error along with a mixture of Gaussian factor analyzers that can explain the variability in each cluster via a small set of latent factors, and is equipped with a matrix-free computational framework that enables efficient parameter estimation. The proposed method allows for the characterization of different kinds of GRBs in terms of a few underlying variables, and provides a more comprehensive understanding of the spectral characteristics governing the different kinds of GRBs. 


Fan Dai


Ranjan Maitra, Iowa State University

WITHDRAWN Marginalized Analytic Data-space Gaussian Inference for Component Separation: Obtaining a Clean DIB Catalog by Marginalizing over Stellar Types

Diffuse interstellar bands (DIBs) are broad absorption features associated with interstellar dust and can serve as chemical and kinematic tracers. Conventional measurements of DIBs in stellar spectra are complicated by residuals between observations and best-fit stellar models. To overcome this, we simultaneously model the spectrum as a combination of stellar, dust, and residual components, with full posteriors on the joint distribution of the components. This decomposition is obtained by modeling each component as a draw from a high-dimensional Gaussian distribution in the data-space (the observed spectrum) - a method we call "Marginalized Analytic Data-space Gaussian Inference for Component Separation" (MADGICS). We use a data-driven prior for the stellar component, which avoids missing stellar features not included in synthetic line lists. This technique provides statistically rigorous uncertainties and detection thresholds, which are required to work in the low signal-to-noise regime that is commonplace for dusty lines of sight. We reprocess all public Gaia DR3 RVS spectra and present an improved 8621 Å DIB catalog, free of detectable stellar line contamination. 


Douglas Finkbeiner, Harvard-Smithsonian Center for Astrophysics
Andrew Saydjari, Harvard Physics