Best Student-led Astrostatistics Papers of 2022

Derek Bingham Chair
Simon Fraser University
Hyungsuk Tak Organizer
Pennsylvania State University
Wednesday, Aug 9: 10:30 AM - 12:20 PM
Topic-Contributed Paper Session 
Metro Toronto Convention Centre 
Room: CC-205D 



Main Sponsor

Astrostatistics Interest Group

Co Sponsors

Section on Physical and Engineering Sciences
Section on Statistical Learning and Data Science


A geometric census of giants in the Local Universe: Bayesian inference of a sparse Paretian population

The radio sky reveals that black hole jets carry electrons and magnetic fields far into the intergalactic medium. In exceptional cases, these outflows give individual galaxies an Mpc-scale sphere of influence. Such giant outflows, or simply giants, embody the most extreme known way in which galaxies affect the Cosmic Web. Despite this, the triggers of giant growth remain unknown. Here we use the LOFAR Two-metre Sky Survey (LoTSS) to measure giant growth's central geometric quantity: total length. We first formulate a statistical framework that is both rigorous and practical, and then search the LoTSS for giants, discovering 2060 previously unknown specimina. Spectacular findings include the longest giant overall, the longest hosted by a spiral galaxy, and 13 giants that appear larger in the sky than the Moon. By combining theory and data, we infer that giant lengths are Pareto distributed with tail index -3.5 ± 0.5. We also deduce the comoving number density of giants and their volume-filling fraction, both for the first time. We conclude that giants are truly rare: at any moment in time, most clusters and filaments — the building blocks of the Cosmic Web — do not harbour giants. 


Martijn Oei, Leiden University

Constraining Galactic Accelerations with Stellar Streams

We present a data-driven method for estimating galactic accelerations from phase-space measurements of stellar streams. Using a differentiable neural network to parameterize the track of the stream in phase-space, our approach enables a direct estimate of the acceleration field in the neighborhood of the stream. A model for the galactic gravitational potential does not need to be specified beforehand. By treating each stream as a collection of proximate orbits, our method utilizes the chain rule to convert derivatives of phase-space coordinates along the stream-track to estimated galactic accelerations. Once acceleration vectors are sampled along the stream, standard analytic models for the Galactic potential can then be constrained. Alternatively, we demonstrate that the potential can be represented with a neural network to enable full model flexibility while minimizing non-physical artifacts through Poisson's equation. On mock data, our approach recovers the true potential with sub-percent level fractional errors across a range of scales, providing a new avenue to map the Milky Way with stellar streams and constrain dark matter on galactic scales.  


Jacob Nibauer, Princeton University

Detecting Ultra-Diffuse Galaxies through their Globular Clusters by Point Process Model

We introduce a new method for detecting ultra-diffuse galaxies (UDGs) by searching for over-densities in intergalactic globular cluster (GCs) populations. Our method is based on an application of the log-Gaussian Cox process. This method is applied to the GCs data obtained from the PIPER survey, a Hubble Space Telescope imaging program targeting the Perseus cluster. We successfully detect all confirmed UDGs with known globular cluster populations in the survey. We also identify a potential galaxy that has no detected diffuse stellar content. Preliminary analysis shows that it is unlikely to be merely an accidental clump of globular clusters or other objects. If confirmed, this system would be the first of its kind. Simulations are used to assess how the physical parameters of the globular cluster systems within UDGs affect their detectability using our method. We quantify the correlation of the detection probability with the total number of GCs in the galaxy and the anti-correlation with increasing half-number radius of the GC system. The Sersic index of the GC distribution has little impact on detectability. 


Dayi Li, University of Toronto

Hierarchically modelling NGC 3147's trio of Type Ia supernova siblings: SNe 2021hpr, 1997bq and 2008fv

We introduce the 'relative intrinsic scatter' parameter, σRel, to model Type Ia supernovae that exploded in the same host galaxy: 'SN siblings'. We define σRel as the dispersion of individual siblings distance estimates relative to one another, and show that marginalising over σRel is a robust and inexpensive way of combining these distances. We proceed to fit a newly trained BayeSN model to new Young Supernova Experiment grizy photometry of SN 2021hpr, together with photometry of its siblings in NGC 3147: SNe 1997bq and 2008fv. By hierarchically fitting these light curves simultaneously, we improve the estimates of distance and dust parameters, as compared to individual fits to each SN. Moreover, just as σRel affects the distance uncertainty, we find the dust parameter posteriors are also affected (in the opposite sense, with larger σRel values leading to larger dust parameter uncertainties). Applying our methods, we constrain a common dust law shape parameter: RV=2.62±0.67, and the Hubble constant: H0=78.4±6.5 km/s/Mpc. We conclude that σRel-marginalisation is important to robustly combine siblings distances for cosmology, and for investigating siblings-host correlations. 


Sam Ward, University of Cambridge

TD-CARMA: Painless, accurate, and scalable estimates of gravitational-lens time delays with flexible CARMA processes

Cosmological parameters encoding our understanding of the expansion history of the Universe can be constrained by the accurate estimation of time delays arising in gravitationally lensed systems. We propose TD-CARMA, a Bayesian method to estimate cosmological time delays by modelling the observed and irregularly sampled light curves as realizations of a CARMA process. Our model accounts for heteroskedastic measurement errors and microlensing, an additional source of independent extrinsic long-term variability in source brightness. The semi-separable structure of the CARMA covariance matrix allows for fast and scalable likelihood computation using Gaussian Process modeling. We obtain a sample from the joint posterior distribution of the model parameters using a nested sampling approach. This allows for ``painless'' Bayesian Computation, dealing with the expected multi-modality of the posterior distribution and not requiring the specification of starting values or an initial guess for the time delay, unlike existing methods. Our time delay estimates for six doubly lensed quasars are consistent with those derived in the literature, but are typically two to four times more precise. 


Antoine Meyer, Imperial College London