Sunday, Aug 6: 4:00 PM - 5:50 PM
Metro Toronto Convention Centre
This work introduces a flexible and adaptive nonparametric method for estimating the association between multiple covariates and power spectra of multiple time series. The approach uses a Bayesian sum of trees model to capture complex dependencies and interactions between covariates and the power spectrum, which are often observed in biomedical studies. Local power spectra corresponding to terminal nodes within trees are estimated nonparametrically using Bayesian penalized linear splines. Trees are random and fit using a Bayesian backfitting MCMC algorithm that sequentially considers tree modifications via reversible-jump techniques. For high-dimensional covariates, a sparsity-inducing Dirichlet hyperprior on tree splitting proportions provides sparse estimation of covariate effects and efficient variable selection. By averaging over the posterior distribution of trees, the proposed method can recover both smooth and abrupt changes in the power spectrum across multiple covariates. The proposed methodology is used to study gait maturation in young children by evaluating age-related changes in power spectra of stride interval time series in the presence of other covariates.
multiple time series
reversible jump Markov Chain Monte Carlo
Interval estimation for the difference between effect measures is commonly used in various applications. The method of variance estimates recovery (MOVER) is a useful method for constructing the confidence interval of the difference between effect measures. The interval estimation with missing data has been widely studied in recent years, as missing values can occur during data collection. In this study, we propose two proper multiple imputation procedures for the MOVER to estimate the confidence intervals for the difference of the binomial proportions, not only for missing at random but also for missing not at random. A simulation study shows that the coverage probabilities of the proposed intervals are closer to the nominal level than the existing intervals in most cases. These multiple imputation confidence intervals are illustrated with a real data example.
Missing not at random
Graphical and sparse covariance models have found widespread use due to their immediate appeal in modern sample-starved high dimensional applications. A part of their wide appeal stems from the significantly low sample sizes required for the existence of maximum and pseudo-likelihood estimators, especially in comparison with the classical full covariance model. For undirected Gaussian graphical models, the minimum sample size required for the existence of MLEs had been an open question since their introduction in the late '70s, and has been recently settled. The very same question for pseudo-likelihood estimators has remained unsolved ever since their introduction in the '70s. Pseudo-likelihood estimators have recently received renewed attention as they impose fewer restrictive assumptions and have better computational tractability, improved statistical performance, and appropriateness in modern high dimensional applications, thus renewing interest in this longstanding problem. In this work, we undertake a comprehensive study of this open problem within the context of pseudo-likelihood methods proposed in the literature.
Understanding the impact of environmental exposures on health outcomes in women is complicated by the timing of measurement in the menstrual cycle: how exposures are metabolized and measured depends on hormone levels which vary over time. Knowing the cycle time at which measurements are obtained would allow this to be incorporated in subsequent analyses, but the self-reported start of last menstrual cycle has proven unreliable. Moreover, repeated biomarker assessment during a single cycle -- which could establish cycle times -- is burdensome for participants. We therefore propose a method which allows the estimation of cycle phase given hormone levels in a single urine sample. We leverage repeated assessments in a reference sample to model within-subject variability in hormone levels over the course of a menstrual cycle, and then estimate current cycle time given a single value for a new participant. Relying on this novel, data-driven estimation of cycle time allows researchers to have a more accurate benchmark when evaluating environmental factors.
Maximum likelihood estimation in mixed effects logistic regression often results in estimates on the boundary of the parameter space. Such estimates, including infinite values for fixed effects or singular variance components matrices, can cause havoc to numerical estimation procedures and inference. We add a scaled penalty to the log-likelihood function, which penalizes the fixed effects by the Jeffreys' invariant prior for the model with no random effects and the variance components by a composition of negative Huber loss functions. The maximum penalized likelihood estimates are shown to lie in the interior of the parameter space. Appropriate scaling of the penalty preserves the optimal asymptotic properties expected by the maximum likelihood estimator, namely consistency, asymptotic normality, and Cramer-Rao efficiency. Our choice of penalties and scaling factor preserves equivariance of the fixed effects estimates under linear transformation of the model parameters, such as contrasts. The method's superior finite sample performance over other prevalent approaches is shown on real-data examples and comprehensive simulation studies.
singular variance components
In Fetoscopy for Spina Bifida fetus, a polymer patch is spread on the gap in the skin covering the vertebra. The researcher wants to measure roughness of the patch at 0, 4, 8, 12, and 16 weeks. When roughness is measured at any chosen time, the patch gets destroyed after the measurement is obtained. The basic query is how to obtain a profile of roughness over time using data from the marginal data of times. We propose models so that the joint distribution can be estimated from the marginal data. The next objective is to estimate the parameters of the joint distribution. We use Newton-Raphson method and a version of the EM algorithm and compare their performances.
Double-zero-event studies pose a challenge for accurately estimating effect sizes in meta-analysis due to the absence of events in both control and treatment groups. To address this issue, we introduce a Zero-Inflated Bivariate Generalized Linear Mixed Model (ZIBGLMM) and develop both frequentist and Bayesian versions of it. This model is a two-component finite mixture model that includes a subpopulation with extremely low risk. Through extensive simulation studies and real-world meta-analysis case studies, we demonstrate that the ZIBGLMM model outperforms traditional meta-analysis methods and the standard BGLMM model in estimating the true effect size with substantially less bias and comparable coverage probability. In addition, we illustrated our method with a real-world meta-analysis case study, where the existence of extremely low-risk subpopulations is clinically justifiable, and find that the ZIBGLMM model performs better in terms of AIC and DIC than the standard BGLMM model. Our findings suggest that the ZIBGLMM model can properly account for extremely low-risk subpopulations and accurately estimate the effect size in meta-analysis with substantial double-zero studies.
Bivariate Generalized Linear Mixed Models
Generalized Linear Mixed Models