A multivariate discrete failure time model for the analysis of infant motor development
We develop a multivariate discrete failure time model for the analysis of infant motor development. We use the model to jointly evaluate the time (in months) to achievement of three well ‐established motor milestones: sitting up, crawling, and walking. The model includes a subject‐specific latent factor that reflects underlying heterogeneity in the population and accounts for within‐subject dependence across the milestones. The factor loadings and covariate effects are allowed to vary flexibly across milestones, and the milestones are permitted to have unique at‐risk intervals corresponding to different development...
Source: Statistics in Medicine - November 28, 2018 Category: Statistics Authors: Brian Neelon, Azza Shoaibi, Sara E. Benjamin ‐Neelon Tags: RESEARCH ARTICLE Source Type: research

Multilevel model with random effects for clustered survival data with multiple failure outcomes
We present a multilevel frailty model for handling serial dependence and simultaneous heterogeneity in survival data with a multilevel structure attributed to clustering of subjects and the presence of multiple failure outcomes. One commonly observes such data, for example, in multi ‐institutional, randomized placebo‐controlled trials in which patients suffer repeated episodes (eg, recurrent migraines) of the disease outcome being measured. The model extends the proportional hazards model by incorporating a random covariate and unobservable random institution effect to resp ectively account for treatment‐by‐institu...
Source: Statistics in Medicine - November 25, 2018 Category: Statistics Authors: Richard Tawiah, Kelvin K.W. Yau, Geoffrey J. McLachlan, Suzanne K. Chambers, Shu ‐Kay Ng Tags: RESEARCH ARTICLE Source Type: research

Machine learning methods for leveraging baseline covariate information to improve the efficiency of clinical trials
Clinical trials are widely considered the gold standard for treatment evaluation, and they can be highly expensive in terms of time and money. The efficiency of clinical trials can be improved by incorporating information from baseline covariates that are related to clinical outcomes. This can be done by modifying an unadjusted treatment effect estimator with an augmentation term that involves a function of covariates. The optimal augmentation is well characterized in theory but must be estimated in practice. In this article, we investigate the use of machine learning methods to estimate the optimal augmentation. We consid...
Source: Statistics in Medicine - November 25, 2018 Category: Statistics Authors: Zhiwei Zhang, Shujie Ma Tags: RESEARCH ARTICLE Source Type: research

Estimating the receiver operating characteristic curve in matched case control studies
The matched case ‐control design is frequently used in the study of complex disorders and can result in significant gains in efficiency, especially in the context of measuring biomarkers; however, risk prediction in this setting is not straightforward. We propose an inverse‐probability weighting approach to esti mate the predictive ability associated with a set of covariates. In particular, we propose an algorithm for estimating the summary index, area under the curve corresponding to the Receiver Operating Characteristic curve associated with a set of pre‐defined covariates for predicting a binary outcom e. By combi...
Source: Statistics in Medicine - November 22, 2018 Category: Statistics Authors: Hui Xu, Jing Qian, Nina P. Paynter, Xuehong Zhang, Brian W. Whitcomb, Shelley S. Tworoger, Kathryn M. Rexrode, Susan E. Hankinson, Raji Balasubramanian Tags: RESEARCH ARTICLE Source Type: research

A full Bayesian model to handle structural ones and missingness in economic evaluations from individual ‐level data
We present a general Bayesian framework that can handle the complexity. We show the benefits of using our approach with a motivating exampl e, the MenSS trial, for which there are spikes at one in the effectiveness and missingness in both outcomes. We contrast a set of increasingly complex models and perform sensitivity analysis to assess the robustness of the conclusions to a range of plausible missingness assumptions. We demonstrate the flexibility of our approach with a second example, the PBS trial, and extend the framework to accommodate the characteristics of the data in this study. This paper highlights the import...
Source: Statistics in Medicine - November 22, 2018 Category: Statistics Authors: Andrea Gabrio, Alexina J. Mason, Gianluca Baio Tags: RESEARCH ARTICLE Source Type: research

Using a monotone single ‐index model to stabilize the propensity score in missing data problems and causal inference
The augmented inverse weighting method is one of the most popular methods for estimating the mean of the response in causal inference and missing data problems. An important component of this method is the propensity score. Popular parametric models for the propensity score include the logistic, probit, and complementary log ‐log models. A common feature of these models is that the propensity score is a monotonic function of a linear combination of the explanatory variables. To avoid the need to choose a model, we model the propensity score via a semiparametric single‐index model, in which the score is an unknown mo no...
Source: Statistics in Medicine - November 22, 2018 Category: Statistics Authors: Jing Qin, Tao Yu, Pengfei Li, Hao Liu, Baojiang Chen Tags: RESEARCH ARTICLE Source Type: research

Modeling a bivariate residential ‐workplace neighborhood effect when estimating the effect of proximity to fast‐food establishments on body mass index
Hierarchical modeling is the preferred approach of modeling neighborhood effects. When both residential and workplace neighborhoods are known, a bivariate (residential ‐workplace) neighborhood random effect that quantifies the extent that a neighborhood's residential and workplace effects are correlated may be modeled. However, standard statistical software for hierarchical models does not easily allow correlations between the random effects of distinct clusteri ng variables to be incorporated. To overcome this challenge, we develop a Bayesian model and an accompanying estimation procedure that allows for correlated biva...
Source: Statistics in Medicine - November 20, 2018 Category: Statistics Authors: A. James O'Malley, Peter James, Todd A. MacKenzie, Jinyoung Byun, S. V. Subramanian, Jason P. Block Tags: RESEARCH ARTICLE Source Type: research

A powerful and data ‐adaptive test for rare‐variant–based gene‐environment interaction analysis
As whole ‐exome/genome sequencing data become increasingly available in genetic epidemiology research consortia, there is emerging interest in testing the interactions between rare genetic variants and environmental exposures that modify the risk of complex diseases. However, testing rare‐variant–based gene‐by‐environment interactions (GxE) is more challenging than testing the genetic main effects due to the difficulty in correctly estimating the latter under the null hypothesis of no GxE effects and the presence of neutral variants. In response, we have developed a family of powerful and dat a‐adaptive GxE tes...
Source: Statistics in Medicine - November 20, 2018 Category: Statistics Authors: Tianzhong Yang, Han Chen, Hongwei Tang, Donghui Li, Peng Wei Tags: RESEARCH ARTICLE Source Type: research

A new semiparametric transformation approach to disease diagnosis with multiple biomarkers
When multiple biomarkers are available for disease diagnosis, it is desirable to efficiently combine them to form a single index. Making use of the Neyman ‐Pearson paradigm, we propose a new combination/transformation approach to disease diagnosis that efficiently combines multiple biomarkers. The proposed method does not require that the biomarkers be jointly normally distributed or the covariance matrices for the diseased and the nondiseased are n ondifferential. An R package is developed to implement the proposed method. Simulations and two real data examples demonstrate advantages of the new method over existing ones...
Source: Statistics in Medicine - November 20, 2018 Category: Statistics Authors: Ting Lyu, Zhiliang Ying, Hong Zhang Tags: RESEARCH ARTICLE Source Type: research

One ‐sample aggregate data meta‐analysis of medians
An aggregate data meta ‐analysis is a statistical method that pools the summary statistics of several selected studies to estimate the outcome of interest. When considering a continuous outcome, typically each study must report the same measure of the outcome variable and its spread (eg, the sample mean and its standard error). However, some studies may instead report the median along with various measures of spread. Recently, the task of incorporating medians in meta‐analysis has been achieved by estimating the sample mean and its standard error from each study that reports a median in order to meta‐analyze t he mea...
Source: Statistics in Medicine - November 20, 2018 Category: Statistics Authors: Sean McGrath, XiaoFei Zhao, Zhi Zhen Qin, Russell Steele, Andrea Benedetti Tags: RESEARCH ARTICLE Source Type: research

On the efficiency of adaptive sample size design
Adaptive sample size designs, including group sequential designs, have been used as alternatives to fixed sample size designs to achieve more robust statistical power and better trial efficiency. This work investigates the efficiency of adaptive sample size designs as compared to group sequential designs. We show that given a group sequential design, a uniformly more efficient adaptive sample size design based on the same maximum sample size and rejection boundary can be constructed. While maintaining stable statistical power at the required level, the expected sample size of the obtained adaptive sample size design is uni...
Source: Statistics in Medicine - November 18, 2018 Category: Statistics Authors: Lu Cui, Lanju Zhang Tags: RESEARCH ARTICLE Source Type: research

A note on compatibility for inference with missing data in the presence of auxiliary covariates
Imputation and inference (or analysis) models that cannot be true simultaneously are frequently used in practice when missing outcomes are present. In these situations, the conclusions can be misleading depending on how “different” the implicit inference model, induced by the imputation model, is from the inference model actually used. We introducemodel ‐based compatibility (MBC) and compare two MBC approaches to a non ‐MBC approach and explore the inferential validity of the latter in a simple case. In addition, we evaluate more complex cases through a series of simulation studies. Overall, we recommend caution wh...
Source: Statistics in Medicine - November 18, 2018 Category: Statistics Authors: Michael J. Daniels, Xuan Luo Tags: RESEARCH ARTICLE Source Type: research

Issue Information
No abstract is available for this article. (Source: Statistics in Medicine)
Source: Statistics in Medicine - November 14, 2018 Category: Statistics Tags: ISSUE INFORMATION Source Type: research

Semiparametric additive rates model for recurrent events data with intermittent gaps
In this study, we build an additive rates model for recurrent event data considering intermittent gaps. We provide the asymptotic theories behind the proposed model , as well as the goodness of fit between observed and modeled values. Simulation studies reveal that the estimations perform well if intermittent gaps are taken into account. In addition, we utilized the longitudinal cohort of elderly patients who have type 2 diabetes and at least one record of a se vere recurrent complication, hypoglycemia, from the National Health Insurance Research Database in Taiwan to demonstrate the proposed method. (Source: Statistics in Medicine)
Source: Statistics in Medicine - November 14, 2018 Category: Statistics Authors: Pei ‐Fang Su, Junjiang Zhong, Huang‐Tz Ou Tags: RESEARCH ARTICLE Source Type: research

Prediction intervals for penalized longitudinal models with multisource summary measures: An application to childhood malnutrition
In many global health analyses, it is of interest to examine countries' progress using indicators of socio ‐economic conditions based on national surveys from varying sources. This results in longitudinal data where heteroscedastic summary measures, rather than individual level data, are available. Administration of national surveys can be sporadic, resulting in sparse data measurements for some countr ies. Furthermore, the trend of the indicators over time is usually nonlinear and varies by country. It is of interest to track the current level of indicators to determine if countries are meeting certain thresholds, such ...
Source: Statistics in Medicine - November 14, 2018 Category: Statistics Authors: Alexander C. McLain, Edward A. Frongillo, Juan Feng, Elaine Borghi Tags: RESEARCH ARTICLE Source Type: research