Biostatistics
This is an RSS file. You can use it to subscribe to this data in your favourite RSS reader, such as GoogleReader, or to display this data on your own website or blog.
Subscribe to this data using MyMedWorm.
Subscribe to this data using GoogleReader.
Subscribe to this data using Bloglines.
Subscribe to this data using MyYahoo.
Get the very latest Swine Flu news via the MedWorm Swine Flu RSS news feed - updated hourly from thousands of authoritative health and news sources.
This page shows you the latest items in this publication.
241 records returned
Index
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
(Source: Biostatistics)
Source: Biostatistics - September 10, 2009 Category: Bioinformatics Tags: Index Source Type: journals
Letter to the editor
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
(Source: Biostatistics)
Source: Biostatistics - September 10, 2009 Category: Bioinformatics Authors: Rucker, G., Schumacher, M. Tags: Letter to the editor Source Type: journals
Modeling between-trial variance structure in mixed treatment comparisons
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
In mixed treatment comparison (MTC) meta-analysis, modeling the heterogeneity in between-trial variances across studies is a difficult problem because of the constraints on the variances inherited from the MTC structure. Starting from a consistent Bayesian hierarchical model for the mean treatment effects, we represent the variance configuration by a set of triangle inequalities on the standard deviations. We take the separation strategy (Barnard and others, 2000) to specify prior distributions for standard deviations and correlations separately. The covariance matrix of the latent treatment arm effects can be employed as ...
Source: Biostatistics - September 10, 2009 Category: Bioinformatics Authors: Lu, G., Ades, A. Tags: Articles Source Type: journals
Bayesian inference for stochastic multitype epidemics in structured populations using sample data
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
The objective is to make inference for the infection rate parameters in the underlying model of disease transmission. The principal challenge is that the required likelihood of the data is intractable in all but the simplest cases. Demiris and O'Neill (2005b) used data augmentation methods involving a certain random graph in a Markov chain Monte Carlo setting to address this situation in the special case where the sample is the same as the entire population. Here, we take an approach relying on broadly similar principles, but for which the implementation details are markedly different. Specifically, to cover the general ca...
Source: Biostatistics - September 10, 2009 Category: Bioinformatics Authors: O'Neill, P. D. Tags: Articles Source Type: journals
A continuous-index hidden Markov jump process for modeling DNA copy number data
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
The number of copies of DNA in human cells can be measured using array comparative genomic hybridization (aCGH), which provides intensity ratios of sample to reference DNA at genomic locations corresponding to probes on a microarray. In the present paper, we devise a statistical model, based on a latent continuous-index Markov jump process, that is aimed to capture certain features of aCGH data, including probes that are unevenly long, unevenly spaced, and overlapping. The model has a continuous state space, with 1 state representing a normal copy number of 2, and the rest of the states being either amplifications or delet...
Source: Biostatistics - September 10, 2009 Category: Bioinformatics Authors: Stjernqvist, S., Ryden, T. Tags: Articles Source Type: journals
Second-order estimating equations for the analysis of clustered current status data
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
We present methods of estimating the baseline marginal distributions, covariate effects, and association parameters for clustered current status data based on second-order generalized estimating equations. We examine the efficiency gains realized from using second-order estimating equations compared with first-order equations, issues of copula misspecification, and apply the methods to motivating studies including one on the incidence of joint damage in patients with psoriatic arthritis. (Source: Biostatistics)
Source: Biostatistics - September 10, 2009 Category: Bioinformatics Authors: Cook, R. J., Tolusso, D. Tags: Articles Source Type: journals
A mixed model framework for teratology studies
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
A mixed model framework is presented to model the characteristic multivariate binary anomaly data as provided in some teratology studies. The key features of the model are the incorporation of covariate effects, a flexible random effects distribution by means of a finite mixture, and the application of copula functions to better account for the relation structure of the anomalies. The framework is motivated by data of the Boston Anticonvulsant Teratogenesis study and offers an integrated approach to investigate substantive questions, concerning general and anomaly-specific exposure effects of covariates, interrelations bet...
Source: Biostatistics - September 10, 2009 Category: Bioinformatics Authors: Braeken, J., Tuerlinckx, F. Tags: Articles Source Type: journals
Estimating dementia-free life expectancy for Parkinson's patients using Bayesian inference and microsimulation
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
Interval-censored longitudinal data taken from a Norwegian study of individuals with Parkinson's disease are investigated with respect to the onset of dementia. Of interest are risk factors for dementia and the subdivision of total life expectancy (LE) into LE with and without dementia. To estimate LEs using extrapolation, a parametric continuous-time 3-state illness–death Markov model is presented in a Bayesian framework. The framework is well suited to allow for heterogeneity via random effects and to investigate additional computation using model parameters. In the estimation of LEs, microsimulation is used to tak...
Source: Biostatistics - September 10, 2009 Category: Bioinformatics Authors: van den Hout, A., Matthews, F. E. Tags: Articles Source Type: journals
Bayesian inference for within-herd prevalence of Leptospira interrogans serovar Hardjo using bulk milk antibody testing
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
Leptospirosis is the most widespread zoonosis throughout the world and human mortality from severe disease forms is high even when optimal treatment is provided. Leptospirosis is also one of the most common causes of reproductive losses in cattle worldwide and is associated with significant economic costs to the dairy farming industry. Herds are tested for exposure to the causal organism either through serum testing of individual animals or through testing bulk milk samples. Using serum results from a commonly used enzyme-linked immunosorbent assay (ELISA) test for Leptospira interrogans serovar Hardjo (L. hardjo) on sampl...
Source: Biostatistics - September 10, 2009 Category: Bioinformatics Authors: Lewis, F. I., Gunn, G. J., Mckendrick, I. J., Murray, F. M. Tags: Articles Source Type: journals
An efficient method for identifying statistical interactors in gene association networks
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
Network reconstruction is a main goal of many biological endeavors. Graphical Gaussian models (GGMs) are often used since the underlying assumptions are well understood, the graph is readily estimated by calculating the partial correlation (paCor) matrix, and its interpretation is straightforward. In spite of these advantages, GGMs are limited in that interactions are not accommodated as the underlying multivariate normality assumption allows for linear dependencies only. As we show, when applied in the presence of interactions, the GGM framework can lead to incorrect inference regarding dependence. Identifying the exact d...
Source: Biostatistics - September 10, 2009 Category: Bioinformatics Authors: Andrei, A., Kendziorski, C. Tags: Articles Source Type: journals
Sample size calculations for controlling the distribution of false discovery proportion in microarray experiments
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
The false discovery proportion (FDP), the proportion of false rejections among all rejections, provides useful criteria for controlling false positives in multiple testing to detect differential genes in microarray experiments. Owing to a substantial variability in FDP for correlated genes, some authors considered controlling actual FDP, instead of its expectation, that is false discovery rate, in multiple testing. However, there has been no attempt to do this in the design of microarray experiments. In this article, we develop a procedure for sample size calculation to control the distributions of FDP and true positives s...
Source: Biostatistics - September 10, 2009 Category: Bioinformatics Authors: Oura, T., Matsui, S., Kawakami, K. Tags: Articles Source Type: journals
SHARE: an adaptive algorithm to select the most informative set of SNPs for candidate genetic association
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
Association studies have been widely used to identify genetic liability variants for complex diseases. While scanning the chromosomal region 1 single nucleotide polymorphism (SNP) at a time may not fully explore linkage disequilibrium, haplotype analyses tend to require a fairly large number of parameters, thus potentially losing power. Clustering algorithms, such as the cladistic approach, have been proposed to reduce the dimensionality, yet they have important limitations. We propose a SNP-Haplotype Adaptive REgression (SHARE) algorithm that seeks the most informative set of SNPs for genetic association in a targeted can...
Source: Biostatistics - September 10, 2009 Category: Bioinformatics Authors: Dai, J. Y., Leblanc, M., Smith, N. L., Psaty, B., Kooperberg, C. Tags: Articles Source Type: journals
Identifying temporally differentially expressed genes through functional principal components analysis
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
Time course gene microarray is an important tool to identify genes with differential expressions over time. Traditional analysis of variance (ANOVA) type of longitudinal investigation may not be applicable because of irregular time intervals and possible missingness due to contamination in microarray experiments. Functional principal components analysis is proposed to test hypotheses in the change of the mean curves. A permutation test under a mild assumption is used to make the method more robust. The proposed method outperforms the recently developed extraction of differential gene expression and a 2-way mixed effects AN...
Source: Biostatistics - September 10, 2009 Category: Bioinformatics Authors: Liu, X., Yang, M. C. K. Tags: Articles Source Type: journals
Rank-based estimation in the {ell}1-regularized partly linear model for censored outcomes with application to integrated analyses of clinical predictors and gene expression data
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
We consider estimation and variable selection in the partial linear model for censored data. The partial linear model for censored data is a direct extension of the accelerated failure time model, the latter of which is a very important alternative model to the proportional hazards model. We extend rank-based lasso-type estimators to a model that may contain nonlinear effects. Variable selection in such partial linear model has direct application to high-dimensional survival analyses that attempt to adjust for clinical predictors. In the microarray setting, previous methods can adjust for other clinical predictors by assum...
Source: Biostatistics - September 10, 2009 Category: Bioinformatics Authors: Johnson, B. A. Tags: Articles Source Type: journals
A semiparametric 2-part mixed-effects heteroscedastic transformation model for correlated right-skewed semicontinuous data
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
In longitudinal or hierarchical structure studies, we often encounter a semicontinuous variable that has a certain proportion of a single value and a continuous and skewed distribution among the rest of values. In this paper, we propose a new semiparametric 2-part mixed-effects transformation model to fit correlated skewed semicontinuous data. In our model, we allow the transformation to be nonparametric. Fitting the proposed model faces computational challenges due to intractable numerical integrations. We derive the estimates for the parameter and the transformation function based on an approximate likelihood, which has ...
Source: Biostatistics - September 10, 2009 Category: Bioinformatics Authors: Lin, H., Zhou, X.-H. Tags: Articles Source Type: journals
Variable selection and dependency networks for genomewide data
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
We describe a new stochastic search algorithm for linear regression models called the bounded mode stochastic search (BMSS). We make use of BMSS to perform variable selection and classification as well as to construct sparse dependency networks. Furthermore, we show how to determine genetic networks from genomewide data that involve any combination of continuous and discrete variables. We illustrate our methodology with several real-world data sets. (Source: Biostatistics)
Source: Biostatistics - September 10, 2009 Category: Bioinformatics Authors: Dobra, A. Tags: Articles Source Type: journals
A novel approach to cancer staging: application to esophageal cancer
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
A novel 3-step random forests methodology involving survival data (survival forests), ordinal data (multiclass forests), and continuous data (regression forests) is introduced for cancer staging. The methodology is illustrated for esophageal cancer using worldwide esophageal cancer collaboration data involving 4627 patients. (Source: Biostatistics)
Source: Biostatistics - September 10, 2009 Category: Bioinformatics Authors: Ishwaran, H., Blackstone, E. H., Apperson-Hansen, C., Rice, T. W. Tags: Articles Source Type: journals
Estimation and inference for case-control studies with multiple non-gold standard exposure assessments: with an occupational health application
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
In occupational case–control studies, work-related exposure assessments are often fallible measures of the true underlying exposure. In lieu of a gold standard, often more than 2 imperfect measurements (e.g. triads) are used to assess exposure. While methods exist to assess the diagnostic accuracy in the absence of a gold standard, these methods are infrequently used to correct for measurement error in exposure–disease associations in occupational case–control studies. Here, we present a likelihood-based approach that (a) provides evidence regarding whether the misclassification of tests is differential o...
Source: Biostatistics - September 10, 2009 Category: Bioinformatics Authors: Chu, H., Cole, S. R., Wei, Y., Ibrahim, J. G. Tags: Articles Source Type: journals
Biostatistics - Referees of Manuscripts Submitted Mid-2007 to Mid-2008
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
(Source: Biostatistics)
Source: Biostatistics - June 15, 2009 Category: Bioinformatics Tags: Referees Source Type: journals
Joint analysis of prevalence and incidence data using conditional likelihood
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
Disease prevalence is the combined result of duration, disease incidence, case fatality, and other mortality. If information is available on all these factors, and on fixed covariates such as genotypes, prevalence information can be utilized in the estimation of the effects of the covariates on disease incidence. Study cohorts that are recruited as cross-sectional samples and subsequently followed up for disease events of interest produce both prevalence and incidence information. In this paper, we make use of both types of information using a likelihood, which is conditioned on survival until the cross section. In a simul...
Source: Biostatistics - June 15, 2009 Category: Bioinformatics Authors: Saarela, O., Kulathinal, S., Karvanen, J. Tags: Articles Source Type: journals
Optimal designs for 2-color microarray experiments
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
Statisticians can play a crucial role in the design of gene expression studies to ensure the most effective allocation of available resources. This paper considers Pareto optimal designs for gene expression studies involving 2-color microarrays. Pareto optimality enables the recommendation of designs that are particularly efficient for the effects of most interest to biologists. This is relevant in the microarray context where analysis is typically carried out separately for those effects. Our approach will allow for effects of interest that correspond to contrasts rather than solely considering parameters of the linear mo...
Source: Biostatistics - June 15, 2009 Category: Bioinformatics Authors: Sanchez, P. S., Glonek, G. F. V. Tags: Articles Source Type: journals
Testing the prediction error difference between 2 predictors
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
We develop an inference framework for the difference in errors between 2 prediction procedures. The 2 procedures may differ in any aspect and possibly utilize different sets of covariates. We apply training and testing on the same data set, which is accommodated by sample splitting. For each split, both procedures predict the response of the same samples, which results in paired residuals to which a signed-rank test is applied. Multiple splits result in multiple p-values. The median p-value and the mean inverse normal transformed p-value are proposed as summary (test) statistics, for which bounds on the overall type I erro...
Source: Biostatistics - June 15, 2009 Category: Bioinformatics Authors: van de Wiel, M. A., Berkhof, J., van Wieringen, W. N. Tags: Articles Source Type: journals
Development and validation of a dynamic prognostic tool for prostate cancer recurrence using repeated measures of posttreatment PSA: a joint modeling approach
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
Prostate-specific antigen (PSA) is a biomarker routinely and repeatedly measured on prostate cancer patients treated by radiation therapy (RT). It was shown recently that its whole pattern over time rather than just its current level was strongly associated with prostate cancer recurrence. To more accurately guide clinical decision making, monitoring of PSA after RT would be aided by dynamic powerful prognostic tools that incorporate the complete posttreatment PSA evolution. In this work, we propose a dynamic prognostic tool derived from a joint latent class model and provide a measure of variability obtained from the para...
Source: Biostatistics - June 15, 2009 Category: Bioinformatics Authors: Proust-Lima, C., Taylor, J. M. G. Tags: Articles Source Type: journals
A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
We present a penalized matrix decomposition (PMD), a new framework for computing a rank-K approximation for a matrix. We approximate the matrix X as
, where dk, uk, and vk minimize the squared Frobenius norm of X
, subject to penalties on uk and vk. This results in a regularized version of the singular value decomposition. Of particular interest is the use of L1-penalties on uk and vk, which yields a decomposition of X using sparse vectors. We show that when the PMD is applied using an L1-penalty on vk but not on uk, a method for sparse principal components results. In fact, this yields an efficient algorithm for the "SCo...
Source: Biostatistics - June 15, 2009 Category: Bioinformatics Authors: Witten, D. M., Tibshirani, R., Hastie, T. Tags: Articles Source Type: journals
Frailty modeling of bimodal age-incidence curves of nasopharyngeal carcinoma in low-risk populations
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
The incidence of nasopharyngeal carcinoma (NPC) varies widely according to age at diagnosis, geographic location, and ethnic background. On a global scale, NPC incidence is common among specific populations primarily living in southern and eastern Asia and northern Africa, but in most areas, including almost all western countries, it remains a relatively uncommon malignancy. Specific to these low-risk populations is a general observation of possible bimodality in the observed age-incidence curves. We have developed a multiplicative frailty model that allows for the demonstrated points of inflection at ages 15–24 and ...
Source: Biostatistics - June 15, 2009 Category: Bioinformatics Authors: Haugen, M., Bray, F., Grotmol, T., Tretli, S., Aalen, O. O., Moger, T. A. Tags: Articles Source Type: journals
An insight into high-resolution mass-spectrometry data
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
Mass spectrometry is a powerful tool with much promise in global proteomic studies. The discipline of statistics offers robust methodologies to extract and interpret high-dimensional mass-spectrometry data and will be a valuable contributor to the field. Here, we describe the process by which data are produced, characteristics of the data, and the analytical preprocessing steps that are taken in order to interpret the data and use it in downstream statistical analyses. Because of the complexity of data acquisition, statistical methods developed for gene expression microarray data are not directly applicable to proteomic da...
Source: Biostatistics - June 15, 2009 Category: Bioinformatics Authors: Eckel-passow, J. E., Oberg, A. L., Therneau, T. M., Bergen, H. R. Tags: Articles Source Type: journals
Estimating equation-based causality analysis with application to microarray time series data
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
Microarray time-course data can be used to explore interactions among genes and infer gene network. The crucial step in constructing gene network is to develop an appropriate causality test. In this regard, the expression profile of each gene can be treated as a time series. A typical existing method establishes the Granger causality based on Wald type of test, which relies on the homoscedastic normality assumption of the data distribution. However, this assumption can be seriously violated in real microarray experiments and thus may lead to inconsistent test results and false scientific conclusions. To overcome the drawba...
Source: Biostatistics - June 15, 2009 Category: Bioinformatics Authors: Hu, J., Hu, F. Tags: Articles Source Type: journals
Conditional GEE for recurrent event gap times
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
This paper deals with the analysis of recurrent event data subject to censored observation. Using a suitable adaptation of generalized estimating equations for longitudinal data, we propose a straightforward methodology for estimating the parameters indexing the conditional means and variances of the process interevent (i.e. gap) times. The proposed methodology permits the use of both time-fixed and time-varying covariates, as well as transformations of the gap times, creating a flexible and useful class of methods for analyzing gap-time data. Censoring is dealt with by imposing a parametric assumption on the censored gap ...
Source: Biostatistics - June 15, 2009 Category: Bioinformatics Authors: Clement, D. Y., Strawderman, R. L. Tags: Articles Source Type: journals
A note on oligonucleotide expression values not being normally distributed
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
Novel techniques for analyzing microarray data are constantly being developed. Though many of the methods contribute to biological discoveries, inability to properly evaluate the novel techniques limits their ability to advance science. Because the underlying distribution of microarray data is unknown, novel methods are typically tested against the assumed normal distribution. However, microarray data are not, in fact, normally distributed, and assuming so can have misleading consequences. Using an Affymetrix technical replicate spike-in data set, we show that oligonucleotide expression values are not normally distributed ...
Source: Biostatistics - June 15, 2009 Category: Bioinformatics Authors: Hardin, J., Wilson, J. Tags: Articles Source Type: journals
Efficient parameter estimation in longitudinal data analysis using a hybrid GEE method
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
This study addresses this problem by proposing a hybrid method that combines multiple GEEs based on different working correlation models, using the empirical likelihood method (Qin and Lawless, 1994). Analyses show that this hybrid method is more efficient than a GEE using a misspecified working correlation model. Furthermore, if one of the working correlation structures correctly models the within-subject correlations, then this hybrid method provides the most efficient parameter estimates. In simulations, the hybrid method's finite-sample performance is superior to a GEE under any of the commonly used working correlation...
Source: Biostatistics - June 15, 2009 Category: Bioinformatics Authors: Leung, D. H. Y., Wang, Y.-G., Zhu, M. Tags: Articles Source Type: journals
A simulation-approximation approach to sample size planning for high-dimensional classification studies
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
Classification studies with high-dimensional measurements and relatively small sample sizes are increasingly common. Prospective analysis of the role of sample sizes in the performance of such studies is important for study design and interpretation of results, but the complexity of typical pattern discovery methods makes this problem challenging. The approach developed here combines Monte Carlo methods and new approximations for linear discriminant analysis, assuming multivariate normal distributions. Monte Carlo methods are used to sample the distribution of which features are selected for a classifier and the mean and v...
Source: Biostatistics - June 15, 2009 Category: Bioinformatics Authors: de Valpine, P., Bitter, H.-M., Brown, M. P. S., Heller, J. Tags: Articles Source Type: journals
Air pollution and health in Scotland: a multicity study
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
This paper presents an epidemiological study investigating the effects of long-term air pollution exposure on public health in Scotland, focusing on the 4 major urban areas, Aberdeen, Dundee, Edinburgh, and Glasgow. In particular, the associations between respiratory hospital admissions in 2005 and exposure to both PM10 and NO2 between 2002 and 2004 are estimated using a small-area ecological design. The implementation of such studies requires careful consideration of a number of statistical issues, including how to model spatial correlation, identifiability of the model parameters, and the possible effects of ecological b...
Source: Biostatistics - June 15, 2009 Category: Bioinformatics Authors: Lee, D., Ferguson, C., Mitchell, R. Tags: Articles Source Type: journals
Reproducible research and Biostatistics
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
(Source: Biostatistics)
Source: Biostatistics - June 15, 2009 Category: Bioinformatics Authors: Peng, R. D. Tags: Editorial Source Type: journals
A Bayesian model for evaluating influenza antiviral efficacy in household studies with asymptomatic infections
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
Antiviral agents are an important component in mitigation/containment strategies for pandemic influenza. However, most research for mitigation/containment strategies relies on the antiviral efficacies evaluated from limited data of clinical trials. Which efficacy measures can be reliably estimated from these studies depends on the trial design, the size of the epidemics, and the statistical methods. We propose a Bayesian framework for modeling the influenza transmission dynamics within households. This Bayesian framework takes into account asymptomatic infections and is able to estimate efficacies with respect to protectin...
Source: Biostatistics - February 27, 2009 Category: Bioinformatics Authors: Yang, Y., Halloran, M. E., Longini, I. M. Tags: Articles Source Type: journals
Bias in 2-part mixed models for longitudinal semicontinuous data
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
Semicontinuous data in the form of a mixture of zeros and continuously distributed positive values frequently arise in biomedical research. Two-part mixed models with correlated random effects are an attractive approach to characterize the complex structure of longitudinal semicontinuous data. In practice, however, an independence assumption about random effects in these models may often be made for convenience and computational feasibility. In this article, we show that bias can be induced for regression coefficients when random effects are truly correlated but misspecified as independent in a 2-part mixed model. Parallel...
Source: Biostatistics - February 27, 2009 Category: Bioinformatics Authors: Su, L., Tom, B. D. M., Farewell, V. T. Tags: Articles Source Type: journals
A robust method for finely stratified familial studies with proband-based sampling
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
This paper presents a robust method to conduct inference in finely stratified familial studies under proband-based sampling. We assume that the interest is in both the marginal effects of subject-specific covariates on a binary response and the familial aggregation of the response, as quantified by intrafamilial pairwise odds ratios. We adopt an estimating function for proband-based family studies originally developed by Zhao and others (1998) in the context of an unstratified design and treat the stratification effects as fixed nuisance parameters. Our method requires modeling only the first 2 joint moments of the observa...
Source: Biostatistics - February 27, 2009 Category: Bioinformatics Authors: Wang, M., Hanfelt, J. J. Tags: Articles Source Type: journals
Microarray background correction: maximum likelihood estimation for the normal-exponential convolution
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
This article develops the normexp method further by improving the estimation of the parameters. A complete mathematical development is given of the normexp model and the associated saddle-point approximation. Some subtle numerical programming issues are solved which caused the original normexp method to fail occasionally when applied to unusual data sets. A practical and reliable algorithm is developed for exact maximum likelihood estimation (MLE) using high-quality optimization software and using the saddle-point estimates as starting values. "MLE" is shown to outperform heuristic estimators proposed by other authors, bot...
Source: Biostatistics - February 27, 2009 Category: Bioinformatics Authors: Silver, J. D., Ritchie, M. E., Smyth, G. K. Tags: Articles Source Type: journals
Bayesian graphical models for regression on multiple data sets with different variables
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
Routinely collected administrative data sets, such as national registers, aim to collect information on a limited number of variables for the whole population. In contrast, survey and cohort studies contain more detailed data from a sample of the population. This paper describes Bayesian graphical models for fitting a common regression model to a combination of data sets with different sets of covariates. The methods are applied to a study of low birth weight and air pollution in England and Wales using a combination of register, survey, and small-area aggregate data. We discuss issues such as multiple imputation of confou...
Source: Biostatistics - February 27, 2009 Category: Bioinformatics Authors: Jackson, C. H., Best, N. G., Richardson, S. Tags: Articles Source Type: journals
Statistical independence of the colocalized association signals for type 1 diabetes and RPS26 gene expression on chromosome 12q13
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
Following the recent success of genome-wide association studies in uncovering disease-associated genetic variants, the next challenge is to understand how these variants affect downstream pathways. The most proximal trait to a disease-associated variant, most commonly a single nucleotide polymorphism (SNP), is differential gene expression due to the cis effect of SNP alleles on transcription, translation, and/or splicing gene expression quantitative trait loci (eQTL). Several genome-wide SNP–gene expression association studies have already provided convincing evidence of widespread association of eQTLs. As a conseque...
Source: Biostatistics - February 27, 2009 Category: Bioinformatics Authors: Plagnol, V., Smyth, D. J., Todd, J. A., Clayton, D. G. Tags: Articles Source Type: journals
Optimal 2-stage design with given power in association studies
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
(Source: Biostatistics)
Source: Biostatistics - February 27, 2009 Category: Bioinformatics Authors: Wang, J., Liang, H., Zou, G. Tags: Articles Source Type: journals
Statistical monitoring of clinical trials with multivariate response and/or multiple arms: a flexible approach
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
Randomized clinical trials with a multivariate response and/or multiple treatment arms are increasingly common, in part because of their efficiency and a greater concern about balancing risks with benefits. In some trials, the specific types and magnitudes of treatment group differences that would warrant early termination cannot easily be specified prior to the onset of the trial and/or could change as the trial progresses. This underscores the need for more flexible monitoring methods than traditional approaches. This paper extends the repeated confidence bands approach for interim monitoring to more general settings whe...
Source: Biostatistics - February 27, 2009 Category: Bioinformatics Authors: Zhao, L., Hu, X. J., Lagakos, S. W. Tags: Articles Source Type: journals
Optimal multistage designs--a general framework for efficient genome-wide association studies
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
Genome-wide association studies (GWAS) have become increasingly affordable but they are still costly. Therefore, cost saving 2-stage designs were proposed in the literature. The restriction to 2 stages, however, seems artificial and does not exploit the full potential of the underlying methods. We extend the 2-stage approach to the general framework of any number of stages. Based on the theory of group sequential methods, we derive optimal multistage designs. With current genotyping cost structures, our results suggest that up to 4 stages are sufficient in order to get feasible and efficient designs. Furthermore, we consid...
Source: Biostatistics - February 27, 2009 Category: Bioinformatics Authors: Pahl, R., Schafer, H., Muller, H.-H. Tags: Articles Source Type: journals
A method for constructing a confidence bound for the actual error rate of a prediction rule in high dimensions
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
Constructing a confidence interval for the actual, conditional error rate of a prediction rule from multivariate data is problematic because this error rate is not a population parameter in the traditional sense—it is a functional of the training set. When the training set changes, so does this "parameter." A valid method for constructing confidence intervals for the actual error rate had been previously developed by McLachlan. However, McLachlan's method cannot be applied in many cancer research settings because it requires the number of samples to be much larger than the number of dimensions (n >> p), and it ...
Source: Biostatistics - February 27, 2009 Category: Bioinformatics Authors: Dobbin, K. K. Tags: Articles Source Type: journals
Exact and efficient inference procedure for meta-analysis and its application to the analysis of independent 2 x 2 tables with all available data but without artificial continuity correction
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
Recently, meta-analysis has been widely utilized to combine information across comparative clinical studies for evaluating drug efficacy or safety profile. When dealing with rather rare events, a substantial proportion of studies may not have any events of interest. Conventional methods either exclude such studies or add an arbitrary positive value to each cell of the corresponding 2x2 tables in the analysis. In this article, we present a simple, effective procedure to make valid inferences about the parameter of interest with all available data without artificial continuity corrections. We then use the procedure to analyz...
Source: Biostatistics - February 27, 2009 Category: Bioinformatics Authors: Tian, L., Cai, T., Pfeffer, M. A., Piankov, N., Cremieux, P.-Y., Wei, L. J. Tags: Articles Source Type: journals
Measurement error caused by spatial misalignment in environmental epidemiology
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
In many environmental epidemiology studies, the locations and/or times of exposure measurements and health assessments do not match. In such settings, health effects analyses often use the predictions from an exposure model as a covariate in a regression model. Such exposure predictions contain some measurement error as the predicted values do not equal the true exposures. We provide a framework for spatial measurement error modeling, showing that smoothing induces a Berkson-type measurement error with nondiagonal error structure. From this viewpoint, we review the existing approaches to estimation in a linear regression h...
Source: Biostatistics - February 27, 2009 Category: Bioinformatics Authors: Gryparis, A., Paciorek, C. J., Zeka, A., Schwartz, J., Coull, B. A. Tags: Articles Source Type: journals
A new serially correlated gamma-frailty process for longitudinal count data
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
We describe a new multivariate gamma distribution and discuss its implication in a Poisson-correlated gamma-frailty model. This model is introduced to account for between-subjects correlation occurring in longitudinal count data. For likelihood-based inference involving distributions in which high-dimensional dependencies are present, it may be useful to approximate likelihoods based on the univariate or bivariate marginal distributions. The merit of composite likelihood is to reduce the computational complexity of the full likelihood. A 2-stage composite-likelihood procedure is developed for estimating the model parameter...
Source: Biostatistics - February 27, 2009 Category: Bioinformatics Authors: Fiocco, M., Putter, H., Van Houwelingen, J.C. Tags: Articles Source Type: journals
Biomarker evaluation and comparison using the controls as a reference population
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
The classification accuracy of a continuous marker is typically evaluated with the receiver operating characteristic (ROC) curve. In this paper, we study an alternative conceptual framework, the "percentile value." In this framework, the controls only provide a reference distribution to standardize the marker. The analysis proceeds by analyzing the standardized marker in cases. The approach is shown to be equivalent to ROC analysis. Advantages are that it provides a framework familiar to a broad spectrum of biostatisticians and it opens up avenues for new statistical techniques in biomarker evaluation. We develop several n...
Source: Biostatistics - February 27, 2009 Category: Bioinformatics Authors: Huang, Y., Pepe, M. S. Tags: Articles Source Type: journals
Modified test statistics by inter-voxel variance shrinkage with an application to f MRI
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
Functional magnetic resonance imaging (f MRI) is a noninvasive technique which is commonly used to quantify changes in blood oxygenation and flow coupled to neuronal activation. One of the primary goals of f MRI studies is to identify localized brain regions where neuronal activation levels vary between groups. Single voxel t-tests have been commonly used to determine whether activation related to the protocol differs across groups. Due to the generally limited number of subjects within each study, accurate estimation of variance at each voxel is difficult. Thus, combining information across voxels is desirable in order to...
Source: Biostatistics - February 27, 2009 Category: Bioinformatics Authors: Su, S.-C., Caffo, B., Garrett-Mayer, E., Bassett, S. S. Tags: Articles Source Type: journals
Generalized linear models with unspecified reference distribution
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
We propose a new class of semiparametric generalized linear models. As with existing models, these models are specified via a linear predictor and a link function for the mean of response Y as a function of predictors X. Here, however, the "baseline" distribution of Y at a given reference mean µ0 is left unspecified and is estimated from the data. The response distribution when the mean differs from µ0 is then generated via exponential tilting of the baseline distribution, yielding a response model that is a natural exponential family, with corresponding canonical link and variance functions. The resulting mode...
Source: Biostatistics - February 27, 2009 Category: Bioinformatics Authors: Rathouz, P. J., Gao, L. Tags: Articles Source Type: journals
Letter to the editor
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
(Source: Biostatistics)
Source: Biostatistics - December 12, 2008 Category: Bioinformatics Authors: Chu, H., Guo, H. Tags: Letter Source Type: journals
