Computationally efficient methods for fitting mixed models to electronic health records data
Authors
Rhodes, KM
Turner, R
Payne, R
White, I
Publication Date
2018-12-20Journal Title
Statistics in Medicine
ISSN
1097-0258
Publisher
Wiley-Blackwell
Volume
37
Issue
29
Pages
4557-4570
Type
Article
This Version
VoR
Metadata
Show full item recordCitation
Rhodes, K., Turner, R., Payne, R., & White, I. (2018). Computationally efficient methods for fitting mixed models to electronic health records data. Statistics in Medicine, 37 (29), 4557-4570. https://doi.org/10.1002/sim.7944
Abstract
Motivated by two case studies using primary care records from the Clinical Practice Research Datalink, we describe statistical methods that facilitate the analysis of tall data, with very large numbers of observations. Our focus is on investigating the association between patient characteristics and an outcome of interest, while allowing for variation among general practices. We explore ways to fit mixed effects models to tall data, including predictors of interest and confounding factors as covariates, and including random intercepts to allow for heterogeneity in outcome among practices. We introduce: (1) weighted regression and (2) meta-analysis of estimated regression coefficients from each practice. Both methods reduce the size of the dataset, thus decreasing the time required for statistical analysis. We compare the methods to an existing subsampling approach. All methods give similar point estimates, and weighted regression and meta-analysis give similar standard errors for point estimates to analysis of the entire dataset, but the subsampling method gives larger standard errors. Where all data are discrete, weighted regression is equivalent to fitting the mixed model to the entire dataset. In the presence of a continuous covariate, meta-analysis is useful. Both methods are easy to implement in standard statistical software
Keywords
health records, meta-analysis, mixed-effects regression model, subsampling, tall data, Data Interpretation, Statistical, Datasets as Topic, Electronic Health Records, General Practice, Humans, Meta-Analysis as Topic, Models, Statistical, Regression Analysis
Sponsorship
The authors are grateful to the CPRD team at the University of Cambridge. In particular, we thank Carol Wilson and Anna Cassel for providing access to the case study datasets that they spent much time preparing for analysis. Kirsty Rhodes was funded by Medical Research Council Unit Programmes U105260558 and MC_UU_00002/5. Rebecca Turner and Ian White were funded by Medical Research Council Unit Programmes U105260558 and MC_UU_12023/21
Funder references
MRC (unknown)
MRC (unknown)
MRC (1182928)
Embargo Lift Date
2100-01-01
Identifiers
External DOI: https://doi.org/10.1002/sim.7944
This record's URL: https://www.repository.cam.ac.uk/handle/1810/287144
Statistics
Total file downloads (since January 2020). For more information on metrics see the
IRUS guide.
Recommended or similar items
The current recommendation prototype on the Apollo Repository will be turned off on 03 February 2023. Although the pilot has been fruitful for both parties, the service provider IKVA is focusing on horizon scanning products and so the recommender service can no longer be supported. We recognise the importance of recommender services in supporting research discovery and are evaluating offerings from other service providers. If you would like to offer feedback on this decision please contact us on: support@repository.cam.ac.uk
The following licence files are associated with this item: