Pre-eclampsia is a leading cause of maternal and perinatal mortality and morbidity. Early identification of women at risk during pregnancy is required to plan management. Although there are many published prediction models for pre-eclampsia, few have been validated in external data. Our objective was to externally validate published prediction models for pre-eclampsia using individual participant data (IPD) from UK studies, to evaluate whether any of the models can accurately predict the condition when used within the UK healthcare setting.
IPD from 11 UK cohort studies (217,415 pregnant women) within the International Prediction of Pregnancy Complications (IPPIC) pre-eclampsia network contributed to external validation of published prediction models, identified by systematic review. Cohorts that measured all predictor variables in at least one of the identified models and reported pre-eclampsia as an outcome were included for validation. We reported the model predictive performance as discrimination (
Of 131 published models, 67 provided the full model equation and 24 could be validated in 11 UK cohorts. Most of the models showed modest discrimination with summary
The evaluated models had modest predictive performance, with key limitations such as poor calibration (likely due to overfitting in the original development datasets), substantial heterogeneity, and small net benefit across settings. The evidence to support the use of these prediction models for pre-eclampsia in clinical decision-making is limited. Any models that we could not validate should be examined in terms of their predictive performance, net benefit, and heterogeneity across multiple UK settings before consideration for use in practice.
PROSPERO ID:
Kym IE Snell and John Allotey are joint first authors (both contributed equally).
Pre-eclampsia, a pregnancy-specific condition with hypertension and multi-organ dysfunction, is a leading contributor to maternal and offspring mortality and morbidity. Early identification of women at risk of pre-eclampsia is key to planning effective antenatal care, including closer monitoring or commencement of prophylactic aspirin in early pregnancy to reduce the risk of developing pre-eclampsia and associated adverse outcomes. Accurate prediction of pre-eclampsia continues to be a clinical and research priority [
Any recommendation to use a prediction model in clinical practice must be underpinned by robust evidence on the reproducibility of the models, their predictive performance across various settings, and their clinical utility. An individual participant data (IPD) meta-analysis that combines multiple datasets has great potential to externally validate existing models [
We undertook an IPD meta-analysis to externally validate the predictive performance of existing multivariable models to predict the risk of pre-eclampsia in pregnant women managed within the National Health Service (NHS) in the UK and assessed the clinical utility of the models using decision curve analysis.
We undertook a systematic review of reviews by searching Medline, Embase, and the Cochrane Library including DARE (Database of Abstracts of Reviews of Effects) databases, from database inception to March 2017, to identify relevant systematic reviews on clinical characteristics, biochemical, and ultrasound markers for predicting pre-eclampsia [
We updated our previous literature search of prediction models for pre-eclampsia [
We externally validated the models in IPPIC IPD cohorts that contained participants from the UK (IPPIC-UK subset) to determine their performance within the context of the UK healthcare system and to reduce the heterogeneity in the outcome definitions [
We obtained data from cohorts in prospective and retrospective observational studies (including cohorts nested within randomised trials, birth cohorts, and registry-based cohorts). Collaborators sent their pseudo-anonymised IPD in the most convenient format for them, and we then formatted, harmonised, and cleaned the data. Full details on the eligibility criteria, selection of the studies and datasets, and data preparation have previously been reported in our published protocol [
Two independent reviewers assessed the quality of each IPD cohort using a modified version of the PROBAST (Prediction study Risk of Bias Assessment) tool [
We summarised the total number of participants and number of events in each dataset, and the overall numbers available for validating each model.
We could validate the predictive performance of a model only when the values of all its predictors were available for participants in at least one IPD dataset, i.e. in datasets where none of the predictors was systematically missing (unavailable for all participants). In such datasets, when data were missing for predictors and outcomes in some participants (‘partially missing data’), we used a 3-stage approach. First, where possible, we filled in the actual value that was missing using knowledge of the study’s eligibility criteria or by using other available data in the same dataset. For example, replacing nulliparous = 1 for all individuals in a dataset if only nulliparous women were eligible for inclusion. Secondly, after preliminary comparison of other datasets with the information, we used second trimester information in place of missing first trimester information. For example, early second trimester values of body mass index (BMI) or mean arterial pressure (MAP) were used if the first trimester values were missing. Where required, we reclassified into categories. Women of either Afro-Caribbean or African-American origin were classified as Black, and those of Indian or Pakistani origin as Asian. Thirdly, for any remaining missing values, we imputed all partially missing predictor and outcome values using multiple imputation by chained equations (MICE) [
We conducted the imputations in each IPD dataset separately. This approach acknowledges the clustering of individuals within a dataset and retains potential heterogeneity across datasets. We generated 100 imputed datasets for each IPD dataset with any missing predictor or outcome values. In the multiple imputation models, continuous variables with missing values were imputed using linear regression (or predictive mean matching if skewed), binary variables were imputed using logistic regression, and categorical variables were imputed using multinomial logistic regression. Complete predictors were also included in the imputation models as auxiliary variables. To retain congeniality between the imputation models and predictive models [
For each model that we could validate, we applied the model equation to each individual
We examined the predictive performance of each model separately, using measures of discrimination and calibration, firstly in the IPD for each available dataset and then at the meta-analysis level. We assessed model discrimination using the
Where data had been imputed in a particular IPD dataset, the predictive performance measures were calculated in each of the imputed datasets, and then Rubin’s rules were applied to combine statistics (and corresponding standard errors) across imputations [
When it was possible to validate a model in multiple cohorts, we summarised the performance measures across cohorts using a random-effects meta-analysis estimated using restricted maximum likelihood (for each performance measure separately) [
A particular challenge is to predict pre-eclampsia in nulliparous women as they have no history from prior pregnancies (which are strong predictors); therefore, we also conducted a subgroup analysis in which we assessed the performance of the models in only nulliparous women from each study.
For each pre-eclampsia outcome (any, early, or late onset), we compared prediction models using decision curve analysis [
All statistical analyses were performed using Stata MP Version 15. TRIPOD guidelines were followed for transparent reporting of risk prediction model validation studies [
Of the 131 models published on prediction of pre-eclampsia, only 67 reported the full model equation needed for validation (67/131, 51%) (Supplementary Table S3, Additional file Pre-eclampsia prediction model equations externally validated in the IPPIC-UK cohorts Model no. Author (year) Predictor category Prediction model equation for linear predictor (LP) 1 Plasencia 2007a Clinical characteristics LP = − 6.253 + 1.432(if Afro-Caribbean ethnicity) + 1.465(if mixed ethnicity) + 0.084(BMI) + 0.81(if woman’s mother had PE) − 1.539(if parous without previous PE) + 1.049(if parous with previous PE) 2 Poon 2008 Clinical characteristics LP = − 6.311 + 1.299(if Afro-Caribbean ethnicity) + 0.092(BMI) + 0.855(if woman’s mother had PE) − 1.481(if parous without previous PE) + 0.933(if parous with previous PE) 3 Wright 2015a* Clinical characteristics Mean gestational age at delivery with PE = 54.3637 − 0.0206886(age, years - 35, if age ≥ 35) + 0.11711(height, cm - 164) − 2.6786(if Afro-Caribbean ethnicity) − 1.129(if South Asian ethnicity) − 7.2897(if chronic hypertension) − 3.0519(if systemic lupus erythematosus or antiphospholipid syndrome) − 1.6327(if conception by in vitro fertilisation) − 8.1667(if parous with previous PE) + 0.0271988(if parous with previous PE, previous gestation in weeks - 24)2 − 4.335(if parous with no previous PE) − 4.15137651(if parous with no previous PE, interval between pregnancies in years)−1 + 9.21473572(if parous with no previous PE, interval between pregnancies in years)−0.5 − 0.0694096(if no chronic hypertension, weight in kg – 69) − 1.7154(if no chronic hypertension and family history of PE) − 3.3899(if no chronic hypertension and diabetes mellitus type 1 or 2) 4 Baschat 2014a Clinical characteristics and biochemical markers LP = − 8.72 + 0.157 (if nulliparous) + 0.341(if history of hypertension) + 0.635(if prior PE) + 0.064(MAP) − 0.186(PAPP-A, Ln MoM) 5 Goetzinger 2010 Clinical characteristics and biochemical markers LP = − 3.25 + (0.51(if PAPP-A < 10th percentile) + 0.93(if BMI > 25) + 0.94(if chronic hypertension) + 0.97(if diabetes) + 0.61(if African American ethnicity) 6 Odibo 2011a Clinical characteristics and biochemical markers LP = − 3.389 − 0.716(PAPP-A, MoM) + 0.05(BMI) + 0.319(if black ethnicity) + 1.57(if history of chronic hypertension) 7 Odibo 2011b Clinical characteristics and ultrasound markers LP = − 3.895 − 0.593(mean uterine PI) + 0.944(if pre-gestational diabetes) + 0.059(BMI) + 1.532(if history of chronic hypertension) 8 Yu 2005a Clinical characteristics and ultrasound markers LP = 1.8552 + 5.9228(mean uterine PI)−2 − 14.4474(mean uterine PI)−1 − 0.5478(if smoker) + 0.6719(bilateral notch) + 0.0372(age) + 0.4949(if black ethnicity) + 1.5033(if history of PE) − 1.2217(if previous term live birth) + 0.0367(T2 BMI) 9 Baschat 2014b Clinical characteristics LP = − 5.803 + 0.302(if history of diabetes) + 0.767 (if history of hypertension) + 0.00948(MAP) 10 Crovetto 2015a Clinical characteristics LP = − 5.177 + (2.383 if black ethnicity) − 1.105(if nulliparous) + 3.543(if parous with previous PE) + 2.229(if chronic hypertension) + 2.201(if renal disease) 11 Kuc 2013a Clinical characteristics LP = − 6.790 − 0.119(maternal height, cm) + 4.8565(maternal weight, Ln kg) + 1.845(if nulliparous) + 0.086(maternal age, years) + 1.353(if smoker) 12 Plasencia 2007b Clinical characteristics LP = − 6.431 + 1.680(if Afro-Caribbean ethnicity) + 1.889(if mixed ethnicity) + 2.822(if parous with previous PE) 13 Poon 2010a Clinical characteristics LP = − 5.674 + 1.267(if black ethnicity) + 2.193(if history of chronic hypertension) − 1.184(if parous without previous PE) + 1.362(if parous with previous PE) + 1.537(if conceived with ovulation induction) 14 Scazzocchio 2013a Clinical characteristics LP = − 7.703 + 0.086(BMI) + 1.708(if chronic hypertension) + 4.033(if renal disease) + 1.931(if parous with previous PE) + 0.005(if parous with no previous PE) 15 Wright 2015b* Clinical characteristics Same as model 3 16 Poon 2009a Clinical characteristics and biochemical markers LP = − 6.413 − 3.612 (PAPP-A, Ln MoM) + 1.803(if history of chronic hypertension) + 1.564(if black ethnicity) − 1.005(if parous without previous PE) + 1.491(if parous with previous PE) 17 Yu 2005b Clinical characteristics and ultrasound markers LP = − 9.81223 + 2.10910(mean uterine PI)3 − 1.79921(mean uterine PI)3 + 1.059463(if bilateral notch) 18 Crovetto 2015b Clinical characteristics LP = − 5.873 − 0.462(if white ethnicity) + 0.109(BMI) − 0.825(if nulliparous) + 2.726(if parous with previous PE) + 1.956(if chronic hypertension) − 0.575(if smoker) 19 Kuc 2013b Clinical characteristics LP = − 14.374 + 2.300(maternal weight, Ln kg) + 1.303(if nulliparous) + 0.068(maternal age, years) 20 Plasencia 2007c Clinical characteristics LP = − 6.585 + 1.368(if Afro-Caribbean ethnicity) + 1.311(if mixed ethnicity) + 0.091(BMI) + 0.960(if woman’s mother had PE) − 1.663(if parous without previous PE) 21 Poon 2010b Clinical characteristics LP = − 7.860 + 0.034(maternal age, years) + 0.096(BMI) + 1.089(if black ethnicity) + 0.980(if Indian or Pakistani ethnicity) + 1.196(if mixed ethnicity) + 1.070(if woman’s mother had PE) − 1.413(if parous without previous PE) + 0.780(if parous with previous PE) 22 Scazzocchio 2013b Clinical characteristics LP = 6.135 + 2.124(if previous PE) + 1.571(if chronic hypertension) + 0.958(if diabetes) + 1.416(if thrombophilic condition) − 0.487(if multiparous) + 0.093(BMI) 23 Poon 2009b Clinical characteristics and biochemical markers LP = − 6.652 − 0.884(PAPP-A, Ln MoM) + 1.127(if family history of PE) + 1.222(if black ethnicity) + 0.936(if Indian or Pakistani ethnicity) + 1.335(if mixed ethnicity) + 0.084(BMI) − 1.255(if parous without previous PE) + 0.818(if parous with previous PE) 24 Yu 2005c Clinical characteristics and ultrasound markers LP = 0.7901 + 5.1473(mean uterine PI)−2 − 12.5152(mean uterine PI)−1 − 0.5575(if smoker) + 0.5333(if bilateral notch) + 0.0328(age) + 0.4958(if black ethnicity) + 1.5109(if history of PE) + 1.1556(if previous term live birth) + 0.0378(BMI) * The model for ‘mean gestational age at delivery with PE’ assumes a normal distribution with the predicted mean gestational age and SD=6.8833. The risk of delivery with PE is then calculated as the area under the normal curve between 24 weeks and either 42 weeks for any onset PE (model 3) or 34 weeks for early-onset PE (model 14). For more detail see Wright et al., 2015. Identification of prediction models for validation in IPPIC-UK cohorts
IPD from 11 cohorts contained within the IPPIC network contained relevant predictors and outcomes that could be used to validate at least one of the 24 prediction models. Four of the 11 validation cohorts were prospective observational studies (Allen 2017, POP, SCOPE, and Velauthar 2012) [
A fifth of all validation cohorts (2/11, 18%) were classed as having an overall low risk of bias for all three PROBAST domains of participant selection, predictor evaluation, and outcome assessment. Seven (7/11, 64%) had low risk of bias for participant selection domain, and ten (10/11, 91%) had low risk of bias for predictor assessment, while one had an unclear risk of bias for that domain. For outcome assessment, half of all cohorts had low risk of bias (5/11, 45%) and it was unclear in the rest (6/11, 55%) (Supplementary Table S7, Additional file
All of the models we validated were developed in unselected populations of high- and low-risk women. About two thirds of the models (63%, 15/24) included only clinical characteristics as predictors [
We validated the predictive performance of each of the 24 included models in at least one and up to eight validation cohorts. The distributions of the linear predictor and the predicted probability are shown for each model and validation cohort in Supplementary Table S8 (Additional file
Two clinical characteristics models (Plasencia 2007a; Poon 2008) with predictors such as ethnicity, family history of pre-eclampsia, and previous history of pre-eclampsia showed reasonable discrimination in validation cohorts with summary Summary estimates of predictive performance for each model across validation cohorts Model no. Type of predictors Author (year) No. of validation cohorts Total no. of women Total events Summary estimate of performance statistic (95% CI), measures of heterogeneity ( Calibration slope Calibration-in-the-large 1 Clinical Plasencia 2007a 3 3257 102 0.69 (0.53, 0.81) 0.69 (− 0.03, 1.41) 0.14 (− 1.47, 1.76) 2 Poon 2008 3 3257 102 0.69 (0.53, 0.81) 0.72 (− 0.03, 1.46) 0.002 (− 1.65, 1.66) 3 Wright 2015a 3 1916 76 0.62 (0.48, 0.75) 0.64 (− 0.18, 1.47) 0.95 (− 1.13, 3.03) 4 Clinical and biochemical markers Baschat 2014a 2 5257 287 0.71 (0.47, 0.87) 1.24 (0.00, 2.48) − 0.43 (− 14.4, 13.55) 5 Goetzinger 2010 3 6811 343 0.66 (0.30, 0.90) 1.124 (− 0.60, 2.84) − 0.97 (− 3.04, 1.11) 6 Odibo 2011a 3 59,892 1774 0.72 (0.51, 0.86) 1.16 (0.24, 2.08) − 0.79 (− 2.62, 1.04) 7 Clinical and ultrasound markers Odibo 2011b 1 1145 28 0.53 (0.39, 0.66) 0.28 (− 0.64, 1.20) − 0.52 (− 0.91, − 0.13) 8 Clinical and ultrasound markers Yu 2005a 1 4212 273 0.61 (0.57 to 0.65) 0.08 (0.01 to 0.14) Not estimable 9 Clinical Baschat 2014b 5 22,781 204 0.68 (0.62, 0.73) 2.04 (0.56, 3.52) − 0.10 (− 1.70 to 1.49) 10 Crovetto 2015a 3# 6424 21 0.58 (0.21, 0.88) 0.64 (− 4.01, 5.29) − 0.58 (− 4.97, 3.81) 11 Kuc 2013a 6 212,038 1449 0.66 (0.61, 0.71) 0.42 (0.29, 0.55) − 4.33 (− 5.41, − 3.25) 12 Plasencia 2007b 4# 6740 27 0.49 (0.43, 0.55) 0.51 (− 2.05, 3.08) 0.47 (− 0.80, 1.74) 13 Poon 2010a 3 6424 21 0.64 (0.31, 0.87) 0.99 (0.02, 1.96) − 1.09 (− 4.89, 2.70) 14 Scazzocchio 2013a 3 6424 21 0.74 (0.37, 0.93) 0.75 (0.14, 1.36) − 0.70 (− 3.89, 2.49) 15 Wright 2015b 2 1332 9 0.74 (0.04, 1.00) 0.92 (− 4.38, 6.22) 0.28 (− 14.34, 14.90) 16 Clinical and biochemical markers Poon 2009a 1 4212 10 0.74 (0.51, 0.89) 0.45 (0.21, 0.69) − 2.67 (− 3.35, − 1.99) 17 Clinical and ultrasound markers Yu 2005b 1 4212 10 0.91 (0.83, 0.95) 0.56 (0.29, 0.82) 2.47 (1.72, 3.23) 18 Clinical Crovetto 2015b 5 7785 384 0.63 (0.46, 0.78) 0.56 (− 0.01 to 1.13) − 0.05 (− 1.65, 1.55) 19 Kuc 2013b 8 213,532 5716 0.62 (0.57, 0.67) 0.66 (0.50, 0.82) − 1.91 (− 2.24, − 1.59) 20 Plasencia 2007c 3 3257 90 0.67 (0.54, 0.78) 0.61 (0.04, 1.18) 0.20 (− 1.11, 1.52) 21 Poon 2010b 3 3257 90 0.65 (0.48, 0.79) 0.57 (0.08, 1.05) 0.12 (− 1.59, 1.84) 22 Scazzocchio 2013b 1 658 26 0.60 (0.48, 0.71) 0.56 (− 0.17, 1.29) 0.52 (0.13, 0.92) 23 Clinical and biochemical markers Poon 2009b 1 1045 13 0.68 (0.55, 0.79) 0.80 (0.26, 1.34) − 0.35 (− 0.90, 0.21) 24 Clinical and ultrasound markers Yu 2005c 1 4212 263 0.61 (0.57, 0.64) 0.08 (0.05, 0.15) Not estimable # Number of validation cohorts is 2 for the calibration slope as it could not be estimated reliably in SCOPE (for models 10 and 12) or POP (for model 12), and was therefore excluded from the meta-analysis. + The C-statistic was pooled on the logit scale, therefore
The three models with clinical and biochemical predictors (Baschat 2014a; Goetzinger 2010; Odibo 2011a) showed moderate discrimination (summary Calibration plots for clinical characteristic and biomarker models predicting any-onset pre-eclampsia (cohorts with ≥ 100 events)
When validated in individual cohorts, the Odibo 2011a model demonstrated better discrimination in the POP cohort of any risk nulliparous women ( Predictive performance statistics for models in the individual IPPIC-UK cohorts with over 100 events Model no. Author (year) Predictor Sovio 2015 (4212 women) Stirrup 2015 (54,635 women) Ayorinde 2016 (136,635 women) Poston 2006 (2422 women) Fraser 2013 (14,344 women) Calibration slope (95% CI) CITL (95% CI) Calibration slope (95% CI) CITL (95% CI) Calibration slope (95% CI) CITL (95% CI) Calibration slope (95% CI) CITL (95% CI) Calibration slope (95% CI) CITL (95% CI) 4 Baschat 2014a Clinical and biochemical 0.71 (0.67, 0.74) 1.24 (1.03, 1.44) 0.66 (0.53, 0.78) 5 Goetzinger 2010 0.76 (0.73, 0.80) 1.71 (1.50, 1.91) − 0.07 (− 0.20, 0.05) 6 Odibo 2011a 0.78 (0.74, 0.81) 1.49 (1.33, 1.65) − 0.03 (− 0.16, 0.09) 0.67 (0.65, 0.69) 0.96 (0.89, 1.04) − 0.90 (− 0.95, − 0.85) 8 Yu 2005a Clinical and ultrasound 0.61 (0.57, 0.65) 0.08 (0.01, 0.14) Not estimable 9 Baschat 2014b Clinical 0.67 (0.63, 0.72) 1.28 (0.90, 1.66) 1.80 (1.63, 1.97) 11 Kuc 2013a 0.64 (0.59, 0.68) 0.34 (0.23, 0.46) − 4.51 (− 4.67, − 4.35) 0.68 (0.67, 0.70) 0.47 (0.43, 0.51) − 3.39 (− 3.45, − 3.33) 18 Crovetto 2015b Clinical 0.78 (0.75, 0.81) 1.25 (1.12, 1.38) 1.31 (1.18, 1.44) 19 Kuc 2013b 0.60 (0.56, 0.64) 0.67 (0.45, 0.89) − 1.49 (− 1.61, − 1.36) 0.64 (0.62, 0.65) 0.63 (0.56, 0.70) − 1.97 (− 2.03, − 1.92) 0.84 (0.64 to 0.94) 0.75 (0.45, 1.04) − 1.44 (− 2.09, − 0.79) 0.66 (0.62, 0.70) 0.76 (0.55, 0.97) − 1.57 (− 1.70, − 1.45) 24 Yu 2005c Clinical and ultrasound 0.61 (0.57, 0.64) 0.08 (0.01, 0.15) Not estimable CITL = Calibration-in-the-large
We then considered the prediction of early-onset pre-eclampsia. The two clinical characteristics models, Baschat 2014b with predictors such as history of diabetes, hypertension, and mean arterial pressure [
The other six models were validated with a combined total of less than 50 events between the cohorts [
Of the five clinical characteristics models, four (Crovetto 2015b, Kuc 2010b, Plasencia 2007c, Poon 2010b) were validated across cohorts. The models showed reasonable discrimination with summary
When validated in the POP cohort of nulliparous women, the Crovetto 2015b model with predictors such as maternal ethnicity, parity, chronic hypertension, smoking status, and previous history of pre-eclampsia showed good discrimination (
Supplementary Table S10 (Additional file
Where it was possible to estimate it, heterogeneity across studies varied from small (e.g. Plasencia 2007a and Poon 2008 models had
We compared the clinical utility of models for any-onset pre-eclampsia in SCOPE (3 models), Allen 2017 (6 models), UPBEAT (4 models), and POP cohorts (3 models) as they allowed us to compare more than one model. Of the three models validated in the POP cohort [ Decision curves for models of any-onset pre-eclampsia
Of the 131 prediction models developed for predicting the risk of pre-eclampsia, only half published the model equation that is necessary for others to externally validate these models, and of those remaining, only 25 included predictors available to us in the datasets of the validation cohorts. One model could not be validated because of too few events in the validation cohorts. In general, models moderately discriminated between women who did and did not develop any-, early-, or late-onset pre-eclampsia. The performance did not appear to vary noticeably according to the type of predictors (clinical characteristics only; additional biochemical or ultrasound markers) or the trimester. Overall calibration of predicted risks was generally suboptimal. In particular, the summary calibration slope was often much less than 1, suggesting that the developed models were overfitted to their development dataset and thus do not transport well to new populations. Even for those with promising summary calibration performance (e.g. summary calibration slopes close to 1 from the meta-analysis), we found large heterogeneity across datasets, indicating that the calibration performance of the models is unlikely to be reliable across all UK settings represented by the validation cohorts. Some models showed promising performance in nulliparous women, but this was not observed in other populations.
To our knowledge, this is the first IPD meta-analysis to externally validate existing prediction models for pre-eclampsia. Our comprehensive search identified over 130 published models, illustrating the desire for risk prediction in this field, but also the confusion about which models are reliable. The global IPPIC Network brought together key researchers involved in this field, and their cohorts provided access to the largest IPD on prediction of pregnancy complications. We evaluated whether any of the identified models demonstrated good predictive performance in the UK health system, both on average and within individual cohorts. Access to raw data meant that we could exclude ineligible women, account for timing of predictor measurement and outcome, and increase the sample size for rare outcomes such as early-onset pre-eclampsia.
We could only validate 24 of the 131 published pre-eclampsia prediction models and were restricted by poor reporting of published models, as well as the unavailability of predictors used in some reported models within our IPD. It is possible that a better performing model exists which we have been unable to validate. However, the issue of missing predictors may also reflect the availability of these predictors in routine clinical practice, and the inconvenience in their measurement, highlighting the need for a practical prediction model with easy to measure and commonly reported variables [
We limited our validation to UK datasets to reduce the heterogeneity arising from outcome definitions and variations in management. Despite this, often considerable heterogeneity remained in predictive performance. Direct comparison of the prediction models is difficult due to different datasets contributing towards the validation of each model.
Currently, none of the published models on pre-eclampsia has been recommended for clinical practice. We consider the following issues to contribute to this phenomenon. Firstly, most of the models have never been externally validated, and their performance in other populations is unknown [
Fourthly, many models have been developed by considering them as a ‘screening test’ for pre-eclampsia, similar to the approach used in Down syndrome screening with biomarkers. In addition to the lack of information on multiple of the median (MoM) values in validating cohorts, such an approach has inherent limitations. The models’ performances are reported in terms of detection rate (sensitivity) for a specific false positive rate of 10% [
In the recent ASPRE (Combined Multimarker Screening and Randomized Patient Treatment with Aspirin for Evidence-Based Preeclampsia Prevention) trial [
A clinically useful prediction model should be able to accurately identify women who are at risk of pre-eclampsia in all healthcare settings that the model will be used. There is no evidence from this IPD meta-analysis that, for the subset of published models we could evaluate, any model is applicable for use across all populations within UK healthcare setting. In particular, the poor observed calibration and the large heterogeneity across different datasets suggest that the subset of models are not robust enough for widespread use. It is likely that the predictive performance of the models would be improved by recalibration to particular settings and populations, for which local data are needed. This may not be practical in practice.
A major issue is that, based on the subset of models evaluated, existing prediction models in the pre-eclampsia field appear to suffer from calibration slopes < 1 in new data, which is likely to reflect overfitting when developing the model. This is known to be a general problem for the prediction model field in other disease areas [
A pre-eclampsia prediction model with good predictive performance would be beneficial to the UK NHS, but the evidence here suggests that, of the 24 models we could validate, their predictive performance is generally moderate, with miscalibration and heterogeneity across UK settings represented by the dataset available. Thus, there is not enough evidence to warrant recommendation for their routine use in clinical practice. Other models exist that we could not validate, which should also be examined in terms of their predictive performance, net benefit, and any heterogeneity across multiple UK settings before consideration for use in practice.
This project was funded by the National Institute for Health Research Health Technology Assessment Programme (ref no: 14/158/02). Kym Snell is funded by the National Institute for Health Research School for Primary Care Research (NIHR SPCR). The UK Medical Research Council and Wellcome (grant ref.: 102215/2/13/2) and the University of Bristol provide core support for ALSPAC. This publication is the work of the authors, and ST, RR, KS, and JA will serve as guarantors for the contents of this paper. The views expressed are those of the authors and not necessarily those of the NHS, the NIHR, or the Department of Health and Social Care.
Alex Kwong—University of Bristol; Ary I. Savitri—University Medical Center Utrecht; Kjell Åsmund Salvesen—Norwegian University of Science and Technology; Sohinee Bhattacharya—University of Aberdeen; Cuno S.P.M. Uiterwaal—University Medical Center Utrecht; Annetine C. Staff—University of Oslo; Louise Bjoerkholt Andersen—University of Southern Denmark; Elisa Llurba Olive—Hospital Universitari Vall d’Hebron; Christopher Redman—University of Oxford; George Daskalakis—University of Athens; Maureen Macleod—University of Dundee; Baskaran Thilaganathan—St George’s University of London; Javier Arenas Ramírez—University Hospital de Cabueñes; Jacques Massé—Laval University; Asma Khalil—St George’s University of London; Francois Audibert—Université de Montréal; Per Minor Magnus—Norwegian Institute of Public Health; Anne Karen Jenum—University of Oslo; Ahmet Baschat—Johns Hopkins University School of Medicine; Akihide Ohkuchi—University School of Medicine, Shimotsuke-shi; Fionnuala M. McAuliffe—University College Dublin; Jane West—University of Bristol; Lisa M. Askie—University of Sydney; Fionnuala Mone—University College Dublin; Diane Farrar—Bradford Teaching Hospitals; Peter A. Zimmerman—Päijät-Häme Central Hospital; Luc J.M. Smits—Maastricht University Medical Centre; Catherine Riddell—Better Outcomes Registry & Network (BORN); John C. Kingdom—University of Toronto; Joris van de Post—Academisch Medisch Centrum; Sebastián E. Illanes—University of the Andes; Claudia Holzman—Michigan State University; Sander M.J. van Kuijk—Maastricht University Medical Centre; Lionel Carbillon—Assistance Publique-Hôpitaux de Paris Université; Pia M. Villa—University of Helsinki and Helsinki University Hospital; Anne Eskild—University of Oslo; Lucy Chappell—King’s College London; Federico Prefumo—University of Brescia; Luxmi Velauthar—Queen Mary University of London; Paul Seed—King’s College London; Miriam van Oostwaard—IJsselland Hospital; Stefan Verlohren—Charité University Medicine; Lucilla Poston—King’s College London; Enrico Ferrazzi—University of Milan; Christina A. Vinter—University of Southern Denmark; Chie Nagata—National Center for Child Health and Development; Mark Brown—University of New South Wales; Karlijn C. Vollebregt—Academisch Medisch Centrum; Satoru Takeda—Juntendo University; Josje Langenveld—Atrium Medisch Centrum Parkstad; Mariana Widmer—World Health Organization; Shigeru Saito—Osaka University Medical School; Camilla Haavaldsen—Akershus University Hospital; Guillermo Carroli—Centro Rosarino De Estudios Perinatales; Jørn Olsen—Aarhus University; Hans Wolf—Academisch Medisch Centrum; Nelly Zavaleta—Instituto Nacional De Salud; Inge Eisensee—Aarhus University; Patrizia Vergani—University of Milano-Bicocca; Pisake Lumbiganon—Khon Kaen University; Maria Makrides—South Australian Health and Medical Research Institute; Fabio Facchinetti—Università degli Studi di Modena e Reggio Emilia; Evan Sequeira—ga Khan University; Robert Gibson—University of Adelaide; Sergio Ferrazzani—Università Cattolica del Sacro Cuore; Tiziana Frusca—Università degli Studi di Parma; Jane E. Norman—University of Edinburgh; Ernesto A. Figueiró-Filho—Mount Sinai Hospital; Olav Lapaire—Universitätsspital Basel; Hannele Laivuori—University of Helsinki and Helsinki University Hospital; Jacob A. Lykke—Rigshospitalet; Agustin Conde-Agudelo—Eunice Kennedy Shriver National Institute of Child Health and Human Development; Alberto Galindo—Universidad Complutense de Madrid; Alfred Mbah—University of South Florida; Ana Pilar Betran—World Health Organization; Ignacio Herraiz—Universidad Complutense de Madrid; Lill Trogstad—Norwegian Institute of Public Health; Gordon G.S. Smith—Cambridge University; Eric A.P. Steegers—University Hospital Nijmegen; Read Salim—HaEmek Medical Center; Tianhua Huang—North York General Hospital; Annemarijne Adank—Erasmus Medical Centre; Jun Zhang—National Institute of Child Health and Human Development; Wendy S. Meschino—North York General Hospital; Joyce L Browne—University Medical Centre Utrecht; Rebecca E. Allen—Queen Mary University of London; Fabricio Da Silva Costa—University of São Paulo; Kerstin Klipstein-Grobusch —University Medical Centre Utrecht; Caroline A. Crowther—University of Adelaide; Jan Stener Jørgensen—Syddansk Universitet; Jean-Claude Forest—Centre hospitalier universitaire de Québec; Alice R. Rumbold—University of Adelaide; Ben W. Mol—Monash University; Yves Giguère—Laval University; Louise C. Kenny—University of Liverpool; Wessel Ganzevoort—Academisch Medisch Centrum; Anthony O. Odibo—University of South Florida; Jenny Myers—University of Manchester; SeonAe Yeo—University of North Carolina at Chapel Hill; Francois Goffinet—Assistance publique – Hôpitaux de Paris; Lesley McCowan—University of Auckland; Eva Pajkrt—Academisch Medisch Centrum; Bassam G. Haddad—Portland State University; Gustaaf Dekker—University of Adelaide; Emily C. Kleinrouweler—Academisch Medisch Centrum; Édouard LeCarpentier—Centre Hospitalier Intercommunal Creteil; Claire T. Roberts—University of Adelaide; Henk Groen—University Medical Center Groningen; Ragnhild Bergene Skråstad—St Olavs Hospital; Seppo Heinonen—University of Helsinki and Helsinki University Hospital; Kajantie Eero—University of Helsinki and Helsinki University Hospital.
We would like to acknowledge all researchers who contributed data to this IPD meta-analysis, including the original teams involved in the collection of the data, and participants who took part in the research studies. We are extremely grateful to all the families who took part in this study, the midwives for their help in recruiting them, and the whole ALSPAC team, which includes interviewers, computer and laboratory technicians, clerical workers, research scientists, volunteers, managers, receptionists, and nurses.
We are thankful to members of the Independent Steering Committee, which included Prof Arri Coomarasamy (Chairperson, University of Birmingham), Dr. Aris Papageorghiou (St George’s University Hospital), Mrs. Ngawai Moss (Katies Team), Prof. Sarosh Rana (University of Chicago), and Dr. Thomas Debray (University Medical Center Utrecht), for their guidance and support throughout the project.
ST, RR, KSK, KGMM, RH, BT, and AK developed the protocol. KS wrote the statistical analysis plan, performed the analysis, produced the first draft of the article, and revised the article. RR oversaw the statistical analyses and analysis plan. MS and CC formatted, harmonised, and cleaned all of the UK datasets, in preparation for analysis. JA and MS mapped the variables in the available datasets, and cleaned and quality checked the data. AK contributed to the systematic review and development of the IPPIC Network. JA, ST, and MS undertook the literature searches and study selection, acquired the individual participant data, contributed to the development of all versions of the manuscript, and led the project. BT, AK, LK, LCC, MG, JM, ACS,GCS, WG, HL, AOO, AAB, PTS, FP, FdS, HG, FA, CN, ARR, SH, LMA, LS, CAV, BWM, LP, JAR, JK, GD, DF, PTS, JM, RBS, and CH contributed data to the project and provided input at all stages of the project. LCC, MG, JM, ACS, BWM, GCS, WG, HL, AOO, AAB, PTS, FP, FdSC, HG, FA, CH, CN, ARR, SH, LMA, LJMS, CAV, PMM, PMV, AKJ, LBA, JEN, AO, AE, SB, FMM, AG, IH, LC, KK, SY, and JB provided input into the protocol development and the drafting of the initial manuscript. All authors helped revise the manuscript. All authors read and approved the final manuscript.
The data that support the findings of this study are available from the IPPIC data sharing committee, but restrictions apply to the availability of these data, which were used under licence for the current study, and so are not publicly available. Data are however available from the authors upon reasonable request and with permission of contributing collaborators.
Not applicable. The study involved secondary analysis of existing anonymised data.
Not applicable
The authors disclose support from NIHR HTA for the submitted work. LCC reports being Chair of the HTA CET Committee from January 2019. AK reports being a member of the NIHR HTA board. BWM reports grants from Merck; personal fees from OvsEva, Merck, and Guerbet; and other from NHMRC, Guerbet, and Merch, outside the submitted work. GS reports grants and personal fees from GlaxoSmithKline Research and Development Limited, grants from Sera Prognostics Inc., non-financial support from Illumina Inc., and personal fees and non-financial support from Roche Diagnostics Ltd., outside the submitted work. JK reports personal fees from Roche Canada, outside the submitted work. JM reports grants from National Health Research and Development Program, Health and Welfare Canada, during the conduct of the study. JEN reports grants from Chief Scientist Office Scotland, other from GlaxoSmithKline and Dilafor, outside the submitted work. AG reports personal fees from Roche Diagnostics, outside the submitted work. IH reports personal fees from Roche Diagnostics and Thermo Fisher, outside the submitted work. RR reports personal fees from the BMJ, Roche, and Universities of Leeds, Edinburgh, and Exeter, outside the submitted work.
Individual participant data
International Prediction of Pregnancy Complications
Body mass index
Mean arterial pressure
Pregnancy-associated plasma protein
Linear predictor
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.