Demographic factors associated with within-individual variability of lung function for adults with cystic fibrosis: A UK registry study

Highlights • We quantify the association between demographic factors and lung function mean and within-individual variability in adults with cystic fibrosis.• There is additional heterogeneity between individuals which is not explained by the main demographic factors considered.• The year of birth and the age at annual review have a nonlinear association with lung function variability.• Mixed-effects location-scale models provide a flexible alternative to standard linear mixed models in biomedical applications.

Supplementary material for the article "Demographic factors associated with within-individual variability of lung function for adults with cystic fibrosis: a UK registry study"

A Statistical model
The final specification of the model is as follows.The mean submodel is The variability submodel is as follows(εij ∼ N (0, σ 2 ε ij )): Distribution of the random effects: The standard mixed model with random intercept is a particular case of MELSM when σ 2 ε ij is constant for all individuals at all measurements.In that case, υi is the only random effect in the model, accounting for the individual departure from the population average.
The use of random effects makes the interpretation of MELSM not dissimilar from the one in standard mixed models.In the mean submodel, the estimated coefficients will quantify the association between the covariates and the lung function, keeping constant the other variables.The random effect in the mean submodel will inform about additional shifts to the individual's FEV1 with respect to other individuals with the same covariate values.For example, if two hypothetical individuals share the same covariate values but the location random effect for one is 5 units higher than the other individual, then the mean FEV1 for one will be 5 unit higher than the other.
Similarly, the exponential of the predicted scale random effect captures a subject-specific scaling factor (in terms of "inflation/deflation") of the FEV1 standard deviation in a population with the same covariate values.For example, if the same two hypothetical individuals share the same covariate values but the scale random effect for one is 0.7 higher than the other, then the FEV1 standard deviation for one would be e 0.7 = 2 times the standard deviation of the other.The predicted scale random effect value provides the individual component (in addition to the common component determined by the covariate values) of the within-individual variability.
Prior predictive checks [20] induced a choice of generic informative priors: • normal prior for linear fixed effects (all centered at 0 except the one for the intercept of the mean submodel, centered at 2) • Exp(10) for variances of random effects and smoothing terms (which are specified in the model as random effects) • Lewandowski-Kurowicka-Joe LKJ(2) prior for random effect correlation matrix R.
The specification of the age-sex interaction in the package brms [16] is based on the package mgcv [21].The factor-smooth interaction in mgcv specifies a different smoothing function for each sex with a centering constraint with respect to the model intercept.Therefore, to account for sex mean differences, a main parametric term for sex is also included.More details are available at https://stat.ethz.ch/R-manual/Rdevel/library/mgcv/html/gam.models.html.

B.1 Results about interactions, predictions and random effects
The parametric term for sex (accounting for the average difference between male and female smooth functions) and the corresponding 95% credible interval is equal to 0.10 (0.08; 0.12) in the mean submodel The parameters of the distribution of the random effect matrix are also estimated (Table 2).For the mean submodel, the standard deviation of the random intercept is equal to 1.02.This means that if we consider a normal distribution with mean zero and this standard deviation, 95% of the individuals will get a predicted random effect in the range of (plus or minus  random effects remain closer to zero.The model picks more information about the subject-specific deviation from common (fixed-effects) variability when the number of reviews increases.

B.2 Model diagnostics
The Gelman-Rubin ratio convergence diagnostic for all the parameters ranged between 1 and 1.05, indicating that convergence has been reached.As a graphical assessment of quality of fit, the simulated posterior densities are plotted against the observed FEV1 distribution (Figure S.8).This procedure is an example of a posterior predictive check [20], that aims at evaluating in a qualitative way whether the model can generate data which resemble the observed data.The overall shape is mostly captured, although some discrepancy occurs in the lower part of the domain.A model with an additional parameter for skewness (i.e. with a skew-normal distribution with the same specification for mean and variance) did not improve the fit (results not reported here).
Figure S.1: Diagram of inclusion criteria with number of individuals N and number of observations n.

Figure S. 4 :
Figure S.2: Histogram of individuals by number of annual reviews.

Figure S. 5 :
Figure S.5: Posterior density of the standard deviation of the scale random effect σ ω .
Figure S.8: Posterior predictive checks.The dark blue line represents the density of the outcome (FEV 1 ) in the dataset, while the lighter lines are 10 simulated posterior distributions generated from the model.