Early-life and health behaviour influences on lung function in early adulthood

Rationale Early-life exposures may influence lung function at different stages of the life course. However, the relative importance of characteristics at different stages of infancy and childhood are unclear. Objectives To examine the associations and relative importance of early-life events on lung function at age 24 years. Methods We followed 7545 children from the Avon Longitudinal Study of Parents and Children from birth to 24 years. Using previous knowledge, we classified an extensive list of putative risk factors for low lung function, covering sociodemographic, environmental, lifestyle and physiological characteristics, according to timing of exposure: 1) demographic, maternal and child; 2) perinatal; 3) postnatal; 4) early childhood; and 5) adolescence characteristics. Lung function measurements (forced vital capacity (FVC), forced expiratory volume in 1 s (FEV1), FEV1/FVC and forced expiratory flow at 25–75% of FVC) were standardised for sex, age and height. The proportion of the remaining variance explained by each characteristic was calculated. The association and relative importance (RI) of each characteristic for each lung function measure was estimated using linear regression, adjusted for other characteristics in the same and previous categories. Results Lower maternal perinatal body mass index (BMI), lower birthweight, lower lean mass and higher fat mass in childhood had the largest RI (0.5–7.7%) for decreased FVC. Having no siblings, lower birthweight, lower lean mass and higher fat mass were associated with decreased FEV1 (RI 0.5–4.6%). Higher lean mass and childhood asthma were associated with decreased FEV1/FVC (RI 0.6–0.8%). Conclusions Maternal perinatal BMI, birthweight, childhood lean and fat mass and early-onset asthma are the factors in infancy and childhood that have the greatest influence on early-adult lung function.

We restricted our study (N = 7,545) to the core sample participants who have lung function measured at least once, after excluding quadruplets, triplets and one random child of each twin births. The study was approved by the ALSPAC Ethics and Law Committee and local research ethics committees. Informed consent for the use of data collected via questionnaires and clinics was obtained from participants following the recommendations of the ALSPAC Ethics and Law Committee at the time.
Data were collected from several sources: self-administered questionnaires sent to mothers at approximately annual intervals from 6 to 198 months (16½ years); annual physical examinations carried out during research clinics from age 7 to 13 years and at 15, 17 and 24 years. Study data were collected and managed using REDCap electronic data capture tools (2,3) hosted at University of Bristol. The ALSPAC study website contains details of all data through a fully searchable data dictionary that is available on the following Web page: http://www.bris.ac.uk/alspac/researchers/data-access/data-dictionary/

Investigated characteristics
Factors were categorized according to their nature and timing: (1) demographic, maternal and child characteristics that do not change over time and/or were measured before birth, e.g. gas cooking, maternal asthma or allergy, family financial difficulties; (2) perinatal characteristics, e.g. birthweight, maternal smoking during pregnancy; (3) postnatal characteristics, e.g. maternal smoking during first year of age, air pollution exposure during first year of age; (4) early-childhood, e.g. exposure to second-hand smoking during age 1-8 years, air pollution exposure during age 1-7 years; lean and fat mass at age 9 years, current asthma at age 7.5 years (5) adolescence characteristics, e.g. smoking status at age 14 years, pubertal age. Figure 1 shows an overview of investigated characteristics and a detailed description is provided in Table 1.

Lean mass and fat mass residuals
To adjust for differences in fat mass between females and males, and to adjust for height, the measures of fat mass and lean mass included in the analyses were calculated as the residuals from a linear regression of each on gender, height, and height squared. The standard deviation of fat mass residuals was 4.35 kg, approximately double that of lean mass (1.71 kg) residuals. We divided the fat mass residuals by two in subsequent analyses, so that regression coefficients for fat mass, and lean mass that were of similar size reflected associations of similar strength (4).

Statistical analysis
Dealing with missing data: To assess whether missing values of lung function at age 24 years (outcomes) can plausibly be imputed using information from earlier lung function measurements, linear correlation coefficients between SD-scores of lung function measured at different ages were examined (Table 2). A layout of missing data of investigated characteristics and lung function outcomes at ages 8, 15 and 24 years was depicted in Figure S1.
Among the study population ( = 7,545), there were small amounts of missing data for all stages of factors ( Figure S1). This varied from none, e.g. for pre-term delivery and maternal age at delivery, to 38%, for smoking status at age 14 years. The skin prick test (SPT) was performed for only a selected random sub-sample. Therefore, the amount of its missing data was relatively high, 27%.
To increase power and minimize selection bias, multiple imputation by chained equations was performed to impute missing data among our study population (5). In our imputation models, we included all lung function measures at ages 8, 15 and 24 years, exposures, potential confounders, and additional variables that might be predictive of missingness or of the missing values themselves. These included all characteristics of interest as presented in Table 1, smoking status at ages 16, 18, 20, 22 and 23 years, current asthma at ages 9, 11, 13, 14 and 15, immunoglobulin-E blood test and maternal ever caesarean section delivery.
We generated 20 imputed datasets using 10 cycles of regression switching (5). These datasets were then used for the main analyses. Since estimates of associations given by the regression models are derived to be normally distributed, we aggregated the findings across the imputed datasets using the Rubin's rules (6) and obtained 95% confidence intervals for characteristics' association by using the pooled means and pooled standard errors of estimated coefficients. The Rubin's rules produced average of individual coefficients and total average of between-imputation and within-imputation variances as the combined estimates of size of associations and their variances respectively.
Relative importance (RI) of influences on each lung function outcome were assessed using the Lindeman, Merenda, and Gold (LMG) method (7). The LMG analyses the explained variance, R 2 , of a considered model and estimates the incremental R 2 , defined as partial contribution to the total R 2 , attributed to the characteristic of interest. The incremental R 2 for each characteristic might be influenced by the order in which its variable was entered in a model, particularly when correlations among variables exist, i.e. it is larger when the variable entered first and lower when entered last. The LMG derives RI for each characteristic using its incremental R 2 by averaging over all possible orderings among the set of characteristics in the same stage as the characteristic of interest. Thus, the relative importance of factors included in a model are normalized to sum up to its R 2 . The procedures for calculating RI were implemented using the `relaimpo' R package (8).

RESULTS:
Table S1 reports characteristics of the study population using the observed and multiple imputed datasets. The summary statistics for most characteristics were similar in imputed and observed data due to their large proportions of observed data (≥ 90%) and the large size of study population ( = 7,545). For characteristics with lower proportions of observed data, including smoking status at age 14 years (62%), current asthma at age 7.5 years (73%), skin prick test at age 7.5 years (73%), air pollution during 1-7 years of age (81%) and second-hand smoke exposure during age 1-8 years, the differences in summary statistics between observed and imputed data were small, −0.6%, −2.3%, −0.01%, 0.06 μg/m 3 , 1.7% respectively. Lung function, age and height distributions at 24 years clinic were similar in observed and imputed data, see Table S2. The medians and interquartile ranges (IQRs) of height and age were identical in both observed and imputed datasets. Table S1. Characteristics of study population using the observed and imputed datasets.       Abbreviations: Adoles. = adolescence characteristics; FEV 1 = forced expiratory volume in one second; Inc. R 2 = incremental R 2 for variables in the corresponding stage; RI = relative importance (proportion of explained variation in lung function attributed to each variable -averaging over all its possible orderings among characteristics in same stage); Retained R 2 = Total R 2 for retained variables (with P-value ≤ 0.10) from previous stages and corresponding stage, Kg = kilogram; m = metre; μg = microgram; cm = centimetre. * Adjusted for all variables in same stage in addition to characteristics from previous stages that yield P-value ≤ 0.10. Table S8. Adjusted association and relative importance of early-life characteristics with SD scores (adjusted for sex, age and height) of FEV1/FVC measurements (non-imputed) at age 24 years (N=2800).   to each variable -averaging over all its possible orderings among characteristics in same stage); Retained R 2 = Total R 2 for retained variables (with P-value ≤ 0.10) from previous stages and corresponding stage, Kg = kilogram; m = metre; μg = microgram; cm = centimetre * Adjusted for all variables in same stage in addition to characteristics from previous stages that yield P-value ≤ 0.10. Table S10. Crude associations of early-life characteristics with SD scores of FVC (scores adjusted for sex, age and height) at age 24 years (N=7545).  Table S11. Crude associations of early-life characteristics with SD scores of FEV1 (scores adjusted for sex, age and height) at age 24 years (N=7545).    *Educated to the General Certificate of Education level (school-leaving certificate) or lower, see Error! Reference source not found.. Figure S1. Layout of missing data among study population (N=7,545), with percent of observed data shown above corresponding characteristic's column Figure S2. Circular plot of characteristics' association (point estimates and 95% confidence intervals) with measured (non-imputed) lung function parameters at age 24 years (N=2800). The raw data used for generating this plot are reported in Tables S6-S9. Figure S3 Circular plot of characteristics' relative importance (RI), on measured (non-imputed) lung function parameters at age 24 years (N=2800). Associations with higher and lower lung function were highlighted in black and grey colours respectively. Bars' height represents levels of RI, expressed in %, except for characteristics whose RI > 1%, where exact RI values are displayed on their corresponding bars