Summary
Chronic obstructive pulmonary disease (COPD) is the third most common cause of mortality worldwide and it is important to discover whether risk factors can be identified from studies undertaken in childhood.
Numerous longitudinal cohort studies have been developed in many parts of the world to better understand the long-term outcomes of chronic respiratory diseases. Using data they have generated, it should be possible to identify specific risk factors in children and develop a model to prioritise their importance when found, in order to consider ways to reduce the prevalence and/or severity of disease in adults. However, this does require the sharing of data within the field, as is happening in other related fields, such as the Virtual International Stroke Trial Archive (www.vista.gla.ac.uk). Pooling of the raw data could be very informative and an organisation such as the European Respiratory Society could play an important role in ensuring this happens.
Unfortunately, cohort studies vary widely in their inclusion criteria, their methodology and the format in which lung function data are presented. The raw data required to develop a model to assess the impact of childhood risk factors on future lung function have not been made available from many of the published articles.
Our initial belief that recognised risk factors are independent variables was naïve and a different approach is required to better understand their interdependence.
The work of Barker and co-workers [1, 2] in Southampton, UK, linked death from chronic obstructive pulmonary disease (COPD) in adulthood to low birth weight and childhood pneumonias (see fig. 1). They suggested that promotion of fetal lung growth and reduction of infantile respiratory infections may help to reduce the incidence of COPD in the next generation. This led to a large body of research investigating risk factors in childhood that have a negative effect on future respiratory health. For example, mothers who smoke during pregnancy are more likely to deliver babies whose lung function is lower than that of babies born to mothers who do not smoke [3]. These differences, however, become less with time. Many other factors have now been identified as possible causes of damage to the developing human lung. The main environmental factors that have been studied are very early birth [4], reduced placental blood supply resulting in “small for date” babies [5], poor antenatal and postnatal nutrition [6], atmospheric pollution [7], severe childhood asthma [8], infantile respiratory infections [9], obesity [10], bronchial hyperreactivity [11], and multiple socioeconomic factors [12]. Genetic factors have also been implicated [13].
We wanted to find out whether we have progressed in our knowledge and understanding since Barker and co-workers [[1], [2]] published their seminal findings, and whether it is now possible to identify targetable factors that might increase the respiratory health of future generations, at least in terms of prevalence and severity of COPD. In an attempt to answer this question, we undertook a systematic review of the medical literature in relation to this topic. We concentrated on longitudinal cohort studies that have been established in many parts of the world over the last half century. We deliberately excluded cross-sectional studies because single-point values of lung function are unhelpful in assessing whether a specific risk factor has any long-term causative effect.
Having identified relevant longitudinal cohort studies (some commencing at birth, others commencing in early childhood), we hoped to develop a model based on anonymised data that would predict future lung function related to the early-life exposure to specific, recognised risk factors. The full details of our study methodology and findings will be published elsewhere. The purpose of this article is to offer educational insight and ideas to others wishing to investigate systematically longitudinal risk factors for COPD or, indeed, for any other chronic respiratory disease found in the adult population.
Getting started
Our methodology was very similar to that in the recently published Breathe article entitled “How to make sense of a Cochrane systematic review” [14]. For the reasons stated earlier, we decided to concentrate solely on longitudinal studies commencing at birth or in the first few years of life and to exclude cross-sectional studies.
Our outline search revealed over 27 000 citations. After removal of duplicates and screening for relevance, 185 were chosen for full review. 32 met the inclusion criteria of:
-
documentation of a specific risk factor in early life;
-
the recording of lung function on more than one occasion; and
-
the presence of a control/comparator group that was not exposed to the identified risk factor.
Lung function measurement: the challenges
“When you can measure what you are speaking about, and express it in numbers, you know something about it”.
Lord Kelvin [15]
On analysing the aforementioned 32 papers, forced expiratory volume in 1 s (FEV1) was the most frequently recorded spirometric parameter. 18 studies recorded lung function as FEV1 % predicted and, as such, enabled comparison between the papers in relation to specific risk factors. In the other 14 papers, six recorded data as raw FEV1 values [16], two used logistic regression to derive FEV1 z-scores [17], five used indices other than FEV1 and one recorded observed to expected lung function [18]. We therefore could not include the lung function data from these studies in our analysis in the form in which they were published.
This highlights one important difference in study design between different cohort studies, specifically the difference in the format in which lung function data are presented. Without the raw data being made available, meaningful comparison between these studies is not possible. This point has already been made in a review in a previous issue of Breathe [19].
We wanted to use the data from all 32 studies, so we wrote to the authors of the aforementioned 14 articles asking if they would consider sending us their raw data to enable us to calculate all lung function measurements as FEV1 % predicted. Unfortunately, only one of the 14 research groups was able to provide us with the data we required in a form that was helpful. This meant that it was not possible to merge the data from the other 13 disparate studies, thereby preventing the possible development of a single, unified model for the identification of risk factors for COPD. To overcome this limitation, it would be extremely helpful if all researchers agreed on the same lung function index to be used in all future studies and that all journal editors adhered to that same parameter as well.
A further problem with many of the aforementioned studies is a lack of a standardised time-point around which lung function measurements are taken. If the appropriate raw data were available, then it might be possible to use regression methods to estimate a harmonised data set. In the absence of individual data sets being publicly available, however, even in a secondary anonymised form, it is unlikely that anyone will be in any position to use all of the data in an effective way to inform the development of an explanatory model of COPD.
Another interesting challenge is to determine the relationship between the FEV1 measurements at school age and older with the early-life measurements of lung capacity (such as FEV0.5, FEV0.75 or maximal flow at functional residual capacity). This is crucial for model development. One would normally expect a correlation between the early-life and the later-life measurements of lung capacity if the number of alveoli remains constant. However, recent evidence suggests that the numbers of alveoli continue to increase beyond childhood and into adolescence [20]. This means that any measure of lung capacity will need to be adjusted for the number of alveoli present.
Given that the majority of cohort studies to date have presented data as FEV, that could be the continued preferred measurement. When presenting this, it is important to use global multiethnic reference equations that span all ages, such as those of the Global Lung Initiative [21]. Although the most widely used and understood way of presenting FEV1 data is % predicted, it may be more appropriate to use z-scores [22].
Other spirometry issues
A spirometer quantifies volume and flow, either exhaled and/or inhaled, by transduction of the flow rate. The most commonly used measure is FEV1, which can be adjusted for sex, age and height, and reported as a normal referenced value of FEV1 % predicted or as a z-score. The reference values can be informed by different national or international standards. The ratio of FEV1 % predicted to forced vital capacity (FVC) is sometimes used to support a diagnosis of COPD. It has been suggested the diagnosis requires a ratio ≤0.7 [21]. These adjustments and variations can all lead to inhomogeneity. There are now standard operating procedures for the measurement of lung function in children. These attempt to limit variation when measurements are taken at different centres [23]. These were not available when many of the cohort studies started collecting data, providing a potential source of error in the results of these studies.
Factors causing errors in FEV1 measurement are:
-
poor spirometric technique;
-
the testing position (standing or sitting);
-
variability in formulas used to convert raw FEV1 data into FEV1 % predicted or z scores; and
-
disagreement that 0.7 is the cut-off value of FEV1/FVC for COPD diagnosis [21].
Study designs
Cohort studies
The risk factors identified from the 18 cohort studies could be broadly grouped into early childhood respiratory infection, bronchial hyperresponsiveness (BHR)/airway lability, wheeze, family history of atopy or asthma, a childhood diagnosis of asthma, respiratory symptoms, prematurity/low birth weight, atopy and prenatal/postnatal exposure to tobacco smoke. We were eventually able to extract data on 54 combinations of risk factors.
We originally expected that birth cohort and other longitudinal study designs would provide all the data we required to develop explanatory models of reduced lung function as a marker for COPD. Our attempt to develop such a model has failed. One reason for this was that none of the existing birth cohort/longitudinal studies had fully captured the complexity of the independent variables that could affect lung volume in later life. An example is early childhood infection. On its own, bronchiolitis is associated with future respiratory disease but its prevalence and severity can be affected by other factors such as prematurity, social deprivation, atmospheric pollution and secondary tobacco smoke exposure. In turn, prematurity can be affected by maternal diet and smoking, socioeconomic status and genetic factors. These interactions mean it is difficult to quantify the effect of a single childhood risk factor on future respiratory health.
We have been unable to confirm whether there is a clear cause/effect relationship between one specific risk factor and reduced lung function as a marker for COPD. Of particular importance is the observation that those with reduced lung capacity in later childhood and in adult life are also those most likely to have reduced lung capacity in earlier life (the so-called tracking effect). In a Norwegian cohort study, reduced lung function at 10 years of age was associated with respiratory infection in infancy but when lung function prior to the respiratory infection was taken into account, it could be seen that those with infections in infancy also had always exhibited lower lung function (K-H. Carlsen, Oslo University Hospital, Oslo, Norway; personal communication). Similarly, in our in-depth analysis of risk factors in the 18 cohort studies using FEV1 % predicted, we identified that two statistically significant factors were respiratory infections in infancy and BHR measured in infancy. As we have discussed earlier, however, were these truly independent variables?
Any model that is developed is only as good as the variables that have been recorded to inform the modelling process. Most cohort studies have tended to limit the numbers of independent variables measured for obvious reasons. Factors such as costs, convenience, burden on patients and availability of technology are all likely to inform which and how many variables to select. Therefore, it is highly unlikely that any one individual study will provide all the information needed to develop an appropriate model. Conducting a meta-analysis of the data from multiple birth cohort studies is likely to give greater insight but as authors are unable or unwilling to share full data sets, this is not likely to happen in the near future. The European Respiratory Society (ERS) has an excellent record of developing task forces to investigate specific issues using an in-depth analytical process. Perhaps ERS could consider assimilating all the raw data from longitudinal respiratory cohort studies, as this could be a useful way of providing more comprehensive data on childhood risk factors in the development of COPD in adults.
Acknowledgements
We thank GlaxoSmithKline for a monetary grant (CRT116809) to support the study and the North Staffordshire Breath of Life Charity for a further financial donation.
Footnotes
Statement of Interest
GlaxoSmithKline funded the original study discussed in this article through an educational grant, but had no input into the process or writing of this review.
- ©ERS 2014
Breathe articles are open access and distributed under the terms of the Creative Commons Attribution Non-Commercial Licence 4.0.