Abstract
Summary The coordinated regulation of gene expression is crucial for survival, especially in multi-cellular organisms. Gene regulation can occur through a number of different mechanisms, which include the binding of transcription factors to gene promoters and enhancers and to gene repressors.
Overlaid upon this is the epigenetic (or “above” genetics) regulation of gene expression. Epigenetics has been implicated in the determination of cell differentiation and the control of gene expression by each cell under different external stimuli. Epigenetic mechanisms include DNA methylation, which is principally involved in gene silencing, and plays a key role in maintaining cellular differentiation. Another layer of epigenetic regulation is DNA packaging into chromatin, which can alter the availability of the DNA, and is controlled by histone modifications. Finally non-coding RNAs can also affect the stability of coding mRNA and its ability to interact with ribosomes and be translated into protein.
These epigenetic mechanisms are heritable, and maintained through multiple cell divisions, helping to control cell fate, and can even be passed onto germ cells and future generations. In addition to inheritance epigenetics can be altered by the environment, and factors such as pollution and cigarette smoking have been shown to alter the epigenetic profile of cells.
The role of epigenetics in controlling gene expression in complex organs, such as the lungs, is a promising area of research and may help to explain complex inheritance patterns and environmental interactions of many lung diseases including asthma, COPD and lung cancer.
Introduction
The last decade saw the genetic age come to fruition with very large genome-wide association studies of many complex diseases including some respiratory diseases [1]. Although these studies provided new unforeseen targets which implicate novel pathways or proteins in asthma, lung cancer and chronic obstructive pulmonary disease (COPD) for example, they did not account for all the heritability of disease. This has led to the search for factors that may be involved in this “missing heritability”. Confounding this is the fact that all of the myriad cell types within the lung have the same DNA sequence yet they express distinct patterns of genes and proteins and perform distinct functions. Again, it is clear that some factors control this differential expression and response to external cell stimulation. One process that has been implicated in both these effects is epigenetics, from the Greek for “above genetics” [2, 3]. There has been an explosion of papers reporting epigenetic changes which occur in a number of diseases including chronic inflammatory diseases and cancer; indeed the next decade may become the epigenetic age.
Deoxyribonucleic acid (DNA) is commonly viewed as the blueprint of a cell, encoding all the information required for life within its genes. The information encoded in the DNA is transcribed by the ribonucleic acid (RNA) polymerase enzymes into messenger (m)RNA in the nucleus, and is transported to the endoplasmic reticulum where the information is translated into amino acids by the ribosomes (fig. 1).
Owing to the complexity of living organisms the expression of genes is tightly controlled, and whilst the genome of each cell contains the information for all proteins in the human body, only a select subset are expressed at any time in any particular cell type. Therefore the processes of transcription and translation are both tightly regulated to ensure that only the correct proteins are expressed, by a number of mechanisms including transcription factors, transcriptional repressor and transcriptional activator proteins.
The ability of the transcription factors to bind to DNA can be controlled by limiting their access to the DNA. Changes to the way in which DNA is packaged into the cell's nucleus, or alteration of markers on the DNA, such as the addition of methyl groups to a cytosine (5′-methyl C) residue, can drastically alter the ability of the transcription factors to bind to the DNA or the recruitment of transcriptional repressor complexes.
Even after transcription the process of mRNA translocation and translation into protein is tightly controlled by a number of processes, including mRNA stability and binding of other non-coding RNAs to the mRNA.
These mechanisms of transcriptional control are termed epigenetics, and although they do not change the DNA sequence, can be inherited, both by daughter cells after mitosis and even by gametes after meiosis.
DNA methylation
DNA methylation was the first epigenetic mechanism described. DNA methylation, although originally thought to be irreversible, is now known to be a reversible modification of DNA resulting from the enzymatic addition of a methyl group to predominantly cytosine nucleic acids. In most cases, these C residues are followed by a guanine forming a CpG motif. These methylated CpG motifs often exist in groups or CpG islands [4] and generally result in gene silencing. However, there are exceptions to this general rule [5]. These DNA methylation patterns are characteristic of cell types and specific patterns of DNA methylation may remain relatively static throughout a cell's lifespan, following differentiation.
DNA methylation plays a key role in lung development
The lungs begin to develop four weeks after conception and continue to develop after birth. The pluripotent embryonic cells, which are capable of developing into all the cell types in the body, differentiate into specific cell lineages. As the cells develop into more specialised cell types, the cells pick up epigenetic markers on DNA and on histones that lock the cells into their particular lineage [6]. Thus, DNA methylation that a key part of cellular differentiation and development and animals that lose the ability to methylate the DNA fail to develop into adulthood [4].
DNA methylation is important for the immune response
The role of DNA methylation in the immune response is especially important in the regulation of the major histo-compatability (MHC) genes which are involved in antigen presentation [7] and regulation of the innate and adaptive immune systems [8]. The region encoding the MHC on chromosome 6 has been implicated in asthma in several genome-wide searches [9] and epigenetic regulation of the region has been found, highlighting the importance of epigenetic mechanisms in the development of inflammatory diseases [8, 10]. Studies are now being undertaken to examine the levels of DNA methylation in stored blood cells in order to gain greater insight into gene–environment regulation of gene expression in asthma and COPD.
DNA methylation plays a key role in lung cancer
Owing to its role in gene silencing and determining cell fate, DNA methylation is an important factor in the development of cancer. Hypomethylation (the loss of methylation) of the genome is found in cancer [11] and is thought to have significant implications on gene activation, loss of heterozygosity, and global chromosomal stability [12–14].
Aberrant DNA methylation is also associated with cancer development [11]. Increased DNA methylation of a gene promoter region causes gene silencing and hypermethylation of tumour suppressor gene promoters is thought to contribute to the complex processes leading to increased cell proliferation in lung cancer [15, 16]. Studies have shown that the majority of lung cancers have hypermethylated gene promoters and more than 80 genes have been reported to be hypermethylated in lung cancer [17]. In contrast to gene mutation, promoter hypermethylation is a reversible process, making it a very attractive target for cancer therapy; and it is hoped that further study into DNA methylation patterns in cancer may yield potential biomarkers for the early detection of the disease [18].
Alterations to DNA methylation are also associated with the development of COPD, including hypomethylation of immune-modulatory genes [19]. For example, the CpGs in the vicinity of the SERPINA1 gene coding for α1-antitrypsin, a serine protease inhibitor, the loss of which is associated with early onset emphysema, have been shown to be hypomethylated in COPD patients, [2]. In this case, the effect of DNA hypomethylation at these CpGs on SERPINA1 expression is unclear.
DNA packaging
The double helix structure of DNA in humans contains nearly 3 billion base pairs (bp) [20] and if the DNA in a single human cell was stretched out it would reach a length of nearly 2 metres [20]. If all the DNA in an average adult human was stretched out end to end it would reach for 100 trillion metres, approximately 300 times the return journey to the Sun or 2.5 million times around the Earth's equator [20].
In order to contain this amount of DNA within the nucleus of a cell it is tightly packaged, most notably into 23 pairs of chromosomes. These chromosomes are only present during cell division, and in the normal, resting cell state the DNA is wound into a less condensed structure, known as chromatin (fig. 2).
Chromatin is itself made of a complex mix of DNA and scaffolding proteins, in repeating structures termed “nucleosomes”. These nucleosomes are formed of ∼150 bp of DNA, tightly wrapped around a protein core, which is made of eight histone proteins [21–23]. The nucleosomes are joined together by a short linker section of DNA, 20 bp in length producing a structure which, viewed under an electron microscope, appears as “beads on a string”.
Chromatin is further condensed to help its packaging into the nucleus. The nucleosomes are coiled into a more tightly wound structures, such as the 30 nm fibre (named owing to its width), which are further condensed into more compact structures (fig. 2), until the DNA ultimately is condensed into the chromosomes during mitosis.
In order for transcription to occur, proteins such as RNA polymerase need to bind to the DNA and the DNA strands need to be separated. While these higher-order structures in chromatin are very good for packaging the DNA they prevent the transcriptional machinery from accessing DNA. Therefore, the packaging of DNA needs to be removed and the DNA unwound in order for transcription to occur. The accessibility of the DNA is controlled by the histone cores, which can be dislocated by DNA remodelling proteins; or the histones themselves may be modified, such as by acetylation, methylation or phosphorylation. Importantly these processes of histone modification are fully reversible as they are controlled by distinct sets of enzymes that can either add the relevant tag or remove them. As such, chromatin is in a dynamic state, with the DNA being remodelled between the compact and relaxed states as required depending upon the cellular environment in a temporal manner.
Histone modifications
Histones are small, positively charged molecules, which helps them to interact with the negatively charged DNA backbone [20]. There are five distinct, highly evolutionarily conserved histone molecules, histone H1, H2A, H2B, H3 and H4 [21]. Histone proteins H2A, H2B, H3 and H4 (2 of each) form the octamer histone core of the nucleosome around which DNA is coiled. Importantly, the N-terminal tails of these histones protrude through and beyond the DNA backbone which make them accessible for post-translational modifications.
The histone H1 molecule helps form higher structures of chromatin and has been shown to be crucial for the formation of the 30-nm fibre structures. The modifications to histones that affect transcription are illustrated in figure 3.
The histones can be post-translationally modified by families of enzymes which are generally selective for adding, or removing, these modifications. The addition of acetyl, methyl and phosphate groups can alter the affinity of the histones for DNA. The addition of these modifications or tags alters the charge on the histone tails and thereby alters the binding of the histones to the DNA. Furthermore, these histone tags may also serve to recruit co-factors to the DNA. The ability of these co-factors to bind ultimately affects the transcriptional state of the chromatin, and is termed the histone code or language [24, 25].
Histone acetylation
Histone acetylation is the most widely studied modification. Acetylation occurs on lysine residues and a number of these are present on the N-terminal tails of histones H3 and H4 that extend beyond the DNA loops. Acetylated histones lose their positive charge and interact less strongly with DNA [26], which results in a more relaxed or open chromatin state, termed euchromatin [26], which is associated with transcription. The acetylation of the histones is carried out by enzymes called histone acetyl transferases (HATs). There are ∼25 enzymes with HAT activity that are grouped on the basis of their catalytic domains [27]. Conversely these acetyl marks can be removed by another group of 18 enzymes termed histone deacetylases (HDACs), which leads to the chromatin returning to the more condensed heterochromatin state [26, 28]. Acetylation tags are recognised by proteins that contain a particular sequence known as a bromodomain [29, 30]. Recently, drugs selective for groups of bromodomain-containing proteins have been developed and these have been reported to be very effective in suppressing cytokine release from macrophages and preventing death in animal models of sepsis [31] and the development of cancer [32] whether delivered prophylactically or therapeutically.
Other histone modifications
Other histone modifications, including methylation, phosphorylation and ubiquitination of the histones are less well understood [33]. These alterations to the histone code are also important for the regulation of transcription, either through altering the association between histones and DNA or through the recruitment of other transcriptional co-factors. Whilst less is known about these histone modifications and their role in disease, there is evidence of high turnover of these marks on the histones indicating they are a highly active and involved process. At present it is not known precisely how these different histone modifications act together in a temporal manner to co-ordinate gene expression [24]. However, as new tools become available, it is likely that this aspect of the histone code will be better understood and the role of these modifications in disease will be elucidated.
Role of histone modifications in lung diseases
Within the lungs, the dynamic nature of the chromatin plays an important role in regulating inflammation in asthma and COPD [34]. Histone acetylation is associated with the expression of numerous pro-inflammatory genes, in a variety of cell types, including inflammatory cells, the lung epithelium and the airway smooth muscle (ASM) [34]. For example ASM stimulated with tumour necrosis factor (TNF)-α shows histone H4 acetylation of the eotaxin gene promoter, along with binding of the transcription factor nuclear factor (NF)-κB and eotaxin production [35].
In contrast, the activity of corticosteroids, used to treat a number of ailments including asthma and COPD, is closely associated with HDAC2 activity. Glucocorticoids are recognised by the glucocorticoid receptor which, when activated, translocates to the nucleus. Within the nucleus, activated glucocorticoid receptor, among other actions, binds to the promoters of specific sets of inflammatory genes, recruits repressor complexes that contain HDAC2 and represses their transcription [36]. The presence of HDAC2 at the site of inflammatory gene transcription removes local acetyl tags and results in a more closed or condensed heterochromatin state preventing further transcription.
Defects in this mechanism may help to explain the glucocorticoid insensitivity seen in severe asthmatics and patients with COPD. Several studies have linked reduced HDAC2 activity to chronic inflammation in the lungs [37, 38] and as such this makes restoring HDAC function in these patients of potential value as a future treatment [39].
In addition to the control of lung inflammation, histone modifications are also associated with the correct expression of genes involved in cellular replication [40, 41] and DNA repair [42]. Importantly, errors in DNA replication or inappropriate cell division are important factors in driving cancers, including lung cancer.
Histone acetylation appears to play a dual role in lung cancer, both suppressing and driving cancers depending on the genes targeted [43]. Inappropriate histone acetylation, leading to the silencing, of tumour suppressor genes has been found in lung cancer [44]. In contrast, inhibition of HDAC activity inhibits tumour growth, reactivates tumour suppressor genes, and leads to genomic instability by a variety of mechanisms [45]. Although the mechanism of action of HDAC inhibitors in this context is not yet fully understood, they may act through effects on non-histone proteins [46].
Histone methylation is also considered to be an essential step involved in many cell fate determination [45], developmental and differentiation processes [47–49], pluripotency [50] and maintaining genome integrity [51]. It is no surprise, therefore, that histone methylation under the control of several histone methyltransferase (HMT) families has been linked to cell proliferation in models of lung cancer [52]. Indeed, almost 50% of HMTs are associated with tumorigenesis [45]. In the same way that histone acetylation is reversed by groups of deacetylases, histone methylation is reversed by families of histone demethylases (HDMs) [53]. The effects of histone methylation are more complex than that of acetylation as some methylation tags (H3K4 for example) are associated with active gene expression whilst others such as H3K27 are associated with gene repression. Methylation can occur on the same lysine residues as acetylation and competition can occur between HATs and HMTs for targeting specific residues [54].
Non-coding RNAs
The DNA encoding proteins is only a tiny fraction of the total length of the human genome [55]. Human genes contain coding regions termed exons (normally short and containing an average of 50 codons (150 bp) [55]) separated by long stretches on non-coding introns (up to 10 kbp in length) [55]. In addition the human genome contains thousands of non-coding genes, that when transcribed into RNA, do not result in the production of a protein product. When the human genome was first sequenced it was estimated to contain ∼50 000 protein encoding genes [55], however now this number has been revised down to only 20 000–25 000 [56]. Many genes are not protein-encoding but are transcribed into non-coding RNAs (ncRNA). These can perform a number of important functions, including acting as transfer and ribosomal RNAs, which help translate the mRNA sequence in polypeptides, and small nuclear RNAs, which are crucial for splicing and the removal of the introns from the mRNA sequence [55].
More recently ncRNAs have been shown to play a role in the regulation of cells. Non-coding RNAs include microRNAs (miRNA), which are small non-coding, single stranded RNA molecules of 19–25 nucleotides in length [57], as well as longer 200 bp long-non-coding RNAs (lncRNAs) of which about 11 000 exist (www.lncipedia.org). MicroRNAs are capable of binding to full-length mRNA sequences and altering transcription. As miRNAs are capable of binding to multiple mRNAs, the effect of inducing or repressing miRNA expression can influence most biological processes, including cell fate specification, cell proliferation, DNA repair, DNA methylation and apoptosis, and can provide pro-inflammatory or anti-inflammatory stimuli. Importantly for the development of both lung cancer and COPD, miRNAs play an essential role in the development of both the adaptive and innate immune systems [58]. In contrast, lncRNAs can modulate gene expression by guiding transcription factors to binding sites on DNA or by controlling the compaction of local chromatin [59].
The interactions of miRNAs, lncRNAs and mRNAs and their role in disease is not yet fully understood; however, recent work indicates that they are important in both COPD and lung cancer and are potential biomarkers of disease [60]. Several miRNAs are linked to inflammation, proliferation, heart disease and cancer [57]. Some are also down-regulated in the skeletal muscle of patients with COPD compared with non-smoking controls and their expression correlated with clinical features [61].
Non-coding RNAs including miRNAs have been shown to play important roles in lung development [62] and due to their role in cell fate and development miRNAs and lncRNAs probably also play a causal role in cancer [63].
Environmental factors alter the epigenetic profile
As epigenetic processes are dynamic they can be changed throughout a person's lifetime. Environmental factors, such as pollution, diet and cigarette smoking have all been linked to epigenetic modifications [34]. The ability of the environment to alter gene expression through epigenetics may help to explain the complex genetic and environmental interactions in disease, which is especially important in the lung owing to its function as the air–body interface.
Diet can have a major impact on the epigenetic profile, which is best illustrated by the heritable phenotype of transgenic agouti mice, named after their mottled yellow coat colour. Genetically identical mice whose mothers were fed different diets were born with different coat colours. Offspring of mothers fed a folic acid supplemented diet were more likely to have a normal (darker) coat colour, a marker for improved health. The coat colour effect was mediated via increased methylation of the CpG island upstream of the agouti gene (fig. 4) [64]. In a similar manner, supplementation of the maternal diet with folic acid altered the methylation status and gene expression levels in an allergic model of airway hyperresponsiveness in mice [65].
Exposure to air pollution has long been associated with lung diseases [66, 67]. Air pollution, for instance from traffic or industry, can be inhaled deeply into the airways, and pollution has been consistently associated with a variety of adverse health outcomes [68]. As well as causing direct damage to the airways, air pollution has been linked to epigenetic changes in the lung. For example, cigarette smoking and air pollution can result in oxidative stress, leading to DNA lesions and hypomethylation [69–71]. An example, of how epigenetic changes may affect autoimmune disease susceptibility is seen in twin studies. For example, the risk of lupus increases in monozygotic twins, and this has been associated with greater changes in DNA methylation status at particular sites linked to disease-associated genes, presumably as a result of exposure to greater environmental stresses [72].
Cigarette smoke is associated with both lung cancer and COPD. As previously discussed, reduction in HDAC activity is associated with the chronic inflammation seen in COPD. It has been shown that exposure to cigarette smoke reduces the expression and activity of HDAC2 both at the protein and mRNA level [73, 74].
There is a large body of evidence that prenatal exposure to environmental tobacco smoke (ETS) is associated with impaired respiratory function and increased risk of transient wheeze or asthma [75–77]. Maternal smoking in the last trimester is correlated with asthma by 1 year of age and is possibly associated with changes in global and gene-specific DNA methylation patterns [78]. Furthermore, ETS has also been linked to the development of adult asthma [79]. Smoking is thought to alter DNA methylation through a process of oxidative stress [69].
Conclusion
Changes to DNA accessibility, through modification of the histones or DNA methylation, or the involvement of non-coding RNAs can dramatically alter the gene expression and translation profile of a cell. The changes to protein expression can have large impacts on cellular function, leading to inflammation in chronic diseases such as asthma and COPD or inappropriate cell division as in lung cancer. The epigenetic regulation of a cell is therefore of great importance as a field of research.
The epigenetic programming of a cell is a dynamic process, under the control of distinct sets of enzymes, which can be modified by environmental factors. New drugs that target these modifications are under development for many cancers. Future studies will also determine whether the beneficial effects seen in acute models of inflammation will be translated to chronic inflammatory diseases. Of interest, several drugs, particularly nuclear hormone receptors such as corticosteroids, utilize epigenetic changes to obtain their full clinical benefit. Understanding how epigenetics affects lung disease will be of interest as it also provides the potential to cure chronic diseases by resetting the epigenome that has been perturbed by disease or by the environment.
Footnotes
Statement of Interest
A.L. Durham’s salary is supported by a grant from Pfizer. I.M. Adcock is an advisory board member for Almirall, Chiesi, GSK and Novartis. He has received research grants from UBIOPRED and the Medical Research Council and has received payment for lectures, meetings and educational activities from MSD, Boehringer Ingelheim and GSK.
Support statement
This manuscript was supported by a Wellcome Trust programme grant on epigenetics in COPD.
- ©ERS 2013