Epigenetics of induced pluripotency, the seven-headed dragon

Induction of pluripotency from somatic cells by exogenous transcription factors is made possible by a variety of epigenetic changes that take place during the reprogramming process. The derivation of fully reprogrammed induced pluripotent stem (iPS) cells is achieved through establishment of embryonic stem cell (ESC)-like epigenetic architecture permitting the reactivation of key endogenous pluripotency-related genes, establishment of appropriate bivalent chromatin domains and DNA hypomethylation of genomic heterochromatic regions. Restructuring of the epigenetic landscape, however, is a very inefficient process and the vast majority of the induced cells fail to complete the reprogramming process. Optimal ESC-like epigenetic reorganization is necessary for all reliable downstream uses of iPS cells, including in vitro modeling of disease and clinical applications. Here, we discuss the key advancements in the understanding of dynamic epigenetic changes taking place over the course of the reprogramming process and how aberrant epigenetic remodeling may impact downstream applications of iPS cell technology.


Introduction
Somatic cell reprogramming is carried out by exogenous expression of four pluripotency-associated transcription factors, c-Myc, Oct-4, Klf4 and Sox2, known as 'Yamanaka' factors. Th e groundbreaking experiment that identifi ed the minimal requirement of these four transcription factors screened 24 essential embryonic stem cell (ESC)-expressing genes by retroviral introduction into mouse embryonic fi broblasts [1]. Th is pioneering study utilized the neomycin resistance gene knocked into the endogenous Fbx15 locus as a reporter of reprogramming. Upon isolation of ESC-like G418 resistant colonies, detailed characterization demon strated a transition of diff erentiated cells to pluripotency, although attempts to generate adult chimeric mice were unsuccessful. Th is initial report stressed the importance of epigenetic remodeling during the acquisition of pluripotency and demonstrated that key endogenous pluripotency genes such as Nanog and Oct-4 acquired transcriptionally permissive chromatin structure at their promoters, characterized by DNA hypomethylation, histone H3K9 demethylation and acetylation of histone H3. However, it was immediately clear that inherent hetero geneity exists among cells undergoing the reprogramming process and signifi cant diff erences between established induced pluripotent stem (iPS) cell and ESC transcriptomes were evident. Two subsequent studies utilized an improved selection protocol by selecting for the expression of Nanog or Oct-4 rather than Fbx15 [2][3][4]. Using the Nanog selection strategy, Mikkelsen and colleagues [2] identifi ed two distinct reprogrammed iPS cell populations, with Nanog selection at later reprogramming stages (that is, day 30 post-infection) enriching for more ESC-like, fully reprogrammed iPS cells.
Appropriate downstream applications of iPS cells for developing in vitro disease models or therapeutic purposes are absolutely dependent on the fact that artifi cially derived iPS cells do indeed behave like ESCs under the same conditions. To this end, the fi nding that Nanog selected iPS cells are diff erent and of higher quality than the Fbx15 selected iPS cells underlined early on the importance of selecting the right reporting strategy to isolate and characterize the iPS cells of the highest quality. Indeed, genome-wide transcription analy sis revealed wide-ranging changes in gene expression between the two selection processes [5]. Furthermore, markers of complete reprogramming are not exactly the same in the mouse and human systems; for instance, Nanog reactivation in human iPS cells does not specifi cally mark fully reprogrammed iPS cells, and alternative markers, such as DNMT3B [6] and perhaps hTERT expression levels [7], are better indicators of this developmental stage. On the other hand, retroviral Abstract Induction of pluripotency from somatic cells by exogenous transcription factors is made possible by a variety of epigenetic changes that take place during the reprogramming process. The derivation of fully reprogrammed induced pluripotent stem (iPS) cells is achieved through establishment of embryonic stem cell (ESC)-like epigenetic architecture permitting the reactivation of key endogenous pluripotencyrelated genes, establishment of appropriate bivalent chromatin domains and DNA hypomethylation of genomic heterochromatic regions. Restructuring of the epigenetic landscape, however, is a very ineffi cient process and the vast majority of the induced cells fail to complete the reprogramming process. Optimal ESC-like epigenetic reorganization is necessary for all reliable downstream uses of iPS cells, including in vitro modeling of disease and clinical applications. Here, we discuss the key advancements in the understanding of dynamic epigenetic changes taking place over the course of the reprogramming process and how aberrant epigenetic remodeling may impact downstream applications of iPS cell technology.
silencing of the reprogramming factors is a common occurrence in completely reprogrammed mouse and human iPS cells [1,5,6,8,9]. Our lab has recently developed a lentiviral EOS reporter vector for the isolation and expansion of pluripotent stem cells, although EOS enriches for, but does not specifi cally mark, fully reprogrammed mouse iPS cells [10]. In the human iPS cell system, multiple markers of full re programming may need to be utilized, including iPS cell colonies of desirable morphology, gene expression and cell surface marker expression, such as Tra-1-60 and SSEA4. However, since generating chimeras is not an option in the human iPS cell context, these surrogate markers together with teratoma formation are likely to remain the most stringent way to demonstrate full reprogramming.
Robustness of somatic cell reprogramming using 'Yamanaka' factors is evident when considering the sources of somatic cells used in reprogramming experiments, including liver and stomach [11], blood [12], pancreas [13] and the intestine [14]. Th e requirement for the exogenous factors is variable depending on the starting cell type, with a recent paper reporting the derivation of iPS cells using only Oct-4 in neural stem cells [15]. Collectively, these fi ndings demonstrate that the epigenetic landscapes that characterize diff erent somatic cell types can be reorganized using a similar cocktail of transcription factors as the initiator of reprogramming, with Oct-4 playing a central and seemingly irreplaceable role in the reprogramming cascade. It remains to be determined whether somatic cells of diff erent tissues of origin have distinct epigenetic structures that are more permissive to the derivation of fully reprogrammed iPS cells. Alternatively, there may be elite subsets of cells that are more epigenetically predisposed to reprogramming that are yet to be discovered [16]. However, a recent study by Hanna and colleagues [17] demonstrated by single cell sorting of terminally diff erentiated secondary mouse B cells that every cell has the potential of giving rise to Nanog-green fl uorescent protein activated iPS cell populations, albeit at diff erent reprogramming rates, indicating that somatic cell reprogramming involves stochastic mechanisms.

Epigenetic restructuring during reprogramming
Although induced pluripotency is dependent on exogenously introduced transcription factors, the epigenetic changes that take place during reprogramming are the driving force of a gradual transition to the pluripotent state. Th e pioneering studies provided the evidence that, in the mouse system, transition to pluripotency is accompanied by global epigenetic changes. We view these changes as a seven headed dragon, with the heads representing: 1, epigenetic reactivation of endogenous pluripotency genes, such as Oct-4 and Nanog; 2, establishment of bivalent chromatin domains at developmental loci and altered histone H3K4 and H3K27 trimethylation levels at ESC 'signature' gene promoters [2] (bivalent mechanisms are reviewed in [18]); 3, hetero chromatin DNA hypomethylation of satellite repeats [4]; 4, inactive-X chromosome reactivation in female iPS cells; 5, maintenance of DNA methylation marks of imprinted gene loci [5]; 6, retroviral transgene silencing upon pluripotency establishment; and 7, the possibility that these molecular epigenetic modifi cations are accom panied by three-dimensional reorganization of chromatin fi ber structures and nuclear subdomain localization (summarized in Figure 1). It is apparent that pluripotent ESCs and iPS cells share distinct epigenetic regulatory pathways, characterized both by the existence of bivalent chromatin domains and extensive DNA methylation marks at non-CpG dinucleotides [19].

Complete reprogramming is a rare event in a sea of intermediate cells
It is now well established that at least three classes of cells are derived during reprogramming, termed intermediate, partially (or pre-iPS) and fully reprogrammed iPS cells. However, only the latter two are capable of forming ESClike colonies that are commonly isolated during the clonal expansion of iPS cell lines. Th e reprogramming process is a rare event, with initial studies reporting the reprogramming effi ciencies in mice of 0.1% of the starting cell numbers giving rise to visible colonies on the induction plates. Th is number is even smaller when considering only colonies that are fully reprogrammed with all the epigenetic characteristics of an ESC-like state. Indeed, most of the cells that initiate the reprogramming process fail to contribute to the germline of adult chimeric mice [1,3,6,20]. Even though the establishment of partially reprogrammed iPS cells is undesirable for subsequent diff erentiation and disease modeling experiments, continued characterization of such cell lines will provide clues as to the various blocks to bona fi de reprogramming to occur [2].
When retroviral transgene delivery is used for the reprogramming process, continued dependence on the transgene expression is a hallmark characteristic of partially reprogrammed iPS cells. Th is persistent retroviral expression is rarely observed upon endogenous Sox2 [21] or Nanog [22,23] activation. Genome-wide transcriptional analysis of partially reprogrammed cell lines indicates that activation of ESC-specifi c genes by the Yamanaka factors requires cooperative interaction with additional pluripotency-associated genes not present in the partially reprogrammed cells, such as Nanog [24]. Th is diff erential binding of transcription factor complexes is not accounted for by the persistence of histone H3K27me3 chromatin marks of the ESC-specifi c target genes by the four reprogramming factors, implicating additional repressive epigenetic marks. As Oct-4 seems to be the main driving force of the reprogramming process, its association in transcriptional activator or repressor complexes in combination with epigenetic modifying enzymes is worth considering. Recently, Oct-4 was shown to be a member of the Nanog and Oct4associated deacetylase (NODE) repressor complex consisting of Nanog, histone deacetylase (HDAC)1/2 and Mta1/2 and members of repressor chromatin remodeling complexes NuRD and Sin3A [25]. Knockdown of the components of the NODE complex leads to upregulation of developmentally regulated target genes and ESC diff erentiation towards the endoderm lineage. Interestingly, an independent study has demonstrated that a member of the NuRD complex, Mbd3 (methyl CpGbinding domain protein 3), is involved in the repression of primitive endoderm and trophoblast-specifi c genes Gata6 and Cdx2, respectively [26]. In addition, Oct-4 interacts with SetDB1/Eset, a histone H3K9 methyltransferase, to restrict the expression of extra-embryonic lineage genes in the inner cell mass (ICM) of the developing blastocyst [27]. It would be interesting to determine whether transient overexpression of Eset and the NODE complex remodeling enzymes could increase the derivation of completely reprogrammed iPS cells during the late stages of the reprogramming process. Indeed, various chemical treatments specifi cally targeting silencing epigenetic regulators of transcription increase the effi ciency of induction to pluripotency.

Small molecule treatments increase reprogramming effi ciencies
Somatic cell reprogramming is proposed to be a stochastic event [17], requiring precisely coordinated levels of trans gene expression at the right time [8,21], and this ineffi ciency is at least partly due to epigenetic blocks encountered on the road to pluripotency. Th is is convincingly demonstrated by the series of experiments in the mouse system using chemical inhibitors of chroma tin remodeling enzymes, G9a (histone H3K9 methyl transferase) [28], HDACs [29], and DNA methyltransferases (DNMTs) [2,28], yielding higher reprogramming effi ciencies in a variety of cell types (reviewed in [30,31]). However, it remains unresolved whether most chemical treatments only increase the effi ciency of early reprogramming events yielding a higher number of partially reprogrammed iPS cells. To date, two chemical treatments have been shown to induce the conversion of a stable partial iPS cell line to the fully reprogrammed iPS cell state. Th ese conversions used 5AzaC (a DNMT inhibitor) treatment of B-cell derived iPS cells and the heterogeneous MCV8 iPS cell line [2] or treatment with a 2i/leukemia inhibitory factor (LIF) inhibitor cocktail targeting mitogen-activated protein (MEK)/extracellular signal-regulated kinase (ERK) and glycogen synthase kinase-3 (GSK3) pathways [24,32]. However, only 2i/LIF treatment was shown to result in partial iPS cells converting to germline competent fully reprogrammed iPS cells. Since individual partially reprogrammed iPS cells are arrested at diff erent reprogramming stages, clonal lines respond diff erently to the reported chemical conversion approaches. In fact, Mikkelsen and colleagues [2] reported that an additional partially reprogrammed iPS cell line, MCV6, responded to the 5AzaC-mediated conversion to the pluripotent state only upon short hairpin RNA knockdown of four lineage-specifying transcription factors. In addition, Chd1, a euchromatinspecifi c chromatin remodeling enzyme, has been shown to maintain the pluripotent ESC state and its absence impedes the somatic cell reprogramming process [33]. Overall, the data so far suggest that repressive epigenetic modifi ers result in various blockades to the reprogramming process. However, it is important to note that chemical inhibition of key epigenetic regulatory mechanisms within specifi c pluripotent cells may have undesired eff ects. For instance, inhibition of DNA methylation prematurely in the reprogramming process results in apoptosis of the intermediate cell types [34] and may induce DNA damage response pathways [35], while inhibition of HDACs results in diff erentiation of ESCs [36]. Th us, precise timing of the chemical treatments during the reprogramming process is required to achieve optimal conditions for the induction of pluripotency.
Currently, multiple epigenetic hallmark characteristics have been assigned to the fully reprogrammed iPS cell state. Since a wide range of global epigenetic modifi cations are taking place during the reprogramming process, the question of changes to the physical chromatin architecture becomes an interesting feature to explore. It has been known for a long time that ESC genomes are characterized by a more 'open' chromatin state but its role in pluripotency is diffi cult to ascertain. A wide range of epigenetic reconfi guration is carried out during the formation of the ICM, including random Xinactivation and resetting of loci-specifi c histone methylation levels. Th erefore, genomic 'euchromatinization' may be a consequence of such epigenetic reprogram ming events. However, an interesting possibility to consider is that establishment of more open, plastic chromosome territories within the nucleus is required for the cells of the ICM to respond to diff erentiation cues properly. Such a physical state would allow for quick reconfi guration of the chromatin state throughout this early embryonic development. Th is is supported by the fact that nucleosomal proteins are not static chromatin units in pluripotent stem cells, but rather exist in a hyperdynamic state where rapid exchange occurs throughout the cell cycle [37]. Th e reprogramming stage at which such conformational changes to the chromatin state are established remains to be explored in future iPS cell studies.

Applications of induced pluripotent stem cells
Th e process of artifi cial somatic cell reprogramming using defi ned factors will undoubtedly have a tremendous impact on the way we study human disease and will likely have applications in the fi eld of regenerative medicine. Currently, eff orts are being made to diff erentiate iPS cells and ESCs down diff erent lineages into somatic cell types by recapitulating the in vivo diff erentiation cues in vitro [38,39] (and reviewed in [40]). Signaling molecules, such as activin A and bone morphogenetic protein, used in diff erentiation studies are also expressed in the developing embryo. Th ese strategies have provided the proof of concept of the potential use of ESCs and iPS cells in regenerative medicine and disease modeling in humans. In vitro modeling of human disease for small molecule screening will likely become the fi rst use of iPS cell technology for 'real-life' applications. In fact, iPS cell lines have been established from patients with multiple diseases, such as amyotrophic lateral sclerosis [41], familial dysautonomia [42], fanconi anemia [43], rett syndrome [10], spinal muscular atrophy [44] and a spectrum of other diseases [45]. Several of these studies demonstrated directed iPS cell diff erentiation to aff ected cell types with Ebert and colleagues [44] and Lee and colleagues [42] reporting the fi rst successful phenotype improvements using a chemical treatment. However, some concerns are yet to be addressed for widespread modeling of human diseases in vitro. In particular, each of the reports so far has reported a very limited number of established lines, and for disease modeling to be reliably explored, more control and patient lines need to be established in parallel using highly reproducible reprogramming methods. Another issue to consider is the delivery method of the reprogramming factors. Th e most favorable are non-integrating methods that result in genetically unmodifi ed iPS cell lines. Th is is especially important if the reprogrammed iPS cell lines will be utilized for clinical applications where reactivation of the reprogramming factors can result in unpredictable phenotypic outcomes, such as cancer development. A wide range of innovative techniques has resulted in successful reprogramming using pMX-based retroviral vectors, lentiviruses, non-integrating adenoviruses, episomal vectors, and plasmid and piggybac transposon systems. Although integration-free reprogramming methods are preferable, we have previously suggested that the use of retroviral transgenes provides a beacon of the completion of the reprogramming process and may be useful in cell lines not used for clinical applications [46]. Th is feature has been exploited in numerous iPS cell studies investigating the dynamics of the reprogramming process [1,21]. Even with non-integrating reprogramming technologies, signifi cant heterogeneity between iPS cell lines may prove to be the biggest obstacle to reliable evaluation of disease phenotypes in vitro.
Th e range of the normal phenotype of a particular functional assay is yet to be determined in multiple disease model settings. Only when the full range has been determined and reproducible diff erentiation protocols have been established will we be able to model human diseases in culture with a high level of confi dence. Overcoming this obstacle may involve determining the inherent variability in the starting population of undiff erentiated ESCs and iPS cells. It is unknown whether these variabilities will persist to the diff erentiated cell state. Although initial reports were highly encouraging as to the similarities between newly derived iPS cells and existing ESC lines, recent studies support thorough molecular characterization of clonal iPS cell lines, suggesting that signifi cant diff erences between iPS and ESCs remain at both the transcriptional [47] and epigenetic levels [48]. Inappropriate establishment of epigenetic marks in the undiff erentiated iPS cells will undoubtedly yield heterogenous diff erentiated cells for subsequent applications. In fact, a brief comparison of even human ESC lines demonstrated that diff erentiation preferences are highly inconsistent among human ESC lines, resulting in variable levels of lineage-specifi c gene activation [49]. Finally, pluripotent stem cells and adult mammalian progenitor cells have variable expression levels of genes controlling lineage specifi cation [50]. Several genes with variable levels of expression in mouse ESCs have been described, with Nanog [51] and Stella [52], to our knowledge, being the best characterized. It remains to be investigated whether proper levels of variability of these pluripotency-related markers are established upon the transition of diff erentiated cells to fully reprogrammed iPS cells. In the case of Stella, variable gene expression is regulated by reversible epigenetic modifi cations such as histone acetylation in the ICM, with more long-term repressive epigenetic silencing through DNA methylation occurring at later stages of development, in the epiblast derived stem cells.
During these early phases of studying iPS cells, it is important to consider all possible obstacles to the safe and reliable use of iPS cells in these downstream applications [53]. Leading stem cell researchers remain cautious about possible false promises regarding the application of iPS cells. Th e quality of the starting iPS cell lines for in vitro diff erentiation experiments may not need to be pluripotent to the strictest criteria, as long as the diff erentiation of the cell type of interest could be accomplished accurately and effi ciently. For this purpose, if one is interested in studying the cell types of the mesoderm lineage, the capacity of the iPS cell line to give rise to ectoderm/endoderm lineages may not be relevant. Even in these restricted directed diff erentiation experiments, appropriate epigenetic circuitry needs to be established for the lineage-specifi c genes to be reexpressed upon receiving diff erentiation signals.

Conclusions
Reprogramming of somatic cells progresses through several stages before the full pluripotent state is attained. We describe seven epigenetic features of reprogramming and illustrate their dynamic modifi cations at the diff erent stages. Small molecules can enhance effi ciencies and further promote full reprogramming by modifying the epigenetic landscape. As we transition into large scale eff orts to make patient-specifi c iPS cell lines and model human diseases in vitro, proper epigenetic remodeling may remain at the forefront of stem cell biology, and developing methods to stably establish ESC-like epigenetic circuitry during the reprogramming process will be of high priority.

Competing interests
The authors declare that they have no competing interests.