**2.1. Detailed Structure Determination by T2C**

**2. Finalizing the 3D Genome Architecture & Dynamics**

Heuristically, it is very instructive how the central part of the 3D genome architecture and dynamics could now be determined by us in detail, and how out of this process immediately an also evolutionary consistent model (**Figure 1**) arises in agreement with the entire history and heuristics of the field. This has been achieved by a highly integrated systems approach linking holistically: i) a novel high-quality selective high-throughput high-resolution chromosome interaction capture (T2C) technique [25, 26, 67–69] (elucidating the structure with unprecedented resolution of some base pairs), ii) a novel *in vivo* FCS approach [27] exploring the structure and dynamics by measuring chromatin movement, and iii) a novel analytical approach [27] and improvement of super-computer simulations of individual chromosomes and entire cell nuclei [7, 21–24, 26, 49–52, 70] to predict, analyse, and interpret the 3D architecture and dynamics from a theoretical standpoint, and combining all these with iv) scaling analysis of the 3D-architecture [21, 22, 26] and the DNA sequence itself [22, 24, 26] since the architecture and its dynamics leaves sequence "footprints" due to the co-evolutionary entanglement of structure and sequence. The combination of these resulted not only in a consistent

10 -9 103

10-6 10-3 100

10 -10 102

10-8 10-6 10-4 10-2 100

10 0 103 106 109

**chromatin loop**

DNA length [bp]

**Figure 1.** Overview on the size and time scaling of genome organization: The scaling and the levels of organization range over 9, 12, and 14 orders of magnitude! Initially base pairs are formed composing the DNA double helix (image see [22]), forming with a histone core complex the nucleosome (image from [22]), which condense into a chromatin quasi-fibre (simulation image; courtesy G. Wedemann). The DNA double helix forms also superhelices (AFM image of plasmid DNA; courtesy K. Rippe). The next compaction step consists of stable chromatin loops (FISH image; courtesy P. Fransz) forming stable loop aggregates/rosettes connected by a linker (EM image from [44]), which make up interphase chromosome arms and territories (FISH image; courtesy S. Dietzel) and the metaphase ideogram bands (image see [22]). 46 chromosomes compose the human nucleus and are decondensed in interphase (EM image; courtesy K. Richter) and

**rosee/band**

volume

**base pair nucleosome chromatin**

condensed for separation during mitosis (image from [22]).

**double-helix**

70 Chromatin and Epigenetics

**quasi-fibre**

**super-helix**

time

[µm ] 3

**nucleus chromosome**

**chromosome (arms)**

*interphase interphase interphase*

*mitosis mitosis mitosis*

104

[s]

To finally determine and structurally sequence with highest resolution, signal-to-noise ratio, interaction frequency range, and statistical significance the 3D genome architecture we developed targeted chromatin capture (T2C) - a chromatin interaction technique though with far-better quality specifically addressing the needs for genome architectural "sequencing" [25, 26 67–69]. Briefly: i) after chromatin crosslinking, ii) cell permeabilization for intra-nuclear enzymatic DNA restriction, iii) the extracted and largely diluted cross-linked DNA is re-ligated primarily within the crosslinked complexes. After iv) decrosslinking, purification, and final shortening to <500 bp of the chimeric DNA ligates, v) a purified region-specific DNA interaction fragment library is selected by using DNA capture arrays, before finally vi) high-throughput sequencing, mapping to the reference genome, interaction partner determination and visual/quantitative analysis is conducted (**Figure 2**). Notably, we use only uniquely mapped sequences without applying any other corrections

**Figure 2.** Simulated chromosome models [7, 21–23, 26, 49–52]: Volume rendered images of simulated Random-Walk/ Giant-Loop (RW/GL) and Multi-Loop-Subcompartment (MLS) models. As a starting conformation with metaphase chromosome form and size (top), rosettes were stacked (a). Thereof, interphase chromosomes in thermodynamic equilibrium, were decondensed by Monte-Carlo and relaxing Brownian Dynamics. The simulated RW/GL model containing here large 5 Mbp loops notably shows that the large loops do not form distinct structures but intermingle freely (b). In contrast, in the MLS model with 126 kbp loops and linkers, the rosettes form distinct subchromosomal domains and chromatin territories in which the loops do not intermingle freely (c). In an RW/GL model with 126 kbp loops and 63 kbp linkers, again distinct chromatin territories are formed but in contrast to the MLS model without subchromosomal domains (d). It is obvious that the MLS model not only balances stability and flexibility considerations in storage and transcriptional respects, but also is optimal for replication due to its in essence two-dimensional topology allowing controlled duplication and separation during mitosis.

bearing information loss due to the very nature of T2C. This specific setup is not only far superior due to its improvement of 3 to 4 orders of magnitude compared to other interaction approaches (see Introduction), but also allows nearly unlimited opportunities e.g. such as multiplexing for complex research and diagnostics.

between the domains that connects them. The borders of the domains can be determined down to the single fragment level and thus a very high resolution (see below). The interaction of domains with each other and a closer inspection of the interactions in the vicinity of the linker interacting often more frequently compared to other domain

A Consistent Systems Mechanics Model of the 3D Architecture and Dynamics of Genomes

http://dx.doi.org/10.5772/intechopen.89836

73

**ii.** At intermediate scales within the subchromosomal domains, the interaction pattern shows clearly distinct gaps and a quantifiable grid-like arrangement of interactions, which also continues outside and "crosses" with the linear pattern originating from sequentially subsequent domain(s). These interactions on scales of tens of kilo base pairs are doubt-free originating from stable chromatin loops, forming a stable loop aggregate/

**iii.** On the smallest scale, a dense and high interaction frequency pattern is observed in the region from 3 to 10 kbp (i.e. < ~5-15, and ~50 nucleosomes, respectively) along the diagonal. It varies independently of the local fragment size with distinct interactions and non-interacting "gaps". This suggests, that there are defined stable interactions on the nucleosome scale forming an irregular yet locally defined and compacted structure,

i.e. a quasi-fibre with average properties (e.g. an average linear mass density).

functional conditions showed only a relatively small variation of this theme [26, 27].

others across species and even across specie-kingdoms (Imam et al., in preparation).

A detailed quantification [26, 27] of several regions leads to a quasi-fibre compaction of 5 ± 1 nucleosomes per 11 nm, with an average chromatin quasi-fibre persistence length of ~80 to 120 nm, loops and linkers of ~30 to 100 kbp, forming multi-loop aggregates/rosettes with typically 300 kbp to 1.5 Mbp subchromosomal domain sizes. Different cell types, species, or

All this is consistent with a variety of previous observations and predictions such as compacted fibre structures described throughout the literature (see e.g. [16, 17]), the internal structure of subchromosomal domains [7, 21, 22, 24, 38–40, 43, 49, 50] agreeing on all structural levels with the absolute nucleosome concentration distributions [18, 19], the dynamic and functional properties such as the architectural stability and movement of chromosomes [7, 22, 54, 71, 72], chromatin dynamics [73], as well as the diffusion of molecules inside nuclei (e.g. [22, 54, 72]), and recent genome wide *in vivo* FCS measurements of the chromatin quasi-fibre dynamics [27] also suggesting such a chromatin quasi-fibre with variable, function-dependent properties. Beyond, other hypothesis (see Introduction; [26, 27]) about the 3D genome organization on these scales can clearly be ruled out: e.g. no-compaction or a highly-regular chromatin fibre, unstable/dynamic loops or unstable/dynamic loop aggregates/rosettes can clearly be ruled out, because they simple would lead to other interaction patterns and the intrinsic chromatin fibre dynamics with movements on the milli-second scale (Movies 1, 2 [26]) would lead to immediate structural dissolution. Most importantly no other model leads to a consistent functional framework bridging consistently the here described scales as can also be shown by the agreement with scaling analysis of the 3D-architecture [21, 22, 26] and the DNA sequence itself [22, 24, 26]. Beyond, not only functional aspects as the easy (de-)condensation during mitosis can be easily explained, but we were also able to find this organization in the data of

rosette like architecture, due to several consecutive loops coinciding.

parts are mainly due to the breaking of spatial isotropy.

Most importantly, however, T2C allows reaching fundamental resolution limits where "genomic" statistical mechanics and uncertainty principles apply [26]: With fragment length and thus resolutions of a couple of base pairs, a high interaction frequency range, and high signal-to-noise ratio, not only molecular resolution is reached and thus the fundamental limits of cross-linking techniques, but also the mechanism of observation is now on the same scale as the observables (in analogy to classic and quantum mechanics). Actually due to the stochastics following the bias of the system behaviour, the observables, the observation, and thus the measured values are constrained by what we call "genomic" statistical mechanics with corresponding uncertainty principles. This originates from the individual complexity of each highly resolved interaction with a unique but coupled individual probabilistic fragment setting in each cell at a given time. Hence, the actual conditions and components can be determined only partially with high accuracy while with low accuracy otherwise and are eventually even entirely destroyed by the measurement. Thus, the central limit theorem applies with an overlap of system inherent and real noise stochastics, and hence in the end only probabilistic analyses and statements can be drawn as hitherto is well known from classical mechanics, and more so from quantum (mesoscopic) systems. Consequently, population based or multiple single-cell experiments have to be interpreted and understood in a "genome" statistical mechanics manner with uncertainty principles due to the inseparability of factors/parameters also seen there. Thus, in practical terms, valid results are obtained when the statistical limit is reached, i.e. when scaling up the experiment does not narrow down the distribution any further and does not lead to fundamental (overall) changes anymore in observables. Nevertheless, if the statistical limit is reached and if the quality parameters like resolution, frequency range, and signal-to-noise ratio are sound, conclusions could be drawn as in the many cases of classic mechanics, and more so of quantum (mesoscopic) systems.

Consequently, due to this sensitivity of T2C, we [26] were able to determine finally the missing parts of the 3D architecture on scales where a "genomic" statistical mechanics applies with stable reproducibility as one can already see visually in colour coded interaction maps (**Figure 2**): Not only are rare interactions stably detected within an unprecedented frequency range spanning 5-6 orders of magnitude, but also the maps are reproducibly mostly empty (<10% of possible signals are taken). Both interactions and non-interactions show clearly dedicated interaction patterns on all spatial scales within and between domains, including their re-emergence as attenuated repetition on other scales since obviously genomes are scalebridging systems [22, 23]—all of which can be immediately identified as structural features - briefly (**Figure 2**):

**i.** On the largest genomic and thus spatial scale, subchromosomal domains are visible as square-like interaction domains (often unfortunately called TADs; [63]) featuring in general a higher average uniform interaction degree compared to interactions between domains, with a sharp drop at the edge of domains, as well as a clear linker region

between the domains that connects them. The borders of the domains can be determined down to the single fragment level and thus a very high resolution (see below). The interaction of domains with each other and a closer inspection of the interactions in the vicinity of the linker interacting often more frequently compared to other domain parts are mainly due to the breaking of spatial isotropy.

bearing information loss due to the very nature of T2C. This specific setup is not only far superior due to its improvement of 3 to 4 orders of magnitude compared to other interaction approaches (see Introduction), but also allows nearly unlimited opportunities e.g. such

Most importantly, however, T2C allows reaching fundamental resolution limits where "genomic" statistical mechanics and uncertainty principles apply [26]: With fragment length and thus resolutions of a couple of base pairs, a high interaction frequency range, and high signal-to-noise ratio, not only molecular resolution is reached and thus the fundamental limits of cross-linking techniques, but also the mechanism of observation is now on the same scale as the observables (in analogy to classic and quantum mechanics). Actually due to the stochastics following the bias of the system behaviour, the observables, the observation, and thus the measured values are constrained by what we call "genomic" statistical mechanics with corresponding uncertainty principles. This originates from the individual complexity of each highly resolved interaction with a unique but coupled individual probabilistic fragment setting in each cell at a given time. Hence, the actual conditions and components can be determined only partially with high accuracy while with low accuracy otherwise and are eventually even entirely destroyed by the measurement. Thus, the central limit theorem applies with an overlap of system inherent and real noise stochastics, and hence in the end only probabilistic analyses and statements can be drawn as hitherto is well known from classical mechanics, and more so from quantum (mesoscopic) systems. Consequently, population based or multiple single-cell experiments have to be interpreted and understood in a "genome" statistical mechanics manner with uncertainty principles due to the inseparability of factors/parameters also seen there. Thus, in practical terms, valid results are obtained when the statistical limit is reached, i.e. when scaling up the experiment does not narrow down the distribution any further and does not lead to fundamental (overall) changes anymore in observables. Nevertheless, if the statistical limit is reached and if the quality parameters like resolution, frequency range, and signal-to-noise ratio are sound, conclusions could be drawn as in the many cases of classic mechanics, and more so of quantum (mesoscopic) systems.

Consequently, due to this sensitivity of T2C, we [26] were able to determine finally the missing parts of the 3D architecture on scales where a "genomic" statistical mechanics applies with stable reproducibility as one can already see visually in colour coded interaction maps (**Figure 2**): Not only are rare interactions stably detected within an unprecedented frequency range spanning 5-6 orders of magnitude, but also the maps are reproducibly mostly empty (<10% of possible signals are taken). Both interactions and non-interactions show clearly dedicated interaction patterns on all spatial scales within and between domains, including their re-emergence as attenuated repetition on other scales since obviously genomes are scalebridging systems [22, 23]—all of which can be immediately identified as structural features

**i.** On the largest genomic and thus spatial scale, subchromosomal domains are visible as square-like interaction domains (often unfortunately called TADs; [63]) featuring in general a higher average uniform interaction degree compared to interactions between domains, with a sharp drop at the edge of domains, as well as a clear linker region

as multiplexing for complex research and diagnostics.

72 Chromatin and Epigenetics



A detailed quantification [26, 27] of several regions leads to a quasi-fibre compaction of 5 ± 1 nucleosomes per 11 nm, with an average chromatin quasi-fibre persistence length of ~80 to 120 nm, loops and linkers of ~30 to 100 kbp, forming multi-loop aggregates/rosettes with typically 300 kbp to 1.5 Mbp subchromosomal domain sizes. Different cell types, species, or functional conditions showed only a relatively small variation of this theme [26, 27].

All this is consistent with a variety of previous observations and predictions such as compacted fibre structures described throughout the literature (see e.g. [16, 17]), the internal structure of subchromosomal domains [7, 21, 22, 24, 38–40, 43, 49, 50] agreeing on all structural levels with the absolute nucleosome concentration distributions [18, 19], the dynamic and functional properties such as the architectural stability and movement of chromosomes [7, 22, 54, 71, 72], chromatin dynamics [73], as well as the diffusion of molecules inside nuclei (e.g. [22, 54, 72]), and recent genome wide *in vivo* FCS measurements of the chromatin quasi-fibre dynamics [27] also suggesting such a chromatin quasi-fibre with variable, function-dependent properties. Beyond, other hypothesis (see Introduction; [26, 27]) about the 3D genome organization on these scales can clearly be ruled out: e.g. no-compaction or a highly-regular chromatin fibre, unstable/dynamic loops or unstable/dynamic loop aggregates/rosettes can clearly be ruled out, because they simple would lead to other interaction patterns and the intrinsic chromatin fibre dynamics with movements on the milli-second scale (Movies 1, 2 [26]) would lead to immediate structural dissolution. Most importantly no other model leads to a consistent functional framework bridging consistently the here described scales as can also be shown by the agreement with scaling analysis of the 3D-architecture [21, 22, 26] and the DNA sequence itself [22, 24, 26]. Beyond, not only functional aspects as the easy (de-)condensation during mitosis can be easily explained, but we were also able to find this organization in the data of others across species and even across specie-kingdoms (Imam et al., in preparation).

#### **2.2. Dynamics and Structure Revealed by FCS**

To investigate the 3D genome architecture and dynamics also by an orthogonal genome wide and *in vivo* approach, a novel *in vivo* FCS technique exploring the structure and dynamics by measuring chromatin movement combined with a novel analytical approach was introduced [27]. It is based on the fact that a specific chromatin quasi-fibre and its higher-order architecture directly influences its intrinsic dynamics. Thus, the concept dissects intra-molecular polymer dynamics from fluorescence intensity fluctuations measured with FCS to investigate meso-scale chromatin dynamics in living cells and connects this to the underlying threedimensional organization. Besides, the classical analytical polymer models where extended to include dynamics, physical properties, and accessibility. As primary tracer protein for chromatin movement a linker histone H1.0-EGFP construct was chosen [18, 19, 22]. On the one hand, H1.0 decorates chromatin globally and reflects its density. On the other hand, it binds only transiently such that photobleached molecules are constantly replaced by fluorescent ones, and thus chromatin dynamics becomes amenable to FCS analysis (see also [20, 54]): Here, topologically and dynamically independent chromatin domains of 500 kbp to 1.5 Mbp in size were identified that are best described by a compacted chromatin fibre and a loopcluster polymer model under theta-solvent conditions. In more detail again the formation of stable loops and stable multi-loop aggregates/rosettes from a chromatin fibre with certain density and flexibility properties emerged as prominent structural feature of dynamically independent domains - and this throughout the cell nucleus in living cells! The detailed quantitative values for the involved parameters again lead in essence to the same values as found already in the T2C data: a quasi-fibre compaction of 5 ± 1 nucleosomes per 11 nm, with an average persistence length of ~80 to 120 nm, and loops and linkers of ~30 to 100 kbp [27]. Notably, it cannot be stressed enough that the loops and multi-loop aggregates/rosettes form *stable* entities on the time scales which were approachable by FCS (between 10 μs and 10 to 20 s) and do neither open, close, or in any other way reform (longer timescale up to hours are historically known). This not only moves many an assumption currently proposed (see Introduction) into the realm of fairy tales—conceptually and by hard experimental facts in agreement with the research of the last ~30 years (e.g. [18–20, 22, 54, 71]). Visualization of simulated structures illustrates this clearly (Movies 1, 2 [26]): structures described consistently throughout the literature would dissolve immediately - what has never been observed (though attempted to be measured) - and also in consistent agreement with the T2C results measured at the limit of resolution. Beyond, also characteristic variations were found between eu- and heterochromatin: Hydrodynamic relaxation times and gyration radii of independent chromatin domains are larger for open (161 ± 15 ms, 297 ± 9 nm) than for dense chromatin (88 ± 7 ms, 243 ± 6 nm) and increase globally upon chromatin hyperacetylation or ATP depletion. Thus, functional changes are a variation of a basic theme, e.g. more compact heterochromatic domains have a larger inaccessible volume fraction than more open euchromatic ones. Nevertheless, molecular diffusion is fast enough to roam a complete domain within few microseconds, during which the domain itself appears static. Relaxation of domains in the 100 ms range affects genome access in a protein concentration-dependent manner: highly abundant molecules at several 100 nM concentrations 'fill' the fluctuating domain so that a larger volume fraction than for a static TAD becomes adiabatically accessible. In contrast, for low-abundance molecules encounters with specific loci within a domain are diffusionlimited. They sense a higher inaccessible volume fraction. Thus, domain dynamics result in a concentration-dependent differential accessibility that is more pronounced in heterochromatin than in euchromatin due to its shorter relaxation times [20, 22, 27, 54]. In this manner the FCS approach can be extended to acquire complete nuclear maps and thus to "sequence" the

A Consistent Systems Mechanics Model of the 3D Architecture and Dynamics of Genomes

http://dx.doi.org/10.5772/intechopen.89836

75

To better understand the 3D genome organisation suggested e.g. by the above results, to evaluate hypotheses, and to plan future experiments, we were the first who have - since 1996 - developed polymer models with pre-set conditions for *in silico* super-computer simulations (i.e. without attempting to fit data; [7, 21–23, 26, 49–52, 70]) and later also an analytical mathematics framework [27]. The simulations use a stretchable, bendable, and volume excluded polymer (hydrodynamic) approximation of the 30 nm chromatin fibre consisting of individual homogenous segments with a resolution of ~1.0 to 2.5 kbp while combining Monte Carlo and Brownian Dynamics approaches (**Figures 2**–**4**). The analytical polymer approach extends and applies for the first time Gaussian chain and Kratky-Porod model descriptions in combination with the Rouse and Zimm models for polymer dynamics to complex star and rosette topologies under real excluded volume conditions as well as dilute and semi-dilute solvent conditions [27]. Whereas the analytical model is exact, the simulations explore emerg-

Simulations (**Figure 2**) of the Random-Walk/Giant-Loop model in which large individual loops (0.5–5.0 Mbp) are connected by a linker resembling a flexible backbone, as well as the Multi-Loop Subcompartment (MLS) model with rosette-like aggregates (0.5–2 Mbp) with smaller loops (60–250 kbp) connected by linkers (60–250 kbp), have already predicted that only an MLS model, i.e. a compacted quasi-fibre forming stable loops and stable loop aggregates/rosettes connected by a linker, can properly explain the formation of chromosome arms and territories [22], the spatial distances measured both using fluorescence *in situ* hybridization (FISH) experiments [7, 21–23, 26, 49–52, 70], and beyond even the general morphology of nuclei *in vivo* using histone fluorescence fusion proteins [22, 51], nucleosome concentration distributions, as well as dynamic and functional properties such as the diffusion of macromolecules [18, 19, 22, 53, 54]. These models also contained already enough information/aspects to cover other architectures such as free random-walks, random or fractal globules as well as their stability and dynamics. Additionally, the visualization (**Figures 2**–**4**, Movies 1, 2 [26]) creates an immediate feeling for the behaviour of genomes in 3D - a fact which already by pure visual inspection rules out many of the introduction mentioned obscure suggestions immediately.

With the unprecedented quality of both the interaction mapping by T2C and the FCS dynamic measurements (see above) the introduction of simulation and analytical models complex enough to approximate the 3D genome organization adequately showed even more clearly that only a quasi-fibre, stable loop, stable loop aggregate/rosette-like architecture is compatible with the measurements: In essence the simulations and analytical models describe even the slightest details of the T2C and FCS measurements correctly including many at first sight

dynamic organization of nuclei in living cells.

**2.3. Analytical and Computer Simulations Theoretic Evaluation**

ing effects not explicitly introduced into the analytical model.

for low-abundance molecules encounters with specific loci within a domain are diffusionlimited. They sense a higher inaccessible volume fraction. Thus, domain dynamics result in a concentration-dependent differential accessibility that is more pronounced in heterochromatin than in euchromatin due to its shorter relaxation times [20, 22, 27, 54]. In this manner the FCS approach can be extended to acquire complete nuclear maps and thus to "sequence" the dynamic organization of nuclei in living cells.
