**3. Systems Consistency of the 3D Genome Organization**

consistently to the same conclusion whatever orthogonal high-quality method is used and thus are a theoretical framework for the understanding, test, and engineering of genomes.

Since what is near in physical space should also be near (i.e. in terms of similarity) in DNA sequence space and this presumably genome-wide [22–24, 55], and because evolutionary surviving mutations of all sorts will be biased by the genome architecture itself and vice versa, the correlation and thus scaling behaviour of the DNA sequence [22–24, 26, 55] and its connection to the 3D genome architecture scaling - either from T2C interaction mapping [26] or from simulations [21–23] - allows for comprehensive investigation of genome organization in a unified scale-bridging manner from a few to the mega base pair level. Using to this end, the perhaps simplest correlation analysis possible (to avoid information loss or biases), we calculated the mean square deviation of the base pair composition (purines/pyrimidines) within windows of different sizes and calculating the function *C(l)* and its local slope *δ(l)*, which measures the correlation degree, or in more practical lay-men terms, is similar to a spectral measure [22–24, 26]: in relation to mammalian genome organization for each of two different human and mouse strains i) long-range power-law correlations were found on almost the entire observable scale, ii) with the local correlation coefficients showing a species specific multi-scaling behaviour with close to random correlations on the scale of a few base pairs, a

first maximum from 40 bp to 3.6 kbp, and a second maximum from 8 × 10<sup>4</sup>

iii) an additional fine-structure is present in the first and second maxima. The correlation degree and behaviour within the species are nearly identical comparing different chromosomes (with larger differences for the X and Y chromosomes). The behaviour on all scales is equivalent concerning the different measures used to investigate the long-range multiscaling of the genome architecture with the transitions of behaviours even at similar scaling positions [26] and can be associated with a single base pair resolution i) the nucleosome, ii) the compaction into a quasi-fibre, iii) the chromatin fibre regime, iv) the formation of loops, v) subchromosomal domains, and vi) their connection by linkers. Additionally, the already previously proven association to nucleosomal binding on the fine-structural level [22–24] is not only found again, but also is in agreement with the fine-structure found in the interaction scaling. Since the correlation analysis is genome-wide (in contrast to the T2C analysed regions so far) and since individual chromosomes show a highly similar scaling this clearly shows the genome-wide validity of the 3D organization. Moreover, the existence and details of this behaviour show the stability and persistence of the architecture since sequence reshuffling or other destructive measures would result in a loss of this pattern. This would also be the case for an unstable architecture, which would not leave a defined footprint within the sequence. This is again in agreement with our simulations of the dynamics or the genome wide *in vivo* FCS measurements [27]. Consequently, this shows not only by two analysis of completely independent "targets" (the T2C interaction experiments and the analysis of the DNA sequence) the compaction into a chromatin quasi-fibre and a stable multi-loop aggregate/rosette genome architecture again, but proved here also the long discussed notion that what is near in physical space is also near, i.e. more similar, in sequence space. Hence, the 3D architecture and DNA sequence organization are co-evolutionarily tightly entangled (review

to 3 × 10<sup>5</sup> bp, and

**2.4. DNA-Sequence Fine-Structured Multi-Scaling**

78 Chromatin and Epigenetics

The above described holistic combination of several new orthogonal approaches [26, 27] including the heuristics of the field leads interestingly undoubtedly to a consistent picture of genome architecture, dynamics, and in general organization, by establishing that nucleosomes compact into a quasi-fibre folded into stable loops, forming stable multi-loop aggregates/rosettes connected by linkers creating chromosome arms and entire chromosomes. Nevertheless, the heuristics of the field immediately questions whether i) we really now have an evolutionary consistent picture of genome organization, ii) whether this is the unavoidable outcome of Darwinian natural selection and Lamarkian self-referenced manipulation (what we introduce here), and iii) finally whether we can understand now genome organization in its systems context within cells, organs, and the entire organism? This in essence already relates back to the fundamental question of how life emerged from the primordial soup [5, 6, 22]; see details in following sections) but in the context discussed here can be addressed by first reflecting on the existing major functions of genomes, thus setting the stage: i) genomes need to stably store genetic information, ii) the information needs to be differentially read out to give rise to and regulate the molecular machinery, and iii) genomes need to replicate and mutate to spread and evolve:

**i.** Obviously the by far most important function is to stably store over long periods of time genetic information though with enough flexibility including mutations - or in short: without proper storage neither information retrieval, nor replication, nor evolutionary development exist. This involves obviously being resistant against physical/ chemical and/or in- or external mechanical destruction. Whereas, the first act mainly as from the bottom up involving one or a group of chemical bonds in proximity by direct interactions in the molecular soup, the latter depends on the large-scale structure of the basic molecular components and thus acts indirectly top-down on chemical bonds, i.e. that in- or external global stress is transferred and eventually accumulated via the global structure down to molecular levels while leading to mechanical failure. Both this physico-chemical and structural conformation-based destruction paradigms, influence genome architecture on all its levels under evolutionary pressure. They can be formulated such that a) mechanical failure rates are minimized regarding very long time spans, and b) in- or external mechanical failure rates reach an optimum due to the right balance between internal stability increasing with scale (for sensible ranges) and external stress decreasing the stability with increasing scale. From the well known average DNA breaking length of ~300–500 bp after already relatively severe sonication, this translates right away to the nucleosome and chromatin quasi-fibre level assuming that internal

nucleosomal attachment increases the stability and elongating it by a factor 146 bp to 200 bp (repeat length), i.e. the average breakage length of an uncompacted chromatin fibre is 44 kbp or in the extreme 100 kbp balancing the quasi-fibre internal stability increase by further compaction counteracted by the bigger mechanical susceptibility due to local compaction clusters. Thus, the found loops size of 30-100 kbp as well as its chromatin quasi-fibre persistence length of 80-120 nm is just what one would theoretically expect as the evolutionary outcome. The same holds for the formation of stable multi-loop aggregates/rosettes where the major player is internal stability, which is a function of quasi-fibre compaction, loops sizes, and loop numbers [51, 52], giving rise to the natural found size distribution between ~0.3-1.5 Mbp [21–24, 26, 27, 40–42]. Also on the entire chromosome level again in- and external stability criteria have reached an optimum during evolution concerning the number of subchromosomal domains as well as their total size and number within a genome which again would just fit what one would theoretically expect: subchromosomal domain linkers are in the ballpark of loop sizes, the number of subchromosomal domains is <200-300 which just is the optimum size where mechanical stress does not too much destruct mitotic chromosomes under normal conditions. Consequently, the stability criteria are clearly satisfied while obviously still allowing enough flexibility by variation of this theme within the relatively broad boundary limits and various levels compensating individual stretching of limits (e.g. bigger loops might be stabilised by higher quasi-fibre compaction). Beyond, destruction of a complete structural element (e.g. nucleosome, loop) in relation to the characteristic scale seems never really to exceed 1-5% - an important criterion for overall system resilience.

spacing ranging from 115 to 65 nm ([22] and literature cited therein). Together with other factors and molecules in the cell nucleus like proteins and RNA, which all have a similar density, the volume occupancy is still <25%. These percolation assumptions hold, of course, also for the dynamics of the structure itself as pointed out above. At first sight this seems to be a dense system but the architecture is moving constantly by Brownian motion like in a spaghetti soup with additional floating components [18–20, 22, 27, 53, 54]. For chemical reactions this is well known for diffusion limited aggregation processes [75] as well as for percolating systems [75]. Due to the described consistent multi-layered 3D organizations showing also a multi-scaling of its volume occupancy as well as the space in-between this creates now even more and especially a scale dependent accessibility and obstruction to enhance the theoretic predictions of homogeneous though compacted systems with percolating space. Thus, under such conditions the necessary machinery for transcription as well as transcript transport is based mainly on moderately obstructed diffusion and despite of its high overall concentrations acts as an adequate multi-scale space [22, 53]. Consequently, similar to diffusion limited (catalytic) processes modification of the intrinsic architecture and dynamics of the entire genome organization is used for locally or globally fine-tuning of processes and thus functional regulation. Concerning, the stability of the 3D architecture only a quasi-fibre with stable loop aggregates/rosettes allows in terms of stability and flexibility local containment of large-scale interactions during the initiation of transcription e.g. by enhancer promoter interactions. For knot-free replication of the genome these (spatial) arguments also apply: whereas accessibility allows access of the machinery and space for the duplication, spatial obstruction protects the structural integrity. Interestingly, none of the described alternative architectures and dynamics hypothesis (see Introduction) agree to even a sufficient degree with these fundamental necessities to guaranty genome function. **iii.** Replication and extinction of genetic information is the most crucial intervention into genome organization, since in contrast to the readout and regulation of genetic information by transcription, the entire structure and dynamics are affected by copying every single component of the organization. Here, an exact copy within a constrained space not only sequence wise, but also of its 3D architecture and dynamics as well as its disentanglement are the crucial parameters while still allowing structural stability/flexibility and even the access/obstruction of genetic information. From protein folding it is well known, that already during the amino-acid chain synthesis in the ribosome folding takes place, leading to a different 3D folding compared to the relaxation of finished and stretched out amino-acid chains. Obviously, also chromosome replication is such an adiabatic process (also chromosomes never fold from scratch, i.e. *de novo*, and always go continuously from one state to another), which takes place in parallel in the entire cell nucleus. And here again, genome architecture and dynamics are enabling replication to take place easily in principle only compatible with a chromatin quasi-fibre arranged in stable multi-loop aggregates/rosettes. This is due to the fact this architecture on the level of stable multi-loop aggregates/rosettes follows a knot-free two-dimensional topology. Of course, genome architecture is not a simple two-dimensional object in space considering the DNA-double helix and nucleosomal twist and writhe but nevertheless

A Consistent Systems Mechanics Model of the 3D Architecture and Dynamics of Genomes

http://dx.doi.org/10.5772/intechopen.89836

81

**ii.** Access to and obstruction of genetic information, i.e. genetic information retrieval in a regulated fashion is, of course, next to pure storage the major task for a genome, although without a stable information storage retrieval gets arbitrarily complicated whether replication takes place or not. Since the information is readout with similar means as the storage itself, i.e. in a molecular way in contrast e.g. to an optical readout, this relies in principle on two major conditions: a) the physical space for the regulation of the 3D architecture needed that a readout takes place, and b) accessibility/obstruction to the genetic information for the readout-machinery as well as post-processing and transport of the transcribed information. For the first the DNA, nucleosomes, chromatin quasi-fibre, loops and loop aggregates/rosettes, need to have the space to be modified and get rearranged, i.e. that a volume several times bigger than the actual structure exists for ease of change. This involves, naturally a certain compaction, since a homogenous soup would not allow this. Since the regulation and readout is done by molecular mechanisms, it is also obvious that a low spatial occupancy allows moderately obstructed diffusional access of both the regulation and readout machinery only for DNA with a certain compaction degree. For such a scenario the volume occupancy of the architecture in aqueous solution should be well (!) below the limit of ~50% (model depending) as known from percolation studies [74], i.e. in terms of the performance expected for genomes, volume occupancy should be <10% since both the genomic architecture as well as the machinery should be able to access it for regulation by modification as well as readout. For chromatin, experimental values are between 2.5% to ~8% with a homogenous mesh

spacing ranging from 115 to 65 nm ([22] and literature cited therein). Together with other factors and molecules in the cell nucleus like proteins and RNA, which all have a similar density, the volume occupancy is still <25%. These percolation assumptions hold, of course, also for the dynamics of the structure itself as pointed out above. At first sight this seems to be a dense system but the architecture is moving constantly by Brownian motion like in a spaghetti soup with additional floating components [18–20, 22, 27, 53, 54]. For chemical reactions this is well known for diffusion limited aggregation processes [75] as well as for percolating systems [75]. Due to the described consistent multi-layered 3D organizations showing also a multi-scaling of its volume occupancy as well as the space in-between this creates now even more and especially a scale dependent accessibility and obstruction to enhance the theoretic predictions of homogeneous though compacted systems with percolating space. Thus, under such conditions the necessary machinery for transcription as well as transcript transport is based mainly on moderately obstructed diffusion and despite of its high overall concentrations acts as an adequate multi-scale space [22, 53]. Consequently, similar to diffusion limited (catalytic) processes modification of the intrinsic architecture and dynamics of the entire genome organization is used for locally or globally fine-tuning of processes and thus functional regulation. Concerning, the stability of the 3D architecture only a quasi-fibre with stable loop aggregates/rosettes allows in terms of stability and flexibility local containment of large-scale interactions during the initiation of transcription e.g. by enhancer promoter interactions. For knot-free replication of the genome these (spatial) arguments also apply: whereas accessibility allows access of the machinery and space for the duplication, spatial obstruction protects the structural integrity. Interestingly, none of the described alternative architectures and dynamics hypothesis (see Introduction) agree to even a sufficient degree with these fundamental necessities to guaranty genome function.

nucleosomal attachment increases the stability and elongating it by a factor 146 bp to 200 bp (repeat length), i.e. the average breakage length of an uncompacted chromatin fibre is 44 kbp or in the extreme 100 kbp balancing the quasi-fibre internal stability increase by further compaction counteracted by the bigger mechanical susceptibility due to local compaction clusters. Thus, the found loops size of 30-100 kbp as well as its chromatin quasi-fibre persistence length of 80-120 nm is just what one would theoretically expect as the evolutionary outcome. The same holds for the formation of stable multi-loop aggregates/rosettes where the major player is internal stability, which is a function of quasi-fibre compaction, loops sizes, and loop numbers [51, 52], giving rise to the natural found size distribution between ~0.3-1.5 Mbp [21–24, 26, 27, 40–42]. Also on the entire chromosome level again in- and external stability criteria have reached an optimum during evolution concerning the number of subchromosomal domains as well as their total size and number within a genome which again would just fit what one would theoretically expect: subchromosomal domain linkers are in the ballpark of loop sizes, the number of subchromosomal domains is <200-300 which just is the optimum size where mechanical stress does not too much destruct mitotic chromosomes under normal conditions. Consequently, the stability criteria are clearly satisfied while obviously still allowing enough flexibility by variation of this theme within the relatively broad boundary limits and various levels compensating individual stretching of limits (e.g. bigger loops might be stabilised by higher quasi-fibre compaction). Beyond, destruction of a complete structural element (e.g. nucleosome, loop) in relation to the characteristic scale seems never really to exceed 1-5% - an important criterion for overall system resilience.

80 Chromatin and Epigenetics

**ii.** Access to and obstruction of genetic information, i.e. genetic information retrieval in a regulated fashion is, of course, next to pure storage the major task for a genome, although without a stable information storage retrieval gets arbitrarily complicated whether replication takes place or not. Since the information is readout with similar means as the storage itself, i.e. in a molecular way in contrast e.g. to an optical readout, this relies in principle on two major conditions: a) the physical space for the regulation of the 3D architecture needed that a readout takes place, and b) accessibility/obstruction to the genetic information for the readout-machinery as well as post-processing and transport of the transcribed information. For the first the DNA, nucleosomes, chromatin quasi-fibre, loops and loop aggregates/rosettes, need to have the space to be modified and get rearranged, i.e. that a volume several times bigger than the actual structure exists for ease of change. This involves, naturally a certain compaction, since a homogenous soup would not allow this. Since the regulation and readout is done by molecular mechanisms, it is also obvious that a low spatial occupancy allows moderately obstructed diffusional access of both the regulation and readout machinery only for DNA with a certain compaction degree. For such a scenario the volume occupancy of the architecture in aqueous solution should be well (!) below the limit of ~50% (model depending) as known from percolation studies [74], i.e. in terms of the performance expected for genomes, volume occupancy should be <10% since both the genomic architecture as well as the machinery should be able to access it for regulation by modification as well as readout. For chromatin, experimental values are between 2.5% to ~8% with a homogenous mesh

**iii.** Replication and extinction of genetic information is the most crucial intervention into genome organization, since in contrast to the readout and regulation of genetic information by transcription, the entire structure and dynamics are affected by copying every single component of the organization. Here, an exact copy within a constrained space not only sequence wise, but also of its 3D architecture and dynamics as well as its disentanglement are the crucial parameters while still allowing structural stability/flexibility and even the access/obstruction of genetic information. From protein folding it is well known, that already during the amino-acid chain synthesis in the ribosome folding takes place, leading to a different 3D folding compared to the relaxation of finished and stretched out amino-acid chains. Obviously, also chromosome replication is such an adiabatic process (also chromosomes never fold from scratch, i.e. *de novo*, and always go continuously from one state to another), which takes place in parallel in the entire cell nucleus. And here again, genome architecture and dynamics are enabling replication to take place easily in principle only compatible with a chromatin quasi-fibre arranged in stable multi-loop aggregates/rosettes. This is due to the fact this architecture on the level of stable multi-loop aggregates/rosettes follows a knot-free two-dimensional topology. Of course, genome architecture is not a simple two-dimensional object in space considering the DNA-double helix and nucleosomal twist and writhe but nevertheless

in terms of replication disentanglement it is. Consequently, replication origins can be situated and start replication everywhere in each chromatin loop with replication forks leading towards both directions until they hit a loop base (which is the reason for the bidirectional CTCF sites functioning as linear DNA markers for the directional oriented replication machinery). During this procedure even the twist and writhe are copied and need to be untangled as in the case of transcription. While hitting the loop bases then the two forks coming from two loops have to be joined and untangled, but no complex network of knots as they would appear even in a Random-Walk/Giant-Loop or even more so in a fractal globule like replication scenario would have to be cut and re-joined. Again here theoretical predictions for loop size and loop numbers are just fitting the experimental findings (see e.g. [39] and thereafter). Due to the two-dimensional topology of the multi-loop aggregates/rosettes, they can just be separated very easily in 3D space (this idea was proposed and illustrated to the author by his at the time 6 year old son Leander Aurelius!). And again the compaction and volume occupancy in the cell nucleus play an important role: the compaction into a chromatin fibre reduces not only the formation of DNA knots largely (perhaps almost to zero), but also provides with the volume occupancy in the cell nucleus the room for undisturbed replication, with the right flexibility provided by the intrinsic dynamics, allowing the disentanglement of replicated structures with minimal e.g. topoisomerase/decatenase driven active processes.

measurements as well as theoretical descriptions. Hence, from each of these "atomistic" basic units/elements their collective behaviour can be derived by a statistical mechanics on each individual level as wells as a complex interwoven scale-bridging, i.e. a hierarchic back referencing networked systems statistical mechanics - which obviously exists - can now be established in detail. This exceeds and is much more complex than establishing the statistical mechanics at the turn of the 20th century where from the individual components e.g. gas molecules a statistical mechanics established the collective properties of the entire system, e.g. the entire gas, because genome organization is not only a simple dualistic system of e.g. two levels but a complex multilistic network system with back references: In detail this means determining experimentally the behaviour of a genome structural/dynamic level precisely with its entire statistics and then doing the same on the level emerging from the underlying level. In principle this is what we have started already by setting up an experimental and theoretic framework over the past 20 years to elucidate genome organization [7, 18–24, 26, 27, 49, 50], although only now with the complete description of the general 3D genome architecture/ dynamics it is possible to fill the existing lack of knowledge in detail, determine the values for parameters with high precision, and in constant cycles of refinement adjust the description to an ever higher degree of approximation. Thus, the difference to the development of statistical mechanics in classical and later quantum physics at the turn to the 20th century is that in biology many and also much higher levels still are determined by and also act back even on the very first level to a much higher degree. This also immediately unites the at first sight contradicting theoretic descriptions of living systems of Ilia Prigogine [75], stating that living systems are far away from thermodynamic equilibrium, with those proposed by Georgi Gladyshev [76] stating that hierarchic substance stability is locally in thermodynamic equilibrium. Actually, these descriptions are even extended due to the multilistic statistical systems mechanics, i.e. manifold recursive hierarchically back-referencing, which are until now not described but e.g. envisioned in efforts to extend quantum mechanics to higher order complexities [77]. Consequently, a genomic multilistic statistical systems mechanics allows not only to describe and test basic properties of life, but also to answer perhaps the most fundamental questions of life as e.g. whether life time-wise can be extended beyond the currently obvious or thought of limits by manipulated engineering in one of its most central parts - the genome - a quest of epic dimensions appearing already at least between the lines

A Consistent Systems Mechanics Model of the 3D Architecture and Dynamics of Genomes

http://dx.doi.org/10.5772/intechopen.89836

83

in "What Is Life ?" by Erwin Schrödinger [78].

**5. Genotype-Phenotype-Entanglement and Genome Ecology**

The most important implication from the findings described above is most likely the multilistic entanglement between genotype and phenotype being the natural outcome of Darwinian natural selection and Lamarkian self-referenced manipulation in a genome ecology framework, which is connected directly to the origin of genomes and life itself: While entropy grows like an inexorable river, local disturbances lead to ever more ordered self-organizing and self-sustaining resistors, more complex structures, and finally life. In the 1970s Manfred Eigen [5, 6] showed how from the primordial soup autocatalytic chemical reaction-networks emerged and how they form ever more complex cooperatively organized networks and

In summary, the above proves even further and especially in a holistic combination with the presented new orthogonal approaches [26, 27] and including the heuristics of the field, that indeed the described 3D genome organization - DNA forming nucleosomes compacted into a quasi-fibre folded into stable loops, forming stable multi-loop aggregates/rosettes connected by linkers creating chromosome arms and entire chromosomes (**Figure 1**) - presents without doubt a consistent scale bridging systems statistical mechanics genomics fulfilling the functional conditions necessary for storage, transcription, and replication. Additionally, the actual values found for the various parameters involved are just found in those "regions" one would expect as the unavoidable outcome of Darwinian natural selection and Lamarkian self-referenced manipulation (see below).
