**4.5. Importance of the nucleosomal context in epigenetic read-out (PSIP1-PWWP & PHF1-Tudor)**

The complexity of nucleosome recognition by reader proteins is well illustrated by the NMR-based studies on the recognition of H3K36me-nucleosomes by the PWWP domain of PSIP1(Ledgf). NMR studies of this reader interaction found that the PWWP domain has binding affinity orders of magnitude lower for a H3K36me peptide compared to H3K36me3 in a nucleosomal context. Interestingly, a similar observation was made for the Tudor domain of the H3K36me reader PHF1 [85]. Here, an isolated peptide model of the H3 tail showed decreased affinity as well. Due to the proximity of H3K36 to nucleosomal DNA, a role of DNA binding was hypothesized for both proteins. NMR studies showed for PSIP1 and PHF1 alike Recognition of Nucleosomes by Chromatin Factors: Lessons from Data-Driven Docking-Based… http://dx.doi.org/10.5772/intechopen.81016 33

**Figure 5.** (A) Structural model of nucleosome-bound RNF169 (red) and ubiquitin (green). (B, top) The proposed main acidic patch anchoring residue R700 (conserved position throughout the docking solutions) is shown in the conserved arginine anchor position between the acidic triads (Glu 60, Asp 89, Glu 91). (B, bottom) Side chain interactions between RNF169 MIU2 (red) and ubiquitin (green). Figure generated using the author-provided PDB file [69].

a binding site for nucleosomal DNA, resulting in a simultaneous binding mechanism of both trimethyl lysine and nucleosomal DNA.

For PHF1-Tudor, a crystal structure bound to a trimethylated H3 tail peptide was already available to use. The additional importance of the nucleosomal context and synergetic binding mechanism can be understood from the corresponding nucleosome-bound structure (**Figure 6A**). In case of PSIP1-PWWP, the domain structure was solved by NMR and, together with NMR titration data, used to determine a structural model of nucleosome-bound protein (**Figure 6B**) [67, 68, 85]. The structural models of both highlighted the importance of the nucleosomal context in H3K36me3 recognition, emphasizing that complex formation critically depends on two synergetic binding processes. Firstly, the aromatic residues that form the aromatic cage bind to trimethylated lysine H3K36me3. This recognition of the PTM is crucial for the binding, but the readers reach their full binding affinity only when their positive surface residues interact with the nucleosomal DNA. This makes both studies outstanding examples of synergetic interplay of epitopes in nucleosome-binding proteins (**Figure 6C**, **D**).

The insights derived from these structural models were used to design experiments to validate the structural model and may offer possible tools for further research approaches. In case of PSIP1-PWWP, the structural model sparked current efforts in the design of nucleosomemimicking peptides to modulate the PSIP1-chromatin interaction.

#### **4.6. LANA goes solid state**

established the molecular basis of this interaction. The α-helical MIU2 (motif interacting with ubiquitin) domain binds to a hydrophobic patch on the K13/15-conjugated ubiquitin while a disordered region anchors RNF169 on the nucleosome by binding to the acidic patch. They subsequently reconstructed a model structure that presents both epitopes in their nucleosomebound state (**Figure 5A**). The work of Hu *et al.* combined traditional NOESY-based structure determination at the level of histone-dimers with interaction studies at the nucleosome level and complemented these with SAXS data into a final model [89]. The authors also extended their findings to an NMR-based structural model for the complex with DNA repair factor Rad18. Both RNF169 and Rad18 are known to interfere with the binding of 53BP1 to nucleosomes ubiquitinated at H2A K13/15. These NMR-based structural models have allowed to

**Figure 4.** (A) Structural model of HMGN2 (red) bound to the nucleosome. The binding occurs along the nucleosome surface and is driven by interactions with the acidic patch and nucleosomal DNA, resulting in HMGN2 competing with H1 for nucleosome binding. (B) Close view on the acidic patch binding N-terminal HMGN2 region depicting the canonical arginine anchor R26 surrounded by the Glu 91, Asp 89, Glu 60 acidic triad motif of H2A. Figure generated

**4.5. Importance of the nucleosomal context in epigenetic read-out (PSIP1-PWWP &** 

The complexity of nucleosome recognition by reader proteins is well illustrated by the NMR-based studies on the recognition of H3K36me-nucleosomes by the PWWP domain of PSIP1(Ledgf). NMR studies of this reader interaction found that the PWWP domain has binding affinity orders of magnitude lower for a H3K36me peptide compared to H3K36me3 in a nucleosomal context. Interestingly, a similar observation was made for the Tudor domain of the H3K36me reader PHF1 [85]. Here, an isolated peptide model of the H3 tail showed decreased affinity as well. Due to the proximity of H3K36 to nucleosomal DNA, a role of DNA binding was hypothesized for both proteins. NMR studies showed for PSIP1 and PHF1 alike

hypothesize on the molecular mechanism for this interference.

**PHF1-Tudor)**

using the author-provided PDB file [93].

32 Chromatin and Epigenetics

The studies mentioned above illustrate the potential of data-driven modeling of nucleosomeprotein complexes based on state-of the-art solution NMR. Recent advances in solid-state

and a structural model generated. The large agreement between the crystal structure and ssNMR-derived structural model (**Figure 7B**) illustrates the power of this approach. In our view, ssNMR, just as the solution NMR approach, is an attractive alternative for structure determination for nucleosome-protein complexes. While its application awaits to be extended to larger nucleosome-binding domains, we anticipate that it will be a valuable addition to the

Recognition of Nucleosomes by Chromatin Factors: Lessons from Data-Driven Docking-Based…

http://dx.doi.org/10.5772/intechopen.81016

35

Next to NMR, cross-linking mass spectrometry has found increasing application as a data source on nucleosome-protein interactions. With cross-linking, intermolecular contacts between the proteins of interest are captured and converted to covalent connections. These connections are introduced by small molecule linkers, specific for the fusion of well-defined side chains or less specific as radical-forming photo cross-linkers. Furthermore, cross-linkers possess a spacer between their terminal functional groups to define the range of cross-linking ability [99, 100]. Both characteristics can be tuned for the study of a specific system, resulting in a manifold of reported linker molecules. After cross-linking, the protein complex undergoes trypsin digestion resulting in peptide fragments of the complex. Here, covalently cross-linked fragments stay connected. An analysis of these fragments by liquid chromatography mass spectrometry (LC-MS) enables identification of the sequence positions. The cross-links can thus be converted to distance restraints between two residues, with the distance depending on the length of the cross-linker. These restraints can be used to guide structural modelling of the complex [80]. In one of the earliest examples for nucleosome complexes, XL-MS was used to map the binding sites of the various nucleosome-binding domains of the chromatin

**Figure 7.** (A) Structural model for nucleosome-bound LANA peptide. ssNMR data derived from NMR titration experiments were used to direct the docking simulation. (B) Alignment of the ssNMR-derived model for LANA (red) and the crystal structure (green, pdb: 1zla) shows remarkable accuracy of the docking-derived solution. For both, the canonical arginine anchor is depicted as sticks in the typical central position between the acidic triad of H2A (yellow).

**4.7. Modeling nucleosome-bound Rad6-Bre1 based on cross-linking MS**

tool kit in chromatin structural biology.

**Figure 6.** Structural model of nucleosome-bound PHF1 (red; A) and PSIP1-PWWP (green; B). The electrostatic potential of nucleosomal DNA and the surface of PHF1 (C) and PWWP (D), respectively, act in combination with H3K36me3 recognition by the aromatic cage motif (trimethyl lysine side chain shown as sticks). Figure generated using the authorprovided PDB file [85].

NMR (ssNMR) have enabled the detailed investigation of large, soluble biomolecular complexes. Very recently, our lab capitalized on these advances and tailored them for application to nucleosome-protein complexes [87]. Unlike the methyl-TROSY methods, this approach allows observation of all residues, in principle allowing for a more complete mapping of binding interfaces. In this approach, NMR spectra are recorded on sediments, generated by ultracentrifugation, of nucleosomes or their complexes. After assignments of NMR signals of histone H2A in the unbound nucleosome, spectra were recorded on the nucleosome complex with the LANA peptide, analogous to the LANA crystal structure (**Figure 7A**) [61, 87]. Based on the chemical shift changes, the binding site of LANA could be mapped to the acidic patch and a structural model generated. The large agreement between the crystal structure and ssNMR-derived structural model (**Figure 7B**) illustrates the power of this approach. In our view, ssNMR, just as the solution NMR approach, is an attractive alternative for structure determination for nucleosome-protein complexes. While its application awaits to be extended to larger nucleosome-binding domains, we anticipate that it will be a valuable addition to the tool kit in chromatin structural biology.

### **4.7. Modeling nucleosome-bound Rad6-Bre1 based on cross-linking MS**

Next to NMR, cross-linking mass spectrometry has found increasing application as a data source on nucleosome-protein interactions. With cross-linking, intermolecular contacts between the proteins of interest are captured and converted to covalent connections. These connections are introduced by small molecule linkers, specific for the fusion of well-defined side chains or less specific as radical-forming photo cross-linkers. Furthermore, cross-linkers possess a spacer between their terminal functional groups to define the range of cross-linking ability [99, 100]. Both characteristics can be tuned for the study of a specific system, resulting in a manifold of reported linker molecules. After cross-linking, the protein complex undergoes trypsin digestion resulting in peptide fragments of the complex. Here, covalently cross-linked fragments stay connected. An analysis of these fragments by liquid chromatography mass spectrometry (LC-MS) enables identification of the sequence positions. The cross-links can thus be converted to distance restraints between two residues, with the distance depending on the length of the cross-linker. These restraints can be used to guide structural modelling of the complex [80]. In one of the earliest examples for nucleosome complexes, XL-MS was used to map the binding sites of the various nucleosome-binding domains of the chromatin

NMR (ssNMR) have enabled the detailed investigation of large, soluble biomolecular complexes. Very recently, our lab capitalized on these advances and tailored them for application to nucleosome-protein complexes [87]. Unlike the methyl-TROSY methods, this approach allows observation of all residues, in principle allowing for a more complete mapping of binding interfaces. In this approach, NMR spectra are recorded on sediments, generated by ultracentrifugation, of nucleosomes or their complexes. After assignments of NMR signals of histone H2A in the unbound nucleosome, spectra were recorded on the nucleosome complex with the LANA peptide, analogous to the LANA crystal structure (**Figure 7A**) [61, 87]. Based on the chemical shift changes, the binding site of LANA could be mapped to the acidic patch

provided PDB file [85].

34 Chromatin and Epigenetics

**Figure 6.** Structural model of nucleosome-bound PHF1 (red; A) and PSIP1-PWWP (green; B). The electrostatic potential of nucleosomal DNA and the surface of PHF1 (C) and PWWP (D), respectively, act in combination with H3K36me3 recognition by the aromatic cage motif (trimethyl lysine side chain shown as sticks). Figure generated using the author-

**Figure 7.** (A) Structural model for nucleosome-bound LANA peptide. ssNMR data derived from NMR titration experiments were used to direct the docking simulation. (B) Alignment of the ssNMR-derived model for LANA (red) and the crystal structure (green, pdb: 1zla) shows remarkable accuracy of the docking-derived solution. For both, the canonical arginine anchor is depicted as sticks in the typical central position between the acidic triad of H2A (yellow).

remodeling complex ISW2 onto the nucleosome surface [91]. These data were subsequently used to build a structural model of the ISW2-nucleosome complex. A recent case of crosslinking-based modeling in nucleosome research is the E2/E3 ubiquitin ligase complex Rad6- Bre1 (**Figure 8A**). Bre1 is known to act as a homodimer in a complex with Rad6 to specifically ubiquitinate H2B K123 [101, 102]. However, the molecular mechanism of specific ubiquitination remained unknown without any nucleosome-bound complex structure available. Gallego et al. addressed exactly this problem by using XL-MS data to identify the binding interface between the Bre1 RING domain and the nucleosome. Next to nucleosomal DNA binding, they observed binding of the homodimer to the acidic patch (**Figure 8B**), which was verified by LANA-induced inhibition of Bre1 RING nucleosome binding. As a first step in the modeling, the authors modeled the Rad6-Bre1 complex structures based on homology with known E2/E3 RING ligases. Importantly, the resulting model was supported by the observed cross-links. The Rad6-Bre1 model could then be docked onto the nucleosome guided by the observed cross-links. This provided the structural basis for the specificity of Bre1 towards H2B K123 ubiquitination [70].

Ran-binding interface. Before the crystal structure of nucleosome-bound RCC1 was solved, a data-driven model was reported, which does feature Ran-nucleosome interactions. [62]. The authors suggest that, upon Ran binding, the nucleosomal DNA contacts with RCC1 N-terminal tail observed in the crystal are broken in favor of Ran-nucleosome interactions as observed in model. Even though additional studies have to elucidate the exact mechanism of RCC1-Ran nucleosome binding, the use of crystal structure and data-driven model in combi-

Recognition of Nucleosomes by Chromatin Factors: Lessons from Data-Driven Docking-Based…

http://dx.doi.org/10.5772/intechopen.81016

37

Another cardinal topic is the nucleosome-bound state of linker histone H1. To date, the structure of the chromatosome, consisting of the four canonical histones and 166bp of DNA in a complex with linker histones, is strongly debated. In this case as well, there are contradictions between structural models and a nucleosome-bound crystal structure of the chromatosome. The crystal structure reported by Zhou *et al.* displays the globular domain of linker histone H5 (chicken H5) with truncated tails in an on-dyad binding mode encountering both entering and leaving ends of linker DNA [75]. As for linker histone H1 (X. laevis H1.0b, human H1.5), a similar on-dyad binding mode was reported by cryo-EM and crystallography independently from absence or presence of H1 tails [45]. In fact, while not vital for linker histone positioning, the H1 C-terminal domain engages in binding of one of both linker DNAs preferably,

In contrast to the proposed on-dyad complex, computational studies on linker histone binding suggest an alternative, off-dyad binding geometry of the complex in which the linker histone shows interactions with but one strand of linker DNA [103]. This binding mode was shown experimentally in the case of the globular domain of linker histone H1 (D. melanogaster). Here, NMR-based distance information, obtained through paramagnetic relaxation enhancement (PRE), was used to derive the nucleosome-binding mode of H1, showing an asymmetric, off-dyad binding [43]. Interestingly, it was shown by PRE as well that the mutation of a set of five crucial amino acids in H5 to its equivalents in H1 is sufficient to change the binding mode of H5 from on-dyad (crystal) to off-dyad [90]. This points out the importance of linker histone subtype sequence and the interacting residues in determining the binding

Chromatin structural biology is an equally important as demanding field. This is not only clear from the tremendous efforts necessary for the first nucleosome structure but also from the limited number of structures for nucleosome-protein complexes. While crystallography and cryo-EM resulted in various high-resolution structures, not every interaction is accessible this way due to either of many experimental limitations, such as the need for crystallization, the fleeting nature of some complexes or the pervasive role of highly dynamic protein regions. Here, an increasing number of studies shift towards a combined approach utilizing

nation outlines a possible mechanism to further investigate.

introducing asymmetry into the nucleosome-bound complex.

mode towards the nucleosome [44].

**5. Conclusions**

**4.9. Debating H1**

#### **4.8. Adding new perspective on binding modes**

Data-driven structural models complement high-resolution structures in many ways. An interesting example is the RCC1-nucleosome interaction, which serves as binding platform for subsequent binding of Ran, a protein relevant during mitosis (see Section 3.2). Biochemical data have shown that Ran activity is increased in the nucleosome-bound complex. The crystal structure suggests no nucleosome-Ran interactions upon modeling Ran to the RCC1

**Figure 8.** (A) Structural model of homodimeric Bre1 (red) bound to the nucleosome together with the E2 ligase Rad6 (blue) with attached ubiquitin (green). The study was conducted by identifying the interactions between positive Bre1 RING residues and the acidic patch. The docking was further facilitated due to the known target lysine residue. (B) Close view on Bre1 bound to both the acidic patch and nucleosomal DNA. The homodimeric nature allows the engagement of both epitopes in simultaneous binding. Figure generated using the author-provided PDB file [70].

Ran-binding interface. Before the crystal structure of nucleosome-bound RCC1 was solved, a data-driven model was reported, which does feature Ran-nucleosome interactions. [62]. The authors suggest that, upon Ran binding, the nucleosomal DNA contacts with RCC1 N-terminal tail observed in the crystal are broken in favor of Ran-nucleosome interactions as observed in model. Even though additional studies have to elucidate the exact mechanism of RCC1-Ran nucleosome binding, the use of crystal structure and data-driven model in combination outlines a possible mechanism to further investigate.

#### **4.9. Debating H1**

remodeling complex ISW2 onto the nucleosome surface [91]. These data were subsequently used to build a structural model of the ISW2-nucleosome complex. A recent case of crosslinking-based modeling in nucleosome research is the E2/E3 ubiquitin ligase complex Rad6- Bre1 (**Figure 8A**). Bre1 is known to act as a homodimer in a complex with Rad6 to specifically ubiquitinate H2B K123 [101, 102]. However, the molecular mechanism of specific ubiquitination remained unknown without any nucleosome-bound complex structure available. Gallego et al. addressed exactly this problem by using XL-MS data to identify the binding interface between the Bre1 RING domain and the nucleosome. Next to nucleosomal DNA binding, they observed binding of the homodimer to the acidic patch (**Figure 8B**), which was verified by LANA-induced inhibition of Bre1 RING nucleosome binding. As a first step in the modeling, the authors modeled the Rad6-Bre1 complex structures based on homology with known E2/E3 RING ligases. Importantly, the resulting model was supported by the observed cross-links. The Rad6-Bre1 model could then be docked onto the nucleosome guided by the observed cross-links. This provided the structural basis for the specificity of Bre1 towards

Data-driven structural models complement high-resolution structures in many ways. An interesting example is the RCC1-nucleosome interaction, which serves as binding platform for subsequent binding of Ran, a protein relevant during mitosis (see Section 3.2). Biochemical data have shown that Ran activity is increased in the nucleosome-bound complex. The crystal structure suggests no nucleosome-Ran interactions upon modeling Ran to the RCC1

**Figure 8.** (A) Structural model of homodimeric Bre1 (red) bound to the nucleosome together with the E2 ligase Rad6 (blue) with attached ubiquitin (green). The study was conducted by identifying the interactions between positive Bre1 RING residues and the acidic patch. The docking was further facilitated due to the known target lysine residue. (B) Close view on Bre1 bound to both the acidic patch and nucleosomal DNA. The homodimeric nature allows the engagement of

both epitopes in simultaneous binding. Figure generated using the author-provided PDB file [70].

H2B K123 ubiquitination [70].

36 Chromatin and Epigenetics

**4.8. Adding new perspective on binding modes**

Another cardinal topic is the nucleosome-bound state of linker histone H1. To date, the structure of the chromatosome, consisting of the four canonical histones and 166bp of DNA in a complex with linker histones, is strongly debated. In this case as well, there are contradictions between structural models and a nucleosome-bound crystal structure of the chromatosome. The crystal structure reported by Zhou *et al.* displays the globular domain of linker histone H5 (chicken H5) with truncated tails in an on-dyad binding mode encountering both entering and leaving ends of linker DNA [75]. As for linker histone H1 (X. laevis H1.0b, human H1.5), a similar on-dyad binding mode was reported by cryo-EM and crystallography independently from absence or presence of H1 tails [45]. In fact, while not vital for linker histone positioning, the H1 C-terminal domain engages in binding of one of both linker DNAs preferably, introducing asymmetry into the nucleosome-bound complex.

In contrast to the proposed on-dyad complex, computational studies on linker histone binding suggest an alternative, off-dyad binding geometry of the complex in which the linker histone shows interactions with but one strand of linker DNA [103]. This binding mode was shown experimentally in the case of the globular domain of linker histone H1 (D. melanogaster). Here, NMR-based distance information, obtained through paramagnetic relaxation enhancement (PRE), was used to derive the nucleosome-binding mode of H1, showing an asymmetric, off-dyad binding [43]. Interestingly, it was shown by PRE as well that the mutation of a set of five crucial amino acids in H5 to its equivalents in H1 is sufficient to change the binding mode of H5 from on-dyad (crystal) to off-dyad [90]. This points out the importance of linker histone subtype sequence and the interacting residues in determining the binding mode towards the nucleosome [44].
