**4. RDC-based domain orientation analysis – Basics and limitation**

In this section, we will describe the experimental procedure to determine the domain orientation of a multiple-domain protein, from the RDC data collection to structure determination. In addition, the limitations in the RDC based analysis will be discussed to emphasize the necessity of our TROSY based DIORITE approach that follows.

#### **4.1 Collecting the RDC data**

RDC is measured on a pair of 1H coupled HSQC spectra for the samples in isotropic and anisotropic states. The 1H coupled HSQC spectrum gives a pair of split peaks along 15N axis

Complementary Use of NMR to X-Ray Crystallography

conventional NMR structure determination.

**4.3 Significance of the domain orientation analysis by RDCs** 

protein.

for the Analysis of Protein Morphological Change in Solution 423

The alignment tensor magnitudes along each principal axis represent the extent of the aligning order. If they differ between the domains, which may indicate that each rotates differently to each other. The aligning orders give the insight into the domain dynamics in a

The domain orientation analysis for protein in solution gives an invaluable outcome, even its high-resolution structure is available. There are some cases to show the different domain arrangements between solution and crystalline structures. The RDC analysis on maltose binding protein (MBP) in the complex with -cyclodextrin has shown that the relative domain orientation in solution was different from that in the crystal (Skrynnikov et al. 2000b). This may indicate the crystal contact causes a subtle change in domain orientation. Bacteriophage T4 lysozyme in solution was shown to have a more open conformation relative to the crystal structure, which was also analyzed by the RDCs (Goto et al. 2001). This observation appears compatible with steric requirements for the ligand bindings. These examples illustrate how the RDC based domain analysis complements X-ray crystallography in determining the relative domain orientation or protein morphology. The complementary role of the RDC based analysis is considerably emphasized in exploring the domain rearrangement upon binding to the other protein or ligand. Even if the complex structure cannot be solved by X-ray, the complex structure is determined by the RDCs in a solution state when the structure in a ligand-free form is available. This approach does not require tedious and time-consuming NOE analysis as required in conventional NMR structure determination, but just needs the backbone resonance assignments and a set of IPAP-HSQC spectra. The structure determination is, therefore, much more efficient over the

Fig. 8. Schematic representation for the procedure to determine the relative domain orientation based on the alignment tensors for each domain. Sets of RDC data determine the alignment tensor for each domain, independently. The domain orientation of a protein is established by

The domain orientation analysis with the RDCs is now recognized as a useful technique to elucidate overall protein morphology in solution, complementing the X-ray structure

rotating one domain coordinate to make an overlay of its tensor frame on the other.

**4.4 Molecular size limitation in the RDC based approach** 

for each 1H-15N correlation. The doubled number of peaks on a 1H coupled HSQC spectrum may increase signal overlaps that obstacle the accurate reading of peak positions. To avoid this drawback, a particular NMR technique is used to separate the up- and down-field components into different 2D spectra, IPAP-HSQC. IPAP-HSQC gives two separate spectra that have in-phase and anti-phase doubles, respectively. Addition or subtraction of the spectra will give two separate 2D spectra displaying only up-field or down-filed components of each doublet. This signal separation reduces signal overlap on a 1H coupled HSQC spectrum, and keeps the spectral resolution to the same extent as in the original HSQC spectrum (Fig. 3).

The separation width between the up- and down-field components measured from IPAP-HSQC spectra gives <sup>1</sup> *NH J* for isotropic sample and <sup>1</sup> *res NH NH J D* for aligned sample. Therefore, the RDC, *res DNH* , is obtained by their difference.

#### **4.2 Domain orientation analysis based on the RDC data**

Here, we describe the domain orientation analysis based on the RDCs. The domain orientation analysis should be done for the protein whose structure is already known by Xray. The primal interest of the analysis is in exploring the domain reorientation upon interaction with the other protein or ligand. In the cases, each domain is assumed to retain the same structure as in crystal.

As described in the theory section, the alignment tensor for a weakly aligned protein is determined based by the RDCs and its structure coordinate, Eq. (5). In Eq. (5), direction cosines are calculated from the structure coordinate. Because the Saupe order matrix consists of five independent elements, we need more than five RDC data to determine the Saupe order matrix for the corresponding part. The singular value decomposition (SVD) to the matrix comprising of the relations for the observed RDCs will give the Saupe order matrix (Losonczi et al. 1999). Diagonalizing the Saupe order matrix gives the alignment tensor frame orientation relative to the molecular coordinate system and the magnate of the orders along each principal axis. As described in the theory part, the alignment tensor frame orientation is defined by the Euler angles (, , ).

We consider a two-domain protein here. And we assume that the high resolution structure of each domain is available, and each domain structure is the same as in the crystal. Based on the collected RDCs, the alignment tensor for each domain is independently determined according to the above procedure. As schematically drawn in Fig. 8, the determined tensor frames for each domain are used as a guide to define solution domain orientation; one domain coordinate is rotated to make an overlay its tensor frame onto the other (Fig. 8). It is noted here, the RDCs do not provide any distance information between the domains. If the inter-domain segment has high flexibility, the additional distance restraints may be required to build the entire structure, which should come from the other experiments like paramagnetic relaxation enhancement (Clore and Schwieters 2002).

Alignment tensor determined by the RDCs has four possible orientations. Inversion around each principal axis gives the same RDCs values. Therefore, the inversion is not discriminated experimentally. To alleviate this ambiguity in orientation angle, additional alignment states using different aligning media, including charged bicelle, or charged acrylamide gel, will be used. In domain orientation analysis, however, the structural restrictions, which include the length of the inter-domain linker or possible inter-domain steric clash, may allow to define one inter-domain orientation even using a single aligning experiment.

for each 1H-15N correlation. The doubled number of peaks on a 1H coupled HSQC spectrum may increase signal overlaps that obstacle the accurate reading of peak positions. To avoid this drawback, a particular NMR technique is used to separate the up- and down-field components into different 2D spectra, IPAP-HSQC. IPAP-HSQC gives two separate spectra that have in-phase and anti-phase doubles, respectively. Addition or subtraction of the spectra will give two separate 2D spectra displaying only up-field or down-filed components of each doublet. This signal separation reduces signal overlap on a 1H coupled HSQC spectrum, and keeps the spectral resolution to the same extent as in the original

The separation width between the up- and down-field components measured from IPAP-

Therefore, the RDC, *res DNH* , is obtained by their difference.

**4.2 Domain orientation analysis based on the RDC data** 

*NH J* for isotropic sample and <sup>1</sup> *res*

Here, we describe the domain orientation analysis based on the RDCs. The domain orientation analysis should be done for the protein whose structure is already known by Xray. The primal interest of the analysis is in exploring the domain reorientation upon interaction with the other protein or ligand. In the cases, each domain is assumed to retain

As described in the theory section, the alignment tensor for a weakly aligned protein is determined based by the RDCs and its structure coordinate, Eq. (5). In Eq. (5), direction cosines are calculated from the structure coordinate. Because the Saupe order matrix consists of five independent elements, we need more than five RDC data to determine the Saupe order matrix for the corresponding part. The singular value decomposition (SVD) to the matrix comprising of the relations for the observed RDCs will give the Saupe order matrix (Losonczi et al. 1999). Diagonalizing the Saupe order matrix gives the alignment tensor frame orientation relative to the molecular coordinate system and the magnate of the orders along each principal axis. As described in the theory part, the alignment tensor frame

, , ). We consider a two-domain protein here. And we assume that the high resolution structure of each domain is available, and each domain structure is the same as in the crystal. Based on the collected RDCs, the alignment tensor for each domain is independently determined according to the above procedure. As schematically drawn in Fig. 8, the determined tensor frames for each domain are used as a guide to define solution domain orientation; one domain coordinate is rotated to make an overlay its tensor frame onto the other (Fig. 8). It is noted here, the RDCs do not provide any distance information between the domains. If the inter-domain segment has high flexibility, the additional distance restraints may be required to build the entire structure, which should come from the other experiments like

Alignment tensor determined by the RDCs has four possible orientations. Inversion around each principal axis gives the same RDCs values. Therefore, the inversion is not discriminated experimentally. To alleviate this ambiguity in orientation angle, additional alignment states using different aligning media, including charged bicelle, or charged acrylamide gel, will be used. In domain orientation analysis, however, the structural restrictions, which include the length of the inter-domain linker or possible inter-domain steric clash, may allow to define one

paramagnetic relaxation enhancement (Clore and Schwieters 2002).

inter-domain orientation even using a single aligning experiment.

*NH NH J D* for aligned sample.

HSQC spectrum (Fig. 3).

HSQC spectra gives <sup>1</sup>

the same structure as in crystal.

orientation is defined by the Euler angles (

The alignment tensor magnitudes along each principal axis represent the extent of the aligning order. If they differ between the domains, which may indicate that each rotates differently to each other. The aligning orders give the insight into the domain dynamics in a protein.

### **4.3 Significance of the domain orientation analysis by RDCs**

The domain orientation analysis for protein in solution gives an invaluable outcome, even its high-resolution structure is available. There are some cases to show the different domain arrangements between solution and crystalline structures. The RDC analysis on maltose binding protein (MBP) in the complex with -cyclodextrin has shown that the relative domain orientation in solution was different from that in the crystal (Skrynnikov et al. 2000b). This may indicate the crystal contact causes a subtle change in domain orientation. Bacteriophage T4 lysozyme in solution was shown to have a more open conformation relative to the crystal structure, which was also analyzed by the RDCs (Goto et al. 2001). This observation appears compatible with steric requirements for the ligand bindings.

These examples illustrate how the RDC based domain analysis complements X-ray crystallography in determining the relative domain orientation or protein morphology. The complementary role of the RDC based analysis is considerably emphasized in exploring the domain rearrangement upon binding to the other protein or ligand. Even if the complex structure cannot be solved by X-ray, the complex structure is determined by the RDCs in a solution state when the structure in a ligand-free form is available. This approach does not require tedious and time-consuming NOE analysis as required in conventional NMR structure determination, but just needs the backbone resonance assignments and a set of IPAP-HSQC spectra. The structure determination is, therefore, much more efficient over the conventional NMR structure determination.

Fig. 8. Schematic representation for the procedure to determine the relative domain orientation based on the alignment tensors for each domain. Sets of RDC data determine the alignment tensor for each domain, independently. The domain orientation of a protein is established by rotating one domain coordinate to make an overlay of its tensor frame on the other.

#### **4.4 Molecular size limitation in the RDC based approach**

The domain orientation analysis with the RDCs is now recognized as a useful technique to elucidate overall protein morphology in solution, complementing the X-ray structure

Complementary Use of NMR to X-Ray Crystallography

assumed the expected noise level.

corresponds to <sup>1</sup>

by the applied J-evolution period, the apparent <sup>1</sup>

evolution period is applied to allow full recovery of <sup>1</sup>

for the Analysis of Protein Morphological Change in Solution 425

Fig. 9. Simulation of the molecular size dependency of the TROSY and anti-TROSY

components observed on a IPAP-HSQC spectrum. The data represent a slice peak along the 15N axis. This simulation assumed a 750 MHz experiment. Black, blue, and red lines are the simulated peaks for the sizes 20 kDa, 150 kDa, and 800 kDa, respectively. The dotted line is

One proposed remedy is the combinatorial use of TROSY and HSQC. The difference between TROSY and HSQC signals along the 15N axis corresponds to a half of RDC. As discussed above, the transverse relaxation rate of the HSQC signal is slower than that for the anti-TROSY component in a IPAP-HSQC spectra. Therefore, the size limitation problem should be alleviated by replacing anti-TROSY signal with HSQC counterpart. For a 81.4 kDa protein, the transverse relaxation times for the TROSY, anti-TROSY, and HSQC signals are reported to be 65 msec, 10 msec, and 30 msec, respectively (Tugarinov and Kay 2003). In considering the difference between the transverse relaxation times between TROSY and HSQC signals, the combinatorial use does not fully solve the problem, but just alleviates it. Another remedy is the use of J-scaled TROSY, which is also referred to as J-enhanced (JE) TROSY (Kontaxis, Clore and Bax 2000, Bhattacharya, Revington and Zuiderweg 2010). In this experiment, short J-evolution step is added in the standard TROSY, which induces Jdependent shift change from the standard TROSY shift. In the standard TROSY experiment, the shift difference between the signals along the 15N axis on the same 1H chemical shift

*NH J* , whilst in the J-scaled TROSY, this shift difference is changed according

*NH J* coupling value is estimated. If the J-

*NH J* coupling, the observed signal

to the additional duration for J-evolution. From the magnitude of the shift change induced

position should be coincident with that of the HSQC signal. Usually, to gain the signal intensity for the observed signal on J-scaled TROSY, rather limited evolution time is set. In this J-evolution step, the coherences for TROSY and anti-TROSY are mixed; the equivalent

analysis. This approach has, however, severe size limitation. Here, this obstacle in the RDC application is discussed.

The size limitation comes from the rapid transverse relaxation rate of one of the split components observed on a 2D IPAP-HSQC spectrum. The high-filed component shows faster transverse relaxation rate than that of the other. This component has even faster relaxation rate than that of HSQC counterpart. This is due to cross-correlated relaxation interference to amide 15N spin relaxation process; for the high-field component, the crosscorrelated relaxation process additively affects, while for the low-field component, the interference reduces its relaxation rate. The transverse relaxation process of the HSQC signal is free from the interference.

In measuring the RDCs with IPAP-HSQC for high molecular weight protein, the high-field components of each amide spin pairs will broaden and severely reduce the signal intensities, thus, they will not be observed. In particular, the difficulty in observing the high-field component will be enhanced in an aligned state, due to the appearance of the residual dipolar interactions as relaxation causes. For proteins over 20 kDa, it is usually hard to observe the high-field component in an aligned state, thus making the RDC measurement impossible. The RDC based domain orientation analysis with IPAP-HSQC is practically limited up to around 20 kDa.

Simulation of the line broadening on each component of a double according to molecular size is shown in Fig. 9. In this figure, the low-field component that shows a longer transverse relaxation time and the other having a shorter transverse relaxation time are named as TROSY and anti-TROSY, respectively. The slower transverse relaxation associated with the low-field component is due to the mechanism used in TROSY (Transverse Relaxation Optimized SpetroscopY). As demonstrated on the simulation, the anti-TROSY component shows severe broadening even for the medium-size protein, 20 kDa.

The difference in line widths between the TROSY and anti-TROSY components will become considerable for higher molecular weight proteins. As seen in the simulation, protein over 150 kDa gives severely broadened anti-TROSY signal, which already hard to observe. Protein with 800 kDa never gives observable anti-TROSY signal. The size limitation in the RDC based approach is clearly demonstrated in this simulation.

It should be noted, the TROSY component can retain observable signal intensity even for 800 kDa protein (Fig. 9). This motivated us to devise an approach to determine an alignment tensor only using TROSY components.

#### **4.5 Existing remedy for overcoming the size limitation in the RDC-based approach**

Some remedies are proposed to overcome the size limitation the RDC application. They all rely on the TROSY.

The difference in the transverse relaxation rates between the TROSY (low-field) and anti-TROSY (high-field) components split along the 15N axis are explained by the relaxation interference. The same effect is active in the split signals along 1H dimension. In observing the 1H-15N single bond correlation spectrum without decoupling during t1 and also t2 durations, each spin pair gives a quartet on a spectrum; split signals in both 1H and 15N dimensions. The pure TROSY signal is the one having the longest transverse relaxation time among the quartets. In using the protein labeled with 15N and 2H, where an unwanted relaxation process is diminished by breaking the 1H-1H dipolar interaction network in a protein, TROSY effect is enhanced, and it allows 1H-15N correlation spectrum for proteins over 100 kDa (Pervushin et al. 1997).

analysis. This approach has, however, severe size limitation. Here, this obstacle in the RDC

The size limitation comes from the rapid transverse relaxation rate of one of the split components observed on a 2D IPAP-HSQC spectrum. The high-filed component shows faster transverse relaxation rate than that of the other. This component has even faster relaxation rate than that of HSQC counterpart. This is due to cross-correlated relaxation interference to amide 15N spin relaxation process; for the high-field component, the crosscorrelated relaxation process additively affects, while for the low-field component, the interference reduces its relaxation rate. The transverse relaxation process of the HSQC signal

In measuring the RDCs with IPAP-HSQC for high molecular weight protein, the high-field components of each amide spin pairs will broaden and severely reduce the signal intensities, thus, they will not be observed. In particular, the difficulty in observing the high-field component will be enhanced in an aligned state, due to the appearance of the residual dipolar interactions as relaxation causes. For proteins over 20 kDa, it is usually hard to observe the high-field component in an aligned state, thus making the RDC measurement impossible. The RDC based domain orientation analysis with IPAP-HSQC is practically

Simulation of the line broadening on each component of a double according to molecular size is shown in Fig. 9. In this figure, the low-field component that shows a longer transverse relaxation time and the other having a shorter transverse relaxation time are named as TROSY and anti-TROSY, respectively. The slower transverse relaxation associated with the low-field component is due to the mechanism used in TROSY (Transverse Relaxation Optimized SpetroscopY). As demonstrated on the simulation, the anti-TROSY component

The difference in line widths between the TROSY and anti-TROSY components will become considerable for higher molecular weight proteins. As seen in the simulation, protein over 150 kDa gives severely broadened anti-TROSY signal, which already hard to observe. Protein with 800 kDa never gives observable anti-TROSY signal. The size limitation in the

It should be noted, the TROSY component can retain observable signal intensity even for 800 kDa protein (Fig. 9). This motivated us to devise an approach to determine an alignment

**4.5 Existing remedy for overcoming the size limitation in the RDC-based approach**  Some remedies are proposed to overcome the size limitation the RDC application. They all

The difference in the transverse relaxation rates between the TROSY (low-field) and anti-TROSY (high-field) components split along the 15N axis are explained by the relaxation interference. The same effect is active in the split signals along 1H dimension. In observing the 1H-15N single bond correlation spectrum without decoupling during t1 and also t2 durations, each spin pair gives a quartet on a spectrum; split signals in both 1H and 15N dimensions. The pure TROSY signal is the one having the longest transverse relaxation time among the quartets. In using the protein labeled with 15N and 2H, where an unwanted relaxation process is diminished by breaking the 1H-1H dipolar interaction network in a protein, TROSY effect is enhanced, and it allows 1H-15N correlation spectrum for proteins

shows severe broadening even for the medium-size protein, 20 kDa.

RDC based approach is clearly demonstrated in this simulation.

application is discussed.

is free from the interference.

limited up to around 20 kDa.

tensor only using TROSY components.

over 100 kDa (Pervushin et al. 1997).

rely on the TROSY.

Fig. 9. Simulation of the molecular size dependency of the TROSY and anti-TROSY components observed on a IPAP-HSQC spectrum. The data represent a slice peak along the 15N axis. This simulation assumed a 750 MHz experiment. Black, blue, and red lines are the simulated peaks for the sizes 20 kDa, 150 kDa, and 800 kDa, respectively. The dotted line is assumed the expected noise level.

One proposed remedy is the combinatorial use of TROSY and HSQC. The difference between TROSY and HSQC signals along the 15N axis corresponds to a half of RDC. As discussed above, the transverse relaxation rate of the HSQC signal is slower than that for the anti-TROSY component in a IPAP-HSQC spectra. Therefore, the size limitation problem should be alleviated by replacing anti-TROSY signal with HSQC counterpart. For a 81.4 kDa protein, the transverse relaxation times for the TROSY, anti-TROSY, and HSQC signals are reported to be 65 msec, 10 msec, and 30 msec, respectively (Tugarinov and Kay 2003). In considering the difference between the transverse relaxation times between TROSY and HSQC signals, the combinatorial use does not fully solve the problem, but just alleviates it. Another remedy is the use of J-scaled TROSY, which is also referred to as J-enhanced (JE) TROSY (Kontaxis, Clore and Bax 2000, Bhattacharya, Revington and Zuiderweg 2010). In this experiment, short J-evolution step is added in the standard TROSY, which induces Jdependent shift change from the standard TROSY shift. In the standard TROSY experiment, the shift difference between the signals along the 15N axis on the same 1H chemical shift corresponds to <sup>1</sup> *NH J* , whilst in the J-scaled TROSY, this shift difference is changed according to the additional duration for J-evolution. From the magnitude of the shift change induced by the applied J-evolution period, the apparent <sup>1</sup> *NH J* coupling value is estimated. If the Jevolution period is applied to allow full recovery of <sup>1</sup> *NH J* coupling, the observed signal position should be coincident with that of the HSQC signal. Usually, to gain the signal intensity for the observed signal on J-scaled TROSY, rather limited evolution time is set. In this J-evolution step, the coherences for TROSY and anti-TROSY are mixed; the equivalent

Complementary Use of NMR to X-Ray Crystallography

where, cos( ) 

tem cos( ) 

order matrix. At least, five

diagonalization of the Saupe matrix.

for the Analysis of Protein Morphological Change in Solution 427

cos( )cos( ) *kl kl kj lj jj*

= x, y, z. <sup>0</sup> *D NH* is the static dipolar coupling constant, which equals 23.0 and 21.7 kHz for assumed NH bond length 1.02 Å and 1.04 Å, respectively; in solution, the effective NH bond length is estimated to be 1.04 Å, which values includes the bond libration effects. The

the SVD calculation to the equations Eq. (9) for the residues in a protein gives the Saupe

constituted by five independent elements. The alignment tensor is obtained through the

Fig. 10. Schematic drawing of the relationship among the signals for 1H-15N doublet in a 1H

DIORITE is the algorism to determine the alignment tensor solely from TRSOY spectra. It may be expected that DIORITE gives a more accurate alignment tensor over the 1H-15N RDCs, in particular, the case for huge protein over 50 kDa, due to the longest transverse relaxation time of the TROSY signal. DIORITE determines the alignment tensor based on two anisotropic spin interactions; RDC and RCSA. As described in theory part, RCSA contains the tensorial orientational information of the peptide plane against a magnetic field, while RDC gives only the bond vector orientation. RDC value does not change if the peptide plane is rotated along the NH bond axis, although RCSA should significantly change. Because of the inclusion of RCSA effect, DIORITE can discriminate the difference in the peptide plane orientation. Therefore, DIORITE should be more informative over the RDC

coupled HSQC spectra observed for protein in isotropic and aligned states.

based analysis in determining the domain orientation of a protein.

the contributions from a half of RDC and full RCSA.

*<sup>k</sup>* is the direction cosine for the NH bond vector relative the molecular axis *k*, *k*

*kj* is the direction cosine of the CSA principal axis *j* relative to the molecular axis *k*.

 

(10)

*TROSY* data are required to determine the Saupe order matrix

*jj* . As done for the RDC,

*TROSY* contains

*S*

, ,, ,,

*kl xyzj xyz*

The principal value of the CSA tensor along *j* axis is denoted as

mixing of the two gives the coherence for observed as HSQC signal. The more increased the contribution of the anti-TROSY coherence to the observed signal leads to more broadened signals observed. Therefore, in the J-scaled TROSY experiment, partial recovery of the Jmodulation is used to maintain the signal intensities on the J-scaled TROSY spectrum in the observable level.

Signals observed on a J-scaled TROSY spectrum have longer transverse relaxation times than those of the signals on a HSQC spectrum. Their transverse relaxation times, however, are still shorter than those for real TROSY counterparts. The combined use of TROSY and Jscaled TROSY is indeed advantageous over the TROSY/HSQC combination. In determining more accurate RDCs, J-scaled TROSY requires more extent of the mixing of the anti-TROSY coherences, which will result in the lesser sensitive J-scaled TROSY signals. The use of Jscaled TROSY is not the complete remedy for the problem we concern.

In spite of the limitations in the existing approaches, they expanded the RDC application up to 50 kDa protein (Jain, Noble and Prestegard 2003). However, it is also reported that the rapid transverse relaxation of the non-TROSY component is already an obstacle in measuring the RDCs for 81.4 kDa protein . The further expansion of the application limit is expected, and our DIORITE is one of the possible methods used for this purpose.

#### **5. Alignment tensor determination using only TROSY**

As discussed above, the molecular size limitation problem in the RDC based domain orientation analysis is not completely overcome, although the existing approaches have given some successful results. Most of the biologically interesting multi domain proteins tend to be over 100 kDa. The existing approaches are not thought to be applied to such higher molecular weight protein. This is because they do not take fully advantages of TROSY spectroscopy, which allows the longest transverse relaxation time for the observed signals. In contrast to the existing approaches, our approach, DIOIRTE, uses only TROSY spectra, where the signals having the longest transverse relaxation times are used. This may give considerable advantages over the existing methods in respect to the size limitation problem. In this section, we will describe the theoretical aspects of the TROSY based alignment tensor determination, which will allow the domain orientation analysis for higher molecular weight proteins ever.

#### **5.1 Alignment induced TROSY shift changes**

TROSY shift is changed when protein is transferred from isotropic to anisotropic states. This TROSY shift change along the 15N axis contains the effects of two anisotropic spin interaction observed on a peptide plane; 1H-15N dipolar interaction and 15N CSA. As depicted in Fig. 10, this alignment induced TROSY shift change, *TROSY* , contains a half of RDC and the full RCSA effects. In a Cartesian representation using the Saupe order matrix, the following relation should fold:

$$\Delta\delta\_{\text{TROXY}} = -\frac{1}{2} \frac{(\mu\_0)}{(4\pi)} \frac{\gamma\_{\gamma,\gamma'}\hbar}{2\pi^2 \nu\_{\gamma}!} \frac{4\pi}{3} \sum\_{kl=\text{x},\text{y},z} S\_{kl} \cos(\alpha\_k) \cos(\alpha\_l)$$

$$\begin{split} & + \frac{2}{3} \sum\_{kl=\text{x},\text{y},z} \sum\_{j=\text{x},y,z} S\_{kl} \cos(\theta\_{kj}) \cos(\theta\_{jl}) \delta\_{jl} \\ &= \sum\_{kl} S\_{lk} \left\{ \frac{1}{2} D^0\_{\phantom{.}{\text{ML}}} \cos(\alpha\_k) \cos(\alpha\_l) + \frac{2}{3} \Delta\_{ll} \right\} \end{split} \tag{9}$$

mixing of the two gives the coherence for observed as HSQC signal. The more increased the contribution of the anti-TROSY coherence to the observed signal leads to more broadened signals observed. Therefore, in the J-scaled TROSY experiment, partial recovery of the Jmodulation is used to maintain the signal intensities on the J-scaled TROSY spectrum in the

Signals observed on a J-scaled TROSY spectrum have longer transverse relaxation times than those of the signals on a HSQC spectrum. Their transverse relaxation times, however, are still shorter than those for real TROSY counterparts. The combined use of TROSY and Jscaled TROSY is indeed advantageous over the TROSY/HSQC combination. In determining more accurate RDCs, J-scaled TROSY requires more extent of the mixing of the anti-TROSY coherences, which will result in the lesser sensitive J-scaled TROSY signals. The use of J-

In spite of the limitations in the existing approaches, they expanded the RDC application up to 50 kDa protein (Jain, Noble and Prestegard 2003). However, it is also reported that the rapid transverse relaxation of the non-TROSY component is already an obstacle in measuring the RDCs for 81.4 kDa protein . The further expansion of the application limit is

As discussed above, the molecular size limitation problem in the RDC based domain orientation analysis is not completely overcome, although the existing approaches have given some successful results. Most of the biologically interesting multi domain proteins tend to be over 100 kDa. The existing approaches are not thought to be applied to such higher molecular weight protein. This is because they do not take fully advantages of TROSY spectroscopy, which allows the longest transverse relaxation time for the observed signals. In contrast to the existing approaches, our approach, DIOIRTE, uses only TROSY spectra, where the signals having the longest transverse relaxation times are used. This may give considerable advantages over the existing methods in respect to the size limitation problem. In this section, we will describe the theoretical aspects of the TROSY based alignment tensor determination, which will allow the domain orientation analysis for higher

TROSY shift is changed when protein is transferred from isotropic to anisotropic states. This TROSY shift change along the 15N axis contains the effects of two anisotropic spin interaction observed on a peptide plane; 1H-15N dipolar interaction and 15N CSA. As

RDC and the full RCSA effects. In a Cartesian representation using the Saupe order matrix,

1 2 0 2 3

*lk NH k l kl*

*S*

{ cos( )cos( ) }

*S*

cos( )cos( )

*kl kj lj jj*

  cos( )cos( )

 

  *TROSY* , contains a half of

(9)

scaled TROSY is not the complete remedy for the problem we concern.

**5. Alignment tensor determination using only TROSY** 

expected, and our DIORITE is one of the possible methods used for this purpose.

observable level.

molecular weight proteins ever.

the following relation should fold:

**5.1 Alignment induced TROSY shift changes** 

depicted in Fig. 10, this alignment induced TROSY shift change,

2 3

*kl*

<sup>0</sup>

 

 

2 3 1 4 24 5 <sup>2</sup> , ,

*i j i j h TROSY kl k l <sup>r</sup> kl x y z*

,, ,,

*kl x y z j x y z*

*S D*

$$\Delta\_{\underline{u}} = \sum\_{k,l=x,y,z} \sum\_{j=x,y,z} S\_{\underline{u}} \cos(\theta\_{\underline{y}}) \cos(\theta\_{\underline{y}}) \delta\_{\underline{y}} \tag{10}$$

where, cos( ) *<sup>k</sup>* is the direction cosine for the NH bond vector relative the molecular axis *k*, *k* = x, y, z. <sup>0</sup> *D NH* is the static dipolar coupling constant, which equals 23.0 and 21.7 kHz for assumed NH bond length 1.02 Å and 1.04 Å, respectively; in solution, the effective NH bond length is estimated to be 1.04 Å, which values includes the bond libration effects. The tem cos( ) *kj* is the direction cosine of the CSA principal axis *j* relative to the molecular axis *k*. The principal value of the CSA tensor along *j* axis is denoted as *jj* . As done for the RDC, the SVD calculation to the equations Eq. (9) for the residues in a protein gives the Saupe order matrix. At least, five *TROSY* data are required to determine the Saupe order matrix constituted by five independent elements. The alignment tensor is obtained through the diagonalization of the Saupe matrix.

Fig. 10. Schematic drawing of the relationship among the signals for 1H-15N doublet in a 1H coupled HSQC spectra observed for protein in isotropic and aligned states. *TROSY* contains the contributions from a half of RDC and full RCSA.

DIORITE is the algorism to determine the alignment tensor solely from TRSOY spectra. It may be expected that DIORITE gives a more accurate alignment tensor over the 1H-15N RDCs, in particular, the case for huge protein over 50 kDa, due to the longest transverse relaxation time of the TROSY signal. DIORITE determines the alignment tensor based on two anisotropic spin interactions; RDC and RCSA. As described in theory part, RCSA contains the tensorial orientational information of the peptide plane against a magnetic field, while RDC gives only the bond vector orientation. RDC value does not change if the peptide plane is rotated along the NH bond axis, although RCSA should significantly change. Because of the inclusion of RCSA effect, DIORITE can discriminate the difference in the peptide plane orientation. Therefore, DIORITE should be more informative over the RDC based analysis in determining the domain orientation of a protein.

Complementary Use of NMR to X-Ray Crystallography

**5.3 DIORITE analysis using different magnetic field strengths** 

strength. The field dependency of the RCSA gives a peculiar profile to the

between RDC and RCSA values to each other, making the observed

not linear. Using ubiquitin, we simulated the root mean square (rms)

contribution in a higher magnetic field becomes dominant in

reading error in RDC should be 1.2 Hz due to twice subtraction required.

domain orientation in a protein.

angle 

simulation, the rms

explain the profile.

in the observed

expected

the DIORITE analysis will be discussed.

opposite sign to each other. Therefore,

third of the RDC in an absolute magnitude.

RDC around 20 Hz. Under the condition,

worst 800 MHz, the expected

The alignment induced TROSY shift change,

In a peptide plane, the least shielded 15N CSA tensor axis,

RDC minus RCSA in their absolute values. The value

for the Analysis of Protein Morphological Change in Solution 429

specific values within an error range (Fig. 11). The DIORITE analysis using the secondary structure specific 15N CSA tensors, therefore, can be a practical approach for exploring the

TROSY effect reduces line width along 15N dimension at a higher magnetic field; the original paper on TROSY has estimated the optimal frequency for obtaining the narrowest lines is around 1 GHz (Pervushin et al. 1997). In this section, the optimal magnetic-field strength for

dependency due to the RCSA contribution. The magnitude of RCSA is proportional to the applied magnetic-field strength, while the RDC is independent on the applied magnetic-field

than the corresponding RDC. In most of the residues in a protein, RDC and RCSA have an

The inter-cancellation effect depends on the magnetic-field strength, but the dependency is

different magnetic-field strengths indicated by the protein resonance frequencies. In the

decreases up to 800 MHz and then increases according to the field strength. In lower

roughly approximated by a half of RDC. The increasing RCSA, which has an opposite sign to RDC, contribution overall decreases the RDC value but further enhancement of the RCSA

The condition for weak alignment is carefully tuned to avoid severe signal broadening. The alignment order is typically tuned to around 10-3, giving the maximal absolute magnitude of

at a signal-to-noise ration of 40:1 was estimated to be 0.6 Hz. Therefore, the experimental error

subtracting the TROSY shift in an aligned state by that in an isotropic state. Accordingly, the

The resolution on TROSY spectrum is also the factor to be considered in discussing the performance of DIORITE on different magnetic-field strength. Using the average 15N CSA tensor value estimated from ubiquitin, the dependencies of 15N TROSY line width and rms

 *TROSY* value are plotted against proton resonance frequencies, where the values are in ppb (parts per billion) units instead of Hz to compare them in the context of spectral resolution

magnetic-field strength, where the RCSA contribution is rather small and the

ranges from 15 to 20 degrees (Fig. 2b). The small angle result in the cancellation

*TROSY* , which should correspond to the average magnitude of

*TROSY* , show significant magnetic-field

*TROSY* tends to be approximately one-

*TROSY* should be roughly equivalent to that a half of

*TROSY* is expected to give about 6 Hz. Even under the

*TROSY* should be 5.5 Hz. The accuracy in reading peak positions

*TROSY* is estimated to be 0.8 Hz, in considering the error propagation through

*TROSY* is well resolved within the error in peak picking. It is noted that the estimated

*xx* , is close to the NH bond; the

*TROSY* value.

*TROSY* value smaller

*TROSY* values on

*TROSY* ,

*TROSY* is

*TROSY* value, which may

#### **5.2 CSA tensor parameters used in DIOIRTE**

DIORITE based alignment tensor determination requires 15N CSA tensor for each residue. As discussed in the part for the RCSA, it is not trivial to get accurate 15N CSA tensor for each residue in a protein, because it has significant local structure dependency; backbone torsion angle, hydrogen bonding, and so forth. In the DIORITE analysis, 15N CSA tensors for every residue have to be known as input values.

In the domain orientation analysis with DIORITE, we use a high resolution domain coordinate from X-ray. Therefore, we know the detailed structure on each domain before the analysis. Some recent reports on the experimental determination of the residue specific 15N CSA tensor for small proteins in solution had demonstrated that the 15N CSA tensor value is primarily dependent on the backbone torsion angle. According to this correlation, we proposed the practical protocol for the DIORITE analysis that uses the secondary structure specific 15N CSA tensors as inputs.

Fig. 11. Comparison of the back-calculated *TROSY* obtained by DIORITE algorism using the secondary structure specific and the residue specific 15N CSA tensors. Red and green circles are the values with secondary structure specific and residues specific tensors, respectively. Black circles are the observed *TROSY* .

The quality of the back calculated *TROSY* values were assessed on ubiquitin. The values with the secondary structure specific 15N CSA tensors and those with the residue specific tensors were compared (Fig. 11); the residue specific 15N CSAs used here were determined by a set of elaborate spin relaxation analyses by Bodenhausen and co-workers (Loth, Pelupessy and Bodenhausen 2005). As demonstrated in the comparison, the use of the secondary structure specific 15N CSA tensor gives consistent results with those with residue-

DIORITE based alignment tensor determination requires 15N CSA tensor for each residue. As discussed in the part for the RCSA, it is not trivial to get accurate 15N CSA tensor for each residue in a protein, because it has significant local structure dependency; backbone torsion angle, hydrogen bonding, and so forth. In the DIORITE analysis, 15N CSA tensors for every

In the domain orientation analysis with DIORITE, we use a high resolution domain coordinate from X-ray. Therefore, we know the detailed structure on each domain before the analysis. Some recent reports on the experimental determination of the residue specific 15N CSA tensor for small proteins in solution had demonstrated that the 15N CSA tensor value is primarily dependent on the backbone torsion angle. According to this correlation, we proposed the practical protocol for the DIORITE analysis that uses the secondary structure

secondary structure specific and the residue specific 15N CSA tensors. Red and green circles are the values with secondary structure specific and residues specific tensors, respectively.

with the secondary structure specific 15N CSA tensors and those with the residue specific tensors were compared (Fig. 11); the residue specific 15N CSAs used here were determined by a set of elaborate spin relaxation analyses by Bodenhausen and co-workers (Loth, Pelupessy and Bodenhausen 2005). As demonstrated in the comparison, the use of the secondary structure specific 15N CSA tensor gives consistent results with those with residue-

*TROSY* obtained by DIORITE algorism using the

*TROSY* values were assessed on ubiquitin. The values

**5.2 CSA tensor parameters used in DIOIRTE** 

residue have to be known as input values.

specific 15N CSA tensors as inputs.

Fig. 11. Comparison of the back-calculated

*TROSY* .

Black circles are the observed

The quality of the back calculated

specific values within an error range (Fig. 11). The DIORITE analysis using the secondary structure specific 15N CSA tensors, therefore, can be a practical approach for exploring the domain orientation in a protein.

#### **5.3 DIORITE analysis using different magnetic field strengths**

TROSY effect reduces line width along 15N dimension at a higher magnetic field; the original paper on TROSY has estimated the optimal frequency for obtaining the narrowest lines is around 1 GHz (Pervushin et al. 1997). In this section, the optimal magnetic-field strength for the DIORITE analysis will be discussed.

The alignment induced TROSY shift change, *TROSY* , show significant magnetic-field dependency due to the RCSA contribution. The magnitude of RCSA is proportional to the applied magnetic-field strength, while the RDC is independent on the applied magnetic-field strength. The field dependency of the RCSA gives a peculiar profile to the *TROSY* value.

In a peptide plane, the least shielded 15N CSA tensor axis, *xx* , is close to the NH bond; the angle ranges from 15 to 20 degrees (Fig. 2b). The small angle result in the cancellation between RDC and RCSA values to each other, making the observed *TROSY* value smaller than the corresponding RDC. In most of the residues in a protein, RDC and RCSA have an opposite sign to each other. Therefore, *TROSY* should be roughly equivalent to that a half of RDC minus RCSA in their absolute values. The value *TROSY* tends to be approximately onethird of the RDC in an absolute magnitude.

The inter-cancellation effect depends on the magnetic-field strength, but the dependency is not linear. Using ubiquitin, we simulated the root mean square (rms) *TROSY* values on different magnetic-field strengths indicated by the protein resonance frequencies. In the simulation, the rms *TROSY* , which should correspond to the average magnitude of *TROSY* , decreases up to 800 MHz and then increases according to the field strength. In lower magnetic-field strength, where the RCSA contribution is rather small and the *TROSY* is roughly approximated by a half of RDC. The increasing RCSA, which has an opposite sign to RDC, contribution overall decreases the RDC value but further enhancement of the RCSA contribution in a higher magnetic field becomes dominant in *TROSY* value, which may explain the profile.

The condition for weak alignment is carefully tuned to avoid severe signal broadening. The alignment order is typically tuned to around 10-3, giving the maximal absolute magnitude of RDC around 20 Hz. Under the condition, *TROSY* is expected to give about 6 Hz. Even under the worst 800 MHz, the expected *TROSY* should be 5.5 Hz. The accuracy in reading peak positions at a signal-to-noise ration of 40:1 was estimated to be 0.6 Hz. Therefore, the experimental error in the observed *TROSY* is estimated to be 0.8 Hz, in considering the error propagation through subtracting the TROSY shift in an aligned state by that in an isotropic state. Accordingly, the expected *TROSY* is well resolved within the error in peak picking. It is noted that the estimated reading error in RDC should be 1.2 Hz due to twice subtraction required.

The resolution on TROSY spectrum is also the factor to be considered in discussing the performance of DIORITE on different magnetic-field strength. Using the average 15N CSA tensor value estimated from ubiquitin, the dependencies of 15N TROSY line width and rms *TROSY* value are plotted against proton resonance frequencies, where the values are in ppb (parts per billion) units instead of Hz to compare them in the context of spectral resolution

Complementary Use of NMR to X-Ray Crystallography

for the Analysis of Protein Morphological Change in Solution 431

Fig. 13. Domain rotation angle change according to the ligand size in maltose binding protein (MBP). (a) -CD, (b) -CD and (c) -CD complex states. The hinge rotation angles induced by ligand binding were 13 deg., 14 deg. And 22 deg. for the cases of -CD, -CD,

and -CD, respectively.

(Fig. 12a). In this simulation, the optimal frequency to minimize the 15N line width of the observed TROSY signal is 972.5 MHz, close but slightly different from the value by the simpler estimation in the original paper. In comparing the line narrowing of TROSY signal, the rms *TROSY* value shows less dependency on the magnetic-field strength. The effective accuracy in observing *TROSY* value should be estimated by the value *TROSY* divided by the 15N line width of TROSY signal (Fig. 12b). In this simulated profile, the optimal magneticfield strength for observing *TROSY* is around 900 MHz for 1H resonance frequency. Therefore, the magnetic-field strength for the maximal TROSY effect is almost optimal for the DIORITE analysis. DIORITE analysis, thus, can take full advantage of the TROSY effect on a 900 MHz NMR spectrometer, which is now commercially available.

Fig. 12. Field dependency of the TROSY line width and the root-mean-square (rms) *TROSY* . (a) TROSY line width (green) and rms *TROSY* (red) according to the magnetic field strength (b) Field strength dependency of the effective resolution in measuring *TROSY* .

#### **5.4 Practical aspects of the DIORITE data collection**

DIORITE analysis uses TROSY chemical shift changes *TROSY* induced by a weak alignment of protein (Tate 2008, Tate et al. 2004). In general, chemical shift is very sensitive to the solution conditions, including temperature, sample concentration, pH, ionic strength and so no. In measuring *TROSY* , we have to exclude the other factors to change the chemical shift except for the alignment effects. To achieve this, anisotropically compressed polyacrylamide gel is the most appropriate medium for aligning protein. As described above, the anisotropically compressed gel (stretched gel) is made by inserting the cast gel chip having a little greater diameter than that of the inner diameter of NMR tube. The gel chip having the same diameter as that of NMR tube is not compressed after insertion, which can keep the isotropic cavity within. Protein in this non-compressed gel is not aligned and does not show any anisotropic spin interactions. The sample, therefore, can be used as the reference TROSY spectrum for measuring *TROSY* . Because the non-compressed gel consists of the same acrylamide composition, protein in this gel experiences the same chemical environments as in the compressed gel, which ensures that the observed TROSY shift changes solely come from the alignment effects. It is difficult in getting reference data with using the other

(Fig. 12a). In this simulation, the optimal frequency to minimize the 15N line width of the observed TROSY signal is 972.5 MHz, close but slightly different from the value by the simpler estimation in the original paper. In comparing the line narrowing of TROSY signal,

*TROSY* value should be estimated by the value

15N line width of TROSY signal (Fig. 12b). In this simulated profile, the optimal magnetic-

Therefore, the magnetic-field strength for the maximal TROSY effect is almost optimal for the DIORITE analysis. DIORITE analysis, thus, can take full advantage of the TROSY effect

Fig. 12. Field dependency of the TROSY line width and the root-mean-square (rms)

of protein (Tate 2008, Tate et al. 2004). In general, chemical shift is very sensitive to the solution conditions, including temperature, sample concentration, pH, ionic strength and so

except for the alignment effects. To achieve this, anisotropically compressed polyacrylamide gel is the most appropriate medium for aligning protein. As described above, the anisotropically compressed gel (stretched gel) is made by inserting the cast gel chip having a little greater diameter than that of the inner diameter of NMR tube. The gel chip having the same diameter as that of NMR tube is not compressed after insertion, which can keep the isotropic cavity within. Protein in this non-compressed gel is not aligned and does not show any anisotropic spin interactions. The sample, therefore, can be used as the reference TROSY

acrylamide composition, protein in this gel experiences the same chemical environments as in the compressed gel, which ensures that the observed TROSY shift changes solely come from the alignment effects. It is difficult in getting reference data with using the other

(b) Field strength dependency of the effective resolution in measuring

**5.4 Practical aspects of the DIORITE data collection**  DIORITE analysis uses TROSY chemical shift changes

*TROSY* value shows less dependency on the magnetic-field strength. The effective

*TROSY* is around 900 MHz for 1H resonance frequency.

*TROSY* (red) according to the magnetic field strength

*TROSY* , we have to exclude the other factors to change the chemical shift

*TROSY* . Because the non-compressed gel consists of the same

*TROSY* .

*TROSY* induced by a weak alignment

*TROSY* divided by the

*TROSY* .

the rms

accuracy in observing

field strength for observing

(a) TROSY line width (green) and rms

no. In measuring

spectrum for measuring

on a 900 MHz NMR spectrometer, which is now commercially available.

Fig. 13. Domain rotation angle change according to the ligand size in maltose binding protein (MBP). (a) -CD, (b) -CD and (c) -CD complex states. The hinge rotation angles induced by ligand binding were 13 deg., 14 deg. And 22 deg. for the cases of -CD, -CD, and -CD, respectively.

Complementary Use of NMR to X-Ray Crystallography

rather limited cases.

**7. Acknowledgement** 

**8. References** 

the protein morphology in a solution state.

solution, which associates with the functional exertion.

and dynamics. *Protein Science,* 12, 1-16.

structure analysis, in particular, on higher molecular weight proteins.

for the Analysis of Protein Morphological Change in Solution 433

coordinate data. In this review, we introduced protein structure analysis in solution with the

There has been a lot of discussion on the significance of the solution structure determination by NMR. Most of the structures for single domain proteins or isolated domains have shown marginal structural deviations from the corresponding structures solved by X-ray. This diminishes the importance of NMR structure analysis, except for the case in which crystallization is hard. When X-ray structure is available on a type of protein behaving as an independent structural unit, the solution structure determination on the protein by NMR is not usually conducted, because the crystal structure should not largely differ from that in solution. Additionally, in most of the cases, the size limitation in NMR structure determination prohibits such solution structure analyses, even if they are required. NMR solution structure analysis, therefore, has been recognized as a complementary method in

The situation seems different in the structure analysis of multiple domain proteins. There already appeared some examples to show the difference in the domain orientations of protein between solution and crystalline states. The kinds of example will be increased, because it is getting to know that many proteins have domains linked by seemingly flexible or unstructured linkers judged from the sequence. In the proteins, it is presumable that the domain arrangement tends to be defined artificially by crystal packing, thus not represent

The NMR techniques using a weak alignment have paved ways to directly determine the relative domain orientation in solution, which has not been ever done by the conventional NMR methods. Some variations of the methods were introduced in this review, with their limitations in practical applications. Our devised DIORITE approach has a significant advantage in the domain orientation analysis over the existing methods, when it is applied to higher molecular weight proteins. The domain orientation analysis by DIORITE will expand the X-ray structure assisted reach in exploring the protein morphological change in

The domain rearrangement in protein, or protein morphological change, upon binding to ligand or interaction with its partner protein will become much more important in discussing protein functional regulation, after getting the high-resolution crystal structure in a specific state, for example, apo-form. The combined use of DIORITE with X-ray structure data may give vivid views how protein works in solution by changing its morphology. NMR is now becoming a complementary partner to X-ray crystallography in protein

S.T. acknowledges financial support from PRESTO/JST and SENTAN/JST. We appreciate RIKEN for accessing high-field NMR instruments. This review article is dedicated to Prof. Mamoru Tamura who demised on August 7th, 2011. He has been encouraging for

Bax, A. (2003) Weak alignment offers new NMR opportunities to study protein structure

progressing the work described here as a supervisor in the PRESTO/JST project.

assistance of protein structure data collected by X-ray crystallography.

aligning media, including bicelle, filamentous phage. They cannot be the choice for DIORITE analysis.

In measuring *TROSY* values, a set of TROSY spectra for the aligned and isotropic samples are used. The formation of the anisotropic cavity in the stretched gel can be monitored by the split of the deuteron signal from HOD; the deuteron in the water molecule within an anisotropic cavity shows a doublet, residual quadruplar coupling of the deuteron (Fig. 13a). The stronger anisotropy in a cavity gives a larger split. The water deuteron signal in the reference gel shows a singlet, confirming the cavity in the reference gel is isotropic.

In measuring NMR spectrum, water deuteron signal is used as a frequency lock to give the frequency standard. Because of the split of a deuteron signal in an aligned state, the resonance positions observed are biased; one of the doublet envelopes is used for the frequency lock, thus biasing the signals by a half of the split width. This offset should be subtracted from each signal position on the TROSY spectrum for protein in an aligned state.

#### **5.5 DIORITE analyses on MBP in different ligand bound states**

Here, we demonstrate the applications of the DIORITE analysis. Maltose binding protein (MBP) has two domains. From a series of X-ray analysis, MBP is known to show domain reorientation upon binding to ligand, and the domain rotation angle depend on size of the ligand molecule.

MBP binds to -cyclodextrin (-CD) comprising of seven glucose units, and it shows a slight change in the relative domain orientation from that in apo form as demonstrated by X-ray analysis. MBP also binds to different types of cyclodextrins comprising of the different number of glucose units; CD (six glucose units), -CD (eight glucose units). We expected to see the domain rotation angle changes according to the size of the three types of CDs.

Using the anisotropically compressed gel (stretched gel) and uncompressed reference gel, we collected a pair of TROSY spectra for each MBP in the complex with-, - and - CD. Using the apo-form MBP X-ray coordinate, we analyzed the relative domain orientation using DIORITE; the model structure was constructed based on the alignment tensors individually determined for N- and C-terminal domains. On each complex, the backcalculated and observed *TROSY* values were well correlated, suggesting the alignment tensors for each domain were well determined (Fig. 13).

The DIORITE analyses on the MBP complexes demonstrated that for the smallest ligand, - CD, the domain rotation angle was significantly larger than those for the -CD complex structure. On the other hand, MBP in the complex with -CD retains almost the same domain orientation relative to the-CD complex; the size over the -CD does not change the domain rotation angle.

The example analyses demonstrated that the MBP showed significant domain reorientation according to the ligand size. It should be noted that the DIORITE analysis is very efficient to see this ligand-dependent domain reorientation; which requires just a pair of TROSY spectra collected for the sample in aligned and isotropic states.

#### **6. Conclusion**

X-ray structures are still exponentially accumulated every year. The huge collection of the protein coordinates should prompt the protein structure works using the existing protein

aligning media, including bicelle, filamentous phage. They cannot be the choice for

are used. The formation of the anisotropic cavity in the stretched gel can be monitored by the split of the deuteron signal from HOD; the deuteron in the water molecule within an anisotropic cavity shows a doublet, residual quadruplar coupling of the deuteron (Fig. 13a). The stronger anisotropy in a cavity gives a larger split. The water deuteron signal in the reference gel shows a singlet, confirming the cavity in the reference gel is isotropic. In measuring NMR spectrum, water deuteron signal is used as a frequency lock to give the frequency standard. Because of the split of a deuteron signal in an aligned state, the resonance positions observed are biased; one of the doublet envelopes is used for the frequency lock, thus biasing the signals by a half of the split width. This offset should be subtracted from each signal position on the TROSY spectrum for protein in an aligned state.

Here, we demonstrate the applications of the DIORITE analysis. Maltose binding protein (MBP) has two domains. From a series of X-ray analysis, MBP is known to show domain reorientation upon binding to ligand, and the domain rotation angle depend on size of the

MBP binds to -cyclodextrin (-CD) comprising of seven glucose units, and it shows a slight change in the relative domain orientation from that in apo form as demonstrated by X-ray analysis. MBP also binds to different types of cyclodextrins comprising of the different number of glucose units; CD (six glucose units), -CD (eight glucose units). We expected to see the domain rotation angle changes according to the size of the three types of CDs. Using the anisotropically compressed gel (stretched gel) and uncompressed reference gel, we collected a pair of TROSY spectra for each MBP in the complex with-, - and - CD. Using the apo-form MBP X-ray coordinate, we analyzed the relative domain orientation using DIORITE; the model structure was constructed based on the alignment tensors individually determined for N- and C-terminal domains. On each complex, the back-

The DIORITE analyses on the MBP complexes demonstrated that for the smallest ligand, - CD, the domain rotation angle was significantly larger than those for the -CD complex structure. On the other hand, MBP in the complex with -CD retains almost the same domain orientation relative to the-CD complex; the size over the -CD does not change the

The example analyses demonstrated that the MBP showed significant domain reorientation according to the ligand size. It should be noted that the DIORITE analysis is very efficient to see this ligand-dependent domain reorientation; which requires just a pair of TROSY spectra

X-ray structures are still exponentially accumulated every year. The huge collection of the protein coordinates should prompt the protein structure works using the existing protein

*TROSY* values were well correlated, suggesting the alignment

**5.5 DIORITE analyses on MBP in different ligand bound states** 

tensors for each domain were well determined (Fig. 13).

collected for the sample in aligned and isotropic states.

*TROSY* values, a set of TROSY spectra for the aligned and isotropic samples

DIORITE analysis. In measuring

ligand molecule.

calculated and observed

domain rotation angle.

**6. Conclusion** 

coordinate data. In this review, we introduced protein structure analysis in solution with the assistance of protein structure data collected by X-ray crystallography.

There has been a lot of discussion on the significance of the solution structure determination by NMR. Most of the structures for single domain proteins or isolated domains have shown marginal structural deviations from the corresponding structures solved by X-ray. This diminishes the importance of NMR structure analysis, except for the case in which crystallization is hard. When X-ray structure is available on a type of protein behaving as an independent structural unit, the solution structure determination on the protein by NMR is not usually conducted, because the crystal structure should not largely differ from that in solution. Additionally, in most of the cases, the size limitation in NMR structure determination prohibits such solution structure analyses, even if they are required. NMR solution structure analysis, therefore, has been recognized as a complementary method in rather limited cases.

The situation seems different in the structure analysis of multiple domain proteins. There already appeared some examples to show the difference in the domain orientations of protein between solution and crystalline states. The kinds of example will be increased, because it is getting to know that many proteins have domains linked by seemingly flexible or unstructured linkers judged from the sequence. In the proteins, it is presumable that the domain arrangement tends to be defined artificially by crystal packing, thus not represent the protein morphology in a solution state.

The NMR techniques using a weak alignment have paved ways to directly determine the relative domain orientation in solution, which has not been ever done by the conventional NMR methods. Some variations of the methods were introduced in this review, with their limitations in practical applications. Our devised DIORITE approach has a significant advantage in the domain orientation analysis over the existing methods, when it is applied to higher molecular weight proteins. The domain orientation analysis by DIORITE will expand the X-ray structure assisted reach in exploring the protein morphological change in solution, which associates with the functional exertion.

The domain rearrangement in protein, or protein morphological change, upon binding to ligand or interaction with its partner protein will become much more important in discussing protein functional regulation, after getting the high-resolution crystal structure in a specific state, for example, apo-form. The combined use of DIORITE with X-ray structure data may give vivid views how protein works in solution by changing its morphology. NMR is now becoming a complementary partner to X-ray crystallography in protein structure analysis, in particular, on higher molecular weight proteins.
