**3. Applications**

382 Current Trends in X-Ray Crystallography

these situations the SAXS data can be used to generate (using the dummy chain approach) the missing aminoacid loops in the known structure (program BUNCH, Petoukhov and Svergun, 2005) or/and to obtain the spatial arrangement of known domains in order to form the full structure (program SASREF, Petoukhov and Svergun, 2005). Both the generation of the missing loops and the optimization of domains are performed by the use of Monte Carlo methods which, similarly to the previous cases, do not lead to a unique solution. However, even though the solution is not unique, the obtained model is a very good representation of the overall structure. Test examples are shown in Fig14 and Fig15. In Fig14 part of the lysozyme structure was clipped and as it can be seen in the curve, without the loop the atomic model cannot fit the experimental data correctly. With the addition of a dummy chain loop and its optimization it is possible to obtain a very good fit of the experimental data. The generated loop (blue loop in the model) is a reasonable approximation of the real

0.0 0.2 0.4 0.6 0.8 10-5

q [Å-1 ]

 Experimental Fit without the loop Final model

10-4

Fig. 14. *Ab initio* modeling of missing loop of a hypothetical structure using experimental SAXS data. Left: crystallographic structure of lysozyme (pdb entry *6lyz.pdb*) and the restored loop. Semitransparent structure – lysozyme structure with a missing part. Blue blackbone – restored loop superposed to the real, clipped loop. Right: Fit of the experimental data. Open

circles – experimental SAXS data for lysozyme in solution. Dotted line – fitting of the scattering data for the structure without the loop. Solid line – fitting of the scattering data

In Fig15 a hypothetical situation of a heterodimer is shown. The optimization of the structure components does not give a perfect agreement with the initial structural but there is a remarkable similarity, indicating that SAXS data can also be applied in these cases. The situations presented here are just a small representation of possibilities for the applications of these modeling tools. Advanced modeling examples based on these procedures can be found in several articles in the literature (Svergun, 2007). An intrinsic problem of any SAXS modeling is the ambiguity that might arises in the results. In general, it is not possible to obtain a unique solution from the modeling procedure. Therefore it is necessary to complement any scattering modeling with additional information in order to reduce the number of possible solutions. There are several ways on doing this. When available, information about binding sites or specific arrangement of domains can be used as constraints in the modeling. Results from biochemical/biophysical techniques can provide useful information about structure change or binding. For example, fluorescence spectroscopy and isothermal titration calorimetry can provide important information on binding and stoichiometry. In recent applications the simultaneous modeling of scattering

10-3

I(q) [arb. u.]

10-2

loop superposed to it.

for the structure with the optimized loop.

Two applications of SAXS analysis will be presented. In the first case, an *in-situ* aggregation study of lysozyme is presented. As a second example a structural characterization of a giant protein complex is described. These two cases are good examples of the application of the SAXS technique to investigate biological systems.

## **3.1 Lysozyme denaturation and aggregation induced by heat**

The structure of proteins is intrinsically related to its shape. The protein shape, on the other hand, is a result of the protein folding. In the native state, proteins are known to adopt hierarchical structures, which might be a result of a multistep folding process. One possible way to investigate this characteristic is to induce protein denaturation. The denaturation or unfolding can be induced by changes in temperature, pH, or even by the addition of denaturant agents like sodium dodecyl sulfate (SDS). A study of denaturation induced by heat will be presented here.

The experiments were performed at the SAXS beamline of the Brazilian Synchrotron Light Laboratory, Campinas, Brazil. The wavelength selected for the experiments was λ =1.49 Å and the distance between the sample and detector was 745 mm. The measurements were performed using a 1D Gabriel-type detector. The samples were exposed in a 1.5mm capillary tube in a thermally controlled sample holder directly connected to the evacuated beam path. These experiments were performed with lysozyme samples at 10 mg/mL and pH 7.0 in a 10mM phosphate buffer with 50mM of NaCl. Indirect Fourier transformations were performed using program package GNOM which enabled the correction of smearing effects. *Ab initio* models were built using program DAMMIN.

Investigating Macromolecular Complexes in Solution by Small Angle X-Ray Scattering 385

Fig. 17. Average number of monomers per aggregate as a function of time. At least two

The forward scattering for the native protein and the first 5 min frame at 80oC are almost identical indicating that at this stage the protein is still in monomeric state. However the differences in the scattering curve and in the *p(r)* functions when compared to the native state (Fig15) indicates that the protein has adopted a different conformation. This state is known as molten-globule which is a state where the protein is partially denaturated. Interestingly, the protein starts to be denaturated at 80ºC, being stable over lower temperatures (data not shown). Using equation 18, it was possible to calculate the average number of monomers per aggregate, which is shown as a function of time in Fig 17. From this graph, at least two aggregation rates can be identified in the graph, which might indicates that initially the aggregation process is slow but after 30 minutes at 80ºC and with around 5 monomers per aggregate the aggregation is accelerated reaching a number of around 45 monomers per aggregate for 1h at 80oC. A visualization of the obtained aggregates is shown in Fig 14 which agrees with above conclusions. The obtained results confirms other results from the literature which indicates that the denaturation of the

**3.2 Shape and low resolution structure of extracellular hemoglobins calculated from** 

Given the inherent difficulties to obtain the crystallization of proteins with high molecular weights, low resolution studies of extracellular hemoglobins in solution have been the main tool of its structural studies. The physicochemical properties of extracellular hemoglobins (erythrocruorins) have been under study since the 1930s. In particular, different oxygen affinities and cooperativities were reported for molecules with very similar heme content, dimensions and molecular weight. This fact has led the investigators to focus attention on the possible structural differences that could explain this diverse functional behavior. Two very comprehensive reviews on the structure of extracellular hemoglobins have been published by Chung (1979) and Weber (2001). The challenge has always been the elucidation of the interaction among the more than 200 subunits of these respiratory proteins, which lead to the spontaneous, self-limited assembly and cooperative oxygen binding, which are not yet completely understood. In this section the results of the study of extracellular hemoglobins from *Glossoscolex paulistus* with molecular weight of ~3,100 kD will be

aggregation rates can be found. The lines are just for eye guide.

lysozyme can be understood as one stage process (Hirai et al, 1998)

**SAXS data** 

Fig. 16. Aggregation of Lysozyme induced by heat. Top left: scattering data (open circles) and desmeared IFT fits (solid lines). The frames were collected at 80oC in intervals of 5 minutes (first-5min, last-60min). A frame of lysozyme at room temperature (open triangles and dotted line) was added for comparison. Top right: pair distance distribution functions *p(r)* for each dataset. A frame obtained from the SAXS data for lysozyme at room temperature (dotted line) was added for comparison. Bottom: *Ab initio* models restored for each frame. It is possible to see the increase in size for the average model. For comparison, the crystal structure of lysozyme is shown on the left as ribbons.

The results are shown in Fig16. As can be seen, when the protein solution is subjected to 80ºC an evolution of the SAXS profiles as a function of time is observed. As shown in equation 13 and 14, the forward scattering I(0) can provide an estimation for the molecular weight. For a system that presents the formation of aggregates over time, the obtained molecular weight will be an averaged value since a distribution of sizes can be present in the system. However, because the forward scattering is proportional to the square of the particle volume, large particles will have a higher contribution to the final intensity. If one assumes that in each stage the aggregates have a similar size, since the total mass of proteins is constant, it is possible to write,

$$I\left(0\right)\_{\text{agg}} = \frac{c\left(\Delta\rho\_M\right)^2}{N\_A} \overline{M\_{\mathcal{W}\_{\text{agg}}}}\tag{19}$$

If we normalize *I(0)agg* by the forward scattering of the lysozyme measured at room temperature (native state) at the same concentration, this fraction will be a good estimation for the average number of monomers per aggregate:

$$\frac{I\text{(O)}\_{\text{agg}}}{I\text{(O)}\_{\text{ly},20^{\circ}C}} = N\_{\text{mou}}\tag{20}$$

Fig. 16. Aggregation of Lysozyme induced by heat. Top left: scattering data (open circles) and desmeared IFT fits (solid lines). The frames were collected at 80oC in intervals of 5 minutes (first-5min, last-60min). A frame of lysozyme at room temperature (open triangles and dotted line) was added for comparison. Top right: pair distance distribution functions

temperature (dotted line) was added for comparison. Bottom: *Ab initio* models restored for each frame. It is possible to see the increase in size for the average model. For comparison,

The results are shown in Fig16. As can be seen, when the protein solution is subjected to 80ºC an evolution of the SAXS profiles as a function of time is observed. As shown in equation 13 and 14, the forward scattering I(0) can provide an estimation for the molecular weight. For a system that presents the formation of aggregates over time, the obtained molecular weight will be an averaged value since a distribution of sizes can be present in the system. However, because the forward scattering is proportional to the square of the particle volume, large particles will have a higher contribution to the final intensity. If one assumes that in each stage the aggregates have a similar size, since the total mass of proteins is

> <sup>2</sup> 0 *<sup>M</sup>*

> > ,20 0 0 *<sup>o</sup> agg*

*I*

*lyz C*

*c I M N* 

*agg W agg A*

*mon*

*N*

If we normalize *I(0)agg* by the forward scattering of the lysozyme measured at room temperature (native state) at the same concentration, this fraction will be a good estimation

(19)

*<sup>I</sup>* (20)

*p(r)* for each dataset. A frame obtained from the SAXS data for lysozyme at room

the crystal structure of lysozyme is shown on the left as ribbons.

constant, it is possible to write,

for the average number of monomers per aggregate:

The forward scattering for the native protein and the first 5 min frame at 80oC are almost identical indicating that at this stage the protein is still in monomeric state. However the differences in the scattering curve and in the *p(r)* functions when compared to the native state (Fig15) indicates that the protein has adopted a different conformation. This state is known as molten-globule which is a state where the protein is partially denaturated. Interestingly, the protein starts to be denaturated at 80ºC, being stable over lower temperatures (data not shown). Using equation 18, it was possible to calculate the average number of monomers per aggregate, which is shown as a function of time in Fig 17. From this graph, at least two aggregation rates can be identified in the graph, which might indicates that initially the aggregation process is slow but after 30 minutes at 80ºC and with around 5 monomers per aggregate the aggregation is accelerated reaching a number of around 45 monomers per aggregate for 1h at 80oC. A visualization of the obtained aggregates is shown in Fig 14 which agrees with above conclusions. The obtained results confirms other results from the literature which indicates that the denaturation of the lysozyme can be understood as one stage process (Hirai et al, 1998)

#### **3.2 Shape and low resolution structure of extracellular hemoglobins calculated from SAXS data**

Given the inherent difficulties to obtain the crystallization of proteins with high molecular weights, low resolution studies of extracellular hemoglobins in solution have been the main tool of its structural studies. The physicochemical properties of extracellular hemoglobins (erythrocruorins) have been under study since the 1930s. In particular, different oxygen affinities and cooperativities were reported for molecules with very similar heme content, dimensions and molecular weight. This fact has led the investigators to focus attention on the possible structural differences that could explain this diverse functional behavior. Two very comprehensive reviews on the structure of extracellular hemoglobins have been published by Chung (1979) and Weber (2001). The challenge has always been the elucidation of the interaction among the more than 200 subunits of these respiratory proteins, which lead to the spontaneous, self-limited assembly and cooperative oxygen binding, which are not yet completely understood. In this section the results of the study of extracellular hemoglobins from *Glossoscolex paulistus* with molecular weight of ~3,100 kD will be

Investigating Macromolecular Complexes in Solution by Small Angle X-Ray Scattering 387

From the *p(r)* function we obtain a radius of gyration of 113.6 +/- 0.7 Å and maximum dimension of 300 +/- 10 Å . In Fig. 18 we see the excellent fitting of the intensity curve and the *p(r)* function from which the values of Rg and Dmax were calculated. The molecular mass and particle volume where calculated for *G. paulistus* using the *I(0)* value giving 3.1 0.2 MDa and 3.8 0.1 x 106Å3, respectively. These values are compatible with the dimensions obtained with SAXS and from electron micrografs (EM) from *G. paulistus* (Souza, 1990). The overall shape of the particle as present in the EM analysis shows a P62 symmetry, which can be used as a constraint in the model calculation. The introduction of symmetry constraints decreases the number of degrees of freedom, and consequently leads to the restoration of a better three-dimensional model (Svergun, 2000; Oliveira, 2001). In this way it was used a P62 symmetry in the model optimization. In fig. 16 we present one of the best results of the three-dimensional molecular models. Several runs of the optimization program were performed. For each obtained model hydrodynamic parameters were calculated and the ones that provided values not in agreement with the experimental results were excluded. As a result, a model based in the SAXS results and also in agreement with other experimental data could be selected (Table 2). For comparison it is shown in Fig16 the result obtained by Royer et al. (Royer et al,2000) for the hemoglobin of *Lumbricus terrestris* using protein crystallography and electron microscopy. These results showed that the

Fig. 19. A) Crystallographic structure of *Lumbricus terrestris* - from Royer et al.(2000). B) Calculated dummy atom models for the hemoglobin from *G. paulistus* with the computer

Parameter Experimental Data Model Molecular Weight [MDa] 3.10.2 \* --- Sedimentation Coefficient [S] 58 \*\* 57 Stokes Radius [Å] 139\*\*\* 138 Diffusion Coefficient [10-7 cm2/s] 1.56\*\*\* 1.55 f/fmin 1.42\*\*\* 1.41 Table 2. Hydrodynamic Parameters and molecular weight of the hemoglobin from *G. Paulistus* obtained from SAXS models and experimental techniques. \* from experiment,

program DAMMIN using a P62 symmetry. The models are in the same scale.

proteins are quite similar in quaternary structure.

\*\* from Costa MCP 1988, \*\*\* from S value.

presented. Advanced methods of shape restoration from the X-ray scattering data allowed a description of the subunit arrangement of these molecules as well as the determination of dimensional parameters which could also be confirmed by the results of hydrodynamic measurements and calculations for the models proposed. There are only minor differences in the properties already reported on the subunit structure of *Lumbricus terrestris* hemoglobin (Fushitani et al., 1991) and the previous works on the structural subunits of *G. paulistus* studied by pH induced and high pressure dissociation by Bonafe et al.,1991 and Silva et al., 1989, indicating the similarity of these proteins, spite of the differences in molecular weight.

Samples of *G. paulistus* were purified according to a standard procedure (Silva et al., 1989, Bonafe et al., 1991) in several concentrations. SAXS measurements were made using synchrotron radiation at Brazilian Synchrotron Light Laboratory, with hemoglobin in 0.05 M TRIS-HCl buffer pH 7.5. The hemoglobin concentrations used in the experiments varied from 0.5 to 40 mg/mL and the final combination of the frames enabled the extrapolation to zero concentration. The scattered intensities were recorded with a linear position sensitive detector and the primary data correction was done using standard procedures. The *q* range was from q = 0.005 to 0.1882 Å-1, with radiation wavelength of = 1.74Å. To collect the low and high angles scattering data, two sample-detector distances were used (1.74m and 0.84 m). The samples were kept in a 1.5 mm diameter capillary tube sample holder, kept at a constant temperature (20ºC). Indirect Fourier Transformation was performed using the GNOM program package. *Ab initio* calculations were performed using program DAMMIN. The experimental scattered intensity was normalized to absolute scale using water as a primary standard, which enabled the calculation of the protein molecular weight and volume. Finally, hydrodynamic properties of molecular models can be calculated using an approach initially developed for crystallographic structures (program HYDROPRO, de La Torre et al., 2000), which can be easily extended to dummy atom models when the molecular mass and partial specific volume of the protein are known (Arndt et al., 2002). As a result, several hydrodynamic parameters can be calculated and compared with the values obtained by other experimental methods. This comparison can be very useful in order to check the validity of the molecular conformation represented by the 3D models proposed.

Fig. 18. Scattering curve (open circles) and IFT fitting (solid line) of hemoglobin from *G. paulistus*. Inset - pair distance distribution function (*p(r)*).

presented. Advanced methods of shape restoration from the X-ray scattering data allowed a description of the subunit arrangement of these molecules as well as the determination of dimensional parameters which could also be confirmed by the results of hydrodynamic measurements and calculations for the models proposed. There are only minor differences in the properties already reported on the subunit structure of *Lumbricus terrestris* hemoglobin (Fushitani et al., 1991) and the previous works on the structural subunits of *G. paulistus* studied by pH induced and high pressure dissociation by Bonafe et al.,1991 and Silva et al., 1989, indicating the similarity of these proteins, spite of the differences in

Samples of *G. paulistus* were purified according to a standard procedure (Silva et al., 1989, Bonafe et al., 1991) in several concentrations. SAXS measurements were made using synchrotron radiation at Brazilian Synchrotron Light Laboratory, with hemoglobin in 0.05 M TRIS-HCl buffer pH 7.5. The hemoglobin concentrations used in the experiments varied from 0.5 to 40 mg/mL and the final combination of the frames enabled the extrapolation to zero concentration. The scattered intensities were recorded with a linear position sensitive detector and the primary data correction was done using standard procedures. The *q* range was from q = 0.005 to 0.1882 Å-1, with radiation wavelength of = 1.74Å. To collect the low and high angles scattering data, two sample-detector distances were used (1.74m and 0.84 m). The samples were kept in a 1.5 mm diameter capillary tube sample holder, kept at a constant temperature (20ºC). Indirect Fourier Transformation was performed using the GNOM program package. *Ab initio* calculations were performed using program DAMMIN. The experimental scattered intensity was normalized to absolute scale using water as a primary standard, which enabled the calculation of the protein molecular weight and volume. Finally, hydrodynamic properties of molecular models can be calculated using an approach initially developed for crystallographic structures (program HYDROPRO, de La Torre et al., 2000), which can be easily extended to dummy atom models when the molecular mass and partial specific volume of the protein are known (Arndt et al., 2002). As a result, several hydrodynamic parameters can be calculated and compared with the values obtained by other experimental methods. This comparison can be very useful in order to check the validity of the molecular

0.00 0.05 0.10 0.15 0.20

0.00 0.05 0.10 0.15 0.20 0.25 0.30

P(R) [arb. u.]

0 50 100 150 200 250 300

R [Å]

q[Å-1 ]

Fig. 18. Scattering curve (open circles) and IFT fitting (solid line) of hemoglobin from *G.* 

conformation represented by the 3D models proposed.

1E-4

*paulistus*. Inset - pair distance distribution function (*p(r)*).

1E-3

0.01

Intensity [arb. u.]

0.1

1

10 Experimental Data IFT Fit

molecular weight.

From the *p(r)* function we obtain a radius of gyration of 113.6 +/- 0.7 Å and maximum dimension of 300 +/- 10 Å . In Fig. 18 we see the excellent fitting of the intensity curve and the *p(r)* function from which the values of Rg and Dmax were calculated. The molecular mass and particle volume where calculated for *G. paulistus* using the *I(0)* value giving 3.1 0.2 MDa and 3.8 0.1 x 106Å3, respectively. These values are compatible with the dimensions obtained with SAXS and from electron micrografs (EM) from *G. paulistus* (Souza, 1990). The overall shape of the particle as present in the EM analysis shows a P62 symmetry, which can be used as a constraint in the model calculation. The introduction of symmetry constraints decreases the number of degrees of freedom, and consequently leads to the restoration of a better three-dimensional model (Svergun, 2000; Oliveira, 2001). In this way it was used a P62 symmetry in the model optimization. In fig. 16 we present one of the best results of the three-dimensional molecular models. Several runs of the optimization program were performed. For each obtained model hydrodynamic parameters were calculated and the ones that provided values not in agreement with the experimental results were excluded. As a result, a model based in the SAXS results and also in agreement with other experimental data could be selected (Table 2). For comparison it is shown in Fig16 the result obtained by Royer et al. (Royer et al,2000) for the hemoglobin of *Lumbricus terrestris* using protein crystallography and electron microscopy. These results showed that the proteins are quite similar in quaternary structure.

Fig. 19. A) Crystallographic structure of *Lumbricus terrestris* - from Royer et al.(2000). B) Calculated dummy atom models for the hemoglobin from *G. paulistus* with the computer program DAMMIN using a P62 symmetry. The models are in the same scale.


Table 2. Hydrodynamic Parameters and molecular weight of the hemoglobin from *G. Paulistus* obtained from SAXS models and experimental techniques. \* from experiment, \*\* from Costa MCP 1988, \*\*\* from S value.

Investigating Macromolecular Complexes in Solution by Small Angle X-Ray Scattering 389

Chacon, P.; Morán. F.; Días, J. F.; Pantos, E. and Andreu J. M., (1998). "Low-resolution

Chung, M. C. M. and Ellerton, H. D. (1979). "The physico-chemical and functional

Ciccariello, S. (1985), Deviations from the Porod Law due to Parallel Equidistant Interfaces,

Costa, M. C. P.; Bonafe, C. F. S.; Meirelles, N. C. and Galembeck, F., (1988). "Sedimentation

Debye, P. (1915). Zerstreuung von Röntgenstrahlen. *Ann. Phys. (Leipzig)*, 46, 809–823. ISSN

de La Torre, J. G.; Huertas, M. L. and Carrasco B. (2000). "Calculation of Hydrodynamic

Feigin, A. and Svergun, D. I. (1987), Structure Analysis by Small-Angle X-Ray and Neutron

Fritz, G., Glatter O. (2006) Structure and interaction in dense colloidal systems: evaluation of

Fushitani, K. and Riggs, A. F. (1991). "The extracellular hemoglobin of earthworm

Glatter, O. (1977), A New Method for the Evaluation of Small-Angle Scattering Data, *J. Appl.* 

Glatter, O. (1979), The Interpretation of Real-Space Information from Small-Angle Scattering

Glatter, O. (1980), Computation of Distance Distribution Functions and Scattering functions

Glatter, O. (1980) Determination of Particle-Size Distribution Functions from Small-Angle

Glatter, O. (1981), Convolution Square Root of Band-Limited Symmetrical Functions and its

Glatter, O. (1984), Improvements in Real-Space Deconvolution of Small-Angle Scattering

Data, *J. Appl. Cryst.* 17,435-441; ISSN (electronic): 1600-5767.

of Models for Small Angle Scattering Experiments . *Acta Physica Austriaca* , 52, 243-

Scattering Data by Means of the Indirect Transformation Method, *J. Appl. Cryst.* 13,

Application to Small Angle Scattering Data, *J. Appl. Cryst.* 14, 101-108; ISSN

algorithm". *Biophys. J.*, 74, 2760-2775. ISSN (electronic): 1542-0086

*Biophys. Molec. Biol.* 35, 33-102. ISSN: 0079-6107.

719-730. ISSN (electronic): 1542-0086

ISSN (electronic): 1361-648X.

256. ISSN: 0001-6713.

(electronic): 1600-5767.

*Cryst.* 10, 415-421. ISSN (electronic): 1600-5767.

Experiments. *J. Appl. Cryst.* , 12, 166-175;

7-11; ISSN (electronic): 1600-5767.

*Acta Cryst., A*41 , 560-568; ISSN (electronic): 1600-5724.

13210-13216. ISSN 1083-351X

1678-4510

00033804.

York;

351X.

Glycerol on the association of extracellular hemoglobin, *J. Biol. Chem.*, 266 (20),

structures of proteins in solution retrieved from X-ray scattering whit a genetic

properties of extracellular respiratory haemoglobins and chlorocruorins". *Prog.* 

Coefficient and Minimum molecular weight of extracellular hemoglobin of Glossoscolex paulistus (oligochaeta)". *Brazilian J. Med. Biol. Res*., 21, 115-118. ISSN

Properties of Globular Proteins from their Atomic-Level Structure". *Biophys. J.*, 78,

Scattering, *Plenum Publishing Corporation – Plenum Press*, , ISBN: 0-306-42629-3 New

scattering data by the generalized indirect Fourier transformation method. *Journal of Physics - Condensed Matter,* 18 (36) Issue 36, (13 September 2006), S2403-S2419.

Lumbricus terrestris - oxygenation properties of isolated chains, trimer and a reassociated product". J. Biol. Chem., 266 (16), 10275-10281. ISSN (electronic): 1083-

Due to the inherent difficult of make crystals of proteins, particularly for large proteins like the *G. paulistus*, the presented results demonstrated the capability of SAXS technique and the new optimization methods to provide a fast and reliable procedure to investigate the shape and quaternary structures of large protein complexes. Also, the correlation of SAXS results with the hydrodynamic properties increases the reliability of the results and makes possible to perform a model search integrated with hydrodynamic calculations.
