**4. Discussion**

290 The Complex World of Polysaccharides

biosynthesis.

To study biosynthesis of polysaccharide fragments, we analyzed the time course of

**Figure 22.** Growth to the maximal specific radioactivity of polysaccharides as dependent on the time of

The highest radioactivity of the 0.15 M fraction was detected 60-75 min after a glucose load, while mature GAG (fraction 0.25-1.5 М) showed the highest radioactivity after 120 min. Thus, initiation of glycan synthesis is probably associated with the 0.15 М fraction, enriched in neutral oligosaccharides, which is consistent with common views of proteoglycan

To identify the subcellular structures where generation of a polysaccharide chain is initiated, we studied the dynamics of label incorporation in the nuclear and microsomal fractions of

**Figure 23.** Dynamics of accumulation of radiolableled glucose in the nuclear and microsomal fractions of the rat liver. As 100%, we used the radioactivity averaged over all fractions and all time points.

As Fig. 23 demonstrates, the label became detectable in cell structures 30 min after a glucose load. The label accumulated to the highest content in both fractions within 1 h of the experiment. It should be noted that, in the first hour, the accumulation rate was higher in the nuclear fraction (radioactivity 53% compared with 44% in the microsomal fraction).

label presence in rats. The maximal specific radioactivity was taken as 100% for each fraction.

the rat liver. The results of this experiment are shown in Fig. 23.

incorporating radiolabeled glucose in polysaccharide chains (Fig. 22).

Our results demonstrate that polysaccharides are similar to NA in many aspects. To a great extent, this is a consequence of the fact that NA is a polymer of ribose or deoxyribose having a base as a side-chain substituent. Some polysaccharides (e.g., HA, amylose, cellulose) are themselves capable of forming helical structures similar to NA.

Quantum chemical computations revealed the possibility of selective bonding between UDPuronic acids and purines and between UDP-hexoses and pyrimidines contained in NA. The bonding strength per monomer unit in the polysaccharide–NA complex is similar to that computed for the NA double helix. This selectivity allows us to assume a binary genetic code for polysaccharides on the basis of the above principles in addition to the commonly accepted genetic code for amino acids. Quantum chemical computations were supported by the results of dot hybridization and spectral analysis of NA–polysaccharide complexes.

The spectrophotometric studies showed that amylose selectively binds with polypyrimidines (poly(dC)) while polyuronides bind with polypurines (poly(dA)). As well known, amylose is a homopolysaccharide of hexose and polyuronide, of hexuronic acid. The difference between these monosaccharides is that glucose contains the hydroxymethyl group at C5, while hexuronic acid has the carboxyl group at C5. Apart from this, the two monosaccharides are identical. Hence the observed difference in physicochemical properties of the above polysaccharides can be interpreted in terms of the structural difference at C5 of their units. Consequently, our results suggest that purines (exemplified by A) selectively bind to the carboxyl group of uronic acid residues, while pyrimidines interact with the hydroxymethyl group of hexose residues contained in polysaccharides. These experimental findings agree with the results of quantum chemical computations performed for the NA– polysaccharide complexes, testifying again to their adequacy.

The bonding of nucleotides and glycans was detected in experiments aimed at studying the interaction between mature GAG (HA and CS) with fragmented calf thymus DNA. In primary structure, HA is a multiple tandem repeat of a unit consisting of a hexose and an uronic acid residue. In turn, tandem repeats of purine–pyrimidine account for a considerable proportion of DNA in higher organisms, including calf thymus DNA (Kiselev L.L. 2000, Singer M., et al. 1998). As our results demonstrate, HA did find complementary regions in calf thymus DNA. CS showed no complementary interactions with calf thymus DNA. It is known that C6 of hexose residues of CS is modified with the sulfo group, which provides an additional partial negative charge. This modification of hexoses dramatically affects the physicochemical properties of the polysaccharide and probably prevents hydrogen bonding.

Dot hybridization confirmed possible complementarity of hexoses to pyrimidines and uronic acid to purines. The specific bonds between NA and polysaccharides are comparable in strength with the bonds between complementary nucleotides in DNA.

Thus, analysis of the interactions between polysaccharides and NA showed that purines of NA are complementary to uronic acids of polysaccharides and that pyrimidines are complementary to hexoses. The relationships between the genetic apparatus of the cell and polysaccharides deserve further studying in terms of the above complementarity principle. Such studies will probably yield a fundamentally novel view as to whether microheterogeneity of polysaccharide moieties of proteoglycans is genetically determined and related to purine–pyrimidine DNA repeats. Yet it is clear that the nature of these relationships need additional comprehensive studies.

The idea that polysaccharide moieties of proteoglycans contain information is not new. For instance, Zimina (Zimina N.P., et al. 1992, Zimina N.P., et al. 1987, Zimina N.P., et al. 1986, Zimina N.P., Rykova V.I., Dmitriev I.P. 1987) demonstrated that carbohydrate chains of proteoglycans are chemically heterogeneous and structurally irregular and concluded, quite justifiably, that the information content of proteoglycans provides a chemical basis for their intricate and highly specific functions in the cell. There is still no method for GAG sequencing, and data on the structure of their chains are circumstantial. Since chemical heterogeneity is widespread, these data make it possible to assume that proteoglycan chains are of irregular structure with a cluster arrangement of disaccharides. The clustering of bonds sensitive to testicular hyaluronidase and, consequently, of glucuronic acid residues was demonstrated for pig skin dermatan sulfate. The clusters are arranged along the chain without any distinct regularity (Fransson L.A., et al. 1982). Likewise, glucuronic acid-containing disaccharides are clustered at random along the chain of dermatan sulfates from the pig intestinal mucosa and the umbilical cord (Fransson L.A., et al. 1982). The clustering of various uronic acid residues is characteristic of dermatan suflate from the human uterine neck. The irregularity of dermatan sulfate chains was suggested from chromatographic elution profiles of dermatan sulfate isolated from the bovine aortic intima and digested with chondroitinase AC (Oegema T.R., et al. 1979). Structural studies revealed a cluster organization of HS and heparin chains. Their molecules include extended regions consisting of low-sulfated disaccharides that contain glucuronic acid residues as a main component and only a minor amount of glucose residues (Bjork I., et al. 1982). Clusters sulfated to a high extent were also found: they consist of highly charged disaccharides harboring glucose residues as a main component. These clusters are possibly separated by less ordered regions where alternating sugars occur in similar proportions. Molecules vary in size of highly and low-charged regions.

It should be noted that the structure of oligosaccharide sites is an informative element of GAG as opposed to NA, which contain information in the form of a strict sequence of monomeric units (nucleotides). For instance, the capability of self-association of proteoHS and proteodermatan sulfates is due to so-called contact zones, specific regions with alternating disaccharides containing glucuronic acid and glucose residues with a certain arrangement of sulfo groups (Franson L.-A. 1982). It is noteworthy that polysaccharides are characterized by a determined structural–functional interdependence similar to that of NA (Zimina N.P., et al. 1992).

The concept of nontemplate synthesis of polysaccharide components of proteoglycans does not allow a genetically grounded explanation of microheterogeneity (polymorphism) of proteoglycans. Yet data are continuously accumulating that polysaccharide components of proteoglycans are highly polymorphic and that their polymorphism shows distinct tissue, organ, and species specificities. It should be noted that glycans are still poorly understood and, consequently, the current views of glycans are similar to the views of NA in early molecular biology. Compounds belonging to one class strikingly differ in primary structure. We know today that, for instance, every mRNA is a unique element in realization of genetic information, but such functions are still to be elucidated in the case of glycans.

292 The Complex World of Polysaccharides

(Zimina N.P., et al. 1992).

relationships need additional comprehensive studies.

proportions. Molecules vary in size of highly and low-charged regions.

It should be noted that the structure of oligosaccharide sites is an informative element of GAG as opposed to NA, which contain information in the form of a strict sequence of monomeric units (nucleotides). For instance, the capability of self-association of proteoHS and proteodermatan sulfates is due to so-called contact zones, specific regions with alternating disaccharides containing glucuronic acid and glucose residues with a certain arrangement of sulfo groups (Franson L.-A. 1982). It is noteworthy that polysaccharides are characterized by a determined structural–functional interdependence similar to that of NA

The concept of nontemplate synthesis of polysaccharide components of proteoglycans does not allow a genetically grounded explanation of microheterogeneity (polymorphism) of

Thus, analysis of the interactions between polysaccharides and NA showed that purines of NA are complementary to uronic acids of polysaccharides and that pyrimidines are complementary to hexoses. The relationships between the genetic apparatus of the cell and polysaccharides deserve further studying in terms of the above complementarity principle. Such studies will probably yield a fundamentally novel view as to whether microheterogeneity of polysaccharide moieties of proteoglycans is genetically determined and related to purine–pyrimidine DNA repeats. Yet it is clear that the nature of these

The idea that polysaccharide moieties of proteoglycans contain information is not new. For instance, Zimina (Zimina N.P., et al. 1992, Zimina N.P., et al. 1987, Zimina N.P., et al. 1986, Zimina N.P., Rykova V.I., Dmitriev I.P. 1987) demonstrated that carbohydrate chains of proteoglycans are chemically heterogeneous and structurally irregular and concluded, quite justifiably, that the information content of proteoglycans provides a chemical basis for their intricate and highly specific functions in the cell. There is still no method for GAG sequencing, and data on the structure of their chains are circumstantial. Since chemical heterogeneity is widespread, these data make it possible to assume that proteoglycan chains are of irregular structure with a cluster arrangement of disaccharides. The clustering of bonds sensitive to testicular hyaluronidase and, consequently, of glucuronic acid residues was demonstrated for pig skin dermatan sulfate. The clusters are arranged along the chain without any distinct regularity (Fransson L.A., et al. 1982). Likewise, glucuronic acid-containing disaccharides are clustered at random along the chain of dermatan sulfates from the pig intestinal mucosa and the umbilical cord (Fransson L.A., et al. 1982). The clustering of various uronic acid residues is characteristic of dermatan suflate from the human uterine neck. The irregularity of dermatan sulfate chains was suggested from chromatographic elution profiles of dermatan sulfate isolated from the bovine aortic intima and digested with chondroitinase AC (Oegema T.R., et al. 1979). Structural studies revealed a cluster organization of HS and heparin chains. Their molecules include extended regions consisting of low-sulfated disaccharides that contain glucuronic acid residues as a main component and only a minor amount of glucose residues (Bjork I., et al. 1982). Clusters sulfated to a high extent were also found: they consist of highly charged disaccharides harboring glucose residues as a main component. These clusters are possibly separated by less ordered regions where alternating sugars occur in similar We characterized the fraction composition of polysaccharides contained in the rat liver homogenate. As Fig. 18 shows, rat liver polysaccharides include polymers of neutral sugars, hexoses, and hexuronic acids and vary in proportions of monomeric units and the degree of their modification (amination, sulfation, epimerization, etc.).

Correlation analysis revealed an association between time and synthesis of HS, which are GAG modified to the greatest extent. This result is beyond doubt because their modification is known to result in a high degree of sulfation and, consequently, their maturation takes more time as compared with maturation of other GAG groups.

In addition, the analysis results (Pearson empirical correlation coefficients and Spearman rank correlation coefficients) showed a strong linear correlation between cell nuclear structures (DNA) and the fraction of nuclear saccharides eluted from DEAE cellulose until 0.15 M NaCl. Such a correlation was not observed for total DNA and the 0.15 M fraction of the homogenate (oligosaccharides of microsomes, lysosomes, nuclei, other cell structures, and the intercellular matrix). The correlation testifies that nuclear synthesis of saccharides of this group increases after a glucose load, especially 1 h after loading. This finding was confirmed by quantitating polysaccharides with normalization with respect to the DNA content (Fig. 19).

The 0.15 M fraction of the homogenate distinctly correlated with the contents of mature proteoglycans in the homogenate and the nucleus, implicating saccharides of this fraction in GAG maturation. In other words, an increase in synthesis of this saccharide fraction of the homogenate increased synthesis of proteoglycans (CS and HS). The fact that proteoglycan synthesis starts with this fraction was evident from the data on incorporation of radiolabeled glucose in GAG. It should be noted that the 0.15 M fraction varies in composition among cell structures. The nuclear fraction harbors neutral sugar components and uronic acids at 3:0.75, suggesting incomplete addition of uronic acids to neutral trisaccharide units, which takes place in the nucleus. The 0.15 M fraction of nuclear saccharides consists mostly (to 83%) of components eluted at 0.02 М NaCl. The 0.06 and 0.15 М subfractions occur at 11 and 6%, respectively. In microsomes, the proportion of the 0.02 М subfraction is decreased to 55%, while the proportions of the 0.06 and 0.15 М subfractions are increased to 22 and 23%, respectively. The changes observed in the homogenate are similar to those in microsomes, but the proportion of the 0.15 М subfraction is greater (up to 32%).

These findings associate the genetic apparatus of the cell with synthesis of oligosaccharides similar in composition to the universal tetrasaccharide of proteoglycans. The association is also evident from the results obtained with radiolabeled glucose: the label accumulation rate in nuclear structures was higher than in other structures 1 h after a glucose load.

Oligosaccharides with a neutral sugar : uronic acid ratio of 3:1 are most probably synthesized in the nucleus. Heteroglycan chains with a tandem arrangement of uronic acid– hexose units (1:1) are formed in structures of the microsomal fraction.

Analysis of incorporation of radiolabeled glucose in the nuclear and microsomal fractions of the cell (Fig. 23) showed that the incorporation rate is more intense in the nucleus early (at the end of the first hour) after a glucose load. After 2 h, microsomes accumulate radiolabeled glucose to a high extent, which is probably due to a transfer of radiolabeled oligosaccharides from the nucleus and the formation of GAG chains. After 3 h, the content of radiolabeled saccharides in the cell nucleus increased again owing most probably to a transfer of mature GAG from the EPR. This scenario agrees with modern views of proteoglycan synthesis.

It is of interest that the label appeared first in the 0.15 M fraction and then in mature GAG (the 0.25-1.5 M fraction) of the homogenate (Fig. 22). The initially high radioactivity of this fraction can be explained by the presence of glycans involved in initiating synthesis of the polysaccharide chain of GAG. GAG synthesis starts with xylosylation of core proteins and generation of a linker tetrasaccharide; then, the polysaccharide chain is formed by consecutive addition of hexoses and uronic acids in almost equal proportions.

Comparison of the data on radiolabeled glucose incorporation in GAG and fractions of cell structures showed that the high rate of label accumulation in the nucleus early (within the first hour) after a load was associated with the dynamics of label incorporation in polysaccharides eluted until 0.15 M NaCl. This was not the case with mature GAG, which showed the highest radioactivity only 2 h after a load. At this time, the label content is minimal in the nucleus and higher in microsomes. Hence, we can state that generation of the GAG chain in the cell is associated with microsomes, while synthesis of the linker tetrasaccharide, which belongs to the 0.15 M fraction, is associated with nuclear structures.

Saccharides of the 0.15 M fraction differ from classical GAG in having another hexosamine content and probably harbor fragments with linker tetrasaccharides. Note that, until uronic acid–hexosamine ratio reaches 1:1 (as characteristic of HA), such fragments, having a considerable portion of neutral sugars, are eluted from the column at an ionic strength lower than necessary for HA elution (0.15 M NaCl). Since proteoglycan synthesis starts with generation of the linker tetrasaccharide, the radioactivity of this fraction should be higher early after a radiolabeled glucose load. Our results fully agreed with this expectation.

Thus, the glycoside moiety of proteoglycans is probably synthesized in a stepwise manner in the cell. The linker tetrasaccharides are synthesized in structures associated with the cell nucleus, which is also evident from the results of correlation analysis. Then the fragments are transferred into the EPR, where the main glycan chain is synthesized, biochemically modified, and used to form proteoglycans. The GAG chain is synthesized in the EPR. Mature GAG are delivered into the nucleus in a small amount and are mostly exported into the intercellular space to produce the intercellular matrix. This scenario agrees with data of many studies that core proteins entering the EPR and Golgi system already have the linker tetrasaccharide (xylose-galactose-galactose-uronic acid) (Colman Y., et al. 2000, Zimina N.P., et al. 1992, Zimina N.P., et al. 1987, Silbert J.E. et al. 1995). These data indirectly support our assumption that DNA plays a role in synthesis of the universal tetrasaccharide of proteoglycans.

294 The Complex World of Polysaccharides

proteoglycan synthesis.

Oligosaccharides with a neutral sugar : uronic acid ratio of 3:1 are most probably synthesized in the nucleus. Heteroglycan chains with a tandem arrangement of uronic acid–

Analysis of incorporation of radiolabeled glucose in the nuclear and microsomal fractions of the cell (Fig. 23) showed that the incorporation rate is more intense in the nucleus early (at the end of the first hour) after a glucose load. After 2 h, microsomes accumulate radiolabeled glucose to a high extent, which is probably due to a transfer of radiolabeled oligosaccharides from the nucleus and the formation of GAG chains. After 3 h, the content of radiolabeled saccharides in the cell nucleus increased again owing most probably to a transfer of mature GAG from the EPR. This scenario agrees with modern views of

It is of interest that the label appeared first in the 0.15 M fraction and then in mature GAG (the 0.25-1.5 M fraction) of the homogenate (Fig. 22). The initially high radioactivity of this fraction can be explained by the presence of glycans involved in initiating synthesis of the polysaccharide chain of GAG. GAG synthesis starts with xylosylation of core proteins and generation of a linker tetrasaccharide; then, the polysaccharide chain is formed by

Comparison of the data on radiolabeled glucose incorporation in GAG and fractions of cell structures showed that the high rate of label accumulation in the nucleus early (within the first hour) after a load was associated with the dynamics of label incorporation in polysaccharides eluted until 0.15 M NaCl. This was not the case with mature GAG, which showed the highest radioactivity only 2 h after a load. At this time, the label content is minimal in the nucleus and higher in microsomes. Hence, we can state that generation of the GAG chain in the cell is associated with microsomes, while synthesis of the linker tetrasaccharide, which belongs to the 0.15 M fraction, is associated with nuclear structures.

Saccharides of the 0.15 M fraction differ from classical GAG in having another hexosamine content and probably harbor fragments with linker tetrasaccharides. Note that, until uronic acid–hexosamine ratio reaches 1:1 (as characteristic of HA), such fragments, having a considerable portion of neutral sugars, are eluted from the column at an ionic strength lower than necessary for HA elution (0.15 M NaCl). Since proteoglycan synthesis starts with generation of the linker tetrasaccharide, the radioactivity of this fraction should be higher

early after a radiolabeled glucose load. Our results fully agreed with this expectation.

Thus, the glycoside moiety of proteoglycans is probably synthesized in a stepwise manner in the cell. The linker tetrasaccharides are synthesized in structures associated with the cell nucleus, which is also evident from the results of correlation analysis. Then the fragments are transferred into the EPR, where the main glycan chain is synthesized, biochemically modified, and used to form proteoglycans. The GAG chain is synthesized in the EPR. Mature GAG are delivered into the nucleus in a small amount and are mostly exported into the intercellular space to produce the intercellular matrix. This scenario agrees with data of many studies that core proteins entering the EPR and Golgi system already have the linker tetrasaccharide (xylose-galactose-galactose-uronic acid) (Colman Y., et al. 2000, Zimina N.P., et al. 1992,

consecutive addition of hexoses and uronic acids in almost equal proportions.

hexose units (1:1) are formed in structures of the microsomal fraction.

Based on our data, we suggest a conceptual mechanism underlying the functional association between the nuclear apparatus and the formation of the tetrasaccharide linking the protein core with the GAG moiety in proteoglycans.

Taken together, the results obtained with different methods allow us to propose a concept of template synthesis of proteoglycans with the involvement of tandem DNA repeats.

It is clear that proteoglycan synthesis starts with DNA transcription to yield pre-mRNA. The informative region of the mRNA for the core protein is formed according to the commonly accepted mechanism up to the serine codon. When the serine codon is followed by a triplet complementary to the trisaccharide uronic acid-hexose-hexose, the mode of transcription changes. It should be noted that the complementarity requirements are met only by eight DNA triplets (ACC, ACT, ATC, ATT, GCC, GCT, GTC, and GTT). The energy of bonding during NA synthesis is distributed as follows among these triplets: it is about 11.7 kcal/mol in one triplet (GCC), 9.6 kcal/mol in three triplets (GCT, GTC, and ACC), 7.6 kcal/mol in three triplets (GТТ, АCТ, and АТC), and 5.5 kcal/mol in one triplet (АТТ). Thus, the ATT triplet has the bonding energy of only 5.5 kcal/mol per monomer, lower than in the triplet duplex DNA-polysaccharide (more than 7 kcal/mol).

Under certain conditions, this situation with the ATT codon may lead to appreciable competition with UDP-sugars in the formation of complementary pairs during RNA synthesis on DNA. As a result, synthesis of a trisaccharide may be more advantageous in terms of energy than addition of three structural units to RNA. Hence the following process is possible. Glycosyltransferase utilizes the first uronic acid residue bound to DNA adenine via complementary interactions and generates a bond between the saccaride and the last ribonucleotide of the serine codon of RNA, simultaneously converting uronic acid to xylose via decarboxylation at C5. Decarboxylation initiates separation of the saccharide unit from DNA. Then, Gal transferases generate glycoside bonds between xylose and two hexose (galactose) residues, which are hydrogen-bonded to the thymine tandem of DNA according to the above scheme.

It is clear that some pre-mRNAs synthesized *in vivo* under such conditions have the trisaccharide xylose-galactose-galactose in place of a ribonucleotide triplet at the site of RNA branching on the ATT codon of DNA. It is well known that the ATT triplet determines a stop codon (UAA) terminating synthesis of polypeptide chains, which indirectly supports our assumption. If this sequence is followed by tandem repeats (purine-pyrimidine, such as CA) with a strength of bonding in NA synthesis about 8.8 kcal/mol, synthesis of oligosaccharide is disadvantageous and the RNA is extended according to the DNA-RNA complementarity rule, which is more advantageous in terms of energy. Although the energy of bonds plays an important role in this situation, we assume that the process is still far more intricate.

Thus, it is possible that hnRNA contains pre-mRNAs that have a trisaccharide followed by a tandem ribonucleotide repeat consisting of more than 300 monomers (which is sufficient for generating the glycoside moiety of proteoglycans) and starting from 5'-P-terminal guanosine at the nuclear RNA branching site. This can explain why the corresponding RNA regions block reverse transcription and are resistant to some RNases. Guanine acts as a hnRNA branching site and, as a purine, is capable of ensuring further addition of the first uronic acid residue, which always follows the trisaccharide. With the above system of complementary interactions, the trisaccharide is analogous to UAA and, when immediately followed by GU, allows the processing and splicing of pre-mRNA according to the commonly accepted scheme. Like adenines, the two galactose residues may interact with two uridines of the spliceosome U-RNA through their hydroxymethyl groups. Thus, the sequence Gal-Gal-G-U provides a binding site for spliceosome structures, as characteristic of the splicing tetranucleotide consensus sequence (AAGU).

In such an RNA strand, saccharides are linked by the 3'-5' bond up to the end of the serine codon. C2 of the saccharide of the last nucleotide of the serine codon interacts with C5 of xylose, which is thereby capable of binding through С1 with serine when xylosylation is initiated. Then xylose С4 binds with С1 of a galactose dimer. C2 of the last galactose interacts with C5 of the saccharide moiety of guanosine. Thus, C3 of the last nucleotide of the serine codon and C3 of the last galactose are free for bonding, which allows a 3-1 bond with the first glucuronic acid residue. Uridine following guanosine in an RNA intron determines the addition of a hexose to the glucuronic acid residue through the 4-1 (or 3-1) bond during template synthesis of a polysaccharide (Fig. 24).

Invariant GG at the 5' end of the next exon, continuing the protein-coding sequence, represents the first two nucleotides of a glycine codon (GGA, GGG, GGC, or GGU). This GG is in the trisaccharide site (the RNA branching site) and, as the flanking intron is excised in the processingosome during RNA maturation, continues the mRNA coding region after the serine codon in the 5'-3' direction. The obligatory presence of invariant splicing-site GG in the triplet following the serine codon can explain the conservation of the xylosylation site (serine-glycine tandem) among core proteins, because only glycine codons start with two guanines.

The resulting transcript is capable of directing synthesis of a polypeptide chain wherein serine is covalently bound to the trisaccharide and then with glycine. Tandem RNA sequences, representing the intron side chain following the tetrasaccharide (xylosegalactose-galactose-uronic acid), allow a polysaccharide fragment to be synthesized according to the NA base–monosaccharide complementarity by glycosyltransferases that form 4-1 or 3-1 glycoside bonds, starting from the last saccharide. After splicing occurs and the two guanines find themselves in the exon part of the molecule, the tandem RNA fragment is linked to C2 of uronic acid through the terminal uridine. The fragment determines an ordered arrangement of monosaccharides for synthesis of a polysaccharide chain. This process is advantageous (bonding energy is about 7 kcal/mol) in systems having an excess of UDP-saccharides and glycosyltransferases and lacking mononucleotides and NA polymerases, as characteristic of membrane structures of EPR and the Golgi complex. We propose that the process is termed glycotranscription, because information contained in NA is directly transferred to the polysaccharide chain.

the splicing tetranucleotide consensus sequence (AAGU).

template synthesis of a polysaccharide (Fig. 24).

NA is directly transferred to the polysaccharide chain.

generating the glycoside moiety of proteoglycans) and starting from 5'-P-terminal guanosine at the nuclear RNA branching site. This can explain why the corresponding RNA regions block reverse transcription and are resistant to some RNases. Guanine acts as a hnRNA branching site and, as a purine, is capable of ensuring further addition of the first uronic acid residue, which always follows the trisaccharide. With the above system of complementary interactions, the trisaccharide is analogous to UAA and, when immediately followed by GU, allows the processing and splicing of pre-mRNA according to the commonly accepted scheme. Like adenines, the two galactose residues may interact with two uridines of the spliceosome U-RNA through their hydroxymethyl groups. Thus, the sequence Gal-Gal-G-U provides a binding site for spliceosome structures, as characteristic of

In such an RNA strand, saccharides are linked by the 3'-5' bond up to the end of the serine codon. C2 of the saccharide of the last nucleotide of the serine codon interacts with C5 of xylose, which is thereby capable of binding through С1 with serine when xylosylation is initiated. Then xylose С4 binds with С1 of a galactose dimer. C2 of the last galactose interacts with C5 of the saccharide moiety of guanosine. Thus, C3 of the last nucleotide of the serine codon and C3 of the last galactose are free for bonding, which allows a 3-1 bond with the first glucuronic acid residue. Uridine following guanosine in an RNA intron determines the addition of a hexose to the glucuronic acid residue through the 4-1 (or 3-1) bond during

Invariant GG at the 5' end of the next exon, continuing the protein-coding sequence, represents the first two nucleotides of a glycine codon (GGA, GGG, GGC, or GGU). This GG is in the trisaccharide site (the RNA branching site) and, as the flanking intron is excised in the processingosome during RNA maturation, continues the mRNA coding region after the serine codon in the 5'-3' direction. The obligatory presence of invariant splicing-site GG in the triplet following the serine codon can explain the conservation of the xylosylation site (serine-glycine

The resulting transcript is capable of directing synthesis of a polypeptide chain wherein serine is covalently bound to the trisaccharide and then with glycine. Tandem RNA sequences, representing the intron side chain following the tetrasaccharide (xylosegalactose-galactose-uronic acid), allow a polysaccharide fragment to be synthesized according to the NA base–monosaccharide complementarity by glycosyltransferases that form 4-1 or 3-1 glycoside bonds, starting from the last saccharide. After splicing occurs and the two guanines find themselves in the exon part of the molecule, the tandem RNA fragment is linked to C2 of uronic acid through the terminal uridine. The fragment determines an ordered arrangement of monosaccharides for synthesis of a polysaccharide chain. This process is advantageous (bonding energy is about 7 kcal/mol) in systems having an excess of UDP-saccharides and glycosyltransferases and lacking mononucleotides and NA polymerases, as characteristic of membrane structures of EPR and the Golgi complex. We propose that the process is termed glycotranscription, because information contained in

tandem) among core proteins, because only glycine codons start with two guanines.

**Figure 24.** Hypothetical scheme of genetically determined template synthesis of proteoglycans.

It is clear that synthesis of the core protein chain on ribosomes is quite possible, because the tetrasaccharide does not occupy the 3'-5' bond of the last nucleotide of the serine codon and acts as a spacer linking the RNA intron. As a result, the newly synthesized core protein contains the linker tetrasaccharide and the RNA intron attached to the serine. The protein is delivered after synthesis into the site where the corresponding polysaccharide fragment is generated in membrane structures of the EPR and the Golgi complex. Serine is xylosylated in the EPR during biosynthesis of the core protein on ribosomes. This scenario does not contradict the modern views of GAG synthesis (Silbert J.E. et al. 1995).

It should be noted that pre-mRNA splicing is tissue-specific. Moreover, selection of the splicing site depends on the developmental stage in some cases (Singer M., et al. 1998). In other words, the primary structure of mRNA introns varies with tissue and ontogenetic stage. The same is true for the primary structure of the glycoside moiety of proteoglycans. It is known that proteoglycans are completely absent from unicellular and prokaryotic organisms, as well as pre-mRNA processing is. This fact suggests a relationship between glycan biosynthesis and mRNA maturation in terms of biological significance for the cell.

This assumption is supported by data obtained for mRNAs of CS core proteins. Translated in a cell-free wheat germ system, cartilage mRNA directed synthesis of a 340-kDa core protein. Immediately after translation, the protein already contained glycosylation signals for subsequent glycan synthesis (Hook M., et al. 1984), i.e., the linker tetrasaccharide was already present.

Based on quantum chemical analysis of the advantage of bonding, the hypothesis of proteoglycan synthesis of the NA template (Fig. 24) agrees with the views of the processes involved in realizing genetic information (genome structure, transcription, hnRNA processing and splicing, translation) (Singer M., Berg P. 1998) and eliminates the main contradictions of the existing concept of proteoglycan metabolism. The hypothesis explains why the site where the linker tetrasaccharide is attached to the protein core has not been identified in more than fifty years of studies on proteoglycans: the tetrasaccharide is already contained in RNA before translation. There is convincing evidence that the tetrasaccharide finds its way in the EPR as covalently bound to the core protein (Colman Y., et al. 2000, Silbert J.E. et al. 1995). A role of protein core structures in determining the serine xylosylation site was rejected in recent studies. A "vitalistic" hypothesis has been formulated that ascribes this role to intracellular membranes. The hypothesis is based on the fact that glycosyltransferases are mostly in membrane structures of the EPR and Golgi complex (Silbert J.E. et al. 1995). Yet this fact alone does not prove synthesis of the linker tetrasaccharide in these structures, because glycosyltransferases are also detectable in nuclei (nuclear membranes). Our biochemical studies implicate structures of the cell nucleus in initiation of synthesis of the linker tetrasaccharide.

According to our hypothesis, a serine is subject to xylosylation only when its codon is followed in DNA consequently by the ATT stop codon and CA (GU in RNA) of an intron, responsible for hnRNA branching and mRNA processing.

The above scheme of proteoglycan synthesis allows generation of a linear heteroglycan of a particular size (about 300 monomers). The question is how template synthesis following this scheme proceeds in the case of HA, consisting of thousands of monomers. It is clear that RNA templates of a corresponding size are absent from EPR structures. It is possible that GAG synthesis in this case utilizes lasso-like intron RNA, which result from cis-splicing and are potentially capable of directing cyclic synthesis of HA with any number of monomeric units. It is known that mRNA trans-splicing yields linear introns. Such RNA structures may direct template synthesis of branched homoglycans whose branching period is a multiple of the full NA helix turn (10-12 units). Monosaccharide 10 (12) overlies monosaccharide 1 of the nascent strand under these conditions, which allows an additional bond between them. Then synthesis proceeds through several such rounds to yield a branched polysaccharide structurally similar to glycogen. This hypothetical mechanism generating branched polysaccharides is supported by the fact that their branching period is a multiple of the full NA helix turn.

298 The Complex World of Polysaccharides

already present.

initiation of synthesis of the linker tetrasaccharide.

responsible for hnRNA branching and mRNA processing.

It is clear that synthesis of the core protein chain on ribosomes is quite possible, because the tetrasaccharide does not occupy the 3'-5' bond of the last nucleotide of the serine codon and acts as a spacer linking the RNA intron. As a result, the newly synthesized core protein contains the linker tetrasaccharide and the RNA intron attached to the serine. The protein is delivered after synthesis into the site where the corresponding polysaccharide fragment is generated in membrane structures of the EPR and the Golgi complex. Serine is xylosylated in the EPR during biosynthesis of the core protein on ribosomes. This scenario does not

It should be noted that pre-mRNA splicing is tissue-specific. Moreover, selection of the splicing site depends on the developmental stage in some cases (Singer M., et al. 1998). In other words, the primary structure of mRNA introns varies with tissue and ontogenetic stage. The same is true for the primary structure of the glycoside moiety of proteoglycans. It is known that proteoglycans are completely absent from unicellular and prokaryotic organisms, as well as pre-mRNA processing is. This fact suggests a relationship between glycan biosynthesis and mRNA maturation in terms of biological significance for the cell.

This assumption is supported by data obtained for mRNAs of CS core proteins. Translated in a cell-free wheat germ system, cartilage mRNA directed synthesis of a 340-kDa core protein. Immediately after translation, the protein already contained glycosylation signals for subsequent glycan synthesis (Hook M., et al. 1984), i.e., the linker tetrasaccharide was

Based on quantum chemical analysis of the advantage of bonding, the hypothesis of proteoglycan synthesis of the NA template (Fig. 24) agrees with the views of the processes involved in realizing genetic information (genome structure, transcription, hnRNA processing and splicing, translation) (Singer M., Berg P. 1998) and eliminates the main contradictions of the existing concept of proteoglycan metabolism. The hypothesis explains why the site where the linker tetrasaccharide is attached to the protein core has not been identified in more than fifty years of studies on proteoglycans: the tetrasaccharide is already contained in RNA before translation. There is convincing evidence that the tetrasaccharide finds its way in the EPR as covalently bound to the core protein (Colman Y., et al. 2000, Silbert J.E. et al. 1995). A role of protein core structures in determining the serine xylosylation site was rejected in recent studies. A "vitalistic" hypothesis has been formulated that ascribes this role to intracellular membranes. The hypothesis is based on the fact that glycosyltransferases are mostly in membrane structures of the EPR and Golgi complex (Silbert J.E. et al. 1995). Yet this fact alone does not prove synthesis of the linker tetrasaccharide in these structures, because glycosyltransferases are also detectable in nuclei (nuclear membranes). Our biochemical studies implicate structures of the cell nucleus in

According to our hypothesis, a serine is subject to xylosylation only when its codon is followed in DNA consequently by the ATT stop codon and CA (GU in RNA) of an intron,

contradict the modern views of GAG synthesis (Silbert J.E. et al. 1995).

RNAs with the above structures may be components of small cytoplasmic RNA, the role of which is still poorly understood. It is known that scRNA account for less than 1% of total cell RNA. Some scRNAs contain tandem repeats, in particular, *Alu* sequences. These scRNAs are associated with membrane structures of the EPR. Some scRNAs (7SL RNA) are involved in transmembrane transport of polypeptides across the EPR lipid bilayer. The scRNA size varies from 90 to 330 nt, which falls well into our concept. It should be noted that synthesis of storage homoglycans probably requires no template and proceeds through a simpler mechanism, because their chains carry no information.

Many glycoproteins act as antigens on the plasma membrane. Immunohistochemical comparison of glycoproteins isolated from hepatocyte membranes of intact, embryonic, and regenerating liver and from hepatoma showed that the protein component of antigens is detectable on all hepatocytes regardless of the state of the liver. Moreover, the protein component is tissue-nonspecific and universal for most cell groups. The carbohydrate component of antigens proved to be specific. It is the structure of the polysaccharide component that determines the immunogenicity of a proteoglycan. Zimina (Zimina N.P., et al. 1987) studied the specifics of GAG synthesis in the liver of adult rats and embryos and in hepatoma and showed that GAG of tissues with active cell proliferation are sulfated to a lower extent as compared with the corresponding GAG of quiescent tissues. That is, polysaccharide fragments of the same proteoglycans differ depending on the state of hepatocytes.

According to the concept of nontemplate synthesis of GAG, the protein core is synthesized on polysomes of the rough EPR. The same protein components of proteoglycans were synthesized in all above cases. The identity of the protein core suggests that the glycoside components of these proteoglycans are synthesized in one site of the smooth EPR and, consequently, are also identical. Experimental results contradict this assumption. The variation of glycoside components is possible only when glycosyltransferase complexes of EPR membranes vary depending on the physiological state of hepatocytes. A glycosyltransferase complex of the smooth EPR should contain about 300 molecules of the enzyme generating the glycoside bond in a certain sequence. For instance, HS synthesis

requires a complex of about 300 N-acetylglucosaminylatransferases and glucuronyltransferases occurring in equal proportions. Each of the 150 enzyme molecules, e.g., N-acetylglucosaminyltransferases, should be encoded by a separate gene with a nonconserved sequence coding for the site of enzyme attachment to a strongly specific membrane site of the smooth EPR. If GAG synthesis is template-independent, at least three types of such complexes are necessary for HS generation. About 450 Nacetylglucosaminyltransferase genes are required for this process. Since the enzyme is also involved in synthesis of heparin and other polysaccharides, the number of Nacetylglucosaminyltransferase genes should occur at thousands of copies per genome. Yet it is known that DNA regions complementary to mRNAs have unique sequences and occur at a few copies per genome. Glycosyltransferase genes are not exceptions to this rule. Thus, the commonly accepted hypothesis of nontemplate synthesis of GAG cannot explain the genetically determined heterogeneity of polysaccharide components of proteoglycans. This hypothesis does not stand up, since recent works have demonstrated the informative value of the glycoside moiety of proteoglycans (Zimina N.P., et al. 1992, Zimina N.P., et al. 1987).

Our scheme of template synthesis of GAG solves this problem, because only a few glycosyltransferase genes are sufficient for generating any diversity of polysaccharide chains in this case.

Synthesis of some glycans with a high information content according to the glycotranscription principle makes it possible to assume, by analogy with RNA, the existence of reverse glycotranscription, whereby information is transferred from polysaccharide to RNA and then, by reverse transcriptase, to DNA fragments, which can be inserted into the genome to preserve the acquired information about new glycans in the genome structure. Such information is contained in specific tandem DNA repeats, which are unique for each individual. The relevant genetic systems are probably capable of being transmitted to the progeny and being fixed as a hereditary or acquired character by selection. It seems that both nontemplate and genetically determined template synthesis of polysaccharides exist in nature and are closely associated with each other through information flows.

It should be noted in this connection that Ronichevskaya and Rykov (Ronichevskaya G.M., Rykova V.I. 1977) observed that GAG suppress DNA replication in proliferating cells in a keilon-like manner. This observation can be explained by homology of polysaccharides to NA. We think that polysaccharide fragments are capable of blocking DNA polymerases via specific binding to complementary DNA repeats. As reported earlier, the substances examined in (Ronichevskaya G.M., et al. 1977) were isolated from RNA preparations, which provides indirect evidence for a role of RNA in glycan synthesis. According to our results, HA fragments form stable bonds with NA.

Further studies of proteoglycan polymorphism (species and tissues specificity, age- and pathological changes) and development of sequencing technique will probably demonstrate the specificity of proteoglycans at the level of individual organisms. By analogy, polymorphism of highly repetitive genome sequences is now beyond doubt and is widely used in genome fingerprinting. We assume that it is genomic repeats that are responsible for the information structure of glycans, whose polymorphism (microheterogeneity) has come to be commonly recognized.

The results of many studies implicate proteoglycans in intricate processes regulating cell proliferation (Ronichevskaya G.M., et al. 1977) and differentiation (Kinoshita S., et al. 1979, Kinoshita S., Yoshii K. 1979), cell recognition, and the organization of intercellular interactions (Henkart P., et al. 1973). In evolution, polysaccharides played a key role in the transition from unicellular organisms to multicellular entities. It is owing to polysaccharides that the individual cell functions as a structure with a particular function in the cell ensemble of the total organism. In the ontogeny, proteoglycans are associated with the aging of cells, tissues, and organs (Zimnitskii A.N., et al. 2004). A genetic defect in synthesis of GAG polysaccharide chains leads to severe hereditary disorders (Zimnitskii A.N., Bashkatov S.A. 2004 ). It is beyond doubt that the relevant processes are controlled by DNA, which carries genetic information.

To conclude, life is a mode of existence of not only protein and nucleic, but also of polysaccharide bodies, because all three biopolymers determine life that we observe. Analysis of the distinct association of NA and polysaccharides will clarify the role of both polysaccharides and DNA repeats. The latter cannot be described now as waste or selfish, because some of them are potentially capable of directing polysaccharide synthesis.

We are grateful to prof. N.K. Yankovskii (Institute of General Genetics, Russian Academy of Sciences, Moscow) and Dr. V.I. Salyanov (Institute of Molecular Biology, Russian Academy of Sciences, Moscow) for help in collecting and interpreting empirical data.
