**2.1 Galectin families**

Regarding their overall structure galectins are clustered in three families: a) prototype galectins consisting of one CRD, b) chimera-type galectins with one CRD and a non-lectin domain whose only member known so far is galectin-3, and c) tandem-repeat galectins which have two different CRDs linked by a short peptide (see Fig. 1) (Hirabayashi & Kasai, 1993; Leffler et al., 2004).

In this review we want to focus on galectin-1, -3 and -8 as representatives of the three galectin families.

Fig. 1. Galectin families regarding their overall structure (modified from Barondes et al. 1994) (Al-Ansari et al., 2009).

Galectins are either divalent regarding their intrinsic protein structure (tandem-repeat galectins such as galectin-8) or form homotypic di- to oligomers through site-specific interactions (prototype and chimera galectins such as galectin-1 and -3). Different galectin-8 isoforms represent either prototypic or tandem-repeat type galectins depending on splice variants. Some galectin-8 splice variants consist only of the N-terminal CRD and different elongations without a second CRD. Those variants can rather be grouped to the prototypic galectins (Bidon et al., 2001; Al-Ansari et al., 2009 Zick et al., 2004). Prototype galectins such as galectin-1 form homodimers through hydrophobic interactions at the N-terminal amino acid residues (Cho & Cummings, 1997; Lobsanov et al., 1993). Dimerisation occurs as equilibrium reaction depending on protein concentration but independent of available soluble ligands (Cho & Cummings, 1995). In contrast the chimeric galectin-3 forms oligomers (most likely pentamers) via its N-terminal collagen-like extension after ligandbinding (Ahmad et al., 2004a; Birdsall et al., 2001; Nieminen et al., 2008).

#### **2.2 Carbohydrate recognition domains**

4 Biomaterials – Physics and Chemistry

dimensional structure of the carbohydrate recognition domain (CRD) (Barondes et al., 1994a; Barondes et al., 1994b). Several human galectin CRDs have been characterised by crystallography, including those of human galectin-1, galectin-3 and the N-terminal domain of galectin-8 (Ideo et al., 2011; Kishishita et al., 2008; Lobsanov et al., 1993; Lopez-Lucendo et al., 2004; Seetharaman et al., 1998). The C-terminal domain of galectin-8 has been investigated by NMR (Tomizawa et al., 2008). All of them show a globular fold consisting of two anti-parallel β-sheets with five to six strands respectively (Ideo et al., 2011; Kishishita et al., 2008; Lobsanov et al., 1993; Lopez-Lucendo et al., 2004; Seetharaman et al., 1998; Tomizawa et al., 2008). The CRDs analysed so far consist of three consecutive exons, with most of the conserved amino acids encoded on the middle one (Cooper & Barondes, 1999;

Regarding their overall structure galectins are clustered in three families: a) prototype galectins consisting of one CRD, b) chimera-type galectins with one CRD and a non-lectin domain whose only member known so far is galectin-3, and c) tandem-repeat galectins which have two different CRDs linked by a short peptide (see Fig. 1) (Hirabayashi & Kasai,

In this review we want to focus on galectin-1, -3 and -8 as representatives of the three

Fig. 1. Galectin families regarding their overall structure (modified from Barondes et al.

galectins 1,2,5,7,10,11,13,14,(15) galectin 3 galectins 4,6,8,9,12

prototype chimeric tandem-repeat

binding (Ahmad et al., 2004a; Birdsall et al., 2001; Nieminen et al., 2008).

Galectins are either divalent regarding their intrinsic protein structure (tandem-repeat galectins such as galectin-8) or form homotypic di- to oligomers through site-specific interactions (prototype and chimera galectins such as galectin-1 and -3). Different galectin-8 isoforms represent either prototypic or tandem-repeat type galectins depending on splice variants. Some galectin-8 splice variants consist only of the N-terminal CRD and different elongations without a second CRD. Those variants can rather be grouped to the prototypic galectins (Bidon et al., 2001; Al-Ansari et al., 2009 Zick et al., 2004). Prototype galectins such as galectin-1 form homodimers through hydrophobic interactions at the N-terminal amino acid residues (Cho & Cummings, 1997; Lobsanov et al., 1993). Dimerisation occurs as equilibrium reaction depending on protein concentration but independent of available soluble ligands (Cho & Cummings, 1995). In contrast the chimeric galectin-3 forms oligomers (most likely pentamers) via its N-terminal collagen-like extension after ligand-

Houzelstein et al., 2004).

1993; Leffler et al., 2004).

1994) (Al-Ansari et al., 2009).

galectin families.

**2.1 Galectin families** 

The carbohydrate recognition domain is highly conserved throughout different galectins and organisms.


Fig. 2. Sequence alignment of human galectin-1, -3 and the single carbohydrate recognition domains of galectin-8 (residues 1-150 and 221-359 of isoform a). Used sequences are galectin-1 (NP\_002296), galectin-3 (P17931) and galectin-8 isoform a (NP\_963839) as published on http://www.ncbi.nlm.nih.gov/protein. Completely conserved amino acids are marked with an asterisk, conserved substitutions are marked with a colon and semiconserved substitutions with a simple dot. Important amino acids mentioned in the following text are additionally highlighted: Conserved amino acids of the binding pocket are highlighted in grey; residues with importance for the binding are labelled in an ellipse; galectin-1 cysteine residues are marked with circles. The alignment has been performed using ClustalW2 at http://www.ebi.ac.uk using the default settings (Chenna et al., 2003; Larkin et al., 2007).

The conserved amino acids are directly involved in carbohydrate binding either by the formation of hydrogen bonds or van der Waals interactions with the sugar moiety. Most of the conserved amino acids form hydrogen bonds with the bound sugar unit. An important sequence motif in this context is His(158)-Asn(160)-Arg(162) (numbering according to human galectin-3, see Fig. 2). Those three amino acids have been found to form hydrogen bonds with the bound galactose residue for example in galectin-1 (Lobsanov et al., 1993), galectin-3 (Diehl et al., 2010; Seetharaman et al., 1998), and galectin-8 N-CRD (Ideo et al., 2011; Kishishita et al., 2008) (see Fig. 3). The sequence motif can also be found in galectin-8

Galectins: Structures, Binding Properties and Function in Cell Adhesion 7

Different tissues are known to produce galectins and most of them secrete parts of the cytosolic galectin pool. The amount of secreted galectin depends on cell type, differentiation status and can be regulated by external triggers (Cooper, 2002; Hughes, 1999). Examples of galectin producing cells with relevance for regenerative medicine are beside others neurons, epithelial cells of several tissues and liver cells, which produce either several different

Galectins act intra- and extracellularly. As known so far they are secreted via a non-classical mechanism which is not fully understood yet. They lack classical signalling sequences for specific localisation but can be found in the outer cellular space as well as inside the cells even located in the nucleus (Hughes, 1999). Although the complex regulation of secretion remains still elusive some explanations have been found: Galectin-1 secretion depends on the binding to a counter-receptor molecule and does not involve plasma membrane blebbing (Seelenmeyer et al., 2005; Seelenmeyer et al., 2008). Galectin-3 secretion seems also to be regulated by binding to other proteins such as chaperons and subsequent vesicular secretion (Hughes, 1999; Mehul & Hughes, 1997). The N-terminal-domain of galectin-3 is important for subcellular translocation and secretion of the protein (Gong et al., 1999).

The lectin activity of galectin-1 depends on reduced cysteine residues. Oxidised galectin-1 has no lectin activity but functions in the regeneration of nerve axons (Horie et al., 2004). Galectin-1 has six cysteine residues which are accessible to the solvent (see Fig. 2). The removal of the most accessible cysteine (Cys2) (Lopez-Lucendo et al., 2004) - or better all cysteine residues - enhances protein stability under both reducing and non-reducing conditions significantly (Cho & Cummings, 1995; Nishi et al., 2008), while none of them is necessary for lactose binding as shown by site directed mutagenesis and x-ray

Galectin-3 has some specific properties due to its unique structure. Galectin-3 consists of three parts: 1) a N-terminal 12 amino acid leader sequence containing two phosphorylation sites, 2) a proline and glycine rich collagen like domain necessary for oligomerisation and 3) the carbohydrate recognition domain (Ahmad et al., 2004a; Dumic et al., 2006; Kubler et al., 2008; Mehul & Hughes, 1997; Nieminen et al., 2008). The first few amino acids forming the leader peptide are important for the subcellular localisation and secretion of the protein (Gong et al., 1999). Moreover phosphorylation of Ser6 seems to regulate affinity for different ligands and thereby cellular activity of galectin-3 (Dumic et al., 2006; Mazurek et al., 2000; Szabo et al., 2009; Yoshii et al., 2002). Galectin-3 can be cleaved by different proteases such as metalloproteinases-2 and –9 (gelatinases A and B respectively), metalloproteinase-13 (collagenase-3) and with low activity metalloproteinase-1 (collagenase-1) separating the fulllength CRD from the N-terminal extension (Guévremont et al., 2004; Ochieng et al., 1994). The main cleavage position is located between Ala62 and Tyr63 while other cleaving sites are only recognised by some specific proteases to lesser extend (Dumic et al., 2006;

galectins or a specific subset of galectins (Dumic et al., 2006; Hughes, 1999).

crystallography (Hirabayashi & Kasai, 1991; Lopez-Lucendo et al., 2004).

**2.3.2 Galectin-1: Importance of reducing conditions** 

**2.3.3 Galectin-3: The only known chimera type galectin** 

**2.3 Other specific features of galectins** 

**2.3.1 Secretion** 

C-CRD but as no x-ray crystallography is available for this CRD the hydrogen bridges have not been verified yet. Additional residues are involved in the conserved binding process either by hydrogen bonding (Glu184, Asn174, numbering according to human galectin-3, see Fig. 3) or van-der-Waals interaction (Trp181, numbering according to human galectin-3) (Di Lella et al., 2009; Diehl et al., 2010; Lobsanov et al., 1993; Seetharaman et al., 1998).

Fig. 3**.** Human galectin-3 with bound galactose unit of LacNAc **PDB 1KJL**. H-bondings are shown as dotted lines for residues H158 (C4-OH), N160 (C4-OH), R162 (C4-OH and intramolecular O-atom), N174 (C6-OH), E184 (C6-OH) (Seetharaman et al., 1998; Sörme et al., 2005). Picture made with SwissProt pdb viewer (Guex & Peitsch, 1997).

The importance of the mentioned H-bonding amino acids has been proven by site-directed mutagenesis performed with human galectin-1. In those experiments the change of single amino acids involved in H-bonding eliminates the binding to lactose-sepharose and/or asialofetuin (Hirabayashi & Kasai, 1991; Hirabayashi & Kasai, 1994). Although binding is not completely abolished, significant influence of the conserved Trp residue for sugar binding was also proven in bovine and human galectin-1 (Abbott & Feizi, 1991; Hirabayashi & Kasai, 1991).

Arg186 is not completely conserved throughout the different galectins (see Fig. 2, elipse). The N-terminal domain of galectin-8 for example presents an Ile at the corresponding position resulting in a differing fine specificity for glycans. Due to this mutation galectin-8 N-CRD favours lactose structures over LacNAc type II structures in the binding site. Thereby different biological functions of galectin-8 in contrast to other galectins such as galectin-3 are regulated (Salomonsson et al., 2010). Specific binding preferences resulting from the differences in amino acid sequence will be discussed in chapter 4 in more detail.

#### **2.3 Other specific features of galectins 2.3.1 Secretion**

6 Biomaterials – Physics and Chemistry

C-CRD but as no x-ray crystallography is available for this CRD the hydrogen bridges have not been verified yet. Additional residues are involved in the conserved binding process either by hydrogen bonding (Glu184, Asn174, numbering according to human galectin-3, see Fig. 3) or van-der-Waals interaction (Trp181, numbering according to human galectin-3) (Di Lella et al., 2009; Diehl et al., 2010; Lobsanov et al., 1993; Seetharaman et al., 1998).

Fig. 3**.** Human galectin-3 with bound galactose unit of LacNAc **PDB 1KJL**. H-bondings are shown as dotted lines for residues H158 (C4-OH), N160 (C4-OH), R162 (C4-OH and intramolecular O-atom), N174 (C6-OH), E184 (C6-OH) (Seetharaman et al., 1998; Sörme et

The importance of the mentioned H-bonding amino acids has been proven by site-directed mutagenesis performed with human galectin-1. In those experiments the change of single amino acids involved in H-bonding eliminates the binding to lactose-sepharose and/or asialofetuin (Hirabayashi & Kasai, 1991; Hirabayashi & Kasai, 1994). Although binding is not completely abolished, significant influence of the conserved Trp residue for sugar binding was also proven in bovine and human galectin-1 (Abbott & Feizi, 1991; Hirabayashi

Arg186 is not completely conserved throughout the different galectins (see Fig. 2, elipse). The N-terminal domain of galectin-8 for example presents an Ile at the corresponding position resulting in a differing fine specificity for glycans. Due to this mutation galectin-8 N-CRD favours lactose structures over LacNAc type II structures in the binding site. Thereby different biological functions of galectin-8 in contrast to other galectins such as galectin-3 are regulated (Salomonsson et al., 2010). Specific binding preferences resulting from the differences in amino acid sequence will be discussed in chapter 4 in more detail.

al., 2005). Picture made with SwissProt pdb viewer (Guex & Peitsch, 1997).

& Kasai, 1991).

Different tissues are known to produce galectins and most of them secrete parts of the cytosolic galectin pool. The amount of secreted galectin depends on cell type, differentiation status and can be regulated by external triggers (Cooper, 2002; Hughes, 1999). Examples of galectin producing cells with relevance for regenerative medicine are beside others neurons, epithelial cells of several tissues and liver cells, which produce either several different galectins or a specific subset of galectins (Dumic et al., 2006; Hughes, 1999).
