**2. Families and structures of galectins**

Galectins are defined by their β-galactoside binding ability and their common sequence of about 130 conserved amino acids. This sequence homology results in a similar overall three-

Galectins: Structures, Binding Properties and Function in Cell Adhesion 5

The carbohydrate recognition domain is highly conserved throughout different galectins

Fig. 2. Sequence alignment of human galectin-1, -3 and the single carbohydrate recognition domains of galectin-8 (residues 1-150 and 221-359 of isoform a). Used sequences are galectin-1 (NP\_002296), galectin-3 (P17931) and galectin-8 isoform a (NP\_963839) as published on http://www.ncbi.nlm.nih.gov/protein. Completely conserved amino acids are marked with an asterisk, conserved substitutions are marked with a colon and semiconserved substitutions with a simple dot. Important amino acids mentioned in the following text are additionally highlighted: Conserved amino acids of the binding pocket are highlighted in grey; residues with importance for the binding are labelled in an ellipse; galectin-1 cysteine residues are marked with circles. The alignment has been performed using ClustalW2 at http://www.ebi.ac.uk using the default settings (Chenna et al., 2003;

The conserved amino acids are directly involved in carbohydrate binding either by the formation of hydrogen bonds or van der Waals interactions with the sugar moiety. Most of the conserved amino acids form hydrogen bonds with the bound sugar unit. An important sequence motif in this context is His(158)-Asn(160)-Arg(162) (numbering according to human galectin-3, see Fig. 2). Those three amino acids have been found to form hydrogen bonds with the bound galactose residue for example in galectin-1 (Lobsanov et al., 1993), galectin-3 (Diehl et al., 2010; Seetharaman et al., 1998), and galectin-8 N-CRD (Ideo et al., 2011; Kishishita et al., 2008) (see Fig. 3). The sequence motif can also be found in galectin-8

**2.2 Carbohydrate recognition domains** 

and organisms.

Larkin et al., 2007).

dimensional structure of the carbohydrate recognition domain (CRD) (Barondes et al., 1994a; Barondes et al., 1994b). Several human galectin CRDs have been characterised by crystallography, including those of human galectin-1, galectin-3 and the N-terminal domain of galectin-8 (Ideo et al., 2011; Kishishita et al., 2008; Lobsanov et al., 1993; Lopez-Lucendo et al., 2004; Seetharaman et al., 1998). The C-terminal domain of galectin-8 has been investigated by NMR (Tomizawa et al., 2008). All of them show a globular fold consisting of two anti-parallel β-sheets with five to six strands respectively (Ideo et al., 2011; Kishishita et al., 2008; Lobsanov et al., 1993; Lopez-Lucendo et al., 2004; Seetharaman et al., 1998; Tomizawa et al., 2008). The CRDs analysed so far consist of three consecutive exons, with most of the conserved amino acids encoded on the middle one (Cooper & Barondes, 1999; Houzelstein et al., 2004).
