**2. Exploring the ECM**

organisms' physiology. The most known examples of ECM-related tissues are the skin, where ECM act as a barrier against the outside environment, and the bones where ECM is strength‐ ened by a mineral phase which allows the body to stand and to move. However, its apparent structural and mechanical properties have hidden more subtle roles of ECM in cell differentia‐ tion and function as ECMs are not restricted to load-bearing organs but are present and required in all types of tissues and organs. During the development of the embryo, neural crest cells lose their cell-cell adhesion properties toward cell–ECM interactions that allow them to move along the dorsal part of the embryo and reach their specific site of function and give birth to the future skeleton. Again, tissue remodeling, as observed during the healing processes, can release messenger molecules that were entrapped in the ECM, waiting for the right moment to trigger their signal and healing functions [1]. Some lack of knowledge on ECM functions remains mainly because of the challenge represented by its comprehensive study. Indeed, ECM is made of several high molecular weight proteins, proteoglycans, and polysaccharides molecules self-arranged into fibers and networks difficult to solubilize and individualize. Basic biochemistry techni‐ ques have led to the identification of the major components of ECMs such as collagens or laminins, but as the investigations are progressing, this results in the constant growing of the constitu‐ ent members of collagen and laminin families and in the discovery of new ECM components with unknown functions [2]. Moreover, understanding the ECM not only means discovering new molecules but also to unravel their organization in the ECM network. So the study of ECM requires the combination of identification and imaging techniques to give a valuable scheme of its composition, organization, and finally function. Interestingly, unraveling ECM complexity meets one of the fundamental questions for biologists: how to recreate and maintain life outside

294 Composition and Function of the Extracellular Matrix in the Human Body

a living organism (literally *ex vivo* but commonly referred as *in vitro*)?

of 3D cell culture models.

The beginning of the 20th century aroused the possibility to dissociate cells from living tissues and to culture them *ex vivo*. This new technique has triggered the emergence of the new discipline of cell biology which has brought most of the knowledge that we possess today on cell proliferation, differentiation, metabolism, cell fate, and death. However, *ex vivo* cell cultures were restricted to two-dimensional (2-D) culture systems, originally on glass and subsequently on plastic dishes, occasionally supplemented by the coating of ECM molecules to favor cell adhesion. Parallel to the development of cell biology, the broad field of materials science was creating polymers and devices able to bring *ex vivo* cell culture to the third dimension, and to the 21st century. Dedicated to materials that interact with living tissues, the field of biomaterials encompasses several scientific disciplines, from physics and chemistry to biochemistry and medicine. Several types of three-dimensional (3D) materials have been engineered which may represent valuable tools for fundamental cell research, but a lack of knowledge on ECM structures have undermined their use for cell biology. On the other hand, cell biologists are not necessarily aware of the development and possibilities created by extensive research in the field of 3D biomaterials, and this partly compromises the expansion

In this chapter, we will present basic techniques involved in the investigation of extracellular matrices and data generated by their use to understand ECM composition and organization. Basic knowledge on ECM composition and organization should be useful for biomaterial Extracellular matrices are multimolecular three-dimensional (3D) networks made of a large variety of ECM-specific molecules and their compositions and organizations are tissuespecific. Exploring the ECM means (1) the determination of its distribution within the tissue and its relation to the cell content, (2) the identification and quantification of its composition, and (3) the characterization of the 3-D architecture of the ECM network [2]. ECMs contain similar biomolecules which can be organized in two main classes (1) proteins and glycopro‐ teins and (2) proteoglycans and polysaccharides. Variation in the composition or in the amount of certain ECM molecules will change dramatically the physical properties of the ECM such as the tensile strength observed in the hard mineralized ECM in bones, the elasticity observed in dermis of the skin, or even the transparency in the cornea of the eye. The biochemistry of ECM components strongly influences the techniques used to investigate them. Light micro‐ scopy associated with histological staining is based on the differences of biochemical features of tissues (i.e., hydrophobicity, electrical charge, and molecular weight). Proteomics associated with mass spectrometry is a powerful tool to exhaustively identify proteins in a complex sample, but biochemistry of ECM proteins is particularly unfavorable to this method that need significant adaptation to be effective with ECM samples. Finally, electron microscopy is the ideal method to investigate the molecular and fibrillary organization of the ECM network.

#### **2.1. Biochemistry of the main ECM components**

#### *2.1.1. Proteins and glycoproteins*

A large diversity of proteins is found in ECMs where they are the principal component. They are classified either in structural proteins that are directly involved in the overall architecture of the ECM or in soluble factors that are globular proteins entrapped in the ECM network. Structural proteins are mainly fibrous, insoluble, and high molecular weight molecules, including collagens, elastin, laminins, and fibronectins. They are direct actors of the shape and the mechanical properties of tissues and organs and further possess the ability to autoassemble among themselves as well as to interact with each other to form fibrillary network and complex 3-D architectures. Most of the ECM proteins have sequences recognized by cells for adhesions and some of them can bind specifically soluble growth factors or cytokines. These molecules present several posttranslational modifications like hydroxylation at Proline and Lysine residues in collagens and O-glycosylation and N-glycosylation in laminins and fibronectin.

Collagens are found in all types of ECMs and are the main constituent of connective tissues like skin, bone, and tendons [3]. They belong to a large family of molecules with to date 28 members identified (numbered from collagen type I to type XXVIII). Collagens are trimeric proteins, made of the association of three alpha-chains specific to each type of collagens that assemble together to form a super-helix structure. For some collagen types several alphachains exist, leading to multiple isoforms of the same collagen molecule and raising the diversity and the complexity of the collagen family. In ECMs, collagens are organized in different supramolecular assemblies inherited from the specificity for each collagen types taking into account their amino-acid sequences and the 3-D folding of their tertiary structure [4]. Fibril-forming collagens include collagen type I, II, III, V, and XI. They assemble in large fibrils (up to 500 nm in diameter) that can merge to form collagen fibers of micrometric size. All ECMs contain fibrillary collagens. Connective tissues are characterized by an abundant ECM content made mainly by fibrils of collagen type I in dermis and bone, or of collagen type II fibrils in cartilage. Basement membranes (BM) are a specialized form of ECM mainly found in epithelial tissues and contain heterotypic fibrils combining collagen I and III or V [5]. Size and diameters of collagen fibrils are regulated by other ECM molecules like fibril-associated collagens or proteoglycans. Collagen fibrils and fibers are finally stabilized by covalent crosslinks making these structures highly resistant to mechanical load and stresses. Networkforming collagens are mostly found in BM where collagen type IV is the most abundant. Collagen IV molecules assemble in a hexameric superstructure that propagate to form finally a 2-D network that is maintained by covalent crosslinks with methionine and lysine residues [6].

Laminins are large molecular weight (from 400 to 900 kDa), heterotrimeric glycoproteins and, along with collagen type IV, they are the main constituent of BM [7]. Even found in every BM, laminin is a large family of molecules, and their distribution among BM is tissue-specific. A laminin molecule consists of the association of one alpha, one beta, and one gamma chain. To date, 5 alpha, 3 beta, and 3 gamma chains have been identified which may be assembled in 16 different laminin molecules. All laminins share common structural features: a cross-shaped 3- D structure with one long and two shorts arms, di-sulfide bridges in-between the chains that maintain their association and the presence of several N-glycosylation on asparagine residues. Laminins auto-assemble in a network interlaced with the collagen type IV network. Directed toward the cells, laminins gives cues for cell adhesions through integrin receptors.

Elastin is organized in fibers closely linked to fibrillar collagens where it gives the elasticity to tissues and compensate the tensile strength of collagen fibers [8]. Elastin is secreted by cells as a 60–70 kDa monomeric soluble precursor, tropoelastin, which contains intermittent hydro‐ phobic domains. Tropoelastin monomers auto-assemble to form elastin fibers that are stabi‐ lized by enzymatic cross-linking through Lysine residues and rendering the elastin network highly insoluble. Stacks of hydrophobic domains in the elastin network are responsible for its elastic properties and make elastin highly resistant to enzymatic degradation and solubiliza‐ tion in aqueous solutions.

Fibronectin is a large (500 kDa) dimeric glycoprotein made of the association of two nonident‐ ical monomers linked by two disulfides bounds at their C-terminal extremities [9]. Diversity of the monomers is due to alternate splicing of the fibronectin mRNA, as fibronectin is encoded by only one gene. Fibronectin is expressed by several cell types and found in most of ECMs. It assembles through disulfide bridges in oligomers and finally in insoluble fibers possessing various diameters ranging from 10 nm to microns [10]. A soluble form made of the dimeric monomer may be also found to circulate in the blood. Fibronectin primary structure is arranged in several domains that specifically interact with collagens or with cells via integrins.

There are globular, soluble proteins associated with the ECM network of structural proteins. Among the globular proteins there are growth factors, cytokines, and ECM-specific proteolytic enzymes like matrix metalloproteinases (MMP). They play an important role in cell signaling and in the remodeling of the ECM network and finally in the overall biological activity of ECMs. They can be linked to structural proteins by labile interactions at specific biding sites or be trapped in the high molecular weight chains of the structural proteins and proteoglycan. However, they are not core proteins of the ECM network, and their biochemistry is similar to most of the other globular proteins.

#### *2.1.2. Proteoglycans and polysaccharides*

Lysine residues in collagens and O-glycosylation and N-glycosylation in laminins and

296 Composition and Function of the Extracellular Matrix in the Human Body

Collagens are found in all types of ECMs and are the main constituent of connective tissues like skin, bone, and tendons [3]. They belong to a large family of molecules with to date 28 members identified (numbered from collagen type I to type XXVIII). Collagens are trimeric proteins, made of the association of three alpha-chains specific to each type of collagens that assemble together to form a super-helix structure. For some collagen types several alphachains exist, leading to multiple isoforms of the same collagen molecule and raising the diversity and the complexity of the collagen family. In ECMs, collagens are organized in different supramolecular assemblies inherited from the specificity for each collagen types taking into account their amino-acid sequences and the 3-D folding of their tertiary structure [4]. Fibril-forming collagens include collagen type I, II, III, V, and XI. They assemble in large fibrils (up to 500 nm in diameter) that can merge to form collagen fibers of micrometric size. All ECMs contain fibrillary collagens. Connective tissues are characterized by an abundant ECM content made mainly by fibrils of collagen type I in dermis and bone, or of collagen type II fibrils in cartilage. Basement membranes (BM) are a specialized form of ECM mainly found in epithelial tissues and contain heterotypic fibrils combining collagen I and III or V [5]. Size and diameters of collagen fibrils are regulated by other ECM molecules like fibril-associated collagens or proteoglycans. Collagen fibrils and fibers are finally stabilized by covalent crosslinks making these structures highly resistant to mechanical load and stresses. Networkforming collagens are mostly found in BM where collagen type IV is the most abundant. Collagen IV molecules assemble in a hexameric superstructure that propagate to form finally a 2-D network that is maintained by covalent crosslinks with methionine and lysine residues

Laminins are large molecular weight (from 400 to 900 kDa), heterotrimeric glycoproteins and, along with collagen type IV, they are the main constituent of BM [7]. Even found in every BM, laminin is a large family of molecules, and their distribution among BM is tissue-specific. A laminin molecule consists of the association of one alpha, one beta, and one gamma chain. To date, 5 alpha, 3 beta, and 3 gamma chains have been identified which may be assembled in 16 different laminin molecules. All laminins share common structural features: a cross-shaped 3- D structure with one long and two shorts arms, di-sulfide bridges in-between the chains that maintain their association and the presence of several N-glycosylation on asparagine residues. Laminins auto-assemble in a network interlaced with the collagen type IV network. Directed

toward the cells, laminins gives cues for cell adhesions through integrin receptors.

Elastin is organized in fibers closely linked to fibrillar collagens where it gives the elasticity to tissues and compensate the tensile strength of collagen fibers [8]. Elastin is secreted by cells as a 60–70 kDa monomeric soluble precursor, tropoelastin, which contains intermittent hydro‐ phobic domains. Tropoelastin monomers auto-assemble to form elastin fibers that are stabi‐ lized by enzymatic cross-linking through Lysine residues and rendering the elastin network highly insoluble. Stacks of hydrophobic domains in the elastin network are responsible for its elastic properties and make elastin highly resistant to enzymatic degradation and solubiliza‐

fibronectin.

[6].

tion in aqueous solutions.

Polysaccharides found in ECMs of vertebrae are glycosaminoglycans (GAG) and are cova‐ lently linked to a core protein to form proteoglycans, except for hyaluronan representing the only "pure" polysaccharide of ECMs [11]. Even if this chapter focuses on mammalian ECMs, it has to be mentioned that polysaccharides are the main ECM components of invertebrates and plants represented by chitin and cellulose, respectively. Hyaluronan, equally called hyaluronic acid, has the particularity to be synthesized at the plasma membrane by three different Hyaluronan synthase enzymes and not inside the Golgi apparatus like all the other proteoglycans [12]. GAGs are linear, unbranched polysaccharides composed by tens to hundreds of disaccharides units. The combination of disaccharide units is highly heterogene‐ ous, but can be specific for each individual chain. The disaccharide unit is made of glucosamine or galactosamine linked to another modified hexose, the most often to glucuronic acid, iduronic acid, or galactose. These monosaccharides are mainly modified by N-acetylation and Nsulfatation. The nature of the disaccharide unit and the types of modifications lead to the formation of different types of GAG, including chondroitin sulfate, dermatan sulfate, keratan sulfate, heparan sulfate, and hyaluronan. At physiological pH, GAGs chains are highly negatively charged due to the sulfate and carboxylic acid functions carried on modified hexoses. The net negative charge of GAGs make them highly hydrophilic, and thus, they play an important role in the hydration of ECMs [13]. High amounts of water associated with GAGs ensure some mechanical properties to ECMs, especially the resistance to compression as in the cartilage. Proteoglycans are abundant within ECM, but may be also found at the cell membrane or intracellularly. The most active part of the proteoglycans is the GAGs chain which can interact with growth factors, cytokines, cell receptors, and other constituents of the ECM. However, their core proteins also possess interaction sites that make proteoglycans highly versatile molecules inside ECMs [14]. Due to their interactions with ECM components, they play a role in ECM organization, but their important role is to be a reservoir for growth factors and to anchor signal molecules that are released through specific enzymes in particular after injuries and favor wound healing.
