**Meet the editor**

After getting a master's degree in physics and chemistry, Dr. Claire Lesieur switched to biochemistry and biophysics for her PhD. She worked on the pore-forming toxin aerolysin, her favorite example of protein fold plasticity: it starts as a soluble monomer and ends as a heptameric pore. Dr. Lesieur did a post doc on the oligomerization of the cholera toxin B into pentamers

aimed at isolating assembly intermediates. The difficulty of producing such intermediates experimentally drove her to explore computational approaches. The idea was to identify the amino acids determinant for the oligomerization and to investigate how they conduct subunit association. Ultimately, one may hope that the molecular detail of the process will help in designing better inhibitors against pathological oligomers.

Contents

**Preface VII**

**in Common 1**

**of Alkenes 31**

**Oligomers 103** Calin Jianu

**Applications 133**

Kunio Kawamura

Skoda-Földes

**Rare Earth Catalysts 3**

**Section 1 What Chemical and Biological Oligomers Have**

Thomas Chenal and Marc Visseaux

Xinfang Liu and Ke-Qin Zhang

D. Jurašin and M. Dutour Sikirić

**Primitive Earth Conditions 173**

Chapter 2 **The Use of Ionic Liquids in the Oligomerization**

Chapter 1 **End-capped Oligomers of Ethylene, Olefins and Dienes, by**

Chapter 3 **Silk Fiber — Molecular Formation Mechanism, Structure-**

Chapter 4 **Ethylene Oxide Homogeneous Heterobifunctional Acyclic**

Chapter 5 **Higher Oligomeric Surfactants — From Fundamentals to**

Chapter 6 **Oligomerization of Nucleic Acids and Peptides under the**

**means of Coordinative Chain Transfer Polymerization using**

Csaba Fehér, Eszter Kriván, Zoltán Eller, Jenő Hancsók and Rita

**Property Relationship and Advanced Applications 69**

### Contents

**Preface XI**


#### **X** Contents


Preface

The book is on the oligomerization of chemical and biological compounds. Oligomers are built on the association of several copies of a unit. The units that compose a chemical oligomer have a broad range of chemical compositions. The units that compose biological oligomers are more restricted. There are polymers of amino acid units, called proteins and polymers of nucleotides called DNA and RNA. They are also amphipathic polymers called lipids which are made of a polar head and polymers of carbon atoms, commonly C14 to C24. Those are covalent biological polymers, the units are covalently bound to one another. Then there are protein oligomers whose unit is a protein chain. In both chemical and biological oligomers, there are covalent and non-covalent oligomers as described in several chapters

The section 1 includes six chapters to highlight the grounds shared by the two types of oligomers. The chapters 1 to 3 explore the use of chemical and biological oligomers as inno‐ vative materials. The first and the second chapters deal with the synthesis of olefin oligom‐ ers. They expose the challenges of synthetizing such chemical molecules with particular properties and their application in the industrial world, in particular in the oil industry. Thus, the chapters present technical challenges as well as applications. The first chapter by Thomas Chenal and Marc Visseaux describes Coordinative Chain Transfer Polymerization (CCTP) and how this method allows the production of functionalized end-capped olefin oligomers. The second chapter by Csaba Fehér and co-authors is on the use of ionic liquids as catalyst in the oligomerization of alkenes/olefins. The authors recall how olefin oligomers are necessary in the oil industry, present an overview of the different methods of synthesis, the different oligomeric products and finally focus on the promising use of ionic liquid as an efficient catalyst for producing olefin compounds. The use of ionic liquid, also called molten salts, is an alternative to two-phase catalysis. The detail understanding of the mechanism of the olefins oligomerization illustrates the gap of knowledge between chemical and biologi‐ cal oligomerization. In chemical oligomerization, it is already possible, although still very challenging, to create tailor-made molecules with expected/directed properties. In contrast, in biological oligomerization, the research still mainly focuses on studying the biological oligomers. The level of understanding required to produce biological compounds in a de‐ sign manner is not yet reached. Nevertheless, synthetic biology is a booming field and the gap is certainly closing, the future is opened to nano-biomaterial. The third chapter by Xin‐ fang Liu and Ke-Qin Zhang on silkworm and spider fibers is a perfect example of the recent progress towards that direction. The authors have produced a thorough review, giving in‐ formation from the basic structural elements of the fibers to their incredible mechanical

properties. Finally, the authors discuss their use as biomaterial in the textile industry.

(e.g. chapters 5 and 8). The book is divided in three sections.

Guy Cousineau

### Preface

**Section 2 Biological Oligomers 211**

**VI** Contents

Bozidarka L. Zaric

Chapter 8 **Protein Oligomerization 239**

**Diseases 279**

Chapter 7 **Oligomerization of Biomacromolecules – Example of RNA**

**Binding Sm/LSm Proteins 213**

Giovanni Gotte and Massimo Libonati

Dai Mizuno and Masahiro Kawahara

**in Amyloidogenesis 295**

**Section 3 Computational Approaches 325**

**Investigations 365** Giovanni Feverati

C. Lesieur and L. Vuillon

**Contour Words 423** Guy Cousineau

Chapter 14 **Characterization of Some Periodic Tiles by**

**Plasticity 395**

Claire Lesieur

Chapter 9 **Oligomerization of Proteins and Neurodegenerative**

Chapter 10 **Structure and Function of Stefin B Oligomers – Important Role**

Ajda Taler-Verčič, Mira Polajnar and Eva Žerovnik

Chapter 11 **The Assembly of Protein Oligomers — Old Stories and New**

Chapter 12 **Geometry and Topology in Protein Interfaces -- Some Tools for**

Chapter 13 **From Tilings to Fibers – Bio-mathematical Aspects of Fold**

**Perspectives with Graph Theory 327**

The book is on the oligomerization of chemical and biological compounds. Oligomers are built on the association of several copies of a unit. The units that compose a chemical oligomer have a broad range of chemical compositions. The units that compose biological oligomers are more restricted. There are polymers of amino acid units, called proteins and polymers of nucleotides called DNA and RNA. They are also amphipathic polymers called lipids which are made of a polar head and polymers of carbon atoms, commonly C14 to C24. Those are covalent biological polymers, the units are covalently bound to one another. Then there are protein oligomers whose unit is a protein chain. In both chemical and biological oligomers, there are covalent and non-covalent oligomers as described in several chapters (e.g. chapters 5 and 8). The book is divided in three sections.

The section 1 includes six chapters to highlight the grounds shared by the two types of oligomers. The chapters 1 to 3 explore the use of chemical and biological oligomers as inno‐ vative materials. The first and the second chapters deal with the synthesis of olefin oligom‐ ers. They expose the challenges of synthetizing such chemical molecules with particular properties and their application in the industrial world, in particular in the oil industry. Thus, the chapters present technical challenges as well as applications. The first chapter by Thomas Chenal and Marc Visseaux describes Coordinative Chain Transfer Polymerization (CCTP) and how this method allows the production of functionalized end-capped olefin oligomers. The second chapter by Csaba Fehér and co-authors is on the use of ionic liquids as catalyst in the oligomerization of alkenes/olefins. The authors recall how olefin oligomers are necessary in the oil industry, present an overview of the different methods of synthesis, the different oligomeric products and finally focus on the promising use of ionic liquid as an efficient catalyst for producing olefin compounds. The use of ionic liquid, also called molten salts, is an alternative to two-phase catalysis. The detail understanding of the mechanism of the olefins oligomerization illustrates the gap of knowledge between chemical and biologi‐ cal oligomerization. In chemical oligomerization, it is already possible, although still very challenging, to create tailor-made molecules with expected/directed properties. In contrast, in biological oligomerization, the research still mainly focuses on studying the biological oligomers. The level of understanding required to produce biological compounds in a de‐ sign manner is not yet reached. Nevertheless, synthetic biology is a booming field and the gap is certainly closing, the future is opened to nano-biomaterial. The third chapter by Xin‐ fang Liu and Ke-Qin Zhang on silkworm and spider fibers is a perfect example of the recent progress towards that direction. The authors have produced a thorough review, giving in‐ formation from the basic structural elements of the fibers to their incredible mechanical properties. Finally, the authors discuss their use as biomaterial in the textile industry.

The chapters 4 to 6 address more fundamental mechanistic problem and illustrate that chemi‐ cal and biological oligomerization share basic prerequisites. The definition of an oligomer is the association of several copies of a "unit" to achieve complex and diverse (novel) conforma‐ tions. The "unit" may change enormously in composition, the formation of an oligomer al‐ ways entails a remarkable cut in the chemical/building cost because a unique chemical "piece" is used in a combinatory manner to produce distinct tridimensional local arrangements and yield a broad variety of global conformations. To build equivalent conformations with varia‐ ble "pieces" would be chemically more expensive, or more costly in terms of coding for bio‐ logical oligomers. How little local change produces a large variety of conformations and functions is well illustrated in the fourth chapter by Jianu Calin. The author describes oligo‐ meric derivatives of the ethylene oxide molecule R1(OCH2CH2O)R2 achieved by simply changing R1 and R2 and the environment. In the chapter five by D. Jurašin and M. Dutour Sikirić, they introduce conventional surfactants, which are amphipathic molecules made of polar and hydrophobic group. They describe how surfactants are able to transit from a popu‐ lation of disperse monomers to a bilayer (elongated oligomer), then to a micelle (cyclic oligomer) and up to supramolecular assembly such as liquid crystal (phase diagram). In other words, surfactant makes oligomers which grow in one direction (micelle and bilayer) and oligomers which grow in more than one direction (supramolecular assembly). Interestingly protein oligomers which grow in one direction (cyclic oligomer) and protein fibers which grow in two directions are their biological equivalent. Thus the phase diagram also stands for the transition from protein oligomer to protein fiber. The importance of the growth direction in the formation of protein fibers is explored in the chapter 13 by Laurent Vuillon and myself. Moreover, D. Jurašin and M. Dutour Sikirić discuss the novelty of introducing a covalent link‐ age, referred to as a spacer, between monomers and the effects of the spacer (nature, length, rigidity) on the phase diagram and the oligomeric state of the surfactant. Again, there is a proteic equivalent to such spacer, named a hinge loop which introduces sufficient fold plasti‐ city to allow a protein to change its oligomeric state. This mechanism called domain swapping is discussed in the chapter 8 by Gotte and Libonati as well as its consequences when the func‐ tion of the protein is lost upon the conformational change, leading to diseases, called confor‐ mational diseases (e.g. Alzheimer's disease). Conformational diseases and fibers are also discussed in details in the chapters 9 and 10. These analogies are interesting and hopefully the reading of the chapters on chemical oligomers would open new perspectives for investigating biological oligomers and vice-versa.

the primitive earth conditions. Escaping cyclic oligomerization seems a decisive step to‐ wards the formation of more complex molecules of life. This chapter makes an appropriate bridge between the chemical and the biological world and on the key stages necessary for life. It retraces the state of the art of the origin of life from chemical molecules to DNA/RNA

Preface IX

The section 2 includes the chapters 7 to 10 and covers biological oligomerization. It starts with the chapter 7 by Bozidarka L. Zaric which describes the RNA world and Sm/LSm pro‐ teins, protein oligomers that participate to the maturation of RNA. It makes a good bridge between the chapter by Kunio Kawamura and the chapters 8 to 10 which focus on protein oligomerization. The chapter 7 evokes RNA maturation and splicing mechanism, a nice ex‐ ample of the complexity of biological oligomers and the combinatory aspect of such com‐ plexity. The splicesome, the cellular machinery which controls the splicing and is made of protein and RNA entities, is dissected for the readers to understand this huge biological as‐ sembly and how all the partners are orchestrated to yield one of the most important cellular activities in higher organisms. Chapter 8 by Gotte and Libonati is already mentioned. The authors provide a thorough review on protein oligomerization, illustrate the broad diversity of protein oligomers, the different technical methods to study such compounds and discuss the plasticity of certain protein to undergo a fold transition between distinct oligomeric states. The chapters 9 to 10 explain how such fold plasticity may unfortunately lead to dis‐ eases, so called conformational diseases. The chapter 9 by Dai and Kawahara describes more exclusively the proteins involved in pathological oligomerization and neurodegenerative diseases. The authors present biophysical and biochemical results as well as the molecular and cellular context. Finally, the chapter 10 by Ajda Taler-Verčič and co-authors discusses the particular case of Stefin B Oligomers/cystatin and how it is a good model of conforma‐ tional diseases. Interestingly, they discuss several familial mutations known to promote the transition from a healthy cystatin B to pathological cystatin B oligomers and to lead to myo‐ clonus epilepsy. Another example of how a local change may have great impact on the con‐ formation of the molecule. The authors also describe some mechanisms of pathological

and proteins. The author brings us on a fascinating journey through time and space.

assemblies and the role of cellular factors such as lipids in the mechanism.

The section 3 includes the chapters 11 to 14 and is intended to present results from computa‐ tional and theoretical approaches to stress their value as alternative and complementary methods to experimental approaches. The chapter 11 by myself makes a bridge, discussing experimental and computational results. It also emphasizes novel advances in the under‐ standing of protein oligomerization using graph and network theories. The chapter 12 by Giovanni Feverati focuses on the area of contact between two adjacent chains in a protein oligomer, so called the protein interface. The author proposes an algorithm to identify hot spots, amino acids determinant for the formation of a protein interface and analyzes some properties of the interfaces over a dataset of 40 protein oligomers. This chapter is a good illustration of the result obtained by a computational approach. The chapter 13 by Laurent Vuillon and myself, already mentioned, proposes a mathematical framework to fiber forma‐ tion considering 2D-tiling and symmetry operations. A particular effort has been made to escort the reader through the mathematics that grasp the structural determinants required for fiber formation. The result may have implications beyond protein fibers in biological quasi-crystal/viral assembly, nevertheless these aspects are not treated. The main break‐ through is the mean of identifying on the 3D-structure, the basic local properties that make a protein oligomer more susceptible to fiber formation. Hopefully, this would open new ven‐

Of course, it is tempting to make another analogy between a soup of amino acids interacting via weak bonds and conventional surfactant and between covalently bound amino acids, namely proteins and covalent surfactant. The length/rigidity/nature of the "spacer" could be considered as the backbone of the protein (loop, beta strand, alpha helice) which would in‐ troduce the flexibility to produce tridimensional structures. Loops would introduce some "laxity" while beta-strands and alpha-helices some rigidity. The authors show that the cova‐ lent bond reduces the CMC (Critical Micellization Concentration) and allows the formation of supramolecular structures at lower "unit" concentration. In that sense, introducing a pep‐ tide bond between amino acids may have promoted the formation of proteins at lower ami‐ no acid concentration acting like a catalyst. Likewise the "hinge loop" in the domain swapping may allow oligomerization at lower protein concentration.

The presence of cyclic and elongated oligomers takes a different meaning when reading the chapter 6 by Kunio Kawamura on the oligomerization of nucleic acids and peptides under the primitive earth conditions. Escaping cyclic oligomerization seems a decisive step to‐ wards the formation of more complex molecules of life. This chapter makes an appropriate bridge between the chemical and the biological world and on the key stages necessary for life. It retraces the state of the art of the origin of life from chemical molecules to DNA/RNA and proteins. The author brings us on a fascinating journey through time and space.

The chapters 4 to 6 address more fundamental mechanistic problem and illustrate that chemi‐ cal and biological oligomerization share basic prerequisites. The definition of an oligomer is the association of several copies of a "unit" to achieve complex and diverse (novel) conforma‐ tions. The "unit" may change enormously in composition, the formation of an oligomer al‐ ways entails a remarkable cut in the chemical/building cost because a unique chemical "piece" is used in a combinatory manner to produce distinct tridimensional local arrangements and yield a broad variety of global conformations. To build equivalent conformations with varia‐ ble "pieces" would be chemically more expensive, or more costly in terms of coding for bio‐ logical oligomers. How little local change produces a large variety of conformations and functions is well illustrated in the fourth chapter by Jianu Calin. The author describes oligo‐ meric derivatives of the ethylene oxide molecule R1(OCH2CH2O)R2 achieved by simply changing R1 and R2 and the environment. In the chapter five by D. Jurašin and M. Dutour Sikirić, they introduce conventional surfactants, which are amphipathic molecules made of polar and hydrophobic group. They describe how surfactants are able to transit from a popu‐ lation of disperse monomers to a bilayer (elongated oligomer), then to a micelle (cyclic oligomer) and up to supramolecular assembly such as liquid crystal (phase diagram). In other words, surfactant makes oligomers which grow in one direction (micelle and bilayer) and oligomers which grow in more than one direction (supramolecular assembly). Interestingly protein oligomers which grow in one direction (cyclic oligomer) and protein fibers which grow in two directions are their biological equivalent. Thus the phase diagram also stands for the transition from protein oligomer to protein fiber. The importance of the growth direction in the formation of protein fibers is explored in the chapter 13 by Laurent Vuillon and myself. Moreover, D. Jurašin and M. Dutour Sikirić discuss the novelty of introducing a covalent link‐ age, referred to as a spacer, between monomers and the effects of the spacer (nature, length, rigidity) on the phase diagram and the oligomeric state of the surfactant. Again, there is a proteic equivalent to such spacer, named a hinge loop which introduces sufficient fold plasti‐ city to allow a protein to change its oligomeric state. This mechanism called domain swapping is discussed in the chapter 8 by Gotte and Libonati as well as its consequences when the func‐ tion of the protein is lost upon the conformational change, leading to diseases, called confor‐ mational diseases (e.g. Alzheimer's disease). Conformational diseases and fibers are also discussed in details in the chapters 9 and 10. These analogies are interesting and hopefully the reading of the chapters on chemical oligomers would open new perspectives for investigating

Of course, it is tempting to make another analogy between a soup of amino acids interacting via weak bonds and conventional surfactant and between covalently bound amino acids, namely proteins and covalent surfactant. The length/rigidity/nature of the "spacer" could be considered as the backbone of the protein (loop, beta strand, alpha helice) which would in‐ troduce the flexibility to produce tridimensional structures. Loops would introduce some "laxity" while beta-strands and alpha-helices some rigidity. The authors show that the cova‐ lent bond reduces the CMC (Critical Micellization Concentration) and allows the formation of supramolecular structures at lower "unit" concentration. In that sense, introducing a pep‐ tide bond between amino acids may have promoted the formation of proteins at lower ami‐ no acid concentration acting like a catalyst. Likewise the "hinge loop" in the domain

The presence of cyclic and elongated oligomers takes a different meaning when reading the chapter 6 by Kunio Kawamura on the oligomerization of nucleic acids and peptides under

swapping may allow oligomerization at lower protein concentration.

biological oligomers and vice-versa.

VIII Preface

The section 2 includes the chapters 7 to 10 and covers biological oligomerization. It starts with the chapter 7 by Bozidarka L. Zaric which describes the RNA world and Sm/LSm pro‐ teins, protein oligomers that participate to the maturation of RNA. It makes a good bridge between the chapter by Kunio Kawamura and the chapters 8 to 10 which focus on protein oligomerization. The chapter 7 evokes RNA maturation and splicing mechanism, a nice ex‐ ample of the complexity of biological oligomers and the combinatory aspect of such com‐ plexity. The splicesome, the cellular machinery which controls the splicing and is made of protein and RNA entities, is dissected for the readers to understand this huge biological as‐ sembly and how all the partners are orchestrated to yield one of the most important cellular activities in higher organisms. Chapter 8 by Gotte and Libonati is already mentioned. The authors provide a thorough review on protein oligomerization, illustrate the broad diversity of protein oligomers, the different technical methods to study such compounds and discuss the plasticity of certain protein to undergo a fold transition between distinct oligomeric states. The chapters 9 to 10 explain how such fold plasticity may unfortunately lead to dis‐ eases, so called conformational diseases. The chapter 9 by Dai and Kawahara describes more exclusively the proteins involved in pathological oligomerization and neurodegenerative diseases. The authors present biophysical and biochemical results as well as the molecular and cellular context. Finally, the chapter 10 by Ajda Taler-Verčič and co-authors discusses the particular case of Stefin B Oligomers/cystatin and how it is a good model of conforma‐ tional diseases. Interestingly, they discuss several familial mutations known to promote the transition from a healthy cystatin B to pathological cystatin B oligomers and to lead to myo‐ clonus epilepsy. Another example of how a local change may have great impact on the con‐ formation of the molecule. The authors also describe some mechanisms of pathological assemblies and the role of cellular factors such as lipids in the mechanism.

The section 3 includes the chapters 11 to 14 and is intended to present results from computa‐ tional and theoretical approaches to stress their value as alternative and complementary methods to experimental approaches. The chapter 11 by myself makes a bridge, discussing experimental and computational results. It also emphasizes novel advances in the under‐ standing of protein oligomerization using graph and network theories. The chapter 12 by Giovanni Feverati focuses on the area of contact between two adjacent chains in a protein oligomer, so called the protein interface. The author proposes an algorithm to identify hot spots, amino acids determinant for the formation of a protein interface and analyzes some properties of the interfaces over a dataset of 40 protein oligomers. This chapter is a good illustration of the result obtained by a computational approach. The chapter 13 by Laurent Vuillon and myself, already mentioned, proposes a mathematical framework to fiber forma‐ tion considering 2D-tiling and symmetry operations. A particular effort has been made to escort the reader through the mathematics that grasp the structural determinants required for fiber formation. The result may have implications beyond protein fibers in biological quasi-crystal/viral assembly, nevertheless these aspects are not treated. The main break‐ through is the mean of identifying on the 3D-structure, the basic local properties that make a protein oligomer more susceptible to fiber formation. Hopefully, this would open new ven‐

ue to design therapeutic strategies to protect from pathological oligomerization. The book ends on the chapter 14 by Guy Cousineau where the notion of 2D-tiling is extended to other symmetries. This chapter is pure mathematics but comes as an extension of chapter 13, and we hope it will become handy to understand the construction of some biological oligomers.

The large conformational plasticity of oligomers due to global consequences arising from local features is the common notion that transpires throughout the chapters. It is the main message of the book.

> **Claire Lesieur** AGIM, UGA-CNRS, France

**Section 1**

**What Chemical and Biological Oligomers Have**

**in Common**

**What Chemical and Biological Oligomers Have in Common**

ue to design therapeutic strategies to protect from pathological oligomerization. The book ends on the chapter 14 by Guy Cousineau where the notion of 2D-tiling is extended to other symmetries. This chapter is pure mathematics but comes as an extension of chapter 13, and we hope it will become handy to understand the construction of some biological oligomers. The large conformational plasticity of oligomers due to global consequences arising from local features is the common notion that transpires throughout the chapters. It is the main

**Claire Lesieur**

AGIM, UGA-CNRS, France

message of the book.

X Preface

**Chapter 1**

**End-capped Oligomers of Ethylene, Olefins and Dienes,**

**by means of Coordinative Chain Transfer Polymerization**

Polymerization catalysis has seen a huge development with the progress of organometallic chemistry. Metallocenes, post-metallocenes, and constrained geometry complexes (CGC), have been used as single-site catalysts by polymerists, with the aim to elaborate polymeric materials with improved properties that can't be attained by other synthetic strategies [1]. Since the beginning of the 21st century and besides the search for new organometallic structures that could be exploited as potential catalysts, new methods and concepts have been developed, aiming at better controlling polymerization catalysis. Living (up to Immortal) Polymerization [2], Chain Walking [3] Chain Shuttling [4] have emerged. Mastering transfer reactions in polymerization catalysis, as Coordinative Chain Transfer Polymerization (CCTP) [5], has reappeared recently as a tool that would allow a better control of the whole process, and would also open the way to unprecedented macromolecular architectures. On the other hand, mastering of transfer reactions in polymerization catalysis is also of high interest because it

We will describe in this chapter some representative examples of our recent results that illustrate the possibilities offered by controlling transfer reactions in rare earth mediated oligomerization catalysis. In particular, the association of dialkylmagnesium with rare earth (RE) precatalysts afforded active combinations for the oligomerization of a series of olefins including ethylene, octene, styrene, butadiene, isoprene and β-myrcene. Chain transfer on the magnesium atom afforded long chain di(polyolefin)magnesium derivatives (see scheme 1). Characterizations and applications of these end-capped oligomers will then be largely

> © 2014 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

**using Rare Earth Catalysts**

Thomas Chenal and Marc Visseaux

http://dx.doi.org/10.5772/58217

**1. Introduction**

developed.

Additional information is available at the end of the chapter

allows the preparation of functionalized oligomers.

**Chapter 1**

## **End-capped Oligomers of Ethylene, Olefins and Dienes, by means of Coordinative Chain Transfer Polymerization using Rare Earth Catalysts**

Thomas Chenal and Marc Visseaux

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/58217

#### **1. Introduction**

Polymerization catalysis has seen a huge development with the progress of organometallic chemistry. Metallocenes, post-metallocenes, and constrained geometry complexes (CGC), have been used as single-site catalysts by polymerists, with the aim to elaborate polymeric materials with improved properties that can't be attained by other synthetic strategies [1]. Since the beginning of the 21st century and besides the search for new organometallic structures that could be exploited as potential catalysts, new methods and concepts have been developed, aiming at better controlling polymerization catalysis. Living (up to Immortal) Polymerization [2], Chain Walking [3] Chain Shuttling [4] have emerged. Mastering transfer reactions in polymerization catalysis, as Coordinative Chain Transfer Polymerization (CCTP) [5], has reappeared recently as a tool that would allow a better control of the whole process, and would also open the way to unprecedented macromolecular architectures. On the other hand, mastering of transfer reactions in polymerization catalysis is also of high interest because it allows the preparation of functionalized oligomers.

We will describe in this chapter some representative examples of our recent results that illustrate the possibilities offered by controlling transfer reactions in rare earth mediated oligomerization catalysis. In particular, the association of dialkylmagnesium with rare earth (RE) precatalysts afforded active combinations for the oligomerization of a series of olefins including ethylene, octene, styrene, butadiene, isoprene and β-myrcene. Chain transfer on the magnesium atom afforded long chain di(polyolefin)magnesium derivatives (see scheme 1). Characterizations and applications of these end-capped oligomers will then be largely developed.

© 2014 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

functionalized initiators, for example bis(ortho-dimethylaminobenzyl)magnesium(THF), bis(ortho-methoxybenzyl)magnesium(THF) [20], bis(3-butenyl)magnesium(dibutylether)2 and bis(10-undecenyl)magnesium(dibutylether)2 [12]. Shorter compounds i. e. the diethyl or dimethyl and even the dihydrido magnesium compounds were excluded of the field of this study due to their insolubility. Some difficulties were encountered when using the well-known RMgX Grignard reagents for two distinct reasons: first, the halide atoms clearly compete with alkyl groups on the active centers and disturb equilibria resulting to the formation of the active Cat-R species; second, ethers are the usual solvents of Grignard species while they are inhibitors of the polymerization due to high oxophilicity of the catalyst metal. Hence, specially designed initiators were conveniently obtained by conversion (Schlenk equilibrium, see scheme 4) of Grignard reagents into dialkylmagnesium compounds by means of MgX2(dioxane) precipitation and concentration of the filtrated solution to a minimum level of

End-capped Oligomers of Ethylene, Olefins and Dienes, by means of Coordinative Chain Transfer…

http://dx.doi.org/10.5772/58217

5

MgR2 + MgX2

dialkylmagnesium

Unlike for liquid monomers, which can be handled with classical Schlenk techniques, gaseous ethylene (at standard conditions) requires particular equipment as, for example, with solution process. The convenient solvents tested were hydrocarbons, and among them linear or cyclic alkanes, aromatics, possibly halogenated. Rigorous exclusion of air, moisture and protic impurities was reached by working with a sealed reactor equipped with a vacuum line (see scheme 5). After a purge with clean gas, a simple wash with a sacrificial batch of dialkylmag‐ nesium solution enabled the required purity level. Since ethylene polymerization is highly exothermic, the reactor was correctly thermostated. In the same way, kinetic limitations due to dissolution rates of ethylene were overtaken by powerful stirring. Ethylene inlet was regulated at a constant pressure, at atmospheric or higher pressure. A mass flowmeter and a data acquisition system recorded the exact amount of ethylene consumed all along the reaction,

The catalytic species for ethylene insertions in alkyl chains were obtained *in situ* by adding to the dialkylmagnesium reagent a solution of a rare earth salt containing also the necessary amount of ligand under its protio form, to obtain the pre-catalyst in solution (see next paragraph). As a typical example, two equiv of pentamethylcyclopentadiene ligand will afford the *in situ* formation of a decamethylmetallocene. Several metals were tested, for example yttrium, lanthanum, neodymium, samarium and lutetium [21-22]. Although the catalytic

hence allowing deducing the average molar mass of the chains at any time.

ethers.

**Scheme 4.** Schlenk equilibrium

**2. Oligomers of Ethylene**

**2.1. Ease of implementation**

2 RMgX

Grignard reagent

**Scheme 1.** End-capped oligomers starting from dialkyl magnesium

Particular emphasis will be given to the ease of implementation of the process demonstrating its usefulness as a versatile tool for the elaboration of high-added value macromolecular objects. This facility is illustrated by the possibility that we disclosed, to assess a large set of RE/MgR2 combinations, including *in situ* synthesized catalysts as depicted on scheme 2. In particular, homoleptic, mono or bis-substituted complexes can straightforwardly be prepared in the catalytic mixture, simply starting with RE salts [6].

### [cat] = Rare Earth Salt + H-Ligand + MgR2

```
Scheme 2. In situ synthesis of catalytic species as a convenient process.
```
These results are rooted in the pioneering breakthrough of Pr Mortreux [7-9] and their impact is well-illustrated by subsequent exploitation by other groups [10-19] and a booming devel‐ opment in chain transfer agents. Moreover, in the case where the alkyl-Mg group contains a functional moiety (ie aromatic ether, amino or alkenyl), this may afford telechelic oligomers [12] (see scheme 3)

Single-ended or telechelic oligomers

**Scheme 3.** Examples of dialkylmagnesium initiators

The starting compounds of the process are based on dialkyl magnesium compounds MgR2 (see scheme 3). Solutions in hydrocarbons are commercially available, including at the industrial scale, for example: dibutylmagnesium (Aldrich, 1 M in hexane), butylethylmagne‐ sium (Texas Alkyls, 20% in heptane) or butyloctylmagnesium (Akzo Nobel, 20% in heptane). Alternatively, more sophisticated dialkylmagnesium compounds were synthesized as sake of functionalized initiators, for example bis(ortho-dimethylaminobenzyl)magnesium(THF), bis(ortho-methoxybenzyl)magnesium(THF) [20], bis(3-butenyl)magnesium(dibutylether)2 and bis(10-undecenyl)magnesium(dibutylether)2 [12]. Shorter compounds i. e. the diethyl or dimethyl and even the dihydrido magnesium compounds were excluded of the field of this study due to their insolubility. Some difficulties were encountered when using the well-known RMgX Grignard reagents for two distinct reasons: first, the halide atoms clearly compete with alkyl groups on the active centers and disturb equilibria resulting to the formation of the active Cat-R species; second, ethers are the usual solvents of Grignard species while they are inhibitors of the polymerization due to high oxophilicity of the catalyst metal. Hence, specially designed initiators were conveniently obtained by conversion (Schlenk equilibrium, see scheme 4) of Grignard reagents into dialkylmagnesium compounds by means of MgX2(dioxane) precipitation and concentration of the filtrated solution to a minimum level of ethers.

**Scheme 4.** Schlenk equilibrium

MgR2

in the catalytic mixture, simply starting with RE salts [6].

**Scheme 2.** *In situ* synthesis of catalytic species as a convenient process.

R = Butyl, Ethyl Butyl, Butyl Butyl, Octyl [3-butenyl]2 [10-undecenyl]2

Advanced

Single-ended or telechelic oligomers

MgR2

[12] (see scheme 3)

dialkylmagnesium initiators :

**Scheme 3.** Examples of dialkylmagnesium initiators

**Scheme 1.** End-capped oligomers starting from dialkyl magnesium

4 Oligomerization of Chemical and Biological Compounds

[cat] monomer

Particular emphasis will be given to the ease of implementation of the process demonstrating its usefulness as a versatile tool for the elaboration of high-added value macromolecular objects. This facility is illustrated by the possibility that we disclosed, to assess a large set of RE/MgR2 combinations, including *in situ* synthesized catalysts as depicted on scheme 2. In particular, homoleptic, mono or bis-substituted complexes can straightforwardly be prepared

[cat] = Rare Earth Salt + H-Ligand + MgR2

These results are rooted in the pioneering breakthrough of Pr Mortreux [7-9] and their impact is well-illustrated by subsequent exploitation by other groups [10-19] and a booming devel‐ opment in chain transfer agents. Moreover, in the case where the alkyl-Mg group contains a functional moiety (ie aromatic ether, amino or alkenyl), this may afford telechelic oligomers

The starting compounds of the process are based on dialkyl magnesium compounds MgR2 (see scheme 3). Solutions in hydrocarbons are commercially available, including at the industrial scale, for example: dibutylmagnesium (Aldrich, 1 M in hexane), butylethylmagne‐ sium (Texas Alkyls, 20% in heptane) or butyloctylmagnesium (Akzo Nobel, 20% in heptane). Alternatively, more sophisticated dialkylmagnesium compounds were synthesized as sake of

Mg(Polymer)2

Mg BEM

=

Mg N

N

THF

Mg O

O

THF

#### **2. Oligomers of Ethylene**

#### **2.1. Ease of implementation**

Unlike for liquid monomers, which can be handled with classical Schlenk techniques, gaseous ethylene (at standard conditions) requires particular equipment as, for example, with solution process. The convenient solvents tested were hydrocarbons, and among them linear or cyclic alkanes, aromatics, possibly halogenated. Rigorous exclusion of air, moisture and protic impurities was reached by working with a sealed reactor equipped with a vacuum line (see scheme 5). After a purge with clean gas, a simple wash with a sacrificial batch of dialkylmag‐ nesium solution enabled the required purity level. Since ethylene polymerization is highly exothermic, the reactor was correctly thermostated. In the same way, kinetic limitations due to dissolution rates of ethylene were overtaken by powerful stirring. Ethylene inlet was regulated at a constant pressure, at atmospheric or higher pressure. A mass flowmeter and a data acquisition system recorded the exact amount of ethylene consumed all along the reaction, hence allowing deducing the average molar mass of the chains at any time.

The catalytic species for ethylene insertions in alkyl chains were obtained *in situ* by adding to the dialkylmagnesium reagent a solution of a rare earth salt containing also the necessary amount of ligand under its protio form, to obtain the pre-catalyst in solution (see next paragraph). As a typical example, two equiv of pentamethylcyclopentadiene ligand will afford the *in situ* formation of a decamethylmetallocene. Several metals were tested, for example yttrium, lanthanum, neodymium, samarium and lutetium [21-22]. Although the catalytic

21st century [25-26] and consisted in the use of a crystalline bis(pentamethylcyclopentadienyl) rare earth chloride (B). The latest route, since a few years [6,27-28], proposed the direct use of a "naked" rare earth salts, as catalyst precursor, which was mixed with two equivalents of ligand precursor (pentamethylcyclopentadiene, C5Me5H, noted as Cp\*H). When this mixture was added to a solution of dialkylmagnesium in excess, under atmospheric pressure of ethylene, it took only a few seconds to see the activity rising up at the same level as with the use of well-defined catalysts. This was interpreted in terms of alkyl/X ionic metathesis occurring between the rare earth and the magnesium atom affording the formation of rare earth alkyls. In a subsequent step, the Ln-alkyl species was assumed to react with the ligand precursor Cp\*H with alkane elimination as irreversible step, to lead to the formation of a {Cp\*2LnR} compound. Clues for this mechanism arose from an NMR study that excluded the other possible pathway: a mixture of dialkyl magnesium and two equivalents of Cp\*H gave

End-capped Oligomers of Ethylene, Olefins and Dienes, by means of Coordinative Chain Transfer…

the expected complex Cp\*2Mg very slowly compared to the rate of catalyst formation.

Cp\*2Ln-CH(TMS)2

Cp\*2LnCl2Li(OEt2)2

[Cp\*2Ln-H]2

Ethylene

method A

http://dx.doi.org/10.5772/58217

7

MgR2

method C

Ln R

Cp\*2LnCl2Li(OEt2)2 2 Cp\*H + LnX3

*active species*

Implementation of the reaction was tested at a laboratory scale with the following details given here as an example. In a glove box, the butylethylmagnesium (BEM, 200 µmol) was weighed in a syringe and diluted in 20 mL of toluene (which was degassed by argon bubbling and stored over molecular sieves). The polymerization reactor was purged under vacuum and filled with ethylene. Injection of the BEM solution (200 µmol, 20 mL toluene) in the reactor and stirring at 90°C led to passivation of the inner walls of the reactor. Meanwhile, a new set of BEM

MgR2

**Scheme 6.** Various strategies leading to the formation of the active species.

method B

**Scheme 5.** Reactor equipment

activity and selectivity were clearly different depending on the rare earth nature, these differences were of low importance, as far as ethylene polymerization is concerned [9]. The neodymium species were however much more studied in the literature, probably in relation to their interesting properties for polymerization of other monomers, especially diene mono‐ mers. Different anions of the rare earth species were used, for example chloride, borohydride, alkoxides (tertiobutylates, phenates), carboxylates (versatate, 2-ethylhexanoate), phosphates and phosphonates. Once more, differences in reactivity were noticed, but the most important feature concerning the nature of the anionic ligand was the solubility of the corresponding salts. A comparative study between chloride and versatate salts of neodymium gave exactly the same results in terms of oligomers distribution for the same ethylene conversion, but the reaction started immediately with the versatate which was soluble in toluene, while it was sluggish with the insoluble chloride salt. The neodymium versatate has the other advantage to be available at an industrial scale (Solvay-Rhodia Rare Earth Systems), while at a laboratory scale, the neodymium borohydride [23] was preferred for its well-defined structure and its purity, allowing easiest spectroscopic characterizations, hence better control of the generation of catalytic centers.

Well-defined organometallic compounds bearing two pentamethylcyclopentadienyl and an alkyl group in the coordination sphere of the rare earth metal were initially established as bestsuited [24]. But these catalytic species, while being very active in polymerization conditions, were found excessively prone to decomposition in the absence of monomer and impossible to store. Hence, several synthetic routes were proposed to generate the catalyst *in situ*, starting from a series of precatalysts (see scheme 6). The earliest route (A) reported in the literature was involving hydride species [21], which can be formed by hydrogenolysis of a crowded alkyl: the bis(trimethylsilyl)methyl (TMS2CH-). A next route was developed at the turn of the 21st century [25-26] and consisted in the use of a crystalline bis(pentamethylcyclopentadienyl) rare earth chloride (B). The latest route, since a few years [6,27-28], proposed the direct use of a "naked" rare earth salts, as catalyst precursor, which was mixed with two equivalents of ligand precursor (pentamethylcyclopentadiene, C5Me5H, noted as Cp\*H). When this mixture was added to a solution of dialkylmagnesium in excess, under atmospheric pressure of ethylene, it took only a few seconds to see the activity rising up at the same level as with the use of well-defined catalysts. This was interpreted in terms of alkyl/X ionic metathesis occurring between the rare earth and the magnesium atom affording the formation of rare earth alkyls. In a subsequent step, the Ln-alkyl species was assumed to react with the ligand precursor Cp\*H with alkane elimination as irreversible step, to lead to the formation of a {Cp\*2LnR} compound. Clues for this mechanism arose from an NMR study that excluded the other possible pathway: a mixture of dialkyl magnesium and two equivalents of Cp\*H gave the expected complex Cp\*2Mg very slowly compared to the rate of catalyst formation.

**Scheme 6.** Various strategies leading to the formation of the active species.

activity and selectivity were clearly different depending on the rare earth nature, these differences were of low importance, as far as ethylene polymerization is concerned [9]. The neodymium species were however much more studied in the literature, probably in relation to their interesting properties for polymerization of other monomers, especially diene mono‐ mers. Different anions of the rare earth species were used, for example chloride, borohydride, alkoxides (tertiobutylates, phenates), carboxylates (versatate, 2-ethylhexanoate), phosphates and phosphonates. Once more, differences in reactivity were noticed, but the most important feature concerning the nature of the anionic ligand was the solubility of the corresponding salts. A comparative study between chloride and versatate salts of neodymium gave exactly the same results in terms of oligomers distribution for the same ethylene conversion, but the reaction started immediately with the versatate which was soluble in toluene, while it was sluggish with the insoluble chloride salt. The neodymium versatate has the other advantage to be available at an industrial scale (Solvay-Rhodia Rare Earth Systems), while at a laboratory scale, the neodymium borohydride [23] was preferred for its well-defined structure and its purity, allowing easiest spectroscopic characterizations, hence better control of the generation

Well-defined organometallic compounds bearing two pentamethylcyclopentadienyl and an alkyl group in the coordination sphere of the rare earth metal were initially established as bestsuited [24]. But these catalytic species, while being very active in polymerization conditions, were found excessively prone to decomposition in the absence of monomer and impossible to store. Hence, several synthetic routes were proposed to generate the catalyst *in situ*, starting from a series of precatalysts (see scheme 6). The earliest route (A) reported in the literature was involving hydride species [21], which can be formed by hydrogenolysis of a crowded alkyl: the bis(trimethylsilyl)methyl (TMS2CH-). A next route was developed at the turn of the

of catalytic centers.

**Scheme 5.** Reactor equipment

6 Oligomerization of Chemical and Biological Compounds

Implementation of the reaction was tested at a laboratory scale with the following details given here as an example. In a glove box, the butylethylmagnesium (BEM, 200 µmol) was weighed in a syringe and diluted in 20 mL of toluene (which was degassed by argon bubbling and stored over molecular sieves). The polymerization reactor was purged under vacuum and filled with ethylene. Injection of the BEM solution (200 µmol, 20 mL toluene) in the reactor and stirring at 90°C led to passivation of the inner walls of the reactor. Meanwhile, a new set of BEM solution was prepared in the glove box. In a 2 mL syringe, neodymium versatate (NdVs3, 13.6 mg, 20.7 µmol) and pentamethylcyclopentadiene (Cp\*H, 7.0 mg, 51 µmol) were weighed and diluted with 1 mL of the above BEM solution. After evacuation of the passivation solution, the remaining 19 mL BEM solution was injected in the reactor and stirred until pressure and temperature were stable. Injection of the small syringe solution gave the start of the polymer‐ ization. The monitoring of ethylene consumption allowed to measure the amount of ethylene consumed over time, and consequently the theoretical chain length of the polyethylene produced. Reaction can be stopped at a desired degree of polymerization or, if allowed to go free, it will stop spontaneously after a short burst of activity, the average molar mass being around 1000-2000 g/mol. Polymerization quench was possible with any known reactive of Grignard species, typically an electrophile, for example protic compounds produced alkanes, aldehydes or ketones gave secondary or tertiary alcohols, carbon dioxide gave carboxylic acids, iodine gave iodinated alkanes, etc. (see further functionalization section).

#### **2.2. Mechanism of control**

Preferred paradigm for understanding the present reaction came from the concept of "Auf‐ baureaktion" early developed by Karl Ziegler in the fifties [29]. The process under study at this time aimed at growing the linear chains of alkyl aluminum by insertion of ethylene. When implementing potential catalysts, the investigations were particular fruitful using titanium chloride and were oriented in a new field nowadays known as "Ziegler-Natta polymerization". While being originally restricted to alkyl aluminum compounds, the concept of Aufbaureak‐ tion can be transposed to alkyl magnesium compounds at the condition that a suitable catalyst would be found. Fortunately, in our laboratory, the studies of rare earth complexes as model compounds of Ziegler-Natta catalysts revealed that dialkylmagnesium compounds behaved as efficient transfer agents during the ethylene polymerization with these rare earth complexes. The striking fact was that the transfer process was fast and reversible between magnesium and rare earth. That means that, in the presence of a large Mg/RE ratio, the alkyl chains were essentially observed on magnesium, nevertheless insertion of ethylene seemed to happen homogeneously on all these chains with the activity and selectivity of the rare earth catalyst. Each chain can be considered as continuously transferred from one to another metallic atom, with insertion of ethylene only when it was found on a lanthanide unit. Alkyl chains were observed inactive for ethylene insertion when located on magnesium, so they were named "dormant species" in this case, since they were only waiting for a transfer to lanthanide (see scheme 7).

The mechanism described here clearly belongs to the "catalyzed chain transfer polymeriza‐ tion" concept (CCTP otherwise named coordinative chain transfer polymerization) as recently reviewed by our group [31]. CCTP is part of the new concepts developed in coordination polymerization catalysis in the recent years, also including Living Degenerative Group-Transfer Coordination Polymerization [32], Chain Walking Polymerization (CWP) [33] and Chain Shuttling Polymerization (CSP) [4]. In the area of radical polymerization, a parallel can be seen between Controlled Radical Polymerization mechanisms (Atom Transfer Radical Polymerization, ATRP, Nitroxide Mediated Polymerization, NMP or Radical Addition Fragmentation Transfer RAFT, for example [34]) and the CCTP concept, which is restricted to mechanisms by coordination polymerization. By streamlining, CCTP can be considered as a generalized concept of the original Aufbaureaktion, describing the fast and reversible transfer of chains between a large amount of inexpensive species and a suitable catalyst. Other couples

R

R

Ln

2LnR Cp\*

ethylene insertion

aggregation with MgR2

alkyl

**Scheme 7.** Postulated mechanism for chain growth on dialkylmagnesium.

Cp\*

exchang

Cp\*

e

Cp\*

Cp\*

Cp\*

2LnR-MgR2

Cl

Li(OEt2)2 + MgR2

2Ln-CH2-CH2-R

2Ln Mg CH2

R

2Ln Mg CH2

R

CH2 R

2Ln Mg C

R

CH2 R

H2

R

CH2 R

Cl

Ln R

http://dx.doi.org/10.5772/58217

9

active species

2Ln-(CH2-CH2)n-R

Cp\*

transfer with excess dialkyl-magnesium

propagation

End-capped Oligomers of Ethylene, Olefins and Dienes, by means of Coordinative Chain Transfer…

R-(CH2-CH2)n-Mg-(CH2-CH2)n-R

The mechanism proposed for the observed chain growth on dialkyl magnesium fundamentally contrasts with an anionic mechanism, for example in the chain growth of alkyl lithium as observed in the relevant work of Bergbreiter [30]. In these studies, the alkyl lithium species were modified by the use of tetramethyl ethylene diamine (TMEDA) as a ligand of the lithium atom, rendering the alkyl anion sufficiently reactive to attack the ethylene monomer. Regard‐ less the cost problem arising from the need of up to three equivalent of TMEDA ligand per produced oligomer, the high reactivity of the species and the resulting secondary reactions required to work at very low temperature for further uses.

End-capped Oligomers of Ethylene, Olefins and Dienes, by means of Coordinative Chain Transfer… http://dx.doi.org/10.5772/58217 9

**Scheme 7.** Postulated mechanism for chain growth on dialkylmagnesium.

solution was prepared in the glove box. In a 2 mL syringe, neodymium versatate (NdVs3, 13.6 mg, 20.7 µmol) and pentamethylcyclopentadiene (Cp\*H, 7.0 mg, 51 µmol) were weighed and diluted with 1 mL of the above BEM solution. After evacuation of the passivation solution, the remaining 19 mL BEM solution was injected in the reactor and stirred until pressure and temperature were stable. Injection of the small syringe solution gave the start of the polymer‐ ization. The monitoring of ethylene consumption allowed to measure the amount of ethylene consumed over time, and consequently the theoretical chain length of the polyethylene produced. Reaction can be stopped at a desired degree of polymerization or, if allowed to go free, it will stop spontaneously after a short burst of activity, the average molar mass being around 1000-2000 g/mol. Polymerization quench was possible with any known reactive of Grignard species, typically an electrophile, for example protic compounds produced alkanes, aldehydes or ketones gave secondary or tertiary alcohols, carbon dioxide gave carboxylic acids,

Preferred paradigm for understanding the present reaction came from the concept of "Auf‐ baureaktion" early developed by Karl Ziegler in the fifties [29]. The process under study at this time aimed at growing the linear chains of alkyl aluminum by insertion of ethylene. When implementing potential catalysts, the investigations were particular fruitful using titanium chloride and were oriented in a new field nowadays known as "Ziegler-Natta polymerization". While being originally restricted to alkyl aluminum compounds, the concept of Aufbaureak‐ tion can be transposed to alkyl magnesium compounds at the condition that a suitable catalyst would be found. Fortunately, in our laboratory, the studies of rare earth complexes as model compounds of Ziegler-Natta catalysts revealed that dialkylmagnesium compounds behaved as efficient transfer agents during the ethylene polymerization with these rare earth complexes. The striking fact was that the transfer process was fast and reversible between magnesium and rare earth. That means that, in the presence of a large Mg/RE ratio, the alkyl chains were essentially observed on magnesium, nevertheless insertion of ethylene seemed to happen homogeneously on all these chains with the activity and selectivity of the rare earth catalyst. Each chain can be considered as continuously transferred from one to another metallic atom, with insertion of ethylene only when it was found on a lanthanide unit. Alkyl chains were observed inactive for ethylene insertion when located on magnesium, so they were named "dormant species" in this case, since they were only waiting for a transfer to lanthanide (see

The mechanism proposed for the observed chain growth on dialkyl magnesium fundamentally contrasts with an anionic mechanism, for example in the chain growth of alkyl lithium as observed in the relevant work of Bergbreiter [30]. In these studies, the alkyl lithium species were modified by the use of tetramethyl ethylene diamine (TMEDA) as a ligand of the lithium atom, rendering the alkyl anion sufficiently reactive to attack the ethylene monomer. Regard‐ less the cost problem arising from the need of up to three equivalent of TMEDA ligand per produced oligomer, the high reactivity of the species and the resulting secondary reactions

required to work at very low temperature for further uses.

iodine gave iodinated alkanes, etc. (see further functionalization section).

**2.2. Mechanism of control**

8 Oligomerization of Chemical and Biological Compounds

scheme 7).

The mechanism described here clearly belongs to the "catalyzed chain transfer polymeriza‐ tion" concept (CCTP otherwise named coordinative chain transfer polymerization) as recently reviewed by our group [31]. CCTP is part of the new concepts developed in coordination polymerization catalysis in the recent years, also including Living Degenerative Group-Transfer Coordination Polymerization [32], Chain Walking Polymerization (CWP) [33] and Chain Shuttling Polymerization (CSP) [4]. In the area of radical polymerization, a parallel can be seen between Controlled Radical Polymerization mechanisms (Atom Transfer Radical Polymerization, ATRP, Nitroxide Mediated Polymerization, NMP or Radical Addition Fragmentation Transfer RAFT, for example [34]) and the CCTP concept, which is restricted to mechanisms by coordination polymerization. By streamlining, CCTP can be considered as a generalized concept of the original Aufbaureaktion, describing the fast and reversible transfer of chains between a large amount of inexpensive species and a suitable catalyst. Other couples of such combinations with different metals have been employed, beside the herein described dialkyl magnesium / rare earth metallocene association. For example with diethyl zinc, chain growth was observed using iron complexes with pyridinediimine ligand [5] and with zirco‐ nium complexes with salicylaldimine ligand [35]. With trialkyl aluminum compounds, yttrium complexes with pyridinamine ligand [36] and chromium complexes with Cp\* ligand were also found successful [37-38]. However, the resulting oligomers in these cases were end caped with zinc and aluminum, respectively, consequently having a different reactivity than the magne‐ sium end capped oligomers.

#### **2.3. Chain length distribution and other characterizations**

At the end of the ethylene uptake, a simple way to characterize the oligomers produced was to hydrolyze the C-Mg chains end by any protic reagent; for example by pouring the homo‐ geneous solution of the reactor at 90°C in cold methanol. The slurry had to be well stirred to obtain a fine precipitate, while bubbling argon was necessary in order to avoid oxygen side reactions. Meanwhile, the presence of trace amounts of hydrochloric acid favored the disso‐ lution of all metallic residues. When oligomers with a carbon number under 20 could be neglected, the precipitate was straightforwardly filtered and dried, otherwise a low amount of methanol was used in order to recover the entire mixture for further analyses.

NMR 1 H and 13C, as well as IR analyses established the linearity of the chains; absolutely no branching was observed. The chain ends consisted of normal alkyl –(CH2)n–CH3 with only trace amounts of vinyl ends –CH=CH2. Some olefins could indeed be produced by a secondary reaction with mechanisms named either β-H elimination, or transfer to ethylene monomer. Low temperature and high magnesium concentration favored the very low amount of such defects. Typically, a temperature of 90°C and toluene solutions between 10-1 and 10-2 M represented a good compromise.

Several analytical methods were used to characterize the distribution of the chain sizes. For mixtures having rather high molar mass, the size exclusion chromatography (SEC or gel permeation chromatography, GPC) at 135°C in trichlorobenzene were found to be the best choice, an example is given in figure 1.

it is not surprising that we failed to analyze the sample after hydrolysis. Hence we tried to functionalize the crude oligomer mixture, when it was still constituted of alkyl chains end capped with magnesium, with a tag that was easily ionisable [40]. As tag precursor, we looked for a carbonyl compound with fused aromatic cycles and electrodonor substituents. Carbonyl compounds are known to undergo addition of organomagnesium compounds, while fused

End-capped Oligomers of Ethylene, Olefins and Dienes, by means of Coordinative Chain Transfer…

http://dx.doi.org/10.5772/58217

11

criteria, the cheap and nicely colored Rhodamine B base was selected. Advantage of the MALDI-TOF analytical method once again is the recognition of individual molecules, com‐ plementary to the GC method. However, quantification was distorted by segregation at the evaporation step and we are aware that only the living fraction of the oligomers mixture, once functionalized, can be detected (dead chains, for example olefins, can't be observed by this technique). The spectrum is given in figure 3, with an enlargement showing the good fit between data and a simulated distribution based on Poisson statistics and 13C isotopic

. With these

aromatic cycles with electrodonor substituents would give stable adducts with H+

**Figure 1.** SEC analysis of an oligomer sample with *M*¯*n*=1530 g/mol and dispersity Ð=1.55

abundance.

Gas chromatography (GC) was found best suited for oligomers having less than 100 carbons, if the apparatus was equipped with a Simdist metal column, or 60 carbons when using a usual fused silica column. Advantages of this analytical method lie in the identification and quan‐ tification of each kind of molecule with a high precision, including the separation of impurities like olefins up to 22 carbons. Use of alkane standards, commercially available for 36, 50 and even 60 carbons, ensured reliable calibration of the flame ionization detector, bypassing an obvious segregation during the evaporation step in the injection port. An example is given in figure 2 [39].

Ultimate characterization of oligomers was reached with mass spectrometry using a time of flight detector and a matrix assisted laser desorption ionization process (MALDI-TOF). However, considering that alkanes represent ones of the most difficult molecules to be ionized,

End-capped Oligomers of Ethylene, Olefins and Dienes, by means of Coordinative Chain Transfer… http://dx.doi.org/10.5772/58217 11

**Figure 1.** SEC analysis of an oligomer sample with *M*¯*n*=1530 g/mol and dispersity Ð=1.55

of such combinations with different metals have been employed, beside the herein described dialkyl magnesium / rare earth metallocene association. For example with diethyl zinc, chain growth was observed using iron complexes with pyridinediimine ligand [5] and with zirco‐ nium complexes with salicylaldimine ligand [35]. With trialkyl aluminum compounds, yttrium complexes with pyridinamine ligand [36] and chromium complexes with Cp\* ligand were also found successful [37-38]. However, the resulting oligomers in these cases were end caped with zinc and aluminum, respectively, consequently having a different reactivity than the magne‐

At the end of the ethylene uptake, a simple way to characterize the oligomers produced was to hydrolyze the C-Mg chains end by any protic reagent; for example by pouring the homo‐ geneous solution of the reactor at 90°C in cold methanol. The slurry had to be well stirred to obtain a fine precipitate, while bubbling argon was necessary in order to avoid oxygen side reactions. Meanwhile, the presence of trace amounts of hydrochloric acid favored the disso‐ lution of all metallic residues. When oligomers with a carbon number under 20 could be neglected, the precipitate was straightforwardly filtered and dried, otherwise a low amount

H and 13C, as well as IR analyses established the linearity of the chains; absolutely no branching was observed. The chain ends consisted of normal alkyl –(CH2)n–CH3 with only trace amounts of vinyl ends –CH=CH2. Some olefins could indeed be produced by a secondary reaction with mechanisms named either β-H elimination, or transfer to ethylene monomer. Low temperature and high magnesium concentration favored the very low amount of such defects. Typically, a temperature of 90°C and toluene solutions between 10-1 and 10-2 M

Several analytical methods were used to characterize the distribution of the chain sizes. For mixtures having rather high molar mass, the size exclusion chromatography (SEC or gel permeation chromatography, GPC) at 135°C in trichlorobenzene were found to be the best

Gas chromatography (GC) was found best suited for oligomers having less than 100 carbons, if the apparatus was equipped with a Simdist metal column, or 60 carbons when using a usual fused silica column. Advantages of this analytical method lie in the identification and quan‐ tification of each kind of molecule with a high precision, including the separation of impurities like olefins up to 22 carbons. Use of alkane standards, commercially available for 36, 50 and even 60 carbons, ensured reliable calibration of the flame ionization detector, bypassing an obvious segregation during the evaporation step in the injection port. An example is given in

Ultimate characterization of oligomers was reached with mass spectrometry using a time of flight detector and a matrix assisted laser desorption ionization process (MALDI-TOF). However, considering that alkanes represent ones of the most difficult molecules to be ionized,

of methanol was used in order to recover the entire mixture for further analyses.

sium end capped oligomers.

10 Oligomerization of Chemical and Biological Compounds

represented a good compromise.

choice, an example is given in figure 1.

NMR 1

figure 2 [39].

**2.3. Chain length distribution and other characterizations**

it is not surprising that we failed to analyze the sample after hydrolysis. Hence we tried to functionalize the crude oligomer mixture, when it was still constituted of alkyl chains end capped with magnesium, with a tag that was easily ionisable [40]. As tag precursor, we looked for a carbonyl compound with fused aromatic cycles and electrodonor substituents. Carbonyl compounds are known to undergo addition of organomagnesium compounds, while fused aromatic cycles with electrodonor substituents would give stable adducts with H+ . With these criteria, the cheap and nicely colored Rhodamine B base was selected. Advantage of the MALDI-TOF analytical method once again is the recognition of individual molecules, com‐ plementary to the GC method. However, quantification was distorted by segregation at the evaporation step and we are aware that only the living fraction of the oligomers mixture, once functionalized, can be detected (dead chains, for example olefins, can't be observed by this technique). The spectrum is given in figure 3, with an enlargement showing the good fit between data and a simulated distribution based on Poisson statistics and 13C isotopic abundance.

**Figure 3.** Mass spectrum by MALDI-TOF of an oligomer sample functionalized by Rhodamine B base.

polymers, without any treatment or additives, appeared promising.

The functionalization of ethylene oligomers was demonstrated in the above paragraph by the use of Rhodamine B base. This reaction was an addition of the long chain alkyl magnesium to the carbonyl group of an aromatic lactone. As seen on figure 3, the resulting molecule was an ethylene oligomer covalently bonded to a polar dye. The red color of the dye was retained in the final material, but other physical properties, like solubility for example, were severely modified. As a matter of fact, it was easy to clean by washing with methanol all Rhodamine residues, but it was impossible to decolorize by any way a plastic container (LDPE) in which had been stored the oligomers capped with Rhodamine. A possible explanation would be that the long chains were integrated in the bulk of the container walls. The dye, which is covalently bonded to these long chains, had consequently negligible leaching rates. Applications of this material as permanent ink, non-leaching and readily dispersible in the bulk of non-polar

End-capped Oligomers of Ethylene, Olefins and Dienes, by means of Coordinative Chain Transfer…

http://dx.doi.org/10.5772/58217

13

Long chain primary alcohols were obtained by a straightforward functionalization of the oligomers with oxygen and then hydrolysis. Indeed, one had simply to follow the next step of the Aufbaureaktion in its application as an industrial process. An analogy can be seen between aluminum and magnesium as far as the reactivity of alkyl metal with oxygen is concerned.

**2.4. Functionalizations**

10

**Figure 2.** GC analysis of an oligomer sample with *M*¯n=392 g/mol and dispersity Ð=1.08

Figure 2. GC analysis of an oligomer sample with <sup>M</sup> n = 392 g/mol and dispersity Ð = 1.08

End-capped Oligomers of Ethylene, Olefins and Dienes, by means of Coordinative Chain Transfer… http://dx.doi.org/10.5772/58217 13

**Figure 3.** Mass spectrum by MALDI-TOF of an oligomer sample functionalized by Rhodamine B base.

#### **2.4. Functionalizations**

10

0 10 20 30 40 50 60 70

Carbon number

Retention Time (boiling point order)

Internal standard C36

experimental

Model distribution Mn = 392 g/mol, D = 1.08

E10

Linear alcanes

E11

Ligand (12-crown-4)

FID Signal (massic abundance)

0

0,005

0,01

molar amount (mmol)

0,015

0,02

0,025

Internal standard C13

12 Oligomerization of Chemical and Biological Compounds

Figure 2. GC analysis of an oligomer sample with <sup>M</sup> n = 392 g/mol and dispersity Ð = 1.08

**Figure 2.** GC analysis of an oligomer sample with *M*¯n=392 g/mol and dispersity Ð=1.08

The functionalization of ethylene oligomers was demonstrated in the above paragraph by the use of Rhodamine B base. This reaction was an addition of the long chain alkyl magnesium to the carbonyl group of an aromatic lactone. As seen on figure 3, the resulting molecule was an ethylene oligomer covalently bonded to a polar dye. The red color of the dye was retained in the final material, but other physical properties, like solubility for example, were severely modified. As a matter of fact, it was easy to clean by washing with methanol all Rhodamine residues, but it was impossible to decolorize by any way a plastic container (LDPE) in which had been stored the oligomers capped with Rhodamine. A possible explanation would be that the long chains were integrated in the bulk of the container walls. The dye, which is covalently bonded to these long chains, had consequently negligible leaching rates. Applications of this material as permanent ink, non-leaching and readily dispersible in the bulk of non-polar polymers, without any treatment or additives, appeared promising.

Long chain primary alcohols were obtained by a straightforward functionalization of the oligomers with oxygen and then hydrolysis. Indeed, one had simply to follow the next step of the Aufbaureaktion in its application as an industrial process. An analogy can be seen between aluminum and magnesium as far as the reactivity of alkyl metal with oxygen is concerned. Kinetics of the oxidation of alkyl magnesium were found fast, perhaps faster than hydrolysis, since we detected long chain alcohols as impurities of alkanes if hydrolysis of alkyl magnesium was realized in non-degassed methanol. On the contrary, when long chain alcohols were the desired product, alkane impurities were observed with a peculiar specificity: the distribution of these alkanes was higher in mean molar mass, sometimes up to twice the mean molar mass relative to the parent distribution of alcohols. A possible explanation would be to consider a coupling mechanism of two alkyls resulting either from a reductive elimination or from a Würtz-like reaction. The coupling mechanism would be induced by oxygen on unsolvated dialkyl magnesium compounds. We bypassed this difficulty by using ether additives. Best results (93% functionalization efficiency, see figure 4) were obtained using crown ether of small size (12-crown-4). An advantage of the latter compound was its inertness relative to the rare earth catalyst, allowing its addition at the beginning of the ethylene oligomerization. A this stage, the crown ether had a beneficial effect on the CCTP mechanism by impeding the aggregation of dialkyl magnesium compounds: first, the ethylene insertion kept a high rate irrespective of the magnesium concentration; second, exchange of alkyl groups were found faster, as seen with an improvement of the dispersity. For example figure 2 (Ð=1.08) was obtained with crown ether while figure 1 (Ð=1.55) was obtained without it. Noteworthy, a large range of oxygenated impurities would have more or less the same character (n donor or Lewis base), provided the stoichiometry with respect to magnesium would be between 1 and 2. This explained why it is possible to run the oligomerization process with relatively unclean materials, with good activity and dispersity, but with doubtful and irreproducible results. Figure 4 is the GC trace of a sample of long chain linear alcohols obtained in our best conditions, *i. e*. with 1 equivalent of 12-crown-4. An analysis by 13C NMR spectroscopy is presented in figure 5 for a similar sample but with slightly shorter chains and after precipitation and filtration [6].

Many other functionalizations have been achieved with magnesium end-capped oligomers of ethylene which were synthesized by rare earth/magnesium based CCTP. For example carboxylic acids were obtained in our laboratory by addition of carbon dioxide [9]. Long-chain alkyl tin or alkyl silicium in a star shape were successfully tested [9,41]. An important work has also been realized by Boisson and D'Agosto who focused on the synthesis of thiol and iodine end-capped oligomers by addition of sulphur and iodine on the long-chain dialkyl magnesium, respectively [11]. The iodo compounds were then transformed into the corre‐ sponding long chain azides, which were used in click chemistry. Alternatively, the latter azides were hydrogenated into the corresponding amino end-capped oligomers. These examples are not exhaustive and the field of application should be widen since this represents only a part of the larger theme of Catalytic Chain Transfer as recently reviewed by Marks [42].

#### **2.5. Uses as macro-initiators**

A special application of magnesium end-capped oligomers was to consider them as macroinitiators and to synthesize block copolymers. A polyethylenyl block can then be grafted to a second block of different nature giving rise to a specialty polymer. Since the linear alkyl group has a lipophilic character, with good crystallization ability, the interest was to associate a

13

**Figure 4.** GC analysis of an oligomer sample functionalized to primary alcohols by oxygen.

0 10 20 30 40 50 60 70

Carbon number

presented in figure 5 for a similar sample but with slightly shorter chains and after

End-capped Oligomers of Ethylene, Olefins and Dienes, by means of Coordinative Chain Transfer…

Retention Time (boiling point order)

C36 standard

alcohols 93 %

http://dx.doi.org/10.5772/58217

15

alkanes 7 %

Model distribution Mn = 420 g/mol D = 1.06

alkane 22 C

alcohol 18 C

alkane 24 C

alcohol 20 C

precipitation and filtration [6].

12-crown-4

C13 standard

FID Signal (massic abundance)

0

0,01

0,02

0,03

molar amount (mmol)

0,04

0,05

0,06

presented in figure 5 for a similar sample but with slightly shorter chains and after End-capped Oligomers of Ethylene, Olefins and Dienes, by means of Coordinative Chain Transfer… http://dx.doi.org/10.5772/58217 15

precipitation and filtration [6].

Kinetics of the oxidation of alkyl magnesium were found fast, perhaps faster than hydrolysis, since we detected long chain alcohols as impurities of alkanes if hydrolysis of alkyl magnesium was realized in non-degassed methanol. On the contrary, when long chain alcohols were the desired product, alkane impurities were observed with a peculiar specificity: the distribution of these alkanes was higher in mean molar mass, sometimes up to twice the mean molar mass relative to the parent distribution of alcohols. A possible explanation would be to consider a coupling mechanism of two alkyls resulting either from a reductive elimination or from a Würtz-like reaction. The coupling mechanism would be induced by oxygen on unsolvated dialkyl magnesium compounds. We bypassed this difficulty by using ether additives. Best results (93% functionalization efficiency, see figure 4) were obtained using crown ether of small size (12-crown-4). An advantage of the latter compound was its inertness relative to the rare earth catalyst, allowing its addition at the beginning of the ethylene oligomerization. A this stage, the crown ether had a beneficial effect on the CCTP mechanism by impeding the aggregation of dialkyl magnesium compounds: first, the ethylene insertion kept a high rate irrespective of the magnesium concentration; second, exchange of alkyl groups were found faster, as seen with an improvement of the dispersity. For example figure 2 (Ð=1.08) was obtained with crown ether while figure 1 (Ð=1.55) was obtained without it. Noteworthy, a large range of oxygenated impurities would have more or less the same character (n donor or Lewis base), provided the stoichiometry with respect to magnesium would be between 1 and 2. This explained why it is possible to run the oligomerization process with relatively unclean materials, with good activity and dispersity, but with doubtful and irreproducible results. Figure 4 is the GC trace of a sample of long chain linear alcohols obtained in our best conditions, *i. e*. with 1 equivalent of 12-crown-4. An analysis by 13C NMR spectroscopy is presented in figure 5 for a similar sample but with slightly shorter chains and after precipitation and

14 Oligomerization of Chemical and Biological Compounds

Many other functionalizations have been achieved with magnesium end-capped oligomers of ethylene which were synthesized by rare earth/magnesium based CCTP. For example carboxylic acids were obtained in our laboratory by addition of carbon dioxide [9]. Long-chain alkyl tin or alkyl silicium in a star shape were successfully tested [9,41]. An important work has also been realized by Boisson and D'Agosto who focused on the synthesis of thiol and iodine end-capped oligomers by addition of sulphur and iodine on the long-chain dialkyl magnesium, respectively [11]. The iodo compounds were then transformed into the corre‐ sponding long chain azides, which were used in click chemistry. Alternatively, the latter azides were hydrogenated into the corresponding amino end-capped oligomers. These examples are not exhaustive and the field of application should be widen since this represents only a part

of the larger theme of Catalytic Chain Transfer as recently reviewed by Marks [42].

A special application of magnesium end-capped oligomers was to consider them as macroinitiators and to synthesize block copolymers. A polyethylenyl block can then be grafted to a second block of different nature giving rise to a specialty polymer. Since the linear alkyl group has a lipophilic character, with good crystallization ability, the interest was to associate a

filtration [6].

**2.5. Uses as macro-initiators**

**Figure 4.** GC analysis of an oligomer sample functionalized to primary alcohols by oxygen.

13

**Figure 5.** NMR (CDCl3, 75 MHz) of an oligomer sample functionalized to primary alcohols by oxygen

second block more polar or more amorphous. From a catalytic point of view, the second step generally needs its own catalyst, which was added simultaneously to the second monomer, since different monomers require usually different catalysts. But a particularly elegant process, in terms of atom economy, should be highlighted when the magnesium atom itself was the catalytic centre of the second step. As an example of the latter case, we tested the polymeri‐ zation of methyl methacrylate with magnesium end-capped macro-initiators and we obtained the corresponding PE-PMMA biblock [9]. A different situation was encountered when εcaprolactone monomer was chosen for the second step. Indeed the traces of rare earth compounds (1 to 2 % of the magnesium amount) were found very active and compete with the magnesium activity, resulting in a polymodal distribution of PE-PCL block copolymers. A third example of block copolymerization, described recently [39], used isoprene as second monomer and needed the addition at the second step of one additional equivalent of rare earth salt per magnesium macro-initiator. The polyethylene-polyisoprene copolymers were ob‐ tained in a one-pot procedure (see scheme 8).

R-(CH2-CH2)n-Mg-(CH2-CH2)m-R

Z

R-(CH2-CH2)n (CH2-CH2)n-R

In contrast to the high activity of the rare earth / magnesium bimetallic system for ethylene polymerization, when using pentamethylcyclopentadiene Cp\* ligands, the activity for other olefins such as octene was found very low. Moreover, poor selectivity hampered the propa‐ gation step by a secondary transfer reaction. This is an irreversible hydride transfer reaction, named π-allyl formation, between a living alkyl-metal catalyst and an olefin, giving rise to a dead alkane and a π-allyl-metallic complex, the latter having low, if any, catalytic ability to insert a new olefin monomer (higher activation energy of the insertion step in the catalytic

**3. Oligomers and co-oligomers of olefins and conjugated dienes**

(CH2-CH2)n-R

(CH2-CH2)n-R Z = Sn, Si

FUNCTIONAL BLOCK COPOLYMERS

Ethylene - (Meth)Acrylates

End-capped Oligomers of Ethylene, Olefins and Dienes, by means of Coordinative Chain Transfer…

Ethylene - Styrene

Ethylene - conj. Dienes

http://dx.doi.org/10.5772/58217

17

n m

linear PE *trans*-PI

Ethylene - Lactides

Ethylene - Lactones

R-(CH2-CH2)n-I

R-(CH2-CH2)n-1-CH=CH2

R-(CH2-CH2)n C OH

**3.1. Octene**

R-(CH2-CH2)n-SH

R-(CH2-CH2)n-COOH

OLIGOMERS

R-(CH2-CH2)n-OH

**Nd-[cat] 2**

**Scheme 8.** Block copolymer of ethylene and isoprene with *trans* stereospecificity.

Mg n n

+ MgR2

**Nd-[cat] 1**

**Scheme 9.** Functionalization of ethylene oligomers after CCTP process.

After several steps of functionalization as described above, the end-capped oligomers of ethylene can be designed as sophisticated macro-initiators, which were adapted to some particular polymerization mechanism. For example, RAFT or NMP mechanisms were successfully initiated with special macro-initiators based on long-chain alkyls issued from CCTP [11]. As for functionalization, the field of applications of block copolymers is largely open and new products can be synthesized purposely, depending on the precise properties required (see scheme 9).[9,11-19]

End-capped Oligomers of Ethylene, Olefins and Dienes, by means of Coordinative Chain Transfer… http://dx.doi.org/10.5772/58217 17

**Scheme 8.** Block copolymer of ethylene and isoprene with *trans* stereospecificity.

**Scheme 9.** Functionalization of ethylene oligomers after CCTP process.

#### **3. Oligomers and co-oligomers of olefins and conjugated dienes**

#### **3.1. Octene**

second block more polar or more amorphous. From a catalytic point of view, the second step generally needs its own catalyst, which was added simultaneously to the second monomer, since different monomers require usually different catalysts. But a particularly elegant process, in terms of atom economy, should be highlighted when the magnesium atom itself was the catalytic centre of the second step. As an example of the latter case, we tested the polymeri‐ zation of methyl methacrylate with magnesium end-capped macro-initiators and we obtained the corresponding PE-PMMA biblock [9]. A different situation was encountered when εcaprolactone monomer was chosen for the second step. Indeed the traces of rare earth compounds (1 to 2 % of the magnesium amount) were found very active and compete with the magnesium activity, resulting in a polymodal distribution of PE-PCL block copolymers. A third example of block copolymerization, described recently [39], used isoprene as second monomer and needed the addition at the second step of one additional equivalent of rare earth salt per magnesium macro-initiator. The polyethylene-polyisoprene copolymers were ob‐

**Figure 5.** NMR (CDCl3, 75 MHz) of an oligomer sample functionalized to primary alcohols by oxygen

After several steps of functionalization as described above, the end-capped oligomers of ethylene can be designed as sophisticated macro-initiators, which were adapted to some particular polymerization mechanism. For example, RAFT or NMP mechanisms were successfully initiated with special macro-initiators based on long-chain alkyls issued from CCTP [11]. As for functionalization, the field of applications of block copolymers is largely open and new products can be synthesized purposely, depending on the precise properties

tained in a one-pot procedure (see scheme 8).

16 Oligomerization of Chemical and Biological Compounds

required (see scheme 9).[9,11-19]

In contrast to the high activity of the rare earth / magnesium bimetallic system for ethylene polymerization, when using pentamethylcyclopentadiene Cp\* ligands, the activity for other olefins such as octene was found very low. Moreover, poor selectivity hampered the propa‐ gation step by a secondary transfer reaction. This is an irreversible hydride transfer reaction, named π-allyl formation, between a living alkyl-metal catalyst and an olefin, giving rise to a dead alkane and a π-allyl-metallic complex, the latter having low, if any, catalytic ability to insert a new olefin monomer (higher activation energy of the insertion step in the catalytic cycle). This problem was overcome by changing the ligand in the coordination sphere of the rare earth. In particular, replacement of the methyl substituents on the cyclopentadiene ring of Cp\* by several crowded silyl groups fulfilled the electronic and steric requirements for a better activity and selectivity for the octene polymerization. The synthesis of a peculiar ligand (see scheme 10) and isolation of the corresponding rare earth complex represented a prereq‐ uisite to test the ability of the CCTP concept for the synthesis of magnesium end-capped oligomers of octene. Successful results were obtained, (*<sup>M</sup>*¯n=400-1300 g/mol, Ð=1.11-1.65) with longer reaction times than for ethylene (24 h). [43-44]

**Scheme 11.** Oligomerization of styrene by CCTP with rare earth / magnesium system. 85% syndiotactic selectivity for

End-capped Oligomers of Ethylene, Olefins and Dienes, by means of Coordinative Chain Transfer…

http://dx.doi.org/10.5772/58217

19

Statistical copolymerization was assessed with 1-hexene and styrene monomers in the presence of the Cp\*La(BH4)2(THF)2/n-butylethylmagnesium (BEM) catalytic system. Poly(styr‐ ene-*co*-hexene) statistical copolymers were obtained with up to 46% yield, and 23% 1-hexene content (see scheme 12). The occurrence of chain transfer reactions in the presence of excess BEM was established in the course of the statistical co-polymerization, through significant molecular weights decrease *vs.* 1 equiv BEM, along with narrowing of the dispersities. Thanks to this transfer process, the quantity of 1-hexene in the copolymer was increased substantially, from 8.6 to 23.2 %, for 80/20 1-hexene/styrene composition in the feed and in the presence of 10 equiv. BEM versus 1 equiv. [55]. These results extend the range of CCTP concept to a chain transfer induced control of the composition of statistical copolymers to poly(styrene-co-

**Scheme 12.** Half-lanthanocene/BEM-mediated styrene-hexene Coordinative Chain Transfer co-Polymerization.

Polymerization of butadiene was known for a long time to be realized with catalytic systems based on rare earth salts and dialkyl magnesium [56-58]. Chain transfer efficiency to magne‐ sium was of prime importance for example for the synthesis of functionalized polybutadienes with applications in the tyre industry [20]. Another striking point of this polymerization was the regioselectivity of the reaction which, in principle, can afford three kinds of sequences named 1,2-(or vinyl), 1,4-*trans* and 1,4-*cis* (see scheme 13). Using dialkyl magnesium in combination with rare earth salts was most generally effective for 1,4-*trans* selectivity [59], but

Cp\*Ln(BH4)2(THF)2, Ln=Nd, La, and atactic polystyrene for LaCl3(THF)3.

hexene) copolymers.

**3.3. Butadiene**

**Scheme 10.** Oligomerization of octene by CCTP with neodymium / magnesium system.

#### **3.2. Styrene**

As already mentioned for α-olefins, the activity of the decamethylmetallocene RE / Mg system was observed as negligible for the insertion step of styrene in a polymerization catalytic cycle. Higher temperature gave effective chain transfer of oligostyrene on dialkylmagnesium, but with competitive radicalar and/or anionic mechanisms besides the coordination one [45-48]. By switching from Cp\* to other ligands, a stereoselective CCTP process yielded near-perfect syndiospecific and isospecific oligostyrenes end-capped with magnesium [49-51]. The use of half-metallocene systems (with only one Cp\* ligand per rare earth) and even of inorganic rare earth salts like chloride, borohydride, alkoxide or phenoxide [52], gave successful results in oligomerization of styrene in a chain transfer mechanism to dialkyl magnesium. In a study centered to the structure/reactivity relationships of the pre-catalyst, it was shown that Ln(BH4)3(THF)x (x=3, Ln=Nd, La) as well as the mixed La(BH4)2Cl(THF)2.6 led to an efficient transmetallation of the growing polystyrene chain with the Mg chain transfer agent. However, 1 H NMR and MALDI-TOF studies established the simultaneous occurrence of some β-H abstraction. Such uncontrolled termination reactions were absent with LaCl3(THF)3, Cp\*Nd(BH4)2(THF)2 and Cp\*La(BH4)2(THF)2. The quantitative transfer efficiency observed led us to conclude to a Catalyzed Chain Growth on magnesium. Moreover, the reaction remained significantly syndioselective (85 %) with the two latter ones, as observed previously when combined with only 1 equiv BEM (see scheme 11) [53-54].

End-capped Oligomers of Ethylene, Olefins and Dienes, by means of Coordinative Chain Transfer… http://dx.doi.org/10.5772/58217 19

**Scheme 11.** Oligomerization of styrene by CCTP with rare earth / magnesium system. 85% syndiotactic selectivity for Cp\*Ln(BH4)2(THF)2, Ln=Nd, La, and atactic polystyrene for LaCl3(THF)3.

Statistical copolymerization was assessed with 1-hexene and styrene monomers in the presence of the Cp\*La(BH4)2(THF)2/n-butylethylmagnesium (BEM) catalytic system. Poly(styr‐ ene-*co*-hexene) statistical copolymers were obtained with up to 46% yield, and 23% 1-hexene content (see scheme 12). The occurrence of chain transfer reactions in the presence of excess BEM was established in the course of the statistical co-polymerization, through significant molecular weights decrease *vs.* 1 equiv BEM, along with narrowing of the dispersities. Thanks to this transfer process, the quantity of 1-hexene in the copolymer was increased substantially, from 8.6 to 23.2 %, for 80/20 1-hexene/styrene composition in the feed and in the presence of 10 equiv. BEM versus 1 equiv. [55]. These results extend the range of CCTP concept to a chain transfer induced control of the composition of statistical copolymers to poly(styrene-cohexene) copolymers.

**Scheme 12.** Half-lanthanocene/BEM-mediated styrene-hexene Coordinative Chain Transfer co-Polymerization.

#### **3.3. Butadiene**

cycle). This problem was overcome by changing the ligand in the coordination sphere of the rare earth. In particular, replacement of the methyl substituents on the cyclopentadiene ring of Cp\* by several crowded silyl groups fulfilled the electronic and steric requirements for a better activity and selectivity for the octene polymerization. The synthesis of a peculiar ligand (see scheme 10) and isolation of the corresponding rare earth complex represented a prereq‐ uisite to test the ability of the CCTP concept for the synthesis of magnesium end-capped oligomers of octene. Successful results were obtained, (*<sup>M</sup>*¯n=400-1300 g/mol, Ð=1.11-1.65) with

Cl

SiMe3

Cl Li THF THF

SiMe3

<sup>n</sup> Mg <sup>2</sup>

MgR2 R

As already mentioned for α-olefins, the activity of the decamethylmetallocene RE / Mg system was observed as negligible for the insertion step of styrene in a polymerization catalytic cycle. Higher temperature gave effective chain transfer of oligostyrene on dialkylmagnesium, but with competitive radicalar and/or anionic mechanisms besides the coordination one [45-48]. By switching from Cp\* to other ligands, a stereoselective CCTP process yielded near-perfect syndiospecific and isospecific oligostyrenes end-capped with magnesium [49-51]. The use of half-metallocene systems (with only one Cp\* ligand per rare earth) and even of inorganic rare earth salts like chloride, borohydride, alkoxide or phenoxide [52], gave successful results in oligomerization of styrene in a chain transfer mechanism to dialkyl magnesium. In a study centered to the structure/reactivity relationships of the pre-catalyst, it was shown that Ln(BH4)3(THF)x (x=3, Ln=Nd, La) as well as the mixed La(BH4)2Cl(THF)2.6 led to an efficient transmetallation of the growing polystyrene chain with the Mg chain transfer agent. However,

H NMR and MALDI-TOF studies established the simultaneous occurrence of some β-H abstraction. Such uncontrolled termination reactions were absent with LaCl3(THF)3, Cp\*Nd(BH4)2(THF)2 and Cp\*La(BH4)2(THF)2. The quantitative transfer efficiency observed led us to conclude to a Catalyzed Chain Growth on magnesium. Moreover, the reaction remained significantly syndioselective (85 %) with the two latter ones, as observed previously when

Nd

Me2Si

Me3Si

**Scheme 10.** Oligomerization of octene by CCTP with neodymium / magnesium system.

combined with only 1 equiv BEM (see scheme 11) [53-54].

**3.2. Styrene**

1

Me3Si

longer reaction times than for ethylene (24 h). [43-44]

18 Oligomerization of Chemical and Biological Compounds

Polymerization of butadiene was known for a long time to be realized with catalytic systems based on rare earth salts and dialkyl magnesium [56-58]. Chain transfer efficiency to magne‐ sium was of prime importance for example for the synthesis of functionalized polybutadienes with applications in the tyre industry [20]. Another striking point of this polymerization was the regioselectivity of the reaction which, in principle, can afford three kinds of sequences named 1,2-(or vinyl), 1,4-*trans* and 1,4-*cis* (see scheme 13). Using dialkyl magnesium in combination with rare earth salts was most generally effective for 1,4-*trans* selectivity [59], but 1,4-*cis* sequences have been obtained in some cases [60]. We recently established that the selectivity shifted slightly from high 1,4-*trans* (up to 97%) to more 1,2-regular (17%) simply by increasing from 1 to 10 the ratio of magnesium transfer agent relative to rare earth amounts. [61]. A chain transfer induced control of regioselectivity was hence evidenced (see table 1). Decrease of molar masses with increase of magnesium amounts was observed, delineating the transfer efficiency, although with a moderately controlled character (moderate dispersity). The magnesium presence at the tips of oligomer chains was attested by further functionalization, as mentioned below.

Statistical copolymerizations were achieved with styrene and butadiene monomers under conditions similar to previous experiments in table 1. Corresponding oligomers were obtained with *<sup>M</sup>*¯n ranging from 4.4 to 240 kg/mol, for Mg / Nd from 10 to 1, respectively, and styrene incorporation up to 16.9%. Regioselectivity was 1,4-*trans* sequences for butadiene and isolated styrene units inserted statistically in the chains, as identified by 13C NMR. Insertion rates for butadiene were found higher than for styrene. Among these copolymerization experiments, one was short stopped by quenching with a ketone (benzhydrylidene anthrone, Sigma-Aldrich, see scheme 14) in order to have light oligomers, end-capped with a functional group. Analysis by MALDI-TOF mass spectrometry gave a detailed snapshot of the growing chains, evidencing the C-Mg reactivity of the magnesium end-capped oligomers (see figure 6). Statistical distributions of styrene and butadiene monomers were observed, that follow exactly the Poisson probability. The lower styrene insertion rate, as compared to the butadiene one, was clear: the average degree of polymerization, after a 15-minute reaction, was between 17 and 18 for butadiene monomer whereas it was between 1 and 2 for styrene monomer.

End-capped Oligomers of Ethylene, Olefins and Dienes, by means of Coordinative Chain Transfer…

Bu

Nd(BH4)3THF3

O

s b-1

**Scheme 14.** Statistical copolymerization of butadiene and styrene under CCTP conditions and further functionaliza‐

In the presence of 1 to 10 equiv alkyl magnesium as chain transfer agent, combined to Cp\*Ln(BH4)2(THF)n (Ln=La, n=2.5; Ln=Nd, n=2), the observed molecular weight distributions are monomodal, and the number average molecular weight is close to the calculated one considering two growing chains per magnesium atom. This, along with reasonable dispersities (Ð 1.3–1.9), highlights a rare earth catalyzed polyisoprene chain growth on magnesium (see scheme 15). With Ln(BH4)3(THF)3 (Ln=La, Nd) in the same conditions transfer efficiency is

<sup>s</sup> Mg <sup>2</sup> <sup>b</sup>

http://dx.doi.org/10.5772/58217

21

+ + Mg(Bu)2

tion with benzhydrylidene anthrone.

**3.4. Isoprene**

Bu


**Scheme 13.** Oligomerization of butadiene by CCTP with rare earth / magnesium catalytic system.

a: Reactions at 50 °C, in 10 mL of toluene; pre-catalyst: Nd(BH4)3(THF)3 20 µmol; co-catalyst: BEM; [butadiene]/ [Nd]=1000

b: Determined by 1H and 13C NMR spectroscopy in CDCl3

c: Number-average molecular weight measured by SEC with reference to PS standards. No correction factor applied.

d: Dispersity measured by SEC: Đ=*M*¯ w / *M*¯n

e: With addition of Cp\*H (20 µmol) and supplementary BEM (10 µmol) in order to have an in situ prepared half-metal‐ locene Cp\*NdX2.

**Table 1.** Butadiene polymerization with Nd (BH4)3(THF)3 and chain transfer with dialkyl magnesium.

Statistical copolymerizations were achieved with styrene and butadiene monomers under conditions similar to previous experiments in table 1. Corresponding oligomers were obtained with *<sup>M</sup>*¯n ranging from 4.4 to 240 kg/mol, for Mg / Nd from 10 to 1, respectively, and styrene incorporation up to 16.9%. Regioselectivity was 1,4-*trans* sequences for butadiene and isolated styrene units inserted statistically in the chains, as identified by 13C NMR. Insertion rates for butadiene were found higher than for styrene. Among these copolymerization experiments, one was short stopped by quenching with a ketone (benzhydrylidene anthrone, Sigma-Aldrich, see scheme 14) in order to have light oligomers, end-capped with a functional group. Analysis by MALDI-TOF mass spectrometry gave a detailed snapshot of the growing chains, evidencing the C-Mg reactivity of the magnesium end-capped oligomers (see figure 6). Statistical distributions of styrene and butadiene monomers were observed, that follow exactly the Poisson probability. The lower styrene insertion rate, as compared to the butadiene one, was clear: the average degree of polymerization, after a 15-minute reaction, was between 17 and 18 for butadiene monomer whereas it was between 1 and 2 for styrene monomer.

**Scheme 14.** Statistical copolymerization of butadiene and styrene under CCTP conditions and further functionaliza‐ tion with benzhydrylidene anthrone.

#### **3.4. Isoprene**

1,4-*cis* sequences have been obtained in some cases [60]. We recently established that the selectivity shifted slightly from high 1,4-*trans* (up to 97%) to more 1,2-regular (17%) simply by increasing from 1 to 10 the ratio of magnesium transfer agent relative to rare earth amounts. [61]. A chain transfer induced control of regioselectivity was hence evidenced (see table 1). Decrease of molar masses with increase of magnesium amounts was observed, delineating the transfer efficiency, although with a moderately controlled character (moderate dispersity). The magnesium presence at the tips of oligomer chains was attested by further functionalization,

+ MgR2 [R-(CH2-CH=CH-CH2)n]2-Mg

Nd(BH4)3THF3 + 0 - 1 Cp\*H

**Scheme 13.** Oligomerization of butadiene by CCTP with rare earth / magnesium catalytic system.

**Isolated Yield (%)**

1,4-*trans*

1,4-*cis*

*1,4-trans 1,4-cis 1,2*

 1 2 96 95.0 3.5 1.5 49 1.29 2e 1 2 43 96.7 1.9 1.4 47 1.38 2 2 54 95.5 2.4 2.1 19 1.28 3 2 16 93.0 2.4 4.6 5.5 1.20 5 2 8 88.0 3.1 8.9 3.9 1.24 5 8 20 89.9 2.1 8.0 3.9 1.43 5 20 37 90.4 1.3 8.3 5.5 3.00 10 20 18 80.7 3.5 15.8 3.8 2.90 a: Reactions at 50 °C, in 10 mL of toluene; pre-catalyst: Nd(BH4)3(THF)3 20 µmol; co-catalyst: BEM; [butadiene]/

c: Number-average molecular weight measured by SEC with reference to PS standards. No correction factor applied.

e: With addition of Cp\*H (20 µmol) and supplementary BEM (10 µmol) in order to have an in situ prepared half-metal‐

**Table 1.** Butadiene polymerization with Nd (BH4)3(THF)3 and chain transfer with dialkyl magnesium.

1,2- or *vinyl*

**Selectivity b (%)** *<sup>M</sup>***¯ n c**

n

**(kg.mol-1)**

Đ <sup>d</sup>

n

n

as mentioned below.

20 Oligomerization of Chemical and Biological Compounds

**Run a [Mg]/[Nd] Time (h)**

b: Determined by 1H and 13C NMR spectroscopy in CDCl3

d: Dispersity measured by SEC: Đ=*M*¯ w / *M*¯n

[Nd]=1000

locene Cp\*NdX2.

In the presence of 1 to 10 equiv alkyl magnesium as chain transfer agent, combined to Cp\*Ln(BH4)2(THF)n (Ln=La, n=2.5; Ln=Nd, n=2), the observed molecular weight distributions are monomodal, and the number average molecular weight is close to the calculated one considering two growing chains per magnesium atom. This, along with reasonable dispersities (Ð 1.3–1.9), highlights a rare earth catalyzed polyisoprene chain growth on magnesium (see scheme 15). With Ln(BH4)3(THF)3 (Ln=La, Nd) in the same conditions transfer efficiency is

An example of functionalization has been assessed on magnesium end-capped oligomers of isoprene, by oxidation with oxygen and hydrolysis to primary alcohols. Analysis by MALDI-TOF mass spectrometry (see figure 7) was in accordance with the expected structure for half of the observed peaks. The other half witnessed the presence of unfunctionalized species which may arise from difficulties to avoid hydrolysis or coupling reactions, before or during the

End-capped Oligomers of Ethylene, Olefins and Dienes, by means of Coordinative Chain Transfer…

http://dx.doi.org/10.5772/58217

23

**Figure 7.** MALDI-TOF spectrum of isoprene oligomers functionalized in primary alcohols.

As expected from our previous results with isoprene and butadiene, polymerization of βmyrcene with neodymium borohydride-based coordination catalysts (Cp\*Nd(BH4)2(THF)2 and Nd(BH4)3(THF)3) in the presence of gradual excesses of BEM (1 to 20 equiv) shows high level of transfer reactions efficiency between neodymium and magnesium [65]. For 1-5 Mg cocatalyst equiv, the measured molecular weights (by SEC and end-group integration NMR) match quite well with calculated values for the growth of two chains per magnesium atom. As the BEM quantity increase, dispersities become more narrow, which accounts for rapid and reversible polymer chain transfer. In addition, the selectivity turns from > 90 %-1,4-*trans*<sup>1</sup> (1 BEM) to 3,4-rich (64%, 20 BEM), thus illustrating the "tuning ability" of the BEM concentration

in such processes, as already observed with isoprene and butadiene (see scheme 16).

1 The selectivity was initially claimed as cis but recent additional experiments led us to reconsider it as trans. This will

oxidation.

**3.5. Myrcene**

be published soon.

**Figure 6.** MALDI-TOF cumulative diagram of a poly(butadiene-*co*-styrene) oligomer initiated with butyl groups and functionalized with benzhydrylidene anthrone.

around 50–60%. With all the catalysts, the polymerization is significantly slowed down with BEM excesses *vs* 1 equiv. magnesium dialkyl, as observed for butadiene [62]. The excess of BEM has another consequence to the polymerization process: the transmetalation is accom‐ panied by a modification of the selectivity of the reaction, from 98.5% 1,4-*trans* with 1 BEM to up to 46% 3,4-polyisoprene using 10 equiv. chain transfer agent. This simply appears as a way to tune the microstructure of the polyisoprene just by adjusting the quantity of chain transfer agent [63]. Actually, a gradual decrease of the 1,4-*trans* stereoselectivity of the reaction, leading to a great variety of poly(1,4-*trans*-isoprene) based materials, is observed at the benefit of 3,4 selectivity with increasing quantities of magnesium dialkyl. By combining dialkylmagnesium and trialkylaluminum, we disclosed that the *trans*-selectivity can be preserved: a 1,4-*trans* stereoselective oligomerization of isoprene with a CCTP character leading to the growth of several poly(1,4-*trans*-isoprene) chain per catalyst metal is reached using the half-lanthanocene Cp\*La(BH4)2(THF)2 in combination with BEM and Ali Bu3 in 1/1/9, 1/1/19, or 1/1/39 quantities, respectively [64].

**Scheme 15.** Rare earth catalyzed polyisoprene chain growth on magnesium. Ln=Nd, La

An example of functionalization has been assessed on magnesium end-capped oligomers of isoprene, by oxidation with oxygen and hydrolysis to primary alcohols. Analysis by MALDI-TOF mass spectrometry (see figure 7) was in accordance with the expected structure for half of the observed peaks. The other half witnessed the presence of unfunctionalized species which may arise from difficulties to avoid hydrolysis or coupling reactions, before or during the oxidation.

**Figure 7.** MALDI-TOF spectrum of isoprene oligomers functionalized in primary alcohols.

#### **3.5. Myrcene**

around 50–60%. With all the catalysts, the polymerization is significantly slowed down with BEM excesses *vs* 1 equiv. magnesium dialkyl, as observed for butadiene [62]. The excess of BEM has another consequence to the polymerization process: the transmetalation is accom‐ panied by a modification of the selectivity of the reaction, from 98.5% 1,4-*trans* with 1 BEM to up to 46% 3,4-polyisoprene using 10 equiv. chain transfer agent. This simply appears as a way to tune the microstructure of the polyisoprene just by adjusting the quantity of chain transfer agent [63]. Actually, a gradual decrease of the 1,4-*trans* stereoselectivity of the reaction, leading to a great variety of poly(1,4-*trans*-isoprene) based materials, is observed at the benefit of 3,4 selectivity with increasing quantities of magnesium dialkyl. By combining dialkylmagnesium and trialkylaluminum, we disclosed that the *trans*-selectivity can be preserved: a 1,4-*trans* stereoselective oligomerization of isoprene with a CCTP character leading to the growth of several poly(1,4-*trans*-isoprene) chain per catalyst metal is reached using the half-lanthanocene

**Figure 6.** MALDI-TOF cumulative diagram of a poly(butadiene-*co*-styrene) oligomer initiated with butyl groups and

Bu3 in 1/1/9, 1/1/19, or 1/1/39 quantities,

Cp\*La(BH4)2(THF)2 in combination with BEM and Ali

functionalized with benzhydrylidene anthrone.

22 Oligomerization of Chemical and Biological Compounds

**Scheme 15.** Rare earth catalyzed polyisoprene chain growth on magnesium. Ln=Nd, La

respectively [64].

As expected from our previous results with isoprene and butadiene, polymerization of βmyrcene with neodymium borohydride-based coordination catalysts (Cp\*Nd(BH4)2(THF)2 and Nd(BH4)3(THF)3) in the presence of gradual excesses of BEM (1 to 20 equiv) shows high level of transfer reactions efficiency between neodymium and magnesium [65]. For 1-5 Mg cocatalyst equiv, the measured molecular weights (by SEC and end-group integration NMR) match quite well with calculated values for the growth of two chains per magnesium atom. As the BEM quantity increase, dispersities become more narrow, which accounts for rapid and reversible polymer chain transfer. In addition, the selectivity turns from > 90 %-1,4-*trans*<sup>1</sup> (1 BEM) to 3,4-rich (64%, 20 BEM), thus illustrating the "tuning ability" of the BEM concentration in such processes, as already observed with isoprene and butadiene (see scheme 16).

<sup>1</sup> The selectivity was initially claimed as cis but recent additional experiments led us to reconsider it as trans. This will be published soon.

**Author details**

Thomas Chenal\*

neuve d'Ascq, France

2006, 312, 714-719.

WO2013/014383A1.

US005779942A, 1996.

Soc., 2004, 126, 10701-10712.

**References**

and Marc Visseaux

University of Science and Technology of Lille, France

National School of Chemistry of Lille (ENSCL), France

National Center for Scientific Research, Unit of Catalysis and Solid State Chemistry, Ville‐

End-capped Oligomers of Ethylene, Olefins and Dienes, by means of Coordinative Chain Transfer…

http://dx.doi.org/10.5772/58217

25

[1] Gladyz J. A. Frontiers in Metal-Catalyzed Polymerization: Designer Metallocenes, Designs on New Monomers, Demystifying MAO, Metathesis Déshabillé. Chem. Rev.,

[2] Coates, G. W.; Hustad, P. D.; Reinartz, S. Catalysts for the Living Insertion Polymeri‐ zation of Alkenes: Access to New Polyolefin Architectures Using Ziegler–Natta

[3] Guan, Z; Cotts, P. M.; McCord, E. F.; McLain, S. J. Chain Walking: A New Strategy to

[4] Arriola, D. J.; Carnahan, E. M.; Hustad, P. D.; Kuhlman, R. L.; Wenzel, T. T. Catalytic Production of Olefin Block Copolymers via Chain Shuttling Polymerization. Science,

[5] Britovsek, G. J. P.; Cohen S. A.; Gibson V. C.; van Meurs M. Iron Catalyzed Polyethy‐ lene Chain Growth on Zinc: A Study of the Factors Delineating Chain Transfer ver‐ sus Catalyzed Chain Growth in Zinc and Related Metal Alkyl Systems. J. Am. Chem.

[6] Chenal, T.; Mortreux, A.; Visseaux, M. Method for Preparing Dialkyl Magnesium Compounds by Ethylene Polymerization and Uses Thereof. Patent

[7] Pelletier, J. F.; Bujadoux, K.; Olonde, X.; Adisson, E.; Mortreux, A.; Chenal, T. Long-Chain Dialkylmagnesium, its Preparation Process and Applications. Patent

[8] Pelletier, J. F.; Mortreux, A.; Olonde, X.; Bujadoux, K. Synthesis of New Dialkylmag‐ nesium Compounds by Living Transfer Ethylene Oligo-and Polymerization with

[9] Chenal, T.; Olonde, X.; Pelletier, J. F.; Bujadoux, K.; Mortreux, A. Controlled poly‐ ethylene chain growth on magnesium catalyzed by lanthanidocene: A living transfer

Lanthanocene Catalysts. Angew. Chem. Int. Ed. Engl., 1996, 35(16), 1854-1856.

2000, 100(4), 1167-1168, and all articles in this special issue.

Chemistry. Angew. Chem., Int. Ed., 2002, 41, 2236-2257.

Control Polymer Topology. Science, 1999, 283, 2059-2062.

**Scheme 16.** Oligomerization of myrcene by CCTP with rare earth / magnesium system. [Nd](BH4)=Nd(BH4)3(THF)3, Cp\*Nd(BH4)2(THF)2, MgR2=BEM.

#### **4. Conclusion**

Readily applicable syntheses of magnesium end-capped oligomers of olefinic unsaturated monomers leading to tailor-made macromolecular objects were proposed. The strategy implemented to achieve this goal consisted in Coordinative Chain Transfer Polymerization involving rare earths precatalysts combined to a dialkylmagnesium reagent. After work-up, organo-functionalized oligomers bearing an hydroxyl, anthrone, rhodamine group, etc, or a second polymer sequence could be prepared.

The rare earth precatalyst was unprecedentedly and advantageously prepared via an *in situ* methodology, which allowed to start with basic compounds as dialkylmagnesium, commercial rare earth salts, and a cyclopentadiene (typically the permethylated one C5Me5H), to afford a highly efficient metallocene-based catalytic combination.

A complete panel of analyses was carried out on the oligomer samples, thus allowing a welldefined knowledge of the mechanism of reaction and of the active species involved, along with the ideal operative conditions in order to accurately monitor the process, in particular in the case of ethylene monomer. Basically, the growing length of these oligomers could be tuned all along the reaction just by checking the ethylene intake. After consumption of the monomer, the reaction was stopped by addition of a functionalization reagent, affording a rich variety of end-capped oligomers of ethylene. Similar processes were applied to others monomers including olefins, styrene, conjugated dienes, and including also copolymers of these mono‐ mers.

Applications of these compounds include copolymers and formulation additives for fine chemistry, plasturgy and electronics, pharmaceutics and cosmetics, dyes and adhesives etc.

#### **Acknowledgements**

Pr. André Mortreux, as former leader of the polymerization group in Lille, is greatly acknowl‐ edged for its essential contribution to initiate the basics of ethylene CCTP and related appli‐ cations

#### **Author details**

Thomas Chenal\* and Marc Visseaux

University of Science and Technology of Lille, France

National Center for Scientific Research, Unit of Catalysis and Solid State Chemistry, Ville‐ neuve d'Ascq, France

National School of Chemistry of Lille (ENSCL), France

#### **References**

**Scheme 16.** Oligomerization of myrcene by CCTP with rare earth / magnesium system. [Nd](BH4)=Nd(BH4)3(THF)3,

Readily applicable syntheses of magnesium end-capped oligomers of olefinic unsaturated monomers leading to tailor-made macromolecular objects were proposed. The strategy implemented to achieve this goal consisted in Coordinative Chain Transfer Polymerization involving rare earths precatalysts combined to a dialkylmagnesium reagent. After work-up, organo-functionalized oligomers bearing an hydroxyl, anthrone, rhodamine group, etc, or a

The rare earth precatalyst was unprecedentedly and advantageously prepared via an *in situ* methodology, which allowed to start with basic compounds as dialkylmagnesium, commercial rare earth salts, and a cyclopentadiene (typically the permethylated one C5Me5H), to afford a

A complete panel of analyses was carried out on the oligomer samples, thus allowing a welldefined knowledge of the mechanism of reaction and of the active species involved, along with the ideal operative conditions in order to accurately monitor the process, in particular in the case of ethylene monomer. Basically, the growing length of these oligomers could be tuned all along the reaction just by checking the ethylene intake. After consumption of the monomer, the reaction was stopped by addition of a functionalization reagent, affording a rich variety of end-capped oligomers of ethylene. Similar processes were applied to others monomers including olefins, styrene, conjugated dienes, and including also copolymers of these mono‐

Applications of these compounds include copolymers and formulation additives for fine chemistry, plasturgy and electronics, pharmaceutics and cosmetics, dyes and adhesives etc.

Pr. André Mortreux, as former leader of the polymerization group in Lille, is greatly acknowl‐ edged for its essential contribution to initiate the basics of ethylene CCTP and related appli‐

Cp\*Nd(BH4)2(THF)2, MgR2=BEM.

24 Oligomerization of Chemical and Biological Compounds

second polymer sequence could be prepared.

highly efficient metallocene-based catalytic combination.

**4. Conclusion**

mers.

cations

**Acknowledgements**


polymerization for the synthesis of higher dialkyl-magnesium. Polymer, 2007, 48, 1844-1856.

[20] Cortial, G.; Le Floch, P.; Nief, F.; Thuilliez, J. Novel Organometallic Compounds Con‐ taining a Metal Belonging to the Second Column of the Periodic Table, and Method

End-capped Oligomers of Ethylene, Olefins and Dienes, by means of Coordinative Chain Transfer…

http://dx.doi.org/10.5772/58217

27

[21] Jeske, G.; Lauke, H.; Mauermann, H.; Swepston, P. N.; Schumann, H.; Marks T. J. Highly Reactive Organolanthanides. Systematic Routes to and Olefin Chemistry of Early and Late Bis(pentamethylcyclopentadienyl) 4f Hydrocarbyl and Hydride Com‐

[22] Watson, P.L.; Herskovitz, T. Homogeneous Lanthanide Complexes as Polymeriza‐ tion and Oligomerization Catalysts: Mechanistic Studies. ACS Symposium Series.,

[23] Cendrowski-Guillaume, S.M.; Le Gland, G.; Nierlich, M.; Ephritikhine, M. Lantha‐ nide Borohydrides as Precursors to Organometallic Compounds. Mono(cyclooctate‐

[24] Ballard, D.G.H.; Courtis, A.; Holton, J.; McMeeking, J.; Pearce, R. Alkyl bridged com‐ plexes of the group 3A and lanthanoid metals as homogeneous ethylene polymerisa‐ tion catalysts. Journal of the Chemical Society, Chemical Communications., 1978, 22,

[25] Olonde, X.; Bujadoux, K.; Mortreux, A.; Petit, F. Catalysts and Process for the Prepa‐ ration of Same for Use in the Polymerization of Ethylene. Patent WO 93/07180. [26] Pettijohn, T. M. Olefin Polymerization Process and Polymer Produced. Patent

[27] Zinck, P.; Valente, A.; Terrier, M.; Mortreux, A.; Visseaux, M. Half-Lanthanidocenes Catalysts via the ''Borohydride/Alkyl'' Route: A simple Approach of Ligand Screen‐ ing for the Controlled Polymerization of Styrene. C. R. Chimie, 2008, 11, 595-602. [28] Visseaux, M.; Terrier, M.; Mortreux, A.; Roussel, P. Facile Synthesis of Lanthanido‐ cenes by the "Borohydride/Alkyl Route" and Their Application in Isoprene Polymer‐

[29] Ziegler, K.; Gellert, H. G.; Külhorn, H.; Martin, H.; Meyer, K.; Nagel, K.; Sauer, H.; Zosel, K. Aluminium-organische Synthese im Bereich olefinischer Kohlenwasser‐

[30] Bergbreiter, D. E.; Blanton J. R.; Chandran R.; Hein, M. D.; Huang, K.-J.; Treadwell, D. R.; Walker, S. A. Anionic Syntheses of Terminally Functionalized Ethylene

[31] Valente, A.; Mortreux, A.; Visseaux, M.; Zinck, P. Coordinative Chain Transfer Poly‐

Oligomers. J. Polym. Sci. Part A: Polym. Chem., 1989, 27(12), 4205-4226.

traenyl) Neodymium Complexes. Organometallics, 2000, 19(26), 5654-5660.

for Preparing Same. Patent WO2010/139449A1.

plexes. J. Am. Chem. Soc., 1985, 107, 8091-8103.

ization. Eur. J. Inorg. Chem., 2010, 2867-2876.

merization. Chem. Rev., 2013, 113, 3836-3857.

stoffe. Angewandte Chemie, 1952, 64(12), 323-329.

1983, 459-479

994-995

US005350816A, 1994.


[20] Cortial, G.; Le Floch, P.; Nief, F.; Thuilliez, J. Novel Organometallic Compounds Con‐ taining a Metal Belonging to the Second Column of the Periodic Table, and Method for Preparing Same. Patent WO2010/139449A1.

polymerization for the synthesis of higher dialkyl-magnesium. Polymer, 2007, 48,

[10] Carpentier, J.-F.; Sarazin, Y. Alkaline-earth metal complexes in homogeneous poly‐ merization catalysis. Topics in Organometallic Chemistry, 2013, 45, 141-189.

[11] Mazzolini, J.; Espinosa, E.; D'Agosto, F.; Boisson, C. Catalyzed Chain Growth (CCG) on a Main Group Metal: an Efficient Tool to Functionalize Polyethylene. Polym.

[12] German, I.; Kelhifi, W.; Norsic, S.; Boisson, C.; D'Agosto F. Telechelic Polyethylene from Catalyzed Chain-Growth Polymerization. Angew. Chem. Int. Ed., 2013, 52(12),

[13] Espinosa, E.; Charleux, B.; D'Agosto, F.; Boisson, C.; Tripathy, R.; Faust, R.; Soulié- Ziakovic C. Di-and Triblock Copolymers Based on Polyethylene and Polyisobutene Blocks. Toward New Thermoplastic Elastomers. Macromolecules, 2013, 46,

[14] Bieligmeyer, M.; Mehdizadeh Taheri, S.; German, I.; Boisson, C.; Probst, C.; Mil‐ ius,W.; Altstadt, V.; Breu,J.; Schmidt, H.-W.; D'Agosto, F.; Forster, S. Completely Mis‐ cible Polyethylene Nanocomposites. J. Am. Chem. Soc., 2012, 134, 18157-18160. [15] Mazzolini, J.; Boyron, O.; Monteil, V.; D'Agosto, F.; Boisson, C.; Sanders, G. C.; Heuts, J. P. A.; Duchateau, R.; Gigmes, D.; Bertin D. Polyethylene End Functionaliza‐ tion using Thia-Michael Addition Chemistry. Polym. Chem., 2012, 3, 2383-2392. [16] Lefay, C.; Glé D.; Rollet, M.; Mazzolini, J.; Bertin, D.; Viel, S.; Schmid, C.; Boisson, C.; D'Agosto, F.; Gigmes, D.; Barner-Kowollik, C. Block Copolymers via Macromercap‐ tan Initiated Ring Opening Polymerization. J. Polym. Sci. Part A: Polym. Chem.,

[17] Akbar, S.; Beyou, E.; Chaumont, P.; Mazzolini, J.; Espinosa, E.; D'Agosto F.; Boisson C. Synthesis of Polyethylene-Grafted Multiwalled Carbon Nanotubes via a Peroxide-Initiating Radical Coupling Reaction and by Using Well-Defined TEMPO and Thiol End-Functionalized Polyethylenes. J. Polym. Sci. Part A: Polym. Chem., 2011, 49(4),

[18] Espinosa, E.; Glassner, M.; Boisson, C.; Barner-Kowollik, C.; D'Agosto, F. Synthesis of Cyclopentadienyl Capped Polyethylene and Subsequent Block Copolymer Forma‐ tion Via Hetero Diels-Alder (HDA) Chemistry. Macromol. Rapid Commun., 2011, 32,

[19] Mazzolini, J.; Mokthari, I.; Briquel, R.; Boyron, O.; Delolme, F.; Monteil, V.; Bertin,D.; Gigmes, D.; D'Agosto, F.; Boisson, C. Thiol-End-Functionalized Polyethylenes. Mac‐

1844-1856.

3438-3441.

3417-3424.

2011, 49(3), 803-813.

957-965.

1447-1453.

romolecules, 2010, 43, 7495-7503.

Chem., 2010, 1, 793-800.

26 Oligomerization of Chemical and Biological Compounds


[32] Sita, L. R. Ex Uno Plures ("Out of One, Many"): New Paradigms for Expanding the Range of Polyolefins through Reversible Group Transfers. Angew. Chem. Int. Ed., 2009, 48, 2464-2472.

[45] Bogaert, S.; Carpentier, J. –F. Chenal, T.; Mortreux, A.; Ricart, G. Chlorolanthano‐ cene-dialkylmagnesium systems for styrene bulk polymerization and styrene-ethyl‐ ene block copolymerization. Macromolecular Chemistry and Physics, 2000, 201(14),

End-capped Oligomers of Ethylene, Olefins and Dienes, by means of Coordinative Chain Transfer…

http://dx.doi.org/10.5772/58217

29

[46] Sarazin, Y.; Chenal, T.; Mortreux, A.; Vezin, H.; Carpentier, J.-F. Binary cerium(IV) tert-butoxides-dialkylmagnesium systems: Radical versus coordinative polymeriza‐

tion of styrene. Journal of Molecular Catalysis A: Chemical, 2005, 238, 207-214.

molecular Chemistry and Physics, 2001, 202(16), 3219-3227.

(RAP) of styrene and dienes. Polymer, 2007, 48(15), 4322-4327.

[47] Menoret, S.; Carlotti, S.; Fontanille, M.; Deffieux, A.; Desbois, P.; Schade, C.; Schrepp, W.; Warzelhan, V. Retarded anionic polymerization, 5 influence of the structure of dialkylmagnesium additives on the reactivity of polystyryllithium species. Macro‐

[48] Carlotti, S.; Desbois, P.; Warzelhan, V.; Deffieux, A. Retarded anionic polymerization

[49] Sarazin, Y.; de Fremont, P.; Annunziata, L.; Duc, M.; Carpentier, J.-F. Syndio-and Iso‐ selective Coordinative Chain Transfer Polymerization of Styrene Promoted by ansa-Lanthanidocene/Dialkylmagnesium Systems. Advanced Synthesis & Catalysis, 2011,

[50] Annunziata, L.; Duc, M.; Carpentier, J.-F. Chain Growth Polymerization of Isoprene and Stereoselective Isoprene-Styrene Copolymerization Promoted by an ansa-Bis(in‐

[51] Rodrigues, A.-S.; Kirillov, E.; Vuillemin, B.; Razavi, A.; Carpentier, J.-F. Binary ansalanthanidocenes/dialkylmagnesium systems versus single-component catalyst: Con‐ trolled synthesis of end-capped syndiotactic oligostyrenes. Journal of Molecular

[52] Gromada, J.; le Pichon, L.; Mortreux, A.; Leising, F.; Carpentier, J.-F. Neodymium alk(aryl)oxides-dialkylmagnesium systems for butadiene polymerization and co‐ polymerization with styrene and glycidyl methacrylate. Journal of Organometallic

[53] Zinck, P.; Valente, A.; Mortreux, A.; Visseaux, M. In situ generated half-lanthanido‐ cene based catalysts for the controlled oligomerisation of styrene: Selectivity, block

[54] Zinck, P.; Valente, A.; Bonnet, F.; Violante, A.; Mortreux, A.; Visseaux, M.; Ilinca, S.; Duchateau, R.; Roussel, P. Reversible coordinative chain transfer polymerization of styrene by rare earth borohydrides, chlorides/dialkylmagnesium systems. J. Polym.

[55] Valente, A.; Zinck, P.; Mortreux, A.; Bria, M.; Visseaux, M. Half-lanthanocene/ dialkylmagnesium-mediated coordinative chain transfer copolymerization of styrene

copolymerization and chain transfer. Polymer 2007, 48, 4609-4614.

and hexane. J. Polym. Sci. Part A: Polym. Chem., 2011, 49, 3778-3782.

denyl) allyl-Yttrium Complex. Macromolecules, 2011, 44(18), 7158-7166.

Catalysis A: Chemical, 2007, 273(1-2), 87-91.

Sci. Part A: Polym. Chem., 2010, 48, 802-814.

Chemistry, 2003, 683(1), 44-55.

1813-1822.

353(8), 1367-1374.


[45] Bogaert, S.; Carpentier, J. –F. Chenal, T.; Mortreux, A.; Ricart, G. Chlorolanthano‐ cene-dialkylmagnesium systems for styrene bulk polymerization and styrene-ethyl‐ ene block copolymerization. Macromolecular Chemistry and Physics, 2000, 201(14), 1813-1822.

[32] Sita, L. R. Ex Uno Plures ("Out of One, Many"): New Paradigms for Expanding the Range of Polyolefins through Reversible Group Transfers. Angew. Chem. Int. Ed.,

[33] Guan, Z.; Cotts, P. M.; McCord, E. F.; McLain, S. J. Chain Walking: A New Strategy to

[34] D'Agosto, F.; Boisson, C. A RAFT Analogue Olefin Polymerization Technique using Coordination Chemistry. Australian Journal of Chemistry, 2010, 63(8), 1155-1158.

[35] Makio, H.; Ochiai, T.; Mohri, J.-I.; Takeda, K.; Shimazaki, T.; Usui, Y.; Matsuura,S.; Fujita, T. Synthesis of Telechelic Olefin Polymers via Catalyzed Chain Growth on Multinuclear Alkylene Zinc Compounds. J. Am. Chem. Soc., 2013, 135, 8177-8180.

[36] Kretschmer, W. P.; Meetsma, A.; Hessen, B.; Schmalz, T.; Qayyum, S.; Kempe, R. Re‐ versible Chain Transfer between Organoyttrium Cations and Aluminum: Synthesis of Aluminum-Terminated Polyethylene with Extremely Narrow Molecular-Weight

[37] Ganesan, M.; Gabbaï, F. P. [Cp\*Cr(C6F5)(Me)(Py)] as a Living Chromium(III) Catalyst

[38] Bazan, G. C.; Rogers, J. S.; Fang C. C. Catalytic Insertion of Ethylene into Al−C Bonds with Pentamethylcyclopentadienyl−Chromium(III) Complexes. Organometallics,

[39] Chenal, T.; Visseaux, M. Combining Polyethylene CCG and Stereoregular Isoprene Polymerization: First Synthesis of Poly(ethylene)-b-(trans-isoprene) by Neodymium Catalyzed Sequenced Copolymerization. Macromolecules, 2012, 45, 5718-5727.

[40] Lin-Gibson, S.; Brunner, L.; Vanderhart, D. L.; Bauer, B. J.; Fanconi, B. M.; Guttman, C. M.; Wallace W. E. Optimizing the Covalent Cationization Method for the Mass

[41] Lennon, P. J.; Mack D. P.; Thompson Q. E. Nucleophilic Catalysis of Organosilicon

[42] Amin, S. B.; Marks T. J. Versatile Pathways for In Situ Polyolefin Functionalization with Heteroatoms: Catalytic Chain Transfer. Angew. Chem. Int. Ed., 2008, 47,

[43] Bogaert, S.; Chenal, T.; Mortreux, A.; Nowogrocki, G.; Lehmann, C. W.; Carpentier, J. –F. ansa-Bis(cyclopentadienyl) Ligands: Synthesis and Use in Olefin Oligomeriza‐

[44] Bogaert, S.; Chenal, T.; Mortreux, A.; Carpentier, J.-F. Unusual product distribution in ethylene oligomerization promoted by in situ ansa-chloroneodymocene-dialkyl‐ magnesium systems. Journal of Molecular Catalysis A: Chemical, 2002, 190(1-2),

Spectrometry of Polyolefins. Macromolecules, 2002, 35(18), 7149-7156.

Substitution Reactions. Organometallics, 1989, 8, 1121-1122

tion. Organometallics, 2001, 20, 199-205.

for the "Aufbaureaktion". Organometallics, 2004, 23 (20), 4608-4613.

Control Polymer Topology Science, 1999, 283, 2059-2062.

Distribution. Chem. Eur. J., 2006, 12, 8969-8978.

2009, 48, 2464-2472.

28 Oligomerization of Chemical and Biological Compounds

2001, 20 (10), 2059-2064.

2006-2025.

207-214.


**Chapter 2**

**The Use of Ionic Liquids in**

**the Oligomerization of Alkenes**

Csaba Fehér, Eszter Kriván, Zoltán Eller, Jenő Hancsók and Rita Skoda-Földes

http://dx.doi.org/10.5772/57478

blending fractions has been increasing.

**1. Introduction**

Additional information is available at the end of the chapter

The more and more stringent quality standards of motor fuels, targeting the reduction of harmful material emission of the vehicles, require the development of new technologies for fuel production or improvement of the existing ones. The need for environmentally friendly, relatively clean-burning and practically heteroatom-free, high *n*-and *i*-paraffin containing

From the options to convert light hydrocarbons (C3-C6 paraffins and olefins) of lower value to high quality blending components, oligomerization is one of the most promising methodolo‐ gies. It provides extra flexibility to respond to changes in market demands with regards to the required gasoline:diesel ratio. C3–C5 olefins obtained by fluid catalytic cracking can be oligomerized to produce branched products which have higher octane numbers. The need for an increased overall diesel fuel yield can be addressed by C3–C5 olefin oligomerization

operated in the trimer or tetramer mode, followed by hydrogenation of the products.

Beside the application as blending components in diesel fuels, triisobutenes are considered to be highly useful for the synthesis of specialty chemicals including dodecylbenzene, base oils and solidifying agents for epoxy resins. Oligomerization, and further hydrogenation of other olefins, such as 1-octene and 1-decene results in the formation of high viscosity index (good lubricating properties) synthetic base oils with low pour point and good oxidative stability. The main challenges in the design of catalysts for oligomerization are focused to reach high conversion and high selectivity. Consequently, several catalysts have been developed for the

> © 2014 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


#### **Chapter 2**

### **The Use of Ionic Liquids in the Oligomerization of Alkenes**

Csaba Fehér, Eszter Kriván, Zoltán Eller, Jenő Hancsók and Rita Skoda-Földes

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/57478

#### **1. Introduction**

[56] Jenkins, D. K. Polymerization of Conjugated Dienes. Patent US004619982, 1986.

sium alkyl cocatalyst: 1. Polymer, 1985 26(1), 147-151.

sium alkyl cocatalyst: 2. Polymer, 1985 26(1), 152-158.

One Catalyst System. Polymer, 1994, 35(13), 2897-2898.

alysts. European Polymer Journal, 2013, 49, 4130-4140.

mer Chemistry., 2010, 48 (21), 4640-4647.

mer Chemistry, 2012, 50 (14), 2898-2905.

134-138.

30 Oligomerization of Chemical and Biological Compounds

2400-2409.

[57] Jenkins, D. K. Butadiene polymerization with a rare earth compound using a magne‐

[58] Jenkins, D. K. Butadiene polymerization with a rare earth compound using a magne‐

[59] Monakov, Y..B.; Duvakina, N.V.; Ionova, I.A. Polymerization of Butadiene with a Halogen-containing Trans-regulating Neodymium-magnesium Catalytic System in the Presence of Carbon Tetrachloride. Polymer Science-Series B, 2008, 50 (5-6),

[60] Jenkins, D. K. Sequential Formation of Trans and Cis Butadiene Homopolymers with

[61] Ventura, A.; Chenal, T.; Bria, M.; Bonnet, F.; Zinck, P.; Ngono-Ravache, Y.; Balanzat, E.; Visseaux, M. Trans-stereospecific polymerization of butadiene and random co‐ polymerization with styrene using borohydrido rare earths / magnesium dialkyl cat‐

[62] Terrier, M.; Visseaux, M.; Chenal, T.; Mortreux, A. Controlled Trans-stereospecific Polymerization of Isoprene with Lanthanide(III) Borohydride/Dialkylmagnesium Systems: the Improvement of the Activity and Selectivity, Kinetic Studies, and Mech‐ anistic Aspects. Journal of Polymer Science, Part A: Polymer Chemistry, 2007, 45 (12),

[63] Valente, A.; Zinck, P.; Mortreux, A.; Visseaux, M. Catalytic Chain Transfer (co-)Poly‐ merization: Unprecedented Polyisoprene CCG and a New Concept to Tune the Com‐ position of a Statistical Copolymer Macromol. Rapid Commun., 2009, 30, 528-531. [64] Valente, A.; Zinck, P.; Vitorino, M.J.; Mortreux, A.; Visseaux, M. Rare earths/main group metal alkyls catalytic systems for the 1,4-trans stereoselective coordinative chain transfer polymerization of isoprene. Journal of Polymer Science, Part A: Poly‐

[65] Loughmari, S.; Hafid, A.; Bouazza, A.; El Bouadili, A.; Zinck, P.; Visseaux, M. Highly Stereoselective Coordination Polymerization of β-Myrcene from a Lanthanide-based Catalyst: Access to Bio-sourced Elastomers. Journal of Polymer Science, Part A: Poly‐ The more and more stringent quality standards of motor fuels, targeting the reduction of harmful material emission of the vehicles, require the development of new technologies for fuel production or improvement of the existing ones. The need for environmentally friendly, relatively clean-burning and practically heteroatom-free, high *n*-and *i*-paraffin containing blending fractions has been increasing.

From the options to convert light hydrocarbons (C3-C6 paraffins and olefins) of lower value to high quality blending components, oligomerization is one of the most promising methodolo‐ gies. It provides extra flexibility to respond to changes in market demands with regards to the required gasoline:diesel ratio. C3–C5 olefins obtained by fluid catalytic cracking can be oligomerized to produce branched products which have higher octane numbers. The need for an increased overall diesel fuel yield can be addressed by C3–C5 olefin oligomerization operated in the trimer or tetramer mode, followed by hydrogenation of the products.

Beside the application as blending components in diesel fuels, triisobutenes are considered to be highly useful for the synthesis of specialty chemicals including dodecylbenzene, base oils and solidifying agents for epoxy resins. Oligomerization, and further hydrogenation of other olefins, such as 1-octene and 1-decene results in the formation of high viscosity index (good lubricating properties) synthetic base oils with low pour point and good oxidative stability.

The main challenges in the design of catalysts for oligomerization are focused to reach high conversion and high selectivity. Consequently, several catalysts have been developed for the

© 2014 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

oligomerization of lower alkenes. [1] Both Brønsted and Lewis acids have been used in either homogeneous or heterogeneous phase. Transition-metal complexes, such as Ni-based homo‐ geneous catalysts containing phosphine ligands have also been developed. Ionic liquids were introduced first, as highly polar media, to ensure catalyst recovery due to biphasic conditions.

The quantity of the fuel used is in accordance with the type of the drive. [5,6]

**◦** gasoline: benzene content ≤ 1.0 v/v%; total aromatic content ≤ 35 v/v%

and special (see also Table 1 and Table 2) requirements, such as:

**•** good combustion properties

construction materials)

**•** reduced aromatic content

**•** reduced olefin content

tion chamber clearness)

**•** compatibility with motor oils

**•** gasoline ≤ 18 v/v%

**•** user-friendly

**•** secure utilisation

**•** easy biodegradation

**•** low book-cost, etc.

**◦** high octane number (gasoline)

**◦** high cetane number (Diesel fuels)

**•** very low sulphur content (≤ 10 mg/kg)

**◦** lower total harmful material emission, *etc*.

**◦** Diesel fuels: polyaromatic content ≤ 8 %

with e.g. oxygen containing compounds)

**•** fewer harmful matters in the exhaust gas

**•** environmentally friendly (non toxic)

**•** long-time availability of quality and quantity all over the world

Based on the fuel demand (Figure 1), liquid hydrocarbons (engine gasoline, diesel gas oil) are used in the widest range in inland transport. They must satisfy the most important general

The Use of Ionic Liquids in the Oligomerization of Alkenes

http://dx.doi.org/10.5772/57478

33

**◦** relatively low combustion temperature (lower amount of NOx formation; lower load of

**•** good blending ability with alternative components (not only with hydrocarbons but also

**•** good additive sensibility and compatibility with additives (e.g. nozzle, valve and combus‐

In recent years the use of room-temperature molten salts, or ionic liquids, has received increasing attention. The ionic liquids are good solvents for inorganic complexes (catalysts) while being immiscible with most hydrocarbons. As such, they provide a nonaqueous alternative for two-phase catalysis. Ionic liquid solvents eliminate the need to use volatile organic solvents. Furthermore, the nearly infinite combinations of suitable cations and anions lead to the possibility of tailoring their properties.

However, ionic liquids may serve not only as solvents, but also as catalysts during the oligomerization reaction. Both Lewis and Brønsted acidic ionic liquids have been used efficiently. Another possibility is the development of supported ionic liquid phases to combine the advantages of the ionic liquid with heterogeneous catalysis.

In the present chapter, after a short review of the importance and mechanism of light olefin oligomerization, the use of ionic liquids as solvents and catalysts in oligomerization reactions is presented. The effect of the composition of ionic liquids on the catalytic activity and selectivity is discussed in detail.

#### **2. Industrial relevance of oligomerization of lower alkenes**

Oligomerization of light olefins is an important alternative for the production of higher molecular weight hydrocarbon mixtures useful as fuels (e.g. gasoline or diesel). [2] It makes possible the upgrading of low value components of plant process streams, *e.g*. from fluid catalytic cracking, from cracking of polymers or wastes or from Fischer-Tropsch C3-C6 olefins.

During oligomerization of light olefins, different boiling point range isoolefins can be pro‐ duced. Gasoline, jet fuel, diesel gas oil and base oil boiling point range components can be formed depending on the level of oligomerization. They can be transformed to isoparaffins by hydrogenation. Because of the constant growth of the energy demand of transportation, there is an increasing need for the production of isoparaffins.

Inland, aerial and waterway mobility is a keystone of sustainable development. To operate the equipment of mobility, internal combustion engines will be used (dominant share ≥85%) based on forecasts until 2030-2035. [3]

Intermittent-duty engines are used in inland and waterway transport, while aircrafts are operated by continuous jet engines. From the former, the Otto-and Diesel-engines are the most prevalent. The operational materials of these engines are the motor fuels and, in a wider sense, lubricants and other materials (*e.g*. oxygen source, refrigerants), too. [4]

The quantity of the fuel used is in accordance with the type of the drive. [5,6]

Based on the fuel demand (Figure 1), liquid hydrocarbons (engine gasoline, diesel gas oil) are used in the widest range in inland transport. They must satisfy the most important general and special (see also Table 1 and Table 2) requirements, such as:


oligomerization of lower alkenes. [1] Both Brønsted and Lewis acids have been used in either homogeneous or heterogeneous phase. Transition-metal complexes, such as Ni-based homo‐ geneous catalysts containing phosphine ligands have also been developed. Ionic liquids were introduced first, as highly polar media, to ensure catalyst recovery due to biphasic conditions. In recent years the use of room-temperature molten salts, or ionic liquids, has received increasing attention. The ionic liquids are good solvents for inorganic complexes (catalysts) while being immiscible with most hydrocarbons. As such, they provide a nonaqueous alternative for two-phase catalysis. Ionic liquid solvents eliminate the need to use volatile organic solvents. Furthermore, the nearly infinite combinations of suitable cations and anions

However, ionic liquids may serve not only as solvents, but also as catalysts during the oligomerization reaction. Both Lewis and Brønsted acidic ionic liquids have been used efficiently. Another possibility is the development of supported ionic liquid phases to combine

In the present chapter, after a short review of the importance and mechanism of light olefin oligomerization, the use of ionic liquids as solvents and catalysts in oligomerization reactions is presented. The effect of the composition of ionic liquids on the catalytic activity and

Oligomerization of light olefins is an important alternative for the production of higher molecular weight hydrocarbon mixtures useful as fuels (e.g. gasoline or diesel). [2] It makes possible the upgrading of low value components of plant process streams, *e.g*. from fluid catalytic cracking, from cracking of polymers or wastes or from Fischer-Tropsch C3-C6 olefins. During oligomerization of light olefins, different boiling point range isoolefins can be pro‐ duced. Gasoline, jet fuel, diesel gas oil and base oil boiling point range components can be formed depending on the level of oligomerization. They can be transformed to isoparaffins by hydrogenation. Because of the constant growth of the energy demand of transportation, there

Inland, aerial and waterway mobility is a keystone of sustainable development. To operate the equipment of mobility, internal combustion engines will be used (dominant share ≥85%) based

Intermittent-duty engines are used in inland and waterway transport, while aircrafts are operated by continuous jet engines. From the former, the Otto-and Diesel-engines are the most prevalent. The operational materials of these engines are the motor fuels and, in a wider sense,

lubricants and other materials (*e.g*. oxygen source, refrigerants), too. [4]

lead to the possibility of tailoring their properties.

32 Oligomerization of Chemical and Biological Compounds

selectivity is discussed in detail.

on forecasts until 2030-2035. [3]

the advantages of the ionic liquid with heterogeneous catalysis.

**2. Industrial relevance of oligomerization of lower alkenes**

is an increasing need for the production of isoparaffins.

	- **◦** gasoline: benzene content ≤ 1.0 v/v%; total aromatic content ≤ 35 v/v%
	- **◦** Diesel fuels: polyaromatic content ≤ 8 %

**property**

**Table 1.** Quality requirement changes of gasoline

**Property**

maximum distillation recovery (95%) temperature, °C

CARB: California Air Resources Board, WWFC: World Wide Fuel Charter

**Table 2.** Quality requirement changes of diesel gas oils

2004.

**EN 228 (1993)** **EN 228 (2000)**

**EN 590 (1999)**

maximum total aromatic content, % - - - - 35.0

**EN 590 (2000)** **EN 590 (2005)**

minimum cetane number 48 51 51 51 40 53 55 maximum density at 15°C, kg/m3 820-860 820–845 820–845 820–845 - 820-840 maximum sulphur content, mg/kg 500 350 50.0/10.0 10.0 15 15 10.0

maximum polyaromatic content, % - 11 11 8 - 3.5 2.0

maximum biodiesel (FAME-) content, v/v%, 5.0 7.0 5.0 not allowed

370 360 360 360

**EN 590 (2009)**

**EN 228 (2005)**

maximum sulphur content, mg/kg 500 150 50/10 10 80/30 60/30 30/20/15 10 maximum aromatic content, v/v% - 42 35 35 - 35 35 35 maximum olefincontent, v/v% - 18 18 18 - 10 10 10 maximum benzene content, v/v% 5.0 1.0 1.0 1.0 0.62/1.3a 1.1 0.7 1.0 maximum oxygen content, % - 2.7 2.7 2.7/3.7 2.7/3.5 1.8-3.5b 1.8-3.5b 2.7 maximum ethanol content v/v% - 5.0 5.0 5.0/10.0 10.0 - - 5.0/10.0 maximum Reidvapour pressure, kPa 35-100 60/70 60/70 45-105 44-69 44-50 41-50 45-105

CARB: California Air Resources Board, WWFC: World Wide Fuel a: Commitment of crude oil refinery, b: Prohibited since 01. 01.

**EN 228 (2013)**

**European Union United States WWFC**

**CARB Phase 3 (2004/ 2006)**

The Use of Ionic Liquids in the Oligomerization of Alkenes

**European Union United States WWFC**

**Federal ASTM D 975-13 No. 2-D; S-15**

(m/m%)

338

**CARB (2008)** **5. category (2012)**

10 15

(90 v/v%) - <sup>350</sup>

**CARB Phase 3 (2009)**

http://dx.doi.org/10.5772/57478

**5. category (2012)**

35

**RFG Phase II. 2012**

**Figure 1.** Changing demand of fuel by type (toe: tonne of oil equivalent)

Based on these requirements, it can be unequivocally stated that it is only the quantity of paraffins (mainly iso-and cycloparaffins and, with less importance, *n*-paraffins) that is not limited directly or indirectly in fuels. The concentration of cycloparaffins with suitable boiling points is low in crude oil and their synthesis is still expensive. [4] As a consequence, the most important components of gasoline and diesel gas oils are mixtures of different carbon number (different boiling point) isoparaffins or high isoparaffin content fractions, from the aspect of performance properties, environment protection and health. For example, gasoline boiling point range isoparaffins have high octane number and energy content (Figure 2), [4,7] their sensibility is low, they are practically sulphur and aromatic free, less toxic and because of their 'cleaner ignition', lead to the formation of less harmful material. Isoparaffins in diesel gas oil boiling point range have high energy content and cetane number (Figure 3) and they have good flow properties even at low temperature (low freezing point) (Figure 4). [8,9] Moreover, they are the most suitable hydrocarbons from the aspect of environmental protection. These properties are due to their high hydrogen content (CnH2n+2) and consequently low carbon content, their relatively easier biodegradation and low toxicity.


CARB: California Air Resources Board, WWFC: World Wide Fuel a: Commitment of crude oil refinery, b: Prohibited since 01. 01. 2004.

**Table 1.** Quality requirement changes of gasoline

**Millio**

**n to**

**e**

34 Oligomerization of Chemical and Biological Compounds

**375**

**300**

**225**

**150**

**75**

**0**

**Figure 1.** Changing demand of fuel by type (toe: tonne of oil equivalent)

content, their relatively easier biodegradation and low toxicity.

**2000 2020 2040**

Based on these requirements, it can be unequivocally stated that it is only the quantity of paraffins (mainly iso-and cycloparaffins and, with less importance, *n*-paraffins) that is not limited directly or indirectly in fuels. The concentration of cycloparaffins with suitable boiling points is low in crude oil and their synthesis is still expensive. [4] As a consequence, the most important components of gasoline and diesel gas oils are mixtures of different carbon number (different boiling point) isoparaffins or high isoparaffin content fractions, from the aspect of performance properties, environment protection and health. For example, gasoline boiling point range isoparaffins have high octane number and energy content (Figure 2), [4,7] their sensibility is low, they are practically sulphur and aromatic free, less toxic and because of their 'cleaner ignition', lead to the formation of less harmful material. Isoparaffins in diesel gas oil boiling point range have high energy content and cetane number (Figure 3) and they have good flow properties even at low temperature (low freezing point) (Figure 4). [8,9] Moreover, they are the most suitable hydrocarbons from the aspect of environmental protection. These properties are due to their high hydrogen content (CnH2n+2) and consequently low carbon


CARB: California Air Resources Board, WWFC: World Wide Fuel Charter

**Table 2.** Quality requirement changes of diesel gas oils

**Figure 4.** Freezing points of different hydrocarbons

**property**

Fischer-Tropsch paraffinic base oil, PAO: poly(alpha-olefins)

**Table 3.** Properties of different base oils

Isoparaffin hydrocarbons are of key importance not only as energy carriers but also as suitable lubricants in case of engine oils. From the main components of engine oils (base oil and additives), hydrocarbon based lubricating oils have 65-80% share. From the aspect of per‐ formance properties, paraffin hydrocarbons and especially isoparaffins with a proper boiling point range are the most suitable constituents (Table 3.). [10, 11] These hydrocarbons have excellent lubricant properties (high viscosity index), good or suitable flow properties (-65°C

— -20°C), low evaporization loss, moreover, they are aromatic and sulphur free [12].

kinematic viscosity, (100°C) mm2/s 4 4 4 4 4 4

HC-1:hydrocracking base oil; HC-2: rigorous hydrocracking base oil; HC-3: hydroisomerized paraffinic base oil, HC-4:

viscosity index 100 105 125 130 140 125 volatility (NOACK), % 23 18 14 13 11 12 flowpoint, °C -15 -15 -18 -20 -30 -65 n- and i-paraffins, % 25 30 55 75 100 96 aromatics, % 24 05 0.3 0.1 0 sulphur content, % <0.3 <0.1 <0.1 <100 <10 <1

**base oil conventional HC-1 HC-2 HC-3 HC-4 PAO**

The Use of Ionic Liquids in the Oligomerization of Alkenes

http://dx.doi.org/10.5772/57478

37

**Figure 2.** Research octane numbers of different hydrocarbons

**Figure 3.** Cetane numbers of different hydrocarbons

**Figure 4.** Freezing points of different hydrocarbons

**Figure 2.** Research octane numbers of different hydrocarbons

36 Oligomerization of Chemical and Biological Compounds

**Range of required values**

**Mono-ring aromatics**

**Figure 3.** Cetane numbers of different hydrocarbons

**Cetane number**

**Paraffins**

**Decalines**

**Tetralines**

**5 10 15 20 25 Carbon number**

**Olefins Isoparaffins**

**Mono-ring naphthenes**

**Naftalines**

Isoparaffin hydrocarbons are of key importance not only as energy carriers but also as suitable lubricants in case of engine oils. From the main components of engine oils (base oil and additives), hydrocarbon based lubricating oils have 65-80% share. From the aspect of per‐ formance properties, paraffin hydrocarbons and especially isoparaffins with a proper boiling point range are the most suitable constituents (Table 3.). [10, 11] These hydrocarbons have excellent lubricant properties (high viscosity index), good or suitable flow properties (-65°C — -20°C), low evaporization loss, moreover, they are aromatic and sulphur free [12].


HC-1:hydrocracking base oil; HC-2: rigorous hydrocracking base oil; HC-3: hydroisomerized paraffinic base oil, HC-4: Fischer-Tropsch paraffinic base oil, PAO: poly(alpha-olefins)

**Table 3.** Properties of different base oils

There are a lot of possibilities to produce isoparaffin rich fractions with different boiling point range:

7

oligomerization

In summary, oligomerization of olefin hydrocarbons has an important role in the production of operational materials of internal combustion engines, and its importance is growing further. A great advantage of this method is that the quantity of each product can be controlled with the level of oligomerization. It makes the flexible adaptation to the market demands possible.

Beside the production of valuable blending components for fuels in petroleum refineries, oligomerization can be used for the large scale synthesis of fine chemicals and intermediates in the petrochemical industry. Linear C8-olefin dimers are highly-desirable intermediates for the production of C9-plasticizers, exhibiting better thermal properties than branched isomers. Oligomerization of ethene producing higher α-olefins also represents an important industrial process, as the products, depending on the chain length, can be used as intermediates for

During the quest for sustainable technologies for the production of oligomers, research

Ionic liquids (ILs) are salts consisting of bulky organic cations and inorganic or organic anions. (Figure 5 shows the general formulas of the most commonly used ILs, as well as the structures of the compounds mentioned in this chapter.) They melt at relatively low temperature, usually

do not dissolve apolar compounds. These properties, together with their ability to stabilize transition metal complexes in low oxidation states, make them ideal solvents for transition metal catalyzed reactions. [13] When the polarity of the products are sufficiently low, biphasic reactions take place. After the completion of the reaction, the products can be separated by simple decantation and the metal catalyst remains in the IL phase that can be reused. ILs have negligible vapor pressure and they are not flammable which makes them very easy and safe to handle. Mainly because of their low volatility, they are considered to be 'green solvents'. However, their toxicity, investigated more thoroughly only recently, [14] should also be taken into account. Because of the great variety of anion – cation pairs, and the diversity in the side chains of the cations, an almost infinite IL combinations can be produced. Task-specific ILs are developed by the fine-tune of their physical and chemical properties through a careful choice

C. They are good solvents for polar organic molecules and inorganic salts but they

concerning the use of ionic liquids as solvents and/or catalysts came into focus.

**3. Main features of ionic liquids used in oligomerization reactions**

<sup>7</sup> <sup>7</sup> <sup>7</sup> <sup>7</sup> <sup>7</sup>

The Use of Ionic Liquids in the Oligomerization of Alkenes

dimer trimer higher

<sup>+</sup> <sup>+</sup> ...

http://dx.doi.org/10.5772/57478

oligomers

39

oligomerization

ethene 1-decene

**Scheme 3.** Synthesis of poly-α-olefins (PAOs)

plastics, plasticizers, surfactants.

of the structure of the cation-anion pair.

below 100 o


**Scheme 1.** Synthesis of naphtha boiling point range (C8) and jet fuel/gas oil boiling point range (C12) isoparaffins by an oligomerization — hydrogenation reaction sequence, starting from isobutene

**Scheme 2.** Base oil targeted oligomerization and hydrogenation of light olefins

Poly-α-olefins (PAOs), used as lubricant base oils, are synthesized by a two-step reaction sequence from linear α-olefins derived from ethene (Scheme 3.). PAOs have good flow properties at low temperatures, relatively high thermal and oxidative stability, low evapora‐ tion losses at high temperatures, higher viscosity index, good friction behavior, good hydro‐ lytic stability and good erosion resistance.

**Scheme 3.** Synthesis of poly-α-olefins (PAOs)

There are a lot of possibilities to produce isoparaffin rich fractions with different boiling point

**•** isomerisation of *n*-paraffins with suitable carbon number (C5-C7 naphtha fractions, gas oil

**•** suitable level oligomerization of light olefins followed by hydrogenation to isoparaffins (indirect alkylation in case of naphthas, gas oil targeted oligomerization and hydrogenation;

hydrogenation

hydrogenation

C CH3

C CH3

CH3

H3C

CH3

C8 isoparaffin

H2C CH CH3 CH3

H3C CH2 C

C12 isoparaffin

CH3

CH3

CH3 <sup>C</sup> H2C CH3 <sup>H</sup>

base oil targeted oligomerization and hydrogenation) (Scheme 1, Scheme 2).

C CH2 CH3

C CH3

CH3

CH3

CH3 C H2C CH3

CH3

CH3 C HC CH3

**Scheme 1.** Synthesis of naphtha boiling point range (C8) and jet fuel/gas oil boiling point range (C12) isoparaffins by

Poly-α-olefins (PAOs), used as lubricant base oils, are synthesized by a two-step reaction sequence from linear α-olefins derived from ethene (Scheme 3.). PAOs have good flow properties at low temperatures, relatively high thermal and oxidative stability, low evapora‐ tion losses at high temperatures, higher viscosity index, good friction behavior, good hydro‐

**•** alkylation of isobutene with olefins (naphtha blending components)

**•** hydrocracking of higher molecular weight hydrocarbon mixtures

C H2C CH3

C HC CH3

H3C CH2 C

H3C CH2 C

CH2

CH3

oligomerization hydrogenation

CH3

CH3

H3C

H3C

C CH3

CH3

C CH3

CH3

an oligomerization — hydrogenation reaction sequence, starting from isobutene

**Scheme 2.** Base oil targeted oligomerization and hydrogenation of light olefins

lytic stability and good erosion resistance.

range:

fraction)

38 Oligomerization of Chemical and Biological Compounds

H2C C

isobutene

C3-C5

CH3 CH3

dimerization

trimerization

In summary, oligomerization of olefin hydrocarbons has an important role in the production of operational materials of internal combustion engines, and its importance is growing further. A great advantage of this method is that the quantity of each product can be controlled with the level of oligomerization. It makes the flexible adaptation to the market demands possible.

Beside the production of valuable blending components for fuels in petroleum refineries, oligomerization can be used for the large scale synthesis of fine chemicals and intermediates in the petrochemical industry. Linear C8-olefin dimers are highly-desirable intermediates for the production of C9-plasticizers, exhibiting better thermal properties than branched isomers. Oligomerization of ethene producing higher α-olefins also represents an important industrial process, as the products, depending on the chain length, can be used as intermediates for plastics, plasticizers, surfactants.

During the quest for sustainable technologies for the production of oligomers, research concerning the use of ionic liquids as solvents and/or catalysts came into focus.

#### **3. Main features of ionic liquids used in oligomerization reactions**

Ionic liquids (ILs) are salts consisting of bulky organic cations and inorganic or organic anions. (Figure 5 shows the general formulas of the most commonly used ILs, as well as the structures of the compounds mentioned in this chapter.) They melt at relatively low temperature, usually below 100 o C. They are good solvents for polar organic molecules and inorganic salts but they do not dissolve apolar compounds. These properties, together with their ability to stabilize transition metal complexes in low oxidation states, make them ideal solvents for transition metal catalyzed reactions. [13] When the polarity of the products are sufficiently low, biphasic reactions take place. After the completion of the reaction, the products can be separated by simple decantation and the metal catalyst remains in the IL phase that can be reused. ILs have negligible vapor pressure and they are not flammable which makes them very easy and safe to handle. Mainly because of their low volatility, they are considered to be 'green solvents'. However, their toxicity, investigated more thoroughly only recently, [14] should also be taken into account. Because of the great variety of anion – cation pairs, and the diversity in the side chains of the cations, an almost infinite IL combinations can be produced. Task-specific ILs are developed by the fine-tune of their physical and chemical properties through a careful choice of the structure of the cation-anion pair.

**Figure 5.** Common cations and anions of ILs

[Al2Cl7] - , [Al3Cl10] - , [Fe2Cl7] -

From the point of view of oligomerization of alkenes, mainly acidic ILs (Figure 6) are of interest. [15] Acidity may be due either to the anion or to the cation of the IL. [16]

[NTf2] - , [OTf]-

, [OTs]-

acidity can be achieved by the addition of Brønsted acids such as HF or HCl into halide based ILs. The advantage of these systems lies in the fact that by supporting the acid in the ILs, its

The Use of Ionic Liquids in the Oligomerization of Alkenes

http://dx.doi.org/10.5772/57478

41

Acidity of the ILs depends greatly on the structure of the cations, too. [MIM][BF4] shows higher acidity compared to dialkylimidazolium ILs. [18] At the same time, Brønsted acidic ILs are usually prepared by the introduction of alkane sulfonic acid or carboxylate acid groups as side

In oligomerization of alkenes, the most widely used methods involve cationic oligomerization in the presence of acids and transition metal catalyzed oligomerization. [19] In the first case, a carbocation intermediate is formed by the transfer of a proton from the acid catalyst to the alkene (for oligomerization of isobutene, see Scheme 4). Then the carbocation acts as an electrophile and reacts with the alkene to form another carbocation produced from two monomers. Termination happens when the dimeric carbocation loses a proton. When the carbocation is reasonably stable, this termination reaction is slower than chain elongation and

termination

+

**4. Oligomerization mechanisms relevant in IL solvents**

the reaction leads to a trimer then a tetramer and finally a polymer.

**Scheme 4.** Mechanism of cationic oligomerization of isobutene

insertions of the olefin in the catalytic species. [20]

highly branched oligomers.

<sup>H</sup><sup>+</sup> -H<sup>+</sup>

elongation

...

In the transition metal catalyzed reaction the catalytically active species is regenerated after coordination of the olefin, chain growth and termination by β-elimination, producing the primary oligomerization product. Dimerization of alkenes can be described as two successive

In the industrial DimersolTM process, the active catalyst is formed in situ by the reaction of a nickel (II) salt with an ethylaluminumhalide derivative. In the first step, insertion of either the C1 or C2 carbon of the alkene can occur, leading to various linear or branched dimers (for the mechanism of propene dimerization, see Scheme 5). At the same time, the nickel-complexes are active catalysts in isomerization, leading to further isomeric products. [19] The regioselec‐ tivity of dimerization can be directed by the addition of appropriate ligands. For example, in nickel-catalyzed oligomerizations sterically demanding phosphines favor the formation of

volatility can be reduced.

chains of the cations (Figure 6).

**Figure 6.** Cations and anions of acidic ILs

ILs with polynuclear metallic anions, such as chloroaluminate, chloroferrate or chlorozincate ions, show Lewis acidity [16] and, in the presence of protons, superacidity. [17] Brønsted acidity can be achieved by the addition of Brønsted acids such as HF or HCl into halide based ILs. The advantage of these systems lies in the fact that by supporting the acid in the ILs, its volatility can be reduced.

Acidity of the ILs depends greatly on the structure of the cations, too. [MIM][BF4] shows higher acidity compared to dialkylimidazolium ILs. [18] At the same time, Brønsted acidic ILs are usually prepared by the introduction of alkane sulfonic acid or carboxylate acid groups as side chains of the cations (Figure 6).

#### **4. Oligomerization mechanisms relevant in IL solvents**

[BMIM]+: R=Bu, R'=Me, R''=H

: R=Hex, R'=Me, R''=H

40 Oligomerization of Chemical and Biological Compounds

P R

R' R"

R'''

Cl- , Br- , I-

**Figure 5.** Common cations and anions of ILs

[AlCl4] - , [FeCl4] -

[Al2Cl7] - , [Al3Cl10] - , [Fe2Cl7] -

N N

R (CH2)4SO3H

[MIMBs]+: R=Me

[BIMBs]+: R=Bu

[HIMBs]+: R=Hex

**Figure 6.** Cations and anions of acidic ILs

[MIM]+: R=H, R'=Me, R''=H

[EMIM]+: R=Et, R'=Me, R''=H

N N R R'

**Cations**

**Anions**

[HMIM]<sup>+</sup>

R" <sup>N</sup>

R'

R

N R R'

[MPYR]+: R=H R'=Me

3-, [HSO4] - , [SO4] 2-

[H2PO4] - , [HSO4] -




[H(HF)2-3]

[HCl.Al2Cl7]

[M2Cl7]

S R

R'' <sup>R</sup>

[NO3] - , [PO4]

[BMPY]+: R=Bu R'=Me

N R

[BF4] - , [PF6] - , [SbF6] -

[NTf2] - , [OTf]-

From the point of view of oligomerization of alkenes, mainly acidic ILs (Figure 6) are of interest.

(CH2)nSO3H

**Cations Anions**

[ZnCl3] - , [CuCl2] - , [SnCl3] -

[15] Acidity may be due either to the anion or to the cation of the IL. [16]

N R

N N

R (CH2)nCO2H

ILs with polynuclear metallic anions, such as chloroaluminate, chloroferrate or chlorozincate ions, show Lewis acidity [16] and, in the presence of protons, superacidity. [17] Brønsted

phosphonium ammonium sulfonium

R' R"

, [OTs]-

R'''

[BPY]+: R=Bu R'=H

[PY]+: R=R'=H

imidazolium pyridinium pyrrolidinium

In oligomerization of alkenes, the most widely used methods involve cationic oligomerization in the presence of acids and transition metal catalyzed oligomerization. [19] In the first case, a carbocation intermediate is formed by the transfer of a proton from the acid catalyst to the alkene (for oligomerization of isobutene, see Scheme 4). Then the carbocation acts as an electrophile and reacts with the alkene to form another carbocation produced from two monomers. Termination happens when the dimeric carbocation loses a proton. When the carbocation is reasonably stable, this termination reaction is slower than chain elongation and the reaction leads to a trimer then a tetramer and finally a polymer.

**Scheme 4.** Mechanism of cationic oligomerization of isobutene

In the transition metal catalyzed reaction the catalytically active species is regenerated after coordination of the olefin, chain growth and termination by β-elimination, producing the primary oligomerization product. Dimerization of alkenes can be described as two successive insertions of the olefin in the catalytic species. [20]

In the industrial DimersolTM process, the active catalyst is formed in situ by the reaction of a nickel (II) salt with an ethylaluminumhalide derivative. In the first step, insertion of either the C1 or C2 carbon of the alkene can occur, leading to various linear or branched dimers (for the mechanism of propene dimerization, see Scheme 5). At the same time, the nickel-complexes are active catalysts in isomerization, leading to further isomeric products. [19] The regioselec‐ tivity of dimerization can be directed by the addition of appropriate ligands. For example, in nickel-catalyzed oligomerizations sterically demanding phosphines favor the formation of highly branched oligomers.

Activities of catalysts in organic solvents were found to be considerably lower than those obtained from the same nickel precursors immobilized in an IL. [22]. This can be explained by an increase in the electrophilic nature of the nickel metal center by the weak coordinating

The efficiency of the IL/catalyst system depends noticeably on the composition of the chloroaluminate IL. In ILs with an aluminum molar fraction lower than 0.50 ('basic ionic liquids'), the presence of an excess of coordinating chloride anions inhibits catalytic activity

Acidic chloroaluminates, with an aluminum molar fraction higher than 0.50, show enhanced activity. At the same time, in acidic ILs composed of [BMIM]Cl / AlCl3, the formation of

an abstraction of the phosphine ligand, ensuring high branching in the product mixture, from the coordination sphere of the metal occurs [24] that alters the selectivity of the reaction.

To avoid this, a weak competitive base, an aromatic hydrocarbon [24, 25] that did not interfere with the cationic nickel active species, was added to the reaction mixture. Thus, the acidity and the distribution of aluminum anionic species could be controlled due to the coordination

−

viscous heavy oligomers, characteristic of a cationic oligomerization, even in the absence of a Ni-precursor, [23, 26] arising from the superacidity of proton contamination of the ionic liquid.

First, these problems were eliminated by the use of AlEtCl2 instead of AlCl3. In these ILs, both the formation of higher oligomers, and the loss of phosphine ligands could be avoided. The latter statement was based on the fact, that selectivity of the reaction was shifted towards 2,3 dimethylbutenes during dimerization of propene, characteristic in the presence of catalysts

In [BMIM]Cl / AlEtCl2 / AlCl3 systems, ethylaluminum species such as [AlEtCl3]

( ) <sup>2</sup>

or [Al3C10]

HNiPR Al Cl HNi AlCl .PR AlCl 3 27 33 4

Al Cl ArH AlCl .ArH AlCl 2 7 <sup>3</sup> <sup>4</sup>

−

2− and NiCl3L<sup>−</sup>

The Use of Ionic Liquids in the Oligomerization of Alkenes

<sup>−</sup> were detected. In the presence of [Al2Cl7]

leads to the formation of yellow highly

<sup>−</sup>, [Al2Et2Cl5]

− ,

2 3 4 3 <sup>2</sup> NiCl PR 2Cl NiCl 2PR - - +® + (1)

+ -+ - é ùé ù é ù é ù ë ûë + ®+ + ûëû ë û (2)


HCl Al Cl H 2 7 nonsolvated <sup>4</sup> 2 AlCl - + - é ù éù é ù ëû û + ® <sup>ë</sup> <sup>+</sup> ë û (4)

<sup>−</sup> [36] are formed, depending on the amount of the alkylaluminum

.[23, 24]

http://dx.doi.org/10.5772/57478

− , 43

ability of chloroaluminate anions.

multinuclear species, such as [Al2Cl7]

of AlCl3 to the aromatic ring.

with bulky ligands. [23]

and [Al3Et3Cl7]

−

[Al2EtCl6]

As another disadvantage, the presence of [Al2Cl7]

due to the formation of stable anionic species such as NiCl4

**Scheme 5.** Mechanism of nickel-catalyzed dimerization of propene

The main challenges in the design of catalysts for oligomerization are focused to reach high conversion and high selectivity. Although transition metal catalysts can efficiently be finetuned to achieve these goals, a great drawback of the homogeneous systems lies in the problem of catalyst recovery and recycle. The use of a two-phase solvent mixture is an attractive alternative, as it enables catalyst separation and reuse. ILs were introduced first in oligomeri‐ zation reactions as co-catalysts, as well as polar solvents ensuring biphasic nickel-catalyzed reactions. [21]

#### **5. The use of ILs as solvents in oligomerization catalyzed by organometallic complexes**

#### **5.1. Oligomerization with Ni catalysts**

Chloroaluminate ILs, composed of imidazolium [22-34] or pyridinium [33, 35] halides, AlCl3 and, in most cases, an alkylaluminum compound (AlEtCl2 or AlEt2Cl), act both as a medium for catalyst immobilization and nickel activator (Table 4). The cationic nickel active species is immobilized in the ionic phase without the need of a special ligand. Because of the high solubility of the Ni-complex but poor solubility of the products in the IL, complete separation of the catalyst can be achieved by simple decantation. In principle, the recovered catalyst/IL phase can continuously be reused maintaining its catalytic activity and selectivity.

Activities of catalysts in organic solvents were found to be considerably lower than those obtained from the same nickel precursors immobilized in an IL. [22]. This can be explained by an increase in the electrophilic nature of the nickel metal center by the weak coordinating ability of chloroaluminate anions.

The efficiency of the IL/catalyst system depends noticeably on the composition of the chloroaluminate IL. In ILs with an aluminum molar fraction lower than 0.50 ('basic ionic liquids'), the presence of an excess of coordinating chloride anions inhibits catalytic activity due to the formation of stable anionic species such as NiCl4 2− and NiCl3L<sup>−</sup> .[23, 24]

$$\text{NiCl}\_2\text{(PR}\_3\text{)}\_2 + 2\text{Cl}^- \rightarrow \text{NiCl}\_4^{2-} + 2\text{PR}\_3 \tag{1}$$

Acidic chloroaluminates, with an aluminum molar fraction higher than 0.50, show enhanced activity. At the same time, in acidic ILs composed of [BMIM]Cl / AlCl3, the formation of multinuclear species, such as [Al2Cl7] − or [Al3C10] <sup>−</sup> were detected. In the presence of [Al2Cl7] − , an abstraction of the phosphine ligand, ensuring high branching in the product mixture, from the coordination sphere of the metal occurs [24] that alters the selectivity of the reaction.

Ni

**5.1. Oligomerization with Ni catalysts**

**Scheme 5.** Mechanism of nickel-catalyzed dimerization of propene

L L

Ni

42 Oligomerization of Chemical and Biological Compounds

L L

C1

C2

+

reactions. [21]

**complexes**

Ni <sup>L</sup> L H L Ni

L L

Ni

The main challenges in the design of catalysts for oligomerization are focused to reach high conversion and high selectivity. Although transition metal catalysts can efficiently be finetuned to achieve these goals, a great drawback of the homogeneous systems lies in the problem of catalyst recovery and recycle. The use of a two-phase solvent mixture is an attractive alternative, as it enables catalyst separation and reuse. ILs were introduced first in oligomeri‐ zation reactions as co-catalysts, as well as polar solvents ensuring biphasic nickel-catalyzed

**5. The use of ILs as solvents in oligomerization catalyzed by organometallic**

Chloroaluminate ILs, composed of imidazolium [22-34] or pyridinium [33, 35] halides, AlCl3 and, in most cases, an alkylaluminum compound (AlEtCl2 or AlEt2Cl), act both as a medium for catalyst immobilization and nickel activator (Table 4). The cationic nickel active species is immobilized in the ionic phase without the need of a special ligand. Because of the high solubility of the Ni-complex but poor solubility of the products in the IL, complete separation of the catalyst can be achieved by simple decantation. In principle, the recovered catalyst/IL

phase can continuously be reused maintaining its catalytic activity and selectivity.

L L

C2

Ni

L L

Ni

L L

C2

C1

C1

insertion isomerization

insertion termination

(dimerization)

$$\left[\text{HNiPR}\_3\right]^+ + \left[\text{Al}\_2\text{Cl}\_7\right]^- \rightarrow \left[\text{HNi}\right]^+ + \text{AlCl}\_3.\text{PR}\_3 + \left[\text{AlCl}\_4\right]^-\tag{2}$$

To avoid this, a weak competitive base, an aromatic hydrocarbon [24, 25] that did not interfere with the cationic nickel active species, was added to the reaction mixture. Thus, the acidity and the distribution of aluminum anionic species could be controlled due to the coordination of AlCl3 to the aromatic ring.

$$\left[\text{Al}\_2\text{Cl}\_7\right]^- + \text{ArH} \rightarrow \text{AlCl}\_3.\text{ArH} + \left[\text{AlCl}\_4\right]^-\tag{3}$$

As another disadvantage, the presence of [Al2Cl7] − leads to the formation of yellow highly viscous heavy oligomers, characteristic of a cationic oligomerization, even in the absence of a Ni-precursor, [23, 26] arising from the superacidity of proton contamination of the ionic liquid.

$$\text{HCl} + \left[\text{Al}\_2\text{Cl}\_7\right]^- \rightarrow \left[\text{H}\right]^+\_{\text{nonsolvated}} + 2\left[\text{AlCl}\_4\right]^- \tag{4}$$

First, these problems were eliminated by the use of AlEtCl2 instead of AlCl3. In these ILs, both the formation of higher oligomers, and the loss of phosphine ligands could be avoided. The latter statement was based on the fact, that selectivity of the reaction was shifted towards 2,3 dimethylbutenes during dimerization of propene, characteristic in the presence of catalysts with bulky ligands. [23]

In [BMIM]Cl / AlEtCl2 / AlCl3 systems, ethylaluminum species such as [AlEtCl3] <sup>−</sup>, [Al2Et2Cl5] − , [Al2EtCl6] − and [Al3Et3Cl7] <sup>−</sup> [36] are formed, depending on the amount of the alkylaluminum compound. The composition of the mixture greatly affects the acidity of the final IL, so an adjustment of the AlEtCl2 to AlCl3 ratio is necessary in order to optimize the efficiency of the catalyst/ionic liquid system.[24, 26]

and Ni-carbene complexes (**8**) [30] were found to be suitable catalyst precursors in chloroalu‐

Various ligands/additives were found to affect greatly the activity and/or selectivity of the

A decisive effect of the counter-anion of cationic Ni-complexes on the catalytic activity was

(TOF: 6480 h-1) compared to [Ni(MeCN)6][BF4]2 (TOF: 2412 h-1). This phenomenon was

compounds. It can be assumed that a single active nickel hydride was formed in the ionic liquid, regardless of the precursor used, but in varying amounts depending on the counter ion

In case of [Ni(MeCN)6][BF4]2, the use of PCy3.CS2 as an additive led to considerably higher reaction rate (TOF: 6840 h-1) in oligomerization of 2-butene carried out in acidic ILs, than the addition of the phosphine PCy3 itself (TOF: 3960 h-1). [28] A systematic study of the effect of PCy3.CS2 was carried out by de Souza. [29] According to the NMR investigations of different IL— PCy3.CS2 systems, the improvement of catalytic activity is a consequence of a new anionic

−

Various Ni-carbene complexes (**8**, Figure 7) were found to be more active catalysts of 1-butene oligomerization (with TOFs 3820 h-1 – 7020 h-1) in chloroaluminate ILs than NiCl2(PCy3)2 (**7**,

At the same time, because of a similar selectivity obtained in the presence of [Ni(MeCN)6] [BF4]2, a Ni-carbene complex and a nickel complex in an imidazolium IL with an imidazolium ion blocked with a methyl group in position 2, the active species are thought to be similar. As the formation of a Ni-carbene complex is not possible in the last case, it was concluded that the active catalyst was formed by the coordination of aluminum species to the nickel center in

In the [BMIM]Cl / AlEtCl2 / AlCl3 systems usually selective dimerization of ethene [25, 31, 38], propene [23, 24, 32] and butenes [26-29] could be observed due to the inhibition of cationic

It should be mentioned however, that by a proper change in the reaction temperature and alkylaluminum co-catalyst/Ni ratio, oligomerization of ethene could be shifted towards the formation of trimers. Both TOF values and C6 selectivity showed a curve with a maximum

Oligomerization in acidic ILs was reported to lead to highly branched products. Besides, isomerization of both products and starting material was observed in most cases. For example, oligomerization of 1-butene and 2-butene led to the same product distribution, showing that

coordinate to the nickel center and has a strong electron-withdrawing effect on the metal.

<sup>−</sup> anions led to enhanced catalytic activity

The Use of Ionic Liquids in the Oligomerization of Alkenes

http://dx.doi.org/10.5772/57478

45

with the zwitterionic PCy3.CS2, that is able to

ligand towards alkyl aluminum

2− and [AlCl4]

minate ILs (Table 4).

catalytic systems.

for the different salts.

all cases. [26]

observed. [27] Precursors with [ZnCl4]

IL species, formed by the reaction of [Al2EtCl6]

TOF=2950 h-1) under the same conditions. [30]

oligomerization/polymerization by the AlEtCl2 co-catalyst. [26]

value with an increase in the AlEt2Cl/Ni ratio. [22]

isomerization was not a limiting step. [28]

attributed to the highly reactive behavior of the [BF4] <sup>−</sup>

Cationic side reactions in [BMIM]Cl / AlCl3 ILs could totally be suppressed by the addition of AlEtCl2, even in a molar fraction as low as 0.05. [26] This behavior was explained by a hydride abstraction on the olefin by [Al2Cl7] − , generating an allylic cation (Scheme 6), with a subsequent alkylation consuming the co-catalyst.

**Scheme 6.** The role of alkylaluminum derivatives in the suppression of cationic oligomerization

However, the loss of AlEtCl2 may decrease the possibility of precatalyst activation in nickelcontaining systems and consequently decreases their activity. This explanation is in agreement with the observation that increasing the AlCl3 content of the ILs decreases the activity, due to the lower availability of the co-catalyst, necessary for the alkylation of the nickel precursor.

Generally, the use of slightly acidic [BMIM]Cl / AlEtCl2 / AlCl3 mixtures is the most favorable. Catalytic activity of nickel precursors was found to be higher due to the more acidic nature of these ionic liquids compared to [BMIM]Cl / AlEtCl2. [24]

When an IL containing AlEtCl2 is contacted with a hydrocarbon layer, the dissociation equilibrium of polynuclear chloroethylaluminum anions is shifted to the formation of lower nuclear chloroaluminum anions and neutral chloroethylaluminum compounds (Eqs 5, 6). Because of the total miscibility of the latter aluminum derivatives and hydrocarbons, ethyla‐ luminium species are leached into the organic phase and this leads to a change in the compo‐ sition of the ionic liquid phase. [24] The extraction process, as well as the nature of the alkylaluminum species formed in various chloroaluminate ILs was investigated in detail by Gilbert *et al*. by Raman spectroscopy [37] in order to optimize acidity and catalytic activity of the system as well as to minimize the loss of AlEtCl2.

$$2\left[\text{Al}\_2\text{Et}\_2\text{Cl}\_5\right]^- \rightarrow 2\left[\text{AlCl}\_4\right]^- + \text{Al}\_2\text{Et}\_4\text{Cl}\_2\tag{5}$$

$$2\left[\text{Al}\_2\text{EtCl}\_6\right]^- \rightarrow 2\left[\text{AlCl}\_4\right]^- + \text{Al}\_2\text{Et}\_2\text{Cl}\_4 \tag{6}$$

Ni salts (such as NiF2 [25]), neutral (**1** (Figure 7) [23, 34], **2** [35], **3** [33, 34], **4** [34]) and cationic Ni-complexes ([Ni(MeCN)6][BF4]2 [25-27, 32], ([Ni(MeCN)6][AlCl4]2, ([Ni(MeCN)6][ZnCl4] [27]) as well as preformed (*e. g.* NiCl2(P(i-Pr)3)2 (**5**) [23, 24], NiBr2(PPh3)2 (**6**) [22], NiCl2(PCy3)2 (**7**) [25]) or *in situ* produced Ni-phosphine complexes (*e. g.* NiCl2+PBu3, NiCl2+PPh3 NiCl2+PCy3 [28]) and Ni-carbene complexes (**8**) [30] were found to be suitable catalyst precursors in chloroalu‐ minate ILs (Table 4).

compound. The composition of the mixture greatly affects the acidity of the final IL, so an adjustment of the AlEtCl2 to AlCl3 ratio is necessary in order to optimize the efficiency of the

Cationic side reactions in [BMIM]Cl / AlCl3 ILs could totally be suppressed by the addition of AlEtCl2, even in a molar fraction as low as 0.05. [26] This behavior was explained by a hydride


However, the loss of AlEtCl2 may decrease the possibility of precatalyst activation in nickelcontaining systems and consequently decreases their activity. This explanation is in agreement with the observation that increasing the AlCl3 content of the ILs decreases the activity, due to the lower availability of the co-catalyst, necessary for the alkylation of the nickel precursor.

Generally, the use of slightly acidic [BMIM]Cl / AlEtCl2 / AlCl3 mixtures is the most favorable. Catalytic activity of nickel precursors was found to be higher due to the more acidic nature of

When an IL containing AlEtCl2 is contacted with a hydrocarbon layer, the dissociation equilibrium of polynuclear chloroethylaluminum anions is shifted to the formation of lower nuclear chloroaluminum anions and neutral chloroethylaluminum compounds (Eqs 5, 6). Because of the total miscibility of the latter aluminum derivatives and hydrocarbons, ethyla‐ luminium species are leached into the organic phase and this leads to a change in the compo‐ sition of the ionic liquid phase. [24] The extraction process, as well as the nature of the alkylaluminum species formed in various chloroaluminate ILs was investigated in detail by Gilbert *et al*. by Raman spectroscopy [37] in order to optimize acidity and catalytic activity of

Ni salts (such as NiF2 [25]), neutral (**1** (Figure 7) [23, 34], **2** [35], **3** [33, 34], **4** [34]) and cationic Ni-complexes ([Ni(MeCN)6][BF4]2 [25-27, 32], ([Ni(MeCN)6][AlCl4]2, ([Ni(MeCN)6][ZnCl4] [27]) as well as preformed (*e. g.* NiCl2(P(i-Pr)3)2 (**5**) [23, 24], NiBr2(PPh3)2 (**6**) [22], NiCl2(PCy3)2 (**7**) [25]) or *in situ* produced Ni-phosphine complexes (*e. g.* NiCl2+PBu3, NiCl2+PPh3 NiCl2+PCy3 [28])

, generating an allylic cation (Scheme 6), with a subsequent

+


<sup>225</sup> 4 242 2 Al Et Cl 2 AlCl Al Et Cl - - é ùéù ® + ë ûëû (5)

2 6 4 224 2 Al EtCl 2 AlCl Al Et Cl - - é ùé ù ® + ë ûë û (6)

−


**Scheme 6.** The role of alkylaluminum derivatives in the suppression of cationic oligomerization


these ionic liquids compared to [BMIM]Cl / AlEtCl2. [24]

the system as well as to minimize the loss of AlEtCl2.

catalyst/ionic liquid system.[24, 26]

44 Oligomerization of Chemical and Biological Compounds

abstraction on the olefin by [Al2Cl7]

+ [Al2Cl7]

alkylation consuming the co-catalyst.

Various ligands/additives were found to affect greatly the activity and/or selectivity of the catalytic systems.

A decisive effect of the counter-anion of cationic Ni-complexes on the catalytic activity was observed. [27] Precursors with [ZnCl4] 2− and [AlCl4] <sup>−</sup> anions led to enhanced catalytic activity (TOF: 6480 h-1) compared to [Ni(MeCN)6][BF4]2 (TOF: 2412 h-1). This phenomenon was attributed to the highly reactive behavior of the [BF4] <sup>−</sup> ligand towards alkyl aluminum compounds. It can be assumed that a single active nickel hydride was formed in the ionic liquid, regardless of the precursor used, but in varying amounts depending on the counter ion for the different salts.

In case of [Ni(MeCN)6][BF4]2, the use of PCy3.CS2 as an additive led to considerably higher reaction rate (TOF: 6840 h-1) in oligomerization of 2-butene carried out in acidic ILs, than the addition of the phosphine PCy3 itself (TOF: 3960 h-1). [28] A systematic study of the effect of PCy3.CS2 was carried out by de Souza. [29] According to the NMR investigations of different IL— PCy3.CS2 systems, the improvement of catalytic activity is a consequence of a new anionic IL species, formed by the reaction of [Al2EtCl6] − with the zwitterionic PCy3.CS2, that is able to coordinate to the nickel center and has a strong electron-withdrawing effect on the metal.

Various Ni-carbene complexes (**8**, Figure 7) were found to be more active catalysts of 1-butene oligomerization (with TOFs 3820 h-1 – 7020 h-1) in chloroaluminate ILs than NiCl2(PCy3)2 (**7**, TOF=2950 h-1) under the same conditions. [30]

At the same time, because of a similar selectivity obtained in the presence of [Ni(MeCN)6] [BF4]2, a Ni-carbene complex and a nickel complex in an imidazolium IL with an imidazolium ion blocked with a methyl group in position 2, the active species are thought to be similar. As the formation of a Ni-carbene complex is not possible in the last case, it was concluded that the active catalyst was formed by the coordination of aluminum species to the nickel center in all cases. [26]

In the [BMIM]Cl / AlEtCl2 / AlCl3 systems usually selective dimerization of ethene [25, 31, 38], propene [23, 24, 32] and butenes [26-29] could be observed due to the inhibition of cationic oligomerization/polymerization by the AlEtCl2 co-catalyst. [26]

It should be mentioned however, that by a proper change in the reaction temperature and alkylaluminum co-catalyst/Ni ratio, oligomerization of ethene could be shifted towards the formation of trimers. Both TOF values and C6 selectivity showed a curve with a maximum value with an increase in the AlEt2Cl/Ni ratio. [22]

Oligomerization in acidic ILs was reported to lead to highly branched products. Besides, isomerization of both products and starting material was observed in most cases. For example, oligomerization of 1-butene and 2-butene led to the same product distribution, showing that isomerization was not a limiting step. [28]

Isomerization of the oligomerization products could be minimized by the proper choice of temperature and pressure during dimerization of propene: good selectivity for 1-hexene, up to 63%, could be achieved. [32]

As linear C8-alkenes are highly desirable precursors of plasticizers, great efforts were made to develop a catalyst system producing linear oligomers. Complexes **2** and **3** (Figure 7), showing high selectivity in linear oligomerization in organic solvents, did not give satisfactory results in [BMPY]Cl / AlEt2Cl / AlCl3 ILs. To avoid cationic side reactions, the use of buffered IL systems, consisting of [BMPY]Cl / AlCl3 and an organic base, was investigated. [33, 35] The function of the base is to trap any free acidic species in the IL which may initiate cationic side reactions. Besides suitable basicity, the base should be non-coordinating with respect to the catalytically active nickel center. Depending on the nature of the base, 32-72% linear selectivity could be achieved in 1-butene oligomerization. Catalytic activity and selectivity was found to be optimal in the presence of *N*-methylpyrrole (TOF: 2100 h-1, linear selectivity: 51%) or chinoline (TOF: 1240 h-1, linear selectivity: 64%).

In oligomerization of propene, the same catalyst showed higher activity and dimer selectivity but lower linearity of the products in the buffered IL [EMIM]Cl /AlCl3 / N-methylpyrrole than in toluene. [34] Selective dimerization was due to the 22 times lower solubility of the product hexenes in the IL than the starting material. The low linearity was attributed to ligand degradation in the presence of [Al2Cl7] − anions present in the ionic liquid.

Highly selective dimerization of ethene, propene, 1-butene and 1-hexene could be carried out in the presence of **10** (Figure 7) in triphenylbismuth buffered chloroluminate ILs, such as [MPYR][Al2Cl7]/ BiPh3. [39]

As chloroaluminate ILs promote isomerization of the double bond giving internal olefins as the main products, the possibility of the use of neutral ILs was also investigated. Hexafluoro‐ phosphate ILs with imidazolium cations were found to be suitable solvents for ethylene oligomerization with a cationic nickel-phosphine complex leading to 91-95% linear hexene selectivity, together with a 89-94% 1-hexene selectivity in the C6 fraction. [40] The experiments showed enhanced activity in ILs (TOFs: 2058-12712 h-1) compared to the reaction in CH2Cl2 (1852 h-1). Decreasing activity was observed with increasing alkyl chain length of the imida‐ zolium cation. This phenomenon was explained by inhibition of the cationic nickel catalyst by the oligomers formed: oligomerization activity was reduced by both the monophasic reaction in CH2Cl2 and increasing solubility of the products in the ILs with increasing alkyl chain length. Excellent 1-hexene selectivity is due to the low solubility of the oligomerization products in the IL. Since the primarily formed 1-olefins are quickly extracted into the organic layer, consecutive isomerization of these products at the Ni-center is suppressed. [41] A decisive negative effect of solvent impurities, such as chloride and water, on the outcome of the reaction was also revealed.

90%), together with good activity, with TOFs between 1400-12000 h-1, were achieved in most cases. Selectivity for 1-butene was around 35% in the presence of IL/catalyst systems leading

The main advantage of the use of the ionic liquid solvents in oligomerization is the possibility

In butene dimerization the catalytic system, a [BMIM]Cl / AlCl3 / AlEtCl2 ionic liquid with an aluminum molar fraction of 0.57 and a cationic nickel complex, was reused for six times without any significant changes in the catalytic activity or selectivity. [22] The dimer distri‐ bution was almost constant throughout the catalytic runs. These indicate that the composition

to C4 products in good yields.

O O Ni O O

Cl Cl Ni P

P

N NiCl2 N PPh3

**9**

PPh3

Ni O O

Br Br Ni P Ph

CF3

O O Ni O O

F3C

The Use of Ionic Liquids in the Oligomerization of Alkenes

F3C

<sup>N</sup> <sup>O</sup> <sup>N</sup> <sup>N</sup> <sup>R</sup>

R''

CF3

47

http://dx.doi.org/10.5772/57478

CF3

Ni I

**8**

<sup>O</sup> <sup>N</sup> <sup>N</sup> <sup>N</sup> <sup>R</sup>

**11**

R

R''

R'

R

R

N N N

**14**

R' R' Co Cl Cl

Ni

Br- Br-

R

R

I

R'

R

+

CF3

**1 2 4**

Cl Cl Ni P

Ni O O CF3

CF3 **<sup>3</sup>**

P

+

[SbF6] -

+

**13**

Ph Ph

<sup>P</sup> Ph Ph Ph

**5 6 7**

N N N

**10**

O P P Ni

**Figure 7.** Some transition metal complexes used in oligomerization reactions

F F Ni Br Br

of catalyst reuse.

N O

**12**

O N Ni

Active catalysts were generated from Ni(COD)2 and the Brønsted acid, H(Et2O)2B[3,5- (CF3)2C6H3]4 in ILs such as [BMIM][NTf2] or [BMIM][SbF6]. [38] Contrary to organic solvents, the IL was able to stabilize and immobilize the active Ni species even in the absence of a coordinative ligand. In ethene oligomerization, high selectivity towards C4-C8 olefins (above

The Use of Ionic Liquids in the Oligomerization of Alkenes http://dx.doi.org/10.5772/57478 47

**Figure 7.** Some transition metal complexes used in oligomerization reactions

Isomerization of the oligomerization products could be minimized by the proper choice of temperature and pressure during dimerization of propene: good selectivity for 1-hexene, up

As linear C8-alkenes are highly desirable precursors of plasticizers, great efforts were made to develop a catalyst system producing linear oligomers. Complexes **2** and **3** (Figure 7), showing high selectivity in linear oligomerization in organic solvents, did not give satisfactory results in [BMPY]Cl / AlEt2Cl / AlCl3 ILs. To avoid cationic side reactions, the use of buffered IL systems, consisting of [BMPY]Cl / AlCl3 and an organic base, was investigated. [33, 35] The function of the base is to trap any free acidic species in the IL which may initiate cationic side reactions. Besides suitable basicity, the base should be non-coordinating with respect to the catalytically active nickel center. Depending on the nature of the base, 32-72% linear selectivity could be achieved in 1-butene oligomerization. Catalytic activity and selectivity was found to be optimal in the presence of *N*-methylpyrrole (TOF: 2100 h-1, linear selectivity: 51%) or

In oligomerization of propene, the same catalyst showed higher activity and dimer selectivity but lower linearity of the products in the buffered IL [EMIM]Cl /AlCl3 / N-methylpyrrole than in toluene. [34] Selective dimerization was due to the 22 times lower solubility of the product hexenes in the IL than the starting material. The low linearity was attributed to ligand

Highly selective dimerization of ethene, propene, 1-butene and 1-hexene could be carried out in the presence of **10** (Figure 7) in triphenylbismuth buffered chloroluminate ILs, such as

As chloroaluminate ILs promote isomerization of the double bond giving internal olefins as the main products, the possibility of the use of neutral ILs was also investigated. Hexafluoro‐ phosphate ILs with imidazolium cations were found to be suitable solvents for ethylene oligomerization with a cationic nickel-phosphine complex leading to 91-95% linear hexene selectivity, together with a 89-94% 1-hexene selectivity in the C6 fraction. [40] The experiments showed enhanced activity in ILs (TOFs: 2058-12712 h-1) compared to the reaction in CH2Cl2 (1852 h-1). Decreasing activity was observed with increasing alkyl chain length of the imida‐ zolium cation. This phenomenon was explained by inhibition of the cationic nickel catalyst by the oligomers formed: oligomerization activity was reduced by both the monophasic reaction in CH2Cl2 and increasing solubility of the products in the ILs with increasing alkyl chain length. Excellent 1-hexene selectivity is due to the low solubility of the oligomerization products in the IL. Since the primarily formed 1-olefins are quickly extracted into the organic layer, consecutive isomerization of these products at the Ni-center is suppressed. [41] A decisive negative effect of solvent impurities, such as chloride and water, on the outcome of the reaction

Active catalysts were generated from Ni(COD)2 and the Brønsted acid, H(Et2O)2B[3,5- (CF3)2C6H3]4 in ILs such as [BMIM][NTf2] or [BMIM][SbF6]. [38] Contrary to organic solvents, the IL was able to stabilize and immobilize the active Ni species even in the absence of a coordinative ligand. In ethene oligomerization, high selectivity towards C4-C8 olefins (above

anions present in the ionic liquid.

−

to 63%, could be achieved. [32]

46 Oligomerization of Chemical and Biological Compounds

chinoline (TOF: 1240 h-1, linear selectivity: 64%).

degradation in the presence of [Al2Cl7]

[MPYR][Al2Cl7]/ BiPh3. [39]

was also revealed.

90%), together with good activity, with TOFs between 1400-12000 h-1, were achieved in most cases. Selectivity for 1-butene was around 35% in the presence of IL/catalyst systems leading to C4 products in good yields.

The main advantage of the use of the ionic liquid solvents in oligomerization is the possibility of catalyst reuse.

In butene dimerization the catalytic system, a [BMIM]Cl / AlCl3 / AlEtCl2 ionic liquid with an aluminum molar fraction of 0.57 and a cationic nickel complex, was reused for six times without any significant changes in the catalytic activity or selectivity. [22] The dimer distri‐ bution was almost constant throughout the catalytic runs. These indicate that the composition of the catalytic species in the ionic liquid phase is not affected by the removal of the organic phase.

Cationic complexes of iron and cobalt, [Fe(MeCN)6][BF4]2 and [Co(MeCN)6][BF4]2, showed catalytic activities *ca*. one order of magnitude lower in chloroaluminate ILs than the analogous nickel catalyst in ethene oligomerization. [43] At the same time, they exerted higher dimer selectivity (in the range of 79-100% for iron and 66-99% for cobalt) than the nickel derivative (43-68%) and excellent selectivity for 1-butene (up to 100% and 99% within the dimer fraction for the iron and cobalt complex, respectively.) FeCl2 and CoCl2 turned out to be more active

The Use of Ionic Liquids in the Oligomerization of Alkenes

http://dx.doi.org/10.5772/57478

49

Similarly to the nickel-catalyzed reactions, the addition of the ionic liquid greatly increased catalytic activity of the cationic complexes. Also, systems with more acidic co-catalysts, such

Cyclodimerization of 1,3-butadiene, producing 4-vinyl-1-cylohexene, was carried out in [BMIM][BF4] in the presence of the catalyst precursor [Fe(NO)2Cl]2 and a reducing agent, such as Zn or AlEtCl2. [44] In the latter case the conversion and turnover frequencies were similar to those obtained using Zn as reducing agent, but formation of some linear oligomers was observed. The use of the IL solvent greatly enhanced catalytic activity: the TOF achieved in the two-phase system (1404 h-1) was higher than that under homogeneous conditions (253 h-1). The catalytic system could be recovered and used in three successive experiments without

Bis(imino)pyridine cobalt (II) catalysts (**14**, Figure 7) activated by methylaluminoxane were also found to be active in ethene oligomerization in chloroaluminate ILs. [45] They showed activities for ethene conversion in the range of 4000–15300 h−1. The presence of electron withdrawing groups in the ligands, such as CF3 and F, Cl or Br, enhanced catalytic activity. The catalysts exhibited high selectivity (above 90%) for the dimerization of ethene. The major

Tungsten complexes, generally applied to induce metathesis reactions, could also be used for the dimerization of ethene in chloroaluminate ILs. [46] The tungsten-imido complex Cl2W=NPh(PMe3)3 was found to be an active catalyst in slightly acidic [BMIM]Cl / AlCl3 even without the addition of AlEtCl2. At the same time, the presence of the alkylaluminum cocatalyst was necessary in propene dimerization with the Cl4W=NPh catalyst. In the latter case, contrary to the experiment starting from the phosphine complex, some leaching of the tungsten

Palladium — Lewis acid systems, such as Pd(OAc)2/Cu(OTf)2 and Pd(OAc)2/In(OTf)3 were found to be very efficient catalysts for the selective dimerization of styrene to 1,3-diphenyl-1 butene in [BMIM][PF6]. [47] Linear dimerization of 1,3-butadiene was achieved with PdCl2 in [BMIM][BF4] and [BMIM][PF6]. [48] Butadiene conversion was significantly higher in ILs than under homogeneous conditions. According to atomic absorption analysis, the palladium

but the absence of ligands resulted in considerably lower selectivity to α-olefins.

as AlEtCl2 and AlEt2Cl, showed higher activity than AlEt3 or methylaluminoxane.

any loss in the catalytic activity or selectivity.

product was 1-butene for all the catalysts.

species to the organic phase was observed.

complex was almost completely retained in the ionic phase.

On the contrary, a drop of activity with constant selectivity was observed in the second run in chloroaluminate ILs composed of [BMIM]Cl / AlCl3 / AlEtCl2=0.48/0.5/0.02. [26] At the same time, an increase in the TOFs from 5700 h-1 to 6500 h-1 and from 6100 h-1 to 7600 h-1 took place when the AlEtCl2 content of the IL was raised to 5% and 10%, respectively. This shows that a minimal amount of co-catalyst is needed, probably to maintain a significant amount of active species through alkylation of the nickel species after the recycle.

Almost constant activity and selectivity could be ensured by the addition of a fresh supply of AlEt2Cl together with the *n*-heptane solvent after each run to the IL phase consisting of [BMIM] Cl / AlCl3 / AlEt2Cl in ethene oligomerization. After three cycles, 98% of the original amount of nickel remained in the IL phase.[22]

The stability of the catalyst could be enhanced by the use of ligands with imidazolium tags. [42] Recycling experiments showed that bis-(salicylaldimine) Ni(II) complexes bearing imidazolium moieties in the side chains (**11**, Figure 7) could be recycled more efficiently, retaining their catalytic activity (with TOF values of 49 h-1 — 41 h-1) at least in three cycles of ethene oligomerization. In contrast, activity of complex **12** dropped considerably upon reuse with TOFs of 48 h-1 and 30 h-1 in the first and second batch, respectively.

The catalytic system, the cationic Ni-complex (**13**, Figure 7) in [BMIM][PF6], was used suc‐ cessfully for biphasic oligomerization of ethene to higher α-olefins. It was proved to be recyclable with little change in selectivity, although with somewhat lower activity. [40] The drop of TOF from 12712 h-1 to 7952 h-1 in the third run was attributed to the practical problem of quantitative transfer of the IL — catalyst mixture back into the autoclave under completely inert conditions.

Good results of catalyst reuse were obtained in ethene oligomerization in the presence of a catalyst generated from a Ni(0) complex and a Brønsted acid in [BMIM][SbF6]. [38]

The use of the IL solvent makes it possible to carry out the oligomerization reaction in a continuous mode. A loop reactor was designed where the biphasic reaction mixture was circulated by a pump with high flow rates. [33] Two static mixers in the reactor loop provided an efficient dispersion of the ionic catalyst solution in the organic phase. The product was separated from the IL/catalyst mixture with a gravity separator integrated in the reactor loop. After 3 h reaction time the [BMIM]Cl / AlCl3 / N-methylpyrrole / **3** catalytic system showed good activity (TOF=2700 h−1) and selectivity (selectivity to C8-product >98%, selectivity to linear C8-product=52%).

#### **5.2. Oligomerization with other transition metal catalysts**

To avoid high carbon–carbon double bond isomerization rate, observed for nickel-catalyzed reactions, oligomerization catalysts based on other transition metals have also been tested (Table 4).

Cationic complexes of iron and cobalt, [Fe(MeCN)6][BF4]2 and [Co(MeCN)6][BF4]2, showed catalytic activities *ca*. one order of magnitude lower in chloroaluminate ILs than the analogous nickel catalyst in ethene oligomerization. [43] At the same time, they exerted higher dimer selectivity (in the range of 79-100% for iron and 66-99% for cobalt) than the nickel derivative (43-68%) and excellent selectivity for 1-butene (up to 100% and 99% within the dimer fraction for the iron and cobalt complex, respectively.) FeCl2 and CoCl2 turned out to be more active but the absence of ligands resulted in considerably lower selectivity to α-olefins.

of the catalytic species in the ionic liquid phase is not affected by the removal of the organic

On the contrary, a drop of activity with constant selectivity was observed in the second run in chloroaluminate ILs composed of [BMIM]Cl / AlCl3 / AlEtCl2=0.48/0.5/0.02. [26] At the same time, an increase in the TOFs from 5700 h-1 to 6500 h-1 and from 6100 h-1 to 7600 h-1 took place when the AlEtCl2 content of the IL was raised to 5% and 10%, respectively. This shows that a minimal amount of co-catalyst is needed, probably to maintain a significant amount of active

Almost constant activity and selectivity could be ensured by the addition of a fresh supply of AlEt2Cl together with the *n*-heptane solvent after each run to the IL phase consisting of [BMIM] Cl / AlCl3 / AlEt2Cl in ethene oligomerization. After three cycles, 98% of the original amount

The stability of the catalyst could be enhanced by the use of ligands with imidazolium tags. [42] Recycling experiments showed that bis-(salicylaldimine) Ni(II) complexes bearing imidazolium moieties in the side chains (**11**, Figure 7) could be recycled more efficiently, retaining their catalytic activity (with TOF values of 49 h-1 — 41 h-1) at least in three cycles of ethene oligomerization. In contrast, activity of complex **12** dropped considerably upon reuse

The catalytic system, the cationic Ni-complex (**13**, Figure 7) in [BMIM][PF6], was used suc‐ cessfully for biphasic oligomerization of ethene to higher α-olefins. It was proved to be recyclable with little change in selectivity, although with somewhat lower activity. [40] The drop of TOF from 12712 h-1 to 7952 h-1 in the third run was attributed to the practical problem of quantitative transfer of the IL — catalyst mixture back into the autoclave under completely

Good results of catalyst reuse were obtained in ethene oligomerization in the presence of a

The use of the IL solvent makes it possible to carry out the oligomerization reaction in a continuous mode. A loop reactor was designed where the biphasic reaction mixture was circulated by a pump with high flow rates. [33] Two static mixers in the reactor loop provided an efficient dispersion of the ionic catalyst solution in the organic phase. The product was separated from the IL/catalyst mixture with a gravity separator integrated in the reactor loop. After 3 h reaction time the [BMIM]Cl / AlCl3 / N-methylpyrrole / **3** catalytic system showed good activity (TOF=2700 h−1) and selectivity (selectivity to C8-product >98%, selectivity to linear

To avoid high carbon–carbon double bond isomerization rate, observed for nickel-catalyzed reactions, oligomerization catalysts based on other transition metals have also been tested

catalyst generated from a Ni(0) complex and a Brønsted acid in [BMIM][SbF6]. [38]

**5.2. Oligomerization with other transition metal catalysts**

species through alkylation of the nickel species after the recycle.

with TOFs of 48 h-1 and 30 h-1 in the first and second batch, respectively.

of nickel remained in the IL phase.[22]

48 Oligomerization of Chemical and Biological Compounds

inert conditions.

C8-product=52%).

(Table 4).

phase.

Similarly to the nickel-catalyzed reactions, the addition of the ionic liquid greatly increased catalytic activity of the cationic complexes. Also, systems with more acidic co-catalysts, such as AlEtCl2 and AlEt2Cl, showed higher activity than AlEt3 or methylaluminoxane.

Cyclodimerization of 1,3-butadiene, producing 4-vinyl-1-cylohexene, was carried out in [BMIM][BF4] in the presence of the catalyst precursor [Fe(NO)2Cl]2 and a reducing agent, such as Zn or AlEtCl2. [44] In the latter case the conversion and turnover frequencies were similar to those obtained using Zn as reducing agent, but formation of some linear oligomers was observed. The use of the IL solvent greatly enhanced catalytic activity: the TOF achieved in the two-phase system (1404 h-1) was higher than that under homogeneous conditions (253 h-1). The catalytic system could be recovered and used in three successive experiments without any loss in the catalytic activity or selectivity.

Bis(imino)pyridine cobalt (II) catalysts (**14**, Figure 7) activated by methylaluminoxane were also found to be active in ethene oligomerization in chloroaluminate ILs. [45] They showed activities for ethene conversion in the range of 4000–15300 h−1. The presence of electron withdrawing groups in the ligands, such as CF3 and F, Cl or Br, enhanced catalytic activity. The catalysts exhibited high selectivity (above 90%) for the dimerization of ethene. The major product was 1-butene for all the catalysts.

Tungsten complexes, generally applied to induce metathesis reactions, could also be used for the dimerization of ethene in chloroaluminate ILs. [46] The tungsten-imido complex Cl2W=NPh(PMe3)3 was found to be an active catalyst in slightly acidic [BMIM]Cl / AlCl3 even without the addition of AlEtCl2. At the same time, the presence of the alkylaluminum cocatalyst was necessary in propene dimerization with the Cl4W=NPh catalyst. In the latter case, contrary to the experiment starting from the phosphine complex, some leaching of the tungsten species to the organic phase was observed.

Palladium — Lewis acid systems, such as Pd(OAc)2/Cu(OTf)2 and Pd(OAc)2/In(OTf)3 were found to be very efficient catalysts for the selective dimerization of styrene to 1,3-diphenyl-1 butene in [BMIM][PF6]. [47] Linear dimerization of 1,3-butadiene was achieved with PdCl2 in [BMIM][BF4] and [BMIM][PF6]. [48] Butadiene conversion was significantly higher in ILs than under homogeneous conditions. According to atomic absorption analysis, the palladium complex was almost completely retained in the ionic phase.


**alkene catalyst a ionic liquid reference**

1,3-butadiene [Fe(NO)2Cl]2 / Zn or AlEtCl2 [BMIM][BF4] 44

ethene **14** [BMIM]Cl / MAO / AlCl3 45 ethene Cl2W=NPh(PMe3)3 [BMIM]Cl / AlCl3 46 propene Cl4W=NPh [BMIM]Cl / AlEtCl2 / AlCl3 46

ethene [Co(MeCN)6][BF4]2

Pd(OAc)2/Cu(OTf)2 Pd(OAc)2/In(OTf)3

: for structures of catalysts **1**-**14** see Figure 7.

**Table 4.** Oligomerization of alkenes in ILs with transition metal catalysts

decreased considerably the acidity level of the proton.

**5.3. The use of ILs as solvents in acid-catalyzed oligomerizations**

styrene

a

at 60 <sup>o</sup>

1,3-butadiene PdCl2

[BMIM]Cl / AlEt2Cl / AlCl3 [BMIM]Cl / AlEt3 / AlCl3 [BMIM]Cl / MAO / AlCl3

[[BMIM]Cl / AlEtCl2 / AlCl3 [BMIM]Cl / AlEt2Cl / AlCl3 [BMIM]Cl / AlEt3 / AlCl3 [BMIM]Cl / MAO / AlCl3

[BMIM][BF4] [BMIM][PF6]

Air-stable nonchloroaluminate ILs can be used as solvents for catalytic amounts of protic acids (Table 5). In this case the IL is used to immobilize the acid catalyst and to ensure a biphasic reaction. The acidity of the system can be tuned by the nature of the acid/IL composition. The acidity level of these systems can be evaluated by the determination of the Hammett acidity functions using UV-Visible spectroscopy. [49, 50] (For a review, see [51].)These results were in good agreement with the catalytic data obtained for oligomerization of isobutene in most ILs except for [BMIM][NTf2]. [52] The choice of the anion of the IL and the nature and concentration of the Brønsted acid were the main factors influencing catalytic performance. Although excellent conversions were obtained with HNTf2 or HOTf, dimer selectivity remained mod‐ erate and trimers were formed in a large amount. Under the same acid concentrations, no reaction was observed either in organic solvents, such as heptane, or in water. Due to the solvating effect of water with respect to the proton, even addition of water to an IL/acid system

The catalytic system composed of [HMIM][BF4] and HBF4 showed high catalytic activity in dimerization of α-methylstyrene (Scheme 7) to produce 4-methyl-2,4-diphenyl-1-pentene (**17**)

C. [53] The catalyst could be reused in four cycles without a loss of activity or selectivity. HBF4 alone was found to be a considerably less active catalyst leading to a 6% conversion of α-methylstyrene under identical conditions. The IL [HMIM][BF4] exerted no catalytic activity

[BMIM][PF6] 47

43

http://dx.doi.org/10.5772/57478

51

The Use of Ionic Liquids in the Oligomerization of Alkenes

48


**Table 4.** Oligomerization of alkenes in ILs with transition metal catalysts

**alkene catalyst a ionic liquid reference**

[BMIM]Cl / AlEtCl2 / AlCl3 23, 24

[BMIM]Cl / AlEtCl2 / AlCl3 26, 27

[BMIM]Cl / AlEtCl2 / AlCl3 28

33

35

[BMPY]Cl / LiCl / AlCl3 [BMPY]Cl / AlEtCl2 / AlCl3

[BMPY]Cl / AlEtCl2 / AlCl3 [BMPY]Cl / *N*-methylpyrrol / AlCl3

[BMPY]Cl / *N*-methylpyrrole / AlCl3 [BMIM]Cl / *N*-methylpyrrole / AlCl3

ethene NiF2, NiCl2(PCy3)2 (**7**), [Ni(MeCN)6][BF4]2 [BMIM]Cl / AlEtCl2 / AlCl3 25 ethene NiBr2(PPh3)2 (**6**) [BMIM]Cl / AlEt2Cl / AlCl3 22 ethene **9** [BMIM]Cl / AlEt2Cl / AlCl3 31 ethene **11, 12** [BMIM]Cl / AlEt2Cl / AlCl3 42 ethene Ni(COD)2 / H(Et2O)2B[3,5-(CF3)2C6H3]4 [BMIM][NTf2], [BMIM][SbF6] 38 ethene **13** [BMIM][PF6] 40, 41 propene [Ni(MeCN)6][BF4]2 [BMIM]Cl / AlEtCl2 / AlCl3 32

propene Ni(acac)2 (**1**) [BMIM]Cl / AlEtCl2 / AlCl3 23 propene **8** [BMIM]Cl / *N*-methylpyrrole / AlCl3 30 propene Ni(acac)2 (**1**), **3**, **4** [EMIM]Cl / *N*-methylpyrrole / AlCl3 34 propene **10** [MPYR][Al2Cl7]/ BiPh3 39

1-butene [Ni(MeCN)6][BF4]2 + PCy3CS2 [BMIM]Cl / AlEtCl2 / AlCl3 29

1-butene **8** [BMIM]Cl / *N*-methylpyrrole / AlCl3 30

ethene [Fe(MeCN)6][BF4]2 [BMIM]Cl / AlEtCl2 / AlCl3 43

**oligomerization with Ni catalysts**

50 Oligomerization of Chemical and Biological Compounds

NiCl2(P(i-Pr)3)2 (**5**) NiCl2(PBu3)2 NiCl2(PPh3)2 NiCl2(PBz3)2 NiCl2(PCy3)2 (**7**)

1-butene [Ni(MeCN)6][BF4]2

NiCl2+ PBu3, NiCl2+ PPh3 NiCl2+ PCy3 NiCl2+ PCy3CS2

**oligomerization with other transition metal catalysts**

Ni(MeCN)6][AlCl4]2, [Ni(MeCN)6][ZnCl4

propene

1-butene, 2 butene

1-butene **3**

1-butene **2**

#### **5.3. The use of ILs as solvents in acid-catalyzed oligomerizations**

Air-stable nonchloroaluminate ILs can be used as solvents for catalytic amounts of protic acids (Table 5). In this case the IL is used to immobilize the acid catalyst and to ensure a biphasic reaction. The acidity of the system can be tuned by the nature of the acid/IL composition. The acidity level of these systems can be evaluated by the determination of the Hammett acidity functions using UV-Visible spectroscopy. [49, 50] (For a review, see [51].)These results were in good agreement with the catalytic data obtained for oligomerization of isobutene in most ILs except for [BMIM][NTf2]. [52] The choice of the anion of the IL and the nature and concentration of the Brønsted acid were the main factors influencing catalytic performance. Although excellent conversions were obtained with HNTf2 or HOTf, dimer selectivity remained mod‐ erate and trimers were formed in a large amount. Under the same acid concentrations, no reaction was observed either in organic solvents, such as heptane, or in water. Due to the solvating effect of water with respect to the proton, even addition of water to an IL/acid system decreased considerably the acidity level of the proton.

The catalytic system composed of [HMIM][BF4] and HBF4 showed high catalytic activity in dimerization of α-methylstyrene (Scheme 7) to produce 4-methyl-2,4-diphenyl-1-pentene (**17**) at 60 <sup>o</sup> C. [53] The catalyst could be reused in four cycles without a loss of activity or selectivity. HBF4 alone was found to be a considerably less active catalyst leading to a 6% conversion of α-methylstyrene under identical conditions. The IL [HMIM][BF4] exerted no catalytic activity without the addition of an acid. The use of ILs with other anions or the addition of other acids than HBF4 led to lower activity and/or lower selectivity. At the same time, the cation of the IL had no significant influence, [BMIM][BF4]–HBF4, [MIM][BF4]–HBF4 and [EMIM][BF4]–HBF4 were similarly suitable compositions. The reaction temperature had a great effect on the selectivity of the reaction, at 120 o C selective formation of **16** was observed.

**6. Ionic liquids as catalysts**

conversion was achieved in 16h at 60 o

[60]

The first example for an IL catalyzed oligomerization of low molecular weight olefins was described in 1993 (Table 6). [54] Ethene or propene were converted to a mixture of oligomeric saturated and unsaturated hydrocarbons. The [BMIM]Cl / AlCl3 system was shown to give unsaturated C4-C6 hydrocarbons with better selectivity than the more acidic [BPY]Cl / AlCl3.

As it was described in the previous section, the formation of highly viscous oligomeric products were found to be formed in acidic chloroaluminate ILs even in the absence of a nickel catalyst [23, 26] but according to the reports, this reaction could be suppressed in the presence of the alkylaluminum co-catalyst. At the same time, Stenzel et al. reported on the oligomeri‐ zation of 1-alkenes in the IL with a composition of [BMIM]Cl / AlCl3 / AlEtCl2=1/1.1/0.1. [55] The reaction was much slower than the transition metal catalyzed oligomerization: 67% ethene

Oligomerization of 1-hexene, [56, 57] 1-octene [58] and 1-decene [58] led to oligoalkylnaphtenic oils in chloroaluminate ionic liquids, such as [Et3NH]Cl / AlCl3, [PY]Cl / AlCl3 and 2,6 bis(morpholinylethyl)-4-methylphenol/AlCl3. The products were formed via an oligomeriza‐ tion – cyclization reaction sequence. The introduction of a titanium containing modifier into the chloroaluminate IL leads to the formation of new catalytic centers that mediate the oligomerization process toward the formation of oligomers with a higher molecular mass. [56]

Chloroaluminate ionic liquids are extremely moisture-sensitive and hydrolyze to release hydrogen chloride in contact with traces of water. Besides, they promote not only cationic oligomerization but also isomerization. As a consequence, the catalytic activity of chlorofer‐ rate(III)-[59] and chlorogallate(III) ILs [60], less sensitive to hydrolysis, were also tested.

Acidic compositions of [Et3NH]Cl / FeCl3 and [C13H22N]Cl / FeCl3 showed high activity in isobutene oligomerization leading to a mixture of diisobutene and triisobutene with high selectivity. [59] The conversion of isobutene increased with increasing reaction temperature, but at high temperature cracking reactions occurred. The addition of CuCl to iron(III) chloride ionic liquids increased catalytic activity and the selectivity for diisobutene plus triisobutene

[EMIM][Ga2Cl7] was successfully used in the oligomerization of 1-pentene to produce a C20– C50 fraction, that can be used as a base for synthetic automotive lubricants, with high selectivity.

A considerably different activity and selectivity of chloroaluminate and chloroferrate ILs were observed in oligomerization of α-methylstyrene (Scheme 7). [61] With [Et3NH]Cl / 2AlCl3 100% conversion was achieved in 5 minutes. The selectivity for 1,1,3-trimethyl-3-phenylindan (**16**) was as high as about 97% in the absence of organic solvents. [Et3NH]Cl / 2FeCl3, [BMIM]Br / 2AlCl3 [BMIM]Br / 2FeCl3 turned out to be similarly active but less selective catalysts producing 22-32% trimers as side products. In [BMIM]Cl / 2AlCl3 complete selectivity towards **16** was observed. [53] The high activity of the chloroaluminate ionic liquid was attributed to the strong

chain length of the monomer (to 7% in the oligomerization of 1-hexene).

up to 90 % using a composition of CuCl / [Et3NH]Cl / FeCl3=0.25/1/1.5.

C. The total yield of oligomers decreased with increasing

The Use of Ionic Liquids in the Oligomerization of Alkenes

http://dx.doi.org/10.5772/57478

53

**Scheme 7.** Oligomerization of α-methylstyrene


**Table 5.** Acid catalyzed oligomerization in ILs

#### **6. Ionic liquids as catalysts**

without the addition of an acid. The use of ILs with other anions or the addition of other acids than HBF4 led to lower activity and/or lower selectivity. At the same time, the cation of the IL had no significant influence, [BMIM][BF4]–HBF4, [MIM][BF4]–HBF4 and [EMIM][BF4]–HBF4 were similarly suitable compositions. The reaction temperature had a great effect on the

<sup>+</sup> +H+ -H+

**alkene acidic catalyst ionic liquid reference**

isobutene HOTf [BMIM][NTf2] 52

isobutene CH3SO3H [BMIM][NTf2] 52

isobutene CF3CO2H [BMIM][NTf2] 52

[BMIM][OTf] [BMIM][PF6] [BMIM][BF4] [BMIM][SbF6]

[BMIM][BF4] [EMIM][BF4] [MIM][BF4]

C selective formation of **16** was observed.


+

**17 18**

**<sup>15</sup> <sup>16</sup>**

52

53

selectivity of the reaction, at 120 o

52 Oligomerization of Chemical and Biological Compounds

**Scheme 7.** Oligomerization of α-methylstyrene

**Table 5.** Acid catalyzed oligomerization in ILs

isobutene HNTf2 [BMIM][NTf2]

α-methylstyrene HBF4 [HMIM][BF4]

The first example for an IL catalyzed oligomerization of low molecular weight olefins was described in 1993 (Table 6). [54] Ethene or propene were converted to a mixture of oligomeric saturated and unsaturated hydrocarbons. The [BMIM]Cl / AlCl3 system was shown to give unsaturated C4-C6 hydrocarbons with better selectivity than the more acidic [BPY]Cl / AlCl3.

As it was described in the previous section, the formation of highly viscous oligomeric products were found to be formed in acidic chloroaluminate ILs even in the absence of a nickel catalyst [23, 26] but according to the reports, this reaction could be suppressed in the presence of the alkylaluminum co-catalyst. At the same time, Stenzel et al. reported on the oligomeri‐ zation of 1-alkenes in the IL with a composition of [BMIM]Cl / AlCl3 / AlEtCl2=1/1.1/0.1. [55] The reaction was much slower than the transition metal catalyzed oligomerization: 67% ethene conversion was achieved in 16h at 60 o C. The total yield of oligomers decreased with increasing chain length of the monomer (to 7% in the oligomerization of 1-hexene).

Oligomerization of 1-hexene, [56, 57] 1-octene [58] and 1-decene [58] led to oligoalkylnaphtenic oils in chloroaluminate ionic liquids, such as [Et3NH]Cl / AlCl3, [PY]Cl / AlCl3 and 2,6 bis(morpholinylethyl)-4-methylphenol/AlCl3. The products were formed via an oligomeriza‐ tion – cyclization reaction sequence. The introduction of a titanium containing modifier into the chloroaluminate IL leads to the formation of new catalytic centers that mediate the oligomerization process toward the formation of oligomers with a higher molecular mass. [56]

Chloroaluminate ionic liquids are extremely moisture-sensitive and hydrolyze to release hydrogen chloride in contact with traces of water. Besides, they promote not only cationic oligomerization but also isomerization. As a consequence, the catalytic activity of chlorofer‐ rate(III)-[59] and chlorogallate(III) ILs [60], less sensitive to hydrolysis, were also tested.

Acidic compositions of [Et3NH]Cl / FeCl3 and [C13H22N]Cl / FeCl3 showed high activity in isobutene oligomerization leading to a mixture of diisobutene and triisobutene with high selectivity. [59] The conversion of isobutene increased with increasing reaction temperature, but at high temperature cracking reactions occurred. The addition of CuCl to iron(III) chloride ionic liquids increased catalytic activity and the selectivity for diisobutene plus triisobutene up to 90 % using a composition of CuCl / [Et3NH]Cl / FeCl3=0.25/1/1.5.

[EMIM][Ga2Cl7] was successfully used in the oligomerization of 1-pentene to produce a C20– C50 fraction, that can be used as a base for synthetic automotive lubricants, with high selectivity. [60]

A considerably different activity and selectivity of chloroaluminate and chloroferrate ILs were observed in oligomerization of α-methylstyrene (Scheme 7). [61] With [Et3NH]Cl / 2AlCl3 100% conversion was achieved in 5 minutes. The selectivity for 1,1,3-trimethyl-3-phenylindan (**16**) was as high as about 97% in the absence of organic solvents. [Et3NH]Cl / 2FeCl3, [BMIM]Br / 2AlCl3 [BMIM]Br / 2FeCl3 turned out to be similarly active but less selective catalysts producing 22-32% trimers as side products. In [BMIM]Cl / 2AlCl3 complete selectivity towards **16** was observed. [53] The high activity of the chloroaluminate ionic liquid was attributed to the strong


Stabilization of the intermediate carbenium ion (**15**, Scheme 7) by the IL with great polarity may make it possible for the positively charged carbon to attack the aromatic ring leading to the cyclodimer **16**. At the same time, in tertiary-amylalcohol as solvent, [Et3NH]Cl / 2FeCl3 oligomerizes α-methylstyrene to unsaturated dimer **17** with high selectivity. Interestingly, activity of [Et3NH]Cl / 2AlCl3 was much lower under these conditions, probably due to the

The Brønsted acidic IL, [MIM][BF4] was shown to catalyze selective dimerization of αmethylstyrene, without the addition of an acidic co-catalyst. [62] This can be explained by the higher acidity of [MIM][BF4] compared to dialkylimidazolium ILs. [63] In the dimerization

yl-2,4-diphenyl-1-pentene (**17**) was formed with 93% selectivity at 92% conversion of the substrate, while indan **16** could be obtained with 100% selectivity when the reaction temper‐

Other Brønsted acidic ILs, consisting of imidazolium cations with alkane sulfonic acid side chains, were found to be equivally active but less selective catalysts in the dimerization of αmethylstyrene. [64] At the same time, in oligomerization of isobutene, 68% and 94 % conversion and 99% selectivity to C8+C12 products, starting material for the production of high-octane gasoline blending components, could be achieved with [MIMBs][OTf] and [HIMBs][OTf] (Figure 6), respectively. The selectivity of the reaction was greatly dependent on the length of the side chain of the imidazolium cation. The use of an ionic liquid with smaller side chain led to higher selectivity for dimeric products. The increase in the catalytic activity of [HIMBs][OTf] was attributed to the higher solubility of isobutene in this IL because of the greater lipophilicity of the imidazolium cation with the longer alkyl side chain. It should be mentioned that using 1-alkenes as substrates, isomerization, instead of oligomerization, was found to be the main

A Brønsted acidic IL could also be used in catalytic amounts in a neutral IL as solvent. [MIMBs] [OTf] (Figure 6) was proved to be a more efficient acidic catalyst for isobutene dimerization in [BMIM][OTf] solvent than HNTf2 or HOTf regarding dimer selectivity (up to 88 % at 70% isobutene conversion). [52] The [MIMBs][OTf] / [BMIM][OTf] mixture retained its activity and

Despite the several advantages of ILs discussed in the previous sections, they also have some drawbacks including the difficulties in handling because of the high viscosity of some ILs and the problems for application in fixed bed reactors. Also, biphasic IL—organic systems require large amounts of the expensive ILs, which hinders industrial applications. These difficulties can be overcome by the use of supported ionic liquid phases (SILPs) prepared by the immo‐

Supported versions of the transition metal catalyzed processes, easy to use in a continuous mode reactor, were described. A buffered IL immobilized on a support material together with

C. The ionic liquid was successfully recycled six times.

The Use of Ionic Liquids in the Oligomerization of Alkenes

http://dx.doi.org/10.5772/57478

C 4-meth‐

55

reaction, a great temperature dependence on the selectivity was observed: at 60 o

better solvation of Al3+ than that of Fe3+ in the organic solvent.

ature was increased to 170 o

selectivity in ten subsequent cycles.

**7. Supported catalysts based on ILs**

bilization of ILs on solid supports. [65]

reaction.

**Table 6.** Oligomerization of alkenes with IL catalysts

Lewis acid, as well as superacidic protons, existing in the system due to the release of hydro‐ chloric acid in the presence of trace amounts of water (see Eq. 4).

Stabilization of the intermediate carbenium ion (**15**, Scheme 7) by the IL with great polarity may make it possible for the positively charged carbon to attack the aromatic ring leading to the cyclodimer **16**. At the same time, in tertiary-amylalcohol as solvent, [Et3NH]Cl / 2FeCl3 oligomerizes α-methylstyrene to unsaturated dimer **17** with high selectivity. Interestingly, activity of [Et3NH]Cl / 2AlCl3 was much lower under these conditions, probably due to the better solvation of Al3+ than that of Fe3+ in the organic solvent.

The Brønsted acidic IL, [MIM][BF4] was shown to catalyze selective dimerization of αmethylstyrene, without the addition of an acidic co-catalyst. [62] This can be explained by the higher acidity of [MIM][BF4] compared to dialkylimidazolium ILs. [63] In the dimerization reaction, a great temperature dependence on the selectivity was observed: at 60 o C 4-meth‐ yl-2,4-diphenyl-1-pentene (**17**) was formed with 93% selectivity at 92% conversion of the substrate, while indan **16** could be obtained with 100% selectivity when the reaction temper‐ ature was increased to 170 o C. The ionic liquid was successfully recycled six times.

Other Brønsted acidic ILs, consisting of imidazolium cations with alkane sulfonic acid side chains, were found to be equivally active but less selective catalysts in the dimerization of αmethylstyrene. [64] At the same time, in oligomerization of isobutene, 68% and 94 % conversion and 99% selectivity to C8+C12 products, starting material for the production of high-octane gasoline blending components, could be achieved with [MIMBs][OTf] and [HIMBs][OTf] (Figure 6), respectively. The selectivity of the reaction was greatly dependent on the length of the side chain of the imidazolium cation. The use of an ionic liquid with smaller side chain led to higher selectivity for dimeric products. The increase in the catalytic activity of [HIMBs][OTf] was attributed to the higher solubility of isobutene in this IL because of the greater lipophilicity of the imidazolium cation with the longer alkyl side chain. It should be mentioned that using 1-alkenes as substrates, isomerization, instead of oligomerization, was found to be the main reaction.

A Brønsted acidic IL could also be used in catalytic amounts in a neutral IL as solvent. [MIMBs] [OTf] (Figure 6) was proved to be a more efficient acidic catalyst for isobutene dimerization in [BMIM][OTf] solvent than HNTf2 or HOTf regarding dimer selectivity (up to 88 % at 70% isobutene conversion). [52] The [MIMBs][OTf] / [BMIM][OTf] mixture retained its activity and selectivity in ten subsequent cycles.

#### **7. Supported catalysts based on ILs**

Lewis acid, as well as superacidic protons, existing in the system due to the release of hydro‐

**alkene ionic liquid reference**

[BPY]Cl / AlCl3

[BPY]Cl / AlCl3

[C13H22N]Cl / FeCl3 [Et3NH]Cl / FeCl3 / CuCl

[HIMBs][OTf]

[PY]Cl / AlCl3

[PY]Cl / AlCl3

[PY]Cl / AlCl3

[Et3NH]Cl / FeCl3 [BMIM]Br / AlCl3 [BMIM]Br / FeCl3

[HIMBs][OTf]

α-methylstyrene [BMIM]Cl / AlCl3 53

α-methylstyrene [MIM][BF4] 62

2,6-bis(morpholinylethyl)-4-methylphenol/AlCl3 2,6-bis(morpholinylethyl)-4-methylphenol/TiCl4

isobutene [MIMBs][OTf] / [BMIM][OTf] 52 1-pentene [BMIM]Cl / AlEtCl2 / AlCl3 55 1-pentene [EMIM][Ga2Cl7] 60 1-hexene [BMIM]Cl / AlEtCl2 / AlCl3 55

ethene [BMIM]Cl / AlEtCl2 / AlCl3 55

propene [BMIM]Cl / AlEtCl2 / AlCl3 55 1-butene [BMIM]Cl / AlEtCl2 / AlCl3 55

54

54

59

64

56, 57

58

58

61

64

ethene [BMIM]Cl / AlCl3

54 Oligomerization of Chemical and Biological Compounds

propene [BMIM]Cl / AlCl3

isobutene [Et3NH]Cl / FeCl3

isobutene [MIMBs][OTf]

1-hexene [Et3NH]Cl / AlCl3

1-octene [Et3NH]Cl / AlCl3

1-decene [Et3NH]Cl / AlCl3

α-methylstyrene [Et3NH]Cl / AlCl3

α-methylstyrene [MIMBs][OTf]

**Table 6.** Oligomerization of alkenes with IL catalysts

chloric acid in the presence of trace amounts of water (see Eq. 4).

Despite the several advantages of ILs discussed in the previous sections, they also have some drawbacks including the difficulties in handling because of the high viscosity of some ILs and the problems for application in fixed bed reactors. Also, biphasic IL—organic systems require large amounts of the expensive ILs, which hinders industrial applications. These difficulties can be overcome by the use of supported ionic liquid phases (SILPs) prepared by the immo‐ bilization of ILs on solid supports. [65]

Supported versions of the transition metal catalyzed processes, easy to use in a continuous mode reactor, were described. A buffered IL immobilized on a support material together with an organometallic complex of the type **19** (Figure 8) was used in selective dimerization reactions. The OH groups of the support were modified with an aluminum halide or alkyla‐ luminum halide (Figure 8). The immobilized buffered catalyst was formed by mixing the organometallic catalyst, the IL composition of [BMIM]Cl/AlCl3 and the coated support. The methodology was successfully used for the dimerization of propene in a fix bed reactor. [66]

materials with higher catalytic activity at identical catalyst loadings. The formation of a stable film was a prerequisite for an unvarying selectivity of the catalyst, so impregnation at a high temperature was necessary to obtain a suitable composition, especially in case of microporous

As it could be expected, immobilization of the IL led to a loss of the BET surface of the supports. From the experimental data of the nitrogen adsorption/desorption isotherms of the supports and the solid catalysts, it could be concluded that the pores of mesoporous supports retained their shape during the catalyst preparation process, although the total pore volume values decreased because of the active IL film on the wall of the pores. On the other hand, the microporous supports contained narrow shaped pores that became almost totally filled with IL film during the catalyst preparation. The diffusion of isobutene to the micropores was blocked by the IL that filled the pore, so the contact surface between the IL phase and the organic phase was lower than that of SILP catalysts with mesopores. Also, acid capacity of microporous material was found to be lower, in accordance with the different amounts of adsorbed ionic liquid. Lower contact surface and lower acidity resulted in lower catalytic activity. At the same time, using a proper pretreatment, a stable catalyst with excellent C12 selectivity, exceeding even that of the mesoporous material, could be obtained with ILs

Supported chloroaluminate ILs were used for the trimerization of isobutene in a feed contaning C4 mixture. [70] Both the support and immobilization methodology were found to play a crucial role in the reaction. Catalysts obtained by mixing the IL with the silica support showed excellent oligomerization activity. At the same time, ILs immobilized on glass or molecular sieves induced only isobutane/butene alkylation. Exclusive alkylation took place also in the presence of catalysts prepared by grafting the IL on the silica. According to 29Si CP/MAS NMR measurements, in the latter case a covalent bond was formed between the cation and the support. On the contrary, the spectrum of the catalyst with oligomerization activity indicated the formation of a covalent bond between the anion of the Lewis acid and the silanol group

a) b)

**Figure 9.** Structures of immobilized IL on silica support prepared by different methods (a) by mixing the IL and the

O

Si OEt

The Use of Ionic Liquids in the Oligomerization of Alkenes

http://dx.doi.org/10.5772/57478

57

<sup>N</sup> <sup>N</sup>

Al2Cl7

<sup>O</sup> <sup>s</sup>

i l i c a

support materials.

supported on microporous silica.

on the surface of the support (Figure 9).

(AlCl3)x

N N

<sup>O</sup> <sup>s</sup>

silica support, b) by grafting)

i l i c a

**Figure 8.** Modification of silica with an aluminumhalide/alkylaluminumhalide, and complex **19** (M=Ni, Fe, Co, Ti, V) used as the catalyst precursor

Immobilization of metallocene catalysts for the same reaction was achieved by similar methodology. [67]

Silica-supported SO3H-functionalized IL catalysts were used in the oligomerization of isobutene. [68] The catalysts were obtained by impregnation of silica supports with Brønsted acidic ILs with [MIMBs]+ or [BIMBs]+ cations resulting in the formation of a solid material. It was shown that various factors, such as the properties and pretreatment of the solid support, the choice of cation and anion, as well as the reaction time and temperature affected the outcome of the reaction considerably. Oligomerization of isobutene, carried out at 60 o C, led to the formation of C8 products with very good selectivity in each case. An increase in the temperature and/or reaction time led to an increase in the ratios of higher oligomers with selectivities for C12+C16 products up to 85%. The catalysts comprising a trifluoromethanesul‐ fonate anion could be reused several times without loss of activity. Total catalyst leaching of eight successive runs was 2.0% of the original load. At the same time, a quick inactivation of the catalysts obtained from ILs with hydrogensulfate anions was observed.

The TON and TOF values of the SILP catalysts were found to be ten times higher than those of the same ILs. When using the SILP catalysts, a smaller amount of the relatively expensive ionic liquid was sufficient for the reaction. The mass transport into the ionic liquid phase can be rate limiting due to the high viscosity of the ILs. This drawback can be circumvented by dispersing the ILs on support materials. Furthermore, the solid catalysts are easier to handle than the ionic liquids themselves, so separation and recovery of even small amounts of catalysts are simple.

Both mesoporous and microporous silica materials were used as solid supports. [69] A close relationship between catalytic activity and catalyst morphology was observed. Silica with mesoporous structure was able to adsorb a higher amount of the IL and produced SILP materials with higher catalytic activity at identical catalyst loadings. The formation of a stable film was a prerequisite for an unvarying selectivity of the catalyst, so impregnation at a high temperature was necessary to obtain a suitable composition, especially in case of microporous support materials.

an organometallic complex of the type **19** (Figure 8) was used in selective dimerization reactions. The OH groups of the support were modified with an aluminum halide or alkyla‐ luminum halide (Figure 8). The immobilized buffered catalyst was formed by mixing the organometallic catalyst, the IL composition of [BMIM]Cl/AlCl3 and the coated support. The methodology was successfully used for the dimerization of propene in a fix bed reactor. [66]

OAlXnR2-n

OAlXnR2-n

**Figure 8.** Modification of silica with an aluminumhalide/alkylaluminumhalide, and complex **19** (M=Ni, Fe, Co, Ti, V)

Immobilization of metallocene catalysts for the same reaction was achieved by similar

Silica-supported SO3H-functionalized IL catalysts were used in the oligomerization of isobutene. [68] The catalysts were obtained by impregnation of silica supports with Brønsted

was shown that various factors, such as the properties and pretreatment of the solid support, the choice of cation and anion, as well as the reaction time and temperature affected the outcome of the reaction considerably. Oligomerization of isobutene, carried out at 60 o

to the formation of C8 products with very good selectivity in each case. An increase in the temperature and/or reaction time led to an increase in the ratios of higher oligomers with selectivities for C12+C16 products up to 85%. The catalysts comprising a trifluoromethanesul‐ fonate anion could be reused several times without loss of activity. Total catalyst leaching of eight successive runs was 2.0% of the original load. At the same time, a quick inactivation of

The TON and TOF values of the SILP catalysts were found to be ten times higher than those of the same ILs. When using the SILP catalysts, a smaller amount of the relatively expensive ionic liquid was sufficient for the reaction. The mass transport into the ionic liquid phase can be rate limiting due to the high viscosity of the ILs. This drawback can be circumvented by dispersing the ILs on support materials. Furthermore, the solid catalysts are easier to handle than the ionic liquids themselves, so separation and recovery of even small amounts of

Both mesoporous and microporous silica materials were used as solid supports. [69] A close relationship between catalytic activity and catalyst morphology was observed. Silica with mesoporous structure was able to adsorb a higher amount of the IL and produced SILP

the catalysts obtained from ILs with hydrogensulfate anions was observed.

OAlXnR2-n

AlXnR3-n

56 Oligomerization of Chemical and Biological Compounds

s i l i c a

or [BIMBs]+


OH

OH

used as the catalyst precursor

acidic ILs with [MIMBs]+

catalysts are simple.

methodology. [67]

OH

s i l i c a

N R R' N N M

Xn R'' R'''

**19**

C, led

cations resulting in the formation of a solid material. It

As it could be expected, immobilization of the IL led to a loss of the BET surface of the supports. From the experimental data of the nitrogen adsorption/desorption isotherms of the supports and the solid catalysts, it could be concluded that the pores of mesoporous supports retained their shape during the catalyst preparation process, although the total pore volume values decreased because of the active IL film on the wall of the pores. On the other hand, the microporous supports contained narrow shaped pores that became almost totally filled with IL film during the catalyst preparation. The diffusion of isobutene to the micropores was blocked by the IL that filled the pore, so the contact surface between the IL phase and the organic phase was lower than that of SILP catalysts with mesopores. Also, acid capacity of microporous material was found to be lower, in accordance with the different amounts of adsorbed ionic liquid. Lower contact surface and lower acidity resulted in lower catalytic activity. At the same time, using a proper pretreatment, a stable catalyst with excellent C12 selectivity, exceeding even that of the mesoporous material, could be obtained with ILs supported on microporous silica.

Supported chloroaluminate ILs were used for the trimerization of isobutene in a feed contaning C4 mixture. [70] Both the support and immobilization methodology were found to play a crucial role in the reaction. Catalysts obtained by mixing the IL with the silica support showed excellent oligomerization activity. At the same time, ILs immobilized on glass or molecular sieves induced only isobutane/butene alkylation. Exclusive alkylation took place also in the presence of catalysts prepared by grafting the IL on the silica. According to 29Si CP/MAS NMR measurements, in the latter case a covalent bond was formed between the cation and the support. On the contrary, the spectrum of the catalyst with oligomerization activity indicated the formation of a covalent bond between the anion of the Lewis acid and the silanol group on the surface of the support (Figure 9).

**Figure 9.** Structures of immobilized IL on silica support prepared by different methods (a) by mixing the IL and the silica support, b) by grafting)

[BMIM]Cl was used as a template during the preparation of a β-zeolite, a support for nickelβ-diimine catalysts that were highly active in oligomerization of ethene. [71] The high regularity of the microspherical agglomerates, obtained during the preparation of the support, was attributed to the formation of micellar aggregates due to the IL. The nickel complex incorporated into the β-zeolite framework showed higher activity, higher C4 (up to 93.8%) and 1-butene selectivities (85.7% of the C4 fraction) than the homogeneous catalyst. The β-zeolite structure was found to work as a shape-selective support that inhibited re-coordination of 1 butene, thereby preventing isomerization and growth of the oligomer chain.

Good trimer selectivity was obtained in oligomerization of isobutene in the presence of chloroindate ILs. Catalysts with ILs supported on silica were also used efficiently. [79]

Synthesis of polyalphaolefins, starting from 1-decene or 1-dodecene, was described in imidazolium, pyridinium, phosphonium, ammonium or sulfonium halides combined with an aluminum-or gallium halide or-alkylhalide. [80] A continuous mode of operation, using the same composition, was also patented by Chevron. [81] Not only the proper choice of the catalytic system but also the use of an appropriate operational mode is important to achieve optimal results. It was shown that the activity of the IL catalyst could be increased by emul‐

Efficient processes can be developed by the combination of oligomerization in ILs with other methodologies. An acidic IL catalyst consisting of an organic salt and a Lewis acid was used for the oligomerization of a light olefinic by-product fraction from a metallocene-catalyzed

Oligomerization of olefins present in the condensate recovered from the Fischer-Tropsch reactor can lead to high quality lube base oils. [84] However, the presence of oxygenates was found to interfere with the oligomerization of olefins catalyzed by Lewis acidic ILs. The amount of oxygenates could be reduced by contacting the feed with a hydrotreating catalyst prior to oligomerization. [85] The olefin stream formed by the Fisher-Tropsch synthesis and containing 1-alkenes with 5-18 carbon atoms could be converted into lubricating oils having a

Oligomerization of olefins together with an alkylation in the presence of isoparaffins could be carried out using the same IL catalysts. This may provide an efficient way to reduce the concentration of double bonds and at the same time to enhance the quality of the product. [87]

A Lewis acidic IL was supported on a porous solid and served as an adsorbent and activator for a Brønsted acid catalyst. The catalytic system was used in a fixed bed reactor, constructed for oligomerization or combined oligomerization /alkylation reactions. [88] In order to be able to respond to changing market demands, a process adjusting operational mode between two

another C5+ product boiling above this temperature, was developed combining alkylation/

The use of ILs in catalytic reactions is a quickly expanding area of chemical research that has a great potential for industrial applications. Oligomerization of alkenes with ILs is one of the most promising methodologies to be implemented as industrial processes, although until now

The results presented in the previous sections show that ILs can really be used efficiently either as solvents or catalysts in oligomerization reactions. Catalytic activity of transition metal

levels, a low level that favors the synthesis of C5+ products boiling at 137.8 o

C or less. [86]

The Use of Ionic Liquids in the Oligomerization of Alkenes

http://dx.doi.org/10.5772/57478

59

C or below or

sifying the IL with one or more liquid components. [82]

viscosity index of at least 120 and a pour point of -45 o

only one example, the DifasolTM technology, exists. [19]

oligomerization in ILs. [89]

**9. Conclusions**

polyalphaolefin oligomerization process. [83]

#### **8. Industrial processes and patents based on the use of ionic liquids**

Some of the catalyst compositions, described in the previous sections, were also patented.

Researchers of the Institue Français du Petrole (IFP) described the use of chloroaluminate ILs, composed of imidazolium or pyridinium halides and AlCl3 as solvents for nickel catalyzed oligomerizations as early as 1987. [21] An improvement of the system by the addition of an alkylaluminum chloride co-catalyst, [72] and the use of an aromatic hydrocarbon to control the acidity of the system were patented soon afterwards. [73]

In 1998, an industrial process based on chloroaluminate ILs and nickel catalysts, known as the DifasolTM technology, was commercialized. It can be considered as a biphasic variant of the original homogeneous Dimersol XTM process that converts butenes to dimers. [19] As conver‐ sion is dependent on the initial concentration of butenes, the use of the latter technology is limited to C4 feed with a minimum of 60% butene content. The liquid-liquid biphasic process can convert dilute feeds and can produce dimers with high selectivity that does not depend on conversion due to the low solubility of the product dimers in the ionic phase. The Difasol system is ideally suited for use as the finishing reaction section for a conventional Dimersol unit and importantly, it can be fitted into existing Dimersol plants to give improved yields and lower catalyst consumption. The sequence increases the relative octene gain by 22-41% over the traditional single process. [74] By the use of an appropriate ligand with a coordinating nitrogen, the catalyst system containing a zerovalent nickel complex, an acid and the IL can be tuned to obtain either dimers, oligomers or polymers with good selectivity. [75]

BP Chemicals also patented catalytic systems containing nickel-complexes in buffered ILs with a composition of RnMCl3-n or RmM2Cl6-m (M: Al, Ga, B or Fe(III)), an organic halide and a weak base. [76]

Catalysts composed of a Lewis acid and an IL [77] as well as the catalytic activity of acidic chloroaluminate(III) and alkylchloroaluminate(III) ILs [78] in oligomerization of light olefins were patented in 1995. Using the latter system, the olefinic feedstock could simply be bubbled through the ionic liquid catalyst, or alternatively, in a batch process, the IL was injected into a charged autoclave. With the use of imidazolium ions with alkyl groups longer than C5, the reaction could be pushed towards polymerization.

Good trimer selectivity was obtained in oligomerization of isobutene in the presence of chloroindate ILs. Catalysts with ILs supported on silica were also used efficiently. [79]

Synthesis of polyalphaolefins, starting from 1-decene or 1-dodecene, was described in imidazolium, pyridinium, phosphonium, ammonium or sulfonium halides combined with an aluminum-or gallium halide or-alkylhalide. [80] A continuous mode of operation, using the same composition, was also patented by Chevron. [81] Not only the proper choice of the catalytic system but also the use of an appropriate operational mode is important to achieve optimal results. It was shown that the activity of the IL catalyst could be increased by emul‐ sifying the IL with one or more liquid components. [82]

Efficient processes can be developed by the combination of oligomerization in ILs with other methodologies. An acidic IL catalyst consisting of an organic salt and a Lewis acid was used for the oligomerization of a light olefinic by-product fraction from a metallocene-catalyzed polyalphaolefin oligomerization process. [83]

Oligomerization of olefins present in the condensate recovered from the Fischer-Tropsch reactor can lead to high quality lube base oils. [84] However, the presence of oxygenates was found to interfere with the oligomerization of olefins catalyzed by Lewis acidic ILs. The amount of oxygenates could be reduced by contacting the feed with a hydrotreating catalyst prior to oligomerization. [85] The olefin stream formed by the Fisher-Tropsch synthesis and containing 1-alkenes with 5-18 carbon atoms could be converted into lubricating oils having a viscosity index of at least 120 and a pour point of -45 o C or less. [86]

Oligomerization of olefins together with an alkylation in the presence of isoparaffins could be carried out using the same IL catalysts. This may provide an efficient way to reduce the concentration of double bonds and at the same time to enhance the quality of the product. [87]

A Lewis acidic IL was supported on a porous solid and served as an adsorbent and activator for a Brønsted acid catalyst. The catalytic system was used in a fixed bed reactor, constructed for oligomerization or combined oligomerization /alkylation reactions. [88] In order to be able to respond to changing market demands, a process adjusting operational mode between two levels, a low level that favors the synthesis of C5+ products boiling at 137.8 o C or below or another C5+ product boiling above this temperature, was developed combining alkylation/ oligomerization in ILs. [89]

#### **9. Conclusions**

[BMIM]Cl was used as a template during the preparation of a β-zeolite, a support for nickelβ-diimine catalysts that were highly active in oligomerization of ethene. [71] The high regularity of the microspherical agglomerates, obtained during the preparation of the support, was attributed to the formation of micellar aggregates due to the IL. The nickel complex incorporated into the β-zeolite framework showed higher activity, higher C4 (up to 93.8%) and 1-butene selectivities (85.7% of the C4 fraction) than the homogeneous catalyst. The β-zeolite structure was found to work as a shape-selective support that inhibited re-coordination of 1-

butene, thereby preventing isomerization and growth of the oligomer chain.

the acidity of the system were patented soon afterwards. [73]

58 Oligomerization of Chemical and Biological Compounds

reaction could be pushed towards polymerization.

base. [76]

**8. Industrial processes and patents based on the use of ionic liquids**

Some of the catalyst compositions, described in the previous sections, were also patented.

Researchers of the Institue Français du Petrole (IFP) described the use of chloroaluminate ILs, composed of imidazolium or pyridinium halides and AlCl3 as solvents for nickel catalyzed oligomerizations as early as 1987. [21] An improvement of the system by the addition of an alkylaluminum chloride co-catalyst, [72] and the use of an aromatic hydrocarbon to control

In 1998, an industrial process based on chloroaluminate ILs and nickel catalysts, known as the DifasolTM technology, was commercialized. It can be considered as a biphasic variant of the original homogeneous Dimersol XTM process that converts butenes to dimers. [19] As conver‐ sion is dependent on the initial concentration of butenes, the use of the latter technology is limited to C4 feed with a minimum of 60% butene content. The liquid-liquid biphasic process can convert dilute feeds and can produce dimers with high selectivity that does not depend on conversion due to the low solubility of the product dimers in the ionic phase. The Difasol system is ideally suited for use as the finishing reaction section for a conventional Dimersol unit and importantly, it can be fitted into existing Dimersol plants to give improved yields and lower catalyst consumption. The sequence increases the relative octene gain by 22-41% over the traditional single process. [74] By the use of an appropriate ligand with a coordinating nitrogen, the catalyst system containing a zerovalent nickel complex, an acid and the IL can

be tuned to obtain either dimers, oligomers or polymers with good selectivity. [75]

BP Chemicals also patented catalytic systems containing nickel-complexes in buffered ILs with a composition of RnMCl3-n or RmM2Cl6-m (M: Al, Ga, B or Fe(III)), an organic halide and a weak

Catalysts composed of a Lewis acid and an IL [77] as well as the catalytic activity of acidic chloroaluminate(III) and alkylchloroaluminate(III) ILs [78] in oligomerization of light olefins were patented in 1995. Using the latter system, the olefinic feedstock could simply be bubbled through the ionic liquid catalyst, or alternatively, in a batch process, the IL was injected into a charged autoclave. With the use of imidazolium ions with alkyl groups longer than C5, the The use of ILs in catalytic reactions is a quickly expanding area of chemical research that has a great potential for industrial applications. Oligomerization of alkenes with ILs is one of the most promising methodologies to be implemented as industrial processes, although until now only one example, the DifasolTM technology, exists. [19]

The results presented in the previous sections show that ILs can really be used efficiently either as solvents or catalysts in oligomerization reactions. Catalytic activity of transition metal complexes was shown to be enhanced in ILs compared to organic solvents and ILs were proved to ensure easy product separation and efficient catalyst recycling. Technologies using ILs can convert not only pure alkenes but also process streams containing olefins to oligomers and can also be combined with other hydrocarbon conversion processes.

MPYR:1-methylpyrrolidinium

OTf: trifluoromethanesulfonate

RON: research octane number

toe: tonne of oil equivalent

**Acknowledgements**

of the European Social Fund.

, Eszter Kriván2

Magazine 2010;2:97-102.

[3] Megatrends matter, Infineum Insight, 2012;(56):5

ny: Esslingen: TAE, 2011. p.361-73.

, Zoltán Eller2

try (2nd edition). Hoboken, New Jersey: Wiley; 2003. p723-806.

**Author details**

Csaba Fehér1

**References**

The present work was funded by the project TÁMOP-4.2.2.A-11/1/KONV-2012-0071, realized with the support of the Hungarian Government and the European Union, with the co-funding

, Jenő Hancsók<sup>2</sup>

1 University of Pannonia, Institute of Chemistry, Department of Organic Chemistry, Hungary

2 University of Pannonia, Department of MOL Hydrocarbon and Coal Processing, Hungary

[1] Olah GA, Molnár A, Oligomerization and Polymerization. In: Hydrocarbon Chemis‐

[2] Olivier-Bourbigou H, Forestiere A, Saussine L, Magna L, Favre F, Hugues F. Olefin oligomerization for the production of fuels and petrochemicals. OIL GAS European

[4] Hancsók J, Kasza T. The importance of isoparaffins at the modern engine fuel pro‐ duction. In: W. Bartz (ed) Proceedings of 8th International Colloquium Fuels 2011, 8th International Colloquium Fuels, 19-20 January 2011, Stuttgart/Ostfildern, Germa‐

and Rita Skoda-Földes1

The Use of Ionic Liquids in the Oligomerization of Alkenes

http://dx.doi.org/10.5772/57478

61

PAO: poly-α-olefin

PY: pyridinium

NTf2: bistrifluoromethylsulfonimide

The effects of chloroaluminate ILs on nickel-catalyzed reactions are relatively well explored and from the investigations it can be concluded that a careful fine-tune of IL composition is necessary to obtain satisfactory results. However, in most cases little is known about the active transition metal species and the choice of the cations restricted mainly to imidazolium or pyridinium ions. There are still plenty of possibilities for the design of new IL compositions with better performance. This is even more true for ILs with oligomerization activity. To obtain an IL composition with optimal acidity, in order to achieve high conversion and at the same time to avoid polymerization or formation of higher oligomers, is a key issue in future research.

However, one of the main drawbacks that hinder industrial applications of ILs is the still high cost of these materials. In this respect, the use of supported catalysts opens new possibilities. Besides reducing the quantity of ILs necessary for the catalytic reactions, immobilization on solid supports makes it possible to use the catalysts in fixed bed reactors in continuous mode. At the same time, recent observations clearly show that compared to the non-immobilized IL catalysts, there are numerous other factors here that influence catalytic performance. Not only the IL composition but also the choice of the support material, porosity of the support and immobilization methodology affects catalytic activity and selectivity.

#### **Abbreviations**

BIMBs: 1-(4-sulfobutyl)-3-butylimidazolium BMIM: 1-butyl-3-methylimidazolium BMPY: 1-butyl-4-methylpyridinium BPY: 1-butyl-pyridinium Bz: benzyl Cy: cyclohexyl EMIM: 1-ethyl-3-methylimidazolium HIMBs: 1-(4-sulfobutyl)-3-hexylimidazolium HMIM: 1-hexyl-3-methylimidazolium IL: ionic liquid MAO: methylaluminoxane MIM: 1-methyl-3H-imidazolium MIMBs: 1-(4-sulfobutyl)-3-methylimidazolium MPYR:1-methylpyrrolidinium NTf2: bistrifluoromethylsulfonimide OTf: trifluoromethanesulfonate PAO: poly-α-olefin PY: pyridinium RON: research octane number toe: tonne of oil equivalent

#### **Acknowledgements**

complexes was shown to be enhanced in ILs compared to organic solvents and ILs were proved to ensure easy product separation and efficient catalyst recycling. Technologies using ILs can convert not only pure alkenes but also process streams containing olefins to oligomers and can

The effects of chloroaluminate ILs on nickel-catalyzed reactions are relatively well explored and from the investigations it can be concluded that a careful fine-tune of IL composition is necessary to obtain satisfactory results. However, in most cases little is known about the active transition metal species and the choice of the cations restricted mainly to imidazolium or pyridinium ions. There are still plenty of possibilities for the design of new IL compositions with better performance. This is even more true for ILs with oligomerization activity. To obtain an IL composition with optimal acidity, in order to achieve high conversion and at the same time to avoid polymerization or formation of higher oligomers, is a key issue in future research. However, one of the main drawbacks that hinder industrial applications of ILs is the still high cost of these materials. In this respect, the use of supported catalysts opens new possibilities. Besides reducing the quantity of ILs necessary for the catalytic reactions, immobilization on solid supports makes it possible to use the catalysts in fixed bed reactors in continuous mode. At the same time, recent observations clearly show that compared to the non-immobilized IL catalysts, there are numerous other factors here that influence catalytic performance. Not only the IL composition but also the choice of the support material, porosity of the support and

also be combined with other hydrocarbon conversion processes.

60 Oligomerization of Chemical and Biological Compounds

immobilization methodology affects catalytic activity and selectivity.

**Abbreviations**

BPY: 1-butyl-pyridinium

Bz: benzyl

Cy: cyclohexyl

IL: ionic liquid

MAO: methylaluminoxane

MIM: 1-methyl-3H-imidazolium

BIMBs: 1-(4-sulfobutyl)-3-butylimidazolium

BMIM: 1-butyl-3-methylimidazolium BMPY: 1-butyl-4-methylpyridinium

EMIM: 1-ethyl-3-methylimidazolium

HMIM: 1-hexyl-3-methylimidazolium

HIMBs: 1-(4-sulfobutyl)-3-hexylimidazolium

MIMBs: 1-(4-sulfobutyl)-3-methylimidazolium

The present work was funded by the project TÁMOP-4.2.2.A-11/1/KONV-2012-0071, realized with the support of the Hungarian Government and the European Union, with the co-funding of the European Social Fund.

#### **Author details**

Csaba Fehér1 , Eszter Kriván2 , Zoltán Eller2 , Jenő Hancsók<sup>2</sup> and Rita Skoda-Földes1

1 University of Pannonia, Institute of Chemistry, Department of Organic Chemistry, Hungary

2 University of Pannonia, Department of MOL Hydrocarbon and Coal Processing, Hungary

#### **References**


[5] Bridsal C. The Outlook for Energy: A View to 204. Mineral Oil Congress, Stuttgart, Germany April 9-10. 2013.

[20] Nikiforidis I, Görling A, Hieringer W. On the regioselectivity of the insertion step in nickel complex catalyzed dimerization of butene: A density-functional study. Journal

The Use of Ionic Liquids in the Oligomerization of Alkenes

http://dx.doi.org/10.5772/57478

63

[21] Chauvin Y, Commereuc D, Hirschauer A, Hugues F, Saussine, L. Procédé de diméri‐

[22] Pei L, Liu X, Gao H, Wu Q. Biphasic oligomerization of ethylene with nickel com‐ plexes immobilized in organochloroaluminate ionic liquids. Applied Organometallic

[23] Chauvin Y, Gilbert B, Guibard I. Catalytic dimerization of alkenes by nickel com‐ plexes in organochloroaluminate molten salts. Journal of the Chemical Society,

[24] Chauvin Y, Einloft S, Olivier H. Catalytic dimerization of propene by nickel-phos‐ phine complexes in 1-butyl-3-methylimidazolium chloride/AlEtxCl3-x, (x=0, 1) ionic

[25] Einloft S, Dietrich FK, de Souza RF, Dupont J. Selective two-phase catalytic ethylene dimerization by NiII complexes/AlEtCl2 dissolved in organoaluminate ionic liquids.

[26] Thiele D, de Souza RF. The role of aluminum species in biphasic butene dimerization catalyzed by nickel complexes. Journal of Molecular Catalysis A 2007;264:293–8.

[27] Simon LC, Dupont J, de Souza RF. Two-phase n-butenes dimerization by nickel com‐

[28] Chauvin Y, Olivier H, Wyrvalski CN, Simon LC, de Souza RF. Oligomerization of nbutenes catalyzed by nickel complexes dissolved in organochloroaluminate ionic liq‐

[29] de Souza RF, Thiele D, Monteiro AL. Effect of phosphine–CS2 adducts on the nickelcatalyzed butenes oligomerization in organochloroaluminate imidazolium ionic liq‐

[30] McGuinness DS, Mueller W, Wasserscheid P, Cavell KJ, Skelton BW, White AH, Eng‐ lert U. Nickel(II) heterocyclic carbene complexes as catalysts for olefin dimerization in an imidazolium chloroaluminate ionic liquid. Organometallics 2002;21:175-81.

[31] Bernardo-Gusmão K, Trevisan Queiroz LF, de Souza RF, Leca F, Loup C, Réau R. Bi‐ phasic oligomerization of ethylene with nickel–1,2-diiminophosphorane complexes immobilized in 1-n-butyl-3-methylimidazolium organochloroaluminate. Journal of

[32] de Souza RF, Leal BC, de Souza MO, Thiele D. Nickel-catalyzed propylene dimeriza‐ tion in organochloroaluminate ionic liquids: Control of the isomerization reaction.

liquids. Industrial & Engineering Chemistry Research 1995;34:1149-55.

plexes in molten salt media. Applied Catalysis A 1998;175:215-20.

of Molecular Catalysis A 2011;341:63–7.

Chemical Communications 1990;1715-6.

Polyhedron 1996;15(19):3257-59.

uids. Journal of Catalysis 1997;165:275-8.

uids. Journal of Catalysis 2006;241:232–4.

Journal of Molecular Catalysis A 2007;272:6–10.

Catalysis 2003;219:59–62.

Chemistry 2009,23:455–59.

sation ou de codimérisation d'oléfines. FP 2611700, 1987.


[20] Nikiforidis I, Görling A, Hieringer W. On the regioselectivity of the insertion step in nickel complex catalyzed dimerization of butene: A density-functional study. Journal of Molecular Catalysis A 2011;341:63–7.

[5] Bridsal C. The Outlook for Energy: A View to 204. Mineral Oil Congress, Stuttgart,

[7] Hancsók J, Magyar S, Holló A. Importance of isoparaffins in the crude oil refining in‐

[8] Hancsók J, Krár M, Magyar Sz, Boda L, Holló A, Kalló D. Investigation of the pro‐ duction of high cetane number biogasoil from pre-hydrogenated vegetable oils over Pt/HZSM-22/Al2O3. Microporous and Mesoporous Materials 2007;101(1-2):148-52.

[9] Krár M, Thernesz A, Tóth Cs, Kasza T, Hancsók J. Investigation of catalytic conver‐ sion of vegetable oil/gas oil mixtures. In: Halász I. (ed) Silica and Silicates in Modern

[10] Pölczmann G, Hancsók J. Production of high stability base oil from Fischer-Tropsch wax, 18th International Colloquium Tribology, January 10-12.2012, Book of Synopsis.

[11] Pölczmann G, Valyon J, Szegedi Á, Mihályi RM, Hancsók J.Hydroisomerization of Fischer–Tropsch wax on Pt/AlSBA-15 and Pt/SAPO-11 catalysts. Topics in Catalysis

[12] Auer J, Borsi Z, Hancsók J, Lakics L, Lenti M, Nemesnyik Á, Valasek I. Tribology 2.

[13] Olivier-Bourbigou H, Magna L, Morvan D. Ionic liquids and catalysis: Recent prog‐

[14] Pham TPT, Cho CW, Yun YS. Environmental fate and toxicity of ionic liquids: A re‐

[15] Nasirov FA, Novruzova FM, Aslanbeili AM, Azizov AG. Ionic liquids in catalytic processes of transformation of olefins and dienes (Review). Petroleum Chemistry

[16] Chiappe C, Rajamani S. Structural effects on the physico-chemical and catalytic prop‐ erties of acidic ionic liquids: An overview. European Journal of Organic Chemistry

[17] Johnson KE, Pagni RM, Bartmess J. Brønsted Acids in Ionic liquids: fundamentals, organic reactions, and comparisons. Monatshefte für Chemie 207;138:1077–101.

[18] Cui X, Zhang S, Shi F, Ma X, Lu L, Deng Y. The influence of the acidity of ionic liq‐

[19] Forestière A, Olivier-Bourbigou H, Saussine L. Oligomerization of monoolefins by homogeneous catalysts. Oil & Gas Science and Technology – Rev. IFP 2009;64(6):

Lubricants and the investigations. Budapest:Tribotechnik Ltd.; 2003.

ress from knowledge to applications. Applied Catalysis A 2010;373:1–56.

Catalysis, Kerala, India:Transworld Research Network; 2010. p 435-55.

dustry. Chemical Engineering Transactions 2007;11:41-7.

Stuttgart/Ostfildern, Germany, 2012. p71.

view. Water Research 2010;44:352–72.

uids on catalysis. ChemSusChem 2010;3:1043-7.

2011;54:1079–83.

2007;47(5):309–17.

2011;(28):5517-39.

649-67.

Germany April 9-10. 2013.

62 Oligomerization of Chemical and Biological Compounds

[6] BP Statistical Review, June 2013.


[33] Wasserscheid P, Eichmann M. Selective dimerisation of 1-butene in biphasic mode using buffered chloroaluminate ionic liquid solvents —design and application of a continuous loop reactor. Catalysis Today 2001;66:309–16.

[45] Thiele D, de Souza RF. Biphasic ethylene oligomerization using bis(imino)pyridine cobalt complexes in methyl-butylimidazolium organochloroaluminate ionic liquids.

The Use of Ionic Liquids in the Oligomerization of Alkenes

http://dx.doi.org/10.5772/57478

65

[46] Olivier H, Laurent-Gérot P. Homogeneous and two-phase dimerization of olefins catalyzed by tungsten complexes. The role of imido ligands and Lewis acids. Journal

[47] Peng J, Li J, Qiu H, Jiang J, Jiang K, Mao J, Lai G. Dimerization of styrene to 1,3-di‐ phenyl-1-butene catalyzed by palladium–Lewis acid in ionic liquid. Journal of Molec‐

[48] Silva SM, Suarez PAZ, de Souza RF, Dupont J. Selective linear dimerization of 1,3 butadiene by palladium compounds immobilized into 1-n-butyl-3-methyl imidazoli‐

[49] Crowhurst L, Welton T. Brønsted acidity in ionic liquids. EUCHEM Conference on Molten Salts and Ionic Liquids: 16-22 September 2006, Hammamet, Tunisia Abstract

[50] Robert T, Magna L, Olivier-Bourbigou H, Gilbert B. A comparison of the acidity lev‐ els in room-temperature ionic liquids. Journal of the Electrochemical Society

[51] Mihichuk LM, Driver GW, Johnson KE. Brønsted acidity and the medium: funda‐

[52] Magna L, Bildé J, Olivier-Bourbigou H, Robert T, Gilbert B. About the acidity-catalyt‐ ic activity relationship in ionic liquids: Application to the selective isobutene dimeri‐

[53] Shen ZL, He XJ, Hu BX, Sun N, Mo WM. Study on selective dimerization of α-meth‐

[54] Goledzinowski M, Birss VI. Oligomerization of low molecular weight olefins in am‐ bient temperature molten salts. Industrial & Engineering Chemistry Research

[55] Stenzel O, Brüll R, Wahner UM, Sanderson RD, Raubenheimer HG. Oligomerization of olefins in a chloroaluminate ionic liquid. Journal of Molecular Catalysis A

[56] Azizov AG, Kalbalieva ES, Alieva RV, Bektashi NR, Aliev BM. Oligomerization of hexene-1 in the presence of catalytic systems on the basis of ionic liquids. Petroleum

[57] Azizov AH, Aliyeva RV, Kalbaliyeva ES, Ibrahimova MJ. Selective synthesis and the mechanism of formation of the oligoalkylnaphthenic oils by oligocyclization of 1 hexene in the presence of ionic-liquid catalysts. Applied Catalysis A 2010;375:70–7.

mentals with a focus on ionic liquids. ChemPhysChem 2011;12:1622–32.

zation. Oil & Gas Science and Technology – Rev. IFP 2009;64(6):669-679.

ylstyrenes promoted by ionic liquids. Catalysis Letters 2008;122:359–65.

Journal of Molecular Catalysis A 2011;340:83–8.

um ionic liquids. Polymer Bulletin 1998;40:401–5.

of Molecular Catalysis A 1999;148:43–8.

ular Catalysis A 2006;255:16–8.

No. 1980.

2009;156: F115-F121.

1993;32:1795-7.

2003;192:217–22.

Chemistry 2010;50(1):56–64.


[45] Thiele D, de Souza RF. Biphasic ethylene oligomerization using bis(imino)pyridine cobalt complexes in methyl-butylimidazolium organochloroaluminate ionic liquids. Journal of Molecular Catalysis A 2011;340:83–8.

[33] Wasserscheid P, Eichmann M. Selective dimerisation of 1-butene in biphasic mode using buffered chloroaluminate ionic liquid solvents —design and application of a

[34] Eichmann M, Keim W, Haumann M, Melcher BU, Wasserscheid P. Nickel catalyzed dimerization of propene in chloroaluminate ionic liquids: Detailed kinetic studies in

[35] Ellis B, Keim W, Wasserscheid P. Linear dimerisation of but-1-ene in biphasic mode using buffered chloroaluminate ionic liquid solvents. Chemical Communications

[36] Chauvin Y, Di Marco-Van Tiggelen F, Olivier H. Determination of aluminium elec‐ tronegativity in new ambient temperature acidic molten salts based on 3-butyl-1 methylimidazolium chloride and AlEt3-xClx (x=0-3) by 1H nuclear magnetic resonance spectroscopy. Journal of the Chemical Society, Dalton Transactions

[37] Gilbert B, Olivier-Bourbigou H, Favre F. Chloroaluminate ionic liquids: from their structural properties to their applications in process intensification. Oil & Gas Science

[38] Lecocq V, Olivier-Bourbigou H. Biphasic Ni-catalyzed ethylene oligomerization in ionic liquids. Oil & Gas Science and Technology – Rev. IFP 2007; 62(6):761-73.

[39] Dötterl M, Alt HG. Heavy metal with a heavy impact: olefin dimerization reactions in triphenylbismuth buffered chloroaluminate ionic liquids. ChemCatChem

[40] Wasserscheid P, Gordon CM, Hilgers C, Muldoon MJ, Dunkin IR. Ionic liquids: po‐ lar, but weakly coordinating solvents for the first biphasic oligomerisation of ethene to higher α-olefins with cationic Ni complexes. Chemical Communications

[41] Wasserscheid P, Hilgers C, Keim W. Ionic liquids—weakly-coordinating solvents for the biphasic ethylene oligomerization to α-olefins using cationic Ni-complexes. Jour‐

[42] Song KM, Gao HY, Liu FS, Pan J, Guo LH, Zai SB, Wu Q. Ionic liquid-supported bis- (salicylaldimine) nickel complexes: robust and recyclable catalysts for ethylene oligo‐

[43] Thiele D, de Souza RF. Oligomerization of ethylene catalyzed by iron and cobalt in organoaluminate dialkylimidazolium ionic liquids. Catalysis Letters 2010;138:50–5.

[44] Ligabue RA, Dupont J, de Souza RF. Liquid–liquid two-phase cyclodimerization of 1,3-dienes by iron-nitrosyl dissolved in ionic liquids. Journal of Molecular Catalysis

merization in biphasic solvent system. Catalysis Letters 2009;131:566–73.

continuous loop reactor. Catalysis Today 2001;66:309–16.

and Technology – Rev. IFP 2007; 62(6):745-59.

nal of Molecular Catalysis A 2004;214:83–90.

1999;337–8.

64 Oligomerization of Chemical and Biological Compounds

1993;1009–11.

2011;3(11):1799-804.

2001;1186–7.

A 2001;169:11–7.

a batch reactor. Journal of Molecular Catalysis A 2009,314: 42–8.


[58] Ibragimova MD, Samedova FI, Gasanova RZ, Azmamedov NG, Eivazov EZ. Synthe‐ sis of oligooctene and oligodecene oils in the presence of chloroaluminate ionic liq‐ uids. Petroleum Chemistry 2007;47(1):61–6.

[72] Chauvin Y, Commereuc D, Guibard I, Hirschauer A, Olivier H, Saussin L. Non-aque‐ ous liquid composition with an ionic character and its use as a solvent. US 5,104,840,

The Use of Ionic Liquids in the Oligomerization of Alkenes

http://dx.doi.org/10.5772/57478

67

[73] Chauvin Y, Einloft S, Olivier H. Catalytic process for the dimerization of olefins. US

[74] Commereuc D, Forestiere A, Hughes F, Olivier-Bourbigou H. Sequence of processes

[75] Lecocq V, Olivier Bourbigou H. Catalyts composition for dimerizing, co-dimerizing,

[76] Wasserscheid P, Keim W. Catalyst comprising a buffered IL and hydrocarbon con‐

[77] Goledzinowski M, Birss VI. Oligomerization of low molecular weight olefins in am‐

[78] Abdul-Sada AK, Ambler PW, Hodgson PKG, Seddon KR, Stewart NJ, Ionic liquids.

[79] Earle MJ, Karkkainen J, Plechkova NV, Tomaszowska A, Seddon KR. Oligomerisa‐

[80] KD Hope, MS Driver, TV Harris, High viscosity polyalphaolefins prepared with ion‐

[81] Hope KD, Stern DA, Twomey DW, Collins JB. Method for manufacturing high vis‐

[82] Bergmann LH, Hope KD, Benham EA, Stern DA. Method and system to add high shear to improve an ionic liquid catalyzed chemical reaction. US 8,163,856, 2012. [83] Patil AO, BodigeS Process for synthetic lubricant production. US 2009/0156874, 2009. [84] Harris TV, Cheng, Lei GD. Process for the oligomerization of olefins in Fischer-

[85] Johnson DR, HarrisTV, DriverMS Hydrotreating of Fischer-tropsch derived feeds prior to oligomerization using an ionic liquid catalyst. WO 2005/005353, 2005.

[87] Elomari S, Krug R. Process for the formation of a superior lubricant or fuel blend‐ stock by ionic liquid oligomerization of olefins in the presence of isoparaffins. WO

[89] Timken K, Winter S, Lacheen HS, Hommeltoft SI. Market driven alkylation or oligo‐

[88] Hommeltoft SI, Zhou Z. Supported ionic liquid reactor. US 2011/0318233, 2011.

cosity polyalphaolefins using ionic liquid catalysts. US 7,351, 780, 2008.

1992.

5,550,306, 1996.

WO9521871, 1995.

2007/078595, 2007.

tion. US 2008/0306319, 2008.

ic liquid catalyst. US 6,395,948, 2002.

Tropsch derived feeds. WO 2005/005352, 2005.

merization process. US 2011/0226669, 2011.

[86] Atkins MP, Smith MR, Ellis B. Lubricating oils. EP 0 791 643, 1997.

for olefin oligomerization. US 6,444,866, 2002.

bient temperature melts. US 5,463,158, 1995.

oligomerizing and polymerizing olefins. US 6,951,831, 2005.

version process, e.g. oligomerisation. WO9847616, 1998.


[72] Chauvin Y, Commereuc D, Guibard I, Hirschauer A, Olivier H, Saussin L. Non-aque‐ ous liquid composition with an ionic character and its use as a solvent. US 5,104,840, 1992.

[58] Ibragimova MD, Samedova FI, Gasanova RZ, Azmamedov NG, Eivazov EZ. Synthe‐ sis of oligooctene and oligodecene oils in the presence of chloroaluminate ionic liq‐

[59] Yang S, Liu Z, Meng X, Xu C. Oligomerization of isobutene catalyzed by iron(iii)

[60] Atkins MP, Seddon KR, Swadźba-Kwaśny M. Oligomerisation of linear 1-olefins us‐ ing a chlorogallate(III) ionic liquid. Pure and Applied Chemistry 2011;83(7):1391–406.

[61] Cai Q, Li J, Bao F, Shan Y. Tunable dimerization of a-methylstyrene catalyzed by

[62] Wang H, Cui P, Zou G, Yang F, Tang J. Temperature-controlled highly selective di‐ merization of α-methylstyrene catalyzed by Bro°nsted acidic ionic liquid under sol‐

[63] Cui X, Zhang S, Shi F, Ma X, Lu L, Deng Y. The influence of the acidity of ionic liq‐

[64] Gu Y, Shi F, Deng Y. SO3H-functionalized ionic liquid as efficient, green and reusa‐ ble acidic catalyst system for oligomerization of olefins. Catalysis Communications

[65] Riisager A, Fehrmann R, Haumann M, Wasserscheid P. Supported ionic liquids: ver‐

[66] Dötterl M, Schmidt R, Engelmann T, Denner C, Alt HG. Selective olefin dimerizaton with supported metal complexes activated by alkylaluminum compounds or ionic

[67] Engelmann T, Denner C, Alt HG, Schmidt R. Heterogeneous dimerization of alpha-

[68] Fehér C, Kriván E, Hancsók J, Skoda-Földes R. Oligomerisation of isobutene with sili‐

[69] Fehér C, Kriván E, Kovács J, Hancsók J, Skoda-Földes R. Support effect on the cata‐ lytic activity and selectivity of SILP catalysts in isobutene trimerization. Journal of

[70] Liu S, Shang J, Zhang S, Yang B, Deng Y. Highly efficient trimerization of isobutene over silica supported chloroaluminate ionic liquid using C4 feed. Catalysis Today

[71] Mignoni ML, de Souza MO, Pergher SBC, de Souza RF, Bernardo-Gusmão K. Nickel oligomerization catalysts heterogenized on zeolites obtained using ionic liquids as

satile reaction and separation media. Topics in Catalysis 2006;40: 91-102.

olefins with activated metallocene complexes. US 2011/0004036, 2011.

ca supported ionic liquid catalysts. Green Chemistry 2012;14:403-409.

uids. Petroleum Chemistry 2007;47(1):61–6.

66 Oligomerization of Chemical and Biological Compounds

chloride ionic liquids. Energy & Fuels 2009;23:70–3.

acidic ionic liquids. Applied Catalysis A 2005;279:139–43.

vent-free conditions. Tetrahedron 2006;62:3985–8.

uids on catalysis. ChemSusChem 2010;3:1043-1047.

2003;4:597–601.

2013,200:41– 48.

liquids. US 2011/0004039, 2011.

Molecular Catalysis A2013;372:51–57.

templates. Applied Catalysis A 2010;374:26–30.


**Chapter 3**

**Silk Fiber — Molecular Formation Mechanism, Structure-**

Silk fibers spun by several species of arthropods have existed naturally for hundreds of millions of years. The ecological functions of the silk fibers are closely related to their proper‐ ties. For example, orb-weaving spiders produce a variety of different silks with diverse properties, each tailored to achieve a certain task (Figure 1) [1]. Most arthropod species produce silks used for building structures to capture prey and protect their offspring against environ‐ mental hazards [2]. The most investigated categories that have piqued the greatest amount of interest are spider silk and dragline silk in particular, produced by major ampullate glands and the cocoon silk of *Bombyx mori* (*B. mori*). The ongoing evolutionary optimization of silks from silkworms and spiders exhibit outstanding mechanical properties, such as strength and extensibility, as well as toughness, which outperform most other natural and man-made silk fibers (Table 1) [3, 4]. Due to its smooth texture, luster and strength, silks from natural silkworms have been extensively used in apparel and fashion applications for thousands of years [5]. Silks from spiders have also been utilized throughout history, such as sutures and

In contrast petrochemical-based synthetic polymers commonly used today, such as polyethy‐ lene, which is formed by polymerization of ethylene at high temperature and pressure, or under the presence of some metal-based catalysis, *B. mori* and spider spin fibers from a highly concentrated, water-based protein solution under mild conditions [6, 7]. Due to current trends in exploration of natural biological materials and the demand for environmentally friendly (green) materials, investigation of the applications of silk fibers has steadily gained promi‐ nence. Silk fibers are emerging as candidates for applications in even non-apparel areas due in part to recognition for their extraordinary mechanical properties, as well as their biocom‐ patibility and biodegradability. Currently, the promotion of silkworm as bio-factory to

> © 2014 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

**Property Relationship and Advanced Applications**

Xinfang Liu and Ke-Qin Zhang

http://dx.doi.org/10.5772/57611

**1. Introduction**

Additional information is available at the end of the chapter

fishing equipment in ancient Greece and Australasia.

## **Silk Fiber — Molecular Formation Mechanism, Structure-Property Relationship and Advanced Applications**

Xinfang Liu and Ke-Qin Zhang

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/57611

#### **1. Introduction**

Silk fibers spun by several species of arthropods have existed naturally for hundreds of millions of years. The ecological functions of the silk fibers are closely related to their proper‐ ties. For example, orb-weaving spiders produce a variety of different silks with diverse properties, each tailored to achieve a certain task (Figure 1) [1]. Most arthropod species produce silks used for building structures to capture prey and protect their offspring against environ‐ mental hazards [2]. The most investigated categories that have piqued the greatest amount of interest are spider silk and dragline silk in particular, produced by major ampullate glands and the cocoon silk of *Bombyx mori* (*B. mori*). The ongoing evolutionary optimization of silks from silkworms and spiders exhibit outstanding mechanical properties, such as strength and extensibility, as well as toughness, which outperform most other natural and man-made silk fibers (Table 1) [3, 4]. Due to its smooth texture, luster and strength, silks from natural silkworms have been extensively used in apparel and fashion applications for thousands of years [5]. Silks from spiders have also been utilized throughout history, such as sutures and fishing equipment in ancient Greece and Australasia.

In contrast petrochemical-based synthetic polymers commonly used today, such as polyethy‐ lene, which is formed by polymerization of ethylene at high temperature and pressure, or under the presence of some metal-based catalysis, *B. mori* and spider spin fibers from a highly concentrated, water-based protein solution under mild conditions [6, 7]. Due to current trends in exploration of natural biological materials and the demand for environmentally friendly (green) materials, investigation of the applications of silk fibers has steadily gained promi‐ nence. Silk fibers are emerging as candidates for applications in even non-apparel areas due in part to recognition for their extraordinary mechanical properties, as well as their biocom‐ patibility and biodegradability. Currently, the promotion of silkworm as bio-factory to

© 2014 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

produce silk fibers fitting for innovative and advanced functional biological applications is a big trend. Compared to silkworm silks, the potential commercial applications for many spider silks are still extremely finite due to reasons such as difficulty of high-density spider farming, which is limited by the cannibalistic nature of most spiders. Additionally, only ~12 m of silk can be obtained from a complete spider web, this is extremely small in comparison to the 600 to 900 m of silk that is yielded by one silkworm cocoon [8].

Artificial spinning is the most promising method of promoting the application of silk fibers, as it can output sufficient man-made fibers cost-effectively and with specific tailored proper‐ ties. Remarkable efforts for silk fiber reproduction via reconstituted/recombinant silk fibroin are currently underway [9, 10]. Reconstituted silk protein is derived from *B. mori* cocoons and the degummed *B. mori* silk fibers, they are soluble in concentrated LiBr aqueous solutions, yielding reconstituted silk protein solutions after dialysis [9]. Recent advances in transgenic technology enable the high level expression of recombinant proteins. Spider silk proteins have been produced by other organisms so that recombinant spider silk protein might be suitable for creation of artificial silk threads or other applications. Host organisms include bacteria, yeasts, animal cells and plants [3]. In order to biomimic native silks, the reconstituted or recombinant proteins used to spin artificial silks should possess amino acid similar to the ones found in native sites for sequence and composition. However, the properties of synthetic silk fibers currently do not meet the standards of native silks yet, as the composition, hierarchical structure and production conditions of natural silks all reportedly affect their mechanical properties [11]. Consequently, a profound knowledge of the natural formation process, chemical composition, relationships of structure and properties of silk fibers seems therefor

Silk Fiber — Molecular Formation Mechanism, Structure-Property Relationship and Advanced Applications

http://dx.doi.org/10.5772/57611

71

Thanks to recent developments in modern analytical techniques, significant progress has been made with respect to the structural characterization of silk. These techniques can provide molecular information about silk, including microscopic methods (atomic force microscopy (AFM), scanning and transmission electron microscopy (SEM and TEM), and scanning transmission x-ray microscopy (STXM)) and synchrotron x-ray diffraction (wide-angle x-ray diffraction (WAXD) and small-angle x-ray scattering (SAXS) combined with synchrotron radiation). Solid-state nuclear magnetic resonance (SS-NMR) is a powerful technique because it allows for the study of molecular structure and dynamics of semi-crystalline and amorphous materials. Raman and FTIR spectroscopy can provide the dominant conformational contents of a fiber. Raman microspectroscopy can be used to determine quantitative parameters characterizing the molecular structure (orientation and conformation, amino acid composi‐ tion) of micrometer-sized biological samples. In this chapter, we will provide an overview of the current understanding of the silk fibers' structure taken advantage of these analytic methods, then describe in detail the structure-property relationships and the formation processes of silk fiber. Additionally, we will explore material morphologies and applications

The structure-property relationship is one of the most intriguing 'mysteries' of silk fibers. Various studies have suggested that there is a strong connection between the structures of silk fibers and their physical (e. g., mechanical) properties. An understanding of the structureproperty relationship requires background knowledge of local structure, including the component and composition of silk fiber, the conformation and orientation of constitutive units

imperative.

of these silk fibers.

with respect to the fiber, and so on.

**2. The structure-property relationship of silk fibers**

**Figure 1.** Schematic overview of different silk types produced by female orb-weaving spiders (*Araneae*). Each silk type (highlighted in red) is tailored for a specific purpose. (Reprinted from Ref. [1]. Copyright 2011, with permission from Elsevier.)


**Table 1.** Comparison of mechanical properties of natural silks and other synthetic fibers[a]. ([a] Data taken from refs. [3, 4]. [b] RH, relative humidity.)

Artificial spinning is the most promising method of promoting the application of silk fibers, as it can output sufficient man-made fibers cost-effectively and with specific tailored proper‐ ties. Remarkable efforts for silk fiber reproduction via reconstituted/recombinant silk fibroin are currently underway [9, 10]. Reconstituted silk protein is derived from *B. mori* cocoons and the degummed *B. mori* silk fibers, they are soluble in concentrated LiBr aqueous solutions, yielding reconstituted silk protein solutions after dialysis [9]. Recent advances in transgenic technology enable the high level expression of recombinant proteins. Spider silk proteins have been produced by other organisms so that recombinant spider silk protein might be suitable for creation of artificial silk threads or other applications. Host organisms include bacteria, yeasts, animal cells and plants [3]. In order to biomimic native silks, the reconstituted or recombinant proteins used to spin artificial silks should possess amino acid similar to the ones found in native sites for sequence and composition. However, the properties of synthetic silk fibers currently do not meet the standards of native silks yet, as the composition, hierarchical structure and production conditions of natural silks all reportedly affect their mechanical properties [11]. Consequently, a profound knowledge of the natural formation process, chemical composition, relationships of structure and properties of silk fibers seems therefor imperative.

produce silk fibers fitting for innovative and advanced functional biological applications is a big trend. Compared to silkworm silks, the potential commercial applications for many spider silks are still extremely finite due to reasons such as difficulty of high-density spider farming, which is limited by the cannibalistic nature of most spiders. Additionally, only ~12 m of silk can be obtained from a complete spider web, this is extremely small in comparison to the 600

**Figure 1.** Schematic overview of different silk types produced by female orb-weaving spiders (*Araneae*). Each silk type (highlighted in red) is tailored for a specific purpose. (Reprinted from Ref. [1]. Copyright 2011, with permission from

**Fibers Stiffness Strength Extensibility (%) Toughness**

**Table 1.** Comparison of mechanical properties of natural silks and other synthetic fibers[a]. ([a] Data taken from refs. [3,

*B. mori* cocoon silk 7 0.6 18 70 *B. mori* reeled silk 15 0.7 28 150 *A. Diadematus* silk (dragline) 10 1.1 27 180 *A. Diadematus* silk (flagelliform) 0.003 0.5 270 150 Wool (at 100% RH[b]) 0.5 0.2 5 60 Elastin 0.001 0.002 15 2 Nylon fiber 5 0.95 18 80 Kevlar 49 fiber 130 306 2.7 50 Carbon fiber 300 4 1.3 25 High-tensile steel 200 1.5 0.8 6

**(GPa) (GPa) (MJ∙m-3)**

to 900 m of silk that is yielded by one silkworm cocoon [8].

70 Oligomerization of Chemical and Biological Compounds

Elsevier.)

4]. [b] RH, relative humidity.)

Thanks to recent developments in modern analytical techniques, significant progress has been made with respect to the structural characterization of silk. These techniques can provide molecular information about silk, including microscopic methods (atomic force microscopy (AFM), scanning and transmission electron microscopy (SEM and TEM), and scanning transmission x-ray microscopy (STXM)) and synchrotron x-ray diffraction (wide-angle x-ray diffraction (WAXD) and small-angle x-ray scattering (SAXS) combined with synchrotron radiation). Solid-state nuclear magnetic resonance (SS-NMR) is a powerful technique because it allows for the study of molecular structure and dynamics of semi-crystalline and amorphous materials. Raman and FTIR spectroscopy can provide the dominant conformational contents of a fiber. Raman microspectroscopy can be used to determine quantitative parameters characterizing the molecular structure (orientation and conformation, amino acid composi‐ tion) of micrometer-sized biological samples. In this chapter, we will provide an overview of the current understanding of the silk fibers' structure taken advantage of these analytic methods, then describe in detail the structure-property relationships and the formation processes of silk fiber. Additionally, we will explore material morphologies and applications of these silk fibers.

#### **2. The structure-property relationship of silk fibers**

The structure-property relationship is one of the most intriguing 'mysteries' of silk fibers. Various studies have suggested that there is a strong connection between the structures of silk fibers and their physical (e. g., mechanical) properties. An understanding of the structureproperty relationship requires background knowledge of local structure, including the component and composition of silk fiber, the conformation and orientation of constitutive units with respect to the fiber, and so on.

#### **2.1. The structure and composition of** *B. mori* **and spider dragline silk fibers**

In principle, the full range of properties of silk fibers can be calculated from their structural morphology and chemical composition. On the macroscopic level, the morphological structure of *B. mori* silk and spider dragline silk are very similar, as both possess a core-shell structure (Figure 2) [12]. The silk thread diameter varies across types and species. For example, coating the two core brins of *B. mori* silk fiber with sericin yields fibers about 20 µm width. Spider dragline silks have a diameter of 3-5 µm and to date, have been described to contain only one protein monofilament.

monomers are arranged in a strictly controlled manner and are responsible for the formation of well-defined structure [13]. A range of microscopy methods, including SEM, TEM, and AFM, have been used to investigate the microstructure of silk fiber [14-19]. The results confirmed that silk fibers are composed of well-oriented bundles of nanofibrils. Generally, the coatings of silk fibers function as glue. The sericin coating, which occupies 25-30% of the weight of *B. mori* silk fiber, glues the two core brins together. However, recent studies have provided evidences that the coating may act as a fungicidal or bactericidal agent [20]. It may also have a role in the complex spinning process. Studies have demonstrated the presence of microvoids for both silk fibers [21, 22]. Microvoids are thought to develop during the final stages of the spinning process, in which viscous protein aqueous is stretched or loaded into the fibers.

Silk Fiber — Molecular Formation Mechanism, Structure-Property Relationship and Advanced Applications

http://dx.doi.org/10.5772/57611

73

As two major families of silk proteins, fibroin is the chief component of silkworm silk fiber, while spidroin (also named spider fibroin) is the analogue in spider silk fiber. The *B. mori* silk fibroin is composed of two protein chains, heavy-chain (H-fibroin) with the molecular weight of approximately 350 kDa and light chain (L-fibroin, Mw ~ 26 kDa) covalently linked by a disulfide bond at the carboxy-terminus of the two subunits [23-25] (Figure 3a). The main proteinaceous constituents of spider dragline silk are typically two major ampullate spidroins, MaSp1 and MaSp2, which are estimated to range from 250-350 kDa or larger [26-29]. A common

**Figure 3.** (a) Silkworm fibroin consisting of a covalently linked highly repetitive heavy and non-repetitive light chain. (b) Spider silk spidroins consist of a large repetitive core domain flanked by non-repetitive amino-(NRN) and carboxyterminal (NRC) domains. (Figure slightly modified with permission from Ref. [30]. Copyright 2011 Wiley Periodicals,

The primary sequence plays an important role in defining basic materials. Despite being quite different in their primary structure, *B. mori* fibroin heavy chain and spider spi‐ droins share fundamental similarities. Both have large central core of repeated modular units (Figure 4), flanked by nonrepetitive amino- (NRN) [31, 32] and carboxy- (NRC) [29] terminal domains (Figure 3). The light chain of *B. mori* fibroin, has a standard amino acid composition and a nonrepeating sequence. It plays only a marginal role in the fiber [33]. The organization of the repeating modular units can differ significantly, as seen in the sequences of different protein types. As the major component of *B. mori* fibroin, the complete amino acid sequence of the *B. mori* fibroin heavy chain is composed of a highly repetitive

feature of fibroins is the high content of alanine and glycine residues.

**2.2. Hierarchical structure of fibroin in** *B. mori* **and spider silk fibers**

Inc.)

**Figure 2.** Examples of silk fibers produced by silkworms and spiders and a schematic illustration. (Reprinted from Ref. [12]. Copyright 2008, with permission from Elsevier.)

Silk fibers are normally polyamino acid-based fibrous proteins. In contrast, the synthetic polymers, which are usually homopolymers or copolymers consisting of one or several simpler monomer, the biopolymers − silk fibers, the primary sequence and linkage between the monomers are arranged in a strictly controlled manner and are responsible for the formation of well-defined structure [13]. A range of microscopy methods, including SEM, TEM, and AFM, have been used to investigate the microstructure of silk fiber [14-19]. The results confirmed that silk fibers are composed of well-oriented bundles of nanofibrils. Generally, the coatings of silk fibers function as glue. The sericin coating, which occupies 25-30% of the weight of *B. mori* silk fiber, glues the two core brins together. However, recent studies have provided evidences that the coating may act as a fungicidal or bactericidal agent [20]. It may also have a role in the complex spinning process. Studies have demonstrated the presence of microvoids for both silk fibers [21, 22]. Microvoids are thought to develop during the final stages of the spinning process, in which viscous protein aqueous is stretched or loaded into the fibers.

**2.1. The structure and composition of** *B. mori* **and spider dragline silk fibers**

protein monofilament.

72 Oligomerization of Chemical and Biological Compounds

In principle, the full range of properties of silk fibers can be calculated from their structural morphology and chemical composition. On the macroscopic level, the morphological structure of *B. mori* silk and spider dragline silk are very similar, as both possess a core-shell structure (Figure 2) [12]. The silk thread diameter varies across types and species. For example, coating the two core brins of *B. mori* silk fiber with sericin yields fibers about 20 µm width. Spider dragline silks have a diameter of 3-5 µm and to date, have been described to contain only one

**Figure 2.** Examples of silk fibers produced by silkworms and spiders and a schematic illustration. (Reprinted from Ref.

Silk fibers are normally polyamino acid-based fibrous proteins. In contrast, the synthetic polymers, which are usually homopolymers or copolymers consisting of one or several simpler monomer, the biopolymers − silk fibers, the primary sequence and linkage between the

[12]. Copyright 2008, with permission from Elsevier.)

As two major families of silk proteins, fibroin is the chief component of silkworm silk fiber, while spidroin (also named spider fibroin) is the analogue in spider silk fiber. The *B. mori* silk fibroin is composed of two protein chains, heavy-chain (H-fibroin) with the molecular weight of approximately 350 kDa and light chain (L-fibroin, Mw ~ 26 kDa) covalently linked by a disulfide bond at the carboxy-terminus of the two subunits [23-25] (Figure 3a). The main proteinaceous constituents of spider dragline silk are typically two major ampullate spidroins, MaSp1 and MaSp2, which are estimated to range from 250-350 kDa or larger [26-29]. A common feature of fibroins is the high content of alanine and glycine residues.

**Figure 3.** (a) Silkworm fibroin consisting of a covalently linked highly repetitive heavy and non-repetitive light chain. (b) Spider silk spidroins consist of a large repetitive core domain flanked by non-repetitive amino-(NRN) and carboxyterminal (NRC) domains. (Figure slightly modified with permission from Ref. [30]. Copyright 2011 Wiley Periodicals, Inc.)

#### **2.2. Hierarchical structure of fibroin in** *B. mori* **and spider silk fibers**

The primary sequence plays an important role in defining basic materials. Despite being quite different in their primary structure, *B. mori* fibroin heavy chain and spider spi‐ droins share fundamental similarities. Both have large central core of repeated modular units (Figure 4), flanked by nonrepetitive amino- (NRN) [31, 32] and carboxy- (NRC) [29] terminal domains (Figure 3). The light chain of *B. mori* fibroin, has a standard amino acid composition and a nonrepeating sequence. It plays only a marginal role in the fiber [33]. The organization of the repeating modular units can differ significantly, as seen in the sequences of different protein types. As the major component of *B. mori* fibroin, the complete amino acid sequence of the *B. mori* fibroin heavy chain is composed of a highly repetitive (Gly-Ala)n sequence motif and tyrosine-rich domains [34]. In MaSp1, the modular units mainly consist of a subset of the sequence motifs (Ala)n followed by several GGX motifs, with X representing a variable amino acid. In MaSp2, the GGX motif is replaced by the GPGXX motif, which contains more proline residues [26, 27]. The modular units are repeated up to several hundred times in the central core of *B. mori* fibroin heavy chain and spider spidroins such that they largely determine the macroscopic properties of the fibers. The highly conserved sequence of nonrepetitive amino- and carboxy- terminal domains are essential for fiber formation and expected to be of functional relevance [35-39]. Moreover, the analysis of the hydropathicity of these fibroins reveals a pair of hydrophobic and hydrophilic counterparts. The central region of the protein is mostly hydrophobic, while the nonrepetitive amino- and carboxy- terminal domains are more hydrophilic [40].

disordered material, they are roughly equivalent to crystalline and non-crystalline phase of

Silk Fiber — Molecular Formation Mechanism, Structure-Property Relationship and Advanced Applications

http://dx.doi.org/10.5772/57611

75

The solid threads are characteristic of well-oriented *β*-sheet, the dominant secondary structure in silk fibers [57-59]. The first Raman spectrum of *B. mori* silk fiber has clearly shown the predominance of *β*-sheet, matching the results previously obtained from other techniques [60]. The total amount of *β*-sheet is around 50% for *B. mori* silk, which matched the proportion of the (Gly-Ala)n motif [42, 43, 46]. Therefore, it is widely accepted that the *B. mori* fibroin is composed of a highly repetitive (Gly-Ala)n sequence motif adopting antiparallel *β*-sheet conformation, namely silk II of the crystalline form. The *β*-sheet crystallite is the molecular network constructed by crosslinking *β*-sheet conformation of the molecular structures within several neighboring silk protein molecules [61]. It can be indexed as a monoclinic space group with a rectangular unit cell parameter of *a*=0.938 nm, *b*=0.949 nm and *c*=0.698 nm for *B. mori* silk [62]. Drummy *et al.* investigated *B. mori* silk fiber bundles using wide angle X-ray scattering (WAXS) [63]. The amorphous halo was also investigated from the WAXS pattern. The results concluded that silk fiber is made up of crystalline regions and connected by regions of amorphous or non-crystalline regions, each comprising of approximately 50% of the total structure. These features are in agreement with structural model proposed before [64].

It is quite firmly believed that the (Ala)n domains in spider dragline silk fibers adopt a *β*-sheet conformation, which are predominantly found to be antiparallel folding and organized into crystallite [41, 61, 65]. Based on the x-ray diffraction pattern, the *β*-sheet crystallites can be indexed as an orthogonal unit with cell parameters of *a*=1.03 nm, *b*=0.944 nm and *c*=0.695 nm for *Nephila* spider dragline silk [66]. From the data presented, the content of *β*-sheet confor‐ mation in *Nephila* spider dragline silks ranging from 30% to 40% [46, 48]. This value is much higher than the 20% average for degree of crystallinity (amount of *β*-sheet crystallites), as reported by XRD [67-69]. The lower crystallinity of spider dragline seems to closely correlate

It is expected that the mechanical properties of silk fibers will critically depend on the characters of *β*-sheet crystallites, significant properties include crystallinity, size (aspect ratio, distribution) and dispersion of *β*-sheet crystallites, the intercrystallite distance, and the degree of orientation in the silk fiber. The size of *β*-sheet crystallite in *B. mori* silk fiber, determined via quantitative examination of the dark field TEM images, were revealed to be 20 to 170 nm in the axial direction and 1 to 20 nm in the lateral direction. And all the crystallites were uniformly distributed in the whole fiber matrix [14]. The smaller crystallite sizes, as measured from LVTEM and WAXS images, are a reasonable match to those calculated from Scherrer analysis of x-ray fiber pattern [63]. It has a large distribution of sizes range from only a few nanometers to tens of nanometers in length. The average crystallite size of *Nephila* spider dragline silks is calculated to be approximately 2 × 5 × 7 nm based on X-ray diffraction patterns of single and bundles of fiber [70]. The correlation lengths related to intercrystallite distance along the fiber axis is ~ 13-18 nm, measured from SAXS for *Nephila* spider dragline silk. This value is agreeable with the distance between the *β*-sheets from the MaSp1 sequence [19]. The quantitative determination of orientation of the secondary structure for silk protein molecules shows that the *β*-sheets are aligned parallel to the fiber axis [57] and the *β*-sheets crystallites

with less highly ordered *β*-sheet content for diffraction [61].

silk proteins, respectively [55, 56].

**Figure 4.** Typical amino acid sequences of repetitive core of *B. mori* fibroin heavy chain, minor ampullate spidroins and major ampullate spidroins. The highly repetitive (Gly-Ala)n and (Ala)n sequence motifs are highlighted in red. The ac‐ cession numbers for the sequences are P05790, P19837, P46804, AAC47009 and AAC47010 respectively.

The primary structural motifs have a preferred secondary structure and give rise to structures higher up the hierarchy. NMR, circular dichroism (CD), IR and Raman spectroscopy were usually used to examine the chemical, conformational, and orientational information of secondary structures for silk proteins [41-51]. There are three major conformations of silk proteins: the random coil, the *α*-helix and the *β*-sheet [52-54]. Using the approach of Porter, and reducing the complex secondary structure of silk proteins into fractions of ordered and disordered material, they are roughly equivalent to crystalline and non-crystalline phase of silk proteins, respectively [55, 56].

(Gly-Ala)n sequence motif and tyrosine-rich domains [34]. In MaSp1, the modular units mainly consist of a subset of the sequence motifs (Ala)n followed by several GGX motifs, with X representing a variable amino acid. In MaSp2, the GGX motif is replaced by the GPGXX motif, which contains more proline residues [26, 27]. The modular units are repeated up to several hundred times in the central core of *B. mori* fibroin heavy chain and spider spidroins such that they largely determine the macroscopic properties of the fibers. The highly conserved sequence of nonrepetitive amino- and carboxy- terminal domains are essential for fiber formation and expected to be of functional relevance [35-39]. Moreover, the analysis of the hydropathicity of these fibroins reveals a pair of hydrophobic and hydrophilic counterparts. The central region of the protein is mostly hydrophobic, while the nonrepetitive amino- and carboxy- terminal domains are more hydrophilic [40].

74 Oligomerization of Chemical and Biological Compounds

**Figure 4.** Typical amino acid sequences of repetitive core of *B. mori* fibroin heavy chain, minor ampullate spidroins and major ampullate spidroins. The highly repetitive (Gly-Ala)n and (Ala)n sequence motifs are highlighted in red. The ac‐

The primary structural motifs have a preferred secondary structure and give rise to structures higher up the hierarchy. NMR, circular dichroism (CD), IR and Raman spectroscopy were usually used to examine the chemical, conformational, and orientational information of secondary structures for silk proteins [41-51]. There are three major conformations of silk proteins: the random coil, the *α*-helix and the *β*-sheet [52-54]. Using the approach of Porter, and reducing the complex secondary structure of silk proteins into fractions of ordered and

cession numbers for the sequences are P05790, P19837, P46804, AAC47009 and AAC47010 respectively.

The solid threads are characteristic of well-oriented *β*-sheet, the dominant secondary structure in silk fibers [57-59]. The first Raman spectrum of *B. mori* silk fiber has clearly shown the predominance of *β*-sheet, matching the results previously obtained from other techniques [60]. The total amount of *β*-sheet is around 50% for *B. mori* silk, which matched the proportion of the (Gly-Ala)n motif [42, 43, 46]. Therefore, it is widely accepted that the *B. mori* fibroin is composed of a highly repetitive (Gly-Ala)n sequence motif adopting antiparallel *β*-sheet conformation, namely silk II of the crystalline form. The *β*-sheet crystallite is the molecular network constructed by crosslinking *β*-sheet conformation of the molecular structures within several neighboring silk protein molecules [61]. It can be indexed as a monoclinic space group with a rectangular unit cell parameter of *a*=0.938 nm, *b*=0.949 nm and *c*=0.698 nm for *B. mori* silk [62]. Drummy *et al.* investigated *B. mori* silk fiber bundles using wide angle X-ray scattering (WAXS) [63]. The amorphous halo was also investigated from the WAXS pattern. The results concluded that silk fiber is made up of crystalline regions and connected by regions of amorphous or non-crystalline regions, each comprising of approximately 50% of the total structure. These features are in agreement with structural model proposed before [64].

It is quite firmly believed that the (Ala)n domains in spider dragline silk fibers adopt a *β*-sheet conformation, which are predominantly found to be antiparallel folding and organized into crystallite [41, 61, 65]. Based on the x-ray diffraction pattern, the *β*-sheet crystallites can be indexed as an orthogonal unit with cell parameters of *a*=1.03 nm, *b*=0.944 nm and *c*=0.695 nm for *Nephila* spider dragline silk [66]. From the data presented, the content of *β*-sheet confor‐ mation in *Nephila* spider dragline silks ranging from 30% to 40% [46, 48]. This value is much higher than the 20% average for degree of crystallinity (amount of *β*-sheet crystallites), as reported by XRD [67-69]. The lower crystallinity of spider dragline seems to closely correlate with less highly ordered *β*-sheet content for diffraction [61].

It is expected that the mechanical properties of silk fibers will critically depend on the characters of *β*-sheet crystallites, significant properties include crystallinity, size (aspect ratio, distribution) and dispersion of *β*-sheet crystallites, the intercrystallite distance, and the degree of orientation in the silk fiber. The size of *β*-sheet crystallite in *B. mori* silk fiber, determined via quantitative examination of the dark field TEM images, were revealed to be 20 to 170 nm in the axial direction and 1 to 20 nm in the lateral direction. And all the crystallites were uniformly distributed in the whole fiber matrix [14]. The smaller crystallite sizes, as measured from LVTEM and WAXS images, are a reasonable match to those calculated from Scherrer analysis of x-ray fiber pattern [63]. It has a large distribution of sizes range from only a few nanometers to tens of nanometers in length. The average crystallite size of *Nephila* spider dragline silks is calculated to be approximately 2 × 5 × 7 nm based on X-ray diffraction patterns of single and bundles of fiber [70]. The correlation lengths related to intercrystallite distance along the fiber axis is ~ 13-18 nm, measured from SAXS for *Nephila* spider dragline silk. This value is agreeable with the distance between the *β*-sheets from the MaSp1 sequence [19]. The quantitative determination of orientation of the secondary structure for silk protein molecules shows that the *β*-sheets are aligned parallel to the fiber axis [57] and the *β*-sheets crystallites representing the highly ordered fraction are well-oriented along the silk fiber [70-72]. Addi‐ tionally, the *β*-sheets of *B. mori* silks are slightly better oriented than those of dragline silks, corresponding to the fact that they are more crystalline than spider dragline silk [46].

The non-crystalline regions are often described as amorphous, poorly orientated, or randomly coiled sections of the peptide. The structural organization in the amorphous phase is not well understood yet. The existence of *β*-turn or *β*-spiral and helical conformations has been suggested for amorphous domains [42, 46, 65, 73, 74]. Tyrosine residue, on average, may form distorted *β*-turns and distorted *β*-sheets, which is characterized by 13C solid-state NMR in the amorphous matrix of *B. mori* silk [42, 75]. The Gly-rich regions in spider dragline silk have been described as the amorphous rubber based on X-ray diffraction studies [76]. The precise structure of the GlyGlyX motif in MaSp1 has been somewhat controversial. Recent NMR studies provided evidence of the presence of less ordered helical type structure or distorted *β*-sheets adopted by the GlyGlyX motif [48, 73]. However, for MaSp2, ADF 3 and 4 (the fibroin of major ampullate dragline silk for spider *Araneus diadematus*), the structure of the GlyPro‐ GlyXX repeat has been proposed to be a *β*-turns or spiral structure. The stability of these structures is given by the interchain hydrogen bonding [77]. The molecular chains in the amorphous phase are often considered to be randomly oriented. Studies from Raman [46] and SS-NMR [73] reported that the protein backbones in the amorphous regions of silk fibers are not randomly oriented but exhibit certain degree of orientation along the fiber axis, albeit much less oriented than *β*-sheet crystallites. Meanwhile, the higher level of orientation of the amorphous phase for the spider silks than that for *B. mori* silk.

Recent computational approaches have been useful in modeling nanostructure of silk. Molecular modeling integrated the information known about the structures, and has been used to characterize the nanostructure of the silk. Based on a bottom-up molecular computational approach using replica exchange molecular dynamic, Keten *et al*. reported atomic-level structures of MaSp1 and MaSp2 proteins from the *Nephila Clavipes* spider dragline silk sequence. It showed that poly-alanine segments in silk have an extremely high propensity for forming distinct and orderly *β*-sheet crystallites. Previous molecular dynamic simulations on poly-alanine aggregation also suggested that anti-parallel orientations in the hydrogen bonding direction and parallel stacking in the side – chain direction leads to stable *β*-sheets [78]. Glycine-rich regions are less orderly, predominantly forming helical type structures and *β*-turns in amorphous domains. The density of hydrogen bonds in amorphous regions is lower than in *β*-sheet crystallites [79, 80]. All of the results are excellently consistent with available experiment evidence and may contribute towards an improved understanding of the source of silk's strength and toughness.

D2O-inaccessible *β*-sheets are associated with crystallites, while D2O-accessible ones are composed of amorphous domains and interphase *β*-sheets [82-84]. In the case of spider dragline silk, crystalline component larger in size and poorer in orientation are detectable beside the ~ 2 nm sized *β*-sheets crystallite that are commonly observed [69, 87, 88]. These observations of larger ordered regions have been explained as '*Non Periodic Lattice (NPL)*' crystals which form as a result of statistical matches between compatible sequences on adjacent molecular chains. It was revealed that the border where *β*-sheet crystallite regions and amorphous domains do not have any discrete phase boundaries. The presence of the inter‐ phase has also been deduced from STXM studies on *Nephila* dragline silk. It indicates that highly oriented and unoriented domains are surrounded by a moderately oriented matrix [85]. In summary, there is ongoing debate on the molecular structure of silk at the nanoscopic level.

**Figure 5.** (a) The hierarchical structure of spider dragline and silkworm silk fiber. Both spider dragline and fibroin are composed of numerous minute fibrils, which are separated into crystalline and amorphous segments. (b) The minute fibrils in silkworm *B. mori* silk as revealed in an AFM image (scale bar: 150 nm). The silk fiber direction is indicated by

Silk Fiber — Molecular Formation Mechanism, Structure-Property Relationship and Advanced Applications

http://dx.doi.org/10.5772/57611

77

Spider silk and *B. mori* silk feature unique physical properties – such as superior mechanical properties in terms of toughness (the amount of energy absorbed before breakage) (Table 1). So far, the maximum strength of spider dragline silk (dragline of *Caerostris darwini*) up to 1.7 GPa, which exceeds that of steel (1.5 GPa), is in the range of high-tech materials [1]. Due to its great extensibility, spider dragline silks have three times of toughness of man-made synthetic fibers like Kevlar 49 [3, 89]. Typical *B. mori* silk is presumed to be weaker and less extensible than spider dragline silk. However, when forcibly silking from immobilized silkworms

**2.3. The physical (mechanical) properties of silk fibers**

the arrow. (Adapted with permission from Ref. [81]. Copyright 2011, WILEY-VCH.)

According to the prevalent characterizations mentioned above, silk fiber is considered a semicrystalline polymer with a hierarchical structure in which highly oriented *β*-sheets crystallites connecting with an amorphous matrix are organized in nanofibrils or fibrillar entities [81] (Figure 5). However, it has been proposed that there exists a third phase, or interphase consisting of weakly oriented *β*-sheets regions [68, 82-84] or oriented amorphous domains [85, 86] in silk. Recent NMR and IR studies performed on silk used hydrogendeuterium (H-D) exchange to differentiate among three structures. The data revealed that the

**Figure 5.** (a) The hierarchical structure of spider dragline and silkworm silk fiber. Both spider dragline and fibroin are composed of numerous minute fibrils, which are separated into crystalline and amorphous segments. (b) The minute fibrils in silkworm *B. mori* silk as revealed in an AFM image (scale bar: 150 nm). The silk fiber direction is indicated by the arrow. (Adapted with permission from Ref. [81]. Copyright 2011, WILEY-VCH.)

D2O-inaccessible *β*-sheets are associated with crystallites, while D2O-accessible ones are composed of amorphous domains and interphase *β*-sheets [82-84]. In the case of spider dragline silk, crystalline component larger in size and poorer in orientation are detectable beside the ~ 2 nm sized *β*-sheets crystallite that are commonly observed [69, 87, 88]. These observations of larger ordered regions have been explained as '*Non Periodic Lattice (NPL)*' crystals which form as a result of statistical matches between compatible sequences on adjacent molecular chains. It was revealed that the border where *β*-sheet crystallite regions and amorphous domains do not have any discrete phase boundaries. The presence of the inter‐ phase has also been deduced from STXM studies on *Nephila* dragline silk. It indicates that highly oriented and unoriented domains are surrounded by a moderately oriented matrix [85]. In summary, there is ongoing debate on the molecular structure of silk at the nanoscopic level.

#### **2.3. The physical (mechanical) properties of silk fibers**

representing the highly ordered fraction are well-oriented along the silk fiber [70-72]. Addi‐ tionally, the *β*-sheets of *B. mori* silks are slightly better oriented than those of dragline silks,

The non-crystalline regions are often described as amorphous, poorly orientated, or randomly coiled sections of the peptide. The structural organization in the amorphous phase is not well understood yet. The existence of *β*-turn or *β*-spiral and helical conformations has been suggested for amorphous domains [42, 46, 65, 73, 74]. Tyrosine residue, on average, may form distorted *β*-turns and distorted *β*-sheets, which is characterized by 13C solid-state NMR in the amorphous matrix of *B. mori* silk [42, 75]. The Gly-rich regions in spider dragline silk have been described as the amorphous rubber based on X-ray diffraction studies [76]. The precise structure of the GlyGlyX motif in MaSp1 has been somewhat controversial. Recent NMR studies provided evidence of the presence of less ordered helical type structure or distorted *β*-sheets adopted by the GlyGlyX motif [48, 73]. However, for MaSp2, ADF 3 and 4 (the fibroin of major ampullate dragline silk for spider *Araneus diadematus*), the structure of the GlyPro‐ GlyXX repeat has been proposed to be a *β*-turns or spiral structure. The stability of these structures is given by the interchain hydrogen bonding [77]. The molecular chains in the amorphous phase are often considered to be randomly oriented. Studies from Raman [46] and SS-NMR [73] reported that the protein backbones in the amorphous regions of silk fibers are not randomly oriented but exhibit certain degree of orientation along the fiber axis, albeit much less oriented than *β*-sheet crystallites. Meanwhile, the higher level of orientation of the

Recent computational approaches have been useful in modeling nanostructure of silk. Molecular modeling integrated the information known about the structures, and has been used to characterize the nanostructure of the silk. Based on a bottom-up molecular computational approach using replica exchange molecular dynamic, Keten *et al*. reported atomic-level structures of MaSp1 and MaSp2 proteins from the *Nephila Clavipes* spider dragline silk sequence. It showed that poly-alanine segments in silk have an extremely high propensity for forming distinct and orderly *β*-sheet crystallites. Previous molecular dynamic simulations on poly-alanine aggregation also suggested that anti-parallel orientations in the hydrogen bonding direction and parallel stacking in the side – chain direction leads to stable *β*-sheets [78]. Glycine-rich regions are less orderly, predominantly forming helical type structures and *β*-turns in amorphous domains. The density of hydrogen bonds in amorphous regions is lower than in *β*-sheet crystallites [79, 80]. All of the results are excellently consistent with available experiment evidence and may contribute towards an improved understanding of the source

According to the prevalent characterizations mentioned above, silk fiber is considered a semicrystalline polymer with a hierarchical structure in which highly oriented *β*-sheets crystallites connecting with an amorphous matrix are organized in nanofibrils or fibrillar entities [81] (Figure 5). However, it has been proposed that there exists a third phase, or interphase consisting of weakly oriented *β*-sheets regions [68, 82-84] or oriented amorphous domains [85, 86] in silk. Recent NMR and IR studies performed on silk used hydrogendeuterium (H-D) exchange to differentiate among three structures. The data revealed that the

corresponding to the fact that they are more crystalline than spider dragline silk [46].

76 Oligomerization of Chemical and Biological Compounds

amorphous phase for the spider silks than that for *B. mori* silk.

of silk's strength and toughness.

Spider silk and *B. mori* silk feature unique physical properties – such as superior mechanical properties in terms of toughness (the amount of energy absorbed before breakage) (Table 1). So far, the maximum strength of spider dragline silk (dragline of *Caerostris darwini*) up to 1.7 GPa, which exceeds that of steel (1.5 GPa), is in the range of high-tech materials [1]. Due to its great extensibility, spider dragline silks have three times of toughness of man-made synthetic fibers like Kevlar 49 [3, 89]. Typical *B. mori* silk is presumed to be weaker and less extensible than spider dragline silk. However, when forcibly silking from immobilized silkworms artificially at certain spinning speed, the mechanical properties of the specific *B. mori* silk have greatly improved to a level that is comparable the toughest spider silk [4].

strength of the fiber [19]. Recently, Buehler *et al*. investigated *β*-sheet nanocrystals using the sequence from *B. mori* silk as a model system. They examined the key mechanical parameters of the silk *β*-sheet nanocrystals as a function of size. It concluded that small nanocrystals are predominantly loaded in uniform shear so that the hydrogen bonds in *β*-sheet strands break by means of stick-slip motion with enhanced energy dissipation and leading to greater stiffness and fracture resistance of silk [108]. Molecular models of silk protein *β*-sheet crystals with variation in their *β*-strand length were mechanically tested in molecular dynamics simulations. It was found that *β*-strands of around eight residues in length were optimal [109]. In summary, primary structure and spinning conditions both contribute to the observed structures higher

Silk Fiber — Molecular Formation Mechanism, Structure-Property Relationship and Advanced Applications

http://dx.doi.org/10.5772/57611

79

Modern analytical technologies and tools have steadily contributed to the progress in experi‐ mental studies of the structure of silk fibers, as described above. However, it is still no consensus on the hierarchical structure of silk at the nanometer scale. Some models have been proposed to interpret the structure-property relationship of silk fibers. The first such model was Termonia's early model [64]. The model hypothesized that silk is a hydrogen-bonded amorphous phase with embedded stiff crystal domains acting as multifunctional cross-links and creating a thin layer of high modulus in the amorphous regions. The stiff hydrogen bonds are first broken to give the fiber its high initial modulus. Meanwhile, it allows the dynamic rubber phase to redistribute the deformation field for prediction of the nonlinear large strain deformation. The simulated properties based on the theoretical model properly reproduce the combination of high initial modulus, strength and toughness of dragline silk fiber. However, in this model, a theoretical modulus of 160 GPa for rigid *β*-sheets crystals, assuming fully extended crystals, is much higher than the moduli for *β*-sheets crystals obtained from experi‐ ments and molecular dynamic simulation [110-112]. Porter and Vollrath *et al*. includes morphological parameters by simplifying complex structural arrangements of silk fiber into ordered and disordered fractions which are best quantified by the number of amide-amide hydrogen bonds between adjacent chains. These fractions can each impart individual attrib‐ utes to the property profile (such as stiffness and energy dissipation). This model predicts the range of silk tensile properties in good agreement with the experimental observation [56, 113]. Krasnov *et al*. established a viscoelatic model for *B. mori* silk in the form of standard threeparameter Maxwell model, where the elastic modulus is split into amorphous and crystalline elastances. Subsequently, the elastic modulus is parallelly connected to the elements standing for the relaxation processes of the amorphous regions which are observed through cyclic tensile stretching measurements on a single silkworm silk fiber. They separated the mechanical properties of the crystalline and amorphous phases, as well as the interplay between mechan‐ ical properties and morphology, of silk. The model fits well with the reports of testing experiment [111]. Buehler *et al*. presented a simple coarse-grained model in which a combina‐ tion of *β*-sheet nanocrystal and semi-amorphous region is modeled by beads connected via multilinear springs in a serial arrangement, representing the fundamental unit building block of the silk fiber. The mechanical behavior of these domains was simulated based on this model and the resulting stress-strain curve displays the characteristic shape observed in silk. It develops a fundamental understanding of silk's mechanics. In general, amorphous regions contribute to the elasticity of the material. The amorphous regions unravel first when silk is

up the hierarchy associated with the mechanical properties of the silk fiber.

The mechanical properties of silk fibers can be described by stress-strain curve profiles, which are generated by stretching the fibers at a specific strain rate. The stress is expressed as force per cross-sectional area and the strain is defined as a normalized extensibility. Typical stressstrain curves for *B. mori* silkworm silk and spider dragline silk show both elastic behavior followed by plastic deformation [90]. The linear portion of the curve, up to the yield point, is the elastic region. The slope is defined as Young's modulus [91], a measure of the stiffness of the fiber. After the yield point, the fiber buffers the plastic deformation and the stress-strain profiles are subjected to sudden slope changes. This behavior indicates that major structural transition from rubberlike to glassy state occurs in the fiber [92-94]. These characteristics have driven scientists to explore the structural origin of the high-performance silk fiber, with the goal of obtaining templates for designing novel materials with comparable properties.

#### **2.4. The structure-property relationship of silk fibers**

Evidently, the attractive macroscopic mechanical properties of silk fiber can be ascribed to the structural effects. Most of the attention has focused on the nanometer scale: predominantly, primary and secondary structure, as well as organization and arrangement of protein mole‐ cules. In terms of primary structure of silk proteins, amino acid composition, sequential order and the number of the motifs in each module are important for the mechanical properties of the final fibers. For example, the primary structure of *Antheraea pernyi* (*A. pernyi*) silk fibroin produced by the 'wild' silkworm, especially the motif, is more like that of major ampullate spidroins than that of *B. mori* fibroin [34, 95]. It has been found that such 'wild' silkworm silk displays similar mechanical properties as spider dragline silk [96]. In addition, six novel silk proteins from *Mygalomorphae* (terantulas) do not possess high tensile strength and elasticity, due to the absence of the four motifs found in major ampullate spidroins [97]. Several studies have tried to establish correlations between specific peptide segments and the mechanical functions of the silk fibers [81, 89, 96, 98-100]. For example, glycine and proline play important specific roles in silk, as they modulate the backbone hydration and conformational order of peptides to govern the behavior of the fibers [101]. The proline-containing motif, GPGXX, was hypothesized to account for the elasticity of silk [102]. However, the primary structure of silk proteins alone does not explain the properties of silk fibers. With merely the protein of the right primary structure, the artificial spun silk fiber is far inferior to the native one [103].

The mechanical properties of silk fibers, also depend crucially on spinning conductions, such as humidity, temperature, and reeling speed, and so on [19, 104]. Variations in crystallinity and alignment can be found within the silk fiber due to variations in reeling speed of the collected sample. These variations have been mapped to mechanical properties by affecting the formation of the *β*-sheet crystals. As reeling speed is increased, the content of *β*-sheet structures rise in silk, with increasing orientation of both crystalline and amorphous fractions [105, 106]. Additionally, the tensile properties (the breaking stress and modulus) of silk fibers increase while breaking strain decreases [4, 19, 107]. It has also shown that reducing the crystal size by increasing the reeling speed has a significant influence on the toughness and ultimate strength of the fiber [19]. Recently, Buehler *et al*. investigated *β*-sheet nanocrystals using the sequence from *B. mori* silk as a model system. They examined the key mechanical parameters of the silk *β*-sheet nanocrystals as a function of size. It concluded that small nanocrystals are predominantly loaded in uniform shear so that the hydrogen bonds in *β*-sheet strands break by means of stick-slip motion with enhanced energy dissipation and leading to greater stiffness and fracture resistance of silk [108]. Molecular models of silk protein *β*-sheet crystals with variation in their *β*-strand length were mechanically tested in molecular dynamics simulations. It was found that *β*-strands of around eight residues in length were optimal [109]. In summary, primary structure and spinning conditions both contribute to the observed structures higher up the hierarchy associated with the mechanical properties of the silk fiber.

artificially at certain spinning speed, the mechanical properties of the specific *B. mori* silk have

The mechanical properties of silk fibers can be described by stress-strain curve profiles, which are generated by stretching the fibers at a specific strain rate. The stress is expressed as force per cross-sectional area and the strain is defined as a normalized extensibility. Typical stressstrain curves for *B. mori* silkworm silk and spider dragline silk show both elastic behavior followed by plastic deformation [90]. The linear portion of the curve, up to the yield point, is the elastic region. The slope is defined as Young's modulus [91], a measure of the stiffness of the fiber. After the yield point, the fiber buffers the plastic deformation and the stress-strain profiles are subjected to sudden slope changes. This behavior indicates that major structural transition from rubberlike to glassy state occurs in the fiber [92-94]. These characteristics have driven scientists to explore the structural origin of the high-performance silk fiber, with the goal of obtaining templates for designing novel materials with comparable properties.

Evidently, the attractive macroscopic mechanical properties of silk fiber can be ascribed to the structural effects. Most of the attention has focused on the nanometer scale: predominantly, primary and secondary structure, as well as organization and arrangement of protein mole‐ cules. In terms of primary structure of silk proteins, amino acid composition, sequential order and the number of the motifs in each module are important for the mechanical properties of the final fibers. For example, the primary structure of *Antheraea pernyi* (*A. pernyi*) silk fibroin produced by the 'wild' silkworm, especially the motif, is more like that of major ampullate spidroins than that of *B. mori* fibroin [34, 95]. It has been found that such 'wild' silkworm silk displays similar mechanical properties as spider dragline silk [96]. In addition, six novel silk proteins from *Mygalomorphae* (terantulas) do not possess high tensile strength and elasticity, due to the absence of the four motifs found in major ampullate spidroins [97]. Several studies have tried to establish correlations between specific peptide segments and the mechanical functions of the silk fibers [81, 89, 96, 98-100]. For example, glycine and proline play important specific roles in silk, as they modulate the backbone hydration and conformational order of peptides to govern the behavior of the fibers [101]. The proline-containing motif, GPGXX, was hypothesized to account for the elasticity of silk [102]. However, the primary structure of silk proteins alone does not explain the properties of silk fibers. With merely the protein of the right primary structure, the artificial spun silk fiber is far inferior to the native one [103].

The mechanical properties of silk fibers, also depend crucially on spinning conductions, such as humidity, temperature, and reeling speed, and so on [19, 104]. Variations in crystallinity and alignment can be found within the silk fiber due to variations in reeling speed of the collected sample. These variations have been mapped to mechanical properties by affecting the formation of the *β*-sheet crystals. As reeling speed is increased, the content of *β*-sheet structures rise in silk, with increasing orientation of both crystalline and amorphous fractions [105, 106]. Additionally, the tensile properties (the breaking stress and modulus) of silk fibers increase while breaking strain decreases [4, 19, 107]. It has also shown that reducing the crystal size by increasing the reeling speed has a significant influence on the toughness and ultimate

greatly improved to a level that is comparable the toughest spider silk [4].

**2.4. The structure-property relationship of silk fibers**

78 Oligomerization of Chemical and Biological Compounds

Modern analytical technologies and tools have steadily contributed to the progress in experi‐ mental studies of the structure of silk fibers, as described above. However, it is still no consensus on the hierarchical structure of silk at the nanometer scale. Some models have been proposed to interpret the structure-property relationship of silk fibers. The first such model was Termonia's early model [64]. The model hypothesized that silk is a hydrogen-bonded amorphous phase with embedded stiff crystal domains acting as multifunctional cross-links and creating a thin layer of high modulus in the amorphous regions. The stiff hydrogen bonds are first broken to give the fiber its high initial modulus. Meanwhile, it allows the dynamic rubber phase to redistribute the deformation field for prediction of the nonlinear large strain deformation. The simulated properties based on the theoretical model properly reproduce the combination of high initial modulus, strength and toughness of dragline silk fiber. However, in this model, a theoretical modulus of 160 GPa for rigid *β*-sheets crystals, assuming fully extended crystals, is much higher than the moduli for *β*-sheets crystals obtained from experi‐ ments and molecular dynamic simulation [110-112]. Porter and Vollrath *et al*. includes morphological parameters by simplifying complex structural arrangements of silk fiber into ordered and disordered fractions which are best quantified by the number of amide-amide hydrogen bonds between adjacent chains. These fractions can each impart individual attrib‐ utes to the property profile (such as stiffness and energy dissipation). This model predicts the range of silk tensile properties in good agreement with the experimental observation [56, 113]. Krasnov *et al*. established a viscoelatic model for *B. mori* silk in the form of standard threeparameter Maxwell model, where the elastic modulus is split into amorphous and crystalline elastances. Subsequently, the elastic modulus is parallelly connected to the elements standing for the relaxation processes of the amorphous regions which are observed through cyclic tensile stretching measurements on a single silkworm silk fiber. They separated the mechanical properties of the crystalline and amorphous phases, as well as the interplay between mechan‐ ical properties and morphology, of silk. The model fits well with the reports of testing experiment [111]. Buehler *et al*. presented a simple coarse-grained model in which a combina‐ tion of *β*-sheet nanocrystal and semi-amorphous region is modeled by beads connected via multilinear springs in a serial arrangement, representing the fundamental unit building block of the silk fiber. The mechanical behavior of these domains was simulated based on this model and the resulting stress-strain curve displays the characteristic shape observed in silk. It develops a fundamental understanding of silk's mechanics. In general, amorphous regions contribute to the elasticity of the material. The amorphous regions unravel first when silk is being stretched, leading to its large extensibility. Conversely, highly ordered, crystalline regions play a major role in determining the strength and stiffness of silks [114].

from sequence to crystallites to fibrils to fibers, as well as the effect of structural changes on the overall mechanical behavior of silk fibers. In particular, a closer analysis of the mechanical response, spider draglines behave typically the strain-hardening process in the post-yield region [81]. Based on the '*β*-sheet splitting' mechanism [122], the occurrence of strain-harden‐ ing in spider dragline as response to the structural factors has been clearly addressed. Spider dragline silk can acquire extra toughness as a strain-hardening material by breaking intramo‐ lecular *β*-sheets. On the other hand, *B. mori* silkworm silk has far fewer intramolecular *β*-sheets in the amorphous region, therefore it is less extensible and only exhibits strain-weakening after

Silk Fiber — Molecular Formation Mechanism, Structure-Property Relationship and Advanced Applications

http://dx.doi.org/10.5772/57611

81

Furthermore, unlike *B. mori* silk and the other types of spider silks, the mechanical properties of spider dragline silks are greatly influenced by water. When an unconstrained dragline silk fiber is immersed in water or comes in contact with a relative humidity greater than 60%, the thread starts to swell radially, doubling in diameter and a shrinking to half of its original length [123]. This process is known as supercontraction, which is another interesting property of spider silk. A number of research groups have used different experimental techniques to understand supercontraction and the underlying mechanisms [124-126]. It is assumed that supercontraction is a result of reorientation of hydrogen bonds within the chains of protein molecules and is accompanied with the release of the prestress [127-130]. Some researchers attributed supercontraction mainly on the proline content of MaSp2 protein. Notably, the content of proline does not correlate with the mechanical performance of spider dragline silk fibers from different species, but influences the mechanical properties of wetted dragline silks [99, 126, 131, 132]. Recently, Guan *et al*. discussed the role of the two MaSp1 and MaSp2 proteins in supercontraction and quantified a contraction of about 13% maximum, linked to the disordered component of MaSp1 protein. Thus the remaining supercontraction to a total of about 30% is linked to the intrinsically disordered proline-containing fraction of MaSp2 protein [133]. After supercontraction, the silk is called supercontracted fibers, and are usually to be

employed to study the structure-property relationship of silk fiber [123, 134].

**3. Silk protein assembly and silk fiber formation mechanism**

**3.1. Natural spinning process for** *B. mori* **silk and spider draglines**

The remarkable mechanical properties of silk fibers have spawned great interests in determi‐ nation of their origin. Systematic studies of the natural spinning process of silk fibers have shown a highly sophisticated hierarchical process, allowing for the transformation of soluble silk protein into solid fibers with specific mechanical and functional properties. Although much is already known about the characteristics of the silk proteins and silk fibers themselves, the process for silk assembly and spinning into fibers is yet to be resolved. A detailed knowl‐ edge of silk fiber formation is critical for the biomimetic production of tough silk-like fibers.

In nature, silk proteins are secreted and stored in the glands until they are processed into fibers. Morphological and histological studies demonstrate that the silk glands of *B. mori* silkworm

yield point (Figure 6).

Multi-scale experimental and simulation analyses are the key to improve our systematic understanding of how structure and properties are linked. The mechanical mechanism at the macroscopic scale, namely, the fibril, including morphology and its consequence for mechan‐ ical behavior and the mechanistic interplay with nanostructure of silk, has also been elucidated [115-117]. At the same time, many experiments have been employed to assess the effect of structural changes on the mechanical deformation of silk [118-121]. When mechanical load are applied to the fibers, conformation, reorientation, crystallite size, and some other structural characters are monitored to explain the structure-property relationships.

**Figure 6.** A schematic model demonstrating how the silkworm and spider dragline fibers respond when they are sub‐ jected to stretching. There are two components in the alanine-rich regions of spider dragline silk: β-crystallites and intramolecular β-sheets. (Adapted with permission from Ref. [81]. Copyright 2011, WILEY-VCH.)

The experimental and computational investigations shown above have explored mechanical properties of *B. mori* silkworm silks and spider draglines at different structural hierarchies from sequence to crystallites to fibrils to fibers, as well as the effect of structural changes on the overall mechanical behavior of silk fibers. In particular, a closer analysis of the mechanical response, spider draglines behave typically the strain-hardening process in the post-yield region [81]. Based on the '*β*-sheet splitting' mechanism [122], the occurrence of strain-harden‐ ing in spider dragline as response to the structural factors has been clearly addressed. Spider dragline silk can acquire extra toughness as a strain-hardening material by breaking intramo‐ lecular *β*-sheets. On the other hand, *B. mori* silkworm silk has far fewer intramolecular *β*-sheets in the amorphous region, therefore it is less extensible and only exhibits strain-weakening after yield point (Figure 6).

being stretched, leading to its large extensibility. Conversely, highly ordered, crystalline

Multi-scale experimental and simulation analyses are the key to improve our systematic understanding of how structure and properties are linked. The mechanical mechanism at the macroscopic scale, namely, the fibril, including morphology and its consequence for mechan‐ ical behavior and the mechanistic interplay with nanostructure of silk, has also been elucidated [115-117]. At the same time, many experiments have been employed to assess the effect of structural changes on the mechanical deformation of silk [118-121]. When mechanical load are applied to the fibers, conformation, reorientation, crystallite size, and some other structural

**Figure 6.** A schematic model demonstrating how the silkworm and spider dragline fibers respond when they are sub‐ jected to stretching. There are two components in the alanine-rich regions of spider dragline silk: β-crystallites and

The experimental and computational investigations shown above have explored mechanical properties of *B. mori* silkworm silks and spider draglines at different structural hierarchies

intramolecular β-sheets. (Adapted with permission from Ref. [81]. Copyright 2011, WILEY-VCH.)

regions play a major role in determining the strength and stiffness of silks [114].

80 Oligomerization of Chemical and Biological Compounds

characters are monitored to explain the structure-property relationships.

Furthermore, unlike *B. mori* silk and the other types of spider silks, the mechanical properties of spider dragline silks are greatly influenced by water. When an unconstrained dragline silk fiber is immersed in water or comes in contact with a relative humidity greater than 60%, the thread starts to swell radially, doubling in diameter and a shrinking to half of its original length [123]. This process is known as supercontraction, which is another interesting property of spider silk. A number of research groups have used different experimental techniques to understand supercontraction and the underlying mechanisms [124-126]. It is assumed that supercontraction is a result of reorientation of hydrogen bonds within the chains of protein molecules and is accompanied with the release of the prestress [127-130]. Some researchers attributed supercontraction mainly on the proline content of MaSp2 protein. Notably, the content of proline does not correlate with the mechanical performance of spider dragline silk fibers from different species, but influences the mechanical properties of wetted dragline silks [99, 126, 131, 132]. Recently, Guan *et al*. discussed the role of the two MaSp1 and MaSp2 proteins in supercontraction and quantified a contraction of about 13% maximum, linked to the disordered component of MaSp1 protein. Thus the remaining supercontraction to a total of about 30% is linked to the intrinsically disordered proline-containing fraction of MaSp2 protein [133]. After supercontraction, the silk is called supercontracted fibers, and are usually to be employed to study the structure-property relationship of silk fiber [123, 134].

#### **3. Silk protein assembly and silk fiber formation mechanism**

The remarkable mechanical properties of silk fibers have spawned great interests in determi‐ nation of their origin. Systematic studies of the natural spinning process of silk fibers have shown a highly sophisticated hierarchical process, allowing for the transformation of soluble silk protein into solid fibers with specific mechanical and functional properties. Although much is already known about the characteristics of the silk proteins and silk fibers themselves, the process for silk assembly and spinning into fibers is yet to be resolved. A detailed knowl‐ edge of silk fiber formation is critical for the biomimetic production of tough silk-like fibers.

#### **3.1. Natural spinning process for** *B. mori* **silk and spider draglines**

In nature, silk proteins are secreted and stored in the glands until they are processed into fibers. Morphological and histological studies demonstrate that the silk glands of *B. mori* silkworm are a pair of tubes and the two tubal glands are connecting before the spinneret. The gland of *B. mori* silkworm can be divided into three parts: posterior, middle and anterior [135]. The fibrion protein is synthesized and present in a weak gel in the posterior division. The secreted proteins are transported to the middle division, where the sericin is synthesized, accumulating as a shell around the fibroin. Due to the water going out through the cell wall of the gland in the middle division, highly concentrated gel-like fibroin begins to undergo a gel-sol transition and serves as a concentrated protein solution of 30 wt% [136]. Notably, the highly concentrated liquid protein, often referred to as the spinning dope, displays nematic liquid crystal properties [137-139]. The spinning dope is exposed to the elongational flow and moves forward in the anterior division. The shear force increases along the anterior division and the spinneret, leading to the orientation of the liquid crystallinity. The crystalline spinning dope is converted into a fiber containing water-insoluble silk II. This process is accompanied by extrusion through the spinneret into air, evaporating the residual water. Another remarkable feature of the fiber spinning is the stretching force, which is brought about by the repeated drawing back of the silkworm's head, causing the orientation of protein molecules along with the silk fiber.

peptides such as (Gly-Ala)n [146, 147]. The comparison of these models with limited experi‐ mental data, resulting in a number of conflicting models describing the structure of silk I. Recently, the structure of silk I has been proposed as a repeated *β*-turn type II-like structure [148, 149]. The secreted dragline proteins are mainly natively unfolded within the gland and consist of random-coil and polyproline-II with helix-like structure [150-152]. There is evidence indicates that the polyalanine motifs form polyproline-II with a helix-like structure. Particu‐ larly, the polyproline-II conformation may be important for maintaining the highly concen‐ trated spinning dope, since the extended polyproline-II structure could prevent the formation of intramolecular hydrogen bonds. Additionally, the polyproline-II helix in spider fibroin

Silk Fiber — Molecular Formation Mechanism, Structure-Property Relationship and Advanced Applications

favors transforming into a *β*-sheet structure due to their similarity of dihedral angles.

concentration increased. Meanwhile, sodium ions (Na+

the major ampullate gland.

pH 6.

It has reported that *B. mori* fibroins and spider spidroins usually show micellar-like structure [16, 144] with an amphiphilic sequence, implying short alternating hydrophilic and hydro‐ phobic amino acid stretches flanked by larger hydrophilic terminal regions [153, 154]. The intervening hydrophilic blocks located among the hydrophobic blocks in the protein prevent premature *β*-sheet formation, thus maintaining the solubility of the solution. Hence, the silk fiber formation involves shear force inducing the conversion of silk protein with specific structural conformations into *β*-sheet structure. This conversion occurs in the spinning ducts followed by drawing down into fibrillar structure. It has shown upon passage through the gland and spinning duct, the proteins encounter remarkable changes in their solvent environ‐ ment, such as extensional flow, protein concentration, pH and metal ion concentrations, which are thought to be contributing factors in silk processing and affecting structural conformations [3, 142, 155-157]. The changes include removal of some water, slight acidification. In addition, the concentration of calcium ions (Ca2+) are increased as the silk protein flowed through the gland in *B. mori*. Unlike in the silkworm *B. mori*, the elemental composition of the silk dope in the spider suggested that Ca2+ ions concentration stayed constant while potassium ions (K+

Experiments made *in vitro* can provide several relevant insights into the process of silk protein assembly and the formation mechanism of the silk fiber. To unravel the assembly mechanism, the reconstituted/recombinant proteins were applied for fiber assembly under specific condition [158-163]. Jin *et al*. characterized the change of the supramolecular structure of silk fibroin with a reduction of the pH, which demonstrates the self-assembly of silk fibroin as a function of pH. When the pH is reduced from 6.8 to 4.8, a morphological transition of silk fibroin from spherical micelles to nanofibrils and the conformational transition of silk fibroin from random coil to *β*-sheet were observed [158]. It may be driven by the stretching entropy effect related to the hydrophobic block in the protein. Shniepp *et al*. [160] deposited the silk protein solution on mica substrates without and with shearing by spin-coating. Only when shear force was applied during deposition, fibrillar structures were obtained. A microfluidic device was employed in which the ion concentrations and pH value could be controlled, and simultaneously, physical stress could be applied by channel design [163]. Silk fibers formed after addition of phosphate, application of an elongational flow, and a pH change from 8 to

), chlorine ions (Cl<sup>−</sup>

)

) are removed from

http://dx.doi.org/10.5772/57611

83

The major gland responsible for the dragline silk of *Nephila clavipes* spider contains the following components: a long tail, a wider sac, named ampulla, and spinning duct approaching the spinneret [140]. Each division of the gland possesses a unique function in fiber formation. For instance, a highly viscous silk protein solution of ~ 50% (w/v) is secreted from the A-zone of the gland, which is comprised of the tail and two thirds part of the sac. Further compounds forming the shell of the fiber may arise in the B-zone of the gland, which occupies the rest part of the sac. Like in the *B. mori* silkworm, the viscosity of the spider's liquid crystalline protein becomes lower and the spinning dope moves forward in the spinning duct where the orien‐ tation of liquid crystalline protein into a fiber begins. Due to its tapering, the shear force is increasing along the spinning duct and the stress forces generated in the drawdown process bring the protein molecules into alignment. Hence the protein molecules join together with hydrogen bonds to give the final fiber with anti-parallel *β*-sheet structure. As the silk protein molecules aggregate and crystallize, they become more hydrophobic, inducing the loss of water from the surface of the silk fiber [141-143].

#### **3.2. Silk protein assembly and silk fiber formation mechanism on a structural view**

The formation of a solid fiber from soluble silk proteins is a remarkable process owing to complex biochemical and physical changes. For silk spinning, several assembly models, such as liquid spinning theory [136] and micelle theory [144] have been proposed for the fiber formation, whereas the details remain to be elucidated. In order to understand the mechanisms of silk proteins assembly and fiber formation, the structure of proteins stored in *B. mori* silkworm gland and the major ampullate gland of spider should be clarified. In *vivo*, freshly secreted fibroin first adopts silk I (the crystalline form of *B. mori* silk fibrion found before the spinning process) and random-coil conformation [145]. Silk I is less stable as shown by attempts to study the secondary structure of silk I form using x-ray diffraction, electron diffraction or SS-NMR have caused the silk I to convert to silk II easily. Silk I remains poorly understood. Most investigations on the structure of the silk I form have been based on model building of peptides such as (Gly-Ala)n [146, 147]. The comparison of these models with limited experi‐ mental data, resulting in a number of conflicting models describing the structure of silk I. Recently, the structure of silk I has been proposed as a repeated *β*-turn type II-like structure [148, 149]. The secreted dragline proteins are mainly natively unfolded within the gland and consist of random-coil and polyproline-II with helix-like structure [150-152]. There is evidence indicates that the polyalanine motifs form polyproline-II with a helix-like structure. Particu‐ larly, the polyproline-II conformation may be important for maintaining the highly concen‐ trated spinning dope, since the extended polyproline-II structure could prevent the formation of intramolecular hydrogen bonds. Additionally, the polyproline-II helix in spider fibroin favors transforming into a *β*-sheet structure due to their similarity of dihedral angles.

are a pair of tubes and the two tubal glands are connecting before the spinneret. The gland of *B. mori* silkworm can be divided into three parts: posterior, middle and anterior [135]. The fibrion protein is synthesized and present in a weak gel in the posterior division. The secreted proteins are transported to the middle division, where the sericin is synthesized, accumulating as a shell around the fibroin. Due to the water going out through the cell wall of the gland in the middle division, highly concentrated gel-like fibroin begins to undergo a gel-sol transition and serves as a concentrated protein solution of 30 wt% [136]. Notably, the highly concentrated liquid protein, often referred to as the spinning dope, displays nematic liquid crystal properties [137-139]. The spinning dope is exposed to the elongational flow and moves forward in the anterior division. The shear force increases along the anterior division and the spinneret, leading to the orientation of the liquid crystallinity. The crystalline spinning dope is converted into a fiber containing water-insoluble silk II. This process is accompanied by extrusion through the spinneret into air, evaporating the residual water. Another remarkable feature of the fiber spinning is the stretching force, which is brought about by the repeated drawing back of the silkworm's head, causing the orientation of protein molecules along with the silk fiber.

The major gland responsible for the dragline silk of *Nephila clavipes* spider contains the following components: a long tail, a wider sac, named ampulla, and spinning duct approaching the spinneret [140]. Each division of the gland possesses a unique function in fiber formation. For instance, a highly viscous silk protein solution of ~ 50% (w/v) is secreted from the A-zone of the gland, which is comprised of the tail and two thirds part of the sac. Further compounds forming the shell of the fiber may arise in the B-zone of the gland, which occupies the rest part of the sac. Like in the *B. mori* silkworm, the viscosity of the spider's liquid crystalline protein becomes lower and the spinning dope moves forward in the spinning duct where the orien‐ tation of liquid crystalline protein into a fiber begins. Due to its tapering, the shear force is increasing along the spinning duct and the stress forces generated in the drawdown process bring the protein molecules into alignment. Hence the protein molecules join together with hydrogen bonds to give the final fiber with anti-parallel *β*-sheet structure. As the silk protein molecules aggregate and crystallize, they become more hydrophobic, inducing the loss of

**3.2. Silk protein assembly and silk fiber formation mechanism on a structural view**

The formation of a solid fiber from soluble silk proteins is a remarkable process owing to complex biochemical and physical changes. For silk spinning, several assembly models, such as liquid spinning theory [136] and micelle theory [144] have been proposed for the fiber formation, whereas the details remain to be elucidated. In order to understand the mechanisms of silk proteins assembly and fiber formation, the structure of proteins stored in *B. mori* silkworm gland and the major ampullate gland of spider should be clarified. In *vivo*, freshly secreted fibroin first adopts silk I (the crystalline form of *B. mori* silk fibrion found before the spinning process) and random-coil conformation [145]. Silk I is less stable as shown by attempts to study the secondary structure of silk I form using x-ray diffraction, electron diffraction or SS-NMR have caused the silk I to convert to silk II easily. Silk I remains poorly understood. Most investigations on the structure of the silk I form have been based on model building of

water from the surface of the silk fiber [141-143].

82 Oligomerization of Chemical and Biological Compounds

It has reported that *B. mori* fibroins and spider spidroins usually show micellar-like structure [16, 144] with an amphiphilic sequence, implying short alternating hydrophilic and hydro‐ phobic amino acid stretches flanked by larger hydrophilic terminal regions [153, 154]. The intervening hydrophilic blocks located among the hydrophobic blocks in the protein prevent premature *β*-sheet formation, thus maintaining the solubility of the solution. Hence, the silk fiber formation involves shear force inducing the conversion of silk protein with specific structural conformations into *β*-sheet structure. This conversion occurs in the spinning ducts followed by drawing down into fibrillar structure. It has shown upon passage through the gland and spinning duct, the proteins encounter remarkable changes in their solvent environ‐ ment, such as extensional flow, protein concentration, pH and metal ion concentrations, which are thought to be contributing factors in silk processing and affecting structural conformations [3, 142, 155-157]. The changes include removal of some water, slight acidification. In addition, the concentration of calcium ions (Ca2+) are increased as the silk protein flowed through the gland in *B. mori*. Unlike in the silkworm *B. mori*, the elemental composition of the silk dope in the spider suggested that Ca2+ ions concentration stayed constant while potassium ions (K+ ) concentration increased. Meanwhile, sodium ions (Na+ ), chlorine ions (Cl<sup>−</sup> ) are removed from the major ampullate gland.

Experiments made *in vitro* can provide several relevant insights into the process of silk protein assembly and the formation mechanism of the silk fiber. To unravel the assembly mechanism, the reconstituted/recombinant proteins were applied for fiber assembly under specific condition [158-163]. Jin *et al*. characterized the change of the supramolecular structure of silk fibroin with a reduction of the pH, which demonstrates the self-assembly of silk fibroin as a function of pH. When the pH is reduced from 6.8 to 4.8, a morphological transition of silk fibroin from spherical micelles to nanofibrils and the conformational transition of silk fibroin from random coil to *β*-sheet were observed [158]. It may be driven by the stretching entropy effect related to the hydrophobic block in the protein. Shniepp *et al*. [160] deposited the silk protein solution on mica substrates without and with shearing by spin-coating. Only when shear force was applied during deposition, fibrillar structures were obtained. A microfluidic device was employed in which the ion concentrations and pH value could be controlled, and simultaneously, physical stress could be applied by channel design [163]. Silk fibers formed after addition of phosphate, application of an elongational flow, and a pH change from 8 to pH 6.

of silk fiber formation showed that an oligomerization is greatly increased with a drop in pH at about 6.0, which is triggered by the amino-terminal domain [35, 165]. The structural changes of the amino-terminal nonrepetitive domains rearrange the position of the core regions within the micellar-like structure, together with mechanical stimuli supports the *β*-sheet formation. With the exchange of chaotropic ions for the relatively kosmotropic ions, exposing hydropho‐ bic patches can enhance the assembly process. The amino-terminal and carboxy-terminal domains sense changes in salt, pH, and shear force. And the fine-tuned interplay between these parameters enables the silkworm and spider to efficiently produce a stable very tough fiber

Silk Fiber — Molecular Formation Mechanism, Structure-Property Relationship and Advanced Applications

http://dx.doi.org/10.5772/57611

85

Traditionally, silk has been utilized in the construction of textiles. Current research in silk fibers involves their innovative trends and advanced applications. Basically, the rich proportion of essential amino acids in silk fibers indicates high nutritive value, meaning that silk fibroin can be used as a dietary additive [167-169]. Furthermore, the amino acids, glycine, alanine, serine and tyrosine are of vital for nourishing the skin. The crystalline structure of silk protein reflects UV radiation, acting as protective buffer between the skin and environment. The extracts of silk protein are used in soap making, personal care and cosmetic products. The silk protein is also applied to enhance glossy, brightness, and softness of products. In addition, the produc‐ tion of advanced man made super-fibers such as Kevlar involves petrochemical processing, which contributes to pollution. Interest in silk fibers is mainly due to the combination of the mechanical properties and eco friendly way in which they are made. Spider silk fibers have been envisioned to be applied in a variety of technical textiles, including parachute cords, protective clothing and composite materials in aircrafts, which demand high toughness in

Silks are biocompatible, biodegradable and have implant ability, as well as morphologic flexibility. Silk fiber has been used as extremely thin suture for eye or nerve surgery for long history [170]. Nowadays, one attractive application of silk fibers is act as a source of novel biomaterials. Recent progress with processing of silk fibers into various material forms, usually via the formation of the fibroin/spidroin solution, including thread, hydrogels, tubes, sponges, microspheres, particles and films [9, 171], promotes the field of applications for silk fibers in general (Figure 8) [172]. Silk protein can be modified by chemical treatment or used in combination with other materials and the silk-based biomaterials have been transformed for high-technology uses, with promising futures in the fields of biomedicine and material engineering. Numerous studies have demonstrated that fibroin supports cell attachment and proliferation for a variety of cell types [173-178]. Studies have established a potential for silkbased biomaterials use as tissue engineering scaffolds, such as skeletal tissue like bone [179], ligaments [180], and cartilage [181, 182], as well as skin [183], blood vessels [184] and nerve [185]. Silks can be designed and offer another biomedical applications, such as delivery of small molecule drugs, proteins and genes [186]. Silk fibroin possesses remarkable optical properties, such as near-perfect transparency in a visible range. It has been identified as a suitable material

under mild conditions

combination with sleaziness.

**4. Advanced applications of silk fibers**

**Figure 7.** Schematic formation mechanism of the hierarchical assembly from molecular silk fibroin to microfibers.(Re‐ printed from Ref. [1]. Copyright 2011, with permission from Elsevier.)

Actually, the chemical and mechanical stimuli together are likely to influence the fold of nonrepetitive amino-terminal and carboxy-terminal and the hydrophilic spacers within the hydrophobic core domain [37, 135, 158, 164-166]. Due to the larger hydrophilic blocks at the chain ends of the protein molecules having charged groups, it is possible that they might play an important role in the molecular assembly and conformational transition at a specific pH through decreased electrostatic repulsion. A significant step towards understanding the effect of the terminal domains in assembly was the determination of atomistic structures of the nonrepetitive terminal regions of MaSp proteins. Kessler and Scheibel's group reported the structure of carboxy-terminal domain of *Araneus diadematus* ADF 3 by NMR spectroscopy [36]. And Johannson, Knight reported the structures of amino-terminal domain of *Euprosthenops australis* MaSp1 by X-ray scattering [35]. Interestingly, both terminal domains are mainly composed of α-helical barrels but with different folds. The carboxy-terminal domain mediates homodimerization via a disulfide bond [36] and forms a clamp-like structural arrangement. It seems to be implicated in a number of different functions, including control of solubility and fiber formation. The carboxy-terminal domains are able to form supramolecular assemblies resembling micellar-like structure, which is stabilized by the chaotropic ions (Figure 7). The amino-terminal domains are monomeric at pH 6.8 and above [35]. And recent NMR and light scattering studies on the amino-terminal nonrepetitive domain of *Latrodectus hesperus* confirmed that a combination of pH and salt concentration controlled the dimerization. The monomer was clearly stabilized at neutral pH in the presence of salt. While the lower pH and/ or the reduced salt concentration causes the amino-terminal nonrepetitive domain to dimerize in an antiparallel fashion to create head-to-tail dimmers to dimmers [165]. The pH dependence of silk fiber formation showed that an oligomerization is greatly increased with a drop in pH at about 6.0, which is triggered by the amino-terminal domain [35, 165]. The structural changes of the amino-terminal nonrepetitive domains rearrange the position of the core regions within the micellar-like structure, together with mechanical stimuli supports the *β*-sheet formation. With the exchange of chaotropic ions for the relatively kosmotropic ions, exposing hydropho‐ bic patches can enhance the assembly process. The amino-terminal and carboxy-terminal domains sense changes in salt, pH, and shear force. And the fine-tuned interplay between these parameters enables the silkworm and spider to efficiently produce a stable very tough fiber under mild conditions

#### **4. Advanced applications of silk fibers**

Actually, the chemical and mechanical stimuli together are likely to influence the fold of nonrepetitive amino-terminal and carboxy-terminal and the hydrophilic spacers within the hydrophobic core domain [37, 135, 158, 164-166]. Due to the larger hydrophilic blocks at the chain ends of the protein molecules having charged groups, it is possible that they might play an important role in the molecular assembly and conformational transition at a specific pH through decreased electrostatic repulsion. A significant step towards understanding the effect of the terminal domains in assembly was the determination of atomistic structures of the nonrepetitive terminal regions of MaSp proteins. Kessler and Scheibel's group reported the structure of carboxy-terminal domain of *Araneus diadematus* ADF 3 by NMR spectroscopy [36]. And Johannson, Knight reported the structures of amino-terminal domain of *Euprosthenops australis* MaSp1 by X-ray scattering [35]. Interestingly, both terminal domains are mainly composed of α-helical barrels but with different folds. The carboxy-terminal domain mediates homodimerization via a disulfide bond [36] and forms a clamp-like structural arrangement. It seems to be implicated in a number of different functions, including control of solubility and fiber formation. The carboxy-terminal domains are able to form supramolecular assemblies resembling micellar-like structure, which is stabilized by the chaotropic ions (Figure 7). The amino-terminal domains are monomeric at pH 6.8 and above [35]. And recent NMR and light scattering studies on the amino-terminal nonrepetitive domain of *Latrodectus hesperus* confirmed that a combination of pH and salt concentration controlled the dimerization. The monomer was clearly stabilized at neutral pH in the presence of salt. While the lower pH and/ or the reduced salt concentration causes the amino-terminal nonrepetitive domain to dimerize in an antiparallel fashion to create head-to-tail dimmers to dimmers [165]. The pH dependence

**Figure 7.** Schematic formation mechanism of the hierarchical assembly from molecular silk fibroin to microfibers.(Re‐

printed from Ref. [1]. Copyright 2011, with permission from Elsevier.)

84 Oligomerization of Chemical and Biological Compounds

Traditionally, silk has been utilized in the construction of textiles. Current research in silk fibers involves their innovative trends and advanced applications. Basically, the rich proportion of essential amino acids in silk fibers indicates high nutritive value, meaning that silk fibroin can be used as a dietary additive [167-169]. Furthermore, the amino acids, glycine, alanine, serine and tyrosine are of vital for nourishing the skin. The crystalline structure of silk protein reflects UV radiation, acting as protective buffer between the skin and environment. The extracts of silk protein are used in soap making, personal care and cosmetic products. The silk protein is also applied to enhance glossy, brightness, and softness of products. In addition, the produc‐ tion of advanced man made super-fibers such as Kevlar involves petrochemical processing, which contributes to pollution. Interest in silk fibers is mainly due to the combination of the mechanical properties and eco friendly way in which they are made. Spider silk fibers have been envisioned to be applied in a variety of technical textiles, including parachute cords, protective clothing and composite materials in aircrafts, which demand high toughness in combination with sleaziness.

Silks are biocompatible, biodegradable and have implant ability, as well as morphologic flexibility. Silk fiber has been used as extremely thin suture for eye or nerve surgery for long history [170]. Nowadays, one attractive application of silk fibers is act as a source of novel biomaterials. Recent progress with processing of silk fibers into various material forms, usually via the formation of the fibroin/spidroin solution, including thread, hydrogels, tubes, sponges, microspheres, particles and films [9, 171], promotes the field of applications for silk fibers in general (Figure 8) [172]. Silk protein can be modified by chemical treatment or used in combination with other materials and the silk-based biomaterials have been transformed for high-technology uses, with promising futures in the fields of biomedicine and material engineering. Numerous studies have demonstrated that fibroin supports cell attachment and proliferation for a variety of cell types [173-178]. Studies have established a potential for silkbased biomaterials use as tissue engineering scaffolds, such as skeletal tissue like bone [179], ligaments [180], and cartilage [181, 182], as well as skin [183], blood vessels [184] and nerve [185]. Silks can be designed and offer another biomedical applications, such as delivery of small molecule drugs, proteins and genes [186]. Silk fibroin possesses remarkable optical properties, such as near-perfect transparency in a visible range. It has been identified as a suitable material

and mimic the silkworm's or spider's ways of making and processing silks with tunable properties. Such knowledge is beneficial for further improvement of synthetic polymer-based materials. Combined with the discovery of new bio-inspired materials, the future application space seems more and more broad. As a whole, further development of related yield is

Silk Fiber — Molecular Formation Mechanism, Structure-Property Relationship and Advanced Applications

http://dx.doi.org/10.5772/57611

87

The authors thank the financial support from the National Science Foundation of China (NSFC) under Grant 51073113, 91027039 and the Natural Science Foundation of the Jiangsu Higher Education Institutions of China under Grant 10KJA540046. This work was also supported by the Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD). We also acknowledge support from the Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD), Qing Lan Project for Excellent Scientific and Technological Innovation Team of Jiangsu Province (2012) and Project for Jiangsu Scientific and Technological Innovation Team (2013). The author, Xinfang Liu, especially thank the

support of the Postdoctoral Science Foundation of Jiangsu province (No. 1201030B).

National Engineering Laboratory for Modern Silk, College of Textile and Clothing

[1] Eisoldt L., Smith A., Scheibel T. Decoding the secrets of spider silk. Materials Today

[2] Craig C L. Evolution of arthropod silks. Annual Review of Entomology 1997; 42: 231–

[3] Heim M., Keerl D., Scheibel T. Spider silk: From soluble protein to extraordinary fi‐

[4] Shao Z., Vollrath F. Surprising strength of silkworm silk. Nature 2002; 418(15): 741–

ber. Angewandte Chemie, International Edition 2009; 48(20): 3584–3596.

underway.

**Acknowledgements**

**Author details**

**References**

267.

741.

Xinfang Liu and Ke-Qin Zhang\*

2011; 14(3): 80–86.

\*Address all correspondence to: kqzhang@suda.edu.cn

Engineering, Soochow University, Suzhou, China

**Figure 8.** Possible structure and technical applications of the silk fibers. The dotted line shows an example of the ver‐ satility of silk and the multiple possible applications. (Figure slightly modified with permission from Ref. [172]. Copy‐ right 2012, Elsevier.)

for the development of biophotonic components [187-189] in biomedical device performing electronics or sensors [190-198]. Surely, these impressive biopolymers are extremely promising for their potential applications in material science and engineering.

#### **5. Conclusions**

Our review in current chapter concentrated on *B. mori* silk and spider dragline. It links the physical and mechanical properties of native silk to the molecular make up, assembly and formation process in *B. mori* silkworm and spider. Over the last decade, there has been considerable progress in understanding the molecular structure of silk, which has inspired us a range of research utilizing the repeating modules of silk in combination with other chemical motifs to develop novel materials. The physical properties of silks highlight the potential for threads to act as high performance fibers. The advances in genetic engineering and gene sequencing enable the production of recombinant proteins in large amount and the exploration of various applications foreseeable in industry. The key point for this area is the development of suitable spinning technologies to reproducibly form threads with properties similar to that of the natural silk. The molecular assembly process of silk has provided us concepts to copy and mimic the silkworm's or spider's ways of making and processing silks with tunable properties. Such knowledge is beneficial for further improvement of synthetic polymer-based materials. Combined with the discovery of new bio-inspired materials, the future application space seems more and more broad. As a whole, further development of related yield is underway.

#### **Acknowledgements**

The authors thank the financial support from the National Science Foundation of China (NSFC) under Grant 51073113, 91027039 and the Natural Science Foundation of the Jiangsu Higher Education Institutions of China under Grant 10KJA540046. This work was also supported by the Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD). We also acknowledge support from the Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD), Qing Lan Project for Excellent Scientific and Technological Innovation Team of Jiangsu Province (2012) and Project for Jiangsu Scientific and Technological Innovation Team (2013). The author, Xinfang Liu, especially thank the support of the Postdoctoral Science Foundation of Jiangsu province (No. 1201030B).

#### **Author details**

Xinfang Liu and Ke-Qin Zhang\*

\*Address all correspondence to: kqzhang@suda.edu.cn

National Engineering Laboratory for Modern Silk, College of Textile and Clothing Engineering, Soochow University, Suzhou, China

#### **References**

for the development of biophotonic components [187-189] in biomedical device performing electronics or sensors [190-198]. Surely, these impressive biopolymers are extremely promising

**Figure 8.** Possible structure and technical applications of the silk fibers. The dotted line shows an example of the ver‐ satility of silk and the multiple possible applications. (Figure slightly modified with permission from Ref. [172]. Copy‐

Our review in current chapter concentrated on *B. mori* silk and spider dragline. It links the physical and mechanical properties of native silk to the molecular make up, assembly and formation process in *B. mori* silkworm and spider. Over the last decade, there has been considerable progress in understanding the molecular structure of silk, which has inspired us a range of research utilizing the repeating modules of silk in combination with other chemical motifs to develop novel materials. The physical properties of silks highlight the potential for threads to act as high performance fibers. The advances in genetic engineering and gene sequencing enable the production of recombinant proteins in large amount and the exploration of various applications foreseeable in industry. The key point for this area is the development of suitable spinning technologies to reproducibly form threads with properties similar to that of the natural silk. The molecular assembly process of silk has provided us concepts to copy

for their potential applications in material science and engineering.

**5. Conclusions**

right 2012, Elsevier.)

86 Oligomerization of Chemical and Biological Compounds


[5] Iizuka E. Silk thread: Mechanism of spinning and its mechanical properties. Journal of Applied Polymer Science: Applied Polymer Symposium 1985; 41: 173–185.

[21] Robson R M. Microvoids in *Bombyx mori* silk. An electron microscope study. Interna‐

Silk Fiber — Molecular Formation Mechanism, Structure-Property Relationship and Advanced Applications

http://dx.doi.org/10.5772/57611

89

[22] Frische S., Maunsbach A B., Vollrath F. Elongate cavities and skin-core structure in *Nephila* spider silk observed by electron microscopy. Journal of Microscopy 1998;

[23] Takei F., Kikuchi Y., Kikuchi A., Mizuno S., Shimura K. Further evidence for impor‐ tance of the subunit combination of silk fibroin in its efficient secretion from the pos‐

[24] Tanaka K., Mori K., Mizuno S. Immunological identification of the major disulfidelinked light component of silk fibroin. Journal of Biochemistry (Tokyo) 1993; 114(1):

[25] Tanaka K., Kajiyama N., Ishikura K., Waga S., Kikuchi A., Ohtomo K., Takagi T., Mizuno S. Determination of the site of disulfide linkage between heavy and light chains of silk fibroin produced by *Bombyx mori*. Biochimica et Biophysica Act (BBA) –

[26] Xu M., Lewis R V. Structure of a protein superfiber: spider dragline silk. Proceedings of the National Academy of Science of the United States of America 1990; 87(18):

[27] Hinman M B., Lewis R V. Isolation of a clone encoding a second dragline silk fibroin. *Nephila clavipes* dragline silk is a two-protein fiber. The Journal of Biological Chemis‐

[28] Beckwitt R., Arcidiacono S. Sequence conservation in the C-terminal region of spider silk proteins (spidroin) from *Nephila clavipes* (Tetragnathidae) and *Araneus bicentenar‐*

[29] Sponner A., Vater W., Rommerskirch W., Vollrath F., Unger E., Grosse F., Weisshart K. The conserved C-termini contribute to the properties of spider silk fibroins. Bio‐

[30] Eisoldt L., Thamm C., Scheibel T. The role of terminal domains during storage and

[31] Rising A., Hjälm G., Engström W., Johansson J. N-terminal nonrepetitive domain common to dragline, flagelliform, and cylindriform spider silk proteins. Biomacro‐

[32] Motriuk-Smith D., Smith A., Hayashi C Y., Lewis R V. Analysis of the conserved Nterminal domains in major ampullate spider silk proteins. Biomacromolecules 2005;

*ius* (Araneidae). The Journal of Biological Chemistry 1994; 269(9): 6661–6663.

chemical and Biophysical Research Communications 2005; 338(2): 897–902.

assembly of spider silk proteins. Biopolymers 2012; 97(6): 355–361.

terior silk gland cells. The Journal of Cell Biology 1987; 105(1): 175–180.

Protein Structure and Molecular Enzymology 1999; 1432(1): 92–103.

tional Journal of Biological Macromolecules 1999; 24(2-3): 145–150.

189(1): 64–70.

1–4.

7120–7124.

try 1992; 267(27): 19320–19324.

molecules 2006; 7(11): 3120–3124.

6(6): 3152–3159.


[21] Robson R M. Microvoids in *Bombyx mori* silk. An electron microscope study. Interna‐ tional Journal of Biological Macromolecules 1999; 24(2-3): 145–150.

[5] Iizuka E. Silk thread: Mechanism of spinning and its mechanical properties. Journal of Applied Polymer Science: Applied Polymer Symposium 1985; 41: 173–185.

[7] Asakura T., Kaplan D L. Silk production and processing. Encyclopedia of Agricultur‐

[9] Rockwood D N., Preda R C., Yücel T., Wang X Q., Lovett M L., Kaplan D L. Materials fabrication from *Bomyx mori* silk fibroin. Nature Protocols 2011; 6(10): 1612–1631.

[10] Humenik M., Smith A M., Scheibel T. Recombinant spider silks – Biopolymers with

[11] Fu C., Shao Z., Fritz V. Animal silks: their structures, properties and artificial produc‐

[12] Hardy J G., Römer L M., Scheibel T R. Polymeric materials based on silk proteins.

[13] Gührs K H., Weisshart K., Grosse F. Lessons from nature – protein fibers. Reviews in

[14] Shen Y., Johnson M A., Martin D C. Microstructural characterization of *Bombyx mori*

[15] Li S F., McGhie A J., Tang S L. New internal structure of spider dragline silk revealed

[16] Oroudjev E., Soares J., Arcdiacono S., Thompson J B., Fossey S A., Hansma H G. Seg‐ mented nanofibers of spider dragline silk: Atom force microscopy and single-mole‐ cule force spectroscopy. Proceedings of the National Academy of Science of the

[17] Augsten K., Muehlig P., Hermann C. Glycoproteins and skin-core structure in *Nephi‐ la clavipes* spider silk observed by light and electron microscopy. Scanning 2000;

[18] Putthanarat S., Stribeck N., Fossey S A., Eby R K., Adams W W. Investigation of the

[19] Du N., Liu X Y., Narayanan J., Li L., Lim M L M., Li D. Design of superior spider silk: From nanostructure to mechanical properties. Biophysical Journal 2006; 91(12): 4528–

[20] Padamwar M N., Pawar A P. Silk sericin and its applications: A review. Journal of

by atomic force microscopy. Biophysical Journal 1994; 66(4): 1209–1212.

[6] Heslot H. Artifical fibrous proteins: A review. Biochimie 1998; 80(1): 19–31.

[8] Lewis R. Unraveling the weave of spider silk. Bioscience 1996; 46(9): 636–638.

potential for future applications. Polymer 2011; 3(1): 640–661.

tion. Chemical Communication 2009; 43: 6515–6529.

Molecular Biotechnology 2000; 74(2): 121–134.

silk fibers. Macromolecules 1998; 31(25): 8857–8864.

United States of America 2002; 99(9, suppl.2): 6460–6465.

nanofibrils of silk fibers. Polymer 2000; 41(21): 7735–7747.

Scientific & Industrial Research 2004; 63(4): 323–329.

Polymer 2008; 49(30): 4309–4327.

22(1): 12–15.

4535.

al Science 1994; 4: 1–11.

88 Oligomerization of Chemical and Biological Compounds


[33] Zhou C Z., Confalonieri F., Jacquet M., Perasso R., Li Z G., Janin J. Silk fibroin: struc‐ tural implications of a remarkable amino acid sequence. Proteins: Structure, Func‐ tion, and Genetics 2001; 44(2): 119–122.

conformation change with stress. Journal of Raman Spectroscopy 2008; 39(12): 1749–

Silk Fiber — Molecular Formation Mechanism, Structure-Property Relationship and Advanced Applications

http://dx.doi.org/10.5772/57611

91

[46] Lefèvre T., Rousseau M E., Pézolet M. Protein secondary structure and orientation in silk as revealed by Raman spectromicroscopy. Biophysical Journal 2007; 92(8): 2885–

[47] Holland G P., Jenkins J E., Creager M S., Lewis R V., Yarger J L. Quantifying the frac‐ tion of glycine and alanine in *β*-sheet and helical conformations in spider dragline

[48] Jenkins J E., Creager M S., Lewis R V., Holland G P., Yarger J L. Quantitative correla‐ tion between the protein primary sequences and secondary structures in spider drag‐

[49] Ling S., Qi Z., Knight D P., Shao Z., Chen X. Synchrotron FTIR microspectroscopy of

[50] Dicko C., Knight D., Kenney J M., Vollrath F. Structural conformation of spidroin in solution: A synchrotron radiation circular dichroism study. Biomacromolecules 2004;

[51] Lefèvre T., Paquet-Mercier F., Rioux-Dubé J F., Pézolet M. Structure of silk by Raman spectromicroscopy: From the spinning glands to the fibers. Biopolymers 2012; 97(6):

[52] Kaplan D.; Adams W W., Farmer B., Viney C., editors. Silk polymers: materials sci‐ ence and biotechnology. American Chemical Society Symposium Series 1994.

[53] Marsh R E., Corey R B., Pauling L. An investigation of the structure of silk fibroin.

[54] Warwicker J O. Comparative studies of fibroins: II. The crystal structures of various

[55] Dekker M. In: Porter D. (ed.) Group interaction modeling of polymer properties.

[56] Porter D., Vollrath F., Shao Z. Predicting the mechanical properties of spider silk as a model nanostructured polymer. The European Physical Journal E 2005; 16(2): 199–

[57] Rousseau M E., Lefèvre T., Beaulieu L., Asakura L., Pézolet M. Study of protein con‐ formation and orientation in silkworm and spider silk fibers using Raman micro‐

[58] Krimm S., Bandekar J. Vibrational Spectroscopy and conformation of peptides, poly‐ peptides, and proteins. Advances in Protein Chemistry 1986; 38: 181–364.

single natural silk fibers. Biomacromolecules 2011; 12(9): 3344–3349.

silk using solid-state NMR. Chemical Communication 2008; 43: 5568–5570.

line silks. Biomacromolecules 2010; 11(1): 192–200.

Biochimica et Biophysica Acta 1955; 16: 1–34.

fibroins. Journal of Molecular Biology 1960; 2(6): 350–362.

spectroscopy. Biomacromolecules 2004; 5(6): 2247–2257.

1764.

2895.

5(3): 758–767.

New York. 1995. P499.

322–336.


conformation change with stress. Journal of Raman Spectroscopy 2008; 39(12): 1749– 1764.

[46] Lefèvre T., Rousseau M E., Pézolet M. Protein secondary structure and orientation in silk as revealed by Raman spectromicroscopy. Biophysical Journal 2007; 92(8): 2885– 2895.

[33] Zhou C Z., Confalonieri F., Jacquet M., Perasso R., Li Z G., Janin J. Silk fibroin: struc‐ tural implications of a remarkable amino acid sequence. Proteins: Structure, Func‐

[34] Zhou C Z., Confalonieri F., Medina N., Zivanovic Y., Esnault C., Yang T., Jacquet M., Janin J., Duguet M., Perasso R., Li Z G. Fine organization of *Bombyx mori* fibroin

[35] Askarieh G., Hedhammar M., Nordling K., Saenz A., Casals C., Rising A., Johansson J., Knight S D. Self-assembly of spider silk proteins is controlled by a pH-sensitive re‐

[36] Hagn F., Eisoldt L., Hardy J G., Vendrely C., Coles M., Scheibel T., Kessler, H. A con‐ served spider silk domain acts as a molecular switch that controls fibre assembly.

[37] He Y X., Zhang N N., Li W F., Jia N., Chen B Y., Zhou K., Zhang J., Chen Y., Zhou C Z. N-terminal domain of *Bombyx mori* fibroin mediates the assembly of silk in re‐

[38] Hagn F. A structural view on spider silk proteins and their role in fiber assembly.

[39] Ittah S., Cohen S., Garty S., Cohn D., Gat U. An essential role for the C-terminal do‐ main of a dragline spider silk protein in directing fiber formation. Biomacromole‐

[40] Bini E., Knight D P., Kaplan D L. Mapping domain structures in silks from insects and spiders related to protein assembly. Journal of Molecular Biology 2004; 335(1-2):

[41] Simmons A., Ray E., Jelinski L. W. Solid-state 13C NMR of *Nephila clavipes* dragline silk establishes structure and identity of crystalline regions. Macromolecules 1994;

[42] Asakura T., Yao J. 13C CP/MAS NMR study on structural heterogeneity in *Bombyx mori* silk fiber and their generation by stretching. Protein Science 2002; 11(11): 2706–

[43] Boulet-Audet M., Lefèvre T., Buffeteau T., Pézolet M. Attenuated total reflection in‐ frared spectroscopy: An efficient technique to quantitatively determine the orienta‐ tion and conformation of proteins in single silk fibers. Applied Spectroscopy 2008;

[44] Holland G P., Creager M S., Jenkins J E., Lewis R V., Yarger J L. Determining secon‐ dary structure in spider dragline silk by carbon-carbon correlation solid-state NMR spectroscopy. Journal of the American Chemistry Society 2008; 130(30): 9871–9877. [45] Colomban P., Dinh H M., Riand J., Prinsloo L C., Mauchamp B. Nanomechanics of single silkworm and spider fibres: a Raman and micro-mechanical *in situ* study of the

sponse to pH decrease. Journal of Molecular Biology 2012; 418(3-4): 197–207.

heavy chain gene. Nucleic Acid Research 2000; 28(12): 2413–2419.

tion, and Genetics 2001; 44(2): 119–122.

90 Oligomerization of Chemical and Biological Compounds

lay. Nature 2010; 465(13): 236–238.

Nature 2010; 465(13): 239–242.

cules 2006; 7(6): 1790–1795.

27–40.

2713.

27(18): 5235–5237.

62(9): 956–962.

Journal of Peptide Science 2012; 18(6): 357–365.


[59] Dong J., Wan Z., Popov M., Carey P R., Weiss M. A. Insulin assembly damps confor‐ mational fluctuations: Raman analysis of amide I line-widths in native states and fi‐ brils. Journal of Molecular Biology 2003; 330(2): 431–442.

[72] Cruz D H., Rousseau M E., West M M., Pézolet M., Hitchcock A P. Quantitative map‐ ping of the orientation of fibroin *β*-sheets in *B. mori* cocoon fibers by scanning trans‐

Silk Fiber — Molecular Formation Mechanism, Structure-Property Relationship and Advanced Applications

http://dx.doi.org/10.5772/57611

93

[73] van Beek J D., Hess S., Vollrath F., Meier B H. The molecular structure of spider drag‐ line silk: Folding and orientation of the protein backbone. Proceedings of the Nation‐ al Academy of Science of the United States of America 2002; 99(16): 10266–10271. [74] Marcotte I., van Beek J D., Meier B H. Molecular disorder and structure of spider dragline silk investigated by two-dimensional solid-state NMR spectroscopy. Macro‐

[75] Asakura T., Yao J., Yamane T., Umemura K., Ulrich A S. Heterogeneous structure of silk fibers from *Bombyx mori* resolved by 13C solid-state NMR spectroscopy. Journal

[76] Gosline J M., Denny M W., DeMont M E. Spider silk as rubber. Nature 1984;

[77] Gatesy J., Hayashi C., Motriuk D., Woods J., Lewis R. Extreme diversity, conserva‐ tion, and convergence of spider silk fibroin sequence. Science 2001; 291(5513): 2603–

[78] Ma B Y., Nussinov R. Molecular dynamics simulations of alanine rich *β*-sheet oligom‐ ers: Insight into amyloid formation. Protein Science 2002; 11(10): 2335–2350.

[79] Keten S., Buehler M J. Atomistic model of the spider silk nanostructure. Applied

[80] Keten S., Buehler M J. Nanostructure and molecular mechanics of spider dragline silk protein assemblies. Journal of the Royal Society Interface 2010; 7(53): 1709–1721. [81] Du N., Yang Z., Liu X Y., Li Y., Xu H Y. Structural origin of the strain-hardening of

[82] Li X., Eles P T., Michal C A. Water permeability of spider dragline silk. Biomacromo‐

[83] Ene R., Papadopoulos P., Kremer F. Partial deuteration probing structural changes in

[84] Paquet-Mercier F., Lefèvre T., Auger M., Pézolet M. Evidence by infrared spectrosco‐ py of the presence of two type of *β*-sheets in major ampullate spider silk and silk‐

[85] Rousseau M E., Cruz D H., West M M., Hitchcock A P., Pézolet M. Nephila clavipes spider dragline silk microstructure studies by scanning transmission X-ray microsco‐

py. Journal of the American Chemical Society 2007; 129(13): 3897–3905.

spider silk. Advanced Functional Materials 2011; 21(4): 772–778.

supercontracted spider silk. Polymer 2010; 51(21): 4784–4789.

mission X-ray microscopy. Biomacromolecules 2006; 7(3): 836–843.

of the American Chemical Society 2002; 124(30): 8794–8795.

molecules 2007; 40(6): 1995–2001.

Physics Letters 2010; 96(15): 153701–153703.

worm silk. Soft Matter 2013; 9(1): 208–215.

lecules 2009; 10(5): 1270–1275.

309(5968): 551–552.


[72] Cruz D H., Rousseau M E., West M M., Pézolet M., Hitchcock A P. Quantitative map‐ ping of the orientation of fibroin *β*-sheets in *B. mori* cocoon fibers by scanning trans‐ mission X-ray microscopy. Biomacromolecules 2006; 7(3): 836–843.

[59] Dong J., Wan Z., Popov M., Carey P R., Weiss M. A. Insulin assembly damps confor‐ mational fluctuations: Raman analysis of amide I line-widths in native states and fi‐

[60] Zheng S., Li G., Yao W., Yu T. Raman spectroscopic investigation of the denaturation

[61] Simmons A H., Michal C A., Jelinski L W. Molecular orientation and two-component nature of the crystalline fraction of spider dragline silk. Science 1996; 271(5245): 84–

[62] Takahashi Y. Crystal structure of silk of *Bombyx mori*. In: Kaplan D.; Adams W W., Farmer B., Viney C. (eds.) Silk polymers: materials science and biotechnology. Amer‐

[63] Drummy L F., Farmer B L., Naik R R. Correlation of the *β*-sheet crystal size in silk fibers with the protein amino acid sequence. Soft Matter 2007; 3(7): 877–882.

[64] Termonia Y. Molecular modeling of spider silk elasticity. Macromolecules 1994;

[65] Kümmerlen J., van Beek J D., Vollrath F., Meier B H. Local structure in spider drag‐ line silk investigated by two-dimensional spin-diffusion nuclear magnetic resonance.

[66] Becker M A., Tuross N. Initial degradation changes found in *Bombyx mori* silk fibroin. In: Kaplan D.; Adams W W., Farmer B., Viney C. (eds.) Silk polymers: materials sci‐ ence and biotechnology. American Chemical Society Symposium Series 1994. 544:

[67] Yang Z., Grubb D T., Jelinski L W. Small-angle X-ray scattering of spider dragline

[68] Plaza G R., Pérez-Rigueiro J., Riekel C., Perea G B., Agulló-Rueda F., Burghammer M., Guinea G V., Elices M. Relationship between microstructure and mechanical properties in spider silk fibers: identification of two regimes in the microstructural

[69] Sampath S., Isdebski T., Jenkins J E., Ayon J V., Henning R W., Orgel J P R O., Anti‐ poa O., Yarger J L. X-ray diffraction study of nanocrystalline and amorphous struc‐ ture within major and minor ampullate dragline spider silks. Soft Matter 2012; 8(25):

[70] Grubb D T., Jelinski L W. Fiber morphology of spider silk: The effects of tensile de‐

[71] Grubb D T., Ji G. Molecular chain orientation in supercontracted and re-extended spider silk. International Journal of Biological Macromolecules 1999; 24(2-3): 203–210.

process of silk fibroin. Applied Spectroscopy 1989; 43(7): 1269–1272.

ican Chemical Society Symposium Series 1994. 544: P168–175.

brils. Journal of Molecular Biology 2003; 330(2): 431–442.

87.

27(25): 7378–7381.

92 Oligomerization of Chemical and Biological Compounds

P252–269.

6713–6722.

Macromolecules 1996; 29(8): 2920–2928.

silk. Macromolecules 1997; 30(26): 8254–8261.

changes. Soft Matter 2012; 8(22): 6015–6020.

formation. Macromolecules 1997; 30(10): 2860–2867.


[86] Fossey S A., Tripathy S. Atomistic modeling of interphases in spider silk fibers. Inter‐ national Journal of Biological Macromolecules 1999; 24(2-3): 119–125.

[100] Brooks A E., Stricker S M., Joshi S B., Kamerzell T J., Middaugh C R., Lewis R V. Properties of synthetic spider silk fibers based on *Argiope aurantia* MaSp2. Biomacro‐

Silk Fiber — Molecular Formation Mechanism, Structure-Property Relationship and Advanced Applications

http://dx.doi.org/10.5772/57611

95

[101] Rauscher S., Baud S., Miao M., Keeley F., Pomes R. Proline and glycine control pro‐ tein self-organization into elastomeric or amyloid fibrils. Structure 2006; 14(11): 1667–

[102] Hayashi C Y., Shipley N H., Lewis R V. Hypotheses that correlate the sequence, structure, and mechanical properties of spider silk proteins. International Journal of

[103] Shao Z., Vollrath F., Yang Y., Thøgersen H C. Structure and behavior of regenerated

[104] Lee S M., Pippel E., Göesele U., Dresbach C., Qin Y., Chandran C V., Bräeuniger T., Hause G., Knez M. Greatly increased toughness of infiltrated spider silk. Science

[105] Riekel C., Müller M. In situ X-ray diffraction during forced silking of spider silk.

[106] Riekel C., Vollrath F. Spider silk fibre extrusion: combined wide-and small-angle Xray microdiffraction experiments. International Journal of Biological Macromole‐

[107] Khan M R., Morikawa H., Gotoh Y., Miura M., Ming Z., Sato Y., Iwasa M. Structural characteristics and properties of *Bombyx mori* silk fiber obtained by different artificial forcibly silking speeds. Internation Journal of Biological Macromolecules 2008; 42(3):

[108] Keten S., Xu Z., Ihle B., Buehler M J. Nanoconfinement controls stiffness, strength and mechanical toughness of *β*-sheet crystals in silk. Nature Materials 2010; 9(4): 359–

[109] Xiao S., Stacklies W., Debes C., Gräter F. Force distribution determines optimal length of *β*-sheet crystals for mechanical robustness. Soft Matter 2011; 7(4): 1308–

[110] Sinsawat A., Putthanarat S., Magoshi Y., Pachter R., Eby R K. X-ray diffraction and computational studies of the modulus of silk (*Bombyx mori*). Polymer 2002; 43(4):

[111] Krasnov I., Diddens I., Hauptmann N., Helms G., Ogurreck M., Seydel T., Funari S S., Muller M. Mechanical properties of silk: Interplay of deformation on macroscopic and molecular length scales. Physical Review Letters 2008; 100(4): 048104/1–048104/4.

[112] Sinsawat A., Putthanarat S., Magoshi Y., Pachter R., Eby R K. The crystal modulus of

silk (*Bombyx mori*). Polymer 2003; 44(3): 909–910.

molecules 2008; 9(6): 1506–1510.

2009; 324(5926): 488–492.

cules. 2001; 29(3): 203–210.

264–270.

367.

1311.

1323–1330.

Biological Macromolecules 1999; 24(2-3): 271–275.

spider silk. Macromolecules 2003; 36(4): 1157–1161.

Macromolecules 1999; 32(13): 4464–4466.


[100] Brooks A E., Stricker S M., Joshi S B., Kamerzell T J., Middaugh C R., Lewis R V. Properties of synthetic spider silk fibers based on *Argiope aurantia* MaSp2. Biomacro‐ molecules 2008; 9(6): 1506–1510.

[86] Fossey S A., Tripathy S. Atomistic modeling of interphases in spider silk fibers. Inter‐

[87] Thiel B L., Guess K B., Viney C. Non-periodic lattice crystals in the hierarchical mi‐ crostructure of spider (major ampullate) silk. Biopolymers 1997; 41(7): 703–719. [88] Trancik J E., Czernuszka J T., Bell F I., Viney C. Nanostructural features of a spider dragline silk as revealed by electron and X-ray diffraction studies. Polymer 2006;

[89] Swanson B O., Blackledge T A., Beltran J., Hayashi C Y. Variation in the material properties of spider dragline silk across species. Applied Physics A: Materials Sci‐

[90] Sirichaisit J., Brookes V L., Young R J., Vollrath F. Analysis of structure/property rela‐ tionships in silkworm (*Bombyx mori*) and spider dragline (*Nephila edulis*) silks using

[91] Denny M. The physical properties of spider's silk and their role in the design of orb-

[92] Gosline J M., Guerette P A., Ortlepp C S., Savage K N. The mechanical design of spi‐ der silks: From fibroin sequence to mechanical function. The Journal of Experimental

[93] Hu X., Vasanthavada K., Kohler K., McNary S., Moore A M F., Vierra C A. Molecular mechanisms of spider silk. Cellular and Molecular Life Sciences 2006; 63(17): 1986–

[94] Ko F K., Jovicic J. Modeling of mechanical properties and structural design of spider

[95] Sezutsu H., Yukuhiro K. Dynamic rearrangement within the *Antheraea pernyi* silk fi‐ broin gene is associated with four types of repetitive units. Journal of Molecular Evo‐

[96] Fu C., Porter D., Chen X., Vollrath F., Shao Z. Understanding the mechanical proper‐ ties of *Antheraea pernyi* silk – From primary structure to condensed structure of the

[97] Garb J E., Dimauro T., Lewis R V., Hayashi C Y. Expansion and intragenic homogeni‐ zation of spider silk since the triassic: Evidence from mygalomorphae (tarantulas and

their kin) spidroins. Molecular Biology and Evolution 2007; 24(11): 2454–2464.

[98] Brooks A E., Steinkraus H B., Nelson S R., Lewis R V. An investigation of the diver‐ gence of major ampullate silk fibers from *Nephila clavipes* and *Argiope aurantia*. Bio‐

[99] Liu Y., Sponner A., Porter D., Vollrath F. Proline and processing of spider silks. Bio‐

protein. Advanced Functional Materials 2011; 21(4): 729–737.

Raman spectroscopy. Biomacromolecules 2003; 4(2): 387–394.

webs. The Journal of Experimental Biology 1976; 65(2): 483–506.

national Journal of Biological Macromolecules 1999; 24(2-3): 119–125.

47(15): 5633–5642.

94 Oligomerization of Chemical and Biological Compounds

ence & Processing 2006; 82(2): 213–218.

Biology 1999; 202(23): 3295–3303.

lution 2000; 51(4): 329–338.

web. Biomacromolecules 2004; 5(3): 780–785.

macromolecules 2005; 6(6): 3095–3099.

macromolecules 2008; 9(1): 116–121.


[113] Vollrath F., Porter D. Spider silk as a model biomaterial. Applied Physics A: Materi‐ als Science & Processing 2006; 82(2): 205–212.

[126] Savage K N., Gosline J M. The effect of proline on the network structure of major am‐ pullate silks as inferred from their mechanical and optical properties. The Journal of

Silk Fiber — Molecular Formation Mechanism, Structure-Property Relationship and Advanced Applications

http://dx.doi.org/10.5772/57611

97

[127] Liu Y., Shao Z., Vollrath F. Relationships between supercontraction and mechanical

[128] Bell F I., McEwen I J., Viney C. Supercontraction stress in wet spider dragline. Nature

[129] Papadopoulos P., Ene R., Weidner I., Kremer F. Similarities in the structural organi‐ zation of major and minor ampullate spider silk. Macromolecular Rapid Communi‐

[130] Blackledge T A., Boutry C., Wong S C., Baji A., Dhinojwala A., Sahni V., Agnarsson, I. How super is supercontraction? Persistent versus cyclic responses to humidity in spider dragline silk. The Journal of Experimental Biology 2009; 212(13): 1981–1988.

[131] Liu Y., Shao Z., Vollrath F. Elasticity of spider silks. Biomacromolecules 2008; 9(7):

[132] Boutry C., Blackledge T A. Evolution of supercontraction in spider silk: structure − function relationship from tarantulas to orb-weavers. The Journal of Experimental Bi‐

[133] Guan J., Vollrath F., Porter D. Two mechanisms for supercontraction in *Nephila* spi‐

[134] Glišović A., Vehoff T., Davies R J., Salditt T. Strain dependent structural changes of

[135] Wong Po Foo C., Bini E., Hensman J., Knight D P., Lewis R V., Kaplan D L. Role of pH and charge on silk protein assembly in insects and spiders. Applied Physics A:

[136] Vollrath F., Knight D. P. Liquid crystalline spinning of spider silk. Nature 2001;

[137] Knight D P., Vollrath F. Liquid crystals and flow elongation in a spider's silk produc‐ tion line. Proceedings of the Royal Society B: Biological Science 1999; 266(1418): 519–

[138] Kerkam K., Viney C., Kaplan D L., Lombardi S. Liquid crystallinity of natural silk se‐

[139] Willcox P J., Gido S P. Muller W. Kaplan D L. Evidence of a cholesteric liquid crystal‐ line phase in natural silk spinning processes. Macromolecules 1996; 29(15): 5106–

der dragline silk. Biomacromolecules 2011; 12(11): 4030–4035.

spider dragline silk. Macromolecules 2008; 41(2): 390–398.

Materials Science and Processing 2006; 82(2): 223–233.

cretions. Nature 1991; 349(6310): 596–598.

properties of spider silk. Nature Materials 2005; 4(12): 901–905.

Experimental Biology 2008; 211(12): 1937–1947.

2002; 416(6876): 37–37.

1782–1786.

cations 2009; 30(9-10): 851–857.

ology 2010, 213(20): 3505–3514.

410(6828): 541–548.

523.


[126] Savage K N., Gosline J M. The effect of proline on the network structure of major am‐ pullate silks as inferred from their mechanical and optical properties. The Journal of Experimental Biology 2008; 211(12): 1937–1947.

[113] Vollrath F., Porter D. Spider silk as a model biomaterial. Applied Physics A: Materi‐

[114] Nova A., Keten S., Pugno N M., Redaelli A., Buehler M J. Molecular and nanostruc‐ tural mechanisms of deformation, strength and toughness of spider silk fibrils. Nano

[115] Papadopoulos P., Sölter J., Kremer F. Hierarchies in the structural organization of spider silk – a quantitative model. Colloid and Polymer Science 2009; 287(2): 231–236.

[116] Giesa T., Arslan M., Pugno N M., Buehler M J. Nanoconfinement of spider silk fibrils begets superior strength, extensibility, and toughness. Nano Letters 2011; 11(11):

[117] Brown C P., Harnegea C., Gill H S., Price A J., Traversa E., Licoccia S., Rosei F. Rough fibrils provide a toughening mechanism in biological fibers. ACS Nano 2012; 6(3):

[118] Lefèvre T., Paquet-Mercier F., Lesage S., Rousseau M E., Bédard S., Pézolet M. Study by Raman spectromicroscopy of the effect of tensile deformation on the molecular

[119] Eles P T., Michal C A. A DECODER NMR study of backbone orientation in *Nephila clavipes* dragline silk under varying strain and draw rate. Biomacromolecules 2004;

[120] Seydel T., Kölln K., Krasnov I., Diddens I., Hauptmann N., Helms G., Ogurreck M., Kang S G., Koza M M., Müller M. Silkworm silk under tensile strain investigated by synchrotron X-ray diffraction and neutron spectroscopy. Macromolecules 2007, 40(4):

[121] Brookes V L., Young R J., Vollrath F J. Deformation micromechanics of spider silk.

[122] Wu X., Liu X Y., Du N., Xu G., Li B. Unraveled mechanism in silk engineering: Fast reeling induced silk toughening. Applied Physics Letters 2009; 95: 093703/1–

[123] Jelinski L W., Blye A., Liivak O., Michal C., LaVerde G., Seidel A., Shah N., Yang Z. Orientation, structure, wet-spinning, and molecular basis for supercontraction of spi‐ der dragline silk. International Journal of Biological Macromolecules 1999; 24(2-3):

[124] Holland G P., Jenkins J E., Creager M S., Lewis R V., Yarger J L. Solid-state NMR in‐ vestigation of major and minor ampullate spider silk in the native and hydrated

[125] Shao Z., Vollrath F., Sirichaisit J., Young R J. Analysis of spider silk in native and su‐ percontracted states using Raman spectroscopy. Polymer 1999; 40(10): 2493–2500.

Journal of Materials Science 2008; 43(10): 3728–3732.

states. Biomacromolecules 2008; 9(2): 651–657.

structure of *Bombyx mori* silk. Vibrational Spectroscopy 2009; 51(1): 136–141.

als Science & Processing 2006; 82(2): 205–212.

Letters 2010; 10(7): 2626–2634.

96 Oligomerization of Chemical and Biological Compounds

5038–5046.

1961–1969.

5(3): 661–665.

1035–1042.

093703/4.

197–201.


[140] Lefevre T, Boudreault S, Cloutier C, Pezolet M. Conformational and orientational transformation of silk proteins in the major ampullate gland of *Nephila clavipes* spi‐ ders. Biomacromolecules 2008; 9(9): 2399–2407.

tion of spider silk proteins in situ in the intact major ampullate gland and in solution.

Silk Fiber — Molecular Formation Mechanism, Structure-Property Relationship and Advanced Applications

http://dx.doi.org/10.5772/57611

99

[153] Roemer L., Scheibel T. In: Scheibel T. (ed.) Fibrous proteins. Austin: Landes Bio‐

[154] Exler J H., Hummerich D., Scheibel T. The amphiphilic properties of spider silks are important for spinning. Angewandte Chemie International Edition 2007; 46(19):

[155] Magoshi J., Magoshi Y., Becker M A., Nakamura S. Biospinning (silk fiber formation, multiple spinning mechanisms). In: Salamone J C. (ed.) Polymeric Materials Encyclo‐

[156] Vollrath F., Knight D P., Hu X W. Silk production in a spider involves acid bath treat‐ ment. Proceedings of the Royal Society B: Biological Science 1998; 265(1398): 817–820.

[157] Hardy J G., Scheibel T R. Silk-inspired polymers and proteins. Biochemical Society

[158] Chen P., Kim H S., Park C Y., Kim H S., Chin I J., Jin H J. pH-triggered transition of silk fibroin from spherical micelles to nanofibrils in water. Macromolecular Research

[159] Dicko C., Kenney J M., Knight D., Vollrath F. Transition to a *β*-sheet-rich structure in spidroin in vitro: The effects of pH and cations. Biochemistry 2004; 43(44): 14080–

[160] Greving I., Cai M., Vollrath F., Schniepp H C. Shear-induced self-assembly of native silk proteins into fibrils studied by atomic force microscopy. Biomacromolecules

[161] Xie F., Zhang H., Shao H., Hu X. Effect of shearing on formation of silk fibers from regenerated *Bombyx mori* silk fibroin aqueous solution. International Journal of Bio‐

[162] Kinahan M E., Filippidi E., Köster S., Hu X., Evans H M., Pfohl T., Kaplan D L., Wong J. Tunable silk: Using microfluidics to fabricate silk fibers with controllable proper‐

[163] Rammensee S., Slotta U., Scheibel T., Bausch A R. Assembly mechanism of recombi‐ nant spider silk proteins. Proceedings of the National Academy of Science of the

[164] Gronau G., Qin Z., Buehler M J. Effect of sodium chloride on the structure and stabil‐ ity of spider silk's N-terminal protein domain. Biomaterial Science 2013; 1(3): 276–

Biomacromolecules 2007; 8(8): 2342–2344.

pedia. CRC Press: New York. 1996. vol. 1: P667–679.

logical Macromolecules 2006; 38(3-5): 284–288.

ties. Biomacromolecules 2011; 12(5): 1504–1511.

United States of America 2008; 105(18): 6590–6595.

science; 2008. P137–151.

Transactions 2009; 37(4): 677–681.

2008; 16(6): 539–543.

2012; 13(3): 676–682.

14087.

284.

3559–3562.


tion of spider silk proteins in situ in the intact major ampullate gland and in solution. Biomacromolecules 2007; 8(8): 2342–2344.

[153] Roemer L., Scheibel T. In: Scheibel T. (ed.) Fibrous proteins. Austin: Landes Bio‐ science; 2008. P137–151.

[140] Lefevre T, Boudreault S, Cloutier C, Pezolet M. Conformational and orientational transformation of silk proteins in the major ampullate gland of *Nephila clavipes* spi‐

[141] Vollrath F., Knight D P. Structure and function of the silk production pathway in the spider *Nephila edulis*. International Journal of Biological Macromolecules 1999;

[142] Knight D P., Vollrath F. Changes in element composition along the spinning duct in a

[143] Terry A E., Knight D P., Porter D., Vollrath F. pH induced changes in the rheology of silk fibroin solution from the middle division of *Bombyx mori* silkworm. Biomacromo‐

[144] Jin H J., Kaplan D L. Mechanism of silk processing in insects and spiders. Nature

[145] Monti P., Taddei P., Freddi G., Asakura T., Tsukada M. Raman spectroscopic charac‐ terization of *Bombyx mori* silk fibroin: Raman spectrum of silk I. Journal of Raman

[146] Asakura T., Yamane T., Nakazawa Y., Kameda T., Ando K. Structure of *Bombyx mori* silk fibroin before spinning in solid state studied with wide angle X-ray scattering and 13C cross-polarization/magic angle spinning NMR. Biopolymers 2001; 58(5): 521–

[147] Taddei P., Asakura T., Yao J., Monti P. Raman study of poly(alanine-glycine)-based peptides containing tyrosine, valine, and serine as model for the semicrystalline do‐

[148] Asakura T., Ashida, J., Yamane T., kameda T., Nakazawa, Y., Ohgo K., Komatsu K. A repeated *β*-turn structure in poly(Ala-Gly) as a model for silk I of *Bombyx mori* silk fibroin studied with two-dimensional spin-diffusion NMR under off magic angle spinning and rotational echo double resonance. Journal of Molecular Biology 2001;

[149] Asakura T., Ohgo K., Komatsu K., Kanenari M., Okuyama K. Refinement of repeated *β*-turn structure for silk I conformation of *Bombyx mori* silk fibroin using 13C solid-

[150] Hirijida D H., Do K G., Wong S., Zax D., Jelinski L W. 13C NMR of *Nephila clavipes*

[151] Hronska M., van Beek J D., Willimason P T F., Vollrath F., Meier B H. NMR charac‐ terization of native liquid spider dragline silk from *Nephila edulis*. Biomacromolecules

[152] Lefèvre T., Leclerc J., Rioux-Dubé J F., Buffeteau T., Paquin M C., Rousseau M E., Cloutier I., Auger M., Gagné S M., Boudreault S., Cloutier C., Pézolet M. Conforma‐

state NMR and X-ray methods. Macromolecules 2005; 38(17): 7397–7403.

major ampullate silk gland. Biophysical Journal 1996; 71(6): 3442–3447.

mains of *Bombyx mori* silk fibroin. Biopolymers 2004; 75(4): 314–324.

ders. Biomacromolecules 2008; 9(9): 2399–2407.

*Nephila* spider. Naturwissenschaften 2001; 88(4): 179–182.

24(2-3): 243–249.

lecules 2004; 5(3): 768–772.

98 Oligomerization of Chemical and Biological Compounds

2003; 424(28): 1057–1061.

525.

306(2): 291–305.

2004; 5(3): 834–839.

Spectroscopy 2001; 32(2): 103–107.


[165] Hagn F., Thamm C., Scheibel T., Kessler H. pH-Dependent dimerization and salt-de‐ pendent stabilization of the N-terminal domain of spider dragline silk − implications for fiber formation. Angewandte Chemie International Edition 2011; 50(1): 310–313.

[178] Jin H J., Chen J., Karageorgiou V., Altman G H., Kaplan D L. Human bone marrow stromal cell responses on electrospun silk fibroin mats. Biomaterials 2004; 25(6):

Silk Fiber — Molecular Formation Mechanism, Structure-Property Relationship and Advanced Applications

http://dx.doi.org/10.5772/57611

101

[179] Wharram S E., Zhang X., Kaplan D L., McCarthy S P. Electrospun silk material sys‐ tems for wound healing. Macromolecular Bioscience 2010; 10(3): 246–257.

[180] Fan H., Liu H., Wong E J W., Toh S L., Goh J C H. In vivo study of anterior cruciate ligament regeneration using mesenchymal stem cells and silk scaffold. Biomaterials

[181] Mandal B B., Priya A S., Kundu S C. Novel silk sericin/gelation 3-D scaffolds and 2-D films: Fabrication and characterization for potential tissue engineering applications.

[182] Kasoju N., Bhonde R R., Bora U. Preparation and characterization of *Antheraea assama* silk fibroin based novel non-woven scaffolds for tissue engineering applications.

Journal of Tissue Engineering and Regenerative Medicine 2009; 3(7): 539–552. [183] Sugihara A., Sugiura K., Morita H., Ninagawa T., Tubouchi K., Tobe R., Izumiya M., Horio T., Abraham N G., Ikehara S. Promotive effects of a silk film on epidermal re‐ covery from full-thickness skin wounds. Proceedings of the Society for Experimental

[184] Unger R E., Sartoris A., Peters K., Motta A., Migliaresi C., Kunkel M., Bulnheim U., Rychly J., Kirkpatrick C J. Tissue-like self-assembly in cocultures of endothelial cells and osteoblasts and the formation of microcapillary-like structures on three-dimen‐

[185] Yang Y., Chen X., Ding F., Zhang P., Liu J., Gu X. Biocompatibility evaluation of silk fibroin with peripheral nerve tissues and cells in vitro. Biomaterials 2007; 28(9): 1643–

[186] Numata K., Kaplan D L., Silk-based delivery systems of bioactive molecules. Ad‐

[187] Lawrence B D., Cronin-Golomb M., Georgakoudi I., Kaplan D L., Omenetto F G. Bio‐ active silk protein biomaterial systems for optical devices. Biomacromolecules 2008;

[188] Perry H., Gopinath A., Kaplan D L., Negro L D., Omenetto F G. Nano- and micropat‐ terning of optically transparent, mechanically robust, biocompatible silk fibroin

[189] Omenetto F G., Kaplan D L. A new route for silk. Nature Photonics 2008; 2(11): 641–

sional porous biomaterials. Biomaterials 2007; 28(27): 3965–3976.

vanced Drug Delivery Reviews 2010; 62(15): 1497–1508.

films. Advanced Materials 2008; 20(16): 3070–3072.

1039–1047.

1652.

643.

9(4): 1214–1220.

2008; 29(23): 3324–3337.

Acta Biomaterialia 2009; 5(8): 3007–3020.

Biology and Medicine 2000; 225(1): 58–64.


[178] Jin H J., Chen J., Karageorgiou V., Altman G H., Kaplan D L. Human bone marrow stromal cell responses on electrospun silk fibroin mats. Biomaterials 2004; 25(6): 1039–1047.

[165] Hagn F., Thamm C., Scheibel T., Kessler H. pH-Dependent dimerization and salt-de‐ pendent stabilization of the N-terminal domain of spider dragline silk − implications for fiber formation. Angewandte Chemie International Edition 2011; 50(1): 310–313.

[166] Landreh M., Askarieh G., Nordling K., Hedhammar M., Rising A., Casals C., Astor‐ ga-Wells J., Alvelius G., Knight S D., Johansson J., Jornvall H., Bergman T. A pH-de‐ pendent dimer lock in spider silk protein. Journal of Molecular Biology 2010; 404(2):

[167] Dandin S B., Nirmal Kumar S. Bio-medical uses of silk and its derivative. Indian Silk

[168] Manohar Reddy R. Value addition span of silkworm cocoon – Time for utility opti‐ mization. International Journal of Industrial Entomology 2008; 17(1): 109–113.

[170] Altman G H., Diaz F., Jakuba C., Calabro T., Horan R L., Chen J., Lu H., Richmond J.,

[171] Vepari C., Kaplan D L. Silk as a biomaterial. Progress in Polymer Science 2007;

[172] Gronau G., Krishnaji S T., Kinahan M E., Giesa T., Wong J Y., Kaplan D L., Buehler M J. A review of combined experimental and computational procedures for assessing biopolymer structure-process-property relationships. Biomaterials 2012; 33(33): 8240–

[173] Meinel L., Hofmann S., Karageorgiou V., Kirker-Head C., McCool J., Gronowicz G., Zichner L., Langer R., Vunjak-Novakovic G., Kaplan D L. The inflammatory respons‐

[174] Gupta M K., Khokhar S K., Phillips D M., Sowards L A., Drummy L F., Kadakia M P., Naik R R. Patterned silk films cast from ionic liquid solubilized fibroin as scaffolds

[175] Servoli E., Maniglio D., Motta A., Predazzer R., Migliaresi C. Surface properties of silk fibroin films and their interaction with fibroblasts. Macromolecular Bioscience

[176] Min B M., Jeong L., Lee K Y., Park W H. Regenerated silk fibroin nanofibers: water vapor-induced structural changes and their effects on the behavior of normal human

[177] Unger R E., Wolf M., Peters K., Motta A., Migliaresi C., James Kirkpatrick C. Growth of human cells on a non-woven silk fibroin net: a potential for use in tissue engineer‐

es to silk films in vitro and in vivo. Biomaterials 2005; 26(2): 147–155.

for cell growth. Langmuir 2007; 23(3): 1315–1319.

cells. Macromolecular Bioscience 2006; 6(4): 285–292.

ing. Biomaterials 2004; 25(6): 1069–1075.

[169] Michael B. Food management. Proquest Agriculture Journals 2004; 60: 39.

Kaplan D L. Silk-based biomaterials. Biomaterials 2003; 24(3): 401–416.

328–336.

2006; 45(9):5–8.

100 Oligomerization of Chemical and Biological Compounds

32(8-9): 991–1007.

2005; 5(12): 1175–1183.


[190] Amsden J J., Perry H., Boriskina S V., Gopinath A., Kaplan D L., Negro L D., Omenet‐ to F G. Spectral analysis of induced color change on periodically nanopatterned silk films. Optics Express 2009; 17(23): 21271–21279.

**Chapter 4**

**Ethylene Oxide Homogeneous Heterobifunctional**

It is difficult, if not hazardous and partisan, to rank the top advances of organic chemistry in the last approx. 20 years for their role in the development of this science and adjacent, complementary others. It can, however, be said with certainty that the field of homogeneous (HOHAO) and heterogeneous heterobifunctional acyclic oligomers (HEHAO) [1] of ethylene oxide, through their diversity of macromolecular architectures and effects, turns out to be one of the most significant achievements. Their importance resides mainly in the resolve by affordable means of some fundamental problems of organic synthesis by the transfer to the same phase of reaction partners with different polarities (organic substrate and water-soluble,

Pioneering attempts to structure homogeneous PEO chains with n=3-20 are recorded between the fourth and sixth decades of the XXth century and found in technical bulletins issued by large corporations (Hülles, Henkel, Union Carbide, Shell Oil etc.). For reasons of intellectual property protection (in the absence of patents) the technological information recorded within these publications are summary in terms of processing parameters, conversions, or secondary

Mulley, B.A. [2] has the merit of the first to de-centralize and systematize the efforts to structure the first proper homogeneous PEO chains in the true meaning of "homogeneous heterobi‐ functional acyclic oligomers of ethylene oxide" (HOHAO). The compounds reported by him

The material presented below introduces the reader to the field of homogeneous heterobi‐

© 2014 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

are really just "homogeneous monofunctional acyclic oligomers of ethylene oxide".

functional acyclic oligomers of ethylene oxide (HOHAO).

**Acyclic Oligomers**

http://dx.doi.org/10.5772/57610

usually inorganic, reactant).

products.

Additional information is available at the end of the chapter

Calin Jianu

**1. Introduction**


## **Ethylene Oxide Homogeneous Heterobifunctional Acyclic Oligomers**

Calin Jianu

[190] Amsden J J., Perry H., Boriskina S V., Gopinath A., Kaplan D L., Negro L D., Omenet‐ to F G. Spectral analysis of induced color change on periodically nanopatterned silk

[191] Kim D H., Kim Y S., Amsden J., Panilaitis B., Kaplan D L., Omenetto F G., Zakin M R., Rogers J A. Silicon electronics on silk as a path to bioresorbable, implantable devi‐

[192] Kim D H., Viventi J., Amsden J. J., Xiao J., Vigeland L., Kim Y S., Blanco J A., Panilai‐ tis B., Frechette E. S., Contreras D., Kaplan D L., Omenetto F G., Huang Y., Hwang K C., Zakin M R., Litt B., Rogers J A. Dissolvable films of silk fibroin for ultrathin con‐

[193] Parker S T., Domachuk P., Amsden J., Bressner J., Lewis J A., Kaplan D L., Omenetto F G. Biocompatible silk printed optical waveguides. Advanced Materials 2009;

[194] Kharlampieva E., Kozlovskaya V., Wallet B., Shevchenko V V., Naik R R., Vaia R., Kaplan D L., Tsukruk V V. Co-cross-linking silk matrices with silica nanostructures

[195] Steven E., Park J G., Paravastu A., Lopes E B., Brooks J S., Englander O., Siegrist T., Kaner P., Alamo R G. Physical characterization of functionalized spider silk: elec‐ tronic and sensing properties. Science and Technology of Advanced Materials 2011;

[196] Huby N., Vié V., Renault A., Beaufils S., Lefèvre T., Paquet-Mercier F., Pézolet M., Bêche B. Native spider silk as a biological optical fiber. Applied Physics Letters 2013;

[197] Kim S., Mitropoulos A N., Spitzberg J D., Kaplan D L., Omenetto F G. Silk protein based hybrid photonic-plasmatic crystal. Optics Express 2013; 21(7): 8897–8903. [198] Huang X., Liu G., Wang X. New secrets of spider silk: Exceptionally high thermal conductivity and its abnormal change under stretching. Advanced Materials 2012;

formal bio-integrated electronics. Nature Materials 2010; 9(6): 511–517.

for robust ultrathin nanocomposities. ACS Nano 2010; 4(12): 7053–7063.

films. Optics Express 2009; 17(23): 21271–21279.

21(13): 2411–2415.

102 Oligomerization of Chemical and Biological Compounds

12(5): 0055002/1–005002/13.

102(12): 123702/1–123702/3.

24(11): 1482–1486.

ces. Applied Physics Letters 2009; 95(13): 133701/1–133701/3.

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/57610

#### **1. Introduction**

It is difficult, if not hazardous and partisan, to rank the top advances of organic chemistry in the last approx. 20 years for their role in the development of this science and adjacent, complementary others. It can, however, be said with certainty that the field of homogeneous (HOHAO) and heterogeneous heterobifunctional acyclic oligomers (HEHAO) [1] of ethylene oxide, through their diversity of macromolecular architectures and effects, turns out to be one of the most significant achievements. Their importance resides mainly in the resolve by affordable means of some fundamental problems of organic synthesis by the transfer to the same phase of reaction partners with different polarities (organic substrate and water-soluble, usually inorganic, reactant).

Pioneering attempts to structure homogeneous PEO chains with n=3-20 are recorded between the fourth and sixth decades of the XXth century and found in technical bulletins issued by large corporations (Hülles, Henkel, Union Carbide, Shell Oil etc.). For reasons of intellectual property protection (in the absence of patents) the technological information recorded within these publications are summary in terms of processing parameters, conversions, or secondary products.

Mulley, B.A. [2] has the merit of the first to de-centralize and systematize the efforts to structure the first proper homogeneous PEO chains in the true meaning of "homogeneous heterobi‐ functional acyclic oligomers of ethylene oxide" (HOHAO). The compounds reported by him are really just "homogeneous monofunctional acyclic oligomers of ethylene oxide".

The material presented below introduces the reader to the field of homogeneous heterobi‐ functional acyclic oligomers of ethylene oxide (HOHAO).

© 2014 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

HOHAO, "tailor-made macromolecules" ("designer macromolecules") with the general structure (Figure 1), fall into the category of "niche" unitary organic compounds (derivatives of polyethylene glycols PEGn).

#### R1-O-(-CH2CH2O-)n-R2

n=homogeneous oligomerization degree (strictly monitored value); R1, R2=aliphatic, aromatic or mixed derivatization terminals

**Figure 1.** General structure of homogeneous heterobifunctional acyclic oligomers of ethylene oxide

Why have unitary polyoxyethylene chains PEO imposed themselves? On the one hand to eliminate the cumulative manifestation of the colloidal physico-chemical behavior of the group of chain homologues in the polydisperse heterogeneous structures (technical products) obtained by anionic polymerization; on the other hand to definitely delineate the colloidal physico-chemical competences of each homologue.

The reaction is carried out in a special insulated reactor under inert atmosphere (nitrogen) to prevent the possible explosion of ethylene oxide. Finally, the reaction mixture is neutralized,

, NH3, RNH2, RR'NH, etc.). The general reaction scheme is presented in Figure 2b.

sodium hydroxide or boron trifluoride and are used for the synthesis of surfactants.

Its typical reactions are with nucleophiles, which proceed via the SN2 mechanism, both in acidic (weak nucleophiles: water, alcohols) and alkaline media (strong nucleophiles: OH¯

(a) (b)

Ethylene Oxide Homogeneous Heterobifunctional Acyclic Oligomers

http://dx.doi.org/10.5772/57610

,

105

**oxirane ring (epoxide)**

A narrow-range ethoxylated alcohol also called "peaked ethoxylated" alcohol has a distribu‐ tion curve that is narrower than the equivalent standard alcohol ethoxylate and a considerably lower content of unreacted alcohol. This gives the nonionic surfactant focused properties, a very low odor, even if based on a short-chain alcohol and avoids the formulation problems

Reactions of ethylene oxide with fatty alcohols proceed in the presence of sodium metal,

The reaction is carried out in a special insulated reactor under inert atmosphere (nitrogen) to prevent the possible explosion of ethylene oxide. Finally, the reaction mixture is neutralized,

Polyethylene glycol is produced by the interaction of ethylene oxide with water, ethylene glycol, or ethylene glycol oligomers [5]. The reaction is catalyzed by acid or base catalysts.

A narrow-range ethoxylated alcohol also called "peaked ethoxylated" alcohol has a distribution curve that is narrower than the equivalent standard alcohol ethoxylate and a considerably lower content of unreacted alcohol. This gives the nonionic surfactant focused properties, a very low odor, even if based on a short-chain alcohol and avoids the

The size distribution can be characterized statistically by its weight average molecular weight (Mw) and its number average molecular weight (Mn), the ratio of which is called the polydis‐

The different oligomeric and/or polymeric structures of ethylene oxide depend on the polymerization initiators. Ethylene glycol and its oligomers are preferable as starting materials ("lead compounds") instead of water, because they allow the creation of polymers with low

Polyethylene glycol is produced by the interaction of ethylene oxide with water, ethylene glycol, or ethylene glycol oligomers [5]. The reaction is catalyzed by acid or base catalysts.

A number of recent reviews [5,6] have also covered PEGn chemistry and its applications in biotechnology and medicine, supported catalysis, aqueous two-phase systems in bioconver‐

polydispersity index (Mw/Mn). Mw and Mn can be measured by mass spectrometry.

The size distribution can be characterized statistically by its weight average molecular weight (Mw) and its number average molecular weight (Mn), the ratio of which is called the

PEGylation [7] represents the covalent coupling of a PEGn to a macromolecule (*e.g.*, lipids, therapeutic proteins, etc.). The effect of PEGylation is prolongation of the biological effect (produces a larger molecule with a prolonged half-life). PEGylation is similar to the structuring processes of heterogeneous heterobifunctional acyclic oligomers of ethylene oxide (HEHAO)

The different oligomeric and/or polymeric structures of ethylene oxide depend on the polymerization initiators. Ethylene glycol and its oligomers are preferable as starting materials ("lead compounds") instead of water, because they allow the creation of polymers

The lower the degree of ethoxylation, the higher the amount of free alcohol [3].

persity index (Mw/Mn). Mw and Mn can be measured by mass spectrometry.

formulation problems often associated with standard alcohol ethoxylates [3].

The lower the degree of ethoxylation, the higher the amount of free alcohol [3].

polydispersity ("narrow molecular weight distribution") (NMWD).

sion, and solvent and phase transfer catalyst (PTC) in organic synthesis.

with low polydispersity ("narrow molecular weight distribution") (NMWD).

degassed and purified.

degassed and purified.

**H**

**1.083 Å C 116º15'**

**H**

RO¯

(Figure 3).

often associated with standard alcohol ethoxylates [3].

**H**

**Figure 2.** The detailed structural geometry of ethylene oxide (EO) [3,4]

Figure 2. The detailed structural geometry of ethylene oxide (EO) [3,4]

**C 1.083 Å**

**C**

**61º38'**

**59º11'**

**1.470 Å 158º06'**

**H**

#### **2. Ethylene oxide (EO) — Structure, properties, consequences**

The origin of the steadily increasing interest that has fascinated for over a century the scientific effort of many researchers, laboratories and concerns for polyethylene glycols (PEGn) and polyoxyethylene chains (PEO), is largely due to the specific structure and properties of ethylene oxide [3,4]. A slightly colored gas at 25°C, with a sweetish odor and taste characteristic of ethers particularly at concentrations above 500 ppm in air, it is readily soluble in water, ethanol and other organic solvents.

It is relatively thermally stable. In the absence of catalysts up to 300°C it does not dissociate, but above 570°C the major exothermic decomposition is recorded.

Union Carbide at the beginning of the 20th century inaugurated the first production plant of EO by the air oxidation of ethylene in the presence of catalytic metallic silver. Later Shell Oil Co. replaced air with high-purity oxygen and processed EO at 200-300°C and 1-3 MPa, respectively, with an oxidation yield between 63-75% and 75-82%.

The reactivity of the three-atom (two carbon, one oxygen) ether heterocycle (oxirane) (Figure 2a), also founded on the "ring tension theory", favors the nucleophilic attack of organic compounds with hydroxyl, thiol, primary and/or secondary amine, etc., function with breaking of the C-O bond of the oxirane ring.

Its typical reactions are with nucleophiles, which proceed via the SN2 mechanism, both in acidic (weak nucleophiles: water, alcohols) and alkaline media (strong nucleophiles: OH¯ , RO¯ , NH3, RNH2, RR'NH, etc.). The general reaction scheme is presented in Figure 2b.

Reactions of ethylene oxide with fatty alcohols proceed in the presence of sodium metal, sodium hydroxide or boron trifluoride and are used for the synthesis of surfactants.

Figure 2. The detailed structural geometry of ethylene oxide (EO) [3,4] **Figure 2.** The detailed structural geometry of ethylene oxide (EO) [3,4]

HOHAO, "tailor-made macromolecules" ("designer macromolecules") with the general structure (Figure 1), fall into the category of "niche" unitary organic compounds (derivatives

R1-O-(-CH2CH2O-)n-R2 n=homogeneous oligomerization degree (strictly monitored value); R1, R2=aliphatic, aromatic or mixed derivatization

Why have unitary polyoxyethylene chains PEO imposed themselves? On the one hand to eliminate the cumulative manifestation of the colloidal physico-chemical behavior of the group of chain homologues in the polydisperse heterogeneous structures (technical products) obtained by anionic polymerization; on the other hand to definitely delineate the colloidal

The origin of the steadily increasing interest that has fascinated for over a century the scientific effort of many researchers, laboratories and concerns for polyethylene glycols (PEGn) and polyoxyethylene chains (PEO), is largely due to the specific structure and properties of ethylene oxide [3,4]. A slightly colored gas at 25°C, with a sweetish odor and taste characteristic of ethers particularly at concentrations above 500 ppm in air, it is readily soluble in water,

It is relatively thermally stable. In the absence of catalysts up to 300°C it does not dissociate,

Union Carbide at the beginning of the 20th century inaugurated the first production plant of EO by the air oxidation of ethylene in the presence of catalytic metallic silver. Later Shell Oil Co. replaced air with high-purity oxygen and processed EO at 200-300°C and 1-3 MPa,

The reactivity of the three-atom (two carbon, one oxygen) ether heterocycle (oxirane) (Figure 2a), also founded on the "ring tension theory", favors the nucleophilic attack of organic compounds with hydroxyl, thiol, primary and/or secondary amine, etc., function with

Its typical reactions are with nucleophiles, which proceed via the SN2 mechanism, both in acidic (weak nucleophiles: water, alcohols) and alkaline media (strong nucleophiles: OH¯

Reactions of ethylene oxide with fatty alcohols proceed in the presence of sodium metal,

, NH3, RNH2, RR'NH, etc.). The general reaction scheme is presented in Figure 2b.

sodium hydroxide or boron trifluoride and are used for the synthesis of surfactants.

,

**Figure 1.** General structure of homogeneous heterobifunctional acyclic oligomers of ethylene oxide

**2. Ethylene oxide (EO) — Structure, properties, consequences**

but above 570°C the major exothermic decomposition is recorded.

respectively, with an oxidation yield between 63-75% and 75-82%.

physico-chemical competences of each homologue.

ethanol and other organic solvents.

breaking of the C-O bond of the oxirane ring.

of polyethylene glycols PEGn).

104 Oligomerization of Chemical and Biological Compounds

terminals

RO¯

The reaction is carried out in a special insulated reactor under inert atmosphere (nitrogen) to prevent the possible explosion of ethylene oxide. Finally, the reaction mixture is neutralized, degassed and purified. Its typical reactions are with nucleophiles, which proceed via the SN2 mechanism, both in acidic (weak nucleophiles: water, alcohols) and alkaline media (strong nucleophiles: OH¯ , RO¯ , NH3, RNH2, RR'NH, etc.). The general reaction scheme is presented in Figure 2b.

A narrow-range ethoxylated alcohol also called "peaked ethoxylated" alcohol has a distribu‐ tion curve that is narrower than the equivalent standard alcohol ethoxylate and a considerably lower content of unreacted alcohol. This gives the nonionic surfactant focused properties, a very low odor, even if based on a short-chain alcohol and avoids the formulation problems often associated with standard alcohol ethoxylates [3]. Reactions of ethylene oxide with fatty alcohols proceed in the presence of sodium metal, sodium hydroxide or boron trifluoride and are used for the synthesis of surfactants. The reaction is carried out in a special insulated reactor under inert atmosphere (nitrogen) to prevent the possible explosion of ethylene oxide. Finally, the reaction mixture is neutralized,

The lower the degree of ethoxylation, the higher the amount of free alcohol [3]. degassed and purified.

Polyethylene glycol is produced by the interaction of ethylene oxide with water, ethylene glycol, or ethylene glycol oligomers [5]. The reaction is catalyzed by acid or base catalysts. A narrow-range ethoxylated alcohol also called "peaked ethoxylated" alcohol has a distribution curve that is narrower than the equivalent standard alcohol ethoxylate and a considerably lower content of unreacted alcohol. This gives the nonionic surfactant focused

The size distribution can be characterized statistically by its weight average molecular weight (Mw) and its number average molecular weight (Mn), the ratio of which is called the polydis‐ persity index (Mw/Mn). Mw and Mn can be measured by mass spectrometry. properties, a very low odor, even if based on a short-chain alcohol and avoids the formulation problems often associated with standard alcohol ethoxylates [3].

The different oligomeric and/or polymeric structures of ethylene oxide depend on the polymerization initiators. Ethylene glycol and its oligomers are preferable as starting materials ("lead compounds") instead of water, because they allow the creation of polymers with low polydispersity ("narrow molecular weight distribution") (NMWD). The lower the degree of ethoxylation, the higher the amount of free alcohol [3]. Polyethylene glycol is produced by the interaction of ethylene oxide with water, ethylene glycol, or ethylene glycol oligomers [5]. The reaction is catalyzed by acid or base catalysts.

A number of recent reviews [5,6] have also covered PEGn chemistry and its applications in biotechnology and medicine, supported catalysis, aqueous two-phase systems in bioconver‐ sion, and solvent and phase transfer catalyst (PTC) in organic synthesis. The size distribution can be characterized statistically by its weight average molecular weight (Mw) and its number average molecular weight (Mn), the ratio of which is called the polydispersity index (Mw/Mn). Mw and Mn can be measured by mass spectrometry.

PEGylation [7] represents the covalent coupling of a PEGn to a macromolecule (*e.g.*, lipids, therapeutic proteins, etc.). The effect of PEGylation is prolongation of the biological effect (produces a larger molecule with a prolonged half-life). PEGylation is similar to the structuring processes of heterogeneous heterobifunctional acyclic oligomers of ethylene oxide (HEHAO) (Figure 3). The different oligomeric and/or polymeric structures of ethylene oxide depend on the polymerization initiators. Ethylene glycol and its oligomers are preferable as starting materials ("lead compounds") instead of water, because they allow the creation of polymers with low polydispersity ("narrow molecular weight distribution") (NMWD).

of scientific speculations and subsequently of experimental evidence, preceded by several decades the conformational and geometrical interpretation of polyoxyethylene chains.

Ethylene Oxide Homogeneous Heterobifunctional Acyclic Oligomers

http://dx.doi.org/10.5772/57610

107

The infrared (IR) absorption characteristics of PEO-MY complexes differ from the individual species, but are similar to those seen in MY-cyclic polyethers systems reported earlier, thus suggesting that the coordination effect may be due to ion-dipole-type interactions in these

This hypothesis becomes credible if the helical conformation (helix) of oligomeric PEO chains with minimum 7 units EO/coil and the electron-donor character (Lewis base) of the oxygen in

Heterogeneous polyoxyethylene chains obtained by the anionic polymerization of ethylene oxide ("anionic ring-opening polymerization") with a polydispersity degree Mw/Mn < 1.1 are

Generically, the term "oligomer", *oligos* being Greek for "a few", while *mer* with the meaning of primary structural unit (ethylene oxide) which is repeating, can be defined as a molecular assembly composed of a small number of monomeric units covalently grafted (ethylene oxide for HOHAO). Since ethylene oxide (EO) is accessed exclusively, the macromolecular structure

Oligomerization in the casuistry analyzed is a process of monitored attachment in a macro‐ molecular architecture of (n) primary structural units of ethylene oxide. The demarcation of n between oligomerization and polymerization is undecided in the specialized literature. For HOHAO the range 3-20 is accepted compared to 10-100 in general. In the case of heterogeneous polyoxyethylene chains (polydisperse polymerization degree naverage) it is natural to assume the existence of homologous (mixtures of oligomers and/or polymers) oligomeric (polymeric) series with the same strictly defined structure, with molecular weights different from homo‐

For heterogeneous polyoxyethylene (PEO) chains the statistical distribution quantitatively expressed through the equations: Natta, Weilbull/Nycander/Gold, Natta/Mantica, Poisson,

Although the structuring of homogeneous PEO chains represented a scientific challenge for more than six decades, today we still can not say that there is a single, rapid procedure for

**4. Homogeneous polyoxyethylene chains — Preparation, structure,**

hydrophilic, flexible (specific spatial conformation), biocompatible "bridges".

systems.

the PEO chain are accepted.

**3. Anionic polymerization of ethylene oxide**

geneous oligomeric chains (definite oligomerization degree n).

formed is a homo-oligomer (homomer).

etc., is accepted [8]

**competences**

**Figure 3.** Synthetic methods for the preparation of heterogeneous heterobifunctional acyclic oligomers of ethylene oxide (HEHAO) a) direct preparation; b) end group derivatization of PEGn diols

With the decision "no observed adverse effect level" (NOAEL) higher homologues PEGn-1500 for doses of 600 mg/kg have also been advised.

PEGn, PEO or POE refer to an oligomer or polymer of ethylene oxide. The three names are chemically synonymous.

Five concepts frequently accessed in the specialized literature on polyethylene glycols as such and/or derivatized have also cumulatively fueled the steadily increasing interest in homoge‐ neous heterobifunctional acyclic oligomers of ethylene oxide (HOHAO):


In addition to their own phase-transfer activity, PEGn have also been employed as polymer support for other phase-transfer catalysts (PTCs). PEGn have been modified with some typical PTCs such as crown ethers, ammonium salts, cryptands, and polypodants to enhance the phase-transfer in two-phase reactions.

After 1960 the coordination competences of alkali cations in the matrix of polyoxyethylene chains (PEO), as such and/or derivatized were also explicitly recognized. The striking analogy with crown polyethers could not fail to impose questions and provide answers pertinent to their capacity, stability and coordination geometry. Certainties that appeared after the series of scientific speculations and subsequently of experimental evidence, preceded by several decades the conformational and geometrical interpretation of polyoxyethylene chains.

The infrared (IR) absorption characteristics of PEO-MY complexes differ from the individual species, but are similar to those seen in MY-cyclic polyethers systems reported earlier, thus suggesting that the coordination effect may be due to ion-dipole-type interactions in these systems.

This hypothesis becomes credible if the helical conformation (helix) of oligomeric PEO chains with minimum 7 units EO/coil and the electron-donor character (Lewis base) of the oxygen in the PEO chain are accepted.

### **3. Anionic polymerization of ethylene oxide**

R1 OH + H2C

106 Oligomerization of Chemical and Biological Compounds

HO-[CH2CH2O]n-H

oxide (HEHAO) a) direct preparation; b) end group derivatization of PEGn diols

neous heterobifunctional acyclic oligomers of ethylene oxide (HOHAO):

**•** the particular physical properties of aqueous solutions of PEGn;

for doses of 600 mg/kg have also been advised.

n

a

b

of ethylene oxide

chemically synonymous.

solution;

catalysis);

systems.

phase-transfer in two-phase reactions.

O

CH2

b1; b3 - homobifunctional oligomers of ethylene oxide; b2 - heterobifunctional oligomers

**Figure 3.** Synthetic methods for the preparation of heterogeneous heterobifunctional acyclic oligomers of ethylene

With the decision "no observed adverse effect level" (NOAEL) higher homologues PEGn-1500

PEGn, PEO or POE refer to an oligomer or polymer of ethylene oxide. The three names are

Five concepts frequently accessed in the specialized literature on polyethylene glycols as such and/or derivatized have also cumulatively fueled the steadily increasing interest in homoge‐

**•** the unique solvent properties and the coordination competence of cations present in

**•** the employment of PEGn as such and/or derivatized as alternative PTC (phase-transfer

**•** the acceptance of aqueous biphasic reactive extraction (ABRE) as a present phenomenon in the development of alternative processes for wood pulping and green catalytic oxidation

In addition to their own phase-transfer activity, PEGn have also been employed as polymer support for other phase-transfer catalysts (PTCs). PEGn have been modified with some typical PTCs such as crown ethers, ammonium salts, cryptands, and polypodants to enhance the

After 1960 the coordination competences of alkali cations in the matrix of polyoxyethylene chains (PEO), as such and/or derivatized were also explicitly recognized. The striking analogy with crown polyethers could not fail to impose questions and provide answers pertinent to their capacity, stability and coordination geometry. Certainties that appeared after the series

**•** the solvent competences of liquid low-molecular-weight PEGn in chemical reactions;

R2 derivatizing agent

R1 and R2 derivatizing fragment

b1)

b2)

b3)

R1-O-[CH2CH2O]n-R2 heterobifunctional oligomers of ethylene oxide

R1-O-[CH2CH2O]n-R1

R1-O-[CH2CH2O]n-R2

R2-O-[CH2CH2O]n-R2

Heterogeneous polyoxyethylene chains obtained by the anionic polymerization of ethylene oxide ("anionic ring-opening polymerization") with a polydispersity degree Mw/Mn < 1.1 are hydrophilic, flexible (specific spatial conformation), biocompatible "bridges".

Generically, the term "oligomer", *oligos* being Greek for "a few", while *mer* with the meaning of primary structural unit (ethylene oxide) which is repeating, can be defined as a molecular assembly composed of a small number of monomeric units covalently grafted (ethylene oxide for HOHAO). Since ethylene oxide (EO) is accessed exclusively, the macromolecular structure formed is a homo-oligomer (homomer).

Oligomerization in the casuistry analyzed is a process of monitored attachment in a macro‐ molecular architecture of (n) primary structural units of ethylene oxide. The demarcation of n between oligomerization and polymerization is undecided in the specialized literature. For HOHAO the range 3-20 is accepted compared to 10-100 in general. In the case of heterogeneous polyoxyethylene chains (polydisperse polymerization degree naverage) it is natural to assume the existence of homologous (mixtures of oligomers and/or polymers) oligomeric (polymeric) series with the same strictly defined structure, with molecular weights different from homo‐ geneous oligomeric chains (definite oligomerization degree n).

For heterogeneous polyoxyethylene (PEO) chains the statistical distribution quantitatively expressed through the equations: Natta, Weilbull/Nycander/Gold, Natta/Mantica, Poisson, etc., is accepted [8]

### **4. Homogeneous polyoxyethylene chains — Preparation, structure, competences**

Although the structuring of homogeneous PEO chains represented a scientific challenge for more than six decades, today we still can not say that there is a single, rapid procedure for their synthesis due to preparative difficulties, complex process flow diagrams for purification and characterization, consequently due to the high cost of processing in industrial quantities.

mental methods: "short distance diffraction techniques"; "wide-angle-diffraction techniques"

**O**

A**<sup>2</sup> 24.23** A1 o

A**<sup>8</sup>** A7

**25.3**  o

**O**

**O**

These constructive details about the PEO chain ("strain-free polyoxyethylene chain") and the manner of "packing" in the "macromolecular lattice" are reasonedly argued with the specifi‐

**•** the meander has in its structure nine oxyethylene structural units (-CH2CH2-O-), *i.e.*, 4 x 9=36

**•** each oxyethylene unit is "twisted" to the neighboring structural unit, such that the main

**•** in the "meander" conformation one oxyethylene structural unit has a length of 1.9 Å and a diameter of 4 Å compared to the "zig-zag" conformation where the same geometrical

Today it is accepted that the "zig-zag" conformational form is specific to PEO with low oligomerization degrees, and the "meander" conformation to PEO chains with high oligome‐

Be mentioned that differences between some experimental results are due to the study of polyoxyethylene chains: in solution; in the solid state, in the range of average molecular

weights between 2,400 and 100,000 (n=55-2300), and in the solid state, derivatized.

PEO chain returns to the original position at every tenth "lead/turn" (19.5 Å);

**<sup>C</sup> 2.19**  o

**O**

**A A9**

**1.8** <sup>o</sup>

**A10**

**111 <sup>o</sup>**

**O**

**O**

**O**

http://dx.doi.org/10.5772/57610

109

**O**

**O**

**110 <sup>o</sup> 111 <sup>o</sup>**

Ethylene Oxide Homogeneous Heterobifunctional Acyclic Oligomers

**<sup>A</sup> <sup>B</sup> 2.39**  o

**O**

**110 o**

**O**

**O**

and "low-angle-diffraction techniques".

**O**

**a**

**106 <sup>o</sup>**

**O**

**106 <sup>o</sup>**

**A12**

**A**

**b**

cations:

rization degrees.

**13**

<sup>A</sup>**<sup>6</sup> 3.35** Ao

**O**

**A A4 A3 <sup>5</sup> 3.58**

**O**

**O**

**•** the "monoclinic unit cell" appears at four chain meanders;

total oxyethylene units in a "monoclinic unit cell";

**•** the "repeat period" is identical (19.5 Å);

parameters are 3.5Å and 2.5 Å, respectively.

**Figure 4.** The "zig-zag" (a) and meander (b) conformations of the polyoxyethylene chain [11-15]

**3.38** o

**O**

**O**

**O**

o

**O**

During ourresearch the results obtainedcontributeddecisively to the confirmation ofthedirect participation of PEO chains in nucleophilic addition reactions (cyanoethylation, amidoethyla‐ tion) (Figure 10) of polyethoxylated higher alcohols purified of free higher alcohols, polyethy‐ lene glycols (PEGn) and water, when the processing yields under similar conditions increase proportionally with the oligomerization degree, (n), of the PEO chain [9,10].

After almost a century of investigations, similar to other classes of macromolecular compounds for polyethylene glycols (PEGn) and their derivatives (glyme, oligoglymes, PEGylated compounds), respectively, it can also be reasonedly claimed that in addition to a primary structure there are a secondary (conformational) and a tertiary structure (micellar macromo‐ lecular architectures).

The main qualities of these tridimensional macromolecular architectures with consequences in the study of HOHAO are dimensional flexibility, transfer mobility, the existence of "mean‐ der", "zig-zag", and "helix" conformations of variable geometry, free coaxial C-C/C-O rotation, and the absence of "ring tensions" specific to rigid structures (crown polyethers).

With few exceptions, the preparation of homogeneous PEO chains as such and derivatized is reported with yields ranging between 60-80% for relatively small oligomerization degrees (n=3-6) [2]. The laborious, difficult to accomplish purification, associated with the presence of "neighboring effects" ("sympathy effects") between two or more hydrophilic (polyoxyethy‐ lene) chain homologues with close physico-chemical constants, limited the extension of synthetic efforts [10].

The main colloidal characteristics of HEHAO and HOHAO, respectively, depend on their structure and heterogeneous or homogeneous composition. As mixtures with wide distribu‐ tion of hydrophobic R1,R2 and hydrophilic PEO chain homologues, respectively, HEHAO manifest cumulatively through the individual colloidal behavior of each homologue present in the mixture, but also through mutual interdependences. That is why the experimental values of the main basic colloidal characteristics evaluated in the research carried out [9,10] were preliminarily only indicative, even though they were the result of the mathematical processing of a considerable number of measurements.

The actual distribution of the PEO chain homologues and hence of the oligomerization degrees (naverage) changes in the series naverage=3-18 from advanced symmetry for naverage≤8 to pronounced asymmetry for naverage=9-18.

The steadily increasing interest in the definite explanation of colloidal properties of heteroge‐ neous (polydisperse) polyoxyethylene chains is evident in the literature after the sixth decade of the 20th century and by the research related to the obtaining, purification and characteri‐ zation of homogeneous polyoxyethylene chains. Two types of PEO conformations are postulated during this period ("zig-zag" and "meander") (Figure 4).

Today one can draw a unitary conclusion that would eliminate earlier partial speculative assumptions. Major contributions in this area were due to the accessing of modern instru‐ mental methods: "short distance diffraction techniques"; "wide-angle-diffraction techniques" and "low-angle-diffraction techniques".

**Figure 4.** The "zig-zag" (a) and meander (b) conformations of the polyoxyethylene chain [11-15]

These constructive details about the PEO chain ("strain-free polyoxyethylene chain") and the manner of "packing" in the "macromolecular lattice" are reasonedly argued with the specifi‐ cations:


their synthesis due to preparative difficulties, complex process flow diagrams for purification and characterization, consequently due to the high cost of processing in industrial quantities. During ourresearch the results obtainedcontributeddecisively to the confirmation ofthedirect participation of PEO chains in nucleophilic addition reactions (cyanoethylation, amidoethyla‐ tion) (Figure 10) of polyethoxylated higher alcohols purified of free higher alcohols, polyethy‐ lene glycols (PEGn) and water, when the processing yields under similar conditions increase

After almost a century of investigations, similar to other classes of macromolecular compounds for polyethylene glycols (PEGn) and their derivatives (glyme, oligoglymes, PEGylated compounds), respectively, it can also be reasonedly claimed that in addition to a primary structure there are a secondary (conformational) and a tertiary structure (micellar macromo‐

The main qualities of these tridimensional macromolecular architectures with consequences in the study of HOHAO are dimensional flexibility, transfer mobility, the existence of "mean‐ der", "zig-zag", and "helix" conformations of variable geometry, free coaxial C-C/C-O rotation, and the absence of "ring tensions" specific to rigid structures (crown polyethers). With few exceptions, the preparation of homogeneous PEO chains as such and derivatized is reported with yields ranging between 60-80% for relatively small oligomerization degrees (n=3-6) [2]. The laborious, difficult to accomplish purification, associated with the presence of "neighboring effects" ("sympathy effects") between two or more hydrophilic (polyoxyethy‐ lene) chain homologues with close physico-chemical constants, limited the extension of

The main colloidal characteristics of HEHAO and HOHAO, respectively, depend on their structure and heterogeneous or homogeneous composition. As mixtures with wide distribu‐ tion of hydrophobic R1,R2 and hydrophilic PEO chain homologues, respectively, HEHAO manifest cumulatively through the individual colloidal behavior of each homologue present in the mixture, but also through mutual interdependences. That is why the experimental values of the main basic colloidal characteristics evaluated in the research carried out [9,10] were preliminarily only indicative, even though they were the result of the mathematical processing

The actual distribution of the PEO chain homologues and hence of the oligomerization degrees (naverage) changes in the series naverage=3-18 from advanced symmetry for naverage≤8 to pronounced

The steadily increasing interest in the definite explanation of colloidal properties of heteroge‐ neous (polydisperse) polyoxyethylene chains is evident in the literature after the sixth decade of the 20th century and by the research related to the obtaining, purification and characteri‐ zation of homogeneous polyoxyethylene chains. Two types of PEO conformations are

Today one can draw a unitary conclusion that would eliminate earlier partial speculative assumptions. Major contributions in this area were due to the accessing of modern instru‐

postulated during this period ("zig-zag" and "meander") (Figure 4).

proportionally with the oligomerization degree, (n), of the PEO chain [9,10].

lecular architectures).

108 Oligomerization of Chemical and Biological Compounds

synthetic efforts [10].

of a considerable number of measurements.

asymmetry for naverage=9-18.


Today it is accepted that the "zig-zag" conformational form is specific to PEO with low oligomerization degrees, and the "meander" conformation to PEO chains with high oligome‐ rization degrees.

Be mentioned that differences between some experimental results are due to the study of polyoxyethylene chains: in solution; in the solid state, in the range of average molecular weights between 2,400 and 100,000 (n=55-2300), and in the solid state, derivatized.

Experimental evidence favorable to the concepts expressed, associated with continuously developing advances in instrumental investigation, have decisively stimulated the theoretical and practical interest for the synthesis of HOHAO in general, and for mono-and diderivatized homogeneous and heterogeneous polyoxyethylene chains in particular.

(O-CH2CH2)3 -OHH Na+

C12H25/C14H29 (7/3)-OH +

**lauryl/myristyl alcohol (7/3) (LM-OH)**

+

SO2-O-C12H25/C14H29 (7/3)

(O-CH2CH2)3 -O NaH PEG-3-Na

groups") with various fragments R1, R2 (Table 1) [9,10].

CH3

**LM-TS**

[LM(EO)3H]

PEGn-Ac,Cl [9,10].

inert atmosphere

**triethyleneglycol (PEG-3) monosodium triethyleneglycol** 

HO-

(-HCl)

**PTC5**

(O-CH2CH2)3 -O NaH

**(PEG-3-Na)**

SO2-O-C12H25/C14H29 (7/3)

SO3Na C12H25/C14H29 (7/3)-O-(CH2CH2-O)3-H

**homogeneous polyethoxylated (n = 3) lauryl/myristyl (7/3) alcohol**

**sodium p-toluenesulfonate**

Ethylene Oxide Homogeneous Heterobifunctional Acyclic Oligomers

http://dx.doi.org/10.5772/57610

111

**lauryl/myristyl tosylate (7/3) (LM-TS)**

+

CH3

CH3

**Figure 6.** Reaction scheme for the production of homogeneous polyethoxylated (n=3) lauryl/myristyl (7/3) alcohol

Homogeneous heterobifunctional acyclic oligomers of ethylene oxide (HOHAO) fall into the category of "niche" unitary organic compounds (derivatives of homogeneous polyethylene glycols PEGn), whose synthesis, purification, chemical and physico-chemical characterization detaches from the classic heterodisperse character specific to the oligomerization and poly‐ merization products of ethylene oxide (EO), through the unitary (homogeneous) structure of the polyoxyethylene chain (PEO) "constructed" through controlled successive covalent grafting of lower oxyethylene units, diethyleneglycol (DEO) or triethyleneglycol (TEO) (adapted Williamson synthesis), followed by the derivatization of the two terminals ("end

Aromatic sulphochlorides (tosyl chloride) react with higher alcohols (C8-C18), diols (PEGn), and alkyl (C8-C12) phenols (NF), respectively, in the presence of a base, pyridine, forming sulpho‐ nate esters, effective alkylating agents (Figures 6, 7, 9). Higher alcohols (C8-C18), diols (DEO, TEO, PEGn), and alkyl (C8-C12) phenols, respectively, react with thionyl chloride (SOCl2) in the presence of pyridine as organic base, generating the corresponding chloro derivatives (PEG2-2Cl; PEG3-2Cl; PEG6-2Cl) (Figures 8, 9). In order to direct (control) the course of the reaction in the direction of mono-or dichlorination, one hydroxyl terminal can be protected by acetylation (Ac) with acetic anhydride (Ac2O) with the formation of PEGn-Ac; PEGn-2Ac;

Higher alcohols (C8-C18), alkyl (C8-C12) phenols and diols (DEO, TEO, PEGn) also form under controlled inert atmosphere, without oxygen, CO2 or traces of water (Figures 6, 9), alcoholates or phenolates (PEG2-Na; PEG3-Na; PEG6-Na; PEG9-Na; PEG12-Na; PEG18-Na), who subse‐


CH3

SO2Cl

**p-toluene sulfochloride (Ts-Cl)**

The rediscovery of crown polyethers and their role as phase-transfer catalysts, and the pronounced mutual structural similarity constituted an additional major impetus for the theoretical and practical conformational study of acyclic PEO chains, suggested changes in terminology and their recognition as biomacromolecules with major physiological role.

The homologous series of dimethylated polyethylene glycols [CH3(OCH2CH2)nOCH3; (n≥3)] also suggested the term *glyme* (oligoglyme) (*met*hylated *gly*cols). Although not yet widely accepted, there is a tendency to generalize this concept also to derivatized acyclic polyether (polyoxyethylene) chains.

From a simple working hypothesis (speculative nature) able to explain phenomena or processes, subsequent experimental studies based on X-ray investigations, electronic micro‐ scopy and diffraction confirmed their ability of intra-and interchain contraction, dependent on the structure and medium and the formation of "cavities" ("cage" of variable geometry) at oligomerization degrees (n, naverage) below and above nine ethylene oxide (EO) units, and "sandwich" below three structural units (EO), respectively [16].

The synergistic cumulation in a unitary structure of the conformational and colloidal qualities of homogeneous PEO chains with the possibility of controlled modification of the HLB (hydrophilic/hydrophobic balance) was and still is of wide theoretical and practical interest.

Technologies based on grafted PEGn conjugates launched products of major biological importance, for research and diagnosis (PEGn-modified proteins and liposomes, food, medical and analytical matrices), based on the accumulated knowledge on PEGn and the prospects of PEGn as biomaterial.

These considerations suggested the idea of structuring the HOHAO of the PEGn-L(R';2R) (2R';R) structured-lipids-type (Figure 5).

Figure 5. The structure of homogeneous heterobifunctional acyclic oligomers of ethylene oxide of the tailor-made-lipids type [10] **Figure 5.** The structure of homogeneous heterobifunctional acyclic oligomers of ethylene oxide of the tailor-madelipids type [10]

In Figures 6-9 are presented the main process flow charts which formed the basis of the synthesis of homogeneous polyoxyethylene chains in HOHAO. It is noted that in the stages with strongly ionic character were accessed PTC with homogeneous PEO chains, able to activate nucleophilic agents by coordinating the alkaline cation [9,10]. In Figures 6-9 are presented the main process flow charts which formed the basis of the synthesis of homogeneous polyoxyethylene chains in HOHAO. It is noted that in the stages with strongly ionic character were accessed PTC with homogeneous PEO chains, able to activate nucleophilic agents by coordinating the alkaline cation [9,10].

Figure 6. Reaction scheme for the production of homogeneous polyethoxylated (n=3) lauryl/myristyl (7/3) alcohol [LM(EO)3H]

Homogeneous heterobifunctional acyclic oligomers of ethylene oxide (HOHAO) fall into the category of "niche" unitary organic compounds (derivatives of homogeneous polyethylene glycols PEGn), whose synthesis, purification, chemical and physico-chemical characterization detaches from the classic heterodisperse character specific to the oligomerization and polymerization products of ethylene oxide (EO), through the unitary (homogeneous) structure of the polyoxyethylene chain (PEO) "constructed" through controlled successive covalent grafting of lower oxyethylene units, diethyleneglycol (DEO) or triethyleneglycol (TEO) (adapted Williamson synthesis), followed by the derivatization of the

Aromatic sulphochlorides (tosyl chloride) react with higher alcohols (C8-C18), diols (PEGn), and alkyl (C8-C12) phenols (NF), respectively, in the presence of a base, pyridine, forming sulphonate esters, effective alkylating agents (Figures 6, 7, 9). Higher alcohols (C8-C18), diols (DEO, TEO, PEGn), and alkyl (C8-C12) phenols, respectively, react with thionyl chloride

two terminals ("end groups") with various fragments R1, R2 (Table 1) [9,10].

Experimental evidence favorable to the concepts expressed, associated with continuously developing advances in instrumental investigation, have decisively stimulated the theoretical and practical interest for the synthesis of HOHAO in general, and for mono-and diderivatized

The rediscovery of crown polyethers and their role as phase-transfer catalysts, and the pronounced mutual structural similarity constituted an additional major impetus for the theoretical and practical conformational study of acyclic PEO chains, suggested changes in terminology and their recognition as biomacromolecules with major physiological role.

The homologous series of dimethylated polyethylene glycols [CH3(OCH2CH2)nOCH3; (n≥3)] also suggested the term *glyme* (oligoglyme) (*met*hylated *gly*cols). Although not yet widely accepted, there is a tendency to generalize this concept also to derivatized acyclic polyether

From a simple working hypothesis (speculative nature) able to explain phenomena or processes, subsequent experimental studies based on X-ray investigations, electronic micro‐ scopy and diffraction confirmed their ability of intra-and interchain contraction, dependent on the structure and medium and the formation of "cavities" ("cage" of variable geometry) at oligomerization degrees (n, naverage) below and above nine ethylene oxide (EO) units, and

The synergistic cumulation in a unitary structure of the conformational and colloidal qualities of homogeneous PEO chains with the possibility of controlled modification of the HLB (hydrophilic/hydrophobic balance) was and still is of wide theoretical and practical interest. Technologies based on grafted PEGn conjugates launched products of major biological importance, for research and diagnosis (PEGn-modified proteins and liposomes, food, medical and analytical matrices), based on the accumulated knowledge on PEGn and the prospects of

These considerations suggested the idea of structuring the HOHAO of the PEGn-L(R';2R)

PEGn-L (lipids) (2R'; R) PEGn-L (lipids) (1R'; 2R)

Figure 5. The structure of homogeneous heterobifunctional acyclic oligomers of ethylene oxide of the tailor-made-lipids type [10]

**Figure 5.** The structure of homogeneous heterobifunctional acyclic oligomers of ethylene oxide of the tailor-made-

In Figures 6-9 are presented the main process flow charts which formed the basis of the synthesis of homogeneous polyoxyethylene chains in HOHAO. It is noted that in the stages with strongly ionic character were accessed PTC with homogeneous PEO chains, able to

In Figures 6-9 are presented the main process flow charts which formed the basis of the synthesis of homogeneous polyoxyethylene chains in HOHAO. It is noted that in the stages with strongly ionic character were accessed PTC with homogeneous PEO chains, able to

Figure 6. Reaction scheme for the production of homogeneous polyethoxylated (n=3) lauryl/myristyl (7/3) alcohol [LM(EO)3H]

Homogeneous heterobifunctional acyclic oligomers of ethylene oxide (HOHAO) fall into the category of "niche" unitary organic compounds (derivatives of homogeneous polyethylene glycols PEGn), whose synthesis, purification, chemical and physico-chemical characterization detaches from the classic heterodisperse character specific to the oligomerization and polymerization products of ethylene oxide (EO), through the unitary (homogeneous) structure of the polyoxyethylene chain (PEO) "constructed" through controlled successive covalent grafting of lower oxyethylene units, diethyleneglycol (DEO) or triethyleneglycol (TEO) (adapted Williamson synthesis), followed by the derivatization of the

Aromatic sulphochlorides (tosyl chloride) react with higher alcohols (C8-C18), diols (PEGn), and alkyl (C8-C12) phenols (NF), respectively, in the presence of a base, pyridine, forming sulphonate esters, effective alkylating agents (Figures 6, 7, 9). Higher alcohols (C8-C18), diols (DEO, TEO, PEGn), and alkyl (C8-C12) phenols, respectively, react with thionyl chloride

two terminals ("end groups") with various fragments R1, R2 (Table 1) [9,10].

CH2 O CH

O CH2 O CO

R' R'

CH2CH2 OCH2CH2 O R <sup>n</sup>

CO CO

homogeneous and heterogeneous polyoxyethylene chains in particular.

"sandwich" below three structural units (EO), respectively [16].

OCH2CH2 OCH2CH2 O O

n n

activate nucleophilic agents by coordinating the alkaline cation [9,10].

activate nucleophilic agents by coordinating the alkaline cation [9,10].

R R

(polyoxyethylene) chains.

110 Oligomerization of Chemical and Biological Compounds

PEGn as biomaterial.

CO

R' CH2CH2 CH2CH2

CO CO

CH2 O CH

O CH2 O

lipids type [10]

(2R';R) structured-lipids-type (Figure 5).

**Figure 6.** Reaction scheme for the production of homogeneous polyethoxylated (n=3) lauryl/myristyl (7/3) alcohol [LM(EO)3H]

Homogeneous heterobifunctional acyclic oligomers of ethylene oxide (HOHAO) fall into the category of "niche" unitary organic compounds (derivatives of homogeneous polyethylene glycols PEGn), whose synthesis, purification, chemical and physico-chemical characterization detaches from the classic heterodisperse character specific to the oligomerization and poly‐ merization products of ethylene oxide (EO), through the unitary (homogeneous) structure of the polyoxyethylene chain (PEO) "constructed" through controlled successive covalent grafting of lower oxyethylene units, diethyleneglycol (DEO) or triethyleneglycol (TEO) (adapted Williamson synthesis), followed by the derivatization of the two terminals ("end groups") with various fragments R1, R2 (Table 1) [9,10].

Aromatic sulphochlorides (tosyl chloride) react with higher alcohols (C8-C18), diols (PEGn), and alkyl (C8-C12) phenols (NF), respectively, in the presence of a base, pyridine, forming sulpho‐ nate esters, effective alkylating agents (Figures 6, 7, 9). Higher alcohols (C8-C18), diols (DEO, TEO, PEGn), and alkyl (C8-C12) phenols, respectively, react with thionyl chloride (SOCl2) in the presence of pyridine as organic base, generating the corresponding chloro derivatives (PEG2-2Cl; PEG3-2Cl; PEG6-2Cl) (Figures 8, 9). In order to direct (control) the course of the reaction in the direction of mono-or dichlorination, one hydroxyl terminal can be protected by acetylation (Ac) with acetic anhydride (Ac2O) with the formation of PEGn-Ac; PEGn-2Ac; PEGn-Ac,Cl [9,10].

Higher alcohols (C8-C18), alkyl (C8-C12) phenols and diols (DEO, TEO, PEGn) also form under controlled inert atmosphere, without oxygen, CO2 or traces of water (Figures 6, 9), alcoholates or phenolates (PEG2-Na; PEG3-Na; PEG6-Na; PEG9-Na; PEG12-Na; PEG18-Na), who subse‐ quently participate in a directed manner (Figures 7-9) in the nucleophilic substitution of chloride in the mono-and/or dichloro derivatives (PEG2-2Cl; PEG3-2Cl; PEG6-2Cl) for the elongation of the homogeneous PEO chain [9,10].

of HOHAO proper (Table 1). The common element of all the molecular architectures obtained was the directed synthesis of the homogenous polyoxyethylene chains (n) (Figure 7) (n=3,6,9,12,18), following the adapted Williamson variant [9,10]. Subsequently there was the monitored grafting (PEGylation) with fragments R1 (Figures 7, 8), and the derivatization of the second hydroxyl terminal with fragments R2 accessing a scheme of adapted classical reactions (Figures 10, 11). Literature reports similar structures of the nonionic-ionic type with hetero‐ geneous polyoxyethylene chains (naverage=3) [9,10]. The study extends the range of HOHAO with surface-active competences using higher homogenous polyoxyethylene chains (n=3-18).

Ethylene Oxide Homogeneous Heterobifunctional Acyclic Oligomers

http://dx.doi.org/10.5772/57610

113

To facilitate the presentation of synthesized HOHAO (Figure 10) their chemical names have been encoded. The main organic functions were symbolized by the initials of the chemical names (*e.g.*:-propionitrile, PN;-primary ethylamine, EP; etc.), the homologues of the base hydrocarbon chain with the initials of the trivial names (*e.g.*: lauryl/myristyl, LM; cetyl/stearyl, CS) followed in parentheses by the ratio (7/3) signifying their mutual relative distribution. The hydrocarbon chains attached later by synthesis were symbolized by the number of carbon atoms contained (*e.g.*: in the cationic structure – EC-1.1.16., we find two methyl groups symbolized as (1.1.), and a hexadecyl chain indicated by the number (16.), respectively), and within the class in the natural order. For example: *N,N*-dimethyl-*N*-dodecyl (lauryl)-*N*-βlauryl/myristyl (7/13) polyethyleneoxy (n=9) ethylammonium chloride has the symbol LM-

Similarly was done with the structures of structured lipids HOHAO (Figure 11) assimilated

Under equimolar AN/LM-OH conditions, increasing the temperature in the range 25-35°C determines the increase of the cyanoethylation yields, then between 40-60°C the yields drop. With excess acrylic monomer the evolution of yields follows the same trend. In these conditions between 25-40°C the amount of acrylic oligomers formed is below 1%, independently of the excess of monomer introduced, while between 40-60°C it increases dramatically. For cetyl/ stearyl alcohol, under equimolar conditions or excess of monomer AN/CS-OH, increasing the temperature between 45-55°C (below this range the yields are low) favors the nucleophilic addition yields, the more so as the excess of acrylic monomer is higher. In the range 55-70°C the yields drop under the same conditions, and the content of acrylic oligomers is higher than in the case of lauryl/myristyl alcohol. At the cyanoethylation of homogeneous polyethoxylated (n=3) lauryl/myristyl alcohol under equimolar conditions or excess of acryl monomer, the addition yields increase between 25-35°C, then between 45-60°C decrease appreciably. The amount of acrylic oligomers formed follows roughly the same evolution as in the case of lauryl/ myristyl alcohol over the entire temperature range. For homogeneous polyethoxylated (n=3) cetyl/stearyl alcohol under equimolar conditions or excess of monomer, the addition yields increase between 30-40°C then drop between 45-65°C. The content of acrylic oligomers increases in proportion to the temperature and excess of monomer. Compared with cetyl/ stearyl alcohol it is noted that the cyanoethylation yields are higher at the same value of temperature even below 45°C. These suggest that the cyanoethylation reaction is reversible in character (Figure 12), the polyoxyethylene chain favors the addition, while increasing the

(EO)9-EC-1.1.12.

with PEGn-L (lipids) conjugates.

hydrocarbon one reduces the yields.

**Figure 7.** The preparation scheme of homogeneous polyethoxylated lauryl alcohol n=3, 6,9,12,18 under phase-trans‐ fer catalysis conditions (PTC5 – homogeneous β-nonylphenoxy polyethyleneoxy (n=24) methyl ether)

The process with a pronounced polar (ionic) character is additionally favored by the presence of micellar phase-transfer catalysts (PTC1-5) (Figures 6-9, 13), which "sequester" the alkaline cation (sodium) in the "cavity" with flexible geometry of the homogeneous PEO chain's helix for n≥8-9 (PTC1, PTC4, PTC5) or in the interchain space ("sandwich" type) for n≤3-4 (PTC2, PTC3).

#### **5. Homogeneous heterobifunctional acyclic oligomers of ethylene oxide**

**a.** Synthesis of homogeneous heterobifunctional acyclic oligomers of ethylene oxide from the category of surface-active compounds

An overview of the colloidal characteristics of two major categories of surface-active structures (ionic and nonionic) suggested the idea of creating a new class of hybrid surface-active compounds, of the nonionic-ionic type, with synergistically cumulated colloidal effects. After confirming the real possibilities of approaching the study it was decided upon the structuring of HOHAO proper (Table 1). The common element of all the molecular architectures obtained was the directed synthesis of the homogenous polyoxyethylene chains (n) (Figure 7) (n=3,6,9,12,18), following the adapted Williamson variant [9,10]. Subsequently there was the monitored grafting (PEGylation) with fragments R1 (Figures 7, 8), and the derivatization of the second hydroxyl terminal with fragments R2 accessing a scheme of adapted classical reactions (Figures 10, 11). Literature reports similar structures of the nonionic-ionic type with hetero‐ geneous polyoxyethylene chains (naverage=3) [9,10]. The study extends the range of HOHAO with surface-active competences using higher homogenous polyoxyethylene chains (n=3-18).

quently participate in a directed manner (Figures 7-9) in the nucleophilic substitution of chloride in the mono-and/or dichloro derivatives (PEG2-2Cl; PEG3-2Cl; PEG6-2Cl) for the

**PEG-3 PEG-3-Na L-TS**

**Preparation of homogeneous polyethoxylated lauryl alcohol**

**L-OH**

**L-(EO)3-H**

**L-(EO)9-H**

**L-TS**

**PTC5**

**PTC5**

**PTC5**

**PTC5**

**Figure 7.** The preparation scheme of homogeneous polyethoxylated lauryl alcohol n=3, 6,9,12,18 under phase-trans‐

The process with a pronounced polar (ionic) character is additionally favored by the presence of micellar phase-transfer catalysts (PTC1-5) (Figures 6-9, 13), which "sequester" the alkaline cation (sodium) in the "cavity" with flexible geometry of the homogeneous PEO chain's helix for n≥8-9 (PTC1, PTC4, PTC5) or in the interchain space ("sandwich" type) for n≤3-4 (PTC2,

**5. Homogeneous heterobifunctional acyclic oligomers of ethylene oxide**

**a.** Synthesis of homogeneous heterobifunctional acyclic oligomers of ethylene oxide from

An overview of the colloidal characteristics of two major categories of surface-active structures (ionic and nonionic) suggested the idea of creating a new class of hybrid surface-active compounds, of the nonionic-ionic type, with synergistically cumulated colloidal effects. After confirming the real possibilities of approaching the study it was decided upon the structuring

**PTC5**

**L-(EO)12-H**

**L-TS**

**L-(EO)6-H**

**L-TS**

**L-(EO)18-H**

elongation of the homogeneous PEO chain [9,10].

112 Oligomerization of Chemical and Biological Compounds

**Preparation of the homogeneous polyoxyethylene chain**

**PTC5**

**PEG-9 PEG-9-Na**

**PEG-12 PEG-12-Na**

**PEG-18 PEG-18-Na**

fer catalysis conditions (PTC5 – homogeneous β-nonylphenoxy polyethyleneoxy (n=24) methyl ether)

**PEG-2 PEG-2-Na PEG-2-2Cl**

**PTC5**

**PEG-6-2Cl PEG-6 PEG-6-Na**

the category of surface-active compounds

PTC3).

**PEG-3-2Cl L-TS**

To facilitate the presentation of synthesized HOHAO (Figure 10) their chemical names have been encoded. The main organic functions were symbolized by the initials of the chemical names (*e.g.*:-propionitrile, PN;-primary ethylamine, EP; etc.), the homologues of the base hydrocarbon chain with the initials of the trivial names (*e.g.*: lauryl/myristyl, LM; cetyl/stearyl, CS) followed in parentheses by the ratio (7/3) signifying their mutual relative distribution. The hydrocarbon chains attached later by synthesis were symbolized by the number of carbon atoms contained (*e.g.*: in the cationic structure – EC-1.1.16., we find two methyl groups symbolized as (1.1.), and a hexadecyl chain indicated by the number (16.), respectively), and within the class in the natural order. For example: *N,N*-dimethyl-*N*-dodecyl (lauryl)-*N*-βlauryl/myristyl (7/13) polyethyleneoxy (n=9) ethylammonium chloride has the symbol LM- (EO)9-EC-1.1.12.

Similarly was done with the structures of structured lipids HOHAO (Figure 11) assimilated with PEGn-L (lipids) conjugates.

Under equimolar AN/LM-OH conditions, increasing the temperature in the range 25-35°C determines the increase of the cyanoethylation yields, then between 40-60°C the yields drop. With excess acrylic monomer the evolution of yields follows the same trend. In these conditions between 25-40°C the amount of acrylic oligomers formed is below 1%, independently of the excess of monomer introduced, while between 40-60°C it increases dramatically. For cetyl/ stearyl alcohol, under equimolar conditions or excess of monomer AN/CS-OH, increasing the temperature between 45-55°C (below this range the yields are low) favors the nucleophilic addition yields, the more so as the excess of acrylic monomer is higher. In the range 55-70°C the yields drop under the same conditions, and the content of acrylic oligomers is higher than in the case of lauryl/myristyl alcohol. At the cyanoethylation of homogeneous polyethoxylated (n=3) lauryl/myristyl alcohol under equimolar conditions or excess of acryl monomer, the addition yields increase between 25-35°C, then between 45-60°C decrease appreciably. The amount of acrylic oligomers formed follows roughly the same evolution as in the case of lauryl/ myristyl alcohol over the entire temperature range. For homogeneous polyethoxylated (n=3) cetyl/stearyl alcohol under equimolar conditions or excess of monomer, the addition yields increase between 30-40°C then drop between 45-65°C. The content of acrylic oligomers increases in proportion to the temperature and excess of monomer. Compared with cetyl/ stearyl alcohol it is noted that the cyanoethylation yields are higher at the same value of temperature even below 45°C. These suggest that the cyanoethylation reaction is reversible in character (Figure 12), the polyoxyethylene chain favors the addition, while increasing the hydrocarbon one reduces the yields.


These conclusions suggest a greater reactivity of the lauryl chain compared to the myristyl one

In the cyanoethylation process of higher alcohols, the reaction time favors the formation of βalkyl-oxy-propionitriles up to 180 minutes, and the oligomerization of the acrylic monomer throughout the process. After this period the cyanoethylation yields decrease, further con‐ firming the reversible character of the nucleophilic addition under prolonged contact between

> **[3] + [3]\* + [3] [9] [2] + [2]\* + [2] [6] [4] + [4]\* + [4] [12] [6] + [6]\* + [6] [18]**

**[2] + [2] [4]**

**[3] + [3] [6]**

**[3] [3]-**

**[3] [3]-**

**[6] [6]- Na+\*) Na -1/2 H2 [6] Cl[6]Cl\*) SOCl2**

**[6] Cl[6]Cl\*) SOCl2**

**\***

**Figure 8.** Operations flow chart of the monitored structuring of homogeneous polyoxyethylene (PEO) chains (n=6-18) as such accessing structural units of diethyleneglycol PEG2 (DEO) (2), triethyleneglycol PEG3 (TEO) (3) and/or homoge‐ neous polyethylene glycols PEGn (n). a) mono-and diacetylation, respectively, of the homogeneous chain (PEO) (n=3) (protection); b) mono-and dichlorination, respectively, of the PEO chains protected by acetylation; c) schemes of (di‐ rected) structuring of homogeneous polyoxyethylene (PEO) chains (n=6,9,12,18) by phase-transfer catalysis (PTC1, PTC2, PTC3); PTC1 – homogeneous dimethylglyme; PTC2 – homogeneous β-alkyl (L/M) polyethyleneoxy (n=4) methyl

Similar trends are observed in the cyanoethylation of nonylphenol and homogeneous polye‐ thoxylated nonylphenols, respectively, for the entire series of homogeneous and/or heteroge‐

In the series of homogeneous polyethoxylated lauryl/myristyl alcohols (n=3-18) the maxi‐ mum value of the cyanoethylation yield is obtained at lower processing periods, which may suggest the favorable intervention of the polyoxyethylene chain in the cyanoethylation process

**terminal**

**Na -1/2 H2 [3] Cl[3]Cl\*) SOCl2**

**) homogeneous polyoxyethylene (PEO) chain (n = 2-4; 6), dichlorinated (PEGn-2Cl) (Cl[n]Cl)**

> **[2] [6] [6] [12] [6] [18]**

> **[3] [9] [3] [12] [6] [18]**

> > **molar ratio 1/1**

> > **molar ratio 1/2**

**) prior protection by acetylation of a hydroxyl**

**\*\*) after deprotection by acid hydrolysis**

**molar ratio 2/1**

**molar ratio 2/1 molar ratio 1/2**

**[9]\*\*)**

**PTC1**

Ethylene Oxide Homogeneous Heterobifunctional Acyclic Oligomers

**[12]\*\*)**

**PTC1 PTC2/PTC3**

**[18]**

**PTC2**

**- oxygen - carbon - alkali cation;**

http://dx.doi.org/10.5772/57610

115

**- bond;**

**Legend:**

R2

R1 R2

R1

R **c** <sup>2</sup> R1

**PTC1**

**PTC3**

**Na+ \*) Na -1/2 H2 [3] Cl[3]Cl\*) SOCl2**

> **Na+ \* )**

with the increase of the homogeneous polyether chain.

O (**PEG**)n OH

O (**PEG**)n CO CH3

**Cl** (**PEG**)n O**Ac**

**homogeneous polyoxyethylene (PEO) chain (n = 2-4; 6), monoacetylated and monochlorinated (PEGn-Ac,Cl)**

> **Cl** (**PEG**)n **Cl**

**homogeneous polyoxyethylene (PEO) chain (n = 2-4; 6), dichlorinated (PEGn-2Cl)**

**"homogeneous" polyoxyethylene (PEOn) chain (n = 3-12) (PEGn), monoacetylated (PEGn-Ac)**

O CO CH3

**homogeneous polyoxyethylene (PEOn) chain (n = 3-12) (PEGn), diacetylated (PEGn-2Ac)**

CO CH3

<sup>+</sup> CH3COOH

**\***

+ 2 CH3COOH

+ SO2 + HCl

+ 2 SO2 + 2 HCl

reactants (Figure 12).

OH (**PEG**)n OH

**a**

**b**

OH (**PEG**)n OH

OH (**PEG**)n O**Ac** +

**homogeneous polyoxyethylene (PEO) chain (n = 2-4; 6), monoacetylated (PEGn-Ac)** OH (**PEG**)n OH +

**homogeneous polyoxyethylene (PEO) chain (n = 2-4; 6)**

2

O H3CCO H3CCO +

O H3CCO H3CCO +

SOCl2

SOCl2

ether; PTC3 – dicyanoethyl triethyleneglycol.

neous PEO chain homologues.

(Figure 13).

2

**molar ratio 1/1**

**molar ratio 1/2**

**Table 1.** Homogeneous heterobifunctional acyclic oligomers R1-(PEO)n-R2 of ethylene oxide (selective exemplification)

These conclusions suggest a greater reactivity of the lauryl chain compared to the myristyl one with the increase of the homogeneous polyether chain.

**No.**

HO

Alk-O-Alk-Aryl-O-

Alk-O-Alk-Aryl-O-

Alk-O-Alk-Aryl-O-

Alk-O-Alk-Aryl-O-

Alk-O-Alk-Aryl-O-

Alk-O-Alk-Aryl-O-

Alk-O-Alk-Aryl-O-

Alk-O-Alk-Aryl-O-

3,6,9, 12,18

3,6,9, 12,18

3,6,9, 12,18

3,6,9, 12,18

3,6,9, 12,18

3,6,9, 12,18

3,6,9, 12,18

3,6,9, 12,18

3,6,9, 12,18 **Structure R1-(PEO)n-R2**

OH




Alkyl (C1-C18)/Benzyl Hlg

CH3

(m = 1,2)


O CH2(CH2)m C NH

(m = 1,2)

the structure is presented in detail on figure 5

<sup>N</sup> <sup>C</sup> H2 CH2

**Table 1.** Homogeneous heterobifunctional acyclic oligomers R1-(PEO)n-R2 of ethylene oxide (selective exemplification)

CH3

O H2(CH2)m NC

**R1 n R2**

114 Oligomerization of Chemical and Biological Compounds

**Name Competences References**

controlled PEGylation structures

homogeneous polyethoxylate d (C1-C20) alcohols

PEGylation intermediates

PEGylation intermediates

PEGylation intermediates

flotation agents/phasetransfer catalysts

micellar flotation agents

structuring intermediates of amphoteric surface-active compounds

tailor-made lipids, designer lipids, fat mimetics

9,10

9,10

9,10,17,28

9,10,29

9,10,18

9,24

9,18,20

22,23,26

10,21,35

OH homogeneous polyethylene

glycols

monoderivatized homogeneous polyethylene glycols

homogeneous β-alkyl (C8- C18)/alkyl (C8-C12) aryl polyethyleneoxy propionitriles

homogeneous β-alkyl (C8- C18)/alkyl (C8-C12) aryl polyethyleneoxy propionamides

homogeneous β-alkyl (C8- C18)/alkyl (C8-C12) aryl polyethyleneoxy ethyl/propyl amines

homogeneous *N,N*-dimethyl-*N*-alkyl (C8-C18)/benzyl-*N*-βalkyl (C1-C18)/ polyethyleneoxy (n = 3-18) ethyl/propyl ammonium

salified homogeneous β/γalkyl (C8-C18)/alkyl (C8-C12) aryl polyethyleneoxy ethyl/ propyl amines

homogeneous β-alkyl (C8- C18)/alkyl (C8-C12) aryl polyethyleneoxy 2 imidazolines

glycerol esters containing carboxy ethyl/propyl polyethyleneoxy alkyl ether groups

In the cyanoethylation process of higher alcohols, the reaction time favors the formation of βalkyl-oxy-propionitriles up to 180 minutes, and the oligomerization of the acrylic monomer throughout the process. After this period the cyanoethylation yields decrease, further con‐ firming the reversible character of the nucleophilic addition under prolonged contact between reactants (Figure 12).

**Figure 8.** Operations flow chart of the monitored structuring of homogeneous polyoxyethylene (PEO) chains (n=6-18) as such accessing structural units of diethyleneglycol PEG2 (DEO) (2), triethyleneglycol PEG3 (TEO) (3) and/or homoge‐ neous polyethylene glycols PEGn (n). a) mono-and diacetylation, respectively, of the homogeneous chain (PEO) (n=3) (protection); b) mono-and dichlorination, respectively, of the PEO chains protected by acetylation; c) schemes of (di‐ rected) structuring of homogeneous polyoxyethylene (PEO) chains (n=6,9,12,18) by phase-transfer catalysis (PTC1, PTC2, PTC3); PTC1 – homogeneous dimethylglyme; PTC2 – homogeneous β-alkyl (L/M) polyethyleneoxy (n=4) methyl ether; PTC3 – dicyanoethyl triethyleneglycol.

Similar trends are observed in the cyanoethylation of nonylphenol and homogeneous polye‐ thoxylated nonylphenols, respectively, for the entire series of homogeneous and/or heteroge‐ neous PEO chain homologues.

In the series of homogeneous polyethoxylated lauryl/myristyl alcohols (n=3-18) the maxi‐ mum value of the cyanoethylation yield is obtained at lower processing periods, which may suggest the favorable intervention of the polyoxyethylene chain in the cyanoethylation process (Figure 13).

On the overall process, increasing the reaction time up to approx. 180 minutes favors all reaction processes, including the formation of acrylic oligomers.

R-O-(EO)n-H n=3-18; R=C12-C18; NF; EH

R-O-(EO)n-CH2CH2CN R-O-(EO)n-CH2CH2CONH<sup>2</sup> H2O (H+)

(AN) CH2=CH-CONH2 (AM)


Ethylene Oxide Homogeneous Heterobifunctional Acyclic Oligomers

http://dx.doi.org/10.5772/57610

117

CH2CH2N

CH2CH2N

N-alkyl-N,N-dimethyl-N-b-Rpolyethyleneoxy-ethylammonium halide [R-O-(EO)n-EC]

tertiary N,N-dimethyl-N- -Rpolyethyleneoxy-ethylamine [R-O-(EO)n-ET] CmH2m+1X

R-O-(EO)n

CH3 CH3

CH3

CH3 R-O-(EO)<sup>n</sup> CmH2m+1X X

CH2=CH-CN

total acid hydrolysis CTF, CM

Legend: AN - acrylonitrile X- halide NF - nonylphenol EH - 2-ethyl-hexyl

m=1-18 AM - acrylamide


[R-O-(EO)n-PC]

B R-O-(EO)n-CH2CH2COO-

salt of -R-polyethyleneoxy-propionic acids (nonionic soaps)

[R-O-(EO)n-PC-

B- inorganic and/or organic base

R-O-(EO)n-CH2CH2COOH


> 2H2O (H+)

> > BH<sup>+</sup>

The polyoxyethylene chain with its specific conformation interferes in the cyanoethylation process in non-polar reaction media (toluene, etc.) by activating the nucleophile, so that the cyanoethylation yields for higher polyethoxylated alcohols with the same hydrocarbon chain,

The hydrocarbon chains in higher alcohols or in general β-R-oxy-propionitriles present in the process, through their length or tridimensional arrangement, respectively, generate "steric hindrance" phenomena (Figure 14), which reduce the overall rate the more so as the length is greater. The polyoxyethylene chains through their conformation favor the formation of nonsolvated nucleophiles, which accelerates the cyanoethylation. For this reason cyanoethylation is also favored by the presence of phase-transfer catalysts which activate the nucleophile in the "reverse" micelle [9,25] medium system. Unprotected glymes, polyethylene glycols (n=3-30), in comparison with protected ones: dicyanoethylated homogeneous polyethylene glycols (n=3,6,9,12,18), homogeneous β-lauryl-polyethyleneoxy (n=3,6,9,12,18) propionitriles, together with the reaction rate increase the monomer consumption. The determination of the partial reaction order with respect to glymes allowed the indirect estimation of the size of the elementary coordination "cell" of the alkaline cation at the value of 8-9, and the thermody‐

X+]

**Figure 10.** Reaction scheme of the processing of surface-active structures HOHAO [9,19]

but with a variable (n=3-18) polyoxyethylene chain increase.

Increasing the amount of catalyst above the optimum value (4-5 × 10-3 mol/L) increases the alkalinity of the medium and the oligomerization reactions of the acrylic monomer. In the concentration range 15-50 × 10-3 mol/L, the content of homogeneous β-lauryl/myristyl (7/3) polyethyleneoxy (n=3) propionitrile decreases, in parallel with the sharp rise of the acrylic oligomers content. In the series of polyoxyethylene chain homologues, the maximum cyanoe‐ thylation yield is achieved at higher values of the catalyst concentration. At the same catalyst concentration, increasing the polyether chain determines a significant increase of the nucleo‐ philic addition yields, but also a reduction in the amount of acrylic oligomers formed, probably due to the solvation of the acrylic monomer in the polyether chain [9,10].

**Figure 9.** Flow chart of the processing and purification of homogeneous polyoxyethylene (PEO) chains (n=3, 6) mono‐ derivatized with nonylphenol (NF) by condensation of tosylated nonylphenol (NF-TS) with monosodium protected (acetylated) diethyleneglycol under phase-transfer catalysis conditions; PTC4 – β-alkyl (L/M) polyethyleneoxy (n=16) ethylamine

The formation of acrylic oligomers was avoided by the introduction of ferrous cations, in the present case of anhydrous ferrous sulphate (FeSO4), as polymerization inhibitors of the acrylic monomer. For 1%, the cyanoethylation yield of homogeneous lauryl/myristyl (7/3) alcohol increases by more than 10%, without the formation of acrylic oligomers. Similar results are obtained in the homologous series of homogeneous (C12-C18) polyethoxylated (n=3-18) alcohols, and also of nonylphenols as such and polyethoxylated nonylphenols, respectively, for the entire series of (PEO) chain homologues [9,10]. Increasing the length of the hydrocarbon chain, for the same size of the homogeneous polyoxyethylene chain, reduces the cyanoethy‐ lation yields through unfavorable steric effects.

**Figure 10.** Reaction scheme of the processing of surface-active structures HOHAO [9,19]

On the overall process, increasing the reaction time up to approx. 180 minutes favors all

Increasing the amount of catalyst above the optimum value (4-5 × 10-3 mol/L) increases the alkalinity of the medium and the oligomerization reactions of the acrylic monomer. In the concentration range 15-50 × 10-3 mol/L, the content of homogeneous β-lauryl/myristyl (7/3) polyethyleneoxy (n=3) propionitrile decreases, in parallel with the sharp rise of the acrylic oligomers content. In the series of polyoxyethylene chain homologues, the maximum cyanoe‐ thylation yield is achieved at higher values of the catalyst concentration. At the same catalyst concentration, increasing the polyether chain determines a significant increase of the nucleo‐ philic addition yields, but also a reduction in the amount of acrylic oligomers formed, probably

C9H19

H2O (H<sup>+</sup>

**PTC4**

+

O (CH2CH2O-)3-O

**acetylated homogeneous polyethoxylated (n = 3) nonylphenol [NF-(EO) 3-Ac]**

+

) (-CH3COOH) C9H19

O (CH2CH2O-)3-H

+

C9H19

O (CH2CH2O- )3

**tosylated homogeneous polyethoxylated (n = 3) nonylphenol [NF-(EO)3-TS]**

COCH3

CH3

SO2Cl

CH3

SO2

SO3 - Na<sup>+</sup>

CH3

**sodium ptoluenesulfonate**

**Py, 80 - 90 oC**

(-Py<sup>+</sup> Cl-

> **- oxygen - carbon**

**PTC4**

**Legend:**

R1

R2

reaction processes, including the formation of acrylic oligomers.

due to the solvation of the acrylic monomer in the polyether chain [9,10].

C9H19

O

**nonylphenyl tosylate (NF-TS) (alkylating agent)**

C9H19

**-1/2 H2**

HO-(-CH2CH2O-)3-O COCH3

**anhydrous conditions** 

**PEG3-Ac**

**homogeneous polyethoxylated (n = 3) nonylphenol [NF-(EO)3-H]**

) **PEG3-Ac-Na**

**PEG3-Ac-Na**

**Figure 9.** Flow chart of the processing and purification of homogeneous polyoxyethylene (PEO) chains (n=3, 6) mono‐ derivatized with nonylphenol (NF) by condensation of tosylated nonylphenol (NF-TS) with monosodium protected (acetylated) diethyleneglycol under phase-transfer catalysis conditions; PTC4 – β-alkyl (L/M) polyethyleneoxy (n=16)

The formation of acrylic oligomers was avoided by the introduction of ferrous cations, in the present case of anhydrous ferrous sulphate (FeSO4), as polymerization inhibitors of the acrylic monomer. For 1%, the cyanoethylation yield of homogeneous lauryl/myristyl (7/3) alcohol increases by more than 10%, without the formation of acrylic oligomers. Similar results are obtained in the homologous series of homogeneous (C12-C18) polyethoxylated (n=3-18) alcohols, and also of nonylphenols as such and polyethoxylated nonylphenols, respectively, for the entire series of (PEO) chain homologues [9,10]. Increasing the length of the hydrocarbon chain, for the same size of the homogeneous polyoxyethylene chain, reduces the cyanoethy‐

CH3

**PTC4**


SO3 - Na<sup>+</sup>

CH3

SO2

Na

)

**protection 1/1**

Na O -(-CH2CH2O-)3 COCH3

H3COC (-CH2CH2O)6 O

lation yields through unfavorable steric effects.

**acetylated homogeneous polyethoxylated (n = 6) nonylphenol [NF-(EO)6-Ac]**

CH3

C9H19

**Py (Ka=5.6x10-10** \* **) 80 - 90 oC**

116 Oligomerization of Chemical and Biological Compounds

**(-Py+Cl- )**

O-Na+

+

**sodium nonylphenolate**

> CH3CO O CH3CO (HO-

**acetic anhydride**

> H2O (H+) (-CH3COOH)

SO2Cl

**tosyl chloride (TS-Cl)**

+

HO-(-CH2CH2O-)3-H

**triethyleneglycol (PEG3)**

C9H19

ethylamine

O (CH2CH2O-)6-H

**homogeneous polyethoxylated (n = 6) nonylphenol [NF-(EO) 6-H]**

The polyoxyethylene chain with its specific conformation interferes in the cyanoethylation process in non-polar reaction media (toluene, etc.) by activating the nucleophile, so that the cyanoethylation yields for higher polyethoxylated alcohols with the same hydrocarbon chain, but with a variable (n=3-18) polyoxyethylene chain increase.

The hydrocarbon chains in higher alcohols or in general β-R-oxy-propionitriles present in the process, through their length or tridimensional arrangement, respectively, generate "steric hindrance" phenomena (Figure 14), which reduce the overall rate the more so as the length is greater. The polyoxyethylene chains through their conformation favor the formation of nonsolvated nucleophiles, which accelerates the cyanoethylation. For this reason cyanoethylation is also favored by the presence of phase-transfer catalysts which activate the nucleophile in the "reverse" micelle [9,25] medium system. Unprotected glymes, polyethylene glycols (n=3-30), in comparison with protected ones: dicyanoethylated homogeneous polyethylene glycols (n=3,6,9,12,18), homogeneous β-lauryl-polyethyleneoxy (n=3,6,9,12,18) propionitriles, together with the reaction rate increase the monomer consumption. The determination of the partial reaction order with respect to glymes allowed the indirect estimation of the size of the elementary coordination "cell" of the alkaline cation at the value of 8-9, and the thermody‐

cyanoethylation yield is achieved at higher values of the catalyst concentration. At the same catalyst concentration, increasing the polyether chain determines a significant increase of the

probably due to the solvation of the acrylic monomer in the polyether chain [9,10].

**( \* ) saponified lipid fraction of dog rose, wild sweet chestnut, grape and coriander seeds/fruits**

of monomer and further purification complications. The presence of oligomerization inhibitors

**Figure 12.** Mechanisms of the cleavage of the "ether bridge" formed in β-nonylphenoxypolyethyleneoxy (n=3-18)

H

+


NF(EO)n-1-CH2CH2--O Na + CH2 CH CN

+

CH C N

+ CH3OH

CH3O-

Ethylene Oxide Homogeneous Heterobifunctional Acyclic Oligomers

Na+

http://dx.doi.org/10.5772/57610

119

The partial hydrolysis of homogeneous β-R-polyethyleneoxy-propionitriles (HOHAO) is a heterogeneous process due to the limited solubility in water of homogeneous polyethyleneoxy (n=0-18) nitriles and their corresponding amides, respectively. The low reaction temperature along with the reduced solubility and the waxy, consistent appearance of homogeneous β-Rpolyethyleneoxy (n=0-18) propionamides, constitute serious impediments in obtaining high hydrolysis yields. The use of large amounts of water or high reaction temperatures increase the total hydrolysis yields with the formation of the corresponding homogeneous β-Rpolyethyleneoxy (n=0-18) propionic acids. In the research carried out it was proceeded to the partial hydrolysis of nitriles with 90% concentrated sulphuric acid in the temperature range

Homogeneous β-R-polyethyleneoxy-propionic acids were also obtained through the acidcatalyzed exhaustive hydrolysis of homogeneous β-R-polyethyleneoxy-propionitriles. Depending on the processing conditions of the reaction products two classes of HOHAO are obtained: the free acids or their salts (soaps), R-(EO)n-PC, with confirmed surface-active properties [19,22,23]. The evolution of the total hydrolysis yield is determined by the hetero‐

favors the yields.

NF(EO)n-1-CH2CH2--O CH2

NF(EO)n-1-CH2CH2--O CH2 CH CN

**unstable carbanion intermediate**

**sodium nonylphenoxypolyethyleneoxy (n-1) ethoxide**

propionitriles under prolonged contact with the basic catalyst [CH3O-Na+]

0-15°C.

Figure 11.Overall processing scheme of conjugates PEGn-L(2R';R) (R';2R) [10] a) controlled modification of homogeneous polyoxyethylene (PEO) chains (n = 3, 6, 9, 12, (a) controlled modification of homogeneous polyoxyethylene (PEO) chains (n=3, 6, 9, 12, 18), monoderivatized R(NF;EH) by cyanoethylation and total acid hydrolysis [R(EO)nPC]; (b) actual processing of conjugates PEGn-L (2R';R) (R';2R)

18), monoderivatized R(NF;EH) by cyanoethylation and total acid hydrolysis [R(EO)nPC]; **Figure 11.** Overall processing scheme of conjugates PEGn-L(2R';R) (R';2R) [10]

namic expression of the rate constant did the same for the calculation of the "reverse" micelleprocessing medium phase-transfer free energy [9]. b) actual processing of conjugates PEGn-L (2R';R) (R';2R)

A reversible reaction, cyanoethylation is influenced by temperature, time, excess reagent (monomer) and addition products. Secondary products existent in the unpurified technical raw materials (higher alcohols, polyethylene glycols, traces of water) and oligomers of AN affect the yields of nucleophilic addition in the synthesis of HEHAO through the consumption

**Figure 12.** Mechanisms of the cleavage of the "ether bridge" formed in β-nonylphenoxypolyethyleneoxy (n=3-18) propionitriles under prolonged contact with the basic catalyst [CH3O-Na+]

of monomer and further purification complications. The presence of oligomerization inhibitors favors the yields.

The partial hydrolysis of homogeneous β-R-polyethyleneoxy-propionitriles (HOHAO) is a heterogeneous process due to the limited solubility in water of homogeneous polyethyleneoxy (n=0-18) nitriles and their corresponding amides, respectively. The low reaction temperature along with the reduced solubility and the waxy, consistent appearance of homogeneous β-Rpolyethyleneoxy (n=0-18) propionamides, constitute serious impediments in obtaining high hydrolysis yields. The use of large amounts of water or high reaction temperatures increase the total hydrolysis yields with the formation of the corresponding homogeneous β-Rpolyethyleneoxy (n=0-18) propionic acids. In the research carried out it was proceeded to the partial hydrolysis of nitriles with 90% concentrated sulphuric acid in the temperature range 0-15°C.

Homogeneous β-R-polyethyleneoxy-propionic acids were also obtained through the acidcatalyzed exhaustive hydrolysis of homogeneous β-R-polyethyleneoxy-propionitriles. Depending on the processing conditions of the reaction products two classes of HOHAO are obtained: the free acids or their salts (soaps), R-(EO)n-PC, with confirmed surface-active properties [19,22,23]. The evolution of the total hydrolysis yield is determined by the hetero‐

namic expression of the rate constant did the same for the calculation of the "reverse" micelle-

a) controlled modification of homogeneous polyoxyethylene (PEO) chains (n = 3, 6, 9, 12, 18), monoderivatized R(NF;EH) by cyanoethylation and total acid hydrolysis [R(EO)nPC];

(a) controlled modification of homogeneous polyoxyethylene (PEO) chains (n=3, 6, 9, 12, 18), monoderivatized

cyanoethylation yield is achieved at higher values of the catalyst concentration. At the same catalyst concentration, increasing the polyether chain determines a significant increase of the nucleophilic addition yields, but also a reduction in the amount of acrylic oligomers formed,

> OH OH OH

**monoacetylated glycerol (G.M.) diacetylated glycerol (G.D.)**

**R-(EO)n-COOH acids diglyceride R-(EO)n-COOH acids monoglyceride**

**) saponified lipid fraction of dog rose, wild sweet chestnut, grape and coriander seeds/fruits**

R-O-**(EO)n**-COOH (n=3,9,18)

3 H2O (H+)

OAc OAc(H) OH(Ac)

O O O CO CO CO

**triglycerides (**

R` R` R`

**\* )**

OAc

O-CO-R`

**-R-(EO)n-COOH and R'-COOH (2/1) acids triglyceride, conjugated PEGn-L(2R';1R)**

OAc[-CO-**(EO)n**-O-R] O-CO-**(EO)n**-O-R(Ac)

O-CO-R`[-CO-**(EO)n**-O-R] O-CO-**(EO)n**-O-R(-CO-R`)

probably due to the solvation of the acrylic monomer in the polyether chain [9,10].

OAc(H) OH(Ac) OH

**free fatty acids C16(0); C18(1 ); C18(2 ); C18(3 ) in saponified vegetable lipid fraction**

3R`-COOH +

OAc[-CO-**(EO)n**-O-R] O-CO-**(EO)n**-O-R(Ac) O-CO-**(EO)n**-O-R

O-CO-R`[-CO-**(EO)n**-O-R] O-CO-**(EO)n**-O-R(-CO-R`)

R`-COOH R`-COOH

O-CO-**(EO)n**-O-R

**-R-(EO)n-COOH and R'-COOH (2/1) acids triglyceride, conjugated PEGn-L(1R';2R)**

Figure 11.Overall processing scheme of conjugates PEGn-L(2R';R) (R';2R) [10]

(a)

(b)

**( \***

A reversible reaction, cyanoethylation is influenced by temperature, time, excess reagent (monomer) and addition products. Secondary products existent in the unpurified technical raw materials (higher alcohols, polyethylene glycols, traces of water) and oligomers of AN affect the yields of nucleophilic addition in the synthesis of HEHAO through the consumption

processing medium phase-transfer free energy [9].

R(NF;EH) by cyanoethylation and total acid hydrolysis [R(EO)nPC]; (b) actual processing of conjugates PEGn-L (2R';R) (R';2R)

b) actual processing of conjugates PEGn-L (2R';R) (R';2R)

**Figure 11.** Overall processing scheme of conjugates PEGn-L(2R';R) (R';2R) [10]

2 R-O-**(EO)n**-COOH (n=3,9,18)

118 Oligomerization of Chemical and Biological Compounds

**Figure 13.** Scheme of the principle of coordination in non-polar solvents of alkali cations in the cavity of the polyoxy‐ ethylene chain with 8-9 oxygen atoms of β-R-polyethyleneoxy-propionitriles

geneity of the reaction medium (biphasic system); phenomena of micellar catalysis in emulsion or of phase-transfer play a major role in the development of the processes [22].

while excess *n*-dodecylbenzenesulphonic acid favors preferentially the formation of propio‐ namides and less of propionic acids, without the formation of the corresponding esters [28,29].

**Figure 14.** Three-dimensional conformation of HOHAO with structures generating difficulties of "packing" at hydro‐

H

**Legend: - Oxygen atom - Carbon atom - Counterion (anion) - posibility of partial or total "free" rotation, depending on the steric restrictions of the structure**

C O

O O O

C

HC HC HC HC

CH3

H3C **<sup>1</sup>**

C <sup>O</sup> <sup>O</sup> **R2**

http://dx.doi.org/10.5772/57610

121

O

C

O

<sup>C</sup> <sup>O</sup> <sup>O</sup> O O C O O C

HC HC

H3C

HC HC

**hydrophilic coordinating "conformational" cavity**

Ethylene Oxide Homogeneous Heterobifunctional Acyclic Oligomers

**R**

H3C CH3

**N+**

H3C

CH CH CH CH

<sup>C</sup> <sup>O</sup> <sup>O</sup> O O C O O C

> **1 2 3 4 5 6 7 8 9**

H3C m=2-4

In all cases, above 80°C the content of homogeneous β-R-polyethyleneoxy (n=0-18) propioni‐ triles decreases markedly regardless of the acid catalyst used and the presence of propionate esters is observed. Above 110°C in the presence of hydrochloric acid increasing amounts of higher alcohol are observed due to the cleavage of "ether bridges". Increasing the reaction time favors the total acid hydrolysis. At high temperatures (over 80°C) in the presence of hydro‐ chloric acid, after approx. 60 minutes the β-substituted propionitrile disappears from the reaction mixture, and after 90 minutes also does the propionamide formed intermediately. In the presence of TS, at the same temperature and amount of water, propionitriles and propio‐ namides can be found in traces even after 180 minutes and total conversions are generally not

Hydrolysis of homogeneous β-R-polyethyleneoxy (n=3-9) propionitriles in the presence of free homogeneous polyethoxylated (n=3-9) higher alcohols (C12-C18) ensures the obtention of high yields of hydrolysis. In parallel the content of propionic ester is increasing, the more so as the length of the homogeneous polyoxyethylene chain in the homogeneous polyethoxylated

reached.

**R**

H3C H3C H3C

H

<sup>O</sup> **hydrophilic coordinating "conformational" cavity**

philic and/or hydrophobic interfaces

<sup>C</sup> <sup>O</sup> <sup>O</sup> O C

O O

**hydrophobic chain**

C

HC HC HC HC HC HC H3C

**N+ R2**

(n=3-9) higher alcohol introduced is smaller.

The nature of the acid catalyst (HY) Y=Cl- ; HOSO3 - ; CH3C6H4SO3 - , paratoluenesulphonic acid (TS); C12H25C6H4SO3 - , dodecylbenzenesulphonic acid (DBSH), influences favorably the hydrolysis yields through the acid's strength. While in the presence of concentrated sulphuric acid at low temperatures are obtained predominantly homogeneous β-substituted propiona‐ mides, in the presence of hydrochloric acid above 80°C the corresponding propionic acids are formed. Under paratoluenesulphonic and/or dodecylbenzenesulphonic acid catalysis the yields in homogeneous β-R-polyethyleneoxy-propionamides increase at low temperatures for low molar ratios HY/R-(EO)nPN, also on the account of the homogenizing effect of these structures with surface-active properties [22].

Increasing the temperature and the molar ratio HY/propionitrile favors the total hydrolysis yields. The amount of acid selectively influences the hydrolysis of nitriles. Thus, excess hydrochloric acid favors the formation of homogeneous β-substituted propionic acids, alongside their corresponding esters with higher alcohols either present as impurities or originating in the cleavage in acid medium of "ether bridges" formed in the cyanoethylation or amidoethylation process or existing initially in the homogeneous polyoxyethylene chain,

**Figure 14.** Three-dimensional conformation of HOHAO with structures generating difficulties of "packing" at hydro‐ philic and/or hydrophobic interfaces

while excess *n*-dodecylbenzenesulphonic acid favors preferentially the formation of propio‐ namides and less of propionic acids, without the formation of the corresponding esters [28,29].

geneity of the reaction medium (biphasic system); phenomena of micellar catalysis in emulsion

**Figure 13.** Scheme of the principle of coordination in non-polar solvents of alkali cations in the cavity of the polyoxy‐

; HOSO3 -

hydrolysis yields through the acid's strength. While in the presence of concentrated sulphuric acid at low temperatures are obtained predominantly homogeneous β-substituted propiona‐ mides, in the presence of hydrochloric acid above 80°C the corresponding propionic acids are formed. Under paratoluenesulphonic and/or dodecylbenzenesulphonic acid catalysis the yields in homogeneous β-R-polyethyleneoxy-propionamides increase at low temperatures for low molar ratios HY/R-(EO)nPN, also on the account of the homogenizing effect of these

Increasing the temperature and the molar ratio HY/propionitrile favors the total hydrolysis yields. The amount of acid selectively influences the hydrolysis of nitriles. Thus, excess hydrochloric acid favors the formation of homogeneous β-substituted propionic acids, alongside their corresponding esters with higher alcohols either present as impurities or originating in the cleavage in acid medium of "ether bridges" formed in the cyanoethylation or amidoethylation process or existing initially in the homogeneous polyoxyethylene chain,

; CH3C6H4SO3

, dodecylbenzenesulphonic acid (DBSH), influences favorably the


, paratoluenesulphonic acid

Na+ +

R'

O

C N

(-)

C N

or of phase-transfer play a major role in the development of the processes [22].

Na+ R - O - H R - O-

CH C N R - O - CH2 - CH - CN Na+ R - O-

non-polar solvents

non-polar solvents

Na+ + CH3OH+

R - O - H R - O - CH2CH2CN R - O-

R'

R'

Na

+ Na R O

The nature of the acid catalyst (HY) Y=Cl-

R'

R' = C9H19 - C6H5 - O - (CH2CH2 - O)n-1 - CH2CH2 -;

ethylene chain with 8-9 oxygen atoms of β-R-polyethyleneoxy-propionitriles

CH3O-

120 Oligomerization of Chemical and Biological Compounds

**sodium methoxide**

H2C + -

Na+ +

R - O - CH2 - CH - CN Na+ (-)

> C N

O-Na+

R = C9H19 - C6H5 -; or R';

+

R O Na

+

C N


structures with surface-active properties [22].

(TS); C12H25C6H4SO3

R'

R'

In all cases, above 80°C the content of homogeneous β-R-polyethyleneoxy (n=0-18) propioni‐ triles decreases markedly regardless of the acid catalyst used and the presence of propionate esters is observed. Above 110°C in the presence of hydrochloric acid increasing amounts of higher alcohol are observed due to the cleavage of "ether bridges". Increasing the reaction time favors the total acid hydrolysis. At high temperatures (over 80°C) in the presence of hydro‐ chloric acid, after approx. 60 minutes the β-substituted propionitrile disappears from the reaction mixture, and after 90 minutes also does the propionamide formed intermediately. In the presence of TS, at the same temperature and amount of water, propionitriles and propio‐ namides can be found in traces even after 180 minutes and total conversions are generally not reached.

Hydrolysis of homogeneous β-R-polyethyleneoxy (n=3-9) propionitriles in the presence of free homogeneous polyethoxylated (n=3-9) higher alcohols (C12-C18) ensures the obtention of high yields of hydrolysis. In parallel the content of propionic ester is increasing, the more so as the length of the homogeneous polyoxyethylene chain in the homogeneous polyethoxylated (n=3-9) higher alcohol introduced is smaller.

In the series of efforts to diversify the macromolecular architectures HOHAO structured lipid similar to conjugates PEGnL (Figure 5) were obtained with composition, physico-chemical and

Ethylene Oxide Homogeneous Heterobifunctional Acyclic Oligomers

http://dx.doi.org/10.5772/57610

123

**•** synthesis, purification and characterization of the homologous series of homogeneous βalkyl (EH) alkyl aryl (NF) polyethyleneoxy (n=0-18) propionic acids, HOHAO presented

**•** solid/liquid extraction in petroleum ether (b.p.=30-60°C) of glycerides in the divided seed material of coriander (*Coriandrum sativum*, R'co), grapes (*Vitis vinifera*, R's), dog rose (*Rosa canina*, R'm), and nuts (fruits) of wild sweet chestnut (*Castanea vesca*, R'ca), respectively. In all the variants studied unsaturation in variable proportions is confirmed, by higher acids:

The content of saturated higher acids, approx. 10.1% C16 in grapes (s), 10.9% C16 in wild sweet chestnut (ca), 3.18% C16 in dog rose hips (m) and 2.7% C16 in coriander (co), does not change the "fingerprint" of the vegetable lipid fractions rich in ω3 acids of major interest in the

It is noted that in the saponified lipid fraction from grapes and dog rose hips predominate the acids C18;2Δ with a share of 57% and 83%, respectively. In continuation of the preparation

**•** separation of unsaponifiables, followed by the exhaustive acid hydrolysis (HCl) of free higher acids in R'co; R'm; R'ca; R's, their purification and gas-chromatographic characteriza‐

**•** directed esterification of glycerol mono-and/or diprotected with R(EO)nPC or free higher

Two basic surface-active properties were evaluated suggesting potential directions for the exploitation of synthesized HOHAO depending on the structural elements of the respective homologous series (hydrophilic-hydrophobic index, HLB, surface tension and critical micelle concentration, respectively) [33]. The determinations allowed the formulation of structure-

Surface tension as a form of free energy (expressed in N/m or dyne/cm) independent of the shape of the interface separating two phases in a system, is a function of temperature, time, and the structural characteristics of the HOHAO considered. Because the interface equilibrium is established within a short period, the existence of static and dynamic surface tension is accepted [34]. The latter manifests itself in aqueous floats of HOHAO and subsequently acquires great practical significance in the process of their actual use. Two aspects can be

**•** the capacity of HOHAO to reduce the surface tension, expressed as the concentration

required to achieve a certain effect of reducing the surface tension;

functional characteristics specific to the physiological benefits of cell membranes. An adapted classical reaction scheme was accessed following these steps (Figure 14):

above R(EO)nPC (Figure 10);

scheme of PEGn-L follow:

acids R'co; R'm; R'ca; R's, respectively.

surface activity correlations [30].

tion;

distinguished:

oleic (C18;1Δ); linoleic (C18;2Δ) and linolenic (C18;3Δ).

composition of functional lipids reported in the last decades.

**c.** Fundamental colloidal competences of surface-active HOHAO

**Figure 15.** Probable mechanism for the exhaustive acid hydrolysis of β-nonylphenoxy polyethyleneoxy (n=3-18) pro‐ pionitriles (β-substituted propionamide; β-substituted propionic acid) [22,23]

The favorable effect of homogeneous polyethoxylated (n=3-9) higher alcohols (C12-C18) with increasing polyoxyethylene chain on the process of total acid hydrolysis of propionitriles implies the existence of micellar catalysis (n≥9) or in emulsion phenomena (for n=3-9) [9].

Underthe same conditions, increasing the homogeneous polyoxyethylene chain's length in the nitrile subjected to hydrolysis favors the yields, due to the stabilization effect on the hydroly‐ sis intermediates. Increasing the hydrocarbon chain influences negatively, through its length, the total acid hydrolysis yield, probably due to steric reasons similar to cyanoethylation.

Because the hydrolysis of β-substituted propionitriles is a heterogeneous process which takes place both at the separation interface of the two phases (water/organic) and inside the two phases due to the mutual solubility of the two reagents, the use of the most diverse structures of the aforementioned phase-transfer catalysts in the class of the cationic HOHAO synthesized [*N,N,N*-trimethyl-*N*-β-lauryl/myristyl (7/3) oxy-ethylammonium chloride], LM-O-EC-1.1.1, determines the increase of total hydrolysis yields, without the cleavage of "ether bridges" even under mild reaction conditions [28,29].

**b.** Synthesis of homogeneous heterobifunctional acyclic oligomers of ethylene oxide in the category of customized (structured) lipids.

In the series of efforts to diversify the macromolecular architectures HOHAO structured lipid similar to conjugates PEGnL (Figure 5) were obtained with composition, physico-chemical and functional characteristics specific to the physiological benefits of cell membranes.

An adapted classical reaction scheme was accessed following these steps (Figure 14):


In all the variants studied unsaturation in variable proportions is confirmed, by higher acids: oleic (C18;1Δ); linoleic (C18;2Δ) and linolenic (C18;3Δ).

The content of saturated higher acids, approx. 10.1% C16 in grapes (s), 10.9% C16 in wild sweet chestnut (ca), 3.18% C16 in dog rose hips (m) and 2.7% C16 in coriander (co), does not change the "fingerprint" of the vegetable lipid fractions rich in ω3 acids of major interest in the composition of functional lipids reported in the last decades.

It is noted that in the saponified lipid fraction from grapes and dog rose hips predominate the acids C18;2Δ with a share of 57% and 83%, respectively. In continuation of the preparation scheme of PEGn-L follow:


The favorable effect of homogeneous polyethoxylated (n=3-9) higher alcohols (C12-C18) with increasing polyoxyethylene chain on the process of total acid hydrolysis of propionitriles implies the existence of micellar catalysis (n≥9) or in emulsion phenomena (for n=3-9) [9].

**Figure 15.** Probable mechanism for the exhaustive acid hydrolysis of β-nonylphenoxy polyethyleneoxy (n=3-18) pro‐

R = C9H19 <sup>O</sup> CH2CH2 O CH2CH2 <sup>n</sup>


OH R C

**-substituted propionamide**

O

NH2

+H<sup>+</sup>

R C

OH

OH

NH2

OH


R C

OH2 **(II) (III)**

**(IV) (V)**

OH

NH3

CR

pionitriles (β-substituted propionamide; β-substituted propionic acid) [22,23]

**(I)**

R C N H R C NH R C NH

NH

OH

O H H

> O H H

**-substituted propionic acid**

R C

O

OH

Underthe same conditions, increasing the homogeneous polyoxyethylene chain's length in the nitrile subjected to hydrolysis favors the yields, due to the stabilization effect on the hydroly‐ sis intermediates. Increasing the hydrocarbon chain influences negatively, through its length, the total acid hydrolysis yield, probably due to steric reasons similar to cyanoethylation.

Because the hydrolysis of β-substituted propionitriles is a heterogeneous process which takes place both at the separation interface of the two phases (water/organic) and inside the two phases due to the mutual solubility of the two reagents, the use of the most diverse structures of the aforementioned phase-transfer catalysts in the class of the cationic HOHAO synthesized [*N,N,N*-trimethyl-*N*-β-lauryl/myristyl (7/3) oxy-ethylammonium chloride], LM-O-EC-1.1.1, determines the increase of total hydrolysis yields, without the cleavage of "ether bridges" even

**b.** Synthesis of homogeneous heterobifunctional acyclic oligomers of ethylene oxide in the

under mild reaction conditions [28,29].

R C

CR

NH2

OH

OH2

NH


122 Oligomerization of Chemical and Biological Compounds

~H+

R C

category of customized (structured) lipids.

Two basic surface-active properties were evaluated suggesting potential directions for the exploitation of synthesized HOHAO depending on the structural elements of the respective homologous series (hydrophilic-hydrophobic index, HLB, surface tension and critical micelle concentration, respectively) [33]. The determinations allowed the formulation of structuresurface activity correlations [30].

Surface tension as a form of free energy (expressed in N/m or dyne/cm) independent of the shape of the interface separating two phases in a system, is a function of temperature, time, and the structural characteristics of the HOHAO considered. Because the interface equilibrium is established within a short period, the existence of static and dynamic surface tension is accepted [34]. The latter manifests itself in aqueous floats of HOHAO and subsequently acquires great practical significance in the process of their actual use. Two aspects can be distinguished:

**•** the capacity of HOHAO to reduce the surface tension, expressed as the concentration required to achieve a certain effect of reducing the surface tension;

**•** the effectiveness of HOHAO, expressed by the minimum value which is capable of reducing the surface tension.

**•** the increase of the hydrocarbon chain's R1 length (R2 and n identical), which lowers the CMC

Ethylene Oxide Homogeneous Heterobifunctional Acyclic Oligomers

http://dx.doi.org/10.5772/57610

125

**•** the increase of the hydrocarbon chain's R1 length (R2 and n identical) lowers the CMC value

**•** the increase of the hydrocarbon chain's length (R1 and R2 identical) lowers the CMC value

**d.** Fundamental colloidal competences and coordination (sequestration) competences of homogeneous heterobifunctional acyclic oligomers of ethylene oxide from the category

The numerous hopes of obtainment and technological implementation of the conjugates PEGn-L(2R';R)(R';2R) due primarily to their diversified structure, were based on their potential colloidal qualities: colloidal (micellar) solubility, interface phenomena, interfacial surface

In the structure of customized lipids the independent variable, the homogeneous oligomeri‐ zation degree (n), can provide in a homologous series the controlled modification of the hydrophilic/hydrophobic balance (HLB index), and implicitly the range of later practical

**•** increasing the homogeneous oligomerization degree (n) for the same hydrocarbon chain R and/or R', either for the PEGn-L (R';2R) or PEGn-L (2R';R) conjugates, respectively, shows a more prominent increase of the HLB index for the first case regardless of the nature of R

**•** increasing the share of hydrocarbon chain R' against R also determines the reduction of the

**•** modifying the chain R(NF;EH) for the same structure of conjugate PEGn-L and the same homogeneous oligomerization degree (n) causes variations in the HLB index due to the

**•** the modification of the hydrocarbon chain R' (R'ca; R's; R'm; R'co) for the same homologous series of conjugate PEGn-L and homogeneous oligomerization degree (n) does not attract

**•** we find a good agreement between the HLB values determined [10] and calculated [30],

**•** in the series of the 48 conjugated PEGn-L studied the overall range of variation of the HLB index is between 2.5-14, which allowed their indicative grouping in structures: HLB=1-4: lipophilic (insoluble in water) non-dispersible; HLB=4-6: partially dispersible (partially soluble in water); HLB=6-8: unstable microemulsions (after vigorous mechanical stirring); HLB=8-10: stable microemulsions (after gentle mechanical stirring); HLB=10-13: translucent (opalescent) to clear in the upper area of the domain; HLB>13: soluble, transparent).

significant variations of the HLB index either for the same reasons;

which justifies the premises and operating colloidal strategies;

due to the reduction of the degree of "packing" at the interface;

tension (σ), correlated with the critical micelle concentrations (CMC).

From the evaluation of HLB values the following can be stated:

structural differences between the two chains;

through the high hydrophilicity of the HOHAO structures.

value by reducing the solubility;

of customized (structured) lipids

applications.

and/or R';

HLB index value;

It will be proceeded to selectively comment some of the colloidal characteristics of the class. Following the evolution of the surface tension in the homologous series of cationic HOHAO synthesized, the following can be stated:


The most pronounced capacity to reduce the surface tension in the homologous series of *N,N*-dimethyl-*N*-alkyl (C1-C4)-*N*-β-cetyl/stearyl (7/3) oxy-ethylammonium chlorides and N,Ndimethyl-*N*-alkyl (C1-C4)-N-β-cetyl/stearyl (7/3) polyethyleneoxy (n=3-6) ethylammonium chlorides, in comparison with *N,N*-dimethyl-*N*-alkyl (C8-C16)-N-β-lauryl/myristyl (7/3) oxyethylammonium chlorides and *N,N*-dimethyl-*N*-alkyl (C1-C4)-N-β-lauryl/myristyl (7/3) polyethyleneoxy (n=3-18) ethylammonium chlorides can be explained by the difference in the intensity of repulsion between the ionic and nonionic polar hydrophilic groups identically oriented at the separation interface [9]. In the first case the role of entropic factor lowers significantly on account of the increase of the micellization free energy, while in the second case it increases due to the reduction of the micellization free energy at the separation interface with the consequence of accumulation of increased numbers of surface-active cationic molecules (micelles). Another explanation can also be found in the relationship between the capacity to reduce the surface tension and the concentration of the surfactant at the liquid-air separation interface, the latter being decisively influenced by the free energy of the diffusion process from the aqueous float to the interface and the free energy of formation of cationic HOHAO micelles. The decrease of the micelle formation entropy corresponds to an advanced ordering in the micelle, that accumulates around it a greater number of water molecules as the polarity of the ionic hydrophilic group is higher.

The movement of the ionic polar group or the nonionic one towards the center of the cationic structure is followed by an advanced reduction of the entropic effect in solution (the effective length decreases along with the degree of "packing" at the interface), therefore of the capacity to reduce the surface tension [9].

In the homologous series of *N,N*-dimethyl-*N*-alkyl (C1-C16)-N-β-alkyl (C12-C18) polyethyle‐ neoxy (n=0-20) ethylammonium chlorides, the value of the critical micelle concentration (CMC) for the same environmental conditions is influenced by:


The numerous hopes of obtainment and technological implementation of the conjugates PEGn-L(2R';R)(R';2R) due primarily to their diversified structure, were based on their potential colloidal qualities: colloidal (micellar) solubility, interface phenomena, interfacial surface tension (σ), correlated with the critical micelle concentrations (CMC).

In the structure of customized lipids the independent variable, the homogeneous oligomeri‐ zation degree (n), can provide in a homologous series the controlled modification of the hydrophilic/hydrophobic balance (HLB index), and implicitly the range of later practical applications.

From the evaluation of HLB values the following can be stated:

**•** the effectiveness of HOHAO, expressed by the minimum value which is capable of reducing

It will be proceeded to selectively comment some of the colloidal characteristics of the class. Following the evolution of the surface tension in the homologous series of cationic HOHAO

**•** for the same n and R2, R2=-N(CH3)2 CmH2m+1, m=1-18, the capacity to reduce the surface tension increases in the series LM<CS, and the effectiveness of reduction decreases in the

**•** for the same n and R1, the capacity to reduce the surface tension decreases in the series of chain homologues R2, and the effectiveness of reduction increases with the R2 chain's length due to the movement towards the center of the hydrophilic ionic polar group (Figure 14) ;

**•** for the same R1 and R2, the effectiveness of reducing the surface tension increases in the order (EO)0 < (EO)3 < (EO)6 < (EO)9... < (EO)20, while the capacity to reduce the surface tension decreases in the same order: (EO)0 > (EO)6... > (EO)20, due to the increased hydrophilic (polar)

The most pronounced capacity to reduce the surface tension in the homologous series of *N,N*-dimethyl-*N*-alkyl (C1-C4)-*N*-β-cetyl/stearyl (7/3) oxy-ethylammonium chlorides and N,Ndimethyl-*N*-alkyl (C1-C4)-N-β-cetyl/stearyl (7/3) polyethyleneoxy (n=3-6) ethylammonium chlorides, in comparison with *N,N*-dimethyl-*N*-alkyl (C8-C16)-N-β-lauryl/myristyl (7/3) oxyethylammonium chlorides and *N,N*-dimethyl-*N*-alkyl (C1-C4)-N-β-lauryl/myristyl (7/3) polyethyleneoxy (n=3-18) ethylammonium chlorides can be explained by the difference in the intensity of repulsion between the ionic and nonionic polar hydrophilic groups identically oriented at the separation interface [9]. In the first case the role of entropic factor lowers significantly on account of the increase of the micellization free energy, while in the second case it increases due to the reduction of the micellization free energy at the separation interface with the consequence of accumulation of increased numbers of surface-active cationic molecules (micelles). Another explanation can also be found in the relationship between the capacity to reduce the surface tension and the concentration of the surfactant at the liquid-air separation interface, the latter being decisively influenced by the free energy of the diffusion process from the aqueous float to the interface and the free energy of formation of cationic HOHAO micelles. The decrease of the micelle formation entropy corresponds to an advanced ordering in the micelle, that accumulates around it a greater number of water molecules as the

The movement of the ionic polar group or the nonionic one towards the center of the cationic structure is followed by an advanced reduction of the entropic effect in solution (the effective length decreases along with the degree of "packing" at the interface), therefore of the capacity

In the homologous series of *N,N*-dimethyl-*N*-alkyl (C1-C16)-N-β-alkyl (C12-C18) polyethyle‐ neoxy (n=0-20) ethylammonium chlorides, the value of the critical micelle concentration (CMC)

order LM>CS, as a result of the increase of the length of the hydrocarbon chain R1;

the surface tension.

synthesized, the following can be stated:

124 Oligomerization of Chemical and Biological Compounds

character of the cationic structure and its solubility.

polarity of the ionic hydrophilic group is higher.

for the same environmental conditions is influenced by:

to reduce the surface tension [9].


Surface tension (σ) is a function of temperature, duration and structural characteristics also in the series of conjugates PEGn-L (2R';1R) (R';2R).

**•** the coordination competences can also be extrapolated through the structural parameter ionic radius (Å) to a coordination number (N.C.). The correlation relative to the sodium cation (centered cubic lattice with N.C.=6) took into account that in the case of transitional metal cations the evaluated interatomic distances are smaller due to the polarizability of the anions under the influence of proper metal cations, and that the atomic volume of alkali and

Ethylene Oxide Homogeneous Heterobifunctional Acyclic Oligomers

http://dx.doi.org/10.5772/57610

127

**•** the preliminary study of the coordination competences for certain transitional metal cations (known promoters of lipid autooxidation) has technological importance in avoiding and/or eliminating autooxidation processes due to the high degree of unsaturation of the hydro‐

**•** the phase-transfer competences assessed comparatively through the partition coefficient values KD (1,2-dichloroethane) and/or K'D (isooctane) suggest that the hydrophilicity of conjugates PEGn-L(2R';1R) (R';2R) favors the interphase distribution of the "host-guest"

**•** because there are no significant differences between the KD and K'D values, the evaluated

**•** the phase-transfer competences for the same cumulative homogeneous oligomerization degree (nC) are lower for the conjugates PEGn-L(2R';1R) (R';2R) with hydrocarbon chains R(EH) compared with R(NF) due to differences in hydrophobicity [EH(C8) < NF(C15)] and

The comparative evaluation of the colloidal experimental data of conjugates PEGn-L (2R';1R) (R';2R) themselves revealed that not all the structures obtained offer directly potential colloidal and coordination competences due to the different degree of dispersibility in aqueous media

Their exclusion from further tests does not represent the acceptable technical solution, which

**•** the study of cumulated colloidal competences in mixed systems PEGn-L (2R';1R) (R';2R) (HLB=1-8) with homogeneous polyoxyethylene chains (n=9,18), monoderivatized

**•** the study of cumulated colloidal competences in mixed systems PEGn-L (2R';1R) (R';2R)

In both variants it was counted on the cumulative properties recognized in the specialized literature, similar for mixtures with different proportions of colloidal (surface-active) com‐ pounds, but also of the structural units composing them. Thus, in order to shift the cumulative hydrophilic/hydrophobic balance (HLBc) towards increasing hydrophilic character, it was originally resorted to homogeneous polyoxyethylene (PEO) chains (n=3-18) monoderivatized R(EH;NF) with high HLB values (HLBEH9=15.35; HLBNF9=12.86; HLBEH18=17.37; HLBNF18=14.24). In this respect we can also admit the existence of a cumulated homogeneous oligomerization degree (nc), exemplified randomly by PEG3-L (Rs;2NF) (nc=6) or PEG18-L (Rs;2NF) (nc=36).

experimental data confirm the reality of the phase-transfer processes studied;

alkaline-earth cations is larger than that of transitional cations;

(PEO chain-metal cation) systems in industrial processing;

(HLB=1-8) / PEGn-L (2R';1R) (R';2R) (HLB=9-13) (HLB>13).

[2EH(2C8) < 2NF(2C15)], respectively.

(HLB=1-8) (increased lipophilicity).

suggested two future work strategies:

R(EH;NF);

carbon chains (R'ca; R's; R'm; R'co) in conjugates PEGn-L(2R';1R) (R';2R);

The evaluation of this colloidal characteristic in close connection with the critical micelle concentration (CMC) sought to assess the ability of these structures to reduce the surface tension overall, and also the effectiveness of providing a minimum surface tension for a given concentration.

From the comparative interpretation of the experimental data the following correlations can be formulated:


From the comparative interpretation of the experimental data on the coordination character‐ istics the following can be stated [10]:


**•** the coordination competences can also be extrapolated through the structural parameter ionic radius (Å) to a coordination number (N.C.). The correlation relative to the sodium cation (centered cubic lattice with N.C.=6) took into account that in the case of transitional metal cations the evaluated interatomic distances are smaller due to the polarizability of the anions under the influence of proper metal cations, and that the atomic volume of alkali and alkaline-earth cations is larger than that of transitional cations;

Surface tension (σ) is a function of temperature, duration and structural characteristics also in

The evaluation of this colloidal characteristic in close connection with the critical micelle concentration (CMC) sought to assess the ability of these structures to reduce the surface tension overall, and also the effectiveness of providing a minimum surface tension for a given

From the comparative interpretation of the experimental data the following correlations can

**•** for the same homogeneous oligomerization degree (n) and the same hydrocarbon chain (R') the capacity to reduce the surface tension (σ) increases in the series R(EH)→R(NF), while the effectiveness of reduction decreases in the order R(NF)→R(EH), probably due to

**•** for the same homogeneous oligomerization degree (n) and the same hydrocarbon chain (R)) the capacity to reduce the surface tension (σ) decreases significantly in the hydrocarbon chain series R' (R'co ≥ R'm > R'ca > R's), while the effectiveness of reduction of (CMC) increases in the same order, probably due to the movement of the hydrophilic polar groups towards

**•** for the same hydrocarbon chain R and R' the capacity to reduce the surface tension (σ) decreases with increasing the homogeneous oligomerization degree (n), while the effec‐ tiveness of reduction of (CMC) increases in the same homologous series of polyoxyethylene (PEO) chain (n=3-18), probably due to the intensification of the hydrophilic character of

From the comparative interpretation of the experimental data on the coordination character‐

**•** the sequestration (coordination) competences of the conjugates PEGn-L(2R';1R) (R';2R) depend on their concentration (below and/or above CMC) in the processing environment; **•** the coordination competences in the homologous series of derivatized homogeneous polyoxyethylene (PEO) chains (n=3,9,18) studied can be correlated with the values of the main colloidal parameters: HLB index, cumulative homogeneous oligomerization degree

**•** the coordination competences depend on the ionic radius (Å) in the series of metal cations

the respective conjugates PEGn-L(2R';1R) (R';2R); the premises formulated in the specialized literature on the similarity of the geometric coordinates of the conformational "host site" (diameter, radius, area etc.) of the homogeneous polyoxyethylene (PEO) chains (n≥9), with the geometric coordinates of the "guest" (ionic radius, diameter of metallic cations) are

**•** the coordination competences follows the same trend in the case of coordination of transi‐

(1.33) [10], below and above the CMC values of

differences in chain length C8 (EH) and C15 (NF), respectively;

the center of the structure of conjugates PEGn-L (Figure 14);

istics the following can be stated [10]:

(nc), surface tension (σ);

studied: Mg2+(0.65), Na<sup>+</sup>

confirmed;

conjugates PEGn-L, and of the solubility in the polar medium (water).

(0.95), Ca2+(0.99), K+

tional metal cations Ni2+(0.69), Co2+(0.72), Mn2+(0.80);

the series of conjugates PEGn-L (2R';1R) (R';2R).

126 Oligomerization of Chemical and Biological Compounds

concentration.

be formulated:


The comparative evaluation of the colloidal experimental data of conjugates PEGn-L (2R';1R) (R';2R) themselves revealed that not all the structures obtained offer directly potential colloidal and coordination competences due to the different degree of dispersibility in aqueous media (HLB=1-8) (increased lipophilicity).

Their exclusion from further tests does not represent the acceptable technical solution, which suggested two future work strategies:


In both variants it was counted on the cumulative properties recognized in the specialized literature, similar for mixtures with different proportions of colloidal (surface-active) com‐ pounds, but also of the structural units composing them. Thus, in order to shift the cumulative hydrophilic/hydrophobic balance (HLBc) towards increasing hydrophilic character, it was originally resorted to homogeneous polyoxyethylene (PEO) chains (n=3-18) monoderivatized R(EH;NF) with high HLB values (HLBEH9=15.35; HLBNF9=12.86; HLBEH18=17.37; HLBNF18=14.24). In this respect we can also admit the existence of a cumulated homogeneous oligomerization degree (nc), exemplified randomly by PEG3-L (Rs;2NF) (nc=6) or PEG18-L (Rs;2NF) (nc=36).

Three categories of conjugates PEGn-L (2R';1R) (R';2R) have been selected in the intervals HLB=1-4, HLB=3-6, HLB=6-8, respectively, each with eight representatives to which were controlledly added 1%, 10% and 20%, respectively, homogeneous polyoxyethylene (PEO) chain (n=3-18) monoderivatized R(EH;NF), corresponding to the hydrocarbon chain R in the conjugates PEGn-L (2R';1R) (R';2R) (HLB=1-8) evaluated. The HLBc values (Pearson rule) experimentally verified by sampling led to results falling below the error limit of the confidence interval (± 1%).

**6. Conclusions and perspectives**

membrane walls.

**Acknowledgements**

**Author details**

Calin Jianu\*

**References**

represents a "challenge" with real future perspectives.

Preliminary tests carried out and ongoing support these assertions.

Address all correspondence to: calin.jianu@gmail.com

Agricultural Sciences and Veterinary Medicine, Timișoara, Romania

Overall it can be stated that the obtainment and characterization for the first time of homoge‐ neous heterobifunctional oligomers of ethylene oxide (HOHAO) as "niche" structures

Ethylene Oxide Homogeneous Heterobifunctional Acyclic Oligomers

http://dx.doi.org/10.5772/57610

129

The potential applications envisaged are based primarily on their structure but also their varied composition which allow the expression in perspective of colloidal phenomena [wetting, foaming/defoaming with the three components (strength, density, stability) [31,32], softening agent, micellar solubilization, controlled emulsification, adsorption at the interface in normal micelles (aqueous medium) and reverse micelles (non-aqueous media), cleaning in the most various hypostases (including resistance to hard water), chemical interface processes (chemisorption), coordination/sequestration and phase transfer, etc.]. The diversity of lipid conjugates and their competences recommends them as specialized structural units of the cell

My entire gratitude to my mentor, Prof. Eng. Ionel Jianu, Ph.D., the founder of agrifood school of Timisoara, in whose team I started and trained as a researcher and discovered the hitherto

Department of Food Science, Faculty of Food Processing Technology, Banat's University of

[1] Thompson MS, Vadala TP, Vadala ML, Lin Y, Riffle JS. Synthesis and applications of heterobifunctional poly(ethylene oxide) oligomers. Polymer 2008; 49(2) 345-373.

[2] Mulley BA. Synthesis of homogeneous nonionic surfactants. In: Schick MJ. (ed.) Non‐

ionic Surfactants (Chapter 13). New York: Marcel Dekker; 1967. p421-439.

unsuspected potential of homogeneous and heterogeneous polyoxyethylene chains.

The typical colloidal characteristics were correlated and interpreted: surface tension (σ), critical micelle concentration (CMC), hydrophilic-hydrophobic balance (HLB), with the structural parameters for the 18 conjugates PEGn-L(2R';1R) (R';2R) nominated and subsequently associ‐ ated in binary and ternary systems, overall finding that:


#### **6. Conclusions and perspectives**

Three categories of conjugates PEGn-L (2R';1R) (R';2R) have been selected in the intervals HLB=1-4, HLB=3-6, HLB=6-8, respectively, each with eight representatives to which were controlledly added 1%, 10% and 20%, respectively, homogeneous polyoxyethylene (PEO) chain (n=3-18) monoderivatized R(EH;NF), corresponding to the hydrocarbon chain R in the conjugates PEGn-L (2R';1R) (R';2R) (HLB=1-8) evaluated. The HLBc values (Pearson rule) experimentally verified by sampling led to results falling below the error limit of the confidence

The typical colloidal characteristics were correlated and interpreted: surface tension (σ), critical micelle concentration (CMC), hydrophilic-hydrophobic balance (HLB), with the structural parameters for the 18 conjugates PEGn-L(2R';1R) (R';2R) nominated and subsequently associ‐

**•** increasing the homogeneous oligomerization degree (n) and the unsaturation in the chains R' (R's; R'ca; R'm; R'co) 1Δ→2Δ→3Δ favors the hydrophilicity of conjugates PEGn-L(2R';1R) (R';

**•** between the chains R(EH) and R(NF) differences in the colloidal behavior exist and manifest themselves sensibly in favor of increasing the hydrophobic character of conjugates [R(EH) (C8) < R(NF)(C15)], due to the movement of the homogeneous polyoxyethylene (PEO) chains

**•** the exclusive modification of the ratio R'/R affected the interface and transfer colloidal characteristics (R' > R decrease R' < R increase), probably due to "steric restrictions" adversely affecting "the degree of packing" of conjugates PEGn-L(2R';R) (R';2R) at interfaces;

**•** a high degree of unsaturation affects the colloidal characteristics, the more so as the share of higher unsaturated acids 2Δ and 3Δ is higher, due to their possibilities of spatial arrange‐ ment as geometric isomers, but also to the reduction of the capacity of free rotation (C-C) in

**•** the colloidal phenomena in the structural category of systems PEGn-L(2R';R) (R';2R) studied differ for concentration values below CMC [conformational competences are present exclusively due to the homogeneous polyoxyethylene (PEO) chains (n=3,9,18)] and values above CMC, both the conformational competences and the ones of micellar solubilization of homogeneous polyoxyethylene (PEO) chains (n=3,9,18) occur cumulatively and simul‐

**•** the gradual presence of homogeneous polyoxyethylene (PEO) chains (n=9, 18) monoderiv‐ atized R(EH;NF) along with conjugates PEGn-L(2R';R) (R';2R) with HLB < 6 in binary and/or ternary associated systems favors the structuring of systems PEGn-L with HLB' nc ≥ 8, which widens the range of structural variants capable of interface and transfer phe‐

(n=3-18) towards the center of the structure of conjugates [R(NF)(C15) < R(EH)(C8)];

ated in binary and ternary systems, overall finding that:

128 Oligomerization of Chemical and Biological Compounds

the structure of conjugates PEGn-L(2R';R) (R';2R);

nomena, beneficial to the technological practice.

interval (± 1%).

2R);

taneously;

Overall it can be stated that the obtainment and characterization for the first time of homoge‐ neous heterobifunctional oligomers of ethylene oxide (HOHAO) as "niche" structures represents a "challenge" with real future perspectives.

The potential applications envisaged are based primarily on their structure but also their varied composition which allow the expression in perspective of colloidal phenomena [wetting, foaming/defoaming with the three components (strength, density, stability) [31,32], softening agent, micellar solubilization, controlled emulsification, adsorption at the interface in normal micelles (aqueous medium) and reverse micelles (non-aqueous media), cleaning in the most various hypostases (including resistance to hard water), chemical interface processes (chemisorption), coordination/sequestration and phase transfer, etc.]. The diversity of lipid conjugates and their competences recommends them as specialized structural units of the cell membrane walls.

Preliminary tests carried out and ongoing support these assertions.

#### **Acknowledgements**

My entire gratitude to my mentor, Prof. Eng. Ionel Jianu, Ph.D., the founder of agrifood school of Timisoara, in whose team I started and trained as a researcher and discovered the hitherto unsuspected potential of homogeneous and heterogeneous polyoxyethylene chains.

#### **Author details**

Calin Jianu\*

Address all correspondence to: calin.jianu@gmail.com

Department of Food Science, Faculty of Food Processing Technology, Banat's University of Agricultural Sciences and Veterinary Medicine, Timișoara, Romania

#### **References**


[3] Edwards CL. Polyoxyethylene alcohols. In: van Os NM. (ed.) Nonionic Surfactants: Organic Chemistry; Surfactant Science Series. Volume 72. New York: Marcel Dekker; 1998. p87-121.

[16] Rösch M. The configuration of the polyethyleneoxide chain of nonionic surfactants

Ethylene Oxide Homogeneous Heterobifunctional Acyclic Oligomers

http://dx.doi.org/10.5772/57610

131

[17] Jianu, I., Process for the cyanoethylation of higher alcohols C12-C18, RO Patent 63637\*\* (A2), International Classification: C07C41/00; C07C43/00; C07C41/00;

[18] Drugarin C. and Jianu I., Process for the preparation of new amphoteric surface-ac‐ tive agents, RO Patent 78634 (A2), International Classification: (IPC1-7): C07C87/30.

[19] Jianu, I., Process for obtaining polyethoxylated soaps, RO Patent 85437 (A2), Interna‐

[20] Drugarin C. and Jianu I., Process for the preparation of surface-active agents, RO Pat‐ ent 77367 (A2), International Classification: C07D233/08; C07D233/00; (IPC1-7):

[21] Jianu C, Jianu I. Colloidal competences of new tailor-made lipids. Journal of Food

[22] Jianu C. Synthesis of nonionic-anionic colloidal systems based on alkaline and am‐ monium beta-nonylphenol polyethyleneoxy (n=3-20) propionates/dodecylbenzene sulfonates with prospects for food hygiene. Chemistry Central Journal 2012; 6:95, doi:

[23] Jianu C. Colloidal competences of some food cleaning agents based on alkaline and ammonium nonionic soaps. Journal of Food Agriculture and Environment 2012;

[24] Drugarin C. and Jianu, I., Process for the preparation of new cationic agents, RO Pat‐

[25] Jianu I. Beitrage zum kinetischen und thermodynamischen studium der cyanoethy‐ lierungs reaktionen höherer alkohole (C10-C18). In: Proceeding World Surfactants Congress. Surfactants in our Word Today and Tomorrow, vol. II 1984; p188-196. (in

[26] Drugarin C. and Jianu I., Imidazoline derivatives and process for the preparation thereof, RO Patent 82335\*\* (A2), C.A. 102, 26882p, 1985, International Classification:

[27] Weinheimer RM, Varineau PT. Polyoxyethylene alkylphenols. In: van Os NM. (ed.) Nonionic Surfactants: Organic Chemistry. New York: Marcel Dekker; 1998.

[28] Drugarin C, Jianu I. The hydrolysis of β-[(p)-Alkylphenyloxy polyethyleneoxy]-pro‐ pionitrils under phase transfer catalysis-Part I: The synthesis of β-[(p)-Alkylpheny‐

loxy polyethyleneoxy]-propionitrils. Tenside Detergents 1983; 20 128-129.

ent 82439\* (A3), International Classification: (IPC1-7): C07C87/30.

(part 1). Tenside Detergents 1971; 8 302-313. (in german)

tional Classification: C11D1/02; C11D1/02; (IPC1-7): C11D1/02.

Agriculture and Environment 2010; 8(3-4) 148-155.

C07D233/20; C07D233/00; (IPC1-7): C07D233/20.

C07C43/00; (IPC1-7): C07C41/00; C07C43/00.

C07D233/08.

10(1) 10-15.

German).

10.1186/1752-153X-6-95.


[16] Rösch M. The configuration of the polyethyleneoxide chain of nonionic surfactants (part 1). Tenside Detergents 1971; 8 302-313. (in german)

[3] Edwards CL. Polyoxyethylene alcohols. In: van Os NM. (ed.) Nonionic Surfactants: Organic Chemistry; Surfactant Science Series. Volume 72. New York: Marcel Dekker;

[4] Edwards CL. Chemistry and Handling of Ethylene Oxide. In: van Os NM. (ed.) Non‐ ionic Surfactants: Organic Chemistry Science Series (Chapter 1). New York: Marcel

[5] Harris J. Poly(ethylene) glycol Chemistry: Biotechnical and Biomedical Applications.

[6] Chen J, Spear SK, Huddleston JG, Rogers RD. Polyethylene glycol and solutions of polyethylene glycol as green reaction media. Green Chemistry 2005; 7(2) 64-82.

[7] Abuchowski A, McCoy JR, Palczuk NC, van Es T, Davis FF. Effect of covalent attach‐ ment of polyethylene glycol on immunogenicity and circulating live of bovine liver

[8] Edwards CL. Polyoxyethylene Chain Lenght Distribution. In: van Os NM. (ed.) Non‐ ionic Surfactants: Organic Chemistry Science Series (Chapter 1). New York: Marcel

[9] Jianu I. Aminoethers surfaceactive agents. Cationic surfaceactive agents. PhD thesis.

[10] Jianu C. Research concerning the potential of some metal ion complexing additives to improve food value in horticultural raw matter. PhD thesis. Banat's University of

[11] Yokogama Y, Hirajima R, Morigaki K, Yamaguchi Y, Ueda K. Alkali-cation affinities of polyoxyethylene dodecylethers and helical conformation of their cationized mole‐ cules studied by electrospray mass spectrometry. Journal of the American Society for

[12] Jackson AT, Scrivens JH, Williams JP, Baker ES, Gidden J, Bowers MT. Microstructur‐ al and Conformational studies of polyether copolymers. International Journal of

[13] Gidden J, Wyttenbach T, Jackson AT, Scrivens JH, Bowers MT. Gas-Phase Conforma‐ tions of Synthetic Poymers: Poly(Ethylene Glycol), Poly(Propylene Glycol) and Poly (Tetramethylene Glycol). Journal of the American Chemical Society 2000; 122(19)

[14] Bailey FE, Koleske JV. Configuration and Hydrodynamic Proprieties of Polyoxyethy‐ lene Chain in Solution. In: Schick MJ. (ed.) Nonionic Surfactants: Physical Chemistry

[15] Gejji SP, Tegenfeldt J, Lindgren J. Conformational analysis of poly (ethylene oxide)

oligomers: diglyme. Chemical Physics Letters 1994; 226(3-4) 427-432.

catalase. Journal of Biological Chemistry 1977; 252(11) 3582-3586.

Agricultural Sciences and Veterinary Medicine; 2006.

(Chapter 16). New York: Marcel Dekker; 1998. p927-971.

Mass Spectrometry 2007; 18(11) 1914-1920.

Mass Spectrometry 2004; 238(3) 287-297.

1998. p87-121.

Dekker; 1998. p2-4.

130 Oligomerization of Chemical and Biological Compounds

Dekker; 1998. p4-34.

4692-4699.

Polytehnic University; 1984.

New York: Plenum Press; 1992.


[29] Drugarin C, Jianu I. The hydrolysis of β-[(p)-Alkylphenyloxy polyethyleneoxy]-pro‐ pionitrils under phase transfer catalysis-Part II: The synthesis of β-[(p)-Alkylpheny‐ loxy polyethyleneoxy]-propionic acids. Tenside Detergents 1983; 20 130-131.

**Chapter 5**

**Higher Oligomeric Surfactants — From Fundamentals to**

Surfactants (shorten for surface-active agents) are organic compounds containing in one molecule both lyophobic (hydrophobic) and lyophilic (hydrophilic) parts (Fig. 1.). Due to such amphiphilic structure surfactants exhibit specific properties in solutions, as well as in solid state. When present at low concentrations in solutions they adsorb at all available interfaces (liquid/gas, liquid /liquid, liquid /solid,) and as a consequence dramatically change their free energy. At higher concentrations, above so called critical micellization concentration (cmc), when all the interfaces are occupied, surfactants self-assemble in the bulk in various aggre‐ gates: micelles, vesicles and liquid crystals. The type of formed supramolecular structure depends upon the structure and concentration of the surfactant, presence of the electrolyte,

Laboratory for synthesis and processes of self-assembling of organic molecules, Division of physical chemistry, Ruđer

Surfactants (shorten for surface-active agents) are organic compounds containing in one molecule both lyophobic (hydrophobic) and lyophilic (hydrophilic) parts (Fig. 1.). Due to such amphiphilic structure surfactants exhibit specific properties in solutions, as well as in solid state. When present at low concentrations in solutions they adsorb at all available interfaces (liquid/gas, liquid /liquid, liquid /solid,) and as a consequence dramatically change their free energy. At higher concentrations, above so called critical micellization concentration (cmc), when all the interfaces are occupied, surfactants self-assemble in the bulk in various aggregates: micelles, vesicles and liquid crystals. The type of formed supramolecular structure depends upon the structure and concentration of the surfactant, presence of the electrolyte,

Surfactants versatile phase behavior and ability to form different structures, with sizes from nano to micro-scale, is a reason why they are widely used in various industrial processes, ranging from classical (paints, cosmetics, pharmaceuticals, foods) to modern technologies (synthesis of advanced materials, environmental protection). Moreover, surfactants have important roles in living organisms. Examples are pulmonary surfactants, proteins, biological membranes which can be considered to be self-assembled bilayers of surface active compounds (phospholipids), etc. [1,

Surfactants versatile phase behavior and ability to form different structures, with sizes from nano to micro-scale, is a reason why they are widely used in various industrial processes,

In constant search for more efficient and environmentally friendly surfactants, both academic and industrial interest has been focused on design and preparation of novel, complex, surfactant structures, with improved properties in comparison with conventional surfactants, i.e. those containing only one hydrophilic and one hydrophobic group.

© 2014 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Novel surfactants which have attracted considerable interest in last two decades are oligomeric surfactants. These compounds are made up of two (dimeric surfactants) or more (higher oligomeric surfactants) amphiphilic moieties covalently linked at the level of the head groups or very close to them by a spacer group [3]. This means that, in theory, it is possible to synthesize oligomeric surfactants using two or more molecules of identical and/or different conventional surfactant and connecting them with a spacer group varying in chemical nature, length, hydrophobicity and rigidity.

First to report about dimeric or gemini surfactant (Fig. 1. b) in scientific literature were Bunton and collaborators in 1971 [4] They have synthetized bisquaternary ammonium surfactants and studied rate of nucleofilic substitution in their micellar solutions. This work was followed by Devinsky and collaborators in 1985, who synthesized the great variety of bisquaternary ammonium surfactants and investigated their surface activity and micellization [5]. In 1990 Okahara and collaborators synthesized first anionic gemini surfactants, with two sulphate groups and two alkyl chains [6]. In 1990s work by Zana's, and latter Esumi's group, on bisquaternary ammonium surfactants in which they have shown that these

**Higher Oligomeric Surfactants — From Fundamentals to Applications** 

**a) b)** 

Figure 1. Schematic representation of a) monomeric and b) dimeric surfactant molecule.

**Figure 1.** Schematic representation of a) monomeric and b) dimeric surfactant molecule.

The number of possible architectures, and thus properties, is vast [3].

**Applications**

D. Jurašin and M. Dutour Sikirić

http://dx.doi.org/10.5772/57655

**1. Introduction**

D. Jurašin, M. Dutour Sikirić

Bošković Institute, Zagreb

**1. 1. Introduction** 

Croatia

2].

temperature, etc. [1, 2].

temperature, etc. [1, 2].

Additional information is available at the end of the chapter


## **Higher Oligomeric Surfactants — From Fundamentals to Applications**

D. Jurašin and M. Dutour Sikirić

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/57655

#### **1. Introduction** D. Jurašin, M. Dutour Sikirić

temperature, etc. [1, 2].

[29] Drugarin C, Jianu I. The hydrolysis of β-[(p)-Alkylphenyloxy polyethyleneoxy]-pro‐ pionitrils under phase transfer catalysis-Part II: The synthesis of β-[(p)-Alkylpheny‐

[30] Marszall L. HLB of nonionic surfactants: PIT and EIP methods. In: Schick MJ. (ed.) Nonionic Surfactants, Physical Chemistry; Surfactant Science Series. Volume 23. New

[31] Drugarin C, Jianu I. Correlation between molecular structure and surface activity, Part I: The foaming power of the class of β-Alkyloxyethyl and β-Alkylpoly(ethyle‐ noxy)-ethyl-alkyl-dimethylammonium halides. Tenside Detergents 1983; 20 76-77. [32] Drugarin C, Jianu I. Correlation between molecular structure and surface activity, Part II: Stability and density of the foam of β-Alkyloxyethyl and β-Alkylpoly(ethyle‐ noxy)-ethyl-alkyl-dimethylammonium halides. Tenside Detergents 1983; 20 78-79. [33] SR ISO/TR 896:1995 Surface Active Agents-Scientific Clasification identical with the

[34] SR EN 14370:2005 Surface active agents. Determination of surface tension. Ring var‐

[35] Drugarin C. and Jianu, I., Process for obtaining synthetic triglycerides, RO Patent 82041 (A2), C.A. 103, 55720k, 1985. International Classification: C07C67/22;

C07C69/003; C07C67/00; C07C69/00; (IPC1-7): C07C67/22; C07C69/003.

iante (identical with the European Standard EN 14370:2004).

loxy polyethyleneoxy]-propionic acids. Tenside Detergents 1983; 20 130-131.

York: Marcel Dekker; 1998. p493-547.

132 Oligomerization of Chemical and Biological Compounds

International ISO/TR 893:1977.

Surfactants (shorten for surface-active agents) are organic compounds containing in one molecule both lyophobic (hydrophobic) and lyophilic (hydrophilic) parts (Fig. 1.). Due to such amphiphilic structure surfactants exhibit specific properties in solutions, as well as in solid state. When present at low concentrations in solutions they adsorb at all available interfaces (liquid/gas, liquid /liquid, liquid /solid,) and as a consequence dramatically change their free energy. At higher concentrations, above so called critical micellization concentration (cmc), when all the interfaces are occupied, surfactants self-assemble in the bulk in various aggre‐ gates: micelles, vesicles and liquid crystals. The type of formed supramolecular structure depends upon the structure and concentration of the surfactant, presence of the electrolyte, temperature, etc. [1, 2]. Laboratory for synthesis and processes of self-assembling of organic molecules, Division of physical chemistry, Ruđer Bošković Institute, Zagreb Croatia **1. 1. Introduction**  Surfactants (shorten for surface-active agents) are organic compounds containing in one molecule both lyophobic (hydrophobic) and lyophilic (hydrophilic) parts (Fig. 1.). Due to such amphiphilic structure surfactants exhibit specific properties in solutions, as well as in solid state. When present at low concentrations in solutions they adsorb at all available interfaces (liquid/gas, liquid /liquid, liquid /solid,) and as a consequence dramatically change their free energy. At higher concentrations, above so called critical micellization concentration (cmc), when all the interfaces are occupied, surfactants self-assemble in the bulk in various aggregates: micelles, vesicles and liquid crystals. The type of formed supramolecular structure depends upon the structure and concentration of the surfactant, presence of the electrolyte,

**Higher Oligomeric Surfactants — From Fundamentals to Applications** 

**Figure 1.** Schematic representation of a) monomeric and b) dimeric surfactant molecule.

Figure 1. Schematic representation of a) monomeric and b) dimeric surfactant molecule.

The number of possible architectures, and thus properties, is vast [3].

reason why they are widely used in various industrial processes, ranging from classical (paints, cosmetics, pharmaceuticals, foods) to modern technologies (synthesis of advanced materials, environmental protection). Moreover, surfactants have important roles in living organisms. Examples are pulmonary surfactants, proteins, biological Surfactants versatile phase behavior and ability to form different structures, with sizes from nano to micro-scale, is a reason why they are widely used in various industrial processes,

comparison with conventional surfactants, i.e. those containing only one hydrophilic and one hydrophobic group.

Surfactants versatile phase behavior and ability to form different structures, with sizes from nano to micro-scale, is a

membranes which can be considered to be self-assembled bilayers of surface active compounds (phospholipids), etc. [1,

Novel surfactants which have attracted considerable interest in last two decades are oligomeric surfactants. These compounds are made up of two (dimeric surfactants) or more (higher oligomeric surfactants) amphiphilic moieties covalently linked at the level of the head groups or very close to them by a spacer group [3]. This means that, in theory, it is possible to synthesize oligomeric surfactants using two or more molecules of identical and/or different conventional surfactant and connecting them with a spacer group varying in chemical nature, length, hydrophobicity and rigidity.

First to report about dimeric or gemini surfactant (Fig. 1. b) in scientific literature were Bunton and collaborators in 1971 [4] They have synthetized bisquaternary ammonium surfactants and studied rate of nucleofilic substitution in their micellar solutions. This work was followed by Devinsky and collaborators in 1985, who synthesized the great variety of bisquaternary ammonium surfactants and investigated their surface activity and micellization [5]. In 1990 Okahara and collaborators synthesized first anionic gemini surfactants, with two sulphate groups and two alkyl chains [6]. In 1990s work by Zana's, and latter Esumi's group, on bisquaternary ammonium surfactants in which they have shown that these

In constant search for more efficient and environmentally friendly surfactants, both academic and industrial interest has been focused on design and preparation of novel, complex, surfactant structures, with improved properties in © 2014 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

ranging from classical (paints, cosmetics, pharmaceuticals, foods) to modern technologies (synthesis of advanced materials, environmental protection). Moreover, surfactants have important roles in living organisms. Examples are pulmonary surfactants, proteins, biological membranes which can be considered to be self-assembled bilayers of surface active compounds (phospholipids), etc. [1, 2].

The fact that the dimeric surfactants posses properties superior to the corresponding mono‐ mers, was motivation to extend the concept of gemini surfactants to higher oligomers (degree of oligomerization ≥ 3), expecting that their properties would be even better. This expectations were further supported by theoretical considerations predicting that the critical micellization concentration decreases continuously with increasing degree of oligomerization, while preferentially small spherical micelles form at low concentration and wormlike or threadlike micelles at high concentration [19,20]. In addition, since higher oligomeric surfactants repre‐ sent transitional structures between conventional and polymeric surfactants, investigation of

Higher Oligomeric Surfactants — From Fundamentals to Applications

http://dx.doi.org/10.5772/57655

135

First synthesized and investigated higher oligomeric surfactants were trimeric [9, 11, 16, 17] and tetrameric [13] linear quaternary ammonium bromides with alkyl spacers. Up to now different cationic, anionic and nonionic oligomeric surfactants, with degree of oligomerization up to 7 [21] and linear, ring-typed or star-like topology [22] of the molecule have been

However, the efforts in investigation of higher oligomeric surfactants are hindered by the more complex synthesis and purification. Recently progress in this area has been reported. White and Warr have synthesized oligomeric alkylpyridinium surfactants by a simple elimination– addition reaction [23]. Also Feng and collaborators have reported a new method of synthesis

In this chapter the influence of the oligomerization degree, length of hydrophobic chains, nature of the spacer and topology of the molecule on the properties of higher oligomeric surfactants1 in the solution and at the interfaces will be discussed. Comparison will be made with behavior of corresponding monomeric and dimeric surfactants. Possible applications of

Oligomeric surfactants, like conventional ones, are most commonly classified depending on

**•** nonionic – hydrophilic group bears no charge, but solubility in water is a consequence of

**•** zwitterionic – molecule contain both positively and negatively charged hydrophylic groups. So far cationic, anionic and nonionic higher oligomeric surfactants have been synthesized, but

1 The term oligomeric surfactants as used in this chapter does not include surfactants molecules in whose structure only part of the moiety, i.e. head group or tail, is repeated. Example of such amphiphilic molecules are nonionic oligomeric

for oligomeric surfactants by atom transfer radical polymerization (ATRP) [24, 25].

their properties can give insight in the behavior of the latter.

higher oligomeric surfactants will also be discussed.

**2. Clasification of the oligomeric surfactants**

**•** anionic – hydrophilic group is negatively charged, **•** cationic – hydrophilic group is positively charged,

no synthesis of zwitterionic has been reported.

surfactants like Brij or polyetheramine surfactants.

its high polarity,

the type of the hydrophylic headgroup surfactants in four major groups:

investigated.

In constant search for more efficient and environmentally friendly surfactants, both academic and industrial interest has been focused on design and preparation of novel, complex, surfactant structures, with improved properties in comparison with conventional surfactants, i.e. those containing only one hydrophilic and one hydrophobic group.

Novel surfactants which have attracted considerable interest in last two decades are oligomeric surfactants. These compounds are made up of two (dimeric surfactants) or more (higher oligomeric surfactants) amphiphilic moieties covalently linked at the level of the head groups or very close to them by a spacer group [3]. This means that, in theory, it is possible to synthesize oligomeric surfactants using two or more molecules of identical and/ or different conventional surfactant and connecting them with a spacer group varying in chemical nature, length, hydrophobicity and rigidity. The number of possible architec‐ tures, and thus properties, is vast [3].

First to report about dimeric or gemini surfactant (Fig. 1. b) in scientific literature were Bunton and collaborators in 1971 [4] They have synthetized bisquaternary ammonium surfactants and studied rate of nucleofilic substitution in their micellar solutions. This work was followed by Devinsky and collaborators in 1985, who synthesized the great variety of bisquaternary ammonium surfactants and investigated their surface activity and micellization [5]. In 1990 Okahara and collaborators synthesized first anionic gemini surfactants, with two sulphate groups and two alkyl chains [6]. In 1990s work by Zana's, and latter Esumi's group, on bisquaternary ammonium surfactants in which they have shown that these surfactants posses unique properties and various self-assembly behaviors compared to the corresponding monomeric surfactants, motivated the investigation of different dimeric surfactants [7-17].

Constantly growing interest for the investigation and synthesis of novel gemini surfactants is a consequence of their superior properties in comparison to the conventional ones [3]:


In addition, the Krafft temperatures of dimeric surfactants with hydrophilic spacers are generally very low giving to these surfactants the capacity to be used in cold water [3]. Some cationic dimeric surfactants even have interesting biological activity [5, 18].

The fact that the dimeric surfactants posses properties superior to the corresponding mono‐ mers, was motivation to extend the concept of gemini surfactants to higher oligomers (degree of oligomerization ≥ 3), expecting that their properties would be even better. This expectations were further supported by theoretical considerations predicting that the critical micellization concentration decreases continuously with increasing degree of oligomerization, while preferentially small spherical micelles form at low concentration and wormlike or threadlike micelles at high concentration [19,20]. In addition, since higher oligomeric surfactants repre‐ sent transitional structures between conventional and polymeric surfactants, investigation of their properties can give insight in the behavior of the latter.

First synthesized and investigated higher oligomeric surfactants were trimeric [9, 11, 16, 17] and tetrameric [13] linear quaternary ammonium bromides with alkyl spacers. Up to now different cationic, anionic and nonionic oligomeric surfactants, with degree of oligomerization up to 7 [21] and linear, ring-typed or star-like topology [22] of the molecule have been investigated.

However, the efforts in investigation of higher oligomeric surfactants are hindered by the more complex synthesis and purification. Recently progress in this area has been reported. White and Warr have synthesized oligomeric alkylpyridinium surfactants by a simple elimination– addition reaction [23]. Also Feng and collaborators have reported a new method of synthesis for oligomeric surfactants by atom transfer radical polymerization (ATRP) [24, 25].

In this chapter the influence of the oligomerization degree, length of hydrophobic chains, nature of the spacer and topology of the molecule on the properties of higher oligomeric surfactants1 in the solution and at the interfaces will be discussed. Comparison will be made with behavior of corresponding monomeric and dimeric surfactants. Possible applications of higher oligomeric surfactants will also be discussed.

#### **2. Clasification of the oligomeric surfactants**

ranging from classical (paints, cosmetics, pharmaceuticals, foods) to modern technologies (synthesis of advanced materials, environmental protection). Moreover, surfactants have important roles in living organisms. Examples are pulmonary surfactants, proteins, biological membranes which can be considered to be self-assembled bilayers of surface active compounds

In constant search for more efficient and environmentally friendly surfactants, both academic and industrial interest has been focused on design and preparation of novel, complex, surfactant structures, with improved properties in comparison with conventional surfactants,

Novel surfactants which have attracted considerable interest in last two decades are oligomeric surfactants. These compounds are made up of two (dimeric surfactants) or more (higher oligomeric surfactants) amphiphilic moieties covalently linked at the level of the head groups or very close to them by a spacer group [3]. This means that, in theory, it is possible to synthesize oligomeric surfactants using two or more molecules of identical and/ or different conventional surfactant and connecting them with a spacer group varying in chemical nature, length, hydrophobicity and rigidity. The number of possible architec‐

First to report about dimeric or gemini surfactant (Fig. 1. b) in scientific literature were Bunton and collaborators in 1971 [4] They have synthetized bisquaternary ammonium surfactants and studied rate of nucleofilic substitution in their micellar solutions. This work was followed by Devinsky and collaborators in 1985, who synthesized the great variety of bisquaternary ammonium surfactants and investigated their surface activity and micellization [5]. In 1990 Okahara and collaborators synthesized first anionic gemini surfactants, with two sulphate groups and two alkyl chains [6]. In 1990s work by Zana's, and latter Esumi's group, on bisquaternary ammonium surfactants in which they have shown that these surfactants posses unique properties and various self-assembly behaviors compared to the corresponding monomeric surfactants, motivated the investigation of different dimeric surfactants [7-17].

Constantly growing interest for the investigation and synthesis of novel gemini surfactants is a consequence of their superior properties in comparison to the conventional ones [3]:

**•** their cmcs are one or two order of magnitude lower than for the corresponding monomeric

**•** their aqueous solution can have a very high viscosity or even show viscoelastic properties at relatively low surfactant concentrations, whereas the solutions of corresponding mono‐

In addition, the Krafft temperatures of dimeric surfactants with hydrophilic spacers are generally very low giving to these surfactants the capacity to be used in cold water [3]. Some

i.e. those containing only one hydrophilic and one hydrophobic group.

(phospholipids), etc. [1, 2].

134 Oligomerization of Chemical and Biological Compounds

tures, and thus properties, is vast [3].

**•** they are more efficient in lowering surface tension.

**•** also, they have better: solubilizing, wetting and foaming properties.

cationic dimeric surfactants even have interesting biological activity [5, 18].

mers remain low viscous as water,

surfactants.

Oligomeric surfactants, like conventional ones, are most commonly classified depending on the type of the hydrophylic headgroup surfactants in four major groups:


So far cationic, anionic and nonionic higher oligomeric surfactants have been synthesized, but no synthesis of zwitterionic has been reported.

<sup>1</sup> The term oligomeric surfactants as used in this chapter does not include surfactants molecules in whose structure only part of the moiety, i.e. head group or tail, is repeated. Example of such amphiphilic molecules are nonionic oligomeric surfactants like Brij or polyetheramine surfactants.

Since oligomeric surfactants are made up of two or more amphiphilic moieties covalently linked at the level of head groups or very close to them (Fig. 1. b) by a hydrophylic or hydro‐ phobic, flexible or rigid spacer group, they can also be classified based on:

**◦** *triple chain* surfactants with three hydrocarbon chains and two or three carboxylate headgroups (Fig. 3. 3K, 3L)– they exhibit same problem with solubility as ring type

**◦** *tetrameric surfactants with multiple-ring spacers* based on dioxane rings with different

C*m*H2*m*+1 C*s*H2*<sup>s</sup>* CH3

*m* **= number of C atoms in alkyl chains** *s* **= number of C atoms in spacer**

CH3

<sup>C</sup>*s*H2*<sup>s</sup>* <sup>N</sup><sup>+</sup> CH3

> N+ CH3

> > H3C

*m***‐xylylene** *p***‐xylylene** *trans***‐1,4‐buten‐2‐ylene**

N+

CH3

C12H25

<sup>O</sup> H25C12

**3I**

**6A** <sup>N</sup>

R = -C12H25

CH3 CH3

R

NH C10H21 COONa

H3C CH3

NH N+

O <sup>O</sup> <sup>R</sup>

> NH N+

N

<sup>N</sup> NH

O

N+

C12H25

3Cl-

*m* **= 8, 12, 18** 

O H3C CH3

C*m*H2*m*+1 C*m*H2*m*+1 CH3 H3C CH3

H3C CH3

3Cl-

H3C H3C

<sup>C</sup>*s*H2*<sup>s</sup>* <sup>N</sup><sup>+</sup> CH3 CH3

CmH2*m*+1 <sup>C</sup>*m*H2*m*+1 <sup>C</sup>*m*H2*m*+1

OH OH CH3

CH3

C12H25 C12H25 C12H25 3 Cl- **3B**

> C12H25 C12H25 C12H25 3Cl- **3C**

C12H25 C12H25 C12H25 C12H25 4Cl- **4B**

CH3

NH <sup>N</sup><sup>+</sup>

**3F**

N+ N+ <sup>O</sup> C12H25

3 3

3

R

6Br-

CH3 CH3

H3C O

N+

CH3 H3C

R H3C CH3

N

H3C

**3L**

NH H21C10 COONa

NH N+

> NH N+

<sup>O</sup> <sup>R</sup> CH3 CH3

N

N NaOOC C10H21

N HN N+

O O H3C CH3

H25C12

H3C H3C

CH3 CH3

<sup>C</sup>*s*H2*<sup>s</sup>* N+ CH3 CH3

N+

CH3

*<sup>R</sup>* N+ CH3

H3C CH3

3Cl-

HN N+ O H3C CH3

<sup>N</sup> NH

C12H25

CH3 <sup>O</sup> C12H25 CH3 CH3

O

N+

3Cl-

NH C10H21

NaOOC

CH3

C12H25

CH3

http://dx.doi.org/10.5772/57655

137

CmH2*m*+1 <sup>C</sup>*m*H2*m*+1 <sup>C</sup>*m*H2*m*+1 H3C 3Br- **3A**

Higher Oligomeric Surfactants — From Fundamentals to Applications

<sup>C</sup>*s*H2*<sup>s</sup>* N+ CH3

N+ H3C *<sup>R</sup>* N+ *<sup>R</sup>* N+

H3C

**tb**

4Br- **4A**

3

**Figure 2.** Continued.

N C10H21

*m* **= 4, 10**

flexibility (flexible, semi-flexible and rigid) of spacers [40] (Fig. 3. 4D-G).

trimeric surfactants [39].

H3C CH3

N+ CH3

H3C H3C

> N+ H3C CH3 CH3

(H3C)2N

H3C

C*m*H2*m*+1

O NH OH

<sup>O</sup> NH OH

N+

H25C12 CH3 CH3

> N+ NH

H3C H3C

C*m*H2*m*+1

**3J**

C12H25

NaOOC

COONa

<sup>O</sup> NH OH

**4C** 4 Br-

NH <sup>O</sup>

O

<sup>N</sup> <sup>N</sup>

N N N NH C*m*H2*m*+1

NH C*m*H2*m*+1

COONa NH

+ **3D**

C12H25

<sup>N</sup><sup>+</sup> CH3 C*m*H2*m*+1

<sup>C</sup>*s*H2*<sup>s</sup>* N+ CH3 CH3

C*m*H2*m*+1 C*m*H2*m*+1

N+ <sup>N</sup> N+ OH OH C*m*H2*m*+1

*<sup>m</sup>* = 1 - 8, X = Cl-

Cl- **1B** <sup>N</sup>

N+ H3C *<sup>R</sup>* N+ CH3 CH3 H3C

C12H25 C12H25

*R* **=**

N(CH3)2

3Br-

C*m*H2*m*+1 C*m*H2*m*+1

> N+ NH O O

C12H25 CH3 H3C

**3K**

<sup>H</sup> NH 21C10 NaOOC

**Figure 2.** Molecular structure of cationic oligomeric surfactants-quaternary alkyl ammonium salts.

H3C O

N+ NH

CH3 C12H25

H3C

**3H**

**3G** <sup>O</sup> <sup>N</sup><sup>+</sup>

+

N(CH3)2 C*m*H2*m*+1

C*m*H2*m*+1

N

+

*m* **= 8, 10, 12, 14**

C*m*H2*m*+1

C12H25 C12H25

, Br-

**1C**

Br-

H3C 2Br- **2A** N+ H3C

**1A** <sup>N</sup><sup>+</sup>

CH3 CH3

2 X-

**2B** N+ N+

<sup>+</sup> CH3 CH3

Cl-

2Cl- **2C** N+ H3C *<sup>R</sup>* N+ *<sup>R</sup>* N+

NH N+

O

**3E**

**mx px** H3C CH3

C12H25

H3C CH3

H3C

H25C12

OH

<sup>O</sup> <sup>N</sup><sup>+</sup> OH

> C*m*H2*m*+1 H3C CH3

N+ OH

H3C


The structures of typical examples of the higher oligomeric surfacatants, reviewed in this chapter, are given in Fig. 2-4. They include:

	- **◦** *quaternary ammonium surfactants with the alkyl spacers* [3, 7-17, 26, 27] (Fig. 2. 2A, 3A, 4A) – they are usually denoted as *m*-*s*-(*m*-*s*)x-*m*, where *m* represents number of carbon atom in hydrophobic chain, *s* number of the carbon atoms in the spacer and *x*=*j*–2, where *j* is the degree of oligomerization. Up to now, these surfactants have been the most investi‐ gated oligomeric surfactants, due to the relative ease of their synthesis and possibility to tailor surfactant properties by changing spacer and chain length. Majority of these surfactants are bromide salts, chlorides were synthesized in lesser extent.
	- **◦** *oligomeric quaternary ammonium surfactants prepared by epichlorohydrin* [28, 29] (Fig. 2. 2B, 3B) – these surfactants have short polar spacers containing –OH groups.
	- **◦** *oligomeric quaternary ammonium chlorides with trans-1,4-buten-2-ylene, m-xylylene and pxylylene spacers* (Fig. 2. 2C, 3C, 4B) *–* the spacers groups differ in both nature and length, and are all rigid. Chloride was chosen as counterion to increase solubility in water and to provide lower Krafft temperatures [30, 31].
	- **◦** *polyoxyethylene ether trimeric quaternary ammonium surfactants*(Fig. 2. 3I) *–* these surfactants contain chains consisting of both polyoxethylene and dodecyl alkyl groups [32].
	- **◦** *star-shaped* trimeric quaternary ammonium bromides with different chain length [33] (Fig. 2. 3D).
	- **◦** *star-shaped trimeric, tetrameric and hexameric quaternatry ammonium salts with amide groups* (Fig. 2. 3E, 3F, 4C, 6A) *–* amide groups were chosen to increase solubility of these surfactants. The spacers are rigid. Non symmetric and symmetric trimeric surfactants were prepared with a slight difference in the spacer [22, 34-36].
	- **◦** Tris[2-hydroxy-3-(alkyldimethyammonio)-propoxymethyl]ethane]-these surfactants were designed in order to obtain surfactants with enhanced antimicrobial properties [37] (Fig. 2. 3G, 3H).
	- **◦** *ring-type trimeric* surfactants synthesized by introducing three hydrocarbon chains to cyanuric chloride (Fig. 3. 3J) –they can dissolve in water only at pH around 13, which renders their applicability [38].

**◦** *triple chain* surfactants with three hydrocarbon chains and two or three carboxylate headgroups (Fig. 3. 3K, 3L)– they exhibit same problem with solubility as ring type trimeric surfactants [39].

Since oligomeric surfactants are made up of two or more amphiphilic moieties covalently linked at the level of head groups or very close to them (Fig. 1. b) by a hydrophylic or hydro‐

**•** number of amphiphilic moieties present in the molecule – dimeric (gemini), trimeric,

The structures of typical examples of the higher oligomeric surfacatants, reviewed in this

**◦** *quaternary ammonium surfactants with the alkyl spacers* [3, 7-17, 26, 27] (Fig. 2. 2A, 3A, 4A) – they are usually denoted as *m*-*s*-(*m*-*s*)x-*m*, where *m* represents number of carbon atom in hydrophobic chain, *s* number of the carbon atoms in the spacer and *x*=*j*–2, where *j* is the degree of oligomerization. Up to now, these surfactants have been the most investi‐ gated oligomeric surfactants, due to the relative ease of their synthesis and possibility to tailor surfactant properties by changing spacer and chain length. Majority of these

**◦** *oligomeric quaternary ammonium surfactants prepared by epichlorohydrin* [28, 29] (Fig. 2. 2B,

**◦** *oligomeric quaternary ammonium chlorides with trans-1,4-buten-2-ylene, m-xylylene and pxylylene spacers* (Fig. 2. 2C, 3C, 4B) *–* the spacers groups differ in both nature and length, and are all rigid. Chloride was chosen as counterion to increase solubility in water and

**◦** *polyoxyethylene ether trimeric quaternary ammonium surfactants*(Fig. 2. 3I) *–* these surfactants contain chains consisting of both polyoxethylene and dodecyl alkyl groups [32].

**◦** *star-shaped* trimeric quaternary ammonium bromides with different chain length [33] (Fig.

**◦** *star-shaped trimeric, tetrameric and hexameric quaternatry ammonium salts with amide groups* (Fig. 2. 3E, 3F, 4C, 6A) *–* amide groups were chosen to increase solubility of these surfactants. The spacers are rigid. Non symmetric and symmetric trimeric surfactants

**◦** Tris[2-hydroxy-3-(alkyldimethyammonio)-propoxymethyl]ethane]-these surfactants were designed in order to obtain surfactants with enhanced antimicrobial properties [37]

**◦** *ring-type trimeric* surfactants synthesized by introducing three hydrocarbon chains to cyanuric chloride (Fig. 3. 3J) –they can dissolve in water only at pH around 13, which

were prepared with a slight difference in the spacer [22, 34-36].

**•** molecular structure (features of their spacer groups)-linear, ring-type, and star-shaped.

**• cationic** – different derivates of quaternary ammonium salts were prepared, like:

surfactants are bromide salts, chlorides were synthesized in lesser extent.

3B) – these surfactants have short polar spacers containing –OH groups.

phobic, flexible or rigid spacer group, they can also be classified based on:

tetrameric, etc.

2. 3D).

(Fig. 2. 3G, 3H).

renders their applicability [38].

**• anionic**

chapter, are given in Fig. 2-4. They include:

136 Oligomerization of Chemical and Biological Compounds

to provide lower Krafft temperatures [30, 31].

**◦** *tetrameric surfactants with multiple-ring spacers* based on dioxane rings with different flexibility (flexible, semi-flexible and rigid) of spacers [40] (Fig. 3. 4D-G).

NH NaOOC COONa <sup>H</sup> NH 21C10 NH H21C10 **Figure 2.** Molecular structure of cationic oligomeric surfactants-quaternary alkyl ammonium salts.

NaOOC

**3K**

N N N

COONa

C*m*H2*m*+1

**3J**

NH C*m*H2*m*+1

COONa NH

C*m*H2*m*+1

3

**Figure 2.** Continued.

NH C10H21

N

COONa

**3L**

NH C10H21

NaOOC

N NaOOC C10H21

N C10H21

*m* **= 4, 10**

#### **• nonionic**


**Figure 3.** Molecular structure of anionic oligomeric surfactants.

(EO)*<sup>k</sup>* OH

CH3 H3C CH3 H3C CH3

C10H22 O C10H22

OR OR OR *<sup>l</sup>* **7A Figure 3.** Molecular structure of anionic oligomeric surfactants. O O O O C10H22 H22C10

O O

OSO3Na

O

O OSO3Na

> O O OSO3Na

**4F** <sup>O</sup>

OSO3Na

C10H22

<sup>C</sup>*m*H2*m*+1 <sup>C</sup>*m*H2*m*+1 <sup>C</sup>*m*H2m+1

O O

OSO3Na

OSO3Na **4D**

O

*m* **= 8, 10, 12, 14, 16**

O

OSO3Na

**3M**

H22C10

<sup>O</sup> H22C10 OSO3Na

H22C10

**Figure 4.** Molecular structure of nonionic oligomeric surfactants.

CH3 H3C CH3

CH3 H3C CH3 H3C CH3

(EO)*<sup>k</sup>* OH

OSO3Na **4E**

OSO3Na

O O

OSO3Na

OSO3Na

N

O O

NH

O O

C*m*H2m+1

**3N**

O O O O O O

<sup>O</sup> <sup>H</sup> C10H22 22C10

H22C10 C10H22

O O

O

NH

<sup>O</sup> <sup>C</sup>*m*H2*m*+1 OH

O OSO3Na

*k*

OSO3Na

<sup>O</sup> <sup>C</sup>*m*H2*m*+1 OH

N

*k*

OSO3Na

O C10H22

O OSO3Na

<sup>O</sup> OH

*k*

C10H22

**3. Oligomeric surfactants in solution**

Unique properties that surfactants exhibit in aqueous solution and in the solid state are consequence of their amphiphilic nature. When present in low concentrations in aqueous solution, surfactants tend to concentrate at the available interfaces and in that way reduce the free energy of the system. At higher concentration, when all the interfaces are saturated the reduction of the system energy can be achieved, depending on the experimental conditions, by crystallization of the surfactant from the solution or by the formation of supramolecular aggregates (micelles, vesicles, liquid crystals etc., Fig. 5). The concentration above which micelle are formed is called critical micellization concentration (cmc). Micelles are thermody‐

Higher Oligomeric Surfactants — From Fundamentals to Applications

http://dx.doi.org/10.5772/57655

139

namically stable dispersed species in equilibrium with surfactant monomers [1, 2].

**Figure 5.** Modes of surfactant reduction of surface and interfacial energies. After ref [1].

heat of hydratation of the material being dissolved.

which are thermodynamically favored form [1, 2, 43].

Overall solubility of many ionic compounds increases as temperature increases. This effect is the result of the physical characteristics of the solid phase, namely crystal lattice energy and

In the case of ionic surfactants, it is often observed that the solubility undergoes a sharp, discontinuous increase at some characteristic temperature, named the Krafft temperature (*T*K) (Fig. 6). Below the Krafft temperature solubility of the surfactant is determined by the solid state properties, while above it the surfactant solubility increases due to formation of micelles,

The Kraft temperature varies with alkyl chain length and structure, as well as with counterion. Lowering of the Krafft temperature can be achieved by introducing chain branching, multiple

**3.1. Solubility of surfactants**

H3C CH3

(EO)*<sup>k</sup>* OH

H22C10

H22C10

**4G**

4

4

**Figure 4.** Molecular structure of nonionic oligomeric surfactants.

#### **3. Oligomeric surfactants in solution**

**• nonionic**

skeleton the same [42].

138 Oligomerization of Chemical and Biological Compounds

N N N NH C*m*H2*m*+1

NH

O O

O O

C*m*H2*m*+1

OR OR OR

OSO3Na

<sup>C</sup>*m*H2*m*+1 <sup>C</sup>*m*H2*m*+1 <sup>C</sup>*m*H2m+1

O O

O O

*m* **= 8, 10, 12, 14, 16**

OR OR OR

<sup>C</sup>*m*H2*m*+1 <sup>C</sup>*m*H2*m*+1 <sup>C</sup>*m*H2m+1

*m* **= 8, 10, 12, 14, 16**

O

OSO3Na

OSO3Na

N N N NH C*m*H2*m*+1

NH

COONa NH

COONa NH

C*m*H2*m*+1

O O

O O

**Figure 3.** Molecular structure of anionic oligomeric surfactants.

O O

O O

**Figure 4.** Molecular structure of nonionic oligomeric surfactants.

**3K**

O C10H22

O O OSO3Na

**4F** <sup>O</sup>

H NH 21C10 NaOOC

OSO3Na

C10H22

O OSO3Na

C10H22

OSO3Na

**3K**

O C10H22

O O OSO3Na

**4F** <sup>O</sup>

OSO3Na

O OSO3Na

OSO3Na

O

O

H NH 21C10 NaOOC

NaOOC

COONa

OSO3Na **4D**

H22C10

NaOOC

O

OSO3Na **4D**

OSO3Na

H22C10

H22C10

COONa

H22C10

C*m*H2*m*+1

<sup>O</sup> H22C10 OSO3Na

C*m*H2*m*+1

H22C10

**3J**

**3M**

H22C10

**3M**

<sup>O</sup> H22C10 OSO3Na

**3J**

**◦** *tyloxapol* (Fig. 4. 7A) – repeating unit is close to Triton X-100 and maximum degree of

**◦** *n-alkylphenol polyoxyethylene trimeric surfactants* with different lengths of hydrophilic group oxyethylene chains and hydrophobic group methylene chains [41] (Fig. 4. 3M).

**◦** trimeric surfactants derived from tris(2-aminoethy)amine (Fig. 4. 3N) – these surfactants offer possibility to change hydrophilic/lipophilic balance while keeping molecular

> N C10H21

*m* **= 4, 10**

**Figure 3.** Molecular structure of anionic oligomeric surfactants.

N C10H21

*<sup>l</sup>* **7A**

**<sup>R</sup> <sup>=</sup> (CH2CH2O)***k***H,** *<sup>k</sup>* **<sup>=</sup> 4, 7, 10, 15, 20, <sup>30</sup>** *<sup>k</sup>* **<sup>=</sup> <sup>8</sup> – <sup>10</sup>***, <sup>l</sup>* **<sup>=</sup> 1‐ 5** *<sup>m</sup>* **<sup>=</sup> 8, 10, 12;** *<sup>k</sup>* **<sup>=</sup> 9, 23, <sup>45</sup> Figure 4.** Molecular structure of nonionic oligomeric surfactants.

*<sup>l</sup>* **7A**

**<sup>R</sup> <sup>=</sup> (CH2CH2O)***k***H,** *<sup>k</sup>* **<sup>=</sup> 4, 7, 10, 15, 20, <sup>30</sup>** *<sup>k</sup>* **<sup>=</sup> <sup>8</sup> – <sup>10</sup>***, <sup>l</sup>* **<sup>=</sup> 1‐ 5** *<sup>m</sup>* **<sup>=</sup> 8, 10, 12;** *<sup>k</sup>* **<sup>=</sup> 9, 23, <sup>45</sup> Figure 4.** Molecular structure of nonionic oligomeric surfactants.

**Figure 3.** Molecular structure of anionic oligomeric surfactants.

(EO)*<sup>k</sup>* OH

(EO)*<sup>k</sup>* OH

*m* **= 4, 10**

CH3 H3C CH3 H3C CH3

CH3 H3C CH3 H3C CH3

C10H22 O C10H22

C10H22 O C10H22

NH C10H21 COONa

OSO3Na **4E**

H22C10

CH3 H3C CH3

> (EO)*<sup>k</sup>* OH

CH3 H3C CH3

H3C CH3

H3C CH3

(EO)*<sup>k</sup>* OH

H22C10

H22C10

**4G**

**4G**

NH C10H21 COONa

H22C10

N

NH H21C10 COONa

**3L**

O O

NH H21C10 COONa

O O O O O O

<sup>O</sup> <sup>H</sup> C10H22 22C10

H22C10 C10H22

O

O O

O

**3L**

O O

OSO3Na

OSO3Na

O O

OSO3Na

OSO3Na

(EO)*<sup>k</sup>* OH

OSO3Na **4E**

OSO3Na

CH3 H3C CH3 H3C CH3

> (EO)*<sup>k</sup>* OH

CH3 H3C CH3 H3C CH3

OSO3Na

NH C10H21

O C10H22

O OSO3Na

C10H22

OSO3Na

NH C10H21

*k*

OSO3Na

*k*

O C10H22

O OSO3Na

NaOOC

N NaOOC C10H21

> O O

> > N

N NaOOC C10H21

O O

NaOOC

O OSO3Na

OSO3Na

N

O O

NH

NH

O O

C*m*H2m+1

C*m*H2m+1

**3N**

**3N**

O O O O O O

<sup>O</sup> <sup>H</sup> C10H22 22C10

H22C10 C10H22

NH

<sup>O</sup> <sup>C</sup>*m*H2*m*+1 OH

O OSO3Na

*k*

OSO3Na

N

NH

<sup>O</sup> <sup>C</sup>*m*H2*m*+1 OH

*k*

<sup>O</sup> <sup>C</sup>*m*H2*m*+1 OH

N

<sup>O</sup> <sup>C</sup>*m*H2*m*+1 OH

N

<sup>O</sup> OH

<sup>O</sup> OH

*k*

*k*

C10H22

4

4

polymerization is about 7 [21]. This surfactant is commercially available.

Unique properties that surfactants exhibit in aqueous solution and in the solid state are consequence of their amphiphilic nature. When present in low concentrations in aqueous solution, surfactants tend to concentrate at the available interfaces and in that way reduce the free energy of the system. At higher concentration, when all the interfaces are saturated the reduction of the system energy can be achieved, depending on the experimental conditions, by crystallization of the surfactant from the solution or by the formation of supramolecular aggregates (micelles, vesicles, liquid crystals etc., Fig. 5). The concentration above which micelle are formed is called critical micellization concentration (cmc). Micelles are thermody‐ namically stable dispersed species in equilibrium with surfactant monomers [1, 2].

**Figure 5.** Modes of surfactant reduction of surface and interfacial energies. After ref [1].

#### **3.1. Solubility of surfactants**

Overall solubility of many ionic compounds increases as temperature increases. This effect is the result of the physical characteristics of the solid phase, namely crystal lattice energy and heat of hydratation of the material being dissolved.

In the case of ionic surfactants, it is often observed that the solubility undergoes a sharp, discontinuous increase at some characteristic temperature, named the Krafft temperature (*T*K) (Fig. 6). Below the Krafft temperature solubility of the surfactant is determined by the solid state properties, while above it the surfactant solubility increases due to formation of micelles, which are thermodynamically favored form [1, 2, 43].

The Kraft temperature varies with alkyl chain length and structure, as well as with counterion. Lowering of the Krafft temperature can be achieved by introducing chain branching, multiple

Trimeric quaternary ammonium surfactants with polar spacers (3B) have also shown good

In the series of quaternary ammonium surfactants with *trans*-1,4-butenylene, *m*-and *p*-xylylene spacers (2C, 3C, 4B) it was shown that the Krafft temperatures are reduced by a higher degree of oligomerization [30]. Interestingly, *T*<sup>K</sup> of the trimeric and tetrameric surfactants with the *p*xylylene spacer (3Cpx and 4Bpx) are below 0 °C, whereas the analogous dimer 2Cpx surfactant

The Krafft temperature of trimeric surfactants 3E and 3F [34], and polyoxyethylene ether

Despite the large number of hydrophobic alkyl chains, the Krafft temperature of hexameric

Many surfactant applications are based on their ability to absorb at various interfaces in an oriented fashion. The difference in the surface activity of different surfactants is a consequence of the difference in their packing density at the air/water interface. The packing density is reflected in the values of surfactant surface excess concentration (*Γ*max) and surface area occupied by a surfactant molecule (*a*min). The higher the value of surface excess concentration, consequently the lower the value of surface area, the more efficient surfactant is. *a*min is not only measure of efficacy of adsorption, but it is also the first information about the orientation

Maximum surface excess concentration of a surfactant (*Γ*max), can be calculated from the surface tension (*γ*) measurements, i.e. from the maximal slope (d*γ*/dlog*c*) in the *γ vs.* log *c* before cmc

> æ ö ¶g

1 2.303 log

*<sup>a</sup>*min <sup>=</sup> <sup>10</sup><sup>14</sup> *N*A*Γ*max

= - ç ÷ ¶è ø*<sup>T</sup>*

where *c* denotes concentration, *R* is the gas constant (8.314 Jmol−1 K−1), *T* is absolute temperature and *n* is the number of solute species whose concentration at the interface changes with change

From the surface excess concentration, the area per molecule at the interface, *a*min, in square

C [22].

C [32]. The *T*K of star-

http://dx.doi.org/10.5772/57655

141

(2)

C [33].

Higher Oligomeric Surfactants — From Fundamentals to Applications

*nRT c* (1)

[2].

water solubility and their Krafft temperatures are all below 0 °C [28].

trimeric quaternary ammonium surfactant 3I is also found to be below 0 o

shaped trimeric surfactants 3D were found to be lower than 5 o

has a Krafft temperature of 23 °C [30].

surfactant 6A is also found to be below 0 o

**3.2. Adsorption at the air/water interface**

(Fig. 7):

and packing of the surfactant at the interface [1, 2, 43].

max

where *N*A is Avogardo's number and *Γ*max is expressed in mol/m2

*Γ*

in the value of surfactant bulk concentration (*c*) [2].

nms is calculated from the relation:

**Figure 6.** Schematical representation of the solubility curve for the ionic surfactants. The Krafft temperature (*T*K) is the temperature at which surfactant solubility equals the cmc. Above *T*K surfactant molecules form a dispersed phase; be‐ low *T*K hydrated crystals are formed. After ref. [43].

bonds in the alkyl chain or bulkier hydrophilic groups in the surfactant molecules. In this way intermolecular reactions that promote crystallization are reduced [1, 2, 43].

The Krafft temperature is usually determined either by measuring the change of electrical conductivity with temperature or visually observing the change of turbidity of supersaturated surfactant solution (usually 1 wt %).

Knowledge of the Krafft temperature is crucial in many applications since below *T*<sup>K</sup> the surfactant will clearly not perform efficiently; hence typical characteristics such as maximum surface tension lowering and micelle formation cannot be achieved.

Ionic dimeric *m*-*s*-*m* surfactants with *m* ≤ 12 are generally highly soluble in water. The Krafft temperatures below 0 o C have been reported for many series of anionic dimeric surfactants with hydrophobic or hydrophilic spacers [3].

Only the Krafft temperatures of cationic higher oligomeric surfactants were reported, to the best of our knowledge. Majority of the reported values are below 0 o C, which is important for their possible applications in cold water. Besides relative ease of their synthesis, low Krafft temperature is one of the main reasons why oligomeric quaternary ammonium surfactants have received much of the attention.

While trimeric 12-2-12-2-12 surfactant has Krafft temperature in the vicinity of 0 o C, *T*<sup>K</sup> for corresponding tetramer12-2-12-2-12-2-12 is 32 o C [27]. It is interesting to note that the Krafft temperature does not regularly change with the degree of oligomerization in this series of oligomeric dodecyl quaternary ammonium surfactants since the Krafft temperature of monomeric DTAB (1A) is below 0 o C and that of dimeric 12-2-12 is 15 o C [3].

Trimeric quaternary ammonium surfactants with polar spacers (3B) have also shown good water solubility and their Krafft temperatures are all below 0 °C [28].

In the series of quaternary ammonium surfactants with *trans*-1,4-butenylene, *m*-and *p*-xylylene spacers (2C, 3C, 4B) it was shown that the Krafft temperatures are reduced by a higher degree of oligomerization [30]. Interestingly, *T*<sup>K</sup> of the trimeric and tetrameric surfactants with the *p*xylylene spacer (3Cpx and 4Bpx) are below 0 °C, whereas the analogous dimer 2Cpx surfactant has a Krafft temperature of 23 °C [30].

The Krafft temperature of trimeric surfactants 3E and 3F [34], and polyoxyethylene ether trimeric quaternary ammonium surfactant 3I is also found to be below 0 o C [32]. The *T*K of starshaped trimeric surfactants 3D were found to be lower than 5 o C [33].

Despite the large number of hydrophobic alkyl chains, the Krafft temperature of hexameric surfactant 6A is also found to be below 0 o C [22].

#### **3.2. Adsorption at the air/water interface**

bonds in the alkyl chain or bulkier hydrophilic groups in the surfactant molecules. In this way

**Figure 6.** Schematical representation of the solubility curve for the ionic surfactants. The Krafft temperature (*T*K) is the temperature at which surfactant solubility equals the cmc. Above *T*K surfactant molecules form a dispersed phase; be‐

The Krafft temperature is usually determined either by measuring the change of electrical conductivity with temperature or visually observing the change of turbidity of supersaturated

Knowledge of the Krafft temperature is crucial in many applications since below *T*<sup>K</sup> the surfactant will clearly not perform efficiently; hence typical characteristics such as maximum

Ionic dimeric *m*-*s*-*m* surfactants with *m* ≤ 12 are generally highly soluble in water. The Krafft

Only the Krafft temperatures of cationic higher oligomeric surfactants were reported, to the

their possible applications in cold water. Besides relative ease of their synthesis, low Krafft temperature is one of the main reasons why oligomeric quaternary ammonium surfactants

temperature does not regularly change with the degree of oligomerization in this series of oligomeric dodecyl quaternary ammonium surfactants since the Krafft temperature of

C and that of dimeric 12-2-12 is 15 o

While trimeric 12-2-12-2-12 surfactant has Krafft temperature in the vicinity of 0 o

C have been reported for many series of anionic dimeric surfactants

C, which is important for

C [27]. It is interesting to note that the Krafft

C [3].

C, *T*<sup>K</sup> for

intermolecular reactions that promote crystallization are reduced [1, 2, 43].

surface tension lowering and micelle formation cannot be achieved.

best of our knowledge. Majority of the reported values are below 0 o

surfactant solution (usually 1 wt %).

low *T*K hydrated crystals are formed. After ref. [43].

140 Oligomerization of Chemical and Biological Compounds

with hydrophobic or hydrophilic spacers [3].

corresponding tetramer12-2-12-2-12-2-12 is 32 o

have received much of the attention.

monomeric DTAB (1A) is below 0 o

temperatures below 0 o

Many surfactant applications are based on their ability to absorb at various interfaces in an oriented fashion. The difference in the surface activity of different surfactants is a consequence of the difference in their packing density at the air/water interface. The packing density is reflected in the values of surfactant surface excess concentration (*Γ*max) and surface area occupied by a surfactant molecule (*a*min). The higher the value of surface excess concentration, consequently the lower the value of surface area, the more efficient surfactant is. *a*min is not only measure of efficacy of adsorption, but it is also the first information about the orientation and packing of the surfactant at the interface [1, 2, 43].

Maximum surface excess concentration of a surfactant (*Γ*max), can be calculated from the surface tension (*γ*) measurements, i.e. from the maximal slope (d*γ*/dlog*c*) in the *γ vs.* log *c* before cmc (Fig. 7):

$$\left.F\_{\text{max}} = -\frac{1}{2.303nRT} \left(\frac{\partial \mathcal{Y}}{\partial \log c}\right)\_T \right. \tag{1}$$

where *c* denotes concentration, *R* is the gas constant (8.314 Jmol−1 K−1), *T* is absolute temperature and *n* is the number of solute species whose concentration at the interface changes with change in the value of surfactant bulk concentration (*c*) [2].

From the surface excess concentration, the area per molecule at the interface, *a*min, in square nms is calculated from the relation:

$$a\_{\rm min} = \frac{10^{14}}{N\_{\rm A} \Gamma\_{\rm max}} \tag{2}$$

where *N*A is Avogardo's number and *Γ*max is expressed in mol/m2 [2]. In addition to *Γ*max and *a*min following quantities can be used to assess surfactant performance in lowering surface tension and its preference for adsorption in comparison to micellization [1, 2, 43]:


Surface activity of higher oligomeric surfactants was assessed based on *Γ*max, *a*min, *γ*cmc and p*C*<sup>20</sup> values. Data reported in available literature are summarized in Table 1.

1

**Figure 7.** Plot of surface tension versus log of the bulk phase concentration for an aqueous solution of a surfactant. Three distinctive parts of the curve represent [45]:

**Table 1.** The surface excess concentration (

surfactants are given according to the figures 2-4.

**106 max**

**mol molecule/ m2**

**mol alkyl chain / m2**

**nm2 /molecule**

1A 3.5 2.7 3.5 0.62 0.59 2 38.6 38.9 2.3 25 30

10-2-10 31.8 25 44

5.4 0.72

3

30.3

3.8

31.4

25

15,16

30

27

1.04

12-3-12 2.3 4.6 0.48 3 25 15

12-6-12 1.35 2.7 0.72 3 25 7,15

8-2-8-2-8 1.07 1.55 4 35.1 25 26

10-2-10-2-10 1.38 1.21 4 25.8 25 26

1.49

4 36.4

36.0 3.7

25

26

27

30

1.27

12-3-12-3-12 1.75 5.25 0.49 4 25 15

12-6-12-6-12 0.7 2.1 0.83 4 25 15

12-2-12-2-12-2-12 0.9 1.84 5 29.7 4.0 40 27

12-3-12-4-12-3-12 1.3 5.2 25 15

1B 40.5 23 30

2B-12 Cl 35 20 28

3B -12 Cl 32 20 28

2Ctb 41.5 23 30

2C mx 43 23 30

2Cpx 45.0 23 30

Higher Oligomeric Surfactants — From Fundamentals to Applications

4Bmx 38.5 23 30

3I 38.9 25 32

3H - 8 2.8 0.593 34.78 4.21 20 37

3H - 10 2.88 0.575 34.03 4.70 20 37

3H - 12 2.94 0.565 33.02 5.39 20 37

http://dx.doi.org/10.5772/57655

143

\*decyltrimethylammonium bromide**, \*\*** bisquaternary ammonium dichloride with dodecyl chain (R(CH3)2N+CH2CH-(OH)CH2+N(CH3)2R·2Cl-

12-2-12-2-12 1.11

1.3

12-2-12 2.7

1.7

**nm2 / alkyl chain**

**Surfactant**

*Cationic* 

max), the minimum area per molecule at the air/solution interface (*a*min), the number of solute species whose concentration at the interface

changes with change in the value of surfactant bulk concentration (*n*), the maximum reduction of surface tension (cmc) and the negative logarithm of concentration required to produce a

surface tension reduction of 20 mN m-1 (p*C*20) obtained at given temperature (*T*) for higher oligomeric and corresponding monomeric and dimeric surfactants. Abbreviated names of listed

*a***min**

*n*

**cmc** **mN / m-1**

**p***C***20**

*T*

**C**

**o**

**Ref.** 15, 16

27



**Table 1.** The surface excess concentration (

surfactants are given according to the figures 2-4.

max), the minimum area per molecule at the air/solution interface (*a*min), the number of solute species whose concentration at the interface

changes with change in the value of surfactant bulk concentration (*n*), the maximum reduction of surface tension (cmc) and the negative logarithm of concentration required to produce a

surface tension reduction of 20 mN m-1 (p*C*20) obtained at given temperature (*T*) for higher oligomeric and corresponding monomeric and dimeric surfactants. Abbreviated names of listed

In addition to *Γ*max and *a*min following quantities can be used to assess surfactant performance in lowering surface tension and its preference for adsorption in comparison to micellization

**•** the maximum reduction of surface tension (*γ*cmc) which can be attained in the solution of certain surfactant regardless of its concentration. Lower *γ*cmc means more surface active

**•** the concentration required to produce a surface tension reduction of 20 mN m-1 (*C*20), usually expressed as negative logarithm of such concentration, p*C*20. The larger p*C*<sup>20</sup> the more efficiently the surfactant is adsorbed at the interface and the more efficiently it reduces

**•** the cmc/*C*20, ratio is a convenient measure of the relative effects of structural factors on the micellization and adsorption processes; the larger the values of the cmc/*C*20 ratio, the greater the tendency of the surfactant to adsorb at the interface, relative to its tendency to form

Surface activity of higher oligomeric surfactants was assessed based on *Γ*max, *a*min, *γ*cmc and p*C*<sup>20</sup>

**Figure 7.** Plot of surface tension versus log of the bulk phase concentration for an aqueous solution of a surfactant.

**i.** At low surfactant concentrations surfactant monomers are forming monolayer at the air/water interface.

**ii.** At concentrations below, but close to cmc slope of the curve is constant because surface concentration

**iii.** At concentrations above cmc, surface tension remains almost constant, due to constant monomer

Surface tension is decreasing with the surfactant bulk concentration due to the increasing surfactant surface

values. Data reported in available literature are summarized in Table 1.

[1, 2, 43]:

surfactant.

142 Oligomerization of Chemical and Biological Compounds

surface tension.

Three distinctive parts of the curve represent [45]:

reached its maximum value.

concentration.

concentration at the air/water interface.

micelles.

1


2

The comparison of the experimentally obtained *Γ*max and *a*min values for single chain surfactants is straight forward. However, this is far from being so for the oligomeric surfactants. The problem lies in the *n* in the expression for the surface excess concentration (eq. 1). The prefactor *n* represents the number of species at the interface, the concentration of which changes with the bulk surfactant concentration. This value depends on the degree of dissociation of ionic surfactants, which is not known exactly for all surfactants. There are two extreme cases, one assuming no dissociation, thus treating the surfactants as one particle, or the other assuming complete dissociation. In the case of single-chained surfactants situation is simple, this value is 1 for the nonionic, which are not ionized, and 2 for the ionic surfactants, which are considered to be fully ionized, when both ions are univalent. However introducing additional structural elements to the surfactant molecule complicates the situation. Not only that for many oligo‐ meric surfactants degree of dissociation is unknown, but it can vary in the same series of

Higher Oligomeric Surfactants — From Fundamentals to Applications

http://dx.doi.org/10.5772/57655

145

surfactants either with changing the degree of oligomerization or the spacer length.

and 5 for tetramer [15, 27].

of oligomerization.

corresponding monomer.

This problem was first encountered with dimeric *m*-*s*-*m* surfactants. The reported values were obtained using *n*=2 assuming that one of the headgroup is neutralized by a counterion [46, 47], or *n*=3 assuming that surfactant is fully ionized [7]. In order to be able to compare different series of the surfactants some studies reported values obtained using both values. Not even studies in which surface was determined directly by small angle neutron scattering manage to solve the problem, because it was shown that degree of dissociation depends on the nature of the spacer [48]. Nowadays it is commonly assumed that for series of linear quaternary ammonium surfactants with dodecyl chains and short ethylen space, *n*=3 for dimer, 4 for trimer

In addition, when comparing *Γ*max and *a*min obtained in different studies, one should be aware that they can be expressed either per molecule or per alkyl chain. The later is more convenient to determine spacer influence in the series of surfactants with the same degree

Based only on the structure of the surfactant molecule it could be expected that *a*min will increase with the degree of oligomerization due to the increased number of headgroups in the molecule. However, since higher oligomers are more surface active, in some cases, their molecules are more closely packed at the air/water interface. As a result, sometimes *a*min doesn't vary with the degree of oligomerization and therefore it may be the same or even lesser than for

Most of the pioneering work in investigating oligomeric surfactant adsorption at the air/water interface has been done by Zana and collaborators [3,7], who have laid the basis for the

The influence of the oligomerization degree, spacer and alkyl chain length of *m*-*s*-(*m*-*s*)*x*-*m* (1A-4A) surfactants on surface activity was investigated by different groups [15-17, 26, 27]. The dependence of surface area occupied by a surfactant molecule (expressed per alkyl chain) on oligomerization degree depends on spacer length for the surfactants with dodecyl chains. For spacer length *s*=2 and 3 surface area *a*min remains nearly constant going from

understanding of oligomeric surfactant behavior at air/water interface.

**Table 1.** The surface excess concentration (max), the minimum area per molecule at the air/solution interface (*a*min), the number of solute species whose concentration at the interface changes with change in the value of surfactant bulk concentration (*n*), the maximum reduction of surface tension (cmc) and the negative logarithm of concentration required to produce a surface tension reduction of 20 mN m-1 (p*C*20) obtained at given temperature (*T*) for higher oligomeric and corresponding monomeric and dimeric surfactants. Abbreviated names of listed surfactants are given according to the figures 2-4.

**Table 1.** Continued.

The comparison of the experimentally obtained *Γ*max and *a*min values for single chain surfactants is straight forward. However, this is far from being so for the oligomeric surfactants. The problem lies in the *n* in the expression for the surface excess concentration (eq. 1). The prefactor *n* represents the number of species at the interface, the concentration of which changes with the bulk surfactant concentration. This value depends on the degree of dissociation of ionic surfactants, which is not known exactly for all surfactants. There are two extreme cases, one assuming no dissociation, thus treating the surfactants as one particle, or the other assuming complete dissociation. In the case of single-chained surfactants situation is simple, this value is 1 for the nonionic, which are not ionized, and 2 for the ionic surfactants, which are considered to be fully ionized, when both ions are univalent. However introducing additional structural elements to the surfactant molecule complicates the situation. Not only that for many oligo‐ meric surfactants degree of dissociation is unknown, but it can vary in the same series of surfactants either with changing the degree of oligomerization or the spacer length.

This problem was first encountered with dimeric *m*-*s*-*m* surfactants. The reported values were obtained using *n*=2 assuming that one of the headgroup is neutralized by a counterion [46, 47], or *n*=3 assuming that surfactant is fully ionized [7]. In order to be able to compare different series of the surfactants some studies reported values obtained using both values. Not even studies in which surface was determined directly by small angle neutron scattering manage to solve the problem, because it was shown that degree of dissociation depends on the nature of the spacer [48]. Nowadays it is commonly assumed that for series of linear quaternary ammonium surfactants with dodecyl chains and short ethylen space, *n*=3 for dimer, 4 for trimer and 5 for tetramer [15, 27].

2

**Table 1.** Continued.

according to the figures 2-4.

**Surfactant 106**

**mol molecule/ m2**

*Cationic* 

*Anionic* 

C12H25OSO Na 3

**Table 1.** The surface excess concentration (max), the minimum area per molecule at the air/solution interface (*a*min), the number of solute species whose concentration at the interface changes with change in the value of surfactant bulk concentration (*n*), the maximum reduction of surface tension (cmc) and the negative logarithm of concentration required to produce a surface tension reduction of 20 mN m-1 (p*C*20) obtained at given temperature (*T*) for higher oligomeric and corresponding monomeric and dimeric surfactants. Abbreviated names of listed surfactants are given

**mol alkyl chain / m2**

**nm2 /molecule** 3D-8 0.875 1.90 4 25 33

3D-10 0.883 1.88 4 33.4 25 33

3D-12 0.820 2.03 4 32.3 25 33

3D-14 0.970 1.70 4 32.1 25 33

4C 47.6 25 36

C11H23COONa 2.43 0.69 37.5 2.3 20 38

3J-4 3.54 0.47 1 32.9 3.8 25 38

3J-10 4.84 0.34 1 28.6 5.0 25 38

C10H21CH(NH)2COONa 3.37 0.49 1 46.0 3.3 25 39

3K 9.13 0.18 1 29.3 5.9 25 39

3L 2.57 0.65 1 33.2 6.1 25 39

4D 0.95-1.18 4-5 30 5.13 20 40

4E 0.83-1.04 4-5 28 4.50 20 40

4F 1.02-1.28 4-5 32 4.54 20 40

3N 8-PEG400 1.82 0.91 1 35.4 25 42

3N 8- PEG1000 2.21 0.89 1 33.2 25 42

3N 8- PEG2000 2.57 0.83 1 31.1 25 42

3N 10- PEG400 1.84 0.93 1 36 25 42

3N 10- PEG1000 3.33 0.90 1 35.8 25 42

3N 10- PEG2000 3.78 0.85 1 33.3 25 42

3N 12- PEG400 1.93 0.94 1 36.8 25 42

3N 12- PEG1000 2.45 0.91 1 36 25 42

3N 12- PEG2000 4.02 0.85 1 34.3 25 42

*Nonionic* 

3.16 0.53 32.5 2.5 25 38

**nm2 / alkyl chain**

 **max**

*a***min**

*n*

**cmc** **mN / m-1**

**p***C***20**

*T*

**C**

**o**

**Ref.**

144 Oligomerization of Chemical and Biological Compounds

In addition, when comparing *Γ*max and *a*min obtained in different studies, one should be aware that they can be expressed either per molecule or per alkyl chain. The later is more convenient to determine spacer influence in the series of surfactants with the same degree of oligomerization.

Based only on the structure of the surfactant molecule it could be expected that *a*min will increase with the degree of oligomerization due to the increased number of headgroups in the molecule. However, since higher oligomers are more surface active, in some cases, their molecules are more closely packed at the air/water interface. As a result, sometimes *a*min doesn't vary with the degree of oligomerization and therefore it may be the same or even lesser than for corresponding monomer.

Most of the pioneering work in investigating oligomeric surfactant adsorption at the air/water interface has been done by Zana and collaborators [3,7], who have laid the basis for the understanding of oligomeric surfactant behavior at air/water interface.

The influence of the oligomerization degree, spacer and alkyl chain length of *m*-*s*-(*m*-*s*)*x*-*m* (1A-4A) surfactants on surface activity was investigated by different groups [15-17, 26, 27]. The dependence of surface area occupied by a surfactant molecule (expressed per alkyl chain) on oligomerization degree depends on spacer length for the surfactants with dodecyl chains. For spacer length *s*=2 and 3 surface area *a*min remains nearly constant going from monomer to trimer [15, 26, 27], while for *s*=6 a slight increase going from dimer to trimer has been observed [15].

Laschewsky et al. [30] prepared large number of oligomeric quaternary ammonium surfactants with different degree of oligomerisation and various spacers. In the three investigated series (2C, 3C, 4B) the spacer length was chosen to be in the range between C4 and C6, because the most pronounced changes in the properties of dimeric surfac‐ tants were reported for rather short spacer groups. The spacer groups employed, namely *trans*-1,4-buten-2-ylene, *m*-xylylene, and *p*-xylylene, can be considered as rigid, thus fixing chemically the distance between the cationic groups within the same molecule. *Γ*max i *a*min values were not reported in order to avoid controversy which *n* value should be used (see section 3.2). Therefore surface activity was assessed based only on *γ*cmc. The surface tension at cmc decreases with the degree of oligomerization, the effect being more pronounced for the longer spacers. The nature of the spacer has important influence on packing of the molecules in monolayer. Based on the *γ*cmc obtained for the three series with different spacers suggest that the packing density of the dodecyl chains in the adsorbed monolayer stays approximately the same for the series with the *trans*-butenylene spacer, increases some‐ what for the series with *m*-xylylene spacer, and improves much for the series with *p*-

Higher Oligomeric Surfactants — From Fundamentals to Applications

http://dx.doi.org/10.5772/57655

147

Greater possibility of adjusting surfactant physico-chemical properties through changing molecular structure and conformation has motivated synthesis and characterization of several

Yoshimura et al. [33] investigated star-shaped trimeric surfactants consisting of three quater‐ nary ammonium surfactants linked to a tris(2-aminoethyl)amine core (3D). Each ammonium group had two methyls and a straight alkyl chain of 8, 10, 12, or 14 carbons. In comparison with corresponding monomeric and gemini surfactants with dodecyl chains, *Γ*max and *a*min of 3D are smaller and much larger, respectively. However, *a*min calculated per hydrocarbon chain are close to that of a monomeric surfactant and *a*min per molecule slightly larger than that of the linear-type trimeric surfactants. This indicates that 3D adsorbed to the air/water interface in an orientation which cause high surface activity, by interactions of the multiple hydrocarbon chains despite the strong electrostatic repulsion between multiple quaternary ammonium

headgroups. These results were also supported by p*C*20 and the cmc/*C*20 ratio [33].

Trimeric surfactant 3D exhibited *γ*cmc values smaller than those of the corresponding mono‐ meric surfactants, and almost the same *γ*cmc values as those of the dimeric surfactants. Their *γ*cmc were also lower in comparison with afore mentioned linear-type cationic trimeric surfactants with three dodecyl chains and two spacers such as ethylene (3A; *m*=12, *s*=2), *trans*-1,4-buten-2-ylene (3Ctb), *m*-xylylene (3Cmx), or *p*-xylylene (3Cpx). Although the adsorption kinetics of trimeric strar–shape surfactants to the air/water interface was slow, they have strongly adsorbed and oriented themselves at the interface, indicating that they efficiently

Wang et al. [22, 34-36] synthesized diversity of oligomeric star-shape surfactants: trimeric (3E, 3F), tetrameric (4C) and hexameric (6A). However, because of their unusual aggregation behavior and properties surface activity was investigated only for tetrameric molecule (4C). The obtained *γ*cmc was higher than the most of the reported cationic gemini and higher oligomeric surfactants. In addition, the surface tension continues to decrease significantly after

xylylene spacer with increasing degree of oligomerization [30].

star-shaped surfactants.

lower the surface tension of water [33].

For *s*=2 values of both maximum surface excess concentration and the minimum area per molecule adsorbed at the air/solution indicate vertical orientation of hydrophobic tails towards the air of monomeric and oligomeric surfactants with the spacer of the latter located at the interface [27].

The *γ*cmc value for dimeric (2A; *m*=12, *s*=2) and tetrameric (4A; *m*=12, *s*=2) surfactants are much lower than those for their monomeric counterpart, 1A. However, the *γ*cmc value of the trimer (3A; *m*=12, *s*=2) is higher than those of the dimer and tetramer, but still lower compared to that of monomer 1A. Both surface efficiency and effectiveness of adsorption at the air/solution interface are lower for the trimer than these parameters for the dimer and tetramer. The differences found between surface activity within the oligomeric series may be attributed to different packing densities at the air/solution interface. The p*C*<sup>20</sup> values of these oligomeric surfactants are much higher than for monomer 1A. Within the oligomeric series, however, the degree of oligomerization does not significantly affect the p*C*20 value [27].

In the series of trimeric 12-*s*-12-*s*-12 surfactants *a*min increases as *s* is increased, as is the case for the same series of surfactant dimers [15]. Among trimeric surfactants of quaternary ammonium bromide *m-2-m-2-m* type, where *m=*8, 10, or 12, 10-2-10-2-10 surfactant has highest efficiency, it occupies lowest surface at the air/water interface and has the lowest *γ*cmc [26].

The reliable values of surface tension could not be obtained for 12-3-12-4-12-3-12 tetramer [15].

It can be concluded that in the series *m*-*s*-(*m*-*s*)*x*-*m* (1A-4A) higher oligomeric surfactants exhibits somewhat better adsorption properties in comparison with monomeric and dimeric surfactants. However, the difference in the surface activity is highest between monomer and dimer than between dimer and higher oligomers.

Kim [28] and Chelebicki [29] synthesized oligomeric quaternary ammonium salts using epichlorohydrin and epibromohydrine. In that way alkyl spacer containing hydroxy group was introduced (2B, 3B). The *γ*cmc of compounds without central alkyl chain (gemini surfactant; 2B, *m*=1) were higher than that of coresponding lower analogs; bisquaternary ammonium dichloride with dodecyl chain (R(CH3)2N+ CH2CH-(OH)CH2 + N(CH3)2R 2Cl- ) and monomeric 1B. Contrary to that, compounds having three dodecyl chains (timeric surfactant, 3B) show lower *γ*cmc than both monomer and dimer. In addition, positive charge in the center of the molecule contributes to the lowering of *γ*cmc [28]. Chelebicki et al. [29] varied counterions (Cl- , Br- ) and alkyl chain length at central nitrogen atom (2B; *m*=2 – 8). Obtained *Γ*max and *a*min values indicated that 2B surfactants with longer alkyl chain length are packed more tightly at the air– water interface. p*C*20 values of all 2B (*X*=Br- ) surfactants were higher than monomer 1B and analogous dichloride bis-ammonium salts 2B (*X*=Cl- ). According to obtained *γ*cmc values, investigated bromide salts reduce surface tension of water more than gemini surfactants and analogous dichloride bis-ammonium salts [29].

In conclusion, introducing additional hydrophobic chain results in the better surface activity of 2B and 3B surfactants. Surface activity is also influenced by counterion, bromides are more surface active than chlorides.

Laschewsky et al. [30] prepared large number of oligomeric quaternary ammonium surfactants with different degree of oligomerisation and various spacers. In the three investigated series (2C, 3C, 4B) the spacer length was chosen to be in the range between C4 and C6, because the most pronounced changes in the properties of dimeric surfac‐ tants were reported for rather short spacer groups. The spacer groups employed, namely *trans*-1,4-buten-2-ylene, *m*-xylylene, and *p*-xylylene, can be considered as rigid, thus fixing chemically the distance between the cationic groups within the same molecule. *Γ*max i *a*min values were not reported in order to avoid controversy which *n* value should be used (see section 3.2). Therefore surface activity was assessed based only on *γ*cmc. The surface tension at cmc decreases with the degree of oligomerization, the effect being more pronounced for the longer spacers. The nature of the spacer has important influence on packing of the molecules in monolayer. Based on the *γ*cmc obtained for the three series with different spacers suggest that the packing density of the dodecyl chains in the adsorbed monolayer stays approximately the same for the series with the *trans*-butenylene spacer, increases some‐ what for the series with *m*-xylylene spacer, and improves much for the series with *p*xylylene spacer with increasing degree of oligomerization [30].

monomer to trimer [15, 26, 27], while for *s*=6 a slight increase going from dimer to trimer

For *s*=2 values of both maximum surface excess concentration and the minimum area per molecule adsorbed at the air/solution indicate vertical orientation of hydrophobic tails towards the air of monomeric and oligomeric surfactants with the spacer of the latter located at the

The *γ*cmc value for dimeric (2A; *m*=12, *s*=2) and tetrameric (4A; *m*=12, *s*=2) surfactants are much lower than those for their monomeric counterpart, 1A. However, the *γ*cmc value of the trimer (3A; *m*=12, *s*=2) is higher than those of the dimer and tetramer, but still lower compared to that of monomer 1A. Both surface efficiency and effectiveness of adsorption at the air/solution interface are lower for the trimer than these parameters for the dimer and tetramer. The differences found between surface activity within the oligomeric series may be attributed to different packing densities at the air/solution interface. The p*C*<sup>20</sup> values of these oligomeric surfactants are much higher than for monomer 1A. Within the oligomeric series, however, the

In the series of trimeric 12-*s*-12-*s*-12 surfactants *a*min increases as *s* is increased, as is the case for the same series of surfactant dimers [15]. Among trimeric surfactants of quaternary ammonium bromide *m-2-m-2-m* type, where *m=*8, 10, or 12, 10-2-10-2-10 surfactant has highest efficiency,

The reliable values of surface tension could not be obtained for 12-3-12-4-12-3-12 tetramer [15]. It can be concluded that in the series *m*-*s*-(*m*-*s*)*x*-*m* (1A-4A) higher oligomeric surfactants exhibits somewhat better adsorption properties in comparison with monomeric and dimeric surfactants. However, the difference in the surface activity is highest between monomer and

Kim [28] and Chelebicki [29] synthesized oligomeric quaternary ammonium salts using epichlorohydrin and epibromohydrine. In that way alkyl spacer containing hydroxy group was introduced (2B, 3B). The *γ*cmc of compounds without central alkyl chain (gemini surfactant; 2B, *m*=1) were higher than that of coresponding lower analogs; bisquaternary ammonium

1B. Contrary to that, compounds having three dodecyl chains (timeric surfactant, 3B) show lower *γ*cmc than both monomer and dimer. In addition, positive charge in the center of the molecule contributes to the lowering of *γ*cmc [28]. Chelebicki et al. [29] varied counterions (Cl-

) and alkyl chain length at central nitrogen atom (2B; *m*=2 – 8). Obtained *Γ*max and *a*min values indicated that 2B surfactants with longer alkyl chain length are packed more tightly at the air–

investigated bromide salts reduce surface tension of water more than gemini surfactants and

In conclusion, introducing additional hydrophobic chain results in the better surface activity of 2B and 3B surfactants. Surface activity is also influenced by counterion, bromides are more

CH2CH-(OH)CH2

+

N(CH3)2R 2Cl-

) surfactants were higher than monomer 1B and

). According to obtained *γ*cmc values,

) and monomeric

,

degree of oligomerization does not significantly affect the p*C*20 value [27].

dimer than between dimer and higher oligomers.

dichloride with dodecyl chain (R(CH3)2N+

water interface. p*C*20 values of all 2B (*X*=Br-

analogous dichloride bis-ammonium salts [29].

surface active than chlorides.

analogous dichloride bis-ammonium salts 2B (*X*=Cl-

Br-

it occupies lowest surface at the air/water interface and has the lowest *γ*cmc [26].

has been observed [15].

146 Oligomerization of Chemical and Biological Compounds

interface [27].

Greater possibility of adjusting surfactant physico-chemical properties through changing molecular structure and conformation has motivated synthesis and characterization of several star-shaped surfactants.

Yoshimura et al. [33] investigated star-shaped trimeric surfactants consisting of three quater‐ nary ammonium surfactants linked to a tris(2-aminoethyl)amine core (3D). Each ammonium group had two methyls and a straight alkyl chain of 8, 10, 12, or 14 carbons. In comparison with corresponding monomeric and gemini surfactants with dodecyl chains, *Γ*max and *a*min of 3D are smaller and much larger, respectively. However, *a*min calculated per hydrocarbon chain are close to that of a monomeric surfactant and *a*min per molecule slightly larger than that of the linear-type trimeric surfactants. This indicates that 3D adsorbed to the air/water interface in an orientation which cause high surface activity, by interactions of the multiple hydrocarbon chains despite the strong electrostatic repulsion between multiple quaternary ammonium headgroups. These results were also supported by p*C*20 and the cmc/*C*20 ratio [33].

Trimeric surfactant 3D exhibited *γ*cmc values smaller than those of the corresponding mono‐ meric surfactants, and almost the same *γ*cmc values as those of the dimeric surfactants. Their *γ*cmc were also lower in comparison with afore mentioned linear-type cationic trimeric surfactants with three dodecyl chains and two spacers such as ethylene (3A; *m*=12, *s*=2), *trans*-1,4-buten-2-ylene (3Ctb), *m*-xylylene (3Cmx), or *p*-xylylene (3Cpx). Although the adsorption kinetics of trimeric strar–shape surfactants to the air/water interface was slow, they have strongly adsorbed and oriented themselves at the interface, indicating that they efficiently lower the surface tension of water [33].

Wang et al. [22, 34-36] synthesized diversity of oligomeric star-shape surfactants: trimeric (3E, 3F), tetrameric (4C) and hexameric (6A). However, because of their unusual aggregation behavior and properties surface activity was investigated only for tetrameric molecule (4C). The obtained *γ*cmc was higher than the most of the reported cationic gemini and higher oligomeric surfactants. In addition, the surface tension continues to decrease significantly after cmc in large concentration range. This was attributed to the formation of premicellar aggre‐ gates before cmc through hydrophobic interaction among hydrophobic chains of different molecules [36].

Ring-typed trimeric surfactants showed higher efficiency at reducing the surface tension in the alkali solution in comparison with single-chain sodium dodecanoate and sodium dodecyl sulfate (SDS). Increasing the length of the side chains results in lowering *γ*cmc. The obtained

Higher Oligomeric Surfactants — From Fundamentals to Applications

http://dx.doi.org/10.5772/57655

149

The p*C*<sup>20</sup> values of the ring-type trimeric surfactants were larger than those of the monomeric anionic surfactants (sodium dodecanoate and SDS) and increased when the hydrophobic chain length increased, indicating that a long hydrophobic chain in the molecule facilitates a more close packed arrangement at the air/water interface and a more efficient adsorption [38]. In comparison with 3K and 3L, based on cmc/*C*<sup>20</sup> values, ring like surfactants are more likely to

Grau et al. [40] synthesized tetrameric anionic surfactants with multiple ring spacers which are flexible (4E), semi-flexible (4D, 4F) or rigid (4G). To the best of our knowledge this is the only study of the influence of spacer nature on surface activity of anionic oligomeric surfac‐ tants. Dioxane groups in the spacer confer wet-ability of synthesized surfactants. It was shown that surfactant 4G with the most rigid spacer displayed a minor surface activity at 20 o

C.

Obtained *a*min values are less than four times the value for the single-chain surfactant sodium 1-decanesulfonate (C10H21SO3Na) indicating that these tetrameric surfactants are some‐ what more closely packed at the air–solution interface than the single chain reference

The *γ*cmc values are smaller than those of single-chain surfactants, but similar to those of doubleand triple-decyl chain surfactants with two sulfonate groups. The *γ*cmc values of the investi‐ gated surfactants are not meaningfully different [40]. The p*C*20 value of the surfactant 4D is larger than those of the corresponding 4 E-G surfactants, which are quite similar. This reveals that the p*C*<sup>20</sup> values decrease with an increase in the number of dioxane rings, in accordance

Although nonionic surfactants are widely applied, especially in the emulsion formulations and in drug delivery, there are only few papers dealing with higher oligomeric surfactants. One of the reasons can be that different polymeric nonionic surfactants are commercially

Surface activity of nonionic oligomeric 3M surfactants was compared with oligomeric trimeric nonylphenol polyoxyethylene surfactants (TNP) and monomeric nonylphenol polyoxyethy‐ lene ether surfactants (NP) [41]. *γ*cmc and the *Γ*max values of the TNP surfactants are lower than those of corresponding monomer NP surfactants. For both TNP and NP surfactants *a*min decreases greatly as the length of oxyethylene chain number is increased. In general, low *a*min values suggest close packing of the surfactants at the air/water interface with almost perpen‐ dicular orientation of the surfactant molecule. The area per surfactant molecule of TNP is three times smaller than that of NP, which indicates that the trimeric surfactant molecules of TNP are not arranged side by side at the air–water interface, but staggered three-dimensional arrangement. This could be explained with increased hydrophobic interactions due to the

with the fact that the shortest spacing group provides the maximum efficiency [40].

C, but

values are similar to those obtained for 3K and 3L anionic trimeric surfactants [38].

form micelles in the bulk solutions.

all four surfactants are surface active at 40 o

compound [40].

available.

Neutral and cationic series of trimeric *β*-hydroxy amino ammonium surfactants with different alkyl chain length (3G, 3H; *m*=8, 12, 18) was synthesized by Grau et al. [37]. Poor solubility of neutral trimers (3G) prevented a reliable determination of their surface active properties. On the other hand cationic compounds (3H) displayed a sharp break in the surface tension *vs.* concentration (on log scale) curves and a final plateau indicating a well-defined cmc. The variation of *Γ*max and *a*min with alkyl chain length is almost negligible. The values of *a*min decrease somewhat with the increase in the alkyl chain length, which was attributed to the flexibility of the spacing group and stronger intermolecular van der Waals forces at increasing chain lengths [37].

All three cationic compounds (3H) have good surface activity as indicated by p*C*20 values. The cmc/*C*<sup>20</sup> ratios indicate that compound with dodecyl alkyl chain has a slightly greater prefer‐ ence for adsorption than for micellization [37].

Contrary to the relatively large number of data for oligomeric cationic surfactants, data for anionic and nonionic surfactants are scarce.

Yoshimura and Esumi [38, 39] have synthesized diverse anionic surfactants. Physico-chemical properties of all surfactants were investigated in alkaline solution at pH 13 (3J-L).

Surface activity of two trimeric surfactants (3K, 3L) was compared to the activity of corre‐ sponding single-chain sodium 2-aminododecanoate surfactant. All parameters, *Γ*max, *a*min and *γ*cmc show that investigated trimeric surfactants provide greater efficiency in lowering the surface tension than the single-chain surfactant. Surface area occupied by 3L molecule is somewhat larger than that of single-chained surfactant. On the other hand, *a*min for 3K is very small. This can be attributed to the smaller electrostatic repulsion between the chains in the molecule of 3K because central chain bears no charged groups unlike the 3L. This enables closer packing of the chains. Among two triple-chain surfactants, 3K shows lower surface tension than 3L. It is considered that the orientation of the latter derived from tris(2-aminoethyl)amine is less effective at the air/water interface, due to the bulky structure compared to the former from 3-aza-1,5-pentanediamine [39].

The obtained cmc/*C*20 values for both trimeric surfactants 3K and 3L indicates their preference to adsorb at air/water interface due to the difficulty of packing three hydrocarbon chains into the micelles. The cmc/*C*<sup>20</sup> of 3L is also much larger than that of 3K, suggesting that it is easier for the former to adsorb at the interface than for the latter [39].

Although it is expected that the areas occupied per molecule of the ring-type trimeric surfac‐ tants are large (3J), because they possess three hydrocarbon chains and a bulky triazine ring, they turned out to be small. The *a*min of the surfactant with shorter chain was comparable to the ones of single-chained surfactants. Increase in hydrocarbon chain length resulted in even smaller *a*min. It is suggested that the ring-type trimeric surfactant molecules pack densely at the air/water interface and therefore are highly surface active, probably due to the hydrophobic interactions between multi-hydrocarbon chains [38].

Ring-typed trimeric surfactants showed higher efficiency at reducing the surface tension in the alkali solution in comparison with single-chain sodium dodecanoate and sodium dodecyl sulfate (SDS). Increasing the length of the side chains results in lowering *γ*cmc. The obtained values are similar to those obtained for 3K and 3L anionic trimeric surfactants [38].

cmc in large concentration range. This was attributed to the formation of premicellar aggre‐ gates before cmc through hydrophobic interaction among hydrophobic chains of different

Neutral and cationic series of trimeric *β*-hydroxy amino ammonium surfactants with different alkyl chain length (3G, 3H; *m*=8, 12, 18) was synthesized by Grau et al. [37]. Poor solubility of neutral trimers (3G) prevented a reliable determination of their surface active properties. On the other hand cationic compounds (3H) displayed a sharp break in the surface tension *vs.* concentration (on log scale) curves and a final plateau indicating a well-defined cmc. The variation of *Γ*max and *a*min with alkyl chain length is almost negligible. The values of *a*min decrease somewhat with the increase in the alkyl chain length, which was attributed to the flexibility of the spacing group and stronger intermolecular van der Waals forces at increasing chain

All three cationic compounds (3H) have good surface activity as indicated by p*C*20 values. The cmc/*C*<sup>20</sup> ratios indicate that compound with dodecyl alkyl chain has a slightly greater prefer‐

Contrary to the relatively large number of data for oligomeric cationic surfactants, data for

Yoshimura and Esumi [38, 39] have synthesized diverse anionic surfactants. Physico-chemical

Surface activity of two trimeric surfactants (3K, 3L) was compared to the activity of corre‐ sponding single-chain sodium 2-aminododecanoate surfactant. All parameters, *Γ*max, *a*min and *γ*cmc show that investigated trimeric surfactants provide greater efficiency in lowering the surface tension than the single-chain surfactant. Surface area occupied by 3L molecule is somewhat larger than that of single-chained surfactant. On the other hand, *a*min for 3K is very small. This can be attributed to the smaller electrostatic repulsion between the chains in the molecule of 3K because central chain bears no charged groups unlike the 3L. This enables closer packing of the chains. Among two triple-chain surfactants, 3K shows lower surface tension than 3L. It is considered that the orientation of the latter derived from tris(2-aminoethyl)amine is less effective at the air/water interface, due to the bulky structure compared to the former

The obtained cmc/*C*20 values for both trimeric surfactants 3K and 3L indicates their preference to adsorb at air/water interface due to the difficulty of packing three hydrocarbon chains into the micelles. The cmc/*C*<sup>20</sup> of 3L is also much larger than that of 3K, suggesting that it is easier

Although it is expected that the areas occupied per molecule of the ring-type trimeric surfac‐ tants are large (3J), because they possess three hydrocarbon chains and a bulky triazine ring, they turned out to be small. The *a*min of the surfactant with shorter chain was comparable to the ones of single-chained surfactants. Increase in hydrocarbon chain length resulted in even smaller *a*min. It is suggested that the ring-type trimeric surfactant molecules pack densely at the air/water interface and therefore are highly surface active, probably due to the hydrophobic

properties of all surfactants were investigated in alkaline solution at pH 13 (3J-L).

molecules [36].

lengths [37].

ence for adsorption than for micellization [37].

148 Oligomerization of Chemical and Biological Compounds

anionic and nonionic surfactants are scarce.

from 3-aza-1,5-pentanediamine [39].

for the former to adsorb at the interface than for the latter [39].

interactions between multi-hydrocarbon chains [38].

The p*C*<sup>20</sup> values of the ring-type trimeric surfactants were larger than those of the monomeric anionic surfactants (sodium dodecanoate and SDS) and increased when the hydrophobic chain length increased, indicating that a long hydrophobic chain in the molecule facilitates a more close packed arrangement at the air/water interface and a more efficient adsorption [38]. In comparison with 3K and 3L, based on cmc/*C*<sup>20</sup> values, ring like surfactants are more likely to form micelles in the bulk solutions.

Grau et al. [40] synthesized tetrameric anionic surfactants with multiple ring spacers which are flexible (4E), semi-flexible (4D, 4F) or rigid (4G). To the best of our knowledge this is the only study of the influence of spacer nature on surface activity of anionic oligomeric surfac‐ tants. Dioxane groups in the spacer confer wet-ability of synthesized surfactants. It was shown that surfactant 4G with the most rigid spacer displayed a minor surface activity at 20 o C, but all four surfactants are surface active at 40 o C.

Obtained *a*min values are less than four times the value for the single-chain surfactant sodium 1-decanesulfonate (C10H21SO3Na) indicating that these tetrameric surfactants are some‐ what more closely packed at the air–solution interface than the single chain reference compound [40].

The *γ*cmc values are smaller than those of single-chain surfactants, but similar to those of doubleand triple-decyl chain surfactants with two sulfonate groups. The *γ*cmc values of the investi‐ gated surfactants are not meaningfully different [40]. The p*C*20 value of the surfactant 4D is larger than those of the corresponding 4 E-G surfactants, which are quite similar. This reveals that the p*C*<sup>20</sup> values decrease with an increase in the number of dioxane rings, in accordance with the fact that the shortest spacing group provides the maximum efficiency [40].

Although nonionic surfactants are widely applied, especially in the emulsion formulations and in drug delivery, there are only few papers dealing with higher oligomeric surfactants. One of the reasons can be that different polymeric nonionic surfactants are commercially available.

Surface activity of nonionic oligomeric 3M surfactants was compared with oligomeric trimeric nonylphenol polyoxyethylene surfactants (TNP) and monomeric nonylphenol polyoxyethy‐ lene ether surfactants (NP) [41]. *γ*cmc and the *Γ*max values of the TNP surfactants are lower than those of corresponding monomer NP surfactants. For both TNP and NP surfactants *a*min decreases greatly as the length of oxyethylene chain number is increased. In general, low *a*min values suggest close packing of the surfactants at the air/water interface with almost perpen‐ dicular orientation of the surfactant molecule. The area per surfactant molecule of TNP is three times smaller than that of NP, which indicates that the trimeric surfactant molecules of TNP are not arranged side by side at the air–water interface, but staggered three-dimensional arrangement. This could be explained with increased hydrophobic interactions due to the increase in number and length of the chains. Hence, TNP surfactants exhibit much better surface activities, including strong adsorption at the surface and wetting ability [41].

Cmc values are characteristic for given surfactant. Factors influencing cmc value are alkyl chain length, type of hydrophilic group and/or counterions, ionic strength, pH, pressure, tempera‐

Micelles size and shape is another property of crucial importance for the application of surfactants. Micelles can vary in size and shape (e.g. spherical, cylindrical, disklike, wormlike), depending on the structure of the molecule and experimental conditions (e.g. surfactant concentration, presence of the electrolytes, temperature, etc.). Aggregation number of the micelles, i.e. number of the surfactant molecules present, ranges between 50 and 200, and can be determined by static light scattering (SLS) or small angle neutron scattering (SANS).

The change of free energy of micellization (*ΔG*mic) tells us whether it is a spontaneous process (*ΔG*mic< 0) or not (*ΔG*mic> 0) and the magnitude of its driving force. Expression for the free

where *β* is the degree of counterion association to the micelle/solution interface and *j* is the number of alkyl chains connected by some spacer groups. Micelle ionization degree, *α*, is

The fact that dimeric and higher oligomeric surfactants have several degrees of the magnitude lower cmcs than corresponding monomeric surfactans was among main reason for the investigation of these surfactants. Lower cmcs means that less surfactant is needed, which has both financial and environmental benefit. Observed general trend is decrease of cmc with increasing oligomerization degree which is mainly attributed to thermodynamical reasons [3, 30] as basically the entropic loss resulting from micellization of the surfactants becomes smaller. However, the difference in cmcs in the series of the same surfactants with the different degree of oligomerization is largest going from monomer to dimer, and gets smaller in going from one to the other higher oligomeric molecule. Micellisation properties of higher oligomeric surfactants were assessed based on cmc and Δ*G*mic values. Available data are summarized in

For the dimeric, trimeric and tetrameric dodecyl quaternary ammonium bromides (1A-4B) the general trend of decreasing cmc with degree of the oligomerization was observed [15, 16, 27]. The cmcs values increase with the spacer length for trimeric surfactants, as it is the case for corresponding dimeric surfactants. In addition, the values of Δ*G*mic (per mole of dodecyl chain) are all around-20 kJ/mol, irrespective of the values of degree of oligomerization and/or spacer length as reported by In et al. [15]. Recent study showed that on a per chain basis for 1A – 4A surfactants (*m*=12, *s*=2) Δ*G*mic is more negative for the monomer than for oligomers [27]. This may be attributed to steric hindrance of the short ethylene spacer from becoming part of the micelle core. The Δ*G*mic values within the oligomeric series become less negative with increas‐ ing the degree of oligomerization as a consequence of lower driving force for micellization. Both Δ*G*ads and Δ*G*mic are negative, showing spontaneous adsorption and micellization.

*RT <sup>G</sup> RT <sup>j</sup> j j* (3)

Higher Oligomeric Surfactants — From Fundamentals to Applications

http://dx.doi.org/10.5772/57655

151

energy of micellization of oligomeric surfactants has been derived by Zana [12]:

mic

1 ln cmc ln æ ö æö D =+ - ç ÷ ç÷ è ø èø

ture, etc.

defined as *α*=1-*β.*

Table 2.

Surface activity of 3M surfactants is not improved as much compared to monomeric surfac‐ tants. *a*min increases with increasing hydrophobic spacer length. This may be explained as follows: with the number of carbon atoms in the hydrophobic spacer of the 3M increasing, the hydrophobic property is stronger. Therefore, it decreases the amount of the saturation of adsorption for 3M at the air–water interface, resulting in the increase of the surface tension at cmc, *γ*cmc.

Mohamed et al. [42] prepared a series of trimeric nonionic surfactants based on tris(2-amino‐ ethyl)amine with varying alkyl and poly(ethylene glycol) chain length, 3N. It was found that *a*min for the prepared trimeric surfactants increases by increasing the alkyl chain length because the surfactant molecules adsorb at the air/water interface to orient themselves so that the hydrophobes are directed away from water. But it is obvious that *a*min decreases by increasing the hydrophilic chain length within the group. This behavior was previously mentioned for nonionic surfactants and is explained by the increase in polyethylene glycol chain leading to coiling the chains in order to minimize any probable interactions between them [42].

In terms of minimum surface tension octyl-3N surfactants proved to be the most efficient in lowering the surface tension of aqueous solutions. The increase in the hydrocarbon chains from octyl to decyl to dodecyl caused an increase of the *γ*cmc, whereas the increase in the poly(eth‐ ylene glycol) chain length within the same group leads to a decrease in the *γ*cmc [42].

In conclusion, described higher oligomeric surfactants do poses better surface activity compared with monomeric surfactants. However, the change is, in most cases, lesser than for going from monomer to dimer. It was found that the influence of the nature and length of the spacer, as well as alkyl chain length, on adsorption is not the same for different surfactants series.

#### **3.3. Micellization**

One of the main characteristics of surfactants is that physico-chemical properties of surfactant solutions abruptly change over small concentration range (see for example Fig. 7). This is a consequence of a significant change in the nature of a solute species, i.e. formation of supra‐ molecular aggregates called micelles. The surfactant concentration at which the change occurs is called the critical micelle concentration (cmc). cmc can also be defined as a minimum surfactant concentration at which micelles are formed and remain in dynamical equilibrium with free monomers [1, 2, 43].

The main driving force for the micelle formation in aqueous solution is the effective interaction between the hydrophobic parts of the surfactant molecules. Interactions opposing micelliza‐ tion may include electrostatic repulsive interactions between charged head groups of ionic surfactants, repulsive osmotic interactions between chainlike polar head groups such as oligo(ethylene oxide) chains, or steric interactions between bulky head groups [49].

Cmc values are characteristic for given surfactant. Factors influencing cmc value are alkyl chain length, type of hydrophilic group and/or counterions, ionic strength, pH, pressure, tempera‐ ture, etc.

increase in number and length of the chains. Hence, TNP surfactants exhibit much better

Surface activity of 3M surfactants is not improved as much compared to monomeric surfac‐ tants. *a*min increases with increasing hydrophobic spacer length. This may be explained as follows: with the number of carbon atoms in the hydrophobic spacer of the 3M increasing, the hydrophobic property is stronger. Therefore, it decreases the amount of the saturation of adsorption for 3M at the air–water interface, resulting in the increase of the surface tension at

Mohamed et al. [42] prepared a series of trimeric nonionic surfactants based on tris(2-amino‐ ethyl)amine with varying alkyl and poly(ethylene glycol) chain length, 3N. It was found that *a*min for the prepared trimeric surfactants increases by increasing the alkyl chain length because the surfactant molecules adsorb at the air/water interface to orient themselves so that the hydrophobes are directed away from water. But it is obvious that *a*min decreases by increasing the hydrophilic chain length within the group. This behavior was previously mentioned for nonionic surfactants and is explained by the increase in polyethylene glycol chain leading to

coiling the chains in order to minimize any probable interactions between them [42].

ylene glycol) chain length within the same group leads to a decrease in the *γ*cmc [42].

In terms of minimum surface tension octyl-3N surfactants proved to be the most efficient in lowering the surface tension of aqueous solutions. The increase in the hydrocarbon chains from octyl to decyl to dodecyl caused an increase of the *γ*cmc, whereas the increase in the poly(eth‐

In conclusion, described higher oligomeric surfactants do poses better surface activity compared with monomeric surfactants. However, the change is, in most cases, lesser than for going from monomer to dimer. It was found that the influence of the nature and length of the spacer, as well as alkyl chain length, on adsorption is not the same for different surfactants

One of the main characteristics of surfactants is that physico-chemical properties of surfactant solutions abruptly change over small concentration range (see for example Fig. 7). This is a consequence of a significant change in the nature of a solute species, i.e. formation of supra‐ molecular aggregates called micelles. The surfactant concentration at which the change occurs is called the critical micelle concentration (cmc). cmc can also be defined as a minimum surfactant concentration at which micelles are formed and remain in dynamical equilibrium

The main driving force for the micelle formation in aqueous solution is the effective interaction between the hydrophobic parts of the surfactant molecules. Interactions opposing micelliza‐ tion may include electrostatic repulsive interactions between charged head groups of ionic surfactants, repulsive osmotic interactions between chainlike polar head groups such as

oligo(ethylene oxide) chains, or steric interactions between bulky head groups [49].

surface activities, including strong adsorption at the surface and wetting ability [41].

cmc, *γ*cmc.

150 Oligomerization of Chemical and Biological Compounds

series.

**3.3. Micellization**

with free monomers [1, 2, 43].

Micelles size and shape is another property of crucial importance for the application of surfactants. Micelles can vary in size and shape (e.g. spherical, cylindrical, disklike, wormlike), depending on the structure of the molecule and experimental conditions (e.g. surfactant concentration, presence of the electrolytes, temperature, etc.). Aggregation number of the micelles, i.e. number of the surfactant molecules present, ranges between 50 and 200, and can be determined by static light scattering (SLS) or small angle neutron scattering (SANS).

The change of free energy of micellization (*ΔG*mic) tells us whether it is a spontaneous process (*ΔG*mic< 0) or not (*ΔG*mic> 0) and the magnitude of its driving force. Expression for the free energy of micellization of oligomeric surfactants has been derived by Zana [12]:

$$
\Delta G\_{\rm mic} = \left(\frac{1}{j} + \beta\right) RT \ln \text{cmcm} - \left(\frac{RT}{j}\right) \ln j \tag{3}
$$

where *β* is the degree of counterion association to the micelle/solution interface and *j* is the number of alkyl chains connected by some spacer groups. Micelle ionization degree, *α*, is defined as *α*=1-*β.*

The fact that dimeric and higher oligomeric surfactants have several degrees of the magnitude lower cmcs than corresponding monomeric surfactans was among main reason for the investigation of these surfactants. Lower cmcs means that less surfactant is needed, which has both financial and environmental benefit. Observed general trend is decrease of cmc with increasing oligomerization degree which is mainly attributed to thermodynamical reasons [3, 30] as basically the entropic loss resulting from micellization of the surfactants becomes smaller. However, the difference in cmcs in the series of the same surfactants with the different degree of oligomerization is largest going from monomer to dimer, and gets smaller in going from one to the other higher oligomeric molecule. Micellisation properties of higher oligomeric surfactants were assessed based on cmc and Δ*G*mic values. Available data are summarized in Table 2.

For the dimeric, trimeric and tetrameric dodecyl quaternary ammonium bromides (1A-4B) the general trend of decreasing cmc with degree of the oligomerization was observed [15, 16, 27]. The cmcs values increase with the spacer length for trimeric surfactants, as it is the case for corresponding dimeric surfactants. In addition, the values of Δ*G*mic (per mole of dodecyl chain) are all around-20 kJ/mol, irrespective of the values of degree of oligomerization and/or spacer length as reported by In et al. [15]. Recent study showed that on a per chain basis for 1A – 4A surfactants (*m*=12, *s*=2) Δ*G*mic is more negative for the monomer than for oligomers [27]. This may be attributed to steric hindrance of the short ethylene spacer from becoming part of the micelle core. The Δ*G*mic values within the oligomeric series become less negative with increas‐ ing the degree of oligomerization as a consequence of lower driving force for micellization. Both Δ*G*ads and Δ*G*mic are negative, showing spontaneous adsorption and micellization. Differences in their magnitudes reveal that these surfactants have greater preference toward adsorption in comparison with micellization. In the series of trimeric *m*-2-*m*-2-*m* surfactants (3A; *m*=8, 10, 12) cmc linearly decreases with increasing alkyl chain length, like for monomeric surfactants [26]. However, the effect of hydrocarbon chain length on the cmc is smaller for trimeric than for monomeric surfactants.

Kim et al. [28] observed that the cmc values of the quaternary ammonium compounds with two dodecyl chains (2B; *m*=1) are 2 orders of magnitude lower, and those of the compounds with three dodecyl chains (3B; *m*=12) are 4 orders of magnitude lower compared with the conventional dodecyltrimethylammonium chloride (1B), regardless of the number of hydro‐ philic ammonio groups in the molecule. The cmc values of compounds having a dimethylam‐ monio group in the center of the molecule are of the same order as that of compounds in which it is not present. This means that the additional charge in the center of the molecule has only a small effect on the cmc of the quaternary ammonium salts. In addition, there is a linear relationship between the total carbon number in the hydrophobic group and the cmcs on the semilogarithmic scale for all investigated quaternary ammonium salts, interestingly irrespec‐ tive of the number of hydrophilic ammonio groups in the molecule [28].

Increase in alkyl chain length from 2 to 8 resulted in linear decrease of the cmcs values of 2A surfactants with both bromide and chloride counterions [29]. However, the type of counterion influences the cmc values, the values for the chlorides were higher than for bromides. Obtained values were 2 orders of the magnitude lower than those of corresponding monomeric 1A and 1B. They were also lower than that of dimeric *m*-*s*-*m* surfactants with hydrophobic flexible spacers (2A; *s*=3, 4, 6). The authors explained this with the formation of H-bonds between two OH groups in the spacer and water molecules, which facilitate the bending of the spacer toward the aqueous phase forming the convex micellar surface [29].

**Table 2.**The critical micellization concentration determined by surface tension (cmc) and electrical conductivity measurement (cmc), the degree of counterion dissociation (

number (*N*agg) and the free energy of micellization (*G*mic) obtained at given temperature (*T*) of higher oligomeric and corresponding monomeric and dimeric surfactants. Abbreviated names

of listed surfactants are given according to the figures 2-4.

**Surfactant**

*Cationic* 

12-2-12

0.9 0.8 1

12-3-12 0.96 0.22 20.8 25 15, 50

12-6-12 1.03 0.33 18.8 25 15, 50

10-2-10-2-10 0.95 25 26

25

26

27

30

0.6 0.8 0.26 19.3

12-3-12-3-12 0.14 0.19 21.5 25 15

12-6-12-6-12 0.28 0.30 19 25 15

12-2-12-2-12-2-12 0.7 0.8 0.35 16.7 40 27

12-3-12-4-12-3-12 0.06 0.20 22.6 25 15

1B 18.3 22 0.34 21.94 34.3 23 29, 30, 51

BQADC\*\* 0.78 20 28

2B-12 Cl 0.0062 20 28

3B – 12 Cl 0.0096 20 28

1C 7.0 27.3 23 30, 51

2Ctb 2.0 15.5 23 30, 31, 51

2C mx 1.5 11.3 23 30, 31, 51

2Cpx 2.1 10.5 23 30, 31, 51

3Ctb 0.36 5.0 23 30, 51

3Cmx 0.28 5.4 23 30, 51

3Cpx 0.29 3.5 23 30, 51

Higher Oligomeric Surfactants — From Fundamentals to Applications

http://dx.doi.org/10.5772/57655

153

4Btb 0.12 3.8 23 30, 51

4Bmx 0.09 23 30

4Bpx 1.3 3.5 23 30, 51

3I 1.0 1.0 25 32

3H - 8 0.622 20 37

3H - 10 0.223 20 37

3H- 12 0.049 20 37

bisquaternary ammonium dichloride with dodecyl chain (R(CH3)2N+CH2CH-(OH)CH2+N(CH3)2R·2Cl-

\*decyltrimethylammonium bromide**, \*\***

12-2-12-2-12 0.065

**cmc**

**mmol / dm3**

**cmc**

**mmol / dm3**

1A 14 14 15.1 15 0.25 0.29 18.3 18.1 25 30

10-2-10 5.0 25 44

0.27 23.2 25 30

**-***G***mic kJ / mol alkyl chain** 

*N***agg**

*T* 

**C**

**Ref**

15, 16

27

16

27

**o**

), aggregation

3

In conclusion, Kim and Chlebicki showed that introducing additional hydrophobic chain results in lower cmc values. Type of counterion also influences micellization process, bromide salts aggregates at lower concentrations than chloride. Obtained results are in accordance with generally observed trend for oligomeric surfactants.

The cmc values of oligomeric quaternary ammonium surfactants with various rigid spacers 2C, 3C, 4B are much lower than those of the structurally closely related surfactant monomers, 1B and 1C [30]. The decrease in cmc is more pronounced going from monomers to dimers, than going from trimers to tetramers. It was shown that the chemical nature of the spacer has an influence on the cmc values among the surfactants of the same degree of oligomerization. Comparing the isomeric spacers *m*-xylylene and *p*-xylylene the cmc slightly increases with increasing spacer length, similar to the behavior of oligomeric surfactants with flexible alkyl spacers. The oligomers with *m*-xylylene spacer have lower cmcs than those with butenylen, despite their increased length, due to the higher hydrophobicity [30]. Exception to this behaviour was 4Bpx surfactant. The reason lies in the formation of premicellar aggregates at very low concentration, resulting in shifting cmc to values higher than expected. Premicellar aggregation can occur in solutions of conventional surfactants that are sufficiently hydropho‐ bic (at least 14 carbon atoms) and in those of dimeric *m*-*s*-*m* surfactants (for 12-*s*-12 with *m* ≥ 12, for *m*-8-*m* with *m* ≥ 14, and for 16-*p*-xylylen-16 [11].


**Table 2.**The critical micellization concentration determined by surface tension (cmc) and electrical conductivity measurement (cmc), the degree of counterion dissociation (

number (*N*agg) and the free energy of micellization (*G*mic) obtained at given temperature (*T*) of higher oligomeric and corresponding monomeric and dimeric surfactants. Abbreviated names

of listed surfactants are given according to the figures 2-4.

), aggregation

Differences in their magnitudes reveal that these surfactants have greater preference toward adsorption in comparison with micellization. In the series of trimeric *m*-2-*m*-2-*m* surfactants (3A; *m*=8, 10, 12) cmc linearly decreases with increasing alkyl chain length, like for monomeric surfactants [26]. However, the effect of hydrocarbon chain length on the cmc is smaller for

Kim et al. [28] observed that the cmc values of the quaternary ammonium compounds with two dodecyl chains (2B; *m*=1) are 2 orders of magnitude lower, and those of the compounds with three dodecyl chains (3B; *m*=12) are 4 orders of magnitude lower compared with the conventional dodecyltrimethylammonium chloride (1B), regardless of the number of hydro‐ philic ammonio groups in the molecule. The cmc values of compounds having a dimethylam‐ monio group in the center of the molecule are of the same order as that of compounds in which it is not present. This means that the additional charge in the center of the molecule has only a small effect on the cmc of the quaternary ammonium salts. In addition, there is a linear relationship between the total carbon number in the hydrophobic group and the cmcs on the semilogarithmic scale for all investigated quaternary ammonium salts, interestingly irrespec‐

Increase in alkyl chain length from 2 to 8 resulted in linear decrease of the cmcs values of 2A surfactants with both bromide and chloride counterions [29]. However, the type of counterion influences the cmc values, the values for the chlorides were higher than for bromides. Obtained values were 2 orders of the magnitude lower than those of corresponding monomeric 1A and 1B. They were also lower than that of dimeric *m*-*s*-*m* surfactants with hydrophobic flexible spacers (2A; *s*=3, 4, 6). The authors explained this with the formation of H-bonds between two OH groups in the spacer and water molecules, which facilitate the bending of the spacer toward

In conclusion, Kim and Chlebicki showed that introducing additional hydrophobic chain results in lower cmc values. Type of counterion also influences micellization process, bromide salts aggregates at lower concentrations than chloride. Obtained results are in accordance with

The cmc values of oligomeric quaternary ammonium surfactants with various rigid spacers 2C, 3C, 4B are much lower than those of the structurally closely related surfactant monomers, 1B and 1C [30]. The decrease in cmc is more pronounced going from monomers to dimers, than going from trimers to tetramers. It was shown that the chemical nature of the spacer has an influence on the cmc values among the surfactants of the same degree of oligomerization. Comparing the isomeric spacers *m*-xylylene and *p*-xylylene the cmc slightly increases with increasing spacer length, similar to the behavior of oligomeric surfactants with flexible alkyl spacers. The oligomers with *m*-xylylene spacer have lower cmcs than those with butenylen, despite their increased length, due to the higher hydrophobicity [30]. Exception to this behaviour was 4Bpx surfactant. The reason lies in the formation of premicellar aggregates at very low concentration, resulting in shifting cmc to values higher than expected. Premicellar aggregation can occur in solutions of conventional surfactants that are sufficiently hydropho‐ bic (at least 14 carbon atoms) and in those of dimeric *m*-*s*-*m* surfactants (for 12-*s*-12 with *m* ≥

tive of the number of hydrophilic ammonio groups in the molecule [28].

the aqueous phase forming the convex micellar surface [29].

generally observed trend for oligomeric surfactants.

12, for *m*-8-*m* with *m* ≥ 14, and for 16-*p*-xylylen-16 [11].

trimeric than for monomeric surfactants.

152 Oligomerization of Chemical and Biological Compounds

3

**Table 2.** Continued.


4

The cmc of the cationic trimeric surfactants 3H linearly decreases with the number of carbon atoms in the alkyl chain. In other words, cmcs values decreases with alkyl length even for very hydrophobic molecule with tree octadecyl chains. Calculated cmc/*C*<sup>20</sup> values point that trimeric surfactant with 16 C atoms in alkyl chains has slightly greater tendency to adsorb at air/water

Higher Oligomeric Surfactants — From Fundamentals to Applications

http://dx.doi.org/10.5772/57655

155

In the 3D series [33] the cmc values decrease with increasing chain length from 10 to 14. The cmc of 3D-8 was not possible to determine because micelles did not form in solution even at the highest concentration studied, due to the short chains, and the solution simply became turbid. The cmc value of 3D-12 surfactant is lower than that of corresponding monomeric and gemini surfactants. The cmc in that series of surfactants decreases for an order of magnitude for each additional surfactant moiety. Compared with linear 12-2-12 surfactant, cmc of 3D-12 was slightly higher, in line with authors assumption that tris(2-amionoethyl)amine would have greater hydrophobicity. 3D-12 also exhibited cmc lower than that of the linear cationic trimeric surfactants with spacers such as *trans*-1,4-buten-2-ylene (3Ctb), *m*-xylylene (3Cmx), and *p*-xylylene (3Cpx), respectively [33]. The relationship between the logarithm of cmc and hydrocarbon chain lengths (as for monomeric and gemini surfactants) or chain number was linear [33]. This means that the longer the chain length and the higher the chain number of

The values of p*C*20 and cmc/*C*20 ratio in 3D series were much larger than those of most ionic monomeric surfactants. In addition, the absolute values of Δ*G*ads are significantly greater than those of Δ*G*mic for all the hydrocarbon chain lengths. Both these facts suggest that in comparison

Trimeric surfactants 3E and 3F display unusual aggregation behavior in aqueous solution [34]. Both trimeric molecules form vesicles just above critical aggregation concentration (cac) and then vesicles gradually transform to the micelles with the increase of surfactant concentration. Normally, surfactants form small aggregates first, and then the aggregates may become large with an increase of the surfactant concentration. The reported cac value of 3E is lower than that of the 3F. Both values are slightly lower from cmc of cationic gemini surfactants. The enthalpy changes for aggregation have large negative value for both surfactants indicating that they have similar aggregation behavior and ability, and that their aggregation is domi‐ nated by hydrophobic interaction. This is expected because investigated trimeric surfactants have similar molecular structure. 3F is a completely symmetric molecule, whereas 3E is not

Tetrameric (4C) and hexameric (6A) surfactants synthesized by the same group of authors [22, 36] also display interesting aggregation behavior which will be address in more detail in the next part of the chapter. Reported cmc value of tetrameric 4C (0.08 mmol dm-3) is at least an order of magnitude smaller than those of cationic gemini surfactants [36]. This is in accordance with the fact that increasing number of alkyl chains in the surfactant molecule results in

Hexameric 6A (as 3E and 3F) above cac forms vesicles not micelles. It is reported that 6A displays two cac (C1 and C2) as a consequence of aggregate transformation caused by the

to micellization, the adsorption of star-type trimeric surfactants is preferred [33].

interface in comparison with tendency to form micelles [37].

surfactants, the lower the cmc will be.

symmetric with spacer slightly different from 3F [34].

decreasing cmc values.

**Table 2.** The critical micellization concentration determined by surface tension (cmc) and electrical conductivity measurement (cmc), the degree of counterion dissociation (), aggregation number (*N*agg) and the free energy of micellization (*G*mic) obtained at given temperature (*T*) of higher oligomeric and corresponding monomeric and dimeric surfactants. Abbreviated names of listed surfactants are given according to the figures 2-4.

The cmc of the cationic trimeric surfactants 3H linearly decreases with the number of carbon atoms in the alkyl chain. In other words, cmcs values decreases with alkyl length even for very hydrophobic molecule with tree octadecyl chains. Calculated cmc/*C*<sup>20</sup> values point that trimeric surfactant with 16 C atoms in alkyl chains has slightly greater tendency to adsorb at air/water interface in comparison with tendency to form micelles [37].

In the 3D series [33] the cmc values decrease with increasing chain length from 10 to 14. The cmc of 3D-8 was not possible to determine because micelles did not form in solution even at the highest concentration studied, due to the short chains, and the solution simply became turbid. The cmc value of 3D-12 surfactant is lower than that of corresponding monomeric and gemini surfactants. The cmc in that series of surfactants decreases for an order of magnitude for each additional surfactant moiety. Compared with linear 12-2-12 surfactant, cmc of 3D-12 was slightly higher, in line with authors assumption that tris(2-amionoethyl)amine would have greater hydrophobicity. 3D-12 also exhibited cmc lower than that of the linear cationic trimeric surfactants with spacers such as *trans*-1,4-buten-2-ylene (3Ctb), *m*-xylylene (3Cmx), and *p*-xylylene (3Cpx), respectively [33]. The relationship between the logarithm of cmc and hydrocarbon chain lengths (as for monomeric and gemini surfactants) or chain number was linear [33]. This means that the longer the chain length and the higher the chain number of surfactants, the lower the cmc will be.

The values of p*C*20 and cmc/*C*20 ratio in 3D series were much larger than those of most ionic monomeric surfactants. In addition, the absolute values of Δ*G*ads are significantly greater than those of Δ*G*mic for all the hydrocarbon chain lengths. Both these facts suggest that in comparison to micellization, the adsorption of star-type trimeric surfactants is preferred [33].

4

**Table 2.** Continued.

**Surfactant**

*Cationic*

3E 4C 6A

*Anionic* 

surfactants. Abbreviated names of listed surfactants are given according to the figures 2-4.

C12H25OSO Na 3

**Table 2.** The critical micellization concentration determined by surface tension (cmc) and electrical conductivity measurement (cmc), the degree of counterion dissociation (), aggregation number (*N*agg) and the free energy of micellization (*G*mic) obtained at given temperature (*T*) of higher oligomeric and corresponding monomeric and dimeric

1.31a

0.11a

0.29

0.24

3J-4 0.19 32 25 38

3J-10 0.011 58020 25 38

C10H21CH(NH)2COONa 1 34.2 25 39

3K 0.0063 10.7 25 39

3L 0.0167 9.9 25 39

a 20 40

a 20 40

a 20 40

3N 8-PEG400 8.23 10-4 25 42

3N 8- PEG1000 7.57 10-4 25 42

3N 8- PEG2000 7.1 10-4 25 42

3N 10- PEG400 7.92 10-4 25 42

3N 10- PEG1000 7.34 10-4 25 42

3N 10- PEG2000 6.91 10-4 25 42

3N 12- PEG400 7.50 10-4 25 42

3N 12- PEG1000 7.15 10-4 25 42

3N 12- PEG2000 6.63 10-4 25 42

a the critical aggregation concentration is reported

4D 0.0468

4E 0.178

4F 0.105

*Nonionic* 

8.2 25 38

1.13 a

**cmc mmol / dm3**

 **cmc**

**mmol / dm3**

 3D-10 1.17 1.60 30.0 25 33

3D-12 0.139 0.177 36.3 25 33

3D-14 0.00647 0.384 33.7 25 33

3F 0.33a 0.39a 0.50 25 34

0.20a 0.32a 0.45 25 34

0.08 0.12 0.73 25 36

25 22

**-***G***mic kJ / mol alkyl chain** 

*N***agg**

*T* 

**C**

**Ref** 

154 Oligomerization of Chemical and Biological Compounds

**o**

Trimeric surfactants 3E and 3F display unusual aggregation behavior in aqueous solution [34]. Both trimeric molecules form vesicles just above critical aggregation concentration (cac) and then vesicles gradually transform to the micelles with the increase of surfactant concentration. Normally, surfactants form small aggregates first, and then the aggregates may become large with an increase of the surfactant concentration. The reported cac value of 3E is lower than that of the 3F. Both values are slightly lower from cmc of cationic gemini surfactants. The enthalpy changes for aggregation have large negative value for both surfactants indicating that they have similar aggregation behavior and ability, and that their aggregation is domi‐ nated by hydrophobic interaction. This is expected because investigated trimeric surfactants have similar molecular structure. 3F is a completely symmetric molecule, whereas 3E is not symmetric with spacer slightly different from 3F [34].

Tetrameric (4C) and hexameric (6A) surfactants synthesized by the same group of authors [22, 36] also display interesting aggregation behavior which will be address in more detail in the next part of the chapter. Reported cmc value of tetrameric 4C (0.08 mmol dm-3) is at least an order of magnitude smaller than those of cationic gemini surfactants [36]. This is in accordance with the fact that increasing number of alkyl chains in the surfactant molecule results in decreasing cmc values.

Hexameric 6A (as 3E and 3F) above cac forms vesicles not micelles. It is reported that 6A displays two cac (C1 and C2) as a consequence of aggregate transformation caused by the changes of the surfactant configuration through hydrophobic interaction among the hydro‐ carbon chains [22].

the aggregation, relative to the adsorption, increases with the number of dioxane rings and

Higher Oligomeric Surfactants — From Fundamentals to Applications

http://dx.doi.org/10.5772/57655

157

As a rule cmc values of nonionic surfactants are lower in comparison with ionic surfactants due to the lesser electrostatic repulsion of the hydrophilic groups at the micelle/water interface. Although, the data for nonionic higher oligomeric surfactants are scarce, this trend has been observed in investigated higher oligomeric nonionic surfactants. The cmc value of Tyloxapol (7A) determined by time-resolved fluorescence quenching (TRFQ) is in the micromolar range,

The cmc values of trimeric nonylphenol polyoxyethylene and monomeric nonylphenol polyoxyethylene ether surfactants 3M increase with the number of the oxyethylene groups in the spacer due to the greater hydrophilic characteristics. On the other hand, increasing the hydrophobic chain length results in lower cmc. Short hydrophobic chains are stretched and in contact with water. Therefore they need more free energy to form micelle and as a results the cmc of these surfactants are higher. Long hydrophobic chains, longer than the equilibrium distance of electrostatic repulsion of the head groups, pack inside the micelle hydrophobic

core, which may decrease the free energy and consequently lower the cmc value [41].

length. But they also decrease with increasing poly(ethylene glycol) chain length [42].

obtained Δ*G*mic values for different type of oligomeric surfactants (Table 2).

**3.4. Properties of higher oligomeric surfactants aggregates**

The cmc values of trimeric nonionic 3N surfactants are small, which suggest that they easily form aggregates in solution. As expected, the cmc values decrease with increasing alkyl chain

In conclusion, the cmc values of higher oligomeric surfactants are smaller in comparison with corresponding monomeric and dimeric surfactants. However, the changes are becoming less significant with increase of degree of oligomerization. Reported cmc/*C*<sup>20</sup> ratios indicate that, as expected, structural factors have dominant role in determining higher oligomeric surfac‐ tants preference toward adsorption or micellization. This is collaborated with wide range of

The most striking feature of the dimeric surfactants with short spacers, in comparison with monomeric, is their tendency to form elongated micelles already at relatively low concentra‐ tions, without added electrolyte. Zana have pointed out that the origin of the different aggregation behavior of monomeric and dimeric surfactants lies in the different distribution of the head group distances at the micelle/water interface in these two classes of surfactants [3, 10]. For the monomeric surfactants, the head groups are randomly distributed on the surface separating the aqueous phase and the micelle hydrophobic core. The distribution of distances between head groups has a maximum at a thermodynamic equilibrium distance (*d*T) deter‐ mined by the equilibrium of the opposite forces involved in micelle formation. In the case of dimeric surfactants, the distribution becomes bimodal. One maximum corresponds to the thermodynamic distance, the other one, narrower, at a distance *d*<sup>S</sup> that corresponds to the length of the spacer. The length of the spacer is determined not only by the number of atoms in the spacer but also by its conformation. The distance *d*<sup>S</sup> can be adjusted to be smaller, equal or larger than *d*T by changing structure of the spacer opening a possibility to create variety of

i.e., about a hundred times lower than for the "monomer" TX100 [21].

rigidity of the spacer [40].

The enthalpy change for the aggregation of 6A exhibits a very large negative value, much larger than those of other surfactants with closely similar cationic ammonium amphiphilic moiety. This enthalpy change should concern the entire aggregation process including both the first and the second aggregation processes at C1 and C2 [22].

Enthalpy change of aggregation for 6A is much more negative than for 1A, 2A (*m*=12, *s*=6), 3F and 4C, due to the much stronger inter-and intramolecular hydrophobic interactions between the alkyl chains. That is to say, the cooperative hydrophobic interaction becomes stronger with the increase of the number of the hydrophobic chains in a surfactant molecule. Of course, hydrogen bonding between amide groups can also increase the enthalpy change per amphi‐ philic moiety for 3F, 4C, and 6A. However, each amphiphilic moiety of these molecules has one amide group, and the significantly enhanced enthalpy change per amphiphilic moiety for 6A confirms that the contribution of each hydrocarbon chain to inter-and intramolecular hydrophobic interaction in the 6A aggregation becomes much stronger than that for other surfactants [22].

The cmc values of ring type anionic trimeric surfactant 3J decrease with increasing the alkyl chain length from 4 to 10 [38]. Obtained cmc/*C*20 ratios are very small compared with the single chained surfactants suggesting that trimeric anionic ring type surfactants are more likely to form micelles in the bulk solution due to the interactions between multi-hydrocarbon chains.

Anionic triple chain surfactants [39] 3K and 3L have cmc values one-two order of magnitude lower than corresponding single chained sodium 2-aminododecanoate. This indicates that both surfactants have excellent micelle-forming ability at low concentration. The cmc of 3K is also lower than that of 3L, showing the effect in the number of hydrophilic groups on the cmc. On the contrary, absolute values of Δ*G*mic per hydrocarbon chain of 3K and 3L are much smaller than that of the single chain surfactant. It indicates that the steric hindrance of closely connected hydrocarbon chains makes it difficult for the triple-chain surfactants to form micelles. This result is supported by the large cmc/*C*20 ratio [39]. The opposite result for cmc/*C*20 ratio was obtained for ring type anionic trimeric surfactant 3J [38].

The cmc values of 4D-4G tetrameric surfactants depend on the architecture of the spacer [40]. Increasing the number of dioxane rings in the spacer increases the cmc value, surfactant 4D has the smallest cmc value among investigated surfactants. The dioxane rings appear to act as hydrophilic units, which contribute to increasing cmc values. Furthermore, a comparison between homologues compounds, as surfactants 4E and 4F, reveals that, as expected, the higher the flexibility of the spacer the higher the cmc. This behavior is more evident at higher temperature (40 o C) [40].

Cmc/*C*20 values indicate that 4D-4G surfactants have higher preference to aggregate than gemini surfactants. Authors proposed that this might be because the four hydrophobic groups are more suitably oriented to accommodate themselves in the internal part of the aggregates than the two hydrophobic groups of the gemini surfactants. The preference for the aggregation, relative to the adsorption, increases with the number of dioxane rings and rigidity of the spacer [40].

changes of the surfactant configuration through hydrophobic interaction among the hydro‐

The enthalpy change for the aggregation of 6A exhibits a very large negative value, much larger than those of other surfactants with closely similar cationic ammonium amphiphilic moiety. This enthalpy change should concern the entire aggregation process including both the first

Enthalpy change of aggregation for 6A is much more negative than for 1A, 2A (*m*=12, *s*=6), 3F and 4C, due to the much stronger inter-and intramolecular hydrophobic interactions between the alkyl chains. That is to say, the cooperative hydrophobic interaction becomes stronger with the increase of the number of the hydrophobic chains in a surfactant molecule. Of course, hydrogen bonding between amide groups can also increase the enthalpy change per amphi‐ philic moiety for 3F, 4C, and 6A. However, each amphiphilic moiety of these molecules has one amide group, and the significantly enhanced enthalpy change per amphiphilic moiety for 6A confirms that the contribution of each hydrocarbon chain to inter-and intramolecular hydrophobic interaction in the 6A aggregation becomes much stronger than that for other

The cmc values of ring type anionic trimeric surfactant 3J decrease with increasing the alkyl chain length from 4 to 10 [38]. Obtained cmc/*C*20 ratios are very small compared with the single chained surfactants suggesting that trimeric anionic ring type surfactants are more likely to form micelles in the bulk solution due to the interactions between multi-hydrocarbon chains.

Anionic triple chain surfactants [39] 3K and 3L have cmc values one-two order of magnitude lower than corresponding single chained sodium 2-aminododecanoate. This indicates that both surfactants have excellent micelle-forming ability at low concentration. The cmc of 3K is also lower than that of 3L, showing the effect in the number of hydrophilic groups on the cmc. On the contrary, absolute values of Δ*G*mic per hydrocarbon chain of 3K and 3L are much smaller than that of the single chain surfactant. It indicates that the steric hindrance of closely connected hydrocarbon chains makes it difficult for the triple-chain surfactants to form micelles. This result is supported by the large cmc/*C*20 ratio [39]. The opposite result for cmc/*C*20 ratio was

The cmc values of 4D-4G tetrameric surfactants depend on the architecture of the spacer [40]. Increasing the number of dioxane rings in the spacer increases the cmc value, surfactant 4D has the smallest cmc value among investigated surfactants. The dioxane rings appear to act as hydrophilic units, which contribute to increasing cmc values. Furthermore, a comparison between homologues compounds, as surfactants 4E and 4F, reveals that, as expected, the higher the flexibility of the spacer the higher the cmc. This behavior is more evident at higher

Cmc/*C*20 values indicate that 4D-4G surfactants have higher preference to aggregate than gemini surfactants. Authors proposed that this might be because the four hydrophobic groups are more suitably oriented to accommodate themselves in the internal part of the aggregates than the two hydrophobic groups of the gemini surfactants. The preference for

and the second aggregation processes at C1 and C2 [22].

obtained for ring type anionic trimeric surfactant 3J [38].

C) [40].

carbon chains [22].

156 Oligomerization of Chemical and Biological Compounds

surfactants [22].

temperature (40 o

As a rule cmc values of nonionic surfactants are lower in comparison with ionic surfactants due to the lesser electrostatic repulsion of the hydrophilic groups at the micelle/water interface. Although, the data for nonionic higher oligomeric surfactants are scarce, this trend has been observed in investigated higher oligomeric nonionic surfactants. The cmc value of Tyloxapol (7A) determined by time-resolved fluorescence quenching (TRFQ) is in the micromolar range, i.e., about a hundred times lower than for the "monomer" TX100 [21].

The cmc values of trimeric nonylphenol polyoxyethylene and monomeric nonylphenol polyoxyethylene ether surfactants 3M increase with the number of the oxyethylene groups in the spacer due to the greater hydrophilic characteristics. On the other hand, increasing the hydrophobic chain length results in lower cmc. Short hydrophobic chains are stretched and in contact with water. Therefore they need more free energy to form micelle and as a results the cmc of these surfactants are higher. Long hydrophobic chains, longer than the equilibrium distance of electrostatic repulsion of the head groups, pack inside the micelle hydrophobic core, which may decrease the free energy and consequently lower the cmc value [41].

The cmc values of trimeric nonionic 3N surfactants are small, which suggest that they easily form aggregates in solution. As expected, the cmc values decrease with increasing alkyl chain length. But they also decrease with increasing poly(ethylene glycol) chain length [42].

In conclusion, the cmc values of higher oligomeric surfactants are smaller in comparison with corresponding monomeric and dimeric surfactants. However, the changes are becoming less significant with increase of degree of oligomerization. Reported cmc/*C*<sup>20</sup> ratios indicate that, as expected, structural factors have dominant role in determining higher oligomeric surfac‐ tants preference toward adsorption or micellization. This is collaborated with wide range of obtained Δ*G*mic values for different type of oligomeric surfactants (Table 2).

#### **3.4. Properties of higher oligomeric surfactants aggregates**

The most striking feature of the dimeric surfactants with short spacers, in comparison with monomeric, is their tendency to form elongated micelles already at relatively low concentra‐ tions, without added electrolyte. Zana have pointed out that the origin of the different aggregation behavior of monomeric and dimeric surfactants lies in the different distribution of the head group distances at the micelle/water interface in these two classes of surfactants [3, 10]. For the monomeric surfactants, the head groups are randomly distributed on the surface separating the aqueous phase and the micelle hydrophobic core. The distribution of distances between head groups has a maximum at a thermodynamic equilibrium distance (*d*T) deter‐ mined by the equilibrium of the opposite forces involved in micelle formation. In the case of dimeric surfactants, the distribution becomes bimodal. One maximum corresponds to the thermodynamic distance, the other one, narrower, at a distance *d*<sup>S</sup> that corresponds to the length of the spacer. The length of the spacer is determined not only by the number of atoms in the spacer but also by its conformation. The distance *d*<sup>S</sup> can be adjusted to be smaller, equal or larger than *d*T by changing structure of the spacer opening a possibility to create variety of structure [3]. The bimodal distribution of head group distances and the presence of the chemical link between head groups strongly affect the curvature of surfactant layers, and thus the micelle shape. Additional alkyl chains and hydrophilic or hydrophobic spacers in mole‐ cules of higher oligomeric surfactants complicate aggregation processes even more from this point of view. For higher oligomeric surfactants to date most data about aggregation behavior is given for linear and star-shaped dodecyl quaternary ammonium surfactants. Experimental techniques used for determining the aggregation number and structure of oligomeric surfac‐ tant micellar solutions are SANS, TRFQ, SLS, dynamic light scattering (DLS) and transmission electron microscopy at cryogenic temperatures (cryo-TEM).

Furthermore, when this alkyl chain is longer, only larger aggregates are observed. In addition, it was also found that the aggregate size increased with an increase in the surfactant concen‐ tration. It can be concluded that the size distribution of investigated 2B surfactants depends

DLS revealed the existence of only small aggregates (2 nm or smaller) in the solution of 2C, 3C and 4B (regardless of the spacer) surfactants up to 1% wt of surfactant [51]. TRFQ measure‐ ments were used to determine aggregation numbers (*N*agg). The relatively low *N*agg < 40 for solutions of ca. 3 wt %, was ascribed to the chloride counterion which are less bound to the cationic head groups (higher ionization degree) than bromide, resulting in a stronger electro‐ static repulsion of the hydrophilic heads as well as a steric hindrance due to higher hydration of the latter. The nature of the spacer has a major influence on the aggregation number of the oligomeric surfactants. In general, the shorter the spacer the larger aggregation number is. For the trimers aggregation number decreases in series *N*agg (tb) > *N*agg (mx) > *N*agg (px) in accordance with the behavior of dimeric surfactants. Contrary to *m*-*s*-*m* type of surfactants, the aggregation number of 2C, 3C and 4B surfactants decrease with the degree of oligomerization. In this case a decrease of *N*agg with increasing degree of oligomerization, could be ascribed to the fact that further addition of long rigid spacers between the head groups reduces the overall flexibility of the structures and, hence, makes it difficult for the higher oligomers to pack tightly [51].

The structure of the micelles formed in the solution of star-shaped trimeric surfactants 3D was significantly influenced by alkyl chain length, for *m*=10, ellipsoidal micelle formed, for *m*=12, the ellipsoidal micelle transformed to the threadlike micelles with increasing concentration, and for *m*=14, threadlike micelles were formed at low concentration and no transitions were

Unusual aggregation behavior of star-shaped trimeric 3E and 3F, tetrameric 4C and hexameric 6A surfactants was explained by the dominant role of hydrophobic interactions which enables

Due to the rigid spacer and the intramolecular electrostatic repulsion among the quaternary ammonium headgroups, the hydrophobic chains of 3E and 3F pack loosely, and the 4C

Trimeric surfactants (3E and 3F) form vesicles just above the cac. With increasing concentration hydrophobic interaction becomes strong enough to pack the hydrophobic tails tightly and turn the molecular conformation into a pyramid-like shape, which results in the gradual transfor‐

For tetrameric 4C these interactions are strong enough already below cac so 4C form large network-like premicellar aggregates that changes to small spherical micelles at high surfactant

Star-shaped hexameric (6A) surfactant also form network-like premicellar aggregates well below the cac and transfer to small spherical micelles at high concentration. Its aggregation behavior is more complex since two cacs are observed. Between two cacs the hydrophobic interaction becomes stronger so that the 6A may transfer to a claw-like configuration. Above

molecule presents a stretched configuration at low concentrations [34-36].

concentration due to the same reason as for trimeric surfactants [34-36].

observed as concentration increased [33].

mation of vesicles in spherical micelles.

the configuration of the molecules to change [22, 34-36].

or Cl- .

http://dx.doi.org/10.5772/57655

159

Higher Oligomeric Surfactants — From Fundamentals to Applications

on the alkyl chain length and concentration but not on the counterion, Br-

In et al. [15] have shown, using cryo-TEM, that the sequence of the aggregates morphology of linear dodecyl quaternary ammonium surfactants 1A-4A, (*m*=12, *s*=3) with increasing degree of oligomerization from 1 to 4 is:

spherical micelles (monomer) → linear wormlike micelles (dimer) → branched wormlike micelles (trimer) → closedloop (ring) micelles (tetramer).

These results were confirmed with molecular modeling and molecular dynamic simulations [19, 20]. The changes of micelle shape can have a strong impact on rheology of the oligomeric surfactants solution, as will be discussed in following part.

In 12-*s*-(12-*s*)*x*-12 series with a long spacer, *s=*6, there is practically no change of micelle shape with the increasing degree of oligomerization. It was shown that aggregation numbers for these oligomeric surfactants are similar to the aggregation number of DTAB (1A) spherical micelles [15].

The size distributions for the micelles of the trimeric surfactants *m*-2-*m*-2-*m* (3A; *m*=8, 10, 12) were obtained by DLS at concentrations 2-to 16-fold cmc [26]. For all three surfactants bimodal size distribution was obtained. The smaller hydrodynamic diameter was determined to be 3.8– 5.2, 3.8–5.4 and 4.8–6.6 nm for 8-2-8-2-8, 10-2-10-2-10, and 12-2-12-2-12, respectively. The hydrodynamic diameter of larger particles was 30–50 nm. The aggregation numbers at cmc, determined by SLS, increases with increasing alkyl chain length from 7±3 for 8-2-8-2-8, to 8±2 for 10-2-10-2-10 and 18±1 for 12-2-12-2-12. The aggregation number of 12-2-12-2-12 corresponds to 54 (3x18) alkyl chains in the micelle, which is almost identical to the aggregation number of DTAB (1A) and twofold for 12-2-12 [26]. The small aggregation numbers of 8-2-8-2-8 and 10-2-10-2-10 were attributed to shorter hydrocarbon chains [26].

The particle size measurements at a concentration 2-fold of the cmc revealed that the apparent hydrodynamic diameter of monomeric, dimeric trimeric and tetrameric dodecyl ammonium bromides with ethylene spacer is around 3.6 nm in all cases, indicating the formation of spherical micelles [27]. This is in accordance with the theoretical considerations [19] and experimentally obtained data for oligomeric surfactants, showing that their micelles are spherical at the cmc [15]. It is shown that both DTAB (1A) and 12-2-12 form spherical micelles, DTAB even at fairly high concentrations and the 12-2-12 up to 1.3 wt% [50, 51]. The larger particles observed in the Yoshimura's study [26] of trimer were not detected in this case.

Chlebicki et al. [29] reported the coexistence of small spherical micelles and large nearly spherical aggregates when the length of the alkyl chain at the central nitrogen is short (2B). Furthermore, when this alkyl chain is longer, only larger aggregates are observed. In addition, it was also found that the aggregate size increased with an increase in the surfactant concen‐ tration. It can be concluded that the size distribution of investigated 2B surfactants depends on the alkyl chain length and concentration but not on the counterion, Bror Cl- .

structure [3]. The bimodal distribution of head group distances and the presence of the chemical link between head groups strongly affect the curvature of surfactant layers, and thus the micelle shape. Additional alkyl chains and hydrophilic or hydrophobic spacers in mole‐ cules of higher oligomeric surfactants complicate aggregation processes even more from this point of view. For higher oligomeric surfactants to date most data about aggregation behavior is given for linear and star-shaped dodecyl quaternary ammonium surfactants. Experimental techniques used for determining the aggregation number and structure of oligomeric surfac‐ tant micellar solutions are SANS, TRFQ, SLS, dynamic light scattering (DLS) and transmission

In et al. [15] have shown, using cryo-TEM, that the sequence of the aggregates morphology of linear dodecyl quaternary ammonium surfactants 1A-4A, (*m*=12, *s*=3) with increasing degree

spherical micelles (monomer) → linear wormlike micelles (dimer) → branched wormlike micelles (trimer) → closedloop (ring) micelles (tetramer). These results were confirmed with molecular modeling and molecular dynamic simulations [19, 20]. The changes of micelle shape can have a strong impact on rheology of the oligomeric

In 12-*s*-(12-*s*)*x*-12 series with a long spacer, *s=*6, there is practically no change of micelle shape with the increasing degree of oligomerization. It was shown that aggregation numbers for these oligomeric surfactants are similar to the aggregation number of DTAB (1A) spherical

The size distributions for the micelles of the trimeric surfactants *m*-2-*m*-2-*m* (3A; *m*=8, 10, 12) were obtained by DLS at concentrations 2-to 16-fold cmc [26]. For all three surfactants bimodal size distribution was obtained. The smaller hydrodynamic diameter was determined to be 3.8– 5.2, 3.8–5.4 and 4.8–6.6 nm for 8-2-8-2-8, 10-2-10-2-10, and 12-2-12-2-12, respectively. The hydrodynamic diameter of larger particles was 30–50 nm. The aggregation numbers at cmc, determined by SLS, increases with increasing alkyl chain length from 7±3 for 8-2-8-2-8, to 8±2 for 10-2-10-2-10 and 18±1 for 12-2-12-2-12. The aggregation number of 12-2-12-2-12 corresponds to 54 (3x18) alkyl chains in the micelle, which is almost identical to the aggregation number of DTAB (1A) and twofold for 12-2-12 [26]. The small aggregation numbers of 8-2-8-2-8 and

The particle size measurements at a concentration 2-fold of the cmc revealed that the apparent hydrodynamic diameter of monomeric, dimeric trimeric and tetrameric dodecyl ammonium bromides with ethylene spacer is around 3.6 nm in all cases, indicating the formation of spherical micelles [27]. This is in accordance with the theoretical considerations [19] and experimentally obtained data for oligomeric surfactants, showing that their micelles are spherical at the cmc [15]. It is shown that both DTAB (1A) and 12-2-12 form spherical micelles, DTAB even at fairly high concentrations and the 12-2-12 up to 1.3 wt% [50, 51]. The larger particles observed in the Yoshimura's study [26] of trimer were not detected in this case.

Chlebicki et al. [29] reported the coexistence of small spherical micelles and large nearly spherical aggregates when the length of the alkyl chain at the central nitrogen is short (2B).

electron microscopy at cryogenic temperatures (cryo-TEM).

surfactants solution, as will be discussed in following part.

10-2-10-2-10 were attributed to shorter hydrocarbon chains [26].

of oligomerization from 1 to 4 is:

158 Oligomerization of Chemical and Biological Compounds

micelles [15].

DLS revealed the existence of only small aggregates (2 nm or smaller) in the solution of 2C, 3C and 4B (regardless of the spacer) surfactants up to 1% wt of surfactant [51]. TRFQ measure‐ ments were used to determine aggregation numbers (*N*agg). The relatively low *N*agg < 40 for solutions of ca. 3 wt %, was ascribed to the chloride counterion which are less bound to the cationic head groups (higher ionization degree) than bromide, resulting in a stronger electro‐ static repulsion of the hydrophilic heads as well as a steric hindrance due to higher hydration of the latter. The nature of the spacer has a major influence on the aggregation number of the oligomeric surfactants. In general, the shorter the spacer the larger aggregation number is. For the trimers aggregation number decreases in series *N*agg (tb) > *N*agg (mx) > *N*agg (px) in accordance with the behavior of dimeric surfactants. Contrary to *m*-*s*-*m* type of surfactants, the aggregation number of 2C, 3C and 4B surfactants decrease with the degree of oligomerization. In this case a decrease of *N*agg with increasing degree of oligomerization, could be ascribed to the fact that further addition of long rigid spacers between the head groups reduces the overall flexibility of the structures and, hence, makes it difficult for the higher oligomers to pack tightly [51].

The structure of the micelles formed in the solution of star-shaped trimeric surfactants 3D was significantly influenced by alkyl chain length, for *m*=10, ellipsoidal micelle formed, for *m*=12, the ellipsoidal micelle transformed to the threadlike micelles with increasing concentration, and for *m*=14, threadlike micelles were formed at low concentration and no transitions were observed as concentration increased [33].

Unusual aggregation behavior of star-shaped trimeric 3E and 3F, tetrameric 4C and hexameric 6A surfactants was explained by the dominant role of hydrophobic interactions which enables the configuration of the molecules to change [22, 34-36].

Due to the rigid spacer and the intramolecular electrostatic repulsion among the quaternary ammonium headgroups, the hydrophobic chains of 3E and 3F pack loosely, and the 4C molecule presents a stretched configuration at low concentrations [34-36].

Trimeric surfactants (3E and 3F) form vesicles just above the cac. With increasing concentration hydrophobic interaction becomes strong enough to pack the hydrophobic tails tightly and turn the molecular conformation into a pyramid-like shape, which results in the gradual transfor‐ mation of vesicles in spherical micelles.

For tetrameric 4C these interactions are strong enough already below cac so 4C form large network-like premicellar aggregates that changes to small spherical micelles at high surfactant concentration due to the same reason as for trimeric surfactants [34-36].

Star-shaped hexameric (6A) surfactant also form network-like premicellar aggregates well below the cac and transfer to small spherical micelles at high concentration. Its aggregation behavior is more complex since two cacs are observed. Between two cacs the hydrophobic interaction becomes stronger so that the 6A may transfer to a claw-like configuration. Above second cac the hydrophobic interaction continues to strengthen, and to cause the molecular configuration to convert into a pyramid-like shape, which generates the transition of the large spherical aggregates to small spherical micelles as for 3E, 3F and 4C surfactants [22]. This research also proves that introducing more alkyl chains in molecules results with more complex aggregation process of oligomeric surfactants.

in the solution of these surfactants. The same absence of the effect was observed by Laschewsky et al. [30] who concluded that despite general theoretical predictions made for oligomeric surfactants [19, 20] the remarkable thickening power of certain oligomeric surfactants is apparently restricted to molecular structures with very short spacer groups (namely *s*=2 or 3).

Higher Oligomeric Surfactants — From Fundamentals to Applications

http://dx.doi.org/10.5772/57655

161

Rheological behavior of star-shaped trimeric surfactants strongly depends on alkyl chain length [33].The shear-rate dependence of viscosity of the star-shaped 3D-10 solutions was the same as that for water. On the basis of previous findings it was concluded that 3D-10 forms

At lower concentrations the viscosity of 3D-12 solution was also the same as that for the water. But at higher concentrations viscosity decreased with increasing shear rate the behavior known as shear thinning typical for the rod-or wormlike micelles and chainlike polymers. The results point out that with increasing concentration of 3D-12 surfactant a transition of the spherical micelle-to-rodlike micelle occurs, which was confirmed with SANS and cryo-TEM techniques. When concentration was further increased, growth of wormlike micelles was observed, accompanying the extrusion of the water from the micelles. In order to elucidate this behavior the surface charge per unit length and the end-cap energy of monomeric 1A, dimeric 12-2-12 and trimeric 3D-12 were obtained from the analysis of the volume fraction dependence of the zero-shear viscosity. Results have shown that molecular structure (linear or star-shaped) and degree of oligomerization (*j*=1, 2 or 3) have no influence on the surface charge per unit length. However, the end-cap energy of wormlike micelles decreased in order 3D-12 > 12-2-12 > 1A, indicating that wormlike micelles form more easily in the solutions of trimeric and dimerics surfactants even at lower concentrations. The crucial difference in the molecular structure between 3D-12 and 12-2-12 is the number of spacer chains. It seems that it is the spacer responsible for the increase of end-cap energy, one of the reasons being limitation of intramo‐ lecular motion by increasing the number of spacer groups in the molecule. The shape of the molecule also affects the end-cap energy. Trimeric star-shaped 3D surfactant is more round shaped than the linear one which results in lower end-cap energy of its wormlike micelles [54].

Viscosity of 3D-14 solutions was higher than that of water and it didn't change with increasing

Already in these several studies the complex rheological behavior of higher oligomeric surfactants was observed. Results have shown that number and length of spacer in surfactant molecule have key role in their aggregation and thus rheological behavior. It seems that the interplay of contribution of spacer and alkyl chains in resulting behavior is even more subtle than for dimeric surfactants. Considering the unusual aggregation behavior of tetrameric 4C and hexameric 6A it would be interesting to see how it reflects to their rheological behavior.

Synthetic surfactants are nowadays present in many everyday products and are utilized in many industrial processes. Although, dimeric and higher oligomeric surfactants have shown

shear rate which was attributed to presence of threadlike micelles in solution [33].

the spherical or ellipsoidal micelles [33].

**4. Applications**

Yosimura and Esumi [38, 39] investigated size and aggregation number of anionic trimeric surfactants (3K, 3L, 3J) aggregates in solution by DLS and SLS.

The aggregation number of the ring-type trimeric surfactants 3J-4 and 3J-10 were 32 and 580±20, respectively as determined by SLS [38]. The *N*agg of 3J-10 is very large, probably due to the strong cohesion derived from the interactions between longer three hydrocarbon chains in the ring-type trimeric surfactants. The size of the micelles was determined by DLS at the concentration 4-6-fold of the cmc of 3J. In the solution of both surfactants smaller and larger aggregates were detected. Both kinds of aggregates increased in size with increasing length of alkyl chains [38].

The large difference in aggregation numbers of 3K (*N*agg=1304±8) and 3L (*N*agg=39±2) was explained by the strong attractive interactions between hydrocarbon chains as well as the decrease of electrostatic repulsion due to less hydrophilic groups of 3K in comparison with 3L, which has usual micelle size [39]. DLS measurements revealed that in solution of 3K the aggregates 20–30 nm in size are formed, while in the solution of 3L aggregates of sizes 2–10 and 15–40 nm coexist. In the case of 3L, it is considered that the smaller ones correspond to the micelles and the larger ones correspond to the large aggregates similar to 3K [39].

The aggregation numbers of Tyloxapol (7A) and of TX100 micelles were determined by TRFQ. The Tyloxapol micelles were found to be smaller than the TX100 micelles. This behavior is opposite to that found for ionic surfactant oligomers (2A-4A) with respect to their correspond‐ ing monomers. Cryo-TEM showed that the Tyloxapol micelles remain spheroidal up to a concentration of about 10 wt%. At 15 wt%, some regions of ordered elongated micelles were also observed, which could be the precursors of the hexagonal phase known to occur at about 35 wt% [21].

#### **3.5. Viscosity**

The peculiar aggregation behavior of oligomeric *m*-*s*-(*m*-*s*)*x*-*m* surfactants significantly influences the rheological behavior of their solutions. The change of the micelle shape, in the series of surfactants with short spacers, from spherical to wormlike or threadlike micelles with increasing degree of oligomerization affects viscosity of the surfactant solution. The phenom‐ enon that attracts special attention, in this sense, is viscoelastic behavior that is observed in solution of long wormlike micelles. For example dimeric surfactants with a short spacer, such as 12-2-12, give rise to worm-like micelles at fairly low surfactant concentrations, even in the absence of added salt [3, 9-11, 14, 15, 52].

No such changes were observed in the solution of the surfactants with longer spacers, like in going from the monomeric [53] to the dimeric (12-6-12) [10] and trimeric surfactant (12-6-12-6-12) analogues [15]. This indicates that the large, rodlike aggregates were not formed in the solution of these surfactants. The same absence of the effect was observed by Laschewsky et al. [30] who concluded that despite general theoretical predictions made for oligomeric surfactants [19, 20] the remarkable thickening power of certain oligomeric surfactants is apparently restricted to molecular structures with very short spacer groups (namely *s*=2 or 3).

Rheological behavior of star-shaped trimeric surfactants strongly depends on alkyl chain length [33].The shear-rate dependence of viscosity of the star-shaped 3D-10 solutions was the same as that for water. On the basis of previous findings it was concluded that 3D-10 forms the spherical or ellipsoidal micelles [33].

At lower concentrations the viscosity of 3D-12 solution was also the same as that for the water. But at higher concentrations viscosity decreased with increasing shear rate the behavior known as shear thinning typical for the rod-or wormlike micelles and chainlike polymers. The results point out that with increasing concentration of 3D-12 surfactant a transition of the spherical micelle-to-rodlike micelle occurs, which was confirmed with SANS and cryo-TEM techniques. When concentration was further increased, growth of wormlike micelles was observed, accompanying the extrusion of the water from the micelles. In order to elucidate this behavior the surface charge per unit length and the end-cap energy of monomeric 1A, dimeric 12-2-12 and trimeric 3D-12 were obtained from the analysis of the volume fraction dependence of the zero-shear viscosity. Results have shown that molecular structure (linear or star-shaped) and degree of oligomerization (*j*=1, 2 or 3) have no influence on the surface charge per unit length. However, the end-cap energy of wormlike micelles decreased in order 3D-12 > 12-2-12 > 1A, indicating that wormlike micelles form more easily in the solutions of trimeric and dimerics surfactants even at lower concentrations. The crucial difference in the molecular structure between 3D-12 and 12-2-12 is the number of spacer chains. It seems that it is the spacer responsible for the increase of end-cap energy, one of the reasons being limitation of intramo‐ lecular motion by increasing the number of spacer groups in the molecule. The shape of the molecule also affects the end-cap energy. Trimeric star-shaped 3D surfactant is more round shaped than the linear one which results in lower end-cap energy of its wormlike micelles [54].

Viscosity of 3D-14 solutions was higher than that of water and it didn't change with increasing shear rate which was attributed to presence of threadlike micelles in solution [33].

Already in these several studies the complex rheological behavior of higher oligomeric surfactants was observed. Results have shown that number and length of spacer in surfactant molecule have key role in their aggregation and thus rheological behavior. It seems that the interplay of contribution of spacer and alkyl chains in resulting behavior is even more subtle than for dimeric surfactants. Considering the unusual aggregation behavior of tetrameric 4C and hexameric 6A it would be interesting to see how it reflects to their rheological behavior.

#### **4. Applications**

second cac the hydrophobic interaction continues to strengthen, and to cause the molecular configuration to convert into a pyramid-like shape, which generates the transition of the large spherical aggregates to small spherical micelles as for 3E, 3F and 4C surfactants [22]. This research also proves that introducing more alkyl chains in molecules results with more

Yosimura and Esumi [38, 39] investigated size and aggregation number of anionic trimeric

The aggregation number of the ring-type trimeric surfactants 3J-4 and 3J-10 were 32 and 580±20, respectively as determined by SLS [38]. The *N*agg of 3J-10 is very large, probably due to the strong cohesion derived from the interactions between longer three hydrocarbon chains in the ring-type trimeric surfactants. The size of the micelles was determined by DLS at the concentration 4-6-fold of the cmc of 3J. In the solution of both surfactants smaller and larger aggregates were detected. Both kinds of aggregates increased in size with increasing length of

The large difference in aggregation numbers of 3K (*N*agg=1304±8) and 3L (*N*agg=39±2) was explained by the strong attractive interactions between hydrocarbon chains as well as the decrease of electrostatic repulsion due to less hydrophilic groups of 3K in comparison with 3L, which has usual micelle size [39]. DLS measurements revealed that in solution of 3K the aggregates 20–30 nm in size are formed, while in the solution of 3L aggregates of sizes 2–10 and 15–40 nm coexist. In the case of 3L, it is considered that the smaller ones correspond to the

The aggregation numbers of Tyloxapol (7A) and of TX100 micelles were determined by TRFQ. The Tyloxapol micelles were found to be smaller than the TX100 micelles. This behavior is opposite to that found for ionic surfactant oligomers (2A-4A) with respect to their correspond‐ ing monomers. Cryo-TEM showed that the Tyloxapol micelles remain spheroidal up to a concentration of about 10 wt%. At 15 wt%, some regions of ordered elongated micelles were also observed, which could be the precursors of the hexagonal phase known to occur at about

The peculiar aggregation behavior of oligomeric *m*-*s*-(*m*-*s*)*x*-*m* surfactants significantly influences the rheological behavior of their solutions. The change of the micelle shape, in the series of surfactants with short spacers, from spherical to wormlike or threadlike micelles with increasing degree of oligomerization affects viscosity of the surfactant solution. The phenom‐ enon that attracts special attention, in this sense, is viscoelastic behavior that is observed in solution of long wormlike micelles. For example dimeric surfactants with a short spacer, such as 12-2-12, give rise to worm-like micelles at fairly low surfactant concentrations, even in the

No such changes were observed in the solution of the surfactants with longer spacers, like in going from the monomeric [53] to the dimeric (12-6-12) [10] and trimeric surfactant (12-6-12-6-12) analogues [15]. This indicates that the large, rodlike aggregates were not formed

micelles and the larger ones correspond to the large aggregates similar to 3K [39].

complex aggregation process of oligomeric surfactants.

160 Oligomerization of Chemical and Biological Compounds

alkyl chains [38].

35 wt% [21].

**3.5. Viscosity**

absence of added salt [3, 9-11, 14, 15, 52].

surfactants (3K, 3L, 3J) aggregates in solution by DLS and SLS.

Synthetic surfactants are nowadays present in many everyday products and are utilized in many industrial processes. Although, dimeric and higher oligomeric surfactants have shown better properties compared to conventional ones, the difficulties in their synthesis in sufficient quantities are rendering their commercial application. In addition, in order for new surfactant to be used in certain application, its toxicity and harmfulness should be assessed, which requires both time and money. Considering their improved physico-chemical properties it would be worthwhile to perform such tests. Possibility to use higher oligomeric surfactants for several different applications has been investigated.

monomeric and dimeric surfactants, the emulsification ability and the strength of interfacial

Higher Oligomeric Surfactants — From Fundamentals to Applications

http://dx.doi.org/10.5772/57655

163

Abdul-Raouf et al. have studied factors affecting stability of oil-in-water emulsions prepared by shearing together known amounts of Land Belayim crude oil and aqueous solutions of nonionic 3N surfactants. It was observed that equilibrium interfacial tension of nonionic 3N surfactants decreases with the length of ethylene oxide chain for the same alkyl chain length. After 24 h coalescence process has started only in a few emulsions, indicating that the emul‐ sions were mostly stable. The stability of the emulsions depended on oil percentage and oil/ water ratio. The emulsions with higher oil percentage were found to be more stable. The effectiveness of the emulsifiers decreases with increasing alkyl chain length for 3N surfactants. The emulsions stability increased with increasing surfactants concentration from 300 to 400 ppm, but decreased by further increasing the surfactants concentration to 500 ppm. These findings led to the conclusion that there is an optimum concentration at which the emulsion droplets are fully encapsulated, preventing agglomeration and coalescence which occur at lower concentrations. At higher concentrations surfactant molecules interact with each other and cause the disorder of the arrangement at the interface which facilitates the coalescence [42].

Surfactants that can self-assemble in smooth bilayers are promising water-based lubricants. However, the cohesion of adsorbed layers is not always satisfactory. Lagleize et al. [55] have investigated the possibility of combining surface-adsorbing surfactants and coadsorbing polymer in order to obtain dense and cohesive lubricant films by self-assembly on mica surfaces. The three studied quarternary ammonium surfactants, monomeric cetyl triethylam‐ monium chloride, dimeric 12-3-12 and trimeric 12-3-12-3-12 form flat bilayers at the negatively charged mica surface at the concentrations above cmc [3]. It was shown that the degree by which coadsorption of the anionic–neutral poly(acrylic acid)–polyacrylamide diblock copoly‐ mer reinforces the adsorbed layers, as well as the nature and the characteristic times of shearinduced dynamic transitions between states of low-and high-friction forces depend on the degree of oligomerization. Behaviour of systems containing dimeric and trimeric surfactants under shear and compression are qualitatively simillar, while systems with monomeric surfactant have shown different response indicating lower cohesion of the layers [55].

Many practical applications of the surfactants are based on their ability to adsorb at different surfaces. Understanding the interactions between surfactants and mineral surfaces are of special interest for agriculture and oil recovery and consequently for the environment.

Esumi et al. [17] investigated adsorption of 2A and 3A (*m*=12, *s*=2) surfactants, on silica as well as adsolubilization of 2-naphtol. Adsorption of dimeric and trimeric surfactant differs from the monomeric as revealed by adsorption isotherms. Density of adsorbed surfactants decreases with oligomerization degree. The ratio of maximum amount of 2-naphtol adsolubilized to the adsorbed amount of surfactant on silica increases with the oligomerization degree. From a

films are greatly enhanced [22].

**4.4. Lubricants**

**4.5. Adsorption at mineral surfaces**

#### **4.1. Solubilization**

Solubilization of poorly water-soluble or insoluble compounds is among most frequent surfactant application.

Laschewsky at el. [30] investigated solubilization capacity of a 2C, 3C and 4B for *p*-xylene and 2,3-dimethylbut-2-ene. The results indicated that the solubilization depends on both the chemical nature of the spacer and the couple surfactant-solubilizate used. Degree of oligome‐ rization, within a given series of oligomers, didn't have significant influence on the solubili‐ zation capacity.

#### **4.2. Foaming**

Foams are encountered in many important technological areas.

It was shown that the trimeric surfactant 3B has almost the same foaming ability and the foam stability as corresponding bisquaternary ammonium dichlorides which indicates that balance of hydrophobicity and hydrophilicity strongly affects the foaming properties [28]. Simple test of bubbling the air through aqueous solution of a fixed concentration until a given height of foam was produced was used for determining foaming ability of 2C, 3C and 4B [30]. For both trimeric and tetrameric surfactants it was observed that the short spacer favor foam formation and stability. Trimeric surfactants stabilize foam significantly more than corresponding dimeric, but the difference between corresponding trimers and tetramers is marginal. This effect cannot be explained by the difference in viscosity.

#### **4.3. Emulsification**

The oil/water emulsions are fine dispersions, encountered in many household and industrial products. The role of surfactants in emulsions is to stabilize them by adsorbing at the water/oil interface preventing phase separation into oil and water phases.

Ability of hexameric quaternary ammonium salt 6A to emulsify heptanes, dodecane, toluene, xylene was investigated. As described in section 3.4, hexameric 6A forms network-like premicellar aggregates at very low concentration, which have compeled the authors to study emulsion forming efficiency at the concentrations far below first cac. It was found that oil/ water emulsion form quickly after vigorous shaking. The surfactant 6A can emulsify heptanes and dodecane, emulsions being stable at the level of 60 – 70 %, but can't emulsify toluene and xylene. The reason was proposed to be the greater compatibility of 6A with linear fatty acids due to its long hydrophobic alkyl chains. Due to the hexameric structure, in comparison with monomeric and dimeric surfactants, the emulsification ability and the strength of interfacial films are greatly enhanced [22].

Abdul-Raouf et al. have studied factors affecting stability of oil-in-water emulsions prepared by shearing together known amounts of Land Belayim crude oil and aqueous solutions of nonionic 3N surfactants. It was observed that equilibrium interfacial tension of nonionic 3N surfactants decreases with the length of ethylene oxide chain for the same alkyl chain length. After 24 h coalescence process has started only in a few emulsions, indicating that the emul‐ sions were mostly stable. The stability of the emulsions depended on oil percentage and oil/ water ratio. The emulsions with higher oil percentage were found to be more stable. The effectiveness of the emulsifiers decreases with increasing alkyl chain length for 3N surfactants. The emulsions stability increased with increasing surfactants concentration from 300 to 400 ppm, but decreased by further increasing the surfactants concentration to 500 ppm. These findings led to the conclusion that there is an optimum concentration at which the emulsion droplets are fully encapsulated, preventing agglomeration and coalescence which occur at lower concentrations. At higher concentrations surfactant molecules interact with each other and cause the disorder of the arrangement at the interface which facilitates the coalescence [42].

#### **4.4. Lubricants**

better properties compared to conventional ones, the difficulties in their synthesis in sufficient quantities are rendering their commercial application. In addition, in order for new surfactant to be used in certain application, its toxicity and harmfulness should be assessed, which requires both time and money. Considering their improved physico-chemical properties it would be worthwhile to perform such tests. Possibility to use higher oligomeric surfactants

Solubilization of poorly water-soluble or insoluble compounds is among most frequent

Laschewsky at el. [30] investigated solubilization capacity of a 2C, 3C and 4B for *p*-xylene and 2,3-dimethylbut-2-ene. The results indicated that the solubilization depends on both the chemical nature of the spacer and the couple surfactant-solubilizate used. Degree of oligome‐ rization, within a given series of oligomers, didn't have significant influence on the solubili‐

It was shown that the trimeric surfactant 3B has almost the same foaming ability and the foam stability as corresponding bisquaternary ammonium dichlorides which indicates that balance of hydrophobicity and hydrophilicity strongly affects the foaming properties [28]. Simple test of bubbling the air through aqueous solution of a fixed concentration until a given height of foam was produced was used for determining foaming ability of 2C, 3C and 4B [30]. For both trimeric and tetrameric surfactants it was observed that the short spacer favor foam formation and stability. Trimeric surfactants stabilize foam significantly more than corresponding dimeric, but the difference between corresponding trimers and tetramers is marginal. This

The oil/water emulsions are fine dispersions, encountered in many household and industrial products. The role of surfactants in emulsions is to stabilize them by adsorbing at the

Ability of hexameric quaternary ammonium salt 6A to emulsify heptanes, dodecane, toluene, xylene was investigated. As described in section 3.4, hexameric 6A forms network-like premicellar aggregates at very low concentration, which have compeled the authors to study emulsion forming efficiency at the concentrations far below first cac. It was found that oil/ water emulsion form quickly after vigorous shaking. The surfactant 6A can emulsify heptanes and dodecane, emulsions being stable at the level of 60 – 70 %, but can't emulsify toluene and xylene. The reason was proposed to be the greater compatibility of 6A with linear fatty acids due to its long hydrophobic alkyl chains. Due to the hexameric structure, in comparison with

water/oil interface preventing phase separation into oil and water phases.

for several different applications has been investigated.

162 Oligomerization of Chemical and Biological Compounds

Foams are encountered in many important technological areas.

effect cannot be explained by the difference in viscosity.

**4.1. Solubilization**

zation capacity.

**4.3. Emulsification**

**4.2. Foaming**

surfactant application.

Surfactants that can self-assemble in smooth bilayers are promising water-based lubricants. However, the cohesion of adsorbed layers is not always satisfactory. Lagleize et al. [55] have investigated the possibility of combining surface-adsorbing surfactants and coadsorbing polymer in order to obtain dense and cohesive lubricant films by self-assembly on mica surfaces. The three studied quarternary ammonium surfactants, monomeric cetyl triethylam‐ monium chloride, dimeric 12-3-12 and trimeric 12-3-12-3-12 form flat bilayers at the negatively charged mica surface at the concentrations above cmc [3]. It was shown that the degree by which coadsorption of the anionic–neutral poly(acrylic acid)–polyacrylamide diblock copoly‐ mer reinforces the adsorbed layers, as well as the nature and the characteristic times of shearinduced dynamic transitions between states of low-and high-friction forces depend on the degree of oligomerization. Behaviour of systems containing dimeric and trimeric surfactants under shear and compression are qualitatively simillar, while systems with monomeric surfactant have shown different response indicating lower cohesion of the layers [55].

#### **4.5. Adsorption at mineral surfaces**

Many practical applications of the surfactants are based on their ability to adsorb at different surfaces. Understanding the interactions between surfactants and mineral surfaces are of special interest for agriculture and oil recovery and consequently for the environment.

Esumi et al. [17] investigated adsorption of 2A and 3A (*m*=12, *s*=2) surfactants, on silica as well as adsolubilization of 2-naphtol. Adsorption of dimeric and trimeric surfactant differs from the monomeric as revealed by adsorption isotherms. Density of adsorbed surfactants decreases with oligomerization degree. The ratio of maximum amount of 2-naphtol adsolubilized to the adsorbed amount of surfactant on silica increases with the oligomerization degree. From a two-step process of adsorption-adsolubilization it was concluded that oligomers are adsorbed at silica much more strongly than the monomeric surfactant keeping 2-naphtol in their adsorbed layers.

highly ordered supermicroporous silica with the pore structure belonging to the two-dimen‐

Quantum dots are inorganic nanoparticles with photoluminiscent properties. They can be prepared either in water or in organic solvents. Synthesis in organic solvents do offer better shape control and achieving higher crystallinity of the product, since organic ligands (most often alkyl amine) are used to control crystal growth. As a result particles coated with hydrophobic layer are obtained. This layer should be removed if nanoparticles are to be used in the water, without affecting their optical properties. Strategy based on encapsulation of nanoparticles in amphiphilic molecules aggregates is proving to be successful in this sense. Dazzazi et al. [60] investigated the phase transfer of highly monodispers ZnO nanocrystals

ammonium surfactants. It was shown that 60 % transfer yields could be obtained with the oligomers and polymer, but no measurable transfer was observed with monomeric surfactant. Trimer and polymer were more efficient than dimer. The results were explained by more quantitative molecular aggregation of surfactants at nanocrystals surface with increasing degree of oligomerization. In addition, the dynamics of molecular exchange between the bulk and double layer coating decreases with increasing degree of oligomerization. Obtained nanocrystals exhibited strong photoluminiscence in the water, as well as long term chemical

It was shown that the antielectrostatic effect of oligomeric quaternary ammonium derived from epichlorohydrin 2B strongly depends on the counterion, i.e. bromide or chloride. However, both chlorides and bromides have shown very good antistatic properties, similar to commer‐ cially available antistatic agent Catanac. No influence of increasing the central chain length on

Surfactants are frequently used in pharmacy in preparation of drug carriers or systems for targeted drug delivery. In addition, many drug molecules are amphiphilic and therefore

Motivated by the fact that some dimeric surfactants have antimicrobial activity, neutral and cationic trimeric surfactants 3G and 3H were tested against fungi, Gram positive and Gramnegative bacteria. Both types of trimeric surfactants were the most efficient against Gram positive bacteria. Cationic trimers were more efficient then neutral. Compounds with 12 C atoms in alkyl chain were found to be most active, while those with 8 and 18 C atoms were

A proposed approach in the prevention or therapy of Alzheimer's disease is decreasing or eliminating the neuritic plaques composed of fibrillar *β*-amyloid (A*β).* The strong tendency of 4C to self-assemble even below cmc prompted authors to study disassembly of amyloide fibrils in its presence. It was shown that both 4C premicellar and micellar aggregates can effectively disassemble matutre A*β*(1-40) fibrils in aqueous solution. Unlike 4C, 12-6-12 loses its efficiency

), trimer (12-3-12-3-12 3Br-

Higher Oligomeric Surfactants — From Fundamentals to Applications

) and polymer alkyl

http://dx.doi.org/10.5772/57655

165

sional hexagonal structure and pore size from 1.92 to 2.16 nm was obtained.

using monomer (DTAB, 1A), dimer (12-3-12 2Br-

the antielectrostatic properties was observed [29].

and photo-chemical stability.

**4.8. Biomedical applications**

found to be almost inactive [37].

surface active.

**4.7. Antistatic properties**

The same group investigated competitive adsorption of pesticide paraquat and 2A and 3A (*m*=12, *s*=2) surfactants, on clay [56]. The results indicate competitive adsorption between paraquat and the surfactants. Oligomerization degree didn't have a significant influence on replacement of paraquat.

In et al. [15] investigated adsorption of 12-3-12-3-12 and 12-6-12-6-12 trimers and 12-3-12-4-12-3-12 tetramer at silica and compared it with behavior of corresponding dimers. The values of surface surface excess concentration per mol of adsorbed chain indicate that 12-3-12-3-12 and 12-3-12-4-12-3-12 form bilayer on silica, while value for 12-6-12-6-12 indicates that it forms cylindrical micelles, similar to 12-6-12.

The efficiency of oil recovery process is affected by the wettability of oil reservoir rocks. It was shown that trimeric 12-2-12-2-12 surfactant can alter wettabilty of both water-wet and oil-wet mica surfaces more efficiently then monomeric or dimeric cationic surfactants. The change in wettabillity is a consequence of surfactant adsorption in the form of monolayer at the mica surface. The orientation of the surfactant molecules in the monolayer depends on the type of the surface (oil or water wet) [57].

Atomic force microscopy (AFM), X-ray photoelectron spectroscopy (XPS) and PC model calculation showed that trimeric 3F forms highly-ordered bilayers on the mica surface [35]. Such an ordered structure is induced by match of 3 charged surfactant head groups with negatively charged sites on mica surface. In addition, the formation of the bilayer is promoted by intermolecular bonding and hydrophobic interactions. It was concluded that structural features of an oligomeric surfactant can greatly affect its manner of adsorption, which can help in design of self-assembling molecules for the fabrication of surface patterns.

#### **4.6. Synthesis of advanced materials**

So far higher oligomeric surfactants have been used in synthesis of two classes of materials – mesoporous silica materials and ZnO quantum dots.

Preparation of the materials with pores of controlled size, shape and connectivity is of great importance for the practical applications where the shape of the molecules should be recog‐ nized. Examples are catalysis, molecular sieves, selective adsorption, sensing, etc. Different applications put different demands for the material properties. Therefore the new routs of synthesis or modifications of existing ones are constantly thought. Surfactants, in general, are frequently used as templates in the synthesis of inorganic materials. Among different mesoand microporous materials, silica based materials take special place.

Although different gemini surfactants of bis-quarternay ammonium type has been used in preparation of different cubic, hexagonal, lamellar mesoporous silica materials [3], only 14-2-14-2-14 [58] and 10-2-10-2-10 [59] have been used as structure directing agent. By using 14-2-14-2-14 surfactant it was possible to synthesize, high quality, ordered two-dimensional hexagonal mesoporous silica under mild conditions. In the presence of 10-2-10-2-10 surfactant highly ordered supermicroporous silica with the pore structure belonging to the two-dimen‐ sional hexagonal structure and pore size from 1.92 to 2.16 nm was obtained.

Quantum dots are inorganic nanoparticles with photoluminiscent properties. They can be prepared either in water or in organic solvents. Synthesis in organic solvents do offer better shape control and achieving higher crystallinity of the product, since organic ligands (most often alkyl amine) are used to control crystal growth. As a result particles coated with hydrophobic layer are obtained. This layer should be removed if nanoparticles are to be used in the water, without affecting their optical properties. Strategy based on encapsulation of nanoparticles in amphiphilic molecules aggregates is proving to be successful in this sense.

Dazzazi et al. [60] investigated the phase transfer of highly monodispers ZnO nanocrystals using monomer (DTAB, 1A), dimer (12-3-12 2Br- ), trimer (12-3-12-3-12 3Br- ) and polymer alkyl ammonium surfactants. It was shown that 60 % transfer yields could be obtained with the oligomers and polymer, but no measurable transfer was observed with monomeric surfactant. Trimer and polymer were more efficient than dimer. The results were explained by more quantitative molecular aggregation of surfactants at nanocrystals surface with increasing degree of oligomerization. In addition, the dynamics of molecular exchange between the bulk and double layer coating decreases with increasing degree of oligomerization. Obtained nanocrystals exhibited strong photoluminiscence in the water, as well as long term chemical and photo-chemical stability.

#### **4.7. Antistatic properties**

two-step process of adsorption-adsolubilization it was concluded that oligomers are adsorbed at silica much more strongly than the monomeric surfactant keeping 2-naphtol in their

The same group investigated competitive adsorption of pesticide paraquat and 2A and 3A (*m*=12, *s*=2) surfactants, on clay [56]. The results indicate competitive adsorption between paraquat and the surfactants. Oligomerization degree didn't have a significant influence on

In et al. [15] investigated adsorption of 12-3-12-3-12 and 12-6-12-6-12 trimers and 12-3-12-4-12-3-12 tetramer at silica and compared it with behavior of corresponding dimers. The values of surface surface excess concentration per mol of adsorbed chain indicate that 12-3-12-3-12 and 12-3-12-4-12-3-12 form bilayer on silica, while value for 12-6-12-6-12 indicates

The efficiency of oil recovery process is affected by the wettability of oil reservoir rocks. It was shown that trimeric 12-2-12-2-12 surfactant can alter wettabilty of both water-wet and oil-wet mica surfaces more efficiently then monomeric or dimeric cationic surfactants. The change in wettabillity is a consequence of surfactant adsorption in the form of monolayer at the mica surface. The orientation of the surfactant molecules in the monolayer depends on the type of

Atomic force microscopy (AFM), X-ray photoelectron spectroscopy (XPS) and PC model calculation showed that trimeric 3F forms highly-ordered bilayers on the mica surface [35]. Such an ordered structure is induced by match of 3 charged surfactant head groups with negatively charged sites on mica surface. In addition, the formation of the bilayer is promoted by intermolecular bonding and hydrophobic interactions. It was concluded that structural features of an oligomeric surfactant can greatly affect its manner of adsorption, which can help

So far higher oligomeric surfactants have been used in synthesis of two classes of materials –

Preparation of the materials with pores of controlled size, shape and connectivity is of great importance for the practical applications where the shape of the molecules should be recog‐ nized. Examples are catalysis, molecular sieves, selective adsorption, sensing, etc. Different applications put different demands for the material properties. Therefore the new routs of synthesis or modifications of existing ones are constantly thought. Surfactants, in general, are frequently used as templates in the synthesis of inorganic materials. Among different meso-

Although different gemini surfactants of bis-quarternay ammonium type has been used in preparation of different cubic, hexagonal, lamellar mesoporous silica materials [3], only 14-2-14-2-14 [58] and 10-2-10-2-10 [59] have been used as structure directing agent. By using 14-2-14-2-14 surfactant it was possible to synthesize, high quality, ordered two-dimensional hexagonal mesoporous silica under mild conditions. In the presence of 10-2-10-2-10 surfactant

in design of self-assembling molecules for the fabrication of surface patterns.

and microporous materials, silica based materials take special place.

adsorbed layers.

replacement of paraquat.

164 Oligomerization of Chemical and Biological Compounds

the surface (oil or water wet) [57].

**4.6. Synthesis of advanced materials**

mesoporous silica materials and ZnO quantum dots.

that it forms cylindrical micelles, similar to 12-6-12.

It was shown that the antielectrostatic effect of oligomeric quaternary ammonium derived from epichlorohydrin 2B strongly depends on the counterion, i.e. bromide or chloride. However, both chlorides and bromides have shown very good antistatic properties, similar to commer‐ cially available antistatic agent Catanac. No influence of increasing the central chain length on the antielectrostatic properties was observed [29].

#### **4.8. Biomedical applications**

Surfactants are frequently used in pharmacy in preparation of drug carriers or systems for targeted drug delivery. In addition, many drug molecules are amphiphilic and therefore surface active.

Motivated by the fact that some dimeric surfactants have antimicrobial activity, neutral and cationic trimeric surfactants 3G and 3H were tested against fungi, Gram positive and Gramnegative bacteria. Both types of trimeric surfactants were the most efficient against Gram positive bacteria. Cationic trimers were more efficient then neutral. Compounds with 12 C atoms in alkyl chain were found to be most active, while those with 8 and 18 C atoms were found to be almost inactive [37].

A proposed approach in the prevention or therapy of Alzheimer's disease is decreasing or eliminating the neuritic plaques composed of fibrillar *β*-amyloid (A*β).* The strong tendency of 4C to self-assemble even below cmc prompted authors to study disassembly of amyloide fibrils in its presence. It was shown that both 4C premicellar and micellar aggregates can effectively disassemble matutre A*β*(1-40) fibrils in aqueous solution. Unlike 4C, 12-6-12 loses its efficiency with decreasing the concentration, indicating that 4C significant self-aggregation ability below cmc could be the key factor in the fibril disassembly. Authors have proposed two key features of the fibril disassembly. First is binding of positively charged 4C surfactant with negatively charged fibriles through electrostatic interactions. Second is self-assembly of the bound 4C molecules which lead to fibrils disaggregation and formation of mixed surfactant/ A*β*(1-40) molecules aggregates [61].

**•** linear relationship between the total carbon number in the hydrophobic group and the cmcs on the semilogarithmic scale has been shown for different oligomeric quaternary ammoni‐

Higher Oligomeric Surfactants — From Fundamentals to Applications

http://dx.doi.org/10.5772/57655

167

**•** quaternary ammonium oligomeric surfactants with short spacers (*m*=12, *s*=2 or 3) and starshaped topology exhibit peculiar aggregation behavior, which significantly influences the rheological behavior of their solutions. Such a behavior was not observed for the oligomeric

**•** the length and nature of the spacer within the series of surfactants with the same alkyl chain length and hydrophilic group are most dominant factors in determining the overall surfactant behavior. However, influence of the nature and length of the spacer on adsorption and micellization is not the same for different surfactants series, i.e. it depends on entire

Although solid state properties of surfactant, in general, attract considerable attention due to their polymorphism and mesomorphism, to the best of our knowledge only one study of higher

Despite the obstacles, the results of a number studies which have shown the potential of oligomeric surfactants for different applications give additional motivation for the future

To our dear mentor dr.sc.Nada Filipović-Vinceković for introducing us to the wonderful world

Laboratory for Synthesis and Processes of Self-Assembling of Organic Molecules, Division

[1] Meyers D. Surfactants Science and Technology. 3rd Ed. Hoboken, New Jersey: John

of Physical Chemistry, Ruđer Bošković Institute, Zagreb, Croatia

um surfactants series, as for monomers and dimers

surfactants with longer spacers.

oligomeric surfactants has been reported [67].

molecular architecture.

**Acknowledgements**

research.

of surfactans.

**Author details**

**References**

D. Jurašin and M. Dutour Sikirić

Wiley & Sons Inc.; 2006.

#### **5. Conclusion**

Current investigations in surfactant science are driven by the requirements to design surfac‐ tants that possess enhanced physico-chemical properties, new surfactant utilization in complex systems and specific applications in modern technologies. Hence, investigations of the structure–property relationship in surfactant systems are very important in order to be able to design new surfactants and their supramolecules for specific applications [62]. Al‐ though, in theory, higher oligomeric surfactants are text book example how the properties could be tailored by changing structural elements of the surfactants molecule, in the practice investigation of higher oligomeric surfactants are still more driven by the feasibility of the synthesis than the intended application. Additional reason is that the process of their aggre‐ gation is still largely unpredictable.

So far mostly cationic oligomeric quaternary ammonium surfactants have been synthesized and investigated, due to the relative ease of their synthesis, low Krafft temperatures and interesting rheological behavior. Oligomeric quaternary ammonium surfactants with different molecular architecture, linear [15-17, 26, 27], dissymmetric [62-65] and star-shaped molecules [22, 33, 34, 36], have been synthesized. Anionic and nonionic oligomeric surfactants have been studied to much lesser extent.

Common conclusions that can be drawn for these different classes of higher oligomeric surfactants are:

	- **◦** enhances the surface activity,
	- **◦** shifts the critical micelle concentration (cmc) to lower concentrations, although the changes are becoming less significant with increase of degree of oligomerization above 2,

Although solid state properties of surfactant, in general, attract considerable attention due to their polymorphism and mesomorphism, to the best of our knowledge only one study of higher oligomeric surfactants has been reported [67].

Despite the obstacles, the results of a number studies which have shown the potential of oligomeric surfactants for different applications give additional motivation for the future research.

#### **Acknowledgements**

with decreasing the concentration, indicating that 4C significant self-aggregation ability below cmc could be the key factor in the fibril disassembly. Authors have proposed two key features of the fibril disassembly. First is binding of positively charged 4C surfactant with negatively charged fibriles through electrostatic interactions. Second is self-assembly of the bound 4C molecules which lead to fibrils disaggregation and formation of mixed surfactant/ A*β*(1-40)

Current investigations in surfactant science are driven by the requirements to design surfac‐ tants that possess enhanced physico-chemical properties, new surfactant utilization in complex systems and specific applications in modern technologies. Hence, investigations of the structure–property relationship in surfactant systems are very important in order to be able to design new surfactants and their supramolecules for specific applications [62]. Al‐ though, in theory, higher oligomeric surfactants are text book example how the properties could be tailored by changing structural elements of the surfactants molecule, in the practice investigation of higher oligomeric surfactants are still more driven by the feasibility of the synthesis than the intended application. Additional reason is that the process of their aggre‐

So far mostly cationic oligomeric quaternary ammonium surfactants have been synthesized and investigated, due to the relative ease of their synthesis, low Krafft temperatures and interesting rheological behavior. Oligomeric quaternary ammonium surfactants with different molecular architecture, linear [15-17, 26, 27], dissymmetric [62-65] and star-shaped molecules [22, 33, 34, 36], have been synthesized. Anionic and nonionic oligomeric surfactants have been

Common conclusions that can be drawn for these different classes of higher oligomeric

**•** most frequently the dodecyl chain was chosen as hydrophobic building block because it is long enough to confer good surfactant properties to amphiphiles, while it is still short enough that good water solubility for higher oligomeric or polymeric derivatives could be expected. Also, in the most cases, within certain series of surfactants the best properties have

**•** the increase of the number of alkyl chains, i.e. degree of oligomerization, within oligomeric

**◦** shifts the critical micelle concentration (cmc) to lower concentrations, although the changes are becoming less significant with increase of degree of oligomerization above

molecules aggregates [61].

166 Oligomerization of Chemical and Biological Compounds

gation is still largely unpredictable.

studied to much lesser extent.

surfactants with dodecyl chains,

**◦** enhances the surface activity,

surfactants are:

surfactant series:

2,

**5. Conclusion**

To our dear mentor dr.sc.Nada Filipović-Vinceković for introducing us to the wonderful world of surfactans.

#### **Author details**

D. Jurašin and M. Dutour Sikirić

Laboratory for Synthesis and Processes of Self-Assembling of Organic Molecules, Division of Physical Chemistry, Ruđer Bošković Institute, Zagreb, Croatia

#### **References**

[1] Meyers D. Surfactants Science and Technology. 3rd Ed. Hoboken, New Jersey: John Wiley & Sons Inc.; 2006.

[2] Rosen M. J. Surfactants and Interfacial Phenomena, 3rd Ed, Hoboken, New Jersey: Wi‐ ley Interscience; 2004.

[16] Esumi K, Taguma K, Koide Y. Aqueous Properties of Multichain Quaternary Cation‐

Higher Oligomeric Surfactants — From Fundamentals to Applications

http://dx.doi.org/10.5772/57655

169

[17] Esumi K, Goino M., Koide Y. Adsorption and Adsolubilization by Monomeric, Di‐ meric or Trimeric Quarternary Ammonium Surfactant at Silica/water Interface. Jour‐

[18] Hait SK, Moulik SP. Gemini surfactants: A distinct class of self-assembling mole‐

[19] Maiti PK, Lansac Y, Glaser MA, Clark NA, Rouault Y. Self-Assembly in Surfactant Oligomers: A Coarse-Grained Description through Molecular Dynamics Simulations.

[20] Wu H, Xu J, He X, Zhao J, Wen H. Mesoscopic Simulation of Self Assembly in Surfac‐ tant Oligomers by Dissipative Particle Dynamics. Colloids and Surfaces A: Physico‐

[21] Regev O, Zana R. Aggregation Behavior of Tyloxapol, a Nonionic Surfactant Oligom‐ er in Aqueous Solution. Journal of Colloid and Interface Science 1999;210 8–17.

[22] Fan Y, Hou Y, Xiang J, Yu D, Wu C, Tian M, Han Y, Wang Y. Synthesis and Aggrega‐ tion Behavior of a Hexameric Quaternary Ammonium Surfactant. Langmuir 2011;27

[23] White KA, Warr GG, Linkage by elimination/addition: A simple synthesis for a fami‐ ly of oligomeric alkylpyridinium surfactants. Journal of Colloid and Interface Science

[24] Su X, Feng Y, Wang B, Lu Z,Wei L. Oligomericcationic surfactants prepared from surfmersvia ATRP: synthesis and surface activities. Colloid and Polymer Science

[25] Su X, Chu Z, Shuai Y, Guo Z, Feng Y. Oligomeric alkylpyridinium surfactants pre‐

[26] Yoshimura T, Yoshida H, Ohno A, Esumi K. Physicochemical Properties of Quater‐ nary Ammonium Bromide-type Trimeric Surfactants. Journal of Colloid and Inter‐

[27] Jurašin D, Habuš I, Filipović-Vinceković N. Role of the Alkyl Chain Number and Head Groups Location on Surfactants self-assembly in Aqueous Solutions. Colloids

[28] Kim T-S, Kida T, Nakatsuji Y, Ikeda I. Preparation and Properties of Multiple Ammo‐ nium Salts Quaternized by Epichlorohydrin. Langmuir 1996;12 6304-6308.

[29] Chlebicki J, Węgrzyńska J, Wilk K. A. Surface-active, Micellar, and Antielectrostatic Properties of Bis-ammonium salts. *Journal of Colloid and Interface Science* 2008;323 372–

and Surfaces A: Physicochemical and Engineering Aspects 2010;368 119–128.

ic Surfactants. Langmuir 1996;12 4039-4041.

cules. Current Science 2002;82 1101-1111.

Langmuir 2002;18 1908−1918.

10570–10579.

2009;337 304-306

2011, 289, 101–110.

378.

pared via ATRP. *e-Polymers,* 2011, no. 046

face Science 2003; 267 167–172.

nal of Colloid and Interface Science 1996;183 539-545.

chemical and Engineering Aspects 2006;290 239–246.


[16] Esumi K, Taguma K, Koide Y. Aqueous Properties of Multichain Quaternary Cation‐ ic Surfactants. Langmuir 1996;12 4039-4041.

[2] Rosen M. J. Surfactants and Interfacial Phenomena, 3rd Ed, Hoboken, New Jersey: Wi‐

[3] Zana R., Dimeric and Oligomeric Surfactants. Behavior at Interfaces and in Aqueous Solution a Review. Advances in Colloid and Interface Science 2002;97 205-253.

[4] Bunton CA, Robinson L, Schaak J, Stam MF. Catalysis of Nucleophilic Substitutions by Micelles of Dicationic Detergents. Journal of Organic Chemistry 1971;36 2346 –

[5] Devinsky F, Masarova L, Lacko I. Surface Activity and Micelle Formation of Some New Bisquaternary Ammonium Salts. Journal of Colloid and Interface Science

[6] Zhu Y-P, Masuyama A, Okahara M. Preparation and Surface Active Properties of Amphipathic Compounds With Two Sulfate Groups and Two Lipophilic Alkyl

[7] Alami E, Beinert G, Marie P, Zana R. Alkanediyl-α,ω-bis(dimethylalkylammonium bromide) surfactants. 3. Behavior at the air-water interface Langmuir 1993;9

[8] Zana R, Levy H, Papoutsi D, Beinert G. Micellization of Two Triquaternary Ammoni‐

[9] Danino D, Talmon Y, Levy H, Beinert G, Zana R. Branched Threadlike Micelles in an

[10] Danino D, Talmon Y, Zana R. Alkanediyl-α,ω-bis(dimethylalkylammonium bro‐ mide) Surfactants (Dimeric Surfactants). 5. Aggregation and Microstructure in Aque‐

[11] Zana R. Dimeric (Gemini) Surfactants: Effect of the Spacer Group on the Association Behavior in Aqueous Solution. Journal of Colloid and Interface Science 2002;248

[12] Zana R. Critical Micellization Concentration of Surfactants in Aqueous Solution and

[13] Zana R, In M, Levy H, Duportail G. Alkanediyl-α,ω-bis(dimethylalkylammonium bromide). 7. Fluorescence Probing Studies of Micelle Micropolarity and Microviscosi‐

[14] In M, Warr GG, Zana R. Dynamics of Branched Threadlike Micelles. Physical Review

[15] In M, Bec V, Aguerre-Chariol O, Zana R. Quaternary Ammonium Bromide Surfac‐ tant Oligomers in Aqueous Solution: Self-Association and Microstructure. Langmuir

Free Energy of Micellization. Langmuir 1996; 12 1208-1211.

Chains Journal of American Oil Chemist Society 1990;67 459-463.

um Surfactants in Aqueous Solution. Langmuir 1995;11 3694-3698.

ous Solutions. Langmuir 1995;11 1448-1456.

ty. Langmuir 1997;13 5552-5557.

Letters 1998;83 2278-2281.

2000;16 141-148.

Aqueous Solution of a Trimeric Surfactant. Science 1995;269 1420-1421.

ley Interscience; 2004.

168 Oligomerization of Chemical and Biological Compounds

2350.

1985;105 235-239.

1465-1467.

203−220.


[30] Laschewsky A, Wattebled L, Arotcaréna M, Habib−Jiwan JL, Rakotoaly RH. Synthe‐ sis and Properties of Cationic Oligomeric Surfactants. Langmuir 2005;21 7170−7179.

[44] Yoshimura T, Nagata Y, Esumi K. Interactions of quaternary ammonium salt-type gemini surfactants with sodium poly(styrene sulfonate). Journal of Colloid and Inter‐

Higher Oligomeric Surfactants — From Fundamentals to Applications

http://dx.doi.org/10.5772/57655

171

[45] Ottewill RH. Introduction. In: Tadros ThF. (ed) Surfactants. London; Academic Press,

[46] Pinazo, A, Diz M., Solans C, Pes MA, Erra P, Infante, MR. Synthesis and Properties of Cationic Surfactants Containing a Disulfide Bond Journal of American Oil Chemist

[47] Devínsky F, Lacko I, Bitterová F, Tomečková L. Relationship between Structure, Sur‐ face Activity and Micelle Formation of Some new Bisquaternary Isosteres of 1,5-Pen‐ tanediammonium Dibromides. Journal of Colloid and Interface Science 1986;114

[48] Li ZX, Dong CC, Thomas RK. Neutron Reflectivity Studies of the Surface Excess of Gemini Surfactants at the Air−Water Interface Langmuir 1999;15 4392-4396.

[49] Malmsten M. Surfactants and Polymers in Drug Delivery. New York; Marcel Dekker,

[50] Zana R, Benrraou Rueff, Alkanediyl-α,ω-bis(dimethylalkylammonium bromide) Sur‐ factants. 1. Effect of the Spacer Chain Length on the Critical Micelle Concentration

[51] Wattebled L., Laschewsky A., Moussa A., Habib−Jiwan J.−L. Aggregation Numbers of Cationic Oligomeric Surfactants: A Time-Resolved Fluorescence Quenching Study.

[52] Bernheim-Groswasser A, Zana R, Talmon Y. Sphere-to-cylinder Transition in Aque‐ ous Micellar Solution of a Dimeric (Gemini) Surfactant. Journal of Physical Chemis‐

[53] Lianos P, Lang J, Zana R. Fluorescence Probe Study of the Effect of Concentration on the State of Aggregation of Dodecylalkyldimethylammonium Bromides and Dialkyl‐ dimethylammonium Chlorides in Aqueous Solution *Journal of Colloid and Interface*

[54] Kusano T, Iwase H, Yoshimura T, Shibayama M. Structural and Rheological Studies on Growth of Salt-Free Wormlike Micelles Formed by Star-Type Trimeric Surfac‐

[55] Lagleize J.-M, Richetti P, Drummond C. Effect of Surfactant Oligomerization Degree on Lubricant Properties of Mixed Surfactant-Diblock Copolymer Films. Tribology

and Micelle Ionization Degree Langmuir 1991;7 1072-1075.

face Science 2004;275 618-622.

Inc.: 1984. 1-18.

314-322.

Inc.: 2002.

Langmuir 2006;22 2551−2557.

try B 2000;104 4005–4009.

*Science* 1983;91 276-279.

Letters 2010;39 31–38.

tants. Langmuir 2012; 28 16798-16806.

Society 1993;70 37-42.


[44] Yoshimura T, Nagata Y, Esumi K. Interactions of quaternary ammonium salt-type gemini surfactants with sodium poly(styrene sulfonate). Journal of Colloid and Inter‐ face Science 2004;275 618-622.

[30] Laschewsky A, Wattebled L, Arotcaréna M, Habib−Jiwan JL, Rakotoaly RH. Synthe‐ sis and Properties of Cationic Oligomeric Surfactants. Langmuir 2005;21 7170−7179.

[31] Laschewsky A, Lunkenheimer K, Rakotoaly RH, Wattebled L. Spacer Effects in Di‐ meric Cationic Surfactants. Colloid and Polymer Science 2005;283 469-479.

[32] Zhang J, Zheng Y, Yu P, He L, Wang H, Wang R. Synthesis, Characterization and Surface-Activity of a Polyoxyethylene Ether Trimeric Quaternary Ammonium Sur‐

[33] Yoshimura T, Kusano T, Iwase H, Shibayama M, Ogawa T, Kurata H. Star-Shaped Trimeric Quaternary Ammonium Bromide Surfactants:Adsorption and Aggregation

[34] Wu C, Hou Y, Deng M, Huang X, Yu D,Xiang Y, Liu Y, Li Z, Wang Y. Molecular Conformation-Controlled Vesicle/Micelle Transition of Cationic Trimeric Surfactants

[35] Hou Y, Cao M, Deng M, Wang Y. Highly-Ordered Selective Self-Assembly of a Tri‐ meric Cationic Surfactant on a Mica Surface. Langmuir 2008;24 10572-10574.

[36] Hou Y, Han Y, Deng M, Xiang J, Wang Y. Aggregation Behavior of a Tetrameric Cati‐

[37] Murguia MC, Cristaldi MD, Porto A, Di Conza J, Grau RJ. Synthesis, Surface-Active Properties, and Antimicrobial Activities of New Neutral and Cationic Trimeric. Sur‐

[38] Yoshimura T, Esumi K. Physicochemical Properties of Ring-Type Trimeric Surfac‐

[39] Yoshimura T, Esumi K. Physicochemical properties of anionic triple-chain surfac‐ tants in alkaline solutions. Journal of Colloid and Interface Science 2004;276 450–455.

[40] Murguia MC, Cabrera MI, Guastavino JF, Grau RJ. New oligomeric surfactants with multiple-ring spacers: Synthesis and tensioactive properties. Colloids and Surfaces A:

[41] Yang F, Li G, Qi J, Zhang SM, Liu R. Synthesis and surface activity properties of al‐ kylphenol polyoxyethylene nonionic trimeric surfactants. Applied Surface Science

[42] El-Sayed Abdul-Raouf M, Mahmoud Abdul-Raheim AR, El-Sayed Maysour N, Mo‐ hamed H. Synthesis, Surface-Active Properties, and Emulsification Efficiency of Tri‐ meric-Type Nonionic Surfactants Derived From Tris(2-aminoethyl)amine. Journal of

[43] Eastoe J. Surfactant Aggregation and Adsorption at Interfaces. In: Cosgrove T. (ed.) Colloid Science Principles, Methods and Applications, 2nd Edition. Chichester, UK:

factant. Journal of Surfactants and Detergents 2010;13 155–158.

onic Surfactant in Aqueous Solution. Langmuir 2009;26 28−33.

factants. Journal of Surfactants and Detergents 2008;11 41–48.

tants from Cyanuric Chloride. Langmuir 2003;19 3535-3538.

Physicochemical and Engineering Aspects 2005;262 1–7.

Surfactants and Detergents 2011;14 185–193.

John Wiley & Sons; 2010. 61-89.

2010;257 312–318.

Properties. Langmuir 2012;28 9322−9331.

170 Oligomerization of Chemical and Biological Compounds

in Aqueous Solution. Langmuir 2010;26 7922–7927.


[56] Esumi K, Takeda Y, Koide Y. Competitive adsorption of cationic surfactant and pesti‐ cide on laponite, Colloids and Surfaces A: Physicochemical and Engineering Aspects 1998;135 59-62.

**Chapter 6**

**Oligomerization of Nucleic Acids and Peptides under the**

**1.1. Why are nucleic acids and peptides so important for the emergence of life-like systems?**

Question how life originated on the primitive earth is still a frontier of science. Nowadays, primitive life-like system is mostly considered to have emerged by chemical evolution on the Earth although some scientists are evaluating the possibility of panspermia hypothesis [1-3]. External and internal energies from the earth, such as cosmic rays, ultra violet radiation, meteorite impacts, volcanic eruption, submarine hydrothermal vent system, etc. resulted in organic molecules from inorganic materials, such as primitive atmospheric gas, minerals, materials solved in the ocean. Organic molecules were polymerized and finally resulted in several biological functions. Complex mixtures of the organic molecules should have evolved to primitive life-like systems. By the Miller-Urey experiment at 1953, a scenario from simple molecules to organic molecules in a simulated primitive atmosphere – ocean system was evaluated for the first time [4]. After the Miller-Urey experiment, the origin of life problem became as a scientific subject and a number of simulation experiments were carried out to

The knowledge regarding the primitive environments of the Earth and surrounding universe in relation to chemical evolution has been accumulated as well as the simulation experiments. For instance, an environment, which was assumed at the time of the Miller-Urey experiment, is not already acceptable nowadays although it does not mean that the Miller-Urey experiment was devalued; the knowledge on the primitive earth has continuously accumulated to improve the reliability of simulation experiments. The primitive earth environments are difficult to estimate since the evidences regarding oldest rocks and fossil organisms are merely capable to trace back to near 3.9 billion years ago, where almost oldest evidence of life-like system has been obtained [8, 9], although the best current evidences are closer to 3.5 billion years ago [10].

> © 2014 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

**Primitive Earth Conditions**

Additional information is available at the end of the chapter

elucidate how life-like system was originated on this planet [5-7].

Kunio Kawamura

**1. Introduction**

http://dx.doi.org/10.5772/58222


## **Oligomerization of Nucleic Acids and Peptides under the Primitive Earth Conditions**

Kunio Kawamura

[56] Esumi K, Takeda Y, Koide Y. Competitive adsorption of cationic surfactant and pesti‐ cide on laponite, Colloids and Surfaces A: Physicochemical and Engineering Aspects

[57] Zhang R, Qin N, Peng L, Tang K, Ye Z. Wettability alteration by trimeric cationic sur‐ factant at water-wet/oil-wet mice mineral surfaces. Applied Surface Science 2012;258

[58] Yan X, Han S, Hou W, Yu X, Zeng C, Zhao X, Che H. Synthesis of highely ordered mesoporous silica using cationic trimeric surfactant as structure-directing agent, Col‐ loids and Surfaces A: Physicochemical and Engineering Aspects 2007;303 219-225. [59] Sun Y, Han s, Yu X, Che H, Liu A, Wang S, Synthesis of highly ordered supermi‐ croporous silica using short-chain cationic trimeric surfactant as structure-directing

[60] Dazzazi A, Coppel Y, In M, Chassenieux C, Mascalchi P, Salome L, Bouhaouss A, Kahn ML, Gauffre F. Oligomeric and polymeric surfactants for the transfer of lumi‐ niscent ZnO nanocrystals to water. Journal of Materials Chemistry C 2013;11

[61] He C, Hou Y, Han Y, Wang Y. Disassembly of Amyloid Fibrils by Premicellar and Micellar Aggregates of a Tetrameric Cationic Surfactant in Aqueous Solution. Lang‐

[62] Svenson S. Controlling surfactant self-assembly. Current Opinion in Colloid and In‐

[63] Oda R, Huc I, Candau SJ. Gemini surfactants, the effect of hydrophobic chain length

[64] Oda R, Huc I, Homo JC, Heinrich B, Schmutz M, Candau SJ. Elongated Aggregates

[65] Sikirić M, Primožič I, Talmon Y, Filipović-Vinceković N. Effect of the spacer length on the association and adsorption behavior of dissymmetric gemini surfactants. Jour‐

[66] Sikirić M, Šmit I, Tušek-Božić Lj, Tomašić V, Pucić I, Primožič I, Filipović-Vinceković N. Effect of the spacer length on the solid phase transitions of dissymmetric gemini

[67] Jurašin D, Pustak A, Habuš I, Šmit I, Filipović-Vinceković N. Polymorphism and Mesomorphism of Oligomeric Surfactants: Effect of the Degree of Oligomerization.

and dissymmetry. Chemical Communications 1997;2105-2106.

nal of Colloid and Interface Science 2005;281 473-481.

surfactants. Langmuir 2003;19 10044-10053.

Langmuir 2011;27 14118-14130.

Formed by Cationic Gemini Surfactants. Langmuir 1999;15 2384-2390.

agent. Journal of porous Materials 2010;17:597-603.

1998;135 59-62.

172 Oligomerization of Chemical and Biological Compounds

7943-7949.

2158-2165.

muir 2011;27 4551-4556.

terface Science 2004;25 201–212.

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/58222

#### **1. Introduction**

#### **1.1. Why are nucleic acids and peptides so important for the emergence of life-like systems?**

Question how life originated on the primitive earth is still a frontier of science. Nowadays, primitive life-like system is mostly considered to have emerged by chemical evolution on the Earth although some scientists are evaluating the possibility of panspermia hypothesis [1-3]. External and internal energies from the earth, such as cosmic rays, ultra violet radiation, meteorite impacts, volcanic eruption, submarine hydrothermal vent system, etc. resulted in organic molecules from inorganic materials, such as primitive atmospheric gas, minerals, materials solved in the ocean. Organic molecules were polymerized and finally resulted in several biological functions. Complex mixtures of the organic molecules should have evolved to primitive life-like systems. By the Miller-Urey experiment at 1953, a scenario from simple molecules to organic molecules in a simulated primitive atmosphere – ocean system was evaluated for the first time [4]. After the Miller-Urey experiment, the origin of life problem became as a scientific subject and a number of simulation experiments were carried out to elucidate how life-like system was originated on this planet [5-7].

The knowledge regarding the primitive environments of the Earth and surrounding universe in relation to chemical evolution has been accumulated as well as the simulation experiments. For instance, an environment, which was assumed at the time of the Miller-Urey experiment, is not already acceptable nowadays although it does not mean that the Miller-Urey experiment was devalued; the knowledge on the primitive earth has continuously accumulated to improve the reliability of simulation experiments. The primitive earth environments are difficult to estimate since the evidences regarding oldest rocks and fossil organisms are merely capable to trace back to near 3.9 billion years ago, where almost oldest evidence of life-like system has been obtained [8, 9], although the best current evidences are closer to 3.5 billion years ago [10].

© 2014 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

DNA, RNA, and proteins are realized as most important biological materials, which play central roles in the present life-like systems. These molecules possess a number of functions, such as genetic information storage, transcription and translation of genetic information, enzymatic activities, which are based on their unique three-dimensional structures of oligoand polynucleotides and proteins. A certain length of polymerized nucleic acid and peptide chains is the essential to display systemized biological functions although small molecules also play important roles in living organisms. Thus, the formation and accumulation of a certain length of polymers of nucleic acids and peptides are essential process toward the emergence of life-like system.

Metabolism, amplification (replication), and evolution are generally considered as essential requisites, for which a life-like system is regarded as alive [11]. The requisites are strongly related to a train of machinery, which maintains the genetic coding and translation system in cell consisting of DNA, RNA, and proteins as shown in Figure. 1. It was generally considered that the phenotype molecules of cell-type of organism are proteins while the genotype molecules are DNA molecules until the discovery of ribozymes. The function of DNA is storage of genetic information. A DNA sequence is copied to mRNA and translated to amino acid sequence of polypeptide on ribosome using tRNA (Figure 1 left). The peptide becomes a protein by final processing and folding of three-dimensional structure. In the modern organ‐ isms, DNA is regarded as blueprint and protein is regarded as building block created on the basis of the blueprint. The information flow from DNA and protein is universal in all the present organisms [12]. On the other hand, one of the important functions of proteins is enzymatic function. It is considered that protein enzymes control the formation of other biologically important molecules, such as sugar, lipid, vitamin, coenzyme; these functional molecules are constructed indirectly by the catalytic actions of protein enzymes. In addition, protein enzymes maintain the several reactions regarding the DNA replication, the RNA formation, and the protein metabolism itself.

molecules would have facilitated chemical evolution each other. Such cooperative chemical evolution would have resulted in an RNA – protein world after the RNA world system.

Oligomerization of Nucleic Acids and Peptides under the Primitive Earth Conditions

http://dx.doi.org/10.5772/58222

175

**Figure 2.** Chemical evolution of RNA and protein-like molecules towards RNA-protein world

Here, the term "protein" and "peptide" should be reconsidered in the present chapter. Normally, a protein is ambiguously defined just as a very long peptide, which has a biochem‐ ical function. On the other hand, peptides are frequently regarded as just smaller molecules than proteins. However, an important characteristic from the viewpoint of the origin of life, that is, the fact that proteins are synthesized on the basis of genetic translation system in cell should be pointed out. So, if a primitive long peptide were synthesized on the basis of genetic information, which might be constructed from RNA molecules, the long peptide is regarded as a protein. Thus, the assignment method must be elucidated simultaneously during the evaluation of the origin of protein. Thus, in the present chapter the term "a protein" is defined as a certain length of peptide molecule, which is created using an assignment method in a lifelike system. Thus, the term "abiotic-peptide" or "protein-like" will be used as describing oligomerized molecules made from amino acids including peptide bonding. On the other hand, the length of abiotic-peptides, which could have gained biological functions would be also important to consider the oligomerization of amino acids. Biological functions of proteins and peptides are based on the three-dimensional structures. Thus, a minimum length of oligopeptides would be regarded as a candidate functional molecule. The minimum length of a peptide possessing a biological function might be down to 10-mer oligopeptides since a 10-

Conclusively, DNA, RNA, and proteins are regarded as most important molecules, which play important roles regarding the genetic information flow. Then, the origin of the genetic information is important issue in relation to the origin of life. The relationship between DNA

mer peptide could form a rigid three-dimensional structure [13].

**Figure 1.** Information flow in the present organisms (left) and in an RNA-based life-like system (right)

A schematic pathway for chemical evolution of RNA and protein-like molecules is illustrated in Figure 2. According to the modern functions of RNA and proteins, it is assumed that proteins could not have possessed capability for storage of information during the chemical evolution in contrast to RNA molecules. However, the interactions between RNA and protein-like

**Figure 2.** Chemical evolution of RNA and protein-like molecules towards RNA-protein world

DNA, RNA, and proteins are realized as most important biological materials, which play central roles in the present life-like systems. These molecules possess a number of functions, such as genetic information storage, transcription and translation of genetic information, enzymatic activities, which are based on their unique three-dimensional structures of oligoand polynucleotides and proteins. A certain length of polymerized nucleic acid and peptide chains is the essential to display systemized biological functions although small molecules also play important roles in living organisms. Thus, the formation and accumulation of a certain length of polymers of nucleic acids and peptides are essential process toward the emergence

Metabolism, amplification (replication), and evolution are generally considered as essential requisites, for which a life-like system is regarded as alive [11]. The requisites are strongly related to a train of machinery, which maintains the genetic coding and translation system in cell consisting of DNA, RNA, and proteins as shown in Figure. 1. It was generally considered that the phenotype molecules of cell-type of organism are proteins while the genotype molecules are DNA molecules until the discovery of ribozymes. The function of DNA is storage of genetic information. A DNA sequence is copied to mRNA and translated to amino acid sequence of polypeptide on ribosome using tRNA (Figure 1 left). The peptide becomes a protein by final processing and folding of three-dimensional structure. In the modern organ‐ isms, DNA is regarded as blueprint and protein is regarded as building block created on the basis of the blueprint. The information flow from DNA and protein is universal in all the present organisms [12]. On the other hand, one of the important functions of proteins is enzymatic function. It is considered that protein enzymes control the formation of other biologically important molecules, such as sugar, lipid, vitamin, coenzyme; these functional molecules are constructed indirectly by the catalytic actions of protein enzymes. In addition, protein enzymes maintain the several reactions regarding the DNA replication, the RNA

**Figure 1.** Information flow in the present organisms (left) and in an RNA-based life-like system (right)

A schematic pathway for chemical evolution of RNA and protein-like molecules is illustrated in Figure 2. According to the modern functions of RNA and proteins, it is assumed that proteins could not have possessed capability for storage of information during the chemical evolution in contrast to RNA molecules. However, the interactions between RNA and protein-like

of life-like system.

174 Oligomerization of Chemical and Biological Compounds

formation, and the protein metabolism itself.

molecules would have facilitated chemical evolution each other. Such cooperative chemical evolution would have resulted in an RNA – protein world after the RNA world system.

Here, the term "protein" and "peptide" should be reconsidered in the present chapter. Normally, a protein is ambiguously defined just as a very long peptide, which has a biochem‐ ical function. On the other hand, peptides are frequently regarded as just smaller molecules than proteins. However, an important characteristic from the viewpoint of the origin of life, that is, the fact that proteins are synthesized on the basis of genetic translation system in cell should be pointed out. So, if a primitive long peptide were synthesized on the basis of genetic information, which might be constructed from RNA molecules, the long peptide is regarded as a protein. Thus, the assignment method must be elucidated simultaneously during the evaluation of the origin of protein. Thus, in the present chapter the term "a protein" is defined as a certain length of peptide molecule, which is created using an assignment method in a lifelike system. Thus, the term "abiotic-peptide" or "protein-like" will be used as describing oligomerized molecules made from amino acids including peptide bonding. On the other hand, the length of abiotic-peptides, which could have gained biological functions would be also important to consider the oligomerization of amino acids. Biological functions of proteins and peptides are based on the three-dimensional structures. Thus, a minimum length of oligopeptides would be regarded as a candidate functional molecule. The minimum length of a peptide possessing a biological function might be down to 10-mer oligopeptides since a 10 mer peptide could form a rigid three-dimensional structure [13].

Conclusively, DNA, RNA, and proteins are regarded as most important molecules, which play important roles regarding the genetic information flow. Then, the origin of the genetic information is important issue in relation to the origin of life. The relationship between DNA and protein has been an issue for many years, which is so-called as "egg and chicken problem". DNA replication, the transformation of DNA to RNA, the translation to proteins are main‐ tained by protein enzymes and the genetic information of such proteins is coded in DNA sequences. Thus, the relationship between DNA and protein resembles the relationship between egg and chicken. However, discovery of RNA enzyme [14], that is ribozyme, sug‐ gested that RNA molecules should have played a central role in ancient life-like systems, where RNA preserves genetic information, but also enzymatic functions. This is so-called the RNA world hypothesis [15-17] and extensively investigated for the last 2 decays. The genetic information flow in an RNA-based life-like system is shown in Figure 1 (right). RNA molecules play both the functions, that are, storage of genetic information and expression of enzymatic activities. Evidences, such as, important functions of mRNA, tRNA, rRNA during the trans‐ lation of DNA to protein, the enzymatic functions of ribozymes, the role of ATP as high energy phosphate, the roles of coenzyme possessing nucleotide moieties, support the RNA world hypothesis. Nowadays, it is understood that the DNA sequences as genotype molecules are assigned to the amino acid sequences and RNA sequences as phenotype molecules in cell. The presence of an assignment method between genotype and phenotype is added as an important requisite for life-like system [18, 19]. In an RNA based life-like system, RNA molecules should have played both the roles of genotype and phenotype.

structures could be formed in an open system, which could not be formed under the equili‐ brium state. Thus, it should be considered that oligonucleotides and abiotic-peptides could

Oligomerization of Nucleic Acids and Peptides under the Primitive Earth Conditions

http://dx.doi.org/10.5772/58222

177

have accumulated under conditions, which are far from equilibrium conditions.

**Figure 3.** Activation of amino acid in aminoacyl-tRNA

**Figure 4.** Structures of nucleoside 5'-triphosphate

#### **2. Difficulties of dehydration in aqueous solution**

Both the formations of nucleic acids and proteins *in vivo* are the dehydration process in aqueous medium. In other words, the hydrolytic degradation of both nucleic acids and proteins spontaneously proceeds in aqueous medium from the thermodynamic viewpoint. Thus, the formation of these biopolymers is not advantageous from the viewpoint of thermodynamics unless using a condensation reagent and/or an activation technique of monomer. In the present organisms, triphosphate nucleotide monomers are primarily used to form nucleic acids as monomer unit, and the activated amino acids by forming aminoacyl-tRNA (Figure 3) are adopted to form peptide bonding on ribosome. Similarly, the dehydration using condensation agent and/or activation of the monomers in organic solvent are a standard strategy for the organic chemical synthesis of nucleic acids and proteins.

On the other hand, RNA oligomerization proceeds by RNA polymerases on a DNA template from nucleotide 5'-triphosphates (Figure 4), which are the activated nucleotide monomers *in vivo*. A polypeptide forms on a ribosome in the presence of mRNA molecules and aminoacyltRNA molecules, which are formed from amino acids and tRNA molecules by aminoacyl tRNA synthetase molecules. On the other hand, modern DNA templates, ribosomes, and enzymes were not present on the primitive earth. Thus, the processes of RNA formation without these organic materials should be identified to clarify the origin of primitive life-like system.

Furthermore, an important factor should be pointed out for the investigation of prebiotic oligomerization; life-like system is a thermodynamically open system, of which energy and materials inflow and outflow from and to the environment. Thus, the formation of dissipative structures could be formed in an open system, which could not be formed under the equili‐ brium state. Thus, it should be considered that oligonucleotides and abiotic-peptides could have accumulated under conditions, which are far from equilibrium conditions.

**Figure 3.** Activation of amino acid in aminoacyl-tRNA

and protein has been an issue for many years, which is so-called as "egg and chicken problem". DNA replication, the transformation of DNA to RNA, the translation to proteins are main‐ tained by protein enzymes and the genetic information of such proteins is coded in DNA sequences. Thus, the relationship between DNA and protein resembles the relationship between egg and chicken. However, discovery of RNA enzyme [14], that is ribozyme, sug‐ gested that RNA molecules should have played a central role in ancient life-like systems, where RNA preserves genetic information, but also enzymatic functions. This is so-called the RNA world hypothesis [15-17] and extensively investigated for the last 2 decays. The genetic information flow in an RNA-based life-like system is shown in Figure 1 (right). RNA molecules play both the functions, that are, storage of genetic information and expression of enzymatic activities. Evidences, such as, important functions of mRNA, tRNA, rRNA during the trans‐ lation of DNA to protein, the enzymatic functions of ribozymes, the role of ATP as high energy phosphate, the roles of coenzyme possessing nucleotide moieties, support the RNA world hypothesis. Nowadays, it is understood that the DNA sequences as genotype molecules are assigned to the amino acid sequences and RNA sequences as phenotype molecules in cell. The presence of an assignment method between genotype and phenotype is added as an important requisite for life-like system [18, 19]. In an RNA based life-like system, RNA molecules should

Both the formations of nucleic acids and proteins *in vivo* are the dehydration process in aqueous medium. In other words, the hydrolytic degradation of both nucleic acids and proteins spontaneously proceeds in aqueous medium from the thermodynamic viewpoint. Thus, the formation of these biopolymers is not advantageous from the viewpoint of thermodynamics unless using a condensation reagent and/or an activation technique of monomer. In the present organisms, triphosphate nucleotide monomers are primarily used to form nucleic acids as monomer unit, and the activated amino acids by forming aminoacyl-tRNA (Figure 3) are adopted to form peptide bonding on ribosome. Similarly, the dehydration using condensation agent and/or activation of the monomers in organic solvent are a standard strategy for the

On the other hand, RNA oligomerization proceeds by RNA polymerases on a DNA template from nucleotide 5'-triphosphates (Figure 4), which are the activated nucleotide monomers *in vivo*. A polypeptide forms on a ribosome in the presence of mRNA molecules and aminoacyltRNA molecules, which are formed from amino acids and tRNA molecules by aminoacyl tRNA synthetase molecules. On the other hand, modern DNA templates, ribosomes, and enzymes were not present on the primitive earth. Thus, the processes of RNA formation without these organic materials should be identified to clarify the origin of primitive life-like system.

Furthermore, an important factor should be pointed out for the investigation of prebiotic oligomerization; life-like system is a thermodynamically open system, of which energy and materials inflow and outflow from and to the environment. Thus, the formation of dissipative

have played both the roles of genotype and phenotype.

176 Oligomerization of Chemical and Biological Compounds

organic chemical synthesis of nucleic acids and proteins.

**2. Difficulties of dehydration in aqueous solution**

**Figure 4.** Structures of nucleoside 5'-triphosphate

### **3. Primitive earth environments: So extreme as comparing to the present earth environments**

**4. Successful examples of the formation of nucleic acids**

would be necessary to display such functions.

pathways.

**4.1. RNA formation using the present high-energy nucleotide monomers**

Discovery of ribozyme suggested that RNA molecules played important roles in the emergence of life. If the RNA world hypothesis is correct then RNA or RNA-like molecules should have accumulated on the primitive earth. It is now getting known that diversity of RNA molecules in the present organisms is very large including mRNA, tRNA, rRNA, ribozymes, small noncoding RNA [29]. The biological functions of tRNA, rRNA, and ribozymes are caused by the formation of three-dimensional structures. Presumably, a certain length of RNA oligomers

Oligomerization of Nucleic Acids and Peptides under the Primitive Earth Conditions

http://dx.doi.org/10.5772/58222

179

The importance of RNA molecules to the origin of life had been speculated many years ago [30, 31] before the time when the RNA world hypothesis was proposed [15]. Thus, the simulation experiments of chemical evolution of RNA molecules were extensively carried out before the proposal of the RNA world hypothesis. RNA molecules should have evolved without enzymes and template DNA molecules. On the other hand, there are some enzymes in modern organ‐ isms and viruses (Table 1) that catalyze the formation of RNA molecules with and without using a template nucleic acid. RNA polymerase catalyzes the RNA formation from nucleotide 5'-triphosphate (NTP) in the presence of DNA template [32], and Qβ replicase from a virus catalyzes the RNA formation in the presence of an RNA template [33]. Besides, polynucleotide phosphorylase (PNPase) catalyzes the formation of RNA from nucleotide 5'-diphosphate (NDP) without a template nucleic acid [34]; it is considered that the role of PNPase is primarily to form NDP from polynucleotides, but the enzyme also catalyzes the formation of polynu‐ cleotide from NTP. Thus, the enzyme is used for the preparation of polynucleotide in the laboratory. The presence of different enzymes in the formation of RNA indicates that oligo‐ nucleotides would have formed under the primitive earth conditions through multiple

The pathways for the formation of oligonucleotides without enzymes were extensively investigated. The potential of the present high-energy nucleotide phosphate as monomer unit, that is, NDP and NTP, was examined in the absence of the enzyme and a template nucleic acid. However, the efficient formation of oligonucleotides from neither NDP nor NTP has been observed [35]. The high-energy nucleotide phosphates possess the sufficient Gibbs free energy to form oligonucleotides so that the reason of the difficulty of oligomerization of NDP and NTP is probably due to the relatively small rate constants of formation of oligonucleotide without enzyme as comparing to the degradation of these monomers or the hydrolysis of monomers. Thus, condensation agents were used for the oligomeriation of nucleotide mono‐ mers on a template nucleotide polymer although these reactions are normally not so efficient [36-40]. These condensation agents are not considered as prebiotic so cyclic 2', 3'-phosphate were also verified [41, 42]. Recently, the acceptable prebiotic pathway of 2', 3'-cyclic pyrimidine nucleotide monomers was proposed, which might have been an activated nucleotide monomer for oligoucleotides [43]. Very recently, nucleoside 3', 5'-cyclicmonophosphate was used to

The time that a most primitive life emerged on the primitive earth is considered between from 4.6 to 3.9 billion years ago; the earth formed 4.6 billion years ago and the oldest evidence in 3.8 billion year old rock has been pointed out [8]. Thus, since physical evidences between 4.6 – 3.9 billion years ago are hardly obtained the theoretical estimation on the basis of astrophysics and earth science regarding the environments of the earth between 4.6 – 3.9 billion years ago has been carried out. At the early stage of the earth, the surface of the earth is covered with magma-ocean. Presumably, a large number of giant meteorite impacts, celestial bodies, and planetesimals attacked the primitive earth until 3.8 billion years ago [20]. Thus, the giant meteorite impact could have evaporated totally the water of the ocean while the ocean would have formed until 4.2 billion years ago [21, 22]; life-like system, which might have emerged before 3.8 years ago, could have been destroyed under such extreme conditions. Furthermore, the moon was much closer than today which caused strong tidal actions. These factors should have strongly affected the surface temperature of the earth. On the other hand, some scientists assumed that the primitive earth was frozen since the solar luminosity was relatively less than at present [23, 24]. Thus, the temperature of the primitive ocean, in which life originated, remains speculative [25-27] although the temperature would have been higher than the present.

At the same time, it should be also considered that the ancient environments of the earth should be heterogeneous as the present environments are. It would be more difficult to elucidate the local heterogeneity of the ancient environments because of the limitation by theoretical estimation. The presence of the ocean would suggest the presence of mantle convection on the ancient earth before 4.2 billion years ago [28] so that hydrothermal vent systems would have been present in the primitive ocean. In addition, it should be considered that large continents were not formed on the primitive earth. On the other hand, the primitive atmospheric conditions should be taken into account since the primitive atmosphere is nowadays consid‐ ered including CO2, CO, N2, H2O mainly as such strong reductive one [4] dispite the relatively strong reductive atmosphere at the time of Miller-Urey experiment was assumed to include CH4, NH3, H2O [27].

Conclusively, the primitive earth environments should have been extreme and heterogeneous. Nevertheless, the simulation experiments of chemical evolution of nucleic acids and abioticpeptides have been normally carried out under mild conditions. One reason is that the ancient environments of the earth should have been so extreme that these simulation experiments could not be easily carried out. Thus, these extreme conditions must be considered as possible earth environments for the simulation experiments of chemical evolution.

### **4. Successful examples of the formation of nucleic acids**

**3. Primitive earth environments: So extreme as comparing to the present**

The time that a most primitive life emerged on the primitive earth is considered between from 4.6 to 3.9 billion years ago; the earth formed 4.6 billion years ago and the oldest evidence in 3.8 billion year old rock has been pointed out [8]. Thus, since physical evidences between 4.6 – 3.9 billion years ago are hardly obtained the theoretical estimation on the basis of astrophysics and earth science regarding the environments of the earth between 4.6 – 3.9 billion years ago has been carried out. At the early stage of the earth, the surface of the earth is covered with magma-ocean. Presumably, a large number of giant meteorite impacts, celestial bodies, and planetesimals attacked the primitive earth until 3.8 billion years ago [20]. Thus, the giant meteorite impact could have evaporated totally the water of the ocean while the ocean would have formed until 4.2 billion years ago [21, 22]; life-like system, which might have emerged before 3.8 years ago, could have been destroyed under such extreme conditions. Furthermore, the moon was much closer than today which caused strong tidal actions. These factors should have strongly affected the surface temperature of the earth. On the other hand, some scientists assumed that the primitive earth was frozen since the solar luminosity was relatively less than at present [23, 24]. Thus, the temperature of the primitive ocean, in which life originated, remains speculative [25-27] although the temperature would have been higher than the

At the same time, it should be also considered that the ancient environments of the earth should be heterogeneous as the present environments are. It would be more difficult to elucidate the local heterogeneity of the ancient environments because of the limitation by theoretical estimation. The presence of the ocean would suggest the presence of mantle convection on the ancient earth before 4.2 billion years ago [28] so that hydrothermal vent systems would have been present in the primitive ocean. In addition, it should be considered that large continents were not formed on the primitive earth. On the other hand, the primitive atmospheric conditions should be taken into account since the primitive atmosphere is nowadays consid‐ ered including CO2, CO, N2, H2O mainly as such strong reductive one [4] dispite the relatively strong reductive atmosphere at the time of Miller-Urey experiment was assumed to include

Conclusively, the primitive earth environments should have been extreme and heterogeneous. Nevertheless, the simulation experiments of chemical evolution of nucleic acids and abioticpeptides have been normally carried out under mild conditions. One reason is that the ancient environments of the earth should have been so extreme that these simulation experiments could not be easily carried out. Thus, these extreme conditions must be considered as possible

earth environments for the simulation experiments of chemical evolution.

**earth environments**

178 Oligomerization of Chemical and Biological Compounds

present.

CH4, NH3, H2O [27].

#### **4.1. RNA formation using the present high-energy nucleotide monomers**

Discovery of ribozyme suggested that RNA molecules played important roles in the emergence of life. If the RNA world hypothesis is correct then RNA or RNA-like molecules should have accumulated on the primitive earth. It is now getting known that diversity of RNA molecules in the present organisms is very large including mRNA, tRNA, rRNA, ribozymes, small noncoding RNA [29]. The biological functions of tRNA, rRNA, and ribozymes are caused by the formation of three-dimensional structures. Presumably, a certain length of RNA oligomers would be necessary to display such functions.

The importance of RNA molecules to the origin of life had been speculated many years ago [30, 31] before the time when the RNA world hypothesis was proposed [15]. Thus, the simulation experiments of chemical evolution of RNA molecules were extensively carried out before the proposal of the RNA world hypothesis. RNA molecules should have evolved without enzymes and template DNA molecules. On the other hand, there are some enzymes in modern organ‐ isms and viruses (Table 1) that catalyze the formation of RNA molecules with and without using a template nucleic acid. RNA polymerase catalyzes the RNA formation from nucleotide 5'-triphosphate (NTP) in the presence of DNA template [32], and Qβ replicase from a virus catalyzes the RNA formation in the presence of an RNA template [33]. Besides, polynucleotide phosphorylase (PNPase) catalyzes the formation of RNA from nucleotide 5'-diphosphate (NDP) without a template nucleic acid [34]; it is considered that the role of PNPase is primarily to form NDP from polynucleotides, but the enzyme also catalyzes the formation of polynu‐ cleotide from NTP. Thus, the enzyme is used for the preparation of polynucleotide in the laboratory. The presence of different enzymes in the formation of RNA indicates that oligo‐ nucleotides would have formed under the primitive earth conditions through multiple pathways.

The pathways for the formation of oligonucleotides without enzymes were extensively investigated. The potential of the present high-energy nucleotide phosphate as monomer unit, that is, NDP and NTP, was examined in the absence of the enzyme and a template nucleic acid. However, the efficient formation of oligonucleotides from neither NDP nor NTP has been observed [35]. The high-energy nucleotide phosphates possess the sufficient Gibbs free energy to form oligonucleotides so that the reason of the difficulty of oligomerization of NDP and NTP is probably due to the relatively small rate constants of formation of oligonucleotide without enzyme as comparing to the degradation of these monomers or the hydrolysis of monomers. Thus, condensation agents were used for the oligomeriation of nucleotide mono‐ mers on a template nucleotide polymer although these reactions are normally not so efficient [36-40]. These condensation agents are not considered as prebiotic so cyclic 2', 3'-phosphate were also verified [41, 42]. Recently, the acceptable prebiotic pathway of 2', 3'-cyclic pyrimidine nucleotide monomers was proposed, which might have been an activated nucleotide monomer for oligoucleotides [43]. Very recently, nucleoside 3', 5'-cyclicmonophosphate was used to produce oligonucleotides at medium temperatures, where oligonucleotides were detected using gel electrophoresis [44].

catalysts, clay mineral catalysts, using different types of activated nucleotide monomers, using non-ribose based nucleotide analogues. As a variation of ImpN, nucleoside 5'-monophos‐

Oligomerization of Nucleic Acids and Peptides under the Primitive Earth Conditions

http://dx.doi.org/10.5772/58222

181

It was successfully demonstrated that oligonucleotides form from the activated nucleotide monomers without enzyme. However, the formation of the activated nucleotide under the primitive earth conditions is also an important issue. Possible pathways for the spontaneous formation of the activated nucleotide had been barely elucidated [48, 54, 55]. However, the accumulation of the activated nucleotide monomers would not be so likely since the moisture controlled N-P bond formation and the formation of nucleotide phosphate under dry condi‐ tions are included [54]. Furthermore, the activated nucleotide monomers are substantially hydrolyzed to nucleoside 5'-monophosphate [56]. The yield of the activated nucleotide monomers formed under these simulation conditions is not so high that the simulation experiments of oligonucleotide formation from the activated nucleotide monomers under the primitive earth are normally carried out using a sufficient amount of the activated nucleotide monomers, such as ImpN, which are prepared using an organic synthetic technique. Organic synthesis of ImpN was established and the yields of ImpN and the analogues are normally

**4.2. Prebiotic formation and organic synthesis of the activated nucleotide monomer**

**4.3. Spontaneous formation of oligonucleotide using metal catalyst (Metal-catalyzed**

Spontaneous oligomerization of the activated nucleotide monomers should be a next step for the chemical evolution of oligonucleotide once these formed under the primitive earth conditions. A spontaneous formation pathway in the presence of metal ion catalysts was found for the first time (Figure 6) [50, 58]. The maximum length of the oligonucleotide formed by the Metal-catalyzed reaction of RNA reaches to ca. 10-mer nucleotide units [59-61]. The efficiency of Metal-catalyzed reaction is less dependent on the type of nucleotide bases. Catalytic metal

simulation experiments as comparing to the concentration in the present ocean. Regioselective formation of 3', 5'-or 2', 5'-linked oligonucleotides has been observed that the formation of 2',

tide is preferable using Zn2+[50, 59-61]. Although the reason 3', 5'-linked RNA oligonucelotide was selected by the chemical evolution of RNA is not yet established, these experimental data would be related to the reason 3', 5'-linked olignucelotide was selected. In addition, it is pointed out that the 2', 5'-linked oligonucelotide formation was applied to a practical synthetic method of 2', 5'-linked oligonucleotide [59]. Mechanistic analysis of Metal-catalyzed reactions suggested that the acceleration by metal ions is due to the formation of the metal ions with phosphate group and/or bases, which enhance the association between two monomers or that between a monomer and an elongating oligomer prior to the phosphodiester bond formation

2+, are normally active at fairly high concentrations in the

2+and that of 3', 5'-linked oligonucleo‐

pho-2-methylimidazolde (2MeImpN) has been frequently used [53].

over 95 % by coupling method using a sulfide compound [57].

5'-linked oligonucleotide is preferable using Pb2+or UO2

**reaction)**

[62].

ions, such as Zn2+, Pb2+, and UO2


**Table 1.** Formation of RNA in the presence of different types of enzymes.

It was not successful for an efficient template-directed formation of oligonucleotides from nucleoside 5'-monophosphate monomers [36-40, 45] or using an activated nucleotide inter‐ mediate [46] in the presence of the condensation agent. Finally, it was discovered an acceptable primitive activated nucleotide monomer, that is, nucleotide 5'-monophosphorimidazolide (ImpN) (Figure 5), which is activated with N-P bound on the phosphate group of the nucleotide monomer [47-49]. For instance, adenosine 5'-monophophorimidazolide (ImpA) forms oligoadenylate (oligoA) in the presence of Zn2+without a template [50] and guanosine 5' monophosphorimidazolide (ImpG) forms efficiently oligoguanylate (oligoG) up to 40 nucleo‐ tide units in the presence of a polycytidylic acid template (polyC) without using enzyme in the presence of Pb2+or Zn2+[51, 52]. The prebiotic RNA formation reactions in detail will be summarized in the following sections.

**Figure 5.** Activated nucleotide monomers

The details and variation of the reaction using activated nucleotide monomers have been extensively investigated in the absence and presence of template, in the presence of metal ion catalysts, clay mineral catalysts, using different types of activated nucleotide monomers, using non-ribose based nucleotide analogues. As a variation of ImpN, nucleoside 5'-monophos‐ pho-2-methylimidazolde (2MeImpN) has been frequently used [53].

#### **4.2. Prebiotic formation and organic synthesis of the activated nucleotide monomer**

produce oligonucleotides at medium temperatures, where oligonucleotides were detected

**Enzyme Monomer Template**

It was not successful for an efficient template-directed formation of oligonucleotides from nucleoside 5'-monophosphate monomers [36-40, 45] or using an activated nucleotide inter‐ mediate [46] in the presence of the condensation agent. Finally, it was discovered an acceptable primitive activated nucleotide monomer, that is, nucleotide 5'-monophosphorimidazolide (ImpN) (Figure 5), which is activated with N-P bound on the phosphate group of the nucleotide monomer [47-49]. For instance, adenosine 5'-monophophorimidazolide (ImpA) forms oligoadenylate (oligoA) in the presence of Zn2+without a template [50] and guanosine 5' monophosphorimidazolide (ImpG) forms efficiently oligoguanylate (oligoG) up to 40 nucleo‐ tide units in the presence of a polycytidylic acid template (polyC) without using enzyme in the presence of Pb2+or Zn2+[51, 52]. The prebiotic RNA formation reactions in detail will be

The details and variation of the reaction using activated nucleotide monomers have been extensively investigated in the absence and presence of template, in the presence of metal ion

Ribonuclease Nucleoside 5'-triphosphate DNA Qβ replicase Nucleoside 5'-triphosphate RNA Polynucleotide phosphorylase Nucleoside 5'-diphosphate No template

**Table 1.** Formation of RNA in the presence of different types of enzymes.

using gel electrophoresis [44].

180 Oligomerization of Chemical and Biological Compounds

summarized in the following sections.

**Figure 5.** Activated nucleotide monomers

It was successfully demonstrated that oligonucleotides form from the activated nucleotide monomers without enzyme. However, the formation of the activated nucleotide under the primitive earth conditions is also an important issue. Possible pathways for the spontaneous formation of the activated nucleotide had been barely elucidated [48, 54, 55]. However, the accumulation of the activated nucleotide monomers would not be so likely since the moisture controlled N-P bond formation and the formation of nucleotide phosphate under dry condi‐ tions are included [54]. Furthermore, the activated nucleotide monomers are substantially hydrolyzed to nucleoside 5'-monophosphate [56]. The yield of the activated nucleotide monomers formed under these simulation conditions is not so high that the simulation experiments of oligonucleotide formation from the activated nucleotide monomers under the primitive earth are normally carried out using a sufficient amount of the activated nucleotide monomers, such as ImpN, which are prepared using an organic synthetic technique. Organic synthesis of ImpN was established and the yields of ImpN and the analogues are normally over 95 % by coupling method using a sulfide compound [57].

#### **4.3. Spontaneous formation of oligonucleotide using metal catalyst (Metal-catalyzed reaction)**

Spontaneous oligomerization of the activated nucleotide monomers should be a next step for the chemical evolution of oligonucleotide once these formed under the primitive earth conditions. A spontaneous formation pathway in the presence of metal ion catalysts was found for the first time (Figure 6) [50, 58]. The maximum length of the oligonucleotide formed by the Metal-catalyzed reaction of RNA reaches to ca. 10-mer nucleotide units [59-61]. The efficiency of Metal-catalyzed reaction is less dependent on the type of nucleotide bases. Catalytic metal ions, such as Zn2+, Pb2+, and UO2 2+, are normally active at fairly high concentrations in the simulation experiments as comparing to the concentration in the present ocean. Regioselective formation of 3', 5'-or 2', 5'-linked oligonucleotides has been observed that the formation of 2', 5'-linked oligonucleotide is preferable using Pb2+or UO2 2+and that of 3', 5'-linked oligonucleo‐ tide is preferable using Zn2+[50, 59-61]. Although the reason 3', 5'-linked RNA oligonucelotide was selected by the chemical evolution of RNA is not yet established, these experimental data would be related to the reason 3', 5'-linked olignucelotide was selected. In addition, it is pointed out that the 2', 5'-linked oligonucelotide formation was applied to a practical synthetic method of 2', 5'-linked oligonucleotide [59]. Mechanistic analysis of Metal-catalyzed reactions suggested that the acceleration by metal ions is due to the formation of the metal ions with phosphate group and/or bases, which enhance the association between two monomers or that between a monomer and an elongating oligomer prior to the phosphodiester bond formation [62].

on a clay mineral, naturally-occurring montmorillonite, was discovered for the first time at 1992 and the reaction was extensively investigated [67]. The oligonucleotides up to ca. 10-mer nucleotide monomer units are formed spontaneously in one pot reaction in the presence of montmorillonite catalyst. The efficiency of Clay-catalyzed reaction is somewhat dependent on the type of nucleotide bases; the formation of oligoguanylate (oligoA) is somewhat less effective as comparing to the others [68-71]. By these reactions, both the 2', 5'-linked and 3', 5' linked oligonucleotides form and pyrophosphate-linked isomers were observed. There is a trend that the oligonucleotides formed from ImpN consisting of purine base involve mainly 3', 5'-linked isomers and those formed from ImpN consisting of pyrimidine base involve mainly 2', 5'-linked isomers. The percentage of pyrophosphate-linked isomers is normally

Oligomerization of Nucleic Acids and Peptides under the Primitive Earth Conditions

http://dx.doi.org/10.5772/58222

183

Furthermore, it is noted that a fairly large amount of cyclic oligonucleotides, such as cyclic-2 mer, cyclic-3-mer, and cyclic-4-mer, are generally observed. The ratio of the cyclic isomers is also dependent on the type of nucleotide bases [70]. For instance, the 2-mer fraction on an anion-exchange HPLC column of the products using ImpU and ImpI involves 60 – 96% of cyclic-3-mers. Longer cyclic oligomers are also observed in higher length fractions. The cyclization of nucleotide 6-mer and higher oligomers, which possess phosphate group at 5' terminal and 2'-OH at 3'-terminal, was observed in the presence of condensation agent [72, 73]; this reaction was designed for the competition reaction between elongation and cycliza‐ tion. The cyclization of oligonucleotides is considered as termination reaction for the chemical

The oligonucleotide formation proceeds using different sources of montmorillonite, but the oligonucleotide formation does not proceed at all using a clay from Otay [69]. The binding of ImpN on different clay mineral is observed in these reaction systems, and there is a strong correlation between the yield of oligonucleotide and the binding constant of the activated nucleotide monomer on the clay. Thus, binding is necessary step for the acceleration of oligonucleotide formation on the clay. The activated nucleotide monomer is bound to the surface of montmorillonite ca. 80%, where negative charges are present, and bound to the edge of the montmorillonite 20% [74]. The activated nucleotide monomer is bound through Mg2+ion bridge, which forms complex with phosphorimidazolide group (Figure 7) [69]. Low efficiency of oligonucleotide formation in the absence of Mg2+supports the Mg2+bridge binding model,

On the other hand, the binding of the activated nucleotide monomers is fairly dependent on the types of nucleotide bases [69, 70]. The strength of binding of ImpN consisting of purine base is larger than that consisting of pyrimidine base. Although the apparent binding ratio of ImpC and ImpU on the clay is very low, the yields of oligonucleotide are not much different from those of ImpA and ImpG. This fact suggests that the apparent binding of ImpN includes an effective and non-effective binding on the clay mineral for the oligonucleotide formation. Non-effective binding reflects that the binding is dependent on the hydrophobicity and staking

By using Clay-catalyzed reaction, it was elucidated that oligonucleotides up to ca. 50-mer nucleotide units form by a continuous feeding of ImpN to the montmorillonite-aqueous

small.

evolution of RNA.

by nucleotide bases.

and Ca2+ion can also form the bridge [69].

**Figure 6.** Metal-catalyzed reaction

**Figure 7.** RNA formation of montmorillonite clay

#### **4.4. Spontaneous formation of oligonucleotide using clay catalyst (Clay-catalyzed reaction)**

The importance of minerals for the chemical evolution of biomolecules and biopolymers was frequently assumed from old days [63]. Thus, the simulation experiments of oligonucleotide formation have been extensively carried out [64-66]. An efficient formation of oligonucleotide on a clay mineral, naturally-occurring montmorillonite, was discovered for the first time at 1992 and the reaction was extensively investigated [67]. The oligonucleotides up to ca. 10-mer nucleotide monomer units are formed spontaneously in one pot reaction in the presence of montmorillonite catalyst. The efficiency of Clay-catalyzed reaction is somewhat dependent on the type of nucleotide bases; the formation of oligoguanylate (oligoA) is somewhat less effective as comparing to the others [68-71]. By these reactions, both the 2', 5'-linked and 3', 5' linked oligonucleotides form and pyrophosphate-linked isomers were observed. There is a trend that the oligonucleotides formed from ImpN consisting of purine base involve mainly 3', 5'-linked isomers and those formed from ImpN consisting of pyrimidine base involve mainly 2', 5'-linked isomers. The percentage of pyrophosphate-linked isomers is normally small.

Furthermore, it is noted that a fairly large amount of cyclic oligonucleotides, such as cyclic-2 mer, cyclic-3-mer, and cyclic-4-mer, are generally observed. The ratio of the cyclic isomers is also dependent on the type of nucleotide bases [70]. For instance, the 2-mer fraction on an anion-exchange HPLC column of the products using ImpU and ImpI involves 60 – 96% of cyclic-3-mers. Longer cyclic oligomers are also observed in higher length fractions. The cyclization of nucleotide 6-mer and higher oligomers, which possess phosphate group at 5' terminal and 2'-OH at 3'-terminal, was observed in the presence of condensation agent [72, 73]; this reaction was designed for the competition reaction between elongation and cycliza‐ tion. The cyclization of oligonucleotides is considered as termination reaction for the chemical evolution of RNA.

**Figure 6.** Metal-catalyzed reaction

182 Oligomerization of Chemical and Biological Compounds

**Figure 7.** RNA formation of montmorillonite clay

**4.4. Spontaneous formation of oligonucleotide using clay catalyst (Clay-catalyzed reaction)**

The importance of minerals for the chemical evolution of biomolecules and biopolymers was frequently assumed from old days [63]. Thus, the simulation experiments of oligonucleotide formation have been extensively carried out [64-66]. An efficient formation of oligonucleotide

The oligonucleotide formation proceeds using different sources of montmorillonite, but the oligonucleotide formation does not proceed at all using a clay from Otay [69]. The binding of ImpN on different clay mineral is observed in these reaction systems, and there is a strong correlation between the yield of oligonucleotide and the binding constant of the activated nucleotide monomer on the clay. Thus, binding is necessary step for the acceleration of oligonucleotide formation on the clay. The activated nucleotide monomer is bound to the surface of montmorillonite ca. 80%, where negative charges are present, and bound to the edge of the montmorillonite 20% [74]. The activated nucleotide monomer is bound through Mg2+ion bridge, which forms complex with phosphorimidazolide group (Figure 7) [69]. Low efficiency of oligonucleotide formation in the absence of Mg2+supports the Mg2+bridge binding model, and Ca2+ion can also form the bridge [69].

On the other hand, the binding of the activated nucleotide monomers is fairly dependent on the types of nucleotide bases [69, 70]. The strength of binding of ImpN consisting of purine base is larger than that consisting of pyrimidine base. Although the apparent binding ratio of ImpC and ImpU on the clay is very low, the yields of oligonucleotide are not much different from those of ImpA and ImpG. This fact suggests that the apparent binding of ImpN includes an effective and non-effective binding on the clay mineral for the oligonucleotide formation. Non-effective binding reflects that the binding is dependent on the hydrophobicity and staking by nucleotide bases.

By using Clay-catalyzed reaction, it was elucidated that oligonucleotides up to ca. 50-mer nucleotide units form by a continuous feeding of ImpN to the montmorillonite-aqueous solution system [75]. Furthermore, a ImpN analogue replaced the imidazole moiety with purine derivative forms oligonucleotides with ca. 50-mer nucleotide units in one-step reaction [76, 77]. The modifications of Clay-catalyzed reaction indicate that oligonucleotides could have accumulated under the primitive earth conditions if the conditions were mild as present.

The reaction mechanism of Clay-catalyzed reaction was extensively investigated on the basis of reaction kinetics and binding thermodynamics. The binding of ImpN or analogues is the first step for the formation of oligonucleotides. Mg2+ions are necessary for the binding of ImpN through the bridge with Mg2+, which attaches to phosphate group of ImpN on the negatively charged clay surface [69, 74]. Kinetic analysis showed that the formation rate constant of 2-mer is much smaller than that of 3-mer and longer oligonucleotides. This fact would suggest that the association of two ImpN molecules to form 2-mer is much weaker than that of ImpN with elongating olignucleotide. A similar trend was observed in the cases of Metal-catalyzed reaction [62]. The regioselectivity of 2', 5'-or 3', 5'-linked isomers using pyrimidine or purine ImpN is probably due to the different binding conformation of the activated nucleotide monomer on clay surface. During the oligomerization, the hydrolysis of ImpN simultaneously proceeds, but the formation of oligonucleotide is much faster than that of ImpN hydrolysis.

#### **4.5. Template-directed formation of oligonucleotide (Template-directed reaction)**

According to the scenario of chemical evolution of RNA, the replication of oligonucleotides should be a next step after the spontaneous formation of oligonucelotides by Metal-catalyzed and Clay-catalyzed reactions since the replication of RNA would have resulted in the repli‐ cation of genetic information. Orgel and coworkers showed that template-directed formation of oligoG from ImpG or 2MeImpG on a polyC without using any enzyme (Figure 8) [51-53]. These reactions produce oligonucleotides up to ca. 40-mer. The efficiency of Template-directed reaction using ImpG is dependent on metal ions, where Pb2+ion catalyzes the 2', 5'-linked oligoG and Zn2+ion catalyzes the 3', 5'-linked oligoG. It is interesting that Zn2+is present in the active center of modern RNA polymerase (Figure 9) [78]. Besides, 2MeImpG forms 3', 5'-linked oligoG without using any metal ions. Formation of cyclic isomers is less in Template-directed reaction while the cyclic 2-mer, 3-mer, and higher oligomers are frequently observed for Claycatalyzed reactions. The Template-directed reaction is dependent on pH and temperature. Normally, the efficiency of the reaction is highest at around pH 8 [53]. This is due to the activity of the activity of the activated nucleotide monomers.

partially incorporate different bases in the oligonucleotides other than oligoG [57, 79]. The efficiency of the Template-directed reaction increases with increasing of the ratio of cytidine of the template. In addition, it was confirmed that Template-directed reaction using ImpA, ImpU, and ImpC would partially proceeds using hairpin oligonucleotides [80]. Furthermore, Template-directed reaction proceeds if the activated oligoU is used as starting material on a polyA template, of which the association of oligoU would be stronger than that of ImpU monomers [81]. Although these results suggest that Template-directed reaction partially proceeds in mixed oligonucleotides, thus the fidelity of Template-directed reaction is not high. The limitaition of Template-directed reaction as prebiotic formation of RNA has been pointed out [82]. In other words, these facts would suggest that there is still a drawback how the replication of oligonucleotide could have proceeded in the absence of enzyme. In addition, protocell and vesicles could have enhanced the chemical evolution of RNA molecules [83, 84]. Conclusively, a universal pathway for the replication using activated nucleotide monomers is

Oligomerization of Nucleic Acids and Peptides under the Primitive Earth Conditions

http://dx.doi.org/10.5772/58222

185

not yet identified.

**Figure 8.** Template-directed formation of RNA

It is noted that Template-directed reactions using different ImpN or 2MeImpN with different nucleotide bases, that is, adenine, uracil, and cytosine do efficiently not proceed to form oligonucleotides; oligoA forms with very low efficient from ImpA on polyU template [49] and oligoU and oligoC are not formed at all from ImpU and ImpC on a complementary template. The low efficiency using activated nucleotide monomers is probably due to that the association between two ImpN molecules or that between ImpN and an elongating oligonucleotide on a complementary template is very weak for ImpN consisting of adenine, uridine, or cytidine base.

A possible pathway was demonstrated that Template-directed reaction from a mixture of 4 types of ImpN with different bases on a template including complementary bases enable to

**Figure 8.** Template-directed formation of RNA

solution system [75]. Furthermore, a ImpN analogue replaced the imidazole moiety with purine derivative forms oligonucleotides with ca. 50-mer nucleotide units in one-step reaction [76, 77]. The modifications of Clay-catalyzed reaction indicate that oligonucleotides could have accumulated under the primitive earth conditions if the conditions were mild as present.

The reaction mechanism of Clay-catalyzed reaction was extensively investigated on the basis of reaction kinetics and binding thermodynamics. The binding of ImpN or analogues is the first step for the formation of oligonucleotides. Mg2+ions are necessary for the binding of ImpN through the bridge with Mg2+, which attaches to phosphate group of ImpN on the negatively charged clay surface [69, 74]. Kinetic analysis showed that the formation rate constant of 2-mer is much smaller than that of 3-mer and longer oligonucleotides. This fact would suggest that the association of two ImpN molecules to form 2-mer is much weaker than that of ImpN with elongating olignucleotide. A similar trend was observed in the cases of Metal-catalyzed reaction [62]. The regioselectivity of 2', 5'-or 3', 5'-linked isomers using pyrimidine or purine ImpN is probably due to the different binding conformation of the activated nucleotide monomer on clay surface. During the oligomerization, the hydrolysis of ImpN simultaneously proceeds, but the formation of oligonucleotide is much faster than that of ImpN hydrolysis.

**4.5. Template-directed formation of oligonucleotide (Template-directed reaction)**

of the activity of the activated nucleotide monomers.

184 Oligomerization of Chemical and Biological Compounds

base.

According to the scenario of chemical evolution of RNA, the replication of oligonucleotides should be a next step after the spontaneous formation of oligonucelotides by Metal-catalyzed and Clay-catalyzed reactions since the replication of RNA would have resulted in the repli‐ cation of genetic information. Orgel and coworkers showed that template-directed formation of oligoG from ImpG or 2MeImpG on a polyC without using any enzyme (Figure 8) [51-53]. These reactions produce oligonucleotides up to ca. 40-mer. The efficiency of Template-directed reaction using ImpG is dependent on metal ions, where Pb2+ion catalyzes the 2', 5'-linked oligoG and Zn2+ion catalyzes the 3', 5'-linked oligoG. It is interesting that Zn2+is present in the active center of modern RNA polymerase (Figure 9) [78]. Besides, 2MeImpG forms 3', 5'-linked oligoG without using any metal ions. Formation of cyclic isomers is less in Template-directed reaction while the cyclic 2-mer, 3-mer, and higher oligomers are frequently observed for Claycatalyzed reactions. The Template-directed reaction is dependent on pH and temperature. Normally, the efficiency of the reaction is highest at around pH 8 [53]. This is due to the activity

It is noted that Template-directed reactions using different ImpN or 2MeImpN with different nucleotide bases, that is, adenine, uracil, and cytosine do efficiently not proceed to form oligonucleotides; oligoA forms with very low efficient from ImpA on polyU template [49] and oligoU and oligoC are not formed at all from ImpU and ImpC on a complementary template. The low efficiency using activated nucleotide monomers is probably due to that the association between two ImpN molecules or that between ImpN and an elongating oligonucleotide on a complementary template is very weak for ImpN consisting of adenine, uridine, or cytidine

A possible pathway was demonstrated that Template-directed reaction from a mixture of 4 types of ImpN with different bases on a template including complementary bases enable to partially incorporate different bases in the oligonucleotides other than oligoG [57, 79]. The efficiency of the Template-directed reaction increases with increasing of the ratio of cytidine of the template. In addition, it was confirmed that Template-directed reaction using ImpA, ImpU, and ImpC would partially proceeds using hairpin oligonucleotides [80]. Furthermore, Template-directed reaction proceeds if the activated oligoU is used as starting material on a polyA template, of which the association of oligoU would be stronger than that of ImpU monomers [81]. Although these results suggest that Template-directed reaction partially proceeds in mixed oligonucleotides, thus the fidelity of Template-directed reaction is not high. The limitaition of Template-directed reaction as prebiotic formation of RNA has been pointed out [82]. In other words, these facts would suggest that there is still a drawback how the replication of oligonucleotide could have proceeded in the absence of enzyme. In addition, protocell and vesicles could have enhanced the chemical evolution of RNA molecules [83, 84]. Conclusively, a universal pathway for the replication using activated nucleotide monomers is not yet identified.

between activated monomer and elongating oligomer are important to determine the rate constants of 2-mer, 3-mer, and 4-mer formations, which is similar to Template-directed reaction. The association is mainly due to the stacking through nucleotide bases, which increases with length of oligonucleotide up to 4-mer and remains constant. Kinetic analysis of Metal-catalyzed reaction also implies a similar mechanism to that for Template-directed and Clay-catalyzed reactions (Figure 11). Conclusively, the kinetic investigations summarize that the association of an activated nucleotide monomer with another one activated nucleotide monomer or an elongating oligomer is important to determine the efficiency of oligomerization in the presence of any additives, such as a polynucleotide template, clay surface, and metal

Oligomerization of Nucleic Acids and Peptides under the Primitive Earth Conditions

http://dx.doi.org/10.5772/58222

187

The activated nucleotide monomers with N-P bond were normally tested using ImpN or 2MeImpN, where imidazole and 2-methylimidazole moieties are the leaving group from N-P bound with phosphate group. Although the methyl group does not seem to be so effective, this causes fairly large influence for the formation of long oligonucleotides [86]. For instance, the efficiency of Template-directed reaction using 2MImpG in the absence of metal catalyst is higher than that using ImpG. This is probably due to the enhancement of association between

**Figure 11.** Importance of associate formation for oligonucleotide elongation

**4.7. Variations of the activated nucleotide monomers**

**Figure 10.** Hydrolytic degradation of the activated nucleotide

ions.

**Figure 9.** 3', 5'-linked (left) and 2', 5'-linked (right) RNA

#### **4.6. Kinetic analysis of prebiotic oligonucleotide formation**

First, the reaction mechanism of Template-directed reaction has been extensively investigated on the basis of kinetic analysis [85]. The rate constants for the formation of 2-mer, 3-mer and longer increase in the order 2-mer << 3-mer < 4-mer and higher. The hydrolysis of the activated nucleotide proceeds simultaneously to the oligomerization (Figure 10) [56, 85]. The rate constants of oligomerization are greater than that of hydrolysis in the presence of template. The association between two activated monomers and that between activated monomer and the elongating oligoG is important for the oligonucleotide formation. A similar trend was observed for Clay-catalyzed reaction [69, 70]. This was evaluated in detail and kinetic analysis in details showed a trend that two or three activated nucleotide monomers align on the polyC template prior to the phosphodiester bond formation [85]. For instance, the association between two ImpG molecules would be stronger than that between two ImpC molecules, which is determined by the strength of stacking of nucleotide bases. This is compatible to the fact that the efficiency of Template-directed reaction with ImpA, ImpU, and/or ImpC is much lower than Template-directed reaction with ImpG.

On the other hand, kinetic analysis of the reaction mechanism of Clay-catalyzed reaction showed that the rate constants for the formation of 2-mer, 3-mer, and 4-mer and higher oligonucleotide increase in the order of 2-mer < 3-mer < 4-mer and higher; this trend is consistent with Template-directed reaction [69, 70]. Thus, the hydrolysis of the activated nucleotide monomer is somewhat competitive reaction to the oligomerization in the presence of clay. The kinetics suggested that the association between two activation monomers and that Oligomerization of Nucleic Acids and Peptides under the Primitive Earth Conditions http://dx.doi.org/10.5772/58222 187

**Figure 10.** Hydrolytic degradation of the activated nucleotide

**4.6. Kinetic analysis of prebiotic oligonucleotide formation**

**Figure 9.** 3', 5'-linked (left) and 2', 5'-linked (right) RNA

186 Oligomerization of Chemical and Biological Compounds

lower than Template-directed reaction with ImpG.

First, the reaction mechanism of Template-directed reaction has been extensively investigated on the basis of kinetic analysis [85]. The rate constants for the formation of 2-mer, 3-mer and longer increase in the order 2-mer << 3-mer < 4-mer and higher. The hydrolysis of the activated nucleotide proceeds simultaneously to the oligomerization (Figure 10) [56, 85]. The rate constants of oligomerization are greater than that of hydrolysis in the presence of template. The association between two activated monomers and that between activated monomer and the elongating oligoG is important for the oligonucleotide formation. A similar trend was observed for Clay-catalyzed reaction [69, 70]. This was evaluated in detail and kinetic analysis in details showed a trend that two or three activated nucleotide monomers align on the polyC template prior to the phosphodiester bond formation [85]. For instance, the association between two ImpG molecules would be stronger than that between two ImpC molecules, which is determined by the strength of stacking of nucleotide bases. This is compatible to the fact that the efficiency of Template-directed reaction with ImpA, ImpU, and/or ImpC is much

On the other hand, kinetic analysis of the reaction mechanism of Clay-catalyzed reaction showed that the rate constants for the formation of 2-mer, 3-mer, and 4-mer and higher oligonucleotide increase in the order of 2-mer < 3-mer < 4-mer and higher; this trend is consistent with Template-directed reaction [69, 70]. Thus, the hydrolysis of the activated nucleotide monomer is somewhat competitive reaction to the oligomerization in the presence of clay. The kinetics suggested that the association between two activation monomers and that between activated monomer and elongating oligomer are important to determine the rate constants of 2-mer, 3-mer, and 4-mer formations, which is similar to Template-directed reaction. The association is mainly due to the stacking through nucleotide bases, which increases with length of oligonucleotide up to 4-mer and remains constant. Kinetic analysis of Metal-catalyzed reaction also implies a similar mechanism to that for Template-directed and Clay-catalyzed reactions (Figure 11). Conclusively, the kinetic investigations summarize that the association of an activated nucleotide monomer with another one activated nucleotide monomer or an elongating oligomer is important to determine the efficiency of oligomerization in the presence of any additives, such as a polynucleotide template, clay surface, and metal ions.

**Figure 11.** Importance of associate formation for oligonucleotide elongation

#### **4.7. Variations of the activated nucleotide monomers**

The activated nucleotide monomers with N-P bond were normally tested using ImpN or 2MeImpN, where imidazole and 2-methylimidazole moieties are the leaving group from N-P bound with phosphate group. Although the methyl group does not seem to be so effective, this causes fairly large influence for the formation of long oligonucleotides [86]. For instance, the efficiency of Template-directed reaction using 2MImpG in the absence of metal catalyst is higher than that using ImpG. This is probably due to the enhancement of association between monomers or that between monomer and elongating oligomer by stacking of 2-methylimidazole. The solubility of 2MeImpN is lower than that of ImpN in aqueous solutions. Both imidazole and 2-methy-limidazole are potentially prebiotic compounds. Analogues of ImpN with different leaving group were examined. It was found that the efficiency of the oligonu‐ cleotide formation using different leaving group with N-P bond is correlated to the acidity of the leading group. Oligonucleotide formation indeed proceeds efficiently using such ImpN analogues for Clay-catalyzed reactions, where higher oligonucleotides were observed in some cases (Figure 12) [87, 88]. These studies suggest that a variation of pathways should have been possible for the formation of RNA under the primitive earth conditions.

considered as a major pathway for the formation of sugars. In addition, RNA is considered unstable as comparing to hexose. Naturally, the nucleotide analogues of RNA instead of ribose could be formed under prebiotic conditions. Furthermore, the backbone of phosphodiester bond could be replaced with peptide bonding backbone [91-93] (Figure 13). These possibilities were also experimentally verified. Conclusively, the reactions of the activated nucleotide analogues using hexose backbone also proceed to form oligonucleotide analogues using Template-directed reaction [94, 95]. These examples suggest that template direction could have played important roles in a variety of analogical materials under the prebiotic conditions.

Oligomerization of Nucleic Acids and Peptides under the Primitive Earth Conditions

http://dx.doi.org/10.5772/58222

189

Finally, as mentioned above, it is known that ImpN monomers consisting of deoxyribose do not form oligonucleotides since the reactivity of ImpN with deoxyribose is very low. This is due to the less reactivity of 2'-H group of ribose. The formation of DNA from such activated nucleotide monomers is not yet deeply investigated since DNA molecules are not considered

The RNA polymerization model reactions were extensively studied. Normally, these studies were normally carried out using homochiral materials because of the difficulty of preparation of starting materials. The selection of a single chirality, which is L-ribose for the present nucleic

**4.8. Chiral selection of oligonucleotide using the prebiotic RNA formation models**

as initial genetic material on the primitive earth.

**Figure 13.** RBA and peptide nucleic acid

**Figure 12.** Variations of the activated nucleotide

On the other hand, it is claimed that RNA would not be suitable as an initial material to preserve genetic information since ribose as an RNA moiety is considered normally difficult to form under the primitive earth conditions. Ribose does not form efficiently under prebiotic condi‐ tions although many types of sugars are readily formed by formose reaction [89, 90]; it is Oligomerization of Nucleic Acids and Peptides under the Primitive Earth Conditions http://dx.doi.org/10.5772/58222 189

**Figure 13.** RBA and peptide nucleic acid

monomers or that between monomer and elongating oligomer by stacking of 2-methylimidazole. The solubility of 2MeImpN is lower than that of ImpN in aqueous solutions. Both imidazole and 2-methy-limidazole are potentially prebiotic compounds. Analogues of ImpN with different leaving group were examined. It was found that the efficiency of the oligonu‐ cleotide formation using different leaving group with N-P bond is correlated to the acidity of the leading group. Oligonucleotide formation indeed proceeds efficiently using such ImpN analogues for Clay-catalyzed reactions, where higher oligonucleotides were observed in some cases (Figure 12) [87, 88]. These studies suggest that a variation of pathways should have been

On the other hand, it is claimed that RNA would not be suitable as an initial material to preserve genetic information since ribose as an RNA moiety is considered normally difficult to form under the primitive earth conditions. Ribose does not form efficiently under prebiotic condi‐ tions although many types of sugars are readily formed by formose reaction [89, 90]; it is

possible for the formation of RNA under the primitive earth conditions.

188 Oligomerization of Chemical and Biological Compounds

**Figure 12.** Variations of the activated nucleotide

considered as a major pathway for the formation of sugars. In addition, RNA is considered unstable as comparing to hexose. Naturally, the nucleotide analogues of RNA instead of ribose could be formed under prebiotic conditions. Furthermore, the backbone of phosphodiester bond could be replaced with peptide bonding backbone [91-93] (Figure 13). These possibilities were also experimentally verified. Conclusively, the reactions of the activated nucleotide analogues using hexose backbone also proceed to form oligonucleotide analogues using Template-directed reaction [94, 95]. These examples suggest that template direction could have played important roles in a variety of analogical materials under the prebiotic conditions.

Finally, as mentioned above, it is known that ImpN monomers consisting of deoxyribose do not form oligonucleotides since the reactivity of ImpN with deoxyribose is very low. This is due to the less reactivity of 2'-H group of ribose. The formation of DNA from such activated nucleotide monomers is not yet deeply investigated since DNA molecules are not considered as initial genetic material on the primitive earth.

#### **4.8. Chiral selection of oligonucleotide using the prebiotic RNA formation models**

The RNA polymerization model reactions were extensively studied. Normally, these studies were normally carried out using homochiral materials because of the difficulty of preparation of starting materials. The selection of a single chirality, which is L-ribose for the present nucleic acid, is a great issue in the field of the origin of life study. Although this question is not yet solved, some attempts were carried out for the evaluation of the efficiency of Templatedirected, Clay-catalyzed, and Metal-catalyzed reactions by using heterochiral materials [96-99]. The preparation of the activated nucleotide monomers with L-ribose was succeeded although it involves very complicated organic synthetic procedures. These reactions showed that the homochiral oligomerization is preferable instead of the heterochiral oligomerization for both the Clay-catalyzed and Template-directed reactions. The results would imply the reason how homochiral biochemistry was selected during the chemical evolution of nucleotide and abiotic-peptide (Figure 14).

In general, the efficiency of oligonucleotide formation by Metal-catalyzed, Clay-catalyzed, and Template-directed reactions decreases with increasing temperature. Kinetic analyses clarified successfully the reason of the low efficiency at high temperatures [102-104]. The investigations provided the rate constants regarding these reaction systems, that is, the formation of oligom‐ ers 2-mer (*k*2), 3-mer (*k*3), 4-mer (*k*4), the hydrolysis of the activated nucleotide (*k*hy), the formation of pyrophosphate-linkage (*k*py), and the hydrolysis of oligonucleotide formed by these prebiotic oligomerizations. Comparison of the temperature dependence of *k*2, *k*3, *k*<sup>4</sup> shows that *k*2 has a smaller activation energy than that for *k*3 and *k*4. In addition, the relative magnitude of *k*<sup>2</sup> becomes small as comparing to *k*hy with increasing temperature although that of *k*3 and *k*<sup>4</sup> does not; the activation energy for *k*2 is smaller thant that for *k*hy, but that for *k*3 and *k*<sup>4</sup> is comparable to that for *k*hy. The main reason of low efficiency of oligonucleotide formation at high temperature is that the formation of 2-mer becomes relatively slow as comparing to the

Oligomerization of Nucleic Acids and Peptides under the Primitive Earth Conditions

http://dx.doi.org/10.5772/58222

191

The similar trend was observed for Metal-catalyzed and Clay-catalyzed reactions on the basis of kinetic analysis [103, 104]. In addition, the association does not only affect the efficiency of these reactions but also the regioselectivity, andeffective to the formation of cyclic-isomers.The fact that the formation of DNA double-helix is primarily stabilized by the base-stacking rather than hydrogen bonding is minor factor [105] is consistent with the trend in the primitive RNA formation reactions. These analyses suggest that the accumulation of the activated nucleotide monomerwouldbedifficultathightemperaturessincethehydrolysisoftheactivationnucleotide becomesfastunlesstheactivatednucleotidemonomersarecontinuouslysuppliedbyaplausible pathway. Nevertheless, such conformation of DNA and RNA must be held at high tempera‐

Proteins are regarded the practical entity for maintaining life-like system while DNA in the present organisms or the primitive RNA in RNA worlds is regarded as blueprint of life-like systems. The question, which was the first material between nucleic acids and proteins for the emergence of life-like system on the primitive earth, has been controversy for many years although this might be solved by the proposal of the RNA world hypothesis. However, there are a number of drawbacks regarding the RNA world hypothesis so the question is continu‐ ously discussed. As mentioned above, the term "protein" in the present chapter involves the meaning that protein must be a functional material, which is synthesized on the basis of genetic information. If a protein-like molecule is merely consisting of peptide, but the sequence is not encoded in a genetic coding system, the material is called as just abiotic-peptide. As mentioned, such peptides are called as abiotic-peptides to distinguish from the peptides possessing biological functions *in vivo*. Naturally, for instance, this is not inconsistent with a hypothesis that both the RNA and protein-like molecules formed independently and became interrelated

tures since a thermophilic organism could survive at least 100-120 °C [106, 107].

**5. Successful examples of the formation of abiotic-peptides**

only after the accumulations of these molecules.

**5.1. Introduction**

hydrolysis of the activated nucleotide at high temperature.

**Figure 14.** Homochiral selectivity for prebiotic RNA formation

#### **4.9. Temperature dependence of the pebiotic oligomerization at medium to high temperature**

According to the evidences of geoscience, the temperature of the primitive earth would be much higher than that of present. In addition, submarine hydrothermal vent system would have been present and played important roles for the emergence of life and the last universal common ancestor, which is deduced from phylogenetic tree analysis, could be close to hyperthermophiles [100, 101]. Naturally, the heterogeneity of the primitive earth environment should be considered so the previous studies on the formation of RNA are regarded to bias very mild conditions. As mentioned above, the oligomeriation of the activated nucleotide monomers does not proceed in acidic and alkaline solutions because of the hydrolysis of the activated nucleotide. Besides, the temperature dependence of these reactions was extensively investigated in relation to the hydrothermal origin of life hypothesis.

In general, the efficiency of oligonucleotide formation by Metal-catalyzed, Clay-catalyzed, and Template-directed reactions decreases with increasing temperature. Kinetic analyses clarified successfully the reason of the low efficiency at high temperatures [102-104]. The investigations provided the rate constants regarding these reaction systems, that is, the formation of oligom‐ ers 2-mer (*k*2), 3-mer (*k*3), 4-mer (*k*4), the hydrolysis of the activated nucleotide (*k*hy), the formation of pyrophosphate-linkage (*k*py), and the hydrolysis of oligonucleotide formed by these prebiotic oligomerizations. Comparison of the temperature dependence of *k*2, *k*3, *k*<sup>4</sup> shows that *k*2 has a smaller activation energy than that for *k*3 and *k*4. In addition, the relative magnitude of *k*<sup>2</sup> becomes small as comparing to *k*hy with increasing temperature although that of *k*3 and *k*<sup>4</sup> does not; the activation energy for *k*2 is smaller thant that for *k*hy, but that for *k*3 and *k*<sup>4</sup> is comparable to that for *k*hy. The main reason of low efficiency of oligonucleotide formation at high temperature is that the formation of 2-mer becomes relatively slow as comparing to the hydrolysis of the activated nucleotide at high temperature.

The similar trend was observed for Metal-catalyzed and Clay-catalyzed reactions on the basis of kinetic analysis [103, 104]. In addition, the association does not only affect the efficiency of these reactions but also the regioselectivity, andeffective to the formation of cyclic-isomers.The fact that the formation of DNA double-helix is primarily stabilized by the base-stacking rather than hydrogen bonding is minor factor [105] is consistent with the trend in the primitive RNA formation reactions. These analyses suggest that the accumulation of the activated nucleotide monomerwouldbedifficultathightemperaturessincethehydrolysisoftheactivationnucleotide becomesfastunlesstheactivatednucleotidemonomersarecontinuouslysuppliedbyaplausible pathway. Nevertheless, such conformation of DNA and RNA must be held at high tempera‐ tures since a thermophilic organism could survive at least 100-120 °C [106, 107].

#### **5. Successful examples of the formation of abiotic-peptides**

#### **5.1. Introduction**

acid, is a great issue in the field of the origin of life study. Although this question is not yet solved, some attempts were carried out for the evaluation of the efficiency of Templatedirected, Clay-catalyzed, and Metal-catalyzed reactions by using heterochiral materials [96-99]. The preparation of the activated nucleotide monomers with L-ribose was succeeded although it involves very complicated organic synthetic procedures. These reactions showed that the homochiral oligomerization is preferable instead of the heterochiral oligomerization for both the Clay-catalyzed and Template-directed reactions. The results would imply the reason how homochiral biochemistry was selected during the chemical evolution of nucleotide

and abiotic-peptide (Figure 14).

190 Oligomerization of Chemical and Biological Compounds

**Figure 14.** Homochiral selectivity for prebiotic RNA formation

**temperature**

**4.9. Temperature dependence of the pebiotic oligomerization at medium to high**

investigated in relation to the hydrothermal origin of life hypothesis.

According to the evidences of geoscience, the temperature of the primitive earth would be much higher than that of present. In addition, submarine hydrothermal vent system would have been present and played important roles for the emergence of life and the last universal common ancestor, which is deduced from phylogenetic tree analysis, could be close to hyperthermophiles [100, 101]. Naturally, the heterogeneity of the primitive earth environment should be considered so the previous studies on the formation of RNA are regarded to bias very mild conditions. As mentioned above, the oligomeriation of the activated nucleotide monomers does not proceed in acidic and alkaline solutions because of the hydrolysis of the activated nucleotide. Besides, the temperature dependence of these reactions was extensively

Proteins are regarded the practical entity for maintaining life-like system while DNA in the present organisms or the primitive RNA in RNA worlds is regarded as blueprint of life-like systems. The question, which was the first material between nucleic acids and proteins for the emergence of life-like system on the primitive earth, has been controversy for many years although this might be solved by the proposal of the RNA world hypothesis. However, there are a number of drawbacks regarding the RNA world hypothesis so the question is continu‐ ously discussed. As mentioned above, the term "protein" in the present chapter involves the meaning that protein must be a functional material, which is synthesized on the basis of genetic information. If a protein-like molecule is merely consisting of peptide, but the sequence is not encoded in a genetic coding system, the material is called as just abiotic-peptide. As mentioned, such peptides are called as abiotic-peptides to distinguish from the peptides possessing biological functions *in vivo*. Naturally, for instance, this is not inconsistent with a hypothesis that both the RNA and protein-like molecules formed independently and became interrelated only after the accumulations of these molecules.

In the first place, abiotic-peptide formation from amino acids is also the dehydration so this is not preferable in aqueous medium unless using the activation of amino acid or a condensation agent. Peptides are normally regarded more stable against the hydrolysis as comparing to nucleic acids although peptides in aqueous medium are normally exposed under the pressure to hydrolytic degradation. In the present organisms, peptides form on ribosomes from aminoacyl-tRNA, which is regarded as activation of amino acid activation. Furthermore, condensation agent and activation techniques are used for modern organic synthesis of peptides. Thus, it should be noted that abiotic-peptides do not form spontaneously from the viewpoint of thermodynamics even if amino acids are mixed in aqueous medium. This is the same situation as the oligomerization of DNA and RNA.

Although the proteinoids involve partially peptide bonding, proteinoids might not reflect a plausible ancient accumulation pathway of protein-like molecules since large continents were not present on the primitive earth. The question how the proteinoids containing peptide bonding incompletely had evolved to the modern proteins is a great drawback. In addition, as mentioned above these condensation products are not categorized as proteins since these are not synthesized by the direction of genetic information. Normally, it is realized that protein does not replicate alone. Furthermore, these condensation products do not seem to be useful in an organic synthetic method. Thus, the research regarding the thermal condensation of

Oligomerization of Nucleic Acids and Peptides under the Primitive Earth Conditions

http://dx.doi.org/10.5772/58222

193

**5.3. Formation of protein-like and abiotic-peptide molecules in aqueous medium**

The condensation from amino acids to form peptide bonding is disadvantageous in aqueous medium [114]. Activation of amino acids or usage of a condensation agent is necessary for peptide bond formation [115-120]. Thus, the pathways for oligopeptides have been investigat‐ edas simulationexperimentsundertheprimitive earthenvironments.Severaltypesofminerals and metal ions were evaluated to see whether these materials enhance the formation of long oligopeptides. Nevertheless, the plausible oligomerization has met with limited success [119].

Presumably, the temperature of the primitive earth surface should have been much higher than present. Furthermore, the phylogenetic analysis of present organisms suggested that last universal common ancestor (LUCA), which is located to the branches between Bacteria and Archaea, would have been a hyperthermophilic organism [100, 101] although this assumption is still strongly disputed [121, 122]. The importance of such hydrothermal systems has been extensively investigated [123]. The hypothesis is named as "hydrothermal origin of life hypothesis". Although LUCA is not the origin of life, it would reflect old organisms and the environments, where life had emerged. Recent discoveries of ecosystem present near submar‐ ine hydrothermal vent systems in deep-ocean support the hydrothermal origin of life hypoth‐ esis [124]. Thus, simulation experiments have been carried out for the formation of oligopeptides under the hydrothermal conditions using batch reactors at the beginning of such simulation experiments. Condensation products, which involve peptide bonding and bonding

Recently, flow reactors were developed and behaviors of biomolecules in hydrothermal systems have been extensively investigated to simulate the submarine vent system in deepocean [126]. The simulation experiment showed that oligopeptides are formed from glycine monomers at temperature over 250 °C, where oligoglycine up to 6 amino acid units was detected. The efficiency of the direct oligopeptide formation from amino acid is normally low; the maximum yield is normally 0.1 – 1 % at most [126, 127]. The reason is due to primarily the fact that dehydration in aqueous solution is not advantageous so that the oligopeptides once produced are always exposed under the degradation pressure. In addition, the formation of diketopierazine (DKP) inhibits the further elongation of oligopeptides (Figure 16); DKP is

between silicate and organic materials, were observed in the products [125].

amino acids does not progress so far.

**5.4. Hydrothermal synthesis of oligpeptides**

#### **5.2. Thermal condensation of amino acid mixture**

Spontaneous condensation of amino acid mixtures does not or less proceeds in aqueous solutions. This is due to that dehydration is disadvantageous in aqueous medium. It was found that thermal condensation products form from amino acid mixtures, such as glutamic acid and aspartic acid, under dry conditions at temperatures 180-200 °C [108-110] (Figure. 15). Micro‐ spheres can be formed by treatment of the condensation products in boiling water; the microspheres were named as proteinoid. It is surprising that these condensation products readily form with normally molecular weight over 10000 Da involving peptide bonding, but the peptide bonds are not controlled; probably branched sequences and non-peptide bonding would be involved. Lacton formed by heating glutamic acid etc. is melted at high temperatures, where other amino acids are dissolved and then result in polymerization. The biochemical functions of proteinoids have been extensively investigated [111-113].

**Figure 15.** Thermal condensation of amino acids

Although the proteinoids involve partially peptide bonding, proteinoids might not reflect a plausible ancient accumulation pathway of protein-like molecules since large continents were not present on the primitive earth. The question how the proteinoids containing peptide bonding incompletely had evolved to the modern proteins is a great drawback. In addition, as mentioned above these condensation products are not categorized as proteins since these are not synthesized by the direction of genetic information. Normally, it is realized that protein does not replicate alone. Furthermore, these condensation products do not seem to be useful in an organic synthetic method. Thus, the research regarding the thermal condensation of amino acids does not progress so far.

#### **5.3. Formation of protein-like and abiotic-peptide molecules in aqueous medium**

The condensation from amino acids to form peptide bonding is disadvantageous in aqueous medium [114]. Activation of amino acids or usage of a condensation agent is necessary for peptide bond formation [115-120]. Thus, the pathways for oligopeptides have been investigat‐ edas simulationexperimentsundertheprimitive earthenvironments.Severaltypesofminerals and metal ions were evaluated to see whether these materials enhance the formation of long oligopeptides. Nevertheless, the plausible oligomerization has met with limited success [119].

#### **5.4. Hydrothermal synthesis of oligpeptides**

In the first place, abiotic-peptide formation from amino acids is also the dehydration so this is not preferable in aqueous medium unless using the activation of amino acid or a condensation agent. Peptides are normally regarded more stable against the hydrolysis as comparing to nucleic acids although peptides in aqueous medium are normally exposed under the pressure to hydrolytic degradation. In the present organisms, peptides form on ribosomes from aminoacyl-tRNA, which is regarded as activation of amino acid activation. Furthermore, condensation agent and activation techniques are used for modern organic synthesis of peptides. Thus, it should be noted that abiotic-peptides do not form spontaneously from the viewpoint of thermodynamics even if amino acids are mixed in aqueous medium. This is the

Spontaneous condensation of amino acid mixtures does not or less proceeds in aqueous solutions. This is due to that dehydration is disadvantageous in aqueous medium. It was found that thermal condensation products form from amino acid mixtures, such as glutamic acid and aspartic acid, under dry conditions at temperatures 180-200 °C [108-110] (Figure. 15). Micro‐ spheres can be formed by treatment of the condensation products in boiling water; the microspheres were named as proteinoid. It is surprising that these condensation products readily form with normally molecular weight over 10000 Da involving peptide bonding, but the peptide bonds are not controlled; probably branched sequences and non-peptide bonding would be involved. Lacton formed by heating glutamic acid etc. is melted at high temperatures, where other amino acids are dissolved and then result in polymerization. The biochemical

same situation as the oligomerization of DNA and RNA.

functions of proteinoids have been extensively investigated [111-113].

**5.2. Thermal condensation of amino acid mixture**

192 Oligomerization of Chemical and Biological Compounds

**Figure 15.** Thermal condensation of amino acids

Presumably, the temperature of the primitive earth surface should have been much higher than present. Furthermore, the phylogenetic analysis of present organisms suggested that last universal common ancestor (LUCA), which is located to the branches between Bacteria and Archaea, would have been a hyperthermophilic organism [100, 101] although this assumption is still strongly disputed [121, 122]. The importance of such hydrothermal systems has been extensively investigated [123]. The hypothesis is named as "hydrothermal origin of life hypothesis". Although LUCA is not the origin of life, it would reflect old organisms and the environments, where life had emerged. Recent discoveries of ecosystem present near submar‐ ine hydrothermal vent systems in deep-ocean support the hydrothermal origin of life hypoth‐ esis [124]. Thus, simulation experiments have been carried out for the formation of oligopeptides under the hydrothermal conditions using batch reactors at the beginning of such simulation experiments. Condensation products, which involve peptide bonding and bonding between silicate and organic materials, were observed in the products [125].

Recently, flow reactors were developed and behaviors of biomolecules in hydrothermal systems have been extensively investigated to simulate the submarine vent system in deepocean [126]. The simulation experiment showed that oligopeptides are formed from glycine monomers at temperature over 250 °C, where oligoglycine up to 6 amino acid units was detected. The efficiency of the direct oligopeptide formation from amino acid is normally low; the maximum yield is normally 0.1 – 1 % at most [126, 127]. The reason is due to primarily the fact that dehydration in aqueous solution is not advantageous so that the oligopeptides once produced are always exposed under the degradation pressure. In addition, the formation of diketopierazine (DKP) inhibits the further elongation of oligopeptides (Figure 16); DKP is cyclic dipeptide and considered as very stable. Rapid accumulation of DKP from glycine is also observed in the simulation reaction of submarine vent system.

**Figure 16.** DKP formation from amino acids

On the other hand, in our group, a hydrothermal micro flow reactor using fused-silica capillary has been developed (Figure 17) [128-130]. This enables to monitor hydrothermal reactions at temperature up to 400 °C at time scale between 2 ms – 200 s. By using the flow reactor, a more efficient elongation of 4-mer and 5-mer of alanine to 5-mer and 6-mer was discovered at 250 – 310 °C within 5 – 20 s [131]. Furthermore, it was found that oligopeptides including 20 amino acid units form within 180 s at temperatures 270 – 310 °C [132]. These reactions were designed to avoid the formation of DKP during the direct oligomerization of monomeric amino acids; the formation of DKP is regarded as a problem for the organic synthesis of peptide. DKP readily forms from dipeptide at high temperatures, for instance DKP forms from alanine dipeptide within 10 s at 275 °C (Figure 18). On the contrary, DKP formation is relatively slow if the elongation starts from 4-mer and 5-mer of alanine. In these reactions, 4-mer and 5-mer are converted to DKP finally, but the DKP formation is relatively slow. Thus the elongation of these peptides is observed during the conversion of the 4-mer and 5-me to DKP. The reaction scheme is shown in Figure 18. The elongation yield reaches to 10 %, which is ca. 100-fold greater than that of direct formation from monomeric amino acids. This is surprising since neither condensation reagent nor catalyst is necessary. The kinetic analysis implies that peptide bond within 4-mer would compensate the elongation of peptide bonding to form 5-mer at very high temperatures.

reactor is settled in a high temperature heating system. This system enables to monitor hydrothermal reactions at temperatures up to 300 °C at the time scale 2 – 200 s in the presence of mineral particles. By using the method, it was discovered that the oligopeptide elongation reaction from alanine 4-mer to 5-mer within 5 – 20 s is notably enhanced by naturally-occurring

Oligomerization of Nucleic Acids and Peptides under the Primitive Earth Conditions

http://dx.doi.org/10.5772/58222

195

**6. Unsolved drawbacks regarding the origin of building blocks for a**

**6.1. General difficulties of oligomerization of nucleic acids and abiotic-peptides**

First, the dehydration in aqueous solution is disadvantageous from the viewpoint of thermo‐ dynamics. As we evaluated the difficulties of oligonucleotides and oligopeptides in a former section, activation method is a key technique to form long oligomers. For the case of nucleotide, the activated nucleotide monomer, that is, imidazolide of nucleoside 5'-monophosphate is useful material. However, the prebiotic activation method, which is regarded as compatible to the primitive earth environments, is not identified for the peptide formation. AminoacyltRNA is universally used in the present organisms and some minerals would have played roles for activation or condensation agents for the oligopeptide formation. However, these possi‐

carbonate minerals, such as calcite and dolomite.

**Figure 18.** Pathways of elongation and degradation of oligopeptides

**primitive life-like system**

bilities are not yet extensively evaluated.

**Figure 17.** Principle of hydrothermal flow reactor

To investigate the role of minerals under hydrothermal conditions, a new type of flow reactor technique was established in our group [133]. The system consisting a mineral-packed column

Oligomerization of Nucleic Acids and Peptides under the Primitive Earth Conditions http://dx.doi.org/10.5772/58222 195

**Figure 18.** Pathways of elongation and degradation of oligopeptides

cyclic dipeptide and considered as very stable. Rapid accumulation of DKP from glycine is

On the other hand, in our group, a hydrothermal micro flow reactor using fused-silica capillary has been developed (Figure 17) [128-130]. This enables to monitor hydrothermal reactions at temperature up to 400 °C at time scale between 2 ms – 200 s. By using the flow reactor, a more efficient elongation of 4-mer and 5-mer of alanine to 5-mer and 6-mer was discovered at 250 – 310 °C within 5 – 20 s [131]. Furthermore, it was found that oligopeptides including 20 amino acid units form within 180 s at temperatures 270 – 310 °C [132]. These reactions were designed to avoid the formation of DKP during the direct oligomerization of monomeric amino acids; the formation of DKP is regarded as a problem for the organic synthesis of peptide. DKP readily forms from dipeptide at high temperatures, for instance DKP forms from alanine dipeptide within 10 s at 275 °C (Figure 18). On the contrary, DKP formation is relatively slow if the elongation starts from 4-mer and 5-mer of alanine. In these reactions, 4-mer and 5-mer are converted to DKP finally, but the DKP formation is relatively slow. Thus the elongation of these peptides is observed during the conversion of the 4-mer and 5-me to DKP. The reaction scheme is shown in Figure 18. The elongation yield reaches to 10 %, which is ca. 100-fold greater than that of direct formation from monomeric amino acids. This is surprising since neither condensation reagent nor catalyst is necessary. The kinetic analysis implies that peptide bond within 4-mer would compensate the elongation of peptide bonding to form 5-mer at very high

To investigate the role of minerals under hydrothermal conditions, a new type of flow reactor technique was established in our group [133]. The system consisting a mineral-packed column

also observed in the simulation reaction of submarine vent system.

**Figure 16.** DKP formation from amino acids

194 Oligomerization of Chemical and Biological Compounds

temperatures.

**Figure 17.** Principle of hydrothermal flow reactor

reactor is settled in a high temperature heating system. This system enables to monitor hydrothermal reactions at temperatures up to 300 °C at the time scale 2 – 200 s in the presence of mineral particles. By using the method, it was discovered that the oligopeptide elongation reaction from alanine 4-mer to 5-mer within 5 – 20 s is notably enhanced by naturally-occurring carbonate minerals, such as calcite and dolomite.

#### **6. Unsolved drawbacks regarding the origin of building blocks for a primitive life-like system**

#### **6.1. General difficulties of oligomerization of nucleic acids and abiotic-peptides**

First, the dehydration in aqueous solution is disadvantageous from the viewpoint of thermo‐ dynamics. As we evaluated the difficulties of oligonucleotides and oligopeptides in a former section, activation method is a key technique to form long oligomers. For the case of nucleotide, the activated nucleotide monomer, that is, imidazolide of nucleoside 5'-monophosphate is useful material. However, the prebiotic activation method, which is regarded as compatible to the primitive earth environments, is not identified for the peptide formation. AminoacyltRNA is universally used in the present organisms and some minerals would have played roles for activation or condensation agents for the oligopeptide formation. However, these possi‐ bilities are not yet extensively evaluated.

Second, it is generally true that the formation of short cyclic-oligonucleotides [70, 72, 73, 134] and DKP [131] inhibits the further elongation of oligomers (Figure 19). The cyclization of oligonucleotides is completely controlled by RNA polymerase and peptide synthetic method on ribosomes *in vivo*. Nucleic acid templates, mineral catalysts, and metal ion catalysts tend to inhibit the cyclization reaction. The additives probably enhance suitable steric alignment of monomers to inhibit cyclization although the general rules for inhibition of cyclization are not elucidated. Actually as a primitive oligomerization, there is less cyclic-dinucleotide for the formation of oligoG in Template-directed reaction. This is probably due to Watson-Crick type hydrogen bonding and base stacking. On the contrary, cyclic-2-mer, 3-mer, and 4-mer somewhat readily form in Clay-catalyzed reactions. On the other hand, the elongation from 4mer or higher is effective for inhibition of DKP formation. In the present organisms, the alignment of amino acid monomers is supported by the machinery of ribosome as well as mRNA and tRNA. However, if the hydrothermal hypothesis is correct, then the oligopeptide formation would have occurred even under such extreme conditions involving the submarine hydrothermal vent system. In such a system, it is assumed that the alignment of amino acids would be so difficult since weak interactions among biomolecules would not act at high temperatures.

**6.2. Formation of simple oligomers to functional entities**

identified [82, 135].

oligonucleotide formation.

The genetic information flow in present organisms consists of DNA replication, transformation of DNA sequence to mRNA, translation of mRNA sequence to protein. According to the RNA world hypothesis, the roles of DNA and protein should have been covered by RNA molecules so the replication of RNA molecules is the first step in an RNA based life-like systems [135,

Oligomerization of Nucleic Acids and Peptides under the Primitive Earth Conditions

http://dx.doi.org/10.5772/58222

197

In a former section, the difficulty if Template-directed reaction for the cases of A, U, C as nucleotide bases on the complementary template was described. This would be due to the weak stacking between the activated monomers and elongating oligomer. Thus, the enhance‐ ment of the association has been attempted. It has been shown that intercalator somewhat enhances the efficiency of oligonucleotide by enhancement of the association [137]. However, the practical replication conditions using primitive activate nucleotide monomers are not

The biological functions of oligonucleotide and oliopeptide would be displayed with a certain length of oligomers, such as 30-mer, 50-mer, and 100-mer. Transfer-RNA (tRNA) would be one of smallest functional RNA consisting of ca. 80-mer. On the other hand, recently it has been extensively found that small non-coding RNA and so-called microRNA possess impor‐ tant roles *in vivo*. In addition, it is also known that short peptides possess several important roles while the average length of proteins is ca. 100-mer. Oligonucleotides with 40 – 50-mer can form from Template-directed and Clay-catalyzed reactions. This fact suggests that different types of functional RNA molecules should have been included in a randomly formed large RNA pool on the primitive earth. Several functional RNA sequences would have amplified by replication machinery, that is ribosome, of which some main parts are made of RNA molecules. As mentioned, Template-directed reaction is only efficient for the case of the formation of oligoG on a polyC template. However, the Template-directed reaction could proceed using the activated oligonucleotide as a monomer for the case of oligoU formation of a polyA template. If this type of Template-directed reaction is universal for the different types of combination of monomer and template, a replication of RNA molecules could have evolved under the primitive earth conditions. Nevertheless, it was confirmed that a heterogeneous oligoC template spontaneously formed by Clay-catalyzed reaction containing 2', 5'-and 3'-5' linked isomers could have preserved as template for the Template-directed reaction of oligoG formation [138]. On the other hand, the oligopeptide formation from amino acid mixtures is not yet elucidated although some thermal condensation products of amino acids, such as proteinoids, were observed. Actually, it is considered that the formation of amino acids seems to be easier than that of nucleotide monomers since amino acids are readily found in gas reaction products with different types of energy sources. However, it could not straightfor‐ wardly mean that the oligopeptide formation using prebiotic materials is easier than the

*In vitro* selection technique has been extensively investigated to create several types of artificial functional RNA molecules [139, 140]. This technique elucidated that a RNA replication system catalyzed by RNA molecule could have evolved if amplification and selection processes were present. Naturally, the molecular biological tools and materials are necessary for the *in vitro*

136]. Here, the possibility of the prebiotic RNA replication is briefly discussed.

**Figure 19.** Cyclization and elongation of prebiotic oligomers

Third, the estimation of the primitive earth environments is not yet succeeded. Furthermore, the heterogeneity of earth conditions should be considered. Improvement of knowledge about the primitive environment is essential and a variety of conditions of simulation experiments would be helpful. Most of simulation reactions for the oligonucleotide formation have been carried out under the fairly mild temperatures and pressures. Beside, abiotic-peptide forma‐ tions were carried out under a variety conditions from mild conditions to hydrothermal conditions. Wider range of simulation experiments should be carried out unless the exact earth environments are readily identified. General rules would be found on the basis of such a variety of simulation experiments in future.

#### **6.2. Formation of simple oligomers to functional entities**

Second, it is generally true that the formation of short cyclic-oligonucleotides [70, 72, 73, 134] and DKP [131] inhibits the further elongation of oligomers (Figure 19). The cyclization of oligonucleotides is completely controlled by RNA polymerase and peptide synthetic method on ribosomes *in vivo*. Nucleic acid templates, mineral catalysts, and metal ion catalysts tend to inhibit the cyclization reaction. The additives probably enhance suitable steric alignment of monomers to inhibit cyclization although the general rules for inhibition of cyclization are not elucidated. Actually as a primitive oligomerization, there is less cyclic-dinucleotide for the formation of oligoG in Template-directed reaction. This is probably due to Watson-Crick type hydrogen bonding and base stacking. On the contrary, cyclic-2-mer, 3-mer, and 4-mer somewhat readily form in Clay-catalyzed reactions. On the other hand, the elongation from 4mer or higher is effective for inhibition of DKP formation. In the present organisms, the alignment of amino acid monomers is supported by the machinery of ribosome as well as mRNA and tRNA. However, if the hydrothermal hypothesis is correct, then the oligopeptide formation would have occurred even under such extreme conditions involving the submarine hydrothermal vent system. In such a system, it is assumed that the alignment of amino acids would be so difficult since weak interactions among biomolecules would not act at high

Third, the estimation of the primitive earth environments is not yet succeeded. Furthermore, the heterogeneity of earth conditions should be considered. Improvement of knowledge about the primitive environment is essential and a variety of conditions of simulation experiments would be helpful. Most of simulation reactions for the oligonucleotide formation have been carried out under the fairly mild temperatures and pressures. Beside, abiotic-peptide forma‐ tions were carried out under a variety conditions from mild conditions to hydrothermal conditions. Wider range of simulation experiments should be carried out unless the exact earth environments are readily identified. General rules would be found on the basis of such a

temperatures.

196 Oligomerization of Chemical and Biological Compounds

**Figure 19.** Cyclization and elongation of prebiotic oligomers

variety of simulation experiments in future.

The genetic information flow in present organisms consists of DNA replication, transformation of DNA sequence to mRNA, translation of mRNA sequence to protein. According to the RNA world hypothesis, the roles of DNA and protein should have been covered by RNA molecules so the replication of RNA molecules is the first step in an RNA based life-like systems [135, 136]. Here, the possibility of the prebiotic RNA replication is briefly discussed.

In a former section, the difficulty if Template-directed reaction for the cases of A, U, C as nucleotide bases on the complementary template was described. This would be due to the weak stacking between the activated monomers and elongating oligomer. Thus, the enhance‐ ment of the association has been attempted. It has been shown that intercalator somewhat enhances the efficiency of oligonucleotide by enhancement of the association [137]. However, the practical replication conditions using primitive activate nucleotide monomers are not identified [82, 135].

The biological functions of oligonucleotide and oliopeptide would be displayed with a certain length of oligomers, such as 30-mer, 50-mer, and 100-mer. Transfer-RNA (tRNA) would be one of smallest functional RNA consisting of ca. 80-mer. On the other hand, recently it has been extensively found that small non-coding RNA and so-called microRNA possess impor‐ tant roles *in vivo*. In addition, it is also known that short peptides possess several important roles while the average length of proteins is ca. 100-mer. Oligonucleotides with 40 – 50-mer can form from Template-directed and Clay-catalyzed reactions. This fact suggests that different types of functional RNA molecules should have been included in a randomly formed large RNA pool on the primitive earth. Several functional RNA sequences would have amplified by replication machinery, that is ribosome, of which some main parts are made of RNA molecules. As mentioned, Template-directed reaction is only efficient for the case of the formation of oligoG on a polyC template. However, the Template-directed reaction could proceed using the activated oligonucleotide as a monomer for the case of oligoU formation of a polyA template. If this type of Template-directed reaction is universal for the different types of combination of monomer and template, a replication of RNA molecules could have evolved under the primitive earth conditions. Nevertheless, it was confirmed that a heterogeneous oligoC template spontaneously formed by Clay-catalyzed reaction containing 2', 5'-and 3'-5' linked isomers could have preserved as template for the Template-directed reaction of oligoG formation [138]. On the other hand, the oligopeptide formation from amino acid mixtures is not yet elucidated although some thermal condensation products of amino acids, such as proteinoids, were observed. Actually, it is considered that the formation of amino acids seems to be easier than that of nucleotide monomers since amino acids are readily found in gas reaction products with different types of energy sources. However, it could not straightfor‐ wardly mean that the oligopeptide formation using prebiotic materials is easier than the oligonucleotide formation.

*In vitro* selection technique has been extensively investigated to create several types of artificial functional RNA molecules [139, 140]. This technique elucidated that a RNA replication system catalyzed by RNA molecule could have evolved if amplification and selection processes were present. Naturally, the molecular biological tools and materials are necessary for the *in vitro* selection technique. However, if the amplification on the basis of Template-directed reaction under the primitive earth conditions, amplified RNA molecules are exposed under different types of selection pressure. This would cause the accumulation of functional RNA molecules, such as RNA replicase function, under the primitive earth conditions.

**7. Conclusive remarks: Possible future applications by learning the**

These studies support the fact that oligomer formation in aqueous solution in the absence of enzyme is possible. The reaction processes are different from normal organic synthetic techniques. Therefore, the improvement of the formation reactions of oligomers, which possess biological functions, should be achieved by learning chemistry of living organisms. Recently, organic chemistry is expected to be a technology compatible with the global environmental protection. Obviously, organic synthesis requires a large amount of organic solvent, which is not suitable to the environmental protection. On the contrary, nowadays, creation of a environmental harmless system for organic synthetic methods is expected by learning the living organisms. However, the organic synthesis is generally difficult in aqueous medium. According to the examples shown here, oligonucleotides and oligopeptides could be synthe‐ sized in aqueous medium. The fundamental researches will facilitate such demand of envi‐

Oligomerization of Nucleic Acids and Peptides under the Primitive Earth Conditions

http://dx.doi.org/10.5772/58222

199

This research was supported by Grant-in-Aid for Exploratory Research in Japan, 24657166

Department of Human Environmental Studies, Hiroshima Shudo University, Hiroshima,

[1] Arrhenius S. Die Verbreitung des Lebens im Weltenraum. Die Umschau 1903; 7: 481–

[2] Crick FHC., Orgel LE. Directed Panspermia. Icarus 1973; 19(3): 341-346.

**primitive oligomerization processes**

ronment harmless organic synthesis.

Address all correspondence to: kawamura@shudo-u.ac.jp

**Acknowledgements**

from JSPS of Japan.

**Author details**

Kunio Kawamura\*

Japan

**References**

485.

#### **6.3. Relationship between RNA and protein-like molecules**

Cooperation of RNA and proteins is definitely important in modern organisms. Thus, it should be evaluated that the formation of RNA and that of peptides could be enhanced by each other. According to previous studies, Template-directed reaction is not enhanced in the presence of proteinoids [141]. On the other hand, possible roles of nucleotides and oligonucleotides for the formation of oligopeptides are not yet well evaluated. Presumably, nucleotides and oligonu‐ cleotides would not be helpful for the abiotic-peptide synthesis under hydrothermal condi‐ tions since nucleotides and oligonucleotides are much more unstable than amino acids and peptides [129, 135, 136, 142]. This fact implies that chemical evolution of RNA would have started after the accumulation of oligopeptides.

#### **6.4. Accumulation of RNA molecules and abiotic-peptide under the primitive earth conditions**

According to the former sections, RNA could have formed from the activated nucleotide in the absence of enzyme. Abiotic-peptides also form without using the *in vitro* translation system on ribosome. However, the accumulation of oligonucleotides and oligopeptides would not have been easy under the primitive earth conditions since these are spontaneously hydrolyzed in aqueous medium although the rates of hydrolysis of these oligomers are normally slower than those of formation at low temperatures. The hydrolysis rate increases with increasing temperature.

The accumulation of oligomers should be considered from the viewpoint that the life-like system is considered as thermodynamically open system. That is, the accumulation of oligomers is determined by both the rates of formation and degradation of oligomers. Thus, although the rate of degradation of oligomers at high temperatures is large, the accumulation would be possible if the formation rate is much lager than that of degradation. The formation rate would be possibly enhanced by several prebiotic catalysts. Experimental simulations of these conditions are not yet well succeeded while most of simulation experiments were carried out under static conditions. Some experiments, of which activated monomers were fed to batch reaction, successfully showed the possible formation of long oligonucleotides and oligopep‐ tides [75]. Flow reactor would be useful to design where the activated monomers are fed to a chemical evolutionary system.

### **7. Conclusive remarks: Possible future applications by learning the primitive oligomerization processes**

These studies support the fact that oligomer formation in aqueous solution in the absence of enzyme is possible. The reaction processes are different from normal organic synthetic techniques. Therefore, the improvement of the formation reactions of oligomers, which possess biological functions, should be achieved by learning chemistry of living organisms. Recently, organic chemistry is expected to be a technology compatible with the global environmental protection. Obviously, organic synthesis requires a large amount of organic solvent, which is not suitable to the environmental protection. On the contrary, nowadays, creation of a environmental harmless system for organic synthetic methods is expected by learning the living organisms. However, the organic synthesis is generally difficult in aqueous medium. According to the examples shown here, oligonucleotides and oligopeptides could be synthe‐ sized in aqueous medium. The fundamental researches will facilitate such demand of envi‐ ronment harmless organic synthesis.

#### **Acknowledgements**

selection technique. However, if the amplification on the basis of Template-directed reaction under the primitive earth conditions, amplified RNA molecules are exposed under different types of selection pressure. This would cause the accumulation of functional RNA molecules,

Cooperation of RNA and proteins is definitely important in modern organisms. Thus, it should be evaluated that the formation of RNA and that of peptides could be enhanced by each other. According to previous studies, Template-directed reaction is not enhanced in the presence of proteinoids [141]. On the other hand, possible roles of nucleotides and oligonucleotides for the formation of oligopeptides are not yet well evaluated. Presumably, nucleotides and oligonu‐ cleotides would not be helpful for the abiotic-peptide synthesis under hydrothermal condi‐ tions since nucleotides and oligonucleotides are much more unstable than amino acids and peptides [129, 135, 136, 142]. This fact implies that chemical evolution of RNA would have

**6.4. Accumulation of RNA molecules and abiotic-peptide under the primitive earth**

According to the former sections, RNA could have formed from the activated nucleotide in the absence of enzyme. Abiotic-peptides also form without using the *in vitro* translation system on ribosome. However, the accumulation of oligonucleotides and oligopeptides would not have been easy under the primitive earth conditions since these are spontaneously hydrolyzed in aqueous medium although the rates of hydrolysis of these oligomers are normally slower than those of formation at low temperatures. The hydrolysis rate increases with increasing

The accumulation of oligomers should be considered from the viewpoint that the life-like system is considered as thermodynamically open system. That is, the accumulation of oligomers is determined by both the rates of formation and degradation of oligomers. Thus, although the rate of degradation of oligomers at high temperatures is large, the accumulation would be possible if the formation rate is much lager than that of degradation. The formation rate would be possibly enhanced by several prebiotic catalysts. Experimental simulations of these conditions are not yet well succeeded while most of simulation experiments were carried out under static conditions. Some experiments, of which activated monomers were fed to batch reaction, successfully showed the possible formation of long oligonucleotides and oligopep‐ tides [75]. Flow reactor would be useful to design where the activated monomers are fed to a

such as RNA replicase function, under the primitive earth conditions.

**6.3. Relationship between RNA and protein-like molecules**

198 Oligomerization of Chemical and Biological Compounds

started after the accumulation of oligopeptides.

**conditions**

temperature.

chemical evolutionary system.

This research was supported by Grant-in-Aid for Exploratory Research in Japan, 24657166 from JSPS of Japan.

#### **Author details**

Kunio Kawamura\*

Address all correspondence to: kawamura@shudo-u.ac.jp

Department of Human Environmental Studies, Hiroshima Shudo University, Hiroshima, Japan

#### **References**


[3] Horneck G., Rettberg P., Reitz G., Wehner J., Eschweiler U., Strauch K., Panitz C., Starke V., Baumstark-Khan C. Protection of Bacterial Spores in Space, A Contribution to the Discussion on Panspermia. Origins Life Evol Biospheres 2001; 31: 527-547.

[19] Kawamura K. Civilization as a Biosystem Examined by the Comparative Analysis of

Oligomerization of Nucleic Acids and Peptides under the Primitive Earth Conditions

http://dx.doi.org/10.5772/58222

201

[20] Gomes R., Levison HF., Tsiganis K., Morbidelli A. Origin of the cataclysmic Late Heavy Bombardment period of the terrestrial planets. Nature 2005; 435(7041):

[21] Maher KA., Stevenson, DJ. Impact Frustration of the Origin of Life. Nature 1988;

[22] Sleep NH., Zahnle KJ., Kasting IF., Morowitz HJ., Annihilation of Ecosystems by Large Asteroid Impacts on the Early Earth. Nature 1989; 342(6246): 139-142.

[23] Sagan C., Mullen G. Earth and Mars: Evolution of Atmospheres and Surface Temper‐

[24] Newman MJ., Rood RT. Implications of Solar Evolution for the Earth's Early Atmos‐

[25] Walker JCG. Carbon-dioxide on the Early Earth. Origins Life Evol Biosphere 1985;

[26] Kasting JF., Ackerman TP. Climatic Consequences of Very High-carbon Dioxide Lev‐

[28] Mojzsis SJ., Harrison TM., Pidgeon, RT. Oxygen-isotope Evidence from Ancient Zir‐ cons for Liquid Water at Earth's Surface 4, 300 Myr Ago. Nature 2001; 409: 178-181.

[29] Olivas WM., Muhlrad D., Parker R. Analysis of the Yeast genome: Identification of New Non-coding and Small ORF-containing RNAs. Nucl Acid Res 1997; 25(22):

[30] Eigen M. Selforganization of Matter and the Evolution of Biological Nacromolecules.

[31] Orgel LE., Crick FHC. Anticipating an RNA World Some Past Speculations on the

[32] Chamberlin N., berg P. Deoxyrinonucleic Acid-directed Synthesis of Ribonucleic Acid by an Enzyme from Escherichia Coli. Proc Natl Acad Sci USA 1962; 48(2): 81-94.

[33] Haruna I., Spiegelman S. An RNA "Replicase" Induced by and Selective for a Viral RNA: Isolation and Properties. Proc Natl Acad Sci USA 1963; 50(2): 905-911.

[34] Grunberg-Manago M. Enzymatic Synthesis of Nucleic Acids. Ann Rev Biochem 1962;

[35] Kawamura K., Kuranoue K., Umehara M. Chemical Evolution of RNA and RNA Pol‐ ymerases: Implications from Search for the Prebiotic Pathway of formation of RNA

els in the Earths Early Atmosphere. Science 1986; 234(4782): 1383-1385.

[27] Kasting JF. Earth's early atmosphere. Science 1993; 259(5097): 920-926.

Origin of Life: Where are They Today? FASEB J 1993;7: 238-239.

Biosystems. BioSystems 2007; 90: 139-150.

466-469.

331(6157): 612-614.

16(2): 117-127.

4619-4625.

31: 301-332.

atures. 1972; 177(4043): 52-56.

phere. Science 1977; 198(4321): 1035-1037.

Naturwissenschaften 1971; 58: 465-523.


[19] Kawamura K. Civilization as a Biosystem Examined by the Comparative Analysis of Biosystems. BioSystems 2007; 90: 139-150.

[3] Horneck G., Rettberg P., Reitz G., Wehner J., Eschweiler U., Strauch K., Panitz C., Starke V., Baumstark-Khan C. Protection of Bacterial Spores in Space, A Contribution to the Discussion on Panspermia. Origins Life Evol Biospheres 2001; 31: 527-547.

[4] Miller SL. A Production of Amino Acids under Possible Primitive Earth Conditions.

[5] Ferris JP., Joshi PC., Edelson EH., Lawless JG. HCN-Plausible Source Of Purines, Pyr‐ imidines And Amino-Acids on Primitive Earth. J Mol Evol 1978; 11: 293-311.

[6] Schlesinger G., Miller, SL. Prebiotic Synthesis in Atmospheres Containing CH4, CO,

[7] Cleaves HJ. Chalmers JH, Lazcano A. Miller SL, Bada JL. A Reassessment of Prebiotic Organic Synthesis in Neutral Planetary Atmospheres. Origins Life Evol Biospheres

[8] Mojzsis SJ., Arrhenius G. McKeegan KD., Harrison TM., Nutman AP., Friend CRL.

[9] Van Kranendonk, MJ. Volcanic Degassing, Hydrothermal Circulation and the Flour‐ ishing of Early Life on Earth: A Review of the Evidence from c. 3490-3240 Ma Rocks of the Pilbara Supergroup, Pilbara Craton, Western Australia. Earth Sci Rev 2006;

[10] Kuppers, BO. Molecular Theory of Evolution. Outline of a Physico-chemical Theory

[11] Schopf, JW. Microfossils of the Early Archean Apex Chert: New Evidence of the An‐

[13] Honda S., Yamasaki K., Sawada Y., Morii H. 10 Residue Folded Peptide Designed by

[14] Cech TR., Zaung AJ., Grabowski PJ. In Vitro Splicing of the Ribosomal RNA Precou‐ sor of Tetrahymena: Involvment of a Guanosine Nucleotide in the Excision of the In‐

[16] Cech TR. A model for the RNA-catalyzed replication of RNA. Proc Natl Acad Sci

[17] Lazcano A. The Biochemical Roots of the RNA World: from Zymonucleic Acid to Ri‐

[18] Nemoto N., Husimi Y. A Model of the Virus-type Strategy in the Early Stage of En‐

chopfbefore 3, 800 Million Years Ago. Nature 1996; 384(7): 55-59.

[12] Crick F. Central Dogma of Moecular Biology. Nature 1970; 227: 561-563.

[15] Gilbert W. Origin of life: the RNA world. Nature 1986; 319: 618–618.

of the Origin of Life. Berlin: Springer-Verlag; 1985.

Segment Statistics. Structure 2004; 12: 1507-1518.

bozymes. Hist. Phil. Life Sci. 2012; 34(3): 407-423.

coded Molecular Evolution. J. Theor Biol 1995; 176: 67-77.

tervening Sequence. Cell 1981; 27: 487-496.

USA 1986; 83: 4360–4363.

tiquity of Life. Science 1993; 260: 640-646.

Science 1953; 528-529.

200 Oligomerization of Chemical and Biological Compounds

2008; 38(2): 105-115.

74(3-4): 197-240.

and CO2. J Mol Evol 1983; 19: 376-382.


from Adenosine 5'-Triphosphate in the Presence of Thermal Condensation Products of Amino Acids as Primitive Enzymes. Viva Origino 2002; 30(3): 123-134.

[50] Sawai H., Orgel LE. Oligonucleotide Synthesis Catalyzed by the Zn2+ion. J Am Chem

Oligomerization of Nucleic Acids and Peptides under the Primitive Earth Conditions

http://dx.doi.org/10.5772/58222

203

[51] Lohrmann R., Orgel LE. Efficient catalysis of polycytidylic acid-directed oligoguany‐

[52] Bridson P.K., Orgel LE. Catalysis of accurate poly(C)-directed synthesis of 3'-5'-

[53] Inoue T., Orgel LE. Oligomerization of (Guanosine 5'-Phosphor)-2-methylimidazo‐

[54] Lohrmann R. Formation of Ncleoside 5'-Phosphamidates under Potentially Prebio‐

[56] Kanavarioti A., Bernasconi CF., Doodokyan DL., Alberas DJ. Magnesium-ion Cata‐ lyzed P-N Bond Hydrolysis in Imidazolide-Activated Nucleotides-Relevance to Tem‐ plate-directed Synthesis of Polynucleotides. J Am Chem Soc 1989; 111(18): 7247-7257.

[57] Joyce GF., Inoue T., Orgel LE. Non-enzymatic template-directed synthesis on RNA random copolymers: Poly(C, U) templates. J Mol Biol 1984; 176(2): 279-306.

[58] Sawai H. Oligonucleotide Formation Catalyzed by Divalent Metal Ions. The Unique‐

[59] Sawai H., Shibata T., Ohno M. Preparation of Oligoadenylates with 2'-5' Linkage Us‐

[60] Sawai H., Kuroda K., Hojo T. Uranyl Ion as a Highly Effective Catalyst for Internu‐

[61] Sawai H., Higa K., Kuroda K. Synthesis of Cyclic and Acyclic Oligocytidylates by Ur‐ anyl Ion Catalyst in Aqueous Solution. J Chem Soc Perkin Trans 1 1992; (4): 505-508.

[62] Kawamura K., Maeda J. Kinetic Analysis of Oligo(C) Formation from the 5'-Mono‐ phosphorimidazolide of Cytidine with Pb(II) Ion Catalyst at 10-75 °C. Origins Life

[64] Ponnamperuma C., Shimoyama A., Friebele E. Clay and the Origin of Life. Origins

[65] Ferris JP., Huang CH., Hagan WJ. Montmorillonite: A Multifunctional Mineral Cata‐ lyst for the Prebiological Formation of Phosphate Esters. Origins Life Evol Biosphere

[66] Ferris JP., Ertem G., Agarwak VK. The Adsorption of Nucleotides and Polynucleoti‐ des on Montmorillonite Clay. Origins Life Evol Biosphere 1989; 19: 153-164.

cleotide Bond Formation. Bull Chem Soc Jpn 1989; 62(2): 2018-2023.

[63] Bernal JD. The Physical Basis of Life. Proc R Soc Lond A 1949; 62: 537-558.

ness of the Ribosyl System. J Mol Evol 1988; 27(3): 181-186.

ing Pb2+Ion Catalyst. Tetrahedron 1981; 37(3): 481-485.

Evol Biospheres 2007; 37: 153-165.

Life 1982; 12: 9-40.

1988; 18(1-2): 121-133.

[55] Lohrmann R., Orgel LE. Prebiotic Activation Processes. Nature 1973; 244: 418-420.

lide on Poly(C): An RNA Polymerase Model. J Mol Biol 1982(1); 201-207.

late formation by Pb2+. J Mol Biol 1980; 142(4, 5): 555-567.

logical Conditions. J Mol Evol 1977; 10: 137-154.

linked oligoguanylates by Zn2+. J Mol Biol 1980; 144(4): 567-577.

Soc 1975; 97(12): 3532-3533.


[50] Sawai H., Orgel LE. Oligonucleotide Synthesis Catalyzed by the Zn2+ion. J Am Chem Soc 1975; 97(12): 3532-3533.

from Adenosine 5'-Triphosphate in the Presence of Thermal Condensation Products

[36] Sulston J., Lohrmann R., Orgel LE., Todd Miles H. Nonenzymatic Synthesis of Oli‐ goadenylates on a Polyuridylic Acid Template. Proc Natl Acad Sci USA 1968; 59:

[37] Sulston J., Lohrmann R., Orgel LE., Todd Miles H. Specificity of Oligonucleotide Syn‐ thesis Directed by Polynucleic Acid. Proc Natl Acad Sci USA 1968; 60: 409-415.

[38] Lohrmann R., Orgel LE. Prebiotic Synthesis: Phosphorylation in Aqueous Solution.

[39] Sulston J., Lohrmann R., Orgel LE., Schneider-Bernloehr H., Todd Miles H. Non-en‐ zymatic Synthesis of Deoxyadenylate Oligonucleotides on a Polyuridylate Template.

[40] Sulston J., Lohrmann R., Orgel LE., Schneider-Bernloehr H., Todd Miles H. Non-en‐ zymatic Oligonucleotide Synthesis on a Polycytidylate Template. J Mol Biol 1969; 40:

[41] Renz M., Lorhmann R., Orge LE. Catalyst for the Polymerization of Adenosine Cyclic 2', 3'-Phosphate on a Poly(U) Template. Biochim Biophys Acta 1971; 240: 463-471.

[42] Verlander MS., Lorhmann R., Orge LE. Catalysts for the Self-Polymerization of Ade‐

[43] Powner MW., Gerland B., Sutherland JD. Synthesis of Activated Pyrimidine Ribonu‐ cleotides in Prebiotically Plausible Conditions. Nature 2009; 459: 239-242.

[44] Costanzo G., Pino S., Ciciriello F., Di Mauro E. Generation of Long RNA Chains in

[45] Ibanez JD., Kimball AP., Oró J. Possible Prebiotic Condensation of Mononucleotide

[46] Schneider-Bernloehr H., Lorhmann R., Sulston JE., Orge LE., Todd Miles H. Specific of Template-directed Synthesis with Adenine Nucleotides. J Mol Biol 1970; 47:

[47] Weimann BJ., Lorhmann R., Orge LE., Schneider-Bernloehr H., Sulston JE. Template-Directed Synthesis with Adenosine-5'-phosphorimidazolide. Science 1968; 161(3839):

[48] Orgel LE., Lohrmann R. Prebiotic Chemistry and Nucleic Aicd Replication. Acc

[49] Lohrmann R., Orgel LE. Studies of Oligoadenylate Formation on a Poly(U) template.

nosine Cyclic 2', 3'-Phosphate. J Mol Evol 1973; 2: 303-316.

Water. J Biol Chem 2012; 284(48): 33206-33216.

by Cyanamide. Science 1971; 173: 444-445.

of Amino Acids as Primitive Enzymes. Viva Origino 2002; 30(3): 123-134.

726-733.

227-234.

257-260.

387-387.

Chem Res 1974; 7: 368-377.

J Mol Evol 1979; 12: 237-257.

Science 1968; 161(3836): 64-66.

202 Oligomerization of Chemical and Biological Compounds

J Mol Biol 1969; 37: 151-155.


[67] Ferris JP., Ertem G. Oligomerization of Ribonucleotides on Montmorillonite: Reaction of the 5'-Phosphorimidazolide of Adenosine. Science 1992; 257(5075): 1387-1389.

[81] Sawai H., Wada K. Nonenzymatic Template-Directed Condensation of Short-chained Oligouridylates on a Poly(A) Template. Origins Life Evol Biosphere 2000; 30: 503-511.

Oligomerization of Nucleic Acids and Peptides under the Primitive Earth Conditions

http://dx.doi.org/10.5772/58222

205

[82] Hill Jr AR, Orgel LE., Wu T. The Limits of Template-directed Synthesis with Nucleo‐ side-5'-Phosphoro(2-Methyl)Imidazolides. Origins Life Evol Biosphere 1993; 23(5-6):

[84] Mansy SS., Schrum JP., Krishnamurthy M., Tobe S., Treco DA., Szostak JW. Tem‐ plate-directed Synthesis of a Genetic Polymer in a Model Protocell. Nature 2008; 454:

[85] Kanavarioti A., Bernasconi CF., Alberas DJ., Baird, EE. Kinetic Dissection of Individ‐ ual Steps on the Poly(C)-Directed Oligoguanylate Synthesis from Guanosine 5'- Monophosphate 2-Methylimidazolide. J Am Chem Soc 1993; 115(19): 8537-8546.

[86] Inoue T., Orgel LE. Substituent Control of the Poly(C)-Directed Oligomerization of Guanosine 5'-Phosphorimidazolide. J Am Chem Soc 1981; 103: 7666-7667.

[87] Prabahar KJ., Cole TD., Ferris JP. Effect of Phosphate Activating Group on Oligonu‐ cleotide Formation on Montmorillonite-the Regioselective Formation of 3', 5'-Linked

[88] Prabahar KJ., Ferris JP. Adenine Derivatives as Phosphate-activating Groups for the Regioselective Formation of 3', 5'-Linked Oligoadenylates on Montmorillonite: Possi‐ ble Phosphate-activating Groups for the Prebiotic Synthesis of RNA. J Am Chem Soc

[89] Oscar L. Ueber Formaldehyd und Dessen Condensation. J Prak Chem 1886; 321:

[90] Breslow R. On the Mechanism of the Formose Reaction. Tetrahedron Lett 1959; 21:

[91] Eschenmoser A., Loewenthal E. Chemistry of Potentially Prebiological Natural-prod‐

[92] Dueholm KL., Egholm M., Behrens C., Christensen L., Hansen HF., Vulpius T., Pe‐ tersen KH., Berg RH., Nielsen PE., Buchardt O. Synthesis of Peptide Nucleic-acid Monomers Containing the 4 Natural Nucleobases-Thymine, Cytosine, Adenine, and

[94] Pitsch S., Krishnamurthy R., Bolli M., Wendeborn S., Holzner A., Minton M., Lesueur C., Schlonvogt I., Jaun B., Eschenmoser A. Pyranosyl-RNA (P-RNA)-Base-Pairing Se‐ lectivity and Potential to Replicate-Preliminary Communication. Helv Chim Acta

Guanine and their Oligomerization. J Org Chem 1994; 59(19): 5767-5773.

[93] Joyce GF. The Antiquity of RNA-based Evolution. Nature 2002; 418: 214-221.

Oligoadenylates. J Am Chem Soc 1994; 116(24): 10914-10920.

[83] Szostak JW., Bartel DP., Luisi PL. Synthesizing Life. Nature 2001; 409: 387-390.

285-290.

122-126.

1997; 119(19): 4330-4337.

1995; 78(7): 1621-1635.

ucts. Chem Soc Rev 1992; 21(1): 1-16.

321-351.

22-26.


[67] Ferris JP., Ertem G. Oligomerization of Ribonucleotides on Montmorillonite: Reaction of the 5'-Phosphorimidazolide of Adenosine. Science 1992; 257(5075): 1387-1389. [68] Ferris JP., Ertem G. Montmorillonite Catalysis of RNA Oligomer Formation in Aque‐ ous Solution. A model for the Prebiotic Formation of RNA. J Am Chem Soc 1993;

[69] Kawamura K., Ferris JP. Kinetics and Mechanistic Analysis of Dinucleotide and Oli‐ gonucleotide Formation from the 5'-Phosphorimidazolide of Adenosine on Na+

[70] Kawamura K., Ferris JP. Clay Catalysis of Oligonucleotide Formation: Kinetics of the Reaction of the 5'-Phosphorimidazolides of Nucleotides with the Non-basic Hetero‐ cycles Uracil and Hypoxanthine. Origins Life Evol Biosphere 1999; 29(6): 563-591. [71] Ertem G., Ferris JP. Template-directed Synthesis Using the Heterogeneous Templates Produced by Montmorillonite Catalysis. A Possible Bridge between the Prebiotic and

[72] Kawamura K., Nakahara N., Okamoto F., Okuda N. Temperature Dependence of the Cyclization of Guanine and Cytosine Mix Hexanucleotides with Water-soluble Car‐

[73] Kawamura K., Okamoto F., Okuda N. Influence of Template Oligonucleotides on the Condensation of Oligonucleotides in the Presence of Water-soluble Carbodiimide: Search for Model Reactions of the Formation of RNA Which Could Be Effective at

[74] Ertem G., Ferris JP. Formation of RNA Oligomers on Montmorillonite: Site of Cataly‐

[75] Ferris JP., Hill AR., Liu RH., Orgel LE. Synthesis of Long Prebiotic Oligomers on

[76] Huang WH., Ferris JP. Synthesis of 35-40 Mers of RNA Oligomers from Unblocked Monomers. A Simple Approach to the RNA World. Chem Comm 2003; 12: 1458-1459.

[77] Huang WH., Ferris JP. One-step, Regioselective Synthesis of up to 50-mers of RNA Oligomers by Montmorillonite Catalysis. J Am Chem Soc 2006; 128(27): 8914-8919. [78] Lorhman R., Bridson PK., Orgel LE. Efficient Metal-ion Catalyzed Template-Directed

[79] Inoue T., Orgel LE. A nonenzymatic RNA polymerase model. Science 1983;

[80] Wu TF., Orgel LE. Nonenzymatic Template-Directed Synthesis on Oligodeoxycytidy‐ late Sequences in Hairpin Oligonucleotides. J Am Chem Soc 1992; 114(1): 317-322.

Oligonucleotide Synthesis. Science 1980; 208(4451): 1464-1465.

Montmorillonite. J Am Chem Soc 1994; 116(17): 7564-7572.

RNA Worlds. J Am Chem Soc 1997; 119(31): 7197-7201.

bodiimide at 0-75 °C. Viva Origino 2003; 31(4): 221-232.

High Temperatures. Viva Origino 2004; 32: 68-80.

sis. Origins Life Evol Biosphere 1998; 28: 485-499.

Mineral Surfaces. Nature 1996; 381: 59-61.

219(4586): 859-862.


115(26): 12270-12275.

204 Oligomerization of Chemical and Biological Compounds


[95] Bolli M., Micura R., Eschenmoser, A. Pyranosyl-RNA: Chiroselective self-assembly of base sequences by ligative oligomerization of tetranucleotide-2', 3'-cyclophosphates (with a commentary concerning the origin of biomolecular homochirality). Chem & Biol 1997; 4(4): 309-320.

[109] Fox SW., Harada K. The Thermal Copolymerization of Amino Acids Common to

Oligomerization of Nucleic Acids and Peptides under the Primitive Earth Conditions

http://dx.doi.org/10.5772/58222

207

[110] Harada K., Fox SW. The Thermal Copolymerization of Aspartic Acid and Glutamic

[111] Fox SW., Krampitz G. Catalytic Decomposition of Glucose in Aqueous Solution by

[112] Nakashima T., Fox SW. Selective Condensation of Aminoacyl Adenylates by Necleo‐

[113] Fox SW. Metabolic Microspheres Origins and Evolution. Naturwissenschaften 1980;

[114] Shock EL. Stability of Peptides in High-temperature Aqueous-solutions. Geochim

[115] Rode BM., Schwendinger MG. Copper-catalyzed Amino Acid Condensation in Wa‐ ter — A Simple Possible Way of Prebiotic Peptide Formation. Origins Life Evol Bio‐

[116] Bujdak J., Rode BM. Silica, Alumina, and Clay-Catalyzed Alanine Peptide Bond For‐

[118] Meng M., Stievano L., Lambert JF. Adsorption and Thermal Condensation Mecha‐ nisms of Amino Acids on Oxide Supports. 1. Glycine on Silica. Langmuir 2004; 20(3):

[119] Leman L., Orgel LE., Ghadiri MR. Carbonyl Sulfide-mediated Prebiotic Formation of

[120] Lambert J-F., Adsorption and Polymerization of Amino Acids on Mineral Surfaces: A

[122] Galtier N., Tourasse N., Gouy M. A Nonhyperthermophilic Common Ancestor to Ex‐

[123] Holm NG., Marine Hydrothermal Systems and the Origin of Life. Origin Life Evol

[124] Baross JA., Hoffman SE. Submarine Hydrothermal Vents and Associated Gradient Environments as Sites for the Origin and Evolution of Life. Origin Life Evol Bio‐

[125] Yanagawa H., Kojima K. Thermophilic Microspheres of Peptide-Like Polymers and

[121] Forterre P., A Hot Topic: the Origin of Hyperthermophiles. Cell 1996; 85: 789–792.

[117] Rode BM. Peptides and the Origin of Life. Peptides 1999; 20(6): 773-786.

proteinoid Microparticles. Proc Natl Acad Sci USA 1972; 69(1):106-108.

Protein. J Am Chem Soc 1960; 82: 3745–3751.

Acid. Arch Biochem Biophys 1960; 86(2): 274-280.

Thermal Proteinoids. Nature 1964; 203: 1362-1364.

Cosmochim Acta 1992; 56(9): 3481-3491.

mation. J Mol Evol 1997; 45: 457-466.

Peptides. Science 2004; 306(5694): 283-286.

tant Life Forms. Science 1999; 283: 220–221.

Biosphere 1992; 22: 1–242.

sphere 1985; 15(4): 327-345.

Review. Origins Life Evol Biospheres 2008; 38: 211-242.

Silicates Formed at 250°C. J Biochem 1985; 97: 1521-1524.

sphere 1990; 20(5): 401-410.

67: 378-383.

914-923.


[109] Fox SW., Harada K. The Thermal Copolymerization of Amino Acids Common to Protein. J Am Chem Soc 1960; 82: 3745–3751.

[95] Bolli M., Micura R., Eschenmoser, A. Pyranosyl-RNA: Chiroselective self-assembly of base sequences by ligative oligomerization of tetranucleotide-2', 3'-cyclophosphates (with a commentary concerning the origin of biomolecular homochirality). Chem &

[96] Joyce GF., Schwartz AW., Miller SL., Orgel LE. The Case for an Ancestral Genetic System Involving Simple Analogs of the Nucleotides. Proc Natl Acad Sci USA 1987;

[97] Joshi PC., Pitsch S., Ferris JP. Homochiral Selection in the Montmorillonite-Catalyzed and Uncatalyzed Prebiotic Synthesis of RNA. Chem Commun 2000; 2497–2498.

[98] Urata H., Ando C., Ohmoto N., Shimamoto Y., Kobayashi Y., Akagi M. Efficient and Homochiral Selective Oligomerization of Racemic Ribonucleotides on Mineral Sur‐

[99] Osawa K., Urata H., Sawai H. Chiral Selection in Oligoadenylate Formation in the Presence of a Metal Ion Catalyst or Poly(U) Template. Origins Life Evol Biosphere

[100] Nisbet EG. Origin of life: RNA and Hot-water Springs. Nature 1986; 322: 206–206.

[101] Pace NR. Origin of Life—Facing up to the Physical Setting. Cell 1991; 65: 531–533.

[102] Kawamura K., Umehara M. Kinetic Analysis of the Temperature Dependence of the Template-directed Formation of Oligoguanylate from the 5'-Phosphorimidazolide of Guanosine on a Poly(C) Template with Zn2+. Bull Chem Soc Jpn 2001; 74(5): 927-935.

[103] Kawamura K., Maeda J. Kinetic Analysis of Oligo(C) Formation from the 5'-Mono‐ phosphorimidazolide of Cytidine with Pb(II) Ion Catalyst at 10–75°C. Origin Life

[104] Kawamura K., Maeda J. Kinetics and Activation Parameter Analysis for the Prebiotic

[105] Yakovchuk P., Protozanova E., Frank-Kamenetskii MD. Base-stacking and Base-pair‐ ing Contributions into Thermal Stability of the DNA Double Helix. Nucl Acid Res

[106] Blochl E., Rachel R., Burggraf S., Hafenbradl D., Jannasch HW., Stetter KO., Pyrolo‐ bus Fumarii, Gen. and Sp. Nov., Represents a Novel Group of Archaea, Extending

[107] Cowan DA. The Upper Temperature for Life-Where Do We Draw the Line? Trends

[108] Fox SW., Harada K. Thermal Copolymerization of Amino Acids to a Product Resem‐

the Upper Temperature Limit for life to 113 °C. Extremophiles 1997; 1: 14-21.


Biol 1997; 4(4): 309-320.

206 Oligomerization of Chemical and Biological Compounds

face. Chem Lett 2001; 324–325.

Evol Biospheres 2007; 37: 153–165.

Oligocytidylate Formation on Na+

84: 4398-4402.

2005; 35: 213-223.

112: 8015–8023.

2006; 34(2): 564-574.

Micobiol 2004; 12: 58-60.

bling Protein. Science 1958; 128(3333): 1214-1215.


[126] Imai E., Honda H., Hatori K., Brack A., Matsuno K. Elongation of Oligopeptides in a Simulated Submarine Hydrothermal System. Science 1999; 283: 831-833.

[139] Tuerk C., Gold L. Systematic Evolution of Ligands by Exponential Enrichment: RNA Ligands to Bacteriophage T4 DNA Polymerase. Science 1990; 249: 505-510.

Oligomerization of Nucleic Acids and Peptides under the Primitive Earth Conditions

http://dx.doi.org/10.5772/58222

209

[140] Ellington AD., Szostak JW. In Vitro Selection of RNA Molecules That Bind Specific

[141] Kawamura K., Kuranoue K., Nagahama M. Prebiotic Inhibitory Activity of Proteinlike Molecules to the Template-directed Formation of Oligoguanylate from Guano‐ sine 5'-Monophosphate 2-Methylimidazolide on a Polycytidylic Acid Template. Bull

[142] Kawamura K., Yukioka M. Kinetics of the Racemization of Amino Acids at 225-275 °C Using a Real-time Monitoring Method of Hydrothermal Reactions. Thermochimi‐

Ligands. Nature 1990; 346: 818-822.

Chem Soc Jpn 2004; 77(7): 1367-1375.

ca Acta 2001; 375(1-2): 9-16.


[139] Tuerk C., Gold L. Systematic Evolution of Ligands by Exponential Enrichment: RNA Ligands to Bacteriophage T4 DNA Polymerase. Science 1990; 249: 505-510.

[126] Imai E., Honda H., Hatori K., Brack A., Matsuno K. Elongation of Oligopeptides in a Simulated Submarine Hydrothermal System. Science 1999; 283: 831-833.

[127] Imai E., Honda H., Hatori K., Matsuno K. Autocatalytic Synthesis of Oligoglycine in a Simulated Submarine Hydrothermal System. Origin Life Evol Biosphere 2000; 29:

[128] Kawamura K. Monitoring of Hydrothermal Reactions In 3 ms Using Fused-silica Ca‐

[129] Kawamura K. Monitoring Hydrothermal Reactions on the Millisecond Time Scale Using a Micro-tube Flow Reactor and Kinetics of ATP Hydrolysis for the RNA World

[130] Kawamura K. Development of Micro-flow Hydrothermal Monitoring Systems and their Applications to the Origin of Life Study on Earth. Anal Sci 2011; 27(7): 675-683.

[131] Kawamura K., Nishi T., Sakiyama T. Consecutive Elongation of Alanine Oligopepti‐ des at the Second Time Range Under Hydrothermal Condition Using a Micro Flow

[132] Kawamura K., Shimahashi M. One-step Formation of Oligopeptide-like Molecules from Glu and Asp in Hydrothermal Environments. Naturwissenschaften 2008; 95:

[133] Kawamura K., Takeya H., Kushibe T., Koizumi Y. Mineral-enhanced Hydrothermal Oligopeptide Formation at the Second Time Scale. Astrobiology 2011; 11: 461-469.

[134] Horowitza ED., Engelharta AE., Chena MC., Quarlesa KA., Smitha MW., Lynna DG., Hud NV. Intercalation as a Means to Suppress Cyclizationand Promote Polymeriza‐ tion of Base-pairing Oligonucleotides in a Prebiotic World. Proc Natl Acad Sci USA

[135] Kawamura K. Drawbacks of the Ancient RNA-based Life-like System Under Primi‐

[136] Kawamura K. Reality of the Emergence of Life-like Systems from Simple Prebiotic Polymers on Primitive Earth. In: Seckbach J., Gordon R. (ed.) Genesis-in the Begin‐ ning: Precursors of Life, Chemical Models and Early Biological Evolution. Springer;

[137] Hud NV., Jaina SS., Lia X., Lynn DG. Addressing the Problems of Base Pairing and Strand Cyclization in Template-Directed Synthesis A Case for the Utility and Neces‐ sity of 'Molecular Midwives' and Reversible Backbone Linkages for the Origin of

[138] Ertem G., Ferris JP. Synthesis of RNA Oligomers on Heterogeneous Templates. Na‐

249-259.

208 Oligomerization of Chemical and Biological Compounds

449-454.

2010; 107(12): 5288-5293.

2012. p123-144.

ture 1996; 379: 238-240.

pillary Tubing. Chem Lett 1999; 125-126.

Hypothesis. Bull Chem Soc Jpn 2000; 73: 1805-1811.

Reactor System. J Am Chem Soc 2005; 127: 522-523.

tive Earth Conditions. Biochimie 2012; 94(7): 1441-1450.

proto-RNA. Chem & Biodiversity 2007; 4: 768-783.


**Section 2**

**Biological Oligomers**

**Biological Oligomers**

**Chapter 7**

**Oligomerization of Biomacromolecules – Example of**

The central dogma of molecular biology stated that genetic information only flows in one direction, from DNA to proteins via an intermediate called messenger ribonucleic acid (mRNA) [1,2,3]. Originally, ribonucleic acid (RNA) was thought to have roles in information transfer and structure maintenance. Today, we know that RNA performs a remarkable range of functions in the living cell, (control of gene expression, chromosome –end maintenance, housekeeping activities, sorting of proteins in the cell and defines metazoan development) [3].Although, proteins have enzymatic activities mostly, in the early 1980s has been shown that RNA molecules can catalyze a chemical reaction and RNAs with catalytic activity are called **ribozymes**. The discovery of ribozymes led to the hypothesis that RNA could have been the original molecule of life on earth about four billion years ago; a biopolymer with the ability to self-replicate and that could both store information and catalyze chemical reactions. RNA would have been self-sufficient as the original molecule of life [4]. Discovery of the unexpect‐ edly wide variety of functions carried out by RNA was accompanied by the identification of a multitude of further types of small, non-coding RNAs (small nuclear RNA, small nucleolar RNAs, small interfering RNAs, micro RNAs) highlighting the versatility of RNA as a bio‐

**rRNAs** represent structural and catalytic elements of the ribosome. In the nucleolus of eukaryotic cells, more than 100 tandemly repeated units of rRNA genes are transcribed into long precursor transcripts [7,8]. Following transcription, pre-rRNA is subsequently cleaved to form mature rRNAs and with approximately 80 proteins to form the large and small ribosomal subunits prior to their export to the cytoplasm. **SnoRNAs (Small nucleolar**

> © 2014 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

**RNA Binding Sm/LSm Proteins**

Additional information is available at the end of the chapter

Bozidarka L. Zaric

**1. Introduction**

**1.1. A multitude of RNAs**

chemical tool for the cell [2].

http://dx.doi.org/10.5772/57592

## **Oligomerization of Biomacromolecules – Example of RNA Binding Sm/LSm Proteins**

Bozidarka L. Zaric

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/57592

#### **1. Introduction**

#### **1.1. A multitude of RNAs**

The central dogma of molecular biology stated that genetic information only flows in one direction, from DNA to proteins via an intermediate called messenger ribonucleic acid (mRNA) [1,2,3]. Originally, ribonucleic acid (RNA) was thought to have roles in information transfer and structure maintenance. Today, we know that RNA performs a remarkable range of functions in the living cell, (control of gene expression, chromosome –end maintenance, housekeeping activities, sorting of proteins in the cell and defines metazoan development) [3].Although, proteins have enzymatic activities mostly, in the early 1980s has been shown that RNA molecules can catalyze a chemical reaction and RNAs with catalytic activity are called **ribozymes**. The discovery of ribozymes led to the hypothesis that RNA could have been the original molecule of life on earth about four billion years ago; a biopolymer with the ability to self-replicate and that could both store information and catalyze chemical reactions. RNA would have been self-sufficient as the original molecule of life [4]. Discovery of the unexpect‐ edly wide variety of functions carried out by RNA was accompanied by the identification of a multitude of further types of small, non-coding RNAs (small nuclear RNA, small nucleolar RNAs, small interfering RNAs, micro RNAs) highlighting the versatility of RNA as a bio‐ chemical tool for the cell [2].

**rRNAs** represent structural and catalytic elements of the ribosome. In the nucleolus of eukaryotic cells, more than 100 tandemly repeated units of rRNA genes are transcribed into long precursor transcripts [7,8]. Following transcription, pre-rRNA is subsequently cleaved to form mature rRNAs and with approximately 80 proteins to form the large and small ribosomal subunits prior to their export to the cytoplasm. **SnoRNAs (Small nucleolar**

© 2014 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

uracil. RNA has much wider biological activities and adopts a wider range of structures. DNA double helices preferentially assume the B-form structure in solution and RNA double helices are found in the A-form. The **RNA A-form double helix** has a narrow and deep major groove, which prevents proteins to recognize RNA in a manner analogous to the way they recognize DNA. An RNA molecule can locally adopt several types of secondary structure (bulges, hairpins, internal loops) [15, 16].Eukaryotic mRNAs are almost always associated with RNAbinding proteins. RNA-binding proteins generally have a modular structure and contain RNAbinding domains of 70–150 amino acids that mediate RNA recognition. Three major classes of eukaryotic RNA-binding protein domains are known: the RNA-recognition motif (RRM), the double stranded RNA binding domain (dsRBD) and the K-homology (KH) domain [17].

Oligomerization of Biomacromolecules – Example of RNA Binding Sm/LSm Proteins

http://dx.doi.org/10.5772/57592

215

Eukaryotic gene expression is a complex, stepwise process that begins with transcription (synthesis of pre-mRNA) [18]. Mature mRNAs are produced in the cell nucleus from primary transcripts of coding genes (pre-mRNAs) by a series of processing events which include capping, splicing, and 3` end polyadenylation. Mature mRNAs are transported to the cyto‐ plasm. All modification steps are coupled and influence each other. RNA polymerase II is a key molecular coordinator of these processing events, and phosphorylation of it has regulatory

In 1977, a number of research groups discovered that the genes of higher organisms are often made up of a sequence of coding (called exons) and non-coding base sequences (introns). During transcription, all parts of the gene are copied to form a strand of pre-mRNA. Introns are removed and the exons stitched together so that the now continuous exons can be translated to produce a protein. This splicing of the pre-mRNA is a multistage process, carried out by complex macromolecular machinery known as the spliceosome, which is among the most

Splicing of precursors to mRNAs occurs in two steps, both involving a single transesterification reaction [23].Assembly and function of the spliceosome requires approximately 300 polypep‐ tides and five snRNAs, not considering gene-specific RNA-binding factors [23]. There are two distinct types of spliceosome in most cells. The major class U2-type spliceosome is universal in eukaryotes, whereas the minor class or U12-type spliceosome is not present in some

The pre-mRNA contains conserved elements at its intron/ exon boundaries that determine the proper sites for the splicing reaction (Figure 3.). The 5' splice site contains a conserved consensus sequence, which is AG/GURAGU (R=purine, / denotes the exon/ intron boundary). The branch site lies between 100 and 18 bases upstream of the 3' splice site and has the consensus: CUR**A**Y for vertebrates (**A** branching nucleotide, Y is pyrimidine). In higher

organisms. The evolutionary relation between these two spliceosomes is uncertain.

**1.3. All mRNA processing steps are coupled**

**1.4. Removal of introns and the splicing reaction**

complex macromolecular machineries in the cell [22].

role [19, 20,21,].

**1.5. Types of introns**

**Figure 1. The RNA family** (Reprinted from reference 2)

**RNAs)** participate in both the modification and cleavage events that occur during ribo‐ some biogenesis [5]. A second group of small RNAs are the **tRNAs,** which are essential in translation [8,10]. **Micro RNAs** are non-coding RNAs of 22-24 nucleotides in length. They down regulate gene expression by attaching themselves to mRNA, thereby preventing them from being translated into protein. Another type of non-coding RNA is the **small interfer‐ ing RNA**. These small RNAs mediate life time of RNA by interacting with mRNAs and labeling it for destruction [2, 4]. **PiRNA** is the large class of small non-coding RNAs which acts with piwi proteins. These piRNA have been linked to both epigenetic and posttransla‐ tional gene silencing of retrotransposons and other genetic elements in germline cells [11,12]. Telomerase is complex of proteins and RNA,and is responsible for maintaining of natural end of chromosomes. Telomerase acts as reverse transcriptase because its mechanism of action is copying RNA template into DNA [10].

Small nuclear RNAs are components of the macromolecular machinery (spliceosome) that has a role in the maturation of mRNA. They are termed **U snRNAs** (stands for uridyl rich small nuclear RNA). U1, U2, U4, U5, U7, U11, and U12 are synthesized in the nucleus by RNA polymerase II. After that they are transported to the cytoplasm where association with the U snRNP proteins (proteins which associate with uridine rich small nuclear RNAs, generating UsnRNP, uridyl rich small nuclear ribonuclear particles) occurs, followed by re-import into the nucleus [1, 13].U6 and U8 snRNA belong as well to class of small nuclear RNA but their synthesis, biogenesis and function differs from mentioned UsnRNAs. U3 snRNA shares common denomination but as well this small RNA is found in the nucleolus, and has role in pre-rRNA processing and has C/D box motif, which technically make it a member of the C/D class of snoRNAs [14].

#### **1.2. Structure of RNA and association with proteins**

DNA and RNA have similar covalent structures, the only difference being the change from a 2`-deoxyribose sugar to a ribose sugar and from a methyl group in thymine to a hydrogen in uracil. RNA has much wider biological activities and adopts a wider range of structures. DNA double helices preferentially assume the B-form structure in solution and RNA double helices are found in the A-form. The **RNA A-form double helix** has a narrow and deep major groove, which prevents proteins to recognize RNA in a manner analogous to the way they recognize DNA. An RNA molecule can locally adopt several types of secondary structure (bulges, hairpins, internal loops) [15, 16].Eukaryotic mRNAs are almost always associated with RNAbinding proteins. RNA-binding proteins generally have a modular structure and contain RNAbinding domains of 70–150 amino acids that mediate RNA recognition. Three major classes of eukaryotic RNA-binding protein domains are known: the RNA-recognition motif (RRM), the double stranded RNA binding domain (dsRBD) and the K-homology (KH) domain [17].

#### **1.3. All mRNA processing steps are coupled**

Eukaryotic gene expression is a complex, stepwise process that begins with transcription (synthesis of pre-mRNA) [18]. Mature mRNAs are produced in the cell nucleus from primary transcripts of coding genes (pre-mRNAs) by a series of processing events which include capping, splicing, and 3` end polyadenylation. Mature mRNAs are transported to the cyto‐ plasm. All modification steps are coupled and influence each other. RNA polymerase II is a key molecular coordinator of these processing events, and phosphorylation of it has regulatory role [19, 20,21,].

#### **1.4. Removal of introns and the splicing reaction**

In 1977, a number of research groups discovered that the genes of higher organisms are often made up of a sequence of coding (called exons) and non-coding base sequences (introns). During transcription, all parts of the gene are copied to form a strand of pre-mRNA. Introns are removed and the exons stitched together so that the now continuous exons can be translated to produce a protein. This splicing of the pre-mRNA is a multistage process, carried out by complex macromolecular machinery known as the spliceosome, which is among the most complex macromolecular machineries in the cell [22].

Splicing of precursors to mRNAs occurs in two steps, both involving a single transesterification reaction [23].Assembly and function of the spliceosome requires approximately 300 polypep‐ tides and five snRNAs, not considering gene-specific RNA-binding factors [23]. There are two distinct types of spliceosome in most cells. The major class U2-type spliceosome is universal in eukaryotes, whereas the minor class or U12-type spliceosome is not present in some organisms. The evolutionary relation between these two spliceosomes is uncertain.

#### **1.5. Types of introns**

**RNAs)** participate in both the modification and cleavage events that occur during ribo‐ some biogenesis [5]. A second group of small RNAs are the **tRNAs,** which are essential in translation [8,10]. **Micro RNAs** are non-coding RNAs of 22-24 nucleotides in length. They down regulate gene expression by attaching themselves to mRNA, thereby preventing them from being translated into protein. Another type of non-coding RNA is the **small interfer‐ ing RNA**. These small RNAs mediate life time of RNA by interacting with mRNAs and labeling it for destruction [2, 4]. **PiRNA** is the large class of small non-coding RNAs which acts with piwi proteins. These piRNA have been linked to both epigenetic and posttransla‐ tional gene silencing of retrotransposons and other genetic elements in germline cells [11,12]. Telomerase is complex of proteins and RNA,and is responsible for maintaining of natural end of chromosomes. Telomerase acts as reverse transcriptase because its mechanism of

Small nuclear RNAs are components of the macromolecular machinery (spliceosome) that has a role in the maturation of mRNA. They are termed **U snRNAs** (stands for uridyl rich small nuclear RNA). U1, U2, U4, U5, U7, U11, and U12 are synthesized in the nucleus by RNA polymerase II. After that they are transported to the cytoplasm where association with the U snRNP proteins (proteins which associate with uridine rich small nuclear RNAs, generating UsnRNP, uridyl rich small nuclear ribonuclear particles) occurs, followed by re-import into the nucleus [1, 13].U6 and U8 snRNA belong as well to class of small nuclear RNA but their synthesis, biogenesis and function differs from mentioned UsnRNAs. U3 snRNA shares common denomination but as well this small RNA is found in the nucleolus, and has role in pre-rRNA processing and has C/D box motif, which technically make it a member of the C/D

DNA and RNA have similar covalent structures, the only difference being the change from a 2`-deoxyribose sugar to a ribose sugar and from a methyl group in thymine to a hydrogen in

action is copying RNA template into DNA [10].

**Figure 1. The RNA family** (Reprinted from reference 2)

214 Oligomerization of Chemical and Biological Compounds

**1.2. Structure of RNA and association with proteins**

class of snoRNAs [14].

The pre-mRNA contains conserved elements at its intron/ exon boundaries that determine the proper sites for the splicing reaction (Figure 3.). The 5' splice site contains a conserved consensus sequence, which is AG/GURAGU (R=purine, / denotes the exon/ intron boundary). The branch site lies between 100 and 18 bases upstream of the 3' splice site and has the consensus: CUR**A**Y for vertebrates (**A** branching nucleotide, Y is pyrimidine). In higher

structure in the nucleus composed of proteins and nucleic acids. Its function is to transcribe ribosomal RNA (rRNA) and combine it with proteins to form ribosomes. There are about 200 distinct kinds of snRNPs (they differ according to the RNA or protein components) with

Oligomerization of Biomacromolecules – Example of RNA Binding Sm/LSm Proteins

spliceosomes). They generally play a role in gene expression. One exception is the telomerase snRNP, essential for genome maintenance, which is present only in a few copies per cell. Spliceosomes are formed around the pre-mRNA substrate by the successive assembly of five small nuclear ribonucleoprotein-particles (snRNPs): U1, U2, U4, U5, and U6. These particles are composed each of a small nuclear RNA (snRNA), seven Sm core proteins common to all snRNPs (except for the U6 snRNP, which contains a related set of seven proteins, the Sm-like proteins) and several snRNP-specific proteins. The snRNPs play a central role in the process of splicing. They are responsible for the recognition of splice sites, definition of exon/ intron boundaries. These interactions are partially mediated through base pairing and are dynamic

Subsequent to transcription by RNA polymerase II and capping, pre-U1 snRNA assembles with several factors including cap-binding proteins (CBP), a phosphorylated adaptor for RNA export (PHAX), Crm1, and Ran-GTP, which all together mediate export of U1 snRNA to the cytoplasm. After export, Sm proteins interact with the U snRNAs to form the snRNP Sm core. This step is facilitated by the SMN complex (survival of motor neurons complex). The SMN complex is composed from SMN protein and the other proteins called Gemins (Gemins 2-7). Nuclear re-import is mediated by snurportin-1 (SPN1), which binds to the snRNAs m3G cap structure. After import, these factors dissociate. The U1 specific proteins are imported independently into the nucleus, where assembly into mature U1 snRNP occurs [27]. This is a

Assembly of a spliceosome for excision of an intron requires recognition of sequences at the 5' splice site as well as the branch site and nearby 3' splice site. U1 snRNA binds to the 5` end of the intron using sequence complementarities. There are reports which show that the U1 snRNA recognizes the 5` splice site in a preassembled penta-snRNP complex [28]. U2 snRNP complex associates with the branch region. Early snRNP/pre-mRNA complexes are preferentially committed to splicing as compared to free RNA and thus are called **commitment complexes** (CCs). The process of U2 snRNP association is ATP-dependent and four proteins are critical for recognition [29]. Subsequent to the binding of U2 snRNP complex, a tri-snRNP complex containing U4/U6 snRNP and U5 snRNP associates in an ATP-dependent manner to form complex A2-1. It is likely that the U1 snRNA/pre-mRNA duplex dissociates at this stage. The 5' splice site sequence is probably paired on the intron side to U6 snRNA and on the exon side to U5 snRNA. The transition between complex A2-1 and A1 requires destabilization of at least U4/U6 di-snRNA. As only three snRNAs U2, U6 and U5 are associated with the spliceosome at the moment of catalysis, and as U5 snRNA pairing with exon sequences is not essential, the

copies per cell (for snRNPs in major

http://dx.doi.org/10.5772/57592

217

(for snoRNPs) to over 106

so that the spliceosome complex changes during the process of splicing.

pathway shared with U1, U2, U4 and U5 snRNPs.

abundances between 104

**1.7. U snRNP biogenesis**

**1.8. Spliceosome assembly**

**Figure 2.** The two chemical steps of splicing (Reprinted from reference 23)

eukaryotes, a polypyrimidine tract variable in length is often located between the branch site and the 3' splice site. The 3' splice site has the consensus:YAG/R for mammals. This class of introns is spliced by U2 spliceosome. The U12 type introns have different consensus sequences and are spliced by the U12 spliceosome [24]. The number of known U12 introns is still very small. U12 type introns are present in many vertebrates, nematodes, insect, and plant species.

**Figure 3.** Splice site consensus sequences. Comparison of splice site consensus sequences for human U2 dependent and U12 dependent introns. The most conserved regions, 5' splice site (5' SS), branch point (BP), 3'splice site (3' SS), are shown with their consensus sequences (R=purine, Y=pyrimidine). The polypyrimidine tract often present in metazoan U2 dependent introns is indicated as (Py). (Reprinted from reference 25)

#### **1.6. Small nuclear ribonucleoproteins, snRNPs**

Small ribonucleoproteins (RNPs) are tight complexes of one or more proteins with a short RNA molecule (usually 60-300 nucleotides). RNPs inhabit nuclear and cytoplasmatical compart‐ ments of the eukaryotic cell [25]. Those that reside in the nucleus, the small nuclear ribonu‐ cleoproteins (snRNPs) can themselves be divided into two families. There are **snRNPs** of the nucleoplasm, whose function lies in preparing messenger RNA for export into the cytoplasm. A different set of snRNPs, called **snoRNPs**, reside in the nucleolus [26]. The **nucleolus** is a structure in the nucleus composed of proteins and nucleic acids. Its function is to transcribe ribosomal RNA (rRNA) and combine it with proteins to form ribosomes. There are about 200 distinct kinds of snRNPs (they differ according to the RNA or protein components) with abundances between 104 (for snoRNPs) to over 106 copies per cell (for snRNPs in major spliceosomes). They generally play a role in gene expression. One exception is the telomerase snRNP, essential for genome maintenance, which is present only in a few copies per cell. Spliceosomes are formed around the pre-mRNA substrate by the successive assembly of five small nuclear ribonucleoprotein-particles (snRNPs): U1, U2, U4, U5, and U6. These particles are composed each of a small nuclear RNA (snRNA), seven Sm core proteins common to all snRNPs (except for the U6 snRNP, which contains a related set of seven proteins, the Sm-like proteins) and several snRNP-specific proteins. The snRNPs play a central role in the process of splicing. They are responsible for the recognition of splice sites, definition of exon/ intron boundaries. These interactions are partially mediated through base pairing and are dynamic so that the spliceosome complex changes during the process of splicing.

#### **1.7. U snRNP biogenesis**

eukaryotes, a polypyrimidine tract variable in length is often located between the branch site and the 3' splice site. The 3' splice site has the consensus:YAG/R for mammals. This class of introns is spliced by U2 spliceosome. The U12 type introns have different consensus sequences and are spliced by the U12 spliceosome [24]. The number of known U12 introns is still very small. U12 type introns are present in many vertebrates, nematodes, insect, and plant species.

**Figure 3.** Splice site consensus sequences. Comparison of splice site consensus sequences for human U2 dependent and U12 dependent introns. The most conserved regions, 5' splice site (5' SS), branch point (BP), 3'splice site (3' SS), are shown with their consensus sequences (R=purine, Y=pyrimidine). The polypyrimidine tract often present in metazoan

Small ribonucleoproteins (RNPs) are tight complexes of one or more proteins with a short RNA molecule (usually 60-300 nucleotides). RNPs inhabit nuclear and cytoplasmatical compart‐ ments of the eukaryotic cell [25]. Those that reside in the nucleus, the small nuclear ribonu‐ cleoproteins (snRNPs) can themselves be divided into two families. There are **snRNPs** of the nucleoplasm, whose function lies in preparing messenger RNA for export into the cytoplasm. A different set of snRNPs, called **snoRNPs**, reside in the nucleolus [26]. The **nucleolus** is a

U2 dependent introns is indicated as (Py). (Reprinted from reference 25)

**Figure 2.** The two chemical steps of splicing (Reprinted from reference 23)

216 Oligomerization of Chemical and Biological Compounds

**1.6. Small nuclear ribonucleoproteins, snRNPs**

Subsequent to transcription by RNA polymerase II and capping, pre-U1 snRNA assembles with several factors including cap-binding proteins (CBP), a phosphorylated adaptor for RNA export (PHAX), Crm1, and Ran-GTP, which all together mediate export of U1 snRNA to the cytoplasm. After export, Sm proteins interact with the U snRNAs to form the snRNP Sm core. This step is facilitated by the SMN complex (survival of motor neurons complex). The SMN complex is composed from SMN protein and the other proteins called Gemins (Gemins 2-7). Nuclear re-import is mediated by snurportin-1 (SPN1), which binds to the snRNAs m3G cap structure. After import, these factors dissociate. The U1 specific proteins are imported independently into the nucleus, where assembly into mature U1 snRNP occurs [27]. This is a pathway shared with U1, U2, U4 and U5 snRNPs.

#### **1.8. Spliceosome assembly**

Assembly of a spliceosome for excision of an intron requires recognition of sequences at the 5' splice site as well as the branch site and nearby 3' splice site. U1 snRNA binds to the 5` end of the intron using sequence complementarities. There are reports which show that the U1 snRNA recognizes the 5` splice site in a preassembled penta-snRNP complex [28]. U2 snRNP complex associates with the branch region. Early snRNP/pre-mRNA complexes are preferentially committed to splicing as compared to free RNA and thus are called **commitment complexes** (CCs). The process of U2 snRNP association is ATP-dependent and four proteins are critical for recognition [29]. Subsequent to the binding of U2 snRNP complex, a tri-snRNP complex containing U4/U6 snRNP and U5 snRNP associates in an ATP-dependent manner to form complex A2-1. It is likely that the U1 snRNA/pre-mRNA duplex dissociates at this stage. The 5' splice site sequence is probably paired on the intron side to U6 snRNA and on the exon side to U5 snRNA. The transition between complex A2-1 and A1 requires destabilization of at least U4/U6 di-snRNA. As only three snRNAs U2, U6 and U5 are associated with the spliceosome at the moment of catalysis, and as U5 snRNA pairing with exon sequences is not essential, the

The classical view of spliceosome assembly has been challenged by Stevens et al [30]. This group isolated from yeast a penta-snRNP complex which when supplied with soluble

Oligomerization of Biomacromolecules – Example of RNA Binding Sm/LSm Proteins

http://dx.doi.org/10.5772/57592

219

components, does splice pre-mRNA.

**Figure 5.** A simplified view of the splicing process (Reprinted from reference 5)

**Figure 4.** The U snRNP biogenesis pathway (Reprinted from reference 27)

catalytic site is either created by U6 snRNA by U2 snRNA or both. The action of certain proteins is required for the transition to the second step in splicing. The catalytic site for the second step is created by either U6 snRNA, U2 snRNA, or associated proteins. The reannealing of released U4 and U6 snRNP and association with U5 forms the U4/U6 U5 tri-snRNP complex which is then ready to reassemble on another commitment complex.

The classical view of spliceosome assembly has been challenged by Stevens et al [30]. This group isolated from yeast a penta-snRNP complex which when supplied with soluble components, does splice pre-mRNA.

**Figure 5.** A simplified view of the splicing process (Reprinted from reference 5)

catalytic site is either created by U6 snRNA by U2 snRNA or both. The action of certain proteins is required for the transition to the second step in splicing. The catalytic site for the second step is created by either U6 snRNA, U2 snRNA, or associated proteins. The reannealing of released U4 and U6 snRNP and association with U5 forms the U4/U6 U5 tri-snRNP complex which is

then ready to reassemble on another commitment complex.

**Figure 4.** The U snRNP biogenesis pathway (Reprinted from reference 27)

218 Oligomerization of Chemical and Biological Compounds

#### **1.9. mRNA stabilization, degradation**

Regulation of mRNA decay rates is an important control point in determining the abundance of cellular transcripts. Some mRNA has half-lives that are 100 times shorter than cellular generation times and some mRNA have half-lives spanning several cell cycles [21]. The poly (A) tail is important in stabilization of mRNA. It interacts with the poly (A) binding protein (PABP), which makes direct contact with a specific region of the translation-initiation factor (eIF4E). Translation initiation factor (eIF4) interacts with the cap binding proteins. In this way, a ternary (PABP-translation initiation,-cap binding protein, poly (A) tail) complex is formed which circularizes mRNA *in vitro*, promoting translation and stabilization of mRNAs [21]. Several sequence elements can regulate the mRNA turnover rate, either by its promotion (destabilizer elements) or by its inhibition (stabilizer elements). Important elements are A+Urich elements (ARE), located in the 3` untranslated regions (UTR) of mRNAs [31]. At least four different ways of mRNA degradation have been reported in eukaryotic cells [32]. In most cases, degradation of the transcript begins with the shortening of the poly (A) tail at the 3` end of mRNA. After shortening of the poly (A) tail follows the removal of the 5` cap structure (decapping), thereby exposing the transcript to digestion by a 5` to 3` exonuclease. Family of LSm proteins is involved in degradation of mRNAs, in the deccaping step Transcripts can be degraded in the 3`-5` direction after deadenylation. This process is catalyzed by the exosome [33]. One mRNA degradation pathway is the **nonsense mediated decay (NMD)**, which provides strongest evidence for a link between translation and turnover [34].

**Figure 6.** Primary and secondary structure of Sm proteins (Reprinted from reference 41). Amino acid sequence align‐ ment of the human Sm (D1, D2, D3, B/B`, E, F, and G) proteins with secondary structure elements. Wavy line, helix; arrows, β strands. The β strands within the Sm1 and Sm2 motifs are colored blue and yellow, respectively. The β strands and interconnecting loops are numbered consecutively from the N terminus. The conserved Sm1 and Sm2 mo‐ tifs are indicated and the conserved residues within these motifs are highlighted in blue (hydrophobic), grey (hydro‐ phobic, less well conserved), orange (neutral polar), red (basic) and green (acidic). (Reprinted from reference 41)

Oligomerization of Biomacromolecules – Example of RNA Binding Sm/LSm Proteins

http://dx.doi.org/10.5772/57592

221

Solved structures of this protein family members (pdb codes: 1d3b,1b34,1hk9, 1h64,1i8f,1i4k, 1kq1,3bw1,1th7) show that the fold is highly conserved. It is defined by an N-terminal helix, followed by a five-stranded anti-parallel β sheet. Strands β1, β2, and β3 are part of the Sm1 motif, whereas the Sm2 motif forms strands β4 and β5. The five stranded β sheet is strongly bent in the middle and the conserved hydrophobic residues form a hydrophobic core [41].

The Sm proteins bind to the Sm site of U snRNAs [42]. The Sm site consensus sequence (PuAU4-6GPu) has a central, uridine rich tract and flanking purines. *In vitro* studies reveal that the single–stranded U rich region and the 5` adenosine of the Sm site play critical roles in Sm protein assembly. The uridine bases and the 2` hydroxyl groups collectively provide binding determinants [43]. In the absence of U snRNA, the seven Sm proteins form three stable subcomplexes (D3B, or D3B', D1D2, and EFG). These sub complexes then form a heptameric ring around the snRNA Sm site, and as such the complex is termed the **Sm core**. SnRNP core assembly is an ordered pathway that involves formation of a sub-core particle followed by formation of the full Sm core, which promotes cap hypermethylation and pre-snRNP import [44]. The Sm fold is necessary and sufficient for the formation of specific inter-subunit interactions. Biochemical results indicate that there is one copy of each Sm protein in the snRNP core domain and therefore support the heptameric ring model of the snRNP core domain [45]. None of the single Sm proteins has a known RNA recognition motif, so another type of interactions with RNA must be involved. Crosslinking studies indicate that Sm motif 1 is responsible for interactions with RNA, and Sm motif 2 for protein –protein interactions [43].

#### **2. Family of Sm-LSm proteins**

#### **2.1. Sm proteins, assembly of U1, U2, U4, U5 snRNPs**

The Sm proteins were first discovered as antigens targeted by so-called **Anti-Sm antibodies** in a patient with a form of Systemic lupus erythematosus (SLE), a debilitating autoimmune disease. They were named Sm proteins in honor of Stephanie Smith, a patient who suffered from SLE. Other proteins with very similar structures were subsequently discovered and named LSm proteins. The common proteins for U1, U2, U4 and U5 snRNPs are named Sm proteins due to their recognition by anti-Sm autoantibodies (isolated from the serum of patients with autoimmune diseases [35, 36].Eight proteins: B`, B, D1-D3, E, F and G have been charac‐ terized in human cells. All of the Sm core proteins are encoded by separate genes [37], with the exception of B and B`. The B and B` that result from an alternative splicing of gene 6628 located on chromosome 20, locus 20p13, only differs in 11 amino acids at the C-terminus [38]. In neural tissues, SmN replaces SmB and SmB` [39]. Two sequence motifs, named Sm1 and Sm2, are found in all known Sm proteins, what is reason that they are called Sm proteins [40]. The N terminal Sm1 motif is composed of 32 amino acids. The Sm2 motif, located closer to the C terminus, is shorter spanning only 14 amino acids [35]. Sm motif 1 and Sm motif 2 are separated by a linker of variable length. The alignment of the sequences of human Sm proteins reveals a striking conservation of the two motifs. Majority of the Sm proteins have amino or carboxy-terminal extensions.

**1.9. mRNA stabilization, degradation**

220 Oligomerization of Chemical and Biological Compounds

**2. Family of Sm-LSm proteins**

carboxy-terminal extensions.

**2.1. Sm proteins, assembly of U1, U2, U4, U5 snRNPs**

Regulation of mRNA decay rates is an important control point in determining the abundance of cellular transcripts. Some mRNA has half-lives that are 100 times shorter than cellular generation times and some mRNA have half-lives spanning several cell cycles [21]. The poly (A) tail is important in stabilization of mRNA. It interacts with the poly (A) binding protein (PABP), which makes direct contact with a specific region of the translation-initiation factor (eIF4E). Translation initiation factor (eIF4) interacts with the cap binding proteins. In this way, a ternary (PABP-translation initiation,-cap binding protein, poly (A) tail) complex is formed which circularizes mRNA *in vitro*, promoting translation and stabilization of mRNAs [21]. Several sequence elements can regulate the mRNA turnover rate, either by its promotion (destabilizer elements) or by its inhibition (stabilizer elements). Important elements are A+Urich elements (ARE), located in the 3` untranslated regions (UTR) of mRNAs [31]. At least four different ways of mRNA degradation have been reported in eukaryotic cells [32]. In most cases, degradation of the transcript begins with the shortening of the poly (A) tail at the 3` end of mRNA. After shortening of the poly (A) tail follows the removal of the 5` cap structure (decapping), thereby exposing the transcript to digestion by a 5` to 3` exonuclease. Family of LSm proteins is involved in degradation of mRNAs, in the deccaping step Transcripts can be degraded in the 3`-5` direction after deadenylation. This process is catalyzed by the exosome [33]. One mRNA degradation pathway is the **nonsense mediated decay (NMD)**, which

provides strongest evidence for a link between translation and turnover [34].

The Sm proteins were first discovered as antigens targeted by so-called **Anti-Sm antibodies** in a patient with a form of Systemic lupus erythematosus (SLE), a debilitating autoimmune disease. They were named Sm proteins in honor of Stephanie Smith, a patient who suffered from SLE. Other proteins with very similar structures were subsequently discovered and named LSm proteins. The common proteins for U1, U2, U4 and U5 snRNPs are named Sm proteins due to their recognition by anti-Sm autoantibodies (isolated from the serum of patients with autoimmune diseases [35, 36].Eight proteins: B`, B, D1-D3, E, F and G have been charac‐ terized in human cells. All of the Sm core proteins are encoded by separate genes [37], with the exception of B and B`. The B and B` that result from an alternative splicing of gene 6628 located on chromosome 20, locus 20p13, only differs in 11 amino acids at the C-terminus [38]. In neural tissues, SmN replaces SmB and SmB` [39]. Two sequence motifs, named Sm1 and Sm2, are found in all known Sm proteins, what is reason that they are called Sm proteins [40]. The N terminal Sm1 motif is composed of 32 amino acids. The Sm2 motif, located closer to the C terminus, is shorter spanning only 14 amino acids [35]. Sm motif 1 and Sm motif 2 are separated by a linker of variable length. The alignment of the sequences of human Sm proteins reveals a striking conservation of the two motifs. Majority of the Sm proteins have amino or

**Figure 6.** Primary and secondary structure of Sm proteins (Reprinted from reference 41). Amino acid sequence align‐ ment of the human Sm (D1, D2, D3, B/B`, E, F, and G) proteins with secondary structure elements. Wavy line, helix; arrows, β strands. The β strands within the Sm1 and Sm2 motifs are colored blue and yellow, respectively. The β strands and interconnecting loops are numbered consecutively from the N terminus. The conserved Sm1 and Sm2 mo‐ tifs are indicated and the conserved residues within these motifs are highlighted in blue (hydrophobic), grey (hydro‐ phobic, less well conserved), orange (neutral polar), red (basic) and green (acidic). (Reprinted from reference 41)

Solved structures of this protein family members (pdb codes: 1d3b,1b34,1hk9, 1h64,1i8f,1i4k, 1kq1,3bw1,1th7) show that the fold is highly conserved. It is defined by an N-terminal helix, followed by a five-stranded anti-parallel β sheet. Strands β1, β2, and β3 are part of the Sm1 motif, whereas the Sm2 motif forms strands β4 and β5. The five stranded β sheet is strongly bent in the middle and the conserved hydrophobic residues form a hydrophobic core [41].

The Sm proteins bind to the Sm site of U snRNAs [42]. The Sm site consensus sequence (PuAU4-6GPu) has a central, uridine rich tract and flanking purines. *In vitro* studies reveal that the single–stranded U rich region and the 5` adenosine of the Sm site play critical roles in Sm protein assembly. The uridine bases and the 2` hydroxyl groups collectively provide binding determinants [43]. In the absence of U snRNA, the seven Sm proteins form three stable subcomplexes (D3B, or D3B', D1D2, and EFG). These sub complexes then form a heptameric ring around the snRNA Sm site, and as such the complex is termed the **Sm core**. SnRNP core assembly is an ordered pathway that involves formation of a sub-core particle followed by formation of the full Sm core, which promotes cap hypermethylation and pre-snRNP import [44]. The Sm fold is necessary and sufficient for the formation of specific inter-subunit interactions. Biochemical results indicate that there is one copy of each Sm protein in the snRNP core domain and therefore support the heptameric ring model of the snRNP core domain [45]. None of the single Sm proteins has a known RNA recognition motif, so another type of interactions with RNA must be involved. Crosslinking studies indicate that Sm motif 1 is responsible for interactions with RNA, and Sm motif 2 for protein –protein interactions [43].

localization signal of U snRNP is composed of the U snRNAs 2,2,7 tri-methylguanosine cap

domain [52]. The Importin beta binding domain allows for snRNPs cargo to be imported in a Ran independent fashion. After import of snRNPs into the nucleus, Snurportin1 dissociates from its cargo and is exported back into the cytoplasm using Crm1, a receptor for leucine-rich nuclear export signals [53]. The SMN complex not only mediates snRNP core assembly but is an integral complex component during the entire snRNP core biogenesis in the cytoplasm.It is not excluded that SMN is actually the long-sought Sm core nuclear localization signal

**Figure 8.** Schematic model of the role of SMN in snRNP core biogenesis in the cytoplasm. (Reprinted from reference 54)

Sm and Sm-like proteins are found in all kingdoms of life: *eukarya, archaea and bacteria.* These proteins were found even in *Archaea*. Because Archaea have been proposed to be related to the ancestor of the eukaryotic nuclear genome, this fact suggests that an LSm protein gene was present in the last common ancestor. *Archaebacteria* harbour between one and two genes wich encode for Sm motif containing proteins. The *in vivo* functions of

has an N-terminal importin beta binding domain and a carboxy terminal m3

G cap but not to the Sm core. Snurportin1

Oligomerization of Biomacromolecules – Example of RNA Binding Sm/LSm Proteins

G-cap binding

223

http://dx.doi.org/10.5772/57592

and Sm core domain. Snurportin 1 binds to the m3

receptor [54].

**2.2. LSm proteins**

**Figure 7. Proposed Higher-Order Assembly of the Human Core snRNP Proteins**. The seven core Sm proteins (B/B`, D1, D2, D3, E, F, and G) are arranged within the seven-membered ring based on the crystal structures of the D1D2(1b34) and D3B (d3b)complexes and pairwise interactions deduced from biochemical and genetic experiments. (Reprinted from reference 41)

Basic residues of human and yeast SmB, SmD1 and SmD3 are reported to be responsible for import of the Sm core particle [45]. *In vitro*, the snRNP core domain can be assembled from purified components [46]. Assembly of the spliceosomal class of snRNPs *in vivo* is an active process that is mediated by several factors, including the product of the SMN gene (survival of motor neurons gene). Mutations of SMN gene are responsible for spinal muscular atrophy disease (SMA). Spinal muscular atrophy is an autosomal recessive disorder correlated with loss of motor neurons, as a result of a mutation on the SMN gene [47]. The SMN protein is ubiquitously expressed in all tissues of metazoan organisms reflecting the fact that it provides a fundamental activity required by all cells. The SMN protein is predominantly cytosolic but it is also found in the nucleus, namely in a few spherical nuclear domains that overlap with the so-called Cajal bodies (where snRNPs and snoRNPs are localized). These spherical domains have been called Gemini of Cajal bodies (Gems). Proteins associated with the SMN protein are called Gemins. The SMN complex interacts *in vitro* with Sm and LSm proteins which contain symmetrically methylated RG (arginine –glycine) repeats [48]. Symmetrically methylated RG repeats of SmD1, SmD3 and LSm4 are generated by action of the so-called methylosome [48,49].The SMN complex binds to the human hypermethylase which suggests that SMN may have a role in formation of the snRNA m3 G cap structure. It has been proposed that after binding of SMN to the Sm core proteins, SMN promotes engagement of TGS1 to the m7 G-capped snRNP particle. According to that model, SMN dissociates from the C terminal part of the B/B` Sm proteins followed by association of TGS1 and Sm core. This step allows formation of the m3 G-cap. The association of Snurportin 1 with the m3 G-cap can promote release of TGS1 and generate import-competent snRNP [50]. According to these data, the SMN complex interacts with protein components of U snRNPs, but there are reports [51] on sequence-specific interactions between U1 snRNA and the SMN complex. The nuclear localization signal of U snRNP is composed of the U snRNAs 2,2,7 tri-methylguanosine cap and Sm core domain. Snurportin 1 binds to the m3 G cap but not to the Sm core. Snurportin1 has an N-terminal importin beta binding domain and a carboxy terminal m3 G-cap binding domain [52]. The Importin beta binding domain allows for snRNPs cargo to be imported in a Ran independent fashion. After import of snRNPs into the nucleus, Snurportin1 dissociates from its cargo and is exported back into the cytoplasm using Crm1, a receptor for leucine-rich nuclear export signals [53]. The SMN complex not only mediates snRNP core assembly but is an integral complex component during the entire snRNP core biogenesis in the cytoplasm.It is not excluded that SMN is actually the long-sought Sm core nuclear localization signal receptor [54].

**Figure 8.** Schematic model of the role of SMN in snRNP core biogenesis in the cytoplasm. (Reprinted from reference 54)

#### **2.2. LSm proteins**

Basic residues of human and yeast SmB, SmD1 and SmD3 are reported to be responsible for import of the Sm core particle [45]. *In vitro*, the snRNP core domain can be assembled from purified components [46]. Assembly of the spliceosomal class of snRNPs *in vivo* is an active process that is mediated by several factors, including the product of the SMN gene (survival of motor neurons gene). Mutations of SMN gene are responsible for spinal muscular atrophy disease (SMA). Spinal muscular atrophy is an autosomal recessive disorder correlated with loss of motor neurons, as a result of a mutation on the SMN gene [47]. The SMN protein is ubiquitously expressed in all tissues of metazoan organisms reflecting the fact that it provides a fundamental activity required by all cells. The SMN protein is predominantly cytosolic but it is also found in the nucleus, namely in a few spherical nuclear domains that overlap with the so-called Cajal bodies (where snRNPs and snoRNPs are localized). These spherical domains have been called Gemini of Cajal bodies (Gems). Proteins associated with the SMN protein are called Gemins. The SMN complex interacts *in vitro* with Sm and LSm proteins which contain symmetrically methylated RG (arginine –glycine) repeats [48]. Symmetrically methylated RG repeats of SmD1, SmD3 and LSm4 are generated by action of the so-called methylosome [48,49].The SMN complex binds to the human hypermethylase which suggests

**Figure 7. Proposed Higher-Order Assembly of the Human Core snRNP Proteins**. The seven core Sm proteins (B/B`, D1, D2, D3, E, F, and G) are arranged within the seven-membered ring based on the crystal structures of the D1D2(1b34) and D3B (d3b)complexes and pairwise interactions deduced from biochemical and genetic experiments.

that after binding of SMN to the Sm core proteins, SMN promotes engagement of TGS1 to the

G-cap. The association of Snurportin 1 with the m3

release of TGS1 and generate import-competent snRNP [50]. According to these data, the SMN complex interacts with protein components of U snRNPs, but there are reports [51] on sequence-specific interactions between U1 snRNA and the SMN complex. The nuclear

G-capped snRNP particle. According to that model, SMN dissociates from the C terminal part of the B/B` Sm proteins followed by association of TGS1 and Sm core. This step allows

G cap structure. It has been proposed

G-cap can promote

that SMN may have a role in formation of the snRNA m3

m7

formation of the m3

(Reprinted from reference 41)

222 Oligomerization of Chemical and Biological Compounds

Sm and Sm-like proteins are found in all kingdoms of life: *eukarya, archaea and bacteria.* These proteins were found even in *Archaea*. Because Archaea have been proposed to be related to the ancestor of the eukaryotic nuclear genome, this fact suggests that an LSm protein gene was present in the last common ancestor. *Archaebacteria* harbour between one and two genes wich encode for Sm motif containing proteins. The *in vivo* functions of

archaeal Sm proteins remain unknown (in constrast to the eukaryotic and bacterial homologs, and fact that high resolution structure from archaeal systems is known (pdb code 1ljo) [55]. LSm proteins have been identified in plants as well [56]. Eukaryotic genomes have more than 20 Sm/LSm genes each, corresponding to the LSm and Sm proteins which are components of Sm and LSm complexes. Database searches in the yeast genome, revealed 16 Sm motif containing proteins. Some Sm-like proteins were found to interact weakly with some Sm proteins, most probably via non-specific Sm domain interactions [57], but some of the LSm proteins interact with Sm proteins as part of U7 snRNP. In yeast there are nine LSm proteins, in humans more than eight. Each of the human LSm proteins has one orthologue in yeast. Yeast LSm2p-LSm7p share sequence identity with human LSm2- LSm7 ranging from 41-62%. LSm9p appears to be present only in yeast. Yeast LSm8p aligns best with human LSm8 (26% identity). In addition, LSm proteins are highly conserved throughout all eukaryotic kingdoms, as the homologues in insect, nematode and plant database share between 50 and 75 % identity with their human counterparts. Each of the LSm proteins in humans can clearly be best aligned with one of the canonical Sm pro‐ teins. However, their sequence identities are not high enough to allow the conclusion that LSm proteins undergo the same protein–protein interactions like Sm proteins [58].

**2.3. Role of LSm 2-8 oligomers in U6snRNP assembly**

tri-snRNP [64].

The LSm2-8 complex was isolated from Hela cells nuclear extract in an RNA free form. Electron micrographs revealed a doughnut–shaped heterooligomer, similar to the Sm core snRNPs [58]. LSm proteins have a high affinity for single-stranded oligo-U, but they do not recognize the canonical Sm binding site. In yeast and humans, LSm2-8 forms a heteroheptameric ring around the 3` end of U6 snRNA, consisting of a U rich tract. The Sm core RNP is extremely salt stable; however, LSm-U6 snRNA dissociates at salt concentrations higher than 0.5M, or in the presence of competitor RNA, suggesting that the LSm-U6 complex is less stable [58]. U6 snRNA has no conserved Sm site and does not associate with Sm proteins. Its biogenesis pathway differs in many respects from the U1, U2, U4 and U5snRNP pathways; it is transcribed by RNA polymerase III and capped by γ-monomethyltriphosphate. The 3` end of pre-U6 snRNA is elongated during maturation and subsequently trimmed leaving in most organisms a 2`-3` cyclic phosphate. The enzymes involved in this process are specific for U6 snRNA, and U6 snRNA does not leave the nucleus [61]. Mature U6 snRNA shows nucleoplasmic localization [62]. Experimental evidence suggests that U6 snRNA is present in the cytoplasmic compart‐ ment of mouse fibroblast cells [63]. This result suggests that the LSm2-8 complex may act as a nuclear localization signal, but the cytoplasmic localization of the U6 snRNP is highly questionable. The actual function of the LSm 2-8 complex associated with U6 snRNA appears to be connected to U6 snRNP assembly and function. Mutants with decreased levels of LSm2-8 show splicing defects correlating with a reduced level of U6 snRNA. How the LSm2-8 complex affects U6 snRNP remains unclear. One possibility is that LSm proteins facilitate conforma‐ tional rearrangements during the splicing cycle, U4/U6 annealing and formation of U4/U6/U5

Oligomerization of Biomacromolecules – Example of RNA Binding Sm/LSm Proteins

http://dx.doi.org/10.5772/57592

225

**2.4. Role of LSm proteins in protecting mRNA 3` end termini from degradation**

suggesting, that LSm2-8 protects the 3` end of U6 snRNA from degradation [65].

hetero hexamer to U8 snoRNA on the conserved third stem-loop sequences [66].

**2.5. Role of LSm oligomer proteins in U8 snoRNP organization**

**2.6. LSm oligomers as part of U7 snRNP**

LSm proteins have additional roles apart from splicing. Yeast strains which lack LSm1-7p fail to grow at higher temperatures, and accumulate mRNA shortened at the 3` end by 20-30 nucleotides. The simplest model proposes that LSm1-7 complex binds to the mRNA and sterically inhibits endo and exo-nucleases. Nuclear LSm2-8 binds to the U6 snRNA 3` end,

U8 snoRNP is required for processing of 5.8S and 28S rRNAs, which together with the 5S rRNA build up the large ribosomal subunit. In Xenopus extract, LSm2, 3, 4, 6, 7, and 8 are bound as

Maturation of the non-polyadenylated histone mRNAs 3' ends occurs by endonucleolytic cleavage mediated by U7 snRNP [67]. U7 snRNA contains a non-canonical Sm site. Purified U7 snRNP lacks D1 and D2 proteins but has LSm10 (14kDa) and LSm 11 (50kDa) instead [68].

**Figure 9.** Structural Alignment of Human Sm/LSm proteins

Similar to canonical Sm proteins, the LSm proteins are recognized by antibodies from patients suffering from systemic lupus erythematosus (SLE) [59]. Sm/LSm proteins always appear as homomeric (in the case of prokaryotes) or heteromeric (in eukaryotes) ringlike multimers. These ring-shaped complexes, generally containing either six or seven subunits, are the functional LSm protein unit. All canonical Sm proteins are essential for vegetative growth of yeast. LSm proteins have variable effects after depletion in yeast. In mice embryos, LSm4-null zygotes survived to the blastocyst stage, but died shortly after [60].

#### **2.3. Role of LSm 2-8 oligomers in U6snRNP assembly**

archaeal Sm proteins remain unknown (in constrast to the eukaryotic and bacterial homologs, and fact that high resolution structure from archaeal systems is known (pdb code 1ljo) [55]. LSm proteins have been identified in plants as well [56]. Eukaryotic genomes have more than 20 Sm/LSm genes each, corresponding to the LSm and Sm proteins which are components of Sm and LSm complexes. Database searches in the yeast genome, revealed 16 Sm motif containing proteins. Some Sm-like proteins were found to interact weakly with some Sm proteins, most probably via non-specific Sm domain interactions [57], but some of the LSm proteins interact with Sm proteins as part of U7 snRNP. In yeast there are nine LSm proteins, in humans more than eight. Each of the human LSm proteins has one orthologue in yeast. Yeast LSm2p-LSm7p share sequence identity with human LSm2- LSm7 ranging from 41-62%. LSm9p appears to be present only in yeast. Yeast LSm8p aligns best with human LSm8 (26% identity). In addition, LSm proteins are highly conserved throughout all eukaryotic kingdoms, as the homologues in insect, nematode and plant database share between 50 and 75 % identity with their human counterparts. Each of the LSm proteins in humans can clearly be best aligned with one of the canonical Sm pro‐ teins. However, their sequence identities are not high enough to allow the conclusion that

LSm proteins undergo the same protein–protein interactions like Sm proteins [58].

Similar to canonical Sm proteins, the LSm proteins are recognized by antibodies from patients suffering from systemic lupus erythematosus (SLE) [59]. Sm/LSm proteins always appear as homomeric (in the case of prokaryotes) or heteromeric (in eukaryotes) ringlike multimers. These ring-shaped complexes, generally containing either six or seven subunits, are the functional LSm protein unit. All canonical Sm proteins are essential for vegetative growth of yeast. LSm proteins have variable effects after depletion in yeast. In mice embryos, LSm4-null

**Figure 9.** Structural Alignment of Human Sm/LSm proteins

224 Oligomerization of Chemical and Biological Compounds

zygotes survived to the blastocyst stage, but died shortly after [60].

The LSm2-8 complex was isolated from Hela cells nuclear extract in an RNA free form. Electron micrographs revealed a doughnut–shaped heterooligomer, similar to the Sm core snRNPs [58]. LSm proteins have a high affinity for single-stranded oligo-U, but they do not recognize the canonical Sm binding site. In yeast and humans, LSm2-8 forms a heteroheptameric ring around the 3` end of U6 snRNA, consisting of a U rich tract. The Sm core RNP is extremely salt stable; however, LSm-U6 snRNA dissociates at salt concentrations higher than 0.5M, or in the presence of competitor RNA, suggesting that the LSm-U6 complex is less stable [58]. U6 snRNA has no conserved Sm site and does not associate with Sm proteins. Its biogenesis pathway differs in many respects from the U1, U2, U4 and U5snRNP pathways; it is transcribed by RNA polymerase III and capped by γ-monomethyltriphosphate. The 3` end of pre-U6 snRNA is elongated during maturation and subsequently trimmed leaving in most organisms a 2`-3` cyclic phosphate. The enzymes involved in this process are specific for U6 snRNA, and U6 snRNA does not leave the nucleus [61]. Mature U6 snRNA shows nucleoplasmic localization [62]. Experimental evidence suggests that U6 snRNA is present in the cytoplasmic compart‐ ment of mouse fibroblast cells [63]. This result suggests that the LSm2-8 complex may act as a nuclear localization signal, but the cytoplasmic localization of the U6 snRNP is highly questionable. The actual function of the LSm 2-8 complex associated with U6 snRNA appears to be connected to U6 snRNP assembly and function. Mutants with decreased levels of LSm2-8 show splicing defects correlating with a reduced level of U6 snRNA. How the LSm2-8 complex affects U6 snRNP remains unclear. One possibility is that LSm proteins facilitate conforma‐ tional rearrangements during the splicing cycle, U4/U6 annealing and formation of U4/U6/U5 tri-snRNP [64].

#### **2.4. Role of LSm proteins in protecting mRNA 3` end termini from degradation**

LSm proteins have additional roles apart from splicing. Yeast strains which lack LSm1-7p fail to grow at higher temperatures, and accumulate mRNA shortened at the 3` end by 20-30 nucleotides. The simplest model proposes that LSm1-7 complex binds to the mRNA and sterically inhibits endo and exo-nucleases. Nuclear LSm2-8 binds to the U6 snRNA 3` end, suggesting, that LSm2-8 protects the 3` end of U6 snRNA from degradation [65].

#### **2.5. Role of LSm oligomer proteins in U8 snoRNP organization**

U8 snoRNP is required for processing of 5.8S and 28S rRNAs, which together with the 5S rRNA build up the large ribosomal subunit. In Xenopus extract, LSm2, 3, 4, 6, 7, and 8 are bound as hetero hexamer to U8 snoRNA on the conserved third stem-loop sequences [66].

#### **2.6. LSm oligomers as part of U7 snRNP**

Maturation of the non-polyadenylated histone mRNAs 3' ends occurs by endonucleolytic cleavage mediated by U7 snRNP [67]. U7 snRNA contains a non-canonical Sm site. Purified U7 snRNP lacks D1 and D2 proteins but has LSm10 (14kDa) and LSm 11 (50kDa) instead [68].

#### **2.7. LSm protein oligomers in mRNA degradation**

Yeast two hybrid assays reveal multiple interactions between the eight LSm proteins, suggesting the existence of more than one LSm protein complex. Each human LSm protein is capable of interacting with multiple other LSm proteins and splicing factors, like prp24, prp4, and SmD1 [69]. Coprecipitation experiments demonstrated that LSm1p (LSmXp, is the nomenclature for yeast LSm proteins) together with LSm2p-LSm7p forms a new sevensubunit complex [70, 71]. The LSm complex LSm1-7 plays a role in mRNA degradation [72], and LSm2-8 has a role in the stabilization of U6 snRNP. These two protein com‐ plexes thus have very different functions. LSm1p mutants accumulate full length capped transcripts, but mutations on LSm1p do not stabilize mRNA containing premature stop codons, suggesting that the LSm1-7 complex is not involved in NMD [71]. The function of the LSm1-7 complex is most likely to interact with the mRNA substrate and accelerate decapping. Decapping is mediated by a decapping enzyme that is consisting of Dcp1a, Dcp1b, and the catalytic subunit Dcp2. The LSm1-7 proteins are localized in discrete cytoplasmic foci. The foci contain key decapping factors required for 5`-3` mRNA degrada‐ tion. Coexpression of LSm proteins increases the number of foci. The cytoplasmic foci contain LSm1-7 proteins [73]. LSm1 and LSm8 are closely related to each other, and to the SmB protein. The 33 C terminal amino acids of LSm1 are necessary but not sufficient for proper cellular localization of hLSm1 [73]. Finally it has been demonstrated [73] that the foci are actual degradation centers, where mRNA degradation occurs. This suggests that the cytoplasm of cells is more organized than previously thought. Bacterial Hfq protein (pdb 1hk9) is able to chaperone RNA-RNA interactions similarly like LSm proteins ability to chaperone RNA/protein interactions and protect the 3' end of a transcript from exonucleo‐ lytic decay while encouraging degradation through other pathways [74].

pairing of snoRNAs with their rRNA targets, to conduct pseudouridylation and ribose

Oligomerization of Biomacromolecules – Example of RNA Binding Sm/LSm Proteins

http://dx.doi.org/10.5772/57592

227

**Figure 10.** Three different heptameric complexes contain Sm or LSm proteins, reprinted from [71]

A particularly interesting example of forming higher order complexes-oligomers is the Sm/LSm protein family (whose various complexes are described above), whose members are engaged in a variety of RNA processing events, forming complexes which differ sometimes only by one out of seven subunits. Another important aspect of the Sm/LSm protein family is that these proteins never occur in isolation; for proper functioning they require complex formation. Hence, the way to better understand Sm/LSm protein function is to study Sm/LSm complexes. It is difficult to determine the connection between the oligomeric state of a given

modifications.

#### **2.8. LSm proteins in the processing of pre-tRNAs**

It has been reported that depletion of LSm proteins in yeast leads to strong accumulation of unspliced tRNA species. The absence of LSm proteins most probably alters the pattern of processing intermediate [75].

#### **2.9. LSm 2-7 complex associated with snR5**

An LSm2-7 hexameric complex is found to be associated with snR5 in *Saccharomyces cerevi‐ siae*. This RNA is a member of the class of snoRNAs that function in pseudouridylation of rRNA. The SnR5 associated LSm complex may be a hexamer, but it is not excluded that one LSm protein in this complex is present in more than one copy, or an as yet unidentified yeast protein associates with LSm2-7, thereby closing the heptameric ring [76]. LSm2-8 interacts with external 3` sequences on U6 snRNA. The LSm1-7 complex interacts with 3` UTR mRNA but there could well be secondary structure elements between the LSm1-7 binding site and the mRNA 3' end. U8snoRNA and snR5 bind LSm proteins via internal RNA sequences, suggest‐ ing that LSm rings can assemble onto the RNA. LSm proteins have a role in the biogenesis and function of at least a subset of nucleolar RNAs. One possibility is that LSm proteins assist base pairing of snoRNAs with their rRNA targets, to conduct pseudouridylation and ribose modifications.

**2.7. LSm protein oligomers in mRNA degradation**

226 Oligomerization of Chemical and Biological Compounds

Yeast two hybrid assays reveal multiple interactions between the eight LSm proteins, suggesting the existence of more than one LSm protein complex. Each human LSm protein is capable of interacting with multiple other LSm proteins and splicing factors, like prp24, prp4, and SmD1 [69]. Coprecipitation experiments demonstrated that LSm1p (LSmXp, is the nomenclature for yeast LSm proteins) together with LSm2p-LSm7p forms a new sevensubunit complex [70, 71]. The LSm complex LSm1-7 plays a role in mRNA degradation [72], and LSm2-8 has a role in the stabilization of U6 snRNP. These two protein com‐ plexes thus have very different functions. LSm1p mutants accumulate full length capped transcripts, but mutations on LSm1p do not stabilize mRNA containing premature stop codons, suggesting that the LSm1-7 complex is not involved in NMD [71]. The function of the LSm1-7 complex is most likely to interact with the mRNA substrate and accelerate decapping. Decapping is mediated by a decapping enzyme that is consisting of Dcp1a, Dcp1b, and the catalytic subunit Dcp2. The LSm1-7 proteins are localized in discrete cytoplasmic foci. The foci contain key decapping factors required for 5`-3` mRNA degrada‐ tion. Coexpression of LSm proteins increases the number of foci. The cytoplasmic foci contain LSm1-7 proteins [73]. LSm1 and LSm8 are closely related to each other, and to the SmB protein. The 33 C terminal amino acids of LSm1 are necessary but not sufficient for proper cellular localization of hLSm1 [73]. Finally it has been demonstrated [73] that the foci are actual degradation centers, where mRNA degradation occurs. This suggests that the cytoplasm of cells is more organized than previously thought. Bacterial Hfq protein (pdb 1hk9) is able to chaperone RNA-RNA interactions similarly like LSm proteins ability to chaperone RNA/protein interactions and protect the 3' end of a transcript from exonucleo‐

lytic decay while encouraging degradation through other pathways [74].

It has been reported that depletion of LSm proteins in yeast leads to strong accumulation of unspliced tRNA species. The absence of LSm proteins most probably alters the pattern of

An LSm2-7 hexameric complex is found to be associated with snR5 in *Saccharomyces cerevi‐ siae*. This RNA is a member of the class of snoRNAs that function in pseudouridylation of rRNA. The SnR5 associated LSm complex may be a hexamer, but it is not excluded that one LSm protein in this complex is present in more than one copy, or an as yet unidentified yeast protein associates with LSm2-7, thereby closing the heptameric ring [76]. LSm2-8 interacts with external 3` sequences on U6 snRNA. The LSm1-7 complex interacts with 3` UTR mRNA but there could well be secondary structure elements between the LSm1-7 binding site and the mRNA 3' end. U8snoRNA and snR5 bind LSm proteins via internal RNA sequences, suggest‐ ing that LSm rings can assemble onto the RNA. LSm proteins have a role in the biogenesis and function of at least a subset of nucleolar RNAs. One possibility is that LSm proteins assist base

**2.8. LSm proteins in the processing of pre-tRNAs**

**2.9. LSm 2-7 complex associated with snR5**

processing intermediate [75].

**Figure 10.** Three different heptameric complexes contain Sm or LSm proteins, reprinted from [71]

A particularly interesting example of forming higher order complexes-oligomers is the Sm/LSm protein family (whose various complexes are described above), whose members are engaged in a variety of RNA processing events, forming complexes which differ sometimes only by one out of seven subunits. Another important aspect of the Sm/LSm protein family is that these proteins never occur in isolation; for proper functioning they require complex formation. Hence, the way to better understand Sm/LSm protein function is to study Sm/LSm complexes. It is difficult to determine the connection between the oligomeric state of a given protein and its function *in vivo*. Reconstitution in vitro of two human LSm complexes with seven subunits each, LSm1-7 and LSm 2-8, has been described [77, 78, 79]. The LSm2-8 complex binds to the 3'-end of U6 snRNA in the cell nucleus. The closely related cytoplasmic LSm1-7complex binds to the 3'UTR of mRNAs destined for degradation. Remarkably, LSm1-7 differs from LSm2-8 only by the exchange of one single subunit, LSm1 for LSm8.

Sequence comparisons of the yeast LSm protein family indicate that each canonical Sm protein has a corresponding LSm protein with the exception of SmB, which aligns almost equally well with LSm1 and LSm8. Based on sequence comparisons co expression vectors encoding the homologs of SmD1D2, LSm23, of SmD3B, LSm48, and of SmEFG, LSm567 were constructed and proteins were expressed in bacteria [77]. LSm4 and LSm1 were singly over expressed for the reconstitution of LSm1-7 [77].

Two heteroheptameric complexes LSm1-7 and LSm2-8 were reconstituted from two hetero‐ dimers and one heterotrimer in case of LSm2-8 (LSm2-3, LSm4-8, LSm5-6-7) and one hetero‐ trimer, one heterodimer and two proteins singly expressed (LSm2-3, LSm5-6-7, LSm1, LSm8). Reconstitution of heteroheptamers was achieved by mixing of equimolar amounts of each appropriate protein at 37°C adding 4 M urea in order to disrupt higher order structures, because those proteins have tendency to oligomerize. After incubation, mixture of pure recombinant proteins was dialyzed against native buffer. Mixture was applied on to size exclusion chromatography, followed by the anion exchange chromatography. Last step in purification of homogenous heteroheptamers was size exclusion chromatography (peak profile shown on figure 11., and respective fractions were analyzed on polyacrylamide gel (shown of figure 12).

core snRNP domain [58]. The central cavity observed for the recombinant LSm2-8 complex is larger than in the native LSm2-8 complexes (3 vs. 2 nm, respectively). The LSm1-7 rings appear to be slightly smaller, measuring ~ 7 nm across with a pore diameter of less than 1.5 nm. Thus, recombinant LSm1-7 and LSm2-8 complexes are similar to one another and to the native Sm/LSm complexes at this level. In all LSm co-crystal structures solved with RNA oligonu‐

Oligomerization of Biomacromolecules – Example of RNA Binding Sm/LSm Proteins

http://dx.doi.org/10.5772/57592

229

One of the methods which can be used for the identification and characterization of the RNA binding proteins is the electrophoretic mobility shift assay (EMSA). The basis of this method is the change in the electrophoretic mobility of a nucleic acid molecule upon binding to a protein or another molecule. Initially a labeled RNA, which contains the binding sequence, is incubated with a sample containing the RNA binding proteins and the mixture is then analyzed on a non-denaturing gel. The unbound RNA will have a characteristic electrophoretic mobility. Functionality of reconstituted LSm2-8 and LSm1-7 complexes has been demonstrated using this essay in vitro [77]. That oligomer complexes are functional in vivo has been shown [77], by injecting fluorescently labeled complexes into cytoplasm of living cells. They localized in expected cellular compartment, namely LSm 2-8 took nuclear localization and LSm1-7 complex remained in the cytoplasm. The structure-function relationships within the Sm/LSm protein family reflect three major interconnected features which illustrate why it is so impor‐ tant to solve the structures of Sm/LSm hetero-oligomeric complexes: First, Sm/LSm protein function is in general strictly dependent on complex formation. This holds for RNA binding, Sm/LSm-protein containing RNP biogenesis, interaction with non-Sm protein effector pro‐ teins, and RNA processing activity. The required interaction interfaces are apparently always three dimensional structural sites generated from several Sm/LSm subunits. High resolution

cleotides, the RNA molecules mainly wrap around the rim of the pore.

**Figure 12.** SDS PAGE gel

**Figure 11.** Second size exclusion chromatography step

Negative stain electron micrographs show that reconstituted LSm2-8 has a ring-like architec‐ ture with a diameter of about 8 nm. The overall dimensions are similar to those previously observed for the native LSm2-8 complex isolated from HeLa cell nuclear extract (8 nm) and Oligomerization of Biomacromolecules – Example of RNA Binding Sm/LSm Proteins http://dx.doi.org/10.5772/57592 229

**Figure 12.** SDS PAGE gel

protein and its function *in vivo*. Reconstitution in vitro of two human LSm complexes with seven subunits each, LSm1-7 and LSm 2-8, has been described [77, 78, 79]. The LSm2-8 complex binds to the 3'-end of U6 snRNA in the cell nucleus. The closely related cytoplasmic LSm1-7complex binds to the 3'UTR of mRNAs destined for degradation. Remarkably, LSm1-7

Sequence comparisons of the yeast LSm protein family indicate that each canonical Sm protein has a corresponding LSm protein with the exception of SmB, which aligns almost equally well with LSm1 and LSm8. Based on sequence comparisons co expression vectors encoding the homologs of SmD1D2, LSm23, of SmD3B, LSm48, and of SmEFG, LSm567 were constructed and proteins were expressed in bacteria [77]. LSm4 and LSm1 were singly over expressed for

Two heteroheptameric complexes LSm1-7 and LSm2-8 were reconstituted from two hetero‐ dimers and one heterotrimer in case of LSm2-8 (LSm2-3, LSm4-8, LSm5-6-7) and one hetero‐ trimer, one heterodimer and two proteins singly expressed (LSm2-3, LSm5-6-7, LSm1, LSm8). Reconstitution of heteroheptamers was achieved by mixing of equimolar amounts of each appropriate protein at 37°C adding 4 M urea in order to disrupt higher order structures, because those proteins have tendency to oligomerize. After incubation, mixture of pure recombinant proteins was dialyzed against native buffer. Mixture was applied on to size exclusion chromatography, followed by the anion exchange chromatography. Last step in purification of homogenous heteroheptamers was size exclusion chromatography (peak profile shown on figure 11., and respective fractions were analyzed on polyacrylamide gel

HisTagLSm2800Q001:1\_UV1\_280nm HisTagLSm2800Q001:1\_UV2\_260nm HisTagLSm2800Q001:1\_Cond HisTagLSm2800Q001:1\_Fractions

0.0 5.0 10.0 15.0 20.0 25.0 30.0 ml K1 K2 K3 K4 K5 K6 K7 K8 K9 K10 K11 K12 K13 K14 K15 L15 L14 L13 L12 L11 L10 L9 L8 L7 L6 L5 L4 L3 L2 L1 M1

Negative stain electron micrographs show that reconstituted LSm2-8 has a ring-like architec‐ ture with a diameter of about 8 nm. The overall dimensions are similar to those previously observed for the native LSm2-8 complex isolated from HeLa cell nuclear extract (8 nm) and

19.46 21.22 24.48 26.63

27.98

HisTagLSm2800Q001:1\_Inject HisTagLSm2800Q001:1\_UV1\_280nm@01,BASEM1 HisTagLSm2800Q001:1\_Logbook

12.60

differs from LSm2-8 only by the exchange of one single subunit, LSm1 for LSm8.

the reconstitution of LSm1-7 [77].

228 Oligomerization of Chemical and Biological Compounds

(shown of figure 12).

0

**Figure 11.** Second size exclusion chromatography step

100

200

300

400 mAU core snRNP domain [58]. The central cavity observed for the recombinant LSm2-8 complex is larger than in the native LSm2-8 complexes (3 vs. 2 nm, respectively). The LSm1-7 rings appear to be slightly smaller, measuring ~ 7 nm across with a pore diameter of less than 1.5 nm. Thus, recombinant LSm1-7 and LSm2-8 complexes are similar to one another and to the native Sm/LSm complexes at this level. In all LSm co-crystal structures solved with RNA oligonu‐ cleotides, the RNA molecules mainly wrap around the rim of the pore.

One of the methods which can be used for the identification and characterization of the RNA binding proteins is the electrophoretic mobility shift assay (EMSA). The basis of this method is the change in the electrophoretic mobility of a nucleic acid molecule upon binding to a protein or another molecule. Initially a labeled RNA, which contains the binding sequence, is incubated with a sample containing the RNA binding proteins and the mixture is then analyzed on a non-denaturing gel. The unbound RNA will have a characteristic electrophoretic mobility. Functionality of reconstituted LSm2-8 and LSm1-7 complexes has been demonstrated using this essay in vitro [77]. That oligomer complexes are functional in vivo has been shown [77], by injecting fluorescently labeled complexes into cytoplasm of living cells. They localized in expected cellular compartment, namely LSm 2-8 took nuclear localization and LSm1-7 complex remained in the cytoplasm. The structure-function relationships within the Sm/LSm protein family reflect three major interconnected features which illustrate why it is so impor‐ tant to solve the structures of Sm/LSm hetero-oligomeric complexes: First, Sm/LSm protein function is in general strictly dependent on complex formation. This holds for RNA binding, Sm/LSm-protein containing RNP biogenesis, interaction with non-Sm protein effector pro‐ teins, and RNA processing activity. The required interaction interfaces are apparently always three dimensional structural sites generated from several Sm/LSm subunits. High resolution

**Author details**

Bozidarka L. Zaric

**References**

ICTM – Centre of Chemistry, University of Belgrade, Belgrade, Serbia

tion, *Nature Structural Biology*, (7): 831-835

Inc. NewYork and London.

*in Genetics*, 19(10): 561-568.

Sping Harbor, New York.

*(Mosc).* (2013):78(6):592-602.

983-8.

*rent Opinion Cell Biology*, (3): 337-342.

U6 snRNA., *Molecular Biology of the Cell*, (13): 3123-3137.

[13] Steitz J. A. (1998) ``Snurps``, *Scientific American*, 36-41.

[1] Mark G. Caprara and Timothy W. Nilsen (2000) RNA: Versatility in form and func‐

Oligomerization of Biomacromolecules – Example of RNA Binding Sm/LSm Proteins

http://dx.doi.org/10.5772/57592

231

[2] Steven Buckingham (2003) The Major World of microRNAs, Horizon symposia, 1-4,

[3] Molecular Biology of the Cell (2001) Bruce Alberts, Dennis Bray, Julian Lewis, Martin Raff, Keith Roberts, James D. Watson, Chapter 6, Third edition, Garland Publishing,

[4] Pietzsch J. (2003) Understanding the RNAissance, Horizon Symposia, 1-4, http://

[5] Hartman E. and Hartmann R. (2003) The enigma of ribonuclease P evolution, *Trends*

[6] Altman S., Ribonuclease P, Chapter 14 in *The RNA World* (Second edition), Cold

[7] Tollervey D. and Kiss T. (1997) Function and synthesis of small nucleolar RNAs. *Cur‐*

[8] Gerbi S. A. and Lange T. S. (2002) All Small Nuclear RNAs (snRNAs) of U4/U6U5 trisnRNP Localize to Nucleoli, Identification of the Nucleolar Localization Element of

[9] Trotta C. R., and Abelson J., (2000) tRNA splicing: An RNA World Add-on or an An‐ cient Reaction? *RNA World*, second edition, Cold Spring Harbor Laboratory Press.

[10] Wong JM, Collins K. Telomere maintenanace and disease (2003) *Lancet*;362(9388):

[11] Bortvin A. PIWI-interacting RNAs (piRNAs)-a mouse testis perspective. *Biochemistry*

[12] Burroughs AM, Ando Y, Aravind L. (2013) New perspectives on the diversification of the RNA interference system: insights from comparative genomics and small RNA sequencing. (2013) Wiley Interdiscip Rev RNA Dec 5. doi: 10. 1002/wrna. 1210.

http://www. nature. com/horizon/rna/background/understanding. html

www. nature. com/horizon/rna/background/understanding. html

**Figure 13.** Electron micrographs of reconstituted complex LSm1-7 (c) and LSm2-8 (d) (reprinted from reference 77).

structural information is clearly required to explain the molecular basis for this phenomenon. Second, exchange of only one or two subunits from one to another heterooligomeric (mostly heptameric) Sm/LSm complex changes its whole biology (see above). How such subtle structural changes can have these very large functional effects can only be addressed by solving the crystal structures of the respective complexes. Lastly, the ability of individual Sm/LSm proteins to assemble with different homologous binding partners to form architecturally very similar, yet functionally diverse complexes argues for a very fine balance between flexibility and specificity for the respective Sm-Sm interactions. Clearly, in order to understand the "molecular recognition code" governing the specificity balance mentioned above, more structural information on such interactions is indispensable. Recently crystal structure of Saccharomyces cerevisiae LSm2-8 complex bound to U6 snRNA had been determined (pdb code 4M7D) [80].

#### **Acknowledgements**

This work was supported by the grant No. 172001 from the Ministry of Science and Education, Republic of Serbia.

#### **Author details**

Bozidarka L. Zaric

ICTM – Centre of Chemistry, University of Belgrade, Belgrade, Serbia

#### **References**

structural information is clearly required to explain the molecular basis for this phenomenon. Second, exchange of only one or two subunits from one to another heterooligomeric (mostly heptameric) Sm/LSm complex changes its whole biology (see above). How such subtle structural changes can have these very large functional effects can only be addressed by solving the crystal structures of the respective complexes. Lastly, the ability of individual Sm/LSm proteins to assemble with different homologous binding partners to form architecturally very similar, yet functionally diverse complexes argues for a very fine balance between flexibility and specificity for the respective Sm-Sm interactions. Clearly, in order to understand the "molecular recognition code" governing the specificity balance mentioned above, more structural information on such interactions is indispensable. Recently crystal structure of Saccharomyces cerevisiae LSm2-8 complex bound to U6 snRNA had been determined (pdb

**Figure 13.** Electron micrographs of reconstituted complex LSm1-7 (c) and LSm2-8 (d) (reprinted from reference 77).

This work was supported by the grant No. 172001 from the Ministry of Science and Education,

code 4M7D) [80].

Republic of Serbia.

**Acknowledgements**

230 Oligomerization of Chemical and Biological Compounds


[14] Pendrak ML, Roberts DD. (2011) Ribosomal RNA processing in candida albicans *RNA.* (2011) 17(12):2235-48.

[30] Stevens SW, Ryan DE, Ge HY, Moore RE, Young MK, Lee TD, Abelson J (2002) Com‐ position and functional characterization of the yeast spliceosomal penta-snRNP, *Mo‐*

Oligomerization of Biomacromolecules – Example of RNA Binding Sm/LSm Proteins

http://dx.doi.org/10.5772/57592

233

[31] Chen Ching-Yi, Roberto Gherzi, Shao-En Ong, Edvard L. Chan, Reinout Raijmakers, Ger J. M. Pruijin, Georg Stoecklin, Christof Moroni, Matthias Mann and Mitchael Karin (2001) AU binding proteins recruits the exosome to degrade ARE-containing

[32] Tucker M. and Parker R. (2000) Mechanisms and Control of mRNA decapping in

[33] Lejeune Fabrice, Xiaojie Li and Lynne E. Maquat (2003) Nonsense-Mediated mRNA Decay in Mammalian Cells Involved Deccaping, Deadenylating and Exonucleolytic

[34] Miles F. Wilkinson and Ann-Bin Shyu (2002) RNA Surveillance by nuclear scanning?

[35] Luhrmann, R., Kastner, B. and Bach, M. (1990) Structure of spliceosomal snRNPs and

[36] Lerner, E. A., Lerner, M. R., Hardin, J. A., Janeway, C. A and Steitz, J. A. (1981) Mon‐ oclonal antibodies to nucleic acid containing cellular constituents: probes for molecu‐

[37] Hermann H, et al., (1995) snRNP Sm proteins share two evolutionary conserved se‐ quence motifs which are involved in Sm protein –protein interaction, *EMBO Journal.,*

[38] van Dam A, Winkel I, Zijlstra-Baalbergen J, Smeenk R, Cuypers HT. (1989) Cloned human snRNP proteins B and B' differ only in their carboxy-terminal part. *The EMBO*

[39] McAllister, G., Amara, S. G and Lerner, M. R. (1988) Tissue-specific expression and cDNA cloning of small nuclear ribonucleoprotein-associated polypeptide N. *PNAS,*

[40] Salgado-Garrido,J. ; Bragado-Nilsson,E. ; Kandels-Lewis,S. ; Seraphin,B. (1999) Sm and Sm-like proteins assemble in two related complexes of deep evolutionary origin,

[41] Kambach,C. ; Walke,S. ; Young,R. ; Avis,J. M. ; de la,Fortelle E. ; Raker,V. A. ; Luhr‐ mann,R. ; Li,J. ; Nagai,K. (1999) Crystal Structures of Two Sm protein Complexes and their Implications for the Assembly of the Spliceosomal snRNPs, *Cell*, 96:375-387.

[42] Raker, V. A., Plessel Gabriele and Lurmann Reinhard (1996) The snRNP core assem‐ bly pathaway: identification of stable core protein heteromeric complexes and an

snRNP subcore particle in vitro, *The EMBO Journal*, 15(9): 2256-2269.

their role in pre-mRNA splicing, *Biochim. Biophys. Acta,* 1087: 265-292.

lar biology and autoimmune disease. *PNAS,* 78: 2737-2741.

Saccharomyces cerevisiae, *Annual Rev. Bioch.,* 69: 571-95.

*lecular Cell,* 9: 31-44.

mRNAs. *Cell*, 107(4) :451-64.

Activities, *Molecular Cell,*. 12: 675-687.

*Nature Cell Biology*, 4: 144-147.

14(9): 2076-2088.

*Journal.* 8(12): 3853-60.

*USA*, 85(14): 5296-300.

*The EMBO Journal*, 8(12), 3451-3462.


[30] Stevens SW, Ryan DE, Ge HY, Moore RE, Young MK, Lee TD, Abelson J (2002) Com‐ position and functional characterization of the yeast spliceosomal penta-snRNP, *Mo‐ lecular Cell,* 9: 31-44.

[14] Pendrak ML, Roberts DD. (2011) Ribosomal RNA processing in candida albicans

[15] Varani G. and Pardi A. (1994) Structure of RNA, Chapter 1, in *Protein –RNA interac‐*

[16] Varani G. and Kiyoshi N. (1998) RNA Recognition by RNP proteins during RNA

[17] Pérez-Cañadillas J. M. and Varani G. (2001) Recent advances in RNA–protein recog‐

[18] Maniatis T and Reed R. (2002) An extensive network of coupling among gene expres‐

[19] Proudfoot NJ, Furger A, Dye MJ.,(2002) Integrating mRNA processing with tran‐

[20] Kornblihtt AR, de la Mata M, Fededa JP, Munoz MJ, Nogues G. (2004) Multiple links

[21] Cougot N, van Dijk E, Babajko S, Seraphin B. (2004) Cap-Tabolism, *Trends in Biochem‐*

[22] Jurica Melissa and Melissa J. Moore: Capturing splicing complexes to study structure

[23] Nilsen Timothy: The spliceosome: The most complex macromolecular machine in the

[24] Will CL, Schneider C, Hossbach M, Urlaub H, Rauhut R, Elbashir S, Tuschl T, Luhr‐ mann R (2004) The human 18S U11/U12 snRNP contains a set of novel proteins not

[25] Yi-Tao Yu,. Scharl E.,Smith C. M., and A. Steiz J. A. The growing world of small nu‐ clear ribonucleoproteins, *RNA WORLD*, second edition, Cold Spring Harbor Labora‐

[26] Balakin AG, Smith L, Fournier MJ. (1996) The RNA world in the nucleolus: Two ma‐ jor families of small RNAs defined by different box elements with related function,

[27] Will C. L and Lührmann R. (2001) Spliceosomal U snRNP biogenesis, structure and

[28] Malca H, Shomron N, Ast G (2003) The U1 snRNP Base Pairs with the 5` Splice Site within a Penta-snRNP complex, *Molecular and Cellular Biology*, 23(10): 3442-3455. [29] Pascolo E. and Seraphin B. (1997) The Branchpoint Residue Is Recognized during Commitment Complex Formation before Being Bulged out of the U2snRNA-pre-

*RNA.* (2011) 17(12):2235-48.

232 Oligomerization of Chemical and Biological Compounds

*tions*, Nagai and Mattaj, IRL press, Oxford

sion machines, *Nature*, 416, 499-504

scription, *Cell*, 108 (4), 501-512.

*ical Sciences*, 29(8): 436-44.

tory Press

*Cell*, 86: 823-834.

processing, *Annu. Rev. Biophys. Biomol. Struct*. 27: 402-45.

between transcription and splicing, *RNA*, 10, 1489-1498.

found in the U2-dependent spliceosome. *RNA*, 10(6):929-41.

function, *Current Opinion in Cell Bilogy*, (13): 290-301.

mRNA Duplex, *Molecular and Cellular Biology*, 17(7): 3469-3476.

and mechanism (2002) *Methods*, 28: 336-345.

cell? (2003) *Bioassays,* 25: 1147-1149

nition, *Current Opinion Structural Biology*, 11, 53-58.


[43] Raker,V. A. ; Hartmuth,K. ; Kastner,B. ; Luhrmann,R. (1999) Spliceosomal U snRNPcore assembly: Sm proteins assemble onto an Sm site RNA Nonanucleotide in a specific and Thermodynamically Stable Manner, *Molecular and Cell Biology*, 19(10): 6554-6565.

[55] Cameron Mura,Peter S. Randolph, Jennifer Patterson and Aaron e. Cozen (2013) Archaeal and eukaryotic homologs of Hfq A structural and evolutionary perspective

Oligomerization of Biomacromolecules – Example of RNA Binding Sm/LSm Proteins

http://dx.doi.org/10.5772/57592

235

[56] Anna Golisz, Pawel J. Sikorski, Katarzyna Kruszka and Joanna Kufel (2013) Arabi‐ dopsis thaliana LSM proteins function in mRNA splicing and degradation *Nucleic*

[57] Camasses,A. ; Bragado-Nilsson,E. ; Martin,R. ; Seraphin,B. ; Bordonne,R. (1998) Inter‐ actions within yeast Sm core complex: from proteins to amino acids., *Mol. Cell. Biol*.

[58] Achsel,T. ; Brahms,H. ; Kastner,B. ; Bachi,A. ; Wilm,M. ; Luhrmann,R. (1999) A doughnut-shaped heteromer of human Sm-like proteins binds to the 3` end of U6 snRNA, thereby facilitating U4/U6 duplex formation in vitro, *The EMBO Journal,*

[59] Eystathioy T, Peebles CL, Hamel JC, Vaughn JH, Chan EK. (2002) Autoantibody to the hLSm4 and the Heptameric LSm complex in Anti-Sm Sera, *Arthritis and Rheuma‐*

[60] Hirsch,E. ; Oohashi,T. ; Ahmad,M. ; Stamm,S. ; Fassler,R. (2000) Peri-Implatantation Lethality in Mice Lacking the Sm Motif-containing protein LSm4, *Molecular and Cellu‐*

[61] Vankan,P. ; McGuigan,C. ; Mattaj,I. W. (1990) Domains of U4 and U6 snRNAs re‐ quired for snRNP assembly and splicing complementation,*The EMBO Journal.,*9:

[62] Lange TS, Gerbi SA. (2000) Transient Nucleolar Localization of U6 snRNA in Xeno‐

[63] Fury, M. G. and Zieve, G. W. (1996) U6 snRNA maturation and stability, *Experimental*

[64] Mayes,A. E. ; Verdone,L. ; Legrain,P. ; Beggs,J. D. (1999) Characterization of Sm-like proteins in yeast and their association with U6 snRNA, *The EMBO Journal*, 18 (15):

[65] Weihai He and Parker R. (2001) The Yeast Cytoplasmic LSmI/pat1p complex protects

[66] Tomasevic,N. ; Peculis,B. A. (2002) Xenopus LSm proteins Bind U8 snoRNA via an Internal Evolutionarily Conserved Octamer Sequence, *Molecular and Cellular Biology*,.

[67] Muller B. and Schumperli D. (1997) The U7 snRNP and the hairpin binding protein: Key players in histone mRNA metabolism, *Seminars in cell and developmental biology*,

mRNA 3` termini from partial degradation, *Genetics*, 158:1445-1455.

pus Laevis Oocytes, *Molecular Biology of the Cell*, 11:2419-2428.

on Sm function. *RNA Biology* 10:4, 636–651.

*Acid Research* 41:6232-6249.

18: 1956-1966.

18(20):5789-5802.

*tism,* 46(3), 726-734.

3397-3404.

4321-4331.

22(12): 4101-4112.

8: 567-577.

*lar Biology,* 20 (3): 1055-1062.

*Cell Research,* 228: 63-69.


[55] Cameron Mura,Peter S. Randolph, Jennifer Patterson and Aaron e. Cozen (2013) Archaeal and eukaryotic homologs of Hfq A structural and evolutionary perspective on Sm function. *RNA Biology* 10:4, 636–651.

[43] Raker,V. A. ; Hartmuth,K. ; Kastner,B. ; Luhrmann,R. (1999) Spliceosomal U snRNPcore assembly: Sm proteins assemble onto an Sm site RNA Nonanucleotide in a specific and Thermodynamically Stable Manner, *Molecular and Cell Biology*, 19(10):

[44] Walke,S. ; Bragado-Nilsson,E. ; Seraphin,B. ; Nagai,K. (2001) Stoichiometry of the Sm proteins in yeast spliceosomal snRNPs supports the heptamer ring model of the core

[45] R. Bordonne (2001) Functional characterization of nuclear localization signals in

[46] Stark H, Dube P, Luhrmann R, Kastner B. (2001) Arrangement of RNA and proteins in the spliceosomal U1 small nuclear ribonucleoprotein particle, *Nature,* (409):

[47] SchrankB., GötzR., Gunnersen J. M., UreJ. M.,, Toyka K. V., Smith A., and Sendtner M. (1997) Inactivation of the survival motor neuron gene, a candidate for human spi‐ nal muscular atrophy, leads to massive cell death in early mouse embryos, *PNAS*. 94:

[48] Brahms,H. ; Meheus,L. ; de,Brabandere,V; Fischer,U. ; Luhrmann,R. (2001) Symmetri‐ cal dimethylation of arginine residues in spliceosomal Sm protein B/B and the Smlike protein LSm4, and their interaction with the SMN protein, *RNA*, 7: 1531-1542. [49] Friesen,W. J. ; Wyce,A. ; Paushkin,S. ; Abel,L. ; Rappsilber,J. ; Mann,M. ; Dreyfuss,G. (2002) A Novel WD repeat protein Component of the Methylosome Binds Sm pro‐

[50] Gubitz AK, Feng W, Dreyfuss G. (2004) The SMN complex, *Experimental Cell Research*

[51] Yong J, Pellizzoni L, Dreyfuss G. (2002) Sequence –specific interaction of U1 snRNA

[52] Narayanan U, Ospina JK, Frey MR, Hebert MD, Matera AG. (2002) SMN, the spinal muscular atrophy protein forms a pre-import snRNP complex with snurportin1 and

[53] Paraskeva E, Izaurralde E, Bischoff FR, Huber J, Kutay U, Hartmann E, Luhrmann R, Gorlich D. (1999) CRM-1 mediated Recycling of Snurportin1 to the Cytoplasm, *The*

[54] Massenet,S. ; Pellizzoni,L. ; Paushkin,S. ; Mattaj,I. W. ; Dreyfuss,G. et al (2002) The SMN Complex is Associated with snRNPs throughout their Cytoplasmic Assembly

yeast Sm proteins, *Molecular Cell. Biology*, 20(21):7943-54.

teins, *Journal of Biol. Chemistry*, 277: 8243-8247.

with the SMN complex, *The EMBO Journal*, 21(5):1188-1196.

importin β, *Human Molecular Genetics*, 11(15): 1785-1795.

Pathway, *Molecular and Cellular Biology*, 22(18):6533-6541.

*Journal of Cell Biology,* 145(2): 255-264.

6554-6565.

539-542.

9920-9925.

296:, 51-56.

domain, *J. Mol. Biol.,* 308:49-58.

234 Oligomerization of Chemical and Biological Compounds


[68] Pillai RS, Grimmler M, Meister G, Will CL, Luhrmann R, Fischer U, Schumperli D. (2003) Unique Sm core structure of U7 snRNPs: assembly by a specialized SMN com‐ plex and the role of a new component, LSm11 in histone RNA processing, *Genes and Development* 17: 2321-2333.

[81] Zhou L, Hang J, Zhou Y, Wan R, Lu G, Yin P, Yan C, Shi Y(2013) Crystal structures of the Lsm complex bound to the 3' end sequence of U6 small nuclear RNA. Nature

Oligomerization of Biomacromolecules – Example of RNA Binding Sm/LSm Proteins

http://dx.doi.org/10.5772/57592

237

Nov 17. doi: 10. 1038/nature12803.


[81] Zhou L, Hang J, Zhou Y, Wan R, Lu G, Yin P, Yan C, Shi Y(2013) Crystal structures of the Lsm complex bound to the 3' end sequence of U6 small nuclear RNA. Nature Nov 17. doi: 10. 1038/nature12803.

[68] Pillai RS, Grimmler M, Meister G, Will CL, Luhrmann R, Fischer U, Schumperli D. (2003) Unique Sm core structure of U7 snRNPs: assembly by a specialized SMN com‐ plex and the role of a new component, LSm11 in histone RNA processing, *Genes and*

[69] Fromont-Racine M, Mayes AE, Brunet-Simon A, Rain JC, Colley A, Dix I, Decourty L, Joly N, Ricard F, Beggs JD, Legrain P.. (2000) Genome-wide protein interaction

[70] Tharun,S. ; He,W. ; Mayes,A. E. ; Lennertz,P. ; Beggs,J. D. ; Parker,R. (2000) Yeast Smlike proteins function in mRNA deccaping and decay*, Nature*, 404: 515-518.

[71] Bouveret,E. ; Rigaut,G. ; Shevchenko,A. ; Wilm,M. ; Seraphin,B. l (2000) A Sm-like protein complex that participates in mRNA degradation, *The EMBO Journal*, 19 (7):

[72] Panone B. K. and WolinS. L. (2000) RNA degradation: Sm-like proteins WRING the

[73] Ingelfinger D, Arndt-Jovin DJ, Luhrmann R, Achsel T. (2002) The human LSm1-7 proteins colocalize with the mRNA-degrading enzymes Dcp1/2 and Xrn1 in distinct

[74] Cougot N, van Dijk E, Babajko S, Seraphin B. (2004) Cytoplasmic foci are sites of

[75] Carol J. Wilusz and Jeffrey Wilusz (2013) Lsm proteins and Hfq Life at the 3' end".

[76] Kufel J, Allmang C, Verdone L, Beggs JD, Tollervey D. (2002) LSm proteins Are Re‐ quired for Normal Processing of the Pre-tRNA and Their Efficient Association with

[77] Fernandez CF, Pannone BK, Chen X, Fuchs G, Wolin SL. (2004) An LSm2-7 Complex in Saccharomyces cerevisiae Associates with the Small Nucleolar RNA snR5, Molecu‐

[78] Zaric B., Chami M., Remigy H., Engel A., Ballmer-Hofer K., Winkler FK and Kam‐ bach C. (2005) Reconstitution of two Recombinant LSm protein complexes reveals as‐ pects of their architecture, assembly and function, *Journal of Biological Chemistry*, 280:

[79] Karen C. M. Moraes, Naomi Bergman, Bozidarka Zaric, Christian Kambach, Carol J. Wilusz& Jeffrey Wilusz,(2007)Lsm proteins bind to and stabilize RNAs containing;

[80] Bozidarka Zaric and Christian Kambach (2008)"Reconstitution of recombinant hu‐ man LSm complex for biochemical, biophysical and cell biological studies". *Methods*

poly(A) tracts *Nature Structural Molecular Bioogy.* ;14(9):824-31.

La-Homologous Protein Lhp1p, *Molecular and Cellular Biology*, 22(14):5248-56.

mRNA decay in human cells, *The Journal of Cell Biology,* 165(1): 31-40.

screens reveal functional networks involving Sm-like proteins, *Yeast,* 95-110.

*Development* 17: 2321-2333.

236 Oligomerization of Chemical and Biological Compounds

neck of mRNA, *Current Biology*, 10: 478-482.

cytoplasmic foci, *RNA*, 8: 1489-1501.

lar Biology of the *Cell,* (15):2842-2852.

*RNA Biology* 10:4, 592–601.

16066-16075.

*in Enzymology* 448:57-74.

1661-1671.

**Chapter 8**

**Protein Oligomerization**

http://dx.doi.org/10.5772/57489

**1. Introduction**

Giovanni Gotte and Massimo Libonati

Additional information is available at the end of the chapter

Protein oligomerization is a wide and fascinating topic concerning the behavior of proteins that can form supramolecular structures, either naturally or artificially. Proteins can homo- or hetero-oligomerize through a covalent, almost always irreversible stabilization, or through often reversible associations mediated by electrostatic and hydrophobic interactions, or Hbonds. The structural and functional aspects of protein oligomerization have acquired increasing importance especially in the last two decades. The improvement of the X-ray analyses quality, and NMR potential, as well as the incoming of dynamic light scattering (DLS) or surface Plasmon resonance (SPR) techniques allowed to understand features unknown before or to correct notions that were wrongly believed true. Protein oligomerization is often a phenomenon crucial in triggering various physiological pathways. On the contrary, in different compartments other protein oligomers can be the first deleterious seed driving to protein fibrillization, an event implicated in several devastating neurodegenerative diseases. In the latter case, the isolation and analysis of the oligomeric species, considered as the real toxic agents, remained elusive for a long time. Only very recently new techniques, such as solid-state NMR, Cryo-transmission electron microscopy (Cryo-TEM), High-Resolution Atomic-Force Spectroscopy, Molecular Modeling, allowed to discover structural and func‐

tional data that can clarify the determinants of a very complicated pathway.

structural and functional aspects of natural or artificial oligomers.

Under the light of these recent discoveries, the chapter aims at exploring the most important

In the first paragraph after this we introduce a tentative rationalization of the terms related to the wide world of protein oligomerization; in the second we unveil the structural and mech‐ anistic features of the different protein oligomers that can natively or artificially form; in the third we analyze the stability of protein oligomers and the factors influencing or affecting it; in the fourth we describe the functional (benign) versus the aberrant interactions determined

> © 2014 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### **Chapter 8**

### **Protein Oligomerization**

Giovanni Gotte and Massimo Libonati

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/57489

#### **1. Introduction**

Protein oligomerization is a wide and fascinating topic concerning the behavior of proteins that can form supramolecular structures, either naturally or artificially. Proteins can homo- or hetero-oligomerize through a covalent, almost always irreversible stabilization, or through often reversible associations mediated by electrostatic and hydrophobic interactions, or Hbonds. The structural and functional aspects of protein oligomerization have acquired increasing importance especially in the last two decades. The improvement of the X-ray analyses quality, and NMR potential, as well as the incoming of dynamic light scattering (DLS) or surface Plasmon resonance (SPR) techniques allowed to understand features unknown before or to correct notions that were wrongly believed true. Protein oligomerization is often a phenomenon crucial in triggering various physiological pathways. On the contrary, in different compartments other protein oligomers can be the first deleterious seed driving to protein fibrillization, an event implicated in several devastating neurodegenerative diseases. In the latter case, the isolation and analysis of the oligomeric species, considered as the real toxic agents, remained elusive for a long time. Only very recently new techniques, such as solid-state NMR, Cryo-transmission electron microscopy (Cryo-TEM), High-Resolution Atomic-Force Spectroscopy, Molecular Modeling, allowed to discover structural and func‐ tional data that can clarify the determinants of a very complicated pathway.

Under the light of these recent discoveries, the chapter aims at exploring the most important structural and functional aspects of natural or artificial oligomers.

In the first paragraph after this we introduce a tentative rationalization of the terms related to the wide world of protein oligomerization; in the second we unveil the structural and mech‐ anistic features of the different protein oligomers that can natively or artificially form; in the third we analyze the stability of protein oligomers and the factors influencing or affecting it; in the fourth we describe the functional (benign) versus the aberrant interactions determined

© 2014 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

by protein oligomerization; in the fifth paragraph, finally, some hints related to possible industrial applications of protein oligomers will be mentioned.

kDa protein is oligomeric when it reaches a weight of 80-100 kDa. On the contrary, if the oligomeric limit were a MW of, hypothetically, 100 kDa, the former protein would be larger than an oligomer already at a trimeric status, while the latter would be an oligomer also when composed by ten subunits, thus being a decamer. Protein oligomerization is heavily involved also in the formation of amyloidogenic (or not) fibrils, and in this scenario it is difficult to find the most appropriate terms able to indicate oligomers or other supramolecular structures preceding fibrillization [2] because of the transient nature of these structures. Moreover, it is almost impossible to determine the exact number of subunits forming a fibril, although the main criterion to distinguish an oligomer from a fibril is that the former is soluble, while the

Protein Oligomerization http://dx.doi.org/10.5772/57489 241

Consequently, the M.W. classification (low-molecular/high-molecular weight oligomers) becomes prevalent in the 'fibrillization' context [3, 4], while the classification based on the polymerization degree [5, 6] can be useful in the context of natural or artificial, limited or

To make the story short and to tangibly classify the protein polymerization (oligomerization) events we can enter two paths, one following controlled oligomerization, the other trapping protein oligomers before undergoing fibrillization and using Aβ-amyloid peptide of 40-42 residues as the monomeric unit [7]. From these two different, although possibly overlapping terminologies, although referring to different structures and/or structural elements, the

**b.** Multimers or higher-order oligomers: from hexamer(s) to pentadecamers

**2.** Uncontrolled and extensive oligomerization driving to fibrillization:

from about 60 up to 200 Å in diameter or width [8, 14].

**c.** Very large oligomers or large multimers: from pentadecamers up to more than

**a.** (Aβ) Pre-Fibrillar Oligomers: from a tetramer of 18 kDa to structures of about 75 kDa

**b.** (Aβ) Fibrillar Oligomers: from a dimer or a tetramer of 9-18 kDa to structures of about

**c.** Proto-fibrils: structures lacking order and periodic symmetry of mature fibrils; shorter and less linear than fibrils [8], and often deriving from 'globulomers' [9] or 'annular oligomers' [10], already called 'ring-like shaped' annular aggregates [11]. **d.** Fibrils: ordered, symmetrical, long, varying in length, extending towards several micrometers [12], stacked filaments of unbranched cross-β-spine [13] pairs, ranging

Thus, different types of oligomers deriving from the same protein or domain can certainly overlap in terms of size and M.W. Consequently, it has to be also taken into account that the

latter is not.

controlled, self- or hetero-protein(s) association.

following classification could be adopted:

20-30mers.

500 kDa [7].

[7].

**1.** Limited protein polymerization (not resulting in fibers)

**a.** Oligomers: from dimer(s) to pentamer(s)

Numerous cases and literature reports related to different protein will be described, especially the ones whose oligomers have been extensively studied. In this context, ribonucleases (RNases), and ribonuclease A in particular, which have become useful and interesting models, will be considered.

### **2. Natural and/or artificial protein polymers: Oligomers, multimers, aggregates, fibrils; tentative definitions and classifications**

Before describing and analyzing the protein oligomerization events in detail, some definitions are necessary.

First of all, proteins can self- or cross-associate either naturally or artificially. The artificial phenomenon can occur when the environmental conditions of a protein solution are changed, or cross-linking chemical reaction(s) are introduced.

If a monomeric protein, i.e., lacking quaternary structure, is considered as the starting species, we can obtain the first polymeric seed when a dimer, or different dimers, form. Then, poly‐ merization can continue towards trimer(s), tetramer(s), up to decamer(s) and so on. All these protein species are polymers that in the literature are termed, in turn, oligomers, multimers, or large aggregates, protofibrils and fibrils. The smallest subset of different subunits forming an oligomer is the structural unit of an oligomeric protein, and can also be called protomer. A promoter can be a protein subunit or several different subunits that assemble(s) in a defined stoichiometry to form an oligomer. For example, hemoglobin consists of four subunits, two α-chains and two β-chains. The oligomer stoichiometry is thus α2β2. Hemoglobin is a hetero‐ tetramer, but it is also a dimer of two αβ-protomers. Several authors use the terms protein oligomers, multimers or aggregates, to define protein species that range between a dimer and a proto-fibril. Consequently, the reader cannot withdraw the right meaning of what is reported unless from the context in which these terms are used. These different definitions are due to the lack of a clear-cut terminology assignment to the various supra-molecular protein products forming.

Although the elegant classification of homomers recently reported by Levy & Teichmann [1] is based on the symmetry and size of the structures, i.e., the number of subunits, it does not distinguish between homo-oligomers and large homo-polymers. Thus, we introduce here a tentative rationalization of the terminologies. In addition, rules that may help to assign the right terminology are lacking also because protein oligomers can be distinguished in two different ways: a) on the basis of the number of subunits forming the polymer (i.e., on the basis of the polymerization degree) or b) on the basis of the molecular weight (M.W.) of the polymer.

These two different approaches represent the principal controversy that may generate confusion. In fact, if an oligomer can be formed by two to eight-ten protein subunits, a protein of 50 kDa can be considered oligomeric also when its M.W. reaches 400-500 kDa. Instead, a 10 kDa protein is oligomeric when it reaches a weight of 80-100 kDa. On the contrary, if the oligomeric limit were a MW of, hypothetically, 100 kDa, the former protein would be larger than an oligomer already at a trimeric status, while the latter would be an oligomer also when composed by ten subunits, thus being a decamer. Protein oligomerization is heavily involved also in the formation of amyloidogenic (or not) fibrils, and in this scenario it is difficult to find the most appropriate terms able to indicate oligomers or other supramolecular structures preceding fibrillization [2] because of the transient nature of these structures. Moreover, it is almost impossible to determine the exact number of subunits forming a fibril, although the main criterion to distinguish an oligomer from a fibril is that the former is soluble, while the latter is not.

Consequently, the M.W. classification (low-molecular/high-molecular weight oligomers) becomes prevalent in the 'fibrillization' context [3, 4], while the classification based on the polymerization degree [5, 6] can be useful in the context of natural or artificial, limited or controlled, self- or hetero-protein(s) association.

To make the story short and to tangibly classify the protein polymerization (oligomerization) events we can enter two paths, one following controlled oligomerization, the other trapping protein oligomers before undergoing fibrillization and using Aβ-amyloid peptide of 40-42 residues as the monomeric unit [7]. From these two different, although possibly overlapping terminologies, although referring to different structures and/or structural elements, the following classification could be adopted:

	- **a.** Oligomers: from dimer(s) to pentamer(s)

by protein oligomerization; in the fifth paragraph, finally, some hints related to possible

Numerous cases and literature reports related to different protein will be described, especially the ones whose oligomers have been extensively studied. In this context, ribonucleases (RNases), and ribonuclease A in particular, which have become useful and interesting models,

Before describing and analyzing the protein oligomerization events in detail, some definitions

First of all, proteins can self- or cross-associate either naturally or artificially. The artificial phenomenon can occur when the environmental conditions of a protein solution are changed,

If a monomeric protein, i.e., lacking quaternary structure, is considered as the starting species, we can obtain the first polymeric seed when a dimer, or different dimers, form. Then, poly‐ merization can continue towards trimer(s), tetramer(s), up to decamer(s) and so on. All these protein species are polymers that in the literature are termed, in turn, oligomers, multimers, or large aggregates, protofibrils and fibrils. The smallest subset of different subunits forming an oligomer is the structural unit of an oligomeric protein, and can also be called protomer. A promoter can be a protein subunit or several different subunits that assemble(s) in a defined stoichiometry to form an oligomer. For example, hemoglobin consists of four subunits, two α-chains and two β-chains. The oligomer stoichiometry is thus α2β2. Hemoglobin is a hetero‐ tetramer, but it is also a dimer of two αβ-protomers. Several authors use the terms protein oligomers, multimers or aggregates, to define protein species that range between a dimer and a proto-fibril. Consequently, the reader cannot withdraw the right meaning of what is reported unless from the context in which these terms are used. These different definitions are due to the lack of a clear-cut terminology assignment to the various supra-molecular protein products

Although the elegant classification of homomers recently reported by Levy & Teichmann [1] is based on the symmetry and size of the structures, i.e., the number of subunits, it does not distinguish between homo-oligomers and large homo-polymers. Thus, we introduce here a tentative rationalization of the terminologies. In addition, rules that may help to assign the right terminology are lacking also because protein oligomers can be distinguished in two different ways: a) on the basis of the number of subunits forming the polymer (i.e., on the basis of the polymerization degree) or b) on the basis of the molecular weight (M.W.) of the polymer. These two different approaches represent the principal controversy that may generate confusion. In fact, if an oligomer can be formed by two to eight-ten protein subunits, a protein of 50 kDa can be considered oligomeric also when its M.W. reaches 400-500 kDa. Instead, a 10

**2. Natural and/or artificial protein polymers: Oligomers, multimers,**

**aggregates, fibrils; tentative definitions and classifications**

industrial applications of protein oligomers will be mentioned.

240 Oligomerization of Chemical and Biological Compounds

or cross-linking chemical reaction(s) are introduced.

will be considered.

are necessary.

forming.

	- **a.** (Aβ) Pre-Fibrillar Oligomers: from a tetramer of 18 kDa to structures of about 75 kDa [7].
	- **b.** (Aβ) Fibrillar Oligomers: from a dimer or a tetramer of 9-18 kDa to structures of about 500 kDa [7].
	- **c.** Proto-fibrils: structures lacking order and periodic symmetry of mature fibrils; shorter and less linear than fibrils [8], and often deriving from 'globulomers' [9] or 'annular oligomers' [10], already called 'ring-like shaped' annular aggregates [11].
	- **d.** Fibrils: ordered, symmetrical, long, varying in length, extending towards several micrometers [12], stacked filaments of unbranched cross-β-spine [13] pairs, ranging from about 60 up to 200 Å in diameter or width [8, 14].

Thus, different types of oligomers deriving from the same protein or domain can certainly overlap in terms of size and M.W. Consequently, it has to be also taken into account that the same protein can be considered as to be fibrillogenic or not fibrillogenic not only on the basis of its dimensions or M.W., but also because of its morphology, toxicity, pathway of formation, or method of artificial formation of its oligomers [2, 7, 8]. Furthermore, these supramolecular species can be toxic or harmless provided they satisfy, or not, some requisites [2, 15] that will be discussed in the following paragraphs.

Alternatively, artificial cross-linking(s) can produce oligomers after the reaction of a protein with dehydrating molecules, such as EDC or carbodihymides in general, or through the use of several bifunctional reagents, such as dialdehydes or diimidoesters [18, 19]. These two latter chemicals display two terminal reactive groups separated by a variable number of unreactive spacers, such as methylenes. They are often symmetric, but sometimes asymmetric. The alternative use of them becomes advantageous depending on the goal to be reached, as for

Protein Oligomerization http://dx.doi.org/10.5772/57489 243

Some of these different reagents were extensively used already in the late '50s and '60s to produce protein oligomers that have been useful, after limited proteolysis, to study protein primary and tertiary structures, and, thus, protein conformations. Later, the formation of covalently linked oligomers allowed sometimes the production of protein derivatives with

One of the first class of reagents used in protein cross-linking were dialdehydes, such as glutaraldehyde (Figure 1 a) or diimidoesters [19], like dimethyladipimidate, dimethylpimeli‐

Both classes of these cross-linkers react mainly with lysine residues, which act as nucleophyles towards the aldehydic or imidic carbon of the cross-linker (Fig. 1 c) under slightly basic environmental conditions (pH ranging from 7.5 to 8.5-9). By this reaction a yield of more than 20% of dimers can be obtained, as well as a decreasing amount of trimers, tetramers and traces

**Figure 1. (a)** Glutaraldehyde. **(b)** Diimidoesters: n=4, dimethyl-adipimidate; n=5,-pimelimidate; n=6,-suberimidate. **(c)**

The advantage represented by diimidoesters with respect to dialdehydes, like glutaraldehyde, is that the latters are toxic, generally more reactive than diimidoesters and consequently often able to introduce unwanted changes in a protein. Moreover, sometimes they strongly drive towards side intramolecular reactions with a possible inactivation of the protein. Diimidoesters reactivity, on the contrary, can be better controlled although protein lysines may be modified,

Mechanism of reaction of a protein with an imidoester (dimethylsuberimidate).

example for a homo-or hetero-protein cross-linking, respectively.

midate and dimethylsuberimidate (Figure 1 b).

of higher-order oligomers.

increased activity, and also higher stability against proteases, and so on.

Based on what it has been reported, the terms oligomers or others related to supramolecular protein structures may be still somehow misleading and equivocal. Anyway, the tentative classification here proposed and based on two different contexts, i.e., fibrillogenesis or not, will be used in the following paragraphs.

#### **3. Protein oligomers**

#### **3.1. Homo/hetero-dimeric or oligomeric proteins**

Several studies have been focused, especially in the last two decades, on the features of homopolymeric or, more precisely, homo-oligomeric proteins, i.e., supramolecular structures formed by self-associating proteins. In this context, a great amount of structural, mechanistic, physicochemical, and functional elements have been discovered and elucidated as important determinants which tune and control protein self-association.

Instead, except the very well known heterotetrameric hemoglobin, less is known about heterooligomeric protein complexes. These hetero-structures refer to chains of different sequences, which undergo association pathway in a way less statistically favorable and easily controllable, either qualitatively or quantitatively, than protein self-association. Nevertheless, protein hetero-oligomerization represents a very important phenomenon in the formation of molec‐ ular machines, like for motor proteins (kinesin, microtubules,), or, alternatively, can be obtained artificially by the use of asymmetric bifunctional reagents, while new discoveries of natural hetero-protein association events lighted up again the interest over this topic.

#### **3.2. Covalently linked oligomers**

Although not very frequently, protein cross-linking can occur naturally forming covalently linked species that display quaternary structures, or active covalent complexes starting from inactive monomeric precursors. Post-transduction modifications, photochemical event(s), or co-enzyme binding (i.e., going from apo-to holo-forms), can also induce protein self-or crosslinking.

Natural cross-linking can sometimes occur through free cysteines of two different subunits that can couple to form intermolecular disulfides depending on the redox state of the envi‐ ronment. This is the case, for example, of bovine seminal RNase (BS-RNase), the unique member of the large pancreatic-type secretory ribonuclease super-family which is dimeric in nature [16]. Other proteins that can form covalent supramolecular structures are structural proteins such as collagen or elastin, the latter forming or disrupting in tissues or vessels desmosine bridges forming between lysine residues after elastic stress or relaxation [17].

Alternatively, artificial cross-linking(s) can produce oligomers after the reaction of a protein with dehydrating molecules, such as EDC or carbodihymides in general, or through the use of several bifunctional reagents, such as dialdehydes or diimidoesters [18, 19]. These two latter chemicals display two terminal reactive groups separated by a variable number of unreactive spacers, such as methylenes. They are often symmetric, but sometimes asymmetric. The alternative use of them becomes advantageous depending on the goal to be reached, as for example for a homo-or hetero-protein cross-linking, respectively.

same protein can be considered as to be fibrillogenic or not fibrillogenic not only on the basis of its dimensions or M.W., but also because of its morphology, toxicity, pathway of formation, or method of artificial formation of its oligomers [2, 7, 8]. Furthermore, these supramolecular species can be toxic or harmless provided they satisfy, or not, some requisites [2, 15] that will

Based on what it has been reported, the terms oligomers or others related to supramolecular protein structures may be still somehow misleading and equivocal. Anyway, the tentative classification here proposed and based on two different contexts, i.e., fibrillogenesis or not,

Several studies have been focused, especially in the last two decades, on the features of homopolymeric or, more precisely, homo-oligomeric proteins, i.e., supramolecular structures formed by self-associating proteins. In this context, a great amount of structural, mechanistic, physicochemical, and functional elements have been discovered and elucidated as important

Instead, except the very well known heterotetrameric hemoglobin, less is known about heterooligomeric protein complexes. These hetero-structures refer to chains of different sequences, which undergo association pathway in a way less statistically favorable and easily controllable, either qualitatively or quantitatively, than protein self-association. Nevertheless, protein hetero-oligomerization represents a very important phenomenon in the formation of molec‐ ular machines, like for motor proteins (kinesin, microtubules,), or, alternatively, can be obtained artificially by the use of asymmetric bifunctional reagents, while new discoveries of

natural hetero-protein association events lighted up again the interest over this topic.

Although not very frequently, protein cross-linking can occur naturally forming covalently linked species that display quaternary structures, or active covalent complexes starting from inactive monomeric precursors. Post-transduction modifications, photochemical event(s), or co-enzyme binding (i.e., going from apo-to holo-forms), can also induce protein self-or cross-

Natural cross-linking can sometimes occur through free cysteines of two different subunits that can couple to form intermolecular disulfides depending on the redox state of the envi‐ ronment. This is the case, for example, of bovine seminal RNase (BS-RNase), the unique member of the large pancreatic-type secretory ribonuclease super-family which is dimeric in nature [16]. Other proteins that can form covalent supramolecular structures are structural proteins such as collagen or elastin, the latter forming or disrupting in tissues or vessels desmosine bridges forming between lysine residues after elastic stress or relaxation [17].

be discussed in the following paragraphs.

242 Oligomerization of Chemical and Biological Compounds

will be used in the following paragraphs.

**3.1. Homo/hetero-dimeric or oligomeric proteins**

determinants which tune and control protein self-association.

**3. Protein oligomers**

**3.2. Covalently linked oligomers**

linking.

Some of these different reagents were extensively used already in the late '50s and '60s to produce protein oligomers that have been useful, after limited proteolysis, to study protein primary and tertiary structures, and, thus, protein conformations. Later, the formation of covalently linked oligomers allowed sometimes the production of protein derivatives with increased activity, and also higher stability against proteases, and so on.

One of the first class of reagents used in protein cross-linking were dialdehydes, such as glutaraldehyde (Figure 1 a) or diimidoesters [19], like dimethyladipimidate, dimethylpimeli‐ midate and dimethylsuberimidate (Figure 1 b).

Both classes of these cross-linkers react mainly with lysine residues, which act as nucleophyles towards the aldehydic or imidic carbon of the cross-linker (Fig. 1 c) under slightly basic environmental conditions (pH ranging from 7.5 to 8.5-9). By this reaction a yield of more than 20% of dimers can be obtained, as well as a decreasing amount of trimers, tetramers and traces of higher-order oligomers.

**Figure 1. (a)** Glutaraldehyde. **(b)** Diimidoesters: n=4, dimethyl-adipimidate; n=5,-pimelimidate; n=6,-suberimidate. **(c)** Mechanism of reaction of a protein with an imidoester (dimethylsuberimidate).

The advantage represented by diimidoesters with respect to dialdehydes, like glutaraldehyde, is that the latters are toxic, generally more reactive than diimidoesters and consequently often able to introduce unwanted changes in a protein. Moreover, sometimes they strongly drive towards side intramolecular reactions with a possible inactivation of the protein. Diimidoesters reactivity, on the contrary, can be better controlled although protein lysines may be modified, and allows to maintain the overall charge of the protein unmodified. Consequently, despite the oligomeric yields obtained with diimidoesters are lower than those of dialdehydes, the products obtained are more specific and active than the ones obtained with the latters. Finally, the longer the spacer, the higher the intermolecular yield and the less rigid oligomeric product obtainable [19]. Thus, dimethylsuberimidate is more useful to artificially oligomerize a protein than the other shorter diimidoesters mentioned above.

derivatives in the spacers, and coupled, in the second terminus, with imidoesters, diones, thiones, 2-iminothiolane, etc. Almost all of these bifunctional cross-links foresee a two-step reaction in which one of the partners is firstly modified and activated (for example, with 2 iminothiolane) in order to become able to react with the appropriate partner and form the new hetero-dimeric adduct. Some of the most important reactants used for artificial heterodimer‐ ziation are 1-(3-((2,5-Dioxopyrrolidinyl)oxy-carbonyl)phenyl)-1H-pyrrole-2,5-dione (MBS) Nsuccinimidyl-3-(2-pyridyldithio)propionate (SPDP), S-4-succinimidyloxycarbonyl-α-methyl benzyl thiosulfate (SMBT), or 4-succinimidyloxycarbonyl-α-methyl-α(2pyridyldithio)toluene (SMPT). The details of these hetero-oligomerization reactions are beyond the aim of this

Again, also another dione, 4-phenyl, 1,2,4-triazoline-3,5-dione (PTD), structurally similar to maleimides but containing nitrogens in the pentagon ring, can be useful to covalently cross-

exploits its dehydrating power to create new isopeptide bonds between the side chains of Lys and Glu or Asp residues of proteins [23], thus producing a "zero-length" cross-linking. This reagent can be very useful to covalently fix previously formed oligomeric protein aggregates [24] and allows to avoid unwanted insertions of chemicals or net charge modifications in the

Divinyl sulfone (DVS) [25] and dinitrodifluorobenzene (DFDNB) [26] are two bifunctional reagents that deserve to be mentioned (Figure 3). They lack spacers, and to stabilize preformed

DVS is specific for histidines, while DFDNB reacts with lysines. The limited dimensions of both molecules allow cross-linking only between residues that are very close to each other. For this reason subsequent chromatographic or electrophoretic analyses can highlight important conformational features of the protein or its oligomers. The cross-linking can drive towards intra-or inter-molecular adducts, thus revealing if the protein was monomeric or already

Finally, covalent protein "zero-length" oligomerization can be obtained, without chemicals, by sealing a lyophilized protein under vacuum at high temperature, up to 85 °C, for 24-96 hours [30]. This process permits to obtain dimers, trimers, and traces of tetramers of ribonu‐ clease A (RNase A) and lysozyme, without the introduction of chemical groups which could

**+**

), such as EDC, which

Protein Oligomerization http://dx.doi.org/10.5772/57489 245

chapter but are clearly described in the review of Fracasso and colleagues [21].

Other important bifunctional reagents are carbodiimides (R1-N=C=C-R2

structures without affecting oligomers' conformations [25, 27-29].

**Figure 3. (a)** Divinyl sulfone (DVS). **(b)** Dinitrodifluorobenzene (DFDNB)

oligomeric before the covalent stabilization [25].

link proteins through amino-groups [22].

protein complex.

Other reagents that can be useful to obtain protein oligomers are the bifunctional N-substituted maleimide derivatives [20], such as *N,N'*1,2-or 1,3-or 1,4phenyldimaleimide (orto-meta-para-PDM) (Figure 2 a-c), or derivatives displaying spacers of varying length between the two reactive maleimides (Figure 2 d-f). The reaction occurs between the maleimide and the free sulfhydryl group of a cysteine of the protein as an addiction which saturates the double link of the maleimide conjugated with two carbonyls (Figure 2 g) [20].

The cross-linking event can be irreversible or reversible. Reversibility can be favored, for example, by the use of spacers containing disulfide bonds (Figure 2 g). These "redox" spacers can be applied to all bifunctional reagents displaying spacers if the reversibility of the reaction is desired, and are very tricky for asymmetric bifunctional linkers.

This can allow, for example, to obtain species that need to be driven inside cells as heterodimers and then to be released as dissociated monomers exploiting the reducing environment of the cytosol. Thus, spacers containing disulfides can be applied to all bifunctional reagents if the reversibility of the reaction is desired.

Asymmetric bifunctional reagents can be very useful to covalently link antibodies or part of them (light/heavy chains) with proteins, protein domains, or toxins, or whatever of biological interest. They can be a combination of maleimides or succinimides displaying or not dithioderivatives in the spacers, and coupled, in the second terminus, with imidoesters, diones, thiones, 2-iminothiolane, etc. Almost all of these bifunctional cross-links foresee a two-step reaction in which one of the partners is firstly modified and activated (for example, with 2 iminothiolane) in order to become able to react with the appropriate partner and form the new hetero-dimeric adduct. Some of the most important reactants used for artificial heterodimer‐ ziation are 1-(3-((2,5-Dioxopyrrolidinyl)oxy-carbonyl)phenyl)-1H-pyrrole-2,5-dione (MBS) Nsuccinimidyl-3-(2-pyridyldithio)propionate (SPDP), S-4-succinimidyloxycarbonyl-α-methyl benzyl thiosulfate (SMBT), or 4-succinimidyloxycarbonyl-α-methyl-α(2pyridyldithio)toluene (SMPT). The details of these hetero-oligomerization reactions are beyond the aim of this chapter but are clearly described in the review of Fracasso and colleagues [21].

Again, also another dione, 4-phenyl, 1,2,4-triazoline-3,5-dione (PTD), structurally similar to maleimides but containing nitrogens in the pentagon ring, can be useful to covalently crosslink proteins through amino-groups [22].

Other important bifunctional reagents are carbodiimides (R1-N=C=C-R2 **+** ), such as EDC, which exploits its dehydrating power to create new isopeptide bonds between the side chains of Lys and Glu or Asp residues of proteins [23], thus producing a "zero-length" cross-linking. This reagent can be very useful to covalently fix previously formed oligomeric protein aggregates [24] and allows to avoid unwanted insertions of chemicals or net charge modifications in the protein complex.

Divinyl sulfone (DVS) [25] and dinitrodifluorobenzene (DFDNB) [26] are two bifunctional reagents that deserve to be mentioned (Figure 3). They lack spacers, and to stabilize preformed structures without affecting oligomers' conformations [25, 27-29].

**Figure 3. (a)** Divinyl sulfone (DVS). **(b)** Dinitrodifluorobenzene (DFDNB)

and allows to maintain the overall charge of the protein unmodified. Consequently, despite the oligomeric yields obtained with diimidoesters are lower than those of dialdehydes, the products obtained are more specific and active than the ones obtained with the latters. Finally, the longer the spacer, the higher the intermolecular yield and the less rigid oligomeric product obtainable [19]. Thus, dimethylsuberimidate is more useful to artificially oligomerize a protein

Other reagents that can be useful to obtain protein oligomers are the bifunctional N-substituted maleimide derivatives [20], such as *N,N'*1,2-or 1,3-or 1,4phenyldimaleimide (orto-meta-para-PDM) (Figure 2 a-c), or derivatives displaying spacers of varying length between the two reactive maleimides (Figure 2 d-f). The reaction occurs between the maleimide and the free sulfhydryl group of a cysteine of the protein as an addiction which saturates the double link

The cross-linking event can be irreversible or reversible. Reversibility can be favored, for example, by the use of spacers containing disulfide bonds (Figure 2 g). These "redox" spacers can be applied to all bifunctional reagents displaying spacers if the reversibility of the reaction

**Figure 2. Some bifuctional maleimides.(a), (b), (c)** orto-meta-para N, N'phenylenemaleimmide (o,m,pPDM); **(d)** 1,4 bismaleimidobutane (BMB); **(e)** 1,8-bismaleimidotriethyleneglycol (BM[PEO]3); **(f)** dithio-bis-maleimidoethane (DMTE); **(g)** Reaction of a bis-maleimide derivative with sulfhydryl groups (free, reduced cysteines). Figure adapted from [20].

This can allow, for example, to obtain species that need to be driven inside cells as heterodimers and then to be released as dissociated monomers exploiting the reducing environment of the cytosol. Thus, spacers containing disulfides can be applied to all bifunctional reagents if the

Asymmetric bifunctional reagents can be very useful to covalently link antibodies or part of them (light/heavy chains) with proteins, protein domains, or toxins, or whatever of biological interest. They can be a combination of maleimides or succinimides displaying or not dithio-

than the other shorter diimidoesters mentioned above.

244 Oligomerization of Chemical and Biological Compounds

of the maleimide conjugated with two carbonyls (Figure 2 g) [20].

is desired, and are very tricky for asymmetric bifunctional linkers.

reversibility of the reaction is desired.

DVS is specific for histidines, while DFDNB reacts with lysines. The limited dimensions of both molecules allow cross-linking only between residues that are very close to each other. For this reason subsequent chromatographic or electrophoretic analyses can highlight important conformational features of the protein or its oligomers. The cross-linking can drive towards intra-or inter-molecular adducts, thus revealing if the protein was monomeric or already oligomeric before the covalent stabilization [25].

Finally, covalent protein "zero-length" oligomerization can be obtained, without chemicals, by sealing a lyophilized protein under vacuum at high temperature, up to 85 °C, for 24-96 hours [30]. This process permits to obtain dimers, trimers, and traces of tetramers of ribonu‐ clease A (RNase A) and lysozyme, without the introduction of chemical groups which could totally or partially inactivate the protein residues involved in the reaction. In fact, the heatvacuum treatment of proteins induces the dehydration of some of the Lys and Asp or Glu sidechains, thus producing newly formed intermolecular isopeptide bonds. However, this reaction, firstly considered to be specific of a single couple of Lys and Glu of RNase A [31], was later found to involve more than one of these acid or basic residues, thus producing a mixture of heterogeneous products [32].

Last, but not least, cross-linked oligomers can be produced also through UV photochemistry. The UV-treatment of a monomer or pre-formed non-covalent oligomers [33] produces covalently stabilized oligomers. If the photochemical treatment inactivates the native protein or its possibly pre-formed oligomers, structural important information can be anyway withdrawn from these artificial modifications [34, 35].

#### **3.3. Non-covalent protein oligomers**

Protein association can occur very often naturally, and without covalent modification(s), through a homo-or hetero-association mediated by a weak-bond-network, formed by electro‐ static or hydrophobic interactions, and/or specific H-bond(s). If the interface between mono‐ mers or protomers is large almost all types of interactions can occur and are more frequently conserved. Small interfaces have been probably acquired recently in evolution [36]. These interactions can be crucial for the active forms of several class of proteins, such as enzymes and transporters. These interaction(s)/association(s) may occur naturally because of the sequence and structural features of the subunits which build the oligomeric complex(es). Otherwise, they can form because of environmental changes, like pH or ionic strength, or as a consequence of the increase of the monomers local concentration.

These events allow the protein to overpass its dimerization dissociation constant(s) KD1/2 (equation 1 and Figure 4, red and orange).

$$\mathbf{K}\_{\mathbf{D}} = \left[\mathbf{M}\right]^2 / \left[\mathbf{D}\right] \qquad \text{with } M = monomer \text{ , } \mathbf{D} = dimer \tag{1}$$

**Figure 4. Schematic view of protein oligomerization.** A native monomer (blue) can become a dimer if KD of dimeri‐ zation (equation 1) is overpassed. Two possible dimeric conformers with different interfaces or interface areas are shown to form with KD1 and KD2 respectively. These events can lead to the formation of different trimers (three green conformers are shown), tetramers (magenta, pink, violet) and so on. Trimers and tetramers unlikely form from isolated monomers because of entropic reasons, but can grow from the pre-existing oligomer(s) (or also from two dimers). Oligomers can follow different pathways (dotted lines) to form higher-order oligomers or multimers, and can undergo conformational rearrangements which can compensate the entropy cost to be payed to associate the subunits.

Protein Oligomerization http://dx.doi.org/10.5772/57489 247

An example of natural oligomeric protein is represented by the oxygen transporter hemoglobin (Hb), which is functionally active only as an α2β2 tetramer (or, in the fetus, α2γ2, endowed with a greater affinity to oxygen) with its subunits being associated through salt bridges and other weak interactions. The loss of these interactions drives Hb to switch its conformation from the deoxygenated (tense, T) to the oxygenated (relaxed, R) form, and this modification is due to allosteric interactions triggered by the first oxygen bound (Figure 5). Hb, unless being tetrameric, cannot be active, while myoglobin, the oxygen collector in tissues, is active as a

**T R**

Another important oligomeric protein transporter is transthyretin (TTR), formerly called prealbumin, one of the transporters of the hormone thyroxine and of the lipocalin retinol-binding protein (RBP), the specific carrier of A-vitamin [38]. TTR is natively a 55-kDa dimer of dimers,

**Figure 5.** Allosteric control of haemoglobin (Hb) induced by oxygen binding.

monomer.

A further increase of the concentration can augment the degree of polymerization with the formation of trimers (Figure 4, green pictures), tetramers (Figure 4 magenta, pink, violet), and larger oligomers or multimers if the entropy cost is balanced by favorable interface interac‐ tion(s).

The free energy of multimerization is reported in equation 2 [37].

$$
\Delta \mathbf{G} = \Delta \mathbf{G}^0 + \mathbf{R} \mathbf{T} \ln \left( \begin{bmatrix} \mathbf{n} \mathbf{P} \end{bmatrix} / \begin{bmatrix} \mathbf{P} \end{bmatrix}^\mathbf{n} \right) \tag{2}
$$

in which [P] is the concentration of protein segment(s) exposed, or protein interfaces [37], and prone to interact with other protomers.

totally or partially inactivate the protein residues involved in the reaction. In fact, the heatvacuum treatment of proteins induces the dehydration of some of the Lys and Asp or Glu sidechains, thus producing newly formed intermolecular isopeptide bonds. However, this reaction, firstly considered to be specific of a single couple of Lys and Glu of RNase A [31], was later found to involve more than one of these acid or basic residues, thus producing a

Last, but not least, cross-linked oligomers can be produced also through UV photochemistry. The UV-treatment of a monomer or pre-formed non-covalent oligomers [33] produces covalently stabilized oligomers. If the photochemical treatment inactivates the native protein or its possibly pre-formed oligomers, structural important information can be anyway

Protein association can occur very often naturally, and without covalent modification(s), through a homo-or hetero-association mediated by a weak-bond-network, formed by electro‐ static or hydrophobic interactions, and/or specific H-bond(s). If the interface between mono‐ mers or protomers is large almost all types of interactions can occur and are more frequently conserved. Small interfaces have been probably acquired recently in evolution [36]. These interactions can be crucial for the active forms of several class of proteins, such as enzymes and transporters. These interaction(s)/association(s) may occur naturally because of the sequence and structural features of the subunits which build the oligomeric complex(es). Otherwise, they can form because of environmental changes, like pH or ionic strength, or as

These events allow the protein to overpass its dimerization dissociation constant(s) KD1/2

A further increase of the concentration can augment the degree of polymerization with the formation of trimers (Figure 4, green pictures), tetramers (Figure 4 magenta, pink, violet), and larger oligomers or multimers if the entropy cost is balanced by favorable interface interac‐

in which [P] is the concentration of protein segment(s) exposed, or protein interfaces [37], and

 *with M monomer D dimer* é ù éù / , ë ë <sup>=</sup> <sup>û</sup> <sup>=</sup> <sup>û</sup> <sup>=</sup> **<sup>2</sup> KMD <sup>D</sup>** (1)

( )é ù éù / ë û ëû = +**<sup>0</sup> <sup>n</sup> ΔG ΔG RT ln nP P** (2)

mixture of heterogeneous products [32].

246 Oligomerization of Chemical and Biological Compounds

**3.3. Non-covalent protein oligomers**

(equation 1 and Figure 4, red and orange).

prone to interact with other protomers.

tion(s).

withdrawn from these artificial modifications [34, 35].

a consequence of the increase of the monomers local concentration.

The free energy of multimerization is reported in equation 2 [37].

**Figure 4. Schematic view of protein oligomerization.** A native monomer (blue) can become a dimer if KD of dimeri‐ zation (equation 1) is overpassed. Two possible dimeric conformers with different interfaces or interface areas are shown to form with KD1 and KD2 respectively. These events can lead to the formation of different trimers (three green conformers are shown), tetramers (magenta, pink, violet) and so on. Trimers and tetramers unlikely form from isolated monomers because of entropic reasons, but can grow from the pre-existing oligomer(s) (or also from two dimers). Oligomers can follow different pathways (dotted lines) to form higher-order oligomers or multimers, and can undergo conformational rearrangements which can compensate the entropy cost to be payed to associate the subunits.

An example of natural oligomeric protein is represented by the oxygen transporter hemoglobin (Hb), which is functionally active only as an α2β2 tetramer (or, in the fetus, α2γ2, endowed with a greater affinity to oxygen) with its subunits being associated through salt bridges and other weak interactions. The loss of these interactions drives Hb to switch its conformation from the deoxygenated (tense, T) to the oxygenated (relaxed, R) form, and this modification is due to allosteric interactions triggered by the first oxygen bound (Figure 5). Hb, unless being tetrameric, cannot be active, while myoglobin, the oxygen collector in tissues, is active as a monomer.

**Figure 5.** Allosteric control of haemoglobin (Hb) induced by oxygen binding.

Another important oligomeric protein transporter is transthyretin (TTR), formerly called prealbumin, one of the transporters of the hormone thyroxine and of the lipocalin retinol-binding protein (RBP), the specific carrier of A-vitamin [38]. TTR is natively a 55-kDa dimer of dimers, or homotetramer, mainly composed of β-sheets [39]. Its peculiarity consists of the pathologic pathway it follows if destabilized by malignant mutations: if so, monomers dissociate from tetrameric assembly and undergo uncontrolled aggregation and fibrillization through the formation of intermediate annular oligomers [10]. These findings will be further discussed below.

whose overall 3D structure has not been solved yet. Its activity depends on its tetrameric structure and the presence of tetrahydrobiopterin (BH4), and is allosterically regulated by the substrate itself (Phe). Some mutations become pathogenic because they destabilize and inactivate the tetramer, and consequently drive the organism towards the incoming of

Protein Oligomerization http://dx.doi.org/10.5772/57489 249

A natively oligomeric protein can also switch towards higher-order oligomers. Indeed, the natively homotetrameric L-rhamnulose-1-phosphatase aldolase becomes an octamer only upon a single A88F mutation [44]. The introduction of a single residue displaying large nonpolar side-chains (Phe, Trp) can be sufficient to drive the native oligomer towards larger

It has to be mentioned here that from a total of about 450 well characterized enzymes, only about 140 of them are monomeric. Of the other 310, 200 are homo-oligomers/multimers and in detail: 125 homodimers, 50 homotetramers and 25 are structures larger than tetramers.

Protein oligomerization can also be a non-native event: indeed, natively monomeric proteins can naturally and non-covalently undergo oligomerization as a sort of post translational event which can become a switch between active and non active products. This is true, for example, for several trans-membrane receptors, which often display kinase activity. Upon ligand binding, the intracellular domain dimerizes, this event triggering (auto)phosphorilation of the intracellular domain which undergoes conformational changes and is able to activate a signal transduction cascade that induces or tunes important physiological phenomena. Examples of families of this type of receptors are: growth ormone, interferon, cytokine and Tyr-kinase, Gprotein-coupled receptors families (GPCRs) [46]. The components of the latter family were initially thought to act as monomers, but several pharmacological, biochemical and biophys‐

Another very interesting example is represented by the Caspase-3,-7, and-9, a family of proteins involved in apoptosis. Under physiological conditions Caspase-9 exists as an inactive monomer forming an 1:1 complex with the Apaf-1 cofactor in the presence of Cytochrome C and ATP to produce a heteromultimer. This complex co-localizes with a multiple array of Caspase-9 molecules, which consequently increase their concentration above the Caspase-9 homodimer dissociation constant KD. This event allows the homodimer to be formed through the exposure of an activation loop, and the active dimer provides the catalytic activity

Finally, other interesting examples of oligomeric natural proteins are the membrane channelforming tetrameric complexes that allow specific ions (Na+ or K+) or water to permeate cells,

Proteins can also form large pathogenic oligomers or multimers that can evolve towards pathogenic supramolecular structures. Important examples of these malignant events are the uncontrolled aggregation of the Glu6Val Hb mutant of Hb (E6V-Hb or HbS) in sickle cell anaemia or the formation of amyloid or amyloid-like fibrils, as it occurs with several proteins related to severe neurodegenerative diseases. These latter products, overpassing the oligo‐

Finally, the remaining about 110 are hetero-oligomers/multimers [45].

ical data indicate that GPCRs function as cooperatively controlled dimers [47].

necessary to activate Caspase-3 and-7 [48].

such as aquaporins or aquaglyceroporins [49].

Phenylketonuria (PKU) [43].

multimeric complex(es).

Other cases of non-covalent association between proteins can be represented, for example, by 5'pyridoxal-phosphate (PLP) enzymes, such as the aspartic aminotransferase (AAT), alanineglyoxylate-aminotransferase (AGT) (Figure 6a), dopa-decarboxylase (DDC) (Figure 6 b), and cystalisin (a lyase, like DDC). Inactive as monomers, these enzymes are active only when they are in form of dimers, although a detailed analysis of their dimerization pathways has been performed only with *Treponema Denticola* cystalisin mutants [40].

The family of the PLP-dependent enzymes is very large, forming five different fold types, and some members of the family are active as tetramers or hexamers. AGT [41] and DDC [42], shown in Figure 6, are dimers belonging to Fold Type I, as well as cystalysin and AAT. Belonging to Fold Type I means that each subunit of the holo-form host a PLP molecule, but the active site is composite, i.e., formed by residues of both subunits. Instead, enzymes belonging to Fold Type II are active as dimers or oligomers binding one PLP each at the same time, but they evolved to form one active site per subunit, and are often accompanied by the presence of allosteric regulation domains.

**Figure 6. Three dimensional structure of holo natively dimeric(a)** human liver alanine-glioxylate aminotransferase (hAGT) [41] **(b)** pig kidney dopa decarboxylase (pkDDC) [42] (Burkhard P. *et al* 2001, *NSB* **8** (11) 963-7). The two α/β subunits are shown with different colours and for AGT the location of the two PLP molecules is shown in green. AGT is peculiar for the wrapping of each terminal arm (Nα/Nβ) into the region occupied by the complementary subunit.

This evolution pathway allowed them to build domains which are active as allosteric regula‐ tors. The association of the subunits of these PLP-enzymes is often mediated mainly by hydrophobic interactions because the monomeric forms are readily prone to uncontrolled extensive aggregation [35]. Anyway, the inter-subunit surfaces being often very large, several types of contacts occur, such as electrostatic interactions or H-bonds.

Another interesting example of oligomeric protein is represented by mammalian Phenylala‐ nine Hydroxylase (PAH), which is a homo-tetrameric enzyme made of four 50 kDa subunits whose overall 3D structure has not been solved yet. Its activity depends on its tetrameric structure and the presence of tetrahydrobiopterin (BH4), and is allosterically regulated by the substrate itself (Phe). Some mutations become pathogenic because they destabilize and inactivate the tetramer, and consequently drive the organism towards the incoming of Phenylketonuria (PKU) [43].

or homotetramer, mainly composed of β-sheets [39]. Its peculiarity consists of the pathologic pathway it follows if destabilized by malignant mutations: if so, monomers dissociate from tetrameric assembly and undergo uncontrolled aggregation and fibrillization through the formation of intermediate annular oligomers [10]. These findings will be further discussed

Other cases of non-covalent association between proteins can be represented, for example, by 5'pyridoxal-phosphate (PLP) enzymes, such as the aspartic aminotransferase (AAT), alanineglyoxylate-aminotransferase (AGT) (Figure 6a), dopa-decarboxylase (DDC) (Figure 6 b), and cystalisin (a lyase, like DDC). Inactive as monomers, these enzymes are active only when they are in form of dimers, although a detailed analysis of their dimerization pathways has been

The family of the PLP-dependent enzymes is very large, forming five different fold types, and some members of the family are active as tetramers or hexamers. AGT [41] and DDC [42], shown in Figure 6, are dimers belonging to Fold Type I, as well as cystalysin and AAT. Belonging to Fold Type I means that each subunit of the holo-form host a PLP molecule, but the active site is composite, i.e., formed by residues of both subunits. Instead, enzymes belonging to Fold Type II are active as dimers or oligomers binding one PLP each at the same time, but they evolved to form one active site per subunit, and are often accompanied by the

**Figure 6. Three dimensional structure of holo natively dimeric(a)** human liver alanine-glioxylate aminotransferase (hAGT) [41] **(b)** pig kidney dopa decarboxylase (pkDDC) [42] (Burkhard P. *et al* 2001, *NSB* **8** (11) 963-7). The two α/β subunits are shown with different colours and for AGT the location of the two PLP molecules is shown in green. AGT is peculiar for the wrapping of each terminal arm (Nα/Nβ) into the region occupied by the complementary subunit.

This evolution pathway allowed them to build domains which are active as allosteric regula‐ tors. The association of the subunits of these PLP-enzymes is often mediated mainly by hydrophobic interactions because the monomeric forms are readily prone to uncontrolled extensive aggregation [35]. Anyway, the inter-subunit surfaces being often very large, several

Another interesting example of oligomeric protein is represented by mammalian Phenylala‐ nine Hydroxylase (PAH), which is a homo-tetrameric enzyme made of four 50 kDa subunits

types of contacts occur, such as electrostatic interactions or H-bonds.

performed only with *Treponema Denticola* cystalisin mutants [40].

presence of allosteric regulation domains.

248 Oligomerization of Chemical and Biological Compounds

below.

A natively oligomeric protein can also switch towards higher-order oligomers. Indeed, the natively homotetrameric L-rhamnulose-1-phosphatase aldolase becomes an octamer only upon a single A88F mutation [44]. The introduction of a single residue displaying large nonpolar side-chains (Phe, Trp) can be sufficient to drive the native oligomer towards larger multimeric complex(es).

It has to be mentioned here that from a total of about 450 well characterized enzymes, only about 140 of them are monomeric. Of the other 310, 200 are homo-oligomers/multimers and in detail: 125 homodimers, 50 homotetramers and 25 are structures larger than tetramers. Finally, the remaining about 110 are hetero-oligomers/multimers [45].

Protein oligomerization can also be a non-native event: indeed, natively monomeric proteins can naturally and non-covalently undergo oligomerization as a sort of post translational event which can become a switch between active and non active products. This is true, for example, for several trans-membrane receptors, which often display kinase activity. Upon ligand binding, the intracellular domain dimerizes, this event triggering (auto)phosphorilation of the intracellular domain which undergoes conformational changes and is able to activate a signal transduction cascade that induces or tunes important physiological phenomena. Examples of families of this type of receptors are: growth ormone, interferon, cytokine and Tyr-kinase, Gprotein-coupled receptors families (GPCRs) [46]. The components of the latter family were initially thought to act as monomers, but several pharmacological, biochemical and biophys‐ ical data indicate that GPCRs function as cooperatively controlled dimers [47].

Another very interesting example is represented by the Caspase-3,-7, and-9, a family of proteins involved in apoptosis. Under physiological conditions Caspase-9 exists as an inactive monomer forming an 1:1 complex with the Apaf-1 cofactor in the presence of Cytochrome C and ATP to produce a heteromultimer. This complex co-localizes with a multiple array of Caspase-9 molecules, which consequently increase their concentration above the Caspase-9 homodimer dissociation constant KD. This event allows the homodimer to be formed through the exposure of an activation loop, and the active dimer provides the catalytic activity necessary to activate Caspase-3 and-7 [48].

Finally, other interesting examples of oligomeric natural proteins are the membrane channelforming tetrameric complexes that allow specific ions (Na+ or K+) or water to permeate cells, such as aquaporins or aquaglyceroporins [49].

Proteins can also form large pathogenic oligomers or multimers that can evolve towards pathogenic supramolecular structures. Important examples of these malignant events are the uncontrolled aggregation of the Glu6Val Hb mutant of Hb (E6V-Hb or HbS) in sickle cell anaemia or the formation of amyloid or amyloid-like fibrils, as it occurs with several proteins related to severe neurodegenerative diseases. These latter products, overpassing the oligo‐ meric status, will be discussed later in a greater detail, within the physio-pathological func‐ tional consequences of protein oligomerization.

have been proposed. Furthermore, 3D-DS is almost uniquely related to protein self-association,

Protein Oligomerization http://dx.doi.org/10.5772/57489 251

Eisenberg and co-workers, who named 3D-DS this mechanism of reciprocal exchange of domains between proteins, immediately underlined the double-face nature, malignant and/or benign, of this protein-protein interaction, defining it as "entangling alliances between proteins" [50]. The term "entanglement" can be forced to be considered as expressing a negative fate of a protein without any possibility to evade the interaction although this 'jailing' can not be known 'a priori'. This is the case of amyloidogenic proteins which form, through this mechanism, fibrils involved in neurodegenerative diseases. Conversely, the term "alli‐ ances" clearly indicates the benign face of 3D-DS. In fact, a protein can self-associate to acquire novel activities absent in the monomer, or could also enforce pre-existing ones [57, 58], or control them, for example allosterically, as it occurs for swapped dimeric RNases [59, 60].

The mechanism is made possible thanks to the presence of a hinge-loop (red in Figure 7) located between two different protein structured domains. The flexible loop changes its conformation depending on the environment, and can address the swappable domain(s) into the comple‐ mentary subunit(s). The parts of a protein to be exchanged can be elements of secondary structure, such as α-helices or β-sheets (RNase A, BS-RNase), or entire domains (DT). The loop is usually composed of few residues which can be different from one protein to another. Interestingly, a single point mutation can induce dramatic changes in the loop flexibility and switch the protein towards or against self association through 3D-DS [61-64]. Other important factors that govern the 3D-DS event(s) are obviously protein concentration and inter-molecular

Some proteins can be constitutively domain-swapped, as it is for the member of the cyclindependent kinase p13suc1, which is natively a mixture of a monomer and a domain-swapped dimer [67]. The two monomeric/dimeric states are in equilibrium and the domain-swapped

Another very interesting natively domain-swapped protein is one of the two conformers of BS-RNase (Figure 8). This protein is the unique natural dimer of the large pancreatictype RNases superfamily, whose proto-type is RNase A. Furthermore, BS-RNase is a mixture of two isoforms [60], which are both covalently dimeric because of the presence of two antiparallel disulfide bonds, i.e., occurring between Cys-31 and-32 of one subunit and Cys-32 and-31, respectively, of the other [69]. About 70% of the molecules spontaneously swap their N-terminal helices to form the conformer called MxM shown in the right panel of Figure 8. This swapping event implies interesting functional consequences that will be discussed later. Thus, About 30% of BS-RNase is dimeric only thanks to the cited disul‐ fides and called M=M [60, 70] (Figure 8, left panel), while about 70% of the molecules additionally swap their N-terminal helices to form the second BS-RNase conformer, called MxM (Figure 8, right panel) [55, 60]. This swapping event implies interesting functional

dimer is favored by the presence of proline residues in the hinge loop [68].

although rare cases of hetero-association exist.

interactions [65, 66].

consequences that will be discussed later.

#### **3.4. Self/cross-association through three dimensional (3D) domain swapping**

A peculiar, interesting way to form protein dimers, oligomers or large multimers can occur naturally or artificially through the reciprocal exchange of small or large regions (peptide(s) or entire domain(s)) of the monomeric subunits. Monomers exploit short flexible hinge-loops present in their sequence to address a definite domain (or more than one) into the correspond‐ ing partner subunit that will reciprocally swap an identical domain with the former (Figure 7). This mechanism was called three dimensional domain swapping (3D-DS) by Eisenberg and co-workers when they discovered that diphtheria toxin (DT) can form a dimer intertwining an entire domain [50]. Beyond dimers, this mechanism, known for DT as well as for other proteins, can also lead to the formation of larger oligomers [51], exploiting a small flexible loop (Figure 7) that is able to adopt different conformations within various different environmental conditions.

The domain-swapped oligomer reconstitutes the native contacts present in the monomer (closed interface [52], green in Figure 7) except the hinge loop, while a new interface (open interface [52], magenta in Figure 7) forms in the oligomer only, stabilizing it.

**Figure 7. Schematic view of the 3D domain swapping (3D-DS) mechanism**. The movement opening of the loop‐ present in the 'starting' monomer (blue) allows the formation of the dimer by recreating the intramolecular interdo‐ main interface present in the monomer (closed interface, green [52]) in an intermolecular dimeric interface instead. This subsequently drives the formation of a new dimeric interface (open interface, magenta [52]), absent in the mono‐ mer. Through this mechanism a protein can form an active dimer still maintaining functional units (F.U.). The same mechanism can drive to oligomers of higher stoichiometry (number of associated chains) than dimers. (Modified from [53]).

3D domain swapping (3D-DS) was hypothesized about fifty years ago [54] for RNase A, and was then confirmed by several brilliant crystallographic results obtained in the '90s, the first with BS-RNase [55]. In the last two decades, domain swapping has been discovered to involve more than sixty proteins [56], and about 300 domain-swapped structures have been solved in crystals or solution, while even a higher number of models of oligomeric swapped proteins have been proposed. Furthermore, 3D-DS is almost uniquely related to protein self-association, although rare cases of hetero-association exist.

meric status, will be discussed later in a greater detail, within the physio-pathological func‐

A peculiar, interesting way to form protein dimers, oligomers or large multimers can occur naturally or artificially through the reciprocal exchange of small or large regions (peptide(s) or entire domain(s)) of the monomeric subunits. Monomers exploit short flexible hinge-loops present in their sequence to address a definite domain (or more than one) into the correspond‐ ing partner subunit that will reciprocally swap an identical domain with the former (Figure 7). This mechanism was called three dimensional domain swapping (3D-DS) by Eisenberg and co-workers when they discovered that diphtheria toxin (DT) can form a dimer intertwining an entire domain [50]. Beyond dimers, this mechanism, known for DT as well as for other proteins, can also lead to the formation of larger oligomers [51], exploiting a small flexible loop (Figure 7) that is able to adopt different conformations within various different environmental

The domain-swapped oligomer reconstitutes the native contacts present in the monomer (closed interface [52], green in Figure 7) except the hinge loop, while a new interface (open

**Figure 7. Schematic view of the 3D domain swapping (3D-DS) mechanism**. The movement opening of the loop‐ present in the 'starting' monomer (blue) allows the formation of the dimer by recreating the intramolecular interdo‐ main interface present in the monomer (closed interface, green [52]) in an intermolecular dimeric interface instead. This subsequently drives the formation of a new dimeric interface (open interface, magenta [52]), absent in the mono‐ mer. Through this mechanism a protein can form an active dimer still maintaining functional units (F.U.). The same mechanism can drive to oligomers of higher stoichiometry (number of associated chains) than dimers. (Modified from

3D domain swapping (3D-DS) was hypothesized about fifty years ago [54] for RNase A, and was then confirmed by several brilliant crystallographic results obtained in the '90s, the first with BS-RNase [55]. In the last two decades, domain swapping has been discovered to involve more than sixty proteins [56], and about 300 domain-swapped structures have been solved in crystals or solution, while even a higher number of models of oligomeric swapped proteins

interface [52], magenta in Figure 7) forms in the oligomer only, stabilizing it.

**3.4. Self/cross-association through three dimensional (3D) domain swapping**

tional consequences of protein oligomerization.

250 Oligomerization of Chemical and Biological Compounds

conditions.

[53]).

Eisenberg and co-workers, who named 3D-DS this mechanism of reciprocal exchange of domains between proteins, immediately underlined the double-face nature, malignant and/or benign, of this protein-protein interaction, defining it as "entangling alliances between proteins" [50]. The term "entanglement" can be forced to be considered as expressing a negative fate of a protein without any possibility to evade the interaction although this 'jailing' can not be known 'a priori'. This is the case of amyloidogenic proteins which form, through this mechanism, fibrils involved in neurodegenerative diseases. Conversely, the term "alli‐ ances" clearly indicates the benign face of 3D-DS. In fact, a protein can self-associate to acquire novel activities absent in the monomer, or could also enforce pre-existing ones [57, 58], or control them, for example allosterically, as it occurs for swapped dimeric RNases [59, 60].

The mechanism is made possible thanks to the presence of a hinge-loop (red in Figure 7) located between two different protein structured domains. The flexible loop changes its conformation depending on the environment, and can address the swappable domain(s) into the comple‐ mentary subunit(s). The parts of a protein to be exchanged can be elements of secondary structure, such as α-helices or β-sheets (RNase A, BS-RNase), or entire domains (DT). The loop is usually composed of few residues which can be different from one protein to another. Interestingly, a single point mutation can induce dramatic changes in the loop flexibility and switch the protein towards or against self association through 3D-DS [61-64]. Other important factors that govern the 3D-DS event(s) are obviously protein concentration and inter-molecular interactions [65, 66].

Some proteins can be constitutively domain-swapped, as it is for the member of the cyclindependent kinase p13suc1, which is natively a mixture of a monomer and a domain-swapped dimer [67]. The two monomeric/dimeric states are in equilibrium and the domain-swapped dimer is favored by the presence of proline residues in the hinge loop [68].

Another very interesting natively domain-swapped protein is one of the two conformers of BS-RNase (Figure 8). This protein is the unique natural dimer of the large pancreatictype RNases superfamily, whose proto-type is RNase A. Furthermore, BS-RNase is a mixture of two isoforms [60], which are both covalently dimeric because of the presence of two antiparallel disulfide bonds, i.e., occurring between Cys-31 and-32 of one subunit and Cys-32 and-31, respectively, of the other [69]. About 70% of the molecules spontaneously swap their N-terminal helices to form the conformer called MxM shown in the right panel of Figure 8. This swapping event implies interesting functional consequences that will be discussed later. Thus, About 30% of BS-RNase is dimeric only thanks to the cited disul‐ fides and called M=M [60, 70] (Figure 8, left panel), while about 70% of the molecules additionally swap their N-terminal helices to form the second BS-RNase conformer, called MxM (Figure 8, right panel) [55, 60]. This swapping event implies interesting functional consequences that will be discussed later.

**Figure 8. Structure of the two dimeric conformers of BS-RNase:** left panel, the unswapped (M=M) conformer [70]; right, the swapped isoform (MxM) [55]. The two subunits, A and B, and their N-termini, are highlighted, as well as the disulfide bonds that covalently link the two subunits in both isoforms [60].

The BS-RNase hinge loop and open interface have been exstensively studied [64] and, while Pro 19 and Leu 28 are key-residues for the stability of the MxM isoform [65], the entire 16-21 loop mutation together with a R80S mutation dramatically inverted the swapping tendency of the protein [63].

Other proteins, being native monomers, can be induced to form domain-swapped oligom‐ ers either naturally or artificially. This is the case of the 13.7 kDa RNase A (Figure 9 A). This enzyme can form various non-covalent oligomers when is lyophilized from 40-50% acetic acid solutions, and was the first protein for which 3D-DS mechanism was hypothe‐ sized to occur through the swapping of its N-terminal ends [54]. In 1998 this idea was confirmed by the analysis of the crystal structure of the N-term-swapped dimer of RNase A [71], now called RNase A N-dimer, or ND [5] (Figure 9 B). Three years later, the crystallographic structure of another dimeric conformer of RNase A was solved, discover‐ ing that the protein also swaps its C-terminus to form a second dimeric conformer [72], then called the C-dimer or CD [5] (Figure 9 C).

protein displays N swappable domains, SC is defined to be equal to N2

multimers, up to tetradecamers [6, 74], not shown here.

then SC=2 (1+1), while if N=2 SC speeds up to 6 (4+2), and so on. RNase A was thought to be unique in its capability to swap more than one domain, but recently also BS-RNase [81], cyanovirin-N and the mammalian DUF59-Fam96a protein displayed a similar multiple DS behavior [82, 83]. In particular, the natively N-swapped dimeric BS-RNase, known to be able to self-associate since 1969 [57], was found to form either N-and C-swapped tetramers and multimers [81, 84]. In addition, it has recently been reported that a monomerized BS-RNase

**Figure 9. RNase A 3D domain-swapped oligomers.A**, native RNase A monomer; **B**, N-dimer, ND [71]; **C**, C-dimer, CD [72]; **D,** N+C-swapped trimeric model [27, 75]; **E**, cyclic C-swapped-only, trimer, [75]; **F**, cyclic C-swapped-only, trimeric model [75]; **G-N**: tetramers, all N+C-swapped [6, 27, 28, 76], except the cyclic C-swapped-only model of panel **N** [76]. The dimensions of the oligomers here reported are tentatively representing the relative proportions withdrawable af‐ ter the lyophilization from 40% acetic acid solutions. RNase A also forms pentamers, hexamers [6, 29, 73], and larger

Furthermore, RNase A was firstly thought to be induced to oligomerize through an initial only partial denaturation [72] via 40% aqueous acetic acid treatment [54], while later it was shown that the protein undergoes almost complete denaturation (except its four disulfides) under acidic conditions [87]. Then, when lyophilization is followed by a re-dissolution of the powder in 'benign' buffers [87], i.e., buffers that can slow the regression towards monomer, as phosphate does for RNase oligomers, the protein can re-gain its native monomeric form for about 70% of the initial amount, and the remaining 30% forms various domain-swapped oligomers. The debate between the theories based on a partial [88], or total [87] denaturation

[85] can be induced to form a C-swapped dimer similar to the one of RNase A [86].

+N [80]. Thus, if N=1,

Protein Oligomerization http://dx.doi.org/10.5772/57489 253

Several RNase A oligomers larger than dimers have also been found to form [6, 29, 73, 74], Among them, three trimers (Figure 9 D-F), and six tetrameric different conformers have been found and extensively or partially characterized (Figure 9, panels G-N) as well as several other larger multimers [6, 27, 29, 73-75].

These findings indicate that the folds of RNase A are highly versatile, despite its overall known stability. Its dimers are not exclusively artificial, given that traces of ND are present in a native mixture [77, 78], and that CD has been detected to form and be subsequently degraded during protein expression in cells [79]. The capability of this enzyme to swap both termini definitely increases the number of possible structures it can form: in fact, the linear, or quasi-linear but not cyclically-closed oligomeric structures reported in Figure 9 (panels D and G-M), can form thanks to the contemporary swapping of N-and C-termini [6, 27, 28, 75, 76].

The capability to swap multiple domains highly increases the 'swapping capacity' (SC), which is defined as the upper limit of subunits with which a protein may interact [80]. In fact, if a

**Figure 8. Structure of the two dimeric conformers of BS-RNase:** left panel, the unswapped (M=M) conformer [70]; right, the swapped isoform (MxM) [55]. The two subunits, A and B, and their N-termini, are highlighted, as well as the

The BS-RNase hinge loop and open interface have been exstensively studied [64] and, while Pro 19 and Leu 28 are key-residues for the stability of the MxM isoform [65], the entire 16-21 loop mutation together with a R80S mutation dramatically inverted the swapping tendency

Other proteins, being native monomers, can be induced to form domain-swapped oligom‐ ers either naturally or artificially. This is the case of the 13.7 kDa RNase A (Figure 9 A). This enzyme can form various non-covalent oligomers when is lyophilized from 40-50% acetic acid solutions, and was the first protein for which 3D-DS mechanism was hypothe‐ sized to occur through the swapping of its N-terminal ends [54]. In 1998 this idea was confirmed by the analysis of the crystal structure of the N-term-swapped dimer of RNase A [71], now called RNase A N-dimer, or ND [5] (Figure 9 B). Three years later, the crystallographic structure of another dimeric conformer of RNase A was solved, discover‐ ing that the protein also swaps its C-terminus to form a second dimeric conformer [72],

Several RNase A oligomers larger than dimers have also been found to form [6, 29, 73, 74], Among them, three trimers (Figure 9 D-F), and six tetrameric different conformers have been found and extensively or partially characterized (Figure 9, panels G-N) as well as several other

These findings indicate that the folds of RNase A are highly versatile, despite its overall known stability. Its dimers are not exclusively artificial, given that traces of ND are present in a native mixture [77, 78], and that CD has been detected to form and be subsequently degraded during protein expression in cells [79]. The capability of this enzyme to swap both termini definitely increases the number of possible structures it can form: in fact, the linear, or quasi-linear but not cyclically-closed oligomeric structures reported in Figure 9 (panels D and G-M), can form

The capability to swap multiple domains highly increases the 'swapping capacity' (SC), which is defined as the upper limit of subunits with which a protein may interact [80]. In fact, if a

thanks to the contemporary swapping of N-and C-termini [6, 27, 28, 75, 76].

disulfide bonds that covalently link the two subunits in both isoforms [60].

252 Oligomerization of Chemical and Biological Compounds

then called the C-dimer or CD [5] (Figure 9 C).

larger multimers [6, 27, 29, 73-75].

of the protein [63].

**Figure 9. RNase A 3D domain-swapped oligomers.A**, native RNase A monomer; **B**, N-dimer, ND [71]; **C**, C-dimer, CD [72]; **D,** N+C-swapped trimeric model [27, 75]; **E**, cyclic C-swapped-only, trimer, [75]; **F**, cyclic C-swapped-only, trimeric model [75]; **G-N**: tetramers, all N+C-swapped [6, 27, 28, 76], except the cyclic C-swapped-only model of panel **N** [76]. The dimensions of the oligomers here reported are tentatively representing the relative proportions withdrawable af‐ ter the lyophilization from 40% acetic acid solutions. RNase A also forms pentamers, hexamers [6, 29, 73], and larger multimers, up to tetradecamers [6, 74], not shown here.

protein displays N swappable domains, SC is defined to be equal to N2 +N [80]. Thus, if N=1, then SC=2 (1+1), while if N=2 SC speeds up to 6 (4+2), and so on. RNase A was thought to be unique in its capability to swap more than one domain, but recently also BS-RNase [81], cyanovirin-N and the mammalian DUF59-Fam96a protein displayed a similar multiple DS behavior [82, 83]. In particular, the natively N-swapped dimeric BS-RNase, known to be able to self-associate since 1969 [57], was found to form either N-and C-swapped tetramers and multimers [81, 84]. In addition, it has recently been reported that a monomerized BS-RNase [85] can be induced to form a C-swapped dimer similar to the one of RNase A [86].

Furthermore, RNase A was firstly thought to be induced to oligomerize through an initial only partial denaturation [72] via 40% aqueous acetic acid treatment [54], while later it was shown that the protein undergoes almost complete denaturation (except its four disulfides) under acidic conditions [87]. Then, when lyophilization is followed by a re-dissolution of the powder in 'benign' buffers [87], i.e., buffers that can slow the regression towards monomer, as phosphate does for RNase oligomers, the protein can re-gain its native monomeric form for about 70% of the initial amount, and the remaining 30% forms various domain-swapped oligomers. The debate between the theories based on a partial [88], or total [87] denaturation pathway that a generic protein must follow to oligomerize through 3D-DS is still open, and what has been established for RNase A [87] is not possible to be absolutely stated for all proteins.

Other important examples of proteins able to form domain-swapped structures are the following: i) Cytochrome C, which was known to polymerize since 1962 [97], but only recently showed to form these inactive supramolecular structures protomers via a runaway 3D-DS of its C-termini. In particular, domain-swapped dimers, trimers, tetramers and polymers up to ~ 40-mers have been characterized [98]; ii) BCL-XL, an anti-apoptotic protein belonging to the BCL-2 family which can form active C-term-swapped dimers when highly concentrated [99] or alternatively when heated up to 50 °C [100]. iii) Cadherins, which are cell adhesion proteins, dimerize through β-strand swapping to mediate the adhesion itself [101]. Incidentally, it has to be mentioned that several protein cell receptors are known to dimerize to become active (see above), but less is known about the mechanism responsible for the dimerization. Thus, possibly some of them could undergo 3D-DS. iv) Finally, also histones are known to fold

Protein Oligomerization http://dx.doi.org/10.5772/57489 255

3D-DS can also be favored, or hindered, by point or multiple mutations, as is well known for hp-RNase, or by the conditions under which the crystallization process occurs. This is true for barnase, a 12kDa RNase from *Bacillus amyloliquefaciens*, which forms a DS cyclic trimer under not too harsh conditions [103], and for the DS dimer formed in crystals by

Last, but not least, some amyloidogenic proteins form fibrils through the initial formation of domain-swapped dimers and oligomers, which are the starting point of their massive self-association [80]. The possibility to overlap 3D-DS with the mechanism of formation of amyloid fibrils was firstly hypothesized by Eisenberg and colleagues. They explained it as a compatibility existing between 3D-DS and the polyglutammine(polyQ)-cross-β steric zippers [72]. This idea was supported by the structural similarity existing between the Asnbased open interface of RNase A CD [72] and the fibrillogenic nature of poly-Q expan‐ sions [105] which are structured as cross-β-spines [13]. The validity of this theory was also enforced by the discovery that either prion protein or cystatin-C, two amyloidogenic crossβ-spine-prone proteins, form 3D-DS dimers [106, 107] (Figure 11, panels A and B). Later, numerous different experimental evidences confirmed this hypothesis. In fact, after discovering that the prion protein (PrP), which is associated with the lethal neurodegener‐ ative Creutzfeld-Jacobs Disease (CJD) and Scrapie, dimerizes through 3D-DS [106], the dimerization event was shown to be the rate-limiting step in the conversion towards the infectious fibrillogenic form of PrP [108]. Then, conversion to fibrils is promoted by an unlocking of the globular domain combined to a redox process, both triggered by 3D-DS [109, 110]. These events drive to the formation of domain-swapped oligomers and multi‐ mers stabilized by intermolecular newly formed disulfide bonds [110], but do not affect the

overall tertiary structure of the globular main domain of PrP [109] (Figure 11 A).

Furthermore, cystatins, a class of proteins which comprises also stefins and that inhibit cysteine proteases, can also dimerize through 3D-DS [113]. In particular, the L68Q human cystatin-C (hCC) mutant in particular, can dimerize through 3D-DS [107], then inducing severe massive amyloidosis in brain arteries and lethal cerebral hemorrhages. The 13.3 kDa hCC forms fibers through a preliminary domain-swapped dimer+dimer tetrameric rearrangement [114] and a

through 3D-DS in their evolutionary pathway [102].

Grb2-SH2 domains [104].

RNase A and BS-RNase can oligomerize following the same 3D-DS mechanism also if very highly concentrated water-alcohol solutions of the enzyme are heated up to 60 °C, and then stabilized by phosphate buffers avoiding the lyophilization step [78, 81]. In this way, the absolute and relative amounts of the various N- or C-swapped oligomers change depending on the environmental conditions. Point mutations which modulated the hydrophilic/hydro‐ phobic nature of the N-/C-swappable domains of RNase A confirmed this tendency [89].

Also human RNases like human pancreatic RNase (hp-RNase) can spontaneously dimerize through 3D-DS when some point mutations are introduced [90-92], although no oligomers larger than dimers have been detected among its aggregation products.

The 3D-DS mechanism can let a protein to overpass the dimeric status and go towards larger oligomerization, multimerization and possibly fibrillization through multiple swapping and/ or other alternative ways to stabilize the aggregates. For instance, the formation of disulfide bonds, or the repeat for several times of the same type of swapping with the formation of openended structures through a propagative or runaway 3D-DS [93, 94] (Figure 10) are proper alternative ways.

**Figure 10. DS in not-cyclic oligomers larger than dimers.** (modified from [94]). The first two models display openended edging subunits. The increase in stability of an open-ended structure is proportional to the number of subunits that are present between the two edges. The three models reported are an evolution of the ones reported in [93].

Beyond RNases, several other proteins involved in important biologic processes show to form domain-swapped oligomers. One domain swapping-prone protein which in the last years has been discovered to be highly structurally versatile is cyanovirin-N, an 11 kDa protein that inhibits HIV [95]. It can be active either as a monomer or as a metastable domain-swapped dimer [96]. Interestingly, some mutants that become active only as domain-swapped dimers were recently found to form two different relatively stable 3D-DS dimeric conformers, one 3D-DS trimer, and two 3D-DS tetramers [82].

Other important examples of proteins able to form domain-swapped structures are the following: i) Cytochrome C, which was known to polymerize since 1962 [97], but only recently showed to form these inactive supramolecular structures protomers via a runaway 3D-DS of its C-termini. In particular, domain-swapped dimers, trimers, tetramers and polymers up to ~ 40-mers have been characterized [98]; ii) BCL-XL, an anti-apoptotic protein belonging to the BCL-2 family which can form active C-term-swapped dimers when highly concentrated [99] or alternatively when heated up to 50 °C [100]. iii) Cadherins, which are cell adhesion proteins, dimerize through β-strand swapping to mediate the adhesion itself [101]. Incidentally, it has to be mentioned that several protein cell receptors are known to dimerize to become active (see above), but less is known about the mechanism responsible for the dimerization. Thus, possibly some of them could undergo 3D-DS. iv) Finally, also histones are known to fold through 3D-DS in their evolutionary pathway [102].

pathway that a generic protein must follow to oligomerize through 3D-DS is still open, and what has been established for RNase A [87] is not possible to be absolutely stated for all

RNase A and BS-RNase can oligomerize following the same 3D-DS mechanism also if very highly concentrated water-alcohol solutions of the enzyme are heated up to 60 °C, and then stabilized by phosphate buffers avoiding the lyophilization step [78, 81]. In this way, the absolute and relative amounts of the various N- or C-swapped oligomers change depending on the environmental conditions. Point mutations which modulated the hydrophilic/hydro‐ phobic nature of the N-/C-swappable domains of RNase A confirmed this tendency [89].

Also human RNases like human pancreatic RNase (hp-RNase) can spontaneously dimerize through 3D-DS when some point mutations are introduced [90-92], although no oligomers

The 3D-DS mechanism can let a protein to overpass the dimeric status and go towards larger oligomerization, multimerization and possibly fibrillization through multiple swapping and/ or other alternative ways to stabilize the aggregates. For instance, the formation of disulfide bonds, or the repeat for several times of the same type of swapping with the formation of openended structures through a propagative or runaway 3D-DS [93, 94] (Figure 10) are proper

**Figure 10. DS in not-cyclic oligomers larger than dimers.** (modified from [94]). The first two models display openended edging subunits. The increase in stability of an open-ended structure is proportional to the number of subunits that are present between the two edges. The three models reported are an evolution of the ones reported in [93].

Beyond RNases, several other proteins involved in important biologic processes show to form domain-swapped oligomers. One domain swapping-prone protein which in the last years has been discovered to be highly structurally versatile is cyanovirin-N, an 11 kDa protein that inhibits HIV [95]. It can be active either as a monomer or as a metastable domain-swapped dimer [96]. Interestingly, some mutants that become active only as domain-swapped dimers were recently found to form two different relatively stable 3D-DS dimeric conformers, one 3D-

larger than dimers have been detected among its aggregation products.

proteins.

254 Oligomerization of Chemical and Biological Compounds

alternative ways.

DS trimer, and two 3D-DS tetramers [82].

3D-DS can also be favored, or hindered, by point or multiple mutations, as is well known for hp-RNase, or by the conditions under which the crystallization process occurs. This is true for barnase, a 12kDa RNase from *Bacillus amyloliquefaciens*, which forms a DS cyclic trimer under not too harsh conditions [103], and for the DS dimer formed in crystals by Grb2-SH2 domains [104].

Last, but not least, some amyloidogenic proteins form fibrils through the initial formation of domain-swapped dimers and oligomers, which are the starting point of their massive self-association [80]. The possibility to overlap 3D-DS with the mechanism of formation of amyloid fibrils was firstly hypothesized by Eisenberg and colleagues. They explained it as a compatibility existing between 3D-DS and the polyglutammine(polyQ)-cross-β steric zippers [72]. This idea was supported by the structural similarity existing between the Asnbased open interface of RNase A CD [72] and the fibrillogenic nature of poly-Q expan‐ sions [105] which are structured as cross-β-spines [13]. The validity of this theory was also enforced by the discovery that either prion protein or cystatin-C, two amyloidogenic crossβ-spine-prone proteins, form 3D-DS dimers [106, 107] (Figure 11, panels A and B). Later, numerous different experimental evidences confirmed this hypothesis. In fact, after discovering that the prion protein (PrP), which is associated with the lethal neurodegener‐ ative Creutzfeld-Jacobs Disease (CJD) and Scrapie, dimerizes through 3D-DS [106], the dimerization event was shown to be the rate-limiting step in the conversion towards the infectious fibrillogenic form of PrP [108]. Then, conversion to fibrils is promoted by an unlocking of the globular domain combined to a redox process, both triggered by 3D-DS [109, 110]. These events drive to the formation of domain-swapped oligomers and multi‐ mers stabilized by intermolecular newly formed disulfide bonds [110], but do not affect the overall tertiary structure of the globular main domain of PrP [109] (Figure 11 A).

Furthermore, cystatins, a class of proteins which comprises also stefins and that inhibit cysteine proteases, can also dimerize through 3D-DS [113]. In particular, the L68Q human cystatin-C (hCC) mutant in particular, can dimerize through 3D-DS [107], then inducing severe massive amyloidosis in brain arteries and lethal cerebral hemorrhages. The 13.3 kDa hCC forms fibers through a preliminary domain-swapped dimer+dimer tetrameric rearrangement [114] and a

subsequent propagation of 3D-DS [111]. The finding that prevention of 3D-DS inhibits cystatin-C dimerization and multimerization [115] and studies on the hinge-loop governing the 3D-DS event [116] confirm that 3D-DS plays a key-role in cystatin-C fibrillogenesis [117] (Figure 11 B).

Protein Oligomerization http://dx.doi.org/10.5772/57489 257

Another important amyloidogenic protein able to self-associate through 3D-DS is β2-micro‐ globulin (β2-m), the 10.9 kDa light chain of type-I histocompatibility complex, which can seed as amyloid fibrils during long-term hemodyalisis treatments. Like PrP, β2-m dimerizes through 3D-DS [118] and forms propagated domain-swapped amyloid fibrils stabilized by disulfide

Again, other amyloidogenic proteins dimerize and massively aggregate through 3D-DS. These are: i) the immunoglobulin G-binding B1 domain, which forms 3D-DS conformation‐ ally different dimers [119] and tetramers [120] induced by core-domain mutations before forming fibrils [121]; ii) T7-endonuclease I which forms runaway domain-swapped fibrils stabilized by core-domain intermolecular disulfides [122]. iii) Cell cycle protein Cks1, which fibrillize through the preliminary formation of a domain-swapped dimer [123]. Converse‐ ly, for another important amyloidogenic protein, TTR, 3D-DS is to date only hypothe‐ sized [124], while the direct stacking model interaction between subunits [125] is the one

Before leaving the "3D-DS toward amyloidosis" topic, it has to be mentioned that wt RNase A, despite its high structural versatility [5], and its high SC, and although displaying some amyloid-prone short sequences [126, 127], is not able to form amyloid or amyloid-like (i.e., *in vitro*) fibers. Its core domain, in fact, is stabilized by four disulfide bonds, and 'self-chaperones' the whole protein from falling towards fibrillization [128]. Incidentally, the only pancreatictype RNase known to date to form fibrils is the eosinophil cationic protein ECP [129]. Again, Eisenberg and co-workers showed that, contrary to wt RNase A, some mutants, such as poly Q- and poly G-RNase A, spontaneously form "native-like" amyloid fibrils through C-terminal and N-terminal 3D-DS, respectively [128, 130]. The term "native-like" is referred to the evidence that the core-domains of each protomer forming the fiber remain conserved and natively structured in it [128, 130]. These findings confirm once again that 3D-DS and cross-

Finally, the 3D-DS protein dimers or oligomers mentioned above are homo-polymers, and almost all domain-swapped proteins nowadays known are indeed homo-oligomers. Domainswapped hetero-oligomers are extremely rare, but one to be mentioned is the IX/X-bp antico‐ agulant complex. This is a domain-swapped dimer forming between two homologous subunits which cross-associate through an intermolecular disulfide bond, but also intertwin‐ ing a flexible loop located in the central part of each subunit [131]. Contrary to what it could be expected, the dimeric hybrid obtained by associating RNase A with a monomerized BS-RNase [85] did not show a domain-swapped nature. It consists, instead, of two different

conformers associated through hydrophobic and electrostatic interactions [132].

bonds [112] (Figure 11 C).

still nowadays accepted.

β-zipper spines are events that can overlap [72].

**Figure 11. DS Amyloid fibrilsof(A)** human prion protein (hPrP), **(B)** human cystatin-C, hCC, and **(C)** β2-microglobulin (β2-m). In all cases, fibrillogenesis is promoted by the preliminary formation of a domain-swapped dimer which evolves towards fibrils through a redox pathway. All the three fibers display the features of a generic amyloid fiber shown in panel **D**. The figure summarizes pictures reported in [106, 107, 110-112].

subsequent propagation of 3D-DS [111]. The finding that prevention of 3D-DS inhibits cystatin-C dimerization and multimerization [115] and studies on the hinge-loop governing the 3D-DS event [116] confirm that 3D-DS plays a key-role in cystatin-C fibrillogenesis [117] (Figure 11 B).

Another important amyloidogenic protein able to self-associate through 3D-DS is β2-micro‐ globulin (β2-m), the 10.9 kDa light chain of type-I histocompatibility complex, which can seed as amyloid fibrils during long-term hemodyalisis treatments. Like PrP, β2-m dimerizes through 3D-DS [118] and forms propagated domain-swapped amyloid fibrils stabilized by disulfide bonds [112] (Figure 11 C).

Again, other amyloidogenic proteins dimerize and massively aggregate through 3D-DS. These are: i) the immunoglobulin G-binding B1 domain, which forms 3D-DS conformation‐ ally different dimers [119] and tetramers [120] induced by core-domain mutations before forming fibrils [121]; ii) T7-endonuclease I which forms runaway domain-swapped fibrils stabilized by core-domain intermolecular disulfides [122]. iii) Cell cycle protein Cks1, which fibrillize through the preliminary formation of a domain-swapped dimer [123]. Converse‐ ly, for another important amyloidogenic protein, TTR, 3D-DS is to date only hypothe‐ sized [124], while the direct stacking model interaction between subunits [125] is the one still nowadays accepted.

Before leaving the "3D-DS toward amyloidosis" topic, it has to be mentioned that wt RNase A, despite its high structural versatility [5], and its high SC, and although displaying some amyloid-prone short sequences [126, 127], is not able to form amyloid or amyloid-like (i.e., *in vitro*) fibers. Its core domain, in fact, is stabilized by four disulfide bonds, and 'self-chaperones' the whole protein from falling towards fibrillization [128]. Incidentally, the only pancreatictype RNase known to date to form fibrils is the eosinophil cationic protein ECP [129]. Again, Eisenberg and co-workers showed that, contrary to wt RNase A, some mutants, such as poly Q- and poly G-RNase A, spontaneously form "native-like" amyloid fibrils through C-terminal and N-terminal 3D-DS, respectively [128, 130]. The term "native-like" is referred to the evidence that the core-domains of each protomer forming the fiber remain conserved and natively structured in it [128, 130]. These findings confirm once again that 3D-DS and crossβ-zipper spines are events that can overlap [72].

Finally, the 3D-DS protein dimers or oligomers mentioned above are homo-polymers, and almost all domain-swapped proteins nowadays known are indeed homo-oligomers. Domainswapped hetero-oligomers are extremely rare, but one to be mentioned is the IX/X-bp antico‐ agulant complex. This is a domain-swapped dimer forming between two homologous subunits which cross-associate through an intermolecular disulfide bond, but also intertwin‐ ing a flexible loop located in the central part of each subunit [131]. Contrary to what it could be expected, the dimeric hybrid obtained by associating RNase A with a monomerized BS-RNase [85] did not show a domain-swapped nature. It consists, instead, of two different conformers associated through hydrophobic and electrostatic interactions [132].

**Figure 11. DS Amyloid fibrilsof(A)** human prion protein (hPrP), **(B)** human cystatin-C, hCC, and **(C)** β2-microglobulin (β2-m). In all cases, fibrillogenesis is promoted by the preliminary formation of a domain-swapped dimer which evolves towards fibrils through a redox pathway. All the three fibers display the features of a generic amyloid fiber

shown in panel **D**. The figure summarizes pictures reported in [106, 107, 110-112].

256 Oligomerization of Chemical and Biological Compounds

#### **4. Stability of the protein oligomers**

Protein oligomers are supramolecular structures which are sometimes 'chosen'by nature *abinitio*, or often built up as a response to natural or non-natural events. Anyway, in both cases oligomers can follow different fates. They can be highly stable, mainly when formed by irreversible phenomena, or can represent metastable or even transient events, thus undergoing fast or slow dissociation. Therefore, there are clearly several cases to be analyzed.

by point mutations is hemoglobin, which extensively self-associates in its sickle-cell variant

Protein Oligomerization http://dx.doi.org/10.5772/57489 259

Formation of protein oligomers can be induced in some cases by point mutations, as is for hp-RNase [91], or as a consequence of changes in environmental conditions (pH, temperature, protein concentration). This is true for the various domain-swapped dimers and oligomers of RNase A and BS-RNase. Indeed, several studies have been performed on these two pancreatictype enzymes, who display 82% identity [53]. Thus, the 23 out of 124 different residues have been considered as key stabilizing or destabilizing determinants. In particular, in both variants several hinge loops residues and/or others belonging to the swappable N-terminal or Cterminal domains or interacting with them have been mutated [64, 65, 134, 135]. Many of them have been found to be key residues in stabilizing the domain-swapped oligomers, while other were found to promote oligomers dissociation towards the native proteins, i.e., monomeric

Other factors which affect the stability of the oligomers are pH, temperature, ionic strength of the medium. It is easily understandable that acidic or basic pHs destabilize oligomeric assemblies, as well as native monomeric proteins. The role of temperature is very important in stabilizing or destabilizing non-covalent protein oligomers. In fact the subunit motions are proportional to temperature increase, thus, a metastable adduct can be forced to quickly dissociate by increasing the temperature. Furthermore, heat allows a protein to access its denatured state [136] which is a destabilizing event 'per sè'. Anyway, a very recent study reveals that low temperatures, beyond affecting the folding of native monomers [137] can be useful to study the denaturation and dissociation of dimers through NMR procedures [138].

All these effects, as well as the role played by ionic strength, are qualitatively and quantitatively different from one case to another, because each oligomeric complex can display different

In addition, dynamic motions can differently affect the stability of oligomeric conformers belonging to the same protein. This is the case for RNase A whose dimers display a different flexibility, higher for CD than for ND [139, 140]. The dynamics of multimeric assemblies dissociation can lead towards different pathways, then producing different smaller products. This is true, for example, for porphobilinogen synthase (PBGS), an octamer which can dissociate to tetramers and dimers either symmetrically, through a consecutive loss of dimeric

The concentration of a protein is crucial either to oligomerize or dissociate, or even evolve toward larger multimers and fibrils. These phenomena are ruled by the KD values associated to each oligomerization process (see equations 1 and 2 and Figure 4), thus dilution can be a mean by which oligomers and multimers can be destabilized and dissociated [93]. Conse‐ quently, macromolecular crowding in general [142] can deeply affect the propensity of a protein to oligomerize by influencing the oligomerization yields [143] and/or the dissociation

adducts, or through an asymmetric detach of one subunit per time [141].

(HbS).

RNase A or dimeric BS-RNase [134].

shapes, dynamics, and intersubunits surfaces.

pathways and kinetics [74].

First of all, covalently linked oligomers are obviously the most stable supra-molecular protein structures, except if they interact with proteases, or of course if they have disulfide bonds. Disulfides can be affected by slight redox changes of the environment and reduced to free cysteines or cysteine-like adducts, thus unchaining the two (or more) covalently linked protein molecules. This can be the case of hetero-dimers formed by immunotoxins or other artificial conjugates that can be released in the cell by the reducing power of cytosol [21]. A similar but natural event is what concerns BS-RNase, whose two intermolecular disulfide bridges can be reduced in the cytosol with the formation of two monomers derived from the M=M isoform, while a non-covalent dimer (NCD) [133] survives from MxM [55] thanks to the 3D-DS of its N-termini.

The majority of protein oligomers forms through non-covalent weak associations which can often lead to metastable dimers or oligomers. Their lability is essentially related to the nature of the interaction(s) between the subunits, and to the extent of interface area. H-bonds are weaker than electrostatic interactions, but they can be crucial in anchoring a domain into a specific orientation that can be further stabilized by hydrophobic or electrostatic interactions. Thus, the balance of enthalpy and entropy contributions is decisive in driving a protein oligomer to survive or not, taking into account that entropy is against protein-protein associ‐ ation, a phenomenon which is instead favored by a high concentration of the protomers [37, 50, 77], as was reported above in equations 1 and 2.

For example, human AGT (see Figure 6 a) is active only in the dimeric status, while the enzyme loses its activity if dimerization is hindered. This occurs because the two composite active sites are incomplete despite a PLP co-enzyme is present in each subunit [41]. It is now well known that mutations destabilizing the large interface located between the two subunits can dramat‐ ically lower the activity of AGT and possibly drive the enzyme towards monomerization [35]. This latter event induces an unwanted and uncontrolled aggregation which traps monomers, blocking the natural dimerization of AGT [35]. The pathologic effects due to the loss of dimeric, or in general oligomeric, native structures is a general feature of PLP-dependent enzymes.

In general, mutations can affect the overall conformation of protein dimers and/or oligomers and induce monomerization or vice versa uncontrolled aggregation, and even fibrillization. This is the case of TTR (see above), whose native homotetramers are destabilized by point mutations, such as V30M or L55P. The tetrameric assembly in these variants is weakened and the protein easily monomerizes, then undergoing fibrillization through dimeric and octameric annular intermediates [10]. Another protein whose oligomeric status is dramatically affected by point mutations is hemoglobin, which extensively self-associates in its sickle-cell variant (HbS).

**4. Stability of the protein oligomers**

258 Oligomerization of Chemical and Biological Compounds

50, 77], as was reported above in equations 1 and 2.

N-termini.

Protein oligomers are supramolecular structures which are sometimes 'chosen'by nature *abinitio*, or often built up as a response to natural or non-natural events. Anyway, in both cases oligomers can follow different fates. They can be highly stable, mainly when formed by irreversible phenomena, or can represent metastable or even transient events, thus undergoing

First of all, covalently linked oligomers are obviously the most stable supra-molecular protein structures, except if they interact with proteases, or of course if they have disulfide bonds. Disulfides can be affected by slight redox changes of the environment and reduced to free cysteines or cysteine-like adducts, thus unchaining the two (or more) covalently linked protein molecules. This can be the case of hetero-dimers formed by immunotoxins or other artificial conjugates that can be released in the cell by the reducing power of cytosol [21]. A similar but natural event is what concerns BS-RNase, whose two intermolecular disulfide bridges can be reduced in the cytosol with the formation of two monomers derived from the M=M isoform, while a non-covalent dimer (NCD) [133] survives from MxM [55] thanks to the 3D-DS of its

The majority of protein oligomers forms through non-covalent weak associations which can often lead to metastable dimers or oligomers. Their lability is essentially related to the nature of the interaction(s) between the subunits, and to the extent of interface area. H-bonds are weaker than electrostatic interactions, but they can be crucial in anchoring a domain into a specific orientation that can be further stabilized by hydrophobic or electrostatic interactions. Thus, the balance of enthalpy and entropy contributions is decisive in driving a protein oligomer to survive or not, taking into account that entropy is against protein-protein associ‐ ation, a phenomenon which is instead favored by a high concentration of the protomers [37,

For example, human AGT (see Figure 6 a) is active only in the dimeric status, while the enzyme loses its activity if dimerization is hindered. This occurs because the two composite active sites are incomplete despite a PLP co-enzyme is present in each subunit [41]. It is now well known that mutations destabilizing the large interface located between the two subunits can dramat‐ ically lower the activity of AGT and possibly drive the enzyme towards monomerization [35]. This latter event induces an unwanted and uncontrolled aggregation which traps monomers, blocking the natural dimerization of AGT [35]. The pathologic effects due to the loss of dimeric, or in general oligomeric, native structures is a general feature of PLP-dependent enzymes.

In general, mutations can affect the overall conformation of protein dimers and/or oligomers and induce monomerization or vice versa uncontrolled aggregation, and even fibrillization. This is the case of TTR (see above), whose native homotetramers are destabilized by point mutations, such as V30M or L55P. The tetrameric assembly in these variants is weakened and the protein easily monomerizes, then undergoing fibrillization through dimeric and octameric annular intermediates [10]. Another protein whose oligomeric status is dramatically affected

fast or slow dissociation. Therefore, there are clearly several cases to be analyzed.

Formation of protein oligomers can be induced in some cases by point mutations, as is for hp-RNase [91], or as a consequence of changes in environmental conditions (pH, temperature, protein concentration). This is true for the various domain-swapped dimers and oligomers of RNase A and BS-RNase. Indeed, several studies have been performed on these two pancreatictype enzymes, who display 82% identity [53]. Thus, the 23 out of 124 different residues have been considered as key stabilizing or destabilizing determinants. In particular, in both variants several hinge loops residues and/or others belonging to the swappable N-terminal or Cterminal domains or interacting with them have been mutated [64, 65, 134, 135]. Many of them have been found to be key residues in stabilizing the domain-swapped oligomers, while other were found to promote oligomers dissociation towards the native proteins, i.e., monomeric RNase A or dimeric BS-RNase [134].

Other factors which affect the stability of the oligomers are pH, temperature, ionic strength of the medium. It is easily understandable that acidic or basic pHs destabilize oligomeric assemblies, as well as native monomeric proteins. The role of temperature is very important in stabilizing or destabilizing non-covalent protein oligomers. In fact the subunit motions are proportional to temperature increase, thus, a metastable adduct can be forced to quickly dissociate by increasing the temperature. Furthermore, heat allows a protein to access its denatured state [136] which is a destabilizing event 'per sè'. Anyway, a very recent study reveals that low temperatures, beyond affecting the folding of native monomers [137] can be useful to study the denaturation and dissociation of dimers through NMR procedures [138].

All these effects, as well as the role played by ionic strength, are qualitatively and quantitatively different from one case to another, because each oligomeric complex can display different shapes, dynamics, and intersubunits surfaces.

In addition, dynamic motions can differently affect the stability of oligomeric conformers belonging to the same protein. This is the case for RNase A whose dimers display a different flexibility, higher for CD than for ND [139, 140]. The dynamics of multimeric assemblies dissociation can lead towards different pathways, then producing different smaller products. This is true, for example, for porphobilinogen synthase (PBGS), an octamer which can dissociate to tetramers and dimers either symmetrically, through a consecutive loss of dimeric adducts, or through an asymmetric detach of one subunit per time [141].

The concentration of a protein is crucial either to oligomerize or dissociate, or even evolve toward larger multimers and fibrils. These phenomena are ruled by the KD values associated to each oligomerization process (see equations 1 and 2 and Figure 4), thus dilution can be a mean by which oligomers and multimers can be destabilized and dissociated [93]. Conse‐ quently, macromolecular crowding in general [142] can deeply affect the propensity of a protein to oligomerize by influencing the oligomerization yields [143] and/or the dissociation pathways and kinetics [74].

Also proteolysis, occurring naturally or induced, limited or massive, is a way through which proteins' activities are naturally or artificially switched on/off, or anyhow tuned. Thus, oligomerization can also affect proteolysis: in fact, a domain that in a native monomer is exposed to the action of a proteolytic agent can be partially or totally hidden by dimerization or oligomerization, and proteolysis can be slowed down or even blocked; otherwise, a region which in the native monomer is well structured and therefore protected, or even hidden, can be destabilized, destructured and exposed after oligomerization, becoming susceptible to proteolysis. This is what happens with RNase A, whose ND is definitely more susceptible to proteolysis than the CD conformer [144].

activities. These gain- or loss-of-function events occur by exposing or hiding active surfaces or, for example, by inducing positive or negative allostery impossible to be warranted by the native monomer. These events can be ruled not only by changing the oligomerization/ polymerization status of the proteins, but also by conformational changes induced by ligands, such as it occurs with Hb (see Figure 5). Gain or loss of native functions induced by oligomerization can be benign or malignant and also nature can drive either towards physiologic or pathological events, as it happens, respectively, with actin [150] or with sickle-cell anaemia associated to a point mutation in hemoglobin (HbS). Other important pathological implications, i.e., oligomers evolving towards fibrils, will be discussed later. A benign case that deserves to be mentioned is rhodopsin, the pigment involved in the phototransduction events that are crucial for vision, which physiologically organizes itself

Protein Oligomerization http://dx.doi.org/10.5772/57489 261

An interesting case in which protein oligomerization induces a gain of function concerns RNase A. The native monomeric enzyme only degrades single-stranded (ss) RNA and is not cytotoxic, while its artificial dimers and oligomers, either forming through non-covalent 3D-DS or covalent bonds become also active against double-stranded (ds) RNA [29, 57, 73, 152]. Moreover, they can become selectively cytotoxic towards cancer cells both *in vitro* and *in vivo* [153-156], although more recent results indicate that cytotoxicity can be dependent also on the type of cell line studied [32]. This acquired cytotoxic power can be considered benign, given the selectivity towards malignant cells, and can be mainly ascribable to the possibility of oligomeric RNases to evade the ribonuclease inhibitor (RI), which is designed for tightly trap monomeric RNases [157]. This is also the main reason why only MxM BS-RNase (see Figure 8) is cytotoxic: in fact, the cytosolic reducing environment allows only the MxM domainswapped isoform to survive as a non-covalent dimer (NCD) and to evade RI, while M=M becomes a monomeric, RI-susceptible, derivative [158]. Another example of cytotoxic protein oligomers is represented by the ones formed by p13suc1 [159]. The native cell-cycle regulatory protein is a monomer/DS-dimer mixture [160] that can form large native-like cytotoxic aggregates through 3D-DS, and whose structural determinants governing the swapping mechanism have been extensively studied as reported above [62, 68]. Thus, the artificial induction, quench, or control of the oligomerization event(s) can be useful to avoid, induce, or tune several biological properties of proteins, such as for example enzymatic activity, or the

**5.2. Protein oligomerization towards fibrillization and/or amyloidosis**

Protein self- or cross-association can be naturally or artificially controlled to the degree of oligomers or multimers, but can sometimes undergo uncontrolled massive aggregation, often resulting in fibers, as it has already been reported in the domain swapping section. These supramolecular structures can be benign, such as in muscle tissues (actin [150]), or very often harmful. The latter case is true for sickle cell HbS insoluble fibrous malignant polymers, or for amyloid or amyloid-like fibrils, which are often, although not always [2, 15], associated with deleterious neurodegenerative diseases [37]. Proteins can become prone to fibrillization

in a supra-molecular ensemble to be active [151].

incoming of cytotoxicity.

Another interesting case is represented by the pore forming toxin (PFT) families, which are most often produced as soluble monomers, proteolitically cleaved by host proteases leading to their oligomerization and pore formation. This occurs, for example, with aerolysin [145], with protective antigen (PA) from anthrax [146], and with thiol-activated cytolysin (TACY) pore forming family [147].

Finally, it has to be underlined that all factors able to stabilize/destabilize protein oligomers, i.e., pH, temperature, ionic strength, protein concentration, molecular crowding, mutations, are not independent of each other. This makes the scenario more complicated than expected and clearly indicates that an 'absolute stability' of a protein can not be easily defined. This is true, for example, for RNase A domain-swapped oligomers (see Figure 9). Their high number of conformers and the possible interconversions between oligomers make the picture quite complicated [74], and only some data concerning dimers' stabilities have been reported to date [89, 148, 149]. With an apparent contradiction, RNase A-ND was sometimes reported to be more stable than CD while in other environmental conditions the situation is the opposite. These data clearly indicate that different combinations of all the environmental conditions reported above are crucial to stabilize or destabilize different dimeric or oligomeric structures.

#### **5. Oligomeric proteins: functional vs aberrant interactions**

#### **5.1. Gain or loss of function(s) after protein oligomerization**

Natural oligomerization of proteins has been settled by evolution in order to obtain a control of their biological features, for instance of enzyme activity. In some cases, proteins are inactive unless they dimerize or oligomerize because of the high hydrophobic surface the monomer exposes to the solvent, as reported above for PLP-enzymes. Oligomerization can be constitu‐ tive, such as for PLP-enzymes, or induced by signal molecules, like for membrane proteins. Self-or hetero-association of proteins can also be artificial, but can in any case lead to new activities or block activities that in certain situations can become unwanted, for example in the feedback control of some enzymes.

Dimerization and oligomerization can also increase or lower pre-existing activities. In this case the phenomenon can be considered as a sharp controller and tuner of important activities. These gain- or loss-of-function events occur by exposing or hiding active surfaces or, for example, by inducing positive or negative allostery impossible to be warranted by the native monomer. These events can be ruled not only by changing the oligomerization/ polymerization status of the proteins, but also by conformational changes induced by ligands, such as it occurs with Hb (see Figure 5). Gain or loss of native functions induced by oligomerization can be benign or malignant and also nature can drive either towards physiologic or pathological events, as it happens, respectively, with actin [150] or with sickle-cell anaemia associated to a point mutation in hemoglobin (HbS). Other important pathological implications, i.e., oligomers evolving towards fibrils, will be discussed later. A benign case that deserves to be mentioned is rhodopsin, the pigment involved in the phototransduction events that are crucial for vision, which physiologically organizes itself in a supra-molecular ensemble to be active [151].

Also proteolysis, occurring naturally or induced, limited or massive, is a way through which proteins' activities are naturally or artificially switched on/off, or anyhow tuned. Thus, oligomerization can also affect proteolysis: in fact, a domain that in a native monomer is exposed to the action of a proteolytic agent can be partially or totally hidden by dimerization or oligomerization, and proteolysis can be slowed down or even blocked; otherwise, a region which in the native monomer is well structured and therefore protected, or even hidden, can be destabilized, destructured and exposed after oligomerization, becoming susceptible to proteolysis. This is what happens with RNase A, whose ND is definitely more susceptible to

Another interesting case is represented by the pore forming toxin (PFT) families, which are most often produced as soluble monomers, proteolitically cleaved by host proteases leading to their oligomerization and pore formation. This occurs, for example, with aerolysin [145], with protective antigen (PA) from anthrax [146], and with thiol-activated cytolysin (TACY)

Finally, it has to be underlined that all factors able to stabilize/destabilize protein oligomers, i.e., pH, temperature, ionic strength, protein concentration, molecular crowding, mutations, are not independent of each other. This makes the scenario more complicated than expected and clearly indicates that an 'absolute stability' of a protein can not be easily defined. This is true, for example, for RNase A domain-swapped oligomers (see Figure 9). Their high number of conformers and the possible interconversions between oligomers make the picture quite complicated [74], and only some data concerning dimers' stabilities have been reported to date [89, 148, 149]. With an apparent contradiction, RNase A-ND was sometimes reported to be more stable than CD while in other environmental conditions the situation is the opposite. These data clearly indicate that different combinations of all the environmental conditions reported above

Natural oligomerization of proteins has been settled by evolution in order to obtain a control of their biological features, for instance of enzyme activity. In some cases, proteins are inactive unless they dimerize or oligomerize because of the high hydrophobic surface the monomer exposes to the solvent, as reported above for PLP-enzymes. Oligomerization can be constitu‐ tive, such as for PLP-enzymes, or induced by signal molecules, like for membrane proteins. Self-or hetero-association of proteins can also be artificial, but can in any case lead to new activities or block activities that in certain situations can become unwanted, for example in the

Dimerization and oligomerization can also increase or lower pre-existing activities. In this case the phenomenon can be considered as a sharp controller and tuner of important

are crucial to stabilize or destabilize different dimeric or oligomeric structures.

**5. Oligomeric proteins: functional vs aberrant interactions**

**5.1. Gain or loss of function(s) after protein oligomerization**

feedback control of some enzymes.

proteolysis than the CD conformer [144].

260 Oligomerization of Chemical and Biological Compounds

pore forming family [147].

An interesting case in which protein oligomerization induces a gain of function concerns RNase A. The native monomeric enzyme only degrades single-stranded (ss) RNA and is not cytotoxic, while its artificial dimers and oligomers, either forming through non-covalent 3D-DS or covalent bonds become also active against double-stranded (ds) RNA [29, 57, 73, 152]. Moreover, they can become selectively cytotoxic towards cancer cells both *in vitro* and *in vivo* [153-156], although more recent results indicate that cytotoxicity can be dependent also on the type of cell line studied [32]. This acquired cytotoxic power can be considered benign, given the selectivity towards malignant cells, and can be mainly ascribable to the possibility of oligomeric RNases to evade the ribonuclease inhibitor (RI), which is designed for tightly trap monomeric RNases [157]. This is also the main reason why only MxM BS-RNase (see Figure 8) is cytotoxic: in fact, the cytosolic reducing environment allows only the MxM domainswapped isoform to survive as a non-covalent dimer (NCD) and to evade RI, while M=M becomes a monomeric, RI-susceptible, derivative [158]. Another example of cytotoxic protein oligomers is represented by the ones formed by p13suc1 [159]. The native cell-cycle regulatory protein is a monomer/DS-dimer mixture [160] that can form large native-like cytotoxic aggregates through 3D-DS, and whose structural determinants governing the swapping mechanism have been extensively studied as reported above [62, 68]. Thus, the artificial induction, quench, or control of the oligomerization event(s) can be useful to avoid, induce, or tune several biological properties of proteins, such as for example enzymatic activity, or the incoming of cytotoxicity.

#### **5.2. Protein oligomerization towards fibrillization and/or amyloidosis**

Protein self- or cross-association can be naturally or artificially controlled to the degree of oligomers or multimers, but can sometimes undergo uncontrolled massive aggregation, often resulting in fibers, as it has already been reported in the domain swapping section. These supramolecular structures can be benign, such as in muscle tissues (actin [150]), or very often harmful. The latter case is true for sickle cell HbS insoluble fibrous malignant polymers, or for amyloid or amyloid-like fibrils, which are often, although not always [2, 15], associated with deleterious neurodegenerative diseases [37]. Proteins can become prone to fibrillization because of ageing, or after changes in the environmental conditions, such as crowding, pH or temperature shocks. In addition, also point mutations can induce and often speed up the fibrillization phenomena, as for TTR V30M or L55P [161, 162], or for the human prion protein [163], or also for the homo-tetrameric p53 tumor suppressor protein [164, 165]. Several lethal amyloidoses, for example linked to TTR or β2-m, or to other toxic fibrillogenic proteins, like prions, display a premature incoming when associated to familial pathogenic mutations or overpass the species transmission barriers [166]. Then, also cancer has been recently considered a possible prio-like disease, due to the fibrillogenic behavior of some p53 mutants [167].

ments they can undergo (Figure 12) to follow a harmful fibrillogenic pathway or a non

Protein Oligomerization http://dx.doi.org/10.5772/57489 263

**Figure 12.** Evolution of partially or totally unfolded protein monomers towards ordered fibrils through oligomeric in‐

Besides artificial chemistry products, industry has often used natural or artificially modified bio-products to produce bio-materials or bio-fibers. In this context, nano-particles and nano-

Peptides and proteins can be driven towards controlled oligomerization, polymerization and/ or fibrillization to form products displaying useful physico-chemical and mechanical features to be adopted for new or renewed industrial applications. Incidentally, several efforts have been performed in the engineering of short peptides and small biomolecules [168-170], while definitely less is known on the industrial application of protein oligomers, multimers and/or fibrils. The most common materials formed from peptides and proteins are hydrogels, which can be applied in tissue engineering and in drug delivery [169]. These materials are typically formed by hydrated cross-linked fibers that somehow resemble the agarose or polyacrylamide

The 'intima' of these hydrogel structures and their morphology define size, biocompatibility, and mechanical, elastic or rigid properties. Anyway, only some hints concerning the interest‐

For example, the 3D-DS mechanism has been reported as to be applicable in material design [94]. Indeed, domain-swapped oligomeric peptides have been produced to obtain hydrogels [171], while also proteins can be artificially designed to undergo 3D-DS to form oligomers which could be useful to produce biomaterials [172]. Beyond small oligomers, the ability of some proteins to become fibrillogenic through propagative or runaway domain swapping [80, 94] could, or should, be exploited in order to engineer variants that could form harmless 3D-

fibrillogenic harmless destiny [7] can be found in that review [2].

termediates displaying different sizes and conformations. (Modified from [2] and [88]).

materials certainly represent a new fascinating frontier to be developed.

ing 'industrial' applications of protein oligomers will be given here.

**6. Oligomeric proteins and industry**

gels extensively used in biochemical laboratories.

In this complex scenario, it is known [163] and is more and more accepted that the first oligomeric/multimeric species preceding the formation of protofibrils and fibrils are the toxic agents responsible for the incoming of the associated pathology(ies) [2]. Thus, a lot of work is presently devoted to discover the structural and functional properties of these 'toxic oligom‐ ers', which is a very difficult endeavor indeed, because these species show a great tendency to fibrillize quickly.

In the last five years, the evidence became clear that oligomers produced by the same protein can be toxic or non toxic depending on the way they are produced [2, 7, 15]. This was found, for example, with the *Escherichia Coli* hyperforin-N (HypF-N) spherical oligomers [15] and also with other amyloidogenic proteins. Several different supramolecular large structures can be detected with new techniques, such as solid-state NMR, Cryo-TEM, High-Resolution Atomic-Force Spectroscopy, Molecular Modeling. For example, TTR has been discovered to fibrillize through the formation of annular oligomers deriving from the monomer which had been in turn detached from the native tetramer [10].

Several other amyloidogenic proteins have been extensively studied. Among them, the number of those that follow the 3D-DS mechanism is continuously increasing [80], such as for hPrP, hCC, β2-m, the properties of which have already been discussed. In the last years several proteins initially not considered to follow 3D-DS were discovered instead to undergo this mechanism. Nevertheless, the mechanism by which an oligomer/fiber may form essentially depends on the symmetry of the interfacial association and 3D-DS can be not mandatory to reach this requisite. Thus, the formation of the cross-β-spine fibrils intima could follow an end-to-end stacking mechanism, the same followed by non-amyloidogen‐ ic proteins, such as hemoglobin-S [164], or by non-harmful proteins, such as actin [170] or tubulin [171]. Anyway, considering that amyloid fibers are not oligomers, several oligomer‐ ic precursors of amyloidogenic proteins are continuously studied, despite their transient nature. Among them, human lysozime, *Sulfolobus Solfataricus* acylphosphatase (AcP), human superoxide dismutase-1 (SOD1), the latter associated with the devastating disease amyotro‐ phic lateral sclerosis, have been deeply investigated in their propensity to undergo amyloid fibers through amyloidogenic oligomers [167]. Further detailed studies, which are beyond the topic of this chapter and focused on the nature of amyloidogenic oligomers and their differences from the non-amyloidogenic ones have been recently reviewed [2]. Several interesting news concerning the oligomers' polymorphism and the structural rearrange‐ ments they can undergo (Figure 12) to follow a harmful fibrillogenic pathway or a non fibrillogenic harmless destiny [7] can be found in that review [2].

**Figure 12.** Evolution of partially or totally unfolded protein monomers towards ordered fibrils through oligomeric in‐ termediates displaying different sizes and conformations. (Modified from [2] and [88]).

#### **6. Oligomeric proteins and industry**

because of ageing, or after changes in the environmental conditions, such as crowding, pH or temperature shocks. In addition, also point mutations can induce and often speed up the fibrillization phenomena, as for TTR V30M or L55P [161, 162], or for the human prion protein [163], or also for the homo-tetrameric p53 tumor suppressor protein [164, 165]. Several lethal amyloidoses, for example linked to TTR or β2-m, or to other toxic fibrillogenic proteins, like prions, display a premature incoming when associated to familial pathogenic mutations or overpass the species transmission barriers [166]. Then, also cancer has been recently considered a possible prio-like disease, due to the fibrillogenic behavior of some p53 mutants [167].

In this complex scenario, it is known [163] and is more and more accepted that the first oligomeric/multimeric species preceding the formation of protofibrils and fibrils are the toxic agents responsible for the incoming of the associated pathology(ies) [2]. Thus, a lot of work is presently devoted to discover the structural and functional properties of these 'toxic oligom‐ ers', which is a very difficult endeavor indeed, because these species show a great tendency to

In the last five years, the evidence became clear that oligomers produced by the same protein can be toxic or non toxic depending on the way they are produced [2, 7, 15]. This was found, for example, with the *Escherichia Coli* hyperforin-N (HypF-N) spherical oligomers [15] and also with other amyloidogenic proteins. Several different supramolecular large structures can be detected with new techniques, such as solid-state NMR, Cryo-TEM, High-Resolution Atomic-Force Spectroscopy, Molecular Modeling. For example, TTR has been discovered to fibrillize through the formation of annular oligomers deriving from the monomer which had been in

Several other amyloidogenic proteins have been extensively studied. Among them, the number of those that follow the 3D-DS mechanism is continuously increasing [80], such as for hPrP, hCC, β2-m, the properties of which have already been discussed. In the last years several proteins initially not considered to follow 3D-DS were discovered instead to undergo this mechanism. Nevertheless, the mechanism by which an oligomer/fiber may form essentially depends on the symmetry of the interfacial association and 3D-DS can be not mandatory to reach this requisite. Thus, the formation of the cross-β-spine fibrils intima could follow an end-to-end stacking mechanism, the same followed by non-amyloidogen‐ ic proteins, such as hemoglobin-S [164], or by non-harmful proteins, such as actin [170] or tubulin [171]. Anyway, considering that amyloid fibers are not oligomers, several oligomer‐ ic precursors of amyloidogenic proteins are continuously studied, despite their transient nature. Among them, human lysozime, *Sulfolobus Solfataricus* acylphosphatase (AcP), human superoxide dismutase-1 (SOD1), the latter associated with the devastating disease amyotro‐ phic lateral sclerosis, have been deeply investigated in their propensity to undergo amyloid fibers through amyloidogenic oligomers [167]. Further detailed studies, which are beyond the topic of this chapter and focused on the nature of amyloidogenic oligomers and their differences from the non-amyloidogenic ones have been recently reviewed [2]. Several interesting news concerning the oligomers' polymorphism and the structural rearrange‐

fibrillize quickly.

turn detached from the native tetramer [10].

262 Oligomerization of Chemical and Biological Compounds

Besides artificial chemistry products, industry has often used natural or artificially modified bio-products to produce bio-materials or bio-fibers. In this context, nano-particles and nanomaterials certainly represent a new fascinating frontier to be developed.

Peptides and proteins can be driven towards controlled oligomerization, polymerization and/ or fibrillization to form products displaying useful physico-chemical and mechanical features to be adopted for new or renewed industrial applications. Incidentally, several efforts have been performed in the engineering of short peptides and small biomolecules [168-170], while definitely less is known on the industrial application of protein oligomers, multimers and/or fibrils. The most common materials formed from peptides and proteins are hydrogels, which can be applied in tissue engineering and in drug delivery [169]. These materials are typically formed by hydrated cross-linked fibers that somehow resemble the agarose or polyacrylamide gels extensively used in biochemical laboratories.

The 'intima' of these hydrogel structures and their morphology define size, biocompatibility, and mechanical, elastic or rigid properties. Anyway, only some hints concerning the interest‐ ing 'industrial' applications of protein oligomers will be given here.

For example, the 3D-DS mechanism has been reported as to be applicable in material design [94]. Indeed, domain-swapped oligomeric peptides have been produced to obtain hydrogels [171], while also proteins can be artificially designed to undergo 3D-DS to form oligomers which could be useful to produce biomaterials [172]. Beyond small oligomers, the ability of some proteins to become fibrillogenic through propagative or runaway domain swapping [80, 94] could, or should, be exploited in order to engineer variants that could form harmless 3D- DS fibers with special morphological and mechanical properties. These supra-molecular structures could also become materials devoid of direct biological applications, but with industrial and ecologic relevance. Indeed, proteins that for example combine 3D-DS with a covalent stabilization of their fibrillar products through the formation of novel disulfides (like β2-m and recombinant PrP) [110, 112], should be useful to obtain reversible products that could be easily unstructured and recycled under reducing conditions, and, thus, without negative ecologic consequences.

**Author details**

Giovanni Gotte\*

Verona, Verona, Italy

FEBS J. 2010;277:1348-58.

**References**

and Massimo Libonati

\*Address all correspondence to: giovanni.gotte@univr.it

Department of Life and Reproduction Sciences, Biological Chemistry Section, University of

Protein Oligomerization http://dx.doi.org/10.5772/57489 265

[1] Levy ED, Teichmann S. Structural, evolutionary, and assembly principles of protein

[2] Bemporad F, Chiti F. Protein misfolded oligomers: experimental approaches, mecha‐ nism of formation, and structure-toxicity relationships. Chem Biol. 2012;19:315-27.

[3] Haass C, Selkoe DJ. Soluble protein oligomers in neurodegeneration: lessons from the Alzheimer's amyloid beta-peptide. Nat Rev Mol Cell Biol. 2007;8:101-12.

[4] Sakono M, Zako T. Amyloid oligomers: formation and toxicity of Abeta oligomers.

[5] Libonati M, Gotte G. Oligomerization of bovine ribonuclease A: structural and func‐

[6] Cozza G, Moro S, Gotte G. Elucidation of the ribonuclease a aggregation process mediated by 3D domain swapping: A computational approach reveals possible new

[7] Glabe CG. Structural classification of toxic amyloid oligomers. The Journal of Biolog‐

[8] Fandrich M. Oligomeric intermediates in amyloid formation: structure determination and mechanisms of toxicity. Journal of Molecular Biology. 2012;421:427-40.

[9] Barghorn S, Nimmrich V, Striebinger A, Krantz C, Keller P, Janson B, et al. Globular amyloid beta-peptide oligomer - a homogenous and stable neuropathological protein

[10] Pires RH, Karsai A, Saraiva MJ, Damas AM, Kellermayer MS. Distinct annular oligomers captured along the assembly and disassembly pathways of transthyretin

[11] Lashuel HA, Hartley D, Petre BM, Walz T, Lansbury PT, Jr. Neurodegenerative dis‐

ease: amyloid pores from pathogenic mutations. Nature. 2002;418:291.

tional features of its multimers. The Biochemical Journal. 2004;380:311-27.

oligomerization. Prog Mol Biol Transl Sci. 2013;117:25-51.

multimeric structures. Biopolymers. 2008;89:26-39.

in Alzheimer's disease. J Neurochem. 2005;95:834-47.

amyloid protofibrils. PLoS One. 2012;7:e44992.

ical Chemistry. 2008;283:29639-43.

Many examples of protein 'benign' (non-amyloidogenic) fibers could become precious for industry applications. An example can be represented by the trimeric hexon protein. display‐ ing a novel triple β-spiral fibrous fold with implications for the design of a new class of artificial, silk-like fibrous materials [173].

Anyway, all the potential industrial applications of protein oligomers, multimers, and fibrils are certainly far to be completely explored, and what we have here reported about 3D-DS applications confirms that further deep investigations deserve to be performed.

#### **7. Conclusions**

All the notions reported in this chapter indicate that the complexity of the protein oligomeri‐ zation topic augments every day. In particular, the increasing number of studies focused on natural protein oligomerization and the improvement of the quality of the investigation techniques have greatly enlarged the complexity of the analysis of the structural and functional features of protein oligomers. Furthermore, the sharpening of the strategies used by chemistry to obtain cross-linked artificial oligomers allowed industry and laboratories to obtain less heterogeneous products, which were, in addition, scarcely modified with respect to the native monomeric protein(s).

Again, the discovery that some proteins can naturally oligomerize by combining covalent linkings with weak interactions made the scenario even more complicated. This is the case, for example, of proteins that form domain-swapped fibrils additionally stabilized by newly forming disulfides. The same is true also for membrane proteins that associate and sometimes covalently stabilize their interaction as a response to effectors which can play the role of activators or quench cell signals.

Thus, the aim of this chapter is to focus the attention of the reader on the principal features of protein oligomerization. We kept separated, when possible, the covalent linking oligomers from the non-covalent protein self-association products, as well as the natural, constitutive or induced, events from the artificial ones. When we were not able to separate these aspects well, we tried to give a picture as clear as possible. On the other hand, this underlines how much increasing interest has and how many further studies deserves the topic of protein oligome‐ rization.

#### **Author details**

DS fibers with special morphological and mechanical properties. These supra-molecular structures could also become materials devoid of direct biological applications, but with industrial and ecologic relevance. Indeed, proteins that for example combine 3D-DS with a covalent stabilization of their fibrillar products through the formation of novel disulfides (like β2-m and recombinant PrP) [110, 112], should be useful to obtain reversible products that could be easily unstructured and recycled under reducing conditions, and, thus, without negative

Many examples of protein 'benign' (non-amyloidogenic) fibers could become precious for industry applications. An example can be represented by the trimeric hexon protein. display‐ ing a novel triple β-spiral fibrous fold with implications for the design of a new class of artificial,

Anyway, all the potential industrial applications of protein oligomers, multimers, and fibrils are certainly far to be completely explored, and what we have here reported about 3D-DS

All the notions reported in this chapter indicate that the complexity of the protein oligomeri‐ zation topic augments every day. In particular, the increasing number of studies focused on natural protein oligomerization and the improvement of the quality of the investigation techniques have greatly enlarged the complexity of the analysis of the structural and functional features of protein oligomers. Furthermore, the sharpening of the strategies used by chemistry to obtain cross-linked artificial oligomers allowed industry and laboratories to obtain less heterogeneous products, which were, in addition, scarcely modified with respect to the native

Again, the discovery that some proteins can naturally oligomerize by combining covalent linkings with weak interactions made the scenario even more complicated. This is the case, for example, of proteins that form domain-swapped fibrils additionally stabilized by newly forming disulfides. The same is true also for membrane proteins that associate and sometimes covalently stabilize their interaction as a response to effectors which can play the role of

Thus, the aim of this chapter is to focus the attention of the reader on the principal features of protein oligomerization. We kept separated, when possible, the covalent linking oligomers from the non-covalent protein self-association products, as well as the natural, constitutive or induced, events from the artificial ones. When we were not able to separate these aspects well, we tried to give a picture as clear as possible. On the other hand, this underlines how much increasing interest has and how many further studies deserves the topic of protein oligome‐

applications confirms that further deep investigations deserve to be performed.

ecologic consequences.

**7. Conclusions**

monomeric protein(s).

rization.

activators or quench cell signals.

silk-like fibrous materials [173].

264 Oligomerization of Chemical and Biological Compounds

Giovanni Gotte\* and Massimo Libonati

\*Address all correspondence to: giovanni.gotte@univr.it

Department of Life and Reproduction Sciences, Biological Chemistry Section, University of Verona, Verona, Italy

#### **References**


[12] Fandrich M. On the structural definition of amyloid fibrils and other polypeptide ag‐ gregates. Cell Mol Life Sci. 2007;64:2066-78.

[27] Nenci A, Gotte G, Bertoldi M, Libonati M. Structural properties of trimers and tet‐ ramers of ribonuclease A. Protein Science: a publication of the Protein Society.

Protein Oligomerization http://dx.doi.org/10.5772/57489 267

[28] Gotte G, Libonati M. Oligomerization of ribonuclease A: two novel three-dimension‐ al domain-swapped tetramers. The Journal of Biological Chemistry. 2004;279:36670-9.

[29] Gotte G, Laurents DV, Libonati M. Three-dimensional domain-swapped oligomers of ribonuclease A: identification of a fifth tetramer, pentamers and hexamers, and de‐ tection of trace heptameric, octameric and nonameric species. Biochimica et Biophy‐

[30] Simons BL, King MC, Cyr T, Hefford MA, Kaplan H. Covalent cross-linking of pro‐ teins without chemical reagents. Protein Science: a publication of the Protein Society.

[31] Simons BL, Kaplan H, Fournier SM, Cyr T, Hefford MA. A novel cross-linked RNase

[32] Vottariello F, Costanzo C, Gotte G, Libonati M. "Zero-length" dimers of ribonuclease A: further characterization and no evidence of cytotoxicity. Bioconjug Chem.

[33] Fancy DA, Kodadek T. Chemistry for the analysis of protein-protein interactions: rapid and efficient cross-linking triggered by long wavelength light. Proceedings of the National Academy of Sciences of the United States of America. 1999;96:6020-4. [34] Lumb MJ, Danpure CJ. Functional synergism between the most common polymor‐ phism in human alanine:glyoxylate aminotransferase and four of the most common disease-causing mutations. The Journal of Biological Chemistry. 2000;275:36415-22.

[35] Cellini B, Montioli R, Paiardini A, Lorenzetto A, Maset F, Bellini T, et al. Molecular defects of the glycine 41 variants of alanine glyoxylate aminotransferase associated with primary hyperoxaluria type I. Proceedings of the National Academy of Sciences

[36] Levy ED, Boeri Erba E, Robinson CV, Teichmann SA. Assembly reflects evolution of

[37] Eisenberg D, Jucker M. The amyloid state of proteins in human diseases. Cell.

[38] Monaco HL, Rizzi M, Coda A. Structure of a complex of two plasma proteins: trans‐

[39] Blake C, Serpell L. Synchrotron X-ray studies suggest that the core of the transthyre‐ tin amyloid fibril is a continuous beta-sheet helix. Structure. 1996;4:989-98.

thyretin and retinol-binding protein. Science. 1995;268:1039-41.

of the United States of America. 2010;107:2896-901.

protein complexes. Nature. 2008;453:1262-5.

A dimer with enhanced enzymatic properties. Proteins. 2007;66:183-95.

2001;10:2017-27.

2002;11:1558-64.

2010;21:635-45.

2012;148:1188-203.

sica Acta. 2006;1764:44-54.


[27] Nenci A, Gotte G, Bertoldi M, Libonati M. Structural properties of trimers and tet‐ ramers of ribonuclease A. Protein Science: a publication of the Protein Society. 2001;10:2017-27.

[12] Fandrich M. On the structural definition of amyloid fibrils and other polypeptide ag‐

[13] Nelson R, Sawaya MR, Balbirnie M, Madsen AO, Riekel C, Grothe R, et al. Structure

[14] Sipe JD, Cohen AS. Review: history of the amyloid fibril. Journal of Structural Biolo‐

[15] Campioni S, Mannini B, Zampagni M, Pensalfini A, Parrini C, Evangelisti E, et al. A causative link between the structure of aberrant protein oligomers and their toxicity.

[16] D'Alessio G, Parente A, Guida C, Leone E. Dimeric structure of seminal ribonuclease.

[18] Hartman FC, Wold F. Cross-linking of bovine pancreatic ribonuclease A with di‐

[19] Wang D, Wilson G, Moore S. Preparation of cross-linked dimers of pancreatic ribo‐

[20] Green NS, Reisler E, Houk KN. Quantitative evaluation of the lengths of homobi‐ functional protein cross-linking reagents used as molecular rulers. Protein Science: a

[21] Fracasso G, Bellisola G, Castelletti D, Tridente G, Colombatti M. Immunotoxins and other conjugates: preparation and general characteristics. Mini Rev Med Chem.

[22] Ayers NA, Nadeau OW, Read MW, Ray P, Carlson GM. Effector-sensitive cross-link‐ ing of phosphorylase b kinase by the novel cross-linker 4-phenyl-1,2,4-triazoline-3,5-

[23] Sheehan JCH, J.J. The cross-linking of gelatin using a water-soluble carbodiimide.

[24] Lopez-Alonso JP, Diez-Garcia F, Font J, Ribo M, Vilanova M, Scholtz JM, et al. Carbo‐ diimide EDC Induces Cross-Links That Stabilize RNase A C-Dimer against Dissocia‐ tion: EDC Adducts Can Affect Protein Net Charge, Conformation, and Activity.

[25] Ciglic MI, Jackson PJ, Raillard SA, Haugg M, Jermann TM, Opitz JG, et al. Origin of dimeric structure in the ribonuclease superfamily. Biochemistry. 1998;37:4008-22.

[26] Lin SH, Konishi Y, Denton ME, Scheraga HA. Influence of an extrinsic cross-link on the folding pathway of ribonuclease A. Conformational and thermodynamic analysis of cross-linked (lysine7-lysine41)-Ribonuclease A. Biochemistry. 1984;23:5504-12.

of the cross-beta spine of amyloid-like fibrils. Nature. 2005;435:773-8.

[17] Urry DW. What is elastin; what is not. Ultrastruct Pathol. 1983;4:227-51.

methyl adipimidate. Biochemistry. 1967;6:2439-48.

publication of the Protein Society. 2001;10:1293-304.

dione. The Biochemical Journal. 1998;331 (Pt 1):137-41.

Journal of the American Chemical Society. 1957;79.

gregates. Cell Mol Life Sci. 2007;64:2066-78.

Nature Chemical Biology. 2010;6:140-7.

nuclease. Biochemistry. 1976;15:660-5.

FEBS Letters. 1972;27:285-8.

gy. 2000;130:88-98.

266 Oligomerization of Chemical and Biological Compounds

2004;4:545-62.

Bioconjug Chem. 2009.


[40] Montioli R, Cellini B, Bertoldi M, Paiardini A, Voltattorni CB. An engineered folded PLP-bound monomer of Treponema denticola cystalysin reveals the effect of the di‐ meric structure on the catalytic properties of the enzyme. Proteins. 2009;74:304-17.

[54] Crestfield AM, Stein WH, Moore S. On the aggregation of bovine pancreatic ribonu‐

Protein Oligomerization http://dx.doi.org/10.5772/57489 269

[55] Mazzarella L, Capasso S, Demasi D, Di Lorenzo G, Mattia CA, Zagari A. Bovine seminal ribonuclease: structure at 1.9 A resolution. Acta Crystallogr D Biol Crystal‐

[56] Gronenborn AM. Protein acrobatics in pairs--dimerization via domain swapping.

[57] Libonati M. Molecular aggregates of ribonucleases. Some enzymatic properties. Ital J

[58] Libonati M, Bertoldi M, Sorrentino S. The activity on double-stranded RNA of aggre‐ gates of ribonuclease A higher than dimers increases as a function of the size of the

[59] Piccoli R, D'Alessio G. Relationships between nonhyperbolic kinetics and dimeric structure in ribonucleases. The Journal of biological chemistry. 1984;259:693-5.

[60] Piccoli R, Tamburrini M, Piccialli G, Di Donato A, Parente A, D'Alessio G. The dualmode quaternary structure of seminal RNase. Proceedings of the National Academy

[61] Rousseau F, Schymkowitz JW, Itzhaki LS. The unfolding story of three-dimensional

[62] Rousseau F, Schymkowitz JW, Wilkinson HR, Itzhaki LS. Intermediates control do‐ main swapping during folding of p13suc1. The Journal of Biological Chemistry.

[63] Ercole C, Spadaccini R, Alfano C, Tancredi T, Picone D. A new mutant of bovine seminal ribonuclease with a reversed swapping propensity. Biochemistry.

[64] Picone D, Di Fiore A, Ercole C, Franzese M, Sica F, Tomaselli S, et al. The role of the hinge loop in domain swapping. The special case of bovine seminal ribonuclease.

[65] Ercole C, Avitabile F, Del Vecchio P, Crescenzi O, Tancredi T, Picone D. Role of the hinge peptide and the intersubunit interface in the swapping of N-termini in dimeric

[66] Yang S, Levine H, Onuchic JN. Protein oligomerization through domain swapping: role of inter-molecular interactions and protein concentration. Journal of Molecular

[67] Rousseau F, Schymkowitz JW, Wilkinson HR, Itzhaki LS. The structure of the transi‐ tion state for folding of domain-swapped dimeric p13suc1. Structure. 2002;10:649-57.

clease. Archives of Biochemistry and Biophysics. 1962;Suppl 1:217-22.

Current Opinion in Structural Biology. 2009;19:39-49.

aggregates. The Biochemical Journal. 1996;318 ( Pt 1):287-90.

of Sciences of the United States of America. 1992;89:1870-4.

The Journal of Biological Chemistry. 2005;280:13771-8.

bovine seminal RNase. Eur J Biochem. 2003;270:4729-35.

domain swapping. Structure. 2003;11:243-51.

logr. 1993;49:389-402.

Biochem. 1969;18:407-17.

2004;279:8368-77.

2007;46:2227-32.

Biology. 2005;352:202-11.


[54] Crestfield AM, Stein WH, Moore S. On the aggregation of bovine pancreatic ribonu‐ clease. Archives of Biochemistry and Biophysics. 1962;Suppl 1:217-22.

[40] Montioli R, Cellini B, Bertoldi M, Paiardini A, Voltattorni CB. An engineered folded PLP-bound monomer of Treponema denticola cystalysin reveals the effect of the di‐ meric structure on the catalytic properties of the enzyme. Proteins. 2009;74:304-17.

[41] Zhang X, Roe SM, Hou Y, Bartlam M, Rao Z, Pearl LH, et al. Crystal structure of ala‐ nine:glyoxylate aminotransferase and the relationship between genotype and enzy‐ matic phenotype in primary hyperoxaluria type 1. Journal of Molecular Biology.

[42] Burkhard P, Dominici P, Borri-Voltattorni C, Jansonius JN, Malashkevich VN. Struc‐ tural insight into Parkinson's disease treatment from drug-inhibited DOPA decar‐

[43] Flydal MI, Martinez A. Phenylalanine hydroxylase: function, structure, and regula‐

[44] Grueninger D, Treiber N, Ziegler MO, Koetter JW, Schulze MS, Schulz GE. Designed

[45] Marianayagam NJ, Sunde M, Matthews JM. The power of two: protein dimerization

[46] Hebert TE, Bouvier M. Structural and functional aspects of G protein-coupled recep‐

[47] Terrillon S, Bouvier M. Roles of G-protein-coupled receptor dimerization. EMBO

[48] Renatus M, Stennicke HR, Scott FL, Liddington RC, Salvesen GS. Dimer formation drives the activation of the cell death protease caspase 9. Proceedings of the National

[49] Agre P, Kozono D. Aquaporin water channels: molecular mechanisms for human

[50] Bennett MJ, Choe S, Eisenberg D. Domain swapping: entangling alliances between proteins. Proceedings of the National Academy of Sciences of the United States of

[51] Steere B, Eisenberg D. Characterization of high-order diphtheria toxin oligomers. Bi‐

[52] Bennett MJ, Schlunegger MP, Eisenberg D. 3D domain swapping: a mechanism for oligomer assembly. Protein science : a publication of the Protein Society.

[53] Benito A, Laurents DV, Ribo M, Vilanova M. The structural determinants that lead to the formation of particular oligomeric structures in the pancreatic-type ribonuclease

Academy of Sciences of the United States of America. 2001;98:14250-5.

boxylase. Nature Structural Biology. 2001;8:963-7.

protein-protein association. Science. 2008;319:206-9.

tor oligomerization. Biochem Cell Biol. 1998;76:1-11.

in biology. Trends Biochem Sci. 2004;29:618-25.

diseases. FEBS Letters. 2003;555:72-8.

family. Curr Protein Pept Sci. 2008;9:370-93.

America. 1994;91:3127-31.

ochemistry. 2000;39:15901-9.

1995;4:2455-68.

tion. IUBMB Life. 2013;65:341-9.

2003;331:643-52.

268 Oligomerization of Chemical and Biological Compounds

Rep. 2004;5:30-4.


[68] Rousseau F, Schymkowitz JW, Wilkinson HR, Itzhaki LS. Three-dimensional domain swapping in p13suc1 occurs in the unfolded state and is controlled by conserved pro‐ line residues. Proceedings of the National Academy of Sciences of the United States of America. 2001;98:5596-601.

C-swapped tetramers and multimers with increasing biological activities. PLoS One.

Protein Oligomerization http://dx.doi.org/10.5772/57489 271

[82] Koharudin LM, Liu L, Gronenborn AM. Different 3D domain-swapped oligomeric cyanovirin-N structures suggest trapped folding intermediates. Proceedings of the National Academy of Sciences of the United States of America. 2013;110:7702-7.

[83] Chen KE, Richards AA, Ariffin JK, Ross IL, Sweet MJ, Kellie S, et al. The mammalian DUF59 protein Fam96a forms two distinct types of domain-swapped dimer. Acta

[84] Adinolfi S, Piccoli R, Sica F, Mazzarella L. BS-RNase tetramers: an example of do‐

[85] Spadaccini R, Ercole C, Gentile MA, Sanfelice D, Boelens R, Wechselberger R, et al. NMR studies on structure and dynamics of the monomeric derivative of BS-RNase:

[86] Sica F, Pica A, Merlino A, Russo Krauss I, Ercole C, Picone D. The multiple forms of bovine seminal ribonuclease: Structure and stability of a C-terminal swapped dimer.

[87] Lopez-Alonso JP, Bruix M, Font J, Ribo M, Vilanova M, Jimenez MA, et al. NMR spectroscopy reveals that RNase A is chiefly denatured in 40% acetic acid: implica‐ tions for oligomer formation by 3D domain swapping. Journal of the American

[88] Dobson CM. Protein misfolding, evolution and disease. Trends Biochem Sci.

[89] Gotte G, Donadelli M, Laurents DV, Vottariello F, Morbio M, Libonati M. Increase of RNase a N-terminus polarity or C-terminus apolarity changes the two domains' pro‐ pensity to swap and form the two dimeric conformers of the protein. Biochemistry.

[90] Russo N, Antignani A, D'Alessio G. In vitro evolution of a dimeric variant of human

[91] Canals A, Pous J, Guasch A, Benito A, Ribo M, Vilanova M, et al. The structure of an engineered domain-swapped ribonuclease dimer and its implications for the evolu‐

[92] Merlino A, Avella G, Di Gaetano S, Arciello A, Piccoli R, Mazzarella L, et al. Structur‐ al features for the mechanism of antitumor action of a dimeric human pancreatic ri‐ bonuclease variant. Protein Science: a publication of the Protein Society. 2009;18:50-7.

[93] Schlunegger MP, Bennett MJ, Eisenberg D. Oligomer formation by 3D domain swap‐ ping: a model for protein assembly and misassembly. Adv Protein Chem.

pancreatic ribonuclease. Biochemistry. 2000;39:3585-91.

tion of proteins toward oligomerization. Structure. 2001;9:967-76.

Crystallogr D Biol Crystallogr. 2012;68:637-48.

FEBS Letters. 2013;587:3755-62.

Chemical Society. 2010;132:1621-30.

1999;24:329-32.

2006;45:10795-806.

1997;50:61-122.

main-swapped oligomers. FEBS Letters. 1996;398:326-32.

new insights for 3D domain swapping. PLoS One. 2012;7:e29076.

2012;7:e46804.


C-swapped tetramers and multimers with increasing biological activities. PLoS One. 2012;7:e46804.

[82] Koharudin LM, Liu L, Gronenborn AM. Different 3D domain-swapped oligomeric cyanovirin-N structures suggest trapped folding intermediates. Proceedings of the National Academy of Sciences of the United States of America. 2013;110:7702-7.

[68] Rousseau F, Schymkowitz JW, Wilkinson HR, Itzhaki LS. Three-dimensional domain swapping in p13suc1 occurs in the unfolded state and is controlled by conserved pro‐ line residues. Proceedings of the National Academy of Sciences of the United States

[69] Di Donato A, D'Alessio G. Interchain disulfide bridges in ribonuclease BS-1. Biochem

[70] Berisio R, Sica F, De Lorenzo C, Di Fiore A, Piccoli R, Zagari A, et al. Crystal struc‐ ture of the dimeric unswapped form of bovine seminal ribonuclease. FEBS Letters.

[71] Liu Y, Hart PJ, Schlunegger MP, Eisenberg D. The crystal structure of a 3D domainswapped dimer of RNase A at a 2.1-A resolution. Proceedings of the National Acade‐

[72] Liu Y, Gotte G, Libonati M, Eisenberg D. A domain-swapped RNase A dimer with implications for amyloid formation. Nature Structural Biology. 2001;8:211-4.

[73] Gotte G, Bertoldi M, Libonati M. Structural versatility of bovine ribonuclease A. Dis‐ tinct conformers of trimeric and tetrameric aggregates of the enzyme. Eur J Biochem.

[74] Lopez-Alonso JP, Gotte G, Laurents DV. Kinetic analysis provides insight into the mechanism of ribonuclease A oligomer formation. Archives of Biochemistry and Bio‐

[75] Liu Y, Gotte G, Libonati M, Eisenberg D. Structures of the two 3D domain-swapped RNase A trimers. Protein Science: a publication of the Protein Society. 2002;11:371-80.

[76] Liu Y, Eisenberg D. 3D domain swapping: as domains continue to swap. Protein Sci‐

[77] Park C, Raines RT. Dimer formation by a "monomeric" protein. Protein Science: a

[78] Gotte G, Vottariello F, Libonati M. Thermal aggregation of ribonuclease A. A contri‐ bution to the understanding of the role of 3D domain swapping in protein aggrega‐

[79] Geiger R, Gautschi M, Thor F, Hayer A, Helenius A. Folding, quality control, and se‐ cretion of pancreatic ribonuclease in live cells. The Journal of Biological Chemistry.

[80] Bennett MJ, Sawaya MR, Eisenberg D. Deposition diseases and 3D domain swap‐

[81] Gotte G, Mahmoud Helmy A, Ercole C, Spadaccini R, Laurents DV, Donadelli M, et al. Double domain swapping in bovine seminal RNase: formation of distinct N- and

ence: a publication of the Protein Society. 2002;11:1285-99.

tion. The Journal of Biological Chemistry. 2003;278:10763-9.

publication of the Protein Society. 2000;9:2026-33.

my of Sciences of the United States of America. 1998;95:3437-42.

of America. 2001;98:5596-601.

270 Oligomerization of Chemical and Biological Compounds

2003;554:105-10.

1999;265:680-7.

physics. 2009;489:41-7.

2011;286:5813-22.

ping. Structure. 2006;14:811-24.

Biophys Res Commun. 1973;55:919-28.


[94] Nagarkar RP, Hule RA, Pochan DJ, Schneider JP. Domain swapping in materials de‐ sign. Biopolymers. 2010;94:141-55.

[107] Janowski R, Kozak M, Jankowska E, Grzonka Z, Grubb A, Abrahamson M, et al. Hu‐ man cystatin C, an amyloidogenic protein, dimerizes through three-dimensional do‐

Protein Oligomerization http://dx.doi.org/10.5772/57489 273

[108] Luhrs T, Zahn R, Wuthrich K. Amyloid formation by recombinant full-length prion proteins in phospholipid bicelle solutions. Journal of Molecular Biology.

[109] Hafner-Bratkovic I, Bester R, Pristovsek P, Gaedtke L, Veranic P, Gaspersic J, et al. Globular domain of the prion protein needs to be unlocked by domain swapping to support prion protein conversion. The Journal of Biological Chemistry.

[110] Lee S, Eisenberg D. Seeded conversion of recombinant prion protein to a disulfidebonded oligomer by a reduction-oxidation process. Nature Structural Biology.

[111] Wahlbom M, Wang X, Lindstrom V, Carlemalm E, Jaskolski M, Grubb A. Fibrillogen‐ ic oligomers of human cystatin C are formed by propagated domain swapping. The

[112] Liu C, Sawaya MR, Eisenberg D. beta(2)-microglobulin forms three-dimensional do‐ main-swapped amyloid fibrils with disulfide linkages. Nature Structural & Molecu‐

[113] Staniforth RA, Giannini S, Higgins LD, Conroy MJ, Hounslow AM, Jerala R, et al. Three-dimensional domain swapping in the folded and molten-globule states of cys‐ tatins, an amyloid-forming structural superfamily. The EMBO Journal.

[114] Sanders A, Jeremy Craven C, Higgins LD, Giannini S, Conroy MJ, Hounslow AM, et al. Cystatin forms a tetramer through structural rearrangement of domain-swapped dimers prior to amyloidogenesis. Journal of Molecular Biology. 2004;336:165-78. [115] Nilsson M, Wang X, Rodziewicz-Motowidlo S, Janowski R, Lindstrom V, Onnerfjord P, et al. Prevention of domain swapping inhibits dimerization and amyloid fibril for‐ mation of cystatin C: use of engineered disulfide bridges, antibodies, and carboxyme‐ thylpapain to stabilize the monomeric form of cystatin C. The Journal of Biological

[116] Orlikowska M, Jankowska E, Kolodziejczyk R, Jaskolski M, Szymanska A. Hingeloop mutation can be used to control 3D domain swapping and amyloidogenesis of

[117] Olafsson I, Grubb A. Hereditary cystatin C amyloid angiopathy. Amyloid.

human cystatin C. Journal of Structural Biology. 2011;173:406-13.

main swapping. Nature Structural Biology. 2001;8:316-20.

Journal of Biological Chemistry. 2007;282:18318-26.

2006;357:833-41.

2011;286:12149-56.

2003;10:725-30.

2001;20:4774-81.

2000;7:70-9.

lar Biology. 2011;18:49-55.

Chemistry. 2004;279:24236-45.


[107] Janowski R, Kozak M, Jankowska E, Grzonka Z, Grubb A, Abrahamson M, et al. Hu‐ man cystatin C, an amyloidogenic protein, dimerizes through three-dimensional do‐ main swapping. Nature Structural Biology. 2001;8:316-20.

[94] Nagarkar RP, Hule RA, Pochan DJ, Schneider JP. Domain swapping in materials de‐

[95] Barrientos LG, Gronenborn AM. The highly specific carbohydrate-binding protein cyanovirin-N: structure, anti-HIV/Ebola activity and possibilities for therapy. Mini

[96] Barrientos LG, Louis JM, Botos I, Mori T, Han Z, O'Keefe BR, et al. The domainswapped dimer of cyanovirin-N is in a metastable folded state: reconciliation of X-

[97] Margoliash E, Lustgarten J. Interconversion of horse heart cytochrome C monomer

[98] Hirota S, Hattori Y, Nagao S, Taketa M, Komori H, Kamikubo H, et al. Cytochrome c polymerization by successive domain swapping at the C-terminal helix. Proceedings of the National Academy of Sciences of the United States of America.

[99] O'Neill JW, Manion MK, Maguire B, Hockenbery DM. BCL-XL dimerization by three-dimensional domain swapping. Journal of Molecular Biology. 2006;356:367-81.

[100] Denisov AY, Sprules T, Fraser J, Kozlov G, Gehring K. Heat-induced dimerization of

[101] Vendome J, Posy S, Jin X, Bahna F, Ahlsen G, Shapiro L, et al. Molecular design prin‐ ciples underlying beta-strand swapping in the adhesive dimerization of cadherins.

[102] Hadjithomas M, Moudrianakis EN. Experimental evidence for the role of domain swapping in the evolution of the histone fold. Proceedings of the National Academy

[103] Zegers I, Deswarte J, Wyns L. Trimeric domain-swapped barnase. Proceedings of the National Academy of Sciences of the United States of America. 1999;96:818-22. [104] Schiering N, Casale E, Caccia P, Giordano P, Battistini C. Dimer formation through domain swapping in the crystal structure of the Grb2-SH2-Ac-pYVNV complex. Bio‐

[105] Perutz MF, Johnson T, Suzuki M, Finch JT. Glutamine repeats as polar zippers: their possible role in inherited neurodegenerative diseases. Proceedings of the National

[106] Knaus KJ, Morillas M, Swietnicki W, Malone M, Surewicz WK, Yee VC. Crystal structure of the human prion protein reveals a mechanism for oligomerization. Na‐

Academy of Sciences of the United States of America. 1994;91:5355-8.

BCL-xL through alpha-helix swapping. Biochemistry. 2007;46:734-40.

Nature Structural & Molecular Biology. 2011;18:693-700.

of Sciences of the United States of America. 2011;108:13462-7.

and polymers. The Journal of Biological Chemistry. 1962;237:3397-405.

sign. Biopolymers. 2010;94:141-55.

ray and NMR structures. Structure. 2002;10:673-86.

Rev Med Chem. 2005;5:21-31.

272 Oligomerization of Chemical and Biological Compounds

2010;107:12854-9.

chemistry. 2000;39:13376-82.

ture Structural Biology. 2001;8:770-4.


[118] Eakin CM, Attenello FJ, Morgan CJ, Miranker AD. Oligomeric assembly of nativelike precursors precedes amyloid formation by beta-2 microglobulin. Biochemistry. 2004;43:7808-15.

[130] Sambashivan S, Liu Y, Sawaya MR, Gingery M, Eisenberg D. Amyloid-like fibrils of ribonuclease A with three-dimensional domain-swapped and native-like structure.

Protein Oligomerization http://dx.doi.org/10.5772/57489 275

[131] Mizuno H, Fujimoto Z, Koizumi M, Kano H, Atoda H, Morita T. Structure of coagu‐ lation factors IX/X-binding protein, a heterodimer of C-type lectin domains. Nature

[132] Vatzaki EH, Allen SC, Leonidas DD, Trautwein-Fritz K, Stackhouse J, Benner SA, et al. Crystal structure of a hybrid between ribonuclease A and bovine seminal ribonu‐

[133] Sica F, Di Fiore A, Merlino A, Mazzarella L. Structure and stability of the non-cova‐ lent swapped dimer of bovine seminal ribonuclease: an enzyme tailored to evade ri‐ bonuclease protein inhibitor. The Journal of Biological Chemistry. 2004;279:36753-60.

[134] Ercole C, Colamarino RA, Pizzo E, Fogolari F, Spadaccini R, Picone D. Comparison of the structural and functional properties of RNase A and BS-RNase: a stepwise muta‐

[135] Merlino A, Picone D, Ercole C, Balsamo A, Sica F. Chain termini cross-talk in the swapping process of bovine pancreatic ribonuclease. Biochimie. 2012;94:1108-18.

[136] Meersman F, Heremans K. Temperature-induced dissociation of protein aggregates:

[137] Babu CR, Hilser VJ, Wand AJ. Direct access to the cooperative substructure of pro‐ teins and the protein ensemble via cold denaturation. Nature Structural & Molecular

[138] Jaremko M, Jaremko L, Kim HY, Cho MK, Schwieters CD, Giller K, et al. Cold dena‐ turation of a protein dimer monitored at atomic resolution. Nature Chemical Biology.

[139] Merlino A, Vitagliano L, Ceruso MA, Mazzarella L. Dynamic properties of the N-ter‐

[140] Merlino A, Ceruso MA, Vitagliano L, Mazzarella L. Open interface and large quater‐ nary structure movements in 3D domain swapped proteins: insights from molecular dynamics simulations of the C-terminal swapped dimer of ribonuclease A. Biophys J.

[141] Selwood T, Jaffe EK. Dynamic dissociating homo-oligomers and the control of pro‐ tein function. Archives of Biochemistry and Biophysics. 2012;519:131-43.

[142] Zhou HX, Rivas G, Minton AP. Macromolecular crowding and confinement: bio‐ chemical, biophysical, and potential physiological consequences. Annual Review of

minal swapped dimer of ribonuclease A. Biophys J. 2004;86:2383-91.

accessing the denatured state. Biochemistry. 2003;42:14234-41.

clease--the basic surface, at 2.0 A resolution. Eur J Biochem. 1999;260:176-82.

Nature. 2005;437:266-9.

Biology. 2004;11:352-7.

2013;9:264-70.

2005;88:2003-12.

Biophysics. 2008;37:375-97.

Structural Biology. 1997;4:438-41.

genesis approach. Biopolymers. 2009;91:1009-17.


[130] Sambashivan S, Liu Y, Sawaya MR, Gingery M, Eisenberg D. Amyloid-like fibrils of ribonuclease A with three-dimensional domain-swapped and native-like structure. Nature. 2005;437:266-9.

[118] Eakin CM, Attenello FJ, Morgan CJ, Miranker AD. Oligomeric assembly of nativelike precursors precedes amyloid formation by beta-2 microglobulin. Biochemistry.

[119] Byeon IJ, Louis JM, Gronenborn AM. A protein contortionist: core mutations of GB1 that induce dimerization and domain swapping. Journal of Molecular Biology.

[120] Kirsten Frank M, Dyda F, Dobrodumov A, Gronenborn AM. Core mutations switch monomeric protein GB1 into an intertwined tetramer. Nature Structural Biology.

[121] Li J, Hoop CL, Kodali R, Sivanandam VN, van der Wel PC. Amyloid-like fibrils from a domain-swapping protein feature a parallel, in-register conformation without na‐

tive-like interactions. The Journal of Biological Chemistry. 2011;286:28988-95.

[122] Guo Z, Eisenberg D. Runaway domain swapping in amyloid-like fibrils of T7 endo‐ nuclease I. Proceedings of the National Academy of Sciences of the United States of

[123] Bader R, Seeliger MA, Kelly SE, Ilag LL, Meersman F, Limones A, et al. Folding and fibril formation of the cell cycle protein Cks1. The Journal of Biological Chemistry.

[124] Laidman J, Forse GJ, Yeates TO. Conformational change and assembly through edge beta strands in transthyretin and other amyloid proteins. Accounts of Chemical Re‐

[125] Serag AA, Altenbach C, Gingery M, Hubbell WL, Yeates TO. Arrangement of subu‐ nits and ordering of beta-strands in an amyloid sheet. Nature Structural Biology.

[126] Sawaya MR, Sambashivan S, Nelson R, Ivanova MI, Sievers SA, Apostol MI, et al. Atomic structures of amyloid cross-beta spines reveal varied steric zippers. Nature.

[127] Goldschmidt L, Teng PK, Riek R, Eisenberg D. Identifying the amylome, proteins ca‐ pable of forming amyloid-like fibrils. Proceedings of the National Academy of Scien‐

[128] Teng PK, Anderson NJ, Goldschmidt L, Sawaya MR, Sambashivan S, Eisenberg D. Ribonuclease A suggests how proteins self-chaperone against amyloid fiber forma‐

[129] Torrent M, Odorizzi F, Nogues MV, Boix E. Eosinophil cationic protein aggregation: identification of an N-terminus amyloid prone region. Biomacromolecules.

tion. Protein Science: a publication of the Protein Society. 2012;21:26-37.

ces of the United States of America. 2010;107:3487-92.

2004;43:7808-15.

274 Oligomerization of Chemical and Biological Compounds

2003;333:141-52.

2002;9:877-85.

America. 2006;103:8042-7.

2006;281:18816-24.

search. 2006;39:576-83.

2002;9:734-9.

2007;447:453-7.

2010;11:1983-90.


[143] Ercole C, Lopez-Alonso JP, Font J, Ribo M, Vilanova M, Picone D, et al. Crowding agents and osmolytes provide insight into the formation and dissociation of RNase A oligomers. Archives of Biochemistry and Biophysics. 2011;506:123-9.

[156] Cafaro V, Bracale A, Di Maro A, Sorrentino S, D'Alessio G, Di Donato A. New mu‐ teins of RNase A with enhanced antitumor action. FEBS Letters. 1998;437:149-52. [157] Kobe B, Deisenhofer J. Mechanism of ribonuclease inhibition by ribonuclease inhibi‐ tor protein based on the crystal structure of its complex with ribonuclease A. Journal

Protein Oligomerization http://dx.doi.org/10.5772/57489 277

[158] Kim JS, Soucek J, Matousek J, Raines RT. Mechanism of ribonuclease cytotoxicity.

[159] Rousseau F, Wilkinson H, Villanueva J, Serrano L, Schymkowitz JW, Itzhaki LS. Do‐ main swapping in p13suc1 results in formation of native-like, cytotoxic aggregates.

[160] Schymkowitz JW, Rousseau F, Irvine LR, Itzhaki LS. The folding pathway of the cellcycle regulatory protein p13suc1: clues for the mechanism of domain swapping.

[161] Thylen C, Wahlqvist J, Haettner E, Sandgren O, Holmgren G, Lundgren E. Modifica‐ tions of transthyretin in amyloid fibrils: analysis of amyloid from homozygous and heterozygous individuals with the Met30 mutation. The EMBO Journal.

[162] McCutchen SL, Colon W, Kelly JW. Transthyretin mutation Leu-55-Pro significantly alters tetramer stability and increases amyloidogenicity. Biochemistry.

[163] Apostol MI, Sawaya MR, Cascio D, Eisenberg D. Crystallographic studies of prion protein (PrP) segments suggest how structural changes encoded by polymorphism at residue 129 modulate susceptibility to human prion disease. The Journal of Biological

[164] Levy CB, Stumbo AC, Ano Bom AP, Portari EA, Cordeiro Y, Silva JL, et al. Co-locali‐ zation of mutant p53 and amyloid-like protein aggregates in breast tumors. The In‐

[165] Ano Bom AP, Rangel LP, Costa DC, de Oliveira GA, Sanches D, Braga CA, et al. Mu‐ tant p53 aggregates into prion-like amyloid oligomers and fibrils: implications for

[166] Apostol MI, Wiltzius JJ, Sawaya MR, Cascio D, Eisenberg D. Atomic structures sug‐ gest determinants of transmission barriers in mammalian prion disease. Biochemis‐

[167] Silva JL, Rangel LP, Costa DC, Cordeiro Y, De Moura Gallo CV. Expanding the prion concept to cancer biology: dominant-negative effect of aggregates of mutant p53 tu‐

ternational Journal of Biochemistry & Cell Biology. 2011;43:60-4.

cancer. The Journal of Biological Chemistry. 2012;287:28152-62.

mour suppressor. Bioscience Reports. 2013;33.

of Molecular Biology. 1996;264:1028-43.

Structure. 2000;8:89-100.

1993;12:743-8.

1993;32:12119-27.

try. 2011;50:2456-63.

Chemistry. 2010;285:29671-5.

The Journal of Biological Chemistry. 1995;270:31097-102.

Journal of Molecular Biology. 2006;363:496-505.


[156] Cafaro V, Bracale A, Di Maro A, Sorrentino S, D'Alessio G, Di Donato A. New mu‐ teins of RNase A with enhanced antitumor action. FEBS Letters. 1998;437:149-52.

[143] Ercole C, Lopez-Alonso JP, Font J, Ribo M, Vilanova M, Picone D, et al. Crowding agents and osmolytes provide insight into the formation and dissociation of RNase A

[144] Nenci A, Gotte G, Maras B, Libonati M. Different susceptibility of the two dimers of ribonuclease A to subtilisin. Implications for their structure. Biochimica et Biophysica

[145] Abrami L, Fivaz M, van der Goot FG. Adventures of a pore-forming toxin at the tar‐

[146] Milne JC, Furlong D, Hanna PC, Wall JS, Collier RJ. Anthrax protective antigen forms oligomers during intoxication of mammalian cells. The Journal of Biological Chemis‐

[147] Billington SJ, Jost BH, Songer JG. Thiol-activated cytolysins: structure, function and

[148] Bucci E, Vitagliano L, Barone R, Sorrentino S, D'Alessio G, Graziano G. On the ther‐ mal stability of the two dimeric forms of ribonuclease A. Biophysical Chemistry.

[149] Vottariello F, Giacomelli E, Frasson R, Pozzi N, De Filippis V, Gotte G. RNase A oli‐ gomerization through 3D domain swapping is favoured by a residue located far

[150] Rafelski SM, Theriot JA. Crawling toward a unified model of cell mobility: spatial and temporal regulation of actin dynamics. Annual Review of Biochemistry.

[151] Dell'Orco D. A physiological role for the supramolecular organization of rhodopsin

[152] Gotte G, Testolin L, Costanzo C, Sorrentino S, Armato U, Libonati M. Cross-linked trimers of bovine ribonuclease A: activity on double-stranded RNA and antitumor

[153] Bartholeyns J, Baudhuin P. Inhibition of tumor cell proliferation by dimerized ribo‐ nuclease. Proceedings of the National Academy of Sciences of the United States of

[154] Matousek J, Gotte G, Pouckova P, Soucek J, Slavik T, Vottariello F, et al. Antitumor activity and other biological actions of oligomers of ribonuclease A. The Journal of

[155] Merlino A, Russo Krauss I, Perillo M, Mattia CA, Ercole C, Picone D, et al. Toward an antitumor form of bovine pancreatic ribonuclease: the crystal structure of three non‐

and transducin in rod photoreceptors. FEBS letters. 2013;587:2060-6.

role in pathogenesis. FEMS, Microbiology Letters. 2000;182:197-205.

oligomers. Archives of Biochemistry and Biophysics. 2011;506:123-9.

get cell surface. Trends in Microbiology. 2000;8:168-72.

from the swapping domains. Biochimie. 2011;93:1846-57.

Acta. 2001;1545:255-62.

276 Oligomerization of Chemical and Biological Compounds

try. 1994;269:20607-12.

2005;116:89-95.

2004;73:209-39.

action. FEBS Letters. 1997;415:308-12.

Biological Chemistry. 2003;278:23817-22.

covalent dimeric mutants. Biopolymers. 2009;91:1029-37.

America. 1976;73:573-6.


[168] Bucciantini M, Giannoni E, Chiti F, Baroni F, Formigli L, Zurdo J, et al. Inherent tox‐ icity of aggregates implies a common mechanism for protein misfolding diseases. Nature. 2002;416:507-11.

**Chapter 9**

**Oligomerization of Proteins and Neurodegenerative**

Oligomerization of amino acids by binding their peptide bonds (-CONH-) forms proteins (or peptides), which are the major components of our bodies. Although the primary sequence (the linear sequence of amino acids) of the protein mainly determines its characteristics, its secondary structures (the conformations) are also critical determinants of their shapes and functions. The conformation (random coil, α-helix, and ß-sheet) is restricted by the circum‐ stances nearby proteins. The hydrogen bond between the amino acids in the peptide chain forms the α-helix structure. Meanwhile, the ß-sheets (ß-plated sheets) consist of ß-strands

Recent neurochemical evidence indicates that the oligomerization of proteins and the forma‐ tion of ß-sheet structures are linked with several neurodegenerative diseases such as Alz‐ heimer's disease (AD), prion diseases, triplet repeat diseases, dementia with Lewy bodies (DLB). The disease-related proteins, such as ß-amyloid protein (AßP) in AD, prion protein in prion diseases, polyglutamine in triplet repeat disease, α-synuclein in DLB, are identical in each disease (Table 1). However, all of these amyloidogenic proteins share common charac‐ teristics in the formation of amyloid with ß-sheet structures, and in the exhibition of cytotox‐ icity. Therefore, a new concept termed "conformational disease" was proposed, suggesting that protein conformation is an important determinant of its toxicity, and consequently, the

These conformational diseases are included in amyloidosis. At 1853, Virchow found the abnormal accumulates in tissues and named "amyloid", since they exhibited similar charac‐ teristics with *amylum*. At 1968, amyloid was determined to be the oligomers of proteins with ß-sheet structures. The accumulation of amyloid causes various diseases (amyloidosis) including familial amyloid polyneuropathy (FAP), amyloid-light chain amyloidosis, dialysis

> © 2014 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

**Diseases**

Dai Mizuno and Masahiro Kawahara

http://dx.doi.org/10.5772/57482

development of the related disease [1].

**1. Introduction**

Additional information is available at the end of the chapter

which are laterally connected peptide bonds with hydrogen bonds.


## **Oligomerization of Proteins and Neurodegenerative Diseases**

[168] Bucciantini M, Giannoni E, Chiti F, Baroni F, Formigli L, Zurdo J, et al. Inherent tox‐ icity of aggregates implies a common mechanism for protein misfolding diseases.

[169] Dykes G, Crepeau RH, Edelstein SJ. Three-dimensional reconstruction of the fibres of

[170] Holmes KC, Popp D, Gebhard W, Kabsch W. Atomic model of the actin filament. Na‐

[171] Downing KH, Nogales E. Crystallographic structure of tubulin: implications for dy‐

[172] Chiti F, Dobson CM. Amyloid formation by globular proteins under native condi‐

[173] Rajagopal K, Schneider JP. Self-assembling peptides and proteins for nanotechnologi‐

[174] Bonzani IC, George JH, Stevens MM. Novel materials for bone and cartilage regener‐

[175] Terzaki K, Kalloudi E, Mossou E, Mitchell EP, Forsyth VT, Rosseeva E, et al. Mineral‐ ized self-assembled peptides on 3D laser-made scaffolds: a new route toward 'scaf‐

[176] Nagarkar RP, Hule RA, Pochan DJ, Schneider JP. De novo design of strand-swapped beta-hairpin hydrogels. Journal of the American Chemical Society. 2008;130:4466-74.

[177] Ogihara NL, Ghirlanda G, Bryson JW, Gingery M, DeGrado WF, Eisenberg D. Design of three-dimensional domain-swapped dimers and fibrous oligomers. Proceedings of the National Academy of Sciences of the United States of America. 2001;98:1404-9.

[178] van Raaij MJ, Mitraki A, Lavigne G, Cusack S. A triple beta-spiral in the adenovirus fibre shaft reveals a new structural motif for a fibrous protein. Nature.

namics and drug binding. Cell Structure and Function. 1999;24:269-75.

cal applications. Current Opinion in Structural Biology. 2004;14:480-6.

fold on scaffold' hard tissue engineering. Biofabrication. 2013;5:045002.

ation. Current Opinion in Chemical Biology. 2006;10:568-75.

Nature. 2002;416:507-11.

278 Oligomerization of Chemical and Biological Compounds

ture. 1990;347:44-9.

1999;401:935-8.

sickle cell haemoglobin. Nature. 1978;272:506-10.

tions. Nature Chemical Biology. 2009;5:15-22.

Dai Mizuno and Masahiro Kawahara

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/57482

#### **1. Introduction**

Oligomerization of amino acids by binding their peptide bonds (-CONH-) forms proteins (or peptides), which are the major components of our bodies. Although the primary sequence (the linear sequence of amino acids) of the protein mainly determines its characteristics, its secondary structures (the conformations) are also critical determinants of their shapes and functions. The conformation (random coil, α-helix, and ß-sheet) is restricted by the circum‐ stances nearby proteins. The hydrogen bond between the amino acids in the peptide chain forms the α-helix structure. Meanwhile, the ß-sheets (ß-plated sheets) consist of ß-strands which are laterally connected peptide bonds with hydrogen bonds.

Recent neurochemical evidence indicates that the oligomerization of proteins and the forma‐ tion of ß-sheet structures are linked with several neurodegenerative diseases such as Alz‐ heimer's disease (AD), prion diseases, triplet repeat diseases, dementia with Lewy bodies (DLB). The disease-related proteins, such as ß-amyloid protein (AßP) in AD, prion protein in prion diseases, polyglutamine in triplet repeat disease, α-synuclein in DLB, are identical in each disease (Table 1). However, all of these amyloidogenic proteins share common charac‐ teristics in the formation of amyloid with ß-sheet structures, and in the exhibition of cytotox‐ icity. Therefore, a new concept termed "conformational disease" was proposed, suggesting that protein conformation is an important determinant of its toxicity, and consequently, the development of the related disease [1].

These conformational diseases are included in amyloidosis. At 1853, Virchow found the abnormal accumulates in tissues and named "amyloid", since they exhibited similar charac‐ teristics with *amylum*. At 1968, amyloid was determined to be the oligomers of proteins with ß-sheet structures. The accumulation of amyloid causes various diseases (amyloidosis) including familial amyloid polyneuropathy (FAP), amyloid-light chain amyloidosis, dialysis

© 2014 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


or histidine (His) or phosphorylated amino acids, and cause cross-linking of the proteins (Fig. 1). Furthermore, all of these amyloidogenic proteins were reported to have the ability to bind metals as shown in Table 1. Our and other numerous studies reported that oligomers cause neurodegeneration by induction of Ca2+ dyshomeostasis through the formation of amyloid channels on neuronal membranes [3,4]. The beneficial characteristics of carnosine (ß alanyl histidine) as a drug for the treatment for these neurodegenerative diseases are also discussed.

Amyloidogenic protein Oligomers with ß-sheet structures

Alzheimer's disease (AD) is a severe type of senile dementia, affecting a large portion of elderly people worldwide. It is characterized by profound memory loss and inability to form new memories. The pathological hallmarks of AD are the presence of numerous extracellular deposits (senile plaques) and intraneuronal neurofibrillary tangles (NFTs). The degeneration of synapses and neurons in the hippocampus or cerebral cortex is also observed. The major components of NFTs are phosphorylated tau proteins, and that of senile plaques are ß-amyloid proteins (AßPs) [5]. Although the precise cause of AD remains elusive, numerous biochemical, cell biological, and genetic studies have supported the idea termed "amyloid cascade hypoth‐ esis" that the AßP accumulation and the consequent neurodegeneration play a central role in AD [6]. Moreover, recent studies on the identified AßP species have indicated that the oligomerization of AßP and the conformational changes are critical in the neurodegeneration

AßP is a small peptide of 39–43 amino acid long. It is derived from the proteolytic cleavage of a large precursor protein (amyloid precursor protein; APP). AßP is secreted by the cleavage of its N-terminal by ß-secretase (BACE), followed by the intra-membrane cleavage of its Cterminal by γ-secretase. Genetic studies of early-onset cases of familial AD indicated that APP mutations and AßP metabolism are associated with AD. It was also revealed that mutations

M

M

M

Oligomerization of Proteins and Neurodegenerative Diseases

http://dx.doi.org/10.5772/57482

281

M

M

Phosphorylated amino acid

M

**Figure 1.** Trace elements acts cross-linkers of amyloidogenic proteins.

**2. Alzheimer's disease and oligomerization of AßP**

His Tyr Arg

**2.1. Amyloid cascade hypothesis**

M stands for metal.

process [7].

\* The sequence of fragment peptide of each amyloidogenic protein (PrP106-126, NAC, polyglutamine) is indicated in italic form.

**Table 1.** Characteristics of amyloidogenic proteins and the related peptides

amyloidosis, *etc* [2]. All of theses diseases share common properties about the deposition of amyloids in various tissues or organs with protease-resistant, insoluble fibril-like structures (amyloid fibrils), and stained by congo-red, ß-sheet specific dye. However, the component of amyloid is different in each disease. For example, the major component of amyloid in FAP patients is transthyretin, and ß2-microglobulin deposits in patients with dialysis amyloidosis. There are no effective treatments for amyloidosis.

In this chapter, we review the implication of protein oligomerization in the pathogenesis of these neurodegenerative diseases. Considering that the amyloidogenic proteins are commonly present in our brain, factors which influence oligomerization play crucial roles in their pathogenesis. As such factors, we focus on trace elements such as Al, Zn, Cu, and Fe. Metals have a property of firmly binding to metal-binding residues of proteins, such as tyrosine (Tyr) or histidine (His) or phosphorylated amino acids, and cause cross-linking of the proteins (Fig. 1). Furthermore, all of these amyloidogenic proteins were reported to have the ability to bind metals as shown in Table 1. Our and other numerous studies reported that oligomers cause neurodegeneration by induction of Ca2+ dyshomeostasis through the formation of amyloid channels on neuronal membranes [3,4]. The beneficial characteristics of carnosine (ß alanyl histidine) as a drug for the treatment for these neurodegenerative diseases are also discussed.

**Figure 1.** Trace elements acts cross-linkers of amyloidogenic proteins.

### **2. Alzheimer's disease and oligomerization of AßP**

#### **2.1. Amyloid cascade hypothesis**

amyloidosis, *etc* [2]. All of theses diseases share common properties about the deposition of amyloids in various tissues or organs with protease-resistant, insoluble fibril-like structures (amyloid fibrils), and stained by congo-red, ß-sheet specific dye. However, the component of amyloid is different in each disease. For example, the major component of amyloid in FAP patients is transthyretin, and ß2-microglobulin deposits in patients with dialysis amyloidosis.

\* The sequence of fragment peptide of each amyloidogenic protein (PrP106-126, NAC, polyglutamine) is indicated in

In this chapter, we review the implication of protein oligomerization in the pathogenesis of these neurodegenerative diseases. Considering that the amyloidogenic proteins are commonly present in our brain, factors which influence oligomerization play crucial roles in their pathogenesis. As such factors, we focus on trace elements such as Al, Zn, Cu, and Fe. Metals have a property of firmly binding to metal-binding residues of proteins, such as tyrosine (Tyr)

There are no effective treatments for amyloidosis.

**Disease The primary sequence of amyloidogenic protein or its fragment peptide**

> *AßP(1-42) and AßP(1-42)truncated C-terminal DAEFRHDSGYEVHHQKLVFFAEDVGSNKGAIIGLMVGG*

> MANLGCWMLVLFVATWSDLGLCKKRPKPGGWNTGG SRYPGQGSPGGNRYPPQGGGGWGQPHGGGWGQPH GGGWGQPHGGGWGQPHGGGWGQGGGTHSQWN KPSKP*KTNMKHMAGAAAAGAVVGG*LGGYVLGSAMS RPIIHFGSDYEDRYYRENMHRYPNQVYYRPMDEYSNQ NNFVHDCVNITIKQHTVTTTTKGENFTETDVKMMERVV EQMCITQYERESQAYYQRGSSMVLFSSPPVILLISFLIFL

> α-synuclein; *NAC* ( a fragment of α-synuclein ) MDVFMKGLSKAKEGVVAAAEKTKQGVAEAAGKTKEG VLYVGSKTKEGVVHGVTTVAEKTKEQVSNVGGAVVTG VTAVAHKTVEGAGNFAAATGLVKKDQKNESGFGPEG TMENSENMPVNPNNETYEMPPEEEYQDYDPEA

> MATLEKLMKAFESLKSF*QQQQQQQQQQQQQQQQQ QQQQQQ*PPPPPPPPPPPQLPQPPPQAQPLLPQPQPPPP

KCNTATCATQRLANFLVHSSNNFGAILSSTNVGSNTY

**Table 1.** Characteristics of amyloidogenic proteins and the related peptides

Alzheimer's disease

Dementia with Lewy bodies (DLB)

Triplet- repeat disease

italic form.

*VVIA*

IVG

Polyglutamine

PPPPPPGP

Diabetes mellitusHuman amylin

Prion disease Prion protein: *PrP106-126*

280 Oligomerization of Chemical and Biological Compounds

**Metal ß-sheet**

Al, Zn, Cu, Fe

> Zn, Cu, M, Fe

Cu, Fe, Al

**formation**

**Cytotoxicity**

+ + +

+ + +

+ + +

Fe + + +

Cu, Al + + +

**Poreformation**

> Alzheimer's disease (AD) is a severe type of senile dementia, affecting a large portion of elderly people worldwide. It is characterized by profound memory loss and inability to form new memories. The pathological hallmarks of AD are the presence of numerous extracellular deposits (senile plaques) and intraneuronal neurofibrillary tangles (NFTs). The degeneration of synapses and neurons in the hippocampus or cerebral cortex is also observed. The major components of NFTs are phosphorylated tau proteins, and that of senile plaques are ß-amyloid proteins (AßPs) [5]. Although the precise cause of AD remains elusive, numerous biochemical, cell biological, and genetic studies have supported the idea termed "amyloid cascade hypoth‐ esis" that the AßP accumulation and the consequent neurodegeneration play a central role in AD [6]. Moreover, recent studies on the identified AßP species have indicated that the oligomerization of AßP and the conformational changes are critical in the neurodegeneration process [7].

> AßP is a small peptide of 39–43 amino acid long. It is derived from the proteolytic cleavage of a large precursor protein (amyloid precursor protein; APP). AßP is secreted by the cleavage of its N-terminal by ß-secretase (BACE), followed by the intra-membrane cleavage of its Cterminal by γ-secretase. Genetic studies of early-onset cases of familial AD indicated that APP mutations and AßP metabolism are associated with AD. It was also revealed that mutations

in the presenilin genes account for the majority of cases of early-onset familial AD. Presenilins have been revealed to be one of γ -secretases, and their mutations also influence the production of AßP and its neurotoxicity.

AßP is secreted from its precursor protein, APP by transmembrane cleavage. Sequences of primate AßP(1-42) and ro‐ dent AßP(1-42) are shown. The comparison between the sequence of primate (human or monkey) Aß(1-42) and ro‐ dent (rat or mouse) Aß(1-42) is depicted.

bral administration of the conditioned medium with cultured cells transfected with the human APP gene inhibited long-term potentiation (LTP), which is a form of synaptic information storage well known as a paradigm of memory mechanisms. They also demonstrated that LTP was blocked by SDS-stable low-molecular-weight oligomers (dimers, trimers, or tetramers) but not AßP monomers or larger aggregates. The natural AßP oligomers (derived from the cerebrospinal fluid of AD patients) cause the loss of dendritic spines, synapses, and LTP blockage. Klein and the colleagues reported that AßP-derived diffusible ligands (ADDLs) obtained from sedimentation by clustering are highly toxic to cultured neurons. They also reported that ADDLs inhibited LTP and exhibited adverse effects on synaptic plasticity such as abnormal spine morphology, decreased spine density, and decreased synaptic proteins. Based on these and other numerous findings, it is widely accepted that AßP oligomers are synaptotoxic and neurotoxic, but not monomer or fibrils. These studies further strengthened and modified the amyloid cascade hypothesis, which suggest that AßP oligomers are neuro‐

AßP monomers exhibit random or a-helix structures. However, under aging conditions or in the presence of some ac‐ celeratory factors, Aß self-aggregates and forms several types of oligomers (SDS-soluble oligomers, ADDLS, globulom‐ ers, or protofibrils *etc.*) and finally forms insoluble aggregates termed amyloid fibrils. Oligomeric soluble Aßs are toxic,

*monomer Soluble oligomer Amyloid fibril*

• ß-sheet • soluble • toxic

Toxicity

*protofibril ADDLs*

*globulomer etc.*

Oligomerization of Proteins and Neurodegenerative Diseases

• ß-sheet • insoluble • less or non-toxic

http://dx.doi.org/10.5772/57482

283

Considering that AßP is secreted from APP into the brain of young people or of normal subjects, factors which influence (accelerate or delay) the oligomerization may become important determinants of the pathogenesis of AD. Various factors, such as the concentration of peptides, the oxidations, mutations, and racemization of AßP, pH, composition of solvents, temperature, and trace elements, can influence the oligomerization processes. A considerable amount of asparagines (Asp) or serine (Ser) residues of AßP accumulated in senile plaques are

toxic and crucial for the pathogeneis of AD [8,9].

• aging (incubation) • metals (Al, Cu, Zn, *etc.*)

Acceleratory factors

• oxidation • Mutation • racemization *etc.*

AßP

• -helix or randam coil • soluble • non-toxic

• rifampicin • curcumine • polyphenol • aspirin

Inhibitory factors

although the monomeric and fibril ones are rather nontoxic.

**Figure 3.** Oligomerization of AßP

• ß-sheet breaker peptide *etc.*

**2.2. Metal-induced oligomerization of AßP**

**Figure 2.** Structure of AßP

Yankner *et al.* reported that the first 40 amino acid residues of AßP (AßP(1–40)) caused the death of cultured rat hippocampal neurons or the neurodegeneration in the brains of experi‐ mental animals. Thereafter, it was agreed upon that the aggregation and the subsequent conformational change of AßP contribute to its neurotoxicity. AßP is a hydrophobic peptide with an intrinsic tendency to self-assemble to form insoluble oligomers with ß-pleated sheet structures. Pike *et al.* revealed that aged AßP(1–40) (aggregated under incubation at 37°C for several days) were considerably more toxic to cultured neurons as compared to freshly prepared AßP(1–40). Simmons *et al*. revealed ß-sheet contents of AßP observed by circular dichroism (CD) spectroscopy correlates with its neurotoxicity.

Furthermore, the longer peptide variant, AßP(1–42), has the characteristics of immediate polymerization compared to AßP(1–40). AßP(1–42) enhances the aggregation of AßP(1–40) and becomes a seed of the amyloid fibrils. AßP (1–42) is more abundant in the brains of AD patients as compared to those of age matched controls. The mutations of APP and those of presenilin genes induce the increased production of AßP (1–42) in the transfected cell lines.

Recent approaches using size-exclusion chromatography, gel electrophoresis, and atomic force microscopy have demonstrated that there are several stable types of soluble oligomers: naturally occurring soluble oligomers (dimers or trimers), ADDLs (AßP-derived diffusible ligands), AßP globulomers, or protofibrils. Hartley separated aggregated AßP(1–40) into low– molecular-weight (mainly monomer), protofibrillar, and fibril fractions by size-exclusion chromatography, and found that the protofibrillar fraction caused marked changes in the electrical activity of cultured neurons and neurotoxicity. Walsh *et al.* found that the intracere‐

AßP monomers exhibit random or a-helix structures. However, under aging conditions or in the presence of some ac‐ celeratory factors, Aß self-aggregates and forms several types of oligomers (SDS-soluble oligomers, ADDLS, globulom‐ ers, or protofibrils *etc.*) and finally forms insoluble aggregates termed amyloid fibrils. Oligomeric soluble Aßs are toxic, although the monomeric and fibril ones are rather nontoxic.

**Figure 3.** Oligomerization of AßP

in the presenilin genes account for the majority of cases of early-onset familial AD. Presenilins have been revealed to be one of γ -secretases, and their mutations also influence the production

membrane

ß-secretase g-secretase

Primate AßP: DAEFRHDSGYEVHHQKLVFFAEDVGSNKGAIIGLMVGGVVIATVIVITLVMLKKK-

AßP is secreted from its precursor protein, APP by transmembrane cleavage. Sequences of primate AßP(1-42) and ro‐ dent AßP(1-42) are shown. The comparison between the sequence of primate (human or monkey) Aß(1-42) and ro‐

Yankner *et al.* reported that the first 40 amino acid residues of AßP (AßP(1–40)) caused the death of cultured rat hippocampal neurons or the neurodegeneration in the brains of experi‐ mental animals. Thereafter, it was agreed upon that the aggregation and the subsequent conformational change of AßP contribute to its neurotoxicity. AßP is a hydrophobic peptide with an intrinsic tendency to self-assemble to form insoluble oligomers with ß-pleated sheet structures. Pike *et al.* revealed that aged AßP(1–40) (aggregated under incubation at 37°C for several days) were considerably more toxic to cultured neurons as compared to freshly prepared AßP(1–40). Simmons *et al*. revealed ß-sheet contents of AßP observed by circular

Furthermore, the longer peptide variant, AßP(1–42), has the characteristics of immediate polymerization compared to AßP(1–40). AßP(1–42) enhances the aggregation of AßP(1–40) and becomes a seed of the amyloid fibrils. AßP (1–42) is more abundant in the brains of AD patients as compared to those of age matched controls. The mutations of APP and those of presenilin genes induce the increased production of AßP (1–42) in the transfected cell lines.

Recent approaches using size-exclusion chromatography, gel electrophoresis, and atomic force microscopy have demonstrated that there are several stable types of soluble oligomers: naturally occurring soluble oligomers (dimers or trimers), ADDLs (AßP-derived diffusible ligands), AßP globulomers, or protofibrils. Hartley separated aggregated AßP(1–40) into low– molecular-weight (mainly monomer), protofibrillar, and fibril fractions by size-exclusion chromatography, and found that the protofibrillar fraction caused marked changes in the electrical activity of cultured neurons and neurotoxicity. Walsh *et al.* found that the intracere‐

Rodent AßP: DAEF*G*HDSG*F*EV*R*HQKLVFFAEDVGSNKGAIIGLMVGGVVIAT

dichroism (CD) spectroscopy correlates with its neurotoxicity.

of AßP and its neurotoxicity.

282 Oligomerization of Chemical and Biological Compounds

dent (rat or mouse) Aß(1-42) is depicted.

**Figure 2.** Structure of AßP

APP

bral administration of the conditioned medium with cultured cells transfected with the human APP gene inhibited long-term potentiation (LTP), which is a form of synaptic information storage well known as a paradigm of memory mechanisms. They also demonstrated that LTP was blocked by SDS-stable low-molecular-weight oligomers (dimers, trimers, or tetramers) but not AßP monomers or larger aggregates. The natural AßP oligomers (derived from the cerebrospinal fluid of AD patients) cause the loss of dendritic spines, synapses, and LTP blockage. Klein and the colleagues reported that AßP-derived diffusible ligands (ADDLs) obtained from sedimentation by clustering are highly toxic to cultured neurons. They also reported that ADDLs inhibited LTP and exhibited adverse effects on synaptic plasticity such as abnormal spine morphology, decreased spine density, and decreased synaptic proteins. Based on these and other numerous findings, it is widely accepted that AßP oligomers are synaptotoxic and neurotoxic, but not monomer or fibrils. These studies further strengthened and modified the amyloid cascade hypothesis, which suggest that AßP oligomers are neuro‐ toxic and crucial for the pathogeneis of AD [8,9].

#### **2.2. Metal-induced oligomerization of AßP**

Considering that AßP is secreted from APP into the brain of young people or of normal subjects, factors which influence (accelerate or delay) the oligomerization may become important determinants of the pathogenesis of AD. Various factors, such as the concentration of peptides, the oxidations, mutations, and racemization of AßP, pH, composition of solvents, temperature, and trace elements, can influence the oligomerization processes. A considerable amount of asparagines (Asp) or serine (Ser) residues of AßP accumulated in senile plaques are racemized. Tomiyama *et al.* reported that racemized D-Asp23-AßP easily aggregates compared to the L-type. Meanwhile, several substances such as rifampicin, curcumin, and aspirin have been reported to inhibit AßP oligomerization *in vitr*o. Rifampicin, a drug used to treat Hansen's disease, may be an interesting inhibitor of oligomerization since patients with Hansen's disease have a low susceptibility to AD. Aspirin and other NSAIDs (non-steroidal anti-inflammatory drugs) inhibit the AßP oligomerization and simultaneously attenuate inflammation.

rization of AßP and attenuates the accumulation of amyloid in the brains of experimental animals. Clinical trials using its analogue PBT2 are under investigation. DFO, a chelator of Al and Fe, attenuates the decline of daily living skills in AD patients. Silicates, which couple with

There is a considerable interest regarding the mechanisms by which AßP oligomers cause neurotoxicity. Exposure to AßP causes various adverse effects on neuronal survivals such as the production of reactive oxygen species, the induction of cytokines, the induction of endoplasmic reticulum (ER) stresses, and the abnormal increase of intracellular calcium levels ([Ca2+]i), *etc* [18]. Although these effects may be interwoven, the disruption of Ca2+ homeostasis is regarded to be an important determinant considering it occurs upstream of the other effects [19,20]. Ca2+ ions are essential for the normal brain functions. They are involved with key enzymes such as kinases, phosphatases, and proteases. Therefore, its influx is severely

*etc*. Ca2+ is also implicated in the phosphorylation of the tau protein or in APP sequestration. Increasing evidence indicates that presenilins are involved in capacitative Ca2+ entry or endoplasmic reticulum (ER) Ca2+ signaling, and that their mutations affect Ca2+-regulated

There is considerable interest regarding the mechanism by which AßPs interact with neurons and disrupt Ca2+ homeostasis. In 1993, Arispe *et al.* first demonstrated that AßP(1–40) directly incorporates into artificial lipid bilayer membranes and forms cation-selective ion channels [22]. The channels termed "amyloid channels" were revealed to be giant multi-level pores and can allow a large amount of Ca2+ to pass through. Their activity was blocked by Zn2+ ions, which are abundantly present in the brain. Furthermore, soluble AßP oligomers but not amyloid fibrils were reported to increase the membrane permeability. Durell *et al.* proposed a 3-D structural model of the amyloid channels obtained from a computer simulation of the secon‐ dary structure of AßP(1–40) in membranes that showed 5- to 8-mers aggregating to form porelike structures on the membranes. The multimeric (tetramer to hexamer) pore-like structures of AßPs on reconstituted membranes were observed using atomic force microscopy. Jang *et al.* established a model of amyloid channels on the membranes and observed that pentamer AßPs form pores, and their dimensions, shapes, and subunit organizations are in good agreement with AFM studies [23]. These results strongly support the hypothetical idea termed "amyloid channel hypothesis", which suggests that the direct incorporation of AßPs and the subsequent imbalances of Ca2+ and other ions through amyloid channels might be the primary event in AßP neurotoxicity. In this respect, AßP might share the mechanism of toxicity with a similar mechanism underlying the toxicity of various antimicrobial or antifungal peptides that

To determine whether AßPs form channels on neuronal cell membranes as well as artificial lipid bilayers, we employed membrane patches from a neuroblastoma cell line (GT1-7 cells), which exhibit several neuronal characteristics such as the extension of neuritis and the expression of neuron-specific proteins or receptors [24]. After exposing the excised membrane

) are strictly conserved by Ca2+ channels,

Oligomerization of Proteins and Neurodegenerative Diseases

http://dx.doi.org/10.5772/57482

285

Al and reduce its toxicity, are also candidates for chelation therapy in AD [17].

**2.2. Oligomerization-induced neurotoxicity of AßP**

controlled, and the intracellular Ca2+ levels ([Ca2+]i

also exhibit channel-forming activity and cell toxicity.

functions including AßP production [21].

Among these factors, trace elements such as aluminum (Al), zinc (Zn), copper (Cu), iron (Fe) are of particular interest. The accumulation of AßP is rarely observed in the brains of rodents (rats or mice) as compared to humans or monkeys. As shown in Fig.2, the amino acid sequence of human and rodent AßP are similar, yet they differ by three amino acids. However, rodent AßP exhibits less tendency to oligomerization compared to human AßP [10]. Considering that these three amino acids (Arg5 , Tyr10, and His13) have the ability to bind metals and that trace metals have cross-linking ability, trace elements might play important roles in the accumula‐ tion of AßP in the human brain.

Exley *et al.* first demonstrated that Al induces a conformational change in AßP(1-40) by CD spectroscopy. Furthermore, exposure to Al causes the accumulation of AßP in cultured neurons or in brains of experimental animals or human. Pratico *et al.* found that Al-fed mice transfected with the human APP gene (Tg 2576) exhibited pathological changes similar to those of the AD brain, including a marked increase in the amount of AßP both in the secreted form and the accumulated form: an increased deposition of senile plaques was also observed [11]. The neuropathological case study of the accidental Al-exposure that occurred in 1988 at Camelford (Cornwall, U.K.) indicated that the exposure to Al, even if it is short-term, could cause the accumulation of AßP and exhibit severe amyloid angiopathy [12]. Since there have been studies indicating the link between Al in drinking water and the pathogenesis of AD, Alinduced oligomerization may directly implicated in AD pathogenesis [13].

Bush *et al.* demonstrated that Zn2+ and Cu2+caused the oligomerization of AβP [14]. However, the metal-induced oligomerization of AßP and other amyloidogenic proteins are complex and controversial. The morphology of AßP oligomers treated with Al, Cu, Fe, and Zn were reported to be quite different. Zatta and his colleagues demonstrated that metals including Al, Cu, Fe, Zn differentially alter the oligomerization of AßP and its toxicity. We have shown that Al enhances the polymerization of AßP(1-40) and forms SDS-stable oligomers *in vitro* by immu‐ noblotting and precipitation [15,16]. The oligomerized AßP(1-40) is heat- or SDS-stable but redissolves on adding deferoxamine, a chelator of Al. The oligomerization induced by Al is more marked than that induced by other metals, including Zn, Fe, Cu, and Cd. Furthermore, while Zn-aggregated AßPs are rarely observed on the surface of cultured neurons several days after its exposure, Al-aggregated AßPs bind tightly to the surface of cultured neurons and form fibrillar deposits. These results suggest that Al-induced AßP oligomers have a strong affinity to membrane surfaces and undergo minimal degradation by proteases compared to Zninduced oligomers. Furthermore, AßP coupled with Al was reported to be highly toxic compared to normal AßP.

Considering the implications of metals in AD pathogenesis, chelation therapy for AD treat‐ ment is of great interest. Clioquinol (quinoform), a chelator of Cu2+ or Zn2+, inhibits oligome‐ rization of AßP and attenuates the accumulation of amyloid in the brains of experimental animals. Clinical trials using its analogue PBT2 are under investigation. DFO, a chelator of Al and Fe, attenuates the decline of daily living skills in AD patients. Silicates, which couple with Al and reduce its toxicity, are also candidates for chelation therapy in AD [17].

#### **2.2. Oligomerization-induced neurotoxicity of AßP**

racemized. Tomiyama *et al.* reported that racemized D-Asp23-AßP easily aggregates compared to the L-type. Meanwhile, several substances such as rifampicin, curcumin, and aspirin have been reported to inhibit AßP oligomerization *in vitr*o. Rifampicin, a drug used to treat Hansen's disease, may be an interesting inhibitor of oligomerization since patients with Hansen's disease have a low susceptibility to AD. Aspirin and other NSAIDs (non-steroidal anti-inflammatory

Among these factors, trace elements such as aluminum (Al), zinc (Zn), copper (Cu), iron (Fe) are of particular interest. The accumulation of AßP is rarely observed in the brains of rodents (rats or mice) as compared to humans or monkeys. As shown in Fig.2, the amino acid sequence of human and rodent AßP are similar, yet they differ by three amino acids. However, rodent AßP exhibits less tendency to oligomerization compared to human AßP [10]. Considering that

metals have cross-linking ability, trace elements might play important roles in the accumula‐

Exley *et al.* first demonstrated that Al induces a conformational change in AßP(1-40) by CD spectroscopy. Furthermore, exposure to Al causes the accumulation of AßP in cultured neurons or in brains of experimental animals or human. Pratico *et al.* found that Al-fed mice transfected with the human APP gene (Tg 2576) exhibited pathological changes similar to those of the AD brain, including a marked increase in the amount of AßP both in the secreted form and the accumulated form: an increased deposition of senile plaques was also observed [11]. The neuropathological case study of the accidental Al-exposure that occurred in 1988 at Camelford (Cornwall, U.K.) indicated that the exposure to Al, even if it is short-term, could cause the accumulation of AßP and exhibit severe amyloid angiopathy [12]. Since there have been studies indicating the link between Al in drinking water and the pathogenesis of AD, Al-

Bush *et al.* demonstrated that Zn2+ and Cu2+caused the oligomerization of AβP [14]. However, the metal-induced oligomerization of AßP and other amyloidogenic proteins are complex and controversial. The morphology of AßP oligomers treated with Al, Cu, Fe, and Zn were reported to be quite different. Zatta and his colleagues demonstrated that metals including Al, Cu, Fe, Zn differentially alter the oligomerization of AßP and its toxicity. We have shown that Al enhances the polymerization of AßP(1-40) and forms SDS-stable oligomers *in vitro* by immu‐ noblotting and precipitation [15,16]. The oligomerized AßP(1-40) is heat- or SDS-stable but redissolves on adding deferoxamine, a chelator of Al. The oligomerization induced by Al is more marked than that induced by other metals, including Zn, Fe, Cu, and Cd. Furthermore, while Zn-aggregated AßPs are rarely observed on the surface of cultured neurons several days after its exposure, Al-aggregated AßPs bind tightly to the surface of cultured neurons and form fibrillar deposits. These results suggest that Al-induced AßP oligomers have a strong affinity to membrane surfaces and undergo minimal degradation by proteases compared to Zninduced oligomers. Furthermore, AßP coupled with Al was reported to be highly toxic

Considering the implications of metals in AD pathogenesis, chelation therapy for AD treat‐ ment is of great interest. Clioquinol (quinoform), a chelator of Cu2+ or Zn2+, inhibits oligome‐

induced oligomerization may directly implicated in AD pathogenesis [13].

, Tyr10, and His13) have the ability to bind metals and that trace

drugs) inhibit the AßP oligomerization and simultaneously attenuate inflammation.

these three amino acids (Arg5

284 Oligomerization of Chemical and Biological Compounds

tion of AßP in the human brain.

compared to normal AßP.

There is a considerable interest regarding the mechanisms by which AßP oligomers cause neurotoxicity. Exposure to AßP causes various adverse effects on neuronal survivals such as the production of reactive oxygen species, the induction of cytokines, the induction of endoplasmic reticulum (ER) stresses, and the abnormal increase of intracellular calcium levels ([Ca2+]i), *etc* [18]. Although these effects may be interwoven, the disruption of Ca2+ homeostasis is regarded to be an important determinant considering it occurs upstream of the other effects [19,20]. Ca2+ ions are essential for the normal brain functions. They are involved with key enzymes such as kinases, phosphatases, and proteases. Therefore, its influx is severely controlled, and the intracellular Ca2+ levels ([Ca2+]i ) are strictly conserved by Ca2+ channels, *etc*. Ca2+ is also implicated in the phosphorylation of the tau protein or in APP sequestration. Increasing evidence indicates that presenilins are involved in capacitative Ca2+ entry or endoplasmic reticulum (ER) Ca2+ signaling, and that their mutations affect Ca2+-regulated functions including AßP production [21].

There is considerable interest regarding the mechanism by which AßPs interact with neurons and disrupt Ca2+ homeostasis. In 1993, Arispe *et al.* first demonstrated that AßP(1–40) directly incorporates into artificial lipid bilayer membranes and forms cation-selective ion channels [22]. The channels termed "amyloid channels" were revealed to be giant multi-level pores and can allow a large amount of Ca2+ to pass through. Their activity was blocked by Zn2+ ions, which are abundantly present in the brain. Furthermore, soluble AßP oligomers but not amyloid fibrils were reported to increase the membrane permeability. Durell *et al.* proposed a 3-D structural model of the amyloid channels obtained from a computer simulation of the secon‐ dary structure of AßP(1–40) in membranes that showed 5- to 8-mers aggregating to form porelike structures on the membranes. The multimeric (tetramer to hexamer) pore-like structures of AßPs on reconstituted membranes were observed using atomic force microscopy. Jang *et al.* established a model of amyloid channels on the membranes and observed that pentamer AßPs form pores, and their dimensions, shapes, and subunit organizations are in good agreement with AFM studies [23]. These results strongly support the hypothetical idea termed "amyloid channel hypothesis", which suggests that the direct incorporation of AßPs and the subsequent imbalances of Ca2+ and other ions through amyloid channels might be the primary event in AßP neurotoxicity. In this respect, AßP might share the mechanism of toxicity with a similar mechanism underlying the toxicity of various antimicrobial or antifungal peptides that also exhibit channel-forming activity and cell toxicity.

To determine whether AßPs form channels on neuronal cell membranes as well as artificial lipid bilayers, we employed membrane patches from a neuroblastoma cell line (GT1-7 cells), which exhibit several neuronal characteristics such as the extension of neuritis and the expression of neuron-specific proteins or receptors [24]. After exposing the excised membrane patches of GT1-7 cells in the bath solution to AßP(1–40), the current derived from the amyloid channels appeared. The amyloid channels formed on the GT1-7 cell membranes were cationselective, multilevel, voltage-independent, long-lasting ones; the channel activity was inhib‐ ited by the addition of Zn2+, and recovered by a zinc chelator, *o*-phenanthroline [25]. These features were considerably similar to those observed on artificial lipid bilayers. Meanwhile, AßP(40–1), a peptide bearing the reversed sequence of AßP(1–40), did not form any channels. Thus, we can conclude that AßPs are directly incorporated into neuronal membranes to form calcium-permeable pores. In order to test the amyloid channel hypothesis, we examined whether AßP altered the [Ca2+]i levels in neurons by a high-resolution multi-site video imaging system with fura-2 as the cytosolic free calcium reporter fluorescent probe. This multisite fluorometry system enables the simultaneous long-term observation of temporal changes in [Ca2+]i of more than 50 neurons. We could observe AßP-induced abnormal increase in [Ca2+]i in GT1-7 cells [26-28] as well as in primary cultured rat hippocampal neurons [29]. Shortly after exposure to AßP (1–40), a marked increase in [Ca2+]i occurred among many, but not all neurons. We also observed apoptotic death of cultured neurons after the exposure to AßPs and the consequent rise in the [Ca2+]i levels.

[Ca2+]i

protective function in AD.

 is initiated. However, zinc ions (Zn2+), which are secreted into synaptic clefts in a neuronal activity-dependent manner, inhibit AßP-induced Ca2+ entry, and thus have a

Oligomerization of Proteins and Neurodegenerative Diseases

http://dx.doi.org/10.5772/57482

287

Once AßP channels are formed on neuronal membranes, homeostasis of Ca2+ and other-ion will be disrupted. Disruption of Ca2+ homeostasis triggers several apoptotic pathways such as the activation of calpain, the induction of caspase, and promote numerous degenerative processes, including the production of reactive oxygen species (ROS) and the phosphorylation of tau, thereby accelerating neuronal death. Mutations of presenilins cause disturbances in the capacitive Ca2+ entry and may influence these pathways. Free radicals also induce membrane disruption, by which unregulated Ca2+ influx is further amplified. The disruption of Ca2+ homeostasis also influences the production and processing of APP. Thus, a vicious cycle of neurodegeneration is initiated. This hypothesis explains the long delay in AD development; AD occurs only in senile subjects despite the fact that Aßs are normally secreted also in younger or in normal subjects. Various environmental factors, such as foods or trace metals, as well as

genetic factors will influence these processes and contribute to AD pathogenesis [31].

The disease-related amyloidogenic proteins exhibit similarities in the formation of ß-pleated sheet structures, abnormal deposition as amyloid fibrils in the tissues, and introduction of apoptotic degeneration. Prion diseases, including human kuru, Creutzfeldt-Jakob disease, and bovine spongiform encephalopathy (BSE), are associated with the conversion of a normal prion protein (PrPC) to an abnormal scrapie isoform (PrPSC) [32]. The ß-sheet region of PrPSC is suggested to play a crucial role in its transmissible degenerative processes. A peptide fragment of PrP corresponding to residues 106–126 (PrP106–126) has been reported to cause death in cultured hippocampal neurons. We investigated the oligomerization of PrP106-126 and its neurotoxicity on primary cultured rat hippocampal neurons [33]. As AßP, PrP106-126 formed amyloid-like fibrils with ß–sheet structures by observation with atomic force microspope and by thioflavin T staining during the aging process. The oligomerization and formation ß-sheet structure enhanced the neurotoxicity of PrP106-126. The co-existence of Zn or Cu inhibited ßsheet formation of PrP106-126 and attenuated its neurotoxicity. Furthermore, the thickness of

Electrophysiological and morphological studies have revealed that PrP106-126 exhibits similarities in the formation of amyloid channels as well as AßP [34]. Lin *et al.* reported that PrP106–126 forms cation-permeable pores in artificial lipid bilayers. The activity of PrP channels was also blocked by Zn2+. Kourie *et al.* investigated the detailed characteristics of channels formed by PrP106–126, concluding that it was directly incorporated into lipid bilayers and formed cation-selective, copper-sensitive ion channels. They also revealed that quinacrine,

The oligomerization and fibrillation of α-synuclein has been implicated in the formation of abnormal inclusions, termed Lewy bodies, and the etiology of dementia with Lewy bodies

a potent therapeutic drug, possibly blocks amyloid channels induced by PrP106-126.

**3. Prion diseases and other amyloidosis**

PrP106-126 fibrils was decreased in the presence of Zn or Cu.

Considering the results of our study together with those of the other studies, we propose the following hypothetical scheme of neurodegeneration induced by oligomerization of AßP (Fig. 4).

AßPs are normally secreted from APP into the cerebrospinal fluid and are usually degraded proteolytically by neprilysin within a short period. However, upregulation of the AßP secretion from APP, or an increased ratio of AßP(1–42) to AßP(1–40) may render AßPs liable to be retained in the brain. It has been demonstrated that APP or presenilin gene mutations promote this process. AßP possesses positive charges at neutral pH. Therefore, the net charge of the outer membrane surface may be a determinant when secreted AßPs bind to cellular membranes (Fig.4 (A)). The distribution of phospholipids on cellular membranes is usually asymmetrical and negatively charged phospholipids such as PS exist on the inner membrane surfaces. Disruption of the assymetrical distribution is the first hallmark of apoptotic cell death [30]. Therefore, the binding of AßP to neuronal membranes seldom occur in normal and young brains. This idea may explain why AD occurs in aged subjects meanwhile AßPs are secreted in the brains of young subjects. After incorporation into the membrane, the conformation of AßPs change and the accumulated AßPs aggregate on the membranes (Fig. 4(B)). The ratio of cholesterol to phospholipids in the membrane may alter membrane fluidity, thereby affecting the process from step (A) to (B). AßP oligomerization *in vitro* will also enhance the channel formation velocity. Considering that natural oligomers (dimers or trimers) are more toxic as compared to monomers or fibrils, it is provable that these oligomers might form tetrameric or hexameric pores and exhibit neurotoxicity. Micro-circumstances on the membranes, such as rafts, are suitable locations that facilitate this process. Finally, aggregated AßP oligomers form ion channels (Fig.4 (C)) leading to the various neurodegenerative processes. The processes required for channel formation (from steps (A) to (C)) may require a long life span and determine the rate of the entire process. Unlike endogenous Ca2+ channels, these AßP channels are not regulated by usual blockers. Thus, once formed on membranes, a continuous flow of [Ca2+]i is initiated. However, zinc ions (Zn2+), which are secreted into synaptic clefts in a neuronal activity-dependent manner, inhibit AßP-induced Ca2+ entry, and thus have a protective function in AD.

Once AßP channels are formed on neuronal membranes, homeostasis of Ca2+ and other-ion will be disrupted. Disruption of Ca2+ homeostasis triggers several apoptotic pathways such as the activation of calpain, the induction of caspase, and promote numerous degenerative processes, including the production of reactive oxygen species (ROS) and the phosphorylation of tau, thereby accelerating neuronal death. Mutations of presenilins cause disturbances in the capacitive Ca2+ entry and may influence these pathways. Free radicals also induce membrane disruption, by which unregulated Ca2+ influx is further amplified. The disruption of Ca2+ homeostasis also influences the production and processing of APP. Thus, a vicious cycle of neurodegeneration is initiated. This hypothesis explains the long delay in AD development; AD occurs only in senile subjects despite the fact that Aßs are normally secreted also in younger or in normal subjects. Various environmental factors, such as foods or trace metals, as well as genetic factors will influence these processes and contribute to AD pathogenesis [31].

#### **3. Prion diseases and other amyloidosis**

patches of GT1-7 cells in the bath solution to AßP(1–40), the current derived from the amyloid channels appeared. The amyloid channels formed on the GT1-7 cell membranes were cationselective, multilevel, voltage-independent, long-lasting ones; the channel activity was inhib‐ ited by the addition of Zn2+, and recovered by a zinc chelator, *o*-phenanthroline [25]. These features were considerably similar to those observed on artificial lipid bilayers. Meanwhile, AßP(40–1), a peptide bearing the reversed sequence of AßP(1–40), did not form any channels. Thus, we can conclude that AßPs are directly incorporated into neuronal membranes to form calcium-permeable pores. In order to test the amyloid channel hypothesis, we examined

system with fura-2 as the cytosolic free calcium reporter fluorescent probe. This multisite fluorometry system enables the simultaneous long-term observation of temporal changes in [Ca2+]i of more than 50 neurons. We could observe AßP-induced abnormal increase in [Ca2+]i in GT1-7 cells [26-28] as well as in primary cultured rat hippocampal neurons [29]. Shortly

neurons. We also observed apoptotic death of cultured neurons after the exposure to AßPs

Considering the results of our study together with those of the other studies, we propose the following hypothetical scheme of neurodegeneration induced by oligomerization of AßP (Fig.

AßPs are normally secreted from APP into the cerebrospinal fluid and are usually degraded proteolytically by neprilysin within a short period. However, upregulation of the AßP secretion from APP, or an increased ratio of AßP(1–42) to AßP(1–40) may render AßPs liable to be retained in the brain. It has been demonstrated that APP or presenilin gene mutations promote this process. AßP possesses positive charges at neutral pH. Therefore, the net charge of the outer membrane surface may be a determinant when secreted AßPs bind to cellular membranes (Fig.4 (A)). The distribution of phospholipids on cellular membranes is usually asymmetrical and negatively charged phospholipids such as PS exist on the inner membrane surfaces. Disruption of the assymetrical distribution is the first hallmark of apoptotic cell death [30]. Therefore, the binding of AßP to neuronal membranes seldom occur in normal and young brains. This idea may explain why AD occurs in aged subjects meanwhile AßPs are secreted in the brains of young subjects. After incorporation into the membrane, the conformation of AßPs change and the accumulated AßPs aggregate on the membranes (Fig. 4(B)). The ratio of cholesterol to phospholipids in the membrane may alter membrane fluidity, thereby affecting the process from step (A) to (B). AßP oligomerization *in vitro* will also enhance the channel formation velocity. Considering that natural oligomers (dimers or trimers) are more toxic as compared to monomers or fibrils, it is provable that these oligomers might form tetrameric or hexameric pores and exhibit neurotoxicity. Micro-circumstances on the membranes, such as rafts, are suitable locations that facilitate this process. Finally, aggregated AßP oligomers form ion channels (Fig.4 (C)) leading to the various neurodegenerative processes. The processes required for channel formation (from steps (A) to (C)) may require a long life span and determine the rate of the entire process. Unlike endogenous Ca2+ channels, these AßP channels are not regulated by usual blockers. Thus, once formed on membranes, a continuous flow of

levels.

levels in neurons by a high-resolution multi-site video imaging

occurred among many, but not all

whether AßP altered the [Ca2+]i

286 Oligomerization of Chemical and Biological Compounds

and the consequent rise in the [Ca2+]i

4).

after exposure to AßP (1–40), a marked increase in [Ca2+]i

The disease-related amyloidogenic proteins exhibit similarities in the formation of ß-pleated sheet structures, abnormal deposition as amyloid fibrils in the tissues, and introduction of apoptotic degeneration. Prion diseases, including human kuru, Creutzfeldt-Jakob disease, and bovine spongiform encephalopathy (BSE), are associated with the conversion of a normal prion protein (PrPC) to an abnormal scrapie isoform (PrPSC) [32]. The ß-sheet region of PrPSC is suggested to play a crucial role in its transmissible degenerative processes. A peptide fragment of PrP corresponding to residues 106–126 (PrP106–126) has been reported to cause death in cultured hippocampal neurons. We investigated the oligomerization of PrP106-126 and its neurotoxicity on primary cultured rat hippocampal neurons [33]. As AßP, PrP106-126 formed amyloid-like fibrils with ß–sheet structures by observation with atomic force microspope and by thioflavin T staining during the aging process. The oligomerization and formation ß-sheet structure enhanced the neurotoxicity of PrP106-126. The co-existence of Zn or Cu inhibited ßsheet formation of PrP106-126 and attenuated its neurotoxicity. Furthermore, the thickness of PrP106-126 fibrils was decreased in the presence of Zn or Cu.

Electrophysiological and morphological studies have revealed that PrP106-126 exhibits similarities in the formation of amyloid channels as well as AßP [34]. Lin *et al.* reported that PrP106–126 forms cation-permeable pores in artificial lipid bilayers. The activity of PrP channels was also blocked by Zn2+. Kourie *et al.* investigated the detailed characteristics of channels formed by PrP106–126, concluding that it was directly incorporated into lipid bilayers and formed cation-selective, copper-sensitive ion channels. They also revealed that quinacrine, a potent therapeutic drug, possibly blocks amyloid channels induced by PrP106-126.

The oligomerization and fibrillation of α-synuclein has been implicated in the formation of abnormal inclusions, termed Lewy bodies, and the etiology of dementia with Lewy bodies (DLB). Non-amyloid component (NAC), a fragment peptide of α-synuclein, accumulates in Alzheimer's senile plaques and causes apoptotic neuronal death. Lashuel *et al.* demonstrated by electron microscope observation that α-synuclein forms annular pore-like structures [35].

The elongation of a polyglutamine-coding CAG triplet repeat in the responsible genes is based on the pathogenesis of triplet-repeat disease such as Huntington's disease or Machado-Joseph disease. Hirakura *et al.* reported that polyglutamine formed ion channels in lipid bilayers.

Lal *et al*. investigated the oligomerization and conformational changes of AßP, synuclein, amylin, and other amyloidogenic proteins using gel electrophoresis and AFM imaging, and demonstrated that these amyloidogenic proteins form annular channel-like structures on bilayer membranes [36]. We have demonstrated that these amyloidogenic peptide also cause the elevations in [Ca2+]<sup>i</sup> as well as AßP [3,31]. Considering these results together as shown in Table 1, it is suggested that the oligomerization of disease-related amyloidogenic proteins and the introduction of apoptotic degeneration by disruption of calcium homeostasis *via* unregu‐ lated amyloid channels may be the molecular basis of neurotoxicity of these diseases.

#### **4. Conclusion**

This hypothesis about the pathogenesis of conformational diseases may help in the develop‐ ment of drugs for these diseases. We focus carnosine (ß-alanyl histidine) as such a protective drug. Carnosine is a naturally occurring dipeptide and is commonly present in vertebrate tissues, particularly within the skeletal muscles and nervous tissues [37]. It is found at high concentrations in the muscles of animals or fish which exhibit high levels of exercise, such as horses, chickens, and whales. Thus, it is believed that carnosine plays important roles in the buffering capacities of muscle tissue and the administration of carnosine has been reported to induce hyperactivity in animals.

Secretion from synapses of AßP, and its direct incorporation into membranes and formation of oligomeric amyloid channels are depicted. Details are discussed in the text.

In the brain, a considerable amount of carnosine is localized in the neurons of the olfactory bulb. It is secreted into synaptic clefts along with the excitatory neurotransmitter glutamate during neuronal excitation. Carnosine reportedly has several beneficial effects including the antioxidant activity, the chelating ability to metal ions, the inhibition of the Maillard reaction. Furthermore, carnosine is reported to have anti-crosslinking properties. Attanasio *et al.* reported that carnosine inhibited the fibrillation of alpha-crystallin. It was also demonstrated that carnosine inhibited the oligomerization and subsequent neurotoxicity of AßP. Corona *et al.* showed that dietary supplementation of carnosine attenuated mitochondrial dysfunction and the accumulation of AßP in Alzheimer's model mice [38]. We also showed that carnosine attenuated the neuronal death induced by prion protein fragment peptide (PrP106-126) by changing its conformation [33]. Carnosine level is significantly reduced in the serum of AD patients. These results suggest possible beneficial effects of carnosine as a treatment for AD and prion diseases. We also demonstrated that carnosine attenuates Zn-induced neuronal

death and becomes a candidate for drugs of vascular dementia [39,40]. All of these functions of carnosine (e.g., antioxidant, anti-glycating, anti-crosslinking, and scavenging toxic alde‐ hydes) are related to the aging processes. The level of carnosine varies during development and is low in the aged animals. Therefore, it is highly possible that carnosine protects against

monomer Oligomer

Carnosine

Metals

Zn2+

APP

synapse

AßP

PC PS cholesterol

**Figure 4.** Amyloid channel hypothesis

Ca2+ Zn2+

Disruption of Ca homeostasis

Degeneration of synapses

Alzheimer' s disease

Neuronal death

Senile plaque

Oligomerization of Proteins and Neurodegenerative Diseases

http://dx.doi.org/10.5772/57482

289

Membrane composition (fluidity, charge)

Neurosteroids

ApoE

 Free radical formation phosphorylation of tau *etc.*

Foods

Estradiol

**Figure 4.** Amyloid channel hypothesis

(DLB). Non-amyloid component (NAC), a fragment peptide of α-synuclein, accumulates in Alzheimer's senile plaques and causes apoptotic neuronal death. Lashuel *et al.* demonstrated by electron microscope observation that α-synuclein forms annular pore-like structures [35].

The elongation of a polyglutamine-coding CAG triplet repeat in the responsible genes is based on the pathogenesis of triplet-repeat disease such as Huntington's disease or Machado-Joseph disease. Hirakura *et al.* reported that polyglutamine formed ion channels in lipid bilayers.

Lal *et al*. investigated the oligomerization and conformational changes of AßP, synuclein, amylin, and other amyloidogenic proteins using gel electrophoresis and AFM imaging, and demonstrated that these amyloidogenic proteins form annular channel-like structures on bilayer membranes [36]. We have demonstrated that these amyloidogenic peptide also cause the elevations in [Ca2+]<sup>i</sup> as well as AßP [3,31]. Considering these results together as shown in Table 1, it is suggested that the oligomerization of disease-related amyloidogenic proteins and the introduction of apoptotic degeneration by disruption of calcium homeostasis *via* unregu‐

lated amyloid channels may be the molecular basis of neurotoxicity of these diseases.

This hypothesis about the pathogenesis of conformational diseases may help in the develop‐ ment of drugs for these diseases. We focus carnosine (ß-alanyl histidine) as such a protective drug. Carnosine is a naturally occurring dipeptide and is commonly present in vertebrate tissues, particularly within the skeletal muscles and nervous tissues [37]. It is found at high concentrations in the muscles of animals or fish which exhibit high levels of exercise, such as horses, chickens, and whales. Thus, it is believed that carnosine plays important roles in the buffering capacities of muscle tissue and the administration of carnosine has been reported to

Secretion from synapses of AßP, and its direct incorporation into membranes and formation

In the brain, a considerable amount of carnosine is localized in the neurons of the olfactory bulb. It is secreted into synaptic clefts along with the excitatory neurotransmitter glutamate during neuronal excitation. Carnosine reportedly has several beneficial effects including the antioxidant activity, the chelating ability to metal ions, the inhibition of the Maillard reaction. Furthermore, carnosine is reported to have anti-crosslinking properties. Attanasio *et al.* reported that carnosine inhibited the fibrillation of alpha-crystallin. It was also demonstrated that carnosine inhibited the oligomerization and subsequent neurotoxicity of AßP. Corona *et al.* showed that dietary supplementation of carnosine attenuated mitochondrial dysfunction and the accumulation of AßP in Alzheimer's model mice [38]. We also showed that carnosine attenuated the neuronal death induced by prion protein fragment peptide (PrP106-126) by changing its conformation [33]. Carnosine level is significantly reduced in the serum of AD patients. These results suggest possible beneficial effects of carnosine as a treatment for AD and prion diseases. We also demonstrated that carnosine attenuates Zn-induced neuronal

of oligomeric amyloid channels are depicted. Details are discussed in the text.

**4. Conclusion**

induce hyperactivity in animals.

288 Oligomerization of Chemical and Biological Compounds

death and becomes a candidate for drugs of vascular dementia [39,40]. All of these functions of carnosine (e.g., antioxidant, anti-glycating, anti-crosslinking, and scavenging toxic alde‐ hydes) are related to the aging processes. The level of carnosine varies during development and is low in the aged animals. Therefore, it is highly possible that carnosine protects against external toxins and acts as an endogenous protective substance against neuronal injury, senescence, and aging. We have applied patents for carnosine and related compounds as drugs for vascular type of senile dementia (Patent No. 5382633, Patent No. JP5294194).

[7] Kawahara M, Negishi-Kato M, Sadakane Y. Calcium dyshomeostasis and neurotoxic‐

Oligomerization of Proteins and Neurodegenerative Diseases

http://dx.doi.org/10.5772/57482

291

[8] Walsh DM and Selkoe DJ. Aß oligomers - a decade of discovery. *J. Neurochem*. 2007;

[9] Klyubin I *et al.* Alzheimer's disease Aβ assemblies mediating rapid disruption of syn‐ aptic plasticity and memory. *Mol Brain*. 2012; 5: doi: 10.1186/1756-6606-5-25.

[10] Dyrks T *et al.* Amyloidogenicity of rodent and human ß A4 sequences. *FEBS letters*

[11] Pratico D *et al.* Aluminum modulates brain amyloidosis through oxidative stress in

[12] Exley C and Esiri MM. Severe cerebral congophilic angiopathy coincident with in‐ creased brain aluminium in a resident of Camelford, Cornwall, UK. *J Neurol Neuro‐*

[13] Kawahara M and Kato-Negishi M. Link between aluminum and the pathogenesis of Alzheimer's disease: the integration of the aluminum and amyloid cascade hypothe‐

[14] Bush AI. The metal theory of Alzheimer's disease. *J Alzheimers Dis.* 2013; 33 Suppl 1:

[15] Kawahara M *et al.* Aluminum promotes the aggregation of Alzheimer's β-amyloid

[16] Kawahara M *et al.* Effects of aluminum on the neurotoxicity of primary cultured neu‐ rons and on the aggregation of β-amyloid protein. *Brain Res Bull* 2001; 55: 211-217.

[17] Faux NG *et al.* PBT2 rapidly improves cognition in Alzheimer's Disease: additional

[18] Small DH, Mok SS and Bornstein JC. Alzheimer's disease and Aß toxicity: from top

[19] Brorson JR *et al.* The Ca2+ influx induced by beta-amyloid peptide 25-35 in cultured hippocampal neurons results from network excitation. *J Neurobiol*. 1995; 26:325-38.

[20] Camandola S and Mattson MP. Aberrant subcellular neuronal calcium regulation in

[21] Green KN and LaFerla FM. Linking calcium to Aß and Alzheimer's disease. *Neuron*

[22] Arispe N, Pollard HB, and Rojas E. Alzheimer disease amyloid ß protein forms calci‐ um channels in bilayer membranes: Blockade by tromethamine and aluminum. *Proc*

aging and Alzheimer's disease. *Biochim Biophys Acta*. 2011;1813:965-73.

ity of Alzheimer's ß-amyloid protein. *Expert Rev Neurother*. 2009; 9: 681-93.

101: 1172-1184.

1993; 324: 231-236.

S277-81.

2008; 59: 190-4.

*surg Psychiatry* 2006; 77: 877-9.

APP transgenic mice. *FASEB J* 2002; 16: 1138-40.

ses, *Int. J. Alzheimer Dis* 2011; doi: 10.4061/2011/276393.

phase II analyses. *J Alzheimers Dis*. 2010;20(2):509-16.

to bottom. *Nat Rev Neurosci.* 2001; 2: 595-8.

*Natl Acad Sci USA*, 1993; 90: 567-571.

protein *in vitro*. *Biochem Biophys Res Commun* 1994; 198: 531-535.

In conclusion, further research into the role of protein oligomerization and Ca homeostasis via amyloid channels might lead to the development of new treatments for neurodegenerative diseases.

#### **Acknowledgements**

The authors would like to thank Mr. M. Yanagita, Ms. A. Komuro, and Ms. N. Kato for their technical assistance. This work was partially supported by a Grant-in Aid for Scientific Research from the Ministry of Education, Culture, Sports, Science and Technology of Japan and by a Grant from Cooperation for Innovative Technology and Advanced Research in Evolutional Area (CITY AREA) from the Miyazaki Prefectural Industrial Support Foundation.

#### **Author details**

Dai Mizuno\* and Masahiro Kawahara

Department of Bio-Analytical Chemistry, Musashino University, Research Institute of Phar‐ maceutical Sciences, Musashino University, Nishitokyo-shi, Tokyo, Japan

#### **References**


[7] Kawahara M, Negishi-Kato M, Sadakane Y. Calcium dyshomeostasis and neurotoxic‐ ity of Alzheimer's ß-amyloid protein. *Expert Rev Neurother*. 2009; 9: 681-93.

external toxins and acts as an endogenous protective substance against neuronal injury, senescence, and aging. We have applied patents for carnosine and related compounds as drugs

In conclusion, further research into the role of protein oligomerization and Ca homeostasis via amyloid channels might lead to the development of new treatments for neurodegenerative

The authors would like to thank Mr. M. Yanagita, Ms. A. Komuro, and Ms. N. Kato for their technical assistance. This work was partially supported by a Grant-in Aid for Scientific Research from the Ministry of Education, Culture, Sports, Science and Technology of Japan and by a Grant from Cooperation for Innovative Technology and Advanced Research in Evolutional Area (CITY AREA) from the Miyazaki Prefectural Industrial Support Foundation.

Department of Bio-Analytical Chemistry, Musashino University, Research Institute of Phar‐

[1] Carrell, R.W. & Lomas, D.A. Conformational disease. *The Lancet*, 1997; 350: 134-138.

[2] Loo D *et al.* Proteomics in molecular diagnosis: typing of amyloidosis. *J Biomed Bio‐*

[3] Kawahara M. Role of calcium dyshomeostasis via amyloid channels in the pathogen‐ esis of Alzheimer's disease. *Current Pharmaceutical Design* 2010; 16: 2779-2789.

[4] Demuro A *et al.* Calcium dysregulation and membrane disruption as a ubiquitous neurotoxic mechanism of soluble amyloid oligomers. *J Biol Chem* 2005; 280:

[5] Selkoe, D.J. The molecular pathology of Alzheimer's disease. *Neuron* 1991; 6: 487-498.

[6] Hardy J and Selkoe DJ. The amyloid hypothesis of Alzheimer's disease: progress and

problems on the road to therapeutics. *Science* 2002; 297: 353-6.

maceutical Sciences, Musashino University, Nishitokyo-shi, Tokyo, Japan

*technol.* 2011; 754109. doi: 10.1155/2011/754109.

for vascular type of senile dementia (Patent No. 5382633, Patent No. JP5294194).

diseases.

**Acknowledgements**

290 Oligomerization of Chemical and Biological Compounds

**Author details**

and Masahiro Kawahara

Dai Mizuno\*

**References**

17294-300.


[23] Jang H *et al.* β-Barrel topology of Alzheimer's β-amyloid ion channels. *J. Mol. Biol.* 2010; 404: 917-34.

[38] Corona C *et al*. Effects of dietary supplementation of carnosine on mitochondrial dys‐ function, amyloid pathology, and cognitive deficits in 3xTg-AD mice. *PLoS One* 2011;

Oligomerization of Proteins and Neurodegenerative Diseases

http://dx.doi.org/10.5772/57482

293

[39] Koyama H, Konoha K, Sadakane Y, Ohkawara S, Kawahara M. Zinc neurotoxicity and the pathogenesis of vascular-type dementia: Involvement of calcium dyshomeo‐

[40] Mizuno D and Kawahara M: The molecular mechanism of zinc neurotoxicity and the pathogenesis of vascular type dementia, *Intern J Mol Sci. 2013;* 14: 22067-81.

stasis and carnosine, *J. Clin Toxicol.* 2012; S3-002, doi: 10.4172/2161-0495.

6(3):e17971.


[38] Corona C *et al*. Effects of dietary supplementation of carnosine on mitochondrial dys‐ function, amyloid pathology, and cognitive deficits in 3xTg-AD mice. *PLoS One* 2011; 6(3):e17971.

[23] Jang H *et al.* β-Barrel topology of Alzheimer's β-amyloid ion channels. *J. Mol. Biol.*

[24] Mellon PL *et al.* Immortalization of hypothalamic GnRH neurons by genetically tar‐

[25] Kawahara M, Arispe N, Kuroda Y, Rojas E. Alzheimer's disease amyloid β-protein

[26] Kawahara M, Arispe N, Kuroda Y, Rojas E. Alzheimer's β-amyloid, human islet amy‐ lin and prion protein fragment evoke intracellular free-calcium elevations by a com‐ mon mechanism in a hypothalamic GnRH neuronal cell-line. *J Biol Chem*, 2000; 275:

[27] Kawahara M and Kuroda Y Molecular mechanism of neurodegeneration induced by Alzheimer's β-amyloid protein: channel formation and disruption of calcium homeo‐

[28] Kawahara M and Kuroda Y. Intracellular calcium changes in neuronal cells induced by Alzheimer's β-amyloid protein are blocked by estradiol and cholesterol. *Cell Mol*

[29] Kato-Negishi M and Kawahara M. Neurosteroids block the increase in intracellular calcium level induced by Alzheimer's β-amyloid protein in long-term cultured rat

[30] Mountz JD et al. Molecular imaging: new applications for biochemistry. J Cell Bio‐

[31] Kawahara M *et al.* Membrane incorporation, channel formation, and disruption of calcium homeostasis by Alzheimer's ß-amyloid protein, *Int J Alzheimer Dis*, 2011;

[33] Kawahara M, Koyama H, Nagata T, Sadakane Y. Zinc, copper, and carnosine attenu‐ ate neurotoxicity of prion fragment PrP106-126, *Metallomics* 2011; 3: 726-734.

[34] Kourie JI and Culverson A. Prion peptide fragment PrP[106-126]forms distinct cation

[35] Lashuel HA and Lansbury PT Jr. Are amyloid diseases caused by protein aggregates that mimic bacterial pore-forming toxins? *Q Rev. Biophys*. 2002 ; 39 : 167-201.

[36] Lal R *et al.* Amyloid β ion channel: 3D structure and relevance to amyloid channel

[37] Hipkiss AR. Carnosine and its possible roles in nutrition and health, *Adv Food Nutr*

hippocampal neurons. *Neuropsychiatr Dis Treat*, 2008; 4: 209-218.

[32] Prusiner SB. Prions. *Proc. Natl. Acad. Sci. USA.,* 1998; 95: 13363-83.

channel types. *J. Neurosci. Res.,* 2000 ; 62 : 120-33.

paradigm. *Biochim Biophys Acta*, 2007; 1768: 1966-75.


2010; 404: 917-34.

292 Oligomerization of Chemical and Biological Compounds

forms Zn2+

14077-14083.

*Neurobio*, 2001; 21: 1-13.

chem Suppl. 2002;39:162-71.

*Res*, 2009 ; 57 : 87-154.

304583.

geted tumorigenesis. *Neuron,* 1990; 5: 1-10.

stasis. *Brain Res. Bull.,* 2000; 53: 389-397.

hypothalamic neurons. *Biophys J*, 1997; 73: 67-75.


**Chapter 10**

**Structure and Function of Stefin B Oligomers – Important**

Many functional proteins act as oligomers. Oligomerization is a well-controlled and regulated process. Those proteins cannot adopt their physiological functions without oligomerization, while the protein misfolding and subsequent unphysiological oligomerization influence the primary protein functions and produce a "gain in toxic function" as the prefibrillar oligomers are toxic for the cells. Misfolding, oligomerization and aggregation are the reasons for the so called conformational diseases. Several of them are neurodegenerative, but some also affect other vital organs. The accumulation of intracellular protein aggregates (various inclusions) and of extracellular protein deposits cause severe cellular degeneration, such as neurodegen‐ eration of affected neurons. Different proteins form rather similar but not identical fibrillar structures, all showing cross-β-structure, where continuous β-sheets run perpendicular to the

Neurodegenerative diseases like Alzheimer's disease, Parkinson's disease, Huntington's disease, prion diseases and many others have in common protein aggregation to amyloid fibrils. Type II diabetes is also an amyloid disease although it is not neurodegenerative. Nowadays it is believed that ordered prefibrillar oligomers or protofibrils may be responsible for cell death and that mature fibrils may even be neuroprotective. Most amyloid prone proteins form different oligomers in the lag phase of the amyloid fibril formation. Conforma‐ tional diseases (Table 1) are difficult to diagnose in the early stage, because they are usually asymptomatic during their development. Even when they can be diagnosed in the early stage there is an ethical reason for avoiding diagnosis (i.e. to hesitate letting know the patient or the relatives), because there are still no therapies which would slow down or even stop the progression of this type of diseases. Many different proteins are being studied in order to understand the common molecular mechanism of the conformational disease and to develop

> © 2014 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

**Role in Amyloidogenesis**

http://dx.doi.org/10.5772/57570

**1. Introduction**

long axis of the fibrils.

Ajda Taler-Verčič, Mira Polajnar and Eva Žerovnik

Additional information is available at the end of the chapter

## **Structure and Function of Stefin B Oligomers – Important Role in Amyloidogenesis**

Ajda Taler-Verčič, Mira Polajnar and Eva Žerovnik

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/57570

#### **1. Introduction**

Many functional proteins act as oligomers. Oligomerization is a well-controlled and regulated process. Those proteins cannot adopt their physiological functions without oligomerization, while the protein misfolding and subsequent unphysiological oligomerization influence the primary protein functions and produce a "gain in toxic function" as the prefibrillar oligomers are toxic for the cells. Misfolding, oligomerization and aggregation are the reasons for the so called conformational diseases. Several of them are neurodegenerative, but some also affect other vital organs. The accumulation of intracellular protein aggregates (various inclusions) and of extracellular protein deposits cause severe cellular degeneration, such as neurodegen‐ eration of affected neurons. Different proteins form rather similar but not identical fibrillar structures, all showing cross-β-structure, where continuous β-sheets run perpendicular to the long axis of the fibrils.

Neurodegenerative diseases like Alzheimer's disease, Parkinson's disease, Huntington's disease, prion diseases and many others have in common protein aggregation to amyloid fibrils. Type II diabetes is also an amyloid disease although it is not neurodegenerative. Nowadays it is believed that ordered prefibrillar oligomers or protofibrils may be responsible for cell death and that mature fibrils may even be neuroprotective. Most amyloid prone proteins form different oligomers in the lag phase of the amyloid fibril formation. Conforma‐ tional diseases (Table 1) are difficult to diagnose in the early stage, because they are usually asymptomatic during their development. Even when they can be diagnosed in the early stage there is an ethical reason for avoiding diagnosis (i.e. to hesitate letting know the patient or the relatives), because there are still no therapies which would slow down or even stop the progression of this type of diseases. Many different proteins are being studied in order to understand the common molecular mechanism of the conformational disease and to develop

© 2014 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

appropriate treatments. Studies use *in vitro* (different spectroscopic methods), *ex vivo* (different cell cultures) and *in vivo* (mouse models) systems to clarify the changes of the protein confor‐ mation and the down-stream effects during the whole process.

SH3 domain p 85 phosphatidylinositol 3-kinase Fibronectin type III phosphoglycenite linase acylphosphatase

Structure and Function of Stefin B Oligomers – Important Role in Amyloidogenesis

http://dx.doi.org/10.5772/57570

297

**Table 2.** Nondisease related amyloid forming proteins/peptides – model proteins (reviewed in [1]) serpins leads to diseases when there are mutations inducing aggregation or change of conformation (e.g. see serpinopathies and the

Stefins are endogenous cysteine protease inhibitors [3], which are ubiquitously expressed in human tissues [4]. They are specific for the papain-family of cysteine proteases and classified as the IH clan in the MEROPS scheme [5]. They are mainly intracellular inhibitors, although have also been found outside the cell in body fluids [6]. They do not have a signal peptide, and they bear a cystatin motif QXVXG, which is the main site involved in binding to target enzymes.

Human stefin B (also termed cystatin B) possesses 98 amino acid residues (Mr= 11 kDa) and no carbohydrate groups or disulphide bridges [7], although it contains one free cysteine [8]. It is an intracellular protein, which tightly and reversibly binds to papain-like cysteine proteases.

Stefin B main function is protection against inappropriate proteolysis of lysosomal cysteine proteases [9]. It is an inhibitor of cathepsins B, L, H and S [7, 10, 11]. However, it exerts some additional functions. It was found to interact with five known non-protease proteins (neuro‐ filament light chain (NFL), brain β-spectrin, RACK-1, human myotubularin related protein 8 (Mtrp) and human T-cell activation protein (Tcrp)) [12]. NFL and β-spectrin are specific to the nervous system. Stefin B (cystatin B) knock-out mice show neurological disorder (loss of the cerebellar granule cells, because of apoptotic bodies, chromatin condensation and some other changes). This suggests that stefin B has an essential anti apoptotic role in the cerebellum [12].

Stefin B is overexpressed in patients with hepatocellular carcinoma and is therefore in combination with some other proteins used as a marker for this disease (it is elevated in the

HypF N-terminal domain (*E. coli*) Amphoterin (human)

Endostatin (human) Met aminopeptidase

Fibroblast growth factor (*N. viridescens*) Apolipoprotein CII

VI domain (murine) B1 domain of IgG binding protein

Apomyoglobin (equine) Apocytochrome c

Stefin B (human) ADA2H

Curlin CgsA subunit Monellin

Conformational dementias by *David A. Lomas and Robin W. Carrell* [2].

We will simply call it stefin B from now on.

Serpins

**2. Human stefin B**


**Table 1.** Diseases of protein misfolding: amyloidoses and non-amyloidoses (reviewed in [1])

Both the amyloid forming proteins involved in certain conformational disease and model proteins are used (Table 2). Stefin B is a model protein for studying amyloid fibril formation.


**Table 2.** Nondisease related amyloid forming proteins/peptides – model proteins (reviewed in [1]) serpins leads to diseases when there are mutations inducing aggregation or change of conformation (e.g. see serpinopathies and the Conformational dementias by *David A. Lomas and Robin W. Carrell* [2].

#### **2. Human stefin B**

appropriate treatments. Studies use *in vitro* (different spectroscopic methods), *ex vivo* (different cell cultures) and *in vivo* (mouse models) systems to clarify the changes of the protein confor‐

angiopathy (Dutch) Amyloid precursor protein (Aβ 1-42)

Type II diabetes mellitus (adult onset) Islet amyloid polypeptide (amylin)

mation and the down-stream effects during the whole process.

**Conformational diseases ( organ/systemic) ) Protein**

disease, Scrapie (sheep), Bovine spongiform encephalopathy ("mad cow") Prion protein

Dialysis-associated amyloidosis β2-microglobulin

Familial amyloid polyneuropathy Transthyretin Reactive amyloidosis familial Mediterranean fever Serum amyloid A

Familial amyloid polyneuropathy (Finnish) Gelsolin

Hereditary cerebral myopathy – Iceland Cystatin C

Diffuse lewy body disease, Parkinson's disease α-synuclein

Familial British dementia FBDP Familial Danish dementia FDDP

Fronto-temporal dementia Tau

Spinocerebellar ataxias Ataxins

Senile cardiac amyloidosis Atrial natriuretic factor

Macroglobulinemia Gamma-1 heavy chain Primary systemic amyloidosis Ig-lambda, Ig-kappa Familial polyneuropathy – Iowa (Irish) Apolipoprotein A1

Nonneuropathic hereditary amyloid with renal disease Fibrinogen alpha, Lysozyme

Amyotropic lateral sclerosis Superoxide dismutase-1

Spinocerebellar ataxia 17 TATA box-binding protein

Spinal and bulbar muscural atropy Androgen receptor

**Table 1.** Diseases of protein misfolding: amyloidoses and non-amyloidoses (reviewed in [1])

Triplet repeat diseases: (Huntinghton's, Spinocerebellar ataxias Polyglutamine tracts (Huntingtin)

Both the amyloid forming proteins involved in certain conformational disease and model proteins are used (Table 2). Stefin B is a model protein for studying amyloid fibril formation.

Alzheimer's disease, Down's syndrome (Trisomy 21), Hereditary cerebral

296 Oligomerization of Chemical and Biological Compounds

Kuru, Gerstmann-Straussler-Scheinker Syndrome (GSS), Creutzfeld-Jacob

Stefins are endogenous cysteine protease inhibitors [3], which are ubiquitously expressed in human tissues [4]. They are specific for the papain-family of cysteine proteases and classified as the IH clan in the MEROPS scheme [5]. They are mainly intracellular inhibitors, although have also been found outside the cell in body fluids [6]. They do not have a signal peptide, and they bear a cystatin motif QXVXG, which is the main site involved in binding to target enzymes.

Human stefin B (also termed cystatin B) possesses 98 amino acid residues (Mr= 11 kDa) and no carbohydrate groups or disulphide bridges [7], although it contains one free cysteine [8]. It is an intracellular protein, which tightly and reversibly binds to papain-like cysteine proteases. We will simply call it stefin B from now on.

Stefin B main function is protection against inappropriate proteolysis of lysosomal cysteine proteases [9]. It is an inhibitor of cathepsins B, L, H and S [7, 10, 11]. However, it exerts some additional functions. It was found to interact with five known non-protease proteins (neuro‐ filament light chain (NFL), brain β-spectrin, RACK-1, human myotubularin related protein 8 (Mtrp) and human T-cell activation protein (Tcrp)) [12]. NFL and β-spectrin are specific to the nervous system. Stefin B (cystatin B) knock-out mice show neurological disorder (loss of the cerebellar granule cells, because of apoptotic bodies, chromatin condensation and some other changes). This suggests that stefin B has an essential anti apoptotic role in the cerebellum [12].

Stefin B is overexpressed in patients with hepatocellular carcinoma and is therefore in combination with some other proteins used as a marker for this disease (it is elevated in the serum already at the early stage of the disease development and therefore easy to detect) [13]. In different types of cancer (human colorectal cancer [14], gastric cancer [15], esophageal carcinoma [16], prostatic adenocarcinoma [17], bladder cancer [18]…) both lower expression level and higher activity due to higher expression level have been detected.

**Stefin B functions**

dementia [27];

inhibition of cathepsins B, L, H and S [7, 10, 11];

protects thymocytes against cell death [37];

positive/negative progression of cancer [13-18];

protecting cells from oxidative stress [26].

**3. Stefin B oligomers** *in vitro* **and in cells**

**Table 3.** Stefin B has many different functions.

cell specific expression [21];

regulates cell cycle [19];

to the nervous system) – essential anti apoptotic role in the cerebellum [12];

inhibits Aβ peptide amyloid fibril formation in oligomer specific manner [38];

increased expression in the nucleus delays caspase activation [22];

upregulation of nitric oxide release from interferon-γ-activated macrophages [23];

involved in immune-response to bacterial challenge of the leech *Theromyzon tessulatum* [24];

interaction with five non protease proteins: neurofilament light chain (NFL), brain β-spectrin, RACK-1, human myotubularin related protein 8 (Mtrp) and human T-cell activation protein (Tcrp) (NFL and β-spectrin are specific

Structure and Function of Stefin B Oligomers – Important Role in Amyloidogenesis

http://dx.doi.org/10.5772/57570

299

increased level in the senile plaques of Alzheimer's and Parkinson's and of patients suffering from senile

essential role in some of the neurons in the central nervous system, protecting the cells against apoptosis [25];

found mainly in the nucleus of proliferating cells and both in the nucleus and cytoplasm of differentiated cells;

Stefin B can adopt different oligomeric states *in vitro* and in cells. On the size-exclusion chromatography (SEC) the wild type protein elutes as a set of well-defined oligomers apart from monomers, dimers, tetramers and even higher oligomers (Figure 1) [39]. Y31 isoform is predominantly dimeric [31], while Y31 P79S mutant is tetrameric [40]. All oligomers can be

SEC results have been confirmed up to decamers by electrospray-ionization mass spectrometry (ESI-MS) [41]. In cells stefin B is present both in monomeric and oligomeric forms. Oligomers size ranges between 10 and 250 kDa. The higher oligomeric species are resistant to 1% SDS and 8 M urea and partially resistant to reducing agents (DTT treatment). The low molecular species comprise monomers, dimers, trimers and pentamers. Stefin B polymers *in vivo* seem to grow by monomer addition and not by domain-swapped dimer addition. The protein binds to many different proteins of various sizes [29]. Already the endogenous stefin B forms small punctate

Oligomers have been observed for many other proteins, especially the amyloidogenic ones. Oligomeric species appear in the lag phase of the reaction and are usually more stable under

isolated as separate peaks by SEC and stay stable for weeks at pH 7.0 and 4 °C [38].

aggregates in cells and after overexpression the aggregates amount increases [30].

Stefin B is localized both in the cytosol and in the nucleus [19, 20]. It is expressed in neurons and glial cells in the brain, but with slightly different localizations; in neurons it localizes to the nucleus while in astrocytes it is localized in the nucleus and in the cytoplasm [21]. It inhibits cathepsin L in the nucleus, whose substrates are transcription factors. Interaction with nucleosome – histones H2A.Z, H2B and H3, and cathepsin L in the nucleus has also been reported [19]. Therefore, it likely regulates transcription. Furthermore, stefin B was reported to regulate cell cycle progression into the S phase – entry into the S phase is delayed [19]. Increased expression of stefin B in the nucleus of T98G astrocytoma cells delays caspase-3 and caspase-7 activation and this delay is independent of cathepsin inhibition [22].

Stefin B plays an important role in the immune system. It upregulates the release of nitric oxide from interferon-γ-activated macrophages [23]. The protein is involved in innate immune response to bacterial challenge of the leech *Theromyzon tessulatum* [24]. It has an essential role in protection of the central nervous system from apoptosis [25] and from oxidative stress [26]. Increased level of stefin B was found in the senile plaques of Alzheimer's and Parkinson's diseases and in samples of patients suffering from senile dementia [27].

Stefin B functions are summarized in Table 3.

Loss of functions because of alterations in the cystatin B gene (dodecameric repeat expansions in the promoter region or point mutations in the coding gene) is the cause of progressive myoclonus epilepsy of type 1 also known as Unverricht-Lundborg disease (EPM1) [25, 28]. EPM1 mutans are polymeric and aggregate prone *in vivo* [29]. *In vitro* and *ex vivo* G4R, R68X, [30, 31], G50E and Q71P [32, 33] have been studied. Only G4R mutant folds like the wild type protein [31], all others lack tertiary structure and are partially unfolded [31, 33]. G4R and R68X also form amyloid fibrils [31]. Both G50E and Q71P mutants are more susceptible to cleavages by proteases, because of the partially unfolded structure [33], which likely contributes to loss of function in cells. Stefin B deficiency triggers neurodegeneration by impaired redox homeo‐ stasis [26]. G4R and R68X do not have any inhibitory activity, while G50E and Q71P are much less active than the wild type protein [32]. All three missense mutants, except for G4R form rather large but diffuse aggregates in cells when over-expressed, while G4R forms small aggregates similar to those of the wild type protein [32].

Cystatin C, a secreted and extracellular protein, is another cysteine protease inhibitor and mutations in its gene are the cause of hereditary cerebral amyloid angiopathy [34]. Human cystatin C is a risk factor for late onset Alzheimer's disease. The protein co-deposits with amyloid β peptide (Aβ peptide) amyloid plaques in patients with Alzheimer's disease [35]. Moreover, cystatin C binds to both the whole amyloid precursor protein (APP) and to Aβ peptide and it inhibits Aβ peptide amyloid fibril formation *in vitro* [36].


**Table 3.** Stefin B has many different functions.

serum already at the early stage of the disease development and therefore easy to detect) [13]. In different types of cancer (human colorectal cancer [14], gastric cancer [15], esophageal carcinoma [16], prostatic adenocarcinoma [17], bladder cancer [18]…) both lower expression

Stefin B is localized both in the cytosol and in the nucleus [19, 20]. It is expressed in neurons and glial cells in the brain, but with slightly different localizations; in neurons it localizes to the nucleus while in astrocytes it is localized in the nucleus and in the cytoplasm [21]. It inhibits cathepsin L in the nucleus, whose substrates are transcription factors. Interaction with nucleosome – histones H2A.Z, H2B and H3, and cathepsin L in the nucleus has also been reported [19]. Therefore, it likely regulates transcription. Furthermore, stefin B was reported to regulate cell cycle progression into the S phase – entry into the S phase is delayed [19]. Increased expression of stefin B in the nucleus of T98G astrocytoma cells delays caspase-3 and

Stefin B plays an important role in the immune system. It upregulates the release of nitric oxide from interferon-γ-activated macrophages [23]. The protein is involved in innate immune response to bacterial challenge of the leech *Theromyzon tessulatum* [24]. It has an essential role in protection of the central nervous system from apoptosis [25] and from oxidative stress [26]. Increased level of stefin B was found in the senile plaques of Alzheimer's and Parkinson's

Loss of functions because of alterations in the cystatin B gene (dodecameric repeat expansions in the promoter region or point mutations in the coding gene) is the cause of progressive myoclonus epilepsy of type 1 also known as Unverricht-Lundborg disease (EPM1) [25, 28]. EPM1 mutans are polymeric and aggregate prone *in vivo* [29]. *In vitro* and *ex vivo* G4R, R68X, [30, 31], G50E and Q71P [32, 33] have been studied. Only G4R mutant folds like the wild type protein [31], all others lack tertiary structure and are partially unfolded [31, 33]. G4R and R68X also form amyloid fibrils [31]. Both G50E and Q71P mutants are more susceptible to cleavages by proteases, because of the partially unfolded structure [33], which likely contributes to loss of function in cells. Stefin B deficiency triggers neurodegeneration by impaired redox homeo‐ stasis [26]. G4R and R68X do not have any inhibitory activity, while G50E and Q71P are much less active than the wild type protein [32]. All three missense mutants, except for G4R form rather large but diffuse aggregates in cells when over-expressed, while G4R forms small

Cystatin C, a secreted and extracellular protein, is another cysteine protease inhibitor and mutations in its gene are the cause of hereditary cerebral amyloid angiopathy [34]. Human cystatin C is a risk factor for late onset Alzheimer's disease. The protein co-deposits with amyloid β peptide (Aβ peptide) amyloid plaques in patients with Alzheimer's disease [35]. Moreover, cystatin C binds to both the whole amyloid precursor protein (APP) and to Aβ

level and higher activity due to higher expression level have been detected.

caspase-7 activation and this delay is independent of cathepsin inhibition [22].

diseases and in samples of patients suffering from senile dementia [27].

Stefin B functions are summarized in Table 3.

298 Oligomerization of Chemical and Biological Compounds

aggregates similar to those of the wild type protein [32].

peptide and it inhibits Aβ peptide amyloid fibril formation *in vitro* [36].

#### **3. Stefin B oligomers** *in vitro* **and in cells**

Stefin B can adopt different oligomeric states *in vitro* and in cells. On the size-exclusion chromatography (SEC) the wild type protein elutes as a set of well-defined oligomers apart from monomers, dimers, tetramers and even higher oligomers (Figure 1) [39]. Y31 isoform is predominantly dimeric [31], while Y31 P79S mutant is tetrameric [40]. All oligomers can be isolated as separate peaks by SEC and stay stable for weeks at pH 7.0 and 4 °C [38].

SEC results have been confirmed up to decamers by electrospray-ionization mass spectrometry (ESI-MS) [41]. In cells stefin B is present both in monomeric and oligomeric forms. Oligomers size ranges between 10 and 250 kDa. The higher oligomeric species are resistant to 1% SDS and 8 M urea and partially resistant to reducing agents (DTT treatment). The low molecular species comprise monomers, dimers, trimers and pentamers. Stefin B polymers *in vivo* seem to grow by monomer addition and not by domain-swapped dimer addition. The protein binds to many different proteins of various sizes [29]. Already the endogenous stefin B forms small punctate aggregates in cells and after overexpression the aggregates amount increases [30].

Oligomers have been observed for many other proteins, especially the amyloidogenic ones. Oligomeric species appear in the lag phase of the reaction and are usually more stable under

**4. Inhibitory activity of stefin B oligomers**

function.

octamers.

For a long time it has been thought that monomer is the only active form of stefin B. All other oligomeric forms would serve as a reservoir of monomers or possess additional functions. Now we have shown that all types of oligomers are active against cysteine protease papain (Figure 2). Dimers which are presumably domain-swapped (discussed in the structural characterization part) show the lowest activity in the enzyme to inhibitor 1:2 ratio. They, as well as other oligomers – tetramers and higher, are fully active in the enzyme to inhibitor 1:4 ratio (Fig. 2). SEC-isolated oligomers are all >95% pure and the amount of monomer present in other oligomers samples has been tested to show that the inhibitory effect is not due to monomer contamination. Higher oligomers are a mixture of various oligomers, starting from hexamers. The highest peak represents octamers, which are the prevalent species and therefore we propose that the average size of the higher oligomer is an octamer (this was used for determining the molar ratio). The sample of the higher oligomers contains a small amount of all other oligomers and we suspect that some inhibitory effect could be due to this contami‐ nation, but possibly not the whole effect. The results are in agreement with the detection of different oligomers in cells [29] and explain that all detected species could retain inhibitory

Structure and Function of Stefin B Oligomers – Important Role in Amyloidogenesis

http://dx.doi.org/10.5772/57570

301

**Figure 2.** Different stefin B oligomers all exert inhibitory activity against the cysteine protease papain. To evaluate the inhibitory activity of stefin B, BANA test was performed [53]. Stefin B monomers, dimers, tetramers and oligomers were prepared in eight different molar ratios [E]:[I]– 1:45, 1:22, 1:11, 1:4, 1:2, 1:0.2. Higher absorbance means lower inhibitory activity. The average oligomeric state (the highest peak from SEC) in the sample of higher oligomers are the

**Figure 1.** Elution profile of stefin B wild type protein (S3E31 form) at pH 7.0 from SEC. The whole sample (top trace) and separated oligomers as indicated, are shown.

non-amyloidogenic conditions. Size and morphology of those oligomers can be determined by SEC, mass spectrometry, dynamic light scattering and analytical ultracentrifugation. Some examples are given below.

Serpins are serine protease inhibitors also used as a model for studies of amyloid fibril formation. They have key regulatory functions in the inflammatory, complement, coagulation and fibrinolytic cascades [2]. The mechanism of their fibril formation is known in crystallo‐ graphic details [42, 43]. They form amyloid fibrils through domain-swapping [44]. Serpins preferably oligomerize than interact with other proteins [45]. They form both smaller oligom‐ ers and condensed longer polymers [46]. Monomers of some serpins are meta-stable with the rate limiting step representing the transition to the dimer. Once dimers are formed, they can connect to each other to form tetramers or recruit monomers to form trimers and also much longer oligomers [47].

Aβ peptide forms a set of oligomers from monomer to hexadecamer; all even and odd numbered oligomers have been detected by ESI MS [48]. Also for many other amylod forming proteins the whole set of oligomeric species has been observed (insulin [49], β2-microglobulin [50, 51], SH3 domain [52]…).

#### **4. Inhibitory activity of stefin B oligomers**

non-amyloidogenic conditions. Size and morphology of those oligomers can be determined by SEC, mass spectrometry, dynamic light scattering and analytical ultracentrifugation. Some

**Figure 1.** Elution profile of stefin B wild type protein (S3E31 form) at pH 7.0 from SEC. The whole sample (top trace)

Serpins are serine protease inhibitors also used as a model for studies of amyloid fibril formation. They have key regulatory functions in the inflammatory, complement, coagulation and fibrinolytic cascades [2]. The mechanism of their fibril formation is known in crystallo‐ graphic details [42, 43]. They form amyloid fibrils through domain-swapping [44]. Serpins preferably oligomerize than interact with other proteins [45]. They form both smaller oligom‐ ers and condensed longer polymers [46]. Monomers of some serpins are meta-stable with the rate limiting step representing the transition to the dimer. Once dimers are formed, they can connect to each other to form tetramers or recruit monomers to form trimers and also much

Aβ peptide forms a set of oligomers from monomer to hexadecamer; all even and odd numbered oligomers have been detected by ESI MS [48]. Also for many other amylod forming proteins the whole set of oligomeric species has been observed (insulin [49], β2-microglobulin

examples are given below.

and separated oligomers as indicated, are shown.

300 Oligomerization of Chemical and Biological Compounds

longer oligomers [47].

[50, 51], SH3 domain [52]…).

For a long time it has been thought that monomer is the only active form of stefin B. All other oligomeric forms would serve as a reservoir of monomers or possess additional functions. Now we have shown that all types of oligomers are active against cysteine protease papain (Figure 2). Dimers which are presumably domain-swapped (discussed in the structural characterization part) show the lowest activity in the enzyme to inhibitor 1:2 ratio. They, as well as other oligomers – tetramers and higher, are fully active in the enzyme to inhibitor 1:4 ratio (Fig. 2). SEC-isolated oligomers are all >95% pure and the amount of monomer present in other oligomers samples has been tested to show that the inhibitory effect is not due to monomer contamination. Higher oligomers are a mixture of various oligomers, starting from hexamers. The highest peak represents octamers, which are the prevalent species and therefore we propose that the average size of the higher oligomer is an octamer (this was used for determining the molar ratio). The sample of the higher oligomers contains a small amount of all other oligomers and we suspect that some inhibitory effect could be due to this contami‐ nation, but possibly not the whole effect. The results are in agreement with the detection of different oligomers in cells [29] and explain that all detected species could retain inhibitory function.

**Figure 2.** Different stefin B oligomers all exert inhibitory activity against the cysteine protease papain. To evaluate the inhibitory activity of stefin B, BANA test was performed [53]. Stefin B monomers, dimers, tetramers and oligomers were prepared in eight different molar ratios [E]:[I]– 1:45, 1:22, 1:11, 1:4, 1:2, 1:0.2. Higher absorbance means lower inhibitory activity. The average oligomeric state (the highest peak from SEC) in the sample of higher oligomers are the octamers.

### **5. Structural characterization of different oligomeric states of stefin B**

Crystal or solution structure of stefin B monomer and oligomers have been so far characterized from monomer in complex with the target enzyme [54], dimer under amyloid forming conditions [55], tetramer [40] to model of amyloid fibrils [56]. Domain-swapped dimer of stefin A has also been determined by heteronuclear NMR [57].

Stefin B was first crystallized in complex with the cysteine protease papain (Figure 3A) [54]. The structure of the monomeric stefin B consists of a five stranded β-sheet wrapped around a five turn α-helix and with an additional carboxy terminal strand running along the convex side of the β-sheet. This type of the tertiary structure is conserved throughout the cystatin family (Fig. 3A)

High resolution structure of stefin B dimer was determined by heteronuclear NMR [55]. Furthermore, changes within the dimer under amyloid forming conditions were observed and flexible residues were defined. Even under amyloid forming conditions the structure remains largely folded, the main differences are in the flexible loops regions. Prolines on positions 74 and 79 play an important role in the orientations of the loop between strands 4 and 5 (the same loop is involved in "hand-shaking" in the tetramer formation). Four different dimer states have been observed in solution.

When the proline at position 79 was mutated into serine, the tetrameric form of the protein became favourable [40]. This tetramer has been crystallized and its structure determined (Figure 3B) [40]. The tetramer consists of two domain-swapped dimers connected with the socalled "hand-shaking" mechanism to each other. Hand-shaking occurs along with *trans* to *cis* isomerization of the proline at position 74. This proline is conserved throughout the cystatin family. Both domains preserve the fold of the monomer (the same as observed in the complex with papain); each domain is composed of the N-terminal part from one chain (strand 1, the α-helix and strand 2) and C-terminal part from the other chain (strands 3, 4 and 5), between both domains there is a linker region (two peptides, belonging each to one chain).

The structured core of human stefin B amyloid fibrils has been determined (Figure 3C) [56]. Based on the H/D exchange rates the structure could be divided onto protected region (inside the fibril core) and unprotected region (striking out of the fibril). Strands 2, 3, 4 and 5 are protected, while strand 1, α-helix and loops between strands 3 and 4 and 4 and 5 are unpro‐ tected. The loop between strands 2 and 3 is the same as the one involved in the domainswapping mechanism. The fibril core would therefore be made of domain-swapped dimers but with the loop between strands 4 and 5 in position of a tetramer.

more stable homologue [60], to proline mutants [61], EPM1 mutants [62] and to chimeras

**Figure 3.** Stefin B structure. (A) stefin B monomer (PDB 1stf) [54], (B) stefin B tetramer (PDB 2oct) [40], (C) secondary

Structure and Function of Stefin B Oligomers – Important Role in Amyloidogenesis

http://dx.doi.org/10.5772/57570

303

A model for the mechanism of stefin B amyloid fibril formation has been proposed based on temperature and concentration dependence of the kinetics [64]. Following changes during the lag phase of the reaction by SEC and ESI MS we were able to improve the model of amyloid fibril formation (Figure 4), further explaining the role of various off-pathway oligomers [41]. Dimers seem to be the building block from which fibril formation starts and the fibrils grow [56]. When the process was started from any kind of oligomers, these transformed into dimers [41]. NMR studies have shown that there are actually four types of dimers observed in the lag

between stefins A and B [63].

structure representation of stefin B fibril [56]

phase under amyloid forming conditions [55].

#### **6. Initial oligomers on the way to fibril formation**

That stefin B forms amyloid fibrils *in vitro* under relatively mild conditions was first shown in 2002 [58] and various solution conditions were probed, fibrils and protofibrils were imaged by AFM and TEM [59]. The fibril formation of stefin B was compared to that of stefin A, a much

**5. Structural characterization of different oligomeric states of stefin B**

A has also been determined by heteronuclear NMR [57].

302 Oligomerization of Chemical and Biological Compounds

family (Fig. 3A)

been observed in solution.

Crystal or solution structure of stefin B monomer and oligomers have been so far characterized from monomer in complex with the target enzyme [54], dimer under amyloid forming conditions [55], tetramer [40] to model of amyloid fibrils [56]. Domain-swapped dimer of stefin

Stefin B was first crystallized in complex with the cysteine protease papain (Figure 3A) [54]. The structure of the monomeric stefin B consists of a five stranded β-sheet wrapped around a five turn α-helix and with an additional carboxy terminal strand running along the convex side of the β-sheet. This type of the tertiary structure is conserved throughout the cystatin

High resolution structure of stefin B dimer was determined by heteronuclear NMR [55]. Furthermore, changes within the dimer under amyloid forming conditions were observed and flexible residues were defined. Even under amyloid forming conditions the structure remains largely folded, the main differences are in the flexible loops regions. Prolines on positions 74 and 79 play an important role in the orientations of the loop between strands 4 and 5 (the same loop is involved in "hand-shaking" in the tetramer formation). Four different dimer states have

When the proline at position 79 was mutated into serine, the tetrameric form of the protein became favourable [40]. This tetramer has been crystallized and its structure determined (Figure 3B) [40]. The tetramer consists of two domain-swapped dimers connected with the socalled "hand-shaking" mechanism to each other. Hand-shaking occurs along with *trans* to *cis* isomerization of the proline at position 74. This proline is conserved throughout the cystatin family. Both domains preserve the fold of the monomer (the same as observed in the complex with papain); each domain is composed of the N-terminal part from one chain (strand 1, the α-helix and strand 2) and C-terminal part from the other chain (strands 3, 4 and 5), between

The structured core of human stefin B amyloid fibrils has been determined (Figure 3C) [56]. Based on the H/D exchange rates the structure could be divided onto protected region (inside the fibril core) and unprotected region (striking out of the fibril). Strands 2, 3, 4 and 5 are protected, while strand 1, α-helix and loops between strands 3 and 4 and 4 and 5 are unpro‐ tected. The loop between strands 2 and 3 is the same as the one involved in the domainswapping mechanism. The fibril core would therefore be made of domain-swapped dimers

That stefin B forms amyloid fibrils *in vitro* under relatively mild conditions was first shown in 2002 [58] and various solution conditions were probed, fibrils and protofibrils were imaged by AFM and TEM [59]. The fibril formation of stefin B was compared to that of stefin A, a much

both domains there is a linker region (two peptides, belonging each to one chain).

but with the loop between strands 4 and 5 in position of a tetramer.

**6. Initial oligomers on the way to fibril formation**

**Figure 3.** Stefin B structure. (A) stefin B monomer (PDB 1stf) [54], (B) stefin B tetramer (PDB 2oct) [40], (C) secondary structure representation of stefin B fibril [56]

more stable homologue [60], to proline mutants [61], EPM1 mutants [62] and to chimeras between stefins A and B [63].

A model for the mechanism of stefin B amyloid fibril formation has been proposed based on temperature and concentration dependence of the kinetics [64]. Following changes during the lag phase of the reaction by SEC and ESI MS we were able to improve the model of amyloid fibril formation (Figure 4), further explaining the role of various off-pathway oligomers [41]. Dimers seem to be the building block from which fibril formation starts and the fibrils grow [56]. When the process was started from any kind of oligomers, these transformed into dimers [41]. NMR studies have shown that there are actually four types of dimers observed in the lag phase under amyloid forming conditions [55].

symptoms has not been demonstrated [67]. Non fibrilar dimers or oligomers of α-synuclein could play a major role in Parkinson's disease progression [68]. In accordance with a generic process of amyloid type of protein aggregation, not only oligomers of amyloidogenic proteins related to amyloid diseases exert toxicity, also HypF-N from *E. coli*, SH3 domain of PI3 kinase and some other proteins form oligomers that are toxic to fibroblasts and neurons, whereas

Structure and Function of Stefin B Oligomers – Important Role in Amyloidogenesis

http://dx.doi.org/10.5772/57570

305

Conformation dependent antibodies distinguish between soluble oligomers [71-73] and amyloid fibrils [74, 75]. Many of those antibodies recognize the generic epitopes regardless of the amino acid sequence and this suggests that different proteins form structurally similar oligomers and fibrils. It was shown that the antibody recognizing Aβ peptide fibrils, recognize also fibrils of transthyretin, islet amyloid polypeptide, β2-microglobulin and polyglutamine and that the same antibody does not recognize soluble oligomers of those proteins nor the native forms [75]. Antibodies recognizing soluble oligomers also inhibit their toxicity which suggests that soluble oligomers do not share only structural properties but also the mechanism

**Figure 5.** Viability of cells exposed to prefibrillar aggregates and oligomers of stefin B. (A) MTS test-viability of the SHSY5Y cells was measured after exposure for 16 hours to serum free medium, which contained prefibrillar oligomers of stefin B – StBpH3, StBpH5, StBpH3F (filtered), and separated dimers, tetramers and higher-order oligomers of stefin B at pH 7 – StBpH7; each at a final concentration of 44 µM. As a control for the effect of a non-toxic protein, cells were exposed for the corresponding length of time to 44 µM of soluble stefin A prepared at pH 5 (StApH5). All the results are relative to the cells alone, with no added substance or protein in the medium. (B) DEVD-ase activity after exposing SHSY5Y cells to prefibrillar aggregates. 200 µl of concentrated StBpH3 aggregates were added to 800 µL of cell medi‐ um to a final concentration of 44 µM (final pH was checked to be neutral). The cells were treated for four hours with

Both oligomers and amyloid fibrils are harmful for the cells; oligomers present mostly the toxic

0.5 µM staurosporine (STS). Statistics-as described [39].

effect, while amyloid fibrils present steric obstacles.

amyloid fibrils of the same proteins show only low toxicity [69, 70].

of toxicity [72].

**Figure 4.** The proposed mechanism for amyloid fibril formation by stefin B [41]. Non-toxic species are coloured green, toxic are coloured red and potentially toxic are coloured orange.

#### **7. Oligomeric state related toxicity**

Stefin B higher oligomers are equally toxic to cells as the molten globule state (mutant P74S Y31) [39]. Higher oligomers still have native like CD spectra and increased ANS binding, which would mean that they are properly folded. The toxicity seems to correlate with the ANS binding (ANS binding is indicator of exposed hydrophobic patches in the protein structure) [39]. MTS test has shown that isolated dimers and tetramers are not toxic (Figure 5A), while higher oligomers show some toxicity to neuroblastoma cells (SH-SY5Y). In accordance, low ordered oligomers, monomers, dimers and tetramers do not increase caspase-3-like activity (Figure 5B) [39]. All those results correlate with insertion into lipid membranes, where toxic species insert more effectively than not toxic. Monomers do not get internalized into the cytoplasm when added to the cell medium, while higher oligomers do [39].

Confirming the above results, in another study [65] it was demonstrated that a mutated P74S Y31 form of stefin B, which is molten globule at neutral pH and prefibrillar aggregates at pH 3, which are also in the molten globule conformation, are most toxic.

A growing body of evidence shows that the soluble oligomers formed during amyloid fibril formation exert toxicity and likely cause neurodegeneration [66]. Small soluble oligomers are the cause of synaptic disfunction, whereas large and insoluble aggregates are likely the reservoir of those toxic species [66]. In the case of Aβ peptide a significant correlation has been found between the levels of soluble oligomers and the degree of synaptic alteration, neurode‐ generation and cognitive decline, while the correlation between insoluble deposits and those symptoms has not been demonstrated [67]. Non fibrilar dimers or oligomers of α-synuclein could play a major role in Parkinson's disease progression [68]. In accordance with a generic process of amyloid type of protein aggregation, not only oligomers of amyloidogenic proteins related to amyloid diseases exert toxicity, also HypF-N from *E. coli*, SH3 domain of PI3 kinase and some other proteins form oligomers that are toxic to fibroblasts and neurons, whereas amyloid fibrils of the same proteins show only low toxicity [69, 70].

Conformation dependent antibodies distinguish between soluble oligomers [71-73] and amyloid fibrils [74, 75]. Many of those antibodies recognize the generic epitopes regardless of the amino acid sequence and this suggests that different proteins form structurally similar oligomers and fibrils. It was shown that the antibody recognizing Aβ peptide fibrils, recognize also fibrils of transthyretin, islet amyloid polypeptide, β2-microglobulin and polyglutamine and that the same antibody does not recognize soluble oligomers of those proteins nor the native forms [75]. Antibodies recognizing soluble oligomers also inhibit their toxicity which suggests that soluble oligomers do not share only structural properties but also the mechanism of toxicity [72].

**Figure 4.** The proposed mechanism for amyloid fibril formation by stefin B [41]. Non-toxic species are coloured green,

Stefin B higher oligomers are equally toxic to cells as the molten globule state (mutant P74S Y31) [39]. Higher oligomers still have native like CD spectra and increased ANS binding, which would mean that they are properly folded. The toxicity seems to correlate with the ANS binding (ANS binding is indicator of exposed hydrophobic patches in the protein structure) [39]. MTS test has shown that isolated dimers and tetramers are not toxic (Figure 5A), while higher oligomers show some toxicity to neuroblastoma cells (SH-SY5Y). In accordance, low ordered oligomers, monomers, dimers and tetramers do not increase caspase-3-like activity (Figure 5B) [39]. All those results correlate with insertion into lipid membranes, where toxic species insert more effectively than not toxic. Monomers do not get internalized into the

Confirming the above results, in another study [65] it was demonstrated that a mutated P74S Y31 form of stefin B, which is molten globule at neutral pH and prefibrillar aggregates at pH

A growing body of evidence shows that the soluble oligomers formed during amyloid fibril formation exert toxicity and likely cause neurodegeneration [66]. Small soluble oligomers are the cause of synaptic disfunction, whereas large and insoluble aggregates are likely the reservoir of those toxic species [66]. In the case of Aβ peptide a significant correlation has been found between the levels of soluble oligomers and the degree of synaptic alteration, neurode‐ generation and cognitive decline, while the correlation between insoluble deposits and those

cytoplasm when added to the cell medium, while higher oligomers do [39].

3, which are also in the molten globule conformation, are most toxic.

toxic are coloured red and potentially toxic are coloured orange.

**7. Oligomeric state related toxicity**

304 Oligomerization of Chemical and Biological Compounds

**Figure 5.** Viability of cells exposed to prefibrillar aggregates and oligomers of stefin B. (A) MTS test-viability of the SHSY5Y cells was measured after exposure for 16 hours to serum free medium, which contained prefibrillar oligomers of stefin B – StBpH3, StBpH5, StBpH3F (filtered), and separated dimers, tetramers and higher-order oligomers of stefin B at pH 7 – StBpH7; each at a final concentration of 44 µM. As a control for the effect of a non-toxic protein, cells were exposed for the corresponding length of time to 44 µM of soluble stefin A prepared at pH 5 (StApH5). All the results are relative to the cells alone, with no added substance or protein in the medium. (B) DEVD-ase activity after exposing SHSY5Y cells to prefibrillar aggregates. 200 µl of concentrated StBpH3 aggregates were added to 800 µL of cell medi‐ um to a final concentration of 44 µM (final pH was checked to be neutral). The cells were treated for four hours with 0.5 µM staurosporine (STS). Statistics-as described [39].

Both oligomers and amyloid fibrils are harmful for the cells; oligomers present mostly the toxic effect, while amyloid fibrils present steric obstacles.

• the disruption of the tissue architecture and functions promoted by the invasion of the extracellular space of organ by amyloid [76, 77]

The exact mechanism of amyloid induced toxicity is yet controversial; nevertheless the pore formation into plasma or mitochondrial membrane is now the leading theory of pathogenesis. Aβ peptide needs anionic phospholipids for binding and insertion into membranes [96-101]. Monomeric form does not interact with membranes. Amylin (islet amyloid polypeptide) readily forms pores into planar lipid bilayer, preferring negatively charged lipids and furthermore the channel activity is inversely proportional to the amount of negative surface charge in the membrane [102]. Native form of prion protein does not form pores, while the mutant prion protein forms irreversible pores in the lipid bilayer. They claim that the pore formation is the main cause of the prion disease [103]. α-synuclein, a natively unfolded protein, also forms pores into lipid membranes but in contrast to others described it is not known how does the oligomeric form increase membrane permeability without forming discrete channels [104]. In the literature we can find contradictory results what effects amyloid proteins exert on membranes. For the same proteins we can find reports that they affect membranes by a nonchannel mechanism and other reports that they affect membranes by the formation of ion channels. Two mechanisms of pore formation into membranes by the amyloid forming

Structure and Function of Stefin B Oligomers – Important Role in Amyloidogenesis

http://dx.doi.org/10.5772/57570

307

**Figure 6.** Binding of stefin B variants to liposomes as measured by SPR. (A) Binding to PC LUV. A comparison of bind‐ ing of 70 µM stefin variants to PC LUV immobilized on the surface of a L1 sensor chip. (B–D) A comparison of the bind‐ ing of prefibrillar form of stefin variants to PG LUV. The concentration of the protein was 10, 20, 40, 50, 60 and 70 µM (curves from the bottom to the top) in each case. The thick gray line represents binding of native stefin variants at

proteins are summarized in Table 5.

70µM. (B) StB-wt; (C) StBY31; (D) G4R [95].

• the destabilization of intracellular and extracellular membranes by oligomers [78, 79]

• the apoptotic cell death and receptor-mediated toxicity triggered by the oligomer interaction with various neuronal receptors [80]

• the oligomer-mediated impairment of the P/Q-type calcium currents [81]

• the impaired maturation of autophagosomes to lysosome mediated by the oligomer accumulation [82]

• the dysfunction of autophagy, a lysosomal pathway for degrading organelles and proteins [83]

• the oxidative damage-induced disruption of the cell viability promoted by the incorporation of redox metals into amyloid fibrils and subsequent generation of reactive oxygen species [84-88]

• the general disorganization of cellular protein homeostasis associated with the exhaustion of the cell defence mechanisms, such as chaperone system [89, 90]

• proteasome inhibition [91]

• the loss of crucial protein functions and/or gain of toxic functions

**Table 4.** Cellular degeneration after protein oligomerization/deposition (summarized in [92])

#### **8. Oligomeric state related membrane interactions**

Amyloid forming proteins interact with biological membranes and even form pores, which are similar to those formed by the pore-forming proteins [1, 93]. Interactions between stefin B and membranes have been extensively studied [65, 94, 95]. Three different proteins have been studied: wt protein, Y31 isoform and G4R mutant (mutant involved in EPM1 disease). Also some comparisons have been made between native and prefibrillar states of the protein (prepared with incubation at pH 3.3 or 4.8) (Figure 6) [94]. Prefibrillar oligomers may be organized in such a way that they are more amphipatic than the native protein and therefore acquire a higher surface-seeking potential. All forms of the protein insert into acidic lipid membranes, cause permeabilization of unilamelar vesicles and destabilize the membranes [94, 95]. Nevertheless there are some significant differences between them concerning pore formation. The mutant G4R does not form pores, but breaks the membrane very fast (in a few minutes), while both the wt protein and Y31 isoform form pores; wt pores are cation selective (and may not be deleterious for the cell) and Y31 pores are anion selective [95]. Wt sample was separated into different oligomers apart from monomers, into dimers, tetramers and higher oligomers. Monomers, dimers and tetramers insert into lipid membranes with approximately the same critical pressure, while the higher oligomers (average size is around octamers) insert more effectively [65].

The exact mechanism of amyloid induced toxicity is yet controversial; nevertheless the pore formation into plasma or mitochondrial membrane is now the leading theory of pathogenesis. Aβ peptide needs anionic phospholipids for binding and insertion into membranes [96-101]. Monomeric form does not interact with membranes. Amylin (islet amyloid polypeptide) readily forms pores into planar lipid bilayer, preferring negatively charged lipids and furthermore the channel activity is inversely proportional to the amount of negative surface charge in the membrane [102]. Native form of prion protein does not form pores, while the mutant prion protein forms irreversible pores in the lipid bilayer. They claim that the pore formation is the main cause of the prion disease [103]. α-synuclein, a natively unfolded protein, also forms pores into lipid membranes but in contrast to others described it is not known how does the oligomeric form increase membrane permeability without forming discrete channels [104]. In the literature we can find contradictory results what effects amyloid proteins exert on membranes. For the same proteins we can find reports that they affect membranes by a nonchannel mechanism and other reports that they affect membranes by the formation of ion channels. Two mechanisms of pore formation into membranes by the amyloid forming proteins are summarized in Table 5.

• the disruption of the tissue architecture and functions promoted by the invasion of the extracellular space of organ

• the apoptotic cell death and receptor-mediated toxicity triggered by the oligomer interaction with various neuronal

• the impaired maturation of autophagosomes to lysosome mediated by the oligomer accumulation [82]

• the oxidative damage-induced disruption of the cell viability promoted by the incorporation of redox metals into

• the general disorganization of cellular protein homeostasis associated with the exhaustion of the cell defence

Amyloid forming proteins interact with biological membranes and even form pores, which are similar to those formed by the pore-forming proteins [1, 93]. Interactions between stefin B and membranes have been extensively studied [65, 94, 95]. Three different proteins have been studied: wt protein, Y31 isoform and G4R mutant (mutant involved in EPM1 disease). Also some comparisons have been made between native and prefibrillar states of the protein (prepared with incubation at pH 3.3 or 4.8) (Figure 6) [94]. Prefibrillar oligomers may be organized in such a way that they are more amphipatic than the native protein and therefore acquire a higher surface-seeking potential. All forms of the protein insert into acidic lipid membranes, cause permeabilization of unilamelar vesicles and destabilize the membranes [94, 95]. Nevertheless there are some significant differences between them concerning pore formation. The mutant G4R does not form pores, but breaks the membrane very fast (in a few minutes), while both the wt protein and Y31 isoform form pores; wt pores are cation selective (and may not be deleterious for the cell) and Y31 pores are anion selective [95]. Wt sample was separated into different oligomers apart from monomers, into dimers, tetramers and higher oligomers. Monomers, dimers and tetramers insert into lipid membranes with approximately the same critical pressure, while the higher oligomers (average size is around octamers) insert

• the dysfunction of autophagy, a lysosomal pathway for degrading organelles and proteins [83]

**Table 4.** Cellular degeneration after protein oligomerization/deposition (summarized in [92])

**8. Oligomeric state related membrane interactions**

• the destabilization of intracellular and extracellular membranes by oligomers [78, 79]

• the oligomer-mediated impairment of the P/Q-type calcium currents [81]

amyloid fibrils and subsequent generation of reactive oxygen species [84-88]

• the loss of crucial protein functions and/or gain of toxic functions

mechanisms, such as chaperone system [89, 90]

306 Oligomerization of Chemical and Biological Compounds

• proteasome inhibition [91]

more effectively [65].

by amyloid [76, 77]

receptors [80]

**Figure 6.** Binding of stefin B variants to liposomes as measured by SPR. (A) Binding to PC LUV. A comparison of bind‐ ing of 70 µM stefin variants to PC LUV immobilized on the surface of a L1 sensor chip. (B–D) A comparison of the bind‐ ing of prefibrillar form of stefin variants to PG LUV. The concentration of the protein was 10, 20, 40, 50, 60 and 70 µM (curves from the bottom to the top) in each case. The thick gray line represents binding of native stefin variants at 70µM. (B) StB-wt; (C) StBY31; (D) G4R [95].

Membrane poration followed by nonspecific membrane leakage – increased conductivity of the membranes by a non-channel mechanism Specific ion transport through ion channel followed by destabilized ionic homeostasis Aβ40 peptide [105] Aβ40 peptide [106, 107] Aβ42 peptide [105] Aβ42 peptide [108] α-synuclein [105] Aβ22-35 peptide [109] IAPP or amylin [105, 110] α-synuclein [111] Polyglutamine [105] IAPP or amylin [112] Prion (106-126) H1 [105] stefin B [95] SOD1 [113]

**Table 5.** The mechanisms underlying globular peptides induced cell dysfunction

#### **9. Copper binding to stefin B and its inhibition of amyloid fibril formation**

Divalent metal ions (Cu (II), Zn (II), Fe (II)) are often observed to colocalize with amyloid plaques *in vivo* in much higher concentrations than usually present in the normal environment. This has led to the hypothesis that this metal ions bind to mature fibrils and influence the fibril formation reaction [114].

Binding of metal ions to prion protein increases proteolysis resistance and structural changes

**Figure 7.** Inhibition of fibrillation of stefin B by Cu2+ as probed by ThT fluorescence. Final protein concentration was in all cases 45 µM and final concentration of Cu2+ 46 µM, leading to 1 : 1 of protein to Cu2+ ratio. (A) Stefin B wildtype (E31 isoform) at pH 5, 40 °C, 0 and 50 µM Cu2+ in the buffer. (B) Stefin B wildtype (E31 isoform) at pH 5, 10% TFE, 25

Structure and Function of Stefin B Oligomers – Important Role in Amyloidogenesis

http://dx.doi.org/10.5772/57570

309

SOD1 aggregation is enhanced and modulated by Ca2+ ions; at physiological pH Ca2+induces conformational changes that increase β-sheet content, they can also divert the aggregation from amyloid fibrils to amorphous aggregates [119]. Cu2+ and Zn2+ accelerate deposition of Aβ40 peptide and Aβ42 peptide, which results in the amorphous aggregates [120, 121]. Fe3+ induces the deposition of fibrillar amyloid plaques of Aβ40 peptide and Aβ42 peptide [120]. Prion protein can directly influence neuronal zinc concentrations [122]. In prion plaques there was found a significant disregulation of copper and manganese, copper was depleted and man‐ ganese was enriched [123]. Fe3+ and Al3+ enhance both formation of mixed oligomers and recruitment of α-synuclein in pre-formed tau oligomers [124]. Copper and selenium inhibit amylin fibril formation, while aluminium and manganese promote it [125]. Copper amylin complex has an anti-aggregating and anti-apoptotic properties, quenching the metal catalysed ROS [126]. Zinc also inhibits amylin fibril formation, furthermore, it favours the formation of

**10. Interaction between oligomers of stefin B and Aβ peptide** *in vitro* **and**

"Professional chaperones" (heat shock protein family) prevent protein misfolding and subsequent protein aggregation. "Amateur chaperones" bind amyloidogenic proteins and may affect their aggregation process. Both types of chaperones colocalize with pathological lesions of Alzheimer's disease, may be involved in Aβ peptide conformational changes, clearance of Aβ peptide from the brain. Both types of chaperones may be involved in the

which might play an important role in the conversion process [118].

amylin hexamers and inhibits the formation of dimers [127].

**in cells-"amateur chaperones"**

°C, 0 and 50 µM Cu2+ in the buffer [114].

Indeed it was proven that human stefin B is a copper binding protein [114]. It shows a picomolar affinity to copper at pH 7 and a nanomolar affinity at pH 5. Both the wild type protein and the Y31 isoform bind copper, while the Y31 P79S, which is tetrameric, does not. Monomers and dimers are able to bind copper, while other oligomers are not. Copper binding does not change the conformation of the protein; however it inhibits amyloid fibril formation (Figure 8). Of note it does not prevent aggregation to prefibrillar oligomers, which can be even more toxic to cells. The protein binds two Cu2+ ions. Copper binding promotes protein dimerization at neutral pH, while at acidic pH protein undergoes dimerization, already without added copper [114].

Later-on, NMR was used to detect the three possible copper binding sites of stefin B. The first is in the α-helix facing away from the β-sheets together with the loop between strands 4 and 5. The second potential binding site is the C-terminal together with the loop between strands 3 and 4 (this binding site is more likely in the dimeric form of the protein). The third binding site is in the dimer only and is the loop between strands 2 and 3, which is the stretched loop in the domain-swapped dimer together with the few residues at the N-terminal [55].

It was confirmed that copper affects the stefin B fibril formation. In more details, it slows down the elongation phase. It was also shown that the presence of copper destabilizes the protein structure and therefore it was concluded that it preferably binds to the slightly unfolded state of the protein. The final fibril morphology does not differ if copper ions are present or not [55].

Metal ions have different effect on amyloid forming proteins. They can speed up the fibril formation, reduce the lag phase, retard the fibrillization or even stop the process completely [115-117].

Membrane poration followed by nonspecific membrane leakage – increased conductivity of the membranes by a

308 Oligomerization of Chemical and Biological Compounds

Aβ40 peptide [105] Aβ40 peptide [106, 107] Aβ42 peptide [105] Aβ42 peptide [108] α-synuclein [105] Aβ22-35 peptide [109] IAPP or amylin [105, 110] α-synuclein [111] Polyglutamine [105] IAPP or amylin [112] Prion (106-126) H1 [105] stefin B [95]

**Table 5.** The mechanisms underlying globular peptides induced cell dysfunction

**9. Copper binding to stefin B and its inhibition of amyloid fibril formation**

Divalent metal ions (Cu (II), Zn (II), Fe (II)) are often observed to colocalize with amyloid plaques *in vivo* in much higher concentrations than usually present in the normal environment. This has led to the hypothesis that this metal ions bind to mature fibrils and influence the fibril

Indeed it was proven that human stefin B is a copper binding protein [114]. It shows a picomolar affinity to copper at pH 7 and a nanomolar affinity at pH 5. Both the wild type protein and the Y31 isoform bind copper, while the Y31 P79S, which is tetrameric, does not. Monomers and dimers are able to bind copper, while other oligomers are not. Copper binding does not change the conformation of the protein; however it inhibits amyloid fibril formation (Figure 8). Of note it does not prevent aggregation to prefibrillar oligomers, which can be even more toxic to cells. The protein binds two Cu2+ ions. Copper binding promotes protein dimerization at neutral pH, while at acidic pH protein undergoes dimerization, already

Later-on, NMR was used to detect the three possible copper binding sites of stefin B. The first is in the α-helix facing away from the β-sheets together with the loop between strands 4 and 5. The second potential binding site is the C-terminal together with the loop between strands 3 and 4 (this binding site is more likely in the dimeric form of the protein). The third binding site is in the dimer only and is the loop between strands 2 and 3, which is the stretched loop

It was confirmed that copper affects the stefin B fibril formation. In more details, it slows down the elongation phase. It was also shown that the presence of copper destabilizes the protein structure and therefore it was concluded that it preferably binds to the slightly unfolded state of the protein. The final fibril morphology does not differ if copper ions are present or not [55]. Metal ions have different effect on amyloid forming proteins. They can speed up the fibril formation, reduce the lag phase, retard the fibrillization or even stop the process completely

in the domain-swapped dimer together with the few residues at the N-terminal [55].

Specific ion transport through ion channel followed by

destabilized ionic homeostasis

non-channel mechanism

formation reaction [114].

without added copper [114].

[115-117].

SOD1 [113]

**Figure 7.** Inhibition of fibrillation of stefin B by Cu2+ as probed by ThT fluorescence. Final protein concentration was in all cases 45 µM and final concentration of Cu2+ 46 µM, leading to 1 : 1 of protein to Cu2+ ratio. (A) Stefin B wildtype (E31 isoform) at pH 5, 40 °C, 0 and 50 µM Cu2+ in the buffer. (B) Stefin B wildtype (E31 isoform) at pH 5, 10% TFE, 25 °C, 0 and 50 µM Cu2+ in the buffer [114].

Binding of metal ions to prion protein increases proteolysis resistance and structural changes which might play an important role in the conversion process [118].

SOD1 aggregation is enhanced and modulated by Ca2+ ions; at physiological pH Ca2+induces conformational changes that increase β-sheet content, they can also divert the aggregation from amyloid fibrils to amorphous aggregates [119]. Cu2+ and Zn2+ accelerate deposition of Aβ40 peptide and Aβ42 peptide, which results in the amorphous aggregates [120, 121]. Fe3+ induces the deposition of fibrillar amyloid plaques of Aβ40 peptide and Aβ42 peptide [120]. Prion protein can directly influence neuronal zinc concentrations [122]. In prion plaques there was found a significant disregulation of copper and manganese, copper was depleted and man‐ ganese was enriched [123]. Fe3+ and Al3+ enhance both formation of mixed oligomers and recruitment of α-synuclein in pre-formed tau oligomers [124]. Copper and selenium inhibit amylin fibril formation, while aluminium and manganese promote it [125]. Copper amylin complex has an anti-aggregating and anti-apoptotic properties, quenching the metal catalysed ROS [126]. Zinc also inhibits amylin fibril formation, furthermore, it favours the formation of amylin hexamers and inhibits the formation of dimers [127].

#### **10. Interaction between oligomers of stefin B and Aβ peptide** *in vitro* **and in cells-"amateur chaperones"**

"Professional chaperones" (heat shock protein family) prevent protein misfolding and subsequent protein aggregation. "Amateur chaperones" bind amyloidogenic proteins and may affect their aggregation process. Both types of chaperones colocalize with pathological lesions of Alzheimer's disease, may be involved in Aβ peptide conformational changes, clearance of Aβ peptide from the brain. Both types of chaperones may be involved in the aggregation, accumulation, persistence and clearance of Aβ peptide and may therefore serve as a potential targets for medical treatment of those patients [128].

Aβ peptides are produced by proteolytic cleavage of the amyloid precursor protein (APP) by α-secretase, β-secretase and γ-secretase. Cathepsin B, which can be inhibited by stefin B, is likely contributor to β-secretase activity [129]. Cysteine protease inhibitors reduce both Aβ peptide level in the brain and β-secretase activity *in vivo* [129]. Stefin B has an effect on production of Aβ peptides and furthermore it also inhibits Aβ peptide fibril formation [38].

Two isoforms of stefin B (Y31 and wt E31) have been studied. Y31 isoform is predominantly a dimer [31] while the wild type protein exist as mixture of monomers, dimer, tetramers and even higher oligomers [39]. ThT fluorescence and transmission electron microscopy (TEM) have shown that Y31 isoform completely inhibits Aβ peptide fibril formation (Figure 9I). The direct interaction between those two proteins has also been shown by SPR measurements, where concentration dependent interaction has been reported and by ESI MS where the complex between dimer stB and monomer Aβ peptide has been detected (Figure 9II) [38]. Furthermore, isolated oligomers of the wild type protein have been studied and it was shown that only the tetramer inhibits Aβ peptide fibril formation and that the higher oligomers show only a weak inhibition. Stefin B also colocalizes with the Aβ peptide aggregates in cells (shown by confocal microscopy) and with the C-terminal fragment of APP (comprising Aβ peptide sequence) (shown by immunoprecipitation).

**Figure 8.** (I) Inhibition of Aβ peptide fibril formation by stefin B measured by ThT fluorescence and (II) the detection of the complex by ESI MS. (I) The Aβ peptide concentration was 17 µM throughout, pH 7.3, 40 °C. (I, A), Aβ peptide alone, 1:1 molar ratio of Aβ to Y31 stefin B (complete inhibition) and 1:1 molar ratio of Aβ to E31 stefin B. (I, B), Aβ peptide alone, and 1:1 molar ratios to E31 stefin B monomers, dimers, tetramers, and higher oligomers. The protein concentrations of stefin B were 17 µM. (II) Complex detected by ESI-MS. (II, A) ESI-MS spectra of Aβ(1– 40) peptide, (II, C) stefin B dimer (Y31 variant) and (II, B) their mixtures were recorded: (II, A), 2 µM Aβ(1– 40) peptide; (II, B), a mixture of 2 µM Aβ peptide and 2 µM stefin B; and (II, C), 2 µM stefin B. Peaks corresponding to the Aβ peptide-stefin B com‐

Structure and Function of Stefin B Oligomers – Important Role in Amyloidogenesis

http://dx.doi.org/10.5772/57570

311

In order to show if binding between domain-swapped dimers and Aβ peptide is a generic property, other cystatins have been used. It was shown that only cystatin C dimers inhibit Aβ peptide fibril formation (30% inhibition) while stefin A dimers exhibited no such effect [140].

Stefin B has been so far found as a good model system for studying amyloid fibril formation. It exhibits nearly all features shared with other amyloid forming proteins: it forms mature fibrils under mildly acidic conditions or even at neutral pH at somewhat higher temperature, forms membrane pores and therefore promotes membrane leaking, binds copper ions and its' oligomers are toxic. It is not a model protein only, but also could be termed an "amateur

We are trying to extend our *in vitro* knowledge to cell cultures to contribute even more to the understanding of conformational disease. It is hoped that new knowledge of protein oligo‐

plex are denoted with an asterisk and numbers above the peaks denote charge state of the ions [38].

Several molecules interacting with Aβ peptide have been reported by now (Table 6).

chaperone" affecting Aβ peptide fibril formation both *in vitro* and *in vivo* [38].

**11. Conclusions and perspectives**


**Table 6.** Molecules interacting with Aβ peptide

**Figure 8.** (I) Inhibition of Aβ peptide fibril formation by stefin B measured by ThT fluorescence and (II) the detection of the complex by ESI MS. (I) The Aβ peptide concentration was 17 µM throughout, pH 7.3, 40 °C. (I, A), Aβ peptide alone, 1:1 molar ratio of Aβ to Y31 stefin B (complete inhibition) and 1:1 molar ratio of Aβ to E31 stefin B. (I, B), Aβ peptide alone, and 1:1 molar ratios to E31 stefin B monomers, dimers, tetramers, and higher oligomers. The protein concentrations of stefin B were 17 µM. (II) Complex detected by ESI-MS. (II, A) ESI-MS spectra of Aβ(1– 40) peptide, (II, C) stefin B dimer (Y31 variant) and (II, B) their mixtures were recorded: (II, A), 2 µM Aβ(1– 40) peptide; (II, B), a mixture of 2 µM Aβ peptide and 2 µM stefin B; and (II, C), 2 µM stefin B. Peaks corresponding to the Aβ peptide-stefin B com‐ plex are denoted with an asterisk and numbers above the peaks denote charge state of the ions [38].

In order to show if binding between domain-swapped dimers and Aβ peptide is a generic property, other cystatins have been used. It was shown that only cystatin C dimers inhibit Aβ peptide fibril formation (30% inhibition) while stefin A dimers exhibited no such effect [140].

Several molecules interacting with Aβ peptide have been reported by now (Table 6).

#### **11. Conclusions and perspectives**

aggregation, accumulation, persistence and clearance of Aβ peptide and may therefore serve

Aβ peptides are produced by proteolytic cleavage of the amyloid precursor protein (APP) by α-secretase, β-secretase and γ-secretase. Cathepsin B, which can be inhibited by stefin B, is likely contributor to β-secretase activity [129]. Cysteine protease inhibitors reduce both Aβ peptide level in the brain and β-secretase activity *in vivo* [129]. Stefin B has an effect on production of Aβ peptides and furthermore it also inhibits Aβ peptide fibril formation [38].

Two isoforms of stefin B (Y31 and wt E31) have been studied. Y31 isoform is predominantly a dimer [31] while the wild type protein exist as mixture of monomers, dimer, tetramers and even higher oligomers [39]. ThT fluorescence and transmission electron microscopy (TEM) have shown that Y31 isoform completely inhibits Aβ peptide fibril formation (Figure 9I). The direct interaction between those two proteins has also been shown by SPR measurements, where concentration dependent interaction has been reported and by ESI MS where the complex between dimer stB and monomer Aβ peptide has been detected (Figure 9II) [38]. Furthermore, isolated oligomers of the wild type protein have been studied and it was shown that only the tetramer inhibits Aβ peptide fibril formation and that the higher oligomers show only a weak inhibition. Stefin B also colocalizes with the Aβ peptide aggregates in cells (shown by confocal microscopy) and with the C-terminal fragment of APP (comprising Aβ peptide

as a potential targets for medical treatment of those patients [128].

310 Oligomerization of Chemical and Biological Compounds

sequence) (shown by immunoprecipitation).

polyphenols Inhibits Aβ peptide fibril formation [131]. heme Prevents Aβ peptide aggregation [132]. GroEL Prevents Aβ peptide aggregation [133].

apolipoprotein E Slows down the oligomerization of Aβ peptide [134].

myelin basic protein Inhibitor of Aβ peptide fibrilar assembly [135].

of Aβ peptide [137].

**Table 6.** Molecules interacting with Aβ peptide

stefin B Inhibits Aβ peptide fibril formation, interaction *in vitro* and in cells [38].

cystatin C Concentration dependent inhibition of Aβ peptide fibril formation [36].

formation from already formed oligomers [136].

albumin Increases the lag phase and decreases the total amount of fibrils [138]

laminin Inhibits Aβ peptide fibril formation, induce depolymerisation of preformed fibrils [130].

ferulic acid Inhibits Aβ peptide oligomer formation from monomers and at the same time accelerates fibril

S14G humanin Inhibits aggregation into fibrils and disaggregates preformed fibrils, reduces cytotoxicity effect

crocetin Inhibits Aβ peptide fibril formation, destabilizes preformed fibrils, stabilize Aβ peptide oligomers and prevents their conversion into fibril [139].

Molecule/protein Type of interaction

Stefin B has been so far found as a good model system for studying amyloid fibril formation. It exhibits nearly all features shared with other amyloid forming proteins: it forms mature fibrils under mildly acidic conditions or even at neutral pH at somewhat higher temperature, forms membrane pores and therefore promotes membrane leaking, binds copper ions and its' oligomers are toxic. It is not a model protein only, but also could be termed an "amateur chaperone" affecting Aβ peptide fibril formation both *in vitro* and *in vivo* [38].

We are trying to extend our *in vitro* knowledge to cell cultures to contribute even more to the understanding of conformational disease. It is hoped that new knowledge of protein oligo‐ merization and aggregation on the molecular and cellular levels will contribute to the devel‐ opment of new therapeutic strategies for patients with various conformational diseases, including the neurodegenerative ones.

[4] Jerala R., M. Trstenjak, B. Lenarcic, V. Turk, Cloning a synthetic gene for human ste‐

Structure and Function of Stefin B Oligomers – Important Role in Amyloidogenesis

http://dx.doi.org/10.5772/57570

313

[5] Rawlings N.D., D.P. Tolle, A.J. Barrett, MEROPS: the peptidase database. Nucleic

[6] Abrahamson M., A.J. Barrett, G. Salvesen, A. Grubb, Isolation of six cysteine protei‐ nase inhibitors from human urine. Their physicochemical and enzyme kinetic prop‐ erties and concentrations in biological fluids. J Biol Chem 1986; 261(24): p.

[7] Brzin J., M. Kopitar, V. Turk, W. Machleidt, Protein inhibitors of cysteine proteinases. I. Isolation and characterization of stefin, a cytosolic protein inhibitor of cysteine pro‐ teinases from human polymorphonuclear granulocytes. Hoppe Seylers Z Physiol

[8] Pol E.,I. Bjork, Role of the single cysteine residue, Cys 3, of human and bovine cysta‐ tin B (stefin B) in the inhibition of cysteine proteinases. Protein Sci 2001; 10(9): p.

[9] Turk B., V. Turk, D. Turk, Structural and functional aspects of papain-like cysteine proteinases and their protein inhibitors. Biol Chem 1997; 378(3-4): p. 141-150.

[10] Bromme D., R. Rinne, H. Kirschke, Tight-binding inhibition of cathepsin S by cysta‐

[11] Turk V., J. Brzin, M. Kotnik, B. Lenarcic, T. Popovic, A. Ritonja, M. Trstenjak, L. Be‐ gic-Odobasic, W. Machleidt, Human cysteine proteinases and their protein inhibitors stefins, cystatins and kininogens. Biomedica biochimica acta 1986; 45(11-12): p.

[12] Di Giaimo R., M. Riccio, S. Santi, C. Galeotti, D.C. Ambrosetti, M. Melli, New insights into the molecular basis of progressive myoclonus epilepsy: a multiprotein complex

[13] Lee M.J., G.R. Yu, S.H. Park, B.H. Cho, J.S. Ahn, H.J. Park, E.Y. Song, D.G. Kim, Iden‐ tification of cystatin B as a potential serum marker in hepatocellular carcinoma. Clin

[14] Sheahan K., S. Shuja, M.J. Murnane, Cysteine protease activities and tumor develop‐ ment in human colorectal carcinoma*.* Cancer research 1989; 49(14): p. 3809-3814.

[15] Plebani M., L. Herszenyi, R. Cardin, G. Roveroni, P. Carraro, M.D. Paoli, M. Rugge, W.F. Grigioni, D. Nitti, R. Naccarato, et al., Cysteine and serine proteases in gastric

[16] Shiraishi T., M. Mori, S. Tanaka, K. Sugimachi, T. Akiyoshi, Identification of cystatin B in human esophageal carcinoma, using differential displays in which the gene ex‐

with cystatin B. Human molecular genetics 2002; 11(23): p. 2941-2950.

tins. Biomedica biochimica acta 1991; 50(4-6): p. 631-635.

fin B and its expression in E. coli. FEBS Lett 1988; 239(1): p. 41-44.

Acids Res 2004; 32(Database issue): p. D160-164.

Chem 1983; 364(11): p. 1475-1480.

Cancer Res 2008; 14(4): p. 1080-1089.

cancer*.* Cancer 1995; 76(3): p. 367-375.

11282-11289.

1729-1738.

1375-1384.

#### **Acknowledgements**

All this work has been done by several contributors over past years. We are thankful to Manca Kenig, Sabina Rabzelj, Slavko Čeru, Katja Škerget, Aida Smajlović and Saša Jenko Kokalj, similarly to some collegues: Vito Turk, Nataša Kopitar Jerala (both JSI, Slovenia), Selma Berbić (University of Tuzla, BiH). Many collaborators from different fields helped us to achieve different perspectives on the problem; Peep Palumaa (TTU, Estonia) – mass spectrometry, Dušan Turk (JSI, Slovenia) – structural biology, Magda Tušek Žnidarič (NIB, Slovenia) – transmission electron microscopy, Andrej Vilfan (JSI, Slovenia) – calculations and mathemat‐ ical models, Rosemary A. Staniforth (University of Sheffield, UK) – NMR, Miha Škarabot (JSI, Slovenia) – atomic force microscopy and many others.

This work was supported by the program P1-0140 and the projects J7-4050 (led by E. Z.) via the Slovenian Research Agency (ARRS) and ARRS young investigators grants (A. T.-V.).

#### **Author details**

Ajda Taler-Verčič1,2, Mira Polajnar1,2 and Eva Žerovnik1,2\*

\*Address all correspondence to: eva.zerovnik@ijs.si

1 Department of Biochemistry, Molecular and Structural Biology, Jožef Stefan Institute, Ljubljana, Slovenia

2 Jožef Stefan International Postgraduate School, Ljubljana, Slovenia

#### **References**


[4] Jerala R., M. Trstenjak, B. Lenarcic, V. Turk, Cloning a synthetic gene for human ste‐ fin B and its expression in E. coli. FEBS Lett 1988; 239(1): p. 41-44.

merization and aggregation on the molecular and cellular levels will contribute to the devel‐ opment of new therapeutic strategies for patients with various conformational diseases,

All this work has been done by several contributors over past years. We are thankful to Manca Kenig, Sabina Rabzelj, Slavko Čeru, Katja Škerget, Aida Smajlović and Saša Jenko Kokalj, similarly to some collegues: Vito Turk, Nataša Kopitar Jerala (both JSI, Slovenia), Selma Berbić (University of Tuzla, BiH). Many collaborators from different fields helped us to achieve different perspectives on the problem; Peep Palumaa (TTU, Estonia) – mass spectrometry, Dušan Turk (JSI, Slovenia) – structural biology, Magda Tušek Žnidarič (NIB, Slovenia) – transmission electron microscopy, Andrej Vilfan (JSI, Slovenia) – calculations and mathemat‐ ical models, Rosemary A. Staniforth (University of Sheffield, UK) – NMR, Miha Škarabot (JSI,

This work was supported by the program P1-0140 and the projects J7-4050 (led by E. Z.) via the Slovenian Research Agency (ARRS) and ARRS young investigators grants (A. T.-V.).

1 Department of Biochemistry, Molecular and Structural Biology, Jožef Stefan Institute,

[1] Kagan B.L.,J. Thundimadathil, Amyloid peptide pores and the beta sheet conforma‐

[2] Lomas D.A.,R.W. Carrell, Serpinopathies and the conformational dementias. Nat Rev

[3] Turk V.,W. Bode, The cystatins: protein inhibitors of cysteine proteinases. FEBS Lett

including the neurodegenerative ones.

312 Oligomerization of Chemical and Biological Compounds

Slovenia) – atomic force microscopy and many others.

Ajda Taler-Verčič1,2, Mira Polajnar1,2 and Eva Žerovnik1,2\*

2 Jožef Stefan International Postgraduate School, Ljubljana, Slovenia

tion. Adv Exp Med Biol 2010; 677: p. 150-167.

Genet 2002; 3(10): p. 759-768.

1991; 285(2): p. 213-219.

\*Address all correspondence to: eva.zerovnik@ijs.si

**Acknowledgements**

**Author details**

Ljubljana, Slovenia

**References**


pression is related to lymph-node metastasis*.* International journal of cancer. Journal international du cancer 1998; 79(2): p. 175-178.

the aged*.* Virchows Archiv. A, Pathological anatomy and histopathology 1993;

Structure and Function of Stefin B Oligomers – Important Role in Amyloidogenesis

http://dx.doi.org/10.5772/57570

315

[28] Pennacchio L.A., A.E. Lehesjoki, N.E. Stone, V.L. Willour, K. Virtaneva, J. Miao, E. D'Amato, L. Ramirez, M. Faham, M. Koskiniemi, J.A. Warrington, R. Norio, A. de la Chapelle, D.R. Cox, R.M. Myers, Mutations in the gene encoding cystatin B in pro‐

[29] Cipollini E., M. Riccio, R. Di Giaimo, F. Dal Piaz, G. Pulice, S. Catania, I. Caldarelli, M. Dembic, S. Santi, M. Melli, Cystatin B and its EPM1 mutants are polymeric and

[30] Ceru S., R. Layfield, T. Zavasnik-Bergant, U. Repnik, N. Kopitar-Jerala, V. Turk, E. Zerovnik, Intracellular aggregation of human stefin B: confocal and electron micro‐

[31] Rabzelj S., V. Turk, E. Zerovnik, In vitro study of stability and amyloid-fibril forma‐ tion of two mutants of human stefin B (cystatin B) occurring in patients with EPM1*.*

Protein science : a publication of the Protein Society 2005; 14(10): p. 2713-2722.

[32] Polajnar M., S. Ceru, N. Kopitar-Jerala, E. Zerovnik, Human stefin B normal and patho-physiological role: molecular and cellular aspects of amyloid-type aggregation

[33] Polajnar M., R. Vidmar, M. Vizovisek, M. Fonovic, N. Kopitar-Jerala, E. Zerovnik, In‐ fluence of partial unfolding and aggregation of human stefin B (cystatin B) EPM1 mutants G50E and Q71P on selective cleavages by cathepsins B and S*.* Biol Chem

[34] Olafsson I., L. Thorsteinsson, O. Jensson, The molecular pathology of hereditary cys‐ tatin C amyloid angiopathy causing brain hemorrhage*.* Brain pathology 1996; 6(2): p.

[35] Levy E., M. Sastre, A. Kumar, G. Gallo, P. Piccardo, B. Ghetti, F. Tagliavini, Codepo‐ sition of cystatin C with amyloid-beta protein in the brain of Alzheimer disease pa‐ tients*.* Journal of neuropathology and experimental neurology 2001; 60(1): p. 94-104.

[36] Sastre M., M. Calero, M. Pawlik, P.M. Mathews, A. Kumar, V. Danilov, S.D. Schmidt, R.A. Nixon, B. Frangione, E. Levy, Binding of cystatin C to Alzheimer's amyloid beta inhibits in vitro amyloid fibril formation*.* Neurobiol Aging 2004; 25(8): p. 1033-1043.

[37] Kopitar-Jerala N., A. Schweiger, R.M. Myers, V. Turk, B. Turk, Sensitization of stefin B-deficient thymocytes towards staurosporin-induced apoptosis is independent of

[38] Skerget K., A. Taler-Vercic, A. Bavdek, V. Hodnik, S. Ceru, M. Tusek-Znidaric, T. Kumm, D. Pitsi, M. Pompe-Novak, P. Palumaa, S. Soriano, N. Kopitar-Jerala, V. Turk, G. Anderluh, E. Zerovnik, Interaction between oligomers of stefin B and amy‐

loid-beta in vitro and in cells*.* J Biol Chem 2010; 285(5): p. 3201-3210.

gressive myoclonus epilepsy (EPM1)*.* Science 1996; 271(5256): p. 1731-1734.

aggregate prone in vivo*.* Biochim Biophys Acta 2008; 1783(2): p. 312-322.

scopy study*.* Biol Cell 2010; 102(6): p. 319-334.

of certain EPM1 mutants*.* Front Mol Neurosci 2012; 5: p. 88.

cysteine cathepsins*.* FEBS Lett 2005; 579(10): p. 2149-2155.

423(3): p. 185-194.

2013; 394(6): p. 783-790.

121-126.


the aged*.* Virchows Archiv. A, Pathological anatomy and histopathology 1993; 423(3): p. 185-194.

[28] Pennacchio L.A., A.E. Lehesjoki, N.E. Stone, V.L. Willour, K. Virtaneva, J. Miao, E. D'Amato, L. Ramirez, M. Faham, M. Koskiniemi, J.A. Warrington, R. Norio, A. de la Chapelle, D.R. Cox, R.M. Myers, Mutations in the gene encoding cystatin B in pro‐ gressive myoclonus epilepsy (EPM1)*.* Science 1996; 271(5256): p. 1731-1734.

pression is related to lymph-node metastasis*.* International journal of cancer. Journal

[17] Mirtti T., K. Alanen, M. Kallajoki, A. Rinne, K.O. Soderstrom, Expression of cystatins, high molecular weight cytokeratin, and proliferation markers in prostatic adenocar‐

[18] Feldman A.S., J. Banyard, C.L. Wu, W.S. McDougal, B.R. Zetter, Cystatin B as a tissue and urinary biomarker of bladder cancer recurrence and disease progression*.* Clini‐ cal cancer research : an official journal of the American Association for Cancer Re‐

[19] Ceru S., S. Konjar, K. Maher, U. Repnik, I. Krizaj, M. Bencina, M. Renko, A. Nepveu, E. Zerovnik, B. Turk, N. Kopitar-Jerala, Stefin B interacts with histones and cathepsin

[20] Riccio M., R. Di Giaimo, S. Pianetti, P.P. Palmieri, M. Melli, S. Santi, Nuclear localiza‐ tion of cystatin B, the cathepsin inhibitor implicated in myoclonus epilepsy (EPM1)*.*

[21] Brannvall K., H. Hjelm, L. Korhonen, U. Lahtinen, A.E. Lehesjoki, D. Lindholm, Cys‐ tatin-B is expressed by neural stem cells and by differentiated neurons and astro‐

[22] Sun T., V. Turk, B. Turk, N. Kopitar-Jerala, Increased expression of stefin B in the nu‐ cleus of T98G astrocytoma cells delays caspase activation*.* Front Mol Neurosci 2012;

[23] Verdot L., G. Lalmanach, V. Vercruysse, S. Hartmann, R. Lucius, J. Hoebeke, F. Gauthier, B. Vray, Cystatins up-regulate nitric oxide release from interferon-gammaactivated mouse peritoneal macrophages*.* J Biol Chem 1996; 271(45): p. 28077-28081.

[24] Lefebvre C., C. Cocquerelle, F. Vandenbulcke, D. Hot, L. Huot, Y. Lemoine, M. Sal‐ zet, Transcriptomic analysis in the leech Theromyzon tessulatum: involvement of

[25] Pennacchio L.A., D.M. Bouley, K.M. Higgins, M.P. Scott, J.L. Noebels, R.M. Myers, Progressive ataxia, myoclonic epilepsy and cerebellar apoptosis in cystatin B-defi‐

[26] Lehtinen M.K., S. Tegelberg, H. Schipper, H. Su, H. Zukor, O. Manninen, O. Kopra, T. Joensuu, P. Hakala, A. Bonni, A.E. Lehesjoki, Cystatin B deficiency sensitizes neu‐ rons to oxidative stress in progressive myoclonus epilepsy, EPM1*.* J Neurosci 2009;

[27] Ii K., H. Ito, E. Kominami, A. Hirano, Abnormal distribution of cathepsin proteinases and endogenous inhibitors (cystatins) in the hippocampus of patients with Alzheim‐ er's disease, parkinsonism-dementia complex on Guam, and senile dementia and in

cystatin B in innate immunity*.* Biochem J 2004; 380(Pt 3): p. 617-625.

international du cancer 1998; 79(2): p. 175-178.

search 2009; 15(3): p. 1024-1031.

314 Oligomerization of Chemical and Biological Compounds

5: p. 93.

cinoma and hyperplasia*.* The Prostate 2003; 54(4): p. 290-298.

L in the nucleus*.* J Biol Chem 2010; 285(13): p. 10078-10086.

cytes*.* Biochem Biophys Res Commun 2003; 308(2): p. 369-374.

Experimental cell research 2001; 262(2): p. 84-94.

cient mice*.* Nat Genet 1998; 20(3): p. 251-258.

29(18): p. 5910-5915.


[39] Ceru S., S.J. Kokalj, S. Rabzelj, M. Skarabot, I. Gutierrez-Aguirre, N. Kopitar-Jerala, G. Anderluh, D. Turk, V. Turk, E. Zerovnik, Size and morphology of toxic oligomers of amyloidogenic proteins: a case study of human stefin B*.* Amyloid : the international journal of experimental and clinical investigation : the official journal of the Interna‐ tional Society of Amyloidosis 2008; 15(3): p. 147-159.

[51] Smith D.P., L.A. Woods, S.E. Radford, A.E. Ashcroft, Structure and dynamics of oli‐ gomeric intermediates in beta2-microglobulin self-assembly*.* Biophys J 2011; 101(5):

Structure and Function of Stefin B Oligomers – Important Role in Amyloidogenesis

http://dx.doi.org/10.5772/57570

317

[52] Ruzafa D., B. Morel, L. Varela, A.I. Azuaga, F. Conejero-Lara, Characterization of oligomers of heterogeneous size as precursors of amyloid fibril nucleation of an SH3

[53] Lenarcic B., I. Krizaj, P. Zunec, V. Turk, Differences in specificity for the interactions of stefins A, B and D with cysteine proteinases*.* FEBS Lett 1996; 395(2-3): p. 113-118.

[54] Stubbs M.T., B. Laber, W. Bode, R. Huber, R. Jerala, B. Lenarcic, V. Turk, The refined 2.4 A X-ray crystal structure of recombinant human stefin B in complex with the cys‐ teine proteinase papain: a novel type of proteinase inhibitor interaction*.* The EMBO

[55] Paramore R., G.J. Morgan, P.J. Davis, C.A. Sharma, A. Hounslow, A. Taler-Vercic, E. Zerovnik, J.P. Waltho, M.J. Cliff, R.A. Staniforth, Mapping local structural perturba‐ tions in the native state of stefin B (cystatin B) under amyloid forming conditions*.*

[56] Morgan G.J., S. Giannini, A.M. Hounslow, C.J. Craven, E. Zerovnik, V. Turk, J.P. Waltho, R.A. Staniforth, Exclusion of the native alpha-helix from the amyloid fibrils

[57] Staniforth R.A., S. Giannini, L.D. Higgins, M.J. Conroy, A.M. Hounslow, R. Jerala, C.J. Craven, J.P. Waltho, Three-dimensional domain swapping in the folded and mol‐ ten-globule states of cystatins, an amyloid-forming structural superfamily*.* The EM‐

[58] Zerovnik E., M. Pompe-Novak, M. Skarabot, M. Ravnikar, I. Musevic, V. Turk, Hu‐ man stefin B readily forms amyloid fibrils in vitro*.* Biochim Biophys Acta 2002;

[59] Zerovnik E., M. Skarabot, K. Skerget, S. Giannini, V. Stoka, S. Jenko-Kokalj, R.A. Sta‐ niforth, Amyloid fibril formation by human stefin B: influence of pH and TFE on fi‐ bril growth and morphology*.* Amyloid : the international journal of experimental and clinical investigation : the official journal of the International Society of Amyloi‐

[60] Jenko S., M. Skarabot, M. Kenig, G. Guncar, I. Musevic, D. Turk, E. Zerovnik, Differ‐ ent propensity to form amyloid fibrils by two homologous proteins-Human stefins A

[61] Smajlovic A., S. Berbic, C. Schiene-Fischer, M. Tusek-Znidaric, A. Taler, S. Jenko-Ko‐ kalj, D. Turk, E. Zerovnik, Essential role of Pro 74 in stefin B amyloid-fibril forma‐ tion: dual action of cyclophilin A on the process*.* FEBS Lett 2009; 583(7): p. 1114-1120.

and B: searching for an explanation*.* Proteins 2004; 55(2): p. 417-425.

of a mixed alpha/beta protein*.* J Mol Biol 2008; 375(2): p. 487-498.

domain: an experimental kinetics study*.* PLoS One 2012; 7(11): p. e49690.

p. 1238-1247.

journal 1990; 9(6): p. 1939-1947.

BO journal 2001; 20(17): p. 4774-4781.

dosis 2007; 14(3): p. 237-247.

1594(1): p. 1-5.

Frontiers in molecular neuroscience 2012; 5: p. 94.


[51] Smith D.P., L.A. Woods, S.E. Radford, A.E. Ashcroft, Structure and dynamics of oli‐ gomeric intermediates in beta2-microglobulin self-assembly*.* Biophys J 2011; 101(5): p. 1238-1247.

[39] Ceru S., S.J. Kokalj, S. Rabzelj, M. Skarabot, I. Gutierrez-Aguirre, N. Kopitar-Jerala, G. Anderluh, D. Turk, V. Turk, E. Zerovnik, Size and morphology of toxic oligomers of amyloidogenic proteins: a case study of human stefin B*.* Amyloid : the international journal of experimental and clinical investigation : the official journal of the Interna‐

[40] Jenko Kokalj S., G. Guncar, I. Stern, G. Morgan, S. Rabzelj, M. Kenig, R.A. Staniforth, J.P. Waltho, E. Zerovnik, D. Turk, Essential role of proline isomerization in stefin B

[41] Taler-Vercic A., T. Kirsipuu, M. Friedemann, A. Noormagi, M. Polajnar, J. Smirnova, M.T. Znidaric, M. Zganec, M. Skarabot, A. Vilfan, R.A. Staniforth, P. Palumaa, E. Zer‐ ovnik, The role of initial oligomers in amyloid fibril formation by human stefin B*.* Int

[42] Dunstone M.A., W. Dai, J.C. Whisstock, J. Rossjohn, R.N. Pike, S.C. Feil, B.F. Le Bon‐ niec, M.W. Parker, S.P. Bottomley, Cleaved antitrypsin polymers at atomic resolu‐

[43] Huntington J.A., N.S. Pannu, B. Hazes, R.J. Read, D.A. Lomas, R.W. Carrell, A 2.6 A structure of a serpin polymer and implications for conformational disease*.* Journal of

[44] Guo Z.,D. Eisenberg, Runaway domain swapping in amyloid-like fibrils of T7 endo‐

[45] Carrell R.W., A. Mushunje, A. Zhou, Serpins show structural basis for oligomer tox‐

[46] Mushero N.,A. Gershenson, Determining serpin conformational distributions with

[47] Zhou A.,R.W. Carrell, Dimers initiate and propagate serine protease inhibitor poly‐

[48] Kloniecki M., A. Jablonowska, J. Poznanski, J. Langridge, C. Hughes, I. Campuzano, K. Giles, M. Dadlez, Ion mobility separation coupled with MS detects two structural states of Alzheimer's disease Abeta1-40 peptide oligomers*.* Journal of molecular biol‐

[49] Nettleton E.J., P. Tito, M. Sunde, M. Bouchard, C.M. Dobson, C.V. Robinson, Charac‐ terization of the oligomeric states of insulin in self-assembly and amyloid fibril for‐

[50] Smith A.M., T.R. Jahn, A.E. Ashcroft, S.E. Radford, Direct observation of oligomeric species formed in the early stages of amyloid fibril formation using electrospray ioni‐

sation mass spectrometry*.* Journal of molecular biology 2006; 364(1): p. 9-19.

nuclease I*.* Proc Natl Acad Sci U S A 2006; 103(21): p. 8042-8047.

icity and amyloid ubiquity*.* FEBS Lett 2008; 582(17): p. 2537-2541.

merisation*.* Journal of molecular biology 2008; 375(1): p. 36-42.

mation by mass spectrometry*.* Biophys J 2000; 79(2): p. 1053-1065.

single molecule fluorescence*.* Methods Enzymol 2011; 501: p. 351-377.

tetramer formation*.* Journal of molecular biology 2007; 366(5): p. 1569-1579.

tional Society of Amyloidosis 2008; 15(3): p. 147-159.

J Mol Sci 2013; 14(9): p. 18362-18384.

316 Oligomerization of Chemical and Biological Compounds

tion*.* Protein Sci 2000; 9(2): p. 417-420.

ogy 2011; 407(1): p. 110-124.

molecular biology 1999; 293(3): p. 449-455.


[62] Kenig M., S. Berbic, A. Krijestorac, L. Kroon-Zitko, M. Tusek, M. Pompe-Novak, E. Zerovnik, Differences in aggregation properties of three site-specific mutants of re‐ combinant human stefin B*.* Protein Sci 2004; 13(1): p. 63-70.

ies against alzheimer amyloid-beta by immunization with a thioredoxin-constrained

Structure and Function of Stefin B Oligomers – Important Role in Amyloidogenesis

http://dx.doi.org/10.5772/57570

319

[74] Kayed R., E. Head, F. Sarsoza, T. Saing, C.W. Cotman, M. Necula, L. Margol, J. Wu, L. Breydo, J.L. Thompson, S. Rasool, T. Gurlo, P. Butler, C.G. Glabe, Fibril specific, conformation dependent antibodies recognize a generic epitope common to amyloid fibrils and fibrillar oligomers that is absent in prefibrillar oligomers*.* Mol Neurodege‐

[75] O'Nuallain B.,R. Wetzel, Conformational Abs recognizing a generic amyloid fibril

[76] Merlini G.,V. Bellotti, Molecular mechanisms of amyloidosis*.* N Engl J Med 2003;

[78] Caughey B.,P.T. Lansbury, Protofibrils, pores, fibrils, and neurodegeneration: sepa‐ rating the responsible protein aggregates from the innocent bystanders*.* Annu Rev

[79] Lashuel H.A., D. Hartley, B.M. Petre, T. Walz, P.T. Lansbury, Jr., Neurodegenerative disease: amyloid pores from pathogenic mutations*.* Nature 2002; 418(6895): p. 291.

[80] Ferreira S.T., M.N. Vieira, F.G. De Felice, Soluble protein oligomers as emerging tox‐ ins in Alzheimer's and other amyloid diseases*.* IUBMB Life 2007; 59(4-5): p. 332-345.

[81] Nimmrich V., C. Grimm, A. Draguhn, S. Barghorn, A. Lehmann, H. Schoemaker, H. Hillen, G. Gross, U. Ebert, C. Bruehl, Amyloid beta oligomers (A beta(1-42) glob‐ ulomer) suppress spontaneous synaptic activity by inhibition of P/Q-type calcium

[82] Nixon R.A., Autophagy in neurodegenerative disease: friend, foe or turncoat? Trends

[83] Powers E.T., R.I. Morimoto, A. Dillin, J.W. Kelly, W.E. Balch, Biological and chemical approaches to diseases of proteostasis deficiency*.* Annu Rev Biochem 2009; 78: p.

[84] Barnham K.J., G.D. Ciccotosto, A.K. Tickler, F.E. Ali, D.G. Smith, N.A. Williamson, Y.H. Lam, D. Carrington, D. Tew, G. Kocak, I. Volitakis, F. Separovic, C.J. Barrow, J.D. Wade, C.L. Masters, R.A. Cherny, C.C. Curtain, A.I. Bush, R. Cappai, Neurotox‐ ic, redox-competent Alzheimer's beta-amyloid is released from lipid membrane by

[85] Cuajungco M.P., L.E. Goldstein, A. Nunomura, M.A. Smith, J.T. Lim, C.S. Atwood, X. Huang, Y.W. Farrag, G. Perry, A.I. Bush, Evidence that the beta-amyloid plaques of Alzheimer's disease represent the redox-silencing and entombment of abeta by zinc*.* J

methionine oxidation*.* J Biol Chem 2003; 278(44): p. 42959-42965.

B-cell epitope peptide*.* J Biol Chem 2007; 282(15): p. 11436-11445.

epitope*.* Proc Natl Acad Sci U S A 2002; 99(3): p. 1485-1490.

[77] Tan S.Y.,M.B. Pepys, Amyloidosis*.* Histopathology 1994; 25(5): p. 403-414.

ner 2007; 2: p. 18.

349(6): p. 583-596.

Neurosci 2003; 26: p. 267-298.

currents*.* J Neurosci 2008; 28(4): p. 788-797.

Biol Chem 2000; 275(26): p. 19439-19442.

Neurosci 2006; 29(9): p. 528-535.

959-991.


ies against alzheimer amyloid-beta by immunization with a thioredoxin-constrained B-cell epitope peptide*.* J Biol Chem 2007; 282(15): p. 11436-11445.


[62] Kenig M., S. Berbic, A. Krijestorac, L. Kroon-Zitko, M. Tusek, M. Pompe-Novak, E. Zerovnik, Differences in aggregation properties of three site-specific mutants of re‐

[63] Kenig M., S. Jenko-Kokalj, M. Tusek-Znidaric, M. Pompe-Novak, G. Guncar, D. Turk, J.P. Waltho, R.A. Staniforth, F. Avbelj, E. Zerovnik, Folding and amyloid-fibril forma‐ tion for a series of human stefins' chimeras: any correlation? Proteins 2006; 62(4): p.

[64] Skerget K., A. Vilfan, M. Pompe-Novak, V. Turk, J.P. Waltho, D. Turk, E. Zerovnik, The mechanism of amyloid-fibril formation by stefin B: temperature and protein con‐

[65] Ceru S.,E. Zerovnik, Similar toxicity of the oligomeric molten globule state and the

[66] Haass C.,D.J. Selkoe, Soluble protein oligomers in neurodegeneration: lessons from the Alzheimer's amyloid beta-peptide*.* Nature reviews. Molecular cell biology 2007;

[67] McLean C.A., R.A. Cherny, F.W. Fraser, S.J. Fuller, M.J. Smith, K. Beyreuther, A.I. Bush, C.L. Masters, Soluble pool of Abeta amyloid as a determinant of severity of neurodegeneration in Alzheimer's disease*.* Annals of neurology 1999; 46(6): p.

[68] El-Agnaf O.M., S.A. Salem, K.E. Paleologou, L.J. Cooper, N.J. Fullwood, M.J. Gibson, M.D. Curran, J.A. Court, D.M. Mann, S. Ikeda, M.R. Cookson, J. Hardy, D. Allsop, Alpha-synuclein implicated in Parkinson's disease is present in extracellular biologi‐ cal fluids, including human plasma*.* FASEB journal : official publication of the Feder‐ ation of American Societies for Experimental Biology 2003; 17(13): p. 1945-1947. [69] Bucciantini M., G. Calloni, F. Chiti, L. Formigli, D. Nosi, C.M. Dobson, M. Stefani, Prefibrillar amyloid protein aggregates share common features of cytotoxicity*.* The

[70] Malisauskas M., J. Ostman, A. Darinskas, V. Zamotin, E. Liutkevicius, E. Lundgren, L.A. Morozova-Roche, Does the cytotoxic effect of transient amyloid oligomers from common equine lysozyme in vitro imply innate amyloid toxicity? The Journal of bio‐

[71] Hrncic R., J. Wall, D.A. Wolfenbarger, C.L. Murphy, M. Schell, D.T. Weiss, A. Solo‐ mon, Antibody-mediated resolution of light chain-associated amyloid deposits*.* Am J

[72] Kayed R., E. Head, J.L. Thompson, T.M. McIntire, S.C. Milton, C.W. Cotman, C.G. Glabe, Common structure of soluble amyloid oligomers implies common mechanism

[73] Moretto N., A. Bolchi, C. Rivetti, B.P. Imbimbo, G. Villetti, V. Pietrini, L. Polonelli, S. Del Signore, K.M. Smith, R.J. Ferrante, S. Ottonello, Conformation-sensitive antibod‐

centration dependence of the rates*.* Proteins 2009; 74(2): p. 425-436.

prefibrillar oligomers*.* FEBS Lett 2008; 582(2): p. 203-209.

Journal of biological chemistry 2004; 279(30): p. 31374-31382.

logical chemistry 2005; 280(8): p. 6269-6275.

of pathogenesis*.* Science 2003; 300(5618): p. 486-489.

Pathol 2000; 157(4): p. 1239-1246.

combinant human stefin B*.* Protein Sci 2004; 13(1): p. 63-70.

918-927.

318 Oligomerization of Chemical and Biological Compounds

8(2): p. 101-112.

860-866.


[86] Huang X., C.S. Atwood, M.A. Hartshorn, G. Multhaup, L.E. Goldstein, R.C. Scarpa, M.P. Cuajungco, D.N. Gray, J. Lim, R.D. Moir, R.E. Tanzi, A.I. Bush, The A beta pep‐ tide of Alzheimer's disease directly produces hydrogen peroxide through metal ion reduction*.* Biochemistry 1999; 38(24): p. 7609-7616.

anchoring versus accelerated surface fibril formation*.* Journal of molecular biology

Structure and Function of Stefin B Oligomers – Important Role in Amyloidogenesis

http://dx.doi.org/10.5772/57570

321

[98] Hertel C., E. Terzi, N. Hauser, R. Jakob-Rotne, J. Seelig, J.A. Kemp, Inhibition of the electrostatic interaction between beta-amyloid peptide and membranes prevents be‐ ta-amyloid-induced toxicity*.* Proc Natl Acad Sci U S A 1997; 94(17): p. 9412-9416. [99] Hirakura Y., M.C. Lin, B.L. Kagan, Alzheimer amyloid abeta1-42 channels: effects of

[100] McLaurin J.,A. Chakrabartty, Characterization of the interactions of Alzheimer betaamyloid peptides with phospholipid membranes*.* Eur J Biochem 1997; 245(2): p.

[101] Terzi E., G. Holzemann, J. Seelig, Interaction of Alzheimer beta-amyloid pep‐ tide(1-40) with lipid membranes*.* Biochemistry 1997; 36(48): p. 14845-14852.

[102] Janson J., R.H. Ashley, D. Harrison, S. McIntyre, P.C. Butler, The mechanism of islet amyloid polypeptide toxicity is membrane disruption by intermediate-sized toxic

[103] Lin M.C., T. Mirzabekov, B.L. Kagan, Channel formation by a neurotoxic prion pro‐

[104] Zakharov S.D., J.D. Hulleman, E.A. Dutseva, Y.N. Antonenko, J.C. Rochet, W.A. Cramer, Helical alpha-synuclein forms highly conductive ion channels*.* Biochemistry

[105] Kayed R., Y. Sokolov, B. Edmonds, T.M. McIntire, S.C. Milton, J.E. Hall, C.G. Glabe, Permeabilization of lipid bilayers is a common conformation-dependent activity of soluble amyloid oligomers in protein misfolding diseases*.* J Biol Chem 2004; 279(45):

[106] Kawahara M., N. Arispe, Y. Kuroda, E. Rojas, Alzheimer's disease amyloid beta-pro‐ tein forms Zn(2+)-sensitive, cation-selective channels across excised membrane

[107] Lin H., Y.J. Zhu, R. Lal, Amyloid beta protein (1-40) forms calcium-permeable, Zn2+ sensitive channel in reconstituted lipid vesicles*.* Biochemistry 1999; 38(34): p.

[108] Rhee S.K., A.P. Quist, R. Lal, Amyloid beta protein-(1-42) forms calcium-permeable,

[109] Di Scala C., J.D. Troadec, C. Lelievre, N. Garmy, J. Fantini, H. Chahinian, Mechanism of cholesterol-assisted oligomeric channel formation by a short Alzheimer beta-amy‐

[110] Green J.D., L. Kreplak, C. Goldsbury, X. Li Blatter, M. Stolz, G.S. Cooper, A. Seelig, J. Kistler, U. Aebi, Atomic force microscopy reveals defects within mica supported lip‐

patches from hypothalamic neurons*.* Biophys J 1997; 73(1): p. 67-75.

Zn2+-sensitive channel*.* J Biol Chem 1998; 273(22): p. 13379-13382.

solvent, pH, and Congo Red*.* J Neurosci Res 1999; 57(4): p. 458-466.

amyloid particles*.* Diabetes 1999; 48(3): p. 491-498.

tein fragment*.* J Biol Chem 1997; 272(1): p. 44-47.

2007; 46(50): p. 14369-14379.

loid peptide*.* J Neurochem 2013.

p. 46363-46366.

11189-11196.

2004; 335(4): p. 1039-1049.

355-363.


anchoring versus accelerated surface fibril formation*.* Journal of molecular biology 2004; 335(4): p. 1039-1049.

[98] Hertel C., E. Terzi, N. Hauser, R. Jakob-Rotne, J. Seelig, J.A. Kemp, Inhibition of the electrostatic interaction between beta-amyloid peptide and membranes prevents be‐ ta-amyloid-induced toxicity*.* Proc Natl Acad Sci U S A 1997; 94(17): p. 9412-9416.

[86] Huang X., C.S. Atwood, M.A. Hartshorn, G. Multhaup, L.E. Goldstein, R.C. Scarpa, M.P. Cuajungco, D.N. Gray, J. Lim, R.D. Moir, R.E. Tanzi, A.I. Bush, The A beta pep‐ tide of Alzheimer's disease directly produces hydrogen peroxide through metal ion

[87] Huang X., M.P. Cuajungco, C.S. Atwood, M.A. Hartshorn, J.D. Tyndall, G.R. Hanson, K.C. Stokes, M. Leopold, G. Multhaup, L.E. Goldstein, R.C. Scarpa, A.J. Saunders, J. Lim, R.D. Moir, C. Glabe, E.F. Bowden, C.L. Masters, D.P. Fairlie, R.E. Tanzi, A.I. Bush, Cu(II) potentiation of alzheimer abeta neurotoxicity. Correlation with cell-free hydrogen peroxide production and metal reduction*.* J Biol Chem 1999; 274(52): p.

[88] Opazo C., X. Huang, R.A. Cherny, R.D. Moir, A.E. Roher, A.R. White, R. Cappai, C.L. Masters, R.E. Tanzi, N.C. Inestrosa, A.I. Bush, Metalloenzyme-like activity of Alz‐ heimer's disease beta-amyloid. Cu-dependent catalytic conversion of dopamine, cho‐ lesterol, and biological reducing agents to neurotoxic H(2)O(2)*.* J Biol Chem 2002;

[89] Macario A.J.,E. Conway de Macario, Sick chaperones, cellular stress, and disease*.* N

[90] Muchowski P.J.,J.L. Wacker, Modulation of neurodegeneration by molecular chaper‐

[91] Almeida C.G., R.H. Takahashi, G.K. Gouras, Beta-amyloid accumulation impairs multivesicular body sorting by inhibiting the ubiquitin-proteasome system*.* J Neuro‐

[92] Uversky V.N., Mysterious oligomerization of the amyloidogenic proteins*.* Febs J

[93] Anderluh G.,E. Zerovnik, Pore formation by human stefin B in its native and oligo‐ meric states and the consequent amyloid induced toxicity*.* Front Mol Neurosci 2012;

[94] Anderluh G., I. Gutierrez-Aguirre, S. Rabzelj, S. Ceru, N. Kopitar-Jerala, P. Macek, V. Turk, E. Zerovnik, Interaction of human stefin B in the prefibrillar oligomeric form with membranes. Correlation with cellular toxicity*.* Febs J 2005; 272(12): p. 3042-3051.

[95] Rabzelj S., G. Viero, I. Gutierrez-Aguirre, V. Turk, M. Dalla Serra, G. Anderluh, E. Zerovnik, Interaction with model membranes and pore formation by human stefin B:

[96] Alarcon J.M., J.A. Brito, T. Hermosilla, I. Atwater, D. Mears, E. Rojas, Ion channel for‐ mation by Alzheimer's disease amyloid beta-peptide (Abeta40) in unilamellar lipo‐

somes is determined by anionic phospholipids*.* Peptides 2006; 27(1): p. 95-104.

[97] Bokvist M., F. Lindstrom, A. Watts, G. Grobner, Two types of Alzheimer's beta-amy‐ loid (1-40) peptide membrane interactions: aggregation preventing transmembrane

studying the native and prefibrillar states*.* Febs J 2008; 275(10): p. 2455-2466.

reduction*.* Biochemistry 1999; 38(24): p. 7609-7616.

37111-37116.

277(43): p. 40302-40308.

320 Oligomerization of Chemical and Biological Compounds

sci 2006; 26(16): p. 4277-4288.

2010; 277(14): p. 2940-2953.

5: p. 85.

Engl J Med 2005; 353(14): p. 1489-1501.

ones*.* Nat Rev Neurosci 2005; 6(1): p. 11-22.


id bilayers induced by the amyloidogenic human amylin peptide*.* Journal of molecu‐ lar biology 2004; 342(3): p. 877-887.

[121] Tougu V., A. Karafin, K. Zovo, R.S. Chung, C. Howells, A.K. West, P. Palumaa, Zn(II)-and Cu(II)-induced non-fibrillar aggregates of amyloid-beta (1-42) peptide are transformed to amyloid fibrils, both spontaneously and under the influence of metal

Structure and Function of Stefin B Oligomers – Important Role in Amyloidogenesis

http://dx.doi.org/10.5772/57570

323

[122] Watt N.T., H.H. Griffiths, N.M. Hooper, Neuronal zinc regulation and the prion pro‐

[123] Johnson C.J., P.U. Gilbert, M. Abrecht, K.L. Baldwin, R.E. Russell, J.A. Pedersen, J.M. Aiken, D. McKenzie, Low copper and high manganese levels in prion protein pla‐

[124] Nubling G., B. Bader, J. Levin, J. Hildebrandt, H. Kretzschmar, A. Giese, Synergistic influence of phosphorylation and metal ions on tau oligomer formation and coaggre‐ gation with alpha-synuclein at the single molecule level*.* Mol Neurodegener 2012; 7:

[125] Mirhashemi S.M.,M.E. Shahabaddin, Evaluation of aluminium, manganese, copper and selenium effects on human islets amyloid polypeptide hormone aggregation*.*

[126] Lee E.C., E. Ha, S. Singh, L. Legesse, S. Ahmad, E. Karnaukhova, R.P. Donaldson, A.M. Jeremic, Copper(ii)-human amylin complex protects pancreatic cells from amy‐ lin toxicity*.* Physical chemistry chemical physics : PCCP 2013; 15(30): p. 12558-12571.

[127] Salamekh S., J.R. Brender, S.J. Hyung, R.P. Nanga, S. Vivekanandan, B.T. Ruotolo, A. Ramamoorthy, A two-site mechanism for the inhibition of IAPP amyloidogenesis by

[128] Wilhelmus M.M., R.M. de Waal, M.M. Verbeek, Heat shock proteins and amateur chaperones in amyloid-Beta accumulation and clearance in Alzheimer's disease*.* Mol

[129] Hook G., V.Y. Hook, M. Kindy, Cysteine protease inhibitors reduce brain beta-amy‐ loid and beta-secretase activity in vivo and are potential Alzheimer's disease thera‐

[130] Morgan C.,N.C. Inestrosa, Interactions of laminin with the amyloid beta peptide. Im‐ plications for Alzheimer's disease*.* Braz J Med Biol Res 2001; 34(5): p. 597-601. [131] Porat Y., A. Abramowitz, E. Gazit, Inhibition of amyloid fibril formation by polyphe‐ nols: structural similarity and aromatic interactions as a common inhibition mecha‐

[132] Zhao L.N., Y. Mu, L.Y. Chew, Heme prevents amyloid beta peptide aggregation through hydrophobic interaction based on molecular dynamics simulation*.* Phys

Pakistan journal of biological sciences: PJBS 2011; 14(4): p. 288-292.

zinc*.* Journal of molecular biology 2011; 410(2): p. 294-306.

chelators*.* J Neurochem 2009; 110(6): p. 1784-1795.

tein*.* Prion 2013; 7(3): p. 203-208.

ques*.* Viruses 2013; 5(2): p. 654-662.

Neurobiol 2007; 35(3): p. 203-216.

peutics*.* Biol Chem 2007; 388(9): p. 979-983.

nism*.* Chem Biol Drug Des 2006; 67(1): p. 27-37.

Chem Chem Phys 2013; 15(33): p. 14098-14106.

p. 35.


[121] Tougu V., A. Karafin, K. Zovo, R.S. Chung, C. Howells, A.K. West, P. Palumaa, Zn(II)-and Cu(II)-induced non-fibrillar aggregates of amyloid-beta (1-42) peptide are transformed to amyloid fibrils, both spontaneously and under the influence of metal chelators*.* J Neurochem 2009; 110(6): p. 1784-1795.

id bilayers induced by the amyloidogenic human amylin peptide*.* Journal of molecu‐

[111] Fantini J.,N. Yahi, The driving force of alpha-synuclein insertion and amyloid chan‐ nel formation in the plasma membrane of neural cells: key role of ganglioside-and

[112] Zhao J., Y. Luo, H. Jang, X. Yu, G. Wei, R. Nussinov, J. Zheng, Probing ion channel activity of human islet amyloid polypeptide (amylin)*.* Biochim Biophys Acta 2012;

[113] Oladzad Abbasabadi A., A. Javanian, M. Nikkhah, A.A. Meratan, P. Ghiasi, M. Nem‐ at-Gorgani, Disruption of mitochondrial membrane integrity induced by amyloid ag‐ gregates arising from variants of SOD1*.* International journal of biological

[114] Zerovnik E., K. Skerget, M. Tusek-Znidaric, C. Loeschner, M.W. Brazier, D.R. Brown, High affinity copper binding by stefin B (cystatin B) and its role in the inhibition of

[115] Atwood C.S., R.D. Moir, X. Huang, R.C. Scarpa, N.M. Bacarra, D.M. Romano, M.A. Hartshorn, R.E. Tanzi, A.I. Bush, Dramatic aggregation of Alzheimer abeta by Cu(II) is induced by conditions representing physiological acidosis*.* The Journal of biologi‐

[116] Raman B., T. Ban, K. Yamaguchi, M. Sakai, T. Kawai, H. Naiki, Y. Goto, Metal iondependent effects of clioquinol on the fibril growth of an amyloid {beta} peptide*.* The

[117] Zou J., K. Kajita, N. Sugimoto, Cu(2+) Inhibits the Aggregation of Amyloid beta-Pep‐ tide(1-42) in vitro We thank JEOL for the AFM measurement. This work was sup‐ ported in part by Grants-in-Aid from the Japanese Ministry of Education, Science, Sports, and Culture, and a Grant from "Research for the Future" Program of the Ja‐ pan Society for the Promotion of Science to N.S*.* Angewandte Chemie 2001; 40(12): p.

[118] Kuczius T.,R. Kelsch, The effect of copper and zinc binding on the solubility and re‐ sistance to proteolysis of physiological prion protein PrP depends on the tissue

[119] Leal S.S., I. Cardoso, J.S. Valentine, C.M. Gomes, Calcium ions promote superoxide dismutase 1 (SOD1) aggregation into non fibrillar amyloid: a link to toxic effects of calcium overload in amyotrophic lateral sclerosis (ALS)? The Journal of biological

[120] Ha C., J. Ryu, C.B. Park, Metal ions differentially influence the aggregation and depo‐ sition of Alzheimer's beta-amyloid on a solid template*.* Biochemistry 2007; 46(20): p.

source and the PrP glycotypes*.* Journal of cellular biochemistry 2013.

cholesterol-binding domains*.* Adv Exp Med Biol 2013; 991: p. 15-26.

lar biology 2004; 342(3): p. 877-887.

322 Oligomerization of Chemical and Biological Compounds

macromolecules 2013; 61C: p. 212-217.

cal chemistry 1998; 273(21): p. 12817-12826.

amyloid fibrillation*.* Febs J 2006; 273(18): p. 4250-4263.

Journal of biological chemistry 2005; 280(16): p. 16157-16162.

1818(12): p. 3121-3130.

2274-2277.

chemistry 2013.

6118-6125.


[133] Yagi-Utsumi M., T. Kunihara, T. Nakamura, Y. Uekusa, K. Makabe, K. Kuwajima, K. Kato, NMR characterization of the interaction of GroEL with amyloid beta as a mod‐ el ligand*.* FEBS Lett 2013; 587(11): p. 1605-1609.

**Section 3**

**Computational Approaches**


**Computational Approaches**

[133] Yagi-Utsumi M., T. Kunihara, T. Nakamura, Y. Uekusa, K. Makabe, K. Kuwajima, K. Kato, NMR characterization of the interaction of GroEL with amyloid beta as a mod‐

[134] Ly S., R. Altman, J. Petrlova, Y. Lin, S. Hilt, T. Huser, T.A. Laurence, J.C. Voss, Bind‐ ing of apolipoprotein E inhibits the oligomer growth of amyloid-beta peptide in solu‐ tion as determined by fluorescence cross-correlation spectroscopy*.* J Biol Chem 2013;

[135] Kotarba A.E., D. Aucoin, M.D. Hoos, S.O. Smith, W.E. Van Nostrand, Fine mapping of the amyloid beta-protein binding site on myelin basic protein*.* Biochemistry 2013;

[136] Cui L., Y. Zhang, H. Cao, Y. Wang, T. Teng, G. Ma, Y. Li, K. Li, Ferulic Acid Inhibits the Transition of Amyloid-beta42 Monomers to Oligomers but Accelerates the Tran‐

[137] Zhang W., Y. Du, M. Bai, Y. Xi, Z. Li, J. Miao, S14G-humanin inhibits Abeta1-42 fibril formation, disaggregates preformed fibrils, and protects against Abeta-induced cyto‐ toxicity in vitro*.* Journal of peptide science : an official publication of the European

[138] Stanyon H.F.,J.H. Viles, Human serum albumin can regulate amyloid-beta peptide fi‐ ber growth in the brain interstitium: implications for Alzheimer disease*.* The Journal

[139] Ahn J.H., Y. Hu, M. Hernandez, J.R. Kim, Crocetin inhibits beta-amyloid fibrillization and stabilizes beta-amyloid oligomers*.* Biochem Biophys Res Commun 2011; 414(1):

[140] Taler-Vercic A.,E. Zerovnik, Binding of amyloid peptides to domain-swapped dim‐ ers of other amyloid-forming proteins may prevent their neurotoxicity. BioEssays : news and reviews in molecular, cellular and developmental biology 2010; 32(12): p.

sition from Oligomers to Fibrils*.* Journal of Alzheimer's disease : JAD 2013.

el ligand*.* FEBS Lett 2013; 587(11): p. 1605-1609.

288(17): p. 11628-11635.

324 Oligomerization of Chemical and Biological Compounds

52(15): p. 2565-2573.

p. 79-83.

1020-1024.

Peptide Society 2013; 19(3): p. 159-165.

of biological chemistry 2012; 287(33): p. 28163-28168.

**Chapter 11**

**The Assembly of Protein Oligomers — Old Stories and**

Proteins are biological entities made of a chain of amino acids bound to one another in a specific order, called the primary structure or the amino acid sequence of the protein. Based on the sequence and the environment, the protein acquires a tridimensional shape called tertiary structure (3D-structure), conformation or fold, suitable for its biological function. The func‐ tional shape is the native structure of the protein. The set of reactions leading to the native structure is the folding of the protein. The vast majority of proteins are oligomers which function only after the association of several copies of their chains. Homo-oligomers have chains with identical sequences and hetero-oligomers have chains with different sequences. The number of associated chains defines the quaternary structure of the oligomer, or its stoichiometry [1]. According to the Protein Database (PDB) where all known 3D structures of proteins are stored, the most observed quaternary structure in all taxa is the dimer (Fig. 1A). A taxon is a set of living organisms grouped because of some shared characteristics. Besides dimers, there exists a large variety of assemblies in terms of quaternary structures and point group symmetries (Fig. 1). In forming a protein oligomer, subunit association has to be

Folding involves the formation of interactions/bonds between atoms of the amino acids of a single chain. These are intramolecular (within a single molecule) amino acid interactions (Fig. 2A). Chain association involves the formation of interactions/bonds between atoms of the amino acids provided by at least two individual chains. These are intermolecular (between two molecules) amino acid interactions (Fig. 2B). Here the protein chain is considered as the

The twenty natural amino acids share four atoms called the backbone atoms and are distin‐ guished by a set of atoms called the side chain atoms. These atoms can make different types

> © 2014 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

**New Perspectives with Graph Theory**

Additional information is available at the end of the chapter

Claire Lesieur

**1. Introduction**

http://dx.doi.org/10.5772/58576

considered in addition to folding.

molecule.

## **The Assembly of Protein Oligomers — Old Stories and New Perspectives with Graph Theory**

Claire Lesieur

Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/58576

#### **1. Introduction**

Proteins are biological entities made of a chain of amino acids bound to one another in a specific order, called the primary structure or the amino acid sequence of the protein. Based on the sequence and the environment, the protein acquires a tridimensional shape called tertiary structure (3D-structure), conformation or fold, suitable for its biological function. The func‐ tional shape is the native structure of the protein. The set of reactions leading to the native structure is the folding of the protein. The vast majority of proteins are oligomers which function only after the association of several copies of their chains. Homo-oligomers have chains with identical sequences and hetero-oligomers have chains with different sequences. The number of associated chains defines the quaternary structure of the oligomer, or its stoichiometry [1]. According to the Protein Database (PDB) where all known 3D structures of proteins are stored, the most observed quaternary structure in all taxa is the dimer (Fig. 1A). A taxon is a set of living organisms grouped because of some shared characteristics. Besides dimers, there exists a large variety of assemblies in terms of quaternary structures and point group symmetries (Fig. 1). In forming a protein oligomer, subunit association has to be considered in addition to folding.

Folding involves the formation of interactions/bonds between atoms of the amino acids of a single chain. These are intramolecular (within a single molecule) amino acid interactions (Fig. 2A). Chain association involves the formation of interactions/bonds between atoms of the amino acids provided by at least two individual chains. These are intermolecular (between two molecules) amino acid interactions (Fig. 2B). Here the protein chain is considered as the molecule.

The twenty natural amino acids share four atoms called the backbone atoms and are distin‐ guished by a set of atoms called the side chain atoms. These atoms can make different types

© 2014 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

**Figure 1. Protein oligomer features.** The data in A and B are obtained by screening the PDB. **A.** Distribution of taxa according to quaternary structures. **B.** Distribution of taxa according to point group symmetries. **C.** Protein oligomers belong to three different point group symmetries. Cn, cyclic with n rotational axes. One ferritin is given as an example for a C3 symmetry (PDB 2F7N). Dn, dihedral with n rotational axes plus 2-fold perpendicular axes. One ferritin is given as an example for a D2 symmetry (PDB 2RBD). Cubic. T, tetrahedral with four 3-fold axes and six 2-fold axes. Ferritin is given as an example (PDB 1DPS) [2]. O, octoahedral, octahedron or hexahedron with six 4-fold axes, eight 3-fold axes and twelve 2-fold axes. One ferritin is given as an example (PDB 1LB3). I, icosahedral with twelve 5-fold axes, twenty 3 fold axes and thirty 2-fold axes. Ferritin is given as an example (PDB 1K4R). Icosahedral is not a cubic point group sym‐ metry but has been conflated to the cubic point group symmetry in chemistry [3].

of chemical bonds. First, the amino acids are linked to one another by a covalent bond involving two backbone atoms and called the peptide bond. The covalent bonds are thus used to make a chain of amino acids arranged in a specific order, the primary sequence of the protein. Cysteine and methionine amino acids are the only amino acids that can make a supplementary covalent bond, called a disulfide bond, using the sulfur atom of their side chains. There can be intramolecular disulfide bonds (between two cysteines of one chain) or intermolecular disulfide bonds (between two cysteines, each one produced by a distinct chain), the latter making a covalent oligomer. Some collagen trimers are stabilized by inter‐chain disulfide bonds [4]. The collagen oligomers have been reviewed recently [5, 6]. Some other examples of covalent oligomers can be found in the chapter on protein oligomerization by Giovanni Gotte

and Massimo Libonati. Disulfide bonds can significantly increase the stability of a chain or of an oligomer, but it is not necessarily true and that needs to be measured case by case [7]. Covalent bonds are strong interactions as it takes a large amount of energy to break them (110-50 kcal/mol). In living organisms, an enzyme (protease) is necessary to cut a covalent

the α-helical interface and **a**b**c**da**b**d**c**d, with a and c residues interacting in the β-strands.

**Figure 2. Interactions in proteins.A.Protein monomer.** Monomeric proteins perform their biological function with a single chain. The formation and the stability of their native 3D structure involve only intramolecular atomic interac‐ tions (interaction within a chain). **B. Protein oligomer.** Oligomeric proteins need to assemble several copies of their chain to perform their biological function. The formation and the stability of their native structure involve intramolec‐ ular atomic interactions (interaction within one chain) to acquire their fold and intermolecular atomic interactions (in‐ teraction between chains) to acquire their quaternary structure. The pictures are generated with Rasmol. The protein chains are shown in ribbons of different colors. For a few amino acids, all atoms are indicated in balls and sticks to highlight intra and inter atomic interactions. **C. Recognition modes.** A protein interface is made of two set of atoms, one per chain, spatially organized to yield a chemical and geometrical complementarity. Two simple cases are present‐ ed. On the left are two interacting α-helices and on the right are two interacting β-strands. Because of the geometry, the interacting amino acids produced particular sequence motifs/pattern **a**bc**d**efg with *a* and *d* residues interacting in

The Assembly of Protein Oligomers — Old Stories and New Perspectives with Graph Theory

http://dx.doi.org/10.5772/58576

329

Second, the tertiary and quaternary structures of protein as well as the folding and the chain association involve mostly non-covalent bonds between the atoms of the amino acids, called weak bonds because it takes a small amount of energy to break them (1-7 kcal/mol). These are

bond.

**Figure 2. Interactions in proteins.A.Protein monomer.** Monomeric proteins perform their biological function with a single chain. The formation and the stability of their native 3D structure involve only intramolecular atomic interac‐ tions (interaction within a chain). **B. Protein oligomer.** Oligomeric proteins need to assemble several copies of their chain to perform their biological function. The formation and the stability of their native structure involve intramolec‐ ular atomic interactions (interaction within one chain) to acquire their fold and intermolecular atomic interactions (in‐ teraction between chains) to acquire their quaternary structure. The pictures are generated with Rasmol. The protein chains are shown in ribbons of different colors. For a few amino acids, all atoms are indicated in balls and sticks to highlight intra and inter atomic interactions. **C. Recognition modes.** A protein interface is made of two set of atoms, one per chain, spatially organized to yield a chemical and geometrical complementarity. Two simple cases are present‐ ed. On the left are two interacting α-helices and on the right are two interacting β-strands. Because of the geometry, the interacting amino acids produced particular sequence motifs/pattern **a**bc**d**efg with *a* and *d* residues interacting in the α-helical interface and **a**b**c**da**b**d**c**d, with a and c residues interacting in the β-strands.

and Massimo Libonati. Disulfide bonds can significantly increase the stability of a chain or of an oligomer, but it is not necessarily true and that needs to be measured case by case [7]. Covalent bonds are strong interactions as it takes a large amount of energy to break them (110-50 kcal/mol). In living organisms, an enzyme (protease) is necessary to cut a covalent bond.

of chemical bonds. First, the amino acids are linked to one another by a covalent bond involving two backbone atoms and called the peptide bond. The covalent bonds are thus used to make a chain of amino acids arranged in a specific order, the primary sequence of the protein. Cysteine and methionine amino acids are the only amino acids that can make a supplementary covalent bond, called a disulfide bond, using the sulfur atom of their side chains. There can be intramolecular disulfide bonds (between two cysteines of one chain) or intermolecular disulfide bonds (between two cysteines, each one produced by a distinct chain), the latter making a covalent oligomer. Some collagen trimers are stabilized by inter‐chain disulfide bonds [4]. The collagen oligomers have been reviewed recently [5, 6]. Some other examples of covalent oligomers can be found in the chapter on protein oligomerization by Giovanni Gotte

metry but has been conflated to the cubic point group symmetry in chemistry [3].

328 Oligomerization of Chemical and Biological Compounds

**Figure 1. Protein oligomer features.** The data in A and B are obtained by screening the PDB. **A.** Distribution of taxa according to quaternary structures. **B.** Distribution of taxa according to point group symmetries. **C.** Protein oligomers belong to three different point group symmetries. Cn, cyclic with n rotational axes. One ferritin is given as an example for a C3 symmetry (PDB 2F7N). Dn, dihedral with n rotational axes plus 2-fold perpendicular axes. One ferritin is given as an example for a D2 symmetry (PDB 2RBD). Cubic. T, tetrahedral with four 3-fold axes and six 2-fold axes. Ferritin is given as an example (PDB 1DPS) [2]. O, octoahedral, octahedron or hexahedron with six 4-fold axes, eight 3-fold axes and twelve 2-fold axes. One ferritin is given as an example (PDB 1LB3). I, icosahedral with twelve 5-fold axes, twenty 3 fold axes and thirty 2-fold axes. Ferritin is given as an example (PDB 1K4R). Icosahedral is not a cubic point group sym‐

> Second, the tertiary and quaternary structures of protein as well as the folding and the chain association involve mostly non-covalent bonds between the atoms of the amino acids, called weak bonds because it takes a small amount of energy to break them (1-7 kcal/mol). These are

hydrogen bonds, hydrophobic bonds, electrostatic bonds (between charges), polar bonds (between dipoles) and van der Waals interactions. Under physiological conditions, the weak bonds continuously form and break. The secondary structures of proteins, α-helices and βsheets (intramolecular β-strand interactions) are stabilized by hydrogen bonds between atoms of the backbone of the amino acids. Likewise for intermolecular β-sheets but the hydrogen bonds are between atoms of the backbone of amino acids located on different chains. At last, but not least, worth noted amino acids in terms of folding and association is the proline. Its side chain geometry is particular and can adopt two positions named *cis* and *trans*, affecting the relative position of its neighboring amino acids accordingly. The consequence is the existence of two different local tridimensional states. The transition between the *cis and trans* conformation is called a *cis-trans* isomerization and is known to slow down the folding of a protein, and to also affect the association of chains indirectly [8-11].

Thus, it has been clear very early on that looking at the 3D structures of protein interfaces was a necessary alternative to sequence analysis [34, 38-43]. The increasing number of 3D-structures available for oligomeric proteins has also favored such investigations and the development of computational methods to identify and study the amino acids at protein interfaces on large scale datasets. In the chapter, we essentially review these computational advances but discuss little experimental progress. One can read the chapter on protein oligomerization by Giovanni Gotte and Massimo Libonati for information on experimental approaches or read information

The Assembly of Protein Oligomers — Old Stories and New Perspectives with Graph Theory

http://dx.doi.org/10.5772/58576

331

The first benefit of computational approaches is the facility in discriminating intermolecular amino acid interactions from intramolecular amino acid interactions systematically and

Many algorithms are available to identify the amino acids involved in intermolecular interac‐ tions from the x-ray coordinates of the 3D-structure of a protein oligomer (reviewed in [13, 21, 45]). The coordinates are accessible at the protein database (PDB, http://www.rcsb.org/pdb/ home/home.do)[46]. There are databases were interfaces have been classified according to their 3D organization, residue conservation and residue types [47]. Some databases are used to complement cellular networks (interactomes) with structural information on the binding

The classical algorithms are based on three different measures: (i) accessibility surface area, (ii) voronoi cells and (iii) arithmetic distances. More novel algorithms use graph theory

**Accessible surface area (ASA).** The first method calculates the solvent accessible surface area by rolling a probe of a given radius around the Van der Waal's surface of the protein atoms whose centre is the accessible surface [49]. Typically, the probe has the same radius as water (1.4 Å) and hence the surface described is referred to as the solvent accessible surface. The ASA are calculated for the monomer and for the oligomer and the interface residues are obtained by the difference in their ASA. ASA is currently used to discriminate biological contacts (large

processed accordingly and provide both biological assembly and asymmetric unit coordinates. The biological assembly entry includes a remark to explain whether the oligomeric state is "author provided" (experimentally shown to be an oligomer) or "software determined" or both. Alternatively, the biological assembly can be downloaded directly from the Structure Server PQS (Protein Quaternary structure) at EBI (http://pqs.ebi.ac.uk) [50]. ASA can be calculated from different servers and programs such as PISA (Protein, Interfaces, Structures and Assem‐ blies, http://www.ebi.ac.uk/msd-srv/prot\_int/) or Naccess (http://www.bioinf.manches‐ ter.ac.uk/naccess/), both essentially implemented from the Lee & Richards method [52].

**Voronoi cells.** The second method selects interfacial residues based on the Voronoi diagram or its closely related power diagram [53-55]. The Voronoi diagram associates to each atom its Voronoi cell, namely the convex polyhedron that contains all points of space closer to that atom

) [50, 51]. The PDB entries are now

)) from crystal ones (small ASA <400 Å2

provided in the following publication [44].

efficiently by relatively simple algorithms.

modes between cellular partners [12, 18-20, 48].

ASA, 1600 ± 400 Å2

measures such as centrality and coefficient clustering.

*2.1.1. Identification of the intermolecular contacts at protein interfaces*

The zone of contact between two associated chains is called the protein interface. The protein interface is made of intermolecular amino acid interactions. Every chain provides a domain that recognizes another domain, or the same domain, on another chain and associates with it. The association is based on the chemical and geometrical complementarities of the two domains. These complementarities are constructed on the spatial layout of the intermolecular amino acid interactions (Fig. 2C). These layouts are referred to as recognition modes and have been extensively studied [12-20]. Yet the rules that would enable us to predict recognition modes from sequences still remain elusive.

Understanding and predicting the modes of recognition of protein interfaces is essential for several reasons. First, because oligomers are involved in many cellular activities and when default interactions occur there are numerous consequences among which certain diseases. Second, because it is important to distinguish biologically significant interfaces from non‐ specific interfaces observed in protein crystals in order to properly assess biological assemblies in x-ray structures [21-24]. Along the same line, it is still not trivial to determine experimentally whether a protein is an oligomer and if so its quaternary structure, so any predictive quaternary structure tool is helpful. Third, because the knowledge on protein interfaces is used in synthetic biology to engineer artificial oligomers for several purposes from drug delivery devices to the development of new material [25-29]. As an example, one can read the chapter by Keqin Zhang on silkworm and spider protein fibers and their potential use in the fabric industry.

#### **2. Overview of protein assembly**

#### **2.1. Intermolecular amino acid interactions**

Protein interfaces have been extensively investigated [14, 30-35]. But because protein interfaces are large and rather flat in nature, they lack the spatial constraints achieved by a limited number of sequences. Thus the sequences of protein interfaces rarely share trivial profiles or patterns, in contrast to protein-small molecule interfaces or residues involved in enzymatic active sites [36, 37].

Thus, it has been clear very early on that looking at the 3D structures of protein interfaces was a necessary alternative to sequence analysis [34, 38-43]. The increasing number of 3D-structures available for oligomeric proteins has also favored such investigations and the development of computational methods to identify and study the amino acids at protein interfaces on large scale datasets. In the chapter, we essentially review these computational advances but discuss little experimental progress. One can read the chapter on protein oligomerization by Giovanni Gotte and Massimo Libonati for information on experimental approaches or read information provided in the following publication [44].

The first benefit of computational approaches is the facility in discriminating intermolecular amino acid interactions from intramolecular amino acid interactions systematically and efficiently by relatively simple algorithms.

#### *2.1.1. Identification of the intermolecular contacts at protein interfaces*

hydrogen bonds, hydrophobic bonds, electrostatic bonds (between charges), polar bonds (between dipoles) and van der Waals interactions. Under physiological conditions, the weak bonds continuously form and break. The secondary structures of proteins, α-helices and βsheets (intramolecular β-strand interactions) are stabilized by hydrogen bonds between atoms of the backbone of the amino acids. Likewise for intermolecular β-sheets but the hydrogen bonds are between atoms of the backbone of amino acids located on different chains. At last, but not least, worth noted amino acids in terms of folding and association is the proline. Its side chain geometry is particular and can adopt two positions named *cis* and *trans*, affecting the relative position of its neighboring amino acids accordingly. The consequence is the existence of two different local tridimensional states. The transition between the *cis and trans* conformation is called a *cis-trans* isomerization and is known to slow down the folding of a

The zone of contact between two associated chains is called the protein interface. The protein interface is made of intermolecular amino acid interactions. Every chain provides a domain that recognizes another domain, or the same domain, on another chain and associates with it. The association is based on the chemical and geometrical complementarities of the two domains. These complementarities are constructed on the spatial layout of the intermolecular amino acid interactions (Fig. 2C). These layouts are referred to as recognition modes and have been extensively studied [12-20]. Yet the rules that would enable us to predict recognition

Understanding and predicting the modes of recognition of protein interfaces is essential for several reasons. First, because oligomers are involved in many cellular activities and when default interactions occur there are numerous consequences among which certain diseases. Second, because it is important to distinguish biologically significant interfaces from non‐ specific interfaces observed in protein crystals in order to properly assess biological assemblies in x-ray structures [21-24]. Along the same line, it is still not trivial to determine experimentally whether a protein is an oligomer and if so its quaternary structure, so any predictive quaternary structure tool is helpful. Third, because the knowledge on protein interfaces is used in synthetic biology to engineer artificial oligomers for several purposes from drug delivery devices to the development of new material [25-29]. As an example, one can read the chapter by Keqin Zhang

on silkworm and spider protein fibers and their potential use in the fabric industry.

Protein interfaces have been extensively investigated [14, 30-35]. But because protein interfaces are large and rather flat in nature, they lack the spatial constraints achieved by a limited number of sequences. Thus the sequences of protein interfaces rarely share trivial profiles or patterns, in contrast to protein-small molecule interfaces or residues involved in enzymatic

protein, and to also affect the association of chains indirectly [8-11].

modes from sequences still remain elusive.

330 Oligomerization of Chemical and Biological Compounds

**2. Overview of protein assembly**

active sites [36, 37].

**2.1. Intermolecular amino acid interactions**

Many algorithms are available to identify the amino acids involved in intermolecular interac‐ tions from the x-ray coordinates of the 3D-structure of a protein oligomer (reviewed in [13, 21, 45]). The coordinates are accessible at the protein database (PDB, http://www.rcsb.org/pdb/ home/home.do)[46]. There are databases were interfaces have been classified according to their 3D organization, residue conservation and residue types [47]. Some databases are used to complement cellular networks (interactomes) with structural information on the binding modes between cellular partners [12, 18-20, 48].

The classical algorithms are based on three different measures: (i) accessibility surface area, (ii) voronoi cells and (iii) arithmetic distances. More novel algorithms use graph theory measures such as centrality and coefficient clustering.

**Accessible surface area (ASA).** The first method calculates the solvent accessible surface area by rolling a probe of a given radius around the Van der Waal's surface of the protein atoms whose centre is the accessible surface [49]. Typically, the probe has the same radius as water (1.4 Å) and hence the surface described is referred to as the solvent accessible surface. The ASA are calculated for the monomer and for the oligomer and the interface residues are obtained by the difference in their ASA. ASA is currently used to discriminate biological contacts (large ASA, 1600 ± 400 Å2 )) from crystal ones (small ASA <400 Å2 ) [50, 51]. The PDB entries are now processed accordingly and provide both biological assembly and asymmetric unit coordinates. The biological assembly entry includes a remark to explain whether the oligomeric state is "author provided" (experimentally shown to be an oligomer) or "software determined" or both. Alternatively, the biological assembly can be downloaded directly from the Structure Server PQS (Protein Quaternary structure) at EBI (http://pqs.ebi.ac.uk) [50]. ASA can be calculated from different servers and programs such as PISA (Protein, Interfaces, Structures and Assem‐ blies, http://www.ebi.ac.uk/msd-srv/prot\_int/) or Naccess (http://www.bioinf.manches‐ ter.ac.uk/naccess/), both essentially implemented from the Lee & Richards method [52].

**Voronoi cells.** The second method selects interfacial residues based on the Voronoi diagram or its closely related power diagram [53-55]. The Voronoi diagram associates to each atom its Voronoi cell, namely the convex polyhedron that contains all points of space closer to that atom than to any other atom. Instead of the Euclidean distance |ax| between a point *x* and an atom centered in *a*, this diagram use the power distance p(x) of *x* with respect to the ball of radius *r* that represents the atom,p(x)=|ax|2 -r2 . The Voronoi cell of an atom then comprises all points of space that have a power distance to that atom less than to any other atom. Its facets belong to the radical plane, which contains the intersection of the spheres if they do intersect. The Voronoi (or power) diagram offers a natural definition of contacts: two atoms are in contact if and only if their Voronoi cells share a facet. The use of Voronoi diagrams has been extended for assessing the reconstruction of protein assembly with the impressive example of the Nuclear Pore Complex [56].

**Arithmetic distances**. The third method also requires the 3D-structure (available in the PDB) and calculates Euclidian distances between atoms of the amino acids of different chains to detect only intermolecular atomic interactions [23–25]. The selection is the pairs of atoms which are within a cut-off distance from each other classically around 5.0 Å such that any type of chemical bonds between the atoms are considered (H-bonds, electrostatic interactions, van der Waals forces, salt bridges and hydrophobic attractions). The pairs of atoms selected as part of the interface, depend on the choice of the cut off distance. This is a serious issue because a distance cannot fully describe a spatial arrangement and there is a chance that the geometry of the interface is not faithfully represented by the set of selected pairs [57]. The need to use a cut off distance for the selection prevents from having a natural read of the geometry of the interface. Better alternatives select pairs of atoms in interactions as the nearest neighbor atoms instead of using a cut-off [40, 58-60]. This measure is more capable of reading the whole geometry of the interface and therefore supplies a more accurate set of pairs of the intermo‐ lecular contacts. Differences in the set of atoms selected according to distances are illustrated in figure 3.

In addition, residue conservation or spatial chemical conservation can be implemented to yield a set of intermolecular amino acid interactions based on structural and sequence information [47]. The method requires the PDBs and a multiple sequence alignment as input data. Indi‐ vidual residues are represented in terms of regional alignments that reflect both their structural environment and their evolutionary variation, as defined by the alignment of homologous sequences. Multiple alignments use either the Shannon or the Von Neumann entropy [61]. Conservation scores are also efficient in discriminating genuine biological assemblies from crystal contacts [22, 24]. There exist several algorithms, the most efficient are mapping conservation score to the 3D-structures such as Evolutionary Trace [62-65].

interfaces(http://tsailab.chem.pacific.edu/wikiBID/index.php/Main\_Page)[68]. In parallel, there have been several experimental evidences not based on ala-scanning mutagenesis but on kinetics of assembly that showed the role of only some amino acids of the interfaces in regulating the chain association [69, 70]. Now hotspot (or hot spot) is a colloquial term that

**Figure 3. Selection of the amino acids in interactions at interfaces. A. Schematic of an interface between chain 1 and chain 2.** Each chain is symbolized by a line and the chain respective atoms are indicated by black dot and letters with indices corresponding to the chain. Distances and so potential interactions between atoms are indicated by dot‐ ted lines. Only few interactions are indicated for the sake of clarity. **B. Selection of atoms in interactions at the in‐ terface.** The same schematic is reproduced after selection of the atoms in interaction at the interface. The top schematic is a selection based on mutually closest atoms, the middle one is a selection of all closest atoms and the bottom one is a selection of atoms at distances shorter than a cut-off of 5 Å. **C. Coarse-grained graphs of the inter‐ face.** Based on the selected atoms, a graph of the interactions between amino acid is drawn. The top, middle and

The Assembly of Protein Oligomers — Old Stories and New Perspectives with Graph Theory

http://dx.doi.org/10.5772/58576

333

There exist several predictors of hotspots combined or not with evolution conservation based

**Algorithms based on graph theory-**More recently, methods based on graph theory have also been proposed to identify hotspots. A graph or a network is a mathematical representation of pairwise relation between objects. A graph is made of vertices (or nodes) and lines, called edges (or links) that connect them. Proteins have been described as networks, with the amino acids of a protein chain considered as the nodes and the interactions between amino acids as the links. These networks, referred to as protein structure networks or amino acid networks, describe the entire protein and are used to infer global characteristics of the protein. Likewise, protein interfaces have been described as networks of hot spots in interactions with the hot

distinguishes a residue relevant for interface formation from others.

bottom graphs correspond to the top, middle and bottom selections illustrated in B, respectively.

on ASA, voronoi and distances, some reviewed in [21, 47].

**Hotspots-**In the mid-nineties, the specific energetic contribution of the side chain atoms to protein interfaces was investigated using Alanine Scanning Mutagenesis because the mutation by alanine reduced the interactions to backbone atoms [66]. It was found that only a small subset of interfacial residues were sensitive to an alanine mutation indicating that the energetic contribution of the interfacial residues was not distributed uniformly. Some key 'hot spot' residues contributed dominantly to the binding free energy. Thorn and Bogan deposited hot spots from alanine scanning mutagenesis experiments in the ASEdb database (http:// nic.ucsf.edu/asedb/) [67]. BID (The Binding Interface Database) is another database of experi‐ mental hot spots, which collects all available experimental data related to hot spots in protein The Assembly of Protein Oligomers — Old Stories and New Perspectives with Graph Theory http://dx.doi.org/10.5772/58576 333

than to any other atom. Instead of the Euclidean distance |ax| between a point *x* and an atom centered in *a*, this diagram use the power distance p(x) of *x* with respect to the ball of radius *r*

of space that have a power distance to that atom less than to any other atom. Its facets belong to the radical plane, which contains the intersection of the spheres if they do intersect. The Voronoi (or power) diagram offers a natural definition of contacts: two atoms are in contact if and only if their Voronoi cells share a facet. The use of Voronoi diagrams has been extended for assessing the reconstruction of protein assembly with the impressive example of the

**Arithmetic distances**. The third method also requires the 3D-structure (available in the PDB) and calculates Euclidian distances between atoms of the amino acids of different chains to detect only intermolecular atomic interactions [23–25]. The selection is the pairs of atoms which are within a cut-off distance from each other classically around 5.0 Å such that any type of chemical bonds between the atoms are considered (H-bonds, electrostatic interactions, van der Waals forces, salt bridges and hydrophobic attractions). The pairs of atoms selected as part of the interface, depend on the choice of the cut off distance. This is a serious issue because a distance cannot fully describe a spatial arrangement and there is a chance that the geometry of the interface is not faithfully represented by the set of selected pairs [57]. The need to use a cut off distance for the selection prevents from having a natural read of the geometry of the interface. Better alternatives select pairs of atoms in interactions as the nearest neighbor atoms instead of using a cut-off [40, 58-60]. This measure is more capable of reading the whole geometry of the interface and therefore supplies a more accurate set of pairs of the intermo‐ lecular contacts. Differences in the set of atoms selected according to distances are illustrated

In addition, residue conservation or spatial chemical conservation can be implemented to yield a set of intermolecular amino acid interactions based on structural and sequence information [47]. The method requires the PDBs and a multiple sequence alignment as input data. Indi‐ vidual residues are represented in terms of regional alignments that reflect both their structural environment and their evolutionary variation, as defined by the alignment of homologous sequences. Multiple alignments use either the Shannon or the Von Neumann entropy [61]. Conservation scores are also efficient in discriminating genuine biological assemblies from crystal contacts [22, 24]. There exist several algorithms, the most efficient are mapping

**Hotspots-**In the mid-nineties, the specific energetic contribution of the side chain atoms to protein interfaces was investigated using Alanine Scanning Mutagenesis because the mutation by alanine reduced the interactions to backbone atoms [66]. It was found that only a small subset of interfacial residues were sensitive to an alanine mutation indicating that the energetic contribution of the interfacial residues was not distributed uniformly. Some key 'hot spot' residues contributed dominantly to the binding free energy. Thorn and Bogan deposited hot spots from alanine scanning mutagenesis experiments in the ASEdb database (http:// nic.ucsf.edu/asedb/) [67]. BID (The Binding Interface Database) is another database of experi‐ mental hot spots, which collects all available experimental data related to hot spots in protein

conservation score to the 3D-structures such as Evolutionary Trace [62-65].

. The Voronoi cell of an atom then comprises all points


that represents the atom,p(x)=|ax|2

332 Oligomerization of Chemical and Biological Compounds

Nuclear Pore Complex [56].

in figure 3.

**Figure 3. Selection of the amino acids in interactions at interfaces. A. Schematic of an interface between chain 1 and chain 2.** Each chain is symbolized by a line and the chain respective atoms are indicated by black dot and letters with indices corresponding to the chain. Distances and so potential interactions between atoms are indicated by dot‐ ted lines. Only few interactions are indicated for the sake of clarity. **B. Selection of atoms in interactions at the in‐ terface.** The same schematic is reproduced after selection of the atoms in interaction at the interface. The top schematic is a selection based on mutually closest atoms, the middle one is a selection of all closest atoms and the bottom one is a selection of atoms at distances shorter than a cut-off of 5 Å. **C. Coarse-grained graphs of the inter‐ face.** Based on the selected atoms, a graph of the interactions between amino acid is drawn. The top, middle and bottom graphs correspond to the top, middle and bottom selections illustrated in B, respectively.

interfaces(http://tsailab.chem.pacific.edu/wikiBID/index.php/Main\_Page)[68]. In parallel, there have been several experimental evidences not based on ala-scanning mutagenesis but on kinetics of assembly that showed the role of only some amino acids of the interfaces in regulating the chain association [69, 70]. Now hotspot (or hot spot) is a colloquial term that distinguishes a residue relevant for interface formation from others.

There exist several predictors of hotspots combined or not with evolution conservation based on ASA, voronoi and distances, some reviewed in [21, 47].

**Algorithms based on graph theory-**More recently, methods based on graph theory have also been proposed to identify hotspots. A graph or a network is a mathematical representation of pairwise relation between objects. A graph is made of vertices (or nodes) and lines, called edges (or links) that connect them. Proteins have been described as networks, with the amino acids of a protein chain considered as the nodes and the interactions between amino acids as the links. These networks, referred to as protein structure networks or amino acid networks, describe the entire protein and are used to infer global characteristics of the protein. Likewise, protein interfaces have been described as networks of hot spots in interactions with the hot spots as nodes and the interactions between hot spots as links. Now, it is important to realize that protein interfaces are not networks but sub-networks (sub graph) as they describe local properties of the protein, namely the interfaces. A good overview of network measures can be found in [71].

made of many nodes with few links and few nodes with many links, called hubs. Hubs are absent in random and single scale networks. Hubs are communication devices that allow most nodes of the networks to be connected with others. This has noticeable consequences in terms of the vulnerability of the networks to changes on the hubs or elsewhere [83]. This is discussed later in the chapter. Proteins are essentially random networks or single-scale networks [84].

The Assembly of Protein Oligomers — Old Stories and New Perspectives with Graph Theory

The clustering coefficient, called *C*, is based on the degree of the nodes and it measures the probability that a node A which is connected to B, itself connected to C, has to be connected to C as well. Calculated over all nodes of the networks, it identifies clusters of nodes highly connected to one another and hence it discriminates different clusters as distinct communities. The calculation of clustering coefficient is detailed in [71]. In protein oligomers, the protein interface is made of many bonds between two adjacent chains and few bonds between nonadjacent chains. Hence the interface makes a cluster in terms of graph and hotspots have been

Protein interfaces are either analyzed based on all interfacial residues or on hotspots only.

To infer the features of protein interfaces, the method is simple: a dataset of protein oligomers/ protein interfaces is built, an algorithm is applied to each one of the interfaces to identify intermolecular contacts and the features of the intermolecular contacts are analyzed using statistics. Classically the parameters to describe a protein interface are: (i) interface size, expressed either in number of amino acids or in ASA, (ii) the number of regions of interfaces over the full-length chain, (iii) the chemical properties of the amino acids (amino acid fre‐

divided by its frequency in a reference set, generally the full-length chain). The size of protein interfaces is an important parameter because it may vary depending on the strength of the

Evolutionary conservation, protein folds, secondary structures, quaternary structures or crystallographic B-factors can also be considered depending on the question and the criteria used to build the dataset. The idea is to find enough specific features to distinguish the residues

Many dataset are built on proteins sharing properties at the level of their full-length chains (function, organism, superfamily, folds, and quaternary structures) but without necessarily sharing features at the level of their interfaces [39, 87-89]. In particular, the geometries of the interfaces are not necessarily looked at and therefore interfaces with different geometries are often compared. But it is generally assumed that proteins related in terms of folds or functions associate in similar ways. However, a screen over a large dataset of dimers, performed by Keskin *et al.* has shown that a non-negligible amount of protein oligomers have interfaces sharing features although they have different folds and functions [90]. This set is referred in the paper as "type II", following the term "type I" used for protein interfaces sharing features

in the protein interface

http://dx.doi.org/10.5772/58576

335

successfully identified by clustering coefficient [85].

of protein interfaces from the rest of the residues.

*2.2.1. Dataset based on features of the full-length protein*

association [86].

**2.2. Features of the intermolecular contacts at protein interfaces**

quency, the interface propensity, namely the frequency of a residue *ai*

The sub-graphs are built of pairs of amino acids in intermolecular interactions (Fig. 3C). As for the previous methods, the 3D-structures of the protein oligomer are used to build the graph and different measures are used to infer the amino acid in interactions. Basically, the atoms are considered as points in space, each chain of a protein oligomer constituting a distinct set of points. All distances between atoms of the different set are calculated and any two atoms within a given cut off distance or closest atoms are considered linked. A coarse-grain graph is built by replacing the interacting atoms by their respective interacting amino acids.

*The measure of path length and centrality is used (Central nodes).* In graphs, the notion of links between amino acids goes beyond chemical/physical bonds which are based on arithmetic distances. In a graph, any two amino acids are connected by a geodesic distance which is the shortest distance between them, measured as the minimum number of links that need to be crossed to connect them by the shortest path. This distance is called the path length and is symbolized by the letter *l*. The mean path length, *<l>,* represents the average over the shortest paths between all pairs of nodes and offers a measure of the network's overall navigability. This introduces the notion of contacts through communication routes in addition to the more classical notion of geometrical/chemical contacts. This novel notion will have vast applications in the field of protein structure and protein dynamics. It has been already used for investigating protein allosteric mechanisms as discussed later [72-77].

Several measures of the centrality of a graph (closeness, betweenness) are associated with geodesic distances. Basically the numbers of shortcut paths going through every node are calculated and the most central nodes are those with the highest numbers of shortcut paths going through them. In other words, the centrality is finding nodes at the crossroad of communication routes. Centrality measures have been used to identify residues important for the function and for the fold of proteins [78-80]. It has also been used to identify hot spot residues in protein interfaces [81, 82]. One has to be careful to keep in mind that a central node needs not to be at the center of the protein or at the center of the interface, but it is necessarily at a crossroad of many paths.

*The measure of degree and clustering coefficient is used*. The number of links a node has, in the context, the number of interactions an amino acid has with other amino acids (number of contact amino acids), is called the degree and is symbolized by the letter *k*. The mean degree <*k*> represents the average degree over all the nodes of the network. The degree distribution of networks is informative on the characteristics of the network [71]. Networks with a power law degree distribution are called scale-free, a name that is rooted in statistical physics literature. It indicates the absence of a typical degree for the nodes in the network (one that could be used to characterize the rest of the nodes). This is in strong contrast to random networks, which have Poisson degree distributions, and for which the degree of all nodes is in the vicinity of the average degree *<k>,* which can be considered typical. There are also exponential degree distributions which are single-scale networks. Scale free networks are made of many nodes with few links and few nodes with many links, called hubs. Hubs are absent in random and single scale networks. Hubs are communication devices that allow most nodes of the networks to be connected with others. This has noticeable consequences in terms of the vulnerability of the networks to changes on the hubs or elsewhere [83]. This is discussed later in the chapter. Proteins are essentially random networks or single-scale networks [84].

The clustering coefficient, called *C*, is based on the degree of the nodes and it measures the probability that a node A which is connected to B, itself connected to C, has to be connected to C as well. Calculated over all nodes of the networks, it identifies clusters of nodes highly connected to one another and hence it discriminates different clusters as distinct communities. The calculation of clustering coefficient is detailed in [71]. In protein oligomers, the protein interface is made of many bonds between two adjacent chains and few bonds between nonadjacent chains. Hence the interface makes a cluster in terms of graph and hotspots have been successfully identified by clustering coefficient [85].

Protein interfaces are either analyzed based on all interfacial residues or on hotspots only.

#### **2.2. Features of the intermolecular contacts at protein interfaces**

spots as nodes and the interactions between hot spots as links. Now, it is important to realize that protein interfaces are not networks but sub-networks (sub graph) as they describe local properties of the protein, namely the interfaces. A good overview of network measures can be

The sub-graphs are built of pairs of amino acids in intermolecular interactions (Fig. 3C). As for the previous methods, the 3D-structures of the protein oligomer are used to build the graph and different measures are used to infer the amino acid in interactions. Basically, the atoms are considered as points in space, each chain of a protein oligomer constituting a distinct set of points. All distances between atoms of the different set are calculated and any two atoms within a given cut off distance or closest atoms are considered linked. A coarse-grain graph is

*The measure of path length and centrality is used (Central nodes).* In graphs, the notion of links between amino acids goes beyond chemical/physical bonds which are based on arithmetic distances. In a graph, any two amino acids are connected by a geodesic distance which is the shortest distance between them, measured as the minimum number of links that need to be crossed to connect them by the shortest path. This distance is called the path length and is symbolized by the letter *l*. The mean path length, *<l>,* represents the average over the shortest paths between all pairs of nodes and offers a measure of the network's overall navigability. This introduces the notion of contacts through communication routes in addition to the more classical notion of geometrical/chemical contacts. This novel notion will have vast applications in the field of protein structure and protein dynamics. It has been already used for investigating

Several measures of the centrality of a graph (closeness, betweenness) are associated with geodesic distances. Basically the numbers of shortcut paths going through every node are calculated and the most central nodes are those with the highest numbers of shortcut paths going through them. In other words, the centrality is finding nodes at the crossroad of communication routes. Centrality measures have been used to identify residues important for the function and for the fold of proteins [78-80]. It has also been used to identify hot spot residues in protein interfaces [81, 82]. One has to be careful to keep in mind that a central node needs not to be at the center of the protein or at the center of the interface, but it is necessarily

*The measure of degree and clustering coefficient is used*. The number of links a node has, in the context, the number of interactions an amino acid has with other amino acids (number of contact amino acids), is called the degree and is symbolized by the letter *k*. The mean degree <*k*> represents the average degree over all the nodes of the network. The degree distribution of networks is informative on the characteristics of the network [71]. Networks with a power law degree distribution are called scale-free, a name that is rooted in statistical physics literature. It indicates the absence of a typical degree for the nodes in the network (one that could be used to characterize the rest of the nodes). This is in strong contrast to random networks, which have Poisson degree distributions, and for which the degree of all nodes is in the vicinity of the average degree *<k>,* which can be considered typical. There are also exponential degree distributions which are single-scale networks. Scale free networks are

built by replacing the interacting atoms by their respective interacting amino acids.

protein allosteric mechanisms as discussed later [72-77].

at a crossroad of many paths.

found in [71].

334 Oligomerization of Chemical and Biological Compounds

To infer the features of protein interfaces, the method is simple: a dataset of protein oligomers/ protein interfaces is built, an algorithm is applied to each one of the interfaces to identify intermolecular contacts and the features of the intermolecular contacts are analyzed using statistics. Classically the parameters to describe a protein interface are: (i) interface size, expressed either in number of amino acids or in ASA, (ii) the number of regions of interfaces over the full-length chain, (iii) the chemical properties of the amino acids (amino acid fre‐ quency, the interface propensity, namely the frequency of a residue *ai* in the protein interface divided by its frequency in a reference set, generally the full-length chain). The size of protein interfaces is an important parameter because it may vary depending on the strength of the association [86].

Evolutionary conservation, protein folds, secondary structures, quaternary structures or crystallographic B-factors can also be considered depending on the question and the criteria used to build the dataset. The idea is to find enough specific features to distinguish the residues of protein interfaces from the rest of the residues.

#### *2.2.1. Dataset based on features of the full-length protein*

Many dataset are built on proteins sharing properties at the level of their full-length chains (function, organism, superfamily, folds, and quaternary structures) but without necessarily sharing features at the level of their interfaces [39, 87-89]. In particular, the geometries of the interfaces are not necessarily looked at and therefore interfaces with different geometries are often compared. But it is generally assumed that proteins related in terms of folds or functions associate in similar ways. However, a screen over a large dataset of dimers, performed by Keskin *et al.* has shown that a non-negligible amount of protein oligomers have interfaces sharing features although they have different folds and functions [90]. This set is referred in the paper as "type II", following the term "type I" used for protein interfaces sharing features and derived from protein oligomers having similar fold and/or functions. To establish the determinants of the construction of an interface it is simpler to look at type II interfaces because the pressure of evolution over the fold and the function of the protein chain is alleviated compared to type I interfaces.

fold, and quaternary structures yet they have a common local fold involved in the intermo‐ lecular contacts that lead to fiber formation. Their pathological form, whether a fiber or an oligomer, involves interactions between two β-strands, each provided by a different chain (intermolecular β-strands). These intermolecular β-strands share several structural properties. They are recognized by the same antibody A11 [99]. Their formation depends on interactions between atoms of the backbone, result which has led to the proposal that aggregation is a generic property of the polypeptide chain [100, 101]. They adopt a cross β structure which can be predicted from sequences by the PIRA (Parallel 'In Register' Arrangement) model, a network made of single pairs of residues [102-107]. Different predictors of the aggregation-

The Assembly of Protein Oligomers — Old Stories and New Perspectives with Graph Theory

http://dx.doi.org/10.5772/58576

337

prone sequences involved in the fiber formation are now available [96, 98, 108-111].

be due to differences in the datasets and/or the algorithms.

(intramolecular networks) [84].

We have studied a dataset of 1056 interfaces present in 755 protein oligomers not known to be involved in conformational diseases [59]. As others, we found no specificity at the level of individual hot spots. The chemical properties of the individual hot spots and their distribution on the sequence characterize only the secondary structure and the solubility of the β-strands. In contrast the interaction pairs provide the interface some specificity. Interestingly, the interfaces are best described by two sets of interaction pairs, pairs involving backbone atoms made essentially of hydrophobic and/or small residues and pairs involving at least one atom of the side chain, preferentially made of charged, polar, long and medium residues. The backbone pairs have properties common to intramolecular β-strand interactions and intermo‐ lecular β-strands involved in fiber formation in terms of amino acid preferences. Thus hydrophobic amino acids whether in pairs or as individual are not giving any specificity to interfaces. That explains that they always appear in any dataset. On the other hand the side chain pairs have particular geometrical characteristic in terms of number of atoms, branching and length. They also show preferred chemical pairing different from those measured for βfibers [98]. However this result is only based on comparison with the literature and it could

The geometry of the side chains has been so far neglected when it appears in our study as a key parameter. Using Steiner Minimal Tree approach (SMT), MacGregor *Smith et al* proposed an elegant geometrical representation of the amino acids that was successfully applied to the problem of protein folding [112]. It will be interesting to extend this approach onto protein interfaces to see if the specificity of protein interfaces may be provided by the geometry of the amino acids rather than their sole chemistry. Similar double layer of interactions, has been observed at the interfaces between colicins and their cognate immunity proteins [113]. One set of the intermolecular residues was common to all colicin-immunity members and produced a low binding affinity between the colicin and its cognate immunity protein while the other set was made of variable residues providing high affinity and specificity to the colicin for a particular cognate. Double layer of interactions has also been reported in monomeric proteins

As mentioned earlier, proteins and protein interfaces are now described as networks of amino acids in interaction or as sub-networks of hot spots in interactions, respectively. This relatively new concept offers the possibility of looking at the layout of interactions in addition to the amino acid properties. It is clear now that the network of interactions is as much important as

Globally, the results of studies on protein interface dataset (mainly type I) revealed the importance of hydrophobic interactions in the formation of protein interfaces, greater residue conservation and chemical property similar to surface residues but packing like core residues [34, 38]. The two latter properties are coherent with the fate of a protein interface. The topology of a soluble protein is defined by surface residues which are accessible to the solvent and core residues which are, on the contrary, buried and inaccessible to the solvent. The amino acids of a protein interface have the solubility requirement of surface residues because the domains of the interface are initially accessible to the solvent to allow binding. To have stable binding, the domains need to minimize void and maximize packing as for the core residues.

If the role of hydrophobic residues is consistent over any dataset, the importance of polar and charged residues in interfaces varies very much between datasets. Altogether this indicates that hydrophobic residues are involved in promiscuous interactions while polar and charged residues yield alternative recognition modes and hence provide each type of interfaces its specificity.

Up to date, there is no single property sufficiently unambiguous to identify the protein interface from the rest of the protein, and considerable disagreement exists on which properties are actually useful. Conservation is an excellent example of a property both widely used and widely debated. De Vries and Bonvin as well as Neuvirth raise the matter of having so many algorithms and the absence of consensus on the parameters truly relevant to the formation of a protein interface [45, 91]. This may well explain the contradictory results on protein interface properties.

Most studies are performed on the features of individual hot spots. Yet protein interfaces result from intermolecular pairwise interactions and are likely encoded at the pair's level. Supporting this view, the few studies investigating the features of pairs of hotspots show sufficient specificity of the residue pair preferences for accurate prediction [40, 58, 92, 93].

#### *2.2.2. Dataset based on features of the interfaces*

To investigate properties responsible for interface formation, an alternative is to build a dataset based on the features of the interfaces and not on the features of the full-length chain. For instance, one can build a dataset of proteins sharing the same geometry of interface.

#### *2.2.2.1. β-strand geometry*

Interfaces made of two interacting β-strands (intermolecular β-strands) have been largely studied because it is present in many conformational diseases such as Alzheimer's disease, Parkinson's disease or serpinopathies [94-98]. Supporting the view of comparing interfaces with identical geometries, the proteins involve in conformational diseases share no functions, fold, and quaternary structures yet they have a common local fold involved in the intermo‐ lecular contacts that lead to fiber formation. Their pathological form, whether a fiber or an oligomer, involves interactions between two β-strands, each provided by a different chain (intermolecular β-strands). These intermolecular β-strands share several structural properties. They are recognized by the same antibody A11 [99]. Their formation depends on interactions between atoms of the backbone, result which has led to the proposal that aggregation is a generic property of the polypeptide chain [100, 101]. They adopt a cross β structure which can be predicted from sequences by the PIRA (Parallel 'In Register' Arrangement) model, a network made of single pairs of residues [102-107]. Different predictors of the aggregationprone sequences involved in the fiber formation are now available [96, 98, 108-111].

and derived from protein oligomers having similar fold and/or functions. To establish the determinants of the construction of an interface it is simpler to look at type II interfaces because the pressure of evolution over the fold and the function of the protein chain is alleviated

Globally, the results of studies on protein interface dataset (mainly type I) revealed the importance of hydrophobic interactions in the formation of protein interfaces, greater residue conservation and chemical property similar to surface residues but packing like core residues [34, 38]. The two latter properties are coherent with the fate of a protein interface. The topology of a soluble protein is defined by surface residues which are accessible to the solvent and core residues which are, on the contrary, buried and inaccessible to the solvent. The amino acids of a protein interface have the solubility requirement of surface residues because the domains of the interface are initially accessible to the solvent to allow binding. To have stable binding, the

If the role of hydrophobic residues is consistent over any dataset, the importance of polar and charged residues in interfaces varies very much between datasets. Altogether this indicates that hydrophobic residues are involved in promiscuous interactions while polar and charged residues yield alternative recognition modes and hence provide each type of interfaces its

Up to date, there is no single property sufficiently unambiguous to identify the protein interface from the rest of the protein, and considerable disagreement exists on which properties are actually useful. Conservation is an excellent example of a property both widely used and widely debated. De Vries and Bonvin as well as Neuvirth raise the matter of having so many algorithms and the absence of consensus on the parameters truly relevant to the formation of a protein interface [45, 91]. This may well explain the contradictory results on protein interface

Most studies are performed on the features of individual hot spots. Yet protein interfaces result from intermolecular pairwise interactions and are likely encoded at the pair's level. Supporting this view, the few studies investigating the features of pairs of hotspots show sufficient

To investigate properties responsible for interface formation, an alternative is to build a dataset based on the features of the interfaces and not on the features of the full-length chain. For

Interfaces made of two interacting β-strands (intermolecular β-strands) have been largely studied because it is present in many conformational diseases such as Alzheimer's disease, Parkinson's disease or serpinopathies [94-98]. Supporting the view of comparing interfaces with identical geometries, the proteins involve in conformational diseases share no functions,

specificity of the residue pair preferences for accurate prediction [40, 58, 92, 93].

instance, one can build a dataset of proteins sharing the same geometry of interface.

*2.2.2. Dataset based on features of the interfaces*

*2.2.2.1. β-strand geometry*

domains need to minimize void and maximize packing as for the core residues.

compared to type I interfaces.

336 Oligomerization of Chemical and Biological Compounds

specificity.

properties.

We have studied a dataset of 1056 interfaces present in 755 protein oligomers not known to be involved in conformational diseases [59]. As others, we found no specificity at the level of individual hot spots. The chemical properties of the individual hot spots and their distribution on the sequence characterize only the secondary structure and the solubility of the β-strands. In contrast the interaction pairs provide the interface some specificity. Interestingly, the interfaces are best described by two sets of interaction pairs, pairs involving backbone atoms made essentially of hydrophobic and/or small residues and pairs involving at least one atom of the side chain, preferentially made of charged, polar, long and medium residues. The backbone pairs have properties common to intramolecular β-strand interactions and intermo‐ lecular β-strands involved in fiber formation in terms of amino acid preferences. Thus hydrophobic amino acids whether in pairs or as individual are not giving any specificity to interfaces. That explains that they always appear in any dataset. On the other hand the side chain pairs have particular geometrical characteristic in terms of number of atoms, branching and length. They also show preferred chemical pairing different from those measured for βfibers [98]. However this result is only based on comparison with the literature and it could be due to differences in the datasets and/or the algorithms.

The geometry of the side chains has been so far neglected when it appears in our study as a key parameter. Using Steiner Minimal Tree approach (SMT), MacGregor *Smith et al* proposed an elegant geometrical representation of the amino acids that was successfully applied to the problem of protein folding [112]. It will be interesting to extend this approach onto protein interfaces to see if the specificity of protein interfaces may be provided by the geometry of the amino acids rather than their sole chemistry. Similar double layer of interactions, has been observed at the interfaces between colicins and their cognate immunity proteins [113]. One set of the intermolecular residues was common to all colicin-immunity members and produced a low binding affinity between the colicin and its cognate immunity protein while the other set was made of variable residues providing high affinity and specificity to the colicin for a particular cognate. Double layer of interactions has also been reported in monomeric proteins (intramolecular networks) [84].

As mentioned earlier, proteins and protein interfaces are now described as networks of amino acids in interaction or as sub-networks of hot spots in interactions, respectively. This relatively new concept offers the possibility of looking at the layout of interactions in addition to the amino acid properties. It is clear now that the network of interactions is as much important as the components of the network in providing the protein its properties in terms of folding, function, evolution and interface formation [59, 78, 84, 114-117].

clustering algorithms with sequence alignments, motifs of sequentially and structurally conserved residues are detected at the β-strand interfaces of lectins. The different motifs are built on a subset of residues at the interface that provide a specific 3D-orientation of the βstrands. Consensus patterns at the interfaces have been found for the different quaternary structures of the lectins. Briefly, there are nine different kinds of quaternary structures in legume lectins including Canonical, ECorL-type, GS4-type, DBL-type, ConAtype, PNA-type, GS1-type, DB58-type, and Arcelin-5-type (monomeric). Seven different consensus is observed including types II (canonical), X1 (DB58-type), X2 (noncanonical interface of ConA), X3 (ECorL-type, handshake), X4 GS4-type, back to back), and the unusual interfaces of PNA and

The Assembly of Protein Oligomers — Old Stories and New Perspectives with Graph Theory

http://dx.doi.org/10.5772/58576

339

For a long time, there are experimental evidences of sequences that are responsible for the quaternary structure of protein. These sequences, called registration sequence, are located upstream the interface region and promote oligomerization of monomeric protein when genetically added [133]. Proline and histidine residues located upstream of interfaces have also been shown to regulate the association between chains [8, 11, 134-136]. Collagen α-fibers and silkworm/spider β-fibers contain several repeats composed of proline residues which also participate in the quaternary structures. Whether these residues belong to the interfaces or are

In summary, residues located within the interfaces and outside are participating in the quaternary structures and the chain assembly. This implies that protein assembly is regulated at two levels, at the level of intramolecular interactions (residues outside interfaces) and at the level of intermolecular interactions (residues in interfaces). Thus it is necessary to also investigate the residues involved in intramolecular interactions to discriminate those partici‐ pating in folding reactions from those participating in both folding and interface formation. The latter residues are probably coordinating the whole assembly process by regulating

systematically located outside the interface regions has not yet been established.

**3. Intermolecular and intramolecular amino acid interactions: The**

Different models of assembly have been recently reviewed [137, 138].

As mentioned at the beginning, protein assembly or protein oligomerization entails folding and association reactions. Thus, to have a full picture of the mechanism of assembly, besides studying intermolecular amino acid interactions, it is necessary to investigate intramolecular amino acid interactions and to apprehend how both types of interactions are coordinated.

First, let's consider the simple case of the formation of a dimer. There are two routes to a dimer. One is through the three states model where unfolded monomers *U* (state 1) fold into mono‐ mers *M* (state 2) which subsequently associate into dimers *D (state 3)*. The alternative route is through the two states model in which unfolded monomers *U* (state 1) associate into folded dimers *D* (state 2). Intramolecular and intermolecular interactions occur sequentially in the three states model but concomitantly in the two states model. One can anticipate that folding

communication between folding and association steps.

**mechanism of protein assembly**

GS1.

We have observed in our 1056 β-strand interface dataset that the side chain pairs also have specific network features. The side chain hot spot sub-networks have nodes with more contacts than the backbone or the PIRA (Parallel In Register Alignment) networks, used to predict fiber sequences [98]. Yet the β-strand interfaces have no hubs and maintain a low interconnectedness (little communication between residues of the interface), probably a mechanism to resist the effect of mutation by secluding the nodes of the networks (Fig. 7). Simultaneously, the side chain residues make less than three contacts avoiding stringency on the choice of amino acids capable of making an interface and providing the β-strand interface high sequence plasticity. Robustness and plasticity of networks are well explored by graph theory and there are several very inspiring papers on that topic [83, 118-120]. This point is discussed in more details later in the chapter.

#### *2.2.2.2. α-coiled interfaces*

To date the only interfaces accurately predicted from sequences are α-coiled interfaces [121-124]. Intermolecular residues follow a so-called knobs-into-holes regular packing producing the α-coiled coil helix-helix assembly [125]. In the simplest case (dimer), the α-coiled coil sequence displays a repeat pattern of seven amino acids so-called heptad repeat, labeled *abcdefg*, with hydrophobic residues at the *a* and *d* positions (Fig. 2C). These hydrophobic intermolecular contacts constitute the seam of the core of the knobs-into-holes interface. The repeats can be shorter than 20 residues or span many hundreds of amino acids.

There are obvious reasons to why it has been possible to understand α-coiled coil interfaces when other geometries still elude us. First, α-helices are geometrically more constraint than β-strands and second backbone interactions do not participate in α-helix interfaces because the hydrogen bond networks are made intra-molecularly. Hence, there is no "backbone noise" information that interferes with the side chain information.

#### *2.2.3. Interfaces and quaternary structures*

The quaternary structures of the protein oligomers and the features of their interfaces are related and different methods are currently developed aimed at understanding such relations [126, 127].

In some cases such relation is more or less understood. For example, in higher-order α-coiled coil oligomers (above dimer) additional (peripheral) knobs-into-holes take place and broaden the helical contacts [128]. Such multiple repeats lead to multi-faceted helices, which combine repeats of different amino acid compositions to accommodate quaternary structures accord‐ ingly [129, 130]. Thus, it is possible by analyzing amino acid sequences to predict the quater‐ nary structures of α-coiled coil oligomers [129-132].

The relation between the interfaces features and the quaternary structure is less understood in β-strand interfaces with few exception as the legume lectin family (81-82). Combining clustering algorithms with sequence alignments, motifs of sequentially and structurally conserved residues are detected at the β-strand interfaces of lectins. The different motifs are built on a subset of residues at the interface that provide a specific 3D-orientation of the βstrands. Consensus patterns at the interfaces have been found for the different quaternary structures of the lectins. Briefly, there are nine different kinds of quaternary structures in legume lectins including Canonical, ECorL-type, GS4-type, DBL-type, ConAtype, PNA-type, GS1-type, DB58-type, and Arcelin-5-type (monomeric). Seven different consensus is observed including types II (canonical), X1 (DB58-type), X2 (noncanonical interface of ConA), X3 (ECorL-type, handshake), X4 GS4-type, back to back), and the unusual interfaces of PNA and GS1.

the components of the network in providing the protein its properties in terms of folding,

We have observed in our 1056 β-strand interface dataset that the side chain pairs also have specific network features. The side chain hot spot sub-networks have nodes with more contacts than the backbone or the PIRA (Parallel In Register Alignment) networks, used to predict fiber sequences [98]. Yet the β-strand interfaces have no hubs and maintain a low interconnectedness (little communication between residues of the interface), probably a mechanism to resist the effect of mutation by secluding the nodes of the networks (Fig. 7). Simultaneously, the side chain residues make less than three contacts avoiding stringency on the choice of amino acids capable of making an interface and providing the β-strand interface high sequence plasticity. Robustness and plasticity of networks are well explored by graph theory and there are several very inspiring papers on that topic [83, 118-120]. This point is discussed in more details later

To date the only interfaces accurately predicted from sequences are α-coiled interfaces [121-124]. Intermolecular residues follow a so-called knobs-into-holes regular packing producing the α-coiled coil helix-helix assembly [125]. In the simplest case (dimer), the α-coiled coil sequence displays a repeat pattern of seven amino acids so-called heptad repeat, labeled *abcdefg*, with hydrophobic residues at the *a* and *d* positions (Fig. 2C). These hydrophobic intermolecular contacts constitute the seam of the core of the knobs-into-holes interface. The

There are obvious reasons to why it has been possible to understand α-coiled coil interfaces when other geometries still elude us. First, α-helices are geometrically more constraint than β-strands and second backbone interactions do not participate in α-helix interfaces because the hydrogen bond networks are made intra-molecularly. Hence, there is no "backbone noise"

The quaternary structures of the protein oligomers and the features of their interfaces are related and different methods are currently developed aimed at understanding such relations

In some cases such relation is more or less understood. For example, in higher-order α-coiled coil oligomers (above dimer) additional (peripheral) knobs-into-holes take place and broaden the helical contacts [128]. Such multiple repeats lead to multi-faceted helices, which combine repeats of different amino acid compositions to accommodate quaternary structures accord‐ ingly [129, 130]. Thus, it is possible by analyzing amino acid sequences to predict the quater‐

The relation between the interfaces features and the quaternary structure is less understood in β-strand interfaces with few exception as the legume lectin family (81-82). Combining

repeats can be shorter than 20 residues or span many hundreds of amino acids.

information that interferes with the side chain information.

nary structures of α-coiled coil oligomers [129-132].

*2.2.3. Interfaces and quaternary structures*

function, evolution and interface formation [59, 78, 84, 114-117].

338 Oligomerization of Chemical and Biological Compounds

in the chapter.

[126, 127].

*2.2.2.2. α-coiled interfaces*

For a long time, there are experimental evidences of sequences that are responsible for the quaternary structure of protein. These sequences, called registration sequence, are located upstream the interface region and promote oligomerization of monomeric protein when genetically added [133]. Proline and histidine residues located upstream of interfaces have also been shown to regulate the association between chains [8, 11, 134-136]. Collagen α-fibers and silkworm/spider β-fibers contain several repeats composed of proline residues which also participate in the quaternary structures. Whether these residues belong to the interfaces or are systematically located outside the interface regions has not yet been established.

In summary, residues located within the interfaces and outside are participating in the quaternary structures and the chain assembly. This implies that protein assembly is regulated at two levels, at the level of intramolecular interactions (residues outside interfaces) and at the level of intermolecular interactions (residues in interfaces). Thus it is necessary to also investigate the residues involved in intramolecular interactions to discriminate those partici‐ pating in folding reactions from those participating in both folding and interface formation. The latter residues are probably coordinating the whole assembly process by regulating communication between folding and association steps.

### **3. Intermolecular and intramolecular amino acid interactions: The mechanism of protein assembly**

As mentioned at the beginning, protein assembly or protein oligomerization entails folding and association reactions. Thus, to have a full picture of the mechanism of assembly, besides studying intermolecular amino acid interactions, it is necessary to investigate intramolecular amino acid interactions and to apprehend how both types of interactions are coordinated. Different models of assembly have been recently reviewed [137, 138].

First, let's consider the simple case of the formation of a dimer. There are two routes to a dimer. One is through the three states model where unfolded monomers *U* (state 1) fold into mono‐ mers *M* (state 2) which subsequently associate into dimers *D (state 3)*. The alternative route is through the two states model in which unfolded monomers *U* (state 1) associate into folded dimers *D* (state 2). Intramolecular and intermolecular interactions occur sequentially in the three states model but concomitantly in the two states model. One can anticipate that folding and association are going to be related but independent in the three states model but concerted in the two states model. In terms of networks, one can speculate that the three states model suggests a protein organized in two sub-graphs remotely connected, one governing the intramolecular interactions and the other the intermolecular reactions. On the contrary the two states model suggests two connected sub-graphs.

and isolated experimentally *the in vitro* disassembly Cn intermediates for ten of the Dn oligomers. In addition, for five cases, the Cn intermediates were also shown to be formed during *in vitro* reassemblies. They concluded that the evolutionary and assembly pathways were related and that assembly intermediates could be predicted solely from the atomic structure.

The Assembly of Protein Oligomers — Old Stories and New Perspectives with Graph Theory

http://dx.doi.org/10.5772/58576

341

But this conclusion might be over optimistic because based on very little cases (49/5375) and because protein structure evolution and protein assembly are plastic in terms of mechanisms

For example a single mutation has been found responsible for a transition from Cn to Dn point group symmetry, it is not obvious how such global change could have been anticipated by simply considering the full-length 3D-structure of the wild-type protein [143]. On the other hand it tells that the protein assembly is regulated by local properties since a single mutation is enough to alter the global assembly. This strongly suggests that the solution lies in under‐ standing the local properties and how they propagate information to regulate the global shape.

Hemoglobin is another complex example of a protein sharing a function but distinct quater‐ nary structures and for which evolutionary and assembly routes are not easily drawn even if

The *Synechocystis cyanoglobin* produces a monomeric hemoglobin (PDB 1S69) with a C1 point group symmetry, the human hemoglobin is tetrameric (PDB 2HHB) with a C2 point group symmetry, the *Oligobrachia mashikoi* produces a 12-mer hemoglobin (PDB 2ZS1) with a D3 point group symmetry while the giant earthworm hemoglobin contains 144 chains (PDB 2GTL) with a D6 point group symmetry. In such case, the different hemoglobin point group symmetries and quaternary structures may result from coding constraints of their respective organisms rather than from a relation in terms of evolution or assembly intermediates. As described by Crick in the early 60s, symmetric assemblies require fewer distinct kinds of specific interaction interfaces compared to asymmetric assemblies [144]. Likewise, higher symmetries require fewer distinct interfaces compared to lower symmetries and thus, the smaller a genome the more often its protein structural complexity may rely on high symmetry. This is consistent with the large occurrences of proteins with icosahedral symmetry in viruses while most eukaryotic molecular machines have C1 symmetry (Fig. 1B). Now, some high point group symmetry oligomers have been also discovered in eukaryotes, bacteria and archea with the vault proteins assembling 78 copies (2zuo, 2zv4, 2zv5), the encapsulin (3dkt) and the vault from *Pyrococcus furiosus* (2E0Z). Clearly one has to be cautious in interpreting data and statistics

In addition, hemoglobin is also an interesting example of how a unique function is provided by a combinatory of assemblies using the same protein fold. There are many other examples (e.g. ferritin, rubisco) but probably α-coiled coil oligomers offer the largest combinatory of assemblies. It was recently shown that they formed before LUCA (last universal common ancestor), by independent routes and most likely as the result of all possible geometric solutions to packing helices in a stable way [145]. Again, that illustrates how diverse evolution

such that to date it remains difficult to establish either route.

the structures are available (Fig. 4).

derived from the PDB.

routes are.

Discriminating the route of assembly is crucial in term of drug design strategy. In the three states model, it is likely that the interface in the folded monomer and in the folded dimer is similarly organized. Thus, it is consistent to use the x-ray structure of the native protein oligomer as a template to design assembly inhibitors. In contrast, in the two states model the interface in the unfolded monomer is different from the interface in the folded dimer so assembly inhibitors designed on the native structure of the protein oligomer are unlikely to recognize the unfolded monomer and block the assembly at the monomeric stage (or at early stage). This is one illustration on why it is important to anticipate the mechanism of assembly.

#### **3.1. Can the mechanism of assembly be predicted by investigating evolutionary relationship?**

D'Alesio offers good historical reviews on this question [139, 140]. The interfaces of dimers assembling by a two states model are found to share patterns with intramolecular interactions observed in monomers. It is proposed that such dimers have evolved from mutations in an existing monomer that led to its unfolding, followed by further mutations that yielded a viable dimer with intermolecular interactions similar to the intramolecular interactions present in the initial monomer. This mechanism is reminiscent of the domain swapping mechanism which is well presented in the chapter by Giovanni Gotte and Massimo Libonati or reviewed in [141]. In such a situation the evolution to the dimer depends initially on the evolution of a viable monomer towards unfolding induced by random mutation. There wouldn't be any folded monomer in the assembly route because it wouldn't bear the mutations in a folded state. The evolutionary route between the dimer and the monomer suggests the presence of epistatic mutations (mutations that have different effects in combination and individually). Here the fold and association steps are related and dependent on one another, in term of evolution and mechanism of assembly. In contrast, dimers assembling through a three states model were not found to share motifs with monomers. The folding of the monomer might then be a natural route towards association, and folding and association would appear evolutionary and mechanistically related but independent.

Next, let's look beyond the simple case of dimers. Possible relations between evolution and assembly mechanism have been further exploited by looking at protein oligomers sharing the same functions (superfamily) but adopting different quaternary structures. In one study the authors exploit the symmetry of oligomers to establish a relationship between evolution and assembly mechanisms [142]. From an initial screen of 5375 PDB structures, they found tetramers with D2 symmetry having homologous dimers with C2 point group symmetry and hexamers with D3 point group symmetry homologous dimers with C2 symmetry or homologue trimers with C3 point group symmetry. In total 49 protein oligomers with a symmetry relation from Cn to Dn are reported. They found evolutionary links between the Cn and Dn counterparts and isolated experimentally *the in vitro* disassembly Cn intermediates for ten of the Dn oligomers. In addition, for five cases, the Cn intermediates were also shown to be formed during *in vitro* reassemblies. They concluded that the evolutionary and assembly pathways were related and that assembly intermediates could be predicted solely from the atomic structure.

and association are going to be related but independent in the three states model but concerted in the two states model. In terms of networks, one can speculate that the three states model suggests a protein organized in two sub-graphs remotely connected, one governing the intramolecular interactions and the other the intermolecular reactions. On the contrary the two

Discriminating the route of assembly is crucial in term of drug design strategy. In the three states model, it is likely that the interface in the folded monomer and in the folded dimer is similarly organized. Thus, it is consistent to use the x-ray structure of the native protein oligomer as a template to design assembly inhibitors. In contrast, in the two states model the interface in the unfolded monomer is different from the interface in the folded dimer so assembly inhibitors designed on the native structure of the protein oligomer are unlikely to recognize the unfolded monomer and block the assembly at the monomeric stage (or at early stage). This is one illustration on why it is important to anticipate the mechanism of assembly.

**3.1. Can the mechanism of assembly be predicted by investigating evolutionary**

D'Alesio offers good historical reviews on this question [139, 140]. The interfaces of dimers assembling by a two states model are found to share patterns with intramolecular interactions observed in monomers. It is proposed that such dimers have evolved from mutations in an existing monomer that led to its unfolding, followed by further mutations that yielded a viable dimer with intermolecular interactions similar to the intramolecular interactions present in the initial monomer. This mechanism is reminiscent of the domain swapping mechanism which is well presented in the chapter by Giovanni Gotte and Massimo Libonati or reviewed in [141]. In such a situation the evolution to the dimer depends initially on the evolution of a viable monomer towards unfolding induced by random mutation. There wouldn't be any folded monomer in the assembly route because it wouldn't bear the mutations in a folded state. The evolutionary route between the dimer and the monomer suggests the presence of epistatic mutations (mutations that have different effects in combination and individually). Here the fold and association steps are related and dependent on one another, in term of evolution and mechanism of assembly. In contrast, dimers assembling through a three states model were not found to share motifs with monomers. The folding of the monomer might then be a natural route towards association, and folding and association would appear evolutionary and

Next, let's look beyond the simple case of dimers. Possible relations between evolution and assembly mechanism have been further exploited by looking at protein oligomers sharing the same functions (superfamily) but adopting different quaternary structures. In one study the authors exploit the symmetry of oligomers to establish a relationship between evolution and assembly mechanisms [142]. From an initial screen of 5375 PDB structures, they found tetramers with D2 symmetry having homologous dimers with C2 point group symmetry and hexamers with D3 point group symmetry homologous dimers with C2 symmetry or homologue trimers with C3 point group symmetry. In total 49 protein oligomers with a symmetry relation from Cn to Dn are reported. They found evolutionary links between the Cn and Dn counterparts

states model suggests two connected sub-graphs.

340 Oligomerization of Chemical and Biological Compounds

mechanistically related but independent.

**relationship?**

But this conclusion might be over optimistic because based on very little cases (49/5375) and because protein structure evolution and protein assembly are plastic in terms of mechanisms such that to date it remains difficult to establish either route.

For example a single mutation has been found responsible for a transition from Cn to Dn point group symmetry, it is not obvious how such global change could have been anticipated by simply considering the full-length 3D-structure of the wild-type protein [143]. On the other hand it tells that the protein assembly is regulated by local properties since a single mutation is enough to alter the global assembly. This strongly suggests that the solution lies in under‐ standing the local properties and how they propagate information to regulate the global shape.

Hemoglobin is another complex example of a protein sharing a function but distinct quater‐ nary structures and for which evolutionary and assembly routes are not easily drawn even if the structures are available (Fig. 4).

The *Synechocystis cyanoglobin* produces a monomeric hemoglobin (PDB 1S69) with a C1 point group symmetry, the human hemoglobin is tetrameric (PDB 2HHB) with a C2 point group symmetry, the *Oligobrachia mashikoi* produces a 12-mer hemoglobin (PDB 2ZS1) with a D3 point group symmetry while the giant earthworm hemoglobin contains 144 chains (PDB 2GTL) with a D6 point group symmetry. In such case, the different hemoglobin point group symmetries and quaternary structures may result from coding constraints of their respective organisms rather than from a relation in terms of evolution or assembly intermediates. As described by Crick in the early 60s, symmetric assemblies require fewer distinct kinds of specific interaction interfaces compared to asymmetric assemblies [144]. Likewise, higher symmetries require fewer distinct interfaces compared to lower symmetries and thus, the smaller a genome the more often its protein structural complexity may rely on high symmetry. This is consistent with the large occurrences of proteins with icosahedral symmetry in viruses while most eukaryotic molecular machines have C1 symmetry (Fig. 1B). Now, some high point group symmetry oligomers have been also discovered in eukaryotes, bacteria and archea with the vault proteins assembling 78 copies (2zuo, 2zv4, 2zv5), the encapsulin (3dkt) and the vault from *Pyrococcus furiosus* (2E0Z). Clearly one has to be cautious in interpreting data and statistics derived from the PDB.

In addition, hemoglobin is also an interesting example of how a unique function is provided by a combinatory of assemblies using the same protein fold. There are many other examples (e.g. ferritin, rubisco) but probably α-coiled coil oligomers offer the largest combinatory of assemblies. It was recently shown that they formed before LUCA (last universal common ancestor), by independent routes and most likely as the result of all possible geometric solutions to packing helices in a stable way [145]. Again, that illustrates how diverse evolution routes are.

**Figure 5. Mechanisms of assembly.** Two mechanisms of assembly have been described and experimentally ob‐ served. The protein chain folds before association in the lock and key mechanism, also called the three states model because the protein can be observed in three states, unfolded monomer, folded monomer and native oligomer (top route). The protein chains associate in a more or less partially folded state, and only subsequently acquire native fold‐ ed conformation, in the fly-casting mechanism also referred to as the two states model because because the protein exist only in two states in a dimer, unfolded monomer or native dimer. These two models are illustrated with the as‐ semblies of the two related AB5 toxins, the heat labile enterotoxin B pentamer (LTB5) and the cholera toxin B pentam‐ er (CtxB5). The two toxins share 94 % sequence identity and almost superimposable atomic structures but nevertheless

The Assembly of Protein Oligomers — Old Stories and New Perspectives with Graph Theory

http://dx.doi.org/10.5772/58576

343

The three states model is the oldest and most classical mechanism observed, it is generally referred to as the lock and key mechanism. There are plenty of experimental evidences of both the two and three states mechanisms. Non-native oligomers, namely oligomers with native quaternary structures but not native folds have been isolated experimentally for a long time and are common intermediates of assemblies [69, 70, 133, 134, 147-153]. Such intermediates are typical border line cases as they might be produced by a lock and key mechanism or by a flycasting mechanism. There are clear examples of protein associating by a fly-casting assembly with unfolded monomers able to associate [8, 134, 151, 154, 155]. The RING domain protein family of scaffolding oligomers presents an interesting case of the formation of a stable partially folded assembly tetramer along the oligomerization route to a native 24-mer [156-158]. The C4 symmetry tetramer populates because of its fast formation from monomers and its slow disappearance into a D4 24-mer (6 x 4). The transition to the D4 symmetry 24-mer is ratelimiting, because of the slow folding Proline cis/trans isomerization that regulates the associ‐ ation of two monomers via the ligation of Zn sites. Likewise dimer, trimer and tetramer assembly intermediates are isolated along the route to the native cholera toxin B pentamer (CtxB5) because the formation of one of the toxin interface is regulated by a *cis-trans* Proline isomerization [8]. The CtxB assembly intermediates acquire some of their native secondary structure along with association because their main interface involves the formation of an intermolecular β-sheet, this folding/association step is regulated by histidine residues [134].

Proline and histidine residues are rare at interfaces but are often found upstream the region of interfaces and are indirectly acting on their formation, as mentioned at the beginning of the

assemble through two different mechanisms.

**Figure 4. Plasticity of quaternary structures fulfilling a single biological function: the hemoglobin example.** The hemoglobin chain exists as a single fold which is copied and assembled with different stoichiometries (number of chains) and different symmetries across species to maintain the same biological function. Few cases are represented from a hemoglobin monomer to a 144-mer assembly. The structures are shown in ribbons except when spacefill is better to illustrate the symmetry of the assembly. The pictures are generated with Rasmol. The PDB codes and the symmetries of the hemoglobins are indicated above their respective structures.

The relation between evolution and assembly routes assumes that an oligomer evolves/ assembles from a monomeric entity. But reverse situation exists as for the native tachylectin-2 monomer which has been proposed to have evolved from a pentameric ancestor through short, functional gene segments that, at later stages, duplicated, fused, and rearranged [146]. The authors propose that new folds evolved through the structural plasticity of assembly inter‐ mediates.

This last example illustrates quite ironically that protein folds and quaternary structures still hold surprises and a direct relation between the evolution of protein oligomers and the mechanism of their assembly is not readily systematic. Both evolution and assembly certainly involve multiple parameters making their prediction rather challenging.

#### **3.2. Can the mechanism of assembly be predicted by experimental approaches?**

The two and the three state models are depicted in the figure 5A.

The Assembly of Protein Oligomers — Old Stories and New Perspectives with Graph Theory http://dx.doi.org/10.5772/58576 343

**Figure 5. Mechanisms of assembly.** Two mechanisms of assembly have been described and experimentally ob‐ served. The protein chain folds before association in the lock and key mechanism, also called the three states model because the protein can be observed in three states, unfolded monomer, folded monomer and native oligomer (top route). The protein chains associate in a more or less partially folded state, and only subsequently acquire native fold‐ ed conformation, in the fly-casting mechanism also referred to as the two states model because because the protein exist only in two states in a dimer, unfolded monomer or native dimer. These two models are illustrated with the as‐ semblies of the two related AB5 toxins, the heat labile enterotoxin B pentamer (LTB5) and the cholera toxin B pentam‐ er (CtxB5). The two toxins share 94 % sequence identity and almost superimposable atomic structures but nevertheless assemble through two different mechanisms.

The three states model is the oldest and most classical mechanism observed, it is generally referred to as the lock and key mechanism. There are plenty of experimental evidences of both the two and three states mechanisms. Non-native oligomers, namely oligomers with native quaternary structures but not native folds have been isolated experimentally for a long time and are common intermediates of assemblies [69, 70, 133, 134, 147-153]. Such intermediates are typical border line cases as they might be produced by a lock and key mechanism or by a flycasting mechanism. There are clear examples of protein associating by a fly-casting assembly with unfolded monomers able to associate [8, 134, 151, 154, 155]. The RING domain protein family of scaffolding oligomers presents an interesting case of the formation of a stable partially folded assembly tetramer along the oligomerization route to a native 24-mer [156-158]. The C4 symmetry tetramer populates because of its fast formation from monomers and its slow disappearance into a D4 24-mer (6 x 4). The transition to the D4 symmetry 24-mer is ratelimiting, because of the slow folding Proline cis/trans isomerization that regulates the associ‐ ation of two monomers via the ligation of Zn sites. Likewise dimer, trimer and tetramer assembly intermediates are isolated along the route to the native cholera toxin B pentamer (CtxB5) because the formation of one of the toxin interface is regulated by a *cis-trans* Proline isomerization [8]. The CtxB assembly intermediates acquire some of their native secondary structure along with association because their main interface involves the formation of an intermolecular β-sheet, this folding/association step is regulated by histidine residues [134].

The relation between evolution and assembly routes assumes that an oligomer evolves/ assembles from a monomeric entity. But reverse situation exists as for the native tachylectin-2 monomer which has been proposed to have evolved from a pentameric ancestor through short, functional gene segments that, at later stages, duplicated, fused, and rearranged [146]. The authors propose that new folds evolved through the structural plasticity of assembly inter‐

**Figure 4. Plasticity of quaternary structures fulfilling a single biological function: the hemoglobin example.** The hemoglobin chain exists as a single fold which is copied and assembled with different stoichiometries (number of chains) and different symmetries across species to maintain the same biological function. Few cases are represented from a hemoglobin monomer to a 144-mer assembly. The structures are shown in ribbons except when spacefill is better to illustrate the symmetry of the assembly. The pictures are generated with Rasmol. The PDB codes and the

This last example illustrates quite ironically that protein folds and quaternary structures still hold surprises and a direct relation between the evolution of protein oligomers and the mechanism of their assembly is not readily systematic. Both evolution and assembly certainly

involve multiple parameters making their prediction rather challenging.

The two and the three state models are depicted in the figure 5A.

symmetries of the hemoglobins are indicated above their respective structures.

342 Oligomerization of Chemical and Biological Compounds

**3.2. Can the mechanism of assembly be predicted by experimental approaches?**

mediates.

Proline and histidine residues are rare at interfaces but are often found upstream the region of interfaces and are indirectly acting on their formation, as mentioned at the beginning of the chapter (see introduction). Registration sequences that control the quaternary structure of protein oligomers are also located outside interfaces. In fact, several cases of residues located outside interfaces have been shown to be involved indirectly in the chain association and a variety of small amino acid modules have been proposed to act upon assembly by different processes. Basically they introduce the flexibility required to modulate the 3D position of interface domains so to increase the chance of successful encounters [138, 159].

In fact, graph theory is the ideal tool to investigate the residues involved in intramolecular interactions, the residues involved in intermolecular interactions and their cross-talk commu‐ nications. Basically sub-graphs or clusters are produced and allosteric communication between the different clusters is investigated. This has been used in enzyme/ligand intermolecular interactions and in interfaces [74]. It appears that the intramolecular networks maintained the robustness of the structure while the interface residues are more plastic to accommodate the

The Assembly of Protein Oligomers — Old Stories and New Perspectives with Graph Theory

http://dx.doi.org/10.5772/58576

345

Obviously folding and association reactions intertwine to orchestrate the protein assembly. This means that the key factors for protein assembly is the balance between intra and inter

There are cases of proteins sharing functions, high sequence identities, folds and quaternary structures but following distinct assembly mechanisms. For example the two related AB5 toxins, heat labile enterotoxin (LTB5) and cholera toxin (CtxB5) have 94 % sequence identity, almost superimposable 3D structures, and identical quaternary structures but nevertheless assemble through different mechanisms under identical experimental conditions (Fig. 5A). LTB5 follows a lock and key mechanism whereas CtxB5 assembles through a fly-casting mechanism [8, 134, 135]. Out of 103 amino acids, 11 are different among which only two in the

The role of only few residues in controlling an assembly or a disassembly mechanism is also evidenced in diseases called conformational diseases where a single amino acid mutation is enough to redirect the protein native conformation to an aberrant conformation such as a fiber, through unfolding/refolding steps [169-175]. Consequently the protein loses its function

This tends to show that the assembly of a protein is in fact regulated by only few amino acids, indicating that little differences are enough to go from a fly-casting to an induced-fit mecha‐ nism. This is in good agreement with allosteric mechanisms and the MWC (Monod, Wyman, Changeux) theory that unifies fly-casting and an induced-fit routes into a single mechanism [176]. Accordingly, protein assembly can be expressed as a single scheme with transitions between the fly-casting and the induced-fit mechanisms depending on thermodynamic

There exist several evidences of such transitions in biology, some of which are illustrated on figure 7. For example, in the course of evolution proteins may change their folding and/or assembly routes upon random mutations. Or proteins very similar in sequences and structures may favor different routes because of small amino acid differences in their sequence and/ or environmental factors. This illustrates the plasticity of proteins in terms of mechanism of formation and in terms of quaternary structures but also supports the fact that it is the local

characteristics (few amino acids) that impact on the global structure of a protein.

flexible motion required for association.

**4. Local key contacts regulate global conformations**

interface. The cpn10 heptamers are another of such example [70].

molecular interactions.

leading to the disease.

equilibrium and kinetic rates (Fig. 6).

#### **3.3. Can the mechanism of assembly be predicted by computational approaches?**

The two states model was revisited by Wolynes' laboratory showing that an unfolded protein has a greater capture radius for a specific binding site than the folded state with its restricted conformational freedom [160]. In this scenario of binding, the unfolded state binds weakly at a relatively large distance followed by folding as the protein approaches the binding site: the ''fly-casting mechanism'' (Fig. 5A). In 2004, Wolynes introduces the notion that certain characteristics of the atomic structure like the interface size and hydrophobicity, the ratio of the number of interfacial contacts to the number of intramonomeric contacts enabled to determine whether a homodimer assembled into a fly casting or lock and key mechanism [161]. A large ratio of interfacial to monomeric contacts is typical of a two-state model and of the flycasting mechanism.

Computational approaches also provide evidences supporting the lock and key mechanism, the fly-casting mechanism and a series of in-between mechanisms attesting of back and forth between folding and association reactions and whose idea lies on an "induced-fit" principle during which intermolecular contacts "catalyze" folding (allostery, conformational gating, induced fit) [162-165].

Recently, molecular dynamic (MD) simulations have been combined to network analysis to provide detail understanding of the route of assembly. For example, coarse-grained transition networks (CGTNs) can be derived from MD simulation to show the transition between oligomers of different sizes [166, 167]. In a recent report, the role of the sequences in the aggregation kinetics and assembly mechanisms was described in great details [168]. Briefly, MD is performed and the state of each conformation/state observed in the MD is defined by a set of digit. Based on the MD, a transition matrix *N x N* is built with *N* states and with the matrix elements defined by the occurrences of any transition between two states. The matrix transition is converted into a graph called KTN (Kinetic Transition Network) with the nodes corresponding to the states and the edges to the transitions. Such graphs provide measures of the population of different states and the probability of transition between them. Energy barriers are associated to the transitions and disconnectivity graphs are constructed to evaluate the energy barrier to go from one conformation to another with min-cut algorithms. The dynamics of aggregation was also evaluated using FPTD (First Passage Time Distribution) which informs on the most populated states and kinetics. Although such approach has not yet been applied to a protein assembly on a full-length protein, there is no doubt that such combination of molecular dynamics with graph theory would provide new directions in predicting protein assembly mechanisms.

In fact, graph theory is the ideal tool to investigate the residues involved in intramolecular interactions, the residues involved in intermolecular interactions and their cross-talk commu‐ nications. Basically sub-graphs or clusters are produced and allosteric communication between the different clusters is investigated. This has been used in enzyme/ligand intermolecular interactions and in interfaces [74]. It appears that the intramolecular networks maintained the robustness of the structure while the interface residues are more plastic to accommodate the flexible motion required for association.

Obviously folding and association reactions intertwine to orchestrate the protein assembly. This means that the key factors for protein assembly is the balance between intra and inter molecular interactions.

#### **4. Local key contacts regulate global conformations**

chapter (see introduction). Registration sequences that control the quaternary structure of protein oligomers are also located outside interfaces. In fact, several cases of residues located outside interfaces have been shown to be involved indirectly in the chain association and a variety of small amino acid modules have been proposed to act upon assembly by different processes. Basically they introduce the flexibility required to modulate the 3D position of

The two states model was revisited by Wolynes' laboratory showing that an unfolded protein has a greater capture radius for a specific binding site than the folded state with its restricted conformational freedom [160]. In this scenario of binding, the unfolded state binds weakly at a relatively large distance followed by folding as the protein approaches the binding site: the ''fly-casting mechanism'' (Fig. 5A). In 2004, Wolynes introduces the notion that certain characteristics of the atomic structure like the interface size and hydrophobicity, the ratio of the number of interfacial contacts to the number of intramonomeric contacts enabled to determine whether a homodimer assembled into a fly casting or lock and key mechanism [161]. A large ratio of interfacial to monomeric contacts is typical of a two-state model and of the fly-

Computational approaches also provide evidences supporting the lock and key mechanism, the fly-casting mechanism and a series of in-between mechanisms attesting of back and forth between folding and association reactions and whose idea lies on an "induced-fit" principle during which intermolecular contacts "catalyze" folding (allostery, conformational gating,

Recently, molecular dynamic (MD) simulations have been combined to network analysis to provide detail understanding of the route of assembly. For example, coarse-grained transition networks (CGTNs) can be derived from MD simulation to show the transition between oligomers of different sizes [166, 167]. In a recent report, the role of the sequences in the aggregation kinetics and assembly mechanisms was described in great details [168]. Briefly, MD is performed and the state of each conformation/state observed in the MD is defined by a set of digit. Based on the MD, a transition matrix *N x N* is built with *N* states and with the matrix elements defined by the occurrences of any transition between two states. The matrix transition is converted into a graph called KTN (Kinetic Transition Network) with the nodes corresponding to the states and the edges to the transitions. Such graphs provide measures of the population of different states and the probability of transition between them. Energy barriers are associated to the transitions and disconnectivity graphs are constructed to evaluate the energy barrier to go from one conformation to another with min-cut algorithms. The dynamics of aggregation was also evaluated using FPTD (First Passage Time Distribution) which informs on the most populated states and kinetics. Although such approach has not yet been applied to a protein assembly on a full-length protein, there is no doubt that such combination of molecular dynamics with graph theory would provide new directions in

interface domains so to increase the chance of successful encounters [138, 159].

casting mechanism.

344 Oligomerization of Chemical and Biological Compounds

induced fit) [162-165].

predicting protein assembly mechanisms.

**3.3. Can the mechanism of assembly be predicted by computational approaches?**

There are cases of proteins sharing functions, high sequence identities, folds and quaternary structures but following distinct assembly mechanisms. For example the two related AB5 toxins, heat labile enterotoxin (LTB5) and cholera toxin (CtxB5) have 94 % sequence identity, almost superimposable 3D structures, and identical quaternary structures but nevertheless assemble through different mechanisms under identical experimental conditions (Fig. 5A). LTB5 follows a lock and key mechanism whereas CtxB5 assembles through a fly-casting mechanism [8, 134, 135]. Out of 103 amino acids, 11 are different among which only two in the interface. The cpn10 heptamers are another of such example [70].

The role of only few residues in controlling an assembly or a disassembly mechanism is also evidenced in diseases called conformational diseases where a single amino acid mutation is enough to redirect the protein native conformation to an aberrant conformation such as a fiber, through unfolding/refolding steps [169-175]. Consequently the protein loses its function leading to the disease.

This tends to show that the assembly of a protein is in fact regulated by only few amino acids, indicating that little differences are enough to go from a fly-casting to an induced-fit mecha‐ nism. This is in good agreement with allosteric mechanisms and the MWC (Monod, Wyman, Changeux) theory that unifies fly-casting and an induced-fit routes into a single mechanism [176]. Accordingly, protein assembly can be expressed as a single scheme with transitions between the fly-casting and the induced-fit mechanisms depending on thermodynamic equilibrium and kinetic rates (Fig. 6).

There exist several evidences of such transitions in biology, some of which are illustrated on figure 7. For example, in the course of evolution proteins may change their folding and/or assembly routes upon random mutations. Or proteins very similar in sequences and structures may favor different routes because of small amino acid differences in their sequence and/ or environmental factors. This illustrates the plasticity of proteins in terms of mechanism of formation and in terms of quaternary structures but also supports the fact that it is the local characteristics (few amino acids) that impact on the global structure of a protein.

**Figure 6. Kinetic scheme of protein assembly.** A protein monomer may exist in an unfolded state *U* and folds into a folded state *M*. A protein oligomer may assemble from (top reactions) unfolded monomers *U* which associate in "par‐ tially"folded states *Ui* with *i* going from 2 to *n*, *n* being the number chains; until they assemble into a non-native oligo‐ meric state *Un* which finally folds into a native folded oligomer *On*. Alternatively, a protein oligomer may assemble from (bottom reactions) unfolded monomers *U* which fold and associate into dimers *D*, trimers *T,* etc until they reach the native oligomeric state *On.* Each of the conformational state exists in equilibrium and may go from one state to another according to *kon* and *koff* rates of the reaction. Only the formation of the native state are considered irreversi‐ ble. The population of every species and the transition species depend on kinetic parameters.

How to identify few amino acids as key determinant for a protein fold or an assembly and what properties they must have to affect the mechanism of assembly and/or its final output? Graph theory is at present probably one of the most suitable approach to investigate such questions. For example, the effects of the mutations involved in conformational diseases have been considered in terms of network. Recently, a novel approach using graph-based signatures has shown that the impact of a mutation correlated with the atomic-distance patterns sur‐ rounding an amino acid residue [177]. They showed that the signatures can be used to predict stability changes of a wide range of mutations occurring in the tumor suppressor protein p53.

We have also investigated the effects of mutation on graph features and the possible conse‐ quences in terms of the disease mechanism [59]. As briefly mentioned earlier, we have seen that the networks of the 1056 intermolecular β-strands present in "healthy" protein oligomers, avoid hubs (highly connected residues) to be robust to mutation. The intermolecular β-strands are essentially disconnected graphs so any mutation would not spread damages far in the network. We compared these "healthy" networks with the β-strand interface of the p53 tetramer, which has known familial mutations related to dissociation of the tetramer, fiber formation, and associated with cancer [169]. The p53 network has a higher interconnectedness because its nodes have higher degrees (ie more contacts) than those in the "healthy" networks, with the consequences that a single node modification (ie a mutation) is enough to reorganize the interactions in the entire network. Thus the higher connectivity of the p53 network leads to a greater sensitivity to rewiring (rearrangement of links upon node modification) than the

disconnected graph observed in healthy proteins. In some cases, such ample rewiring probably promotes chain dissociation, first step to fiber formation. We have now started to investigate why the p53 network has a higher interconnectedness than the "healthy" networks. The p53 tetramer has a D2 point group symmetry and its interfaces adopt a local central symmetry because the two interacting domains have identical sequences and the residues are paired in an anti-parallel manner. In contrast 60 % of the protein interfaces of "healthy" proteins have

**Figure 7. Evidences of transitions between different states and different reaction paths.** The kinetic scheme de‐ scribed in figure 6 is reported here at the center of the figure. The transitions between different protein conforma‐ tions or different reaction paths are indicated by green arrows. The x-ray structures of protein cases undergoing such transition are shown as illustration of these transitions. The transition can take place during evolution and because of mutation (e.g. tachylectin-2, two states model). It can take place because of mutation and lead to conformational dis‐ eases as for the p53 tumor suppressor p53. The main route may depend on a difference of few amino acids as for the two related toxins CtxB5 and LTB5. Or else, the protein as the pore-forming toxin aerolysin may adopt different quater‐ nary structures and go from one to another because of environmental factors (e.g. pH, proteolyic cleavage, cell recep‐

The Assembly of Protein Oligomers — Old Stories and New Perspectives with Graph Theory

http://dx.doi.org/10.5772/58576

347

domains made of different sequences and their β-interfaces have no local symmetry.

tor etc…).

**Figure 7. Evidences of transitions between different states and different reaction paths.** The kinetic scheme de‐ scribed in figure 6 is reported here at the center of the figure. The transitions between different protein conforma‐ tions or different reaction paths are indicated by green arrows. The x-ray structures of protein cases undergoing such transition are shown as illustration of these transitions. The transition can take place during evolution and because of mutation (e.g. tachylectin-2, two states model). It can take place because of mutation and lead to conformational dis‐ eases as for the p53 tumor suppressor p53. The main route may depend on a difference of few amino acids as for the two related toxins CtxB5 and LTB5. Or else, the protein as the pore-forming toxin aerolysin may adopt different quater‐ nary structures and go from one to another because of environmental factors (e.g. pH, proteolyic cleavage, cell recep‐ tor etc…).

How to identify few amino acids as key determinant for a protein fold or an assembly and what properties they must have to affect the mechanism of assembly and/or its final output? Graph theory is at present probably one of the most suitable approach to investigate such questions. For example, the effects of the mutations involved in conformational diseases have been considered in terms of network. Recently, a novel approach using graph-based signatures has shown that the impact of a mutation correlated with the atomic-distance patterns sur‐ rounding an amino acid residue [177]. They showed that the signatures can be used to predict stability changes of a wide range of mutations occurring in the tumor suppressor protein p53.

ble. The population of every species and the transition species depend on kinetic parameters.

346 Oligomerization of Chemical and Biological Compounds

**Figure 6. Kinetic scheme of protein assembly.** A protein monomer may exist in an unfolded state *U* and folds into a folded state *M*. A protein oligomer may assemble from (top reactions) unfolded monomers *U* which associate in "par‐ tially"folded states *Ui* with *i* going from 2 to *n*, *n* being the number chains; until they assemble into a non-native oligo‐ meric state *Un* which finally folds into a native folded oligomer *On*. Alternatively, a protein oligomer may assemble from (bottom reactions) unfolded monomers *U* which fold and associate into dimers *D*, trimers *T,* etc until they reach the native oligomeric state *On.* Each of the conformational state exists in equilibrium and may go from one state to another according to *kon* and *koff* rates of the reaction. Only the formation of the native state are considered irreversi‐

We have also investigated the effects of mutation on graph features and the possible conse‐ quences in terms of the disease mechanism [59]. As briefly mentioned earlier, we have seen that the networks of the 1056 intermolecular β-strands present in "healthy" protein oligomers, avoid hubs (highly connected residues) to be robust to mutation. The intermolecular β-strands are essentially disconnected graphs so any mutation would not spread damages far in the network. We compared these "healthy" networks with the β-strand interface of the p53 tetramer, which has known familial mutations related to dissociation of the tetramer, fiber formation, and associated with cancer [169]. The p53 network has a higher interconnectedness because its nodes have higher degrees (ie more contacts) than those in the "healthy" networks, with the consequences that a single node modification (ie a mutation) is enough to reorganize the interactions in the entire network. Thus the higher connectivity of the p53 network leads to a greater sensitivity to rewiring (rearrangement of links upon node modification) than the

disconnected graph observed in healthy proteins. In some cases, such ample rewiring probably promotes chain dissociation, first step to fiber formation. We have now started to investigate why the p53 network has a higher interconnectedness than the "healthy" networks. The p53 tetramer has a D2 point group symmetry and its interfaces adopt a local central symmetry because the two interacting domains have identical sequences and the residues are paired in an anti-parallel manner. In contrast 60 % of the protein interfaces of "healthy" proteins have domains made of different sequences and their β-interfaces have no local symmetry.

Let's consider three intermolecular networks, one with different amino acid sequences and no local symmetry, a second with identical sequence arranged in a parallel manner (horizontal axe symmetry) and a third with identical sequence arranged in an anti-parallel manner (rotational axe symmetry) (Fig. 8).

Let's now look at the consequences on the network features. The first consequence is a multiplicity of the number of interactions in interfaces with local symmetry and when the sequences are identical and therefore an intrinsic increase of the network interconnectedness (Fig. 8B). As observed for the p53 case, such increase would lead to network sensitivity to rewiring which in terms of protein may introduce a vulnerability to chain dissociation or chain reorganization. The second consequence is the decrease of the number of distinct connected components (Fig. 8B). This again would increase the propagation of changes within the network upon mutation because the amino acids are not secluded from one another. It is interesting that local symmetry is enough to improve the communication within the network without altering significantly the average degree <*k*>. This means that even without hubs the protein interfaces become highly connected by long paths. This preliminary analysis suggests that interfaces made of domains with different sequences might be more resistant to fold plasticity because of an absence of sequence symmetry. Protein oligomers which undergo a transition to pathological assemblies (fiber or oligomers) probably have global and local properties that make them amenable to fold plasticity. How the local properties alter the global

The Assembly of Protein Oligomers — Old Stories and New Perspectives with Graph Theory

http://dx.doi.org/10.5772/58576

349

The novel results obtained by graph theory are that the layout of the interactions, called the network topology, is extremely important for understanding the formation of an interface and the plasticity of fold and quaternary changes. It is also important to understand that the keys are not in any hot spot features but are in the residues whose local properties spread enough global effects to regulate/affect the full-length chain structure. In other words, the formation of interfaces and the quaternary plasticity lay on the residues that control allosteric transitions, mechanisms now revisited using propagation measures in networks. This local to global transition is also investigated by mathematical concepts in the chapter by Laurent Vuillon and

The take home message of the chapter is to exhibit the usefulness of computational approaches to efficiently complement experimental approaches and gain insight in protein assembly. Obviously, future challenges are on understanding how intramolecular and intermolecular interactions are coordinated and the determinants of allosteric transitions. Graph theory and networks approaches open new venues to explore such problems and are certainly going to provide important breakthrough. Briefly, graph theory can help in identifying intramolecular and intermolecular key interactions as well as in investigating their communication means by analyzing the topology of the networks, isolating appropriate clusters and determining

Now, it may be yet too early to grasp what are the network measures most relevant to the problem of protein assembly and how they can be interpreted in terms of protein's needs. For example, proteins and protein interfaces have been described as random networks with Poisson degree distributions centered to a characteristic average <*k*> degree [178]. This means

properties remain to be explored.

**5. Conclusion**

Claire Lesieur.

propagation route (allostery).

**Figure 8. Effect of local symmetry on network features.A.** Protein interfaces may be formed by association of do‐ mains with identical sequences (left and right pannels) or different sequences (middle pannel). In the former cases, the two domains maybe aligned in a parallel or antiparallel manner (in register or out register arrangement). The iden‐ tical sequences will have an intrinsic symmetry in their amino acid pairing, namely if the residue 1 is in interaction with the residue 2 then the residue 2 is in interaction with the residue 1. Such symmetrical constraint will produce some motifs in the network which are not necessarily present in "asymmetrical" interfaces made of domains with two differ‐ ent sequences. This is illustrated on a simple network. **B. Connected components.** Considering a slightly more com‐ plex network one can see that the motifs results from the elements of symmetry, a horizontal axial or a rotational axesymmetry for a parallel or antiparallel arrangement, respectively. The dotted boxes indicated the connected components, namely the residues which are connected to each other. One can see the effect of the symmetry on the total number of connected component. **C. Propagation of changes.** The effect of a single node modification on the network, indicated by a M for mutation, is considered. Assuming there is an effect as long as there is a link between two nodes, the symmetry enables the changes (red link) to propagate within the network. **D.** The average degree *<k>* of the nodes is given for each of the networks.

Let's now look at the consequences on the network features. The first consequence is a multiplicity of the number of interactions in interfaces with local symmetry and when the sequences are identical and therefore an intrinsic increase of the network interconnectedness (Fig. 8B). As observed for the p53 case, such increase would lead to network sensitivity to rewiring which in terms of protein may introduce a vulnerability to chain dissociation or chain reorganization. The second consequence is the decrease of the number of distinct connected components (Fig. 8B). This again would increase the propagation of changes within the network upon mutation because the amino acids are not secluded from one another. It is interesting that local symmetry is enough to improve the communication within the network without altering significantly the average degree <*k*>. This means that even without hubs the protein interfaces become highly connected by long paths. This preliminary analysis suggests that interfaces made of domains with different sequences might be more resistant to fold plasticity because of an absence of sequence symmetry. Protein oligomers which undergo a transition to pathological assemblies (fiber or oligomers) probably have global and local properties that make them amenable to fold plasticity. How the local properties alter the global properties remain to be explored.

#### **5. Conclusion**

Let's consider three intermolecular networks, one with different amino acid sequences and no local symmetry, a second with identical sequence arranged in a parallel manner (horizontal axe symmetry) and a third with identical sequence arranged in an anti-parallel manner

**Figure 8. Effect of local symmetry on network features.A.** Protein interfaces may be formed by association of do‐ mains with identical sequences (left and right pannels) or different sequences (middle pannel). In the former cases, the two domains maybe aligned in a parallel or antiparallel manner (in register or out register arrangement). The iden‐ tical sequences will have an intrinsic symmetry in their amino acid pairing, namely if the residue 1 is in interaction with the residue 2 then the residue 2 is in interaction with the residue 1. Such symmetrical constraint will produce some motifs in the network which are not necessarily present in "asymmetrical" interfaces made of domains with two differ‐ ent sequences. This is illustrated on a simple network. **B. Connected components.** Considering a slightly more com‐ plex network one can see that the motifs results from the elements of symmetry, a horizontal axial or a rotational axesymmetry for a parallel or antiparallel arrangement, respectively. The dotted boxes indicated the connected components, namely the residues which are connected to each other. One can see the effect of the symmetry on the total number of connected component. **C. Propagation of changes.** The effect of a single node modification on the network, indicated by a M for mutation, is considered. Assuming there is an effect as long as there is a link between two nodes, the symmetry enables the changes (red link) to propagate within the network. **D.** The average degree *<k>*

(rotational axe symmetry) (Fig. 8).

348 Oligomerization of Chemical and Biological Compounds

of the nodes is given for each of the networks.

The novel results obtained by graph theory are that the layout of the interactions, called the network topology, is extremely important for understanding the formation of an interface and the plasticity of fold and quaternary changes. It is also important to understand that the keys are not in any hot spot features but are in the residues whose local properties spread enough global effects to regulate/affect the full-length chain structure. In other words, the formation of interfaces and the quaternary plasticity lay on the residues that control allosteric transitions, mechanisms now revisited using propagation measures in networks. This local to global transition is also investigated by mathematical concepts in the chapter by Laurent Vuillon and Claire Lesieur.

The take home message of the chapter is to exhibit the usefulness of computational approaches to efficiently complement experimental approaches and gain insight in protein assembly. Obviously, future challenges are on understanding how intramolecular and intermolecular interactions are coordinated and the determinants of allosteric transitions. Graph theory and networks approaches open new venues to explore such problems and are certainly going to provide important breakthrough. Briefly, graph theory can help in identifying intramolecular and intermolecular key interactions as well as in investigating their communication means by analyzing the topology of the networks, isolating appropriate clusters and determining propagation route (allostery).

Now, it may be yet too early to grasp what are the network measures most relevant to the problem of protein assembly and how they can be interpreted in terms of protein's needs. For example, proteins and protein interfaces have been described as random networks with Poisson degree distributions centered to a characteristic average <*k*> degree [178]. This means all nodes have on average the same number of links (or contacts) and there are no hubs. Proteins have also been described as single-scale network with exponential degree distribution and no hubs again [59, 84]. In some case, the random network is attributed to the backbone interactions while the single-scale network is attributed to the side chain interactions. A network made of a minimum number of contacts seems rather coherent as proteins probably minimize the number of links (bonds) per amino acid to reduce the "building" cost in terms of bonds and the sequence stringency.

**Author details**

Address all correspondence to: claire.lesieur@agim.eu

[1] Goodsell DS, Olson AJ. Structural symmetry and protein function. Annu Rev Bio‐

The Assembly of Protein Oligomers — Old Stories and New Perspectives with Graph Theory

http://dx.doi.org/10.5772/58576

351

[2] Grant RA, Filman DJ, Finkel SE, Kolter R, Hogle JM. The crystal structure of Dps, a ferritin homolog that binds and protects DNA. Nat Struct Biol. 1998 Apr;5(4):294-303.

[3] Lloyd DR. Cubic Icosahedra? A Problem in Assigning Symmetry. Journal of Chemi‐

[4] Cheung DT, DiCesare P, Benya PD, Libaw E, Nimni ME. The presence of intermolec‐ ular disulfide cross-links in type III collagen. J Biol Chem. 1983 Jun 25;258(12):7774-8.

[5] Ricard-Blum S. The collagen family. Cold Spring Harbor perspectives in biology. 2011 Jan;3(1):a004978. PubMed PMID: 21421911. Pubmed Central PMCID: 3003457.

[6] Gordon MK, Hahn RA. Collagens. Cell and tissue research. 2010 Jan;339(1):247-57.

[7] Lesieur C, Frutiger S, Hughes G, Kellner R, Pattus F, van der Goot FG. Increased sta‐ bility upon heptamerization of the pore-forming toxin aerolysin. J Biol Chem. 1999

[8] Lesieur C, Cliff MJ, Carter R, James RF, Clarke AR, Hirst TR. A kinetic model of in‐ termediate formation during assembly of cholera toxin B-subunit pentamers. J Biol

[9] McLaughlin SH, Bulleid NJ. Molecular recognition in procollagen chain assembly.

[10] Reimer U, Scherer G, Drewello M, Kruber S, Schutkowski M, Fischer G. Side-chain effects on peptidyl-prolyl cis/trans isomerisation. J Mol Biol. 1998 Jun 5;279(2):449-60.

phys Biomol Struct. 2000;29:105-53. PubMed PMID: 10940245.

PubMed PMID: 19693541. Pubmed Central PMCID: 2997103.

Chem. 2002 May 10;277(19):16697-704. PubMed PMID: 11877421.

Matrix Biol. 1998 Feb;16(7):369-77. PubMed PMID: 9524357.

Dec 17;274(51):36722-8. PubMed PMID: 10593978.

AGIM-FRE3405 UJF-CNRS, Grenoble, France

PubMed PMID: 9546221.

PubMed PMID: 6863264.

PubMed PMID: 9642049.

cal Education. 2010;87(8):823-6.

Claire Lesieur\*

**References**

Simultaneously, proteins are described as small world because they have small average path length <*l*>. Small <*l*> generally indicates that most nodes, namely amino acids, of the network are within the reach of each other. Such node accessibility would suggest that a single modification anywhere in the network (ie any mutation in a protein) would easily spread changes in the whole network, a hazardous situation for a protein and in contradiction with the fact that protein folds and functions resist most mutations. Small world networks generally have hubs, highly connected nodes that govern the network communication routes. But proteins are random or single-scale networks and as such are not expected to have hubs, at least not hubs with many more links than other nodes. The absence of hubs is good as it reduces the protein vulnerability to mutation. We have measured <*l*> from 10 to 19 in protein interfaces for networks made of about 300 nodes (unpublished). For comparison the world wide web has similar <*l*>=19, but 800 million nodes. Maybe it just happens that some worlds are smaller than other.

It is therefore not so simple to deconvoluate the topology of a network with a small average <*l*> depending if it is a random, single scale or scale-free (power law degree distribution) network. Theoretical developments aiming at this understanding are proposed and allow considering distribution of connected components, distribution of clustering coefficient and approximation of <*l*>. Such work will help analyzing the network measures obtained for amino acid networks [179].

One problem of network is the number of interactions and nodes generated to describe a protein network and how to discriminate a hierarchy within these set of interactions to understand the determinant ones. To this goal, one elegant strategy is to experimentally measure kinetics and affinity to prioritize interactions in networks [180]. Such approaches would complement MD simulations and help discriminating the good from the bad.

#### **Acknowledgements**

A special thanks to Kave Salamatian and Laurent Vuillon for numerous fruitful discussions on network and graph theories. We thank the federation of research FR2914 MSIF (Modeliza‐ tion, Simulations, Interaction Fundamentals) for supporting our work on interdisciplinary research.

#### **Author details**

Claire Lesieur\*

all nodes have on average the same number of links (or contacts) and there are no hubs. Proteins have also been described as single-scale network with exponential degree distribution and no hubs again [59, 84]. In some case, the random network is attributed to the backbone interactions while the single-scale network is attributed to the side chain interactions. A network made of a minimum number of contacts seems rather coherent as proteins probably minimize the number of links (bonds) per amino acid to reduce the "building" cost in terms of bonds and

Simultaneously, proteins are described as small world because they have small average path length <*l*>. Small <*l*> generally indicates that most nodes, namely amino acids, of the network are within the reach of each other. Such node accessibility would suggest that a single modification anywhere in the network (ie any mutation in a protein) would easily spread changes in the whole network, a hazardous situation for a protein and in contradiction with the fact that protein folds and functions resist most mutations. Small world networks generally have hubs, highly connected nodes that govern the network communication routes. But proteins are random or single-scale networks and as such are not expected to have hubs, at least not hubs with many more links than other nodes. The absence of hubs is good as it reduces the protein vulnerability to mutation. We have measured <*l*> from 10 to 19 in protein interfaces for networks made of about 300 nodes (unpublished). For comparison the world wide web has similar <*l*>=19, but 800 million nodes. Maybe it just happens that some worlds are smaller

It is therefore not so simple to deconvoluate the topology of a network with a small average <*l*> depending if it is a random, single scale or scale-free (power law degree distribution) network. Theoretical developments aiming at this understanding are proposed and allow considering distribution of connected components, distribution of clustering coefficient and approximation of <*l*>. Such work will help analyzing the network measures obtained for amino

One problem of network is the number of interactions and nodes generated to describe a protein network and how to discriminate a hierarchy within these set of interactions to understand the determinant ones. To this goal, one elegant strategy is to experimentally measure kinetics and affinity to prioritize interactions in networks [180]. Such approaches

A special thanks to Kave Salamatian and Laurent Vuillon for numerous fruitful discussions on network and graph theories. We thank the federation of research FR2914 MSIF (Modeliza‐ tion, Simulations, Interaction Fundamentals) for supporting our work on interdisciplinary

would complement MD simulations and help discriminating the good from the bad.

the sequence stringency.

350 Oligomerization of Chemical and Biological Compounds

than other.

acid networks [179].

**Acknowledgements**

research.

Address all correspondence to: claire.lesieur@agim.eu

AGIM-FRE3405 UJF-CNRS, Grenoble, France

#### **References**


[11] Tacnet P, Cheong EC, Goeltz P, Ghebrehiwet B, Arlaud GJ, Liu XY, et al. Trimeric re‐ assembly of the globular domain of human C1q. Biochim Biophys Acta. 2008 Mar; 1784(3):518-29. PubMed PMID: 18179779.

[24] Elcock AH, McCammon JA. Identification of protein oligomerization states by analy‐ sis of interface conservation. Proc Natl Acad Sci U S A. 2001 Mar 13;98(6):2990-4.

The Assembly of Protein Oligomers — Old Stories and New Perspectives with Graph Theory

http://dx.doi.org/10.5772/58576

353

[25] Lai YT, King NP, Yeates TO. Principles for designing ordered protein assemblies.

[26] Woolfson DN, Bartlett GJ, Bruning M, Thomson AR. New currency for old rope: from coiled-coil assemblies to alpha-helical barrels. Curr Opin Struct Biol. 2012 Aug;

[27] Channon K, Bromley EH, Woolfson DN. Synthetic biology through biomolecular de‐ sign and engineering. Curr Opin Struct Biol. 2008 Aug;18(4):491-8. PubMed PMID:

[28] Ringler P, Schulz GE. Self-assembly of proteins into designed networks. Science. 2003

[29] King NP, Lai Y-T. Practical approaches to designing novel protein assemblies. Cur‐

[30] Gursoy A, Keskin O, Nussinov R. Topological properties of protein interaction net‐ works from a structural perspective. Biochemical Society Transactions.

[31] Tuncbag N, Gursoy A, Guney E, Nussinov R, Keskin O. Architectures and functional coverage of protein-protein interfaces. J Mol Biol. 2008 Sep 5;381(3):785-802. PubMed

[32] Shulman-Peleg A, Shatsky M, Nussinov R, Wolfson HJ. Spatial chemical conserva‐ tion of hot spot interactions in protein-protein complexes. BMC Biol. 2007;5:43.

[33] Ma B, Nussinov R. Trp/Met/Phe hot spots in protein-protein interactions: potential targets in drug design. Current topics in medicinal chemistry. 2007;7(10):999-1005. [34] Jones S, Thornton JM. Principles of protein-protein interactions. Proc Natl Acad Sci U

[35] Laskowski RA, Thornton JM. Understanding the molecular machinery of genetics

[36] Toogood PL. Inhibition of protein-protein association by small molecules: ap‐ proaches and progress. Journal of medicinal chemistry. 2002 Apr 11;45(8):1543-58.

[37] Gadek TR, Nicholas JB. Small molecule antagonists of proteins. Biochemical pharma‐

through 3D structures. Nature Reviews Genetics. 2008;9(2):141-51.

PubMed PMID: 11248019. Pubmed Central PMCID: 30594.

22(4):432-41. PubMed PMID: 22445228.

rent opinion in structural biology. 2013.

Oct 3;302(5642):106-9. PubMed PMID: 14526081.

PMID: 18620705. Pubmed Central PMCID: 2605427.

S A. 1996 Jan 9;93(1):13-20. PubMed PMID: 8552589.

cology. 2003 Jan 1;65(1):1-8. PubMed PMID: 12473372.

18644449.

2008;36:1398-403.

PubMed PMID: 17925020.

PubMed PMID: 11931608.

Trends Cell Biol. 2012 Dec;22(12):653-61. PubMed PMID: 22975357.


[24] Elcock AH, McCammon JA. Identification of protein oligomerization states by analy‐ sis of interface conservation. Proc Natl Acad Sci U S A. 2001 Mar 13;98(6):2990-4. PubMed PMID: 11248019. Pubmed Central PMCID: 30594.

[11] Tacnet P, Cheong EC, Goeltz P, Ghebrehiwet B, Arlaud GJ, Liu XY, et al. Trimeric re‐ assembly of the globular domain of human C1q. Biochim Biophys Acta. 2008 Mar;

[12] Tuncbag N, Gursoy A, Nussinov R, Keskin O. Predicting protein-protein interactions on a proteome scale by matching evolutionary and structural similarities at interfaces

[13] Tuncbag N, Kar G, Keskin O, Gursoy A, Nussinov R. A survey of available tools and web servers for analysis of protein–protein interactions and interfaces. Briefings in

[14] Guney E, Tuncbag N, Keskin O, Gursoy A. HotSprint: database of computational hot spots in protein interfaces. Nucleic acids research. 2008;36(suppl 1):D662-D6.

[15] Shoemaker BA, Panchenko AR. Deciphering protein-protein interactions. Part II. Computational methods to predict protein and domain interaction partners. PLoS Comput Biol. 2007 Apr 27;3(4):e43. PubMed PMID: 17465672. Pubmed Central

[16] Wass MN, David A, Sternberg MJ. Challenges for the prediction of macromolecular interactions. Curr Opin Struct Biol. 2011 Jun;21(3):382-90. PubMed PMID: 21497504.

[17] Juan D, Pazos F, Valencia A. High-confidence prediction of global interactomes based on genome-wide coevolutionary networks. Proc Natl Acad Sci U S A. 2008 Jan

[18] Mosca R, Ceol A, Stein A, Olivella R, Aloy P. 3did: a catalog of domain-based interac‐ tions of known three-dimensional structure. Nucleic Acids Res. 2014 Jan

[19] Mosca R, Pons T, Ceol A, Valencia A, Aloy P. Towards a detailed atlas of proteinprotein interactions. Curr Opin Struct Biol. 2013 Dec;23(6):929-40. PubMed PMID:

[20] Mosca R, Ceol A, Aloy P. Interactome3D: adding structural details to protein net‐

[21] Janin J, Bahadur RP, Chakrabarti P. Protein-protein interaction and quaternary struc‐

[22] Valdar WS, Thornton JM. Conservation helps to identify biologically relevant crystal contacts. J Mol Biol. 2001 Oct 19;313(2):399-416. PubMed PMID: 11800565.

[23] Ponstingl H, Henrick K, Thornton JM. Discriminating between homodimeric and monomeric proteins in the crystalline state. Proteins. 2000 Oct 1;41(1):47-57. PubMed

works. Nature methods. 2013 Jan;10(1):47-53. PubMed PMID: 23399932.

ture. Q Rev Biophys. 2008 May;41(2):133-80. PubMed PMID: 18812015.

22;105(3):934-9. PubMed PMID: 18199838. Pubmed Central PMCID: 2242690.

1784(3):518-29. PubMed PMID: 18179779.

1;42(1):D374-9. PubMed PMID: 24081580.

Bioinformatics. 2009;10(3):217.

352 Oligomerization of Chemical and Biological Compounds

PMCID: 1857810.

23896349.

PMID: 10944393.

using PRISM. Nature Protocols. 2011;6(9):1341-54.


[38] Lo Conte L, Chothia C, Janin J. The atomic structure of protein-protein recognition sites. J Mol Biol. 1999 Feb 5;285(5):2177-98. PubMed PMID: 9925793.

[52] Lee B, Richards FM. The interpretation of protein structures: estimation of static ac‐ cessibility. J Mol Biol. 1971 Feb 14;55(3):379-400. PubMed PMID: 5551392.

The Assembly of Protein Oligomers — Old Stories and New Perspectives with Graph Theory

http://dx.doi.org/10.5772/58576

355

[53] Poupon A. Voronoi and Voronoi-related tessellations in studies of protein structure and interaction. Curr Opin Struct Biol. 2004 Apr;14(2):233-41. PubMed PMID:

[54] Cazals F, Proust F, Bahadur RP, Janin J. Revisiting the Voronoi description of pro‐ tein-protein interfaces. Protein Sci. 2006 Sep;15(9):2082-92. PubMed PMID: 16943442.

[55] Bouvier B, Grunberg R, Nilges M, Cazals F. Shelling the Voronoi interface of proteinprotein complexes reveals patterns of residue conservation, dynamics, and composi‐

[56] Dreyfus T, Doye V, Cazals F. Probing a continuum of macro-molecular assembly models with graph templates of complexes. Proteins. 2013 Nov;81(11):2034-44.

[57] Faure G, Bornot A, de Brevern AG. Protein contacts, inter-residue interactions and

[58] Ofran Y, Rost B. Protein–protein interaction hotspots carved into sequences. PLoS

[59] Feverati G, Achoch M, Vuillon L, Lesieur C. Intermolecular β-Strand Networks Avoid Hub Residues and Favor Low Interconnectedness: A Potential Protection Mechanism against Chain Dissociation upon Mutation. PloS one. 2014;9(4):e94745. [60] Feverati G, Lesieur C. Oligomeric interfaces under the lens: gemini. PloS one.

[61] Caffrey DR, Somaroo S, Hughes JD, Mintseris J, Huang ES. Are protein-protein inter‐ faces more conserved in sequence than the rest of the protein surface? Protein Sci.

[62] Lichtarge O, Bourne HR, Cohen FE. An evolutionary trace method defines binding surfaces common to protein families. J Mol Biol. 1996 Mar 29;257(2):342-58. PubMed

[63] Armon A, Graur D, Ben-Tal N. ConSurf: an algorithmic tool for the identification of functional regions in proteins by surface mapping of phylogenetic information. J Mol

[64] Pupko T, Bell RE, Mayrose I, Glaser F, Ben-Tal N. Rate4Site: an algorithmic tool for the identification of functional regions in proteins by surface mapping of evolution‐ ary determinants within their homologues. Bioinformatics. 2002;18 Suppl 1:S71-7.

tion. Proteins. 2009 Aug 15;76(3):677-92. PubMed PMID: 19280599.

15093839.

PubMed PMID: 23609891.

2010;5(3):e9897.

PMID: 8609628.

PubMed PMID: 12169533.

side-chain modelling. Biochimie. 2008;90:626-39.

2004 Jan;13(1):190-202. PubMed PMID: 14691234.

Biol. 2001 Mar 16;307(1):447-63. PubMed PMID: 11243830.

computational biology. 2007;3(7):e119.


[52] Lee B, Richards FM. The interpretation of protein structures: estimation of static ac‐ cessibility. J Mol Biol. 1971 Feb 14;55(3):379-400. PubMed PMID: 5551392.

[38] Lo Conte L, Chothia C, Janin J. The atomic structure of protein-protein recognition

[39] Chakrabarti P, Janin J. Dissecting protein-protein recognition sites. Proteins. 2002

[40] Ofran Y, Rost B. Analysing six types of protein–protein interfaces. Journal of mo‐

[41] Bashton M, Chothia C. The geometry of domain combination in proteins. J Mol Biol.

[42] Xu D, Tsai CJ, Nussinov R. Hydrogen bonds and salt bridges across protein-protein interfaces. Protein Eng. 1997 Sep;10(9):999-1012. PubMed PMID: 9464564.

[43] Selbig J, Argos P. Relationships between protein sequence and structure patterns based on residue contacts. Proteins. 1998 May 1;31(2):172-85. PubMed PMID:

[44] Shoemaker BA, Panchenko AR. Deciphering protein-protein interactions. Part I. Ex‐ perimental techniques and databases. PLoS Comput Biol. 2007 Mar 30;3(3):e42.

[45] de Vries SJ, Bonvin AM. How proteins get in touch: interface prediction in the study of biomolecular complexes. Curr Protein Pept Sci. 2008 Aug;9(4):394-406. PubMed

[46] Guex N, Peitsch MC. SWISS-MODEL and the Swiss-PdbViewer: an environment for comparative protein modeling. Electrophoresis. 1997 Dec;18(15):2714-23. PubMed

[47] Shulman-Peleg A, Shatsky M, Nussinov R, Wolfson HJ. MultiBind and MAPPIS: webservers for multiple alignment of protein 3D-binding sites and their interactions. Nucleic Acids Res. 2008 Jul 1;36(Web Server issue):W260-4. PubMed PMID:

[48] Orchard S, Ammari M, Aranda B, Breuza L, Briganti L, Broackes-Carter F, et al. The MIntAct project--IntAct as a common curation platform for 11 molecular interaction databases. Nucleic Acids Res. 2014 Jan 1;42(1):D358-63. PubMed PMID: 24234451. [49] Chothia C, Janin J. Principles of protein-protein recognition. Nature. 1975 Aug

[50] Henrick K, Thornton JM. PQS: a protein quaternary structure file server. Trends Bio‐

[51] Janin J, Rodier F. Protein-protein interaction at crystal contacts. Proteins. 1995 Dec;

sites. J Mol Biol. 1999 Feb 5;285(5):2177-98. PubMed PMID: 9925793.

May 15;47(3):334-43. PubMed PMID: 11948787.

2002 Jan 25;315(4):927-39. PubMed PMID: 11812158.

PubMed PMID: 17397251. Pubmed Central PMCID: 1847991.

28;256(5520):705-8. PubMed PMID: 1153006.

23(4):580-7. PubMed PMID: 8749854.

chem Sci. 1998 Sep;23(9):358-61. PubMed PMID: 9787643.

lecular biology. 2003;325(2):377-87.

354 Oligomerization of Chemical and Biological Compounds

9593191.

PMID: 18691126.

PMID: 9504803.


[65] Landgraf R, Xenarios I, Eisenberg D. Three-dimensional cluster analysis identifies in‐ terfaces and functional residue clusters in proteins. J Mol Biol. 2001 Apr 13;307(5): 1487-502. PubMed PMID: 11292355.

[78] Amitai G, Shemesh A, Sitbon E, Shklar M, Netanely D, Venger I, et al. Network anal‐ ysis of protein structures identifies functional residues. J Mol Biol. 2004 Dec 3;344(4):

The Assembly of Protein Oligomers — Old Stories and New Perspectives with Graph Theory

http://dx.doi.org/10.5772/58576

357

[79] Vendruscolo M, Dokholyan NV, Paci E, Karplus M. Small-world view of the amino acids that play a key role in protein folding. Phys Rev E Stat Nonlin Soft Matter Phys.

[80] Dokholyan NV, Li L, Ding F, Shakhnovich EI. Topological determinants of protein folding. Proc Natl Acad Sci U S A. 2002 Jun 25;99(13):8637-41. PubMed PMID:

[81] del Sol A, O'Meara P. Small-world network approach to identify key residues in pro‐ tein-protein interaction. Proteins. 2005 Feb 15;58(3):672-82. PubMed PMID: 15617065.

[82] Brinda KV, Kannan N, Vishveshwara S. Analysis of homodimeric protein interfaces by graph-spectral methods. Protein Eng. 2002 Apr;15(4):265-77. PubMed PMID:

[83] Albert R, Jeong H, Barabasi AL. Error and attack tolerance of complex networks. Na‐

[84] Greene LH, Higman VA. Uncovering network systems within protein structures. J

[85] Brinda KV, Vishveshwara S. Oligomeric protein structure networks: insights into protein-protein interactions. BMC Bioinformatics. 2005;6:296. PubMed PMID:

[86] Dey S, Pal A, Chakrabarti P, Janin J. The subunit interfaces of weakly associated ho‐ modimeric proteins. J Mol Biol. 2010 Apr 23;398(1):146-60. PubMed PMID: 20156457.

[87] Talavera D, Robertson DL, Lovell SC. Characterization of protein-protein interaction

[88] Kim WK, Henschel A, Winter C, Schroeder M. The many faces of protein-protein in‐ teractions: A compendium of interface geometry. PLoS Comput Biol. 2006 Sep

[89] Winter C, Henschel A, Kim WK, Schroeder M. SCOPPI: a structural classification of protein-protein interfaces. Nucleic Acids Res. 2006 Jan 1;34(Database issue):D310-4.

[90] Keskin O, Tsai CJ, Wolfson H, Nussinov R. A new, structurally nonredundant, di‐ verse data set of protein–protein interfaces and its implications. Protein Science.

[91] Neuvirth H, Heinemann U, Birnbaum D, Tishby N, Schreiber G. ProMateus--an open research approach to protein-binding sites analysis. Nucleic Acids Res. 2007 Jul;

ture. 2000 Jul 27;406(6794):378-82. PubMed PMID: 10935628.

Mol Biol. 2003 Dec 5;334(4):781-91. PubMed PMID: 14636602.

interfaces from a single species. PloS one. 2011;6(6):e21053.

1135-46. PubMed PMID: 15544817.

2002 Jun;65(6 Pt 1):061910. PubMed PMID: 12188762.

12084924. Pubmed Central PMCID: 124342.

16336694. Pubmed Central PMCID: 1326230.

29;2(9):e124. PubMed PMID: 17009862.

PubMed PMID: 16381874.

2004;13(4):1043-55.


[78] Amitai G, Shemesh A, Sitbon E, Shklar M, Netanely D, Venger I, et al. Network anal‐ ysis of protein structures identifies functional residues. J Mol Biol. 2004 Dec 3;344(4): 1135-46. PubMed PMID: 15544817.

[65] Landgraf R, Xenarios I, Eisenberg D. Three-dimensional cluster analysis identifies in‐ terfaces and functional residue clusters in proteins. J Mol Biol. 2001 Apr 13;307(5):

[66] Clackson T, Wells JA. A hot spot of binding energy in a hormone-receptor interface.

[67] Bogan AA, Thorn KS. Anatomy of hot spots in protein interfaces. J Mol Biol. 1998 Jul

[68] Fischer TB, Arunachalam KV, Bailey D, Mangual V, Bakhru S, Russo R, et al. The binding interface database (BID): a compilation of amino acid hot spots in protein in‐

[69] Guidry JJ, Shewmaker F, Maskos K, Landry S, Wittung-Stafshede P. Probing the in‐ terface in a human co-chaperonin heptamer: residues disrupting oligomeric unfolded

[70] Luke K, Perham M, Wittung-Stafshede P. Kinetic folding and assembly mechanisms differ for two homologous heptamers. J Mol Biol. 2006 Oct 27;363(3):729-42. PubMed

[71] Barabasi AL, Oltvai ZN. Network biology: understanding the cell's functional organi‐ zation. Nature reviews Genetics. 2004 Feb;5(2):101-13. PubMed PMID: 14735121. [72] Bhattacharyya M, Vishveshwara S. Probing the allosteric mechanism in pyrrolysyltRNA synthetase using energy-weighted network formalism. Biochemistry. 2011 Jul

[73] De Ruvo M, Giuliani A, Paci P, Santoni D, Di Paola L. Shedding light on protein-li‐ gand binding by graph theory: the topological nature of allostery. Biophys Chem.

[74] Daily MD, Gray JJ. Allosteric communication occurs via networks of tertiary and quaternary motions in proteins. PLoS Comput Biol. 2009 Feb;5(2):e1000293. PubMed

[75] Tsai CJ, Del Sol A, Nussinov R. Protein allostery, signal transmission and dynamics: a classification scheme of allosteric mechanisms. Molecular bioSystems. 2009 Mar;

[76] Gunasekaran K, Ma B, Nussinov R. Is allostery an intrinsic property of all dynamic proteins? Proteins: Structure, Function, and Bioinformatics. 2004;57(3):433-43.

[77] del Sol A, Tsai CJ, Ma B, Nussinov R. The origin of allosteric functional modulation: multiple pre-existing pathways. Structure. 2009 Aug 12;17(8):1042-50. PubMed

5(3):207-16. PubMed PMID: 19225609. Pubmed Central PMCID: 2898650.

terfaces. Bioinformatics. 2003 Jul 22;19(11):1453-4. PubMed PMID: 12874065.

state identified. BMC Biochem. 2003 Oct 2;4:14. PubMed PMID: 14525625.

1487-502. PubMed PMID: 11292355.

3;280(1):1-9. PubMed PMID: 9653027.

19;50(28):6225-36. PubMed PMID: 21650159.

2012 May;165-166:21-9. PubMed PMID: 22464849.

PMID: 19229311. Pubmed Central PMCID: 2634971.

PMID: 19679084. Pubmed Central PMCID: 2749652.

Science. 1995;267(5196):383-6.

356 Oligomerization of Chemical and Biological Compounds

PMID: 16979655.


35(Web Server issue):W543-8. PubMed PMID: 17488838. Pubmed Central PMCID: 1933218.

[105] Krishnan R, Lindquist SL. Structural insights into a yeast prion illuminate nucleation and strain diversity. Nature. 2005 Jun 9;435(7043):765-72. PubMed PMID: 15944694.

The Assembly of Protein Oligomers — Old Stories and New Perspectives with Graph Theory

http://dx.doi.org/10.5772/58576

359

[106] Margittai M, Langen R. Template-assisted filament growth by parallel stacking of tau. Proc Natl Acad Sci U S A. 2004 Jul 13;101(28):10278-83. PubMed PMID: 15240881.

[107] Lv G, Kumar A, Giller K, Orcellet ML, Riedel D, Fernandez CO, et al. Structural com‐ parison of mouse and human alpha-synuclein amyloid fibrils by solid-state NMR. J

[108] Fernandez-Escamilla A-M, Rousseau F, Schymkowitz J, Serrano L. Prediction of se‐ quence-dependent and mutational effects on the aggregation of peptides and pro‐

[109] Galzitskaya OV, Garbuzynskiy SO, Lobanov MY. Prediction of amyloidogenic and disordered regions in protein chains. PLoS computational biology. 2006;2(12):e177.

[110] Thompson MJ, Sievers SA, Karanicolas J, Ivanova MI, Baker D, Eisenberg D. The 3D profile method for identifying fibril-forming segments of proteins. Proceedings of the National Academy of Sciences of the United States of America. 2006;103(11):4074-8.

[111] Belli M, Ramazzotti M, Chiti F. Prediction of amyloid aggregation in vivo. EMBO Rep. 2011 Jul;12(7):657-63. PubMed PMID: 21681200. Pubmed Central PMCID:

[112] Smith JM, Jang Y, Kim MK. Steiner minimal trees, twist angles, and the protein fold‐ ing problem. Proteins: Structure, Function, and Bioinformatics. 2007;66(4):889-902.

[113] Levin KB, Dym O, Albeck S, Magdassi S, Keeble AH, Kleanthous C, et al. Following evolutionary paths to protein-protein interactions with high affinity and selectivity.

[114] Caflisch A. Network and graph analyses of folding free energy surfaces. Curr Opin

[115] Bahar I, Chennubhotla C, Tobi D. Intrinsic dynamics of enzymes in the unbound state and relation to allosteric regulation. Curr Opin Struct Biol. 2007 Dec;17(6):

[116] Chennubhotla C, Bahar I. Signal propagation in proteins and relation to equilibrium fluctuations. PLoS Comput Biol. 2007 Sep;3(9):1716-26. PubMed PMID: 17892319.

[117] Jin Y, Turaev D, Weinmaier T, Rattei T, Makse HA. The evolutionary dynamics of protein-protein interaction networks inferred from the reconstruction of ancient net‐ works. PLoS One. 2013;8(3):e58134. PubMed PMID: 23526967. Pubmed Central

633-40. PubMed PMID: 18024008. Pubmed Central PMCID: 2197162.

Nature structural & molecular biology. 2009;16(10):1049-55.

Struct Biol. 2006 Feb;16(1):71-8. PubMed PMID: 16413772.

Pubmed Central PMCID: 1988854.

PMCID: 3603955.

Mol Biol. 2012 Jun 29;420(1-2):99-111. PubMed PMID: 22516611.

teins. Nature Biotechnology. 2004;22(10):1302-6.

Pubmed Central PMCID: 1405905.

Pubmed Central PMCID: 478563.


[105] Krishnan R, Lindquist SL. Structural insights into a yeast prion illuminate nucleation and strain diversity. Nature. 2005 Jun 9;435(7043):765-72. PubMed PMID: 15944694. Pubmed Central PMCID: 1405905.

35(Web Server issue):W543-8. PubMed PMID: 17488838. Pubmed Central PMCID:

[92] Yan C, Wu F, Jernigan RL, Dobbs D, Honavar V. Characterization of protein-protein

[93] Yan C, Dobbs D, Honavar V. A two-stage classifier for identification of protein-pro‐ tein interface residues. Bioinformatics. 2004 Aug 4;20 Suppl 1:i371-8. PubMed PMID:

[94] Cheng P-N, Pham JD, Nowick JS. The Supramolecular Chemistry of β-Sheets. Journal

[95] Khakshoor O, Nowick JS. Artificial beta-sheets: chemical models of beta-sheets. Curr

[96] López De La Paz M, Serrano L. Sequence determinants of amyloid fibril formation. Proceedings of the National Academy of Sciences of the United States of America.

[97] Lopez De La Paz M, Goldie K, Zurdo J, Lacroix E, Dobson CM, Hoenger A, et al. De novo designed peptide-based amyloid fibrils. Proc Natl Acad Sci U S A. 2002 Dec

[98] Trovato A, Chiti F, Maritan A, Seno F. Insight into the structure of amyloid fibrils from the analysis of globular proteins. PLoS computational biology. 2006;2(12):e170.

[99] Kayed R, Head E, Thompson JL, McIntire TM, Milton SC, Cotman CW, et al. Com‐ mon structure of soluble amyloid oligomers implies common mechanism of patho‐

[100] Guijarro JI, Sunde M, Jones JA, Campbell ID, Dobson CM. Amyloid fibril formation by an SH3 domain. Proc Natl Acad Sci U S A. 1998 Apr 14;95(8):4224-8. PubMed

[101] Dobson CM. Protein misfolding, evolution and disease. Trends Biochem Sci.

[102] Petkova AT, Ishii Y, Balbach JJ, Antzutkin ON, Leapman RD, Delaglio F, et al. A structural model for Alzheimer's beta-amyloid fibrils based on experimental con‐ straints from solid state NMR. Proc Natl Acad Sci U S A. 2002 Dec 24;99(26):16742-7.

[103] Der-Sarkissian A, Jao CC, Chen J, Langen R. Structural organization of alpha-synu‐ clein fibrils studied by site-directed spin labeling. J Biol Chem. 2003 Sep 26;278(39):

[104] Kajava AV, Aebi U, Steven AC. The parallel superpleated beta-structure as a model for amyloid fibrils of human amylin. J Mol Biol. 2005 Apr 29;348(2):247-52. PubMed

interfaces. Protein J. 2008 Jan;27(1):59-70. PubMed PMID: 17851740.

Opin Chem Biol. 2008 Dec;12(6):722-9. PubMed PMID: 18775794.

1933218.

358 Oligomerization of Chemical and Biological Compounds

15262822.

2004;101(1):87.

1999;24:329-32.

PMID: 15811365.

of the American Chemical Society. 2013.

10;99(25):16052-7. PubMed PMID: 12456886.

genesis. Science. 2003;300(5618):486-9.

37530-5. PubMed PMID: 12815044.

PMID: 9539718. Pubmed Central PMCID: 22470.

PubMed PMID: 12481027. Pubmed Central PMCID: 139214.


[118] Liu YY, Slotine JJ, Barabasi AL. Controllability of complex networks. Nature. 2011;473:167-73.

[132] Testa OD, Moutevelis E, Woolfson DN. CC+: a relational database of coiled-coil structures. Nucleic Acids Res. 2009 Jan;37(Database issue):D315-22. PubMed PMID:

The Assembly of Protein Oligomers — Old Stories and New Perspectives with Graph Theory

http://dx.doi.org/10.5772/58576

361

[133] Papanikolopoulou K, Forge V, Goeltz P, Mitraki A. Formation of highly stable chi‐ meric trimers by fusion of an adenovirus fiber shaft fragment with the foldon do‐ main of bacteriophage t4 fibritin. J Biol Chem. 2004b Mar 5;279(10):8991-8. PubMed

[134] Zrimi J, Ng Ling A, Giri-Rachman Arifin E, Feverati G, Lesieur C. Cholera toxin B subunits assemble into pentamers-proposition of a fly-casting mechanism. PLoS One.

[135] Ruddock LW, Coen JJ, Cheesman C, Freedman RB, Hirst TR. Assembly of the B sub‐ unit pentamer of Escherichia coli heat-labile enterotoxin. Kinetics and molecular ba‐ sis of rate-limiting steps in vitro. J Biol Chem. 1996b Aug 9;271(32):19118-23. PubMed

[136] Dang LT, Purvis AR, Huang RH, Westfield LA, Sadler JE. Phylogenetic and function‐ al analysis of histidine residues essential for pH-dependent multimerization of von

[137] Hashimoto K, Nishi H, Bryant S, Panchenko AR. Caught in self-interaction: evolu‐ tionary and functional mechanisms of protein homooligomerization. Phys Biol. 2011

Jun;8(3):035007. PubMed PMID: 21572178. Pubmed Central PMCID: 3148176.

[138] Csermely P, Palotai R, Nussinov R. Induced fit, conformational selection and inde‐ pendent dynamic segments: an extended view of binding events. Trends Biochem Sci. 2010 Oct;35(10):539-46. PubMed PMID: 20541943. Pubmed Central PMCID:

[139] D'Alessio G. The evolutionary transition from monomeric to oligomeric proteins: tools, the environment, hypotheses. Prog Biophys Mol Biol. 1999;72(3):271-98.

[140] D'Alessio G. Oligomer evolution in action? Nat Struct Biol. 1995 Jan;2(1):11-3.

[141] Eisenberg D, Jucker M. The Amyloid State of Proteins in Human Diseases. Cell.

[142] Levy ED, Erba EB, Robinson CV, Teichmann SA. Assembly reflects evolution of pro‐

[143] Luo M, Singh RK, Tanner JJ. Structural determinants of oligomerization of delta(1) pyrroline-5-carboxylate dehydrogenase: identification of a hexamerization hot spot. J Mol Biol. 2013 Sep 9;425(17):3106-20. PubMed PMID: 23747974. Pubmed Central

18842638.

PMID: 14699113.

PMID: 8702586.

3018770.

PubMed PMID: 10581971.

PubMed PMID: 7719846.

tein complexes. Nature. 2008;453(7199):1262-5.

2012;148(6):1188-203.

PMCID: 3743950.

2010;5(12):e15347. PubMed PMID: 21203571.

Willebrand factor. Journal of Biological Chemistry. 2011.


[132] Testa OD, Moutevelis E, Woolfson DN. CC+: a relational database of coiled-coil structures. Nucleic Acids Res. 2009 Jan;37(Database issue):D315-22. PubMed PMID: 18842638.

[118] Liu YY, Slotine JJ, Barabasi AL. Controllability of complex networks. Nature.

[119] Albert R, Barabasi AL. Topology of evolving networks: local events and universality. Physical review letters. 2000 Dec 11;85(24):5234-7. PubMed PMID: 11102229.

[120] Callaway DS, Newman ME, Strogatz SH, Watts DJ. Network robustness and fragili‐ ty: percolation on random graphs. Physical review letters. 2000 Dec 18;85(25):5468-71.

[121] Lupas A, Van Dyke M, Stock J. Predicting coiled coils from protein sequences. Sci‐

[122] Gruber M, Soding J, Lupas AN. Comparative analysis of coiled-coil prediction meth‐

[123] Bartoli L, Fariselli P, Krogh A, Casadio R. CCHMM\_PROF: a HMM-based coiled-coil predictor with evolutionary information. Bioinformatics. 2009 Nov 1;25(21):2757-63.

[124] Wolf E, Kim PS, Berger B. MultiCoil: a program for predicting two-and three-strand‐ ed coiled coils. Protein Sci. 1997 Jun;6(6):1179-89. PubMed PMID: 9194178. Pubmed

[125] Crick FHC. The packing of alpha-helices: simple coiled-coils. Acta Crystallogr.

[126] Poupon A, Janin J. Analysis and prediction of protein quaternary structure. Methods

[127] Comeau SR, Camacho CJ. Predicting oligomeric assemblies: N-mers a primer. J

[128] Walshaw J, Woolfson DN. Extended knobs-into-holes packing in classical and com‐ plex coiled-coil assemblies. J Struct Biol. 2003 Dec;144(3):349-61. PubMed PMID:

[129] Calladine CR, Luisi BF, Pratap JV. A "mechanistic" explanation of the multiple helical forms adopted by bacterial flagellar filaments. J Mol Biol. 2013 Mar 11;425(5):914-28.

[130] Calladine CR, Sharff A, Luisi B. How to untwist an alpha-helix: structural principles of an alpha-helical barrel. J Mol Biol. 2001 Jan 19;305(3):603-18. PubMed PMID:

[131] Moutevelis E, Woolfson DN. A periodic table of coiled-coil protein structures. J Mol

ence. 1991 May 24;252(5010):1162-4. PubMed PMID: 2031185.

Mol Biol. 2010;609:349-64. PubMed PMID: 20221929.

Struct Biol. 2005 Jun;150(3):233-44. PubMed PMID: 15890272.

PubMed PMID: 23274110. Pubmed Central PMCID: 3605589.

Biol. 2009 Jan 23;385(3):726-32. PubMed PMID: 19059267.

ods. J Struct Biol. 2006 Aug;155(2):140-5. PubMed PMID: 16870472.

2011;473:167-73.

360 Oligomerization of Chemical and Biological Compounds

PubMed PMID: 11136023.

PubMed PMID: 19744995.

Central PMCID: 2143730.

1953;6:689–97.

14643203.


[144] Crick FH, Watson JD. Structure of small viruses. Nature. 1956 Mar 10;177(4506): 473-5. PubMed PMID: 13309339.

[158] Kentsis A, Borden KL. Construction of macromolecular assemblages in eukaryotic processes and their role in human disease: linking RINGs together. Curr Protein Pept

The Assembly of Protein Oligomers — Old Stories and New Perspectives with Graph Theory

http://dx.doi.org/10.5772/58576

363

[159] Pereira-Leal JB, Levy ED, Teichmann SA. The origins and evolution of functional modules: lessons from protein complexes. Philos Trans R Soc Lond B Biol Sci. 2006 Mar 29;361(1467):507-17. PubMed PMID: 16524839. Pubmed Central PMCID:

[160] Shoemaker BA, Portman JJ, Wolynes PG. Speeding molecular recognition by using the folding funnel: the fly-casting mechanism. Proc Natl Acad Sci U S A. 2000 Aug

[161] Levy Y, Wolynes PG, Onuchic JN. Protein topology determines binding mechanism. Proc Natl Acad Sci U S A. 2004 Jan 13;101(2):511-6. PubMed PMID: 14694192.

[162] Spaar A, Dammer C, Gabdoulline RR, Wade RC, Helms V. Diffusional encounter of barnase and barstar. Biophys J. 2006 Mar 15;90(6):1913-24. PubMed PMID: 16361332.

[163] Ehrlich LP, Nilges M, Wade RC. The impact of protein flexibility on protein-protein

[164] Gabdoulline RR, Wade RC. Protein-protein association: investigation of factors influ‐ encing association rates by brownian dynamics simulations. J Mol Biol. 2001 Mar

[165] Gabdoulline RR, Wade RC. Simulation of the diffusional association of barnase and

[166] Noe F, Fischer S. Transition networks for modeling the kinetics of conformational change in macromolecules. Curr Opin Struct Biol. 2008 Apr;18(2):154-62. PubMed

[167] Wales DJ. Energy landscapes: some new horizons. Curr Opin Struct Biol. 2010 Feb;

[168] Barz B, Wales DJ, Strodel B. A kinetic approach to the sequence-aggregation relation‐ ship in disease-related protein assembly. J Phys Chem B. 2014 Jan 30;118(4):1003-11.

[169] Higashimoto Y, Asanomi Y, Takakusagi S, Lewis MS, Uosaki K, Durell SR, et al. Un‐ folding, aggregation, and amyloid formation by the tetramerization domain from

[170] Fujita T, Kiyama M, Tomizawa Y, Kohno T, Yokota J. Comprehensive analysis of p53 gene mutation characteristics in lung carcinoma with special reference to histological

mutant p53 associated with lung cancer. Biochemistry. 2006;45(6):1608-19.

docking. Proteins. 2005 Jan 1;58(1):126-33. PubMed PMID: 15515181.

barstar. Biophys J. 1997 May;72(5):1917-29. PubMed PMID: 9129797.

PubMed PMID: 24401100. Pubmed Central PMCID: 3908877.

subtypes. International journal of oncology. 1999;15(5):927-34.

Sci. 2000 Jul;1(1):49-73. PubMed PMID: 12369920.

1;97(16):8868-73. PubMed PMID: 10908673.

9;306(5):1139-55. PubMed PMID: 11237623.

20(1):3-10. PubMed PMID: 20096562.

PMID: 18378442.


[158] Kentsis A, Borden KL. Construction of macromolecular assemblages in eukaryotic processes and their role in human disease: linking RINGs together. Curr Protein Pept Sci. 2000 Jul;1(1):49-73. PubMed PMID: 12369920.

[144] Crick FH, Watson JD. Structure of small viruses. Nature. 1956 Mar 10;177(4506):

[145] Rackham OJ, Madera M, Armstrong CT, Vincent TL, Woolfson DN, Gough J. The evolution and structure prediction of coiled coils across all genomes. J Mol Biol. 2010

[146] Yadid I, Kirshenbaum N, Sharon M, Dym O, Tawfik DS. Metamorphic proteins me‐ diate evolutionary transitions of structure. Proc Natl Acad Sci U S A. 2010 Apr 20;107(16):7287-92. PubMed PMID: 20368465. Pubmed Central PMCID: 2867682. [147] King J, Wood WB. Assembly of bacteriophage T4 tail fibers: the sequence of gene product interaction. J Mol Biol. 1969 Feb 14;39(3):583-601. PubMed PMID: 5390559.

[148] King J. Assembly of the tail of bacteriophage T4. J Mol Biol. 1968 Mar 14;32(2):231-62.

[149] Rennell D, Bouvier SE, Hardy LW, Poteete AR. Systematic mutation of bacteriophage T4 lysozyme. J Mol Biol. 1991 Nov 5;222(1):67-88. PubMed PMID: 1942069.

[150] Goldenberg DP, Berget PB, King J. Maturation of the tail spike endorhamnosidase of Salmonella phage P22. J Biol Chem. 1982 Jul 10;257(13):7864-71. PubMed PMID:

[151] Perham M, Chen M, Ma J, Wittung-Stafshede P. Unfolding of heptameric co-chapero‐ nin protein follows "fly casting" mechanism: observation of transient nonnative hep‐ tamer. J Am Chem Soc. 2005 Nov 30;127(47):16402-3. PubMed PMID: 16305220. [152] Bascos N, Guidry J, Wittung-Stafshede P. Monomer topology defines folding speed of heptamer. Protein Sci. 2004 May;13(5):1317-21. PubMed PMID: 15075408.

[153] Tacnet P, Thielens N, Arifin Giri Rachman E, Hirst TR, Lesieur C. Cholera toxin B as‐ sembly intermediates provide some explanation to the existence of a pentameric tox‐

[154] Pell LG, Cumby N, Clark TE, Tuite A, Battaile KP, Edwards AM, et al. A conserved spiral structure for highly diverged phage tail assembly chaperones. J Mol Biol. 2013

[155] Aghera N, Udgaonkar JB. Kinetic studies of the folding of heterodimeric monellin: evidence for switching between alternative parallel pathways. J Mol Biol. 2012 Jul

[156] Kentsis A, Gordon RE, Borden KL. Control of biochemical reactions through supra‐ molecular RING domain self-assembly. Proc Natl Acad Sci U S A. 2002 Nov

26;99(24):15404-9. PubMed PMID: 12438698. Pubmed Central PMCID: 137729. [157] Kentsis A, Gordon RE, Borden KL. Self-assembly properties of a model RING do‐ main. Proc Natl Acad Sci U S A. 2002 Jan 22;99(2):667-72. PubMed PMID: 11792829.

Jul 24;425(14):2436-49. PubMed PMID: 23542344.

13;420(3):235-50. PubMed PMID: 22542529.

Pubmed Central PMCID: 117363.

473-5. PubMed PMID: 13309339.

362 Oligomerization of Chemical and Biological Compounds

PubMed PMID: 4868421.

7045114.

in state.

Oct 29;403(3):480-93. PubMed PMID: 20813113.


[171] Mateu MG, Fersht AR. Nine hydrophobic side chains are key determinants of the thermodynamic stability and oligomerization status of tumour suppressor p53 tetra‐ merization domain. The EMBO Journal. 1998;17(10):2748-58.

**Chapter 12**

**Provisional chapter**

**Geometry and Topology in Protein Interfaces -- Some**

The present work is motivated by the biological problem of understanding and possibly predicting the assembly of biological molecules, in particular proteins. This is one of the most common processes in living cells thus it is essential to understand its key aspects, especially in relation to the implication in several pathologies, from bacterial infections (cholera, anthrax, ...) to protein misfolding diseases (Alzheimer, Parkinson, ...) [1–4]. The stable association of different subunits requires the formation of specific intermolecular bonds, thus constituting what is called an interface. Unfortunately, in spite of extensive analyses, the identification of the patterns, in the polypeptidic chain, responsible for the

Geometry has developed the ability to measure and characterize complex shapes but it is not a priori obvious that it may also reveal important aspects of the interactions. To understand

The main geometrical elements of the modern Golden Gate Bridge in S. Francisco and the ancient Roman aqueduct bridge Pont du Gard, in the Gard department in France, are arcs. But with a main difference: the three arcs that form the suspension system of the Golden Gate Bridge are concave upward while the many arcs that form the Pont du Gard are concave downward. Indeed, in the first case the arcs resist to longitudinal tension while in the second case resist to longitudinal compression. Stones are unsuited to resist to strong tension, while they perfectly resist to huge compression. Thus, architectural elements that have to undergo strong tensions are made of wood or steel, but not of stone. Notice that the simple observation of the geometrical form, the concave upward or downward aspect of the bridges, has lead us to understand the basic interactions and formulate constraints on the possible choices of materials. This argument can be pushed much forward: the structural analysis in architecture and engineering largely rely on euclidean geometry (diagrams of forces are diagrams of vectors). Even if the elastic properties of construction materials and the action of gravity are important, the main ingredient in studying the equilibrium of forces is geometry.

> ©2012 Feverati, licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use,

© 2014 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

distribution, and reproduction in any medium, provided the original work is properly cited.

**Geometry and Topology in Protein Interfaces –**

**Tools for Investigations**

Additional information is available at the end of the chapter

Additional information is available at the end of the chapter

establishment of an interface remains difficult.

this point, let consider the following examples.

**Some Tools for Investigations**

Giovanni Feverati

Giovanni Feverati

10.5772/58420

**1. Introduction**

http://dx.doi.org/10.5772/58420


#### **Geometry and Topology in Protein Interfaces -- Some Tools for Investigations Geometry and Topology in Protein Interfaces – Some Tools for Investigations**

Giovanni Feverati Giovanni Feverati

[171] Mateu MG, Fersht AR. Nine hydrophobic side chains are key determinants of the thermodynamic stability and oligomerization status of tumour suppressor p53 tetra‐

[172] Reixach N, Foss TR, Santelli E, Pascual J, Kelly JW, Buxbaum JN. Human-murine transthyretin heterotetramers are kinetically stable and non-amyloidogenic. A lesson in the generation of transgenic models of diseases involving oligomeric proteins. J Bi‐

[173] Jiang X, Buxbaum JN, Kelly JW. The V122I cardiomyopathy variant of transthyretin increases the velocity of rate-limiting tetramer dissociation, resulting in accelerated amyloidosis. Proc Natl Acad Sci U S A. 2001 Dec 18;98(26):14943-8. PubMed PMID:

[174] Lomas DA, Carrell RW. Serpinopathies and the conformational dementias. Nature

[175] Chiti F, Stefani M, Taddei N, Ramponi G, Dobson CM. Rationalization of the effects of mutations on peptide and protein aggregation rates. NATURE-LONDON-.

[176] Changeux JP, Edelstein SJ. Allosteric mechanisms of signal transduction. Science.

[177] Pires DE, Ascher DB, Blundell TL. mCSM: predicting the effects of mutations in pro‐ teins using graph-based signatures. Bioinformatics. 2014 Feb 1;30(3):335-42. PubMed

[178] Bode C, Kovacs IA, Szalay MS, Palotai R, Korcsmaros T, Csermely P. Network analy‐ sis of protein dynamics. FEBS Lett. 2007 Jun 19;581(15):2776-82. PubMed PMID:

[179] Newman ME, Strogatz SH, Watts DJ. Random graphs with arbitrary degree distribu‐ tions and their applications. Phys Rev E Stat Nonlin Soft Matter Phys. 2001 Aug;64(2

[180] Peysselon F, Ricard-Blum S. Heparin-protein interactions: From affinity and kinetics to biological roles. Application to an interaction network regulating angiogenesis.

merization domain. The EMBO Journal. 1998;17(10):2748-58.

ol Chem. 2008 Jan 25;283(4):2098-107. PubMed PMID: 18006495.

11752443. Pubmed Central PMCID: 64963.

2005 Jun 3;308(5727):1424-8. PubMed PMID: 15933191.

PMID: 24281696. Pubmed Central PMCID: 3904523.

Matrix Biol. 2013 Nov 16. PubMed PMID: 24246365.

Pt 2):026118. PubMed PMID: 11497662.

Reviews Genetics. 2002;3(10):759-68.

364 Oligomerization of Chemical and Biological Compounds

2003:805-8.

17531981.

Additional information is available at the end of the chapter Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/58420 10.5772/58420

#### **1. Introduction**

The present work is motivated by the biological problem of understanding and possibly predicting the assembly of biological molecules, in particular proteins. This is one of the most common processes in living cells thus it is essential to understand its key aspects, especially in relation to the implication in several pathologies, from bacterial infections (cholera, anthrax, ...) to protein misfolding diseases (Alzheimer, Parkinson, ...) [1–4]. The stable association of different subunits requires the formation of specific intermolecular bonds, thus constituting what is called an interface. Unfortunately, in spite of extensive analyses, the identification of the patterns, in the polypeptidic chain, responsible for the establishment of an interface remains difficult.

Geometry has developed the ability to measure and characterize complex shapes but it is not a priori obvious that it may also reveal important aspects of the interactions. To understand this point, let consider the following examples.

The main geometrical elements of the modern Golden Gate Bridge in S. Francisco and the ancient Roman aqueduct bridge Pont du Gard, in the Gard department in France, are arcs. But with a main difference: the three arcs that form the suspension system of the Golden Gate Bridge are concave upward while the many arcs that form the Pont du Gard are concave downward. Indeed, in the first case the arcs resist to longitudinal tension while in the second case resist to longitudinal compression. Stones are unsuited to resist to strong tension, while they perfectly resist to huge compression. Thus, architectural elements that have to undergo strong tensions are made of wood or steel, but not of stone. Notice that the simple observation of the geometrical form, the concave upward or downward aspect of the bridges, has lead us to understand the basic interactions and formulate constraints on the possible choices of materials. This argument can be pushed much forward: the structural analysis in architecture and engineering largely rely on euclidean geometry (diagrams of forces are diagrams of vectors). Even if the elastic properties of construction materials and the action of gravity are important, the main ingredient in studying the equilibrium of forces is geometry.

Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2014 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

©2012 Feverati, licensee InTech. This is an open access chapter distributed under the terms of the Creative

A second example is taken from Einstein's general relativity, where the notion of gravitational interaction and the space-time geometry are fully identified by the equivalence principle and the Einstein's field equations. The equivalence principle was first formulated by A. Einstein in 1907, when he recognized that the local behaviour of falling bodies is equivalent to the effect of being in an accelerated reference system (this holds for local effects only). The Einstein's field equations (1915) provide with a mathematical formulation of the principle. In summary, spatial and temporal distances fully inform on the gravitational interaction.

10.5772/58420

367

http://dx.doi.org/10.5772/58420

cases, the prediction also takes into account the known geometrical constraints present in

Geometry and Topology in Protein Interfaces -- Some Tools for Investigations

Oligomeric proteins associate by forming an interface. Various descriptions of the interface

The simplest definition of interface between two adjacent polypeptidic chains *A* and *B* is provided by selecting the set of pairs of atoms, one from each chain, whose distance is lower than a given cut-off, typically fixed near 0.5 nm (cut-off interface). This definition does not provide any measure to distinguish pairs. As such, it provides little information since firstly physical interactions decrease when distance grows, secondly two interactions of equal strength may not play the same role if they are in different parts of the molecules, inserted in different local atomic environments. In another definition, the interface is identified to the surface buried between the two components *A* and *B*, namely to those atoms that belong to the surface of *A* and *B* and that loose solvent accessibility once the complex *AB* is formed [10] This makes use of the Van der Waals atomic radii, and leads to distinguish a rim (exposed to the solvent) from a core (inaccessible to the solvent). The interface can be defined also by constructing the Voronoi *α*-complex [10], namely the set of Voronoi restricted balls. The construction follows a precise mathematical procedure, and determines the volume in which

Both the buried surface and Voronoi restricted balls methods focus on the volume of the atoms and the importance of the specific chemical properties of each atom. They make use of a cut off and describe the interactions by using the Van der Waals radius. Differently from these descriptions of the interface, we felt the need to develop a stronger analysis of the structural organization of a protein interface, in order to evaluate the specific role of each

In [11], we shown that many aspects of the structural organization of a protein interface can be effectively described by a graph, namely the ensemble of nodes and edges, constructed following the precise geometrical analysis of the three-dimensional structure of the interface known as symmetrization or symmetric minimization of distances. The algorithm and the graph theory terms are described in Methods. The graph describes how the different atoms are connected. In fact, among its edges one recognizes the known hydrogen bonds present at the interface, that are obtained as a bonus, because the symmetric minimization does not make use of them (see Methods). An example of interaction graph is given in Figure 1.

Statistical analyses have been performed on the case of the *β* interfaces, that are formed by two adjacent *β* strands, one from each subunit [6–8]. In [8] the analysis has been extended to a dataset of 755 proteins. It is known that there are three possible orientations of the adjacent *β* strands: they can be anti-parallel (by far, this is the most common case), parallel or oblique. The latter actually includes all the cases that do not enter into the previous ones, for example perpendicular or oblique *β* strands. The most significant results of these statistical analyses are summarized here (please refer to the Methods for the precise definition of the motifs).

• Two typical interaction graphs have been observed, one for the parallel and one for the anti-parallel orientation. The anti-parallel case shows a BB graph of type ladder, were rungs are typically spaced of 2 amino acids. The parallel case shows a BB graph of type zigzag, in which one recognizes a separation of 2 amino acids in each oscillation of the

amino acids.

have been proposed.

an atom interacts more than its neighbours.

residue and the rules of pairings.

This connection between geometry and interactions also works at the atomic scale. The perfectly planar and hexagonal symmetric form of benzene molecules, compared to the non planar and less symmetric cyclohexane molecule, is clear indication of the different nature of the corresponding Carbon-Carbon bonds and of the sp2 or sp3 hybridization respectively.

In the early 50's, F. Crick [5] observed that the formation of the coiled-coil protein interface is due to the appropriate geometrical and chemical complementarity of the two interacting domains, as in a lock and key mechanism. The key has a particular geometrical form combined to some contact points which together provide it the capacity to associate to one lock.

Moving on from all these examples, in our group we have developed tools to investigate the geometry of the interfaces in multi-chain proteins. In essence, we measure shapes and compare measures between different proteins. Our input data are protein atomic positions, as those from the Protein Data Bank (PDB) repository of protein structural data.

In particular, we compare protein interfaces of similar geometrical form. From the previous examples, there is no surprise in claiming that the geometry will provide information on the interactions at the interfaces. In our previous publications [6–8], careful statistical analyses have been performed and have led to formulate constraints on the amino acid sequences and atom pairs that are compatible with a given geometrical form. The long term perspective of our work is to rationalize the interface, namely to establish a clear understanding of its sequence-structure relationship, in order to develop interface prediction tools and help to advance in interface design.

#### **1.1. Basic information on interfaces**

The shape and the function of proteins are normally encoded within their sequences, i.e. in their amino acid compositions but it is not yet possible, by simply reading the primary sequence of a protein, to predict its three-dimensional structure or the quaternary organization, in the case of an oligomeric protein. One of the difficulties is the non linear encoding of the information in the sequence, namely the fact that the three-dimensional structure is often generated and stabilized by bonds between residues that are not contiguous or even are very far apart, along the chain. Another difficulty is due to the degeneracy between sequences and structures, consisting in the observation that several sequences can code for the same shape, that indicates a versatile role of the amino acids. The secondary structures of proteins which are mainly composed of *α* helices, *β* structures and loops are partially understood, and several prediction programs are now available. Prediction of 3D structures is mainly based on homology, namely comparison of sequences that have similar three-dimensional elements. A rich collection of prediction tools is available on [9]. In some cases, the prediction also takes into account the known geometrical constraints present in amino acids.

2 ime knjige

lock.

advance in interface design.

**1.1. Basic information on interfaces**

A second example is taken from Einstein's general relativity, where the notion of gravitational interaction and the space-time geometry are fully identified by the equivalence principle and the Einstein's field equations. The equivalence principle was first formulated by A. Einstein in 1907, when he recognized that the local behaviour of falling bodies is equivalent to the effect of being in an accelerated reference system (this holds for local effects only). The Einstein's field equations (1915) provide with a mathematical formulation of the principle. In summary, spatial and temporal distances fully inform on the gravitational interaction. This connection between geometry and interactions also works at the atomic scale. The perfectly planar and hexagonal symmetric form of benzene molecules, compared to the non planar and less symmetric cyclohexane molecule, is clear indication of the different nature of the corresponding Carbon-Carbon bonds and of the sp2 or sp3 hybridization respectively. In the early 50's, F. Crick [5] observed that the formation of the coiled-coil protein interface is due to the appropriate geometrical and chemical complementarity of the two interacting domains, as in a lock and key mechanism. The key has a particular geometrical form combined to some contact points which together provide it the capacity to associate to one

Moving on from all these examples, in our group we have developed tools to investigate the geometry of the interfaces in multi-chain proteins. In essence, we measure shapes and compare measures between different proteins. Our input data are protein atomic positions,

In particular, we compare protein interfaces of similar geometrical form. From the previous examples, there is no surprise in claiming that the geometry will provide information on the interactions at the interfaces. In our previous publications [6–8], careful statistical analyses have been performed and have led to formulate constraints on the amino acid sequences and atom pairs that are compatible with a given geometrical form. The long term perspective of our work is to rationalize the interface, namely to establish a clear understanding of its sequence-structure relationship, in order to develop interface prediction tools and help to

The shape and the function of proteins are normally encoded within their sequences, i.e. in their amino acid compositions but it is not yet possible, by simply reading the primary sequence of a protein, to predict its three-dimensional structure or the quaternary organization, in the case of an oligomeric protein. One of the difficulties is the non linear encoding of the information in the sequence, namely the fact that the three-dimensional structure is often generated and stabilized by bonds between residues that are not contiguous or even are very far apart, along the chain. Another difficulty is due to the degeneracy between sequences and structures, consisting in the observation that several sequences can code for the same shape, that indicates a versatile role of the amino acids. The secondary structures of proteins which are mainly composed of *α* helices, *β* structures and loops are partially understood, and several prediction programs are now available. Prediction of 3D structures is mainly based on homology, namely comparison of sequences that have similar three-dimensional elements. A rich collection of prediction tools is available on [9]. In some

as those from the Protein Data Bank (PDB) repository of protein structural data.

Oligomeric proteins associate by forming an interface. Various descriptions of the interface have been proposed.

The simplest definition of interface between two adjacent polypeptidic chains *A* and *B* is provided by selecting the set of pairs of atoms, one from each chain, whose distance is lower than a given cut-off, typically fixed near 0.5 nm (cut-off interface). This definition does not provide any measure to distinguish pairs. As such, it provides little information since firstly physical interactions decrease when distance grows, secondly two interactions of equal strength may not play the same role if they are in different parts of the molecules, inserted in different local atomic environments. In another definition, the interface is identified to the surface buried between the two components *A* and *B*, namely to those atoms that belong to the surface of *A* and *B* and that loose solvent accessibility once the complex *AB* is formed [10] This makes use of the Van der Waals atomic radii, and leads to distinguish a rim (exposed to the solvent) from a core (inaccessible to the solvent). The interface can be defined also by constructing the Voronoi *α*-complex [10], namely the set of Voronoi restricted balls. The construction follows a precise mathematical procedure, and determines the volume in which an atom interacts more than its neighbours.

Both the buried surface and Voronoi restricted balls methods focus on the volume of the atoms and the importance of the specific chemical properties of each atom. They make use of a cut off and describe the interactions by using the Van der Waals radius. Differently from these descriptions of the interface, we felt the need to develop a stronger analysis of the structural organization of a protein interface, in order to evaluate the specific role of each residue and the rules of pairings.

In [11], we shown that many aspects of the structural organization of a protein interface can be effectively described by a graph, namely the ensemble of nodes and edges, constructed following the precise geometrical analysis of the three-dimensional structure of the interface known as symmetrization or symmetric minimization of distances. The algorithm and the graph theory terms are described in Methods. The graph describes how the different atoms are connected. In fact, among its edges one recognizes the known hydrogen bonds present at the interface, that are obtained as a bonus, because the symmetric minimization does not make use of them (see Methods). An example of interaction graph is given in Figure 1.

Statistical analyses have been performed on the case of the *β* interfaces, that are formed by two adjacent *β* strands, one from each subunit [6–8]. In [8] the analysis has been extended to a dataset of 755 proteins. It is known that there are three possible orientations of the adjacent *β* strands: they can be anti-parallel (by far, this is the most common case), parallel or oblique. The latter actually includes all the cases that do not enter into the previous ones, for example perpendicular or oblique *β* strands. The most significant results of these statistical analyses are summarized here (please refer to the Methods for the precise definition of the motifs).

• Two typical interaction graphs have been observed, one for the parallel and one for the anti-parallel orientation. The anti-parallel case shows a BB graph of type ladder, were rungs are typically spaced of 2 amino acids. The parallel case shows a BB graph of type zigzag, in which one recognizes a separation of 2 amino acids in each oscillation of the

10.5772/58420

369

http://dx.doi.org/10.5772/58420

**2. Methods**

**2.1. Symmetric minimization of distances**

Start: *R*<sup>0</sup> ←

*i* ← 0

**while** *Ri* �= { } **do:** min*A*(*a*) ← min

> *LA* ←

*LB* ← 

Output: *S*0, *S*1, *S*2, ...

interface. Please see the caption of Figure 2.

*Si* ← *LA* ∩ *LB Ri*<sup>+</sup><sup>1</sup> ← *Ri* − *Si i* ← *i* + 1 **end while**

min*B*(*b*) ← min

Input: *A* ← atoms of the first subunit

*B* ← atoms of the second subunit

(*a*, *b*) : *a* ∈ *A*, *b* ∈ *B*

The description of the interface, that has been developed starting with [11] and that will be used in this paper, was introduced to help extract the structural organization of the interface. It focuses on the way atoms pair, keeping into account the local connectivity, namely the possibility that atoms interact with other atoms, according to their local arrangement. It is based on the notion of symmetric minimization of distance pairs, defined by the symmetric minimization algorithm, presented here in pseudocode. The flow chart is given in Figure 3. The needed mathematical explanations and demonstrations have been provided in [12].

*d*(*a*, *b*) ← A metric ; typically, the inter-atomic distance is used

) : (*a*, *b*′

, *b*) : (*a*′

(*a*, *b*) ∈ *Ri* : *d*(*a*, *b*) = min*A*(*a*)

(*a*, *b*) ∈ *Ri* : *d*(*a*, *b*) = min*B*(*b*)

The sets *R*0, *Ri* are sets of edges. The empty set is indicated with { }. The symbols − and

∩ indicate set difference and set intersection respectively. The symbols min*A*(*a*), min*B*(*b*) indicate the functions that give the shortest distances in the neighbourhood of the indicated point. The symbols *LA*, *LB* indicate the shortest edges relative to the given iteration *Ri*. The

In summary, the symmetric minimization is a recursive method that, at the first iteration, defines the lowest level of the interface as the pairs of atoms that are reciprocal nearest neighbours. The nearest neighbour condition must be verified for both the two subunit atoms, as the name itself suggests. These pairs form the lowest level *S*0, called symmetrized

reciprocal shortest edge sets are indicated with *Si*, and called symmetrized levels *i*.

) ∈ *Ri* and *b*′ ∈ *B*

Geometry and Topology in Protein Interfaces -- Some Tools for Investigations

, *b*) ∈ *Ri* and *a*′ ∈ *A*

*d*(*a*, *b*′

*d*(*a*′

**Figure 1.** Full interaction graph of the level 0 of the protein 1EEI interface (see Methods). The upper horizontal line represents the sub-unit D, the lower one the sub-unit E. With the crosses we indicate the residues that participate to the interaction graph; their name and membership to specific secondary structures is indicated, when available. The dots represent the residues that do not participate to the interface. The dotted-dashed lines represent pairs of atoms both from the backbone of the residues (BB graph). The solid lines indicate that at least one atom of the pair is from the side chain (SC graph).

zigzag. In some cases, the zigzag topology reduces to a simple vertex V (defined in Methods).

Both these specific topologies identified in the BB graph correspond to the known hydrogen bond graph between backbone atoms. While it is not surprising to find them in the BB graph, the surprise comes from the fact that the position of hydrogen atoms has not been used as input by our algorithms (in most cases it is not even given in the PDB files). Thus, the symmetric minimization algorithm is able to reconstruct the backbone hydrogen bonds network by the unique input of the backbone atoms *N*, *Cα*, *C*,*O* coordinates, with an accuracy of 90%. In other words, the backbone hydrogen bonds satisfy the mathematical property of symmetrically minimized distances and the backbone non-hydrogen atoms that engage in BB hydrogen bonds are reciprocal nearest neighbours, with the indicated accuracy. This identification has been obtained using the web server RING (see Methods) to calculate the hydrogen bonds and compare with the graphs (see also [6]). Among the 120 pairs of residues in the graphs (anti-parallel and parallel cases only), 108 are recognized by RING as hydrogen bonds and 12 are not. Thus, the symmetric minimization recognizes hydrogen bonds with accuracy of 90%.

• Interfaces of the oblique family have very small or absent BB graph, thus the two BB graphs are rather specific to the respective anti-parallel and parallel cases. It is also rare to find BB graphs in non *β* interfaces.

In fact, in these cases the BB graph is intra-chain, namely it develops between atoms of the same sub-unit and is almost absent in the interface.

• The amino acids are not randomly paired. Rather, the frequency with which residues are connected in the interaction graph clearly deviates from the expected frequency calculated from the average frequency of residues in the interface. This indicates that the edges in the interaction graph form in order to provide to the interface very specific features.

#### **2. Methods**

4 ime knjige

D: T - 1 P - Q -

E: P - 93 T

Methods).

L α E - Y -


α Y <sup>α</sup> <sup>D</sup>

to find BB graphs in non *β* interfaces.

the same sub-unit and is almost absent in the interface.

<sup>α</sup> <sup>E</sup>

(BB graph). The solid lines indicate that at least one atom of the pair is from the side chain (SC graph).

S α Q α K α A α R α M α T α I α

<sup>α</sup> <sup>I</sup>

**Figure 1.** Full interaction graph of the level 0 of the protein 1EEI interface (see Methods). The upper horizontal line represents the sub-unit D, the lower one the sub-unit E. With the crosses we indicate the residues that participate to the interaction graph; their name and membership to specific secondary structures is indicated, when available. The dots represent the residues that do not participate to the interface. The dotted-dashed lines represent pairs of atoms both from the backbone of the residues

zigzag. In some cases, the zigzag topology reduces to a simple vertex V (defined in

Both these specific topologies identified in the BB graph correspond to the known hydrogen bond graph between backbone atoms. While it is not surprising to find them in the BB graph, the surprise comes from the fact that the position of hydrogen atoms has not been used as input by our algorithms (in most cases it is not even given in the PDB files). Thus, the symmetric minimization algorithm is able to reconstruct the backbone hydrogen bonds network by the unique input of the backbone atoms *N*, *Cα*, *C*,*O* coordinates, with an accuracy of 90%. In other words, the backbone hydrogen bonds satisfy the mathematical property of symmetrically minimized distances and the backbone non-hydrogen atoms that engage in BB hydrogen bonds are reciprocal nearest neighbours, with the indicated accuracy. This identification has been obtained using the web server RING (see Methods) to calculate the hydrogen bonds and compare with the graphs (see also [6]). Among the 120 pairs of residues in the graphs (anti-parallel and parallel cases only), 108 are recognized by RING as hydrogen bonds and 12 are not. Thus, the symmetric minimization recognizes hydrogen bonds with accuracy of 90%. • Interfaces of the oblique family have very small or absent BB graph, thus the two BB graphs are rather specific to the respective anti-parallel and parallel cases. It is also rare

In fact, in these cases the BB graph is intra-chain, namely it develops between atoms of

• The amino acids are not randomly paired. Rather, the frequency with which residues are connected in the interaction graph clearly deviates from the expected frequency calculated from the average frequency of residues in the interface. This indicates that the edges in the interaction graph form in order to provide to the interface very specific features.

A -

β E - R - <sup>G</sup> - L - S β E β T β Y β F - <sup>K</sup> - 23

I β A β I β S β M β A β N - <sup>103</sup>

#### **2.1. Symmetric minimization of distances**

The description of the interface, that has been developed starting with [11] and that will be used in this paper, was introduced to help extract the structural organization of the interface. It focuses on the way atoms pair, keeping into account the local connectivity, namely the possibility that atoms interact with other atoms, according to their local arrangement. It is based on the notion of symmetric minimization of distance pairs, defined by the symmetric minimization algorithm, presented here in pseudocode. The flow chart is given in Figure 3. The needed mathematical explanations and demonstrations have been provided in [12].

> Input: *A* ← atoms of the first subunit *B* ← atoms of the second subunit *d*(*a*, *b*) ← A metric ; typically, the inter-atomic distance is used

$$\begin{aligned} \text{Start:} & \quad R\_0 \leftarrow \left\{ (a, b) : a \in A, b \in B \right\} \\ & i \leftarrow 0 \\ & \text{while } R\_i \neq \{ \ \} \text{ do:} \\ & \text{min}\_A(a) \leftarrow \min \left\{ d(a, b') : (a, b') \in R\_i \text{ and } b' \in B \right\} \\ & \text{min}\_B(b) \leftarrow \min \left\{ d(a', b) : (a', b) \in R\_i \text{ and } a' \in A \right\} \\ & L\_A \leftarrow \left\{ (a, b) \in R\_i : d(a, b) = \min\_A(a) \right\} \\ & L\_B \leftarrow \left\{ (a, b) \in R\_i : d(a, b) = \min\_B(b) \right\} \\ & S\_i \leftarrow L\_A \cap L\_B \\ & R\_{i+1} \leftarrow R\_i - S\_i \\ & i \leftarrow i + 1 \\ & \text{end while} \end{aligned}$$

Output: *S*0, *S*1, *S*2, ...

The sets *R*0, *Ri* are sets of edges. The empty set is indicated with { }. The symbols − and

∩ indicate set difference and set intersection respectively. The symbols min*A*(*a*), min*B*(*b*) indicate the functions that give the shortest distances in the neighbourhood of the indicated point. The symbols *LA*, *LB* indicate the shortest edges relative to the given iteration *Ri*. The reciprocal shortest edge sets are indicated with *Si*, and called symmetrized levels *i*.

In summary, the symmetric minimization is a recursive method that, at the first iteration, defines the lowest level of the interface as the pairs of atoms that are reciprocal nearest neighbours. The nearest neighbour condition must be verified for both the two subunit atoms, as the name itself suggests. These pairs form the lowest level *S*0, called symmetrized interface. Please see the caption of Figure 2.

The algorithm can be repeated on all the edges that have not been retained at the lowest level. This produces a new set of reciprocally shortest edges, not contained in *S*0, that forms the symmetrized set *S*1, named level 1. The repetition of the algorithm up to exhaustion of all the atom pairs, provides a hierarchy of levels

$$\begin{array}{ccccccccc} \mathbf{S\_{0}} & \mathbf{S\_{1}} & \mathbf{S\_{2}} & \dots & \mathbf{S\_{M}} & & & & & & \mathbf{1} \end{array} \tag{1}$$

10.5772/58420

371

http://dx.doi.org/10.5772/58420

A

B

Geometry and Topology in Protein Interfaces -- Some Tools for Investigations

*b*

NO

**Figure 2.** Example of symmetric minimization. The distance (*ab*) is the smallest for the atom *a* and, at the same time, is the smallest for *b*. Thus, the edge (*ab*) is retained in the symmetrized interface. The distance (*bc*) is the smallest for (*c*) but not for

Otherwise said, the geometrical action of ranking the pairs of an interface corresponds to the sequence of actions that one has to follow to construct the interface with minimum amount of steric effects. The sequence read in the opposite sense, from the maximum level *SM* to *S*0, indicates the steps to access the closest atoms of the interface by removing one layer after the

It is important to stress that each edge of *S*<sup>0</sup> behaves as a nucleation centre, or a bud, because every edge that appears at level 1 is attached to a level 0 edge; thus, level 0 edges act as the starting points of a growth process where, at each higher level, new edges attach to the bud. The growth occurs simultaneously from the various level 0 edges. Each bud is technically a cluster. At some level, an edge appears that joins two different buds; the edge length is the resolution threshold for the two buds. Certain edges do not start a growth process; this occurs when the atoms joined by these edges are not further connected to the rest of the interface. Also, a bud must be present at level 0 and cannot appear at higher level: *S*<sup>0</sup> already contains all the buds. In other words, the various parts of the interface are already contained in *S*<sup>0</sup> and a set smaller than *S*<sup>0</sup> is possibly insufficient to reconstruct the full interface. This provides the mathematical justification for calling *S*<sup>0</sup> a framework of the interface. Indeed, as we have empirically observed in our previous publications, *S*<sup>0</sup> is the smallest set that can describe the interface. This is the most important point of the whole construction and is

In summary, the symmetric minimization has been introduced to responds to the following

**Scalable.** The symmetric minimization may be applied to proteins or objects of any size, without size limits. Moreover, it can be applied also to objects at the human or interstellar

based on the theorems proved in [12] but is published here for the first time.

needs and possesses the following features [12].

*a*

*c*

(*b*) thus it is not retained.

other.

scale.

that define the symmetrized interface (SI). Thus, inter-atomic distances between the two sub-units are ranked according to levels. Notice that, as proteins are of finite and not of infinite size, there must be a maximum level. Also, the algorithm is free of ambiguities, even in case of regular structures. In Figure 1 and in previous papers the lowest level only was analysed, while it is a purpose of this paper to start the investigation of the higher levels.

The symmetric minimization of distances is a variant of the case *k* = 1 of the known *k*-reciprocal nearest neighbour method (kRNN), discussed in [13] and used now in the domain of hierarchical classification and object retrieval in images. A modern presentation is in [14]. Actually, the lowest level *S*<sup>0</sup> could be equivalently obtained with both methods but at higher levels the equivalence breaks down. The purpose of kRNN is to assert the relative proximity of several images containing the same object appearing in different scenes. Each image is a point in some very high-dimensional space. Using an appropriate metric, the closest images are found and agglomerated in a cluster. At the new iteration, new images will join the cluster. The method kRNN compares high-dimensional vectors. It can be applied to atomic coordinates. Once two atoms are found to be reciprocal nearest neighbours at some iteration, they are removed from the pool (and put in a cluster) before the next iteration could start. On the other hand, in the symmetric minimization, two atoms that are reciprocal nearest neighbours are not removed from the pool: the edge they form is removed but the atoms remain. This is the role of the sets *Ri* in the algorithm: at the next, the same atoms may be nearest neighbours with others. In kRNN, the sets of edges *Ri* would simply be replaced by some *Ai*, *Bi* where *Ai* or *Bi* would be obtained by removing from the initial *A* and *B* the atoms found at each iteration. The choice of working with edges comes from the goal of describing the interactions from a geometrical point of view. Moreover, the binding energy in a protein interface accumulates between all pairs of atoms, at least in a suitable range of distances, no matter if they are nearest neighbours or not. Using edges, the information about all the neighbours of an atom is recorded and used at one or other of the levels. Using atoms, part of the edges are not evaluated and some information seems lost, at least for purposes related to protein structure.

The classification of pairs into levels reminds one of perturbative calculations, very common in physics, where the lowest order contains the strongest interaction and the higher orders introduce weaker and weaker terms. The ranking, and the set of levels in equation (1), have been computed on the basis of inter-atomic distances: the higher the level, the larger the distance between atoms. Physically speaking, moving to higher distances implies a tendency to move to weaker interactions. Here, force fields and types of atoms have not been used in the symmetric minimization, thus rising to higher rank only indicates a tendency to weaker interactions and does not hold in a strict sense.

Imagining a mechanical model of balls and sticks, and a quantity of glue, can one mount a human size model of the protein interface? Yes, if the sequence in equation (1) is followed.

6 ime knjige

The algorithm can be repeated on all the edges that have not been retained at the lowest level. This produces a new set of reciprocally shortest edges, not contained in *S*0, that forms the symmetrized set *S*1, named level 1. The repetition of the algorithm up to exhaustion of

that define the symmetrized interface (SI). Thus, inter-atomic distances between the two sub-units are ranked according to levels. Notice that, as proteins are of finite and not of infinite size, there must be a maximum level. Also, the algorithm is free of ambiguities, even in case of regular structures. In Figure 1 and in previous papers the lowest level only was analysed, while it is a purpose of this paper to start the investigation of the higher levels. The symmetric minimization of distances is a variant of the case *k* = 1 of the known *k*-reciprocal nearest neighbour method (kRNN), discussed in [13] and used now in the domain of hierarchical classification and object retrieval in images. A modern presentation is in [14]. Actually, the lowest level *S*<sup>0</sup> could be equivalently obtained with both methods but at higher levels the equivalence breaks down. The purpose of kRNN is to assert the relative proximity of several images containing the same object appearing in different scenes. Each image is a point in some very high-dimensional space. Using an appropriate metric, the closest images are found and agglomerated in a cluster. At the new iteration, new images will join the cluster. The method kRNN compares high-dimensional vectors. It can be applied to atomic coordinates. Once two atoms are found to be reciprocal nearest neighbours at some iteration, they are removed from the pool (and put in a cluster) before the next iteration could start. On the other hand, in the symmetric minimization, two atoms that are reciprocal nearest neighbours are not removed from the pool: the edge they form is removed but the atoms remain. This is the role of the sets *Ri* in the algorithm: at the next, the same atoms may be nearest neighbours with others. In kRNN, the sets of edges *Ri* would simply be replaced by some *Ai*, *Bi* where *Ai* or *Bi* would be obtained by removing from the initial *A* and *B* the atoms found at each iteration. The choice of working with edges comes from the goal of describing the interactions from a geometrical point of view. Moreover, the binding energy in a protein interface accumulates between all pairs of atoms, at least in a suitable range of distances, no matter if they are nearest neighbours or not. Using edges, the information about all the neighbours of an atom is recorded and used at one or other of the levels. Using atoms, part of the edges are not evaluated and some information seems lost, at least for

The classification of pairs into levels reminds one of perturbative calculations, very common in physics, where the lowest order contains the strongest interaction and the higher orders introduce weaker and weaker terms. The ranking, and the set of levels in equation (1), have been computed on the basis of inter-atomic distances: the higher the level, the larger the distance between atoms. Physically speaking, moving to higher distances implies a tendency to move to weaker interactions. Here, force fields and types of atoms have not been used in the symmetric minimization, thus rising to higher rank only indicates a tendency to weaker

Imagining a mechanical model of balls and sticks, and a quantity of glue, can one mount a human size model of the protein interface? Yes, if the sequence in equation (1) is followed.

*S*0, *S*1, *S*2, ..., *SM* (1)

all the atom pairs, provides a hierarchy of levels

purposes related to protein structure.

interactions and does not hold in a strict sense.

**Figure 2.** Example of symmetric minimization. The distance (*ab*) is the smallest for the atom *a* and, at the same time, is the smallest for *b*. Thus, the edge (*ab*) is retained in the symmetrized interface. The distance (*bc*) is the smallest for (*c*) but not for (*b*) thus it is not retained.

Otherwise said, the geometrical action of ranking the pairs of an interface corresponds to the sequence of actions that one has to follow to construct the interface with minimum amount of steric effects. The sequence read in the opposite sense, from the maximum level *SM* to *S*0, indicates the steps to access the closest atoms of the interface by removing one layer after the other.

It is important to stress that each edge of *S*<sup>0</sup> behaves as a nucleation centre, or a bud, because every edge that appears at level 1 is attached to a level 0 edge; thus, level 0 edges act as the starting points of a growth process where, at each higher level, new edges attach to the bud. The growth occurs simultaneously from the various level 0 edges. Each bud is technically a cluster. At some level, an edge appears that joins two different buds; the edge length is the resolution threshold for the two buds. Certain edges do not start a growth process; this occurs when the atoms joined by these edges are not further connected to the rest of the interface. Also, a bud must be present at level 0 and cannot appear at higher level: *S*<sup>0</sup> already contains all the buds. In other words, the various parts of the interface are already contained in *S*<sup>0</sup> and a set smaller than *S*<sup>0</sup> is possibly insufficient to reconstruct the full interface. This provides the mathematical justification for calling *S*<sup>0</sup> a framework of the interface. Indeed, as we have empirically observed in our previous publications, *S*<sup>0</sup> is the smallest set that can describe the interface. This is the most important point of the whole construction and is based on the theorems proved in [12] but is published here for the first time.

In summary, the symmetric minimization has been introduced to responds to the following needs and possesses the following features [12].

**Scalable.** The symmetric minimization may be applied to proteins or objects of any size, without size limits. Moreover, it can be applied also to objects at the human or interstellar scale.

10.5772/58420

373

http://dx.doi.org/10.5772/58420

**2.2. Interaction graphs**

a weighted graph.

the analysis focuses on the interface.

and not from individual atoms.

(*a*, *b*)*F*(*a*′

, *b*′

nodes) in this description that can be called full atoms.

theory this procedure is called fusion. If (*a*, *b*),(*a*′

acid resolution). By consequence, the graphs *Saa*

let's call *F* (fusion) the equivalence relation such that

Then, the following quotient set defines equivalent classes

when two residues are connected by more than one pair of atoms.

The previous subsection demonstrates how the data are analysed. By construction, each set in equation (1) is a collection of edges, the extrema of which are points in some metric space. Mathematically, this corresponds to the notion of graph, namely a set of nodes (here the points in the metric space) and a set of edges joining some of the nodes. Thus, from now on each set representing a level will be naturally identified with a graph. In the present case it is important to stress that the graph has been obtained by evaluating distances, thus each edge is associated to the physical distance of the endpoints. Mathematically speaking, this is

Geometry and Topology in Protein Interfaces -- Some Tools for Investigations

As the graph nodes were initially atoms of one or the other of two adjacent subunits, the graph is automatically bipartite: each edge has an endpoint on one subunit and the other on the second subunit. Edges between atoms of the same subunit are not considered here, as

By construction, a pair of atoms can be connected by a single edge only, as there is just one distance value between them. Thus parallel edges are absent (edges with the same end

Since the first paper on the subject [11], the levels *Si* have been coarse-grained in order to facilitate the human interpretation of the data and to transfer the information to the residue scale. This is actually the scale used by biological entities to store and transfer information: DNA and RNA code for amino acids, not for atoms, and proteins form from residue chains

In the coarse-grained representation the information appears at the amino acid resolution: all the atoms of a residue are identified and represented with the residue name itself. All edges ending on the atoms of the residue are now referred to the residue itself. In graph

> *Saa <sup>i</sup>* <sup>=</sup> *Si*

whole elements are all the edges that start and end on the same residues (*aa* means amino

The subsequent analyses will always refer to this coarse-grained graphs, while the initial

represented in a very effective way: the first subunit residues are represented as an horizontal line of equispaced dots; similarly, the second subunit residues appear as dots in an horizontal line below the previous one. Clearly, the two horizontal lines represent the chains of residues connected by the peptide bonds. Being the graph bipartite, edges can only join a node from one horizontal line to a node of the other horizontal line. In Figure 1 there is an example.

symmetric minimization has always been done with atoms. The graphs *Saa*

, *b*′

) ⇐⇒ *a* and *a*′ belong to the same residue and *b*, *b*′ belong to the same residue

) ∈ *Si* are edges of one of the graphs,

*<sup>F</sup>* (3)

*i* could have parallel edges; this happens

(2)

*i* can be

**Figure 3.** Flow chart of the symmetric minimization algorithm.


#### **2.2. Interaction graphs**

8 ime knjige

Construction of *R*<sup>0</sup> (set of atom pairs, one from sub-unit *A*, one from sub-unit *B*)

Iteration counter: *i* = 0

From *Ri*, construction of the sets *LA* and *LB LA* : for each atom of *A*, the closest partner in *B*, within *Ri LB* : for each atom of *B*, the closest partner in *A*, within *Ri*

Symmetrized level: *Si* = *LA* ∩ *LB*

**Figure 3.** Flow chart of the symmetric minimization algorithm.

**Scale-free.** No length scale has been imposed from outside.

dilute packing.

scale for the interface.

The rest: *Ri*<sup>+</sup><sup>1</sup> = *Ri* − *Si* Is *Ri*<sup>+</sup><sup>1</sup> empty?

**Local.** It is based on the local arrangement of atoms (or points) and not on global features. So, it captures the differences occurring in situations like a dense atomic packing or a

**Intrinsic scales.** It defines a set of characteristic scales, intrinsic to the interface itself. Indeed, the symmetric minimization allows to divide the interface in clusters (the buds previously presented). The first edge that joins two different clusters (two buds) is a characteristic

**Metrics independent.** It is independent on the explicit distance function adopted. Namely, the actual distance used can be different from Euclidean distance, non euclidean

geometries being allowed if they use positive definite metrics (a distance).

Yes

End

No

*i* → *i* + 1

The previous subsection demonstrates how the data are analysed. By construction, each set in equation (1) is a collection of edges, the extrema of which are points in some metric space. Mathematically, this corresponds to the notion of graph, namely a set of nodes (here the points in the metric space) and a set of edges joining some of the nodes. Thus, from now on each set representing a level will be naturally identified with a graph. In the present case it is important to stress that the graph has been obtained by evaluating distances, thus each edge is associated to the physical distance of the endpoints. Mathematically speaking, this is a weighted graph.

As the graph nodes were initially atoms of one or the other of two adjacent subunits, the graph is automatically bipartite: each edge has an endpoint on one subunit and the other on the second subunit. Edges between atoms of the same subunit are not considered here, as the analysis focuses on the interface.

By construction, a pair of atoms can be connected by a single edge only, as there is just one distance value between them. Thus parallel edges are absent (edges with the same end nodes) in this description that can be called full atoms.

Since the first paper on the subject [11], the levels *Si* have been coarse-grained in order to facilitate the human interpretation of the data and to transfer the information to the residue scale. This is actually the scale used by biological entities to store and transfer information: DNA and RNA code for amino acids, not for atoms, and proteins form from residue chains and not from individual atoms.

In the coarse-grained representation the information appears at the amino acid resolution: all the atoms of a residue are identified and represented with the residue name itself. All edges ending on the atoms of the residue are now referred to the residue itself. In graph theory this procedure is called fusion. If (*a*, *b*),(*a*′ , *b*′ ) ∈ *Si* are edges of one of the graphs, let's call *F* (fusion) the equivalence relation such that

(*a*, *b*)*F*(*a*′ , *b*′ ) ⇐⇒ *a* and *a*′ belong to the same residue and *b*, *b*′ belong to the same residue

(2)

Then, the following quotient set defines equivalent classes

$$S\_i^{aa} = \frac{\mathcal{S}\_i}{F} \tag{3}$$

whole elements are all the edges that start and end on the same residues (*aa* means amino acid resolution). By consequence, the graphs *Saa i* could have parallel edges; this happens when two residues are connected by more than one pair of atoms.

The subsequent analyses will always refer to this coarse-grained graphs, while the initial symmetric minimization has always been done with atoms. The graphs *Saa i* can be represented in a very effective way: the first subunit residues are represented as an horizontal line of equispaced dots; similarly, the second subunit residues appear as dots in an horizontal line below the previous one. Clearly, the two horizontal lines represent the chains of residues connected by the peptide bonds. Being the graph bipartite, edges can only join a node from one horizontal line to a node of the other horizontal line. In Figure 1 there is an example.

In fact we found it very convenient to distinguish two sub-graphs. The dotted-dashed lines represent pairs of atoms both from the backbone of the residues (BB graph). The solid lines indicate that at least one atom of the pair is from the side chain (SC graph). Mathematically speaking, this is called an edge-labelled graph, the two possible labels being BB and SC. In principle, there is a third edge label; indeed, the nodes of the same sub-unit form a sequence connected by the peptide bond. Albeit this type of edge is not explicitly shown in the graphs, it is implicitly present and motivates the choice of organizing the residues along straight lines.

10.5772/58420

375

http://dx.doi.org/10.5772/58420

rungs and separation *δ* = 2. In Figure 22 the residues C, S, F (47, 49, 51) of chain A and the residues Y, E, E (110, 108, 106) of chain B form a BB ladder of separation *δ* = 2. The two most common separations are *δ* = 1 or 2, ladder1 and ladder2 respectively. Cases with more than 3 rungs are known. In Figure 14, the residues from 98 to 102 (of the chain

Geometry and Topology in Protein Interfaces -- Some Tools for Investigations

**Multiple edges:** presence of parallel edges in the graph (edges with the same endpoints). The term multiple edges seems more adequate to the present case, given the use of the word parallel to indicate the orientation of the *β*-strands in the interface. In Figure 8, the residues E and F are connected by 4 edges, three of type SC and one of type BB. In Figure

It is important to look for these motifs starting from the largest elements, to avoid useless multiple countings. Especially, it is obvious that a zigzag of 8 nodes contains all the shortest sizes, from 4 to 7. There is no need to record all of them, the largest one being the most informative. Also, a V3 or zigzag motif automatically contains the simplest vertex V, and often more than once. Thus, the vertex V is not counted when it appears inserted in a V3 or a zigzag. In Figure 33, the zigzag5 (residues V, M, I, V, A) contains three times a V. In Figure

These topological elements may appear alone or in combination and have been chosen, as in other publications, because they represent the most common motifs in interface graphs. It may happen that a graph contains more than one motif, in which case all of them will be recorded. For example, in a common situation one of the nodes of a zigzag5 is also a vertex V3. In that case, both motifs are registered. The main reason is that a systematic classification of graphs needs to be accurate and free of ambiguous search criteria, thus it is definitely preferable to accept a redundancy in the identification of motifs than introducing untested criteria. For example, often a ladder has separation of 2 in one part of the graph and separation of 1 in another part, with a edge (a rung) in common between the two parts. Thus, it is recorded as part of both a ladder of 2 and ladder of 1. Indeed, so far, no acceptable criterion has been found to discriminate if a edge must be considered part of a ladder with separation of 1 or of 2. In Figure 28, there is a BB ladder formed by M, I, K, I (chain A) and V, L, I, E (chain B). Residues M and I are separated of 2 while residues I and K, K and I are

In [6] a dataset of 39 oligomeric proteins was chosen on the basis of the presence, in the three-dimensional structure, of a well recognizable *β* interface. Here the same set will be used. It is listed in Table 1 with the indication of the chains and the intervals participating to the *β* interfaces. All proteins are homomeric of stoichiometry from 3 to 8. This set is characterized by the absence of sequence homology, structural homology or functional homology. Viral and membrane proteins are absent, given their specificities. Thus, the set can be considered representative of the general behaviour of *β* interfaces, without reference to specific classes of proteins. In the dataset there are anti-parallel, parallel and oblique orientations of the adjacent *β* strands. Actually, the oblique family will be considered for comparison only and not deeply investigated. In fact its graphs are less structured and it is

D) with the residues from 29 to 25 (chain E) form a ladder1 with 5 rungs.

22, the residues S and E are connected by three BB edges.

30, the vertex V3 (residue H) contains three V motifs.

difficult to find similarities between proteins.

separated of 1.

**2.4. The dataset**

Notice that while the full atom graphs are a fortiori different, *Si* ∩ *Sj* = {}, this is no longer true for the fused graphs: it is not a priori granted that they are different.

In summary, the sets in equation (3) characterize the interaction between subunits; each set is a bipartite and edge-labelled graph where parallel edges may occur.

The amino acid resolution graph of the interaction between subunits has shown to be extremely effective in interpreting data and designing statistical analyses. Notice that the choice of marking non interacting residues with a dot and interacting residues with a cross is purely conventional and does not add information.

#### **2.3. Topological analysis of the graphs**

The analysis will proceed with the inspection and comparison of the interaction graphs of the level 1 of the interface, with some additional information from the level 2. The focus will be on the topology, namely on the organisation of edges in the graphs, whose importance has already been shown in the previous publications [6, 11]. As already stated, the amino acid resolution will be systematically adopted.

The interface graphs will be analysed using the following motifs.


rungs and separation *δ* = 2. In Figure 22 the residues C, S, F (47, 49, 51) of chain A and the residues Y, E, E (110, 108, 106) of chain B form a BB ladder of separation *δ* = 2. The two most common separations are *δ* = 1 or 2, ladder1 and ladder2 respectively. Cases with more than 3 rungs are known. In Figure 14, the residues from 98 to 102 (of the chain D) with the residues from 29 to 25 (chain E) form a ladder1 with 5 rungs.

**Multiple edges:** presence of parallel edges in the graph (edges with the same endpoints). The term multiple edges seems more adequate to the present case, given the use of the word parallel to indicate the orientation of the *β*-strands in the interface. In Figure 8, the residues E and F are connected by 4 edges, three of type SC and one of type BB. In Figure 22, the residues S and E are connected by three BB edges.

It is important to look for these motifs starting from the largest elements, to avoid useless multiple countings. Especially, it is obvious that a zigzag of 8 nodes contains all the shortest sizes, from 4 to 7. There is no need to record all of them, the largest one being the most informative. Also, a V3 or zigzag motif automatically contains the simplest vertex V, and often more than once. Thus, the vertex V is not counted when it appears inserted in a V3 or a zigzag. In Figure 33, the zigzag5 (residues V, M, I, V, A) contains three times a V. In Figure 30, the vertex V3 (residue H) contains three V motifs.

These topological elements may appear alone or in combination and have been chosen, as in other publications, because they represent the most common motifs in interface graphs. It may happen that a graph contains more than one motif, in which case all of them will be recorded. For example, in a common situation one of the nodes of a zigzag5 is also a vertex V3. In that case, both motifs are registered. The main reason is that a systematic classification of graphs needs to be accurate and free of ambiguous search criteria, thus it is definitely preferable to accept a redundancy in the identification of motifs than introducing untested criteria. For example, often a ladder has separation of 2 in one part of the graph and separation of 1 in another part, with a edge (a rung) in common between the two parts. Thus, it is recorded as part of both a ladder of 2 and ladder of 1. Indeed, so far, no acceptable criterion has been found to discriminate if a edge must be considered part of a ladder with separation of 1 or of 2. In Figure 28, there is a BB ladder formed by M, I, K, I (chain A) and V, L, I, E (chain B). Residues M and I are separated of 2 while residues I and K, K and I are separated of 1.

#### **2.4. The dataset**

10 ime knjige

straight lines.

In fact we found it very convenient to distinguish two sub-graphs. The dotted-dashed lines represent pairs of atoms both from the backbone of the residues (BB graph). The solid lines indicate that at least one atom of the pair is from the side chain (SC graph). Mathematically speaking, this is called an edge-labelled graph, the two possible labels being BB and SC. In principle, there is a third edge label; indeed, the nodes of the same sub-unit form a sequence connected by the peptide bond. Albeit this type of edge is not explicitly shown in the graphs, it is implicitly present and motivates the choice of organizing the residues along

Notice that while the full atom graphs are a fortiori different, *Si* ∩ *Sj* = {}, this is no longer

In summary, the sets in equation (3) characterize the interaction between subunits; each set

The amino acid resolution graph of the interaction between subunits has shown to be extremely effective in interpreting data and designing statistical analyses. Notice that the choice of marking non interacting residues with a dot and interacting residues with a cross

The analysis will proceed with the inspection and comparison of the interaction graphs of the level 1 of the interface, with some additional information from the level 2. The focus will be on the topology, namely on the organisation of edges in the graphs, whose importance has already been shown in the previous publications [6, 11]. As already stated, the amino

**Zigzag:** it is a path that alternates from one to the other of the two sub-units, namely from one to the other of the two nodes subsets of the bipartite graph, like in a zigzag seam. In principle, the smallest recognizable zigzag visits three residues; the experience with data has clearly shown that it is not useful to consider such a path in the zigzag family but it is better to classify it in the vertex family (see later). Thus, the smallest zigzag seam considered is the one with 4 residues visited, zigzag4. It is visible in Figure 5, involving the residues A, I, E, A (BB graph). Zigzag paths with 5 or more residues are

**Vertex:** it is a residue connected with two or more different amino acids. In the smallest case, the residue is connected to two other residues and indicated V, as in Figure 4, residues F, L, K. The case of more than two residues connected with the same vertex will be comprehensively indicated with V3. Three exemplars are present in Figure 28. Especially, the residue I on chain B has four connections, three of type BB and one of type SC. **Ladder:** This motif occurs when 2 (or more) amino acids at separation *δ* on the first subunit, namely at positions *M*, *M* + *δ*, are connected with 2 (or more) amino acids *N*, *N* − *δ* on the second sub-unit, respectively. Notice that the second sub-unit runs oppositely to the first. The two groups of amino acids are the ladder rails and the edges are the rungs, from which the name was adopted. In Figure 3, residues E, Y and E, W form a ladder with 2

comprehensively indicated as zigzag5. See Figure 27, residues T, E, L, V, A.

true for the fused graphs: it is not a priori granted that they are different.

is a bipartite and edge-labelled graph where parallel edges may occur.

is purely conventional and does not add information.

**2.3. Topological analysis of the graphs**

acid resolution will be systematically adopted.

The interface graphs will be analysed using the following motifs.

In [6] a dataset of 39 oligomeric proteins was chosen on the basis of the presence, in the three-dimensional structure, of a well recognizable *β* interface. Here the same set will be used. It is listed in Table 1 with the indication of the chains and the intervals participating to the *β* interfaces. All proteins are homomeric of stoichiometry from 3 to 8. This set is characterized by the absence of sequence homology, structural homology or functional homology. Viral and membrane proteins are absent, given their specificities. Thus, the set can be considered representative of the general behaviour of *β* interfaces, without reference to specific classes of proteins. In the dataset there are anti-parallel, parallel and oblique orientations of the adjacent *β* strands. Actually, the oblique family will be considered for comparison only and not deeply investigated. In fact its graphs are less structured and it is difficult to find similarities between proteins.

#### **2.5. Residue interaction network generator (RING)**

It is a web server [15] with software for rendering a protein structure into a network of interactions. Nodes represent single amino acids and edges represent the non-covalent bonding interactions that exist between them. In particular, this web server has been used to calculate the hydrogen bonds for the proteins of the dataset. The hydrogen bonds have not been supplied to the symmetric minimization method but just used to compare some results. 10.5772/58420

377

http://dx.doi.org/10.5772/58420

In the parallel orientation, the zigzag topology and the vertices V3 are absent at this level. On the contrary, these are the most common topological elements at the lower levels. At this level we just find the ladder motif with separation of 1 or 2 amino acids. In some

Geometry and Topology in Protein Interfaces -- Some Tools for Investigations

**R4. Other orientation.** In interfaces other that the anti-parallel or parallel *β*, it was systematically observed in previous publications and is confirmed here that the BB graphs are rare and poorly structured, as composed of one or two edges. This obviously holds

**R5. SC graph.** These graphs are more elaborated and very rich of a variety of elements. One may recognize that the motif of the ladder is present in nearly half the graphs, at all the various levels examined (0, 1, 2). The V, V3 and zigzag elements are very common in all the three orientations. A more complete analysis of the SC graphs will be realized in

The participating amino acids are actually very similar: in [11], paper fully dedicated to

detected. At the level 0, the difference is more in terms of the description that emerges. In the buried surface or Voronoi cells based interface approach, is very natural to distinguish between rim and core: a solvent molecule that tries to penetrate the interface first has to visit the rim, and after may force into the core. The symmetrized levels instead present the growth of the interface, as in a budding process, from the set *S*0. The hierarchy in (1) indicates the dynamical sequence of events that may construct the interface with minimal steric effects. Somehow, this introduces the notion of time in an otherwise static view. The distinction between rim and core is possibly given by the number of the connections of a residue: a highly connected residue must be in the core and cannot be in the rim. Vice versa, a poorly connected residue is in the rim. Indeed, the part of a residue surface that is exposed to the solvent will not be connected to other atoms while an atom that is completely buried will have connections with all its neighbours. This indicates that a minimal distinction between rim and core may appear if one compares few close levels, like *S*0, *S*<sup>1</sup> and *S*2. The point is that the more an atom is present at different levels, the more is connected thus the rim must correspond to those atoms that appear few times across the levels, the others being in the

The importance of how the information circulates in an interface has been first shown in [8].

approaches based on the contact surface (buried surface or Voronoi cells based interface). The motifs that allow this transfer are the zigzags, especially the long ones, and the vertices V, V3. On the contrary, a pure ladder topology, in BB or SC or both, does not have ways to correlate far apart atoms. The frequent presence of zigzag, V and V3 motifs implies the existence of constraints on the positions or the physico-chemical properties of non neighbouring atoms, and sometimes of very far apart atoms. Indeed, in Figure 27, the mere establishment of the path formed by the residues S2, E123, L5, L124, I7, V127 (where BB and SC are both present)

*i* show the presence of long range correlations that are not easily detected with

<sup>0</sup> , with published interface data more than 85% of similarity has been

cases the BB graph reduces to one edge.

In Methods, four definitions of interface have been introduced.

true also at levels 1 and 2.

future publications.

compare the level 0, *Saa*

**4. Discussion**

core.

The graphs *Saa*

#### **3. Results**

The dataset has been described in Methods and its proteins are listed in Table 1.

The graphs of the level 0 were provided as Supporting Information to the publication [6] and are downloadable in open access, as well as the publication itself.

In this section the description focuses on the level 1 and, to a minor extent, on the level 2. The graphs of the level 1 are provided in Figures 3 to 26, and 27 to 34. Given their minor role, the graphs of the level 2 will be provided on demand.

In the following description, the results are grouped and numbered from Result 1 (R1) to Result 5 (R5).

**R1. BB graph at level 1.** The level 1 graphs are not identical to the level 0 ones.

In all the anti-parallel cases (Figures 3 to 26), the presence of a level 1 BB graph is observed, of size comparable with the one observed at level 0, namely with a similar number of edges. It is very small in just one case (2ojw level 1), where a single edge is present.

Similarly, in all the parallel cases (Figures 27 to 34), there is a level 1 graph, with a number of edges similar to the one found at level 0.

**R2. BB graph structure at level 1.** In the anti-parallel orientation, the graphs present a ladder structure in 21 out of 24 cases. Its separation is of 2 amino acids in 20 cases out of 24 and of 1 amino acid in 15 out of 24 cases1. The zigzag connection is absent. The vertex V is present in 13 graphs on 24. The V3 motif is present in 9 out of 24 cases. The zigzag4 is counted 12 out of 24 times. Just 4 multiple edges are observed. In all cases at least one ladder motif shows up at level 0 or at level 1.

In the parallel orientation, the graphs shows a common presence of the zigzag5 topological element, in 4 out of 8 graphs, often accompanied by one or more V3, in 5 out of 8 graphs. The zigzag5 motifs are probably not completely independent from those that appear at level 0, as they always show up together. This aspect needs a larger statistics to be confirmed or disproved. The zigzag opening is of 1 or 2 amino acids. There are also 4 ladders of separation 1. Multiple bonds appear once out of 8 cases2.

**R3. BB graph structure at level 2.** A level 2 BB graph is observed in all cases, often less populated than the lower levels graphs.

In the anti-parallel orientation, there are 13 ladders out of 24. The other cases show a V or zigzag4 motif, and in one case a zigzag5 is found.

<sup>1</sup> At level 0 the ladder separation is of 2 residues (23 cases out of 24). There are multiple edges in 21 cases out of 24.

<sup>2</sup> At level 0 there are 5 zigzag5 out of 8. After, one finds 2 V and 1 zigzag4. Multiple bonds are absent.

In the parallel orientation, the zigzag topology and the vertices V3 are absent at this level. On the contrary, these are the most common topological elements at the lower levels. At this level we just find the ladder motif with separation of 1 or 2 amino acids. In some cases the BB graph reduces to one edge.


#### **4. Discussion**

12 ime knjige

**3. Results**

Result 5 (R5).

present.

**2.5. Residue interaction network generator (RING)**

It is a web server [15] with software for rendering a protein structure into a network of interactions. Nodes represent single amino acids and edges represent the non-covalent bonding interactions that exist between them. In particular, this web server has been used to calculate the hydrogen bonds for the proteins of the dataset. The hydrogen bonds have not been supplied to the symmetric minimization method but just used to compare some results.

The graphs of the level 0 were provided as Supporting Information to the publication [6] and

In this section the description focuses on the level 1 and, to a minor extent, on the level 2. The graphs of the level 1 are provided in Figures 3 to 26, and 27 to 34. Given their minor

In the following description, the results are grouped and numbered from Result 1 (R1) to

In all the anti-parallel cases (Figures 3 to 26), the presence of a level 1 BB graph is observed, of size comparable with the one observed at level 0, namely with a similar number of edges. It is very small in just one case (2ojw level 1), where a single edge is

Similarly, in all the parallel cases (Figures 27 to 34), there is a level 1 graph, with a number

In the parallel orientation, the graphs shows a common presence of the zigzag5 topological element, in 4 out of 8 graphs, often accompanied by one or more V3, in 5 out of 8 graphs. The zigzag5 motifs are probably not completely independent from those that appear at level 0, as they always show up together. This aspect needs a larger statistics to be confirmed or disproved. The zigzag opening is of 1 or 2 amino acids. There are also

**R3. BB graph structure at level 2.** A level 2 BB graph is observed in all cases, often less

<sup>1</sup> At level 0 the ladder separation is of 2 residues (23 cases out of 24). There are multiple edges in 21 cases out of 24.

<sup>2</sup> At level 0 there are 5 zigzag5 out of 8. After, one finds 2 V and 1 zigzag4. Multiple bonds are absent.

In the anti-parallel orientation, there are 13 ladders out of 24. The other cases show a V

4 ladders of separation 1. Multiple bonds appear once out of 8 cases2.

**R2. BB graph structure at level 1.** In the anti-parallel orientation, the graphs present a ladder structure in 21 out of 24 cases. Its separation is of 2 amino acids in 20 cases out of 24 and of 1 amino acid in 15 out of 24 cases1. The zigzag connection is absent. The vertex V is present in 13 graphs on 24. The V3 motif is present in 9 out of 24 cases. The zigzag4 is counted 12 out of 24 times. Just 4 multiple edges are observed. In all cases at least one

The dataset has been described in Methods and its proteins are listed in Table 1.

**R1. BB graph at level 1.** The level 1 graphs are not identical to the level 0 ones.

are downloadable in open access, as well as the publication itself.

role, the graphs of the level 2 will be provided on demand.

of edges similar to the one found at level 0.

ladder motif shows up at level 0 or at level 1.

populated than the lower levels graphs.

or zigzag4 motif, and in one case a zigzag5 is found.

In Methods, four definitions of interface have been introduced.

The participating amino acids are actually very similar: in [11], paper fully dedicated to compare the level 0, *Saa* <sup>0</sup> , with published interface data more than 85% of similarity has been detected. At the level 0, the difference is more in terms of the description that emerges. In the buried surface or Voronoi cells based interface approach, is very natural to distinguish between rim and core: a solvent molecule that tries to penetrate the interface first has to visit the rim, and after may force into the core. The symmetrized levels instead present the growth of the interface, as in a budding process, from the set *S*0. The hierarchy in (1) indicates the dynamical sequence of events that may construct the interface with minimal steric effects. Somehow, this introduces the notion of time in an otherwise static view. The distinction between rim and core is possibly given by the number of the connections of a residue: a highly connected residue must be in the core and cannot be in the rim. Vice versa, a poorly connected residue is in the rim. Indeed, the part of a residue surface that is exposed to the solvent will not be connected to other atoms while an atom that is completely buried will have connections with all its neighbours. This indicates that a minimal distinction between rim and core may appear if one compares few close levels, like *S*0, *S*<sup>1</sup> and *S*2. The point is that the more an atom is present at different levels, the more is connected thus the rim must correspond to those atoms that appear few times across the levels, the others being in the core.

The importance of how the information circulates in an interface has been first shown in [8]. The graphs *Saa i* show the presence of long range correlations that are not easily detected with approaches based on the contact surface (buried surface or Voronoi cells based interface). The motifs that allow this transfer are the zigzags, especially the long ones, and the vertices V, V3. On the contrary, a pure ladder topology, in BB or SC or both, does not have ways to correlate far apart atoms. The frequent presence of zigzag, V and V3 motifs implies the existence of constraints on the positions or the physico-chemical properties of non neighbouring atoms, and sometimes of very far apart atoms. Indeed, in Figure 27, the mere establishment of the path formed by the residues S2, E123, L5, L124, I7, V127 (where BB and SC are both present) requires to satisfy physical conditions due to the volume of the atoms and the distribution of electric charge. Each residue has its internal constraints, among which the fixed length of covalent bonds and the planarity of the surface *Cα*, *C*, 0, *N* that contains the peptide bond. The sequence of edges transfers constraints and establishes a correlation between the outer amino acids S2 and V127 even if they are not in physical contact. The exact microscopic description of the constrains is not easy to find, although. Notice that this example of information transfer along the interface has been detected thanks to a description guided by graph theory and based on edges.

10.5772/58420

379

http://dx.doi.org/10.5772/58420

three rotations can be removed from the counting, that reduces the number of needed relative

As an example, consider the graph of 2ojw in Figure 19; there are *N* = 7 amino acids (even the amino acids that do not participate to the graph but are included within the considered regions must be counted) thus one needs 3*N* − 6 = 15 relative positions. The distances between two consecutive *C<sup>α</sup>* is fixed in all polypeptides (as the distance TG in the graph). In the present case there are 5 of them. The distance between the first and the third of three consecutive *C<sup>α</sup>* (distances TT and GI in the graph) is fixed by the general properties of polypeptides, that makes 3 distances. The graph has 3 edges, namely 3 other distances, thus one remains with 15 − 5 − 3 − 3 = 4 more distances to be fixed. This indicates that this graph is insufficient to reconstruct the shape. A more general counting is possible. In [8] the average number of amino acids (18) and of edges (12) in a *β* interface have been evaluated. Their ratio is very close to 3/2 = 1.5, thus it is reasonable to assume that if there are *N* residues in the interface, there will be nearly 2*N*/3 edges (actually, multiple edges should not be counted, here; this may further reduce the number of known edges). Also, we expects

*N* − 2 consecutive *C<sup>α</sup>* and *N* − 4 groups of three consecutive *C<sup>α</sup>* namely we have

known distances = (*<sup>N</sup>* <sup>−</sup> <sup>2</sup>)+(*<sup>N</sup>* <sup>−</sup> <sup>4</sup>) + <sup>2</sup>

are left with the number of distances that are needed to fix the shape

information to fix the shape of the interface.

an incomplete set of distances is treated in [16].

V3, has been studied in detail in *Saa*

the atoms.

**4.2. Perspectives**

unknown distances = (3*N* − 6) −

We subtract the number of known distances to the number of degrees of freedhom and we

Thus a single level does not provide enough data but two levels provide sufficient

Of course, a full evaluation of the interface degrees of freedom needs a much more complex calculation with atoms but the present counting suggests that few lowest levels should be enough to provide an accurate description of the interface shape and the position of most of

A more complete account of the problem of reconstructing the shape of a set of points given

The result R5 is clearly indicative of the major complexity of the side chain by respect to the backbone. In the paper [8] the role of the residues with multiple interactions, namely V and

Following this observation, one could introduce a parametrization based on the length of the side chain. The discussion on the information flow in the interface points in the same

direction, of dedicating a new publication to the study of the SC graphs.

degrees of freedhom = 3*N* − 6 (4)

Geometry and Topology in Protein Interfaces -- Some Tools for Investigations

3

8 3 *N* − 6 <sup>=</sup> <sup>1</sup> 3

*<sup>N</sup>* <sup>=</sup> <sup>8</sup> 3

<sup>0</sup> and has been correlated to the length of the side chains.

*N* − 6 (5)

*N* (6)

positions or distances to

In summary, the main difference that appears comparing buried surface and Voronoi cells based interface with SI is that the first two definitions focus on the spatial organization of the interface while the latter may suggest a temporal organization and allows to evaluate how the information circulate in the interface.

The result R1 clearly states that the level 1 graphs can have a BB graph of size comparable to that of level 0, thus still informative. At level 2 a smaller BB graph is observed. Preliminary results on levels higher than 2 indicate that the BB graphs are also present.

The result R2 indicates that the level 1 graphs have a structure similar to the one found at level 0, in other words that these two levels present several common elements. Instead, from R3 the level 2 graphs seem organized in a different way. The main structure of the level 2 graphs is the ladder one, in both the anti-parallel and parallel orientations, that indicates that these orientations are not distinguishable at this level, and possibly above. Preliminary results on levels higher than 2 indicate that the main distinctive motif of the parallel orientation, namely the zigzag5, is quite rare. Thus, it seems that level 0 and 1 are those that contain the most useful geometrical and topological information.

In [8] we have already used the properties in R2, R3, by implementing algorithms that, from the PDB structure, are able to characterize an interface and tell if it has a *β* structure, and which is its orientation. These algorithms are based on the level 0 only3. The analysis of level 1 graphs confirms and expands this possibility, because the information from both the levels can now be combined for a more accurate recognition.

The BB graphs at level 0 have been previously associated to structural hydrogen bonds that are present in the anti-parallel and in the parallel orientations of *β* strands.

It is possible that level 1 (for both the BB and SC graphs) doesn't describe proper chemical bonds but weaker dipole-dipole or Van der Waals interactions . This comparison for the level 1 has not yet been explored.

#### **4.1. Counting the degrees of freedhom**

The question that we address now is to evaluate if the description provided by the graphs is enough to reconstruct the shape of the interface or not.

To reconstruct a shape in three dimensions one needs 3 coordinates for each point: 3*N*, where *N* is the number of points. Actually, the absolute position of the centre of mass and the spatial orientation of the object in the space are totally irrelevant thus three overall translations and

<sup>3</sup> Other interface arrangements are not yet recognized.

three rotations can be removed from the counting, that reduces the number of needed relative positions or distances to

$$\text{degrees of freedom} = 3N - 6 \tag{4}$$

As an example, consider the graph of 2ojw in Figure 19; there are *N* = 7 amino acids (even the amino acids that do not participate to the graph but are included within the considered regions must be counted) thus one needs 3*N* − 6 = 15 relative positions. The distances between two consecutive *C<sup>α</sup>* is fixed in all polypeptides (as the distance TG in the graph). In the present case there are 5 of them. The distance between the first and the third of three consecutive *C<sup>α</sup>* (distances TT and GI in the graph) is fixed by the general properties of polypeptides, that makes 3 distances. The graph has 3 edges, namely 3 other distances, thus one remains with 15 − 5 − 3 − 3 = 4 more distances to be fixed. This indicates that this graph is insufficient to reconstruct the shape. A more general counting is possible. In [8] the average number of amino acids (18) and of edges (12) in a *β* interface have been evaluated. Their ratio is very close to 3/2 = 1.5, thus it is reasonable to assume that if there are *N* residues in the interface, there will be nearly 2*N*/3 edges (actually, multiple edges should not be counted, here; this may further reduce the number of known edges). Also, we expects *N* − 2 consecutive *C<sup>α</sup>* and *N* − 4 groups of three consecutive *C<sup>α</sup>* namely we have

$$\text{known distances} = (N - 2) + (N - 4) + \frac{2}{3}N = \frac{8}{3}N - 6\tag{5}$$

We subtract the number of known distances to the number of degrees of freedhom and we are left with the number of distances that are needed to fix the shape

$$\text{unknown distance} = (3N - 6) - \left(\frac{8}{3}N - 6\right) = \frac{1}{3}N \tag{6}$$

Thus a single level does not provide enough data but two levels provide sufficient information to fix the shape of the interface.

Of course, a full evaluation of the interface degrees of freedom needs a much more complex calculation with atoms but the present counting suggests that few lowest levels should be enough to provide an accurate description of the interface shape and the position of most of the atoms.

A more complete account of the problem of reconstructing the shape of a set of points given an incomplete set of distances is treated in [16].

#### **4.2. Perspectives**

14 ime knjige

and based on edges.

the information circulate in the interface.

useful geometrical and topological information.

1 has not yet been explored.

**4.1. Counting the degrees of freedhom**

<sup>3</sup> Other interface arrangements are not yet recognized.

can now be combined for a more accurate recognition.

enough to reconstruct the shape of the interface or not.

requires to satisfy physical conditions due to the volume of the atoms and the distribution of electric charge. Each residue has its internal constraints, among which the fixed length of covalent bonds and the planarity of the surface *Cα*, *C*, 0, *N* that contains the peptide bond. The sequence of edges transfers constraints and establishes a correlation between the outer amino acids S2 and V127 even if they are not in physical contact. The exact microscopic description of the constrains is not easy to find, although. Notice that this example of information transfer along the interface has been detected thanks to a description guided by graph theory

In summary, the main difference that appears comparing buried surface and Voronoi cells based interface with SI is that the first two definitions focus on the spatial organization of the interface while the latter may suggest a temporal organization and allows to evaluate how

The result R1 clearly states that the level 1 graphs can have a BB graph of size comparable to that of level 0, thus still informative. At level 2 a smaller BB graph is observed. Preliminary

The result R2 indicates that the level 1 graphs have a structure similar to the one found at level 0, in other words that these two levels present several common elements. Instead, from R3 the level 2 graphs seem organized in a different way. The main structure of the level 2 graphs is the ladder one, in both the anti-parallel and parallel orientations, that indicates that these orientations are not distinguishable at this level, and possibly above. Preliminary results on levels higher than 2 indicate that the main distinctive motif of the parallel orientation, namely the zigzag5, is quite rare. Thus, it seems that level 0 and 1 are those that contain the most

In [8] we have already used the properties in R2, R3, by implementing algorithms that, from the PDB structure, are able to characterize an interface and tell if it has a *β* structure, and which is its orientation. These algorithms are based on the level 0 only3. The analysis of level 1 graphs confirms and expands this possibility, because the information from both the levels

The BB graphs at level 0 have been previously associated to structural hydrogen bonds that

It is possible that level 1 (for both the BB and SC graphs) doesn't describe proper chemical bonds but weaker dipole-dipole or Van der Waals interactions . This comparison for the level

The question that we address now is to evaluate if the description provided by the graphs is

To reconstruct a shape in three dimensions one needs 3 coordinates for each point: 3*N*, where *N* is the number of points. Actually, the absolute position of the centre of mass and the spatial orientation of the object in the space are totally irrelevant thus three overall translations and

are present in the anti-parallel and in the parallel orientations of *β* strands.

results on levels higher than 2 indicate that the BB graphs are also present.

The result R5 is clearly indicative of the major complexity of the side chain by respect to the backbone. In the paper [8] the role of the residues with multiple interactions, namely V and V3, has been studied in detail in *Saa* <sup>0</sup> and has been correlated to the length of the side chains. Following this observation, one could introduce a parametrization based on the length of the side chain. The discussion on the information flow in the interface points in the same direction, of dedicating a new publication to the study of the SC graphs.


10.5772/58420

381

http://dx.doi.org/10.5772/58420

A: <sup>K</sup> β 85 Y β E β

B: R β 77

A: <sup>L</sup> α 111

B: T - 340 G <sup>β</sup> <sup>S</sup> <sup>β</sup> <sup>P</sup> <sup>β</sup> <sup>L</sup> <sup>β</sup> <sup>V</sup> <sup>β</sup> <sup>T</sup> <sup>β</sup> <sup>M</sup> <sup>β</sup> <sup>L</sup> - 330

A: <sup>N</sup> - 11

B: K β 171 H <sup>β</sup> <sup>I</sup>

S - A β E β V β S β V β E β 19

**3\_1Y13\_AB\_9\_21\_161\_174\_level\_1**

<sup>β</sup> <sup>A</sup> <sup>β</sup> <sup>K</sup> <sup>β</sup> <sup>Q</sup> <sup>β</sup> <sup>T</sup> - 164

E -

**Figure 4.** Anti-parallel orientation of the *β*-strands.

**Figure 5.** Anti-parallel orientation of the *β*-strands.

**Figure 6.** Anti-parallel orientation of the *β*-strands

Y β Y β T - 91

**3\_1PM4\_AB\_82\_95\_64\_80\_level\_1**

E <sup>β</sup> <sup>G</sup> <sup>β</sup> <sup>W</sup> <sup>β</sup> <sup>G</sup> <sup>β</sup> <sup>N</sup>

> G - E β K β T β I β

**3\_1SNR\_AB\_109\_129\_326\_342\_level\_1**


Geometry and Topology in Protein Interfaces -- Some Tools for Investigations

R β F β K <sup>β</sup> <sup>125</sup>

**Table 1.** Table of the proteins considered in this paper, from [6]. In summary, we have 24 antiparallel, 8 parallel and 7 oblique orientations.

<sup>380</sup> Oligomerization of Chemical and Biological Compounds Geometry and Topology in Protein Interfaces – Some Tools for Investigations 17 10.5772/58420 Geometry and Topology in Protein Interfaces -- Some Tools for Investigations http://dx.doi.org/10.5772/58420 381

**Figure 4.** Anti-parallel orientation of the *β*-strands.

16 ime knjige

orientations.

**stoichiometry PDB name chains range on the chains orientation of the**

 1JN1 AB 120-140,1-20 oblique 1PM4 AB 82-95,64-80 anti-parallel 1SJN AB 1-12,118-130 parallel 1SNR AB 109-129,326-342 anti-parallel 1T0A AB 1-15,125-141 oblique 1Y13 AB 9-21,161-174 anti-parallel 2BAZ AB 1-8,116-130 parallel 2BCM BA 43-53,17-26 anti-parallel 2BT9 AB 44-57,1-16 oblique 2GVH AB 189-202,59-73 anti-parallel 2I9D AB 148-166,17-33 anti-parallel 2JCA AB 1-17,103-124 oblique 2P90 AB 71-88,168-180 anti-parallel 1J8D AB 19-29,30-40 anti-parallel 1L3A AD 118-129,88-98 parallel 1PVN AD 489-496,432-438 anti-parallel 2A7R AD 1-16,327-339 parallel 2H5X AD 1-8,18-29 anti-parallel 3BF0 AB 445-468,178-197 anti-parallel 1B09 AB 197-206,99-112 oblique 2XSC AB 62-69,8-16 oblique 1EEI DE 94-103,21-33 anti-parallel 1EFI DH 23-33,94-103 anti-parallel 1FB1 AE 125-138,218-237 anti-parallel 1HI9 AB 66-84,175-191 anti-parallel 1NQU AE 1-6,43-54 parallel 1WUR AB 186-197,93-105 anti-parallel 2OJW AB 42-48,188-195 anti-parallel 2RCF AB 72-83,8-21 anti-parallel 1U1S AB 48-60,54-69 anti-parallel 2BVC AF 211-219,33-40 oblique 2GJV AB 43-56,102-112 anti-parallel 2Z9H AB 5-18,77-89 anti-parallel 1HX5 AG 3-13,92-99 anti-parallel 1OEL AG 34-43,511-524 parallel 1WNR AG 1-10,87-96 anti-parallel 2RAQ AB 33-46,76-91 anti-parallel 1Q3S AB 46-57,515-527 parallel 2V9U AB 140-148,170-177 parallel

**Table 1.** Table of the proteins considered in this paper, from [6]. In summary, we have 24 antiparallel, 8 parallel and 7 oblique

*β* **interface**

**Figure 5.** Anti-parallel orientation of the *β*-strands.

**Figure 6.** Anti-parallel orientation of the *β*-strands

10.5772/58420

383

http://dx.doi.org/10.5772/58420

A: <sup>P</sup> - 74

B: G - 177

A: <sup>D</sup> - 21

B: V α 38

> A: V - 491

> D: Y β 436

G -

H <sup>β</sup> <sup>F</sup>

Q β

**4\_1J8D\_AB\_19\_29\_30\_40\_level\_1**

L β

<sup>β</sup> <sup>S</sup>

**4\_1PVN\_AD\_489\_496\_432\_438\_level\_1**

I β

S β

D β 434

<sup>β</sup> <sup>K</sup>

<sup>β</sup> <sup>I</sup>

<sup>β</sup> <sup>A</sup> β 32

K - 494

H β Y β

D β 27

**Figure 10.** Anti-parallel orientation of the *β*-strands.

**Figure 11.** Anti-parallel orientation of the *β*-strands.

**Figure 12.** Anti-parallel orientation of the *β*-strands.

V β V β I β E -

**3\_2P90\_AB\_71\_81\_168\_180\_level\_1**

V <sup>β</sup> <sup>T</sup> <sup>β</sup> <sup>M</sup> <sup>β</sup> <sup>R</sup>

N - 81

Geometry and Topology in Protein Interfaces -- Some Tools for Investigations


**Figure 7.** Anti-parallel orientation of the *β*-strands.

**Figure 8.** Anti-parallel orientation of the *β*-strands.

**Figure 9.** Anti-parallel orientation of the *β*-strands.

<sup>382</sup> Oligomerization of Chemical and Biological Compounds Geometry and Topology in Protein Interfaces – Some Tools for Investigations 19 10.5772/58420 Geometry and Topology in Protein Interfaces -- Some Tools for Investigations http://dx.doi.org/10.5772/58420 383

**Figure 10.** Anti-parallel orientation of the *β*-strands.

18 ime knjige

B: <sup>I</sup> β 44

A: G - 24 V <sup>β</sup> <sup>Q</sup> <sup>β</sup> <sup>V</sup> <sup>β</sup> <sup>R</sup> <sup>β</sup> <sup>C</sup> - 19

A: <sup>L</sup> <sup>β</sup> <sup>192</sup> A β S β S β E - R - I β D β F - 200

B: F - 69

A: <sup>Y</sup> - 153

B: S β 31 T <sup>β</sup> <sup>I</sup> <sup>β</sup> <sup>S</sup>

T - S β I β T -

T - Q - E -

**3\_2I9D\_AB\_148\_166\_17\_33\_level\_1**


<sup>β</sup> <sup>N</sup>

R - 164

I <sup>β</sup> <sup>R</sup> - <sup>E</sup> - <sup>C</sup> <sup>β</sup> <sup>S</sup> <sup>β</sup> <sup>A</sup> <sup>β</sup> <sup>T</sup> β 61

**3\_2GVH\_AB\_189\_202\_59\_73\_level\_1**

**Figure 7.** Anti-parallel orientation of the *β*-strands.

**Figure 8.** Anti-parallel orientation of the *β*-strands.

**Figure 9.** Anti-parallel orientation of the *β*-strands.

G β V β T β A β L β

**3\_2BCM\_BA\_43\_53\_17\_26\_level\_1**

C - 51

**Figure 11.** Anti-parallel orientation of the *β*-strands.

**Figure 12.** Anti-parallel orientation of the *β*-strands.

10.5772/58420

385

http://dx.doi.org/10.5772/58420

D: <sup>L</sup> - 25

H: N - 103

A:H - 126 D - E - M -

E: R α 235

> A: <sup>D</sup> β 76

B: C β 184 K <sup>β</sup> <sup>V</sup>

L β I β S β G - D - V - 82

**Figure 17.** Anti-parallel orientation of the *β*-strands.

**Figure 18.** Anti-parallel orientation of the *β*-strands.

**Figure 16.** Anti-parallel orientation of the *β*-strands.

S β Y β T β E β

> M <sup>β</sup> <sup>S</sup> <sup>β</sup> <sup>I</sup> <sup>β</sup> <sup>A</sup> <sup>β</sup> <sup>A</sup> <sup>β</sup> <sup>I</sup> β 96

I β V β K β D β I β 135

**5\_1FB1\_AE\_125\_138\_218\_237\_level\_1**

T <sup>β</sup> <sup>S</sup> <sup>β</sup> <sup>V</sup> <sup>β</sup> <sup>T</sup> <sup>β</sup> <sup>K</sup> <sup>β</sup> <sup>N</sup> - <sup>K</sup> - 220

**5\_1HI9\_AB\_75\_84\_175\_185\_level\_1**

<sup>β</sup> <sup>S</sup>


<sup>β</sup> <sup>I</sup> β 177

**5\_1EFI\_DH\_23\_33\_94\_103\_level\_1**

M - 31

Geometry and Topology in Protein Interfaces -- Some Tools for Investigations

**Figure 13.** Anti-parallel orientation of the *β*-strands.

**Figure 14.** Anti-parallel orientation of the *β*-strands.

**Figure 15.** Anti-parallel orientation of the *β*-strands.

<sup>384</sup> Oligomerization of Chemical and Biological Compounds Geometry and Topology in Protein Interfaces – Some Tools for Investigations 21 10.5772/58420 Geometry and Topology in Protein Interfaces -- Some Tools for Investigations http://dx.doi.org/10.5772/58420 385

**Figure 16.** Anti-parallel orientation of the *β*-strands.

20 ime knjige

A: <sup>M</sup> - 1

D: R β 27 Y <sup>β</sup> <sup>G</sup> <sup>β</sup> <sup>V</sup> <sup>β</sup> <sup>G</sup>

A: <sup>H</sup> β 451 T β D β G - V -

B: Y β 188 L β G <sup>β</sup> <sup>N</sup> - <sup>T</sup>

D: <sup>I</sup> β 96 A β A β I β S β M β A β N - 103

E: L - 31

E <sup>β</sup> <sup>T</sup> <sup>β</sup> <sup>Y</sup> <sup>β</sup> <sup>S</sup> <sup>β</sup> <sup>F</sup>

**5\_1EEI\_DE\_94\_103\_21\_33\_level\_1**

**Figure 14.** Anti-parallel orientation of the *β*-strands.

**Figure 15.** Anti-parallel orientation of the *β*-strands.

**Figure 13.** Anti-parallel orientation of the *β*-strands.

A β S β V β R β 6

**4\_2H5X\_AD\_1\_8\_18\_29\_level\_1**


T -

**4\_3BF0\_AB\_451\_463\_178\_190\_level\_1**



> - <sup>K</sup> - 23

β 20

L α A <sup>α</sup><sup>461</sup>

**Figure 17.** Anti-parallel orientation of the *β*-strands.

**Figure 18.** Anti-parallel orientation of the *β*-strands.

10.5772/58420

387

http://dx.doi.org/10.5772/58420

A: S - 51

B: V β 63

A: <sup>H</sup> β 45 Y β C β

B: Y β 110

A: <sup>T</sup> β 7

B: S β 87 V <sup>β</sup> <sup>E</sup>

E <sup>β</sup> <sup>E</sup>

Q β I β V β C - T - V -

**6\_2Z9H\_AB\_5\_18\_77\_89\_level\_1**

<sup>β</sup> <sup>D</sup>


<sup>β</sup> <sup>G</sup> <sup>β</sup> <sup>I</sup> β 79

**Figure 22.** Anti-parallel orientation of the *β*-strands.

**Figure 23.** Anti-parallel orientation of the *β*-strands.

**Figure 24.** Anti-parallel orientation of the *β*-strands.

Q β

V β

**6\_2GJV\_AB\_43\_56\_102\_112\_level\_1**

M β

**6\_1U1S\_AB\_48\_55\_60\_66\_level\_1**

T β

S β

<sup>β</sup> <sup>L</sup> <sup>β</sup> <sup>W</sup> β 104

V β

S β 60

F β E - P - 54

> H α 16

Y β 55

Geometry and Topology in Protein Interfaces -- Some Tools for Investigations

**Figure 19.** Anti-parallel orientation of the *β*-strands.

**Figure 20.** Anti-parallel orientation of the *β*-strands.

**Figure 21.** Anti-parallel orientation of the *β*-strands.

<sup>386</sup> Oligomerization of Chemical and Biological Compounds Geometry and Topology in Protein Interfaces – Some Tools for Investigations 23 10.5772/58420 Geometry and Topology in Protein Interfaces -- Some Tools for Investigations http://dx.doi.org/10.5772/58420 387

**Figure 22.** Anti-parallel orientation of the *β*-strands.

22 ime knjige

A: <sup>K</sup> - 187 Q -

B: E β 103

**Figure 19.** Anti-parallel orientation of the *β*-strands.

**Figure 20.** Anti-parallel orientation of the *β*-strands.

**Figure 21.** Anti-parallel orientation of the *β*-strands.

S - R β T β V β

**5\_1WUR\_AB\_186\_199\_93\_105\_level\_1**

G <sup>β</sup> <sup>K</sup> <sup>β</sup> <sup>V</sup> <sup>β</sup> <sup>V</sup> <sup>β</sup> <sup>V</sup> <sup>β</sup> <sup>E</sup> - 95

A: <sup>T</sup> β 44

B: T β 193

A: <sup>I</sup> β 74 G β I β I - D -

B: M α 19 G

I - <sup>R</sup> <sup>β</sup> <sup>N</sup> <sup>β</sup> <sup>T</sup> - <sup>S</sup> - <sup>V</sup> - <sup>L</sup> - 10

**5\_2RCF\_AB\_72\_83\_8\_21\_level\_1**

T β 46

**5\_2OJW\_AB\_42\_48\_188\_195\_level\_1**

<sup>β</sup> <sup>I</sup>

β 190

> N - G - 82

S β

M <sup>β</sup> <sup>197</sup>

**Figure 23.** Anti-parallel orientation of the *β*-strands.

**Figure 24.** Anti-parallel orientation of the *β*-strands.

10.5772/58420

389

http://dx.doi.org/10.5772/58420

R β 9

Geometry and Topology in Protein Interfaces -- Some Tools for Investigations


> E - 127

<sup>β</sup> <sup>Y</sup>

β 96

A: <sup>S</sup> - 2 T - T β L β A β I β

B: E - 120

A: <sup>M</sup> - 1 Q β I β K β I β K β 6

B: V - 120 D <sup>β</sup> <sup>L</sup> <sup>β</sup> <sup>I</sup> <sup>β</sup> <sup>E</sup> <sup>β</sup> <sup>V</sup> - <sup>D</sup>

A: <sup>G</sup> - 120 A β F β K β

D: P - 90 K <sup>β</sup> <sup>V</sup> <sup>β</sup> <sup>F</sup>

**Figure 28.** Parallel orientation of the *β*-strands.

**Figure 29.** Parallel orientation of the *β*-strands.

**Figure 30.** Parallel orientation of the *β*-strands.

V - <sup>E</sup> <sup>β</sup> <sup>L</sup> <sup>β</sup> <sup>V</sup> <sup>β</sup> <sup>E</sup> <sup>β</sup> <sup>V</sup> - 127

**3\_2BAZ\_AB\_1\_8\_116\_130\_level\_1**

**4\_1L3A\_AD\_118\_129\_88\_98\_level\_1**

**3\_1SJN\_AB\_1\_12\_118\_130\_level\_1**

**Figure 25.** Anti-parallel orientation of the *β*-strands.

**Figure 26.** Anti-parallel orientation of the *β*-strands.

**Figure 27.** Anti-parallel orientation of the *β*-strands.

<sup>388</sup> Oligomerization of Chemical and Biological Compounds Geometry and Topology in Protein Interfaces – Some Tools for Investigations 25 10.5772/58420 Geometry and Topology in Protein Interfaces -- Some Tools for Investigations http://dx.doi.org/10.5772/58420 389

**Figure 28.** Parallel orientation of the *β*-strands.

24 ime knjige

A: I - 5

G: V β 96

> A: <sup>M</sup> - 1 I β K β

G: Q - 94 L <sup>β</sup> <sup>V</sup> <sup>β</sup> <sup>A</sup> <sup>β</sup> <sup>L</sup> <sup>β</sup> <sup>L</sup>

V β N β I β T β L β 41

V - <sup>T</sup>


**7\_2RAQ\_AB\_33\_46\_76\_91\_level\_1**

<sup>β</sup> <sup>V</sup> <sup>β</sup> <sup>E</sup> <sup>β</sup> <sup>D</sup> <sup>β</sup> <sup>V</sup> β 79

A: <sup>E</sup> - 35

B: E - 91

**Figure 25.** Anti-parallel orientation of the *β*-strands.

**Figure 26.** Anti-parallel orientation of the *β*-strands.

**Figure 27.** Anti-parallel orientation of the *β*-strands.

K β

A β

**7\_1WNR\_AG\_1\_10\_87\_96\_level\_1**

P β

**7\_1HX5\_AG\_3\_13\_92\_99\_level\_1**

L β

L -

L - 8

V β 93

> R β 8

<sup>β</sup> <sup>R</sup> - 87

**Figure 29.** Parallel orientation of the *β*-strands.

**Figure 30.** Parallel orientation of the *β*-strands.

10.5772/58420

391

http://dx.doi.org/10.5772/58420

A: <sup>M</sup> - 48

B: R α 517 I <sup>β</sup> <sup>D</sup>

A: V β 142

B: G - 171 A β

D - <sup>S</sup> β

between or inside proteins, protein-DNA or protein-RNA complexes and so on.

aspects that have been systematically negletted in all the previous publications.

**Figure 34.** Parallel orientation of the *β*-strands.

**Figure 35.** Parallel orientation of the *β*-strands.

orientations are now more precise.

models for prediction with some input from level 1.

**5. Conclusion**

D β K β M β

**8\_1Q3S\_AB\_46\_57\_515\_527\_level\_1**

V β

<sup>β</sup> <sup>V</sup> <sup>β</sup> <sup>I</sup> <sup>β</sup> <sup>A</sup> <sup>β</sup> <sup>A</sup> - 524

> V β

The whole analysis presented so far has been triggered by the problem of investigating biological interfaces, namely interfaces that form during the biochemical activity in a cell,

In the introduction, motivations are given for the possibility of using geometry to understand the interactions. In the Results and Discussion sections, the topological, but intrinsically geometrical, properties of the interaction graphs have been presented with the first goal of learning how to distinguish the two main orientations and the second goal of estimating

Both these aspects have been clearly addressed in this paper. The tools to distinguish the two

The results are the consistency of the information from level 1 and level 0 and the limited amount of information from level 2. Both the results suggest to enrich previous statistics and

**8\_2V9U\_AB\_140\_148\_170\_177\_level\_1**

S β

V β

N β 146

T β 175

S - 55

Geometry and Topology in Protein Interfaces -- Some Tools for Investigations

**Figure 31.** Parallel orientation of the *β*-strands.

**Figure 32.** Parallel orientation of the *β*-strands.

**Figure 33.** Parallel orientation of the *β*-strands.

<sup>390</sup> Oligomerization of Chemical and Biological Compounds Geometry and Topology in Protein Interfaces – Some Tools for Investigations 27 10.5772/58420 Geometry and Topology in Protein Interfaces -- Some Tools for Investigations http://dx.doi.org/10.5772/58420 391

**Figure 34.** Parallel orientation of the *β*-strands.

26 ime knjige

A: <sup>H</sup> β 3

D: T - 331

> A: M - 1

E: I β 48

A: <sup>N</sup> β 37

G: T α 516

**Figure 31.** Parallel orientation of the *β*-strands.

**Figure 32.** Parallel orientation of the *β*-strands.

**Figure 33.** Parallel orientation of the *β*-strands.

T <sup>β</sup> <sup>F</sup>

> Q β

T β

V β

T

V β


L β

D - 41



I β

**5\_1NQU\_AE\_1\_6\_46\_54\_level\_1**

L β

**7\_1OEL\_AG\_34\_43\_511\_524\_level\_1**

Y β

V β

E - 5

R β 52

D β

**4\_2A7R\_AD\_1\_12\_330\_339\_level\_1**

<sup>β</sup> <sup>I</sup>

<sup>β</sup> <sup>R</sup> β 335

N β

V β K β 9

**Figure 35.** Parallel orientation of the *β*-strands.

#### **5. Conclusion**

The whole analysis presented so far has been triggered by the problem of investigating biological interfaces, namely interfaces that form during the biochemical activity in a cell, between or inside proteins, protein-DNA or protein-RNA complexes and so on.

In the introduction, motivations are given for the possibility of using geometry to understand the interactions. In the Results and Discussion sections, the topological, but intrinsically geometrical, properties of the interaction graphs have been presented with the first goal of learning how to distinguish the two main orientations and the second goal of estimating aspects that have been systematically negletted in all the previous publications.

Both these aspects have been clearly addressed in this paper. The tools to distinguish the two orientations are now more precise.

The results are the consistency of the information from level 1 and level 0 and the limited amount of information from level 2. Both the results suggest to enrich previous statistics and models for prediction with some input from level 1.

These results are confirmed by the naive counting of the degrees of freedhom: at least in a coarse-grained view, the two levels *Saa* <sup>0</sup> and *<sup>S</sup>aa* <sup>1</sup> provide enough information to reconstruct the shape of the interface (completeness).

10.5772/58420

393

http://dx.doi.org/10.5772/58420

[6] Feverati G, Achoch M, Zrimi J, Vuillon L, Lesieur C. *β*-strand interfaces of non-dimeric protein oligomers are characterized by scattered charge residues pattern. PLoS ONE 7(4): e32558 (2012). The article and the Supporting Information are freely downloadable from Plos One: http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0032558#s5.

Geometry and Topology in Protein Interfaces -- Some Tools for Investigations

[7] Zrimi J, Ng Ling A, Giri-Rachman Arifint E, Feverati G, Lesieur C. Cholera toxin B subunits assemble into pentamers: proposition of a fly-casting mechanism. Plos One

[8] Feverati G, Achoch M, Vuillon L, Lesieur C. Residue interaction networks of *β*-strand interfaces from healthy protein oligomers have few HUBs (multiple contact residues) and low interconnectedness: a potential protection mechanism against network

[10] Janin J, Bahadur R P, Chakrabarti P. Protein-protein interaction and quaternary

[11] Feverati G, Lesieur C. Oligomeric interfaces under the lens: Gemini. PLoS ONE 5(3):

[12] Feverati G, Lesieur C, Vuillon L. Symmetrization: ranking and clustering in protein interfaces. Referred proceeding of the conference "Mathematics of distances and applications", MDA 2012, Varna. Publisher: ITHEA, Sofia. Editors: M. Deza, M.

[13] Lance G N, Williams W T. A General Theory of Classificatory Sorting Strategies. 1.

[14] Murtagh F and Contreras P. Methods of Hierarchical Clustering. Data Mining and Knowledge Discovery. Wiley-Interscience, Vol. 2, No. 1, pp. 86âA ¸S97 (2012). ˘

[15] Martin A J M, Vidotto M, Boscariol F, Di Domenico T, Walsh I, Tosatto S C E. Residue Interaction Network Generator (RING). http://protein.bio.unipd.it/ring/

[16] Liberti L, Lavor C, Maculan N, Mucherino A. Euclidean Distance Geometry and

Hierarchical Systems The Computer Journal (1967) 9 (4): 373-380.

Applications SIAM Review 56[1]:3-69. doi 10.1137/120875909

rewiring and chain dissociation. Under revision by Plos One.

http://www.expasy.org/proteomics/protein\_structure

5(12) e15347 (2010).

e9897 (2010).

Petitjean, K. Markov.

[9] ExPASy: Bioinformatics resource portal,

structure. Q Rev Biophys 41: 133-180 (2008).

In Methods, the notion of *Saa* <sup>0</sup> as the minimal description of the interface, or framework, already used in all our previous publications, has been presented here with solid mathematical arguments: as all buds are present in *Saa* <sup>0</sup> and no bud can appear later, it is legitimate to call *Saa* <sup>0</sup> a framework because a smallest set would miss some buds, namely some parts of the interface. We find that this and the completeness of the interface add important values to the validity of the methods.

Also, it is important to stress the ability of the symmetric minimization to detect the BB hydrogen bonds from the knowledge of the positions of non-hydrogen atoms only: in this case, geometry intrinsically reveals the chemical interactions, without making use of a cut off or other external scales.

#### **Acknowledgements**

This work is founded by the region Rhone-Alpes.

It's a pleasure to thanks Claire Lesieur for most valuable comments and suggestions, and Laurent Vuillon, for his help.

With sadness, the author remembers the friend and colleague Laurent Fournier, recently deceased, for his useful suggestions to improve the C++ codes used in the analyses.

#### **Author details**

Giovanni Feverati

Fédération de recherche MSIF, University of Savoie and CNRS, Annecy-le-Vieux, France

#### **References**


28 ime knjige

coarse-grained view, the two levels *Saa*

In Methods, the notion of *Saa*

legitimate to call *Saa*

or other external scales.

**Acknowledgements**

Laurent Vuillon, for his help.

Res 69: 567-577 (2002).

689-697 (1953).

**Author details** Giovanni Feverati

**References**

the shape of the interface (completeness).

mathematical arguments: as all buds are present in *Saa*

important values to the validity of the methods.

This work is founded by the region Rhone-Alpes.

These results are confirmed by the naive counting of the degrees of freedhom: at least in a

already used in all our previous publications, has been presented here with solid

some parts of the interface. We find that this and the completeness of the interface add

Also, it is important to stress the ability of the symmetric minimization to detect the BB hydrogen bonds from the knowledge of the positions of non-hydrogen atoms only: in this case, geometry intrinsically reveals the chemical interactions, without making use of a cut off

It's a pleasure to thanks Claire Lesieur for most valuable comments and suggestions, and

With sadness, the author remembers the friend and colleague Laurent Fournier, recently

deceased, for his useful suggestions to improve the C++ codes used in the analyses.

Fédération de recherche MSIF, University of Savoie and CNRS, Annecy-le-Vieux, France

of attack. Biochim Biophys Acta, vol. 1778, num. 7-8, p. 1611-23 (2008).

The strategies of toxins (review). Mol Membr Biol 14: 45-64 (1997).

review. Rev Physiol Biochem Pharmacol 159: 1-77 (2007).

[1] Iacovache I, van der Goot G F, Pernot L. Pore formation: An ancient yet complex form

[2] Lesieur C, Vecsey-Semjen B, Abrami L, Fivaz M, van der Goot G F. Membrane insertion:

[3] Kirkitadze M D, Bitan G, Teplow D B. Paradigm shifts in Alzheimer's disease and other neurodegenerative disorders: the emerging role of oligomeric assemblies. J Neurosci

[4] Harrison R S, Sharpe P C, Singh Y, Fairlie D P. Amyloid peptides and proteins in

[5] Crick F H C. The packing of alpha-helices: simple coiled-coils. Acta Crystallogr 6:

<sup>1</sup> provide enough information to reconstruct

<sup>0</sup> and no bud can appear later, it is

<sup>0</sup> as the minimal description of the interface, or framework,

<sup>0</sup> a framework because a smallest set would miss some buds, namely

<sup>0</sup> and *<sup>S</sup>aa*


**Chapter 13**

**Provisional chapter**

**From Tilings to Fibers – Bio-mathematical Aspects of**

Protein oligomers are made by the association of protein chains via intermolecular amino acid interactions (interaction between subunits) forming so called protein interfaces. This chapter proposes mathematical concepts to investigate the shape constraints on the protein interfaces in order to promote oligomerization. First, we focus on tiling the plane (2 dimensions) by translation with abstract shapes. Using the fundamental Theorem of Beauquier-Nivat, we show that the shapes of the tiles must be either like a square or like a hexagon to tile the whole plane. Second, we look in more details at the tiling of a cylinder and discuss its relevancy in constructing protein fibers. The universality of such "building" properties are investigated through biological examples. This chapter is written four-hand by a mathematician and a biologist in order to present bio-mathematical aspects of fiber

Proteins are made by polymerization of 20 different amino acids via a covalent bond called the peptide bond. The ordered amino acids read along the peptide bonds constitute the protein backbone and are referred to as the primary structure or the primary sequence. The secondary structure involves hydrogen bonding to form *α*-helix or pleated-sheet structures. The intricate folding of the polypeptide chain in a globular protein via long range atomic interactions is referred to as the tertiary structure. Some proteins (hemoglobin, for example) have quaternary structure— several polypeptide chains are nested together. These proteins are called protein oligomers. The different chains associate by the formation of contact zones called protein interfaces, made of contacts between atoms of the amino acids (Fig. 1A). The majority of proteins are organized as oligomers and not as single monomers (for a review see chapter in this book by Gotte). Protein oligomers adopt different symmetries and different stoichiometries (number of repeated monomers). According to the PDB (Protein Data Bank [30] ) where all available atomic structures of proteins are stored, protein oligomers exist in cyclic, dihedral and cubic point group symmetries. Most proteins have *Cn* or *Dn* symmetry (see [17, 22, 24]) and almost only viral proteins adopt cubic symmetry [35]. High stoichiometry complexes in eukaryotes have most often a *C*<sup>1</sup> symmetry (identity). Some

> ©2012 Lesieur and Vuillon et al., licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2014 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

proteins adopt helical symmetry and construct fibers (see [24]).

**From Tilings to Fibers – Bio-mathematical Aspects**

**Fold Plasticity**

C. Lesieur and L. Vuillon

C. Lesieur1 and L. Vuillon2∗

**of Fold Plasticity**

http://dx.doi.org/10.5772/58577

10.5772/58577

**1. Introduction**

constructions.

Additional information is available at the end of the chapter

Additional information is available at the end of the chapter

**Provisional chapter**

#### **From Tilings to Fibers – Bio-mathematical Aspects of Fold Plasticity From Tilings to Fibers – Bio-mathematical Aspects of Fold Plasticity**

C. Lesieur and L. Vuillon C. Lesieur1 and L. Vuillon2∗

Additional information is available at the end of the chapter Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/58577 10.5772/58577

#### **1. Introduction**

Protein oligomers are made by the association of protein chains via intermolecular amino acid interactions (interaction between subunits) forming so called protein interfaces. This chapter proposes mathematical concepts to investigate the shape constraints on the protein interfaces in order to promote oligomerization. First, we focus on tiling the plane (2 dimensions) by translation with abstract shapes. Using the fundamental Theorem of Beauquier-Nivat, we show that the shapes of the tiles must be either like a square or like a hexagon to tile the whole plane. Second, we look in more details at the tiling of a cylinder and discuss its relevancy in constructing protein fibers. The universality of such "building" properties are investigated through biological examples. This chapter is written four-hand by a mathematician and a biologist in order to present bio-mathematical aspects of fiber constructions.

Proteins are made by polymerization of 20 different amino acids via a covalent bond called the peptide bond. The ordered amino acids read along the peptide bonds constitute the protein backbone and are referred to as the primary structure or the primary sequence. The secondary structure involves hydrogen bonding to form *α*-helix or pleated-sheet structures. The intricate folding of the polypeptide chain in a globular protein via long range atomic interactions is referred to as the tertiary structure. Some proteins (hemoglobin, for example) have quaternary structure— several polypeptide chains are nested together. These proteins are called protein oligomers. The different chains associate by the formation of contact zones called protein interfaces, made of contacts between atoms of the amino acids (Fig. 1A). The majority of proteins are organized as oligomers and not as single monomers (for a review see chapter in this book by Gotte). Protein oligomers adopt different symmetries and different stoichiometries (number of repeated monomers). According to the PDB (Protein Data Bank [30] ) where all available atomic structures of proteins are stored, protein oligomers exist in cyclic, dihedral and cubic point group symmetries. Most proteins have *Cn* or *Dn* symmetry (see [17, 22, 24]) and almost only viral proteins adopt cubic symmetry [35]. High stoichiometry complexes in eukaryotes have most often a *C*<sup>1</sup> symmetry (identity). Some proteins adopt helical symmetry and construct fibers (see [24]).

the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2014 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

©2012 Lesieur and Vuillon et al., licensee InTech. This is an open access chapter distributed under the terms of

In this chapter, we investigate the construction of biological fibers by using a mathematical model for fibers. Essentially the formation of the fiber is considered as the tiling of an infinite height cylinder derived from an initial two dimensional tiling of the plane followed by the translation of a single tile. This single tile could be replaced by a *n*-mer in order to construct more complex fibers. Of course proteins have tridimensional shapes (3D-structure) and the biological fiber could have cross section and more complex internal organization not considered here because the idea is to construct a complete mathematical model for fibers applicable for any 3D-structure of the protein. This "simplification" is supported biologically because there exists many severe human diseases associated with the formation of fibers by proteins structurally and functionally unrelated. Notorious examples, are Alzheimer (*Aβ*-amyloid), Parkinson (synuclein), cerebral amyloid angiopathy (cystatin C-amyloidosis) and type II diabetes (IAPP, amylin). It is important to realize that fiber formation is also observed in cancer (p53), cardiovascular (transthyretin, serpin) and inflammatory diseases (serpin) (reviewed in [2, 6, 9, 26, 27]). These proteins (indicated in bracket next to the disease) have the fold plasticity to undergo a transition from an oligomeric state to a fiber state, change that leads to the loss of the protein function and the pathologies, called conformational diseases (Fig. 1B). Because this transition is shared by unrelated proteins, it is reasonable to assume that the change is based on a generic binding properties of the interface. The idea of the work is to define the shape constraints of the interfaces of a fiber (infinite height cylinder) compared to the interfaces of an oligomer (finite height cylinder) to ultimately address the problem of conformational diseases. The 2D tiling is relevant as the answer lies in the properties of surfaces. As mentioned, we are particularly interested in determining invariant properties, i.e. property of the surface that are true for any 3D-structure of the protein. In other words, the goal is to trace the local properties (the interface) necessary to carry out a global change. The relation between local properties and quaternary structures is discussed in details in the work by Claverie, Hofunung and Monod and the Monod, Changeux, Wyman (MCW) model ([8, 28]).

10.5772/58577

397

http://dx.doi.org/10.5772/58577

From Tilings to Fibers – Bio-mathematical Aspects of Fold Plasticity

**Figure 1. Protein oligomers and fibers. A. Protein oligomer.** As an example, the x-ray structure of the cholera toxin B pentamer (*CtxB*5) is shown (PDB code 3CHB). Each chain is indicated in a different ribbon color. The top and bottom images represent a top view and a side view of the pentamer, respectively.The atomic interactions involved in a protein interface are highlighted in space fill representation showing all atoms. Each chain participates for one domain of the interface, also called segment or side. (*CtxB*5) has a *C*<sup>5</sup> symmetry that is well described by a cylinder (bottom image). **B. Oligomer to fiber**

impose the use of either 2 interfaces or 3 interfaces. (5.2) We extend our construction to tile the fiber by *n*-mers instead of a single chain. (5.3) We explain how the fold plasticity is able to make a transition from finite cylinder to fiber in the p53 case. (6.1) We investigate the case of non regular tilings in order to design fibers with only 1 direction of periodicity and this implies strong constraints on the shape of the tile. (6.2) We also investigate transition from fiber to tiling of the whole three dimensional space. (7) We give a conclusion by comparing the fiber case to the whole tiling of the 3D space. All along this chapter biological illustrations

Oligomers with *Cn* symmetry can be constructed using a single chain replicated with a rotational axe of order *n* that is by a rotation of angle 360/*n* degrees (in other terms *n* equals to the number of monomers in the oligomers). Such oligomer adopts a cylindrical surface with each chain associating with 2 adjacent chains via a single interface (Fig. 1A). The interface is made of 2 interacting domains also called segments or sides, each one provided by one chain (Fig. 1A). It is important to clearly distinguish the notion of having one interface from the notion of having multiple regions or multiple patches of interfaces. Interfaces can be made of amino acids that are not contiguous along the backbone making up several "regions" (Fig. 2A). Yet the different regions still constitute a single interface because they are only able to bind to a single side. As an analogy, let's consider a poster to be stuck on a wall. You may choose to put several patches of glue on the reverse side of the poster or only one : it will

**transition.** Each monomer is indicated by a different color.

are presented in parallel to the mathematical descriptions.

**2.** *Cn* **and** *Dn* **symmetries on oligomerization**

Clearly the transition to a fiber involves other determinants besides those of the protein interface, as for example the spatial positioning of the interface relative to the main body of the protein, a typical folding problem for which 3D-tiling would be more appropriate. Nevertheless this problem is not addressed in the present chapter which focus essentially on the transition from an oligomer to a fiber seen from the protein interface view point. For more information on the mechanisms of assembly and 3D determinant, one can read the chapter by Claire Lesieur titled "The Assembly of Protein Oligomers: Old Stories and New Perspectives with Graph Theory". For fiber formation, readers can referred to the work by Gebauer [11, 12].

This chapter proposes a mathematical approach of the problem using the principles of tiling and considering the protein interfaces as boundaries of a tile. The chapter is divided into 9 parts: (1) After this introduction. (2) We focus on usual *Cn* and *Dn* symmetries that build finite height cylinders and we explain the construction of interfaces between chains. In paragraph (3) we give a formal description of the boundary of a tile by using abstract geometrical objects called polyominoes, we state the fundamental Theorem of Beauquier-Nivat which explains the properties of the boundary of a tile associated with its capacity to tile a plan. (4) We present the regular tilings of the plane which are tilings constructed by translation of all integral combinations of 2 vectors. (5.1) These two directions of periodicity on the tilings allow us to construct fibers (that are infinite cylinders) and to

**Figure 1. Protein oligomers and fibers. A. Protein oligomer.** As an example, the x-ray structure of the cholera toxin B pentamer (*CtxB*5) is shown (PDB code 3CHB). Each chain is indicated in a different ribbon color. The top and bottom images represent a top view and a side view of the pentamer, respectively.The atomic interactions involved in a protein interface are highlighted in space fill representation showing all atoms. Each chain participates for one domain of the interface, also called segment or side. (*CtxB*5) has a *C*<sup>5</sup> symmetry that is well described by a cylinder (bottom image). **B. Oligomer to fiber transition.** Each monomer is indicated by a different color.

impose the use of either 2 interfaces or 3 interfaces. (5.2) We extend our construction to tile the fiber by *n*-mers instead of a single chain. (5.3) We explain how the fold plasticity is able to make a transition from finite cylinder to fiber in the p53 case. (6.1) We investigate the case of non regular tilings in order to design fibers with only 1 direction of periodicity and this implies strong constraints on the shape of the tile. (6.2) We also investigate transition from fiber to tiling of the whole three dimensional space. (7) We give a conclusion by comparing the fiber case to the whole tiling of the 3D space. All along this chapter biological illustrations are presented in parallel to the mathematical descriptions.

#### **2.** *Cn* **and** *Dn* **symmetries on oligomerization**

2

(MCW) model ([8, 28]).

Gebauer [11, 12].

In this chapter, we investigate the construction of biological fibers by using a mathematical model for fibers. Essentially the formation of the fiber is considered as the tiling of an infinite height cylinder derived from an initial two dimensional tiling of the plane followed by the translation of a single tile. This single tile could be replaced by a *n*-mer in order to construct more complex fibers. Of course proteins have tridimensional shapes (3D-structure) and the biological fiber could have cross section and more complex internal organization not considered here because the idea is to construct a complete mathematical model for fibers applicable for any 3D-structure of the protein. This "simplification" is supported biologically because there exists many severe human diseases associated with the formation of fibers by proteins structurally and functionally unrelated. Notorious examples, are Alzheimer (*Aβ*-amyloid), Parkinson (synuclein), cerebral amyloid angiopathy (cystatin C-amyloidosis) and type II diabetes (IAPP, amylin). It is important to realize that fiber formation is also observed in cancer (p53), cardiovascular (transthyretin, serpin) and inflammatory diseases (serpin) (reviewed in [2, 6, 9, 26, 27]). These proteins (indicated in bracket next to the disease) have the fold plasticity to undergo a transition from an oligomeric state to a fiber state, change that leads to the loss of the protein function and the pathologies, called conformational diseases (Fig. 1B). Because this transition is shared by unrelated proteins, it is reasonable to assume that the change is based on a generic binding properties of the interface. The idea of the work is to define the shape constraints of the interfaces of a fiber (infinite height cylinder) compared to the interfaces of an oligomer (finite height cylinder) to ultimately address the problem of conformational diseases. The 2D tiling is relevant as the answer lies in the properties of surfaces. As mentioned, we are particularly interested in determining invariant properties, i.e. property of the surface that are true for any 3D-structure of the protein. In other words, the goal is to trace the local properties (the interface) necessary to carry out a global change. The relation between local properties and quaternary structures is discussed in details in the work by Claverie, Hofunung and Monod and the Monod, Changeux, Wyman

Clearly the transition to a fiber involves other determinants besides those of the protein interface, as for example the spatial positioning of the interface relative to the main body of the protein, a typical folding problem for which 3D-tiling would be more appropriate. Nevertheless this problem is not addressed in the present chapter which focus essentially on the transition from an oligomer to a fiber seen from the protein interface view point. For more information on the mechanisms of assembly and 3D determinant, one can read the chapter by Claire Lesieur titled "The Assembly of Protein Oligomers: Old Stories and New Perspectives with Graph Theory". For fiber formation, readers can referred to the work by

This chapter proposes a mathematical approach of the problem using the principles of tiling and considering the protein interfaces as boundaries of a tile. The chapter is divided into 9 parts: (1) After this introduction. (2) We focus on usual *Cn* and *Dn* symmetries that build finite height cylinders and we explain the construction of interfaces between chains. In paragraph (3) we give a formal description of the boundary of a tile by using abstract geometrical objects called polyominoes, we state the fundamental Theorem of Beauquier-Nivat which explains the properties of the boundary of a tile associated with its capacity to tile a plan. (4) We present the regular tilings of the plane which are tilings constructed by translation of all integral combinations of 2 vectors. (5.1) These two directions of periodicity on the tilings allow us to construct fibers (that are infinite cylinders) and to Oligomers with *Cn* symmetry can be constructed using a single chain replicated with a rotational axe of order *n* that is by a rotation of angle 360/*n* degrees (in other terms *n* equals to the number of monomers in the oligomers). Such oligomer adopts a cylindrical surface with each chain associating with 2 adjacent chains via a single interface (Fig. 1A). The interface is made of 2 interacting domains also called segments or sides, each one provided by one chain (Fig. 1A). It is important to clearly distinguish the notion of having one interface from the notion of having multiple regions or multiple patches of interfaces. Interfaces can be made of amino acids that are not contiguous along the backbone making up several "regions" (Fig. 2A). Yet the different regions still constitute a single interface because they are only able to bind to a single side. As an analogy, let's consider a poster to be stuck on a wall. You may choose to put several patches of glue on the reverse side of the poster or only one : it will still bind to only one surface. The *Cn* oligomers have only 1 interface and they polymerize following a single direction that is why they can describe a cylindrical surface (fig. 1A). An important notion is the number of adjacent chains in the oligomerization. In the *Cn* symmetry, each chain has exactly 2 adjacent chains (namely the *M*th chain has for adjacent chains the (*M* − 1)th chain and the (*M* + 1)th chain (this calculation is done modulo *n*)) (see fig. 3). To summarize *Cn* is an oligomerization with 1 interface (possibly with distinct regions) and with 2 adjacent chains at each chain (see Fig. 3 and Fig. 4).

10.5772/58577

399

http://dx.doi.org/10.5772/58577

I M

From Tilings to Fibers – Bio-mathematical Aspects of Fold Plasticity

<sup>I</sup> <sup>I</sup> <sup>I</sup> <sup>I</sup>

I

I

**Figure 3. Abstract view of** *C*<sup>4</sup> **symmetry.** A tetramer formed by a single interface between the part *I* and ¯*I*. Each chain is

**Figure 4. Heptamer with 4H56 PDB code with each chain adjacent to 2 other chains.** Remark that in the beta barrel the adjacency between pairs of chains is the same as in the other part of the heptamer. This implies an oligomerization with exactly

A *polyomino* is a fundamental object in theory of tilings introduced by Golomb [16]. Polyominoes are building blocks for tiling and will be for us an abstract version of a single

A polyomino is composed of juxtaposition of unit squares with corners on the **Z**<sup>2</sup> grid (the corners of each unit square are integer points) such that every 2 adjacent squares in the polyomino share a unit segment; and also such that we can reach every pair of squares of the polyomino by a path inside the polyomino moving by unitary steps on adjacent unit squares

There are 2 distinct polyominoes constructed by 2 adjacent unit squares and are called "domino". Remark that a domino can be horizontal or vertical this is why we find 2 dominoes

The most simple polyomino has only one unit square and is called "mino"(Fig. 6).

3

I

adjacent via the interface to exactly 2 other chains

**3. Definitions and notation**

chain in an oligomerization process.

1 interface.

(see Fig. 5).

(Fig. 7).

I

1

I

2

4

**Figure 2. Protein interfaces. A. Left, rectangular schematic of the backbone of a protein oligomer.** Two interacting chains are indicated, chain *M* and *M* + 1. Chain *M* participates in one side of the interface indicated by the red boxes and chain *M* + 1 participates in the complementary sides indicated in green. In the example the protein has 3 distinct regions of interface. Right, Illustration of the 3 regions of interface on the x-ray structure of CtxB5 (see fig. 1A). **B.** *Dn* **symmetry oligomer.** Example with the protein 3GVF (PDB code), a *D*<sup>3</sup> symmetry oligomer made of 6 chains organized as 3 dimers.

A slightly more complicated symmetry is the *Dn* symmetry oligomer which is reconstructed by a rotational axe plus 2 axes of symmetry perpendicular to the rotational axe. Such symmetry might look like to polymerize in 2 directions because it has one region of interface orthogonal to another one so the former can bind in a perpendicular direction. However, the "orthogonal" interface cannot grow besides forming a dimer so a *Dn* symmetry is in fact similar to a *Cn* symmetry considering the polymerization of a dimer instead of a monomer (Fig. 2B).

**Figure 3. Abstract view of** *C*<sup>4</sup> **symmetry.** A tetramer formed by a single interface between the part *I* and ¯*I*. Each chain is adjacent via the interface to exactly 2 other chains

**Figure 4. Heptamer with 4H56 PDB code with each chain adjacent to 2 other chains.** Remark that in the beta barrel the adjacency between pairs of chains is the same as in the other part of the heptamer. This implies an oligomerization with exactly 1 interface.

### **3. Definitions and notation**

4

(Fig. 2B).

still bind to only one surface. The *Cn* oligomers have only 1 interface and they polymerize following a single direction that is why they can describe a cylindrical surface (fig. 1A). An important notion is the number of adjacent chains in the oligomerization. In the *Cn* symmetry, each chain has exactly 2 adjacent chains (namely the *M*th chain has for adjacent chains the (*M* − 1)th chain and the (*M* + 1)th chain (this calculation is done modulo *n*)) (see fig. 3). To summarize *Cn* is an oligomerization with 1 interface (possibly with distinct

**Figure 2. Protein interfaces. A. Left, rectangular schematic of the backbone of a protein oligomer.** Two interacting chains are indicated, chain *M* and *M* + 1. Chain *M* participates in one side of the interface indicated by the red boxes and chain *M* + 1 participates in the complementary sides indicated in green. In the example the protein has 3 distinct regions of interface. Right, Illustration of the 3 regions of interface on the x-ray structure of CtxB5 (see fig. 1A). **B.** *Dn* **symmetry oligomer.** Example with the protein 3GVF (PDB code), a *D*<sup>3</sup> symmetry oligomer made of 6 chains organized as 3 dimers.

A slightly more complicated symmetry is the *Dn* symmetry oligomer which is reconstructed by a rotational axe plus 2 axes of symmetry perpendicular to the rotational axe. Such symmetry might look like to polymerize in 2 directions because it has one region of interface orthogonal to another one so the former can bind in a perpendicular direction. However, the "orthogonal" interface cannot grow besides forming a dimer so a *Dn* symmetry is in fact similar to a *Cn* symmetry considering the polymerization of a dimer instead of a monomer

regions) and with 2 adjacent chains at each chain (see Fig. 3 and Fig. 4).

A *polyomino* is a fundamental object in theory of tilings introduced by Golomb [16]. Polyominoes are building blocks for tiling and will be for us an abstract version of a single chain in an oligomerization process.

A polyomino is composed of juxtaposition of unit squares with corners on the **Z**<sup>2</sup> grid (the corners of each unit square are integer points) such that every 2 adjacent squares in the polyomino share a unit segment; and also such that we can reach every pair of squares of the polyomino by a path inside the polyomino moving by unitary steps on adjacent unit squares (see Fig. 5).

The most simple polyomino has only one unit square and is called "mino"(Fig. 6).

There are 2 distinct polyominoes constructed by 2 adjacent unit squares and are called "domino". Remark that a domino can be horizontal or vertical this is why we find 2 dominoes (Fig. 7).

10.5772/58577

401

http://dx.doi.org/10.5772/58577

From Tilings to Fibers – Bio-mathematical Aspects of Fold Plasticity

104592937, 400795844, 1540820542, 5940738676, 22964779660, 88983512783, 345532572678, 1344372335524, 5239988770268, 20457802016011, 79992676367108, 313224032098244, 1228088671826973, ... We recognize the beginning of the sequence namely 1 mino, 2

We don't know if it possible to find a closed formula (i.e. a mathematical formula depending on *n* that counts the number of distinct polyominoes with *n* unit squares) and this is a

We have defined polyominoes and now we would like to tile the whole plane by translation

Let *P* be a polyomino. A *tiling by translation* of *P* is a covering of the whole plane **R**<sup>2</sup> by translated images of *P* such that there is no hole in the tiling and no overlapping. A

Remark, that we use only translated images of *P* and neither rotation nor reflection nor glide

In order to find a characterization of polyominoes that tile the plane by translation we focus on the boundary of a polyomino *P*. We code the paths of the boundary by 4 letters: *a*

Now, we take a starting point *o* on the boundary and we turn clockwise considering the sequence of steps in order to make a path constituted by unit segments from *o* to the first return on *o*. Let **b**(*P*) be a *boundary word of P* that is the path from *o* to *o* in clockwise that codes the boundary of the polyomino *P* in the following way : starting from an origin on the boundary *o*, the boundary word **b**(*P*) is the concatenation of labels of boundary unit segments read clockwise. Remark that as we turn clockwise we can't have in the step *aa*¯, *aa*¯ ,

*bb* otherwise the considered geometric figure is not a union of unit squares and thus

*b* a down step.

polyomino that tiles the plane by translation is called a *tile* (Fig. 10).

dominoes, 6 triominoes, 19 tetraminoes and so on.

research problem for combinatorists.

of a single polyomino.

**Figure 10. Tilings of the plane**

represents a left step, *b* an up step, *a*¯ a right step and ¯

the coded object could not be a polyomino.

reflection.

*b*¯ *b* or ¯

**Figure 5. Path inside a polyomino**

**Figure 6. The most simple polyomino: the mino**

#### **Figure 7. The 2 dominoes**

There are 6 polyominoes constructed by 3 2-by-2 adjacent unit squares: an horizontal bar of 3 unit squares, a vertical bar of 3 unit squares and four "triominoes *L*" with 3 adjacent unit squares that form an "*L*" shape and the 3 rotations by 90, 180 and 270 degrees (Fig. 8).

#### **Figure 8. The 6 triominoes**

In fact, the number of polyominoes with four unit squares (namely the tetraminoes) is 19.

In addition, the usual definition adds that there is no hole in the interior of a polyomino (Fig. 9).


**Figure 9. Union of unit squares with a hole in the interior**

The number of polyominoes with *n* unit squares is given by the sequence number A001168 according to the Sloane Integer Sequences Encyclopedia (http://oeis.org/A001168): 1, 2, 6, 19, 63, 216, 760, 2725, 9910, 36446, 135268, 505861, 1903890, 7204874, 27394666, 104592937, 400795844, 1540820542, 5940738676, 22964779660, 88983512783, 345532572678, 1344372335524, 5239988770268, 20457802016011, 79992676367108, 313224032098244, 1228088671826973, ... We recognize the beginning of the sequence namely 1 mino, 2 dominoes, 6 triominoes, 19 tetraminoes and so on.

We don't know if it possible to find a closed formula (i.e. a mathematical formula depending on *n* that counts the number of distinct polyominoes with *n* unit squares) and this is a research problem for combinatorists.

We have defined polyominoes and now we would like to tile the whole plane by translation of a single polyomino.

Let *P* be a polyomino. A *tiling by translation* of *P* is a covering of the whole plane **R**<sup>2</sup> by translated images of *P* such that there is no hole in the tiling and no overlapping. A polyomino that tiles the plane by translation is called a *tile* (Fig. 10).

**Figure 10. Tilings of the plane**

6

**Figure 5. Path inside a polyomino**

**Figure 7. The 2 dominoes**

**Figure 8. The 6 triominoes**

**Figure 9. Union of unit squares with a hole in the interior**

(Fig. 9).

**Figure 6. The most simple polyomino: the mino**

There are 6 polyominoes constructed by 3 2-by-2 adjacent unit squares: an horizontal bar of 3 unit squares, a vertical bar of 3 unit squares and four "triominoes *L*" with 3 adjacent unit squares that form an "*L*" shape and the 3 rotations by 90, 180 and 270 degrees (Fig. 8).

In fact, the number of polyominoes with four unit squares (namely the tetraminoes) is 19. In addition, the usual definition adds that there is no hole in the interior of a polyomino

The number of polyominoes with *n* unit squares is given by the sequence number A001168 according to the Sloane Integer Sequences Encyclopedia (http://oeis.org/A001168): 1, 2, 6, 19, 63, 216, 760, 2725, 9910, 36446, 135268, 505861, 1903890, 7204874, 27394666, Remark, that we use only translated images of *P* and neither rotation nor reflection nor glide reflection.

In order to find a characterization of polyominoes that tile the plane by translation we focus on the boundary of a polyomino *P*. We code the paths of the boundary by 4 letters: *a* represents a left step, *b* an up step, *a*¯ a right step and ¯ *b* a down step.

Now, we take a starting point *o* on the boundary and we turn clockwise considering the sequence of steps in order to make a path constituted by unit segments from *o* to the first return on *o*. Let **b**(*P*) be a *boundary word of P* that is the path from *o* to *o* in clockwise that codes the boundary of the polyomino *P* in the following way : starting from an origin on the boundary *o*, the boundary word **b**(*P*) is the concatenation of labels of boundary unit segments read clockwise. Remark that as we turn clockwise we can't have in the step *aa*¯, *aa*¯ , *b*¯ *b* or ¯ *bb* otherwise the considered geometric figure is not a union of unit squares and thus the coded object could not be a polyomino.

For example, the boundary word of the mino is *aba*¯¯ *b* and all cyclic permutations of letters *ba*¯¯ *ba*, *a*¯¯ *bab* and ¯ *baba*¯ leads to different boundary words depending where the starting point is taken. Nevertheless each cyclic permutation of letters define the same polyomino (Fig. 11).

10.5772/58577

*b* by using the point (i) we

http://dx.doi.org/10.5772/58577

*b* and by simplification

*ba*¯¯ *b* 403

In fact, if *w* is a part of a boundary word read clockwise then *w* is the same path read anti clockwise ! In Fig. 13 we see an example of a path clockwise associated with *w* = *ba*¯¯

*b*.

w

a

*ba*¯¯

a

*ba*¯¯¯¯ *ba*¯¯¯

From Tilings to Fibers – Bio-mathematical Aspects of Fold Plasticity

b

*b*. In fact a mino tiles in a unique way and

*b* one in pseudo square by

*b* and given a tiling like a "non robust

and the same path anti clockwise that is associated with *w* = *ba*¯¯

according to the point (ii) we finally have *w* = *baba*¯

a

w

word of *P* we could recover the possible surrounding of *P*.

*X* · *Y* · *Z where one of the variables in the factorization may be empty.*

**Examples of factorizations and associated tilings of the plane.**

*b* thus *X* = *a* and *Y* = *b* and *X* = *a*¯ and *Y* = ¯

X=a

**Figure 14. Tiling by a mino and the 4 adjacent tiles of the grey mino**

taking *X* = *aa* and *Y* = *b* and *X* = *a*¯*a*¯ and *Y* = ¯

X=a

X=a

Y=b

X=a

*aba*¯¯

if it is equal to *X* · *Y* · *X* · *Y* such a polyomino is called *pseudo square*.

has the property that each tile is surrounded by 4 adjacent tiles (Fig. 14).

Y=b

Y=b Y=b

A horizontal domino has exactly 2 factorizations **b**(*P*) = *aaba*¯*a*¯¯

brickwall". In this case each tile is surrounded by 4 adjacent tiles (Fig. 15).

**Figure 13. Coding of a finite path clockwise and anti clockwise**

reverse the word and take the bar of each letter and thus find *w* = ¯¯

a

b

b

In a tiling of the plane each tile are surrounded by a certain number of adjacent copies of *P*. The spirit of the Beauquier-Nivat's Theorem [1] is that by investigating carefully a boundary

**Theorem 3.1** (Beauquier, Nivat)**.** *A polyomino P tiles the plane by translation if and only if the boundary word* **b**(*P*) *is equal up to a cyclic permutation of the symbols to the factorization X* · *Y* · *Z* ·

If the boundary word is equal to *X* · *Y* · *Z* · *X* · *Y* · *Z* such a polyomino is called *pseudo hexagon*,

A mino has a unique factorization on pseudo square, indeed the boundary word is **b**(*P*) =

b b b

**Figure 11. Coding of the boundary of the mino**

Thus for a boundary word *w* = *w*1*w*<sup>2</sup> ··· *wn* where all the *wi* are letters that is *wi* ∈ Σ = {*a*, *b*, *a*¯, ¯ *b*} we define the conjugate of a boundary word *w* if *w* = *uv* then a conjugate is defined by *vu*. We read on Fig. 11 the following boundary word of a mino *a*¯¯ *bab* and we construct the 3 other possible boundary words (by changing the origin of the reading) by the notion of conjugate : we construct *ba*¯¯ *ba* by taking *u* = *a*¯¯ *ba* and *v* = *b*, we construct *aba*¯¯ *b* by taking *u* = *a*¯¯ *b* and *v* = *ab* and we construct ¯ *baba*¯ by taking *u* = *a*¯ and *v* = ¯ *bab*.

This notion of boundary word is efficient because each polyomino can be characterized by it's boundary word.

For example the *L*-triomino could be defined directly by it's boundary word *ba*¯¯ *ba*¯¯ *baab* (Fig. 12). Remark that each conjugate of the word *ba*¯¯ *ba*¯¯ *baab* leads to the same *L*-triomino shape.

**Figure 12. Coding of the boundary of the** *L***-triomino**

We know define a formal operation on words over the alphabet Σ = {*a*, *b*, *a*¯, ¯ *b*}. This operation will be crucial to define interfaces between tiles and for us abstract interfaces between chains.

We define the *u* operator on the word over the alphabet Σ by

if a word *w* of length *n* is written *w* = *w*1*w*<sup>2</sup> ··· *wn* where all the *wi* are letters that is *wi* ∈ Σ = {*a*, *b*, *a*¯, ¯ *b*}

$$(i)\overline{w} = \overline{w\_n}\,\overline{w\_{n-1}}\,\cdots\,\overline{w\_1}$$

$$(ii)\overline{\overline{a}} = a \text{ and } \overline{\overline{b}} = b.$$

The point (i) means that to use the bar operator on a word *w*, we bar each letter of the word *w* read in reverse order. The point (ii) means that 2 times the operator bar on a letter is equal to this letter.

In fact, if *w* is a part of a boundary word read clockwise then *w* is the same path read anti clockwise ! In Fig. 13 we see an example of a path clockwise associated with *w* = *ba*¯¯ *ba*¯¯ *b* and the same path anti clockwise that is associated with *w* = *ba*¯¯ *ba*¯¯ *b* by using the point (i) we reverse the word and take the bar of each letter and thus find *w* = ¯¯ *ba*¯¯¯¯ *ba*¯¯¯ *b* and by simplification according to the point (ii) we finally have *w* = *baba*¯ *b*.

**Figure 13. Coding of a finite path clockwise and anti clockwise**

8

*ba*¯¯ *ba*, *a*¯¯

{*a*, *b*, *a*¯, ¯

shape.

Σ = {*a*, *b*, *a*¯, ¯

to this letter.

*b*}

(*i*)*w* = *wn wn*<sup>−</sup><sup>1</sup> ··· *w*<sup>1</sup>

(*ii*)*a* = *a* and *b* = *b*.

taking *u* = *a*¯¯

it's boundary word.

*bab* and ¯

**Figure 11. Coding of the boundary of the mino**

notion of conjugate : we construct *ba*¯¯

**Figure 12. Coding of the boundary of the** *L***-triomino**

*b* and *v* = *ab* and we construct ¯

(Fig. 12). Remark that each conjugate of the word *ba*¯¯

For example, the boundary word of the mino is *aba*¯¯

*b* and all cyclic permutations of letters

*ba* and *v* = *b*, we construct *aba*¯¯

*bab*.

*baab* leads to the same *L*-triomino

*bab* and we

*ba*¯¯ *baab*

*b*}. This operation

*b* by

*baba*¯ leads to different boundary words depending where the starting point is

taken. Nevertheless each cyclic permutation of letters define the same polyomino (Fig. 11).

a

b b

a

Thus for a boundary word *w* = *w*1*w*<sup>2</sup> ··· *wn* where all the *wi* are letters that is *wi* ∈ Σ =

construct the 3 other possible boundary words (by changing the origin of the reading) by the

This notion of boundary word is efficient because each polyomino can be characterized by

a a

will be crucial to define interfaces between tiles and for us abstract interfaces between chains.

if a word *w* of length *n* is written *w* = *w*1*w*<sup>2</sup> ··· *wn* where all the *wi* are letters that is *wi* ∈

The point (i) means that to use the bar operator on a word *w*, we bar each letter of the word *w* read in reverse order. The point (ii) means that 2 times the operator bar on a letter is equal

For example the *L*-triomino could be defined directly by it's boundary word *ba*¯¯

a

<sup>a</sup> b b

b

We know define a formal operation on words over the alphabet Σ = {*a*, *b*, *a*¯, ¯

We define the *u* operator on the word over the alphabet Σ by

*ba* by taking *u* = *a*¯¯

defined by *vu*. We read on Fig. 11 the following boundary word of a mino *a*¯¯

*b*} we define the conjugate of a boundary word *w* if *w* = *uv* then a conjugate is

*baba*¯ by taking *u* = *a*¯ and *v* = ¯

*ba*¯¯

b

In a tiling of the plane each tile are surrounded by a certain number of adjacent copies of *P*. The spirit of the Beauquier-Nivat's Theorem [1] is that by investigating carefully a boundary word of *P* we could recover the possible surrounding of *P*.

**Theorem 3.1** (Beauquier, Nivat)**.** *A polyomino P tiles the plane by translation if and only if the boundary word* **b**(*P*) *is equal up to a cyclic permutation of the symbols to the factorization X* · *Y* · *Z* · *X* · *Y* · *Z where one of the variables in the factorization may be empty.*

If the boundary word is equal to *X* · *Y* · *Z* · *X* · *Y* · *Z* such a polyomino is called *pseudo hexagon*, if it is equal to *X* · *Y* · *X* · *Y* such a polyomino is called *pseudo square*.

#### **Examples of factorizations and associated tilings of the plane.**

A mino has a unique factorization on pseudo square, indeed the boundary word is **b**(*P*) = *aba*¯¯ *b* thus *X* = *a* and *Y* = *b* and *X* = *a*¯ and *Y* = ¯ *b*. In fact a mino tiles in a unique way and has the property that each tile is surrounded by 4 adjacent tiles (Fig. 14).

**Figure 14. Tiling by a mino and the 4 adjacent tiles of the grey mino**

A horizontal domino has exactly 2 factorizations **b**(*P*) = *aaba*¯*a*¯¯ *b* one in pseudo square by taking *X* = *aa* and *Y* = *b* and *X* = *a*¯*a*¯ and *Y* = ¯ *b* and given a tiling like a "non robust brickwall". In this case each tile is surrounded by 4 adjacent tiles (Fig. 15).

10.5772/58577

405

*b*¯ *ba*¯*a*¯¯ *baa*¯ *b*¯ *babb* it

http://dx.doi.org/10.5772/58577

From Tilings to Fibers – Bio-mathematical Aspects of Fold Plasticity

*baab* has only one factorization in pseudo

*ba*¯¯ *ba*¯ *bab* has

For the thin cross with 9 unit squares with boundary word equals to *aaba*¯*abb* ¯ *a*¯¯

anymore (the square with the symbol "?" cannot be covered) (Fig. 17).

**Figure 17. A thin cross that doesn't tile the plane**

The triomino L with boundary word equals to *ba*¯¯

**Figure 18. The** *L***-triomino tiles the plane like a pseudo hexagon**

the plane.

While the little thin cross with 5 unit squares with boundary words *abab*¯ *a*¯¯

factorizations with *X* = *aba*,*Y* = *bab*¯ and with *X* = *bab*,*Y* = *ab*¯ *a*¯ (Fig. 19).

2 factorizations in pseudo squares and thus 2 distinct tiling ways according to the 2

To summarize, for each polymino we are able to decide if it tiles the plane by translation by considering the factorization of it's boundary word. Now, we investigate regular tilings of

hexagon, thus it tiles the plane in a unique way (Fig. 18).

is impossible to find a factorization and this proves that the thin cross doesn't tile the plane

?

*ba*¯¯

**Figure 15. Tiling of the plane by a domino like a pseudo square and the 4 adjacent tiles of the grey domino**

One factorization in pseudo hexagon by taking *X* = *a*, *Y* = *a*, *Z* = *b* and *X* = *a*¯, *Y* = *a*¯, *Z* = ¯ *b* which give a tiling like a "robust brickwall". In this case each tile is surrounded by 6 adjacent tiles (Fig. 16).

**Figure 16. Tiling of the plane by a domino like a pseudo hexagon and the 6 adjacent tiles of the grey domino**

Remark that the difference between the 2 tilings by a horizontal domino is the number of adjacent copies of polyominoes that surround the domino at the origin. Thus a domino tiles either like a pseudo square if the number of adjacent tiles of each tile is 4 or like a pseudo hexagon if the number of adjacent tiles of each tile is 6.

For the thin cross with 9 unit squares with boundary word equals to *aaba*¯*abb* ¯ *a*¯¯ *b*¯ *ba*¯*a*¯¯ *baa*¯ *b*¯ *babb* it is impossible to find a factorization and this proves that the thin cross doesn't tile the plane anymore (the square with the symbol "?" cannot be covered) (Fig. 17).

**Figure 17. A thin cross that doesn't tile the plane**

10

Y=b

tiles (Fig. 16).

X=a a

X=aa

X=aa

Y=a

a a

Z=b Z=b

Y=a X=a

a a

X=a

Y=a

hexagon if the number of adjacent tiles of each tile is 6.

X=a

X=a

Y=a

Y=b

**Figure 15. Tiling of the plane by a domino like a pseudo square and the 4 adjacent tiles of the grey domino**

One factorization in pseudo hexagon by taking *X* = *a*, *Y* = *a*, *Z* = *b* and *X* = *a*¯, *Y* = *a*¯, *Z* = ¯

which give a tiling like a "robust brickwall". In this case each tile is surrounded by 6 adjacent

Z=b Z=b

**Figure 16. Tiling of the plane by a domino like a pseudo hexagon and the 6 adjacent tiles of the grey domino**

Remark that the difference between the 2 tilings by a horizontal domino is the number of adjacent copies of polyominoes that surround the domino at the origin. Thus a domino tiles either like a pseudo square if the number of adjacent tiles of each tile is 4 or like a pseudo

*b*

X=a a

a a

Y=b Y=b a a

> The triomino L with boundary word equals to *ba*¯¯ *ba*¯¯ *baab* has only one factorization in pseudo hexagon, thus it tiles the plane in a unique way (Fig. 18).

**Figure 18. The** *L***-triomino tiles the plane like a pseudo hexagon**

While the little thin cross with 5 unit squares with boundary words *abab*¯ *a*¯¯ *ba*¯¯ *ba*¯ *bab* has 2 factorizations in pseudo squares and thus 2 distinct tiling ways according to the 2 factorizations with *X* = *aba*,*Y* = *bab*¯ and with *X* = *bab*,*Y* = *ab*¯ *a*¯ (Fig. 19).

To summarize, for each polymino we are able to decide if it tiles the plane by translation by considering the factorization of it's boundary word. Now, we investigate regular tilings of the plane.

10.5772/58577

407

http://dx.doi.org/10.5772/58577

From Tilings to Fibers – Bio-mathematical Aspects of Fold Plasticity

adjacent tiles for each tile; be careful that while a regular tiling is always given by integral combinations of 2 vectors *<sup>v</sup>*<sup>1</sup> and *<sup>v</sup>*2, the boundary of each tile in the pseudo hexagon case use 6 sides in correspondence 2-by-2 with the translations *<sup>v</sup>*1, *<sup>v</sup>*<sup>2</sup> and *<sup>v</sup>*3). Are you able to recover the 6 adjacent polyominoes of the grey polyomino of Fig. 20 ? Hint: in Fig. 20 one adjacency is only by a single vertical step (for the left regular tiling) or horizontal (for the

From now on, we are dealing with the form of the oligomerization of proteins either like a cylinder or like a fiber. For us a cylinder has finite height (Fig. 1A) and a fiber is a possibly

In our model we would like to make the construction of a fiber (that is a non finite height cylinder) by transformation on regular tilings. As the tiling is regular, the whole tiling is invariant by translation of the 2 non null and non collinear vectors *<sup>v</sup>*<sup>1</sup> and *<sup>v</sup>*<sup>2</sup> defined at the previous paragraph. *<sup>v</sup>*<sup>1</sup> and *<sup>v</sup>*<sup>2</sup> are the vectors of translations that respectively send *<sup>X</sup>* to *<sup>X</sup>* and send *<sup>Y</sup>* to *<sup>Y</sup>*. In mathematics the parallelogram constructed by *<sup>v</sup>*<sup>1</sup> and *<sup>v</sup>*<sup>2</sup> at a given integer point is called the fundamental domain that is the minimal part of the tiling used to

Let *<sup>e</sup>*<sup>1</sup> be a horizontal unit vector. We choose a direction parallel to *<sup>e</sup>*<sup>1</sup> namely *<sup>m</sup>*.*<sup>v</sup>*<sup>1</sup> + *<sup>n</sup><sup>v</sup>*<sup>2</sup> = *<sup>k</sup>*.*<sup>e</sup>*<sup>1</sup> with *<sup>m</sup>* and *<sup>n</sup>* given integers. Then we construct the little circle (that is by usual definition the horizontal circle of the cylinder) whose perimeter is equal to *k* by superposing the tiles of the original tiling and the tiles translated by *<sup>k</sup>*.*<sup>e</sup>*1. Indeed as the tiling is invariant by translation of integral combination of *<sup>v</sup>*<sup>1</sup> and *<sup>v</sup>*<sup>2</sup> we are able recover the same shape by using the translation *<sup>m</sup>*.*<sup>v</sup>*<sup>1</sup> + *<sup>n</sup><sup>v</sup>*<sup>2</sup> (which is by construction equal to *<sup>k</sup>*.*<sup>e</sup>*1). Fig. 21 shows an example of a cylinder constructed with a regular tiling by a mino using the identification of the two bold vertical borders for the translation 8 *<sup>v</sup>*<sup>1</sup> (here for a mino *<sup>v</sup>*1=*<sup>e</sup>*1). In the associated cylinder

right regular tiling) ...

**5.1. Regular tiling case**

non finite height object (Fig. 25).

**Figure 20. Two regular tilings of the plane by the same polyomino**

**5. From tilings of the plane to tilings of a fiber**

construct the whole tiling by integral combinations of *<sup>v</sup>*<sup>1</sup> and *<sup>v</sup>*2.

**Figure 19. The 2 tilings like a pseudo square by a thin cross with 5 unit squares**

#### **4. Regular tilings by translation**

The theorem of Beauquier-Nivat gives a correspondence between the way of tiling and the factorization that is the number of copies of the polyomino that surrounds a given polyomino. For the mino, we have 4 copies to cover the whole boundary (because a unit square is a pseudo square that tiles like a square). For the triomino L we have 6 copies to cover the whole boundary (because a triomino L is a pseudo hexagon that tiles like a hexagon). In fact we extend the local correspondences between the coding of part of the boundary word to a geometrical translations namely *<sup>v</sup>*<sup>1</sup> translates *<sup>X</sup>* to *<sup>X</sup>* and *<sup>v</sup>*<sup>2</sup> translates *<sup>Y</sup>* to *<sup>Y</sup>* (and possibly *<sup>v</sup>*<sup>3</sup> translates *<sup>Z</sup>* to *<sup>Z</sup>*). Remark that we have 2 or 3 translations according to the pseudo square or pseudo hexagon, respectively, and by translation to the whole tiling each polyomino has the same local surrounding.

It is interesting to introduce the notion of regular tiling. In a regular tiling the surrounding by adjacent tiles is the same for each polyomino. The mino and the triomino *L* tile only in a regular way by extending the local vectors to the whole tiling.

A *regular tiling* is a tiling by translation of a polyomino *P* such that each tile in the tiling has the same surrounding by translated copies of the tile *P* according to a given factorization of its boundary word. Each factorization leads to a regular tiling of the plane by translation by extending the local translation to the whole plane in the following way.

If *P* is a pseudo square, the factorization **b**(*P*) = *X* · *Y* · *X* · *Y* defines 4 sides of the tile where the sides in correspondence are identified by the pairings (*X*, *X*) and (*Y*,*Y*). The translations *<sup>v</sup>*1, *<sup>v</sup>*<sup>2</sup> corresponding to these pairings allow us to tile the whole plane in a regular way by using integral combinations of the translations *<sup>v</sup>*<sup>1</sup> and *<sup>v</sup>*<sup>2</sup> to *<sup>P</sup>* in order to generate the whole tiling (Fig. 14 or Fig. 15).

In the case of a pseudo hexagon the construction with 6 sides is similar. If *P* is a pseudo hexagon, the factorization **b**(*P*) = *X* · *Y* · *Z* · *X* · *Y* · *Z* defines 6 sides of the tile where the sides in correspondence are identified by the pairings (*X*, *X*), (*Y*,*Y*) and (*Z*, *Z*). The translations *<sup>v</sup>*1, *<sup>v</sup>*<sup>2</sup> and *<sup>v</sup>*<sup>3</sup> corresponding to these pairings allow us to tile the whole plane in a regular way by using integral combination of the translations *<sup>v</sup>*<sup>1</sup> and *<sup>v</sup>*<sup>2</sup> to *<sup>P</sup>* in order to generate the whole tiling (Fig. 16 or Fig. 18) . Remark that the third vector of translation *<sup>v</sup>*<sup>3</sup> is an integral combination of *<sup>v</sup>*<sup>1</sup> and *<sup>v</sup>*<sup>2</sup> (*<sup>v</sup>*<sup>3</sup> = *<sup>v</sup>*<sup>1</sup> −*<sup>v</sup>*<sup>2</sup> in Fig. 16).

Observe that 2 distinct factorizations of the boundary word of *P* give 2 distinct regular tilings of the plane (Fig. 19 in the case of 4 adjacent tiles for each tile or Fig. 20 in the case of 6 adjacent tiles for each tile; be careful that while a regular tiling is always given by integral combinations of 2 vectors *<sup>v</sup>*<sup>1</sup> and *<sup>v</sup>*2, the boundary of each tile in the pseudo hexagon case use 6 sides in correspondence 2-by-2 with the translations *<sup>v</sup>*1, *<sup>v</sup>*<sup>2</sup> and *<sup>v</sup>*3). Are you able to recover the 6 adjacent polyominoes of the grey polyomino of Fig. 20 ? Hint: in Fig. 20 one adjacency is only by a single vertical step (for the left regular tiling) or horizontal (for the right regular tiling) ...

**Figure 20. Two regular tilings of the plane by the same polyomino**

#### **5. From tilings of the plane to tilings of a fiber**

#### **5.1. Regular tiling case**

12

**Figure 19. The 2 tilings like a pseudo square by a thin cross with 5 unit squares**

regular way by extending the local vectors to the whole tiling.

combination of *<sup>v</sup>*<sup>1</sup> and *<sup>v</sup>*<sup>2</sup> (*<sup>v</sup>*<sup>3</sup> = *<sup>v</sup>*<sup>1</sup> −*<sup>v</sup>*<sup>2</sup> in Fig. 16).

extending the local translation to the whole plane in the following way.

The theorem of Beauquier-Nivat gives a correspondence between the way of tiling and the factorization that is the number of copies of the polyomino that surrounds a given polyomino. For the mino, we have 4 copies to cover the whole boundary (because a unit square is a pseudo square that tiles like a square). For the triomino L we have 6 copies to cover the whole boundary (because a triomino L is a pseudo hexagon that tiles like a hexagon). In fact we extend the local correspondences between the coding of part of the boundary word to a geometrical translations namely *<sup>v</sup>*<sup>1</sup> translates *<sup>X</sup>* to *<sup>X</sup>* and *<sup>v</sup>*<sup>2</sup> translates *<sup>Y</sup>* to *<sup>Y</sup>* (and possibly *<sup>v</sup>*<sup>3</sup> translates *<sup>Z</sup>* to *<sup>Z</sup>*). Remark that we have 2 or 3 translations according to the pseudo square or pseudo hexagon, respectively, and by translation to the whole tiling each polyomino has

It is interesting to introduce the notion of regular tiling. In a regular tiling the surrounding by adjacent tiles is the same for each polyomino. The mino and the triomino *L* tile only in a

A *regular tiling* is a tiling by translation of a polyomino *P* such that each tile in the tiling has the same surrounding by translated copies of the tile *P* according to a given factorization of its boundary word. Each factorization leads to a regular tiling of the plane by translation by

If *P* is a pseudo square, the factorization **b**(*P*) = *X* · *Y* · *X* · *Y* defines 4 sides of the tile where the sides in correspondence are identified by the pairings (*X*, *X*) and (*Y*,*Y*). The translations *<sup>v</sup>*1, *<sup>v</sup>*<sup>2</sup> corresponding to these pairings allow us to tile the whole plane in a regular way by using integral combinations of the translations *<sup>v</sup>*<sup>1</sup> and *<sup>v</sup>*<sup>2</sup> to *<sup>P</sup>* in order to generate the whole

In the case of a pseudo hexagon the construction with 6 sides is similar. If *P* is a pseudo hexagon, the factorization **b**(*P*) = *X* · *Y* · *Z* · *X* · *Y* · *Z* defines 6 sides of the tile where the sides in correspondence are identified by the pairings (*X*, *X*), (*Y*,*Y*) and (*Z*, *Z*). The translations *<sup>v</sup>*1, *<sup>v</sup>*<sup>2</sup> and *<sup>v</sup>*<sup>3</sup> corresponding to these pairings allow us to tile the whole plane in a regular way by using integral combination of the translations *<sup>v</sup>*<sup>1</sup> and *<sup>v</sup>*<sup>2</sup> to *<sup>P</sup>* in order to generate the whole tiling (Fig. 16 or Fig. 18) . Remark that the third vector of translation *<sup>v</sup>*<sup>3</sup> is an integral

Observe that 2 distinct factorizations of the boundary word of *P* give 2 distinct regular tilings of the plane (Fig. 19 in the case of 4 adjacent tiles for each tile or Fig. 20 in the case of 6

**4. Regular tilings by translation**

the same local surrounding.

tiling (Fig. 14 or Fig. 15).

From now on, we are dealing with the form of the oligomerization of proteins either like a cylinder or like a fiber. For us a cylinder has finite height (Fig. 1A) and a fiber is a possibly non finite height object (Fig. 25).

In our model we would like to make the construction of a fiber (that is a non finite height cylinder) by transformation on regular tilings. As the tiling is regular, the whole tiling is invariant by translation of the 2 non null and non collinear vectors *<sup>v</sup>*<sup>1</sup> and *<sup>v</sup>*<sup>2</sup> defined at the previous paragraph. *<sup>v</sup>*<sup>1</sup> and *<sup>v</sup>*<sup>2</sup> are the vectors of translations that respectively send *<sup>X</sup>* to *<sup>X</sup>* and send *<sup>Y</sup>* to *<sup>Y</sup>*. In mathematics the parallelogram constructed by *<sup>v</sup>*<sup>1</sup> and *<sup>v</sup>*<sup>2</sup> at a given integer point is called the fundamental domain that is the minimal part of the tiling used to construct the whole tiling by integral combinations of *<sup>v</sup>*<sup>1</sup> and *<sup>v</sup>*2.

Let *<sup>e</sup>*<sup>1</sup> be a horizontal unit vector. We choose a direction parallel to *<sup>e</sup>*<sup>1</sup> namely *<sup>m</sup>*.*<sup>v</sup>*<sup>1</sup> + *<sup>n</sup><sup>v</sup>*<sup>2</sup> = *<sup>k</sup>*.*<sup>e</sup>*<sup>1</sup> with *<sup>m</sup>* and *<sup>n</sup>* given integers. Then we construct the little circle (that is by usual definition the horizontal circle of the cylinder) whose perimeter is equal to *k* by superposing the tiles of the original tiling and the tiles translated by *<sup>k</sup>*.*<sup>e</sup>*1. Indeed as the tiling is invariant by translation of integral combination of *<sup>v</sup>*<sup>1</sup> and *<sup>v</sup>*<sup>2</sup> we are able recover the same shape by using the translation *<sup>m</sup>*.*<sup>v</sup>*<sup>1</sup> + *<sup>n</sup><sup>v</sup>*<sup>2</sup> (which is by construction equal to *<sup>k</sup>*.*<sup>e</sup>*1). Fig. 21 shows an example of a cylinder constructed with a regular tiling by a mino using the identification of the two bold vertical borders for the translation 8 *<sup>v</sup>*<sup>1</sup> (here for a mino *<sup>v</sup>*1=*<sup>e</sup>*1). In the associated cylinder the two vertical borders are collapsed in a single border and this transformation is general as soon as we identify in the tiling two infinite borders in correspondence by a translation of vector *<sup>m</sup>*.�*v*<sup>1</sup> + *<sup>n</sup>*�*v*2.

10.5772/58577

409

http://dx.doi.org/10.5772/58577

From Tilings to Fibers – Bio-mathematical Aspects of Fold Plasticity

v2 v1

v1

v1

the abstract view and Fig. 24 for a part of a real fiber) and the polymerization in fiber with name 3J1R has 6 adjacent chains to each chain (see Fig. 25 for the abstract view and Fig.26

This chapter is an investigation of the combinatorial possibilities on the construction of the fiber. Either a fiber is composed by a regular tiling by a single chain (Fig. 23 for the abstract view and Fig. 24 for a part of a real fiber with pseudo square tiling and Fig. 25 for the abstract view and Fig.26 for a part of real fiber with pseudo hexagon tiling) or the regular tiling of the fiber is composed by a *n*-mer (in mathematics we called this object a meta tile). Thus globally the fiber is a tiling by a meta-tile and this meta-tile could be a single chain or a *n*-mer.

In real life we find fibers tile either by a single chain or by a *n*-mer. In Fig. 27 we see a real fiber with name 3J2U constructed with a tetramer of three chains namely A, B and 2 times the chain C. The tetramer is used to tile the fiber of pseudo hexagon type like in Fig. 25 by

Another example is the construction using oligonucleotides of DNA inspired by the seminal work of Winfree [37]. These technics lead to nano tubes composed by dimers surrounded by

Conway has a criterium for tiling the plane in the case of a tile and a rotation of the tile by 180 degrees. In his article, he forms a meta-tile using the tile *T* and it's rotation Rot180(*T*) and this meta-tile is able to tile the plane by translation. Guy Cousineau wrote an article to

about p53, the final construction of the fiber uses tiling of dimers (see Fig. 33D).

*C C* . You will notice in the next section

v2

v1

v2

**Figure 22. Two boundaries in correspondence by the translation** 4*v*<sup>1</sup> + 2*v*2)

substituting each pseudo hexagon by the tetramer *A B*

either 4 adjacent dimers [23] or 6 adjacent dimers [39].

for a part of real fiber).

**5.2. Tiling fibers by** *n***-mers**

e1

**Figure 21. From tiling to cylinder by using the translation of** 8 **times the vector** �*e*1**.**

In the next example (Fig. 22), the vectors �*v*<sup>1</sup> and �*v*<sup>2</sup> send respectively *<sup>X</sup>* = *bab*¯ to *<sup>X</sup>* = ¯ *ba*¯ *b* and *Y* = *aba* to *Y* = *a*¯¯ *ba*¯. In order to find a horizontal direction we take 2 times the vector �*v*<sup>1</sup> and 1 time the vector �*v*2. Remark that the distance between the point *<sup>O</sup>* and the point *<sup>O</sup>* translated by 2�*v*<sup>1</sup> +�*v*<sup>2</sup> is exactly 5. If we want a little circle with perimeter 10 we consider the translation 2 ∗ (2�*v*<sup>1</sup> +�*v*2) that is 4 times �*v*<sup>1</sup> and 2 times �*v*1. Now we make a fiber by rolling up the plane by identification the 2 sides in bold (this is exactly the border of the tiling given by a staircase in correspondence by the translation with vector 4�*v*<sup>1</sup> + <sup>2</sup>�*v*2) (Fig. 22). In order to have a more general construction and in particular to construct a twist in the fiber, we could use a rational coordinate vector �*r*<sup>1</sup> instead of �*e*<sup>1</sup> and find a direction parallel to �*r*<sup>1</sup> given by the equation *<sup>m</sup>*�*v*<sup>1</sup> + *<sup>n</sup>*�*v*<sup>2</sup> = *<sup>k</sup>*�*r*1.

Conversely, if a polyomino tiles a fiber by translation we recover the whole tiling by finding the little circle and the invariance of the fiber by translation by *<sup>k</sup>*.�*e*<sup>1</sup> and considering in **<sup>R</sup>**<sup>2</sup> the strip at the origin and adding all the strips on **<sup>R</sup>**<sup>2</sup> by translation ℓ.*k*.�*e*<sup>1</sup> with ℓ ∈ **<sup>Z</sup>**.

To summarize, in order to construct a model of fiber using a regular tiling by a single polyomino, the polyomino must tile the plane in a regular way and thus must have at least a factorization as a pseudo square or as a pseudo hexagon. In the fiber, if the tiling is regular then the number of adjacent copies that surround the border is either 4 or 6. This implies that the number of distinct interfaces is 2 or 3.

Of course the model is very flat and in real fibers the molecules are in a 3 dimensional space nevertheless this model allows us to make hypothesis about the number of interface sites and we try to find either 2 or 3 interface sites in the fiber.

We have explained that to build a fiber with a non finite height cylinder shape, it is necessary to have 2 vectors, namely 2 directions of polymerization growth. We have also seen that each tile of fiber will have 4 or 6 adjacent chains formed via 2 or 3 interfaces, respectively. There are examples of such fibers tiled with a single protein: the famous tobacco mosaic virus which was the first virus to be discovered has 4 adjacent chains to each chain (see Fig. 23 for

**Figure 22. Two boundaries in correspondence by the translation** 4*v*<sup>1</sup> + 2*v*2)

the abstract view and Fig. 24 for a part of a real fiber) and the polymerization in fiber with name 3J1R has 6 adjacent chains to each chain (see Fig. 25 for the abstract view and Fig.26 for a part of real fiber).

#### **5.2. Tiling fibers by** *n***-mers**

14

vector *<sup>m</sup>*.�*v*<sup>1</sup> + *<sup>n</sup>*�*v*2.

and *Y* = *aba* to *Y* = *a*¯¯

the equation *<sup>m</sup>*�*v*<sup>1</sup> + *<sup>n</sup>*�*v*<sup>2</sup> = *<sup>k</sup>*�*r*1.

that the number of distinct interfaces is 2 or 3.

and we try to find either 2 or 3 interface sites in the fiber.

8 e1

**Figure 21. From tiling to cylinder by using the translation of** 8 **times the vector** �*e*1**.**

the two vertical borders are collapsed in a single border and this transformation is general as soon as we identify in the tiling two infinite borders in correspondence by a translation of

In the next example (Fig. 22), the vectors �*v*<sup>1</sup> and �*v*<sup>2</sup> send respectively *<sup>X</sup>* = *bab*¯ to *<sup>X</sup>* = ¯

�*v*<sup>1</sup> and 1 time the vector �*v*2. Remark that the distance between the point *<sup>O</sup>* and the point *<sup>O</sup>* translated by 2�*v*<sup>1</sup> +�*v*<sup>2</sup> is exactly 5. If we want a little circle with perimeter 10 we consider the translation 2 ∗ (2�*v*<sup>1</sup> +�*v*2) that is 4 times �*v*<sup>1</sup> and 2 times �*v*1. Now we make a fiber by rolling up the plane by identification the 2 sides in bold (this is exactly the border of the tiling given by a staircase in correspondence by the translation with vector 4�*v*<sup>1</sup> + <sup>2</sup>�*v*2) (Fig. 22). In order to have a more general construction and in particular to construct a twist in the fiber, we could use a rational coordinate vector �*r*<sup>1</sup> instead of �*e*<sup>1</sup> and find a direction parallel to �*r*<sup>1</sup> given by

Conversely, if a polyomino tiles a fiber by translation we recover the whole tiling by finding the little circle and the invariance of the fiber by translation by *<sup>k</sup>*.�*e*<sup>1</sup> and considering in **<sup>R</sup>**<sup>2</sup> the

To summarize, in order to construct a model of fiber using a regular tiling by a single polyomino, the polyomino must tile the plane in a regular way and thus must have at least a factorization as a pseudo square or as a pseudo hexagon. In the fiber, if the tiling is regular then the number of adjacent copies that surround the border is either 4 or 6. This implies

Of course the model is very flat and in real fibers the molecules are in a 3 dimensional space nevertheless this model allows us to make hypothesis about the number of interface sites

We have explained that to build a fiber with a non finite height cylinder shape, it is necessary to have 2 vectors, namely 2 directions of polymerization growth. We have also seen that each tile of fiber will have 4 or 6 adjacent chains formed via 2 or 3 interfaces, respectively. There are examples of such fibers tiled with a single protein: the famous tobacco mosaic virus which was the first virus to be discovered has 4 adjacent chains to each chain (see Fig. 23 for

strip at the origin and adding all the strips on **<sup>R</sup>**<sup>2</sup> by translation ℓ.*k*.�*e*<sup>1</sup> with ℓ ∈ **<sup>Z</sup>**.

*ba*¯. In order to find a horizontal direction we take 2 times the vector

*ba*¯ *b*

> This chapter is an investigation of the combinatorial possibilities on the construction of the fiber. Either a fiber is composed by a regular tiling by a single chain (Fig. 23 for the abstract view and Fig. 24 for a part of a real fiber with pseudo square tiling and Fig. 25 for the abstract view and Fig.26 for a part of real fiber with pseudo hexagon tiling) or the regular tiling of the fiber is composed by a *n*-mer (in mathematics we called this object a meta tile). Thus globally the fiber is a tiling by a meta-tile and this meta-tile could be a single chain or a *n*-mer.

> In real life we find fibers tile either by a single chain or by a *n*-mer. In Fig. 27 we see a real fiber with name 3J2U constructed with a tetramer of three chains namely A, B and 2 times the chain C. The tetramer is used to tile the fiber of pseudo hexagon type like in Fig. 25 by substituting each pseudo hexagon by the tetramer *A B C C* . You will notice in the next section about p53, the final construction of the fiber uses tiling of dimers (see Fig. 33D).

> Another example is the construction using oligonucleotides of DNA inspired by the seminal work of Winfree [37]. These technics lead to nano tubes composed by dimers surrounded by either 4 adjacent dimers [23] or 6 adjacent dimers [39].

> Conway has a criterium for tiling the plane in the case of a tile and a rotation of the tile by 180 degrees. In his article, he forms a meta-tile using the tile *T* and it's rotation Rot180(*T*) and this meta-tile is able to tile the plane by translation. Guy Cousineau wrote an article to

10.5772/58577

411

http://dx.doi.org/10.5772/58577

From Tilings to Fibers – Bio-mathematical Aspects of Fold Plasticity

**Figure 25. Fiber with a pseudo hexagon shape.** Each tile is surrounded by 6 tiles. Remark that pseudo hexagon shapes

**Figure 26. Fiber with 3J1R PDB code: an example of tiling of a fiber with 6 adjacent chains.**

For higher stoichiometry, we construct a *n*-mer invariant by rotation of 360/*n* degrees and we copy this meta-tile in order to tile the whole fiber (see Fig. 29 for an abstract view of the

According to the previous tiling constraints, as the number of sites is 2 or 3 it is more easy to make a fiber with a monomer, dimer, trimer, tetramer and hexamer because we combine the shape of the *n*-mers with the form of the interfaces in order to build the whole fiber. We think that the better values are 1, 2, 3, 4, 6 because these values are also in accordance with the crystallographic constraints (the laws of crystallography [31] allow only periodical tilings with rotations of order 1,2,3,4 and 6, that is by rotations of 360, 180, 120, 90 or 60

appear in particular when there is a tilt on the fiber.

replacement by a tetramer).

**Figure 23. Fiber with a pseudo square shape.** Each tile is surrounded by 4 tiles.

**Figure 24. Tobacco Mosaic Virus with 2TMV PDB code: an example of tiling of a fiber with 4 adjacent chains.**

explain the combinatorics of tilings with axial symmetry and translation (see the chapter of Cousineau in this book). In real life, fibers could be constructed for example with dimers. In general the dimer in the fiber is either invariant by a rotation of 180 degrees or invariant by an axial symmetry and then the dimer is used to tile by translation a fiber (see Fig. 28). Thus our model allows us to first make an interface for forming the dimer. And eventually to make a fiber by translation of the dimer according to the tiling constraints that is by having exactly 2 or 3 interfaces for the tiling of the fiber by the dimer. To recap we have constructed either 3 (one for the dimer and 2 for a tiling like a pseudo square) or 4 interfaces (one for the dimer and 3 for a tiling like a pseudo hexagon).

<sup>410</sup> Oligomerization of Chemical and Biological Compounds From Tilings to Fibers – Bio-mathematical Aspects of Fold Plasticity 17 10.5772/58577 From Tilings to Fibers – Bio-mathematical Aspects of Fold Plasticity http://dx.doi.org/10.5772/58577 411

16

**Figure 23. Fiber with a pseudo square shape.** Each tile is surrounded by 4 tiles.

dimer and 3 for a tiling like a pseudo hexagon).

**Figure 24. Tobacco Mosaic Virus with 2TMV PDB code: an example of tiling of a fiber with 4 adjacent chains.**

explain the combinatorics of tilings with axial symmetry and translation (see the chapter of Cousineau in this book). In real life, fibers could be constructed for example with dimers. In general the dimer in the fiber is either invariant by a rotation of 180 degrees or invariant by an axial symmetry and then the dimer is used to tile by translation a fiber (see Fig. 28). Thus our model allows us to first make an interface for forming the dimer. And eventually to make a fiber by translation of the dimer according to the tiling constraints that is by having exactly 2 or 3 interfaces for the tiling of the fiber by the dimer. To recap we have constructed either 3 (one for the dimer and 2 for a tiling like a pseudo square) or 4 interfaces (one for the

**Figure 25. Fiber with a pseudo hexagon shape.** Each tile is surrounded by 6 tiles. Remark that pseudo hexagon shapes appear in particular when there is a tilt on the fiber.

**Figure 26. Fiber with 3J1R PDB code: an example of tiling of a fiber with 6 adjacent chains.**

For higher stoichiometry, we construct a *n*-mer invariant by rotation of 360/*n* degrees and we copy this meta-tile in order to tile the whole fiber (see Fig. 29 for an abstract view of the replacement by a tetramer).

According to the previous tiling constraints, as the number of sites is 2 or 3 it is more easy to make a fiber with a monomer, dimer, trimer, tetramer and hexamer because we combine the shape of the *n*-mers with the form of the interfaces in order to build the whole fiber. We think that the better values are 1, 2, 3, 4, 6 because these values are also in accordance with the crystallographic constraints (the laws of crystallography [31] allow only periodical tilings with rotations of order 1,2,3,4 and 6, that is by rotations of 360, 180, 120, 90 or 60

10.5772/58577

413

http://dx.doi.org/10.5772/58577

From Tilings to Fibers – Bio-mathematical Aspects of Fold Plasticity

**Figure 29. Replacement of each pseudo square by a tetramer.**

**5.3. From fold plasticity to fibers: the P53 case**

(PDB 3KIF, Fig. 32) (see [38]).

constructed in synthetic biology.

fiber is given by symmetrical *n*-mers. On the other hand, dimers of pentamer or heptamers are able to fit with some deformations a hexagonal tile ie not all monomers are identical (Fig. 31). Such a dimer of pentamer has been observed in nature e.g. synthetic tachylectin

Likewise the pore forming toxin aerolysin can form a dimer of heptamer after a single amino acid mutation (see [33]). It remains to be established whether the dimeric versions of the pentamer and heptamer are more prompt to form fibers than their single oligomeric counterpart. Of course the combinatoric is more and more complicated for *n* > 7 and to construct the fiber the position of the interfaces on the pseudo square or on the pseudo hexagon is split in different parts for *n* > 7 and this construction is very constrained and maybe this phenomenon is difficult to see in real life. Nevertheless such fibers could be

We give some examples of replacements of a pseudo square by a dimer, by a tetramer and by an octamer and of a pseudo hexagon by a dimer, by a trimer and by a hexamer (see Fig. 30).

Notice that the main difference between tiling a finite height cylinder and tiling a fiber (an infinite height cylinder) is the number of directions of the polymerization growth, one in the former case and two in the latter case. Thus a transition from a cylinder to a fiber implies a transition from a single interface to 2 (4 adjacent chains) or 3 interfaces (6 adjacent chains). For this to occur one simple possibility is to start from an oligomer which has 2 regions of interfaces and loosen the constraints on one so it opens to bind in another direction. Common

**Figure 27. Fiber with 3J2U PBD code containing a tetramer inside each pseudo hexagon**

**Figure 28. Fiber with a dimer inside each pseudo hexagon**

degrees). Of course accordingly a transition from pentamer or heptamer (*n* = 5 or 7) to fiber would be expected to be more difficult at least with the mechanism described in this chapter. In this direction quasicrystals come from aperiodical tilings of the whole plane and non periodical structures could appear in viruses [35], nevertheless non periodical ways to form a fiber by translation of a single tile is described in the next section. Indeed, to form Penrose like tilings we must have two tiles with aperiodic arrangements [5, 31] and it is still an open problem to find an aperiodic tiling of the whole plane with a single tile [15] (we expect by the theorem of Beauquier-Nivat that the boundary of such tile (if it exists) must be fractal...). In our fiber model the rules seem to be restrictive because such symmetry fits directly without deformation into a square or hexagonal tile. Of course in real life, the *n*-mers could be symmetrical ([17, 22, 24]) but also non-symmetrical ([4, 32]). We think that for tiling constraints on the border of each *n*-mers the less expansive solution in order to construct a

**Figure 29. Replacement of each pseudo square by a tetramer.**

18

**Figure 27. Fiber with 3J2U PBD code containing a tetramer inside each pseudo hexagon**

degrees). Of course accordingly a transition from pentamer or heptamer (*n* = 5 or 7) to fiber would be expected to be more difficult at least with the mechanism described in this chapter. In this direction quasicrystals come from aperiodical tilings of the whole plane and non periodical structures could appear in viruses [35], nevertheless non periodical ways to form a fiber by translation of a single tile is described in the next section. Indeed, to form Penrose like tilings we must have two tiles with aperiodic arrangements [5, 31] and it is still an open problem to find an aperiodic tiling of the whole plane with a single tile [15] (we expect by the theorem of Beauquier-Nivat that the boundary of such tile (if it exists) must be fractal...). In our fiber model the rules seem to be restrictive because such symmetry fits directly without deformation into a square or hexagonal tile. Of course in real life, the *n*-mers could be symmetrical ([17, 22, 24]) but also non-symmetrical ([4, 32]). We think that for tiling constraints on the border of each *n*-mers the less expansive solution in order to construct a

**Figure 28. Fiber with a dimer inside each pseudo hexagon**

fiber is given by symmetrical *n*-mers. On the other hand, dimers of pentamer or heptamers are able to fit with some deformations a hexagonal tile ie not all monomers are identical (Fig. 31). Such a dimer of pentamer has been observed in nature e.g. synthetic tachylectin (PDB 3KIF, Fig. 32) (see [38]).

Likewise the pore forming toxin aerolysin can form a dimer of heptamer after a single amino acid mutation (see [33]). It remains to be established whether the dimeric versions of the pentamer and heptamer are more prompt to form fibers than their single oligomeric counterpart. Of course the combinatoric is more and more complicated for *n* > 7 and to construct the fiber the position of the interfaces on the pseudo square or on the pseudo hexagon is split in different parts for *n* > 7 and this construction is very constrained and maybe this phenomenon is difficult to see in real life. Nevertheless such fibers could be constructed in synthetic biology.

We give some examples of replacements of a pseudo square by a dimer, by a tetramer and by an octamer and of a pseudo hexagon by a dimer, by a trimer and by a hexamer (see Fig. 30).

#### **5.3. From fold plasticity to fibers: the P53 case**

Notice that the main difference between tiling a finite height cylinder and tiling a fiber (an infinite height cylinder) is the number of directions of the polymerization growth, one in the former case and two in the latter case. Thus a transition from a cylinder to a fiber implies a transition from a single interface to 2 (4 adjacent chains) or 3 interfaces (6 adjacent chains). For this to occur one simple possibility is to start from an oligomer which has 2 regions of interfaces and loosen the constraints on one so it opens to bind in another direction. Common

10.5772/58577

415

http://dx.doi.org/10.5772/58577

From Tilings to Fibers – Bio-mathematical Aspects of Fold Plasticity

biological examples of such transition are domain swapping ([9, 25]) where the presence of

Another possibility is to start from a dimer with 2 regions of interfaces and again loosen one of the region but this time via a mutation. If the modified residue controlled the interactions between the 2 regions, its modification may loosen one region opening it to form a second interface. Since the initial molecule is a dimer, the mutation will free 2 new interfaces enabling the growth in 2 directions. The formation of the fiber will depend on the symmetry of the dimer. To illustrate this possibility, let's consider the p53 tetramer (see Fig. 33). There are several familial point mutations such as G334V or R337 that are responsible for the transition of the p53 into fibers or into a non-native oligomeric state, leading to impairment of the protein function and cancer development [27]. p53 tetramer has a *D*<sup>2</sup> symmetry and is a dimer of a dimer. Each monomer is made of a *β*-strand, a small helix and a long *α*-helix (Fig. 33B). Contacts exist between the residue R337 of the small helix and the residues E349 and D352 of the long helix of the adjacent chain within a dimer of p53 (Fig. 33A). There are contacts between the residues L350 and K351 of one monomer of a dimer and the same residues on another monomer of another dimer (Fig. 33A). The mutation of R337 might inhibit the formation of the small helix and replace it by a more flexible linker which would be too mobile to allow the contacts with the residues E349 and D352 and might also subsequently loosen the contacts between L350 and K351 (Fig. 33B and C). As a consequence the relative position of the *β*-strand and the long *α*-helix would be

a flexible linker frees one interface for fiber formation.

altered allowing the formation of 2 new interfaces (Fig. 33D).

**6.1. From 1-periodical tiling to fibers**

a polyomino *P* and to map it on a fiber.

*k* ∈ **N**.

*<sup>V</sup>* <sup>=</sup> *<sup>U</sup><sup>r</sup>*

**6. Non periodical fiber case and tiling of the whole space case**

In fact, all the preceding constructions are based on regular tilings of the plane. We are able to relax this constraint and to investigate non regular tilings (Fig. 34). Non regular tilings of the plane are constructed by perturbation of regular tilings. We push some strips in the direction *<sup>v</sup>*<sup>1</sup> in order to break the periodicity in direction *<sup>v</sup>*2. In the seminal article of Beauquier-Nivat [1] an important proposition is proved that a tiling (regular or not) by translation of a polyomino is always 1-periodic that is invariant by a translation by a non null vector. This proposition is crucial for us because it allows to take a non regular tiling by

We take a given tiling *T* by translation of a given polyomino *P* and as the tiling *T* is 1-periodic it is invariant by vector *<sup>v</sup>*1. Thus we have the property that the translation by *<sup>v</sup>*<sup>1</sup> of the tiling *<sup>T</sup>* noted Trans*<sup>v</sup>*<sup>1</sup> (*T*) is equal to the tiling *<sup>T</sup>* (here to be more formal and to define the equality between tilings we must introduce equivalent classes of tilings up to translation operator but it is too formal for this chapter written for biologists). By composition of the translations, we have that the translation of *<sup>k</sup>* times the vector *<sup>v</sup>*<sup>1</sup> of the tiling *<sup>T</sup>* is also equal to the tiling *<sup>T</sup>*. That is Trans*k*.*<sup>v</sup>*<sup>1</sup> (*T*) = *<sup>T</sup>*. Thus globally the tiling *<sup>T</sup>* is invariant by the translations *<sup>k</sup>*.*<sup>v</sup>*<sup>1</sup> for

This global property of invariance of the tiling *T* implies a strong property on the boundary word of *P*. In fact in the original article of Beauquier-Nivat the following proposition: if the tiling *T* is 1-periodic then the boundary word is of the form *XWXV* with *W* = *U<sup>r</sup>* and

. Thus locally in the boundary word we are able to read the periodicity property.

**Figure 30. Replacement of a pseudo square and of a pseudo hexagon.**

**Figure 31. Replacement of pseudo square by 2 non regular pentamers.**

**Figure 32. Double asymmetrical pentamer.**

biological examples of such transition are domain swapping ([9, 25]) where the presence of a flexible linker frees one interface for fiber formation.

Another possibility is to start from a dimer with 2 regions of interfaces and again loosen one of the region but this time via a mutation. If the modified residue controlled the interactions between the 2 regions, its modification may loosen one region opening it to form a second interface. Since the initial molecule is a dimer, the mutation will free 2 new interfaces enabling the growth in 2 directions. The formation of the fiber will depend on the symmetry of the dimer. To illustrate this possibility, let's consider the p53 tetramer (see Fig. 33). There are several familial point mutations such as G334V or R337 that are responsible for the transition of the p53 into fibers or into a non-native oligomeric state, leading to impairment of the protein function and cancer development [27]. p53 tetramer has a *D*<sup>2</sup> symmetry and is a dimer of a dimer. Each monomer is made of a *β*-strand, a small helix and a long *α*-helix (Fig. 33B). Contacts exist between the residue R337 of the small helix and the residues E349 and D352 of the long helix of the adjacent chain within a dimer of p53 (Fig. 33A). There are contacts between the residues L350 and K351 of one monomer of a dimer and the same residues on another monomer of another dimer (Fig. 33A). The mutation of R337 might inhibit the formation of the small helix and replace it by a more flexible linker which would be too mobile to allow the contacts with the residues E349 and D352 and might also subsequently loosen the contacts between L350 and K351 (Fig. 33B and C). As a consequence the relative position of the *β*-strand and the long *α*-helix would be altered allowing the formation of 2 new interfaces (Fig. 33D).

#### **6. Non periodical fiber case and tiling of the whole space case**

#### **6.1. From 1-periodical tiling to fibers**

20

**Figure 30. Replacement of a pseudo square and of a pseudo hexagon.**

**Figure 31. Replacement of pseudo square by 2 non regular pentamers.**

**Figure 32. Double asymmetrical pentamer.**

In fact, all the preceding constructions are based on regular tilings of the plane. We are able to relax this constraint and to investigate non regular tilings (Fig. 34). Non regular tilings of the plane are constructed by perturbation of regular tilings. We push some strips in the direction *<sup>v</sup>*<sup>1</sup> in order to break the periodicity in direction *<sup>v</sup>*2. In the seminal article of Beauquier-Nivat [1] an important proposition is proved that a tiling (regular or not) by translation of a polyomino is always 1-periodic that is invariant by a translation by a non null vector. This proposition is crucial for us because it allows to take a non regular tiling by a polyomino *P* and to map it on a fiber.

We take a given tiling *T* by translation of a given polyomino *P* and as the tiling *T* is 1-periodic it is invariant by vector *<sup>v</sup>*1. Thus we have the property that the translation by *<sup>v</sup>*<sup>1</sup> of the tiling *<sup>T</sup>* noted Trans*<sup>v</sup>*<sup>1</sup> (*T*) is equal to the tiling *<sup>T</sup>* (here to be more formal and to define the equality between tilings we must introduce equivalent classes of tilings up to translation operator but it is too formal for this chapter written for biologists). By composition of the translations, we have that the translation of *<sup>k</sup>* times the vector *<sup>v</sup>*<sup>1</sup> of the tiling *<sup>T</sup>* is also equal to the tiling *<sup>T</sup>*. That is Trans*k*.*<sup>v</sup>*<sup>1</sup> (*T*) = *<sup>T</sup>*. Thus globally the tiling *<sup>T</sup>* is invariant by the translations *<sup>k</sup>*.*<sup>v</sup>*<sup>1</sup> for *k* ∈ **N**.

This global property of invariance of the tiling *T* implies a strong property on the boundary word of *P*. In fact in the original article of Beauquier-Nivat the following proposition: if the tiling *T* is 1-periodic then the boundary word is of the form *XWXV* with *W* = *U<sup>r</sup>* and *<sup>V</sup>* <sup>=</sup> *<sup>U</sup><sup>r</sup>* . Thus locally in the boundary word we are able to read the periodicity property.

10.5772/58577

417

http://dx.doi.org/10.5772/58577

From Tilings to Fibers – Bio-mathematical Aspects of Fold Plasticity

**Figure 34. Non regular tiling with dominoes.** The grey domino is surrounded by 5 dominoes thus this tiling is certainly not regular. Remark that the horizontal strips are invariant by the horizontal vector *v*<sup>1</sup> and also by the vector 3*v*1. Thus we are able

the second is *Z* because *W* = *U*<sup>3</sup> = *YZ*. Remark in the picture the periodic part in a staircase

This condition on the border is a necessary condition to find a possible periodicity. In particular, if we want to make the tiling 1-periodic we must push the strip along this direction

In order to construct a fiber we take the tiling and map it on a fiber by superposing tiles according to the direction of periodicity given by *<sup>v</sup>*1. Now we choose a non null value for *<sup>k</sup>*, we are able to construct a fiber because the whole tiling is invariant by the translation *<sup>k</sup><sup>v</sup>*<sup>1</sup> and we superpose the tiles of the original tiling and the tiles translated by *<sup>k</sup><sup>v</sup>*1. Thus we make a fiber that contains exactly *<sup>k</sup>* polyominoes along the direction *<sup>v</sup>*1. An important implication

to map the left border to the right border in order to form a non finite height cylinder.

shape (Fig. 35).

of periodicity.

**Figure 35. Non regular tiling and stair case shape.**

**Figure 33. Transition from oligomer to fiber, the example of the p53 case. A. The p53 tetrameric domain is made of 2 dimers.** Each monomer is made of a *β*-strand followed by a small helix ended by a long *α*-helix parallel to the *β*-strand (1SAK, PDB code). The backbone of each monomer is indicated by a different color. The spacefill amino acids are the residues in contact with R337, which is a residue sensitive to mutation related to some cancer development. **B. Backbone representation of the p53 dimer.** The model proposes that upon mutation of R337, connections between the 2 monomers within a dimer are loosen, enabling a free movement of the long *α*-helix relative to the *β*-strand. C. Schematic of the open-p53 dimer. D. As the structural change takes place in the four monomers, it offers enough new interfaces for growing in 2 directions *<sup>V</sup>*<sup>1</sup> and *<sup>V</sup>*<sup>2</sup>.

For example for a bar of 5 unit squares with boundary word *abbbbba*¯¯ *b*¯ *b*¯ *b*¯ *b* ¯ *b*, we have many factorizations one in pseudo square by taking *X* = *a* and *Y* = *bbbbb* and *X* = *a*¯ and *Y* = ¯ *b*¯ *b*¯ *b*¯ *b*¯ *b* and 4 four factorisations in pseudo hexagon by taking *X* = *a* and *Y* = *b* and *Z* = *bbbb* and *X* = *a*¯ and *Y* = ¯ *b* and *Z* = ¯ *b*¯ *b*¯ *b*¯ *b* or *X* = *a* and *Y* = *bb* and *Z* = *bbb* and *X* = *a*¯ and *Y* = ¯ *b*¯ *b* and *Z* = ¯ *b*¯ *b*¯ *b* or ... or *X* = *a* and *Y* = *bbbb* and *Z* = *b* and *X* = *a*¯ and *Y* = ¯ *b*¯ *b*¯ *b*¯ *b* and *Z* = ¯ *b*. In this example the periodic part on the boundary word is *<sup>U</sup>*<sup>5</sup> = *bbbbb* with *<sup>U</sup>* = *<sup>b</sup>* and *<sup>U</sup>*<sup>5</sup> = ¯ *b*¯ *b*¯ *b*¯ *b*¯ *b*. Remark that in this case *U*<sup>5</sup> = *bbbbb* appears in the factorization in pseudo square.

Another example without factorization in pseudo square is a polyomino constructed by union of 3 triomino L and forming a polyomino with 9 unit squares with boundary word ¯ *baabababab*¯ *a*¯¯ *ba*¯¯ *ba*¯¯ *b* here the factorizations are

*X* = ¯ *ba Y* = *a Z* = *babab X* = *ab*¯ *Y* = *a*¯ *Z* = ¯ *ba*¯¯ *ba*¯¯ *b* or *X* = ¯ *ba Y* = *aba Z* = *bab X* = *ab*¯ *Y* = *a*¯¯ *ba*¯ *Z* = ¯ *ba*¯¯ *b* or *X* = ¯ *ba Y* = *ababa Z* = *b X* = *ab*¯ *Y* = *a*¯¯ *ba*¯¯ *ba*¯ *Z* = ¯ *b*. In this case *U*<sup>3</sup> = *ababab* with *<sup>r</sup>* = 3 and *<sup>U</sup>* = *ab* and *<sup>U</sup>*<sup>3</sup> = ¯ *ba*¯¯ *ba*¯¯ *ba*¯ with *r* = 3 and *U* = ¯ *ba*¯. Remark that in this case *U*<sup>3</sup> does not appear in the boundary word factorization and is splited in 2 parts, one is *Y* and <sup>416</sup> Oligomerization of Chemical and Biological Compounds From Tilings to Fibers – Bio-mathematical Aspects of Fold Plasticity 23 10.5772/58577 From Tilings to Fibers – Bio-mathematical Aspects of Fold Plasticity http://dx.doi.org/10.5772/58577 417

**Figure 34. Non regular tiling with dominoes.** The grey domino is surrounded by 5 dominoes thus this tiling is certainly not regular. Remark that the horizontal strips are invariant by the horizontal vector *v*<sup>1</sup> and also by the vector 3*v*1. Thus we are able to map the left border to the right border in order to form a non finite height cylinder.

the second is *Z* because *W* = *U*<sup>3</sup> = *YZ*. Remark in the picture the periodic part in a staircase shape (Fig. 35).

**Figure 35. Non regular tiling and stair case shape.**

22

**Figure 33. Transition from oligomer to fiber, the example of the p53 case. A. The p53 tetrameric domain is made of 2 dimers.** Each monomer is made of a *β*-strand followed by a small helix ended by a long *α*-helix parallel to the *β*-strand (1SAK, PDB code). The backbone of each monomer is indicated by a different color. The spacefill amino acids are the residues in contact with R337, which is a residue sensitive to mutation related to some cancer development. **B. Backbone representation of the p53 dimer.** The model proposes that upon mutation of R337, connections between the 2 monomers within a dimer are loosen, enabling a free movement of the long *α*-helix relative to the *β*-strand. C. Schematic of the open-p53 dimer. D. As the structural change takes place in the four monomers, it offers enough new interfaces for growing in 2 directions *<sup>V</sup>*<sup>1</sup> and *<sup>V</sup>*<sup>2</sup>.

factorizations one in pseudo square by taking *X* = *a* and *Y* = *bbbbb* and *X* = *a*¯ and *Y* = ¯

example the periodic part on the boundary word is *<sup>U</sup>*<sup>5</sup> = *bbbbb* with *<sup>U</sup>* = *<sup>b</sup>* and *<sup>U</sup>*<sup>5</sup> = ¯

Remark that in this case *U*<sup>5</sup> = *bbbbb* appears in the factorization in pseudo square.

and 4 four factorisations in pseudo hexagon by taking *X* = *a* and *Y* = *b* and *Z* = *bbbb* and

Another example without factorization in pseudo square is a polyomino constructed by union of 3 triomino L and forming a polyomino with 9 unit squares with boundary word

> *ba*¯¯ *ba*¯¯

does not appear in the boundary word factorization and is splited in 2 parts, one is *Y* and

*b* or *X* = ¯

*ba*¯ with *r* = 3 and *U* = ¯

*ba*¯¯

*ba*¯ *Z* = ¯

*b* or *X* = *a* and *Y* = *bb* and *Z* = *bbb* and *X* = *a*¯ and *Y* = ¯

*b*¯ *b*¯ *b*¯ *b* ¯

*b*¯ *b*¯ *b*¯

*ba Y* = *aba Z* = *bab X* = *ab*¯ *Y* = *a*¯¯

*b*, we have many

*b* and *Z* = ¯

*b*. In this case *U*<sup>3</sup> = *ababab*

*ba*¯. Remark that in this case *U*<sup>3</sup>

*b*¯ *b*¯ *b*¯ *b*¯ *b*

*b*¯ *b* and

*b*. In this

*ba*¯

*b*¯ *b*¯ *b*¯ *b*¯ *b*.

For example for a bar of 5 unit squares with boundary word *abbbbba*¯¯

*b* or ... or *X* = *a* and *Y* = *bbbb* and *Z* = *b* and *X* = *a*¯ and *Y* = ¯

*ba Y* = *ababa Z* = *b X* = *ab*¯ *Y* = *a*¯¯

*ba*¯¯ *ba*¯¯

*X* = *a*¯ and *Y* = ¯

*baabababab*¯ *a*¯¯

*ba*¯¯ *ba*¯¯

*b* or *X* = ¯

with *<sup>r</sup>* = 3 and *<sup>U</sup>* = *ab* and *<sup>U</sup>*<sup>3</sup> = ¯

*Z* = ¯ *b*¯ *b*¯

¯

*X* = ¯

*Z* = ¯ *ba*¯¯ *b* and *Z* = ¯

*b*¯ *b*¯ *b*¯

*b* here the factorizations are

*ba Y* = *a Z* = *babab X* = *ab*¯ *Y* = *a*¯ *Z* = ¯

This condition on the border is a necessary condition to find a possible periodicity. In particular, if we want to make the tiling 1-periodic we must push the strip along this direction of periodicity.

In order to construct a fiber we take the tiling and map it on a fiber by superposing tiles according to the direction of periodicity given by *<sup>v</sup>*1. Now we choose a non null value for *<sup>k</sup>*, we are able to construct a fiber because the whole tiling is invariant by the translation *<sup>k</sup><sup>v</sup>*<sup>1</sup> and we superpose the tiles of the original tiling and the tiles translated by *<sup>k</sup><sup>v</sup>*1. Thus we make a fiber that contains exactly *<sup>k</sup>* polyominoes along the direction *<sup>v</sup>*1. An important implication of this construction is the possibility of shifting the strips using the stair case shape. This means that the fiber has one degree of liberty and can tilt along the periodic direction.

10.5772/58577

419

http://dx.doi.org/10.5772/58577

From Tilings to Fibers – Bio-mathematical Aspects of Fold Plasticity

out we recover the 2 ways of tilings either by a pseudo square or a pseudo hexagon. Now we have a complete theory of fiber construction in biology and new robust bio-mathematical

A future exploration can select from the entire PDB, the structures with the global symmetry most appropriate for fiber formation. This set can be screened further for isolating oligomers with only 2 interfaces or only 3 interfaces. These oligomers contain the necessary elements for fiber formation. We can then sort the dataset into two categories, cases known to form pathological fibers and the others. Comparing the characteristics of the interfaces of the cases whose fate is to become pathological fibers and of the "non pathological" cases would help distinguishing the parameters providing the plasticity needed for the transition to fiber from

We would like to thank Claudia Billat who reads carefully a previous version of this article.

[1] Beauquier D. and Nivat M., On translating one polyomino to tile the plane, *Disc. Comput.*

[2] Bellotti V., Chiti F. (2008) Amyloidogenesis in its biological environment: challenging a fundamental issue in protein misfolding diseases. Current opinion in structural biology

[3] Bernstein, F. C., Koetzle, T. F., Williams, G. J., Meyer, E. F., Jr., Brice, M. D., Rodgers, J. R., Kennard, O., Shimanouchi, T., and Tasumi, M. (1977) The Protein Data Bank. A computer-based archival file for macromolecular structures. Eur J Biochem 80, 319-324

[4] Brown, Jerry H. (2006) Breaking symmetry in protein dimers: Designs and functions.

[5] de Bruijn, N. G. (1981) Algebraic theory of Penrose's nonperiodic tilings of the plane. I,

[6] Cheng P.-N., Pham J.D. and Nowick J.S. (2013). The Supramolecular Chemistry of

tools in order to reinvestigate our favorite fibers.

the parameters providing resistance to fiber formation.

\*Address all corespondence to: Laurent.Vuillon@univ-savoie.fr

2 Laboratoire de mathématiques, Université de Savoie, France

II. Nederl. Akad. Wetensch. Indag. Math., 1:43, 39–66.

*β*-Sheets. Journal of the American Chemical Society.

1 Université Joseph Fourier, AGIM, Grenoble, France

**Acknowledgements**

C. Lesieur1 and L. Vuillon2<sup>∗</sup>

*Geom.*, 6 (1991) 575-592.

Protein Science,15:1,1–13.

**Author details**

**References**

18: 771-779.

#### **6.2. From fibers to tilings of the space**

To end this section we give some hints about a possible transition from fiber to tiling of the whole 3 dimensional space. In fact the protein chains are in a 3 dimensional space and we could investigate regular tiling of a 3 dimensional space by a translation of a single polycube (generalization in 3D of the polyomino) that is a union of unit cubes face to face. Such kind of regular tilings leads to tile the whole space by using 3 vectors of translation and in [14] Gambini and Vuillon show that the surrounding of a tile could have *n* adjacent tiles with a fixed integer *n* ≥ 6. For tiling the plane or fibers the number of interfaces is finite and equals to 2 or 3 and in contrast for tiling the space by a polycube the number of interfaces could be greater than each integer *n* with *n* ≥ 3. This means that the number of interfaces could not be finite a priori (see [14]). One interesting question is the following, could we observe in real life a transition by fold plasticity from a fiber to a tiling of the whole 3 dimensional space ?

#### **7. Conclusion**

In this chapter, we focus on mathematical aspects of fibers. The main goal consists on the explanation of basic rules in order to construct cylinders or fibers based on regular tilings by translation of a single chain or of a juxtaposition of chains forming a *n*-mer. We notice that the interfaces must be 2 or 3 according to the tiling on pseudo squares or pseudo hexagons. This is the basic construction and leads to a fiber with a pseudo square tile or with a pseudo hexagon tile. Now, in order to add more combinatorial complexity, we are able to replace each tile on the fiber by a *n*-mer constructed by adding at least another interface. This result inherits the strong combinatorial properties of tilings by polyominoes explored by Beauquier and Nivat [1, 13] and we add the power of combinatorial construction of the *n*-mers. We could now use this construction to explore the structure of real fibers or to construct synthetic fibers. The steps of investigation will be the following: first we look globally at the fiber by searching the smallest object that tiles the fiber (Fig. 23 for the abstract view with pseudo square tiling and Fig. 24 for a part of a real fiber and Fig. 25 for the abstract view with pseudo hexagon tiling and Fig.26 for a part of real fiber). If this object (called meta-tile) is a single chain then we stop the investigation because the structure is found. Otherwise the meta-tile that tiles the fiber is composed by a *n*-mer and we investigate carefully the structure

of this *<sup>n</sup>*-mer (in Fig. 25 the tiling meta-tile is a 4-mer with form *A B C C* and in the p53 fiber,

the object is a 2-mer (see Fig. 33D). Thus each fiber is composed if we zoom out by a tiling by translation by a meta-tile which is a pseudo square or is a pseudo hexagon and if we zoom in the meta-tile is a *n*-mer. Of course if *n* = 1 then the tiling is by a single chain and if *n* ≥ 2 the tiling of the fiber is done by a meta-tile which is a *n*-mer. Remark that for *n* = 1 as we tile by a single shape the number of interfaces is 2 for the pseudo square case or 3 for the pseudo hexagon case. If *n* ≥ 2 then we must add at least 1 interface to form the *n*-mer and thus globally the number of interfaces is at least 3 for the pseudo square case or 4 for the pseudo hexagon case. Our model covers all cases because by zooming in the meta-tile inherits the intrinsic combinatorial complexity of the *n*-mer construction and by zooming out we recover the 2 ways of tilings either by a pseudo square or a pseudo hexagon. Now we have a complete theory of fiber construction in biology and new robust bio-mathematical tools in order to reinvestigate our favorite fibers.

A future exploration can select from the entire PDB, the structures with the global symmetry most appropriate for fiber formation. This set can be screened further for isolating oligomers with only 2 interfaces or only 3 interfaces. These oligomers contain the necessary elements for fiber formation. We can then sort the dataset into two categories, cases known to form pathological fibers and the others. Comparing the characteristics of the interfaces of the cases whose fate is to become pathological fibers and of the "non pathological" cases would help distinguishing the parameters providing the plasticity needed for the transition to fiber from the parameters providing resistance to fiber formation.

#### **Acknowledgements**

We would like to thank Claudia Billat who reads carefully a previous version of this article.

### **Author details**

24

space ?

**7. Conclusion**

of this construction is the possibility of shifting the strips using the stair case shape. This means that the fiber has one degree of liberty and can tilt along the periodic direction.

To end this section we give some hints about a possible transition from fiber to tiling of the whole 3 dimensional space. In fact the protein chains are in a 3 dimensional space and we could investigate regular tiling of a 3 dimensional space by a translation of a single polycube (generalization in 3D of the polyomino) that is a union of unit cubes face to face. Such kind of regular tilings leads to tile the whole space by using 3 vectors of translation and in [14] Gambini and Vuillon show that the surrounding of a tile could have *n* adjacent tiles with a fixed integer *n* ≥ 6. For tiling the plane or fibers the number of interfaces is finite and equals to 2 or 3 and in contrast for tiling the space by a polycube the number of interfaces could be greater than each integer *n* with *n* ≥ 3. This means that the number of interfaces could not be finite a priori (see [14]). One interesting question is the following, could we observe in real life a transition by fold plasticity from a fiber to a tiling of the whole 3 dimensional

In this chapter, we focus on mathematical aspects of fibers. The main goal consists on the explanation of basic rules in order to construct cylinders or fibers based on regular tilings by translation of a single chain or of a juxtaposition of chains forming a *n*-mer. We notice that the interfaces must be 2 or 3 according to the tiling on pseudo squares or pseudo hexagons. This is the basic construction and leads to a fiber with a pseudo square tile or with a pseudo hexagon tile. Now, in order to add more combinatorial complexity, we are able to replace each tile on the fiber by a *n*-mer constructed by adding at least another interface. This result inherits the strong combinatorial properties of tilings by polyominoes explored by Beauquier and Nivat [1, 13] and we add the power of combinatorial construction of the *n*-mers. We could now use this construction to explore the structure of real fibers or to construct synthetic fibers. The steps of investigation will be the following: first we look globally at the fiber by searching the smallest object that tiles the fiber (Fig. 23 for the abstract view with pseudo square tiling and Fig. 24 for a part of a real fiber and Fig. 25 for the abstract view with pseudo hexagon tiling and Fig.26 for a part of real fiber). If this object (called meta-tile) is a single chain then we stop the investigation because the structure is found. Otherwise the meta-tile that tiles the fiber is composed by a *n*-mer and we investigate carefully the structure

the object is a 2-mer (see Fig. 33D). Thus each fiber is composed if we zoom out by a tiling by translation by a meta-tile which is a pseudo square or is a pseudo hexagon and if we zoom in the meta-tile is a *n*-mer. Of course if *n* = 1 then the tiling is by a single chain and if *n* ≥ 2 the tiling of the fiber is done by a meta-tile which is a *n*-mer. Remark that for *n* = 1 as we tile by a single shape the number of interfaces is 2 for the pseudo square case or 3 for the pseudo hexagon case. If *n* ≥ 2 then we must add at least 1 interface to form the *n*-mer and thus globally the number of interfaces is at least 3 for the pseudo square case or 4 for the pseudo hexagon case. Our model covers all cases because by zooming in the meta-tile inherits the intrinsic combinatorial complexity of the *n*-mer construction and by zooming

*C C* and in the p53 fiber,

of this *<sup>n</sup>*-mer (in Fig. 25 the tiling meta-tile is a 4-mer with form *A B*

**6.2. From fibers to tilings of the space**

C. Lesieur1 and L. Vuillon2<sup>∗</sup>

\*Address all corespondence to: Laurent.Vuillon@univ-savoie.fr


#### **References**


[7] Chiang P.K., Lam M.A. and Luo Y. (September 2008). "The many faces of amyloid beta in Alzheimer's disease". Current molecular medicine 8 (6): 580–4. doi:10.2174/156652408785747951. PMID 18781964.

10.5772/58577

421

http://dx.doi.org/10.5772/58577

medicine (Cambridge, Mass.) 14 (7–8): 451–64. doi:10.2119/2007-00100.Irvine.

From Tilings to Fibers – Bio-mathematical Aspects of Fold Plasticity

[22] Janin J., Bahadur R.P. and Chakrabarti P. Protein-protein interaction and quaternary

[23] LaBean T. and Park S. H. (2006). Self-assembled DNA Nanotubes. Nanotechnologies for

[24] Levy E.D. and Teichmann S., Structural, Evolutionary, and Assembly Principles of Protein Oligomerization. In Jesús Giraldo and Francisco Ciruela, editors: Progress in Molecular Biology and Translational Science, Vol. 117, Burlington: Academic Press,

[25] Liu, Y., Gotte, G., Libonati, M. and Eisenberg, D. (2001) A domain-swapped RNase A

[26] Lomas D.A. and Carrell R.W. (2002) Serpinopathies and the conformational dementias.

[27] Lwin, T. Z., Durant, J. J. and Bashford, D. (2007) A fluid salt-bridging cluster and the

[28] Monod J., Wyman J., and Changeux J.-P. (1965). On the nature of allosteric transitions:

[29] Ochieng J. and Chaudhuri G. (2010) Cystatin superfamily. J Health Care Poor

[32] Swapna L.S., Srikeerthana K. and Srinivasan N. (2012) Extent of Structural Asymmetry in Homodimeric Proteins: Prevalence and Relevance. PLoS ONE 7(5): e36688.

[33] Tsitrin Y, Morton CJ, el-Bez C, Paumard P, Velluz MC, et al. (2002) Conversion of a transmembrane to a water-soluble protein complex by a single point mutation. Nat

[34] Tuncbag, N., Kar, G., Keskin, O., Gursoy, A., and Nussinov, R. (2009) A survey of available tools and web servers for analysis of protein–protein interactions and

[35] Twarock, R., (2006) Mathematical virology: a novel approach to the structure and assembly of viruses. Philosophical Transactions of the Royal Society A: Mathematical,

[36] de Vries, S. J., and Bonvin, A. M. (2008) How proteins get in touch: interface prediction

in the study of biomolecular complexes. Curr Protein Pept Sci 9, 394-406.

[31] Senechal, M. (1995) Quasicrystals and geometry, Cambridge University Press.

dimer with implications for amyloid formation. Nat Struct Biol 8, 211–214.

PMC 2274891. PMID 18368143.

the Life Sciences.

2013, pp. 25–51.

structure. Q Rev Biophys 2008;41(2):133–80.

Nature Reviews Genetics 3: 759–768.

stabilization of p53. J Mol Biol 373, 1334–1347.

[30] Protein Data Bank : http://www.rcsb.org/pdb/home/home.do

a plausible model. J. Mol. Biol. 12: 88–118.

Underserved 21: 51-70.

Struct Biol 9: 729-733.

doi:10.1371/journal.pone.0036688

interfaces. Briefings in Bioinformatics 10, 217.

Physical and Engineering Sciences 364, 1849, 3357–3373.


medicine (Cambridge, Mass.) 14 (7–8): 451–64. doi:10.2119/2007-00100.Irvine. PMC 2274891. PMID 18368143.

[22] Janin J., Bahadur R.P. and Chakrabarti P. Protein-protein interaction and quaternary structure. Q Rev Biophys 2008;41(2):133–80.

26

[7] Chiang P.K., Lam M.A. and Luo Y. (September 2008). "The many faces of amyloid beta in Alzheimer's disease". Current molecular medicine 8 (6): 580–4.

[8] Claverie, P. and Hofnung, M. and Monod, J. (1968) Sur certaines implications de l'hypothese d'équivalence stricte entre les protomeres des protéines oligomériques. CR

[9] Eisenberg D. and Jucker M. (2012). The Amyloid State of Proteins in Human Diseases.

[10] Ferreira S.T., Vieira M.N. and De Felice F.G. (2007). Soluble protein oligomers as emerging toxins in Alzheimer's and other amyloid diseases. IUBMB life 59 (4–5):

[11] Gebauer, D. and Völkel, A. and Cölfen, H. (2008). Stable prenucleation calcium

[12] Gebauer, D. and Cölfen, H.(2011). Prenucleation clusters and non-classical

[13] Gambini I. and Vuillon L., An algorithm for deciding if a polyomino tiles the plane,

[14] Gambini I. and Vuillon L., How many faces can polycubes of lattice tilings by translation

[15] Gambini I. and Vuillon L., Non lattice periodic tilings of **R**<sup>3</sup> by single polycubes,

[16] Golomb S. W., Checker boards and polyominoes, *Amer. Math. Monthly*, vol. 61, 10 (1954)

[17] Goodsell, D.S. and Olson, A.J. 2000. Structural symmetry and protein function. Annu.

[18] Grunbaum B. and Shephard G.C., Tilings with congruent tiles, *Bull. Amer. Maths. Soc.*,

[19] Haataja L, Gurlo T, Huang CJ, Butler PC (May 2008). "Islet amyloid in type 2 diabetes, and the toxic oligomer hypothesis". Endocrine Reviews 29 (3): 303–16.

[20] Höppener J.W., Ahrèn B. and Lips C.J. (August 2000). "Islet amyloid and type 2 diabetes mellitus". The New England Journal of Medicine 343 (6): 411–9.

[21] Irvine G.B., El-Agnaf O.M., Shankar G.M. and Walsh D.M. (2008). "Protein aggregation in the brain: the molecular basis for Alzheimer's and Parkinson's diseases". Molecular

doi:10.2174/156652408785747951. PMID 18781964.

332–45. doi:10.1080/15216540701283882. PMID 17505973.

*Theoretical Informatics and Applications*, vol. 41, 2 (2007), 147–155.

of **R**<sup>3</sup> have?, *the electronic journal of combinatorics*, Vol 18, (2011), P199.

carbonate clusters. Science 322 5909: 1819–1822.

*Theoretical Computer Science*, 432, (2012), 52–57.

Rev. Biophys. Biomol. Struct. 29: 105–153.

doi:10.1210/er.2007-0037. PMC 2528855. PMID 18314421.

doi:10.1056/NEJM200008103430607. PMID 10933741.

nucleation.Nano Today 6 6: 564–584.

Séanc. Acad. Sci 266, 1616–1618.

Cell 148: 1188-1203.

675–682.

vol 3, 3 (1980) 951–974.


[37] Winfree, E., Liu, F., Wenzler, L. A., and Seeman, N.C., Design and self-assembly of two-dimensional DNA crystals., Nature 1998, 394, 539-544.

**Chapter 14**

**Provisional chapter**

**Characterization of Some Periodic Tiles by Contour**

**Characterization of Some Periodic Tiles by Contour**

Beauquier and Nivat [1] showed that aperiodicity cannot appear in a tiling by a polyomino using translations only : any such tiling is at least half-periodic. This is not the case when we associate a tile with one of its reflected (or rotated) images, which is a natural thing to do since, in real world, the same molecule can appear in various rotational positions or have an

Figure 1 shows a half-periodic tiling which is quite different from those with only one tile

Moreover, the local structure of such tilings offers more possibilities : a tile can be surrounded by an arbitrary numbers of other tiles as can be easily inferred from the example in figure 2. Therefore, a general study of the tiles involved in such tilings appears to be difficult. Moreover, it is worth noting that there is a lack of notations and methods to study the

In this paper, we provide simple word techniques to characterize some polygons that tile the plane together with one of their reflective image. We have mainly concentrated on tilings with symmetry **pg** [5]. Within periodic tilings involving glide reflections, this is the most interesting group to consider since it is a subgroup of all symmetry groups involving glide

> ©2012 Cousineau, licensee InTech. This is an open access chapter distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2014 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

and an aperiodic one in which the reflected image appears only once.

**Words**

**Words**

Guy Cousineau

**1. Introduction**

isomer.

Guy Cousineau

http://dx.doi.org/10.5772/58631

**Figure 1.** a half-periodic tiling and an aperiodic one

decompositions of tiles and their possible surroundings.

Additional information is available at the end of the chapter

Additional information is available at the end of the chapter


**Provisional chapter**

#### **Characterization of Some Periodic Tiles by Contour Words Characterization of Some Periodic Tiles by Contour Words**

#### Guy Cousineau Guy Cousineau

28

[37] Winfree, E., Liu, F., Wenzler, L. A., and Seeman, N.C., Design and self-assembly of

[38] Yadid I, Kirshenbaum N, Sharon M, Dym O, Tawfik DS (2010) Metamorphic proteins mediate evolutionary transitions of structure. Proc Natl Acad Sci U S A 107: 7287-7292.

[39] Yin, P., Hariadi, R. F., Sahu, S., Choi, H. M., Park, S. H., LaBean, T. H., and Reif, J. H. (2008). Programming DNA tube circumferences. Science, 321(5890), 824-826.

two-dimensional DNA crystals., Nature 1998, 394, 539-544.

422 Oligomerization of Chemical and Biological Compounds

Additional information is available at the end of the chapter Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/58631

**1. Introduction**

Beauquier and Nivat [1] showed that aperiodicity cannot appear in a tiling by a polyomino using translations only : any such tiling is at least half-periodic. This is not the case when we associate a tile with one of its reflected (or rotated) images, which is a natural thing to do since, in real world, the same molecule can appear in various rotational positions or have an isomer.

Figure 1 shows a half-periodic tiling which is quite different from those with only one tile and an aperiodic one in which the reflected image appears only once.

**Figure 1.** a half-periodic tiling and an aperiodic one

Moreover, the local structure of such tilings offers more possibilities : a tile can be surrounded by an arbitrary numbers of other tiles as can be easily inferred from the example in figure 2.

Therefore, a general study of the tiles involved in such tilings appears to be difficult. Moreover, it is worth noting that there is a lack of notations and methods to study the decompositions of tiles and their possible surroundings.

In this paper, we provide simple word techniques to characterize some polygons that tile the plane together with one of their reflective image. We have mainly concentrated on tilings with symmetry **pg** [5]. Within periodic tilings involving glide reflections, this is the most interesting group to consider since it is a subgroup of all symmetry groups involving glide

Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2014 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

©2012 Cousineau, licensee InTech. This is an open access chapter distributed under the terms of the Creative

The contour word of a polygonal tile is completely defined when a starting point is chosen among its vertices. Otherwise, it is only defined modulo a circular shift of its letters The symbol ≡ will denote equality modulo circular shift and the symbol = will denote identity

Characterization of Some Periodic Tiles by Contour Words

http://dx.doi.org/10.5772/58631

425

Given a word *u*, we shall denote (−*u*) the word *u* in which each letter is replaced by its opposite, *<sup>u</sup>* the word *<sup>u</sup>* in which each letter is replaced by its conjugate, and *<sup>u</sup>* the word *<sup>u</sup>*

These three operations commute and are involutive. Therefore, the images of a word *u* to be

*<sup>u</sup>* <sup>−</sup>*<sup>u</sup> <sup>u</sup> <sup>u</sup>* <sup>−</sup>*<sup>u</sup>* <sup>−</sup>*<sup>u</sup> <sup>u</sup>* <sup>−</sup>*<sup>u</sup>*

• The rotations by angles *θ* transform a contour word *u* into the word obtained by multiplying each letter of *u* by the complex number with modulus 1 and argument *θ*

Let us also mention that a change of orientation change *<sup>u</sup>* into <sup>−</sup>*<sup>u</sup>*. As a consequence, a

All periodic tilings symmetry groups have as a subgroup **p1**, the group generated by two independent translations. The tiles that can produce tilings with symmetry **p1** are called in

We shall use this characterization as a basis to deduce characterizations for other symmetry groups. We shall start with symmetry group **p2** for which we shall give a proof that the Conway criterion [4, 8] is a characterization. Then, we shall proceed to symmetry group **pg**. Beauquier and Nivat have shown that the form (1) does not only characterize the tiles that tile the plane with symmetry **p1** but more generally the tiles that tile the plane by translations only. Similarly, it might well be the case that the Conway criterion characterizes the tiles that

*uvw*(−*<sup>u</sup>*)(−*<sup>v</sup>*)(−*<sup>w</sup>*) (1)

of words.

.

**2.3. Periodic tiles**

considered are exactly

where the order of letters has been reversed.

**2.2. Polygonal tiles, isometries and orientation**

• The translations leave the contour words invariants.

In particular, half-turns change *u* into (−*u*).

The effect of isometries on contour words are the following:

• The reflections according to a horizontal axis change *u* into *u*. The other reflections are obtained by applying a rotation to *u*.

In particular, the reflexions according to a vertical axis change *u* into −*u*.

• The glide reflections have the same effect as the associated reflections.

contour factor *<sup>v</sup>* is centro-symmetric if and only if *<sup>v</sup>* <sup>=</sup> *<sup>v</sup>* (*<sup>v</sup>* is a palindrome).

[1] pseudo-hexagons and their contour in our notations have the form

**Figure 2.** an unusual surrounding

reflections. However, we hope that these techniques will also be useful in the future for investigating more general problems in the spirit of [1].

#### **2. Polygonal tiles**

Studies similar to ours are usually placed within the framework of polyominoes. The symmetry group **pg** can indeed be studied in this framework since it involves only one direction of reflexion which can be made for instance horizontal by rotating the axes. However, this is unnecessarily restrictive and we shall use polygons instead. The restriction to polyominoes simplifies the description of tiles by contour words which can be expressed with a simple four letters alphabet (up,down,left,right) whereas the contour word associated with a polygon requires an infinite alphabet (the complex numbers). However, the use of polygons does not make the proofs more complex and allows for more general results.

To be precise, we shall use a notion of "polygonal tile" which in a slight extension of the notion of polygon. A polygonal tile is a sequence of points (vertices) that forms a simple closed line. The edges of a polygonal tile are the segments that join two consecutive vertices. Here, two consecutive edges can have the same direction and this distinguishes polygonal lines from what are usually called polygons. A polygon corresponds to an uncountable infinity of different polygonal lines.

When we consider possible tilings with a given polygonal line, we are only interested in tilings in which vertices of the tilings, i.e. points shared by three tiles or more, correspond to vertices of these tiles and when two tiles are adjacent, the vertices of one tile fit to vertices of the other tile. This restriction is analogous to the restriction to transformations that preserve the "grid" (points with integer coordinates) when one deals with polyominoes. It will enable us to obtain factorizations of the contour words.

#### **2.1. Definitions and notations**

We shall represent polygonal tiles by words on the alphabet **C** of non null complex numbers. Given a word *u* in **C**∗, we denote |*u*| its length, Σ(*u*) the complex sum of its letters, *R*(*u*) and *I*(*u*) the real and imaginary parts of Σ(*u*).

The contour word of a polygonal tile is completely defined when a starting point is chosen among its vertices. Otherwise, it is only defined modulo a circular shift of its letters The symbol ≡ will denote equality modulo circular shift and the symbol = will denote identity of words.

Given a word *u*, we shall denote (−*u*) the word *u* in which each letter is replaced by its opposite, *<sup>u</sup>* the word *<sup>u</sup>* in which each letter is replaced by its conjugate, and *<sup>u</sup>* the word *<sup>u</sup>* where the order of letters has been reversed.

These three operations commute and are involutive. Therefore, the images of a word *u* to be considered are exactly

*<sup>u</sup>* <sup>−</sup>*<sup>u</sup> <sup>u</sup> <sup>u</sup>* <sup>−</sup>*<sup>u</sup>* <sup>−</sup>*<sup>u</sup> <sup>u</sup>* <sup>−</sup>*<sup>u</sup>*

#### **2.2. Polygonal tiles, isometries and orientation**

The effect of isometries on contour words are the following:


In particular, half-turns change *u* into (−*u*).


Let us also mention that a change of orientation change *<sup>u</sup>* into <sup>−</sup>*<sup>u</sup>*. As a consequence, a contour factor *<sup>v</sup>* is centro-symmetric if and only if *<sup>v</sup>* <sup>=</sup> *<sup>v</sup>* (*<sup>v</sup>* is a palindrome).

#### **2.3. Periodic tiles**

.

2

**Figure 2.** an unusual surrounding

**2. Polygonal tiles**

infinity of different polygonal lines.

**2.1. Definitions and notations**

us to obtain factorizations of the contour words.

*I*(*u*) the real and imaginary parts of Σ(*u*).

investigating more general problems in the spirit of [1].

reflections. However, we hope that these techniques will also be useful in the future for

Studies similar to ours are usually placed within the framework of polyominoes. The symmetry group **pg** can indeed be studied in this framework since it involves only one direction of reflexion which can be made for instance horizontal by rotating the axes. However, this is unnecessarily restrictive and we shall use polygons instead. The restriction to polyominoes simplifies the description of tiles by contour words which can be expressed with a simple four letters alphabet (up,down,left,right) whereas the contour word associated with a polygon requires an infinite alphabet (the complex numbers). However, the use of polygons does not make the proofs more complex and allows for more general results.

To be precise, we shall use a notion of "polygonal tile" which in a slight extension of the notion of polygon. A polygonal tile is a sequence of points (vertices) that forms a simple closed line. The edges of a polygonal tile are the segments that join two consecutive vertices. Here, two consecutive edges can have the same direction and this distinguishes polygonal lines from what are usually called polygons. A polygon corresponds to an uncountable

When we consider possible tilings with a given polygonal line, we are only interested in tilings in which vertices of the tilings, i.e. points shared by three tiles or more, correspond to vertices of these tiles and when two tiles are adjacent, the vertices of one tile fit to vertices of the other tile. This restriction is analogous to the restriction to transformations that preserve the "grid" (points with integer coordinates) when one deals with polyominoes. It will enable

We shall represent polygonal tiles by words on the alphabet **C** of non null complex numbers. Given a word *u* in **C**∗, we denote |*u*| its length, Σ(*u*) the complex sum of its letters, *R*(*u*) and All periodic tilings symmetry groups have as a subgroup **p1**, the group generated by two independent translations. The tiles that can produce tilings with symmetry **p1** are called in [1] pseudo-hexagons and their contour in our notations have the form

$$
\mu \upsilon w (-\tilde{u}) (-\tilde{v}) (-\tilde{w}) \tag{1}
$$

We shall use this characterization as a basis to deduce characterizations for other symmetry groups. We shall start with symmetry group **p2** for which we shall give a proof that the Conway criterion [4, 8] is a characterization. Then, we shall proceed to symmetry group **pg**.

Beauquier and Nivat have shown that the form (1) does not only characterize the tiles that tile the plane with symmetry **p1** but more generally the tiles that tile the plane by translations only. Similarly, it might well be the case that the Conway criterion characterizes the tiles that tile the plane by translations only together with their image by half-turn and that the criterion we give in section 4.2 characterizes the tiles that tile the plane by translations only together with their image by reflection. However we have not been able to obtain such a result. This remains, to our knowledge, a conjecture.

with the cases *A*, *A*′

word.

**2.6. Reflection adjacency**

, *B*, *B*′ and *A*, *B*′

factor *uwu* which is its own image under *ht*.

applying twice the horizontal translation *t*.

The tile contour has then a factor *uvuvu*.

**3.1. Presentation of symmetry group p2**

characterization of normal forms by a canonical rewriting system.

and *u* play the parts of *x* and *x*.

**3. Tiles for symmetry p2**

, *B*, *A*′

length argument. In the latter, the factor *x* is decomposed into *x* = *uw* where *w* corresponds to *BB*′ and must therefore be a palindrome and again *<sup>x</sup>* and *<sup>x</sup>* are part of a bigger palindromic

We are mainly interested here in symmetry **pg** which has only one direction of (glide) reflexion. Also, if we study tilings by a tile and one of its images by reflection, we are also in a situation where one reflection direction is privileged. Therefore, it will be convenient to consider that this unique reflection direction is horizontal. This will enable us to express reflections or glide reflections by applying the complex conjugation operation to its contour

A tile that can be made adjacent to its image by reflection or glide reflection must therefore possess both a factor *x* and a factor *x*. These factors can be identical, overlapping or disjoint. When they are identical, *x* contains only real numbers: it is made of a sequence of horizontal segments. This cannot occur in symmetry **pg** because glide reflexions have no fixpoints.

When the two factors *x* and *x* overlap, let us denote by *u* their common factor. We have *x* = *wu* and *x* = *uw*′ and thus *uw*′ = *wu*. If we are in symmetry **pg**, the transformation that transforms *x* into *x* is a glide reflection *gr* which can be decomposed into an horizontal reflection *r* and an horizontal translation *t*. The effect of the glide reflection *gr* on factor *x* is completely described by the complex number Σ(*w*) which has an imaginary part which correspond to the effect of the reflection *r* and a real part which corresponds to the effect of the translation *t*. The complex number Σ(*ww*), which is a real number, corresponds to

Let us assume |*w*| < |*u*|. This implies *u* = *wu*′ and *x* = *wwu*′ . This implies that *x* overlaps with its image obtained by applying twice the glide reflection *gr* or equivalently twice the translation *t*. But this is impossible in symmetry **pg** because this translation belongs to the symmetry group (as the composition of the glide reflection with itself) and in a periodic tiling, an element of a a tile contour cannot overlap with its image by translation. The only possibility is therefore |*w*| ≥ |*u*|. In that case, *x* factorizes into *x* = *uvu* and *x* is equal to *uvu*.

Finally, the two factors *x* and *x* can be disjoint. The tile contour has shape *xyxz* where *y* may be empty. We can note that the overlapping case is a special case of this last one, where *u*

The symmetry group **p2** can be generated by three half-turns or by one half-turn and two translations. The latter is more convenient since it leads to a presentation which allows for a

. Here again, the former can be eliminated by the

Characterization of Some Periodic Tiles by Contour Words

http://dx.doi.org/10.5772/58631

427

#### **2.4. Translation adjacency**

A polygonal tile can be made adjacent to a translated image of itself only if its contour has both a factor *<sup>x</sup>* and a factor (−*<sup>x</sup>*). These two factors are necessarily disjoint because otherwise we have *<sup>x</sup>* <sup>=</sup> *yz* and (−*<sup>x</sup>*) = *zu*, which implies (<sup>−</sup>*<sup>z</sup>*)(−*<sup>y</sup>*) = *zu* and finally (<sup>−</sup>*<sup>z</sup>*) = *<sup>z</sup>*. Such a word cannot be a factor of a contour word since this implies Σ(*z*) = 0.

Therefore, a tile that can be made adjacent to a translated image of itself has the shape *xy*(−*<sup>x</sup>*)*<sup>z</sup>* and the translation associated with the superposition of the factors *<sup>x</sup>* and (−*<sup>x</sup>*), which we shall denote by *Tx*, is defined by the complex number <sup>Σ</sup>(*y*) = <sup>−</sup>Σ(*z*). Such a decomposition *xy*(−*<sup>x</sup>*)*<sup>z</sup>* of a contour word will be called a T-decomposition. A T-decomposition will be said exact when *x* is maximal for translation *Tx* i.e. when the two following conditions are satisfied:


A T-decomposition *xy*(−*<sup>x</sup>*)*<sup>z</sup>* is therefore exact if and only if the contact of the corresponding tile with its image by translation *Tx* is exactly (−*<sup>x</sup>*).

#### **2.5. Half-turn adjacency**

A tile that can be made adjacent to its image by a half-turn must have both a factor *x* and a factor *<sup>x</sup>*. These factors could be disjoint, overlapping or identical. We shall show that the only case we have to consider is the case where *<sup>x</sup>* is a palindromic factor equal to *<sup>x</sup>*. Such a palindromic factor will be said exact if it is not the center of a bigger palindrome.

To clarify the discussion, let us denote by *ht* the corresponding half-turn, by A the starting point of factor *x*, by B its end point, and by A' and B' the images of A and B by halft-turn *ht*.

The case where *<sup>x</sup>* and *<sup>x</sup>* are identical correspond to *<sup>A</sup>*′ <sup>=</sup> *<sup>B</sup>* and *<sup>B</sup>*′ <sup>=</sup> *<sup>A</sup>*. It is not possible to have *A*′ = *A* and *B*′ = *B* because a half-turn has only one fixpoint.

The case where the factors *<sup>x</sup>* and *<sup>x</sup>* are disjoint correspond to a situation where the order between the four considered points are either *A*, *B*, *A*′ , *B*′ or *A*, *B*, *B*′ , *A*′ . However the former case is impossible because, since half-turns are involutive, we have *ht*(*A*′ ) = *A* and *ht*(*B*′ ) = *B*. Therefore the image of *AB*′ is *A*′ *B* a factor strictly contained in *AB*′ which is impossible since half-turns are isometries and preserve the lengthes. We are left with the latter case where the order is *A*, *B*, *B*′ , *A*′ . Let us denote by *w* the factor *BB*′ . Since the image by *ht* of *BB*′ is *B*′ *<sup>B</sup>*, if *ht* is a tiling transformation, then *<sup>w</sup>* must be a palindrome and *AA*′ <sup>=</sup> *xwx* is a bigger palindrome that includes both *<sup>x</sup>* and *<sup>x</sup>* and is its own image under *ht*.

Now, in the last case where factors *<sup>x</sup>* and *<sup>x</sup>* are overlapping, we can eliminate the case where *AB* is included in *A*′ *B*′ or *A*′ *B*′ included in *AB* using the length argument and we are left with the cases *A*, *A*′ , *B*, *B*′ and *A*, *B*′ , *B*, *A*′ . Here again, the former can be eliminated by the length argument. In the latter, the factor *x* is decomposed into *x* = *uw* where *w* corresponds to *BB*′ and must therefore be a palindrome and again *<sup>x</sup>* and *<sup>x</sup>* are part of a bigger palindromic factor *uwu* which is its own image under *ht*.

#### **2.6. Reflection adjacency**

4

remains, to our knowledge, a conjecture.

**2.4. Translation adjacency**

following conditions are satisfied:

**2.5. Half-turn adjacency**

• *y* cannot be factorized into *y* = *y*1*y*′

• *z* cannot be factorized into *z* = *z*1*z*′

tile with its image by translation *Tx* is exactly (−*<sup>x</sup>*).

tile the plane by translations only together with their image by half-turn and that the criterion we give in section 4.2 characterizes the tiles that tile the plane by translations only together with their image by reflection. However we have not been able to obtain such a result. This

A polygonal tile can be made adjacent to a translated image of itself only if its contour has both a factor *<sup>x</sup>* and a factor (−*<sup>x</sup>*). These two factors are necessarily disjoint because otherwise we have *<sup>x</sup>* <sup>=</sup> *yz* and (−*<sup>x</sup>*) = *zu*, which implies (<sup>−</sup>*<sup>z</sup>*)(−*<sup>y</sup>*) = *zu* and finally (<sup>−</sup>*<sup>z</sup>*) = *<sup>z</sup>*. Such a

Therefore, a tile that can be made adjacent to a translated image of itself has the shape *xy*(−*<sup>x</sup>*)*<sup>z</sup>* and the translation associated with the superposition of the factors *<sup>x</sup>* and (−*<sup>x</sup>*), which we shall denote by *Tx*, is defined by the complex number <sup>Σ</sup>(*y*) = <sup>−</sup>Σ(*z*). Such a decomposition *xy*(−*<sup>x</sup>*)*<sup>z</sup>* of a contour word will be called a T-decomposition. A T-decomposition will be said exact when *x* is maximal for translation *Tx* i.e. when the two

(−*<sup>y</sup>*<sup>1</sup>) with <sup>|</sup>*y*1<sup>|</sup> �<sup>=</sup> <sup>0</sup>

(−*<sup>z</sup>*<sup>1</sup>) with <sup>|</sup>*z*1<sup>|</sup> �<sup>=</sup> <sup>0</sup>

A T-decomposition *xy*(−*<sup>x</sup>*)*<sup>z</sup>* is therefore exact if and only if the contact of the corresponding

A tile that can be made adjacent to its image by a half-turn must have both a factor *x* and a factor *<sup>x</sup>*. These factors could be disjoint, overlapping or identical. We shall show that the only case we have to consider is the case where *<sup>x</sup>* is a palindromic factor equal to *<sup>x</sup>*. Such a

To clarify the discussion, let us denote by *ht* the corresponding half-turn, by A the starting point of factor *x*, by B its end point, and by A' and B' the images of A and B by halft-turn *ht*.

The case where *<sup>x</sup>* and *<sup>x</sup>* are identical correspond to *<sup>A</sup>*′ <sup>=</sup> *<sup>B</sup>* and *<sup>B</sup>*′ <sup>=</sup> *<sup>A</sup>*. It is not possible to

The case where the factors *<sup>x</sup>* and *<sup>x</sup>* are disjoint correspond to a situation where the order

since half-turns are isometries and preserve the lengthes. We are left with the latter case

Now, in the last case where factors *<sup>x</sup>* and *<sup>x</sup>* are overlapping, we can eliminate the case where

. Let us denote by *w* the factor *BB*′

*<sup>B</sup>*, if *ht* is a tiling transformation, then *<sup>w</sup>* must be a palindrome and *AA*′ <sup>=</sup> *xwx* is a

, *B*′ or *A*, *B*, *B*′

*B*′ included in *AB* using the length argument and we are left

*B* a factor strictly contained in *AB*′ which is impossible

, *A*′

. However the former

) =

) = *A* and *ht*(*B*′

. Since the image by *ht* of

palindromic factor will be said exact if it is not the center of a bigger palindrome.

have *A*′ = *A* and *B*′ = *B* because a half-turn has only one fixpoint.

case is impossible because, since half-turns are involutive, we have *ht*(*A*′

bigger palindrome that includes both *<sup>x</sup>* and *<sup>x</sup>* and is its own image under *ht*.

between the four considered points are either *A*, *B*, *A*′

, *A*′

*B*′ or *A*′

*B*. Therefore the image of *AB*′ is *A*′

where the order is *A*, *B*, *B*′

*AB* is included in *A*′

*BB*′ is *B*′

word cannot be a factor of a contour word since this implies Σ(*z*) = 0.

We are mainly interested here in symmetry **pg** which has only one direction of (glide) reflexion. Also, if we study tilings by a tile and one of its images by reflection, we are also in a situation where one reflection direction is privileged. Therefore, it will be convenient to consider that this unique reflection direction is horizontal. This will enable us to express reflections or glide reflections by applying the complex conjugation operation to its contour word.

A tile that can be made adjacent to its image by reflection or glide reflection must therefore possess both a factor *x* and a factor *x*. These factors can be identical, overlapping or disjoint. When they are identical, *x* contains only real numbers: it is made of a sequence of horizontal segments. This cannot occur in symmetry **pg** because glide reflexions have no fixpoints.

When the two factors *x* and *x* overlap, let us denote by *u* their common factor. We have *x* = *wu* and *x* = *uw*′ and thus *uw*′ = *wu*. If we are in symmetry **pg**, the transformation that transforms *x* into *x* is a glide reflection *gr* which can be decomposed into an horizontal reflection *r* and an horizontal translation *t*. The effect of the glide reflection *gr* on factor *x* is completely described by the complex number Σ(*w*) which has an imaginary part which correspond to the effect of the reflection *r* and a real part which corresponds to the effect of the translation *t*. The complex number Σ(*ww*), which is a real number, corresponds to applying twice the horizontal translation *t*.

Let us assume |*w*| < |*u*|. This implies *u* = *wu*′ and *x* = *wwu*′ . This implies that *x* overlaps with its image obtained by applying twice the glide reflection *gr* or equivalently twice the translation *t*. But this is impossible in symmetry **pg** because this translation belongs to the symmetry group (as the composition of the glide reflection with itself) and in a periodic tiling, an element of a a tile contour cannot overlap with its image by translation. The only possibility is therefore |*w*| ≥ |*u*|. In that case, *x* factorizes into *x* = *uvu* and *x* is equal to *uvu*. The tile contour has then a factor *uvuvu*.

Finally, the two factors *x* and *x* can be disjoint. The tile contour has shape *xyxz* where *y* may be empty. We can note that the overlapping case is a special case of this last one, where *u* and *u* play the parts of *x* and *x*.

#### **3. Tiles for symmetry p2**

#### **3.1. Presentation of symmetry group p2**

The symmetry group **p2** can be generated by three half-turns or by one half-turn and two translations. The latter is more convenient since it leads to a presentation which allows for a characterization of normal forms by a canonical rewriting system.

If we denote the generators by *T* (the half-turn) and *X*, *Y* (the translations), the equations are:

**3.3. Conway criterion characterizes p2**

with its image by half-turn.

centro-symmetric factors.

into *u* = *u*1*u*′

with the first case. We have *x* = *u*′

This implies *<sup>u</sup>*′ <sup>=</sup> *<sup>u</sup>*′

Conway criterion.

*<sup>u</sup>*′*<sup>u</sup>*1*<sup>v</sup><sup>w</sup>*(−*u*1), and thus *<sup>u</sup>*′

**4. Tiles for symmetry pg**

or

Reciprocally, let us state the condition for a tile to produce a pseudo-hexagon when joined

First, as mentioned in section 2.5, the tile's contour must have a palindromic factor and therefore have shape *xy* where *y* is a palindrome. When the tile is joined to its image by the half-turn, the resulting tile has contour *x*(−*x*) and this contour must be that of a

*<sup>x</sup>*(−*x*) <sup>≡</sup> *uvw*(−*<sup>u</sup>*)(−*<sup>v</sup>*)(−*<sup>w</sup>*)

If *<sup>x</sup>*(−*x*) = *uvw*(−*<sup>u</sup>*)(−*<sup>v</sup>*)(−*<sup>w</sup>*), then *<sup>x</sup>* <sup>=</sup> *uvw* and <sup>−</sup>*<sup>x</sup>* = (−*<sup>u</sup>*)(−*<sup>v</sup>*)(−*<sup>w</sup>*), which is equivalent to *<sup>x</sup>* = (*<sup>u</sup>*)(*<sup>v</sup>*)(*<sup>w</sup>*). We thus have *uvw* = (*<sup>u</sup>*)(*<sup>v</sup>*)(*<sup>w</sup>*) which implies *<sup>u</sup>* <sup>=</sup> *<sup>u</sup>*, *<sup>v</sup>* <sup>=</sup> *<sup>v</sup>* and *<sup>w</sup>* <sup>=</sup> *<sup>w</sup>*. The words *<sup>u</sup>*, *<sup>v</sup>* et *<sup>w</sup>* are palindromes and the primitive tile *xy* is equal to *uvwy* where *u*, *v*, *w* and *y* are palindromes. We are in a special case where the tile is made of four

If *<sup>x</sup>*(−*x*) <sup>≡</sup> *uvw*(−*<sup>u</sup>*)(−*<sup>v</sup>*)(−*<sup>w</sup>*) but *<sup>x</sup>*(−*x*) �<sup>=</sup> *uvw*(−*<sup>u</sup>*)(−*<sup>v</sup>*)(−*<sup>w</sup>*), by shifting if necessary the names *<sup>u</sup>*, *<sup>v</sup>*, *<sup>w</sup>*, *<sup>u</sup>*, *<sup>v</sup>*, *<sup>w</sup>*, we can assume without lost of generality that *<sup>u</sup>* can be decomposed

The second case differs from the first by the orientation only. Therefore, it is sufficient to deal

*<sup>u</sup>*1*<sup>v</sup><sup>w</sup>*(−*u*1).

*<sup>u</sup>*2*vw*(<sup>−</sup>*<sup>u</sup>*2)*yu*′ where *<sup>v</sup>*, *<sup>w</sup>*, *<sup>y</sup>* and *<sup>u</sup>*′ are palindromes which corresponds exactly to the

The symmetry group **pg** can be generated with two glide reflexions with parallel mirrors

and equal associated translations. The presentation has just one equation *G*<sup>1</sup>

)(<sup>−</sup>*<sup>u</sup>*1)(−*<sup>v</sup>*)(−*<sup>w</sup>*)*u*<sup>1</sup>

)(<sup>−</sup>*<sup>u</sup>*1)(−*<sup>v</sup>*)(−*<sup>w</sup>*)*u*1*u*′

)(<sup>−</sup>*<sup>u</sup>*1)(−*<sup>v</sup>*)(−*<sup>w</sup>*)*u*1, which is equivalent to *<sup>x</sup>* <sup>=</sup>

Characterization of Some Periodic Tiles by Contour Words

http://dx.doi.org/10.5772/58631

429

*<sup>u</sup>*2*vw*(<sup>−</sup>*<sup>u</sup>*2)*<sup>y</sup>* or by shifting it to the left

<sup>2</sup> = *G*<sup>2</sup>

<sup>2</sup> where

*<sup>u</sup>*2*vw*(<sup>−</sup>*<sup>u</sup>*2)(−*<sup>u</sup>*′

pseudo-hexagon. Therefore, there must exist words *u*, *v* and *w* such that:

The word *<sup>x</sup>*(−*x*) must thus be identical to a shift of *uvw*(−*<sup>u</sup>*)(−*<sup>v</sup>*)(−*<sup>w</sup>*).

*u*<sup>2</sup> in such a way that |*u*1| = |*u*2| and either

*<sup>x</sup>*(−*x*) = *<sup>u</sup>*2*vw*(<sup>−</sup>*<sup>u</sup>*2)(−*<sup>u</sup>*′

*<sup>u</sup>*2*vw*(<sup>−</sup>*<sup>u</sup>*2) = *<sup>u</sup>*′

, *<sup>u</sup>*<sup>2</sup> <sup>=</sup> *<sup>u</sup>*1, *<sup>v</sup>* <sup>=</sup> *<sup>v</sup>* et *<sup>w</sup>* <sup>=</sup> *<sup>w</sup>*.

*x*(−*x*) = *u*′

*<sup>u</sup>*2*vw*(<sup>−</sup>*<sup>u</sup>*2) and <sup>−</sup>*<sup>x</sup>* = (−*<sup>u</sup>*′

The primitive tile's contour can thus be written *u*′

**4.1. Presentation of symmetry group pg**

*G*<sup>1</sup> and *G*<sup>2</sup> are the two glide reflections.

$$XY = YX \ \ \ \ T^2 = I \ \ \ \ TXT = X^{-1} \ \ \ \ TYT = Y^{-1}$$

We give below the associated canonical rewriting system [6] together with an example:

The normal forms are of two sorts: *YpXq* (translations) and *YpXqT* (half-turns)

It thus appears that a tile is adequate for a (isohedral) tiling with symmetry **p2** if and only if this tile joined to its image by the half-turn *T* forms an adequate tile for symmetry group **p1**. This observation explains in a simple way why the Conway criterion is a necessary condition in section 3.3.

#### **3.2. The Conway criterion**

A polygonal tile satisfies the Conway criterion if and only if it has the shape *uvw*(−*<sup>u</sup>*)*xy* where *v*, *w*, *x* and *y* are palindromes. It is easy to see that if we choose any of its palindromic factors and join it with its half-turn image according to this factor, we obtain a pseudo-hexagon.

Take for instance factor *y*. We obtain

$$
\mu \upsilon w(-\overrightarrow{u})\mathfrak{x}(-\mu)(-\upsilon)(-w)\overrightarrow{u}(-\mathfrak{x})\mathfrak{y}
$$

or, with a left shift

$$\sigma w(-\overleftarrow{u})\ge(-\mu)(-v)(-w)(\overleftarrow{u}(-\infty)u)$$

Taking *<sup>z</sup>* = (−*<sup>u</sup>*)*x*(−*u*), we have <sup>−</sup>*<sup>z</sup>* <sup>=</sup> *<sup>u</sup>*(−*x*)*<sup>u</sup>* and the polygon contour appears as:

$$wwz(-v)(-w)(-z)$$

where *v*, *w* et *z* are palindromes.

This is a specific form of pseudo-hexagon which is invariant by half-turn.

#### **3.3. Conway criterion characterizes p2**

Reciprocally, let us state the condition for a tile to produce a pseudo-hexagon when joined with its image by half-turn.

First, as mentioned in section 2.5, the tile's contour must have a palindromic factor and therefore have shape *xy* where *y* is a palindrome. When the tile is joined to its image by the half-turn, the resulting tile has contour *x*(−*x*) and this contour must be that of a pseudo-hexagon. Therefore, there must exist words *u*, *v* and *w* such that:

$$\mathbf{x}(-\mathbf{x}) \equiv \boldsymbol{\mu}\boldsymbol{v}\boldsymbol{w}(-\boldsymbol{\tilde{u}})(-\boldsymbol{\tilde{v}})(-\boldsymbol{\tilde{w}}),$$

The word *<sup>x</sup>*(−*x*) must thus be identical to a shift of *uvw*(−*<sup>u</sup>*)(−*<sup>v</sup>*)(−*<sup>w</sup>*).

If *<sup>x</sup>*(−*x*) = *uvw*(−*<sup>u</sup>*)(−*<sup>v</sup>*)(−*<sup>w</sup>*), then *<sup>x</sup>* <sup>=</sup> *uvw* and <sup>−</sup>*<sup>x</sup>* = (−*<sup>u</sup>*)(−*<sup>v</sup>*)(−*<sup>w</sup>*), which is equivalent to *<sup>x</sup>* = (*<sup>u</sup>*)(*<sup>v</sup>*)(*<sup>w</sup>*). We thus have *uvw* = (*<sup>u</sup>*)(*<sup>v</sup>*)(*<sup>w</sup>*) which implies *<sup>u</sup>* <sup>=</sup> *<sup>u</sup>*, *<sup>v</sup>* <sup>=</sup> *<sup>v</sup>* and *<sup>w</sup>* <sup>=</sup> *<sup>w</sup>*. The words *<sup>u</sup>*, *<sup>v</sup>* et *<sup>w</sup>* are palindromes and the primitive tile *xy* is equal to *uvwy* where *u*, *v*, *w* and *y* are palindromes. We are in a special case where the tile is made of four centro-symmetric factors.

If *<sup>x</sup>*(−*x*) <sup>≡</sup> *uvw*(−*<sup>u</sup>*)(−*<sup>v</sup>*)(−*<sup>w</sup>*) but *<sup>x</sup>*(−*x*) �<sup>=</sup> *uvw*(−*<sup>u</sup>*)(−*<sup>v</sup>*)(−*<sup>w</sup>*), by shifting if necessary the names *<sup>u</sup>*, *<sup>v</sup>*, *<sup>w</sup>*, *<sup>u</sup>*, *<sup>v</sup>*, *<sup>w</sup>*, we can assume without lost of generality that *<sup>u</sup>* can be decomposed into *u* = *u*1*u*′ *u*<sup>2</sup> in such a way that |*u*1| = |*u*2| and either

$$x(-\infty) = u'u\_2vww(-\widetilde{u\_2})(-\widetilde{u}')(-\widetilde{u\_1})(-\widetilde{v})(-\widetilde{w})u\_1v$$

or

6

are:

in section 3.3.

a pseudo-hexagon.

or, with a left shift

**3.2. The Conway criterion**

Take for instance factor *y*. We obtain

where *v*, *w* et *z* are palindromes.

If we denote the generators by *T* (the half-turn) and *X*, *Y* (the translations), the equations

*XY* = *YX* , *T*<sup>2</sup> = *I* , *TXT* = *X*−<sup>1</sup> , *TYT* = *Y*−<sup>1</sup>

It thus appears that a tile is adequate for a (isohedral) tiling with symmetry **p2** if and only if this tile joined to its image by the half-turn *T* forms an adequate tile for symmetry group **p1**. This observation explains in a simple way why the Conway criterion is a necessary condition

A polygonal tile satisfies the Conway criterion if and only if it has the shape *uvw*(−*<sup>u</sup>*)*xy* where *v*, *w*, *x* and *y* are palindromes. It is easy to see that if we choose any of its palindromic factors and join it with its half-turn image according to this factor, we obtain

*uvw*(−*<sup>u</sup>*)*x*(−*u*)(−*v*)(−*w*)*<sup>u</sup>*(−*x*)

*vw*(−*<sup>u</sup>*)*x*(−*u*)(−*v*)(−*w*)*<sup>u</sup>*(−*x*)*<sup>u</sup>*

*vwz*(−*v*)(−*w*)(−*z*)

Taking *<sup>z</sup>* = (−*<sup>u</sup>*)*x*(−*u*), we have <sup>−</sup>*<sup>z</sup>* <sup>=</sup> *<sup>u</sup>*(−*x*)*<sup>u</sup>* and the polygon contour appears as:

This is a specific form of pseudo-hexagon which is invariant by half-turn.

1 2

4 3

We give below the associated canonical rewriting system [6] together with an example:

The normal forms are of two sorts: *YpXq* (translations) and *YpXqT* (half-turns)

T−<sup>1</sup> → T TY−<sup>1</sup> → *YT* TX−<sup>1</sup> → *XT* X−1*Y* → YX−<sup>1</sup> T<sup>2</sup> → 1 XY → YX X<sup>−</sup>1Y<sup>−</sup><sup>1</sup> → Y−<sup>1</sup> X−<sup>1</sup> XY−<sup>1</sup> → Y−<sup>1</sup> X TX → X<sup>−</sup>1T TY → Y<sup>−</sup>1T

$$\mathbf{x}(-\mathbf{x}) = \mathfrak{u}\_2 \upsilon w(-\widetilde{\mathfrak{u}\_2})(-\widetilde{\mathfrak{u}}')(-\widetilde{\mathfrak{u}\_1})(-\widetilde{\mathfrak{v}})(-\widetilde{\mathfrak{v}})\mathfrak{u}\_1 \mathfrak{u}'$$

The second case differs from the first by the orientation only. Therefore, it is sufficient to deal with the first case.

We have *x* = *u*′ *<sup>u</sup>*2*vw*(<sup>−</sup>*<sup>u</sup>*2) and <sup>−</sup>*<sup>x</sup>* = (−*<sup>u</sup>*′ )(<sup>−</sup>*<sup>u</sup>*1)(−*<sup>v</sup>*)(−*<sup>w</sup>*)*u*1, which is equivalent to *<sup>x</sup>* <sup>=</sup> *<sup>u</sup>*′*<sup>u</sup>*1*<sup>v</sup><sup>w</sup>*(−*u*1), and thus *<sup>u</sup>*′ *<sup>u</sup>*2*vw*(<sup>−</sup>*<sup>u</sup>*2) = *<sup>u</sup>*′ *<sup>u</sup>*1*<sup>v</sup><sup>w</sup>*(−*u*1).

This implies *<sup>u</sup>*′ <sup>=</sup> *<sup>u</sup>*′ , *<sup>u</sup>*<sup>2</sup> <sup>=</sup> *<sup>u</sup>*1, *<sup>v</sup>* <sup>=</sup> *<sup>v</sup>* et *<sup>w</sup>* <sup>=</sup> *<sup>w</sup>*.

The primitive tile's contour can thus be written *u*′ *<sup>u</sup>*2*vw*(<sup>−</sup>*<sup>u</sup>*2)*<sup>y</sup>* or by shifting it to the left *<sup>u</sup>*2*vw*(<sup>−</sup>*<sup>u</sup>*2)*yu*′ where *<sup>v</sup>*, *<sup>w</sup>*, *<sup>y</sup>* and *<sup>u</sup>*′ are palindromes which corresponds exactly to the Conway criterion.

#### **4. Tiles for symmetry pg**

#### **4.1. Presentation of symmetry group pg**

The symmetry group **pg** can be generated with two glide reflexions with parallel mirrors and equal associated translations. The presentation has just one equation *G*<sup>1</sup> <sup>2</sup> = *G*<sup>2</sup> <sup>2</sup> where *G*<sup>1</sup> and *G*<sup>2</sup> are the two glide reflections.

It can also be generated by a single glide reflection *G* and a translation *X* perpendicular to the mirror of *G*. The presentation still has just one equation which is *XGX* = *G*. This second presentation is more interesting. We give below the associated canonical rewriting system [6] together with an example:

The normal forms are *XpG*2*<sup>q</sup>* (translations) and *XpG*2*q*+<sup>1</sup> (glide reflections). Taking *Y* = *G*<sup>2</sup> (*Y* is a translation parallel to the mirror of *G*), the normal forms can be described as *XpYq* (translations) and *XpYqG* (glide reflections).

This permits to see that a tile is adequate for symmetry **pg** if and only if this tile joined to its image by the glide reflexion *G* forms an adequate tile for symmetry group **p1**.

#### **4.2. Criterion for pg tiles**

We are going to show that a tile is adequate for symmetry **pg** if and only if it has one of the shapes

$$\begin{array}{c} \text{u } \upsilon \ w \ \overline{\upsilon} \ \overline{u} \ (-\widetilde{w}) \end{array} \tag{2}$$

Gluing together the primitive tile and its reflected image using factors *<sup>u</sup>* and (−*<sup>u</sup>*), we obtain

*uvw <sup>v</sup> <sup>w</sup>* (−*<sup>u</sup>*) (−*<sup>v</sup>*) (−*<sup>w</sup>*) (<sup>−</sup>*<sup>v</sup>*) (−*<sup>w</sup>*)

*uvx* (−*<sup>u</sup>*) (−*<sup>v</sup>*) (−*<sup>x</sup>*)

which is a pseudo-hexagon. The translation *X* corresponds to the complex number Σ(*uv*)

Let us assume that some tile with contour *m*<sup>1</sup> glued to its reflected image produce a pseudo-hexagon. The conditions for a tile to fit its reflected image have been stated in section

We can also use one more information that comes from the structure of group **pg**. When we compose any glide reflection belonging to the symmetry group **pg** with itself, we obtain a translation that belongs to the symmetry group. Therefore, if in the contour *m*1, the factor *<sup>x</sup>* is the image of the factor *<sup>x</sup>* in some glide reflection , in the contour *<sup>m</sup>*2, the factor (−*<sup>x</sup>*) is image of the factor *<sup>x</sup>* in the corresponding translation. Note that it is written (−*<sup>x</sup>*) and not *x* because it is read in the reverse order in the contour of *m*2. However, even if *x* is an exact factor for the glide reflection, it is not necessarily the case that it is an exact factor for the corresponding translation: we can only assume that *x* is part of the exact factor for the

If we assume that *x* is an exact factor for the translation then *x* appears in the pseudo-hexagon

translation. This distinction will lead to the two possible forms for **pg** tiles.

decomposition of *m*2. Therefore, *m*<sup>2</sup> can be written as

*m*<sup>1</sup> = *x y x z* (4)

Characterization of Some Periodic Tiles by Contour Words

http://dx.doi.org/10.5772/58631

431

*<sup>m</sup>*<sup>2</sup> <sup>=</sup> *x y* (<sup>−</sup>*<sup>z</sup>*) (−*<sup>x</sup>*) (<sup>−</sup>*<sup>y</sup>*) *<sup>z</sup>* (5)

*<sup>m</sup>*<sup>2</sup> <sup>=</sup> *xuv* (−*<sup>x</sup>*) (−*<sup>u</sup>*) (−*<sup>v</sup>*) (6)

Taking *<sup>x</sup>* <sup>=</sup> *<sup>w</sup> <sup>v</sup> <sup>w</sup>* which implies <sup>−</sup>*<sup>x</sup>* = (−*<sup>w</sup>*) (<sup>−</sup>*<sup>v</sup>*) (−*<sup>w</sup>*), this contour can be written

and the translation *Y* to the complex number Σ(*vx*) = Σ(*vw v w*)

**4.3. The criterion characterizes pg tiles**

In all cases suitable for **pg**, we have

and the obtained polygon has contour

a polygon with contour

2.6.

$$\begin{array}{c} \text{in } \overline{u} \ w \ v \ \overline{v} \ (-\widetilde{w}) \end{array} \tag{3}$$

In both cases, one of the factors may be empty.

Let us first show that we obtain a pseudo-hexagon when we make such a tile adjacent to its reflective image using factor *u* or *v*. We shall give the proof for the first form: the second one is similar.

Since the factors *u* and *v* play similar parts, we shall use the reflection according to factor *u*. The reflected tile has contour *<sup>u</sup> <sup>v</sup> wvu* (−*<sup>w</sup>*) in the reverse orientation or if we give the right orientation

$$(-\overline{\overline{\boldsymbol{\mu}}}) \; \overline{\boldsymbol{w}} \; (-\overline{\boldsymbol{\mu}}) \; (-\overline{\boldsymbol{v}}) \; (-\overline{\overline{\boldsymbol{w}}}) \; (-\overline{\overline{\boldsymbol{v}}})$$

Gluing together the primitive tile and its reflected image using factors *<sup>u</sup>* and (−*<sup>u</sup>*), we obtain a polygon with contour

$$\text{If } u \text{ } v \text{ } w \text{ } \overline{v} \text{ } \overline{w} \text{ } (-\overline{u}) \text{ } (-\overline{v}) \text{ } (-\overline{\overline{v}}) \text{ } (-\overline{\overline{v}}) \text{ } (-\overline{w})$$

Taking *<sup>x</sup>* <sup>=</sup> *<sup>w</sup> <sup>v</sup> <sup>w</sup>* which implies <sup>−</sup>*<sup>x</sup>* = (−*<sup>w</sup>*) (<sup>−</sup>*<sup>v</sup>*) (−*<sup>w</sup>*), this contour can be written

$$
\mu \ v \propto (-\overleftarrow{u}) \ (-\overleftarrow{v}) \ (-\overleftarrow{x})
$$

which is a pseudo-hexagon. The translation *X* corresponds to the complex number Σ(*uv*) and the translation *Y* to the complex number Σ(*vx*) = Σ(*vw v w*)

#### **4.3. The criterion characterizes pg tiles**

Let us assume that some tile with contour *m*<sup>1</sup> glued to its reflected image produce a pseudo-hexagon. The conditions for a tile to fit its reflected image have been stated in section 2.6.

In all cases suitable for **pg**, we have

8

[6] together with an example:

(translations) and *XpYqG* (glide reflections).

In both cases, one of the factors may be empty.

**4.2. Criterion for pg tiles**

shapes

is similar.

orientation

GX → X<sup>−</sup>1G GX−<sup>1</sup> → XG G<sup>−</sup>1X → X<sup>−</sup>1G<sup>−</sup><sup>1</sup> G<sup>−</sup>1X<sup>−</sup><sup>1</sup> → XG−<sup>1</sup>

It can also be generated by a single glide reflection *G* and a translation *X* perpendicular to the mirror of *G*. The presentation still has just one equation which is *XGX* = *G*. This second presentation is more interesting. We give below the associated canonical rewriting system

GX' G GX

X' X

G' X' G' G' X

image by the glide reflexion *G* forms an adequate tile for symmetry group **p1**.

The normal forms are *XpG*2*<sup>q</sup>* (translations) and *XpG*2*q*+<sup>1</sup> (glide reflections). Taking *Y* = *G*<sup>2</sup> (*Y* is a translation parallel to the mirror of *G*), the normal forms can be described as *XpYq*

This permits to see that a tile is adequate for symmetry **pg** if and only if this tile joined to its

We are going to show that a tile is adequate for symmetry **pg** if and only if it has one of the

Let us first show that we obtain a pseudo-hexagon when we make such a tile adjacent to its reflective image using factor *u* or *v*. We shall give the proof for the first form: the second one

Since the factors *u* and *v* play similar parts, we shall use the reflection according to factor *u*. The reflected tile has contour *<sup>u</sup> <sup>v</sup> wvu* (−*<sup>w</sup>*) in the reverse orientation or if we give the right

(−*<sup>u</sup>*) *<sup>w</sup>* (−*<sup>u</sup>*) (−*<sup>v</sup>*) (−*<sup>w</sup>*) (<sup>−</sup>*<sup>v</sup>*)

*uvw <sup>v</sup> <sup>u</sup>* (−*<sup>w</sup>*) (2)

*<sup>u</sup> uwv <sup>v</sup>* (−*<sup>w</sup>*) (3)

$$\begin{array}{rcl} x \; m\_1 &=& \mathfrak{x} \; y \; \overline{\mathfrak{x}} \; z \end{array} \tag{4}$$

and the obtained polygon has contour

$$m\_2 = \ge y \ (-\overline{\overline{z}}) \ (-\overline{x}) \ (-\overline{\overline{y}}) \ z \tag{5}$$

We can also use one more information that comes from the structure of group **pg**. When we compose any glide reflection belonging to the symmetry group **pg** with itself, we obtain a translation that belongs to the symmetry group. Therefore, if in the contour *m*1, the factor *<sup>x</sup>* is the image of the factor *<sup>x</sup>* in some glide reflection , in the contour *<sup>m</sup>*2, the factor (−*<sup>x</sup>*) is image of the factor *<sup>x</sup>* in the corresponding translation. Note that it is written (−*<sup>x</sup>*) and not *x* because it is read in the reverse order in the contour of *m*2. However, even if *x* is an exact factor for the glide reflection, it is not necessarily the case that it is an exact factor for the corresponding translation: we can only assume that *x* is part of the exact factor for the translation. This distinction will lead to the two possible forms for **pg** tiles.

If we assume that *x* is an exact factor for the translation then *x* appears in the pseudo-hexagon decomposition of *m*2. Therefore, *m*<sup>2</sup> can be written as

$$
\pi \mathfrak{m}\_2 = \mathfrak{x} \
\
u \
\
v \
\
( - \over \mathfrak{x} ) \ ( - \over \mathfrak{u} ) \ ( - \over \mathfrak{v} ) \tag{6}
$$

and therefore, we must have

$$
\mu \, v \,\, = \,\, y \,\, (-\overline{\overline{z}}) \,\, \, \, \tag{7}
$$

• The solution on reals: *u* = *a<sup>p</sup>* et *v* = *aq*.

*z*. Let us also denote 2′

of 5 et 6 in its inverse (see figure 4).


1

**Figure 4.** General case 1

We also have

x

2

For the same reason, we must have *p* = *q*. This is only a subcase of the previous one.

Now we go back to equations (7) and (8). Here again, we shall use the fact that one of the translations corresponding to *u* and *v* is a vertical one. Let us assume that the vertical translation is associated with *u*. Let us also assume that *R*(*x*) ≥ 0 which implies *R*(*x*) = *R*(*x*) ≥ 0. This is just to ensure that all the vertical projections as they are defined below do

Let us denote by 1, 2, 3, 4, 5, 6 the vertices that bound the factors *<sup>x</sup>*, *<sup>y</sup>*, (<sup>−</sup>*<sup>z</sup>*), (−*<sup>x</sup>*), (<sup>−</sup>*<sup>y</sup>*) and


The factor *u* has extremities 2 and 5′ and the factor *v* has extremities 5′ et 4. Let us denote by *w*<sup>1</sup> the factor with extremities 3 and 6′ and *w*<sup>2</sup> the factor with extremities 5′ and 4. The

x - x - y

5 6

(<sup>−</sup>*<sup>z</sup>*) = *<sup>w</sup>*<sup>1</sup> *y w*<sup>2</sup> (13)

*<sup>z</sup>* = (−*<sup>w</sup>*<sup>1</sup>) (−*<sup>y</sup>*) (−*<sup>w</sup>*<sup>2</sup>) (14)

y


u v

w 1

, 3′ the images of 2, 3 in the vertical translation and 5′

6'

, 6′ the images

w 2

5'

Characterization of Some Periodic Tiles by Contour Words

http://dx.doi.org/10.5772/58631

433

4

exist. The case *R*(*x*) ≤ 0 is similar with *x* and *z* having inverted roles.

y

3

z


factor with extremities 3′ and 2′ is equal to (−*<sup>y</sup>*) and therefore we have


2' 3'

and

$$(-\tilde{\boldsymbol{\mu}})\ (-\tilde{\boldsymbol{\nu}}) \ = \ (-\tilde{\boldsymbol{\overline{y}}})\,\,\,\,\,\tag{8}$$

and in the special case where *z* is empty (the case where *y* is empty is similar), we get

$$
\mu \, v = y \tag{9}
$$

and

$$(-\tilde{\boldsymbol{\mu}})\left(-\tilde{\boldsymbol{\nu}}\right) = \left(-\overline{\tilde{\boldsymbol{\mu}}}\right) \tag{10}$$

We shall first deal with this special case. The equation (10) can be rephrased into

$$y = \overline{v}\,\overline{u}\tag{11}$$

and we have

$$
\mu \, v \, = \, \overline{v} \, \overline{u} \tag{12}
$$

This equation has two types of solutions (see lemma 5.1):

	- *u* = (*ww*)*pw* et *v* = *w*(*ww*)*<sup>q</sup>*

At this point, we have to use the fact that one of the translations which associate (−*<sup>u</sup>*) to *<sup>u</sup>* or (−*<sup>v</sup>*) to *<sup>v</sup>* is a vertical translation. This fact implies *<sup>R</sup>*(*u*) = *<sup>R</sup>*(*v*) = *<sup>R</sup>*(*x*) and therefore *p* = *q* and *u* = *v*.

The tile has shape *<sup>m</sup>*<sup>1</sup> <sup>=</sup> *xx*(−*<sup>u</sup>*)(−*<sup>u</sup>*) or, stating *<sup>y</sup>* = (−*<sup>u</sup>*), *<sup>m</sup>*<sup>1</sup> <sup>=</sup> *<sup>x</sup> x y <sup>y</sup>* (see figure 3).

**Figure 3.** Special case of the first form

• The solution on reals:

*u* = *a<sup>p</sup>* et *v* = *aq*.

10

and

and

and we have

• The general solution:

*p* = *q* and *u* = *v*.

**Figure 3.** Special case of the first form

*u* = (*ww*)*pw* et *v* = *w*(*ww*)*<sup>q</sup>*

and therefore, we must have

*u v* <sup>=</sup> *<sup>y</sup>* (<sup>−</sup>*<sup>z</sup>*) (7)

(−*<sup>u</sup>*) (−*<sup>v</sup>*)=(<sup>−</sup>*<sup>y</sup>*) *<sup>z</sup>* (8)

*u v* = *y* (9)

*y* = *v u* (11)

*u v* = *v u* (12)

(−*<sup>u</sup>*) (−*<sup>v</sup>*)=(<sup>−</sup>*<sup>y</sup>*) (10)

and in the special case where *z* is empty (the case where *y* is empty is similar), we get

We shall first deal with this special case. The equation (10) can be rephrased into

At this point, we have to use the fact that one of the translations which associate (−*<sup>u</sup>*) to *<sup>u</sup>* or (−*<sup>v</sup>*) to *<sup>v</sup>* is a vertical translation. This fact implies *<sup>R</sup>*(*u*) = *<sup>R</sup>*(*v*) = *<sup>R</sup>*(*x*) and therefore

The tile has shape *<sup>m</sup>*<sup>1</sup> <sup>=</sup> *xx*(−*<sup>u</sup>*)(−*<sup>u</sup>*) or, stating *<sup>y</sup>* = (−*<sup>u</sup>*), *<sup>m</sup>*<sup>1</sup> <sup>=</sup> *<sup>x</sup> x y <sup>y</sup>* (see figure 3).

x x

<sup>y</sup> <sup>y</sup>

This equation has two types of solutions (see lemma 5.1):

For the same reason, we must have *p* = *q*. This is only a subcase of the previous one.

Now we go back to equations (7) and (8). Here again, we shall use the fact that one of the translations corresponding to *u* and *v* is a vertical one. Let us assume that the vertical translation is associated with *u*. Let us also assume that *R*(*x*) ≥ 0 which implies *R*(*x*) = *R*(*x*) ≥ 0. This is just to ensure that all the vertical projections as they are defined below do exist. The case *R*(*x*) ≤ 0 is similar with *x* and *z* having inverted roles.

Let us denote by 1, 2, 3, 4, 5, 6 the vertices that bound the factors *<sup>x</sup>*, *<sup>y</sup>*, (<sup>−</sup>*<sup>z</sup>*), (−*<sup>x</sup>*), (<sup>−</sup>*<sup>y</sup>*) and *z*. Let us also denote 2′ , 3′ the images of 2, 3 in the vertical translation and 5′ , 6′ the images of 5 et 6 in its inverse (see figure 4).

The factor *u* has extremities 2 and 5′ and the factor *v* has extremities 5′ et 4. Let us denote by *w*<sup>1</sup> the factor with extremities 3 and 6′ and *w*<sup>2</sup> the factor with extremities 5′ and 4. The factor with extremities 3′ and 2′ is equal to (−*<sup>y</sup>*) and therefore we have

$$(-\overline{\overline{\overline{z}}}) = w\_1 \overline{\overline{y}} w\_2 \tag{13}$$

We also have

$$z = \begin{pmatrix} -\widetilde{w}\_1 \end{pmatrix} \begin{pmatrix} -\widetilde{y} \end{pmatrix} \begin{pmatrix} -\widetilde{w}\_2 \end{pmatrix} \tag{14}$$

since the factors [3, 6′ ], [6′ , 5] ′ et [5′ , 4] are images by translation of factors [3′ , 6], [6, 5] et [1, 2′ ]. This last equality can also be written

$$(-\overline{z}) \, = \, \overline{w\_2} \, \overline{y} \, \overline{w\_1} \tag{15}$$

and we have

$$w\_1 \overline{y} \, w\_2 \, = \, \overline{w\_2} \, \overline{y} \, \overline{w\_1} \, \tag{16}$$

**Figure 5.** Two tilings with non-**pg** symmetry

1 where aperiodicity appears.

glide reflection as shown on figure 6.

and

These solutions with *p* �= *q* correspond to tiles that produce tilings with other symmetries than **pg** or no symmetry at all as those on figure 5 which are only half-periodic or on figure

Now, we go back to the case where the factor *x* of *m*<sup>2</sup> in equation 5 is not an exact factor of the horizontal translation. This means that *x* is included inside some translation factor *axb* where *ab* is non-empty. We assume that the glide reflection maps *x* to *x* and *x* is an exact factor for this glide reflection. If we apply again the glide reflection to *x*, we obtain the image of *x* in the horizontal translation which has to belong to the part of *m*<sup>2</sup> which is not in *m*<sup>1</sup>

Let us show that if *a* and *b* are both non-empty, they cannot both belong to the contour of *m*1. If *a* is non-empty, its image in the glide reflection does not belong to the contour of *m*<sup>1</sup> because otherwise, *x* would not be an exact factor. But then, for the same reason, the image of *a* under the horizontal translation cannot belong to the contour of *m*<sup>2</sup> outside *m*1, so it has to belong to *m*1. The situation is the same for *b*. Consequently, if we consider the images of *x*, *a*, *b* are such that the fist one should be outside *m*<sup>1</sup> and the two others inside *m*1. But then we would have to assume that the image of *x* in the horizontal translation is all the part of *m*<sup>2</sup> that is outside *m*<sup>1</sup> and *m*<sup>1</sup> would have to be equal to *xx* and *a* and *b* would both be empty. So, one of the factors *a* or *b*, if they are non-empty have to be outside the contour of *m*<sup>1</sup> and therefore belong to the part of *m*<sup>2</sup> that is outside *m*1. Let us say, it is *b*. This implies that, in the contour of *m*1, the two factors *x* and *x* are adjacent and *b* has to be the image of *a* in the

We know that the factor *a x* (<sup>−</sup>*<sup>a</sup>*) is exact for the horizontal translation. So, we can make explicit the fact that *<sup>m</sup>*<sup>2</sup> is a pseudo-hexagon by factorizing (<sup>−</sup>*<sup>z</sup>*) into (<sup>−</sup>*<sup>z</sup>*) = *uv* with *<sup>z</sup>* <sup>=</sup>

*<sup>m</sup>*<sup>1</sup> <sup>=</sup> *a x <sup>x</sup>* (<sup>−</sup>*<sup>a</sup>*)*<sup>z</sup>* (17)

Characterization of Some Periodic Tiles by Contour Words

http://dx.doi.org/10.5772/58631

435

*<sup>m</sup>*<sup>2</sup> <sup>=</sup> *a x* (<sup>−</sup>*<sup>a</sup>*) (<sup>−</sup>*<sup>z</sup>*) *<sup>a</sup>* (−*<sup>x</sup>*) (<sup>−</sup>*<sup>a</sup>*) *<sup>z</sup>* (18)

because, otherwise, *m*<sup>1</sup> would overlap with its image in the horizontal translation.

which is an instance of equation

$$\ge y\,z = \overline{z}\,y\,\overline{x}$$

Fortunately, we shall not have to solve this equation in full generality because we can use the fact that *R*(*w*1) = *R*(*w*2) i.e. in the above equation *R*(*x*) = *R*(*z*). We shall show that in this case *x* = *z*. If this is not the case then either *x* = *zu* for some *u* or *z* = *xv* for some *v*. These two cases being symmetrical, we shall deal with the first one only. Since *x* and *z* have equal real projections, *R*(*u*) = 0 (*u* is a pure imaginary number). On the other hand, we have *zuyz* = *zyzu* which implies *I*(*zuyz*) = *I*(*zyzu*) which is equivalent to *I*(*u*) = *I*(*u*). But this is possible only if *I*(*u*) = 0 which, since we already know that *R*(*u*) = 0, implies Σ(*u*) = 0 i.e. *u* would be a closed (looping) factor which cannot exist in a contour word except for the entire contour word. So, we do have *x* = *z*.

So, this shows that in our tile, *w*<sup>1</sup> = *w*<sup>2</sup> and the contour word can finally be written, setting *<sup>w</sup>* = (−*<sup>w</sup>*<sup>2</sup>),

$$w \propto y \text{ } \overline{x} \text{ } \overline{w} \text{ } (-\overline{y})$$

which is of the required shape (form (2).

Notice that if we do not use the fact that one of the translations is vertical, then the real projections of *w*<sup>1</sup> and *w*<sup>2</sup> need not be equal. In that case, we have to consider also solutions such as:

$$\bullet \quad w\_1 = (w\overline{w})^p w \; \; \; (-\overline{y}) \; = \; \; (w\overline{w})^n \; \; \; w\_2 = \; \overline{w} (w\overline{w})^q$$

$$\begin{array}{rcl} \bullet & w\_1 = (w\overline{w})^p \text{ / } (-\overline{y}) & = (w\overline{w})^n w \text{ / } w\_2 = (\overline{w}w)^q \end{array}$$

**Figure 5.** Two tilings with non-**pg** symmetry

12

since the factors [3, 6′

and we have

*<sup>w</sup>* = (−*<sup>w</sup>*<sup>2</sup>),

such as:

], [6′ , 5] ′ et [5′

This last equality can also be written

which is an instance of equation

entire contour word. So, we do have *x* = *z*.

which is of the required shape (form (2).

• *<sup>w</sup>*<sup>1</sup> = (*ww*)*pw* , (−*<sup>y</sup>*)=(*ww*)*<sup>n</sup>* , *<sup>w</sup>*<sup>2</sup> <sup>=</sup> *<sup>w</sup>*(*ww*)*<sup>q</sup>*

• *<sup>w</sup>*<sup>1</sup> = (*ww*)*<sup>p</sup>* , (−*<sup>y</sup>*)=(*ww*)*nw* , *<sup>w</sup>*<sup>2</sup> = (*ww*)*<sup>q</sup>*

, 4] are images by translation of factors [3′

*xyz* = *z y x*

Fortunately, we shall not have to solve this equation in full generality because we can use the fact that *R*(*w*1) = *R*(*w*2) i.e. in the above equation *R*(*x*) = *R*(*z*). We shall show that in this case *x* = *z*. If this is not the case then either *x* = *zu* for some *u* or *z* = *xv* for some *v*. These two cases being symmetrical, we shall deal with the first one only. Since *x* and *z* have equal real projections, *R*(*u*) = 0 (*u* is a pure imaginary number). On the other hand, we have *zuyz* = *zyzu* which implies *I*(*zuyz*) = *I*(*zyzu*) which is equivalent to *I*(*u*) = *I*(*u*). But this is possible only if *I*(*u*) = 0 which, since we already know that *R*(*u*) = 0, implies Σ(*u*) = 0 i.e. *u* would be a closed (looping) factor which cannot exist in a contour word except for the

So, this shows that in our tile, *w*<sup>1</sup> = *w*<sup>2</sup> and the contour word can finally be written, setting

*wxy <sup>x</sup> <sup>w</sup>* (−*<sup>y</sup>*)

Notice that if we do not use the fact that one of the translations is vertical, then the real projections of *w*<sup>1</sup> and *w*<sup>2</sup> need not be equal. In that case, we have to consider also solutions

(<sup>−</sup>*<sup>z</sup>*) = *<sup>w</sup>*<sup>2</sup> *<sup>y</sup> <sup>w</sup>*<sup>1</sup> (15)

*w*<sup>1</sup> *y w*<sup>2</sup> = *w*<sup>2</sup> *y w*<sup>1</sup> (16)

, 6], [6, 5] et [1, 2′

].

These solutions with *p* �= *q* correspond to tiles that produce tilings with other symmetries than **pg** or no symmetry at all as those on figure 5 which are only half-periodic or on figure 1 where aperiodicity appears.

Now, we go back to the case where the factor *x* of *m*<sup>2</sup> in equation 5 is not an exact factor of the horizontal translation. This means that *x* is included inside some translation factor *axb* where *ab* is non-empty. We assume that the glide reflection maps *x* to *x* and *x* is an exact factor for this glide reflection. If we apply again the glide reflection to *x*, we obtain the image of *x* in the horizontal translation which has to belong to the part of *m*<sup>2</sup> which is not in *m*<sup>1</sup> because, otherwise, *m*<sup>1</sup> would overlap with its image in the horizontal translation.

Let us show that if *a* and *b* are both non-empty, they cannot both belong to the contour of *m*1. If *a* is non-empty, its image in the glide reflection does not belong to the contour of *m*<sup>1</sup> because otherwise, *x* would not be an exact factor. But then, for the same reason, the image of *a* under the horizontal translation cannot belong to the contour of *m*<sup>2</sup> outside *m*1, so it has to belong to *m*1. The situation is the same for *b*. Consequently, if we consider the images of *x*, *a*, *b* are such that the fist one should be outside *m*<sup>1</sup> and the two others inside *m*1. But then we would have to assume that the image of *x* in the horizontal translation is all the part of *m*<sup>2</sup> that is outside *m*<sup>1</sup> and *m*<sup>1</sup> would have to be equal to *xx* and *a* and *b* would both be empty.

So, one of the factors *a* or *b*, if they are non-empty have to be outside the contour of *m*<sup>1</sup> and therefore belong to the part of *m*<sup>2</sup> that is outside *m*1. Let us say, it is *b*. This implies that, in the contour of *m*1, the two factors *x* and *x* are adjacent and *b* has to be the image of *a* in the glide reflection as shown on figure 6.

$$m\_1 = \
a \ge \overline{\pi} \ (-\overline{a}) z \tag{17}$$

and

$$\begin{array}{rcl} m\_2 & = & a \ge \left( -\overline{\overline{a}} \right) \left( -\overline{\overline{z}} \right) \overline{a} \left( -\overline{\overline{x}} \right) \left( -\overline{a} \right) z \end{array} \tag{18}$$

We know that the factor *a x* (<sup>−</sup>*<sup>a</sup>*) is exact for the horizontal translation. So, we can make explicit the fact that *<sup>m</sup>*<sup>2</sup> is a pseudo-hexagon by factorizing (<sup>−</sup>*<sup>z</sup>*) into (<sup>−</sup>*<sup>z</sup>*) = *uv* with *<sup>z</sup>* <sup>=</sup>

(−*<sup>u</sup>*)(−*<sup>v</sup>*). The two definitions of *<sup>z</sup>* lead us to the equation

$$
\mu \, v \, = \, \overline{v} \, \overline{u} \tag{19}
$$

2. *x* = *u v* for two non empty words *u* et *v* and *u v* = *v u*.

To prove this lemma, we shall have to prove also the following one:

assume |*uv*| ≥ 2. The base case will correspond to |*uv*| = 2 and |*uv*| = 3.

lemma 5.2, there is no solution on non real complex numbers.

solutions *u* = *a*, *v* = *aa* et *u* = *aa*, *v* = *a* where *a* is a real letter.

• Lemme 5.1: *u v* = *v u*. We compare the lengthes of *u* et *v*.

*a* = *b* = *c* and we have the solution *u* = *a*, *v* = *aa*.

consider two words *u* et *v* such that |*uv*| = *n* + 1.

To study such equations, the alphabet of complex numbers divides naturally between reals that are their own conjugate and non reals which have a conjugate distinct from themselves. The lemmas presented below are very similar to those presented in [7] for ordinary equations on words but are slightly more complex due to the partition of the alphabet. The equations will have two kinds of solutions: solutions on *R*<sup>∗</sup> which are the solutions of the considered

Characterization of Some Periodic Tiles by Contour Words

http://dx.doi.org/10.5772/58631

437

*u* = *w<sup>m</sup>* , *v* = *w<sup>n</sup>*

*u* = (*ww*)*mw* , *v* = *w*(*ww*)*<sup>n</sup>*

*u* = *w<sup>m</sup>* , *v* = *w<sup>n</sup>*

*u* = (*ww*)*mw* , *v* = (*ww*)*<sup>n</sup>*

The proofs of these two lemmas can be done simultaneously by induction on |*uv*|. We shall

• If |*uv*| = 2, *u* and *v* are letters and *u* = *v* in the three cases. This leads to solutions *u* = *u* = *v* = *v* on reals and *u* = *v* on non real complex numbers for lemma 5.1. For

• If |*uv*| = 3, we have either *u* = *ab* et *v* = *c* or *u* = *a*, *v* = *bc* where *a*, *b*,*c* are letters. For lemma 5.1 , the two hypotheses lead to *a* = *a* = *b* = *b* = *c* = *c* and we have only the

For lemma 5.2 , the hypothesis *u* = *ab* et *v* = *c* lead also to *a* = *a* = *b* = *b* = *c* = *c* and we have only the two real solutions. On the contrary, the hypothesis *u* = *a*, *v* = *bc* leads to

• In the inductive case , let us assume that the two lemmas are true when |*uv*| ≤ *n* and

equations where we drop out the conjugation operations and the general solutions.

This case is treated by lemma 5.1.

*where w* ∈ *R*<sup>∗</sup> *and*

*where w* ∈ *C*<sup>∗</sup> *.*

*where w* ∈ *R*<sup>∗</sup> *and*

**Lemme 5.1.** *The solutions of equation u v* = *v u are*

**Lemme 5.2.** *The solutions of equation u v* = *v u are*

and we use the fact that one of the translations associated to factors *u* and *v* is vertical to deduce that *R*(*u*) = *R*(*v*). Therefore *u* = *v* and we can conclude that

$$m\_1 = \
a \ge \overline{x} \ (-\overline{a})(-\overline{u})(-\overline{\overline{u}}) \tag{20}$$

which corresponds to the form 3.

This completes our proof. It is clear from the examples of figure 5 that the main result presented here is only a small step toward the characterization of polygons that tile the plane together with one of their reflected images because our proofs are highly dependent on the specificities of symmetry **pg**.. A plausible conjecture is that a polygon has this property if and only if, suitably oriented, its contour has a factorization of form 2 or 3. But the contour word techniques that we have used here will certainly require further development in order to tackle with such a conjecture.

#### **5. Technical lemmas**

#### **Solutions of the equation** *x* ≡ *x*

This section studies the solutions of the equation *x* ≡ *x*. This equation where the symbol ≡ denotes equality modulo circular shift corresponds to two equations using equality on words.

1. *x* = *x*

In that case, each letter of *x* must be its own conjugate. The word *x* is made uniquely of reals.

2. *x* = *u v* for two non empty words *u* et *v* and *u v* = *v u*. This case is treated by lemma 5.1.

To study such equations, the alphabet of complex numbers divides naturally between reals that are their own conjugate and non reals which have a conjugate distinct from themselves. The lemmas presented below are very similar to those presented in [7] for ordinary equations on words but are slightly more complex due to the partition of the alphabet. The equations will have two kinds of solutions: solutions on *R*<sup>∗</sup> which are the solutions of the considered equations where we drop out the conjugation operations and the general solutions.

**Lemme 5.1.** *The solutions of equation u v* = *v u are*

$$\boldsymbol{u} = \boldsymbol{w}^{\boldsymbol{m}} \text{ , } \boldsymbol{v} = \boldsymbol{w}^{\boldsymbol{n}}$$

*where w* ∈ *R*<sup>∗</sup> *and*

14

**Figure 6.** General case 2

which corresponds to the form 3.

to tackle with such a conjecture.

**Solutions of the equation** *x* ≡ *x*

**5. Technical lemmas**

words.

1. *x* = *x*

reals.

a

(−*<sup>u</sup>*)(−*<sup>v</sup>*). The two definitions of *<sup>z</sup>* lead us to the equation

x

z


deduce that *R*(*u*) = *R*(*v*). Therefore *u* = *v* and we can conclude that



u v


and we use the fact that one of the translations associated to factors *u* and *v* is vertical to

This completes our proof. It is clear from the examples of figure 5 that the main result presented here is only a small step toward the characterization of polygons that tile the plane together with one of their reflected images because our proofs are highly dependent on the specificities of symmetry **pg**.. A plausible conjecture is that a polygon has this property if and only if, suitably oriented, its contour has a factorization of form 2 or 3. But the contour word techniques that we have used here will certainly require further development in order

This section studies the solutions of the equation *x* ≡ *x*. This equation where the symbol ≡ denotes equality modulo circular shift corresponds to two equations using equality on

In that case, each letter of *x* must be its own conjugate. The word *x* is made uniquely of

a


*u v* = *v u* (19)

*<sup>m</sup>*<sup>1</sup> <sup>=</sup> *a x <sup>x</sup>* (<sup>−</sup>*<sup>a</sup>*)(−*<sup>u</sup>*)(−*<sup>u</sup>*) (20)

$$u = (w\overline{w})^m w \; , \; v = \overline{w} (w\overline{w})^n$$

*where w* ∈ *C*<sup>∗</sup> *.*

To prove this lemma, we shall have to prove also the following one:

**Lemme 5.2.** *The solutions of equation u v* = *v u are*

$$\boldsymbol{u} = \boldsymbol{w}^{\boldsymbol{m}} \text{ , } \boldsymbol{v} = \boldsymbol{w}^{\boldsymbol{m}}$$

*where w* ∈ *R*<sup>∗</sup> *and*

$$\mu = (w\overline{w})^m w \; , \; v = (\overline{w}w)^n.$$

The proofs of these two lemmas can be done simultaneously by induction on |*uv*|. We shall assume |*uv*| ≥ 2. The base case will correspond to |*uv*| = 2 and |*uv*| = 3.

	- Lemme 5.1: *u v* = *v u*. We compare the lengthes of *u* et *v*.

[6] Ph. Lechenadec. *Canonical forms in finitely presented algebras*. Pitman, 1986.

[8] D. Schattschneider. Will it tile? try the conway criterion! *Mathematics Magazine*,

Characterization of Some Periodic Tiles by Contour Words

http://dx.doi.org/10.5772/58631

439

[9] H.A.G. Vijshoff and J. Van Leeuwen. Arbitrary versus periodic storage schemes and tesselations of the plane using one type of polyomino. *Inform. Control*, 62:1–25, 1984.

[7] M. Lothaire. *Combinatorics on words*. Cambridge University Press, 1997.

53:224–233, 1980.

	- \* If |*u*| = |*v*|, then *u* = *v* = *v*. (real solutions)
	- \* If <sup>|</sup>*u*<sup>|</sup> <sup>&</sup>gt; <sup>|</sup>*v*|, take *<sup>u</sup>* <sup>=</sup> *vu*1. We also have *<sup>u</sup>* <sup>=</sup> *<sup>u</sup>*1*<sup>v</sup>* hence *<sup>u</sup>*1*<sup>v</sup>* <sup>=</sup> *vu*1. By induction hypothesis, the solutions of this equation are *u*<sup>1</sup> = *w<sup>m</sup>* , *v* = *w<sup>n</sup>* where *w* ∈ *R* which gives *u* = *wm*+*n*, *v* = *w<sup>n</sup>* and *u*<sup>1</sup> = (*ww*)*mw* , *v* = (*ww*)*n*, which gives *u* = (*ww*)*m*<sup>+</sup>*nw* and *v* = (*ww*)*n*.
	- \* If <sup>|</sup>*u*<sup>|</sup> <sup>&</sup>lt; <sup>|</sup>*v*|, take *<sup>v</sup>* <sup>=</sup> *uv*1. Here again we have *<sup>v</sup>* <sup>=</sup> *<sup>v</sup>*1*u*, which implies *uv*<sup>1</sup> <sup>=</sup> *<sup>v</sup>*1*u*. By induction hypothesis, the solutions of this equation are *u* = *w<sup>m</sup>* , *v*<sup>1</sup> = *w<sup>n</sup>* where *w* ∈ *R* which gives *u* = *wm*, *v* = *wm*+*<sup>n</sup>* and *u* = (*ww*)*mw*, *v*<sup>1</sup> = *w*(*ww*)*n*, which gives *v* = (*ww*)*m*+*n*<sup>+</sup>1.

*The author thanks Luc Boasson, Maurice Nivat and Laurent Vuillon for their help and support in preparing this paper and also the referee whose remarks and questions were extremely useful.*

#### **Author details**

#### Guy Cousineau

Fédération de Recherche, CNRS-Université de Savoie Modélisation, Simulation, Interactions Fondamentales, France

#### **References**


16

\* If |*u*| = |*v*|, then *u* = *v*.

*v* = *w* (*ww*)*n*.

*v* = (*ww*)*m*<sup>+</sup>*nw*.

**Author details**

Guy Cousineau

**References**

Verlag, 1972.

• Lemme 5.2: *u v* = *v u*. We compare again the lengthes. \* If |*u*| = |*v*|, then *u* = *v* = *v*. (real solutions)

*u* = (*ww*)*m*<sup>+</sup>*nw* and *v* = (*ww*)*n*.

which gives *v* = (*ww*)*m*+*n*<sup>+</sup>1.

Fédération de Recherche, CNRS-Université de Savoie

*Computational Geometry*, 6:575–592, 1991.

Modélisation, Simulation, Interactions Fondamentales, France

and polyhexes. *Scientific American*, 53:112–115, 1975.

[2] H.S.M. Coxeter. *Introduction to geometry*. John Wiley and sons, 1980.

\* If <sup>|</sup>*u*<sup>|</sup> <sup>&</sup>gt; <sup>|</sup>*v*|, take *<sup>u</sup>* <sup>=</sup> *vu*1. We then also have *<sup>u</sup>* <sup>=</sup> *<sup>u</sup>*1*<sup>v</sup>* hence *<sup>u</sup>* <sup>=</sup> *<sup>u</sup>*1*<sup>v</sup>* and thus *u*1*v* = *vu*<sup>1</sup> or *u*1*v* = *vu*1. By induction hypothesis, the solutions of this equation are *u*<sup>1</sup> = *w<sup>m</sup>* , *v* = *w<sup>n</sup>* where *w* ∈ *R* which gives *u* = *wm*+*n*, *v* = *w<sup>n</sup>* and *u*<sup>1</sup> = (*ww*)*m*, *v* = *w* (*ww*)*n*. In that case *u* = *vu*<sup>1</sup> = (*ww*)*m*<sup>+</sup>*nw* and

\* If <sup>|</sup>*u*<sup>|</sup> <sup>&</sup>lt; <sup>|</sup>*v*|, taking *<sup>v</sup>* <sup>=</sup> *uv*1, we also have *<sup>v</sup>* <sup>=</sup> *<sup>v</sup>*1*<sup>u</sup>* hence *uv*<sup>1</sup> <sup>=</sup> *<sup>v</sup>*1*u*. By induction hypothesis, the solutions of this equation are *u* = *w<sup>m</sup>* , *v*<sup>1</sup> = *w<sup>n</sup>* where *w* ∈ *R* which gives *u* = *wm*, *v* = *wm*+*<sup>n</sup>* and *u* = (*ww*)*mw* , *v*<sup>1</sup> = (*ww*)*n*, which gives

\* If <sup>|</sup>*u*<sup>|</sup> <sup>&</sup>gt; <sup>|</sup>*v*|, take *<sup>u</sup>* <sup>=</sup> *vu*1. We also have *<sup>u</sup>* <sup>=</sup> *<sup>u</sup>*1*<sup>v</sup>* hence *<sup>u</sup>*1*<sup>v</sup>* <sup>=</sup> *vu*1. By induction hypothesis, the solutions of this equation are *u*<sup>1</sup> = *w<sup>m</sup>* , *v* = *w<sup>n</sup>* where *w* ∈ *R* which gives *u* = *wm*+*n*, *v* = *w<sup>n</sup>* and *u*<sup>1</sup> = (*ww*)*mw* , *v* = (*ww*)*n*, which gives

\* If <sup>|</sup>*u*<sup>|</sup> <sup>&</sup>lt; <sup>|</sup>*v*|, take *<sup>v</sup>* <sup>=</sup> *uv*1. Here again we have *<sup>v</sup>* <sup>=</sup> *<sup>v</sup>*1*u*, which implies *uv*<sup>1</sup> <sup>=</sup> *<sup>v</sup>*1*u*. By induction hypothesis, the solutions of this equation are *u* = *w<sup>m</sup>* , *v*<sup>1</sup> = *w<sup>n</sup>* where *w* ∈ *R* which gives *u* = *wm*, *v* = *wm*+*<sup>n</sup>* and *u* = (*ww*)*mw*, *v*<sup>1</sup> = *w*(*ww*)*n*,

*The author thanks Luc Boasson, Maurice Nivat and Laurent Vuillon for their help and support in*

[1] D. Beauquier and M Nivat. On translating one polyomino to tile the plane. *Discrete and*

[3] H.S.M. Coxeter and W.O.J. Mauser. *Generators and relations for discrete groups*. Springer

[4] M. Gardner. More about tiling the plane: the possibilities of polyominoes, polyamonds

[5] B. Grünbaum and G.C. Shephard. *Tilings and patterns*. Wiley and sons, 1987.

*preparing this paper and also the referee whose remarks and questions were extremely useful.*


### *Edited by Claire Lesieur*

Many thanks to the authors for high quality chapters and to the referees for helping improve the manuscripts. The book is interdisciplinary, it covers fields from organic chemistry to mathematics, and raises different aspects of oligomerization. It is a great source of information as every chapter introduces general knowledge and deep details. Mixing communities is to instigate novel ideas and hopefully help looking at oligomerization with new eyes.

Photo by Nongkran\_ch / iStock

Oligomerization of Chemical and Biological Compounds

Oligomerization of Chemical

and Biological Compounds