**2. Synthetic biology — Its aims and relevance**

products such as bio-fuels on a commercial scale. Possible socio-economic benefits from synthetic biology research is thus enormous, but then so is the possibility of the technology's misuse. The concerns range from bioethical and environmental worries to bio-terrorism, say, by malicious release of genetically engineered viruses targeted at specific ethnic groups. The main concern here is the illegal creation and growth of bio-

The socio-economic promise of synthetic biology has spurred both public and private invest‐ ments and made people introspect about its consequences and impact on human society. All players involved in creating and commercialising this knowledge-and-capital intensive emerging technology are obviously deeply interested in knowing how they would gain or lose from the intellectual property (IP) system in place and whether that system needs to be

DNA as an information carrier gained currency in the 1950s with the discovery of the doublehelix structure of cellular DNA by James Watson and Francis Crick in 1953 [12]. Prior to that biologists talked of biological "specificity". In 1953, Watson and Crick noted: "...it therefore seems likely that the precise sequence of the bases is the *code* which carries the genetical *information*..." (Emphasis added) [13]. Now the language of information is pervasive in molecular biology—genes are linear sequences of bases (like letters of an alphabet) that carry information (like words) for the production of proteins (like sentences). The process of going from DNA sequences to proteins we use words like "transcription" and "translation", and we talk of passing genetic "information" from one generation to another. It is rather uncanny that molecular biology can be understood by ignoring chemistry and treating the DNA as a computer program (with enough input data included) in stored memory residing in a computer (the cellular machinery). It is this aspect that bioinformatics exploits. It is analogous to viewing Euclidean geometry not in terms of drawings but in terms of algebra. In our current understanding, DNA is an informational polymer. It is a vast chemical information database that *inter alia* carries the complete set of instructions for making all the proteins a cell will ever need. As Albert Lehninger lyrically put it, understanding the DNA is the study of "the

The intellectual property (IP) system, as it stands, did not anticipate the convergence of the patenting of information carrying living matter, a knowledge-based global economic system, and the ascendancy of a research-centric and innovative biotechnology industry. Therefore, the IP system is already under great strain because biotechnology related IP has been patched onto an existing patent system in an *ad hoc* manner. For example, in the complex legal maze, intellectual property rights (IPR) related to DNA synthesis, which is at the core of synthetic biology, may be inadvertently infringed by DNA synthesis companies in terms of enforceable trade secret, trademark, copyright or patent laws, simply by constructing DNA sequences for

That the DNA is an information encoded molecule, makes the interpretation of IP laws that much more difficult by judges who are generally ignorant about the deep science that supports biotechnology. Indeed organisms are defined by the information encoded in their genomes, and since the origin of life that information is believed to have been encoded using a two-base-

changed, replaced, or abolished from their respective perspective.

molecular logic of the living state." [14].

their clients [15].

weapons.

196 Biotechnology

Synthetic biology is a revolutionary development in life sciences. It is highly multidisciplinary where molecular biology, physical sciences and engineering merge to design and construct new biological parts, novel artificial biological pathways, organisms or devices and systems including the re-design of existing natural biological systems for useful purposes. We may call it bioengineering. It has already produced tumour-seeking microbes for cancer treatment, photosynthetic systems to produce energy, artificial life, etc. Like engineering, it too aims to produce standardized components and connectors, manufacturing and assembly processes, test vehicles and certification processes, etc. to enable production and marketing of increas‐ ingly sophisticated and functional systems on a mass scale. In a sense, "Synthetic biology is the engineering of biology: the synthesis of complex, biologically based (or inspired) systems which display functions that do not exist in nature. This engineering perspective may be applied at all levels of the hierarchy of biological structures – from individual molecules to whole cells, tissues and organisms. In essence, synthetic biology will enable the design of 'biological systems' in a rational and systematic way." [17].

Enormous expectations rest on future advancements in systems biology as it has the potential to radically change the way we approach key technologies, such as medicine and manufac‐ turing. Current efforts have focused on creating highly generic capabilities (the building blocks) in the form of bio-tools and bio-processes that can be scaled for industrial application. Given the high intellectual calibre of the synthetic biology research community, it appears inevitable that the scientific knowledge they produce and place in the public domain will quickly be translated into industrial applications of high economic value by equally talented industry researchers. This raises obvious concerns about the ownership and control of generated intellectual property that may lead to high commercial value in a twenty-first century economy that truly belongs to the life sciences. The key enabling technology is DNA synthesis. The workspace includes microbes, mammalian cells, plants, etc. Its applications include therapeutics, energy (e.g., fuels), chemicals, agriculture, etc.

The pioneering paper of Watson and Crick [12] that elucidated the double helix structure of cellular DNA has been hailed as the greatest discovery in biology since Darwin's theory of evolution. In their paper, they showed that the structure was made possible by the unique base pairing of nucleotides guanine (G) with thymine (T), and adenine (A) with cytosine (C), each member of a pair belonging to opposing strands. It is this pairing that allows base pairs to be arbitrarily stacked as a double helix. In a famous understatement, they wrote: "It has not escaped our notice that the specific pairing we have postulated immediately suggests a possible copying mechanism for the genetic material." It was the potential for explaining biological function of DNA that led to the widespread acceptance of the double helix model rather than any compelling structural evidence. The helical structure was not rigorously determined by X-ray crystallography until the late 1970s [18]. Whereas cells were regarded as the basic building blocks of living organisms during the nineteenth century, the Watson and Crick paper [12] shifted attention from cells to DNA molecules in the middle of the twentieth century, when geneticists began to seriously explore the molecular structure of genes.

In his 2013, State of the Union message, President Barack Obama said:

If we want to make the best products, we also have to invest in the best ideas... Every dollar we invested to map the human genome returned \$140 to our economy... Today, our scientists are mapping the human brain to unlock the answers to Alzheimer's... Now is not the time to gut these job-creating investments in science and innovation. Now is the time to reach a level of research and development not seen since the height of the Space Race.1

On 02 April 2014, President Obama unveiled a bold new research initiative designed to revolutionize our understanding of the human brain.2 The BRAIN (Brain Research through Advancing Innovative Neurotechnologies) initiative's ultimate aim is to help researchers find new ways to treat, cure, and even prevent brain disorders, such as Alzheimer's disease, epilepsy, and traumatic brain injury. Undoubtedly, synthetic biology will play a signal role in this initiative and much of the needed basic research will happen in the universities.

<sup>1</sup> See http://www.whitehouse.gov/the-press-office/2013/02/12/president-barack-obamas-state-union-address.

<sup>2</sup> Fact Sheet: BRAIN Initiative, The White House, 02 April 2013, http://www.whitehouse.gov/the-press-office/2013/04/02/ fact-sheet-brain-initiative

## **2.1. DNA carries information**

blocks) in the form of bio-tools and bio-processes that can be scaled for industrial application. Given the high intellectual calibre of the synthetic biology research community, it appears inevitable that the scientific knowledge they produce and place in the public domain will quickly be translated into industrial applications of high economic value by equally talented industry researchers. This raises obvious concerns about the ownership and control of generated intellectual property that may lead to high commercial value in a twenty-first century economy that truly belongs to the life sciences. The key enabling technology is DNA synthesis. The workspace includes microbes, mammalian cells, plants, etc. Its applications

The pioneering paper of Watson and Crick [12] that elucidated the double helix structure of cellular DNA has been hailed as the greatest discovery in biology since Darwin's theory of evolution. In their paper, they showed that the structure was made possible by the unique base pairing of nucleotides guanine (G) with thymine (T), and adenine (A) with cytosine (C), each member of a pair belonging to opposing strands. It is this pairing that allows base pairs to be arbitrarily stacked as a double helix. In a famous understatement, they wrote: "It has not escaped our notice that the specific pairing we have postulated immediately suggests a possible copying mechanism for the genetic material." It was the potential for explaining biological function of DNA that led to the widespread acceptance of the double helix model rather than any compelling structural evidence. The helical structure was not rigorously determined by X-ray crystallography until the late 1970s [18]. Whereas cells were regarded as the basic building blocks of living organisms during the nineteenth century, the Watson and Crick paper [12] shifted attention from cells to DNA molecules in the middle of the twentieth

century, when geneticists began to seriously explore the molecular structure of genes.

If we want to make the best products, we also have to invest in the best ideas... Every dollar we invested to map the human genome returned \$140 to our economy... Today, our scientists are mapping the human brain to unlock the answers to Alzheimer's... Now is not the time to gut these job-creating investments in science and innovation. Now is the time to reach a level of research and

On 02 April 2014, President Obama unveiled a bold new research initiative designed to

Advancing Innovative Neurotechnologies) initiative's ultimate aim is to help researchers find new ways to treat, cure, and even prevent brain disorders, such as Alzheimer's disease, epilepsy, and traumatic brain injury. Undoubtedly, synthetic biology will play a signal role in

2 Fact Sheet: BRAIN Initiative, The White House, 02 April 2013, http://www.whitehouse.gov/the-press-office/2013/04/02/

this initiative and much of the needed basic research will happen in the universities.

1 See http://www.whitehouse.gov/the-press-office/2013/02/12/president-barack-obamas-state-union-address.

The BRAIN (Brain Research through

In his 2013, State of the Union message, President Barack Obama said:

development not seen since the height of the Space Race.1

revolutionize our understanding of the human brain.2

fact-sheet-brain-initiative

198 Biotechnology

include therapeutics, energy (e.g., fuels), chemicals, agriculture, etc.

DNA is Nature's digital recording medium. The molecular instructions for creating living organisms are encoded in the complex DNA molecule, a portion of which passes from parent to offspring during the reproduction process. Natural DNA is a linear sequence of four types of nucleotides: A, T, G, and C. Each organism's DNA sequence is unique and autobiographical; it determines an organism's unique characteristics, *e.g.*, the colour of a person's eyes, the shape of his nose, his resistance to disease, etc. Other molecules in a biological cell "read" the DNA sequence and set in motion the physical and chemical processes the cell calls for. For example, the vast information carried by the DNA includes the complete set of instructions for making all the proteins a cell will ever need. Over the years biologists have discovered certain tricks for manipulating DNA in a manner similar to manipulating character strings in a text. For example, they can copy DNA fragments using the polymerase chain reaction (PCR) or clone it using a cloning vector; cut DNA using molecular scissors called restriction enzymes; join two complementary DNA strands into a double-stranded molecule in a process called hybridization; and measure the size of DNA fragments without sequencing them using a technique called gel-electrophoresis. The enormous potential of CRISPR genome editing technology lies in its ability to precisely insert DNA into a cell in vivo. For example, CRISPR, allows one to snip out mutated DNA and replace it with the correct sequence. It thus offers possible means of treating many genetic disorders [19].

#### **2.2. The extended DNA alphabet**

Since the late 1990s, researchers have discovered that DNA construction can be extended beyond the natural bases (C, G, A, and T) to include man-made ones and artificial DNA constructed. An expanded "DNA alphabet" will obviously allow cramming of more informa‐ tion by way of larger variety of coding patterns for a given number of nucleotides comprising a DNA strand, *e.g.* an extended genetic code and thus enable a wider range of applications from precise molecular probes and nano-machines to useful new life forms. [20, 21].

Watson and Crick [12] showed that natural bases form two base pairs (A-T, G-C) as a result of specific hydrogen bonding patterns. The unnatural base pairs created by Romesberg's group [8] too pair stably and selectively in DNA. These new base pairs draw upon unnatural hydrogen-bonding topologies as well as upon shape complementarity and hydrophobic forces as opposed to only hydrogen bonding in natural pairs and are also synthesized with high fidelity by DNA polymerases. Romesberg *et al* have succeeded in creating DNA strands using the two natural base pairs and a third unnatural base pair of their design with high fidelity [8]. In a sense, researchers may well be anticipating and pre-empting evolutionary events that left to themselves would have taken a few million years to occur.

#### **2.3. CRISPR technology**

CRISPR technology is a new way of making precise, targeted changes to the genome of a cell or an organism. CRISPRs are often associated with cas genes that code for proteins related to CRISPRs. By inserting a plasmid containing cas genes and specifically designed CRISPRs, an organism's genome can be cut at any desired location. Since its invention in 2012, the CRISPR/ Cas system has been widely used for gene editing (silencing, enhancing or changing specific genes) in basic research. The importance of the CRISPR/Cas adaptive immune system is that it is a prokaryotic immune system that confers resistance to foreign genetic elements such as plasmids and phages and provides a form of acquired immunity. CRISPR spacers recognize and silence these exogenous genetic elements like RNAi in eukaryotic organisms. It is in building the elaborate system of DNA-cutting proteins and guide RNA sequences that requires extensive engineering to function in eukaryotic cells, and to insert new genes where the targeted host DNA is excised. For a quick introduction to CRISPR technology, see [22].

On 15 April 2014, the USPTO issued the first patent (US8697359, CRISPR-Cas systems and methods for altering expression of gene products) to cover CRISPR-Cas9 gene editing technology to Feng Zhang, the sole inventor, just six months after the patent application was filed on 15 October 2013. The patent is assigned to MIT, and the Broad Institute with Broad managing the patent's licensing. The patent claims a modified version of the CRISPR-Cas9 system that is found naturally in bacteria and which microbes use to defend themselves against viruses. The patent, *inter alia*, claims methods for designing and using CRISPR's molecular components. It is widely expected that Broad will adopt a liberal licensing policy that would make the technology available to scientists for research around the world.

CRISPR is already revolutionizing biomedical research because it provides a very efficient way of recreating disease-related mutations in lab animals and cultured cells. It also holds the promise of treating genetic diseases in humans in unprecedented ways, *e.g.*, by directly correcting mutations on a patient's chromosomes. Mental illnesses too may find similar remedies. Since 2012, CRISPR's use in research has spread like wildfire. The chemistry behind the Cas9 protein is still being explored.

#### **2.4. NGS + CRISPR technologies**

The first generation DNA sequencing developed in 1975 by Edward Sanger [23] remained the gold-standard for two and a half decades. It was used in the Human Genome Project that cost \$3 billion and 13 years to sequence the human genome and was completed in 2003. In com‐ parison, next-generation sequencing (NGS) use non-Sanger based, high-throughput technol‐ ogies to sequence millions and billions of DNA strands in parallel, are much faster and cheaper. In fact, an entire genome can be sequenced in a day. And when it is coupled with powerful computational algorithms, say, to answer questions related to mutational spectrum of an organism on a genome-wide scale, we have phenomenal opportunities to understand our biological selves. Targeted sequencing facilitates discovery of disease causing mutations for diagnosis of pathological conditions, and of genes and regulatory elements associated with disease [24, 25]. For trends in DNA sequencing costs, see http://www.genome.gov/sequen‐ cingcosts/. (In 2014, it was less than \$0.1 per raw mega-base of DNA sequence compared to about \$1k in 2004; during 2007-2010, the cost fell sharply.) NGS is not yet ready for clinical use.

For recent advances in CRISPR-Cas9 technology see [26]. In principle, NGS and CRISPR technology together would allow one to change a genome at will to almost anything one wants and even elicit enough detailed information about disease risks, ancestry and other traits of a person to determine his identity. Clearly, such advances raise privacy, ethical, legal, and other social issues that are presently barely understood and therefore need careful study. A NIH initiated study in the U.S. [27] notes: "The ongoing evolution of genomic research and health care requires a continuing analysis of the normative underpinnings of beliefs, practices and policies regarding research, health and disease. In addition, as personal genomic information permeates many aspects of society, it has profound implications for how we understand ourselves as individuals and as members of families, communities, and society--and even for how we under‐ stand what it means to be human. Long-held beliefs about the continuum between health and disease may be transformed, as will concepts of free will and responsibility."
