**3.8 Copy number variations (CNV)**

We are in a new era of personalized genomic medicine. With the significantly advanced and simplified new DNA sequencing tools and methods available in a few years, we would be able to know the whole genomic DNA sequence of every person in a few hours at an affordable price (less than one thousand dollars) (L. Gross, 2007).

In addition to each person has his /her unique protein-coding sequences, deletions, insertions, and inversions, copy number variations is one of the main reasons that we are different from each other genetically. Copy number variations is a hot topic of research in recent years, the aim of the studies is to disclose some possible diseases caused/or influenced by copy number variations; and how copy number variations might determine, regulate, and affect our genetic traits and social behaviours. Park et al. discovered 5,177 CNVs in 30 individuals of Korean, Chinese and Japanese, of which 3,547 were putative Asian-specific CNVs (H. Park et al., 2010). Every genome has about 40.3 CNVs averagely; the median length of CNVs is 18.9 kb. About 8% regions of the human genome are occupied by CNV regions (Yim et al., 2010).

The current research data revealed that every genetically unrelated person is significantly different from each other on protein-coding sequences, single-base variations –SNPs, small nucleotide insertions and deletions (called indels), and copy number variations.

It is time to compare genomic DNA sequences among family members, relatives, and genetically unrelated persons, to confirm that genomic DNA sequences are much more

disease; there is no carrier of a dominant genetic disease, because every person who has the

A diploid genome sequence showed that we are genetically more diverse than we have claimed before (International Human Genome Sequencing Consortium, 2001, 2004; Venter et al., 2001) based on the haploid genome sequences, and the difference between two homologous chromosomes of a pair of chromosomes inherited from one's parents is bigger than we thought before. There were more than 4.1 million DNA sequence variants in this new diploid genome. Single-base variations -single nucleotide polymorphisms (SNPs) are the major variants, small fragments insertions or deletions (indels), large fragments deletions and duplications- copy number variations also contribute to the genomic variation

J. Wang et al. sequenced a Chinese diploid genome sequence (named YH) and found about 3 million SNPs in YH's genome, of which 13.6% were new compared to the SNP database dbSNP. They compared the 3 known genome sequences and recognized that the genomes of YH, Venter, and Watson shared 1.2 million SNPs, and their unique SNPs were 31.8% (YH),

Koreans and Chinese were historically related, and they might have the same ancestors. The diploid genome sequence of a Korean male (named SJK) was significantly different from the Chinese YH; there were 1.3 million different SNPs between the two persons; even though SJK shared more SNPs with YH than with Caucasians Venter and Watson, and the Nigerian male Yoruba. 420, 083 (12.2%) SNPs of SJK were not found in the dbSNP database before,

More than 99% of the genomic DNA sequences of a Japanese male were same to the reference human genome, but there were still 3,132,608 single nucleotide variations (SNVs)

We are in a new era of personalized genomic medicine. With the significantly advanced and simplified new DNA sequencing tools and methods available in a few years, we would be able to know the whole genomic DNA sequence of every person in a few hours at an

In addition to each person has his /her unique protein-coding sequences, deletions, insertions, and inversions, copy number variations is one of the main reasons that we are different from each other genetically. Copy number variations is a hot topic of research in recent years, the aim of the studies is to disclose some possible diseases caused/or influenced by copy number variations; and how copy number variations might determine, regulate, and affect our genetic traits and social behaviours. Park et al. discovered 5,177 CNVs in 30 individuals of Korean, Chinese and Japanese, of which 3,547 were putative Asian-specific CNVs (H. Park et al., 2010). Every genome has about 40.3 CNVs averagely; the median length of CNVs is 18.9 kb. About 8% regions of the human genome are occupied

The current research data revealed that every genetically unrelated person is significantly different from each other on protein-coding sequences, single-base variations –SNPs, small

It is time to compare genomic DNA sequences among family members, relatives, and genetically unrelated persons, to confirm that genomic DNA sequences are much more

nucleotide insertions and deletions (called indels), and copy number variations.

mutated allele gets the disease.

significantly (L. Gross, 2007; Levy et al., 2007).

**3.8 Copy number variations (CNV)**

by CNV regions (Yim et al., 2010).

30.1% (Venter), and 33.0% (Watson) separately (J. Wang et al., 2008).

and 39.87% of the SNPs were SJK-specific (S. Ahn et al., 2009).

compared to other six reported human genomes (Fujimoto et al., 2010).

affordable price (less than one thousand dollars) (L. Gross, 2007).

similar among family members and relatives than among genetically unrelated persons. For example, we are interested to see if a son's Y chromosome sequence is as same as his biological father's; or how many differences there are between these two if they are not the same. I assume it will be proved that genomic DNA sequences are much more similar among family members than among genetically unrelated persons. A new research showed that chromosomes with insertions or deletions could affect the process of meiosis (J. Wang et al., 2010). Therefore, if a healthy donor is a family member/relative of a patient, their genomic DNAs could be matched much better, and there should be less immunological reactions and rejections.

A gene might only be expressed from a chromosome of the paternal or maternal origin resulting from genomic imprinting effect, and some genetic diseases like Prader-Willi syndrome, Angelman syndrome, Beckwith-Wiedemann syndrome, are due to genomic imprinting (Falls et al., 1999; Hall, 1990; Tycko, 1994). Additionally, some genetic diseases such as X-linked severe combined immunodeficiency, Glucose-6-phosphate dehydrogenase deficiency, Pyruvate dehydrogenase deficiency, Wiskott-Aldrich syndrome, and Becker/Duchenne muscular dystrophy are sex linked. Hence, both genomic DNAs from a healthy male and a healthy female might be introduced into somatic cells and stem cells of a patient, to correct the mutated genes in vitro, so as to get possibly more efficient and effective gene therapy. Finally, the corrected cells would be given back to the same patient.

#### **3.9 Human gene's exons are separated by introns**

Many of the human genes have a few introns and exons, and the exons are separated by introns in the human genomic DNA. Introns in a gene can be 10 to 100 times longer than the exons. Statistically, the average exon length is about 170 bp, whereas, the average intron size is about 5419 bp; the average human gene has about 8.8 exons and 7.8 introns. The human nebulin gene has 147 introns. Some introns like the human dystrophin gene intron 44 can be more than 250,000 bp in length (Hawkins, 1988; Lodish et al., 2008; Sakharkar et al., 2004; V. Tran et al., 2005). Introns are removed from the gene to form mRNA by a process of RNA splicing (Berget et al., 1977; Chow et al., 1977) during transcription. mRNA exits the nucleus via nuclear pores, and binds to ribosomes. The ribosome moves along the mRNA, and selects the right tRNA by matching an anti-codon on a tRNA to a codon on the mRNA strand. Each tRNA can only carry a specific amino acid by the help of an enzyme called aminoacyl tRNA synthetase. This is the process of translation-an mRNA sequence is translated into a protein sequence (Goldman, 2008; Lodish et al., 2008).

The human dystrophin gene is the largest known human gene. It has more than 2, 400 kb in length, and has at least 79 exons, its intron 44 has 250 kb, its second largest intron-intron 2, is 170 kb long. 99% of the dystrophin gene sequences are present in introns. The human dystrophin gene locates at locus Xp21.2, and is mutated in patients with Duchenne and Becker muscular dystrophies (Dwi Pramono et al., 2000; Golubovsky & Manton, 2005; Koenig et al., 1987, 1988; Nishio et al., 1994; Roberts, 2001; V. Tran et al., 2005; Zhang et al., 2007).

Human hemoglobin is the protein in red blood cells responsible for transferring oxygen from the lungs to the cells of other parts of the human body . Fetal human hemoglobin has two alpha chains and 2 gamma chains; each of the polypeptide chain has a heme. After birth, the gamma globin gene expression was turned off, and two gamma chains were replaced by two β chains. Therefore, in adult human hemoglobin, there are two α chains,

Gene Therapy of Some Genetic Diseases by Transferring

the secrets of the silkworm and other organisms including humans.

**3.11 Graft-versus-host disease (GVHD)** 

cured or improved by using this gene therapy method.

**3.12 Fanconi anemia** 

2009; Tischkowitz & Hodgson, 2003).

encoded in its genome.

Normal Human Genomic DNA into Somatic Cells and Stem Cells from Patients 117

silkworms. These natural born skills must have been inherited from its parents, and they are

It is estimated that there are only about 20,000-25,000 protein-coding genes in humans; the majority of genome sequences are non-coding sequences (International Human Genome Sequencing Consortium, 2004). We might have ignored some small protein-coding genes, and some alternatively spliced genes. The actual number of protein-coding genes might be bigger than we have claimed. (A. Ahn & Kunkel, 1993; Black, 2003; Dwi Pramono et al., 2000; Muntoni et al., 2003; Nishio et al., 1994; L. Song et al., 2003; V. Tran et al., 2005; Zhang et al., 2007). We do not know the meaning and usefulness of these non-coding sequences clearly so far, only one thing we are almost certain is that: they must have meaning and usefulness. We read many books, newspapers, and journals; we watched hundreds of movies, TV shows; we travelled numerous places, and met a lot of people. We do not know why and how we can remember all these things, and why the childhood memories can be stored in our brains for many years, and the memories can be recalled after so many years. If we can transfer the information from one person's brain to a computer, it might take up millions of gigabyte DVD space. In a human brain cell, only genomic DNA molecules could have such big storage capabilities to store such huge quantities of information. The mechanism of memory is one of the biggest challenges of our human beings; we should be able to uncover the secret of our brains with our own brains if we are on the right track. One day we might be able to know all

It is often hard to find a human leukocyte antigen (HLA)-identical sibling or a well-matched HLA unrelated donor when a patient needs hematopoietic stem cell transplant (HSCT). Sometimes, a patient had to receive a mismatched or partially matched bone marrow transplant and cord-blood transplant, when there was no HLA-matched unrelated donor available, and when a transplant was needed urgently. Acute and chronic graft-versus-host disease is the most severe and common long-term side effect of allogeneic hematopoietic stem cell transplantation (HCT). Acute GVHD was more likely to occur after mismatched marrow transplantation. Chronic GVHD was the major cause of late death of HSCT patients (Eapen et al., 2010; Laughlin et al., 2001, 2004; Mastaglio et al., 2010; Rocha et al., 2004). Cells seem to be able to tolerate foreign DNA without immunological reactions; this is proved by the animal cloning experiments, transgenic animal models, and human and animal replication phenomena. Therefore, the possible approach I described above might have great benefits and advantages. Hopefully some genetic diseases listed below could be

Fanconi anemia (FA) is a rare chromosomal recessive genetic disease. As above cited, there are at least 14 subtypes of Fanconi anemia, and 14 genes whose mutation can cause FA are cloned. FANCB gene is on the X chromosome, and it is the only one on sex chromosomes, the other 13 FA genes are on autosomes. FA was first described by the Swiss pediatrician Guido Fanconi (1892-1979) in 1927 (Joenje & Patel, 2001; Lobitz & Velleuer, 2006; L. Song,

There are mouse models of Fanconi anemia available currently; FancA, FancC, FancG, FancD1, and FancD2 genes have been deleted or mutated in the mice (Parmar et al., 2009).

two β chains, and four heme groups (Feng et al., 2001; Groudine et al., 1983; Hardison, 1996; Yin et al., 2007). The human α-globin gene cluster lies on chromosome 16 (16p13.3), and is about 30 kb, it has 7 genes: zeta, pseudozeta, mu, pseudoalpha-1, alpha-2, alpha-1, theta (Barbour et al., 2000; Entrez Gene, 2011; Feng et al., 2001; Higgs et al., 1989). The human β-globin gene cluster is about 100 kb; it locates on chromosome 11 (11p15.5), and has 5 genes in the order of epsilon, gamma-G, gamma-A, delta, and beta. Both of α-globin gene and β-globin gene have three exons and two introns (Higgs et al., 1989; Yin et al., 2007).

Typically, in a viral or plasmid vector mediated gene therapy, normal mRNAs are reverse transcribed into cDNAs; and specific cDNAs are amplified by PCR method; the PCR products are purified and digested by restriction enzymes; the digested PCR products are inserted into the viral or plasmid vectors; the viral or plasmid vectors containing the normal genes are transfected/transformed into cells, in order to express normal proteins, or to correct the mutated genes in vivo.

This procedure has a problem. As the above described, the mutated genes might be separated by several introns and located in several places of the genomic DNA, the cDNA clones of the normal genes are too short to match and find the mutated genes, therefore, it is hard to correct the mutated genes in vivo, although the cloned genes might express normal proteins transiently. By transferring normal human genomic DNA into cells from patients, it can overcome this difficulty.

#### **3.10 Non-coding sequences of genome sequences, and the miracle silkworm**

We are living in an age that many important organisms have been sequenced (S. Ahn et al., 2009; Fujimoto et al., 2010; Holmes et al., 2005; International Human Genome Sequencing Consortium, 2001, 2004; International Silkworm Genome Consortium, 2008; Levy et al., 2007; O'Brien et al., 1999; Venter et al., 2001; J. Wang et al., 2008; Xia et al., 2004). We gained some valuable information from the genome sequence data of these organisms, but we are far away from knowing the secret of lives. A silkworm has a short but magical life cycle, and it proceeds in the following processes: it starts from a tiny egg; in a suitable environment, the egg turns into a small worm (larva); the small worm eats mulberry tree leaves greedily and thoroughly days and nights, and after 4 times of shedding its skin, it grows bigger and bigger; one day it starts to weave a silk house-a cocoon for itself, in about 2 days, a beautiful and perfect white colored cocoon is made by itself; the silkworm pees before weaving a cocoon, this makes its body smaller, so as to let itself be able to fit in the cocoon; inside the cocoon, the worm changes to a pupa, and before this happens, the worm poops, this makes its body further smaller; after about two weeks, the pupa becomes a moth, and it is time to get out of the cocoon; the moth is very smart, it pees inside the cocoon, the chemicals of the urine are so powerful-one of the chemicals is a special enzyme which can break down the cocoon wall, and it makes one end of the cocoon softer, so the moth can get out without trouble; the female moth comes out of the damaged cocoon, and releases sex pheromones to attract males, and mates with a few males, lays eggs after mating; and a new life cycle is started again if the environment is appropriate; or it will go through a period of hibernating.

When you think about this miracle life cycle of a silkworm, you have to believe that these abilities, talents, and skills of a silkworm are not learned from others or from the environment, because actually no one teaches it to do this step by step, especially for the first silkworm who started doing these things earlier than all the others in a group of silkworms. These natural born skills must have been inherited from its parents, and they are encoded in its genome.

It is estimated that there are only about 20,000-25,000 protein-coding genes in humans; the majority of genome sequences are non-coding sequences (International Human Genome Sequencing Consortium, 2004). We might have ignored some small protein-coding genes, and some alternatively spliced genes. The actual number of protein-coding genes might be bigger than we have claimed. (A. Ahn & Kunkel, 1993; Black, 2003; Dwi Pramono et al., 2000; Muntoni et al., 2003; Nishio et al., 1994; L. Song et al., 2003; V. Tran et al., 2005; Zhang et al., 2007). We do not know the meaning and usefulness of these non-coding sequences clearly so far, only one thing we are almost certain is that: they must have meaning and usefulness. We read many books, newspapers, and journals; we watched hundreds of movies, TV shows; we travelled numerous places, and met a lot of people. We do not know why and how we can remember all these things, and why the childhood memories can be stored in our brains for many years, and the memories can be recalled after so many years. If we can transfer the information from one person's brain to a computer, it might take up millions of gigabyte DVD space. In a human brain cell, only genomic DNA molecules could have such big storage capabilities to store such huge quantities of information. The mechanism of memory is one of the biggest challenges of our human beings; we should be able to uncover the secret of our brains with our own brains if we are on the right track. One day we might be able to know all the secrets of the silkworm and other organisms including humans.
