**2. Methods and results**

Cellular environmental changes such as sudden gravity change is likely to alter the funda‐ mental activities of genes and any change in the physiological function of a cell or an organism is most likely the result of changes in certain genes' expressions. Genes from many cell types have been shown to be sensitive to the microgravity environments (reviewed by Clement 2012). With the advent of high-throughput genomic technology such as microarrays, large scale genome-wide studies have been performed to assess the mRNA levels of cultured cells and organisms exposed to microgravity. This is an effective approach because the control of mRNA abundance of genes is effectively adapted by cells through controlling transcription (especially transcription initiation), nuclear pre-mRNA processing, mRNA transport, mRNA stability, etc. The cellular abundance of mRNAs is critical to gene function and protein production, which is intriguingly fine-tuned by non-coding regulatory RNAs such as miRNAs. Since the turn of the century, microarray studies have been increasingly used in space life sciences to assess the abundance of mRNAs in response to microgravity. The microgravity biotechnologies combined with microarray technology have been successfully used to study microgravity effect on gene expression on a wide variety of cell types. In a previous review, data was combined from all retrievable microarray-based microgravity research to identify the most frequently altered putative "major space genes" [10]. At that time we identified 26 microarray based microgravity studies in mammalian cells or tissue that had some form of published gene lists. In addition, we included the then available results (published gene lists) from four Xenopus studies. Candidate major space genes were defined as genes that appeared to have significantly altered expression levels in at least four studies. The resulting list of merely eight potential space genes were CD44, CTGF, CYR61, FN1, MT2, MT1, MARCKS,

Since 2011, substantially more progress has been achieved in the literature because signifi‐ cantly more studies have been published with retrievable gene lists. The combination of a greater number of studies and a general increase in the availability of published gene lists, has enabled us to greatly expand our list of putative "major space genes" from the initial number of eight [4] to the present number of 129 at the same initial level of stringency, a gene's expression was found to be altered by microgravity in four or more studies. Thus, this paper is an extended review and meta-analysis of gene expression profiles to identify major space genes, with emphasis on findings on mammalian cells. To accomplish this, we first defined the method and scope of the current literature-based study to identify the putative major space genes from published data on microarray based microgravity studies in the literature. We proceeded to obtain our novel data at three different confidence levels for the putative major space genes. We further refined the criterion for putative major space genes to only include genes that were found to have altered expression patterns in five or more studies or model cell lines. This higher stringency of selection yielded a more focused group of 35 putative major space genes. Furthermore, we identified 13 genes as the most likely candidates for the major space genes because they have been reported most frequently (≥ 6 studies) as microgravity sensitive genes. We then proceeded to perform bioinformatics analysis at each of the three confidence levels of the putative major space genes. We will present and discuss the lists of candidate major space genes that are most frequently altered by microgravity environments. We also review and discuss recent advances in the area of microarray based microgravity

TUBA4A [4]

94 Biotechnology

research.

The scope of the current study includes all the microarray based microgravity studies on gene expression regulations that have been documented in the literature. For the initial data collection, we started by doing a PubMed search with the terms such as "microarray and microgravity", "space flight and microarray" and "gene expression and microgravity". From these searches, we were able to identify 48 mammalian microarray studies of microgravity effect. Of these 48 studies there was some form of published gene list from 38 different cell lines in 35 microarray publications of mammalian cells exposed to microgravity, which provide the initial "materials" that this current study is based on. In this Methods and Results section, we present the methods and results together since they are intimately linked in the current approach. We present the methods and results in the following stages: First, the scope of the study data collection is tabulated in Table 1; second, the compilation of the "Master" gene list; Third, identification of the putative major space genes at three different levels of stringency; Fourth, bioinformatics analysis of these putative major space genes using Database for Annotation, Visualization and Integrated Discovery (DAVID) and Search Tool for Inter‐ acting Gene/Proteins (STRING).

#### **2.1. Compilation of published gene expression data into a "Master" gene list**

We collected information on microgravity sensitive genes from the literature into a tabular format so that the source of the reference, the model cell types, the types of microgravity, the duration of exposures, the platform of the gene expression analysis, magnitudes and directions of gene expression regulation, etc. were all included in the "Master" gene list. The source of data contributing publications used as the subject for our current study is shown in Table 1. The first step in the analysis of the collected data pool was to convert the collected published gene expression data into a format that can be compared directly. Since much of the compar‐ ison was across species, we chose to use gene symbols rather than accession numbers. This is mainly because accession numbers are different across species, but the gene symbols are typically the same. In addition, some of the gene lists included accession numbers and gene symbols, others included accession numbers and no gene symbols, and still others included gene symbols and no accession numbers. Therefore, we choose to use gene symbol for all further comparative analysis of these published data. Specifically, we used the DAVID Gene ID Conversion Tool [11] to convert all the differentially expressed genes in microgravity into the same format for comparison. To do this, we copied the accession numbers from each study and uploaded them into DAVID Gene ID Conversion Tool. Once uploaded, the accession numbers are automatically converted into the format chosen. In this case, we chose official gene symbols.

We then were able to assemble a "master" gene list from the 38 published gene lists (PGL) using gene symbols for direct comparison. Our main interest was to determine if a gene was differentially expressed in microgravity. For biological and technical repeats, the data were already averaged in the initial publications and the averaged data were presented in the PGL. There are also a few time-course studies using microarray profiling microgravity effects on gene expression. If a gene was differentially expressed at any time point in a time-course study, it was included in the master list with its magnitude and direction of differential regulation. Even if a gene was differentially regulated in different directions among different time points, we counted it as a differentially regulated gene. Some of these differences in expression are discussed later in this paper.

This "Master" gene list is by no means a complete gene list since many of the publications in the scope of our current study do not include the full list of differentially regulated genes. Significantly, this master gene list provides the data necessary for the identification of putative major space genes at relatively high confidence levels.



RWV – Rotating Wall Vessel, HLS – Hind Limb Suspension

RPM – Random Positioning Machine, P- Parabolic Flight

D-Denervation, B-Balloon

gene expression. If a gene was differentially expressed at any time point in a time-course study, it was included in the master list with its magnitude and direction of differential regulation. Even if a gene was differentially regulated in different directions among different time points, we counted it as a differentially regulated gene. Some of these differences in expression are

This "Master" gene list is by no means a complete gene list since many of the publications in the scope of our current study do not include the full list of differentially regulated genes. Significantly, this master gene list provides the data necessary for the identification of putative

**Tissue Type Microgravity Array type Citation**

Renal RWV/STS Incyte [12] Renal RWV/STS Incyte [13] Liver RWV 6K Human Array [14] Jurkat STS GeneFilter 20k array [15] Fibroblast STS in house [16] T-Cells B unknown [17] T-Cells RPM Affymetrix Human Genome Focus Array [18] T-Cells RWV Affymetrix Human U133A Array [19] Muscle BR Human AceGene Chip [20] Endothelial RPM unknown [21] Liver RWV Agilent 22k Human Microarray V2 [22] Osteoblast RPM Atlas Glass Human 3.8 Microarray [23] Skin RWV Agilent 22k Human Microarray V2 [24] Muscle BR MWG human 23k oligo array\_version 3 [25] Osteoblast DL Affymetrix Human U133 Plus 2.0 Array [26] Muscle ULLS unknown [27] Lymphoblastoids ISS Agilent 44k Whole Genome Microarray [28] Stem Cells RWV Affymetrix Human U133 Plus 2.0 Array [29]

Lymphoblastoids RWV Illumina HumanWG-6 V4 BeadChip/RT2 miRNA

T-Cells RWV/ISS Affymetrix Human U133 Plus 2.0 Array [31] Lymphoblastoids ISS Panorama Ab Microarray [32] Thyroid Cancer RPM Illumina HumanWG-6\_V2\_0\_R3\_11223189\_A array [33] Endothelial P Illumina HumanWG-6\_V2\_0\_R3\_11223189\_A array [34] Endothelial RPM Illumina HumanWG-6\_V2\_0\_R3\_11223189\_A array [35]

PCR Array [30]

discussed later in this paper.

**Organism/Cell or**

**Human**

96 Biotechnology

major space genes at relatively high confidence levels.

STS-Space Shuttle, ULLS-Unilateral Lower Limb Suspension

DL-Diamagnetic Levitation

ISS - International Space Station

**Table 1.** Microarray Based Studies of Microgravity Effect on Mammalian Cells

#### **2.2. Identification of putative major space genes with different levels of stringency**

By compiling the published gene lists into the master list, it provided us with an accurate and convenient platform to identify putative major space genes at various levels of stringency using the simple "vote counting" method. At the very basic level, we identified 1199 genes that were differentially regulated in two or more of the documented studies. One level higher, we found 298 genes appeared to be affected by microgravity in three or more microarray-based micro‐ gravity studies. Furthermore, when we set the bar to four or more studies, we identified 129 genes (Table 2), which is in drastic contrast to the 8 genes found a few years ago using this same level of stringency. Because of the increase in the number of relevant studies, we were able to go beyond the level of four or more studies in the selection of putative major space genes which was the highest level possible in our previous report [4]. Just to reach one step further, we isolated 35 candidate major space genes in five or more studies (Table 2). Further still, we found 13 genes that were reported in six or more studies to be microgravity sensitive (Table 2). These two additional levels of higher stringency for the selection of putative major space genes enabled a significantly higher level of confidence. We performed further bioin‐ formatics analysis on the differentially regulated genes of the top three stringency levels: gene lists of 129 genes (in ≥ 4 studies), 35 genes (in ≥ 5 studies), and 13 genes (in ≥ 6 studies), respectively.


Identification of Putative Major Space Genes Using Genome-Wide Literature Data http://dx.doi.org/10.5772/60412 99


**Table 2.** Putative List of Major Space Genes differentially regulated in 4 or more studies

#### **2.3. Bioinformatics analysis of the putative major space genes**

**2.2. Identification of putative major space genes with different levels of stringency**

respectively.

98 Biotechnology

**Genes Differentially Regulated in 4 Studies**

By compiling the published gene lists into the master list, it provided us with an accurate and convenient platform to identify putative major space genes at various levels of stringency using the simple "vote counting" method. At the very basic level, we identified 1199 genes that were differentially regulated in two or more of the documented studies. One level higher, we found 298 genes appeared to be affected by microgravity in three or more microarray-based micro‐ gravity studies. Furthermore, when we set the bar to four or more studies, we identified 129 genes (Table 2), which is in drastic contrast to the 8 genes found a few years ago using this same level of stringency. Because of the increase in the number of relevant studies, we were able to go beyond the level of four or more studies in the selection of putative major space genes which was the highest level possible in our previous report [4]. Just to reach one step further, we isolated 35 candidate major space genes in five or more studies (Table 2). Further still, we found 13 genes that were reported in six or more studies to be microgravity sensitive (Table 2). These two additional levels of higher stringency for the selection of putative major space genes enabled a significantly higher level of confidence. We performed further bioin‐ formatics analysis on the differentially regulated genes of the top three stringency levels: gene lists of 129 genes (in ≥ 4 studies), 35 genes (in ≥ 5 studies), and 13 genes (in ≥ 6 studies),

ADAMTS1 CCT7 ETFA MFNG RPL29 ADORA2A CD59 FOSL1 MMGT1 RPL9 ALDOA CD9 FST MMP1 RPLP0 ANPEP CD93 GARS MRPS35 SERPINE1 ANXA2 CDH1 GJB2 MX1 SGK1 ANXA3 CDV3 GNG10 NOTCH1 SLC16A3 AP1S1 CFLAR GPNMB NTN4 SNX7 AP3M1 CKS1B HBEGF PDGFRB SPRY2 ASAP1 CLDN11 HERPUD1 PDIA4 SRGN ASNS CLIC3 ID1 PECAM1 TCP1 ATF3 CNBP IGFBP6 PKIA TFB2M ATP5F1 CNIH ITGAV PLAT TGM2 ATP6V0D1 COL8A1 JUNB PLOD2 TLR4

BIRC3 CXCL2 KYNU PLSCR4 TRIB3

In order to get a better understanding of the putative major space genes at the top three stringency levels, we subjected the genes listed in Tables 2 to further bioinformatics analysis using the Database for Annotation, Visualization and Integrated Discovery (DAVID) v6.7 [59, 60] and Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) [61, 62].

The DAVID analysis through gene enrichment allowed us to identify enriched Gene Ontology (GO) terms as well as statistically significant pathways. Each of the top three gene lists was uploaded to DAVID Functional tool (http://david.abcc.ncifcrf.gov) to identify the statistically significant KEGG Pathways as well as the frequency of genes belonging to a particular Gene Ontology. DAVID uses a modified Fisher Exact P-value for gene enrichment analysis and statistically determines the over-representation of functional gene categories in a gene list. Pvalues equal to or smaller than 0.05 are considered strongly enriched [59, 60]. We obtained the potential KEGG Pathways as well as enriched functional clusters as defined by DAVID [59, 60].

For the 129 genes differentially expressed in ≥ 4 microgravity studies, the pathway analysis resulted in eight pathways at P value ≤ 0.05. The KEGG Pathway analysis showed that largest number of enriched genes were in pathways directly related to various cancer. The 2nd largest pathway identified is focal adhesion (Table 3).


**Table 3.** KEGG pathway analysis of 129 putative space genes

We also conducted DAVID functional cluster analysis to determine functionally enriched gene sets from the list of 129 genes differentially regulated in ≥ 4 studies. We set the Stringency at the Highest and used a P-Value cut-off of ≤ 0.05 for inclusion of a term on the list. Based on these criteria, we generated a list with 40 functionally enriched GO categories (Table 4). Some of the top functional categories (based on P-Value) were regulation of apoptosis (17.8%), ion homeostasis (8.5%), cell motility (7.75%), and insulin-like growth factor binding (4.6%).



**Term Count PValue Genes**

Focal adhesion 8 0.02

Pathways in cancer 10 0.035

**Table 3.** KEGG pathway analysis of 129 putative space genes

NOD-like receptor signaling

pathway

100 Biotechnology

Bladder cancer 4 0.02 IL8, CDH1, MYC, MMP1

Small cell lung cancer 5 0.028 CKS1B, ITGAV, BIRC3, MYC, FN1 ECM-receptor interaction 5 0.028 CD44, ITGAV, ITGB4, ITGA10, FN1 Ribosome 5 0.031 RPL17, RPL9, RPLP0, RPL10A, RPL29

Pathogenic Escherichia coli infection 4 0.043 LOC399942, TUBA4A, CDH1, TLR4

CAV2, CAV1, ITGAV, ITGB4, PDGFRB, ITGA10,

CKS1B, FOS, IL8, ITGAV, PDGFRB, CDH1, BIRC3,

BIRC3, FN1

MYC, MMP1, FN1

4 0.053 CCL2, IL8, CXCL2, BIRC3

We also conducted DAVID functional cluster analysis to determine functionally enriched gene sets from the list of 129 genes differentially regulated in ≥ 4 studies. We set the Stringency at the Highest and used a P-Value cut-off of ≤ 0.05 for inclusion of a term on the list. Based on these criteria, we generated a list with 40 functionally enriched GO categories (Table 4). Some of the top functional categories (based on P-Value) were regulation of apoptosis (17.8%), ion homeostasis (8.5%), cell motility (7.75%), and insulin-like growth factor binding (4.6%).

**Term Count % PValue** GO:0005520~insulin-like growth factor binding 6 4.65 1.85E-06 GO:0042981~regulation of apoptosis 23 17.8 4.54E-06 GO:0043066~negative regulation of apoptosis 14 10.9 2.57E-05 GO:0043065~positive regulation of apoptosis 13 10.1 6.73E-04 GO:0016477~cell migration 10 7.75 0.001106 GO:0051674~localization of cell 10 7.75 0.002297 GO:0048870~cell motility 10 7.75 0.002297 GO:0006873~cellular ion homeostasis 11 8.53 0.002595 GO:0007596~blood coagulation 6 4.65 0.002624 GO:0055082~cellular chemical homeostasis 11 8.53 0.002909 GO:0007599~hemostasis 6 4.65 0.00336 GO:0050801~ion homeostasis 11 8.53 0.004886 GO:0030005~cellular di-, tri-valent inorganic cation homeostasis 8 6.2 0.005339 GO:0032496~response to lipopolysaccharide 5 3.88 0.005751 GO:0002237~response to molecule of bacterial origin 5 3.88 0.008466 GO:0006469~negative regulation of protein kinase activity 5 3.88 0.008811 GO:0051412~response to corticosterone stimulus 3 2.33 0.009483

**Table 4.** GO categories for the 129 space genes. Processed through DAVID with stringency set at highest

Next, we submitted the list of 35 genes that were differentially regulated in five or more studies to DAVID for bioinformatics analysis. The KEGG Pathway analysis identified focal adhesion and Extracellular Matrix (ECM)-receptor interaction pathways were the largest number of enriched genes (Table 5).


**Table 5.** KEGG Pathways associated with 35 genes that are differentially regulated in 5 or more studies

We processed the same list of 35 genes through the DAVID Functional Clustering Tool using the highest stringency setting and generated a list with 31 enriched GO categories (Table 6). Some of the top categories are cell adhesion (22.9%), biological adhesion (22.9%), response to steroid hormone stimulus (20%), response to hormone stimulus (20%), response to endoge‐ nous stimulus (20%), regulation of apoptosis(20%), regulation of programmed cell death (20%), regulation of cell death (20%), and insulin-like growth factor binding (11.4%).


**Table 6.** 31 enriched GO categories generated from the list of 35 putative space genes.

For the visualization of the association between the genes in the network, we performed further bioinformatics analysis using STRING (Search Tool for the Retrieval of Interacting Genes/ Proteins) [61, 62]. By using STRING we can examine co-occurrence, co-expression, and experimental evidence for relationships between the genes of interest. For our analysis, physical and functional interactions among the genes were determined using the high confidence score of 0.7. We uploaded the 129 genes that were differentially regulated in at least 4 of the studies to STRING and the resulting gene association network were shown in Figure 1. The blue lines indicate an association; the thicker the lines the higher the level of confidence. Most of the genes clustered near the center and with a strong association are among the 35 genes we identified as differentially regulated in five or more studies. For example, FN1 (identified in 6 or more studies) shows a strong association with MYC, EGR1 and CTGF all of which were also identified in 6 or more studies. FN1 also shows strong association with LOX, CD44, IGFBP3 and IL8 which are in 5 or more studies. FOS, which is another gene identified in 6 or more studies, shows strong association with MYC, EGR1, MMP10, and IL8 which are genes identified in 5 or more studies.

We processed the same list of 35 genes through the DAVID Functional Clustering Tool using the highest stringency setting and generated a list with 31 enriched GO categories (Table 6). Some of the top categories are cell adhesion (22.9%), biological adhesion (22.9%), response to steroid hormone stimulus (20%), response to hormone stimulus (20%), response to endoge‐ nous stimulus (20%), regulation of apoptosis(20%), regulation of programmed cell death (20%),

**Term Count % P Value** GO:0048545~response to steroid hormone stimulus 7 20 6.09E-06 GO:0005520~insulin-like growth factor binding 4 11 2.47E-05 GO:0009725~response to hormone stimulus 7 20 2.28E-04 GO:0009719~response to endogenous stimulus 7 20 3.87E-04 GO:0007155~cell adhesion 8 23 0.00126 GO:0022610~biological adhesion 8 23 0.00128 GO:0019838~growth factor binding 4 11 0.00178 GO:0005539~glycosaminoglycan binding 4 11 0.00403 GO:0016477~cell migration 5 14 0.00435 GO:0030247~polysaccharide binding 4 11 0.00525 GO:0001871~pattern binding 4 11 0.00525 GO:0051495~positive regulation of cytoskeleton organization 3 8.6 0.00535 GO:0051674~localization of cell 5 14 0.00634 GO:0048870~cell motility 5 14 0.00634 GO:0042981~regulation of apoptosis 7 20 0.01206 GO:0043067~regulation of programmed cell death 7 20 0.01263 GO:0010941~regulation of cell death 7 20 0.01284 GO:0006916~anti-apoptosis 4 11 0.01357 GO:0010638~positive regulation of organelle organization 3 8.6 0.01736 GO:0030005~cellular di-, tri-valent inorganic cation homeostasis 4 11 0.01756 GO:0055066~di-, tri-valent inorganic cation homeostasis 4 11 0.02011 GO:0030003~cellular cation homeostasis 4 11 0.02357 GO:0030324~lung development 3 8.6 0.02416 GO:0030323~respiratory tube development 3 8.6 0.02554 GO:0060541~respiratory system development 3 8.6 0.02839 GO:0055080~cation homeostasis 4 11 0.03198 GO:0014706~striated muscle tissue development 3 8.6 0.03393 GO:0060537~muscle tissue development 3 8.6 0.03711 GO:0051493~regulation of cytoskeleton organization 3 8.6 0.04324 GO:0044087~regulation of cellular component biogenesis 3 8.6 0.04674 GO:0030246~carbohydrate binding 4 11 0.04741

regulation of cell death (20%), and insulin-like growth factor binding (11.4%).

102 Biotechnology

**Table 6.** 31 enriched GO categories generated from the list of 35 putative space genes.

Figure 1. Dummy Text 129 genes that were differentially regulated in at least 4 of the studies were uploaded to STRING. This view shows the evidence of the association between genes. The thicker the line the higher the confidence level. **Figure 1.** 129 genes that were differentially regulated in at least 4 of the studies were uploaded to STRING. This view shows the evidence of the association between genes. The thicker the line the higher the confidence level.

STRING analysis the 35 genes differentially regulated in 5 or more studies more clearly show the strong association between FN1,

EGR1, CTGF, LOX, MYC, FOS, IGFP3, and CD44 (Figure 2). Note that the genes FN1, EGR1, CTGF, MYC, FOS are among the genes that were differentially regulated in six or more studies, and therefore were identified to be the candidate major space genes at the highest confidence level in the present study. STRING analysis of the 35 genes differentially regulated in 5 or more studies more clearly show the strong association between FN1, EGR1, CTGF, LOX, MYC, FOS, IGFP3, and CD44 (Figure

2). Note that the genes FN1, EGR1, CTGF, MYC, FOS are among the genes that were differ‐ entially regulated in six or more studies, and therefore were identified to be the candidate major space genes at the highest confidence level in the present study.

**Figure 2.** 35 genes that were differentially regulated in at least 5 of the studies were uploaded to STRING. This view shows the evidence of the association between genes. The thicker the line the higher the confidence level.

To further examine the nature of the top 13 genes that were identified in six or more studies to be gravity sensitive, we compiled them into a table which showed the species, cell types, types of microgravity, duration in microgravity and sources of references as well as the directions of the differentially regulated genes (Table 7). From this table we can see that none of the genes were consistently differentially regulated in the same direction. Genes that tend to co-express such as MT1 and MT2, CTGF and CYR61also seemed to co-express in the same direction in these studies. It is not clear why there is such a convergence in the expression patterns. Variables such as different cell types, different species, different forms of micrograv‐ ity, duration of exposure, and different microarray platforms may be the contributing factors.

Identification of Putative Major Space Genes Using Genome-Wide Literature Data http://dx.doi.org/10.5772/60412 105


2). Note that the genes FN1, EGR1, CTGF, MYC, FOS are among the genes that were differ‐ entially regulated in six or more studies, and therefore were identified to be the candidate

**Figure 2.** 35 genes that were differentially regulated in at least 5 of the studies were uploaded to STRING. This view

To further examine the nature of the top 13 genes that were identified in six or more studies to be gravity sensitive, we compiled them into a table which showed the species, cell types, types of microgravity, duration in microgravity and sources of references as well as the directions of the differentially regulated genes (Table 7). From this table we can see that none of the genes were consistently differentially regulated in the same direction. Genes that tend to co-express such as MT1 and MT2, CTGF and CYR61also seemed to co-express in the same direction in these studies. It is not clear why there is such a convergence in the expression patterns. Variables such as different cell types, different species, different forms of micrograv‐ ity, duration of exposure, and different microarray platforms may be the contributing factors.

shows the evidence of the association between genes. The thicker the line the higher the confidence level.

major space genes at the highest confidence level in the present study.

104 Biotechnology


BR= bed rest HLS = Hind limb suspension ISS = International Space Station ML = Magneto Levitation PF = Parabolic flight RPM = Random Positioning Machine RWV = Rotating Wall Vessel STS = Space Shuttle. "+" Indicates up-regula‐ tion; "-" Indicates down-regulation.

**Table 7.** The top 13 putative space genes organized according to types of microgravity. The columns under the gene symbols show the direction of differential regulation in each cell type and microgravity condition.
