**4. Gene signature databases of various diseases**

The gene signature databases of various diseases are a complementary resource to drug repositioning. Importantly, the gene signatures of diseases are robust across different tissues and experiments to some extent (Dudley et al. 2009). As mentioned in the introduction section, it is difficult to apply a high-throughput way to model various diseases in parallel. Researchers have collected some gene signature datasets related to numerous diseases. However, in practice, biologists usually focus on a specific disease, which means that they could obtain the gene signature of the disease by themselves. Once they have the gene signature of the disease, they could directly query the gene signature library of drugs to get the candidate drugs for this disease.

The gene signatures of diseases were mainly collected from the GEO. ADEPTUS (Annotated Disease Expression Profiles Transformed into a Unified Suite) supplied about 14,000 ready-to-use gene signature profiles, annotated with Disease Ontology terms [41]. ADEPTUS built a classic way to form a gene signature of various diseases. The STARGEO (Search Tag Analyze Resource for GEO) project generated annotations of disease-related samples in GEO to identify robust signatures of disease by meta-analysis via a crowdsourcing approach [42]. It covered about 250 types of diseases and could be improved via the webserver. The DrugVsDiseasedata (Drug versus Disease data) package defined 45 gene signatures of diseases, such as Breast with Small-cell Lung, Cervical, Bladder and Prostate cancer, collected from GEO [43]. Recently, Porcu et al. reported that differentially expressed genes reflect disease-induced rather than disease-causing changes in the transcriptome via the Mendelian randomization method. Thus, identifying the upstream genes, which cause the diseases, would be a promising direction in the transcriptome data of diseases.

Although, there are several gene signature datasets of diseases, more efforts are necessary to enlarge the library of the types of diseases. The disease ontology is a fruitful resource for reference when searching for a disease. With the scale of gene signatures of diseases increasing, there will be more possibility of connecting drugs and diseases as the searching space for the algorithm is expanded.
