**5. Querying Integrated Pathway**

Once the data integration is accomplished, extracting information from the integrated data will be of interest to the biologist. There are various mechanisms to extract information from the integrated database generated. Some of these are described below.

Granular computing with semantic network structure captures the abstraction and incompleteness associated with biological plant pathway data. It is inspired by the ways in which humans granulate information and reason with coarse grained information. The three basic concepts underlying the human cognition are granulation, organization, and causation. Granulation involves decomposition of whole into parts, organization involves integration of parts into whole, and causation involves associations of cause and effects. The fundamental issues with granular computing are granulation of the universe, description of granules, and relationships between granules. The basic ideas of crisp information granulation have appeared in related fields, such as interval analysis, quantization, rough set theory, Demster Shafer theory of belief functions, divide and conquer, cluster analysis, machine learning, data bases and many others. Granules may be induced as a result of 1) equivalence of attribute values, 2) similarity of attribute values, and define the granules 3) equality of attribute value. We use granules for defining the user queries associated with the integrated pathway. Based on user (biologist) choice, granules can be defined to view the integrated pathway. This provides flexibility to the biologist for using the information.

Previous approaches towards metabolic network reconstruction have used various algorithmic methods such as name-matching in IdentiCS [52] and using EC-codes in metaSHARK [53] to link metabolic information to genes. The AUtomatic Transfer by Orthology of Gene Reaction Associations for Pathway Heuristics (AUTOGRAPH) method [54] uses manually curated metabolic networks, orthologue and their related reactions to compare predicted gene-reaction associations.

Arrendondo [55] Proposes to develop a process for the continuous improvement of the inference system used, which is applicable to any such data mining application. It involves the comparison of several classifiers like Support Vector Machines (SVMs), Human Expert generated Fuzzy, and Genetic Algorithm (GA) generated Fuzzy and Neural Networks using various different training data models. In his approach, all classifiers were trained and tested with four different data sets: three biological and a synthetically generated mixture data set. The obtained results showed a highly accurate prediction capability with the mixture data set providing some of the best and most reliable results.
