**1. Introduction**

A fundament search activity begins with the formulation of search intension and mines meaningful information from available information space. This helps the user in gaining intellectual skills and cognitive understanding. Traditional search systems usually support lookup searching in that user has a proper wisdom of their information goal. This type of search relies on traditional 'Query-Result' paradigm in that user pose a query for the relevant document retrieval, browse through results and analyze them to fulfill his information need. This approach performs well in the case of short navigational information requests and fulfills

© 2016 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2018 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

an information location need, but fails in information discovery need [39]. For discoveryoriented applications such as uncovering the information pattern from genomics, health care data, scientific data etc., additional assistance is required to formulate queries and navigation in data space to gain the desired information [16]. In such scenarios, the user usually uncertain about his information goals and/or less familiar with data semantics and context that makes the phrasing of information request [12] challenging. Also, initial search aims and intentions evolve as new information is encountered. Hence, the burden of analyzing, reorganizing and keeping track of the information gathered falls on the user alone [16, 17]. Exploratory search is one such emerging research area that realizes the importance of user's efforts in multiple phases of discovering, analyzing, and learning. Exploratory search systems can deliver pleasing quality information due to their recall-oriented reformulation from short typed ill-phrased query to precise query [23, 29, 37, 38].

tasks. Investigate behaviour prunes gap in knowledge and transform existing data into new

Query Morphing: A Proximity-Based Approach for Data Exploration

http://dx.doi.org/10.5772/intechopen.77073

149

Increase in several competing technologies leads to the generation of large structured and unstructured operational and transactional data. The key source, includes sensors, lab simulators, social media, web pages etc. In this setting, fundamental understanding of complex schema and content is necessary for formulating a data retrieval request otherwise user often stumbled upon empty or huge result set of his query. For such situations, we came up with an imitative towards 'Query Reformulation' as a vital task of Query Evaluation, named a 'Query morphing'. The proposal extract relevant and additional data objects from available data space and then recognize suggestions to acquire intermediate query reformulation.

Morphing refers to undergo a gradual process of transformation of input, e.g. for Image morphing [24, 10], Data Morphing [20]. Some traditional information retrieval techniques that transform initial query submitted by user are mapped in **Figure 2**. These transformations techniques aim to retrieve relevant information and improve system performance as well. Query reformulation techniques perform various transformations by applying user cognitive effort or system assistance and formulate semantically equivalent queries to reduce costs [26, 32, 40]. Pre classified data is required as database abstraction is performed for query reformulation. For successful reformulation it is better to understand the searchers intend and for that query rewriting can be a good option for query transformation. Query rewriting can be viewed as a generalization of query relaxation, query expansion [1, 40] and query substitution techniques [2, 40]. Query expansion techniques answer additional documents by evaluating inputs and expanding original user query through terms addition. Query relaxation techniques, conflict the expansion techniques [3]. Query relaxation is done to generalize query as sometimes ill-phrased query leads to fewer answer. Transformation process based on typical possible alternatives on original query is done in query substitution techniques [4]. An offthe-shelf dictionary/treasure is required for all these query transformation techniques [5].

Techniques grouped towards the left part in the **Figure 2** assist users for precise and unambiguous query formulation and execution. Various relevant query recommendations are generated and suggested that assist users in real-time query reformulation. Query suggestion techniques determines list of relevant queries that may help to achieve a user's search need [6, 18]. Query auto-completion techniques self-complete the formulation of queries have previously been observed in search logs. During a search, user often search is a sequence of queries of similar information need, query chain identifies this sequence. Query logs of earlier queries posed by global user are required to compute query suggestions list. Query recommendation

**Figure 2.** Query transformations and various equivalent techniques.

knowledge.

User's search tasks can be categorized into three behaviors: Lookup, Learn and Investigate that is shown in **Figure 1**. The user may perform multiple types of search task in parallel, therefore searches are denoted by overlapping clouds. Generally, there is interplay between search tasks, for example lookup task interplay with investigate or learn. If we analyze the search behaviors, we can relate traditional search tasks with the lookup tasks in that carefully formulated queries yield precise result with the minimal relevance comparison. For exploratory search tasks, the system seeks more involvement beyond just a query specification and result presentation. A group of tasks allied with exploratory search is of type learn and investigate. Learning behaviour are aiming to knowledge acquisition in that user tries to develop addition, knowledge about the domain and better understand the problem context. It is an iterative process that simulates analogical thinking and relate users' experiences to return a set of data objects. Reformulating queries and comparing results take much time in learning

**Figure 1.** Exploratory search and sub-activities.

tasks. Investigate behaviour prunes gap in knowledge and transform existing data into new knowledge.

an information location need, but fails in information discovery need [39]. For discoveryoriented applications such as uncovering the information pattern from genomics, health care data, scientific data etc., additional assistance is required to formulate queries and navigation in data space to gain the desired information [16]. In such scenarios, the user usually uncertain about his information goals and/or less familiar with data semantics and context that makes the phrasing of information request [12] challenging. Also, initial search aims and intentions evolve as new information is encountered. Hence, the burden of analyzing, reorganizing and keeping track of the information gathered falls on the user alone [16, 17]. Exploratory search is one such emerging research area that realizes the importance of user's efforts in multiple phases of discovering, analyzing, and learning. Exploratory search systems can deliver pleasing quality information due to their recall-oriented reformulation from short

User's search tasks can be categorized into three behaviors: Lookup, Learn and Investigate that is shown in **Figure 1**. The user may perform multiple types of search task in parallel, therefore searches are denoted by overlapping clouds. Generally, there is interplay between search tasks, for example lookup task interplay with investigate or learn. If we analyze the search behaviors, we can relate traditional search tasks with the lookup tasks in that carefully formulated queries yield precise result with the minimal relevance comparison. For exploratory search tasks, the system seeks more involvement beyond just a query specification and result presentation. A group of tasks allied with exploratory search is of type learn and investigate. Learning behaviour are aiming to knowledge acquisition in that user tries to develop addition, knowledge about the domain and better understand the problem context. It is an iterative process that simulates analogical thinking and relate users' experiences to return a set of data objects. Reformulating queries and comparing results take much time in learning

typed ill-phrased query to precise query [23, 29, 37, 38].

148 From Natural to Artificial Intelligence - Algorithms and Applications

**Figure 1.** Exploratory search and sub-activities.

Increase in several competing technologies leads to the generation of large structured and unstructured operational and transactional data. The key source, includes sensors, lab simulators, social media, web pages etc. In this setting, fundamental understanding of complex schema and content is necessary for formulating a data retrieval request otherwise user often stumbled upon empty or huge result set of his query. For such situations, we came up with an imitative towards 'Query Reformulation' as a vital task of Query Evaluation, named a 'Query morphing'. The proposal extract relevant and additional data objects from available data space and then recognize suggestions to acquire intermediate query reformulation.

Morphing refers to undergo a gradual process of transformation of input, e.g. for Image morphing [24, 10], Data Morphing [20]. Some traditional information retrieval techniques that transform initial query submitted by user are mapped in **Figure 2**. These transformations techniques aim to retrieve relevant information and improve system performance as well. Query reformulation techniques perform various transformations by applying user cognitive effort or system assistance and formulate semantically equivalent queries to reduce costs [26, 32, 40]. Pre classified data is required as database abstraction is performed for query reformulation. For successful reformulation it is better to understand the searchers intend and for that query rewriting can be a good option for query transformation. Query rewriting can be viewed as a generalization of query relaxation, query expansion [1, 40] and query substitution techniques [2, 40]. Query expansion techniques answer additional documents by evaluating inputs and expanding original user query through terms addition. Query relaxation techniques, conflict the expansion techniques [3]. Query relaxation is done to generalize query as sometimes ill-phrased query leads to fewer answer. Transformation process based on typical possible alternatives on original query is done in query substitution techniques [4]. An offthe-shelf dictionary/treasure is required for all these query transformation techniques [5].

Techniques grouped towards the left part in the **Figure 2** assist users for precise and unambiguous query formulation and execution. Various relevant query recommendations are generated and suggested that assist users in real-time query reformulation. Query suggestion techniques determines list of relevant queries that may help to achieve a user's search need [6, 18]. Query auto-completion techniques self-complete the formulation of queries have previously been observed in search logs. During a search, user often search is a sequence of queries of similar information need, query chain identifies this sequence. Query logs of earlier queries posed by global user are required to compute query suggestions list. Query recommendation

**Figure 2.** Query transformations and various equivalent techniques.

techniques track user's querying behaviour, identify the interested area from the available data space and recommends set of queries that retrieved relevant information. The query is steering [12, 26] is one process that navigates the user through complex data structures. For query recommendation and steering interactive query session is required to achieve ultimate search goal [12].

that retrieves no result or huge result set. Traditional Database Management tools and systems are constructed by considering that database semantics is well understood by users [39]. Therefore, current applications with huge and complex database do not work well with these traditional Data Base Management techniques. Many interactive data exploration strategies are proposed and developed by researchers that extract and uncover great knowledge from

Query Morphing: A Proximity-Based Approach for Data Exploration

http://dx.doi.org/10.5772/intechopen.77073

151

Automatic Interactive Data Exploration (AIDE) framework is well explained in [16] by authors. In that, the user is directed towards the data area of interest by deliberately incorporating relevance feedback. Various machine learning and data mining techniques can be integrated in that to achieve the best performance. Similarly, in [17] YAML framework is suggested and it uses attribute-value pair frequency to make exploration effective. Automatic exploration strategy performs formulation of user's queries and leads towards relevant information.

In exploratory query aspects where the user is satisfied with 'closed-enough' answer, approximation modules implemented in search system help to achieve shorter response time. This approximation module is built without changing underling database architecture. For example, Aqua approximate query answering system [4] rewrites queries using summery synopsis to provide approximate answers. Automatic Query Processing (AQP) widely uses statistical techniques based on the synopsis [14] to analyze large amount of data. Four main key synopsis are used by researchers for approximation which is random sample synopsis, histogram

Most fundamental and commonly used synopsis is a random sampling in that subset of data objects are fetched based on stochastic mechanism. It is easy to draw samples from a small available data, although to make the sampling process scalable, advance sampling techniques are required e.g. BlinkDB [6] architecture. In this architecture samples are selected based on accuracy of query and response time that device dynamic sampling strategy. A Histogram synopsis method group the data values into subset by summarizing the attribute frequency distribution or combined attribute frequency distribution. By using advance methods such as aggregation over joints are also used to approximate more general class of query. Another synopsis is wavelet synopsis which is identical with the above but the only variation is that it transforms and express most substantial data into the frequency domain. A faster response is one characteristic of approximate query processing. Speedup with accuracy is the key objective of AQP, therefore, returned results must be verified. Interactive approximate query processing performs error estimation [5] and error diagnosis via close forms or bootstrap that

Due to the big data contingency and complex schematic structure of data, sensible formalisms of query is required for complex information retrieval which is mastered by a small group of users usually. Most users in real life apply brute force approaches which manipulate data by hand as they have little knowledge regarding query formulation. Assisted query formulation

complex data via highly ad-hoc interaction.

synopsis, wavelet and sketches synopsis.

guarantees runtime efficiency and resource usage.

**2.3. Assisted query formulation**

**2.2. Query approximation**

Due to the big data occurrences, traditional ways of query transformation repeatedly encountered challenges of relevance. To contrive such inherent challenges of transformation and relevance for exploration in large data a technique 'Query morphing' is designed. Our proposal suggests additional relevant data objects for the formulation of precise query by exploring available data space and leveraging use feedback. We concur that query morphing will also acquire the properties of traditional methodologies by observing that search query and respective results analogous to the history log.
