**2. Knowledge representation: ontology design**

Ontologies are often regarded as special knowledge repositories that can be read and understood both by people and computers, alienated from the developer and reused. Ontology in the context of information technology is a formal specification with a hierarchical structure, which is designed to represent knowledge. Typically, ontology includes descriptions of classes of entities (concepts) and their properties (roles) in relation to a certain subject domain of knowledge, as well as relationships between entities and restrictions on how these relationships can be used. Ontologies, which additionally include objects (instances of entity classes) and particular statements about these objects, are also called knowledge graphs. The formal ontology model O is understood as an ordered triple of the form

O ¼ <*X*, *R*, *F* >*;* where

*X* is a finite set of entity classes (concepts) for the domain represented by the ontology O; *R* is a finite set of properties (roles) that establish relationships between entities for some domain; and *F* is a finite set of interpretation functions defined on *Semantic Web and Interactive Knowledge Graphs as an Educational Technology DOI: http://dx.doi.org/10.5772/intechopen.92433*

entities and/or properties for ontology O. It can be said that interpretation functions map formal ontologies to certain domains.

As an illustration of the ontology creation process, **Figure 1** below shows a design pattern for an ontology "Nuclear Training Center" type, which is used in the project [4]. This model was created on the basis of an analysis of the educational programs of the following Russian and international training centers: National Research Nuclear University MEPhI, Physics Department of Moscow State University, IAEA. The ontology design pattern is represented in the UML notation according to the international standard [5]. The actual ontology in serialized form for the knowledge graph titled "Nuclear Physics at MSU and MEPhI" is available in Ref. [6]. Another approach to the development and refinement of the structure of ontologies is based on Terminological Decision Trees (TDT) [7].

One of the attractive features of the semantic web is that it becomes possible to extract (infer) new knowledge from the facts which already exist in the knowledge graph. For this purpose, intelligent software agents are used, which are called reasoners. The way inference is carried out algorithmically is not specified in the ontology itself or in the corresponding OWL document, since OWL is a declarative language for ontologies describing. The correct answer to any question is determined by the semantics of the description logic that sets the language standard.

**Figure 1.** *Design pattern for an ontology "nuclear training center" type in UML notation.*

classification and clustering of new entities in the semantic database of nuclear

The aim of the work presented in the chapter is to create a working prototype first and then a semantic web portal of knowledge in the domain of nuclear physics and nuclear power engineering based on ontologies and using databases deployed on cloud platforms [3]. The task of the study was to create the following graphs of

To ensure the effective use of the nuclear knowledge database in educational activities, additional software agents have been created for reconnaissance contextsensitive search for adequate network content and its semantic annotation based on existing knowledge graphs (for example, with the aim of authoring training materials), as well as public endpoints for easy navigation on international knowledge

The potential beneficiaries of information solutions and technologies that are proposed in the chapter are students, professors, experts, engineers, managers, and specialists in the domain of nuclear physics and nuclear power engineering (target

Ontologies are often regarded as special knowledge repositories that can be read and understood both by people and computers, alienated from the developer

(instances of entity classes) and particular statements about these objects, are also called knowledge graphs. The formal ontology model O is understood as an ordered

O ¼ <*X*, *R*, *F* >*;* where

*X* is a finite set of entity classes (concepts) for the domain represented by the ontology O; *R* is a finite set of properties (roles) that establish relationships between entities for some domain; and *F* is a finite set of interpretation functions defined on

and reused. Ontology in the context of information technology is a formal specification with a hierarchical structure, which is designed to represent knowledge. Typically, ontology includes descriptions of classes of entities (concepts) and their properties (roles) in relation to a certain subject domain of knowledge,

as well as relationships between entities and restrictions on how these relationships can be used. Ontologies, which additionally include objects

knowledge are solved [3].

• World nuclear data centers

• Events and publications from CERN

*Cloud Computing Security - Concepts and Practice*

• IAEA databases and network services

• Nuclear physics at MSU and MEPhI

• Integrated nuclear knowledge graph

**2. Knowledge representation: ontology design**

• Nuclear research centers

• Nuclear physics journals

databases DBpedia and Wikidata.

audience).

triple of the form

**100**

nuclear knowledge:

In particular, the project under discussion is based on the description logic with the signature SROIQ (D), see [3].

To solve Task 1 and to verify the validity of Hypothesis 1, the following obvious

Step 1. On the educational web portal [4], from the drop-down list, select the knowledge graph "Nuclear Physics at MSU, MEPhI" (the fourth from the top in the list of knowledge graphs). Further, to solve Task 1 and to verify the validity of Hypothesis 1, one can start reasoning either with the corresponding classes "Training course," "Training video," "Professor," etc., or with specific objects "Physics of the atomic nucleus and particles," "Lecture 1. Physics of the atomic nucleus and particles," "I.М. Kapitonov," etc. Let it be decided to start the reasoning with the class "Training course." One should type the first characters of the class name in the corresponding input field of the search widget, for example "Tr," and then in the drop-down list, select the line "Training course." To begin working with the knowledge graph, click the "Start" button, as shown in **Figure 2** below.

Step 2. Depending on the current setting of "Display of knowledge graphs" in a

workspace of the RDF browser and the corresponding graph node named "Training course" will be opened, see **Figure 3**. Each graph node has the form of a colored circle equipped with a button for displaying a local pop-up menu and a button for displaying the metadata. In addition, around each node there are petals of various sizes, shapes, and colors, with which it is possible to start step-by-step inductive

*Widgets for quick diving into the knowledge graphs: in the knowledge graph "Nuclear Physics at MSU, MEPhI"*

*RDF browser: the first node when diving into the knowledge graph "nuclear physics at MSU, MEPhI," class*

*select the class "training course," then click the "start" button.*

pop-up window, in a new tab of a web browser or in the same window, the

**Figure 2.**

**Figure 3.**

**103**

*"training course."*

reasoning should be performed step by step on the knowledge graph.

*Semantic Web and Interactive Knowledge Graphs as an Educational Technology*

*DOI: http://dx.doi.org/10.5772/intechopen.92433*

The RDF browser is another significant attribute of the project [4], which distinguishes it from other well-known solutions in the field of semantic web. An example of inductive reasoning using the RDF browser is given below. Clusters of entities that are related to each other by a particular property or group of properties are examples of deduced facts (samples of new knowledge) that were not originally explicitly presented in the graph. The deduced facts in the RDF browser have the form of petals grouped around the nodes of the graph, are opened with a mouse click, and are very convenient for subsequent visual navigation in the graph.

Once on the desired location of the desired knowledge graph using the search widget, then the user through the RDF browser can perform visual navigation on the graph, visiting its nodes in the desired order and extracting metadata, hypertext links, full-text, and media content associated with the node, wherein the neighborhood (environment, closure) of each node of the graph becomes visible and available for navigation. This neighborhood includes the nodes of the graph, through which the user initially entered the semantic web, as well as adjacent nodes of other graphs that are supported by the knowledge database [3].

The visual way of specifying the inference rules on the graph makes it stand out from the more traditional reasoner's interfaces, where inference rules are specified using SWRL language, logical predicates or a SPARQL-like syntax. It seems, that the intuitive, interactive visual way of specifying inference rules is more friendly for unsophisticated users of knowledge graphs.

## **3. Inductive reasoning in knowledge graphs**

Knowledge graphs may contain various kinds of uncertainties. For this reason, the presentation of real domains of knowledge in the context of the semantic web may encounter difficulties if only classical logical formalisms are used. Alternative approaches sometimes assume the probabilistic nature of knowledge, which is hardly always appropriate and justified [8]. In addition, purely deductive exact logical reasoning may not be possible for knowledge databases on the WWW; such reasonings do not take into account statistical patterns in the data. In this regard, of particular interest is the ability of knowledge databases as artificial intelligence systems to evaluate cognitive hypotheses, using for this purpose, in addition to a deduction, other methods of reasoning, such as inductive reasoning, argumentation, and reasoning based on precedents.

As an example of inductive reasoning in the knowledge graph, consider the following situation [3]. Some students have to pass an exam in nuclear physics at the Physics Department of Moscow State University. Let the student know only the title of the training course: "Physics of the atomic nucleus and particles" and the name of the professor: "I.М. Kapitonov." Let us formulate the task.

Task 1. Using the semantic educational web portal [4], it is required to find and study all the video lectures for this training course.

Let us also assume that the student found a video lecture in the WWW titled "Lecture 1. Physics of the atomic nucleus and particles." He suggests that this video lecture may be relevant to the training course being studied. Let us formulate a hypothesis.

Hypothesis 1. "Lecture 1. Physics of the atomic nucleus and particles" is taught by professor "I.М. Kapitonov" at the Faculty of Physics of the Moscow State University, and it is part of the training course titled "Physics of the atomic nucleus and particles."

## *Semantic Web and Interactive Knowledge Graphs as an Educational Technology DOI: http://dx.doi.org/10.5772/intechopen.92433*

To solve Task 1 and to verify the validity of Hypothesis 1, the following obvious reasoning should be performed step by step on the knowledge graph.

Step 1. On the educational web portal [4], from the drop-down list, select the knowledge graph "Nuclear Physics at MSU, MEPhI" (the fourth from the top in the list of knowledge graphs). Further, to solve Task 1 and to verify the validity of Hypothesis 1, one can start reasoning either with the corresponding classes "Training course," "Training video," "Professor," etc., or with specific objects "Physics of the atomic nucleus and particles," "Lecture 1. Physics of the atomic nucleus and particles," "I.М. Kapitonov," etc. Let it be decided to start the reasoning with the class "Training course." One should type the first characters of the class name in the corresponding input field of the search widget, for example "Tr," and then in the drop-down list, select the line "Training course." To begin working with the knowledge graph, click the "Start" button, as shown in **Figure 2** below.

Step 2. Depending on the current setting of "Display of knowledge graphs" in a pop-up window, in a new tab of a web browser or in the same window, the workspace of the RDF browser and the corresponding graph node named "Training course" will be opened, see **Figure 3**. Each graph node has the form of a colored circle equipped with a button for displaying a local pop-up menu and a button for displaying the metadata. In addition, around each node there are petals of various sizes, shapes, and colors, with which it is possible to start step-by-step inductive

#### **Figure 2.**

In particular, the project under discussion is based on the description logic with the

Once on the desired location of the desired knowledge graph using the search widget, then the user through the RDF browser can perform visual navigation on the graph, visiting its nodes in the desired order and extracting metadata, hypertext links, full-text, and media content associated with the node, wherein the neighborhood (environment, closure) of each node of the graph becomes visible and available for navigation. This neighborhood includes the nodes of the graph, through which the user initially entered the semantic web, as well as adjacent nodes of other

The visual way of specifying the inference rules on the graph makes it stand out from the more traditional reasoner's interfaces, where inference rules are specified using SWRL language, logical predicates or a SPARQL-like syntax. It seems, that the intuitive, interactive visual way of specifying inference rules is more friendly for

Knowledge graphs may contain various kinds of uncertainties. For this reason, the presentation of real domains of knowledge in the context of the semantic web may encounter difficulties if only classical logical formalisms are used. Alternative approaches sometimes assume the probabilistic nature of knowledge, which is hardly always appropriate and justified [8]. In addition, purely deductive exact logical reasoning may not be possible for knowledge databases on the WWW; such reasonings do not take into account statistical patterns in the data. In this regard, of particular interest is the ability of knowledge databases as artificial intelligence systems to evaluate cognitive hypotheses, using for this purpose, in addition to a deduction, other methods of reasoning, such as inductive reasoning, argumenta-

As an example of inductive reasoning in the knowledge graph, consider the following situation [3]. Some students have to pass an exam in nuclear physics at the Physics Department of Moscow State University. Let the student know only the title of the training course: "Physics of the atomic nucleus and particles" and the

Task 1. Using the semantic educational web portal [4], it is required to find and

Let us also assume that the student found a video lecture in the WWW titled "Lecture 1. Physics of the atomic nucleus and particles." He suggests that this video lecture may be relevant to the training course being studied. Let us formulate a

Hypothesis 1. "Lecture 1. Physics of the atomic nucleus and particles" is taught by professor "I.М. Kapitonov" at the Faculty of Physics of the Moscow State University, and it is part of the training course titled "Physics of the atomic nucleus and

name of the professor: "I.М. Kapitonov." Let us formulate the task.

graphs that are supported by the knowledge database [3].

unsophisticated users of knowledge graphs.

tion, and reasoning based on precedents.

hypothesis.

particles."

**102**

study all the video lectures for this training course.

**3. Inductive reasoning in knowledge graphs**

The RDF browser is another significant attribute of the project [4], which distinguishes it from other well-known solutions in the field of semantic web. An example of inductive reasoning using the RDF browser is given below. Clusters of entities that are related to each other by a particular property or group of properties are examples of deduced facts (samples of new knowledge) that were not originally explicitly presented in the graph. The deduced facts in the RDF browser have the form of petals grouped around the nodes of the graph, are opened with a mouse click, and are very convenient for subsequent visual navigation in the graph.

signature SROIQ (D), see [3].

*Cloud Computing Security - Concepts and Practice*

*Widgets for quick diving into the knowledge graphs: in the knowledge graph "Nuclear Physics at MSU, MEPhI" select the class "training course," then click the "start" button.*

#### **Figure 3.**

*RDF browser: the first node when diving into the knowledge graph "nuclear physics at MSU, MEPhI," class "training course."*

reasoning and navigation in the graph. In the upper left part of the workspace of the RDF browser, there are options and help resources (legend, training videos, etc.). In the upper right part of the workspace of the RDF browser, the metadata associated with a particular node can be displayed if desired. Petals located around a node correspond to the single RDF triples in which this node is involved, or to the groups of such RDF triples (large petals). When user hovers over the petals, tooltips appear in which it is possible to see the names of the components of triplets. For large petals (triplet groups), the number of related resources is also displayed. Any group of triplets can be expanded or collapsed with a simple mouse click on the corresponding petal.

Step 3. We are interested in objects which have the type (i.e. belongs to the class) titled "Training course." There are three such objects and they are linked to our node by the "type" property, see **Figure 3**. Click to expand this resource group. Then go to the pop-up local menu of the "Training course" node and click the "View related resources" button (the second one on the right in the row of buttons). The RDF browser will display all the nodes associated with our node by any kind of properties, see **Figure 4**. Next, close the extra nodes and leave only those nodes that are associated with our node by the incoming "type" property, see **Figure 5**. In the course of practical work with the graph, it is advisable not to open the extra nodes by clicking only on the obviously necessary petals. The right side of **Figure 5** shows the metadata for the object named "Physics of the atomic nucleus and particles," which belongs to the class titled "Training Course." This object is an obvious candidate for further reasoning. However, **Figure 5** shows two other alternatives that can be left for further consideration in the inductive reasoning.

Step 4. The student is interested in the training course named "Physics of the atomic nucleus and particles," which is taught exactly at the Faculty of Physics of the Moscow State University. At this step, it is possible to narrow down the number of alternatives considered, taking an interest in the "teaches" property. **Figure 6** shows how this is done. Two alternative training courses taught at the National Research Nuclear University MEPhI, at this step, it is advisable to exclude from further considerations.

**Figure 5.**

**Figure 6.**

**105**

*object titled "physics of the atomic nucleus and particles."*

*RDF browser: Displaying the nodes of the graph, essential for continuing the reasoning, and the metadata for the*

*Semantic Web and Interactive Knowledge Graphs as an Educational Technology*

*DOI: http://dx.doi.org/10.5772/intechopen.92433*

*RDF browser: Using the "teaches" property to reduce the number of alternatives under consideration.*

**Figure 4.** *RDF browser: displaying related resources for the class titled "training course."*

*Semantic Web and Interactive Knowledge Graphs as an Educational Technology DOI: http://dx.doi.org/10.5772/intechopen.92433*

reasoning and navigation in the graph. In the upper left part of the workspace of the RDF browser, there are options and help resources (legend, training videos, etc.). In the upper right part of the workspace of the RDF browser, the metadata associated with a particular node can be displayed if desired. Petals located around a node correspond to the single RDF triples in which this node is involved, or to the groups of such RDF triples (large petals). When user hovers over the petals, tooltips appear in which it is possible to see the names of the components of triplets. For large petals (triplet groups), the number of related resources is also displayed. Any group of

Step 3. We are interested in objects which have the type (i.e. belongs to the class) titled "Training course." There are three such objects and they are linked to our node by the "type" property, see **Figure 3**. Click to expand this resource group. Then go to the pop-up local menu of the "Training course" node and click the "View related resources" button (the second one on the right in the row of buttons). The RDF browser will display all the nodes associated with our node by any kind of properties, see **Figure 4**. Next, close the extra nodes and leave only those nodes that are associated with our node by the incoming "type" property, see **Figure 5**. In the course of practical work with the graph, it is advisable not to open the extra nodes by clicking only on the obviously necessary petals. The right side of **Figure 5** shows the metadata for the object named "Physics of the atomic nucleus and particles," which belongs to the class titled "Training Course." This object is an obvious candidate for further reasoning. However, **Figure 5** shows two other alternatives

triplets can be expanded or collapsed with a simple mouse click on the

that can be left for further consideration in the inductive reasoning.

*RDF browser: displaying related resources for the class titled "training course."*

Step 4. The student is interested in the training course named "Physics of the atomic nucleus and particles," which is taught exactly at the Faculty of Physics of the Moscow State University. At this step, it is possible to narrow down the number of alternatives considered, taking an interest in the "teaches" property. **Figure 6** shows how this is done. Two alternative training courses taught at the National Research Nuclear University MEPhI, at this step, it is advisable to exclude from

corresponding petal.

*Cloud Computing Security - Concepts and Practice*

further considerations.

**Figure 4.**

**104**

*RDF browser: Displaying the nodes of the graph, essential for continuing the reasoning, and the metadata for the object titled "physics of the atomic nucleus and particles."*

• The documents found are ranked by the public search engine in accordance with its internal algorithm, which does not always meet the interests of a

*Semantic Web and Interactive Knowledge Graphs as an Educational Technology*

• Users are not always comfortable to manage the context of the search query,

• Links to the commercial sites usually have a higher rating than other search results. Such effect is achieved through the use of the so-called search engine optimization (SEO) to artificially raise the positions of commercial network resources on pages of popular search engines, in order to increase the flow of

It seems that the above circumstances and trends make public search engines an increasingly inadequate tool for extracting knowledge in the WWW for educational purposes. The context-sensitive search is based on a simple idea: to create such an intermediary (software agent) between the knowledge engineer and public search engines, which helps to systematize search results in accordance with his professional needs, by effectively filtering inappropriate content and information garbage. The goal is to involve the power of the modern search engines in the maximum

When the "Context-sensitive search" software agent is working, the global document search, as well as the search for specialized web resources, is initially performed by the regular search engines (Google Ajax Search, Yandex, Yahoo, and Mail.ru), the interaction with which occurs asynchronously via the dynamic pool of the proxy servers, each of which is hosted on the Google Cloud Platform. The results of the work of the regular search engines are a kind of "raw material" for further processing. Specially designed proxy servers on the cloud platform parse these results and generate the feeds, which are then sent to the client computer, where from the feeds, snippets are formed. These snippets, which contain metadata, before they appear on the monitor of the client computer, undergo additional processing, screening, and sorting, as described below. In particular, for each snippet, its relevance, persistence, and a number of other indexes are calculated, which are further used to systematize and clustering search

The query language of some search engines may include the so-called "search

When setting the search context, the following data sources are available: taxonomies, thesauri, keywords, ontologies, textual files from the client computer, and arbitrary resources from the WWW. Any combination of the above methods for setting the search context is allowed. The resulting context is the union of the selected options. The context defined in this way allows to select, sort, and organize

information that comes from the search engines through the proxy servers. **Figure 8** below shows the possible options for setting the search context.

context." It is about the use directly in the text of the search query of special operators, which allow the user to specify the presence and relative location of specific tokens in the documents found. In this paper, a "search context" is understood slightly different way, namely, as a certain restricted on length text that characterizes the domain that is currently of interest to the knowledge

potential customers for the subsequent monetization of traffic.

level, including built-in query languages and other search controls.

particular user.

results obtained.

**5. Search context**

engineer.

**107**

refine, and direct the search.

*DOI: http://dx.doi.org/10.5772/intechopen.92433*

**Figure 7.** *RDF browser: Solving task 1 and confirming the validity of hypothesis 1.*

Step 5. Continuing inductive reasoning for the object "Faculty of Physics, MSU" by the property "includes" and/or reasoning for the object "Physics of the atomic nucleus and particles" by the property "contains video," the student will be convinced of the validity of the Hypothesis 1 and will get the solution for Task 1, see **Figure 7** below.

It is possible to view detected video lectures without leaving the workspace of the RDF browser, simply by clicking on the corresponding icon in the metadata area for the object named "Lecture 1. Physics of the atomic nucleus and particles."

The result obtained in Step 5 could be achieved in the course of deductive reasoning, without considering possible alternatives. However, the use of inductive reasoning allows the user to naturally extract additional knowledge from the graph, which will not be easy to obtain with a simple deductive inference [3].

Using the above method, it is easy to discover that professor "B.S. Ishkhanov" also gives lectures on the training course "Physics of the atomic nucleus and particles" at the Faculty of Physics of the Moscow State University, see **Figure 7**. All video lectures and other learning objects of both professors for this training course are available. Through the graph of knowledge, the full content of any training course and all the existing relationships are clearly revealed.

As can be seen from the above example, the inductive reasoning process in knowledge graphs resembles a computer adventure game, does not require special skills, and is accessible to an inexperienced user. Knowledge graphs similar to those considered are used in the real educational activity at the National Research Nuclear University MEPhI. Practice shows that university students master the methods of interactive work with knowledge graphs within a few minutes.

### **4. Knowledge acquisition: context-sensitive search**

As a toolkit that prepares data for ontology refinement and enrichment, a software agent (which is essentially a specialized meta-search engine) for the reconnaissance context-sensitive search for new knowledge in the WWW is provided. To begin with it, several characteristic features of popular search engines that are well known to most users should be noted.

*Semantic Web and Interactive Knowledge Graphs as an Educational Technology DOI: http://dx.doi.org/10.5772/intechopen.92433*


It seems that the above circumstances and trends make public search engines an increasingly inadequate tool for extracting knowledge in the WWW for educational purposes. The context-sensitive search is based on a simple idea: to create such an intermediary (software agent) between the knowledge engineer and public search engines, which helps to systematize search results in accordance with his professional needs, by effectively filtering inappropriate content and information garbage. The goal is to involve the power of the modern search engines in the maximum level, including built-in query languages and other search controls.

When the "Context-sensitive search" software agent is working, the global document search, as well as the search for specialized web resources, is initially performed by the regular search engines (Google Ajax Search, Yandex, Yahoo, and Mail.ru), the interaction with which occurs asynchronously via the dynamic pool of the proxy servers, each of which is hosted on the Google Cloud Platform. The results of the work of the regular search engines are a kind of "raw material" for further processing. Specially designed proxy servers on the cloud platform parse these results and generate the feeds, which are then sent to the client computer, where from the feeds, snippets are formed. These snippets, which contain metadata, before they appear on the monitor of the client computer, undergo additional processing, screening, and sorting, as described below. In particular, for each snippet, its relevance, persistence, and a number of other indexes are calculated, which are further used to systematize and clustering search results obtained.

### **5. Search context**

Step 5. Continuing inductive reasoning for the object "Faculty of Physics, MSU" by the property "includes" and/or reasoning for the object "Physics of the atomic nucleus and particles" by the property "contains video," the student will be convinced of the validity of the Hypothesis 1 and will get the solution for Task 1, see **Figure 7** below. It is possible to view detected video lectures without leaving the workspace of the RDF browser, simply by clicking on the corresponding icon in the metadata area for the object named "Lecture 1. Physics of the atomic nucleus and particles." The result obtained in Step 5 could be achieved in the course of deductive reasoning, without considering possible alternatives. However, the use of inductive reasoning allows the user to naturally extract additional knowledge from the graph,

Using the above method, it is easy to discover that professor "B.S. Ishkhanov" also gives lectures on the training course "Physics of the atomic nucleus and particles" at the Faculty of Physics of the Moscow State University, see **Figure 7**. All video lectures and other learning objects of both professors for this training course are available. Through the graph of knowledge, the full content of any training

As can be seen from the above example, the inductive reasoning process in knowledge graphs resembles a computer adventure game, does not require special skills, and is accessible to an inexperienced user. Knowledge graphs similar to those considered are used in the real educational activity at the National Research Nuclear University MEPhI. Practice shows that university students master the methods of

As a toolkit that prepares data for ontology refinement and enrichment, a software agent (which is essentially a specialized meta-search engine) for the reconnaissance context-sensitive search for new knowledge in the WWW is provided. To begin with it, several characteristic features of popular search engines that are well

which will not be easy to obtain with a simple deductive inference [3].

course and all the existing relationships are clearly revealed.

*RDF browser: Solving task 1 and confirming the validity of hypothesis 1.*

*Cloud Computing Security - Concepts and Practice*

interactive work with knowledge graphs within a few minutes.

**4. Knowledge acquisition: context-sensitive search**

known to most users should be noted.

**106**

**Figure 7.**

The query language of some search engines may include the so-called "search context." It is about the use directly in the text of the search query of special operators, which allow the user to specify the presence and relative location of specific tokens in the documents found. In this paper, a "search context" is understood slightly different way, namely, as a certain restricted on length text that characterizes the domain that is currently of interest to the knowledge engineer.

When setting the search context, the following data sources are available: taxonomies, thesauri, keywords, ontologies, textual files from the client computer, and arbitrary resources from the WWW. Any combination of the above methods for setting the search context is allowed. The resulting context is the union of the selected options. The context defined in this way allows to select, sort, and organize information that comes from the search engines through the proxy servers.

**Figure 8** below shows the possible options for setting the search context.

the dominance index, in addition to the relevance and pertinence of the snippets, can also take into account the measure of the similarity between the snippet and the keywords, categories, and properties of the educational portal in total. For the practical calculation of the values of the dominance index, it seems reasonable to use the formalism of Pareto dominance relation [10], since Pareto's multicriteria ranking does not presuppose an a priori knowledge of the relative importance of aspects (for example, what is more important, relevance or pertinence?). Let given the initial set of snippets, from which one should choose some optimal subset, the choice should be made on the basis of certain ideas about the adequacy of snippets (the principle of optimality). The selection task is a simple one, if there is only a single aspect by which it is possible to compare any two snippets and directly indicate which one is more adequate. The solution of the simple selection problems is obvious. In real situations, it is not possible to single out any one aspect. Moreover, it is often generally difficult to single out aspects. The selection and ranking of aspects that are essential for subsequent selection, in turn, is the task of choice. If some of the aspects are more important (priority) than other aspects, this circumstance should be taken into account in the mathematical model of choice. The selection task is the algebra < Ω, O > where Ω is a set of alternatives (in our case, a set of snippets) and O is the optimality principle. The task makes sense if the set of alternatives is known. Usually, the principle of optimality is unknown. For further discussion, suppose that each snippet *x*∈ Ω is characterized by a

*Semantic Web and Interactive Knowledge Graphs as an Educational Technology*

� �. Let <sup>Α</sup> <sup>¼</sup> f g 1, … , *<sup>m</sup>* be the set of aspect

0, *if y exceed x in aspect j* ( ) *where j*<sup>∈</sup> <sup>Α</sup>; *<sup>x</sup>*, *<sup>y</sup>*<sup>∈</sup> <sup>Ω</sup>; (1)

*<sup>J</sup>* <sup>¼</sup> *<sup>j</sup>* : *<sup>j</sup>*<sup>∈</sup> <sup>Α</sup>; *<sup>α</sup> <sup>j</sup>*ð Þ *<sup>x</sup>*, *<sup>y</sup> is defined* � �, *<sup>J</sup>* <sup>∈</sup>f g <sup>Α</sup> ; (2)

*α <sup>j</sup>*ð Þ *y*, *x* (3)

*<sup>y</sup>*<sup>∈</sup> <sup>Ω</sup> ð Þ *<sup>y</sup>*, *<sup>x</sup>* (4)

It can be assumed that choosing between any two snippets *x* and *y* with only one

If *x* and *y* are equal or not comparable in some aspect with the number *j*, then for such number *j*, the function *α <sup>j</sup>*ð Þ *x*, *y* is not defined. Let us form a set *J* of numbers of

Next, we construct a metric that takes into account the number of aspects by which a particular snippet is inferior to all other snippets. Let there be two snippets

*j*∈*J*

*d y*ð Þ¼ , *<sup>x</sup>* <sup>X</sup>

*D*Ωð Þ¼ *x* max *d*

is called the dominance index of *x* when presenting the Ω set. This value characterizes the number of aspects of the snippet *x* that are not the best in

the number of aspects in which *y* is better than *x*. Then, the value

comparison with all other snippets available in the Ω set.

finite set of aspects *x* ¼ *x*1, *x*2, … , *xm*

*DOI: http://dx.doi.org/10.5772/intechopen.92433*

follows:

*α <sup>j</sup>*ð Þ¼ *x*, *y*

*x*, *y*∈ Ω. Denote

**109**

numbers to consider when choosing; f g Α is the set of all subsets Α.

1, *if x exceed y in aspect j*

such aspects that *x* and *y* differ in these aspects

of any aspect taken into account is a simple task. If this is not the case, the corresponding aspect can be decomposed and presented as a group of simpler aspects. For each pair of snippets ð Þ *x*, *y* , we define a family of functions *α <sup>j</sup>*ð Þ *x*, *y* as

**Figure 8.**

*Setting the context for the reconnaissance context-sensitive search: 1 – setting the context using a file from the client computer; 2 – setting the context using an arbitrary site; 3 – widgets to show the established context.*

### **6. Relevance, pertinence, and metrics**

For the purposes of this paper, the relevance of the snippet is the measure of the similarity between the snippet and the search query text. Under the pertinence of the snippet is meant the measure of the similarity between the snippet and search context that was defined earlier. These and other measures are calculated by means a fuzzy comparison of the corresponding texts. To quantify these measures, "Context–sensitive search" software agent uses the Levenshtein metrics [9]. The algorithm for calculating the relevance of the one particular snippet is as follows.

Each lexical unit (token) from the snippet is sequentially compared with each token from the text of the search query. In the case of an exact match of tokens, the relevance of a snippet is increased by number 3. If a complete match of the lexemes requires the use of one of the Levenshtein operations (insertion, deletion, and substitution of one symbol), then the relevance of a snippet is increased by number 2 and not 3 (2 = 3–1). Here, number 1 is the price of one Levenshtein operation. If a complete match of the lexemes requires the use of two Levenshtein operations, then the relevance of a snippet is increased by number 1 (1 = 3–2). Here, number 2 is the price of the two Levenshtein operations. In the case that more than two Levenshtein operations are required to match the lexemes, the relevance of the snippet does not increase at all. It is possible to finetune the prices (weights) of Levenshtein operations of each kind, which initially (by default) are all equal to one.

The algorithm for calculating the snippet's pertinence looks similar, with the only difference that each token from the snippet is successively compared to each token from the search context. As can be seen from the above description of the algorithm, the process of calculating the relevance and pertinence of snippets is a formal one, without analyzing the possible connections of individual tokens and their environment. It is assumed that earlier such an analysis was implemented to some extent during the initial search of documents and their full-text indexing in databases of regular search engines.

Various options for sorting search results in the final output of the "Contextsensitive search" software agent are allowed. The sorting by aspect named "dominance index" deserves a special mention, which provides a joint account of the values of many metrics that characterize the adequacy of the snippets. For example,

#### *Semantic Web and Interactive Knowledge Graphs as an Educational Technology DOI: http://dx.doi.org/10.5772/intechopen.92433*

the dominance index, in addition to the relevance and pertinence of the snippets, can also take into account the measure of the similarity between the snippet and the keywords, categories, and properties of the educational portal in total. For the practical calculation of the values of the dominance index, it seems reasonable to use the formalism of Pareto dominance relation [10], since Pareto's multicriteria ranking does not presuppose an a priori knowledge of the relative importance of aspects (for example, what is more important, relevance or pertinence?).

Let given the initial set of snippets, from which one should choose some optimal subset, the choice should be made on the basis of certain ideas about the adequacy of snippets (the principle of optimality). The selection task is a simple one, if there is only a single aspect by which it is possible to compare any two snippets and directly indicate which one is more adequate. The solution of the simple selection problems is obvious. In real situations, it is not possible to single out any one aspect. Moreover, it is often generally difficult to single out aspects. The selection and ranking of aspects that are essential for subsequent selection, in turn, is the task of choice. If some of the aspects are more important (priority) than other aspects, this circumstance should be taken into account in the mathematical model of choice.

The selection task is the algebra < Ω, O > where Ω is a set of alternatives (in our case, a set of snippets) and O is the optimality principle. The task makes sense if the set of alternatives is known. Usually, the principle of optimality is unknown.

For further discussion, suppose that each snippet *x*∈ Ω is characterized by a finite set of aspects *x* ¼ *x*1, *x*2, … , *xm* � �. Let <sup>Α</sup> <sup>¼</sup> f g 1, … , *<sup>m</sup>* be the set of aspect numbers to consider when choosing; f g Α is the set of all subsets Α.

It can be assumed that choosing between any two snippets *x* and *y* with only one of any aspect taken into account is a simple task. If this is not the case, the corresponding aspect can be decomposed and presented as a group of simpler aspects. For each pair of snippets ð Þ *x*, *y* , we define a family of functions *α <sup>j</sup>*ð Þ *x*, *y* as follows:

$$a\_j(\mathbf{x}, \mathbf{y}) = \begin{cases} 1, & \text{if } \mathbf{x} \text{ exceed } \mathbf{y} \text{ in } a \text{spect}^j j \\ 0, & \text{if } \mathbf{y} \text{ exceed } \mathbf{x} \text{ in } a \text{spect}^j j \end{cases} \text{ where } \mathbf{j} \in \mathbf{A}; \quad \mathbf{x}, \mathbf{y} \in \mathbf{Q}; \tag{1}$$

If *x* and *y* are equal or not comparable in some aspect with the number *j*, then for such number *j*, the function *α <sup>j</sup>*ð Þ *x*, *y* is not defined. Let us form a set *J* of numbers of such aspects that *x* and *y* differ in these aspects

$$J = \left\{ \begin{array}{c} j: j \in \mathcal{A}; \ a\_j(\mathfrak{x}, y) \text{ is defined} \right\}, \quad J \in \{\mathcal{A}\}; \tag{2}$$

Next, we construct a metric that takes into account the number of aspects by which a particular snippet is inferior to all other snippets. Let there be two snippets *x*, *y*∈ Ω. Denote

$$d(\mathbf{y}, \mathbf{x}) = \sum\_{j \in J} a\_j(\mathbf{y}, \mathbf{x}) \tag{3}$$

the number of aspects in which *y* is better than *x*. Then, the value

$$D\_{\Omega}(\mathfrak{x}) = \max\_{\mathfrak{y} \in \Omega} d(\mathfrak{y}, \mathfrak{x}) \tag{4}$$

is called the dominance index of *x* when presenting the Ω set. This value characterizes the number of aspects of the snippet *x* that are not the best in comparison with all other snippets available in the Ω set.

**6. Relevance, pertinence, and metrics**

*Cloud Computing Security - Concepts and Practice*

**Figure 8.**

databases of regular search engines.

**108**

For the purposes of this paper, the relevance of the snippet is the measure of the similarity between the snippet and the search query text. Under the pertinence of the snippet is meant the measure of the similarity between the snippet and search context that was defined earlier. These and other measures are calculated by means a fuzzy comparison of the corresponding texts. To quantify these measures, "Context–sensitive search" software agent uses the Levenshtein metrics [9]. The algorithm for calculating the relevance of the one particular snippet is as follows.

*Setting the context for the reconnaissance context-sensitive search: 1 – setting the context using a file from the client computer; 2 – setting the context using an arbitrary site; 3 – widgets to show the established context.*

Each lexical unit (token) from the snippet is sequentially compared with each token from the text of the search query. In the case of an exact match of tokens, the relevance of a snippet is increased by number 3. If a complete match of the lexemes requires the use of one of the Levenshtein operations (insertion, deletion, and substitution of one symbol), then the relevance of a snippet is increased by number 2 and not 3 (2 = 3–1). Here, number 1 is the price of one Levenshtein operation. If a complete match of the lexemes requires the use of two Levenshtein operations, then the relevance of a snippet is increased by number 1 (1 = 3–2). Here, number 2 is the price of the two Levenshtein operations. In the case that more than two Levenshtein operations are required to match the lexemes, the relevance of the snippet does not increase at all. It is possible to finetune the prices (weights) of Levenshtein opera-

The algorithm for calculating the snippet's pertinence looks similar, with the only difference that each token from the snippet is successively compared to each token from the search context. As can be seen from the above description of the algorithm, the process of calculating the relevance and pertinence of snippets is a formal one, without analyzing the possible connections of individual tokens and their environment. It is assumed that earlier such an analysis was implemented to some extent during the initial search of documents and their full-text indexing in

Various options for sorting search results in the final output of the "Contextsensitive search" software agent are allowed. The sorting by aspect named "dominance index" deserves a special mention, which provides a joint account of the values of many metrics that characterize the adequacy of the snippets. For example,

tions of each kind, which initially (by default) are all equal to one.

Let us define the function *<sup>C</sup>D*ð Þ <sup>Ω</sup> for selecting the best snippets as follows:

$$\mathcal{C}^{\mathcal{D}}(\mathfrak{Q}) = \left\{ \mathfrak{x} \in \mathfrak{Q} \, : \, D\_{\mathfrak{Q}}(\mathfrak{x}) = \min\_{\mathfrak{x} \in \mathfrak{Q}} D\_{\mathfrak{Q}}(\mathfrak{x}) \right\} \tag{5}$$

**7. Knowledge graphs enrichment: semantic annotation**

*Semantic Web and Interactive Knowledge Graphs as an Educational Technology*

document spaces, via metadata.

*DOI: http://dx.doi.org/10.5772/intechopen.92433*

on existing knowledge graphs can be used.

**Figure 10.**

**111**

The database world is a place that is controlled by computers. Supercomputers have amazing computing capabilities, but they can be a struggle when it comes to acquiring new knowledge and experience or putting knowledge into practice. While it is easy for a human to decide whether two or more things are related based on cognitive associations, a computer often fails to do it. Unlike traditional lexical search where search engines look for literal matches of the query words and their variants, semantic annotation tries to interpret natural language close to how people do it. During semantic annotation, all references to cases related to entities in the ontology are recognized. Semantic annotation is the glue that ties ontologies into

The working panel for implementing the semantic annotation process is shown in **Figure 9** below. At the top of this panel is a workspace for entering and editing network resource addresses (URLs) to be annotated. The data in this workspace can be entered from any source, including manually. However, a more technologically advanced approach is to first find on the WWW those network resources that are most adequate to a given domain using the "Context–sensitive search" software agent. The found adequate content can then be easily loaded using the "Download resources" button and included in the list for annotation with a single mouse click. The settings panel for the semantic annotation process is shown in **Figure 10**

below. For annotation, you can select any of the knowledge graphs that are presented in the semantic repository, as well as any combination of them. To calculate measures of similarity between the annotated resource and entities from knowledge graphs, both text analysis methods and neural networks that are trained

*Setting the options for the semantic annotation process: 1 – selecting and visualizing the knowledge graphs used;*

*2 – selecting of the technology and setting semantic annotation parameters.*

Here, the value *D*<sup>Ω</sup> ¼ min *<sup>x</sup>*<sup>∈</sup> <sup>Ω</sup> *D*Ωð Þ *x* is called the index of dominance of the whole Ω set. Snippets with a minimum value of the dominance index form the Pareto set. The Pareto set includes snippets that are the best with respect to all the considered aspects, including relevance and pertinence.

In the project [4], an intuitively more acceptable value is used as the index of dominance, equal to the difference between the number of aspects taken into account and the dominance index determined by the formula (Eq. (4)). Groups of snippets with the same value of the dominance index form clusters, which in the final output of the "Context–sensitive search" software agent are arranged in descending order of this index.

As an illustration of the previous computations in the next section **Figure 9** shows a variant of sorting snippets by dominance index. Snippets are sorted in descending order of the dominance index value when six metrics are taken into account, including snippets relevance and pertinence. When snippets are ordered by the value of the dominance index, within groups of elements with the same value of the dominance index (that is, within a cluster), the snippets are ordered by each of the metrics taken into account in the calculations. Other ways to organize and systematize the content found are available for any combination of metrics that characterize the adequacy of the snippets.

#### **Figure 9.**

*Selecting network resources for semantic annotation: 1 – workspace for entering and editing network addresses (URLs) to be annotated; 2 – setting options and loading results of the context-sensitive search; 3 – the most relevant results of the context-sensitive search.*

*Semantic Web and Interactive Knowledge Graphs as an Educational Technology DOI: http://dx.doi.org/10.5772/intechopen.92433*
