**3. The Semantic Web**

The World Wide Web (WWW or the Web) is the single largest repository of information. The growth of Web has been tremendous since its evolvement both in terms of the content and the technology. The first generation Webs were mainly presentation based. They provided information through the Web pages but did not allow users to interact with them. In short they contained read only information. Moreover, the early pages were text only pages and do not contain multimedia data. These Web sites have higher dependency on the presentation languages as Hypertext Markup Languages (HTML) (Horrocks et al., 2004). With the introduction of eXtensible Markup Language (XML), the information within the pages became more structured. Those XML based pages could hold up the contents in more structured method but still lack the proper definition of semantics within the contents (Berners-Lee T., 1998). Needs of intelligent systems which could exploit the wide range of information available within the Web are widely felt. Semantic Web is envisaged to address the need. The term "Semantic Web" is coined by Tim Berners-Lee in his work (Berners-Lee et al., 2001) to propose the inclusion of semantic for better enabling machine-people cooperation for handling the huge information that exists in the Web.

The term "Semantic Web" has been defined numerous time. Though there is no formal definition of Semantic Web, some of its most used definitions are.

The Semantic Web is not a separate Web but an extension of the current one, in which information is given well-defined meaning, better enabling computers and people to work in cooperation. It is a source to retrieve information from the Web (using the Web spiders from RDF files) and access the data through Semantic Web Agents or Semantic Web Services. Simply Semantic Web is data about data or metadata (Berners-Lee et al., 2001).


Spatialization of the Semantic Web 169

data structures and patterns or handling the semantic interoperability through handling the difference in terminologies (Sheth, 1999). The necessity to have a common understanding of the information led to the concept of inclusion of some form of semantics to represent them. Metadata provided the semantic representations to the information. Metadata is data of data which provides information about the data in terms of their creation, storage, management, authority and in certain term their intended purpose. Metadata became essential part of any reliable information source and a medium to maintain interoperability. Likewise the trend to have standardization or adoption of ad hoc standards made significant progress towards

The current generation has followed the previous trend of heterogeneity in the data source and has carried it even further. The users have become more sophisticated in using this information. They expect the system to help them not at the data level but at the information and increasingly knowledge level (Sheth, 1999), thus expecting to have interoperability at the semantic level. Though metadata provides certain level of semantics for the data, they are generally not enough for managing the ever exploding information. The contexts of information needs to be taken into account to understand the information and these contexts are managed through the ontologies as traditionally they are built for specifying the vocabularies and their relationships. The underlying semantics in ontology provides foundation to interpret the knowledge within. This has provided a huge boosting in achieving interoperability between systems. The use of knowledge to understand information between systems and find a common linkage between them provides a framework for the interoperation. The issue of interoperability which started with technical differences has come to difference in understanding. The technical differences in dealing with interoperability is long been exercised but the semantic differences has come in a big way. It became even bigger issue with the amount of information that is available today. The problem could be tackled with resolving the differences in understanding of information. So

Web 3.0 aims to make computers understand semantics behind information. This would make them intelligent to process information and deliver the required knowledge. It could be argued that the information when encapsulated by semantics would provide knowledge. The relationship between Web 3.0 and Semantic Web is a topic of argument. There are suggestions that they are the same whereas some argue that Semantic Web is a sub-set of

Description Logics supports serialization through the human readable forms of the real world scenario with the classification of concepts and individuals. Moreover, they support

"People keep asking what Web 3.0 is. I think maybe when you've got an overlay of scalable vector graphics - everything rippling and folding and looking misty - on Web 2.0 and access to a Semantic Web integrated across a huge space of data, you'll

Tim Berners-Lee, 2006, (Shannon, 2006)

Web 3. Sir Tim Berners-Lee has described Semantic Web as a component of Web 3.

achieving system, syntactic and structural interoperability (Sheth, 1999).

a form of semantic mapping can address such issues of understanding.

have access to an unbelievable data resource"

This chapter covers different features of the Semantic Web.

**3.1 The knowledge base**

Any information systems which have to interoperate with various other information systems have to face the problem of interoperability. The archaeological community has seen the tremendous change in the manner the data are collected and manipulated. In one hand the technology growth provides the added functionalities to handle information which archaeologists cherish but at the same time they provide heterogeneity in the information pattern. The differences in manners and methods of individual community with the archaeological domain have led to development of independent systems and this has contributed in data incompatibility. A platform providing interoperability between different systems and in particular different sets of information has been widely felt within the community. Actually, the data heterogeneity is the main issue when the time comes to exchange and to manage information that describe the real world.

The issue of interoperability has always been there in the field of Information Technology ever since the computer systems started to communicate with each other through various modes. Factors like data authority, system autonomy and data heterogeneity are involved in the concerns of achieving efficient interoperability among different information systems. During the initial stages of the technology when a system was restricted to a department or at most a company, the issue of interoperability was limited within departments of a company. Hence the concern of data authority was not a big issue. However, the involvement of different departments and with them different players raised the issue of data heterogeneity. The evolvement of database management system (DBMS) fuelled up the necessity for data interoperability. Different underlying issues needed to be considered for achieving data interoperability in database systems like the structural differences, constraints differences or the difference in query languages. These information systems are based on DBMSs and hence the efficiency of system interoperability depended on tackling the question of heterogeneity of underlying data models of these DBMSs. As data models are represented through their schemas, the most common approach was to compare the schemas of the DBMSs and convert a schema of a DBMS to the next DBMS. Other approaches like building up a common model which acts as a broker to interchange the data between different DBMSs were also preferred to achieve the interoperability. In short, the first generation problem of data interoperability was mainly due to the fact of the differences in technical issues such as structures, constraints and different techniques. These problems are short term problems as they could be sorted out with a broker technology mediating between different technical approaches. The main problem of interoperability arises when there is a difference in understanding. The semantic differences between information fuel up the interoperability issues as the information gets more accessible and easy to use.

The next generation of systems saw gradual acceleration in the data types which are not necessarily structured. Those kinds of data could be semi-structured data or digital data like multimedia data. During this period data like geospatial data or temporal data got more acceptances within structured data community expanding the horizon of structured data. The influx of tailored made software applications for these kinds of data has raised the arguments of interoperability in much stronger manner. To add this there is the rapid growth of Internet technology and rapid growth in tendency to depend on internet for information. The information is thus distributed through various systems with their independent methods of developments and presentations. The issue of interoperability revolved around factors like technology for dealing heterogeneous systems with different 168 Semantics – Advances in Theories and Mathematical Models

Any information systems which have to interoperate with various other information systems have to face the problem of interoperability. The archaeological community has seen the tremendous change in the manner the data are collected and manipulated. In one hand the technology growth provides the added functionalities to handle information which archaeologists cherish but at the same time they provide heterogeneity in the information pattern. The differences in manners and methods of individual community with the archaeological domain have led to development of independent systems and this has contributed in data incompatibility. A platform providing interoperability between different systems and in particular different sets of information has been widely felt within the community. Actually, the data heterogeneity is the main issue when the time comes to

The issue of interoperability has always been there in the field of Information Technology ever since the computer systems started to communicate with each other through various modes. Factors like data authority, system autonomy and data heterogeneity are involved in the concerns of achieving efficient interoperability among different information systems. During the initial stages of the technology when a system was restricted to a department or at most a company, the issue of interoperability was limited within departments of a company. Hence the concern of data authority was not a big issue. However, the involvement of different departments and with them different players raised the issue of data heterogeneity. The evolvement of database management system (DBMS) fuelled up the necessity for data interoperability. Different underlying issues needed to be considered for achieving data interoperability in database systems like the structural differences, constraints differences or the difference in query languages. These information systems are based on DBMSs and hence the efficiency of system interoperability depended on tackling the question of heterogeneity of underlying data models of these DBMSs. As data models are represented through their schemas, the most common approach was to compare the schemas of the DBMSs and convert a schema of a DBMS to the next DBMS. Other approaches like building up a common model which acts as a broker to interchange the data between different DBMSs were also preferred to achieve the interoperability. In short, the first generation problem of data interoperability was mainly due to the fact of the differences in technical issues such as structures, constraints and different techniques. These problems are short term problems as they could be sorted out with a broker technology mediating between different technical approaches. The main problem of interoperability arises when there is a difference in understanding. The semantic differences between information fuel up the interoperability issues as the information gets more accessible and

The next generation of systems saw gradual acceleration in the data types which are not necessarily structured. Those kinds of data could be semi-structured data or digital data like multimedia data. During this period data like geospatial data or temporal data got more acceptances within structured data community expanding the horizon of structured data. The influx of tailored made software applications for these kinds of data has raised the arguments of interoperability in much stronger manner. To add this there is the rapid growth of Internet technology and rapid growth in tendency to depend on internet for information. The information is thus distributed through various systems with their independent methods of developments and presentations. The issue of interoperability revolved around factors like technology for dealing heterogeneous systems with different

exchange and to manage information that describe the real world.

easy to use.

data structures and patterns or handling the semantic interoperability through handling the difference in terminologies (Sheth, 1999). The necessity to have a common understanding of the information led to the concept of inclusion of some form of semantics to represent them. Metadata provided the semantic representations to the information. Metadata is data of data which provides information about the data in terms of their creation, storage, management, authority and in certain term their intended purpose. Metadata became essential part of any reliable information source and a medium to maintain interoperability. Likewise the trend to have standardization or adoption of ad hoc standards made significant progress towards achieving system, syntactic and structural interoperability (Sheth, 1999).

The current generation has followed the previous trend of heterogeneity in the data source and has carried it even further. The users have become more sophisticated in using this information. They expect the system to help them not at the data level but at the information and increasingly knowledge level (Sheth, 1999), thus expecting to have interoperability at the semantic level. Though metadata provides certain level of semantics for the data, they are generally not enough for managing the ever exploding information. The contexts of information needs to be taken into account to understand the information and these contexts are managed through the ontologies as traditionally they are built for specifying the vocabularies and their relationships. The underlying semantics in ontology provides foundation to interpret the knowledge within. This has provided a huge boosting in achieving interoperability between systems. The use of knowledge to understand information between systems and find a common linkage between them provides a framework for the interoperation. The issue of interoperability which started with technical differences has come to difference in understanding. The technical differences in dealing with interoperability is long been exercised but the semantic differences has come in a big way. It became even bigger issue with the amount of information that is available today. The problem could be tackled with resolving the differences in understanding of information. So a form of semantic mapping can address such issues of understanding.

Web 3.0 aims to make computers understand semantics behind information. This would make them intelligent to process information and deliver the required knowledge. It could be argued that the information when encapsulated by semantics would provide knowledge. The relationship between Web 3.0 and Semantic Web is a topic of argument. There are suggestions that they are the same whereas some argue that Semantic Web is a sub-set of Web 3. Sir Tim Berners-Lee has described Semantic Web as a component of Web 3.

"People keep asking what Web 3.0 is. I think maybe when you've got an overlay of scalable vector graphics - everything rippling and folding and looking misty - on Web 2.0 and access to a Semantic Web integrated across a huge space of data, you'll have access to an unbelievable data resource"

Tim Berners-Lee, 2006, (Shannon, 2006)

This chapter covers different features of the Semantic Web.

#### **3.1 The knowledge base**

Description Logics supports serialization through the human readable forms of the real world scenario with the classification of concepts and individuals. Moreover, they support

Spatialization of the Semantic Web 171

The concepts can be organized into superclass-subclass hierarchy which is also known as taxonomy. It shares the object-oriented concepts in managing the hierarchy of superconceptsubconcept. The subconcepts are specialized concepts of their superconcepts and the superconcepts are generalized concepts of their subconcepts. The subsumption algorithm determines the superclass-subclass relationships. For an example all individuals of a class must be individuals of its superclass. In general all concepts are subsumed by their superclass. In any graphical representation of knowledge concepts are represented through the nodes. Similarly the roles are binary relationship between concepts and eventually the relationships of the individuals of those concepts. They are represented by links in the graphical representation of knowledge. The description language has a model-theoretic semantics as the language for building the descriptions is independent to each DL system. Thus, statements in the TBox and in the ABox can be identified as first-order logic or, in

The Semantic Web stack also called the Semantic Web cake is basically a hierarchy of the technologies composed of different layers. Each layer takes advantages of the capabilities concerning all the sub-layers. The following figure 5. illustrates the Semantic Web cake.

some cases, a slight extension of it (Baader & Nutt, 2002).

**3.2 The Semantic Web stack**

Fig. 5. The Semantic Web Stack.

the hierarchical structure of concepts in forms of subconcepts/superconcepts relationships of a concept between the concepts of a given terminology. This hierarchical structure provides efficient inference through the proper relations between different concepts. The individual-concept relationship could be compared to instantiation of an object to its class in object-oriented concept. In this manner, the approach DL takes can be related to classification of objects in a real world scenario.

Description logics provide formalization to knowledge representation of real world situations. This means, it should provide the logical replies to the queries of real world situations. This is currently most researched topic in this domain. The results are highly sophisticated reasoning engines which utilize the capabilities of expressiveness of DLs to manipulate the knowledge. A Knowledge Representation system is a formal representation of knowledge described through different technologies. When it is describe through DLs, they set up a Knowledge Base (KB), the contents of which could be reasoned or infer to manipulate them. A knowledge base could be considered as a complete package of knowledge content. It is however only a subset of a KR system that contains additional components.

Figure 4 (Baader & Nutt, 2002) sketches the architecture of any KR system based on DLs. It could be seen the central theme of such a system is a Knowledge Base (KB). The KB constitutes of two components: the TBox and the ABox.

TBox statements are the terms or the terminologies that are used within the system domain. In general they are statements describing the domain through the controlled vocabularies. For example in terms of a social domain the TBox statements are the set of concepts as People, Male,Female, Father, Daughter etc. or the set of roles as marriedTo, siblingOf, sonOf, hasDaughter etc. ABox in other hand contains assertions to the TBox statements. For example Ashish is a Male is an ABox statement. In object oriented concept ABox statements compliant TBox statements through instantiating what is equivalent to classes in TBox and relating the roles (equivalent to methods or properties in OO concept) to those instances. The DLs are expressed through the concepts and roles of a particular domain. This complements well with the fact how knowledge is expressed in the general term. Concepts are sets of classes of individual objects. Classes provide an abstraction mechanism for grouping resources with similar characteristics (Bechhofer, et al., 2004).

Fig. 4. The Architecture of a knowledge representation system based on DLs.

170 Semantics – Advances in Theories and Mathematical Models

the hierarchical structure of concepts in forms of subconcepts/superconcepts relationships of a concept between the concepts of a given terminology. This hierarchical structure provides efficient inference through the proper relations between different concepts. The individual-concept relationship could be compared to instantiation of an object to its class in object-oriented concept. In this manner, the approach DL takes can be related to

Description logics provide formalization to knowledge representation of real world situations. This means, it should provide the logical replies to the queries of real world situations. This is currently most researched topic in this domain. The results are highly sophisticated reasoning engines which utilize the capabilities of expressiveness of DLs to manipulate the knowledge. A Knowledge Representation system is a formal representation of knowledge described through different technologies. When it is describe through DLs, they set up a Knowledge Base (KB), the contents of which could be reasoned or infer to manipulate them. A knowledge base could be considered as a complete package of knowledge content. It is however only a subset of a KR system that contains additional

Figure 4 (Baader & Nutt, 2002) sketches the architecture of any KR system based on DLs. It could be seen the central theme of such a system is a Knowledge Base (KB). The KB

TBox statements are the terms or the terminologies that are used within the system domain. In general they are statements describing the domain through the controlled vocabularies. For example in terms of a social domain the TBox statements are the set of concepts as People, Male,Female, Father, Daughter etc. or the set of roles as marriedTo, siblingOf, sonOf, hasDaughter etc. ABox in other hand contains assertions to the TBox statements. For example Ashish is a Male is an ABox statement. In object oriented concept ABox statements compliant TBox statements through instantiating what is equivalent to classes in TBox and relating the roles (equivalent to methods or properties in OO concept) to those instances. The DLs are expressed through the concepts and roles of a particular domain. This complements well with the fact how knowledge is expressed in the general term. Concepts are sets of classes of individual objects. Classes provide an abstraction mechanism for

classification of objects in a real world scenario.

constitutes of two components: the TBox and the ABox.

grouping resources with similar characteristics (Bechhofer, et al., 2004).

Fig. 4. The Architecture of a knowledge representation system based on DLs.

components.

The concepts can be organized into superclass-subclass hierarchy which is also known as taxonomy. It shares the object-oriented concepts in managing the hierarchy of superconceptsubconcept. The subconcepts are specialized concepts of their superconcepts and the superconcepts are generalized concepts of their subconcepts. The subsumption algorithm determines the superclass-subclass relationships. For an example all individuals of a class must be individuals of its superclass. In general all concepts are subsumed by their superclass. In any graphical representation of knowledge concepts are represented through the nodes. Similarly the roles are binary relationship between concepts and eventually the relationships of the individuals of those concepts. They are represented by links in the graphical representation of knowledge. The description language has a model-theoretic semantics as the language for building the descriptions is independent to each DL system. Thus, statements in the TBox and in the ABox can be identified as first-order logic or, in some cases, a slight extension of it (Baader & Nutt, 2002).
