**2. Related work**

Berners-Lee, Hendler, and Lassila [2] published the article "The Semantic Web" in 2001, marking a brand new approach to semantic web research. The World Wide Web Consortium (W3C) later established a series of technical specifications that promoted the further development of the semantic web; specifications such as RDF, OWL, and SPARQL have allowed the application of the semantic web to many research fields and, further, have laid a foundation for knowledge representation, knowledge organization, and information retrieval on the Internet. Ontology is one of the backbones of the semantic web and was widely used to specify standard concept vocabulary for exchanging data between systems, offer suggestions of answering queries, publish reusable knowledge bases, and provide services to facilitate operations across heterogeneous systems and databases [3]. In 2006, Berners-Lee [4] first proposed the concept of "linked data", which has since become a wildly popular research topic in the computer science (CS) and library and information science (LIS) fields. Linked data builds associations between objects through the resource description framework (RDF) structure, ultimately revealing the relationships and implicitly shared knowledge between heterogeneous sets of data. After more than 10 years of development, linked data has seen numerous breakthroughs in both theoretical and technical aspects. To date, the linking open data project [5] has successfully transformed billions of web data points (e.g., Wikipedia, geographic data, government data) into the RDF triples of linked data, creating one massive data network.

In recent years, researchers have begun to introduce semantic web technology to citation analysis in effort to exploit ontology, linked data, and other technologies to improve the description of citation behaviors and motivations. The most representative example is the semantic publishing and referencing (SPAR) ontologies created by Shotton, Portwin, Klyne, and Miles [6]. Citation Typing Ontology (CiTO) is the ontology SPAR used to describe the relationship between citing papers and cited papers; it provides reference information such as background, method, citation type (e.g., journals, books, reports), peer review, and more. CiTO's citation types include factual relationships and rhetorical relationships. The current version (CiTO 2.4.6) allows authors to describe their citation motivations as references, thus helping to reveal indirect and implicit relationships at work in scholarly literature. Ciancarini et al. [7] presented an experiment to investigate which are the main difficulties behind CiTO and how the humans understand and adopt CiTO. Iorio et al. [8] proposed a tool called CiTalO, which could automatically annotate the nature of citations with properties defined in CiTO through the semantic web and NLP techniques. By contrast, Recupero et al. [9] created SHELDON to extract citation RDF data from text using a machine reader, and CiTO was also used to describe the citation relationship.

citation analysis methods and tools are overly dependent on citation databases, which have

**2.** All kinds of statistical indicators are based on specific instances of citation, which are an-

**3.** Citation databases can only reveal whether there is a reference shared between different

Motivations and behaviors related to citation have been analyzed by researchers from various angles. In 2014, content-based citation analysis method [1] has also been proposed. In this chapter, we propose a new citation analysis framework based on ontology and linked data;

Berners-Lee, Hendler, and Lassila [2] published the article "The Semantic Web" in 2001, marking a brand new approach to semantic web research. The World Wide Web Consortium (W3C) later established a series of technical specifications that promoted the further development of the semantic web; specifications such as RDF, OWL, and SPARQL have allowed the application of the semantic web to many research fields and, further, have laid a foundation for knowledge representation, knowledge organization, and information retrieval on the Internet. Ontology is one of the backbones of the semantic web and was widely used to specify standard concept vocabulary for exchanging data between systems, offer suggestions of answering queries, publish reusable knowledge bases, and provide services to facilitate operations across heterogeneous systems and databases [3]. In 2006, Berners-Lee [4] first proposed the concept of "linked data", which has since become a wildly popular research topic in the computer science (CS) and library and information science (LIS) fields. Linked data builds associations between objects through the resource description framework (RDF) structure, ultimately revealing the relationships and implicitly shared knowledge between heterogeneous sets of data. After more than 10 years of development, linked data has seen numerous breakthroughs in both theoretical and technical aspects. To date, the linking open data project [5] has successfully transformed billions of web data points (e.g., Wikipedia, geographic data, government data) into the RDF triples of linked data, creating one massive data network.

In recent years, researchers have begun to introduce semantic web technology to citation analysis in effort to exploit ontology, linked data, and other technologies to improve the description of citation behaviors and motivations. The most representative example is the semantic publishing and referencing (SPAR) ontologies created by Shotton, Portwin, Klyne, and Miles [6]. Citation Typing Ontology (CiTO) is the ontology SPAR used to describe the relationship between citing papers and cited papers; it provides reference information such as background, method, citation type (e.g., journals, books, reports), peer review, and more. CiTO's

papers but fail to reflect any deeper relationships among semantic citations.

our goal is to enhance the efficacy of citation analysis via semantic web technology.

the following drawbacks:

196 Scientometrics

**2. Related work**

notated only by the author.

**1.** All citation acts are treated as equally important.

Other researchers, for example, Ding, Konidena, Sun, and Chen [10], have also explored the idea of semantic citation to suggest that individuals can use ontology and linked data to describe bibliographic data and publish it to RDF triples. Mahmood, Qadir, and Afzal [11] combined semantic web technology with credible citation analysis to establish a framework that provides openness and reliability validation for all stages of the citation behavior lifecycle. The framework requires the use of semantic metadata at all stages of academic publishing to annotate the citation behavior and generate machine-readable RDF triples. This kind of annotation makes author, publisher, database vendor, and citation analysis system work together and build a set of reliable reference information while eliminating any false or misleading citation actions in the literature. More recently, Peroni et al. [12] experimentally described references in a suitable machine-readable RDF formats to make reference lists freely available to all academics. The open citation corpus [13] is created to store citation data from open access databases.

Quickly moving into an unfamiliar field for researchers is difficult, due to the mass of scientific articles [14] that must be reviewed without prior knowledge of their research contents. In a traditional citation information service, the search results are generated by keywords and other information that match specific knowledge resources and the corresponding user's correspondence. Such a method is simple, but it often ignores the semantic level of the knowledge resources, causing it to miss a significant number of semantic knowledge resources [15]. It may yield search results from a large number of studies that still do not meet the user's personalized knowledge needs [16].

In 2001, Aronson [17] argued that query refinement based on ontology is more efficient than other methods that were available at the time. From the perspective of information organization, ontology is a new method of knowledge organization and processing, and it is also the basis of semantic webs. It can systematize and organize a large amount of relevant information. When applying ontology to information retrieval, it is necessary to apply ontological principles to the information resources, so that search reasoning is implemented by the logical rules contained in the ontology itself, and a high quality retrieval result is output. With respect to the shortcomings of traditional citation information services, the introduction of ontology may help users to improve their searches aimed at multiple citation retrieval. In 2012, Kara, Alan, Sabuncu, Akpınar, Cicekli, and Alpaslan [18] found that while thesauruses are concerned with meanings at the level of words, ontologies more specifically deal with meanings at the level of real-world entities denoted by words. That is, ontologies deal with the interpretation of words in terms of real-world entities.

In recent years, with the advance of ontology, related studies have revealed that ontologybased knowledge services have been developed in different areas, including personalized medicine [19], e-government [20], medicine [21, 22], smart homes [23], the digital library [24, 25], and so on.

information, so we choose the seven-step method developed by the Stanford University. The seven steps are (1) defining the domain and category of the ontology, (2) examining the possibility of reusing existing ontologies, (3) listing the important terms in the ontology, (4) defining the hierarchical system of classes, (5) defining the properties of the classes, (6) defining the facets of the properties, and (7) creating the instance. We also use the most popular protégé as

The Impact on Citation Analysis Based on Ontology and Linked Data

http://dx.doi.org/10.5772/intechopen.76377

199

The construction of BCO is based on references. From the list of references, information such as the author, periodical, document type, year, volume period, and page number are extracted as the classes of BCO. In order to extend the dimensions of citation analysis, we extend the subclass from the perspective of journal and author. The "reference number" class is also added to the article, and the importance of the reference is measured by the quantities of internal references and external references. For property definitions, we reused the alreadyexisting ontology properties (e.g., "fabio: hasPublicationYear," "bibo: volume") and marked the newly added attributes in the form of "bco." An example of the BCO ontology's classes

The construction of FCO begins with three aspects: citation function, citation sentiment, and citation position. The citation function represents the role of cited work to citing work, such as background development, data support, methodology support, extension, or refutation. Citation sentiment expresses the emotion attitude from citing work to cited work, such as positive, neutral, and negative. Citation position indicates the location of the paragraph where the reference behavior occurs, such as the "Introduction" section of the document. An

example of the FCO ontology's classes and properties is shown in **Figure 2**.

**Figure 1.** Example classes and properties of bibliographic citation ontology.

our ontology development tool.

and properties is shown in **Figure 1**.

The digital library is an important application area of ontology-based knowledge service research. In 2015, Patkar [26] indicated that ontology is one of the latest tools for information retrieval from libraries in this digital age. His paper discusses advances in information managing tools and concludes by highlighting the applications of ontology among the different fields.

Koutsomitropoulos, Solomo, and Papatheodorou [27] studied the semantic search service of the DSpace digital repository system. They argued that Semantic Search v2 introduces a structured query mechanism that makes querying easier and improves the design of the system, performance, and scalability. Queries based on the DSpace ontology were dynamically created, and DSpace was able to obtain structured knowledge from the available metadata. Empirical and quantitative evaluation has shown that such a system can conduct semantic searches that provide better services for inexperienced users, such as the use of new query dimensions, with clear benefits.

In 2015, Iorio and Schaerf [28] proposed a semantic model defined by the Sapienza Digital Library to describe resource metadata. The semantic model is derived from the metadata object description model (a digital library descriptive standard). A top-level conceptual reference model supports the implementation of semantic web technologies for digital library metadata.
