**2.4 What is linked data and five-star open data**

It is said that Semantic Web, though simple, is still not being used extensively [5]. Linked Data is a set of guidelines for disclosing, sharing, and connecting pieces of information or knowledge on the Semantic Web using URI and RDF [6]. The Linked Open Data (LOD) project by Chris Bizer and Richard Cyganiak aims to expand the web with shared data by distributing open datasets in the RDF format on Semantic Web and creating the RDF links between these datasets [7]. A class of open data sharing level is defined as the number of stars (⋆) as follows:<sup>2</sup>


<sup>1</sup> https://www.ted.com/talks/tim\_berners\_lee\_the\_next\_web/transcript

<sup>2</sup> https://5stardata.info/en/

⋆⋆⋆Three-star level requires that the data must be in a structured form with an open standard format.

BabelNet [18] and Multilingual Entity Taxonomy (MENTA) [19] extract facts from Wikipedia and WordNet as well as YAGO, but BabelNet and MENTA aimed

*TULIP: A Five-Star Table and List - From Machine-Readable to Machine-Understandable…*

Freebase of Metaweb Technologies [20] is a Web-based knowledge base where users share structured information directly through a Webpage specifically designed for recording and verifying information [21] (unlike DBpedia and YAGO, in which structured data was converted from Wikipedia.) After being acquired by Google in 2010, its data was transferred to Wikidata in 2014. Finally, in 2016, Freebase was closed, and it has been integrated into the Google Knowledge Graph. It is later being developed into Knowledge Vault: a Google research that aims to create an automated

In addition to the knowledge that users create in the system via the Web, Freebase also collects much information from Wikipedia [23] including Notable Names Database (NNDB), Fashion Model Directory (FMD), and MusicBrainz, in order to create a large amount of seed data. Before closing down, Freebase accu-

Wikidata is a project of the Wikimedia Foundation [24]. It is an open knowledge base, allowing users to manually record facts through a system designed to be easy to use, similar to Wikipedia. One interesting concept of Wikidata is the ability to keep the facts in conflict when it is not possible to conclude which fact is more accurate [25]. The "credibility" of information in Wikidata (including Wikipedia) does not focus on the "accuracy" of information more than the "provenance" of that information. For example, the population data of Mumbai is 12.5 million people, according to the Indian Bureau of Statistics but 20.5 million people when based on UN estimates. It is not the responsibility of the Wikidata community to find out what the truth is. Wikidata uses a straightforward way to store all information along with its source. The user has to choose which one to use. Currently, Wikidata has 30 million facts about 14 million topics. It can be seen that both DBpedia and Wikidata are the conversion of Wikipedia data into structured data using different methods [26]. However, some parts of Wikidata have been converted and incorporated into

Cyc is an extensive knowledgebase project by Douglas B. Lenat which started in 1984 [29]. The goal is to store a large number of facts and organize them automatically. OpenCyc is a smaller version of Cyc that reduces the size of the knowledge base and is publicly available [30]. However, OpenCyc was shut down in 2017, but

RDF123 by Han et al. [32] is a tool used to convert data in spreadsheet format to

RDF format. Its concept can also be used to convert the table data into RDF. A survey paper [33] of the W3C RDB2RDF Incubator Group discusses many research projects that involve converting data from relational databases to RDF. Although this research does not mention the data conversion from the generic table, it can be

There are also many W3C recommendations by CSV on the Web Working Group<sup>4</sup> which discusses the conversion of data in the form of record sets in CSV

<sup>4</sup> https://www.w3.org/2013/csvw/wiki/Main\_Page CSV on the Web Working Group Wiki

at creating a multilingual knowledge base.

*DOI: http://dx.doi.org/10.5772/intechopen.91406*

mulated 2.4 billion facts in 44 million topics.

DBpedia Wikidata [27] and the ProFusion dataset [28].

ResearchCyc is still open for research studies [31].

*3.1.3 Transform data from other formats to RDF*

format to other formats such as RDF or JSON.

applied to table conversion.

**23**

*3.1.2 Manually recording facts into the knowledge base*

process to build the knowledge base directly from the Web [22].

