**2. Tags and Web 2.0 objects**

A Web 2.0 *object* or *resource* (e.g., a textual document, audio image, or video) is defined as the main content of a Web 2.0 page. There are various sources of data related to this object, here referred to as its *features*, which we can classify as content features, textual features, user profile features, and social features.

*Content features* are attributes that can be extracted from the main content of the Web 2.0 object, such as the color histogram of an image. *Textual features*, in turn, comprise the self-contained textual blocks that are associated with an object, usually with a well-defined functionality, such as title, description, categories, tags, and user comments [3]. Note that these two sets of features may not be disjoint (e.g., when the main object is a textual document).

In particular, *tags* are keywords freely created by users and associated with objects. Tags are not necessarily unigrams (unless the application automatically splits them by whitespaces). Thus, tags may be composed by two or more words, sometimes separated by spaces, hyphenated, or joined.

**Figure 1** illustrates a MovieLens page containing textual features assigned to an object (a movie, in this case).

*User profile features* include characteristics of the users who created or interacted with the content, while *social features* refer to interactions among users (e.g., explicit friendship links, subscriptions, "likes," etc.). The social connections among users may be explicitly represented by friendship links or implicitly indicated by subscriptions (connections established among users that show interests in one another's content), and endorsements (e.g., "likes"). **Figure 2** illustrates examples of these features.

**137**

**Figure 1.**

**Figure 2.**

*features [9].*

*A Web 2.0 page and some of its textual features.*

hierarchical structure [10].

The Web 2.0 tags, objects, and users form the basic structure of the *folksonomies*, which are defined as the categorization of objects using freely chosen keywords by users. Unlike a taxonomy, which provides a hierarchical categorization with welldefined classes, a folksonomy establishes categories (as tags) without imposing a

*Features commonly found in Web 2.0 pages. Friendship and subscription links are representative examples of social features. The set of tags a user assigned to objects in the applications is taken as one of the user profile features. Features extracted from the content of the main object (e.g., color histogram) are examples of content* 

More formally, a folksonomy is defined as a relation *F =* (*U, T, O, P),* where *U*, *T*, and *O* are finite sets composed by users, tags, and objects, respectively, and *P*, the set of postings, is a ternary relation between these elements, that is, *P* ⊆ *U × T × O* [11].

*Tagging and Tag Recommendation*

*DOI: http://dx.doi.org/10.5772/intechopen.82242*

#### **Figure 1.**

*Cyberspace*

to the improvement of the user experience: there is a high potential of improving the quality of the generated tags by, for example, reducing the amount of misspellings and nondescriptive keywords. Thus, the quality of the IR services that rely on tags as data sources can be indirectly improved by tag recommendation. Other examples of the benefits that tag recommendation can bring to IR services include the direct application of the recommended tags in search [5] and on query expansion [6]. In search, the recommended tags can be exploited to measure the similarity between queries and documents, improving the quality of the retrieved documents. Query expansion, in turn, aims at suggesting more specific and unambiguous queries to the user, which also allows the achievement of better search results. Further examples include researcher

Tag recommendation brings specific challenges that other kinds of recommendation services do not: in the tag domain, we are interested not only in matching the interests of the target user but also in describing, summarizing, and organizing Web content. Thus, the design of tag recommenders demands specific solutions which greatly differ from methods proposed for item recommendation tasks in general. For instance, text mining, knowledge extraction, and semantics play a substantial role in the tag domain. In sum, the recommendation effectiveness affects not only user satisfaction but also the performance of various IR services that rely on tags as data source. The goal of this chapter is to present the concepts of tagging systems and to provide an overview of tag recommendation techniques, explaining the two main steps of these methods: the candidate tag generation and the candidate tag ranking. The rest of this chapter is organized as follows. In Section 2, we define tags, objects, folksonomies, and other basic concepts related to tagging systems. In Section 3, we state the tag recommendation problem, while we explain the main tag candidate extraction and ranking techniques in Sections 4 and 5, respectively.

A Web 2.0 *object* or *resource* (e.g., a textual document, audio image, or video) is defined as the main content of a Web 2.0 page. There are various sources of data related to this object, here referred to as its *features*, which we can classify as content

*Content features* are attributes that can be extracted from the main content of the Web 2.0 object, such as the color histogram of an image. *Textual features*, in turn, comprise the self-contained textual blocks that are associated with an object, usually with a well-defined functionality, such as title, description, categories, tags, and user comments [3]. Note that these two sets of features may not be disjoint

In particular, *tags* are keywords freely created by users and associated with objects. Tags are not necessarily unigrams (unless the application automatically splits them by whitespaces). Thus, tags may be composed by two or more words,

**Figure 1** illustrates a MovieLens page containing textual features assigned to an

*User profile features* include characteristics of the users who created or interacted with the content, while *social features* refer to interactions among users (e.g., explicit friendship links, subscriptions, "likes," etc.). The social connections among users may be explicitly represented by friendship links or implicitly indicated by subscriptions (connections established among users that show interests in one another's content), and endorsements (e.g., "likes"). **Figure 2** illustrates examples of these

features, textual features, user profile features, and social features.

(e.g., when the main object is a textual document).

sometimes separated by spaces, hyphenated, or joined.

profile summarization [7] and search result summarization [8].

**2. Tags and Web 2.0 objects**

object (a movie, in this case).

**136**

features.

*A Web 2.0 page and some of its textual features.*

#### **Figure 2.**

*Features commonly found in Web 2.0 pages. Friendship and subscription links are representative examples of social features. The set of tags a user assigned to objects in the applications is taken as one of the user profile features. Features extracted from the content of the main object (e.g., color histogram) are examples of content features [9].*

The Web 2.0 tags, objects, and users form the basic structure of the *folksonomies*, which are defined as the categorization of objects using freely chosen keywords by users. Unlike a taxonomy, which provides a hierarchical categorization with welldefined classes, a folksonomy establishes categories (as tags) without imposing a hierarchical structure [10].

More formally, a folksonomy is defined as a relation *F =* (*U, T, O, P),* where *U*, *T*, and *O* are finite sets composed by users, tags, and objects, respectively, and *P*, the set of postings, is a ternary relation between these elements, that is, *P* ⊆ *U × T × O* [11].

Thus, each element *(u, t, o)* ∈ *P* indicates that a user *u* associated a tag *t* to an object *o* (this is illustrated as the edges connecting users, tags, and objects in **Figure 2**). In [12], folksonomies are classified in *broad* and *narrow* folksonomies. A broad folksonomy occurs when multiple users can apply the same tag to an object, while a narrow folksonomy occurs when only one user (typically the target object's creator) can tag a given object.

Examples of broad folksonomies include the online radio station LastFM (http:// www.last.fm/) and the publication sharing application Bibsonomy (http://www. bibsonomy.org). The photo sharing site Flickr (http://www.flickr.com/) is an example of narrow folksonomy. While both broad and narrow folksonomies have common goals, a broad folksonomy can be further exploited to rank tags by their popularity and visualize the most important tags by means of tag clouds, which also provide an easy way to navigate the tags, objects, and users of a folksonomy.

Examples of tagging datasets available online for experimentation include MovieLens and Bibsonomy snapshots (https://grouplens.org/datasets/movielens and http://www. kde.cs.uni-kassel.de/bibsonomy/dumps, respectively) and our LastFM, YouTube, and YahooVideo crawled data (https://figshare.com/articles/data\_tar\_gz/2067183).
