**3. The tag recommendation problem**

As in [13], we define two tag recommendation tasks: the *object-centered* problem and the *personalized* problem. In the former, the goal is to generate and rank candidate tags according to their relevance to the target object, that is, the extent to which the tag is related to or describes the target object. *Object-centered tag recommendations,* which do not vary according to the target user, aim at improving tag quality and indirectly improving the effectiveness of information retrieval services, such as searching, classification, and item recommendation, which exploit tags as data sources.

On the other hand, *personalized* tag recommendation takes not only the target object but also the target user into account, aiming at suggesting tags that are relevant to both the target object and the user. Thus, personalized tag recommenders might provide different results for different users, which may better capture the user interests, profile, and background. According to [13], "in applications where multiple users can assign tags to the same object, such as Last.FM, a personalized tag recommender is not only useful for the individual user (e.g., for content organization) but also in a collective sense. This is because, jointly, the tags recommended to different users may provide a more complete description of the object, benefiting search and recommendation services."

In more formal terms, the tag recommendation tasks are defined in [13] as:

*"***Object-Centered Tag Recommendation***. Given a set of input tags Io associated with the target object o, generate a list of candidate tags Co, sorted according to their relevance to object o, and recommend the k candidates in the top positions of Co.*

**Personalized Tag Recommendation***. Given a set of input tags Io, associated with the target object o, generate a list of candidate tags Co,u, sorted according to their relevance to both user u and object o, and recommend the k candidates in the top positions of Co,u."*

Note that possibly there are no tags available in the target object, that is, *Io* = ∅. This is a variation of the "cold start" problem, a well-known problem in recommender systems generally defined as a scenario in which there is an insufficient amount of information about the target user or object, making it difficult to provide effective recommendations.

**139**

**Other aspects**

*Tag recommendation: problem statement.*

user

**Table 1.**

*Tagging and Tag Recommendation*

*DOI: http://dx.doi.org/10.5772/intechopen.82242*

specific tag recommendation domain [14].

concepts of novelty and diversity have been introduced.

function of the frequency of the tag in the collection [14].

**Output: ranked list of candidate tags** *Co*: sorted according

**Novelty/specificity**: capacity of recommending more rare tags

redundancy, that is, focus on a single topic.

These definitions focus on relevance as the only objective to be maximized. However, other aspects of the problem, such as novelty and diversity, have been considered as important, in recommendation systems in general and also in the

According to the traditional definition of relevance or accuracy, the relevance of each tag in a recommendation list is independent of the relevance of the other tags in the list. However, in the general recommendation context, given that a recommendation satisfied the user need, the usefulness of similar recommendations is arguable. This occurs in the tagging context when, for example, only synonyms or strongly similar words are provided as recommendations. To deal with these issues,

In tag recommendation, the novelty of a tag has been defined from the perspec-

The *diversity* of a list of recommended tags, in turn, can be interpreted as the *exhaustivity* of these tags, which is defined in [15] as the coverage they provide for the topics of the associated object. Two approaches to estimate diversity in tag recommendation have been proposed. The implicit approach exploits properties of the recommended items (tags in our case), estimating diversity as the average pairwise semantic dissimilarity between the top recommended tags. In this context, a list of synonyms or semantically related words presents low diversity [14]. The explicit diversification approach, on the other hand, exploits properties of the target of recommendations, such as a set of explicit topics (e.g., categories) related to the target object. The goal of the explicit diversifiers is to cover as many topics related to the target object as possible, and as early in the ranking as possible, minimizing

**Table 1** summarizes the tag recommendation problem and its aspects.

**Input** *Io*: set of input tags associated with the target object *o* **Target** Object *o* Pair object-user <*o,u*>

**Diversity/exhaustivity**: capacity of recommending tags related to the different topics of the target object or

to relevance and other aspects related to *o*

**Personalized Object-centered**

*Co,u*: sorted according to relevance and other aspects related to the pair <*o,u*>

tive of its popularity in the application. In [14], tag novelty is calculated as the inverse of the frequency at which the tag is used in the collection. The rationale is that frequently used tags tend to be more "obvious" recommendations (if relevant), thus being of little use to improve the description of the target object. We note that, according to this definition, noisy terms such as typos may be considered highly novel. However, novelty and diversity must be considered jointly with relevance in order to provide effective tag recommendations. It is worth mentioning that this definition of novelty is closely related to tag *specificity* [15], since rare words tend to be more specific (less general). For example, the word "feline" is less specific than "cat" or "tiger," and thus it is expected that "feline" would be used to describe a larger number of objects than these more specific terms. Therefore, specificity can be interpreted as a statistical property of the term use, being estimated as an inverse

#### *Tagging and Tag Recommendation DOI: http://dx.doi.org/10.5772/intechopen.82242*

*Cyberspace*

can tag a given object.

**3. The tag recommendation problem**

Thus, each element *(u, t, o)* ∈ *P* indicates that a user *u* associated a tag *t* to an object *o* (this is illustrated as the edges connecting users, tags, and objects in **Figure 2**). In [12], folksonomies are classified in *broad* and *narrow* folksonomies. A broad folksonomy occurs when multiple users can apply the same tag to an object, while a narrow folksonomy occurs when only one user (typically the target object's creator)

Examples of broad folksonomies include the online radio station LastFM (http:// www.last.fm/) and the publication sharing application Bibsonomy (http://www. bibsonomy.org). The photo sharing site Flickr (http://www.flickr.com/) is an example of narrow folksonomy. While both broad and narrow folksonomies have common goals, a broad folksonomy can be further exploited to rank tags by their popularity and visualize the most important tags by means of tag clouds, which also

Examples of tagging datasets available online for experimentation include MovieLens and Bibsonomy snapshots (https://grouplens.org/datasets/movielens and http://www. kde.cs.uni-kassel.de/bibsonomy/dumps, respectively) and our LastFM, YouTube, and YahooVideo crawled data (https://figshare.com/articles/data\_tar\_gz/2067183).

As in [13], we define two tag recommendation tasks: the *object-centered* problem and the *personalized* problem. In the former, the goal is to generate and rank candidate tags according to their relevance to the target object, that is, the extent to which the tag is related to or describes the target object. *Object-centered tag recommendations,* which do not vary according to the target user, aim at improving tag quality and indirectly improving the effectiveness of information retrieval services, such as searching, clas-

On the other hand, *personalized* tag recommendation takes not only the target object but also the target user into account, aiming at suggesting tags that are relevant to both the target object and the user. Thus, personalized tag recommenders might provide different results for different users, which may better capture the user interests, profile, and background. According to [13], "in applications where multiple users can assign tags to the same object, such as Last.FM, a personalized tag recommender is not only useful for the individual user (e.g., for content organization) but also in a collective sense. This is because, jointly, the tags recommended to different users may provide a more complete description of the object, benefiting search and recommendation services." In more formal terms, the tag recommendation tasks are defined in [13] as:

*"***Object-Centered Tag Recommendation***. Given a set of input tags Io associated with the target object o, generate a list of candidate tags Co, sorted according to their relevance to object o, and recommend the k candidates in the top positions of Co.*

**Personalized Tag Recommendation***. Given a set of input tags Io, associated with the target object o, generate a list of candidate tags Co,u, sorted according to their relevance to both user u and object o, and recommend the k candidates in the* 

Note that possibly there are no tags available in the target object, that is, *Io* = ∅. This is a variation of the "cold start" problem, a well-known problem in recommender systems generally defined as a scenario in which there is an insufficient amount of information about the target user or object, making it difficult to provide

sification, and item recommendation, which exploit tags as data sources.

provide an easy way to navigate the tags, objects, and users of a folksonomy.

**138**

*top positions of Co,u."*

effective recommendations.

These definitions focus on relevance as the only objective to be maximized. However, other aspects of the problem, such as novelty and diversity, have been considered as important, in recommendation systems in general and also in the specific tag recommendation domain [14].

According to the traditional definition of relevance or accuracy, the relevance of each tag in a recommendation list is independent of the relevance of the other tags in the list. However, in the general recommendation context, given that a recommendation satisfied the user need, the usefulness of similar recommendations is arguable. This occurs in the tagging context when, for example, only synonyms or strongly similar words are provided as recommendations. To deal with these issues, concepts of novelty and diversity have been introduced.

In tag recommendation, the novelty of a tag has been defined from the perspective of its popularity in the application. In [14], tag novelty is calculated as the inverse of the frequency at which the tag is used in the collection. The rationale is that frequently used tags tend to be more "obvious" recommendations (if relevant), thus being of little use to improve the description of the target object. We note that, according to this definition, noisy terms such as typos may be considered highly novel. However, novelty and diversity must be considered jointly with relevance in order to provide effective tag recommendations. It is worth mentioning that this definition of novelty is closely related to tag *specificity* [15], since rare words tend to be more specific (less general). For example, the word "feline" is less specific than "cat" or "tiger," and thus it is expected that "feline" would be used to describe a larger number of objects than these more specific terms. Therefore, specificity can be interpreted as a statistical property of the term use, being estimated as an inverse function of the frequency of the tag in the collection [14].

The *diversity* of a list of recommended tags, in turn, can be interpreted as the *exhaustivity* of these tags, which is defined in [15] as the coverage they provide for the topics of the associated object. Two approaches to estimate diversity in tag recommendation have been proposed. The implicit approach exploits properties of the recommended items (tags in our case), estimating diversity as the average pairwise semantic dissimilarity between the top recommended tags. In this context, a list of synonyms or semantically related words presents low diversity [14]. The explicit diversification approach, on the other hand, exploits properties of the target of recommendations, such as a set of explicit topics (e.g., categories) related to the target object. The goal of the explicit diversifiers is to cover as many topics related to the target object as possible, and as early in the ranking as possible, minimizing redundancy, that is, focus on a single topic.


**Table 1** summarizes the tag recommendation problem and its aspects.

**Novelty/specificity**: capacity of recommending more rare tags **Diversity/exhaustivity**: capacity of recommending tags related to the different topics of the target object or user

#### **Table 1.**

*Tag recommendation: problem statement.*
