**1.1 Overview of the paper**

Given the irrefutable importance of social networks in our daily lives and the ever increasing risk of compromising valuable personal data through privacy attacks against these networks, it is preferable to know how secure a given social network is against privacy attacks. This necessitates a deeper look into the types of privacy attacks and how to cope with them. There is an extensive literature on privacy preserving computational models in variety of application areas such as multi-party communications or distributed computing settings [2–6]. In this chapter, we focus on a specific type of attack known as *background-based active attack* and one measure that reflects the resistance of any given network against such attacks. The organization of the rest of the paper is as follows:


## **2. Privacy measures in social networks**

We begin by discussing the mathematical structure that fit the most to represent social networks. A social network is often portrayed as a graph [7, 8] *G* ¼ ð Þ *V*, *E* where *V* is a set of nodes representing the social members, and *E* is the set of edges portraying the relationship among these members. Both nodes and edges may have extra attributes, such as weights, that provide extra information about the nature of these social bonds (*e.g.*, trust or popularity); however, throughout this survey we will consider the simplest form of graphs, namely undirected and unweighted graphs, to model our social networks.

As we discussed in the previous section, the information that the social networks provide are invaluable. Due to the very nature of many social network applications, the identity of the members or the nature of relationship between members is quite sensitive and valuable. Thus, when releasing a social network we want to remove any attributes that may help identify these kinds of sensitive data. Assuming all members and their relationships are of high sensitivity, preventing *identity disclosure*

### *A Review of Several Privacy Violation Measures for Large Networks under Active Attacks DOI: http://dx.doi.org/10.5772/intechopen.90909*

or *link disclosure* becomes an important task. One popular method to prevent such disclosures is *anonymization*. In an anonymization process, we publish the network without identifying the corresponding nodes or potentially identifiable attributes. Even after anonymizing the network, we will still be releasing many informative attributes encoded by the network structure; for example, attributes such as node degree, connectivity, or other similar graph properties can still help the adversaries in compromising the user privacies of a published network.

Adversaries usually rely on background knowledge to compromise the privacy of published anonymized social networks. For understanding the failure of current privacy preservation methods such as anonymization, we need to have a proper model for the adversary background knowledge. Although it's challenging to have a comprehensive model of all possible types of adversary background knowledge, it is very useful to model the background knowledge via structural properties of networks such as node degrees, embedded subgraphs, node neighbors, etc. [9]. Backstrom et al. [10] were the first to introduce a category of attacks on anonymized social graphs. The models introduced in [10] are background-based attacks and are *widely* used in privacy analysis of social networks. The two main types of attacks are as follows.


The authors in [10] also showed that it *is* possible to compromise the privacy of any social network of *n* nodes with high probability using *only O* ffiffiffiffiffiffiffiffiffiffiffi log *n* � � p attacker nodes. In a *passive attack*, adversary's structural knowledge will give her/him a global view of the network depending on the global structure of the network. It could pose a high privacy risk if an adversary were to combine this global view with the local structural knowledge obtained using an active attack. As an example, consider the network in **Figure 1**. If we only have global structural knowledge, it is not possible to differentiate the nodes *v*<sup>3</sup> and *v*<sup>4</sup> (*e.g.* , same node degrees, *etc*.). However, controlling just one extra node in the graph, such as the node *v*1, provides local structural knowledge such as distances between nodes, and using the knowledge of the distance of *v*<sup>1</sup> from *v*<sup>3</sup> and *v*<sup>4</sup> (*dv*1,*v*<sup>3</sup> ¼ 1 and *dv*1,*v*<sup>4</sup> ¼ 2) one can easily differentiate node *v*<sup>3</sup> from node *v*4.

There are several well-studied strategies for coping with active attacks on a social network [9, 11, 12] via addressing the anonymization process of the social network. However, in this chapter we will focus on a measure that evaluates how resistant a social network is against this type of privacy attack. Introduced by Trujillo-Rasua et al. [1], ð Þ *k*, ℓ -anonymity is a novel and, to the best of our knowledge, the *only* privacy measure examining the structural resistance of a given graph against active attacks. The ð Þ *k*, ℓ -anonymity is a measure based on metric representation of nodes, where *k* is a privacy threshold and *l* is the maximum number of

and the value of information that can be retrieved from them have led social network researchers to take a closer look at methods to combat such bad actors as well as formulate network measures that can provide an insight to the privacy of these networks. In this survey, we will look at one such measure known as ð Þ *k*, ℓ anonymity [1] and will discuss some theoretical and empirical results regarding this

*Security and Privacy From a Legal, Ethical, and Technical Perspective*

Given the irrefutable importance of social networks in our daily lives and the ever increasing risk of compromising valuable personal data through privacy attacks against these networks, it is preferable to know how secure a given social network is against privacy attacks. This necessitates a deeper look into the types of privacy attacks and how to cope with them. There is an extensive literature on privacy preserving computational models in variety of application areas such as multi-party communications or distributed computing settings [2–6]. In this chapter, we focus on a specific type of attack known as *background-based active attack* and one measure that reflects the resistance of any given network against such

• In Section 2 we briefly discuss the notion of privacy in social networks and review some literature on privacy violating attacks on social networks. We also introduce the ð Þ *k*, ℓ -anonymity privacy measure and some corresponding

• In Section 3 we review some basic terminologies and notations that will be used

• Section 4 contains three problems that arise from theoretical investigation of

• Section 5 contains the results of an empirical study on the resistance of real-

We begin by discussing the mathematical structure that fit the most to represent social networks. A social network is often portrayed as a graph [7, 8] *G* ¼ ð Þ *V*, *E* where *V* is a set of nodes representing the social members, and *E* is the set of edges portraying the relationship among these members. Both nodes and edges may have extra attributes, such as weights, that provide extra information about the nature of these social bonds (*e.g.*, trust or popularity); however, throughout this survey we will consider the simplest form of graphs, namely undirected and unweighted

As we discussed in the previous section, the information that the social networks provide are invaluable. Due to the very nature of many social network applications, the identity of the members or the nature of relationship between members is quite sensitive and valuable. Thus, when releasing a social network we want to remove any attributes that may help identify these kinds of sensitive data. Assuming all members and their relationships are of high sensitivity, preventing *identity disclosure*

• Finally, we end this chapter with some concluding remarks in Section 6.

attacks. The organization of the rest of the paper is as follows:

network measurement which are the basis for this measure.

in formulation of the three problems introduced in Section 4.

measure.

**1.1 Overview of the paper**

the ð Þ *k*, ℓ -anonymity.

world social networks.

**2. Privacy measures in social networks**

graphs, to model our social networks.

**126**

**Definition 3** (ð Þ *k*, ℓ -anonymity). A graph *G* under active attack satisfies ð Þ *k*, ℓ anonymity if *k* is the smallest positive integer so that the *k*-metric antidimension of

*A Review of Several Privacy Violation Measures for Large Networks under Active Attacks*

In the above definition, *k* is a parameter depicting the privacy threshold and *l* represents the maximum number of attacker nodes. It is safe to assume that number of attacker nodes *l* is significantly smaller than number of nodes present in the network as injecting attacker nodes or gaining control of existing nodes is difficult

For the exposition in the remainder of this chapter, we will need some notations and terminologies which we introduce here. Consider the (undirected unweighted) graph *G* in **Figure 2**. We will use this graph to illustrate the terminologies and

• The diameter of *G* is the length of the longest shortest path and is denoted by

• The open neighborhood of node *vi* is a subset of all nodes directly connected to

• The metric representation of a node *vi* with respect to a subset such as *S*⊂*V* is

� �∈ *E* � �.

*G* is less than or equal to *l*.

without being detected [15].

notations that are introduced.

**d***vi* ¼ *distv*1,*vi*

denoted by **d***vi*,�*<sup>S</sup>*.

**Figure 2.**

**129**

**3. Basic terminologies and notations**

*DOI: http://dx.doi.org/10.5772/intechopen.90909*

, *distv*2,*vi* , … , *distvn*,*vi* ð Þ.

*diam G*ð Þ¼ max *vi*,*vj* <sup>∈</sup>*<sup>V</sup> distvi*,*vj*

• The metric representation of node *vi* is denoted by

◦ For example, in **Figure 2**, **<sup>d</sup>***<sup>v</sup>*<sup>1</sup> <sup>¼</sup> ð Þ 0, 1, 2, 3, 3, 2

n o.

◦ For example, in **Figure 2**, Nbrð Þ¼ *<sup>v</sup>*<sup>2</sup> f g *<sup>v</sup>*1, *<sup>v</sup>*3, *<sup>v</sup>*<sup>6</sup> .

◦ For example, in **Figure 2**, **<sup>d</sup>***<sup>v</sup>*1,�f g *<sup>v</sup>*3,*v*<sup>4</sup> <sup>¼</sup> ð Þ 2, 3 .

◦ For example, in **Figure 2**, *diam G*ð Þ¼ 3.

*vi* and denoted by Nbrð Þ¼ *vi vj*j *vi*, *vj*

*An example used in Section 3 for illustrating various notations.*

**Figure 1.**

*A simple graph* G *used in Section 2 to illustrate the high risk posed by combining knowledge gained by active and passive attacks.*

attacker nodes that may be inserted in the network. It was shown in [1] that graphs satisfying ð Þ *k*, ℓ -anonymity can successfully deter adversaries controlling at most *l* nodes in the graph from re-identifying nodes with probability higher than <sup>1</sup> *k*.
