**2.1** ð Þ *k***, ℓ -anonymity**

The ð Þ *k*, ℓ -anonymity measure is based on a concept known as *k*-metric antidimension of graphs. To facilitate further discussions about the measure, we first introduce some notations and terminologies. For a simple connected graph *G* ¼ ð Þ *V*, *E* , where *V* is set of nodes and *E* is set of edges, let *distvi*,*vj* denote distance (*i.e.* , number of edges in a shortest path) between the nodes *vi* and *vj*. Given and ordered set of nodes *S* ¼ f g *v*1, … , *vt* and a node *u* we define the metric representation of *u* with respect to *S* as a vector **d***<sup>u</sup>*,�*<sup>S</sup>* ¼ *distu*,*v*<sup>1</sup> , … , *distu*,*vt* ð Þ. Metric representations of nodes are closely related to the concept of a *resolving set* of a graph. Inspired by the problem of identifying an intruder in a network and introduced separately by Slater [13] and by Harary and Melter [14], a resolving set of graph provides recognition of every pair of nodes in graph.

**Definition 1** (resolving set). Given a graph *G* ¼ ð Þ *V*, *E* , a subset *S*⊆*V* is called a resolving set for *G* if, for each pair of nodes ð Þ *u*, *v* ∈ *G*, there exist a node *x*∈*S* such that *distx*,*<sup>u</sup>* 6¼ *distx*,*<sup>v</sup>*. A smallest-cardinality resolving set is called the metric basis, and its cardinality is referred to as the metric dimension of *G*.

The concepts of metric representation and resolving set inspired the introduction of another network measure known as *k-antiresolving set* that will be used as the founding base for ð Þ *k*, ℓ -anonymity.

**Definition 2** (*k*-antiresolving set). Given a graph *G* ¼ ð Þ *V*, *E* , *S*⊂*V* is called a k-antiresolving set of *G* if *k* is the largest integer such that, for every node *v*∈*V*n*S*, there exist at least *k* � 1 nodes *u*1, *u*2, … , *uk*�<sup>1</sup> ∈*V*n*S* with the same metric representation with respect to *S* as *v*.

A *k*-antiresolving set of *minimum* cardinality is called a *k-antiresolving basis*, and its cardinality denotes the *k-metric antidimension adimk*ð Þ *G* of *G*. Note that the *k*-antiresolving set may not exist for every *k* in a graph.

The (*k,l*)-anonymity measure is built upon the *k*-antiresolving set concept. Assume the adversary has gained control of a subset *S* of nodes in the graph *G*, where *S* is a *k*-antiresolving set for *G*. Then the adversary *cannot* uniquely reidentify any node based on the background knowledge (namely, the knowledge of metric representation of a node *v* with respect to *S*) with probability higher than <sup>1</sup> *k*. ð Þ *k*, ℓ -anonymity is formally defined as [1].

*A Review of Several Privacy Violation Measures for Large Networks under Active Attacks DOI: http://dx.doi.org/10.5772/intechopen.90909*

**Definition 3** (ð Þ *k*, ℓ -anonymity). A graph *G* under active attack satisfies ð Þ *k*, ℓ anonymity if *k* is the smallest positive integer so that the *k*-metric antidimension of *G* is less than or equal to *l*.

In the above definition, *k* is a parameter depicting the privacy threshold and *l* represents the maximum number of attacker nodes. It is safe to assume that number of attacker nodes *l* is significantly smaller than number of nodes present in the network as injecting attacker nodes or gaining control of existing nodes is difficult without being detected [15].

### **3. Basic terminologies and notations**

For the exposition in the remainder of this chapter, we will need some notations and terminologies which we introduce here. Consider the (undirected unweighted) graph *G* in **Figure 2**. We will use this graph to illustrate the terminologies and notations that are introduced.

• The metric representation of node *vi* is denoted by **d***vi* ¼ *distv*1,*vi* , *distv*2,*vi* , … , *distvn*,*vi* ð Þ.

◦ For example, in **Figure 2**, **<sup>d</sup>***<sup>v</sup>*<sup>1</sup> <sup>¼</sup> ð Þ 0, 1, 2, 3, 3, 2

• The diameter of *G* is the length of the longest shortest path and is denoted by *diam G*ð Þ¼ max *vi*,*vj* <sup>∈</sup>*<sup>V</sup> distvi*,*vj* n o.

◦ For example, in **Figure 2**, *diam G*ð Þ¼ 3.

	- For example, in **Figure 2**, Nbrð Þ¼ *<sup>v</sup>*<sup>2</sup> f g *<sup>v</sup>*1, *<sup>v</sup>*3, *<sup>v</sup>*<sup>6</sup> .
	- For example, in **Figure 2**, **<sup>d</sup>***<sup>v</sup>*1,�f g *<sup>v</sup>*3,*v*<sup>4</sup> <sup>¼</sup> ð Þ 2, 3 .

**Figure 2.** *An example used in Section 3 for illustrating various notations.*

attacker nodes that may be inserted in the network. It was shown in [1] that graphs satisfying ð Þ *k*, ℓ -anonymity can successfully deter adversaries controlling at most *l* nodes in the graph from re-identifying nodes with probability higher than <sup>1</sup>

*A simple graph* G *used in Section 2 to illustrate the high risk posed by combining knowledge gained by active*

*Security and Privacy From a Legal, Ethical, and Technical Perspective*

The ð Þ *k*, ℓ -anonymity measure is based on a concept known as *k*-metric antidimension of graphs. To facilitate further discussions about the measure, we first introduce some notations and terminologies. For a simple connected graph *G* ¼ ð Þ *V*, *E* , where *V* is set of nodes and *E* is set of edges, let *distvi*,*vj* denote distance (*i.e.* , number of edges in a shortest path) between the nodes *vi* and *vj*. Given and ordered set of nodes *S* ¼ f g *v*1, … , *vt* and a node *u* we define the metric representation of *u* with respect to *S* as a vector **d***<sup>u</sup>*,�*<sup>S</sup>* ¼ *distu*,*v*<sup>1</sup> , … , *distu*,*vt* ð Þ. Metric representations of nodes are closely related to the concept of a *resolving set* of a graph. Inspired by the problem of identifying an intruder in a network and introduced separately by Slater [13] and by Harary and Melter [14], a resolving set of graph provides recognition of

**Definition 1** (resolving set). Given a graph *G* ¼ ð Þ *V*, *E* , a subset *S*⊆*V* is called a resolving set for *G* if, for each pair of nodes ð Þ *u*, *v* ∈ *G*, there exist a node *x*∈*S* such that *distx*,*<sup>u</sup>* 6¼ *distx*,*<sup>v</sup>*. A smallest-cardinality resolving set is called the metric basis,

The concepts of metric representation and resolving set inspired the introduction of another network measure known as *k-antiresolving set* that will be used as the

**Definition 2** (*k*-antiresolving set). Given a graph *G* ¼ ð Þ *V*, *E* , *S*⊂*V* is called a k-antiresolving set of *G* if *k* is the largest integer such that, for every node *v*∈*V*n*S*, there exist at least *k* � 1 nodes *u*1, *u*2, … , *uk*�<sup>1</sup> ∈*V*n*S* with the same metric represen-

A *k*-antiresolving set of *minimum* cardinality is called a *k-antiresolving basis*, and

its cardinality denotes the *k-metric antidimension adimk*ð Þ *G* of *G*. Note that the

The (*k,l*)-anonymity measure is built upon the *k*-antiresolving set concept. Assume the adversary has gained control of a subset *S* of nodes in the graph *G*, where *S* is a *k*-antiresolving set for *G*. Then the adversary *cannot* uniquely reidentify any node based on the background knowledge (namely, the knowledge of metric representation of a node *v* with respect to *S*) with probability higher than <sup>1</sup>

and its cardinality is referred to as the metric dimension of *G*.

*k*-antiresolving set may not exist for every *k* in a graph.

ð Þ *k*, ℓ -anonymity is formally defined as [1].

**2.1** ð Þ *k***, ℓ -anonymity**

**Figure 1.**

*and passive attacks.*

every pair of nodes in graph.

founding base for ð Þ *k*, ℓ -anonymity.

tation with respect to *S* as *v*.

**128**

*k*.

*k*.

	- For example, in **Figure 2**, Df g *<sup>v</sup>*1,*v*<sup>2</sup> ,�f g *<sup>v</sup>*3,*v*<sup>4</sup> <sup>¼</sup> f g ð Þ 2, 3 , 1, 2 ð Þ . Note that the first pair (2,3) corresponds to *v*<sup>1</sup> and the second pair (1,2) corresponds to *v*2.

**4. Theoretical results**

*DOI: http://dx.doi.org/10.5772/intechopen.90909*

nodes *S* that maximizes *k*.

which leads us to Problem 2.

where <sup>ℓ</sup><∣L<sup>≥</sup>*<sup>k</sup>*

probability better than <sup>1</sup>

anonymity where *k*<sup>0</sup>

exists.

**131**

antiresolving set *S* such that (i) *k*<sup>0</sup>

can re-establish such possibilities.

proofs to read the original paper [16].

>*k* and ℓ<sup>0</sup>

To understand graph resistance against privacy attacks, one needs to study the ð Þ *k*, ℓ -anonymity in greater details. Thus, we look into some computational problems related to this measure that were formalized and investigated in [16]. This section contains three problems from [16] and the respective algorithms to solve each problem efficiently. It is important to note that ð Þ *k*, ℓ -anonymity in its basic definition sets no limitation for the adversary, which means that an adversary can take control of as many nodes as she/he can. However, in real world there are many mechanisms designed solely to prevent such attacks and thus the chances of being caught are significantly high. This notion is the motivation behind several problems

*A Review of Several Privacy Violation Measures for Large Networks under Active Attacks*

We now state the three problems for analyzing ð Þ *k*, ℓ -anonymity. Problem 1 simply checks to find a *k*-antiresolving set for the largest possible value of *k*. Problem 2 sets a restriction for number of nodes the adversary can control and attempts to find

compromised. Problem 3 introduces a version of the problem that attempts to address

**Problem 1** (metric antidimension (*ADIM*)). Find a *k*-antiresolving subset of

Problem 1 assumes there are *no* limitations on the number of attacker nodes, thus finding an absolute bound for privacy violation. Note that solution to Problem 1, denoted by *kopt*, shows that, given no bound on number of the nodes an adversary can control, it is feasible to uniquely re-identify *kopt* nodes with probability <sup>1</sup>

*kopt*. The


, ℓ<sup>0</sup> -anonymity

*opt* ∣) one

, ℓ<sup>0</sup> -

> ¼ *k* and, (ii) *S* is of minimum cardinality.

the largest possible value of *k* while minimizing the number of nodes that are

the trade-off between privacy threshold and number of compromised nodes.

assumptions in Problem 1 are rarely plausible in practice; due to mechanisms present to counter such attacks, the more nodes the adversary controls, the higher the risk of being exposed. Thus, a limit on number of attacker nodes is necessary,

**Problem 2** (*k*<sup>≥</sup> -metric antidimension (*ADIM*≥*<sup>k</sup>*)). Given *k*, find a *k*<sup>0</sup>

Problem 2 is an extension to Problem 1 that attempts to find the largest value of *k* while minimizing the number of attacker nodes. A solution to this problem asserts few interesting statements. For example, an adversary controlling *l* attacker nodes

*opt* ∣ cannot uniquely re-identify any node in the network with a

The third problem focuses on a trade-off between number of attacker nodes and

<ℓ, it is easy to observe that *k*<sup>0</sup>

the privacy violation probability. Given two measures ð Þ *k*, ℓ -anonymity and *k*<sup>0</sup>

ance for attacker nodes. The trade-off leads us to the third problem.

measure provides a smaller privacy violation probability but also has lower toler-

**Problem 3** (*k*=-metric antidimension (*ADIM*¼*<sup>k</sup>*)) Given a positive integer *k*, find a *k* antiresolving subset of nodes *S* with minimum cardinality if such a subset

Chatterjee et al. [16] investigated Problems 1–3 from a computational complexity perspective. The following theorems summarizes their finding on Problems 1–3. The non-trivial mathematical proofs for these theorems are unfortunately outside of the scope of this chapter; we strongly recommend readers who are interested in the

*k*. However, using enough number of nodes (≥ ∣L<sup>≥</sup>*<sup>k</sup>*

with respect to measuring the ð Þ *k*, ℓ -anonymity in a graph [17].

• We define a partition <sup>Q</sup> <sup>¼</sup> f g *<sup>V</sup>*1, *<sup>V</sup>*2, … ,*Vt* of *<sup>V</sup>*<sup>0</sup> <sup>⊆</sup>*<sup>V</sup>* as one with the following properties:

◦ ⋃*t <sup>i</sup>*¼<sup>1</sup> *Vi* <sup>¼</sup> *<sup>V</sup>*<sup>0</sup> , and

	- For every node *vj* ∈ ⋃*<sup>t</sup> <sup>i</sup>*¼<sup>1</sup> *Vi* � �<sup>n</sup> <sup>⋃</sup><sup>ℓ</sup> *<sup>i</sup>*¼<sup>1</sup> *<sup>V</sup>*<sup>0</sup> *i* � �, remove *vj* from the set in <sup>Q</sup> that contains it.
	- Optionally, for every set *<sup>V</sup>*<sup>ℓ</sup> in <sup>Q</sup>, replace *<sup>V</sup>*<sup>ℓ</sup> by a partition of *<sup>V</sup>*ℓ.
	- If there exists an empty set, remove it.
		- i. For example, in **Figure 2**, f g f g *v*1, *v*<sup>2</sup> , f g *v*<sup>3</sup> , f g *v*<sup>5</sup> ≺*r*f g f g *v*1, *v*<sup>2</sup> , f g *v*3, *v*4, *v*<sup>5</sup> .
	- The set of equivalence classes, which forms a partition of <sup>D</sup>*<sup>V</sup>*n*V*<sup>0</sup> ,�*V*0, is denoted by Q<sup>¼</sup> *V*n*V*<sup>0</sup> ,�*V*<sup>0</sup>
		- i. For example, in **Figure 2**, Q<sup>¼</sup> f g *<sup>v</sup>*1,*v*2,*v*<sup>6</sup> ,�f g *<sup>v</sup>*3,*v*<sup>5</sup> <sup>¼</sup> f g ð Þ 2, 3 , 1, 2 ð Þ, 2, 3 ð Þ .
	- We declare two nodes *vi*, *vj* <sup>∈</sup>*V*n*V*<sup>0</sup> to be in the same equivalence class if **<sup>d</sup>***vi*,�*V*<sup>0</sup> and **<sup>d</sup>***vj*,�*V*<sup>0</sup> belong to the same equivalence class in <sup>Q</sup><sup>¼</sup> *V*n*V*<sup>0</sup> ,�*V*0; thus Q<sup>¼</sup> *V*n*V*<sup>0</sup> ,�*V*<sup>0</sup> also defines a partition into equivalence classes of *<sup>V</sup>*n*V*<sup>0</sup> .
	- The *measure* of the equivalence relation is defined as

$$\mu\left(\mathcal{D}\_{V\backslash V', -V'}\right) \stackrel{\text{def}}{=} \min\_{\mathcal{V} \in \prod\_{V \backslash V', -V'}} \{|\mathcal{V}|\}.$$

	- i. For example, in **Figure 2**, *μ* Df g *<sup>v</sup>*1,*v*2,*v*<sup>6</sup> ,�f g *<sup>v</sup>*3,*v*<sup>5</sup> � � <sup>¼</sup> 1 and f g *<sup>v</sup>*3, *<sup>v</sup>*<sup>5</sup> is a 1-antiresolving set.

*A Review of Several Privacy Violation Measures for Large Networks under Active Attacks DOI: http://dx.doi.org/10.5772/intechopen.90909*
