*3.6.1 Graph representation*

The major goal of a users' communication network are considered to identify predators and casualties. Gephi [15], a graphical interface is employed to monitor a user's link in the harassing posts in a network. **Figure 11** delineates the bullying network and it represents that a group of users obtained depend upon on the


#### **Table 4.**

*Classifier performances based on different feature reduction methods.*


tormenting messages by utilizing modularity theorem, in order to quantify the quality of segment of a system into sub-graphs or groups. Modularity is characterized as the summation of the weight of all the edges that sink inside the given subgroups less the expected part if edges were dispensed at arbitrary in a given

*(a) Base line method, (b) weighted TFIDF method, and (c) weighted B-TFIDF method.*

*Classification Model for Bullying Posts Detection DOI: http://dx.doi.org/10.5772/intechopen.88633*

graph.

**165**

**Figure 9.**

#### **Table 5.**

*Comparison of weighted B-TFIDF with baseline method on other datasets.*

*Classification Model for Bullying Posts Detection DOI: http://dx.doi.org/10.5772/intechopen.88633*

the cases, the classifier performed almost similar, that is between 80 and 100%. On Myspace dataset recall is moderate nearing to 1. However, precision varies between 76 and 87% except at feature value 18,000 when it reaches 91%. Unlike other datasets, Slashdot performance is very low. Although recall is moderate, precision and F-1 measures decomposed while component set was low. Also, poor performance is observed at feature value 18,000. From this discussion, the performance

In order to identify cyber bullying predators and victims, there is need to determine the most active predators and the most attacked users. The most dynamic predators and victims, and look at the association of clients in a

tormenting relationship as appeared in **Table 6** and it demonstrates that now and again there is more than one user at a similar rank. In this manner, users with a similar rank are gathered together. So it is important to notice that predators hailed at Rank I are additionally recognized as a victim at Rank II. Additionally, Rank II

The major goal of a users' communication network are considered to identify predators and casualties. Gephi [15], a graphical interface is employed to monitor a user's link in the harassing posts in a network. **Figure 11** delineates the bullying network and it represents that a group of users obtained depend upon on the

**Method Precision Recall F-Measure** DF + SVM 0.8471 0.7770 0.8105 PCA + SVM 0.8397 0.7870 0.8125 LDA + SVM 0.8846 0.8554 0.8724 B-LDA + SVM 0.9121 0.8901 0.9003

Baseline Precision 0.35 0.32 0.42 0.62 Baseline Recall 0.60 0.28 0.25 0.53 Baseline F-1 measure 0.44 0.30 0.31 0.57 Weighted TFIDF Precision 0.87 0.78 0.86 0.87 Weighted TFIDF Recall 0.97 0.99 0.98 0.75 Weighted TFIDF F-1 measure 0.92 0.87 0.92 0.81 Weighted B-TFIDF Precision 0.95 0.96 0.96 0.98 Weighted B-TFIDF Recall 0.93 0.84 0.93 0.96 Weighted B-TFIDF F-1 measure 0.94 0.90 0.95 0.97

**Kongregate Slashdot MySpace Twitter**

of weighted B-TFIDF shows the best result (**Figure 9**).

predators are Rank VII victims as well (**Figure 10**).

*Classifier performances based on different feature reduction methods.*

*Comparison of weighted B-TFIDF with baseline method on other datasets.*

**3.6 Victim and predator identification**

*3.6.1 Graph representation*

*Cyberspace*

**Table 4.**

**Table 5.**

**164**

**Figure 9.** *(a) Base line method, (b) weighted TFIDF method, and (c) weighted B-TFIDF method.*

tormenting messages by utilizing modularity theorem, in order to quantify the quality of segment of a system into sub-graphs or groups. Modularity is characterized as the summation of the weight of all the edges that sink inside the given subgroups less the expected part if edges were dispensed at arbitrary in a given graph.


**Table 6.**

*Performance of graph model: Predators and victims identification.*

harassing messages he/she sends or receives. So, a user appointed as a predator and in addition with a casualty score. Predator and victim scores can be calculated by

*p u*ð Þ <sup>X</sup>

*v u*ð Þ <sup>X</sup>

*u*!*y*

*y*!*u*

Here, p(u) and v(u) are represented as the Predator and Victim scores respectively. u ! *y* represents the existing harassing post from u to y, whereas *y* ! *u* shows the presence of the bullying posting from y to u. The above equations are used for evaluating predator and casualty scores and also considered as repeatedly upgrade a set of equations. They depend upon the presumption that the most dynamic predator connects to the most dynamic victims by sending harassing posts. The most active victim is connected to the most dynamic predators by getting bullying messages. Basically, the user's predator score increases when the user (u) is connected with another user with a high victim score. In the same manner, the user's victim score increments when the user (u) is connected through received bullying messages to a user with a high predator score. The scores are computed through incoming degrees and outgoing degrees, and associated scores, in each and every iteration and this may give the result in large values. Subsequently, scores are standardized to unit length, i.e., each predator and victim scores is divided by the

Then there is a necessity to define the ranking methods to the predators and victims which is depicted in the network diagram in **Figure 11**. In order to explain a real scenario in a simple manner, only five users are selected as depicted in **Figure 12** as an example and it depicts the recognition of the most dynamic predators and casualties in a bullying network. It is a weighted directed graph G = (U,A) with a set of nodes are represented as |U| and a set of arcs are represented as |A|

Each node ui ϵ U is a user involved in the bullying conversation, Each arc (ui,uj) ϵ A, is defined as a bullying message sent from ui to uj,

*v y*ð Þ (13)

*p y*ð Þ (14)

the following two equations.

*Classification Model for Bullying Posts Detection DOI: http://dx.doi.org/10.5772/intechopen.88633*

**Figure 11.** *Bullying network.*

sum of all predator and victim scores respectively.

where,

**167**

**Figure 10.** *Predators and victims identification.*

As appeared in **Figure 11**, nine groups or communities, delineated by various colors are formed by considering users that are thickly connected inside the group contrasted with between group by utilizing modularity algorithm. The density of post indicates the badness embedded inside the post and it is calculated for each post. The thickness of a post is computed as the aggregate count of the harassing words within the post separated by the aggregate number of the words in the post. The HITS algorithm is utilized in order to recognize the predators and related casualties and it is also helpful to calculate their scores. The objective behind the HITS strategies is that in a network, the good hub pages point to good authorized pages which are connected by the good hub pages. The search query enters through web pages to recognize potential hub and authority pages with respect to the individual scores. Likewise, this concept is used to rank predators and casualties in a communication network.

Assumption: One bullying message is considered for each user.

Predator: Person who has posted at least one bullying message.

Victim: User who has received at least one bullying message.

Objective: To identify and to rank the most dynamic user as Predator and Victim.

Presently, a ranking method using the HITS module is utilized to detect predators and casualties. A user may be a predator and a victim depends upon on the

*Classification Model for Bullying Posts Detection DOI: http://dx.doi.org/10.5772/intechopen.88633*

**Figure 11.** *Bullying network.*

harassing messages he/she sends or receives. So, a user appointed as a predator and in addition with a casualty score. Predator and victim scores can be calculated by the following two equations.

$$p(\mathfrak{u}) \leftarrow \sum\_{\mathfrak{u} \rightarrow \mathfrak{y}} \mathfrak{v}(\mathfrak{y}) \tag{13}$$

$$w(\mathfrak{u}) \leftarrow \sum\_{\mathfrak{y} \to \mathfrak{u}} p(\mathfrak{y}) \tag{14}$$

Here, p(u) and v(u) are represented as the Predator and Victim scores respectively. u ! *y* represents the existing harassing post from u to y, whereas *y* ! *u* shows the presence of the bullying posting from y to u. The above equations are used for evaluating predator and casualty scores and also considered as repeatedly upgrade a set of equations. They depend upon the presumption that the most dynamic predator connects to the most dynamic victims by sending harassing posts. The most active victim is connected to the most dynamic predators by getting bullying messages. Basically, the user's predator score increases when the user (u) is connected with another user with a high victim score. In the same manner, the user's victim score increments when the user (u) is connected through received bullying messages to a user with a high predator score. The scores are computed through incoming degrees and outgoing degrees, and associated scores, in each and every iteration and this may give the result in large values. Subsequently, scores are standardized to unit length, i.e., each predator and victim scores is divided by the sum of all predator and victim scores respectively.

Then there is a necessity to define the ranking methods to the predators and victims which is depicted in the network diagram in **Figure 11**. In order to explain a real scenario in a simple manner, only five users are selected as depicted in **Figure 12** as an example and it depicts the recognition of the most dynamic predators and casualties in a bullying network. It is a weighted directed graph G = (U,A) with a set of nodes are represented as |U| and a set of arcs are represented as |A| where,

Each node ui ϵ U is a user involved in the bullying conversation, Each arc (ui,uj) ϵ A, is defined as a bullying message sent from ui to uj,

As appeared in **Figure 11**, nine groups or communities, delineated by various colors are formed by considering users that are thickly connected inside the group contrasted with between group by utilizing modularity algorithm. The density of post indicates the badness embedded inside the post and it is calculated for each post. The thickness of a post is computed as the aggregate count of the harassing words within the post separated by the aggregate number of the words in the post. The HITS algorithm is utilized in order to recognize the predators and related casualties and it is also helpful to calculate their scores. The objective behind the HITS strategies is that in a network, the good hub pages point to good authorized pages which are connected by the good hub pages. The search query enters through web pages to recognize potential hub and authority pages with respect to the individual scores. Likewise, this concept is used to rank predators and casualties in a

**Rank I II III IV V VI VII VIII** Number of users (predators) 4 2 1 1 2 7 3 2 Number of users (victims) 8 4 7 2 2 1 9 8

*Performance of graph model: Predators and victims identification.*

Assumption: One bullying message is considered for each user. Predator: Person who has posted at least one bullying message. Victim: User who has received at least one bullying message.

Objective: To identify and to rank the most dynamic user as Predator and

Presently, a ranking method using the HITS module is utilized to detect predators and casualties. A user may be a predator and a victim depends upon on the

communication network.

*Predators and victims identification.*

Victim.

**166**

**Figure 10.**

**Table 6.**

*Cyberspace*

The weight of arc (ui,uj), denoted as wij, is defined as a summation of in-degrees. Predators and victims are recognized by the directed graph G with weight. The victim can be recognized with many incoming arcs and the predator can be recognized with many outgoing arcs of the respective nodes. This method is helpful to observe the most dynamic predator or a casualty.

### *3.6.2 Cyber bullying matrix*

A cyber bullying matrix(w) is constructed to discover a predator and victim depends upon their individual scores. It is depicted in **Table 7**. It is formulated as a square adjacency matrix (it represents the incoming degrees and outgoing degrees of each node) of the subnet with entry w, which is a square adjacency grid of the sub collection with entry wij, where,

$$\mathbf{w}\_{\mathbf{i}\rangle} = \{ \mathbf{n} \text{ if there be } \mathbf{n} \text{ having posts from } \mathbf{u}\_{\mathbf{i}} \text{ to } \mathbf{u}\_{\mathbf{j}}, \mathbf{0} \text{ otherwise} \}\tag{15}$$

Since each client will have a casualty as well as a predator score, scores are represented as the vectors of n\*1 dimension where ith coordinate of the vector represent both the scores of the ith user, say pi and vi respectively. To calculate scores, equations p(u) and v(u) are shortened as the casualty and predator renovating matrix–vector multiplication equations. For the preliminary iteration, pi and vi are started at 1. For every client (say, i = 1 to N) predator and victim notches are as follows:

$$p(ui) = wi\mathbf{1}v\mathbf{1} + wi\mathbf{2}v\mathbf{2} + \dots + wi\mathbf{N}v\mathbf{N} \tag{16}$$

When these equations congregate at a stable value (say k), it offers the final predator and casualty vector of each user. At last, to compute the eigenvector to

U1 03013 … . … . U2 10000 … . … . U3 12010 … . … . U4 01001 … . … . U5 01110 … . … . … .. … . … . … . … . … . … . … . UN … . … . … . … . … . … . … .

The new system is achieved by two commitments. First, a Novel Statistical Application, which is established on the new Bully-LDA with the weighted B-TFIDF strategy on bullying like attributes. It also efficiently and effectively finds latent bullying features to cultivate the accomplishment of the classifier and also to reduce the feature sparsity. Secondly, a Graph Model lends a hand to pinpoint the attackers and causalities in social networks. Such a system would encompass the following function: Tweets Crawling, Tweet Preprocessing and Tokenization, Feature extraction and Frequency extraction, Text Representation Model, Text Classification,

The Twitter corpus consists of text communications by way of metadata such user ID, dispatching time, etc. Tweets Crawling is performed using many classes and techniques in order to get the information of the users' connected data and the details of the Tweets' which is done using Twitter's Application programming interface called "Twitter4j-core-4.02.jar." Tweets are shown in entirely colloquial manner, with more amount noise and variation in linguistics. For example, tweets contain a hefty quantity of novel words, interjections, repetitions, short words such

Algorithm 1 gives a general framework of identification of the top-ranked most active predators and victims. In the algorithm N is a total number of users and Top

**U1 U2 U3 U4 U5 … … UN**

acquire the predator and casualty scores.

is a threshold value, which is set manually.

Output: Set of Top Casualty and Top Predator

post, N, Top.

**Table 7.**

*Cyber bullying matrix (W).*

**4. Summary**

**169**

Algorithm 1. Predators and casualty recognition.

**Sender Recipient**

*Classification Model for Bullying Posts Detection DOI: http://dx.doi.org/10.5772/intechopen.88633*

> 4.Compute Predator and casualty vectors with iterative updating Eqs. (16) and (17), and normalize, until congregate at secure value k; 5. Compute Eigen vectors to locate Predator and Casualty scores;

Input: Set of consumer engaged in the chat with harassing

1. Take out dispatchers and receivers from N; 2. Initialize predator and casualty vector each N; 3. Generate adjacent matrix w using formula (15);

6. Revisit high ranked Predators and Casualties.

Category of Texts, Performance Evaluation, and Results.

$$v(ui) = wi\mathbf{1}p\mathbf{1} + wi\mathbf{2}p\mathbf{2} + \dots + wi\mathbf{NpN} \tag{17}$$

**Figure 12.** *Communication paths between predator and casualty.*

*Classification Model for Bullying Posts Detection DOI: http://dx.doi.org/10.5772/intechopen.88633*


**Table 7.**

The weight of arc (ui,uj), denoted as wij, is defined as a summation of in-degrees. Predators and victims are recognized by the directed graph G with weight. The victim can be recognized with many incoming arcs and the predator can be recognized with many outgoing arcs of the respective nodes. This method is helpful to

A cyber bullying matrix(w) is constructed to discover a predator and victim depends upon their individual scores. It is depicted in **Table 7**. It is formulated as a square adjacency matrix (it represents the incoming degrees and outgoing degrees of each node) of the subnet with entry w, which is a square adjacency grid of the

wij ¼ fn if there be n harassing posts from ui to uj, 0 otherwiseg (15)

*p ui* ð Þ¼ *wi*1*v*1 þ *wi*2*v*2 þ *:* … þ *wiNvN* (16) *v ui* ð Þ¼ *wi*1*p*1 þ *wi*2*p*2 þ … þ *wiNpN* (17)

Since each client will have a casualty as well as a predator score, scores are represented as the vectors of n\*1 dimension where ith coordinate of the vector represent both the scores of the ith user, say pi and vi respectively. To calculate scores, equations p(u) and v(u) are shortened as the casualty and predator renovating matrix–vector multiplication equations. For the preliminary iteration, pi and vi are started at 1. For every client (say, i = 1 to N) predator and victim notches are

observe the most dynamic predator or a casualty.

*3.6.2 Cyber bullying matrix*

*Cyberspace*

as follows:

**Figure 12.**

**168**

*Communication paths between predator and casualty.*

sub collection with entry wij, where,

*Cyber bullying matrix (W).*

When these equations congregate at a stable value (say k), it offers the final predator and casualty vector of each user. At last, to compute the eigenvector to acquire the predator and casualty scores.

Algorithm 1 gives a general framework of identification of the top-ranked most active predators and victims. In the algorithm N is a total number of users and Top is a threshold value, which is set manually.

Algorithm 1. Predators and casualty recognition.


3. Generate adjacent matrix w using formula (15);

4.Compute Predator and casualty vectors with iterative updating

Eqs. (16) and (17), and normalize, until congregate at secure value k;

5. Compute Eigen vectors to locate Predator and Casualty scores;

6. Revisit high ranked Predators and Casualties.
