3. Experiment design, result, and discussion

#### 3.1. Experiment design

conditions: the object function is concave, the inequality constraint is a continuously differentiable convex function, and the equality constraint is an affine function. According to the KKT

� 1 þ ξ<sup>i</sup>

¼ 0, i ¼ 1, 2,…, n (11)

� � <sup>þ</sup> <sup>b</sup><sup>∗</sup> <sup>¼</sup> <sup>0</sup> (12)

subject to

<sup>i</sup> are defined as support vectors. They construct the

<sup>i</sup> tend to be nonzero usually.

(13)

conditions, the optimal parameters α<sup>∗</sup>, w<sup>∗</sup>, and b<sup>∗</sup> must satisfy:

!

In classification, only a small subset of the Lagrange multipliers α<sup>∗</sup>

<sup>w</sup><sup>∗</sup><sup>T</sup>ϕð Þþ <sup>x</sup> <sup>b</sup><sup>∗</sup> <sup>¼</sup> <sup>X</sup><sup>n</sup>

converted into the following equation to derive the dual form for MKL.

0 ≤ α<sup>i</sup> ≤C, i ¼ 1, 2, …, n

Xm l¼1

Xn j¼1

αiαjyi yj Xm l¼1

� � !

β<sup>l</sup> ¼ 1, l ¼ 1, 2, …, m

In Eq. (13), both the base kernel weights β<sup>l</sup> and the Lagrange coefficients α<sup>j</sup> need to be optimized. A two-step procedure is considered to decompose the problem into two optimiza-

In the first step, through grid search and cross-validation, the best weights β<sup>l</sup> are derived by minimizing the 2-norm soft margin error function using linear programming. The weights for text features and image features are changed according to the type of data. For example, for wildfire data, the weight for text features was chosen as 0.70, and the weight for image features was chosen as 0.30. In the second step, the Lagrange coefficients α<sup>j</sup> are obtained by maximizing Eq. (13) using quadratic programming. The interior point method is used to solve quadratic programming in the proposed method, which achieves optimization by traversing the

As described above, the training process of multimedia data fusion builds the system by deriving parameters αj, b, xi, βl, and kl. For a test input x, the decision function for MKL, i.e., the event detection function F(x), is a convex combination of basis kernels, computed as:

βlkl xi; xj

<sup>α</sup><sup>i</sup> � <sup>1</sup> 2 Xn i¼1

j¼1 α∗ j yj k x; xj

In SVM framework, the task of multiple kernel learning is considered as a way of optimizing the kernel weights at the same time of training SVM. For multiple kernels, Eq. (12) can be

" #

Xn j¼1 α∗ j yj k xi; xj � � <sup>þ</sup> <sup>b</sup><sup>∗</sup>

58 Machine Learning - Advanced Techniques and Emerging Applications

max <sup>X</sup><sup>n</sup>

Xn i¼1 yi α<sup>i</sup> ¼ 0

β<sup>l</sup> ≥ 0,

convex interior of the feasible region.

2.5. Final event detection

tion problems.

i¼1

α∗ <sup>i</sup> yi

optimal separating hyperplane as:

The training examples with nonzero α<sup>∗</sup>

Experiments have been done to build the event detection method and test its performance on real twitters. The algorithm is implemented in Matlab. In the experiments, the tweets that contain both text and image are collected from the Twitter streams. The data collection is for two events: Brisbane hailstorm and California wildfire.

The data are separated into two sets, including training and testing. Training data are divided into two groups: the event has happened or the event has not happened, which are manually labeled. Each group has the same number of tweets. The same process is applied to the testing data. The numbers of samples for the two sets are the same. The reasons to have the same number of samples are: the greater the size of the training set and testing sets, the better the algorithm is trained and tested, and the total number of samples is big enough to split the data into two equal sets. For each tweet set to be used for detecting whether an event has happened or not, its features are extracted for fusing operation.

In order to validate the performance of the proposed MKL event detection using both text and image, two other methods are also built and tested. Both the other two methods are based on single kernel learning, with one method taking text only as input and the other taking image only as input.

#### 3.2. Performance evaluation parameters

In order to measure the performance of the proposed method and those of other comparing methods more objectively and comprehensively, four performance parameters are used, including accuracy (A), precision, recall, and F-score [25]. They are defined below.

The accuracy for the event detection method is defined as

$$A = \frac{\text{TP} + \text{TN}}{\text{TP} + \text{TN} + \text{FP} + \text{FN}} \tag{15}$$

where TP, TN, FP, and FN represent true positive, true negative, false positive, and false negative, respectively. In classifying an event such as a wildfire, a true positive (TP) is considered to be when a wildfire happened and a tweet from the wildfire data is classified as wildfire. If a tweet from the wildfire data is classified as not wildfire, this is a false negative (FN). In contrast, when a tweet from the data about a nonwildfire event is classified as wildfire, that is a false positive (FP). If a tweet from the data about a nonwildfire event is classified as not wildfire, that is a true negative (TN). For other events such as hailstorm, the classification is applied in the same way.

Precision is a term that refers to the fraction of correctly retrieved tweets. It is a function of true positives and false positives. It is defined as:

$$precision = \frac{TP}{TP + FP} \tag{16}$$

and F-score. The experiment results have proven that event detection from multimedia data in Twitter is enhanced and improved by using a combination of multiple features for both images

Table 1. Event detection performance of the proposed method in comparison with the performance of two methods that

Event Data Accuracy Precision Recall F-score Brisbane hailstorm Text only 0.89463 0.90662 0.90171 0.90416

California wildfire Text only 0.90981 0.91533 0.91116 0.91324

Image only 0.85981 0.82759 0.90566 0.86486 The proposed method 0.93434 0.93578 0.94444 0.94009

Multiple Kernel-Based Multimedia Fusion for Automated Event Detection from Tweets

http://dx.doi.org/10.5772/intechopen.77178

61

Image only 0.86406 0.88971 0.84912 0.86894 The proposed method 0.92736 0.9311 0.93721 0.93414

In this chapter, a method for detecting hot events, in particular disasters such as hailstorm and wildfires, is proposed. The approach uses visual information as well as textual information to improve the performance of detection. It starts with monitoring a Twitter stream to pick up tweets having texts and images, and storing them in a database. After that, Twitter data is preprocessed to eliminate unwanted data and transform unstructured data into structured data. Then, features in both texts and images are extracted for event detection. For feature extraction from the text, the term frequency-inverse document frequency technique is used. For images, the features extracted are: histogram of oriented gradients descriptors for object detection, gray-level co-occurrence matrix for texture description, color histogram, and scale-invariant features transform. In the next step, text features and image features are input to the multiple kernel learning (MKL) for fusion. MKL can automatically combine both feature types in order to achieve the best performance. The proposed method was tested on two datasets from two events, including Brisbane hailstorm 2014 and California wildfires 2017. The method is compared with a method that used text only and another method that used images only. With the Brisbane hailstorm data, the proposed method achieved the best performance, with a fusion accuracy of 0.93, compared to 0.89 with text only, and 0.85 with images only. With the California wildfires data, the proposed method achieved the best performance, with a fusion accuracy of 0.92, compared to 0.90 with text only, and 0.86 with images only. It has demonstrated that event detection from multimedia data in Twitter is enhanced and improved by our approach of using a combination of multiple features for both images and text. The proposed method also improves computational efficiency when handling big volumes of data, and gives better performance than other fusion approaches. It has delivered an accurate and effective detection method for detecting events, which can be used for spreading awareness and organizing responses.

and text.

4. Conclusion

use text only or image only.

The term recall refers to the fraction of relevant tweets that were retrieved. It is a function of correctly classified examples, i.e., true positives, and the false negatives true positive rate. It is defined as:

$$recall = \frac{\text{TP}}{\text{TP} + \text{FN}} \tag{17}$$

F-score is introduced as the harmonic mean of precision and recall, in this way combining and balancing precision and recall. It is defined as:

$$F-score = 2\*\frac{precision\*recall}{precision + recall} \tag{18}$$

F-score measures how well a learning algorithm applies to a class. It is based on the weighted average of precision and recall.

#### 3.3. Result and discussion

In order to validate the performance of the proposed event detection based on multiple kernel learning, two other single kernel-based methods are also built and tested. Both of the other two methods take single media as input, i.e., text or image. The performance metrics of the proposed method and that of the other two methods for two events are given in Table 1.

From the table, it can be seen that for both the Brisbane hailstorm event and California wildfire event, the proposed method consistently achieved a better performance in all the four metrics than the methods using text only or image only. For example, the proposed method achieved an accuracy of 0.93 for Brisbane hailstorm, whereas the method of using text only achieved 0.89 and the method of using image only achieved 0.85. For California wildfire, the accuracy of the proposed method is 0.92, better than that of 0.90 and 0.86 of the other two methods. Comparing to the other two single kernel-based methods, it can also be seen that the proposed method has improved about 5%, 6%, 5%, and 6%, respectively, in accuracy, precision, recall,


Table 1. Event detection performance of the proposed method in comparison with the performance of two methods that use text only or image only.

and F-score. The experiment results have proven that event detection from multimedia data in Twitter is enhanced and improved by using a combination of multiple features for both images and text.
