**4. Empirical evaluation of SSK**

In this section, we evaluate the performance of SSK classification. We compare it to the results of a classifier that possesses significantly more prior knowledge, demonstrate its performance with a few examples from different cities in Israel, analyze the process of the self-labeled technique, and discuss its overall accuracy and the accuracy in each land use separately.

We used the ground truth land-use labels for two purposes—for training the SSK classifier and for evaluating its performance. Five percent of the cells were randomly chosen at the beginning of the process, and the labels of these cells were treated as ground truth and were used for training the classifier. The performance of the classifier was estimated by the labels of the other 95% of the cells. We performed the classification in each hour separately, and in each hour, repeated the process five times, each with another randomly chosen 5% of the cells. Thus, using these permutations, we diminished the variance caused by the random aspect.

The accuracy rate of SSK averaged over all permutations and hours using labels for only 5% of the cells is 74.4%. Compared to the works of Toole et al. [42] and Pei et al. [18] who also attempted to identify land use based on CDR, our accuracy rate is

exceptionally high; Toole et al. [42] and Pei et al. [18] achieved 54% and 58% accuracy rates, respectively. However, it is not possible to make conclusions based on comparing the accuracy rates of these works. The main reason is that these studies performed land-use mapping of a whole city, Boston in the work of Toole et al. [42], and Singapore in the work of Pei et al. [18], whereas we deliberately chose areas with a relatively "pure" and clear land-use function from different cities in Israel. Identification of the land use in lands of "pure" social function is an easier process.

**Tables 1** and **2** illustrate the classification results in greater detail and the quality of the classification of each land-use category separately. **Table 1** demonstrates the confusion matrices of the results–predicted (columns) vs. true values (rows)–in different day parts: (a) between 4 a.m. and 7 a.m., (b) between 8 a.m. and 5 p.m., (c) between 5 p.m. and 7 p.m., and (d) between 8 p.m. and 10 p.m. Notice the set of social


#### **Table 1.**

*Confusion matrices of the classification results in four day parts: (a) 4 a.m.–7 a.m., (b) 8 a.m.–5 p.m., (c) 5 p.m.–7 p.m., and (d) 8 p.m.–10 p.m.*

*Mapping of Social Functions in a Smart City When Considering Sparse Knowledge DOI: http://dx.doi.org/10.5772/intechopen.104901*


#### **Table 2.**

*Precision, recall, and F1 of each land use.*

land uses changes throughout the day. Some of the social functions, such as Commercial, occur only in specific hours (**Table 1b–d**), while other social functions, such as Highway and No activity, occur all day long, but not necessarily in the areas we chose. For example, in our dataset, there is no cell labeled as No activity between 8 a.m. and 5 p.m. While **Table 1** provides detailed accuracies for the different land uses in different time parts of the day, **Table 2** averages performance over the land uses and time parts and illustrates the precision, recall, and F1 score for the classification of each land use over all cells in the nine cities. Precision is the percentage of cells correctly classified to specific land use c, recall is the percentage of cells of the specific land use that are classified correctly, and the F1 score considers both recall and precision by calculating their harmonic average

$$F1 = 2\frac{Precision \bullet Recall}{Precision + Recall} \tag{7}$$

Thus, we use the F1 score as the best indicator for the quality of classification of certain land use.

Residential and Industrial are well identified (both have an F1 score of 0.82). Residential is the most common land use in urban areas; therefore, correct identification of it is important. In our work, 47% of the cells are Residential. All the land-use categories except Residential have higher precision than recall. It indicates that the classifier tends to classify as Residential, and all the other land uses are underclassified. Residential has a high Recall (0.92) and lower precision, while Industrial has high Precision (0.91) and lower recall. Commercial is relatively well-identified (F1 is 0.59). The commerce identification rate is damaged by the inaccuracy of location estimation more than other land uses. As mentioned in Section 2, CDR-rendered coordinate location estimation is inaccurate and can reach 350 m. Commercial streets, because of their long and narrow shape, are vulnerable to location estimation mistakes. Because they are often surrounded by a "sea" of residential neighborhoods, transmissions originating from the neighborhoods are mixed with transmissions originating from the commerce street. The result is a mixed cellular communication behavior that makes correct identification harder. Indeed, Commercial is often confused with Residential, as is shown in **Table 1b** and **c**. Later in the paper, we demonstrate an example of a Commercial street in the city of Ra'anana that is confused with its neighboring residential buildings. The same problem occurs in other narrowshaped land uses, such as streets and highways; both have a low identification rate.

Street is also frequently confused with Residential (see **Table 1a** and **d**), rather not surprisingly because they are located in the heart of neighborhoods. No activity is relatively well-identified (F1 is 0.64).

We compared SSK performance that assumes possession of the social function of only 5% of the cells to a supervised random forest (RF) [46] classifier that assumes significantly more labeled cells. The RF classifier was trained on the same dataset and the same areas, except that it was trained with 8-fold cross-validation, thus in each fold, RF classified 1/8 of the cells based on the other 7/8 cells. Meaning, that compared to SKK, which assumed possession of 5% of the cells, RF assumed possession of 87.5% (7/8) of the cell. As expected, RF did achieve a higher accuracy rate of 84%; however, the accuracy rate of SSK (74.4%) is considerably high, considering the lack of labeled samples.

In **Figure 5**, we visualize the results on a map we refer to as a geographical confusion map. It resembles a confusion matrix, but it displays the results on a geographical map with each cell (sample) placed where it is located. **Figure 5** compares the geographical confusion maps of RF (**Figure 5a**–**c**) and SSK (**Figure 5d**–**f**) classification on the work hours between 8 a.m. and 5 p.m. in three cities: Ra'anana (RF **Figure 5a** and SSK **Figure 5d**), Ramat-Gan (RF **Figure 5b** and SSK **Figure 5e**), and Tel Aviv (RF **Figure 5c** and SSK **Figure 5f**). The legend displays the colors representing the four land-use classes in these hours. The colored circles beside each batch of cells indicate the "real" land-use label of the cell batch that lies to its side. The color of each of the cells indicates the land use it is classified to. Notice, some of the cells have more than one color. This is because the results in these maps accumulate 45 classification results, 9 hours from 8 a.m. to 5 p.m. X 5 random training–testing permutations.

**Figure 6(left)** focuses on part of Ramat-Gan's RF classification results (**Figure 5b**). See the cell marked "1"; it has three colors: blue, yellow, and a thin line of red. Fiftythree percent of the cell is blue, indicating it was classified as Residential in 53% of the runs (24 of the 45 runs). Also, almost half of the cell is yellow, indicating that it was frequently classified as Industrial, and it includes a thin red line that indicates it was also classified as Commercial (in 2 of the 45 runs). In contrast, the cell marked "2" is completely yellow, indicating that it was classified as Industrial in all runs.

Comparing the visualized results, one can see that SSK, which relies on a small number of labeled cells, suffers from higher classification variance than RF. In SSK, more cells are not unanimously classified to the same cell in all 45 runs, as indicated by more cells containing more than one color. For example, in **Figure 5c**, most of the cells of the commercial streets Ibn Gabirol and Dizengoff in Tel-Aviv classified by RF are uniformly red. This indicates that they were classified as Commercial in all runs.

However, the same streets classified by SSK (**Figure 5f**) are mostly red, indicating that in most runs, they are correctly classified as Commercial, but blue is also prominent, indicating that in a non-negligible number of the runs, they were classified as Residential (note, however, that in both streets, the ground floor of the buildings is stores and restaurants, that is, should be labeled Commercial, but the remaining, usually three, floors are residential, and thus should be labeled as Residential). SSK heavily relies on a random selection of the 5% cells used in the initial training set, in contrast to RF that relies on a large and consistent training set. Raanana's commerce street, Ahuza St. (**Figure 5a** and **d**), is confused with Residential. This is mostly because of the location estimation inaccuracy described earlier in this section, as the street is surrounded by neighborhoods and, hence, receives cellular transmissions of the neighboring Residential land use and is thereby confused with Residential.

*Mapping of Social Functions in a Smart City When Considering Sparse Knowledge DOI: http://dx.doi.org/10.5772/intechopen.104901*

#### **Figure 5.**

*Geographical confusion map comparison of RF (a)–(c) and SSK (d)–(f) for three cities shown in Figure 2 (bottom): (a) and (d) Ra*<sup>0</sup> *anana, (b) and (e) Ramat Gan, and (c) and (f) Tel Aviv.*

Moreover, this geographical confusion may be caused by residential buildings on the street itself that mix the social use of the land (as in the two streets in Tel Aviv).

SSK classification is more biased. As an example, we will examine the results of the commercial streets marked with a red circle beside them in Ramat-Gan (**Figure 5b** and **e**). Both algorithms classified the commercial streets inconsistently, sporadically classifying them as Commercial (correct) or as Residential (incorrect), but RF correctly classified the cells in most runs as Commercial (most cells are mostly red), whereas SSK classified some of the Commercial cells more as Residential (cells that are mostly blue).

#### **Figure 6.**

*(left) "Zoom in" on part of Ramat Gan's geographical confusion map of the RF classification results (Figure 5b). (right) Accuracy rate (Acc) vs. the percentage of classified cells added in the self-labeled process.*

The accuracy of SSK is different across the different streets. Dizengoff St. (**Figure 5f**) for example, is correctly classified as Commercial in most runs. Another Commercial street in Tel Aviv, Ibn Gabirol St. (**Figure 5f**), is correctly classified at a lower rate than Dizengoff, while Jabotinsky St. in Ramat-Gan (**Figure 5e**) is mostly classified as Residential instead of Commercial. Analyzing the three streets indicates that they have different characteristics. Dizengoff and Ibn Gabirol have higher commercial densities than Jabotinsky, with many more shops, cafes, and bars. The automobile traffic on those streets is also different. All three have noticeable car traffic, but Ibn Gabirol is a wider road than Dizengoff, and Jabotinsky is much wider than Ibn Gabirol and serves as the main artery that connects several cities to Tel-Aviv. It may be that Jabotinsky is confused with Residential because there are more residents living there. On Jabotinsky, there are four-story residential buildings (and some 10–20-story ones as well), mainly inhabited by families. In comparison, on Dizengoff and Ibn Gabirol Streets, there are three-story buildings inhabited mostly by young single people. For all these reasons, it is not surprising that these streets are classified differently, as their social function differ.

In **Figure 6(right)**, we illustrate the accuracy rate through the self-labeled iterations. The figure demonstrates the accuracy rate (Acc) in accordance with the percentage of cells that were labeled. After the first iteration, 10% of the cells are classified (5% labeled by ground truth knowledge +5% classified in the first iteration), and the accuracy rate is high (89%). However, notice that, in this stage of the process, 90% of the cells are yet to be classified. Through the process, as more cells are classified, the accuracy rate gradually declines—from 89% after the first iteration to 72% at the end of the process when all cells are classified. There are two reasons for this. First, in each iteration, incorrect labels (due to erroneous labeling of previous iterations) are added to the training set, causing the quality of the training set to decline. Second, as the iterations go on, the samples added to the training set are those that the algorithm was the least confident of in previous iterations. Notice we could have stopped the iterations before all the cells were classified. The accuracy rate drops more rapidly in the classification of the last 20% of the cells. If we would have stopped the process when 80% of the cells were classified, then the accuracy rate would have

been 81%. However, in that case, 20% of the cells would have been left unclassified, so this is left as a trade-off for the user.
