4.4.3. Selection of T1 and T2

Based on the previous results, two fusion models, including MNL built on all the four individual classifiers and RF on the combination between this model and MNL, are selected for the enhancement algorithm. To decide the threshold T1, the correlation is examined between different values of T<sup>1</sup> and the prediction rates of the fusion models, as shown in Figure 5. It is observed that for both models, when T<sup>1</sup> is below the crossing point of 0.72 in Figure 5(a) and 0.8 in Figure 5(b), the number of false prediction is higher than that of the correct one. Thus, 0.72 and 0.8 are selected as T<sup>1</sup> for MNL and RF, respectively. T<sup>2</sup> is set as 0.9, above which the prediction rate is 69.7 and 66.4% for these two models.

### 4.4.4. Enhancement results

As expected, for the probability Tr(aj|ai), the highest values are dominated by the transitions to either home or work/school activities. With Qr(aj|ai), however, the dominance of these two activities is reduced by their high frequencies, and transitions to other less represented activities are exposed. This can be manifested by the high transitions from home to non-work

The row and column represent the current and previous activities respectively; the maximum probability for each

Transition probability Activity type Home Work/school Non-work Social visit Leisure Tr Home 0.008 0.546 0.700 0.197 0.797

Transition probability Activity type Home Work/school Non-work Social visit Leisure Qr Home 0.002 0.159 0.204 0.057 0.232

Work/school 0.883 0.328 0.300 0.701 0.153 Non-work 0.032 0.010 0.000 0.000 0.000 Social visit 0.017 0.081 0.000 0.080 0.051 Leisure 0.061 0.036 0.000 0.022 0.000

Work/school 0.060 0.022 0.020 0.047 0.010 Non-work 0.066 0.019 0.000 0.000 0.000 Social visit 0.023 0.114 0.000 0.113 0.072 Leisure 0.059 0.035 0.000 0.021 0.000

The activity distribution is also differentiated between weekdays, weekend and holidays. The weekday distribution at each hour P(aj|t) is shown in Figure 4(a) and the distribution of the

activities and from social visit to second social visit locations.

Figure 4. Absolute activity distribution (a) and relative activity distribution at each hour (b).

4.4.2. Activity distribution at different time

a

column is in bold.

Table 6. Transition matrix.a

106 Smartphones from an Applied Research Perspective

Table 7 presents the prediction results by the enhancement algorithm (in the column 'After'), along with the results before the enhancement (in the column 'Before') as well as the difference between these two prediction results (in the column 'Difference'). Overall improvement of 4.4 and 7.6% for MNL and RF is achieved. The examination into the results across various activities discloses that the enhancement algorithm particularly performs better on less representative activity types, e.g. non-work obligatory, social visit and leisure activities. This could be originated from the fact that the machine learning algorithms usually favour majority types if the prediction accuracy is used as the evaluation criterion, while the enhancement algorithm puts equal weights on all activity types of the dependent variable (call locations).

The effectiveness of each of the two enhancement methods is also investigated, by running the RF fusion model using each of these methods independently to revise a weak prediction result.

Figure 5. Relation between the prediction and the probabilities from MNL (a) and RF (b) fusion models, respectively.


Table 7. Prediction result comparison between before the enhancement algorithm and after that (%).

The prediction rates of 73.7 and 75.2% were obtained for the transition probability-based and prior probability-based enhancement methods, respectively. Due to the small size of the training set, many locations are labelled as one single known activity of a day, the sequential information is thus not available on these days. With a large dataset, the transition matrix would better represent typical activity and travel behaviour of users. This would lead to the transition probability-based method and the enhancement algorithm as a whole bringing greater improvement over the current experimental results.
