**2.1 Improvements in activity-based travel demand modeling**

It is more than half a century that transportation planners try to understand how individuals schedule their activities and travel to improve urban mobility and accessibility. The evolution of travel demand modeling from trip-based to activity-based highlighted the need for high-resolution databases including sociodemographic and economic attributes of individuals and travel characteristics. Today, with the rapid advancements in computation, technology, and applications, the intelligent transportation systems (ITS) have revolutionized the analysis of travel behavior by having more accurate data, removing human errors, and making use of the vast amount of available data [42]. Tools such as GPS devices, smartphones, smart card data, and social networking sites all have the potential to track the movements and activities of individuals by recording and retaining the relevant data continuously over time. Most of the traditional travel survey data are rich in detail. However, it can result in biased travel demand models because of incomplete self-reports and inaccurate scheduling patterns. Therefore, in this section, the common tools used in collecting big data are introduced and the progress made in the area of extracting big data sources is discussed.

### *2.1.1 Cell phone data*

A call detail record (CDR) is a data record produced by a telephone exchange and consists of spatiotemporal information on the recent system usage [40], which can track people's movements. This CDR data can be processed and applied in activity-based travel demand modelings to better understand human mobility and obtain more accurate origin-destination (OD) tables [43]. The first attempt using CDR data was a study of Caceres et al. [44], who applied mobile phone data to

**31**

data [55].

*2.1.2 Smart card data*

*Recent Progress in Activity-Based Travel Demand Modeling: Rising Data and Applicability*

generate OD matrices. Their concept was then formalized by Wang et al. [45] to obtain transient OD matrices by counting trips for each pair of the following calls from two different telephone (cell) towers at the same hour. Afterward, using the shortest path algorithm, OD trips are assigned to the road network. In the area of urban activity recognition, Farrahi et al. [46] applied two probabilistic methods (i.e., Latent Dirichlet Allocation (LDA) and Author Topic Models, ATM) to cluster CDR trajectories according to their temporal aspects to discover the home and work activities. Considering the spatial aspect of CDR data, Phithakkitnukoon et al. [47] applied auxiliary land use data and geographical information database to find possible activities around a certain cell tower. And considering both the temporalspatial aspect of CDR, Widhalm et al. [48] used an undirected relational Markov network to infer urban activities. They extracted activity patterns for Boston and Vienna by analyzing cell phone data (activity time, duration, and land use). Their results show that trip sequence patterns and activity scheduling observed from datasets were compatible with city surveys as well as the stability of generated activity clusters across time. In a more recent study, [49] an unsupervised generative state-space model is applied to extract user activity patterns from CDR data. Furthermore, it has been shown that the method of CDR sampling is as significant as survey sampling. For example, in one study [50], CDR and survey data is used during a period of six months to investigate the daily mobility for Paris and Chicago. The result shows that 90% of travel patterns observed in both surveys are compatible with phone data. In another similar study [51], a probabilistic induction was proposed using motifs (daily mobility network), time of day activity sequence, and land use classification to produce activity types. CDR data of Singapore was used by Jiang et al. [52] to produce activity-based human mobility patterns.

In the context of activity-based transport modeling, Zilske et al. [53] replaced travel diaries with CDRs as input data for agent-based traffic simulation. They first generated the synthetic CDR data, then the MATSim simulation software was used to identify every observed person as an agent to convert call information into activity. They fused the CDR data set with traffic counts in their next paper [54], to

In summary, the findings reported from different studies indicated the major implications of mobile phone records on the estimation of travel demand variables including travel time, mode and route choice as well as OD demand and traffic flow estimation; however; in practice, the information generated from CDR data are yet to be used widely in simulation models. This is mainly because of the conflict between either level of resolution or format and completeness of model and

Smart card systems with on- and off-boarding information gained much popularity in large public transport systems all over the world, and have become a new source of data to understand and identify the Spatio-temporal travel patterns of the individual passengers. The smart card data are investigated in various studies such as activity identification, scheduling, agent-based transport models, and simulation [56]. Besides, in other studies [57–59] smart card data was used as an analysis tool in investigating the passenger movements, city structure, and city area functions. Similarly, in the recent study [60], a visual analysis system called PeopleVis was introduced to examine the smart card data (SCD) and predict the travel behavior of each passenger. They used one-week SCD in the city of Beijing and found a group of "familiar strangers" who did not know each other but had lots of similarities in their trip choices. Zhao et al. [61] also investigated the group

*DOI: http://dx.doi.org/10.5772/intechopen.93827*

reduce the Spatio-temporal uncertainty.

#### *Recent Progress in Activity-Based Travel Demand Modeling: Rising Data and Applicability DOI: http://dx.doi.org/10.5772/intechopen.93827*

generate OD matrices. Their concept was then formalized by Wang et al. [45] to obtain transient OD matrices by counting trips for each pair of the following calls from two different telephone (cell) towers at the same hour. Afterward, using the shortest path algorithm, OD trips are assigned to the road network. In the area of urban activity recognition, Farrahi et al. [46] applied two probabilistic methods (i.e., Latent Dirichlet Allocation (LDA) and Author Topic Models, ATM) to cluster CDR trajectories according to their temporal aspects to discover the home and work activities. Considering the spatial aspect of CDR data, Phithakkitnukoon et al. [47] applied auxiliary land use data and geographical information database to find possible activities around a certain cell tower. And considering both the temporalspatial aspect of CDR, Widhalm et al. [48] used an undirected relational Markov network to infer urban activities. They extracted activity patterns for Boston and Vienna by analyzing cell phone data (activity time, duration, and land use). Their results show that trip sequence patterns and activity scheduling observed from datasets were compatible with city surveys as well as the stability of generated activity clusters across time. In a more recent study, [49] an unsupervised generative state-space model is applied to extract user activity patterns from CDR data. Furthermore, it has been shown that the method of CDR sampling is as significant as survey sampling. For example, in one study [50], CDR and survey data is used during a period of six months to investigate the daily mobility for Paris and Chicago. The result shows that 90% of travel patterns observed in both surveys are compatible with phone data. In another similar study [51], a probabilistic induction was proposed using motifs (daily mobility network), time of day activity sequence, and land use classification to produce activity types. CDR data of Singapore was used by Jiang et al. [52] to produce activity-based human mobility patterns.

In the context of activity-based transport modeling, Zilske et al. [53] replaced travel diaries with CDRs as input data for agent-based traffic simulation. They first generated the synthetic CDR data, then the MATSim simulation software was used to identify every observed person as an agent to convert call information into activity. They fused the CDR data set with traffic counts in their next paper [54], to reduce the Spatio-temporal uncertainty.

In summary, the findings reported from different studies indicated the major implications of mobile phone records on the estimation of travel demand variables including travel time, mode and route choice as well as OD demand and traffic flow estimation; however; in practice, the information generated from CDR data are yet to be used widely in simulation models. This is mainly because of the conflict between either level of resolution or format and completeness of model and data [55].

### *2.1.2 Smart card data*

Smart card systems with on- and off-boarding information gained much popularity in large public transport systems all over the world, and have become a new source of data to understand and identify the Spatio-temporal travel patterns of the individual passengers. The smart card data are investigated in various studies such as activity identification, scheduling, agent-based transport models, and simulation [56]. Besides, in other studies [57–59] smart card data was used as an analysis tool in investigating the passenger movements, city structure, and city area functions. Similarly, in the recent study [60], a visual analysis system called PeopleVis was introduced to examine the smart card data (SCD) and predict the travel behavior of each passenger. They used one-week SCD in the city of Beijing and found a group of "familiar strangers" who did not know each other but had lots of similarities in their trip choices. Zhao et al. [61] also investigated the group

*Models and Technologies for Smart, Sustainable and Safe Transportation Systems*

challenges in the area of activity-based travel demand modeling.

**2.1 Improvements in activity-based travel demand modeling**

**2. ABMs and the emerging of big data**

these newly available data.

big data sources is discussed.

*2.1.1 Cell phone data*

management strategies.

in enriching ABMs by integrating time-dependent OD matrices produced by ABMs with dynamic traffic assignment; (ii) investigation of ABMs' applicability in transferring from one region to another; and (iii) enriching the capability of ABMs by moving beyond the transportation domain to other such as environment and

The remainder of the paper is organized as follows. Section 2 introduces new data sources such as mobile phone call data records, transit smart cards, and GPS data where the influence of new data sources on the planning of activities, formation, and analysis of the travel behavior of individuals will be investigated. This section also introduces activity-based travel demand models, which generates activity-travel schedules longer than a typical day. Section 3 describes the existing experiences in transferring utility-based and CPM activity-based travel demand models from one geographical area to another. This section also reviews the integration of ABM models with dynamic traffic assignment and other models such as air quality models. The possibility of using activity-based models in travel demand management strategies with a focus on car-sharing and telecommuting are considered as examples. The last section concludes the paper and identifies remaining

This section provides an overview of the role of big data in replacing the traditional data sources, and the changes in activity-based travel demand models given

It is more than half a century that transportation planners try to understand how individuals schedule their activities and travel to improve urban mobility and accessibility. The evolution of travel demand modeling from trip-based to activity-based highlighted the need for high-resolution databases including sociodemographic and economic attributes of individuals and travel characteristics. Today, with the rapid advancements in computation, technology, and applications, the intelligent transportation systems (ITS) have revolutionized the analysis of travel behavior by having more accurate data, removing human errors, and making use of the vast amount of available data [42]. Tools such as GPS devices, smartphones, smart card data, and social networking sites all have the potential to track the movements and activities of individuals by recording and retaining the relevant data continuously over time. Most of the traditional travel survey data are rich in detail. However, it can result in biased travel demand models because of incomplete self-reports and inaccurate scheduling patterns. Therefore, in this section, the common tools used in collecting big data are introduced and the progress made in the area of extracting

A call detail record (CDR) is a data record produced by a telephone exchange and consists of spatiotemporal information on the recent system usage [40], which can track people's movements. This CDR data can be processed and applied in activity-based travel demand modelings to better understand human mobility and obtain more accurate origin-destination (OD) tables [43]. The first attempt using CDR data was a study of Caceres et al. [44], who applied mobile phone data to

**30**

behavior of metro passengers in Zhechen by applying the data mining procedure. After extracting patterns from smart card transaction data, statistical-based and clustering-based methods were applied to detect the passengers' travel patterns. The results show that a temporally regular passenger is very probable to be a spatially regular passenger. The disaggregated nature of smart card data represents suitable input to multi-agent simulation frameworks. For example, the smart card data is used to generate activity plans and implement an agent-based microsimulation of public transport in two cities of Amsterdam and Rotterdam [62]. An agent-based transport simulation is developed for Singapore's public transport using MATSim environment [63]. Unlike Bouman's study, they considered the interaction of public transport with private vehicles. The study of Fourie et al. [64] was another research work to present the possibility of integrating big data algorithms with agent-based transport models. Zhu [65] compared one-week transaction data of smart cards in Shanghai and Singapore. They found feasibility in generating continuous transit use profiles for different types of cardholders. However, to have a better understanding of the patterns and activity behaviors, in addition to collecting the data from smart cards, one should integrate them with other data set.

### *2.1.3 GPS data*

In travel demand modeling, it is important to have accurate and complete travel survey data including trip purpose, length, and companions, travel demand, origin and destination, and time of the day. Since the 1990s, the global positioning system (GPS) became popular for civil engineering applications, especially in the field of transportation as it provides a means of tracking some of the above variables. In the literature, methods of processing the GPS data and identifying activities can be classified according to different approaches such as rule-based and Bayesian model [66]; fuzzy logic [67]; multilayer perceptron [68]; and support vector machine learning [69]. Nevertheless, the disadvantages of using GPS data include the cost, sample size limitation, and the need to retrieve and distribute GPS devices to participate. Since smartphones are becoming one of the human accessories while equipped with a GPS module, they can be considered as a replacement of the GPS device to gather travel data. In this regard, CDR from smartphones is used [70] to estimate origin-destination matrices, or a smartphone-based application is used [71] to map the semiformal minibus services in Kampala (Uganda) and to count passenger boarding and alighting [72]. In the Netherlands, the Mobidot application is developed for analyzing the mobility patterns of individuals. To deduce travel directions and modes, this application uses the real-time data gathered by sensors of smartphones including GPS, accelerometer, and gyroscope sensors to compare them with existing databases [73].

Applying smartphones as a replacement of GPS however, holds several restrictions including the draining of smartphone battery and it is not possible to record travel mode and purpose.

#### *2.1.4 Social media data*

Today transport modelers, planners, and managers have started to benefit from the popularity of social networking data. There are different kinds of social media data such as Twitter, Instagram, and LinkedIn data, which consist of normal text, hash-tag (#), and check-in data. As hash-tag and check-in data are related to an activity, location or event, they can be used as meaningful resources in analysis of destination/origin of the activity [74]. According to the literature, social media has a great influence on different aspects of travel demand modeling [75]. Using social

**33**

*Recent Progress in Activity-Based Travel Demand Modeling: Rising Data and Applicability*

media instead of traditional data collection methods was investigated in different studies [76]. The way of processing these data to extract useful information is challenging as investigated in different studies [77, 78]. Various studies [79–82] also examined social media data to understand the mobility behavior of a large group of people. Testing the possibility of evaluating the origin-destination matrix based on location-based social data was researched [83] or in another similar studies [84, 85] where Twitter data was used to estimate OD matrices. The comparison between this new OD with the traditional values produced by the 4-step model proved the great potential of using social media data in modeling aggregate travel behavior. Social media data can be used in other areas such as destination choice modeling [86], recognizing activity [87], understanding the patterns of choosing activity [80, 88, 89], and interpreting life-style behaviors via studying activity-

Most existing travel demand modelers have applied the household survey data

As highlighted by the above literature review, applying one-day observation data in travel demand modeling provides an inadequate basis of understanding of complex travel behavior to predict the impact of travel demand management strategies. So multi-day data are needed to refine this process. Previously, it was not easy to collect multi-day data, however, today thanks to advantages to technology it is possible to extract data from GPS, smartphones, smart cards, etc. with no burden for the respondent. Models built based on GPS data have been found to be more accurate and precise due to having fewer measurement errors. Collecting call detail

during the period of one day to construct activity schedules. However, longer periods such as one week or one month gained substantial importance during recent years. For simulating everyday travel behavior and generating schedules, a oneweek period provides more comprehensive coverage because it includes weekdays and weekends and represents the weekly routines of individuals in making trips. Periods longer than one week can further provide detail on personal behavior as well as various usage of modes in different ways. So far only a few travel demand models covered a typical week as a studied period. For example rhythm in activitytravel behavior based on the capacity of one week was presented by applying a Kuhn-Tucker method [41]. Few works have been concentrating on the generation of multiple-day travel dataset. For example, by using large data and surveys, Medina developed two discrete choice models for generating multi-day travel activity types based on the likeliness of the activity [91]. a sampling method based on activitytravel pattern type clustering [92] was proposed to extract multi-day activity-travel data according to single-day household travel data. The results show similarities in distributions of intrapersonal variability in multi-day and single-day. MATSim is a popular agent-based simulation for ABM research [93, 94], however, it is not appropriate for modeling the multi-day scenarios because MATSim uses the coevolutionary algorithm to reach the user equilibrium which is a time consuming particularly for multi-day plans. To solve these problems, Ordonez [95] proposed a differentiation between fixed and flexible activities. Based on different time scales, Lee examined three levels of travel behavior dynamics, namely micro-dynamics (24 hours), macro-dynamics (lifelong travel behavior), meso-dynamics (weekly/ monthly/yearly basis) by applying different statistical models [96]. A learning dayby-day module in another agent-based simulation software SimMobility is proposed [97]. Furthermore, ADAPTS is one of the few activity-based travel demand models which depends on activity planning horizon data for a longer period than one day,

*DOI: http://dx.doi.org/10.5772/intechopen.93827*

location choice patterns [90].

**2.2 Dynamic ABM using a multi-day travel data set**

for example, one week or one month [33].

*Recent Progress in Activity-Based Travel Demand Modeling: Rising Data and Applicability DOI: http://dx.doi.org/10.5772/intechopen.93827*

media instead of traditional data collection methods was investigated in different studies [76]. The way of processing these data to extract useful information is challenging as investigated in different studies [77, 78]. Various studies [79–82] also examined social media data to understand the mobility behavior of a large group of people. Testing the possibility of evaluating the origin-destination matrix based on location-based social data was researched [83] or in another similar studies [84, 85] where Twitter data was used to estimate OD matrices. The comparison between this new OD with the traditional values produced by the 4-step model proved the great potential of using social media data in modeling aggregate travel behavior. Social media data can be used in other areas such as destination choice modeling [86], recognizing activity [87], understanding the patterns of choosing activity [80, 88, 89], and interpreting life-style behaviors via studying activitylocation choice patterns [90].

## **2.2 Dynamic ABM using a multi-day travel data set**

Most existing travel demand modelers have applied the household survey data during the period of one day to construct activity schedules. However, longer periods such as one week or one month gained substantial importance during recent years. For simulating everyday travel behavior and generating schedules, a oneweek period provides more comprehensive coverage because it includes weekdays and weekends and represents the weekly routines of individuals in making trips. Periods longer than one week can further provide detail on personal behavior as well as various usage of modes in different ways. So far only a few travel demand models covered a typical week as a studied period. For example rhythm in activitytravel behavior based on the capacity of one week was presented by applying a Kuhn-Tucker method [41]. Few works have been concentrating on the generation of multiple-day travel dataset. For example, by using large data and surveys, Medina developed two discrete choice models for generating multi-day travel activity types based on the likeliness of the activity [91]. a sampling method based on activitytravel pattern type clustering [92] was proposed to extract multi-day activity-travel data according to single-day household travel data. The results show similarities in distributions of intrapersonal variability in multi-day and single-day. MATSim is a popular agent-based simulation for ABM research [93, 94], however, it is not appropriate for modeling the multi-day scenarios because MATSim uses the coevolutionary algorithm to reach the user equilibrium which is a time consuming particularly for multi-day plans. To solve these problems, Ordonez [95] proposed a differentiation between fixed and flexible activities. Based on different time scales, Lee examined three levels of travel behavior dynamics, namely micro-dynamics (24 hours), macro-dynamics (lifelong travel behavior), meso-dynamics (weekly/ monthly/yearly basis) by applying different statistical models [96]. A learning dayby-day module in another agent-based simulation software SimMobility is proposed [97]. Furthermore, ADAPTS is one of the few activity-based travel demand models which depends on activity planning horizon data for a longer period than one day, for example, one week or one month [33].

As highlighted by the above literature review, applying one-day observation data in travel demand modeling provides an inadequate basis of understanding of complex travel behavior to predict the impact of travel demand management strategies. So multi-day data are needed to refine this process. Previously, it was not easy to collect multi-day data, however, today thanks to advantages to technology it is possible to extract data from GPS, smartphones, smart cards, etc. with no burden for the respondent. Models built based on GPS data have been found to be more accurate and precise due to having fewer measurement errors. Collecting call detail

*Models and Technologies for Smart, Sustainable and Safe Transportation Systems*

cards, one should integrate them with other data set.

*2.1.3 GPS data*

them with existing databases [73].

travel mode and purpose.

*2.1.4 Social media data*

behavior of metro passengers in Zhechen by applying the data mining procedure. After extracting patterns from smart card transaction data, statistical-based and clustering-based methods were applied to detect the passengers' travel patterns. The results show that a temporally regular passenger is very probable to be a spatially regular passenger. The disaggregated nature of smart card data represents suitable input to multi-agent simulation frameworks. For example, the smart card data is used to generate activity plans and implement an agent-based microsimulation of public transport in two cities of Amsterdam and Rotterdam [62]. An agent-based transport simulation is developed for Singapore's public transport using MATSim environment [63]. Unlike Bouman's study, they considered the interaction of public transport with private vehicles. The study of Fourie et al. [64] was another research work to present the possibility of integrating big data algorithms with agent-based transport models. Zhu [65] compared one-week transaction data of smart cards in Shanghai and Singapore. They found feasibility in generating continuous transit use profiles for different types of cardholders. However, to have a better understanding of the patterns and activity behaviors, in addition to collecting the data from smart

In travel demand modeling, it is important to have accurate and complete travel survey data including trip purpose, length, and companions, travel demand, origin and destination, and time of the day. Since the 1990s, the global positioning system (GPS) became popular for civil engineering applications, especially in the field of transportation as it provides a means of tracking some of the above variables. In the literature, methods of processing the GPS data and identifying activities can be classified according to different approaches such as rule-based and Bayesian model [66]; fuzzy logic [67]; multilayer perceptron [68]; and support vector machine learning [69]. Nevertheless, the disadvantages of using GPS data include the cost, sample size limitation, and the need to retrieve and distribute GPS devices to participate. Since smartphones are becoming one of the human accessories while equipped with a GPS module, they can be considered as a replacement of the GPS device to gather travel data. In this regard, CDR from smartphones is used [70] to estimate origin-destination matrices, or a smartphone-based application is used [71] to map the semiformal minibus services in Kampala (Uganda) and to count passenger boarding and alighting [72]. In the Netherlands, the Mobidot application is developed for analyzing the mobility patterns of individuals. To deduce travel directions and modes, this application uses the real-time data gathered by sensors of smartphones including GPS, accelerometer, and gyroscope sensors to compare

Applying smartphones as a replacement of GPS however, holds several restrictions including the draining of smartphone battery and it is not possible to record

Today transport modelers, planners, and managers have started to benefit from the popularity of social networking data. There are different kinds of social media data such as Twitter, Instagram, and LinkedIn data, which consist of normal text, hash-tag (#), and check-in data. As hash-tag and check-in data are related to an activity, location or event, they can be used as meaningful resources in analysis of destination/origin of the activity [74]. According to the literature, social media has a great influence on different aspects of travel demand modeling [75]. Using social

**32**

records from mobile phones provide modelers with large trip samples and origindestination matrices, while smart card data are more useful in terms of validation.
