**3. Iterative proportional updating approach**

The iterative proportional updating approach is a heuristic approach that was developed by Ye et al. [19] to address the drawbacks of the IPF approach. Specifically, the IPU approach addresses the issue of control for individual-level attributes and joint distributions of personal characteristics. The IPU algorithm matches both household- and individual-level attributes in a computationally efficient manner by iteratively adjusting and reallocating weights among households of a specific type until both household- and individual-level attributes are matched. Another advantage of the IPU approach is its practicality from the implementation and computational points of view. Eq. (2) represents the mathematical optimization problem as addressed by the IPU approach. In addition, the IPU approach has been generally described in 23 computational steps that can be easily coded in most, if not all, programming languages:

$$\text{Minimize} \sum\_{j} \left( \frac{\sum\_{i} d\_{ij} w\_i - c\_j}{c\_j} \right)^2 \text{ or } \sum\_{j} \frac{\left( \sum\_{i} d\_{ij} w\_i - c\_j \right)^2}{c\_j} \text{ or } \sum\_{j} \frac{\left| \sum\_{i} d\_{ij} w\_i - c\_j \right|}{c\_j} \tag{2}$$

Subject to *wi* ≥ 0

where *i*, denotes a household (*i* = 1, 2,…,*n*); *j*, denotes the constraint or population characteristic of interest (*j* = 1, 2,…,*m*); *di*,*j*, represents the frequency of the population characteristic (household/person type *j* in household *i*); *wi*, is the weight attributed to the *i th* household; *cj* , is the value of the population characteristic *j*.

Furthermore, Ye et al. [19] proposed an alternative method to address the zero-cell problem that undermined the IPF practicality. Their method is based on borrowing the prior information for the zero cells from PUMS data for the entire region, where zero cells are not likely to exist as long as the control variables of interest and their categories are defined appropriately. However, that method has the inherent risk of overrepresenting the demographic group of interest. Despite their attempt to overcome the zero-cell problem, the researchers could not overcome the zero-marginal problem that may result due to nonexistence of a certain attribute in households of a certain geographic area, for example, having no lowincome households in a certain census block or tract. Furthermore, a review by Müller and Axhausen [3] pointed to the lack of a theoretical proof of convergence.

*Transportation Systems Analysis and Assessment*

or business-related agents.

cal applications only.

characteristics.

not account for individual-level controls. The study concluded that the improved fit comes at no cost to the fit against household-level controls. However, the developed methodology was never experimented as to synthesizing commercial-

Lee and Fu [12] realized that the IPF-based population synthesis approaches, specifically the original synthetic reconstruction method [6] and the complimentary combinatorial optimization method [13], are not generally applicable to all population synthesis scenarios. Based on a comparison by Ryan et al. [14], Lee and Fu [12] concluded that combinatorial optimization method produces more accurate demographic information for populations over a small area and that the population synthesis problem should be evaluated from an optimization point of view. In addition, they explored how the estimation of a multiway demographic table can be formulated and solved as a constrained optimization problem in full consideration of both household- and individual-level attributes. Accordingly, that study tackled the inconsistency problem through an approach that is based on the minimum cross-entropy theory. The validity of that model was confirmed through a case study in Singapore, through which results from a 10,641 household study area were superior to conventional IPF approaches. However, Lee and Fu [12] did not provide a full-scale application which constrains the applicability of their model to theoreti-

Zhu and Ferreira [15] were intrigued by the inability of the standard IPF algorithm to fit marginal constraints on multiple agent types simultaneously. Hence, they developed a two-stage population synthesizer that utilized IPF on the first stage and then estimated the spatial pattern of household-level attributes through a second stage IPF-based approach. Their two-stage algorithm consisted of four distinctive steps. The first step involved developing an estimate joint distribution of household- and individual-level attributes. In the second step, households and individuals were drawn from microdata samples. The third step consisted of a conventional IPF with household type and parcel capacity marginal constraints. The fourth and last step included an estimated marginal distribution of other attributes from the fitted model. To validate their approach, Zhu and Ferreira [15] generated synthetic population for Singapore. Their evaluation approach involved four comparisons, namely, fitting only for households-level constraints, fitting for both household- and individual-level constraints, allocating households to buildings while constraining building capacity, and repeating the previous comparison with income level constrained. Validation results yielded realistic spatial heterogeneity while preserving some of the joint distribution of household and locational

Choupani and Mamdoohi [16] addressed the issue of integerization of IPF results in non-integer values instead of integers, for example, fractions of household- or individual-level attributes for zones. In doing so, they proposed a binary linear programming model for tabular rounding in which the integerized table totals and marginals perfectly fit to input data obtained from the Census Bureau. The main advantages of using tabular rounding were that it did not bias joint or marginal distributions of socioeconomic attributes of minority demographic groups and it minimized the distortion to the correlation structure of household- and individual-level non-integer tables. Furthermore, the tabular rounding approach outperformed all other eight rounding approaches. In addition, sensitivity analysis of tabular rounding demonstrated that small and large values are equally significant when it comes to integerization. Their findings were confirmed by a comprehensive literature review [17] that they performed 1 year later, which concluded that IPF is the most feasible approach for synthesizing populations for agent- and activitybased transportation models, once integer conversion and zero-cell issues were

**4**
