**3. Empirical analysis**

To estimate the impact of infrastructure on smart cities we consider the following econometric model:

$$\text{Urbanization} = \alpha\_0 + \sum\_{i=1}^{10} \beta\_i \text{Infruststructure}\_i + \varepsilon\_i \tag{1}$$

where *<sup>i</sup> e* represents well-behaved error term andα<sup>0</sup> stands as constant. Ordinary least squares (OLS) method is used to analyze the impact of infrastructure on urbanization in India. Based on Tripathi [17, 18] city population size, city population density, and city population growth rate are considered to measure the urbanization in this paper. On the other hand, city-level availability of infrastructure is measured by considering city level total road length, number of latrines, water supply capacities, number of electricity connections, hospitals, schools, colleges, universities, banks, and credit societies.

In the context of the positive impact of infrastructure on urbanization, Tiebout [19] indicated that accessibility and superiority of public facilities such as parking facilities, police protection, roads, parks, and municipal golf courses are very important for choosing a municipality. Therefore, consumer voters would migrate to a city that satisfied their demand for infrastructure. Harris and Todaro's [20] model explained that rural to urban migration depends on expected rural–urban income differential rather than rural–urban wages. This indicates that urban condition is better with higher infrastructure facilities which attract more rural people [18].

In the context of India, several studies (e.g., [21–24]) argued that India's urban areas lack adequate infrastructure which requires urgent attention. Pradhan [25] investigated the impact of infrastructure on urbanization in India, using a composite infrastructure development index based on three sub-indexes: physical infrastructure, social infrastructure, and financial infrastructure. Using multivariate principal component analysis, the study confirmed that infrastructure has a significant positive impact on urbanization in India. On the contrast, Tripathi [18] argued that the improvement of infrastructure in large cities may not increase population concentration, but it will improve the living conditions and business activities that increase economic growth potential. Based on these studies we expect a positive or negative effect of infrastructure on urbanization driven by smart city development.

Details about the variable measurement and data sources are provided in Appendix A. **Table 1** presents the summary statistics of each variable used in the analysis. The coefficient of variation (CV) measures the dispersions of data points in a data series. Log of the city population, city population density, and city-wise total number of colleges have lower values of a coefficient of variation (CV) which indicates that there are little differences in their means, implying a more symmetrical distribution. However, it is not the case for the city-wise total number of credit societies, city-wise total water supply capacity, the city-wise total number of banks, and the city-wise total number of latrines.

**Table 2** shows the raw correlation of the variables. The results show that the log of the city population is positively associated with all the infrastructure variables. Most importantly, the log of city population highly correlated with city-wise road length, the city-wise total number of latrines, city-wise total number of electricity connection, and city wise total number of schools. On the other hand, the correlation between city population densities and infrastructure variables is not strong. Similar results are obtained for the correlation between city population growth rate and infrastructure variables.

We now investigate the impact of infrastructure on the urbanization. Based on Tripathi [17, 18], we consider the city population, density, and growth rate for the measurement of urbanization. We consider a total of 10 variables to measure the infrastructure and stand as interdependent variables. **Table 1** shows that there are considerable variations between the minimum and maximum values of the variables. The correlation coefficients show that data are more correlated as the values increase. Hence, factor analysis is considered to reduce the number of independent variables to obtain appropriate estimation.


#### *Does Smart City Development Promote Urbanization in India? DOI: http://dx.doi.org/10.5772/intechopen.94568*

#### **Table 1.**

*Description of data used for the analysis.*

To ensure the validity of data, Kaiser-Meyer-Olkin (KMO) and Bartlett's Test of Sphericity are used. The KMO test is performed by using STATA version 14.1. The estimated results in **Table 3** show that factor analysis is highly recommended as the KMO value is 0.851. The probability of Bartlett's test of Sphericity is very significant (0.000 < 0.01). Thus, factor analysis is desirable.

The initial eigenvalues (i.e., a variance of the factor) are presented in **Table 4**. The most variance is presented by the first factor, the next maximum amount of variance is considered by the second factor, and so on. The negative eigenvalues indicate that the matrix is not full rank suggesting six factors for the analysis can be considered at most. On the other hand, the KMO criterion recommends that factors with Eigenvalues ≥1 should be considered for the analyses. Therefore, the only first factor is relevant for the study that accounts for about 86% of the variance in the solution.

The factor loadings (pattern matrix) according to the uniqueness i.e., a variance is exclusive to the variable and not contributed by other variables is presented in **Table 5**. The bigger values of uniqueness indicate that variables are not properly explained by the factors. For instance, 93.3% of the variance in 'total credit society'


#### **Table 2.**

*Correlation coefficient of the variables used for the analysis.*

### *Does Smart City Development Promote Urbanization in India? DOI: http://dx.doi.org/10.5772/intechopen.94568*


#### **Table 3.**

*KMO and Bartlett's test.*


#### **Table 4.**

*Explanation of total variance.*


#### **Table 5.**

*Factor loadings (pattern matrix) and unique variances for one factor model.*

is not contributed by the other variables in the overall factor model. On the contrary, the 'total number of latrines' that has very low variance (14.6%) is not shared by other variables. As the values of factor loading for approximately all variables are higher (>0.3), we can conclude that factor 1 is defined by all six variables that are considered to produce an infrastructure index. Quite importantly, factor1 is mostly

related to the city-wise number of electricity connections and city-wise number of latrines. It is also important to note that as we are using one factor only, factor rotation which helps to see the underlying dimensions (scales) more clearly is not suitable as there's nothing to rotate.

The linear regression analysis is used to investigate the impact of infrastructure on urbanization in India. **Table 6** presents the results of the regression analysis. The factor score values for the one selected factor is considered as the independent variable. Regression models 1–5 present the estimated results for three dependents variables i.e., size, growth, and density of city populations. To control the heteroscedasticity problem we estimate the robust standard errors.

Regression 1 shows that the infrastructure index has a positive and statistically significant effect on the smart city population in 2011. A 10% increase in infrastructure index increases the smart city population by 7.1%. This indicates that higher infrastructure investment increases the population of smart cities. On the other hand, a higher level of infrastructure also increases the population density of the smart cities in regression 4. The coefficient 0.123 indicates that a 10% increase in infrastructure index increases smart city density by 1.2%. However, infrastructure may not increase the growth rate of the city population as it has a statistically insignificant effect on it in regression 5. This is quite evident as most of the large cities considered for smart city development experienced a negative growth rate. For example, Thiruvananthapuram experienced a 14% negative population growth rate from the period of 2001 to 2011. Therefore, smart city development does not increase the population growth rate of smart cities.

To estimate the robustness of the results we consider smart city population data for 2020 and 2025 from World Urbanization Prospects (WUP): The 2018 Revision [26]. The WUP provides a data population of urban agglomerations with 300,000 inhabitants or more in 2018. On the other hand, though 11% of the total proposed work under the smart city mission completed in 2019, still we have to wait for 2021 (i.e., next Census data) for the evaluation of the impact of infrastructure on the population of smart cities. As some of the smart cities that are considered for our survey have a population less than 3 lakh we could collect data only 77 smart cities. The regression results 2 and 3 show that available


*Robust standard errors in parentheses. Source: Estimated using Eq. (1). \*p < 0.1.*

*\*\*\*p < 0.01.*

infrastructure in 2011 has a positive and statistically significant effect on the log of the smart city population in 2020 and 2025. This indicates that infrastructure has a big role in the promotion of urbanization in India and smart city mission is very important for that.
