**2. Data and methodology**

#### **2.1 Data and descriptive statistics**

The dataset contains 34 advanced countries of OECD and covers the yearly period 1997–2015. The following countries are included: Austria, Belgium, Finland, France, Germany, Greece, Ireland, Italy, Netherlands, Portugal, Slovak Rep., Spain, United Kingdom, Japan, United States, Australia, Canada, Denmark, Iceland, Luxembourg, New Zeeland, Norway, Switzerland, Turkey, Chile, Czech Rep., Estonia, Hungary, Israel, Korea Rep., Mexico, Poland, Slovenia, and Sweden.

In the models, the dependent variable is the natural logarithm of CO2 emissions (metric ton per capita) for country *i* in year *t*, in line with prior literature (e.g., [31]). The vector of explanatory variables includes the following variables: real GDP per capita as a measure of output (GDP), ratio of exports and imports to GDP (TO) for openness to trade, the urban population (as % of total population-UR), foreign direct investment (as % of GDP-FDI), renewable energy consumption (% of total final energy consumption-RE), and the renewable electricity output (as % of total electricity output-REL), which is the share of electricity generated by renewable power plants in total electricity generated by all types of plants. The first three variables could have *The Environmental Kuznets Curve: Empirical Evidence from OECD Countries DOI: http://dx.doi.org/10.5772/intechopen.108631*

alternative impacts (they should be positively or negatively associated with CO2 emissions) while the last two variables are likely to be negatively linked to the environmental quality. All data come from the World Bank Development Indicators database and are in natural logarithms (i.e., each estimated coefficient is a constant elasticity of the dependent variable with respect to the independent variable).

The descriptive statistics on the selected variables for each country are displayed in **Table 1**. It can be observed that the lowest level of renewable energy consumption is 0.7% of total final energy consumption (United Kingdom) while the highest level is 77.8% of total final energy use (Island). The highest value of GDP per capita is up to 112418\$ (Luxembourg) while the lowest level of GDP per capita corresponds to 6075\$ (Poland). Regarding the CO2 emission per capita, the highest level is equal to 25.6% (Luxembourg) while the lower level of CO2 emissions is for Chile, Mexico, and Turkey.

To address multicollinearity concerns, **Table 2** shows the matrix correlation among explanatory variables. The explanatory variables are not highly correlated so that they can be safely integrated into the model (except for the energy use from fossil fuels, which will be dropped from the final model).

#### **2.2 PSTR model**

The R-EKC curve is examined by using the empirical approach proposed by González et al. [36]. This recent method focuses on heterogeneous panels that allow estimated coefficients to vary both across countries and over time. The specification supposes the existence of an infinite number of intermediary regimes, and the coefficients depend upon these regimes. As it is considered a nonlinear panel model, it serves to capture a rise in the level of income that does not affect the income-pollution nexus linearly, but conditionally on the position in the income distribution.

Considering the level of income (GDP per capita) as a transition variable *qi*,*<sup>t</sup>* , the PSTR model with two regimes and a single transition function can be written as follows:

$$\text{CO2}\_{it} = \mu\_i + \beta\_0 \text{GDP}\_{i,t} + \beta\_1 \text{GDP}\_{i\emptyset} f\left(q\_{i,t}; \chi, c\right) + \zeta \text{X}\_{i,t} + \varepsilon\_{i,t} \tag{1}$$

where *CO*2*it* is the dependent variable (the carbon dioxide emissions per capita for country *i* at time *t*), *μ<sup>i</sup>* the individual fixed-effects, *GDPi*,*<sup>t</sup>* is the GDP per capita of the country *i,* at time *t*, the *f qi*,*<sup>t</sup>* , *γ*, *c* is the transition function, and *<sup>ε</sup><sup>i</sup>*,*<sup>t</sup>*, the error term, which is i.i.d (0, *σ*<sup>2</sup> *<sup>ε</sup>* ). *Xi*,*<sup>t</sup>*corresponds to the vector of the control variables including renewable energy use or renewable electricity output (*REit or RELit*), urban population (*URit*Þ, trade openness (*TOit*Þ, foreign direct investment (*FDIit*Þ. The transition function is continuous and integrable on the interval [0,1] and depends on three parameters: *qi*,*t* ,which is the transition variable, *γ* – the slope of the transition function, and *c* is the vector of location parameters such as *c =(c1,...,cm)'*, with *m* as the vector dimension.

There is no specific rule regarding the optimal number of thresholds in studying the income-pollution nexus. To identify the optimal number of thresholds, two tests are performed in the next section (the Lagrange Multiplier Wald test and the Lagrange Multiplier Fisher test), which indicate that m=21 . This means that there are two thresholds of "income" around which the effect of income on pollution is a

<sup>1</sup> This means that the final PSTR model will have two thresholds, and it can be written as follows: *CO*2*it* ¼ *μ<sup>i</sup>* þ *β*0*GDPi*,*<sup>t</sup>* þ *β*1*GDPitf q*1*i*,*<sup>t</sup>* ; *γ*1, *c*<sup>1</sup> <sup>þ</sup> *<sup>β</sup>*2*GDPitf q*2*i*,*<sup>t</sup>* ; *γ*2, *c*<sup>2</sup> <sup>þ</sup> *<sup>ζ</sup>Xi*,*<sup>t</sup>* <sup>þ</sup> *<sup>ε</sup><sup>i</sup>*,*<sup>t</sup>*.

nonlinear one. However, even in this case (m = 2), there are still a continuum of regimes that lie between the extremes (high pollution/low income and low pollution/ high income). Therefore, as the transition variable *qi*,*<sup>t</sup> i*ncreases, the effect of "income" progresses from *β*<sup>0</sup> in the first regime corresponding to *f*ðÞ¼ *:* 0 to *β*<sup>0</sup> þ *β*<sup>1</sup> in the second extreme regime corresponding to *f*ðÞ¼ *:* 1 and so on. Between two extreme cases *f*ðÞ¼ *:* 0 and *f*ðÞ¼ *:* 1, the sensitivity/elasticity of pollution to income is computed by differentiating CO2 emissions with respect to the level of income such as:

$$q\_{i,t} = \frac{\partial CO2\_{i,t}}{\partial GDP\_{i,t}} = \beta\_0 + \beta\_1 \* f\left(q\_{i,t}; \chi, c\right) \tag{2}$$

Furthermore, the sensitivity of CO2 emissions to GDP per capita can vary under the two extreme regimes *β*<sup>0</sup> and *β*0*+β*<sup>1</sup> and is a weighted average of parameters *β*<sup>0</sup> and *β*1. Note that literature indicates that it is difficult to directly interpret the values of these parameters, but easier to interpret their sign as an increase or decrease in the elasticity, depending on the value of the transition variable, and the individual and time dimension given by the previous equation.

#### **2.3 Pre-tests: test for linearity (homogeneity) hypothesis**

Before estimating the PSTR model, I investigate homogeneity's model against the PSTR alternative. The Lagrange multiplier (LM) test of homogeneity based on the asymptotic χ2 distributions, their F-versions and the HAC versions are applied to each of the explanatory variables, they being potential "candidate" transition variables in the PSTR. The LM test looks at the null hypothesis of linearity (homogeneity) against the alternative logistic (m=1) or exponent (m = 2) PSTR model. The optimal number of transition functions is obtained by doing tests of no-remaining nonlinearity. The linearity supposes testing H*0*:*γ* = 0 or *H0*:*α* ¼ *β*, but in both cases the test is nonstandard, since under *H0* the PSTR model has unidentified nuisance parameters.

Usually, it is proposed to replace the transition function *f qi*,*<sup>t</sup>* ; *γ*, *c* by its first-order Taylor expansion around γ = 0 and to test an equivalent hypothesis in an auxiliary regression such as:

$$
\omega\_{i,t} = a\flat\_i + a\text{CO}\mathcal{Q}\_{i,t} + \theta\_1 \text{CO}\mathcal{Q}\_{i,t}^2 + \varepsilon\_{i,t} \tag{3}
$$

In this first-order Taylor expansion, the parameter *θ*<sup>1</sup> is proportional to the slope parameter γ. Thus, testing the linearity against the PSTR model simply means testing: H0: *θ*<sup>1</sup> ¼ 0 in this linear panel model. This can be done by using standard tests such as the F-statistic.

#### **2.4 Pre-tests: selecting the number of transition functions**

The next step aims to identify the optimal number of thresholds (*m*) of the logistic transition function by using the LM homogeneity test. The procedure is like when testing the number of transition functions in the model. In the PSTR framework, it is assumed that the linearity hypothesis is rejected. Then the idea is to test whether there is one transition function (H0: m = 1) or whether there are at least two transition functions (H0: m = 2). Let now us consider the model with m = 2 regimes:

*The Environmental Kuznets Curve: Empirical Evidence from OECD Countries DOI: http://dx.doi.org/10.5772/intechopen.108631*

$$\text{COZ}\_{\text{il}} = \mu\_i + \beta\_0 \, ^\prime \text{GDP}\_{i\downarrow} + \beta\_1 \, ^\prime \text{GDP}\_{i\downarrow} f\left(q\_{1\downarrow i}; \gamma\_1, c\_1\right) + \beta\_2 \, ^\prime \text{GDP}\_{i\downarrow} f\left(q\_{2\downarrow i}; \gamma\_2, c\_2\right) + \zeta \text{X}\_{i\downarrow} + e\_{i\uparrow} \tag{4}$$

If we replace the second transition function *q*2*i*,*<sup>t</sup>* ; *γ*2, *c*<sup>2</sup> by the first-order Taylor expansion around *γ*<sup>2</sup> = 0 and test linear constraints on the parameters, the model becomes:

$$\text{'CO2}\_{\text{it}} = \mu\_{\text{i}} + \beta\_0 \text{'GDP}\_{\text{i},\text{t}} + \beta\_1 \text{'GDPf} \left( q\_{1\text{i},\text{t}}; \gamma\_1, c\_1 \right) + \theta \text{GDP}\_{\text{it}} \* q\_{\text{i},\text{t}} + \varepsilon\_{\text{i},\text{t}} \tag{5}$$

The test of no-remaining nonlinearity is defined by H0: θ = 0; consider SSR0 the panel sum of squared residuals under H0, i.e., in a PSTR model with one transition function. SSR1 is the sum of squared residuals of the transformed model in (5). The testing procedure shows that given a PSTR model with m = m<sup>∗</sup> , the null hypothesis will be tested such as: H0:m=m<sup>∗</sup> against H1:m=m<sup>∗</sup> + 1. If H0 is not rejected, then the procedure ends. Otherwise, the null hypothesis H0:m=m<sup>∗</sup> + 1 is tested against H1:m= m<sup>∗</sup> + 2. The testing procedure continues until the first acceptance of H0.
