**2. Literature review and methodology**

45

46

48

52

53

55

problems for areas downwind far from the actual source of air pollution. They have adverse effect on urban areas, agriculture, and the natural envrinonment. High levels of PM2.5 can

The U.S. Environmental Protection Agency has established standards requiring the annual average of the PM2.5 to be not more than 15 micrograms per cubic meter [3]. The State of California monitors and reports on their air pollutants carefully, setting very high standards

The site design originally planned was well spread statistically. See Figure 1. However, in reality, it is too costly in terms of time, finance, and manpower to keep all the 113 sites to be monitoring and recording every single year. Each year, only a part of the 113 sites were actually

Comparisons of PM2.5 between the years are difficult, due to "missing data" at sample sites [6, 9]. A site that does not have a recorded PM2.5 value is referred to as "missing value", and since there are no patterns so that serious problems would twist the kriging map constructions.

Observing the dataset in Figure 2, the worst (in 1999) only 11 sites (9.73% of 113 sites) were used and at the best (in 2009) 65 sites (57.52% of 113 sites) were used. Over 13 years, 1469 annual

) were reported, which occupied 37.85%. Sitewise looking, only one site, Site 2596 (Placer County APCD), was collected data annually and had 13 recored annual arithmatic means (µg/

) should be recorded, but actually, 556 annual arithmatic means (µg/

) between years for a given site or between sites for a given

) only. The comparisons of PM2.5

) patterns will be an

). From 1999-2011, there are 113 station locations monitoring PM2.5.

results in visibility problems, urban haze, and acid rain [3].

for their air quality (µg/m3

310 Current Air Quality Issues

arithmatic means (µg/m3

annual arithmatic means (µg/m3

m3

m3

sampled, and each year at different locations.

**Figure 1.** Complete 113 PM2.5 Observational Sites in the California State

), while 16 sites had one annual arithmatic mean (µg/m3

year, i.e., the investigations of PM2.5 annual arithmatic means (µg/m3

#### **2.1. Fuzzy theory and membership kriging approach**

Zadeh's fuzzy theory [22, 23] poineered a new mathematical branch. His membership approaches were quickly spreading and merging into many other mathematical branches, for example, engineering, business, economics, etc. and generating huge impacts in mathematical theories and applications. But it is aware that associated with Zadeh's fuzzy mathematical achivements, researchers gradually discovered three fundamental issues: self-duality dilem‐ ma, variable dilemma, and membership dilemma. Guo et al. [8] discussed those dilemmas in detail and pointed out Liu's credibility measure theory [11] is a solid mathematical treatment to address fuzzy phenomenon modelling. The credibility measure, similar to probability measure, assumes self-duality. Consequently, parallel to probability theory, a fuzzy variable and its (credibility) distribution can be defined. Furthermore, the membership function of a fuzzy variable can also be specified by its credibility distribution. Without any doubts, credibility measure theory is applying to practical situations sucessfully, say, Peng and Liu [14] considered parallel machine scheduling problems with processing times, Zheng and Liu [24] studied a fuzzy vehicle routing optimization problem, Guo et al. [2008] proposed credi‐ bility distribution grade kriging for investigation California State PM10 spatial patterns, Wang et al. [20, 21] investigated a fuzzy inventory model without backordering, Sampath and Deepa [16] developed sampling plans containing fuzziness and randomness, and others. Neverthe‐ less, the investigations in statistical estimation and hypothesis testing problems with fuzzy credibility distribution are very slowly progressed, for example, Li et al. [10], Sampath and Ramya [17, 18] in worked with fuzzy normal distributions, and studied the exponential credibility distribution function [1].

With statistically well-designed scheme, the collected spatial data can be analyzed and presented by kriging maps. If we are facing spatial data with "missing" or scarce fuzziness, it is impossible to construct kriging maps. It is noticed that the air pollutants were measured in different sites each year, even the site design originally planned was well spread statistically. We call a site that does not have a recorded PM2.5 concentration as "missing value" site, as continued from Guo's research [4, 5]. To address the basic requirements in constructing kriging maps, Guo first proposed membership kriging in Zadeh's sense, see the MSc thesis [4] in which the linear, quadratic and hyperbolic tangent membership functions were used. Later Guo [5, 7] had developed the membership kriging under credibility theory. Following the membership kriging route of treating uncertainty, Shada et al. [19] and Zoraghein et al. [25] made consid‐ erable contributions in their papers. In this chapter, we will integrate exponential membership and kriging, to fill in the "missing data" based on existing sample data, and making a com‐ parison of PM2.5 concentrations in California from 1999-2011.

#### **2.2. Fuzzy exponential distribution**

It is necessary to introduce the basics of Liu's fuzzy credibility theory [11]. Let Θ be a nonempty set and M a *σ*- algebra over Θ. Elements of M are called events. Cr{*A*} denotes a number or grade associated with event *A*, called credibility (measure or grade). Credibility measure Cr{ ⋅ } satisfies the axioms normality, monotonicity, self-duality and maximality:

$$\begin{aligned} \text{Cr}\left\{\Theta\right\} &= 1\\ \text{Cr}\left\{A\right\} &\le \text{Cr}\left\{B\right\} \text{ for } A \subset B\\ \text{Cr}\left\{A\right\} + \text{Cr}\left\{\mathcal{A}^{c}\right\} &= 1 \text{ for any event } A\\ \text{Cr}\left\{\bigcup\_{i} \mathcal{A}\_{i}\right\} &= \sup\_{i} \{\text{Cr}\left\{A\_{i}\right\}\} \text{ for any events } A\_{i} \text{ with } \sup\_{i} \{\text{Cr}\left\{A\_{i}\right\}\} < 0.5 \end{aligned} \tag{1}$$

**Definition 1:** The set function Cr{} on M is called a credibility measure if it follows Axiom Normality, Axiom Monotonicity, Axiom Self-Duality and Axiom Maximality shown in Equation 1. The triplet (Θ, M ,Cr) is called a credibility space.

**Definition 2:** A measurable function,*η*, mapping from a credibility space (Θ, M,Cr) to a set of real numbers ᾣ ? ⊂ ℝ.

**Definition 3:** The credibility distribution Ψ of a fuzzy variable *η* on the credibility measure space (Θ, M,Cr) is Ψ: ℝ → [0,1], where Ψ(*x*) = Cr{θ∈Θ|*η*(θ*) x*}, x ∈ ℝ.

**Definition 4:** The function *μ* of a fuzzy variable *η* on the credibility measure space (Θ, M,Cr) is called as a membership function: *μ*(*x*)=(2Cr{*η* = *x*})∧1, *x* ∈ℝ.

**Theorem 5: (Credibility Inversion Theorem)** Let *η* be a fuzzy variable on the credibility measure space (Θ, M,Cr) with membership function µ. Then for any set *B* of real numbers,

$$\operatorname{Cr}\left\{\eta \in B\right\} = \frac{1}{2} \Big(\sup\_{\mathbf{x} \in B} \mu\left(\mathbf{x}\right) + 1 - \sup\_{\mathbf{x} \notin B} \mu\left(\mathbf{x}\right)\Big). \tag{2}$$

**Corollary 6:** Let *η* be a fuzzy variable on a credibility measure space (Θ, M,Cr) with membership function *μ*. Then the credibility distribution Ψ is

$$\Psi^{\mathbf{y}}(\mathbf{x}) = \frac{1}{2} \Big( \sup\_{\mathbf{y} \preceq \mathbf{x}} \mu(\mathbf{y}) + \mathbf{l} - \sup\_{\mathbf{y} > \mathbf{x}} \mu(\mathbf{y}) \Big), \text{ for } \forall \mathbf{x} \in \mathbf{l} \text{ }. \tag{3}$$

It is obvious the concept of credibility measure is very abstract although the credibility measure has normality, monotonicity, self-duality and maximality mathematical properties. The credibility measure loses the intuitive feature as Zadeh's membership orientation. Credibility Inversion Theorem and its corollary have just revealed the deep link between an abstract measure and intuitive membership. Such a link definitely paves the way for real-life applica‐ tions. For example, the trapezoidal fuzzy variable has membership function *μ*:

$$\mu(\mathbf{x}) = \begin{cases} 0 & \mathbf{x} \le \mathbf{c} - \mathbf{c}\_2 \\ \frac{c\_2 + \mathbf{x} - \mathbf{c}}{c\_2 - c\_1} & \mathbf{c} - \mathbf{c}\_2 < \mathbf{x} \le \mathbf{c} - \mathbf{c}\_1 \\ \frac{1}{c\_2 - \mathbf{x} + \mathbf{c}} & \mathbf{c} - \mathbf{c}\_1 < \mathbf{x} \le \mathbf{c} + \mathbf{c}\_1 \\ \frac{c\_2 - \mathbf{x} + \mathbf{c}}{c\_2 - c\_1} & \mathbf{c} - \mathbf{c}\_1 < \mathbf{x} \le \mathbf{c} + \mathbf{c}\_1 \\ 0 & \mathbf{c} + \mathbf{c}\_2 < \mathbf{x} \end{cases} \tag{4}$$

where *c*, *c*1*, c*2 are all positive200, *c* > *c*1*,c > c*2, *c*2 > *c*1.

credibility distribution are very slowly progressed, for example, Li et al. [10], Sampath and Ramya [17, 18] in worked with fuzzy normal distributions, and studied the exponential

With statistically well-designed scheme, the collected spatial data can be analyzed and presented by kriging maps. If we are facing spatial data with "missing" or scarce fuzziness, it is impossible to construct kriging maps. It is noticed that the air pollutants were measured in different sites each year, even the site design originally planned was well spread statistically. We call a site that does not have a recorded PM2.5 concentration as "missing value" site, as continued from Guo's research [4, 5]. To address the basic requirements in constructing kriging maps, Guo first proposed membership kriging in Zadeh's sense, see the MSc thesis [4] in which the linear, quadratic and hyperbolic tangent membership functions were used. Later Guo [5, 7] had developed the membership kriging under credibility theory. Following the membership kriging route of treating uncertainty, Shada et al. [19] and Zoraghein et al. [25] made consid‐ erable contributions in their papers. In this chapter, we will integrate exponential membership and kriging, to fill in the "missing data" based on existing sample data, and making a com‐

It is necessary to introduce the basics of Liu's fuzzy credibility theory [11]. Let Θ be a nonempty set and M a *σ*- algebra over Θ. Elements of M are called events. Cr{*A*} denotes a number or grade associated with event *A*, called credibility (measure or grade). Credibility measure Cr{ ⋅ }

( { }) ( { })

**Definition 1:** The set function Cr{} on M is called a credibility measure if it follows Axiom Normality, Axiom Monotonicity, Axiom Self-Duality and Axiom Maximality shown in

**Definition 2:** A measurable function,*η*, mapping from a credibility space (Θ, M,Cr) to a set of

**Definition 3:** The credibility distribution Ψ of a fuzzy variable *η* on the credibility measure

**Definition 4:** The function *μ* of a fuzzy variable *η* on the credibility measure space (Θ, M,Cr)

(1)

Cr sup Cr for any events with sup Cr 0.5

*AA A A*

í ý = <

*ii i i i i <sup>i</sup>*

credibility distribution function [1].

312 Current Air Quality Issues

**2.2. Fuzzy exponential distribution**

{ }

Cr 1

=

{ } { } { } { }

+ ì ü

î þ U

real numbers ᾣ ? ⊂ ℝ.

Cr Cr for

Cr Cr =1 for any event

*AA A*

Equation 1. The triplet (Θ, M ,Cr) is called a credibility space.

space (Θ, M,Cr) is Ψ: ℝ → [0,1], where Ψ(*x*) = Cr{θ∈Θ|*η*(θ*) x*}, x ∈ ℝ.

is called as a membership function: *μ*(*x*)=(2Cr{*η* = *x*})∧1, *x* ∈ℝ.

*c*

*A B AB*

£ Ì

parison of PM2.5 concentrations in California from 1999-2011.

satisfies the axioms normality, monotonicity, self-duality and maximality:

With the help of Equation 3, the fuzzy credibility distribution is thus,

$$\Psi(\mathbf{x}) = \begin{cases} 0 & \mathbf{x} \le \mathbf{c} - \mathbf{c}\_2 \\ \frac{c\_2 + \mathbf{x} - \mathbf{c}}{2\left(c\_2 - c\_1\right)} & \mathbf{c} - c\_2 < \mathbf{x} \le \mathbf{c} - \mathbf{c}\_1 \\ & \frac{1}{2} & \mathbf{c} - \mathbf{c}\_1 < \mathbf{x} \le \mathbf{c} + \mathbf{c}\_1 \\ & \frac{2c\_2 - c\_1 + \mathbf{x} - \mathbf{c}}{2\left(c\_2 - c\_1\right)} & \mathbf{c} - c\_1 < \mathbf{x} \le \mathbf{c} + \mathbf{c}\_1 \\ & \mathbf{1} & \mathbf{c} + c\_2 < \mathbf{x} \end{cases} \tag{5}$$

Liu [12, 13], Wang and Tian [21] and [1] studied the exponential fuzzy distribution with a membership function, denoted as Exp(*m*):

$$\mu\left(\mathbf{x}\right) = \frac{2}{1 + e^{\frac{\mathbf{x}\cdot\mathbf{x}}{\sqrt{\mathbf{x}\cdot\mathbf{x}}}}}, \mathbf{x} \ge \mathbf{0}, m > 0 \tag{6}$$

The support of an exponential membership function is set [0,+∞), the nonnegative part of the real line, ℝ. The expected value and second moment of exponential fuzzy variable are

$$\mathbb{E}\left[\eta\right] = \frac{\sqrt{6}\ln 2}{\pi}m\tag{7}$$

and

$$\mathbb{E}\left[\left.\eta^{2}\right.\right] = m^{2}\tag{8}$$

where the parameter *m* > 0 determining the mean and variance of the exponential fuzzy variable. Thus it is intuitive to reveal how the shape of membership curve affected by the possible values of the parameter *m* > 0.


Modelling PM2.5 with Fuzzy Exponential Membership http://dx.doi.org/10.5772/59617 315


**Table 1.** Impacts of *m* in the Shape of Exponential Membership Curve

Liu [12, 13], Wang and Tian [21] and [1] studied the exponential fuzzy distribution with a

<sup>2</sup> , 0, 0

The support of an exponential membership function is set [0,+∞), the nonnegative part of the real line, ℝ. The expected value and second moment of exponential fuzzy variable are

E = (7)

*m* ë û (8)

(6)

6

[ ] 6 ln 2 h

2 2 E = é ù h

where the parameter *m* > 0 determining the mean and variance of the exponential fuzzy variable. Thus it is intuitive to reveal how the shape of membership curve affected by the

*x m=12 m=15 m=22 m=50* 1.0000 1.0000 1.0000 1.0000 0.9466 0.9573 0.9709 0.9872 0.8935 0.9147 0.9418 0.9744 0.8410 0.8724 0.9128 0.9615 0.7894 0.8306 0.8839 0.9487 0.7390 0.7894 0.8553 0.9360 0.6899 0.7490 0.8269 0.9232 0.6424 0.7094 0.7987 0.9105 0.5968 0.6707 0.7709 0.8978 0.5530 0.6332 0.7435 0.8851 0.5113 0.5968 0.7165 0.8724 0.4717 0.5616 0.6899 0.8598 0.4342 0.5277 0.6638 0.8473 0.3990 0.4952 0.6382 0.8348 0.3660 0.4640 0.6132 0.8223 0.3351 0.4342 0.5887 0.8100 0.3063 0.4059 0.5647 0.7976 0.2796 0.3789 0.5414 0.7854

 *m* p

= ³>

*x m x xm e* p

membership function, denoted as Exp(*m*):

possible values of the parameter *m* > 0.

and

314 Current Air Quality Issues

( )

m

1

+

The value choice of parameter *m* is not aimless. *m* = 22 corresponds to the PM2.5 annual arithematic mean 11.85 (µg/m3 ), while *m* = 50 corresponds to the PM2.5 annual arithematic mean 25.89 (µg/m3 ). Therefore, the one-parameter exponential fuzzy variable may well cope to the modelling requirements of the California PM2.5 annual arithematic mean dataset. Furthermore, using the one-parameter exponential fuzzy variable it may develop a delicate scheme of testing credibility hypothesis.

#### **2.3. Credibility hypothesis testing with exponential membership**

Similiar to the popular Neyman-Pearson Lemma in probability theory, likelihood ratio *L*0/*L*1, constant *k*, and critical region *C* of size *α*, are involved in the testing hypothesis *H*0: θ = θ<sup>0</sup> against alternative hypothesis *H*1: θ = θ1. The likelihood is defined as the product of the densities for given sampled population. Hence we can call Neyman-Pearson testing criterion as likelihood ratio criterion. Inevitably, Type I error and Type II error concepts are also engaged in describing testing procedure. For hypothesis testing under credibility theory, Sampath and Ramya [17] proposed a membership ratio criterion. The membership criterion applies to any forms of credibility distributions, but exponential credibility distribution has its unique advantage. Therefore, the remaining descriptions will be focused on credibility hypothesis testing under exponential membership function [1].

**Definition 7:** Credibility hypothesis is a statement describing the possible rejection of a null hypothesis, *H*0: *μ* = *μ*0 with the credibility distribution of a fuzzy variable against an alternative hypothesis *H*1: *μ* = *μ*1 with another credibility distribution of a fuzzy variable.

**Definition 8:** Credibility hypothesis testing is the rule describing reject or not reject a null hypothesis if the calculated values sampled from the fuzzy distribution defined by null hypothesis.

**Definition 9:** Credibility rejection region is the subset of the support under a fuzzy distribution, denoted as *C*, on which the null credibility hypothesis is rejected *H*0: *μ* = *μ*0, i.e., *C* ={*η* ∈*Θ* | *H*<sup>0</sup> is rejected}

**Definition 10:** Type I error is the mistake by rejecting the null credibility hypothesis *H*0: *μ* = *μ*<sup>0</sup> when it is true and Type II error the mistake by not rejecting the null credibility hypothesis *H*0: *μ* = *μ*0 when it is false.

**Definition 11:** Level of credibility significance is the maximal credibility to make a Type I error in testing a credibility hypothesis *H*0: *μ* = *μ*0, denoted as *α*.

**Definition 12:** The best credibility rejection region of credibility significance level *α*, *C*\* , if this region possesses the maximal power (measured by credibility) under alternative hypothesis *K* with all possible credibility rejection regions of level of credibility significance *α*, i.e.,

$$\operatorname{Cr}\left\{\eta \in C^\* \mid K\right\} \ge \operatorname{Cr}\left\{\eta \in C \mid K\right\},\tag{9}$$

where *C* is any region satisfying the condition Cr{*η* ∈*C* |*H* }≤*α*. The power of the credibility hypothesis testing is Cr{*η* ∈*C* \*}.

With the exponential membership function having single parameter *m* > 0, the best credibility rejection region of credibility significance level *α*, *C*\* should be an interval so that we name it as best credibility rejection interval of credibility significance level *α*.

**Theorem 13:** For testing the null credibility hypothesis *H*0: *m* = *m*<sup>1</sup> against the alternative credibility hypothesis *K*: *m* = *m*2, (*m*1 < *m*2), under exponential fuzzy distributions *μ*(*x*) = Exp(*m*) as Equation 6 specified, the membership ratio criterion is engaged. The criterion states that given credibility significance level α < 0.5, the best credibility rejection interval *C*\* :

$$C^\* = \left[\frac{\sqrt{6}m\_\text{l}}{\pi}\ln\left(\frac{1-\alpha}{\alpha}\right), +\infty\right), \alpha \in \{0, 0.5\}.\tag{10}$$

The credibility of credibility rejection interval *C* \* under alternative hypothesis is greater than *α*.


**Table 2.** *x*0 and *α* under *H*0: *m*1=21.926

Table 2 illustrates relationship between the best credibility rejection interval boundary *x*0 for selected credibility significance level *α <* 0.5. For example, let *α* = 0.20, *m*<sup>1</sup> = 21.93, then the best credibility rejection interval *C*\* = [18.78, +∞). We have to point out that the choice of credibility significance level *α* in credibility hypothesis testing should not follow the "thumb rule" the significance level *α* = 0.05 in probability hypothesis testing. Although the two significance levels have the same role in hypothesis testing, nevertheless, the practical meanings of credibility significance level *α* and the significance level *α* are quite different. From Table 2 and California PM2.5 distribution patterns, it is is logical and practical selecting the credibility significance level *α* = 0.25, which gives *x*0 =14.485.
