2. Tests for nonparametric function based on regression spline

In this section, the linearity of function f xð Þ in model (1) is tested based on regression spline and fiducial method. Then, the proposed test procedure for model (1) is extended to test the linearity of model (2) and the constancy of function coefficient in model (3), respectively.

#### 2.1. Test the linearity of nonparametric regression model

Nonparametric regression model:

84 Topics in Splines and Applications

Partial linear regression model:

Varying-coefficient model:

and f(x) and f <sup>j</sup> xj

the constancy of f <sup>j</sup> xj

Hart [6] and Baraud [7].

y ¼ Z<sup>0</sup>

in model (3) for some <sup>j</sup>∈ð Þ <sup>1</sup>; <sup>2</sup>; <sup>⋯</sup>; <sup>p</sup> .

(see [11–16], among others). The constancy of the functional coefficient f <sup>j</sup> xj

Section 2. It has a good performance even when the sample size is small.

coefficients in model (3) can be seen in Hoover et al. [22], Wu et al. [23], and so on.

In models (1)–(3), y is the response variable, Z ¼ z1; ⋯; zp

y ¼ z1f <sup>1</sup>ð Þþ x<sup>1</sup> … þ zpf <sup>p</sup> xp

x1, ⋯, xp are covariant taking values in a finite interval, ε is the error, b is a parameter vector,

and ε are independent and ε˜Fð Þ ∙=σ , where F is a known cumulate distribution function (cdf) with mean 0 and variance 1; σ is unknown. Without loss of generality, we can suppose that x and x1, ⋯, xp take values in [0, 1]. We try to test the linearity of f xð Þ in models (1) and (2) and

The hypothesis testing in nonparametric regression model was considered in many papers. Härdle and Mammen [1] developed the visible difference between a parametric and a nonparametric curve estimates. Based on smoothing techniques, many tests were constructed for testing the linearity in regression model; see Hart [2], Cox et al. [3], and Cox and Koh [4] for a review. Recently, Fan et al. [5] studied a generalized likelihood ratio statistic, which behaves well in large sample case. Tests based on penalized criterion were developed by Eubank and

The linearity of partial linear regression model (2) was studied by Bianco and Boente [8], Liang et al. [9], and Fan and Huang [10]. There are also many other papers concerning such testing problems

model (3) was studied in Fan and Zhang [17], Cai et al. [18], Fan and Huang [19], You and Zhou [20], and Tang and Cheng [21]. Local polynomials and smoothing spline methods to estimate the

The critical values of most of the previous tests were obtained by Wilks theorem or bootstrap method. So such tests only behave well in the case of relatively large sample size. This chapter would give some testing procedures based on regression spline and the fiducial method [24] in

In using the regression spline, the key problem is the determination of knots used in spline interpolation. As we know that, for smoothing methods such as kernel-based method and smoothing spline, the smoothness is controlled by smoothing parameters. For the well-known kernel estimate, the bandwidth that is extremely big or small might leads to over-smoothing or under-smoothing, respectively. In order to avoid the selection of an optimal smoothing parameter, multi-scale smoothing method was introduced by Chaudhuri and Marron [25, 26] based

, j <sup>¼</sup> <sup>1</sup>, <sup>2</sup>, …, p are unknown smooth functions. Usually we suppose that ð Þ <sup>z</sup>; <sup>x</sup>

y ¼ f xð Þþ ε: (1)

b þ f xð Þþ ε: (2)

<sup>þ</sup> <sup>ε</sup>: (3)

in varying-coefficient

is a p-dimensional regressor, x and

Without loss of generality, we suppose that x in model (1) takes values in [0, 1] and the set of knots is T = {0 ¼ t<sup>1</sup> < t2, ⋯, < tm ¼ 1g. In order to estimate model (1), nonparametric function f (x) is fitted by kth order splines with knots T. This means that

$$f(\mathbf{x}) \approx \sum\_{j=1}^{m+k-1} \beta\_j \mathbf{g}\_j(\mathbf{x}),\tag{4}$$

where β<sup>j</sup> is coefficient and gj ð Þx , j ¼ 1, 2, ⋯, m þ k � 1, is basis function for order k splines, over the knots t1, t2, ⋯, tm:

With n-independent observations Y ¼ y1; y2; ⋯; yn � �<sup>∈</sup> <sup>ℝ</sup><sup>n</sup>, the basis matrix Gn�ð Þ <sup>m</sup>þk�<sup>1</sup> is defined by G ¼ gj ð Þ xi n o, xi is the designed point, i <sup>¼</sup> <sup>1</sup>, <sup>2</sup>, <sup>⋯</sup>, n; j <sup>¼</sup> <sup>1</sup>, <sup>2</sup>, <sup>⋯</sup>, m <sup>þ</sup> <sup>k</sup> � <sup>1</sup>: Hence, model (1) can be approximated as Y ≈ Gβ þ ε. The least squares estimator of coefficients is

$$\hat{\boldsymbol{\beta}} = \left(\boldsymbol{G}^{T}\boldsymbol{G}\right)^{-1}\boldsymbol{G}^{T}\boldsymbol{Y},\tag{5}$$

<sup>b</sup><sup>β</sup> <sup>¼</sup> <sup>β</sup> <sup>þ</sup> <sup>σ</sup> <sup>G</sup><sup>0</sup> ð Þ <sup>G</sup> �<sup>1</sup>

bβ ¼ β þ

S E2

E2

<sup>b</sup>β; <sup>S</sup><sup>2</sup> � � is given, where

For testing hypothesis H<sup>∗</sup>

given <sup>b</sup>β, S<sup>2</sup> and k k<sup>v</sup>

probability 1 if H<sup>∗</sup>

<sup>p</sup> <sup>b</sup>β; <sup>S</sup><sup>2</sup> � � <sup>¼</sup> Q L<sup>0</sup> R E; <sup>β</sup>

2 <sup>Σ</sup> ¼ v<sup>0</sup>

fiducial pivotal quantity about L<sup>0</sup>

�

Σ�<sup>1</sup>

<sup>p</sup> <sup>b</sup>β; <sup>S</sup><sup>2</sup> � � <sup>¼</sup> <sup>1</sup> � Fm�2,n�<sup>m</sup>

2

where Q is the probability measure of E ¼ ðE1, E2Þ and E<sup>1</sup> � Nð Þ 0; Im and, independently,

Given <sup>b</sup>β; <sup>S</sup><sup>2</sup> � �, the distribution of the right side in fiducial model is the fiducial distribution of

<sup>β</sup>. That is, the fiducial distribution of <sup>β</sup> is the conditional distribution of R E; <sup>b</sup>β; <sup>S</sup><sup>2</sup> � � when

where Q(�) and E<sup>Q</sup> express the probability for an event and the expectation of a random

According to the definition of generalized pivotal quantity in [35], R E; <sup>b</sup>β; <sup>S</sup><sup>2</sup> � � is a generalized

ð Þ n � m bβ<sup>0</sup>

is true. Suppose that the error is normally distributed, then the p-value given in Eq. (12) distributes as uniform distribution on interval (0, 1). On the other hand, under some mild condition, the test procedure based on <sup>p</sup> <sup>b</sup>β; <sup>S</sup><sup>2</sup> � � is consistent. Which means that <sup>p</sup> <sup>b</sup>β; <sup>S</sup><sup>2</sup> � � tends to be zero in

E2

<sup>b</sup>; <sup>S</sup><sup>2</sup>

<sup>G</sup><sup>0</sup> ð Þ <sup>G</sup> �<sup>1</sup> 2

> � � � 2 Σ ≥ L<sup>0</sup>

� �, (12)

β. With the definition of Q in Eq. (10), we have that

L L<sup>0</sup> <sup>G</sup><sup>0</sup> ð Þ <sup>G</sup> �<sup>1</sup>

ð Þ <sup>m</sup> � <sup>2</sup> <sup>S</sup><sup>2</sup>

<sup>0</sup> is false. The corresponding theoretical proof of the large sample properties

L � ��<sup>1</sup>

L<sup>0</sup>bβ

1

�

<sup>2</sup> � <sup>χ</sup><sup>2</sup>ð Þ <sup>n</sup> � <sup>m</sup> : From linear regression model, the fiducial model of <sup>β</sup> can be obtained:

<sup>G</sup><sup>0</sup> ð Þ <sup>G</sup> �<sup>1</sup>

R E; <sup>b</sup>β; <sup>S</sup><sup>2</sup> � � <sup>¼</sup> <sup>b</sup><sup>β</sup> � <sup>S</sup>

<sup>b</sup>; <sup>S</sup><sup>2</sup> � � � <sup>E</sup>QR E; <sup>β</sup>

variable under Q, respectively, and Σ is the conditional covariance matrix of L<sup>0</sup>

v for a vector v:

pivotal quantity and also a fiducial pivotal quantity about β. Naturally, L<sup>0</sup>

0 B@

where Fm�2,n�<sup>m</sup> is the cdf of F-distribution with degrees of freedom m � 2 and n � m. Under model (1) and the hypothesis that f xð Þ is a linear function, null hypothesis H<sup>∗</sup>

and finite sample properties of <sup>p</sup> <sup>b</sup>β; <sup>S</sup><sup>2</sup> � � is the same as the proof given in Li et al. [36].

� h i � � �

<sup>0</sup>, the p-value is defined as

E1, S ¼ σE2, E ¼ ð Þ� E1, E<sup>2</sup> Q, (9)

<sup>2</sup>E1, E ¼ ð Þ� E1, E<sup>2</sup> Q: (10)

EQR E; β

� � � �

E1: (11)

Model Testing Based on Regression Spline http://dx.doi.org/10.5772/intechopen.74858 87

<sup>b</sup>; <sup>S</sup><sup>2</sup>

� � � 2 Σ

<sup>E</sup>QR E; <sup>b</sup>β; <sup>S</sup><sup>2</sup> � �

R E; <sup>b</sup>β; <sup>S</sup><sup>2</sup> � � is the

CA, (13)

<sup>0</sup> given in (8)

and the estimator of f xð Þ<sup>i</sup> can be expressed as

$$\widehat{Y} = \left\{ \widehat{f}(\mathbf{x}\_1), \widehat{f}(\mathbf{x}\_2), \dots, \widehat{f}(\mathbf{x}\_n) \right\}^T = G \left( \mathbf{G}^T \mathbf{G} \right)^{-1} \mathbf{G}^T \mathbf{Y}. \tag{6}$$

For testing the linearity of model (1), linear spline is used to approximate f xð Þ. It means that basis function gj ð Þx is a linear function:

$$\begin{aligned} g\_1(\mathbf{x}) &= \frac{-\mathbf{x} - t\_2}{(t\_2 - t\_1)\mathbb{I}\_2(t)}, \\\\ g\_{k-1}(\mathbf{x}) &= \frac{\mathbf{x} - t\_{k-2}}{(t\_{k-1} - t\_{k-2})\mathbb{I}\_{k-1}(t)} - \frac{\mathbf{x} - t\_k}{(t\_k - t\_{k-1})\mathbb{I}\_k(t)}, \qquad 3 \le k \le m, \end{aligned} \tag{7}$$
 
$$g\_m(\mathbf{x}) = \frac{\mathbf{x} - t\_{m-1}}{(t\_m - t\_{m-1})\mathbb{I}\_m(t)}.$$

In this case, the approximated function in (4) is a linear interpolation with k =1. The true value is β<sup>j</sup> ¼ f tj � �, j <sup>¼</sup> <sup>1</sup>, <sup>2</sup>, <sup>⋯</sup>, m. The linearity of function f xð Þ can be written as

$$\mathbf{H}\_0: \frac{\beta\_2 - \beta\_1}{t\_2 - t\_1} = \frac{\beta\_3 - \beta\_2}{t\_3 - t\_2} = \dots = \frac{\beta\_m - \beta\_{m-1}}{t\_m - t\_{m-1}} \dots$$

Null hypothesis H0 can be expressed in matrix as L<sup>0</sup> β ¼ 0,

where

$$L' = \begin{bmatrix} h\_2 & -h\_1 - h\_2 & h\_1 & 0 \cdots & 0 & 0 & 0 \\ \cdots & \cdots & \cdots & \cdots & \cdots & \cdots & \cdots & \cdots \\ 0 & 0 & 0 & 0 & \cdots & h\_{m-1} & -h\_{m-1} - h\_{m-2} & h\_{m-2} \end{bmatrix},$$

where hj ¼ tjþ<sup>1</sup> � tj, j ¼ 1, 2, ⋯, m � 2. Null hypothesis H0 is equivalent to the following one:

$$\mathbf{H}\_0^\*:\ L'\boldsymbol{\beta} = \mathbf{0}.\tag{8}$$

The p-value for testing hypothesis H<sup>∗</sup> <sup>0</sup> will be derived by the fiducial method in the following context. Assume that matrix G has full rank, and let ε˜σNð Þ 0; 1 . In model Y ¼ Gβ þ ε, the sufficient statistic of <sup>β</sup>; <sup>σ</sup><sup>2</sup> � � is <sup>b</sup>β; <sup>S</sup><sup>2</sup> � �, where <sup>b</sup><sup>β</sup> is defined in (5) and

$$S^2 = Y'(I - P\_G)Y\_\prime \qquad \qquad P\_G = G(G'G)^{-1}G'A$$

By Dawid and Stone [34], the sufficient statistic can be represented as a functional model:

$$
\widehat{\beta} = \beta + \sigma (G'G)^{-\frac{1}{2}} E\_1, \qquad S = \sigma E\_2, E = (E\_1, E\_2) \sim Q. \tag{9}
$$

where Q is the probability measure of E ¼ ðE1, E2Þ and E<sup>1</sup> � Nð Þ 0; Im and, independently, E2 <sup>2</sup> � <sup>χ</sup><sup>2</sup>ð Þ <sup>n</sup> � <sup>m</sup> : From linear regression model, the fiducial model of <sup>β</sup> can be obtained:

$$
\widehat{\boldsymbol{\beta}} = \boldsymbol{\beta} + \frac{\mathbf{S}}{E\_2} (\mathbf{G}' \mathbf{G})^{-\frac{1}{2}} \mathbf{E}\_1 \qquad \mathbf{E} = (\mathbf{E}\_1, \mathbf{E}\_2) \sim \mathbf{Q}.\tag{10}
$$

Given <sup>b</sup>β; <sup>S</sup><sup>2</sup> � �, the distribution of the right side in fiducial model is the fiducial distribution of <sup>β</sup>. That is, the fiducial distribution of <sup>β</sup> is the conditional distribution of R E; <sup>b</sup>β; <sup>S</sup><sup>2</sup> � � when <sup>b</sup>β; <sup>S</sup><sup>2</sup> � � is given, where

$$R\left(E; \widehat{\beta}, S^2\right) = \widehat{\beta} - \frac{S}{E\_2} \left(G^\prime G\right)^{-\frac{1}{2}} E\_1. \tag{11}$$

For testing hypothesis H<sup>∗</sup> <sup>0</sup>, the p-value is defined as

<sup>b</sup><sup>β</sup> <sup>¼</sup> <sup>G</sup>TG � ��<sup>1</sup>

For testing the linearity of model (1), linear spline is used to approximate f xð Þ. It means that

ð Þ <sup>t</sup><sup>2</sup> � <sup>t</sup><sup>1</sup> <sup>l</sup>2ð Þ<sup>t</sup> ,

x � tm�<sup>1</sup> ð Þ tm � tm�<sup>1</sup> <sup>l</sup>mð Þ<sup>t</sup> :

> <sup>¼</sup> <sup>⋯</sup> <sup>¼</sup> <sup>β</sup><sup>m</sup> � <sup>β</sup><sup>m</sup>�<sup>1</sup> tm � tm�<sup>1</sup>

> > β ¼ 0,

In this case, the approximated function in (4) is a linear interpolation with k =1. The true value

h<sup>2</sup> � h<sup>1</sup> � h<sup>2</sup> h<sup>1</sup> 0⋯ 0 00 ⋯ ⋯ ⋯ ⋯⋯ ⋯ ⋯ ⋯

0 0 00 ⋯ hm�<sup>1</sup> � hm�<sup>1</sup> � hm�<sup>2</sup> hm�<sup>2</sup>

where hj ¼ tjþ<sup>1</sup> � tj, j ¼ 1, 2, ⋯, m � 2. Null hypothesis H0 is equivalent to the following one:

context. Assume that matrix G has full rank, and let ε˜σNð Þ 0; 1 . In model Y ¼ Gβ þ ε, the

By Dawid and Stone [34], the sufficient statistic can be represented as a functional model:

ð Þ <sup>I</sup> � PG Y, PG <sup>¼</sup> G G<sup>0</sup> ð Þ <sup>G</sup> �<sup>1</sup>

, where bβ is defined in (5) and

H<sup>∗</sup> <sup>0</sup> : L<sup>0</sup>

<sup>g</sup>1ð Þ¼ <sup>x</sup> �<sup>x</sup> � <sup>t</sup><sup>2</sup>

ð Þ tk�<sup>1</sup> � tk�<sup>2</sup> <sup>l</sup><sup>k</sup>�<sup>1</sup>ð Þ<sup>t</sup> � <sup>x</sup> � tk

Yb ¼ bf xð Þ<sup>1</sup> ;bf xð Þ<sup>2</sup> ; ⋯;bf xð Þ<sup>n</sup> n o<sup>T</sup>

x � tk�<sup>2</sup>

gmð Þ¼ x

� �, j <sup>¼</sup> <sup>1</sup>, <sup>2</sup>, <sup>⋯</sup>, m. The linearity of function f xð Þ can be written as

<sup>¼</sup> <sup>β</sup><sup>3</sup> � <sup>β</sup><sup>2</sup> t<sup>3</sup> � t<sup>2</sup>

and the estimator of f xð Þ<sup>i</sup> can be expressed as

gk�<sup>1</sup>ð Þ¼ <sup>x</sup>

ð Þx is a linear function:

H0 :

Null hypothesis H0 can be expressed in matrix as L<sup>0</sup>

β<sup>2</sup> � β<sup>1</sup> t<sup>2</sup> � t<sup>1</sup>

basis function gj

86 Topics in Splines and Applications

is β<sup>j</sup> ¼ f tj

where

L<sup>0</sup> ¼

The p-value for testing hypothesis H<sup>∗</sup>

sufficient statistic of <sup>β</sup>; <sup>σ</sup><sup>2</sup> � � is <sup>b</sup>β; <sup>S</sup><sup>2</sup> � �

<sup>S</sup><sup>2</sup> <sup>¼</sup> <sup>Y</sup><sup>0</sup>

G<sup>T</sup> Y, (5)

ð Þ tk � tk�<sup>1</sup> <sup>l</sup>kð Þ<sup>t</sup> , <sup>3</sup> <sup>≤</sup> <sup>k</sup> <sup>≤</sup> m, (7)

:

β ¼ 0: (8)

<sup>0</sup> will be derived by the fiducial method in the following

G0 :

G<sup>T</sup> Y: (6)

<sup>¼</sup> G GTG � ��<sup>1</sup>

$$p\left(\widehat{\boldsymbol{\beta}},\boldsymbol{S}^{2}\right) = Q\left(\left\|\boldsymbol{L}^{\prime}\left[\boldsymbol{R}\left(\boldsymbol{E};\widehat{\boldsymbol{\beta}},\boldsymbol{S}^{2}\right) - \operatorname{E}\_{\boldsymbol{Q}}\mathcal{R}\left(\boldsymbol{E};\widehat{\boldsymbol{\beta}},\boldsymbol{S}^{2}\right)\right]\right\|\_{\boldsymbol{\Sigma}}^{2} \geq \left\|\boldsymbol{L}^{\prime}\operatorname{E}\_{\boldsymbol{Q}}\mathcal{R}\left(\boldsymbol{E};\widehat{\boldsymbol{\beta}},\boldsymbol{S}^{2}\right)\right\|\_{\boldsymbol{\Sigma}}^{2}\right),\tag{12}$$

where Q(�) and E<sup>Q</sup> express the probability for an event and the expectation of a random variable under Q, respectively, and Σ is the conditional covariance matrix of L<sup>0</sup> <sup>E</sup>QR E; <sup>b</sup>β; <sup>S</sup><sup>2</sup> � � given <sup>b</sup>β, S<sup>2</sup> and k k<sup>v</sup> 2 <sup>Σ</sup> ¼ v<sup>0</sup> Σ�<sup>1</sup> v for a vector v:

According to the definition of generalized pivotal quantity in [35], R E; <sup>b</sup>β; <sup>S</sup><sup>2</sup> � � is a generalized pivotal quantity and also a fiducial pivotal quantity about β. Naturally, L<sup>0</sup> R E; <sup>b</sup>β; <sup>S</sup><sup>2</sup> � � is the fiducial pivotal quantity about L<sup>0</sup> β. With the definition of Q in Eq. (10), we have that

$$p\left(\widehat{\beta}, S^2\right) = 1 - F\_{m-2, n-m} \left( \frac{(n-m)\widehat{\beta}^\prime L \left(L^\prime (G^\prime G)^{-1} L\right)^{-1} L^\prime \widehat{\beta}}{(m-2)S^2} \right) \tag{13}$$

where Fm�2,n�<sup>m</sup> is the cdf of F-distribution with degrees of freedom m � 2 and n � m.

Under model (1) and the hypothesis that f xð Þ is a linear function, null hypothesis H<sup>∗</sup> <sup>0</sup> given in (8) is true. Suppose that the error is normally distributed, then the p-value given in Eq. (12) distributes as uniform distribution on interval (0, 1). On the other hand, under some mild condition, the test procedure based on <sup>p</sup> <sup>b</sup>β; <sup>S</sup><sup>2</sup> � � is consistent. Which means that <sup>p</sup> <sup>b</sup>β; <sup>S</sup><sup>2</sup> � � tends to be zero in probability 1 if H<sup>∗</sup> <sup>0</sup> is false. The corresponding theoretical proof of the large sample properties and finite sample properties of <sup>p</sup> <sup>b</sup>β; <sup>S</sup><sup>2</sup> � � is the same as the proof given in Li et al. [36].

In applications, we need to check some hypotheses as follows:

$$\mathcal{H}\_{01}: f(\mathbf{x}) = \mathbb{C} \Leftrightarrow \beta\_1 = \beta\_2 = \dots = \beta\_{m'}$$

$$\mathcal{H}\_{02}: f(\mathbf{x}) = \mathbb{C} \mathbf{x} \Leftrightarrow \frac{\beta\_2 - \beta\_1}{t\_2 - t\_1} = \frac{\beta\_3 - \beta\_2}{t\_3 - t\_2} = \dots = \frac{\beta\_m - \beta\_{m-1}}{t\_m - t\_{m-1}}, \text{and, } \beta\_1 = 0.1$$

The p-values for testing H01 and H02 can be obtained by replacing L in (12) by L<sup>01</sup> and L02, respectively, where L<sup>02</sup> ¼ ð Þ e1; L , e<sup>1</sup> ¼ ð Þ 1; 0; 0; ⋯; 0 <sup>0</sup> and

$$L\_{01} = \begin{bmatrix} h\_2 & -h\_1 & 0 \cdots & 0 & 0 \\ \cdots & \cdots & \cdots & \cdots & \cdots \\ 0 & 0 & 0 \cdots & h\_m & -h\_{m-1} \end{bmatrix}. \tag{14}$$

Y ¼ Xβ þ ε, (17)

<sup>1</sup>β ¼ 0, (18)

<sup>2</sup>β ¼ 0, (19)

1 � 1 0⋯ 0 0 ⋯⋯ ⋯ ⋯ ⋯

00 0 ⋯ 1 � 1

<sup>1</sup> <sup>X</sup><sup>0</sup> ð Þ <sup>X</sup> �<sup>1</sup>

p mð Þ � <sup>1</sup> <sup>S</sup><sup>2</sup>

<sup>2</sup> <sup>X</sup><sup>0</sup> ð Þ <sup>X</sup> �<sup>1</sup>

ð Þ <sup>m</sup> � <sup>1</sup> <sup>S</sup><sup>2</sup>

ð Þ� <sup>m</sup>�<sup>1</sup> mp:

L1 � ��<sup>1</sup>

L2 � ��<sup>1</sup> L0 1 bβ

L0 2 bβ

; <sup>0</sup>ð Þ� <sup>m</sup>�<sup>1</sup> mp�mj ð Þ<sup>0</sup>

L<sup>1</sup> L<sup>0</sup>

L<sup>2</sup> L<sup>0</sup>

� �

ð Þ n � mp bβ<sup>0</sup>

ð Þ n � mp bβ<sup>0</sup>

According to the above discussion, it can be seen that <sup>p</sup><sup>31</sup> <sup>b</sup>β; <sup>S</sup><sup>2</sup> � � is uniformly distributed over

is not linear. Hence, there is a difference between the distribution function of <sup>p</sup><sup>32</sup> <sup>b</sup>β; <sup>S</sup><sup>2</sup> � � under

<sup>32</sup> and uniform distribution. This difference has an accurate expression, which can be seen in Li et al. [37] (Theorem 3). On the other hand, <sup>p</sup><sup>31</sup> <sup>b</sup>β; <sup>S</sup><sup>2</sup> � � and <sup>p</sup><sup>32</sup> <sup>b</sup>β; <sup>S</sup><sup>2</sup> � � both tend to be zero in probability if null hypotheses are false when sample size tends to be infinity under some mild

<sup>31</sup>. However, under null hypothesis H<sup>∗</sup>

conditions. The corresponding proof was provided also in Li et al. [37].

� �, k <sup>¼</sup> <sup>1</sup>, <sup>2</sup>, <sup>⋯</sup>, m, i <sup>¼</sup> <sup>1</sup>, <sup>2</sup>, <sup>⋯</sup>, n,

Model Testing Based on Regression Spline http://dx.doi.org/10.5772/intechopen.74858

> ð Þ t<sup>1</sup> ; ⋯; f <sup>p</sup>ð Þ tm � �<sup>0</sup>

<sup>31</sup> and H<sup>∗</sup>

1

1

<sup>32</sup>, varying-coefficient model (2)

<sup>32</sup> can be

CA, (20)

CA: (21)

.

89

where X ¼ F1; ⋯; Fp

<sup>1</sup>; ⋯; β<sup>p</sup> 0 � �<sup>0</sup>

<sup>1</sup> is p mð Þ� � 1 mp matrix.

L01<sup>0</sup> ⋯ 0

1

CCCA , L<sup>01</sup> ¼

<sup>2</sup> <sup>¼</sup> <sup>0</sup>ð Þ� <sup>m</sup>�<sup>1</sup> mj ð Þ <sup>0</sup>�<sup>m</sup> ; <sup>L</sup><sup>0</sup>

defined as below if the error ε distributes as normal distribution:

<sup>p</sup><sup>31</sup> <sup>b</sup>β; <sup>S</sup><sup>2</sup> � � <sup>¼</sup> <sup>1</sup> � Fp mð Þ �<sup>1</sup> ,n�mp

<sup>p</sup><sup>32</sup> <sup>b</sup>β; <sup>S</sup><sup>2</sup> � � <sup>¼</sup> <sup>1</sup> � Fm�1,n�mp

⋮⋱⋮

0 ⋯ L01<sup>0</sup>

L0

j ¼ 1, 2, ⋯, p. β ¼ β<sup>0</sup>

respectively:

where L<sup>0</sup>

L0 <sup>1</sup> ¼ 0

BBB@

(0, 1) under hypothesis H<sup>∗</sup>

H∗

� � is <sup>n</sup> � mp matrix and Fj <sup>¼</sup> zjif <sup>k</sup>ð Þ xi

H<sup>∗</sup> <sup>31</sup> : L<sup>0</sup>

H<sup>∗</sup> <sup>32</sup> : L<sup>0</sup>

In the same way as the p-value in (13) is defined, p-value to test hypotheses H<sup>∗</sup>

0 B@

0 B@

is mp-dimensional parametric vector, β<sup>j</sup> ¼ f <sup>j</sup>

It is worth noting that under null hypothesis H31 defined in (15), regression model (3) is equivalent to model (17). However, this equivalence does not hold under null hypothesis H32 defined in (16). Null hypotheses H31 and H32 can be expressed in matrix as the following two,

#### 2.2. Test the linearity of partial linear model

To test the linearity of model (2), p-value can be established analogously. With n-independent observations Y ¼ y1; y2; ⋯; yn � �∈ ℝ<sup>n</sup>, model (2) can be represented as.

$$y\_i = Z\_i'b + f(x\_i) + \varepsilon\_i, i = 1, 2, \cdots, n\_i$$

where Z<sup>0</sup> <sup>i</sup> <sup>¼</sup> zi1; <sup>⋯</sup>; zip � �<sup>0</sup> , b ¼ b1; ⋯; bp � �<sup>0</sup> , xi, i ¼ 1, 2, ⋯, n are fixed designed points. With the approximation of f xð Þ given in (4), model (2) can be approximated by Y ≈ Xθ þ ε, where <sup>X</sup> <sup>¼</sup> ð Þ <sup>ℤ</sup>; <sup>G</sup> <sup>n</sup>�ð Þ <sup>p</sup>þmþ<sup>1</sup> ; <sup>ℤ</sup> <sup>¼</sup> (zij); i <sup>¼</sup> <sup>1</sup>, <sup>2</sup>, <sup>⋯</sup>, n; j <sup>¼</sup> <sup>1</sup>, <sup>2</sup>, <sup>⋯</sup>, p; G is the same as above; and θ ¼ b<sup>0</sup> ; ; β<sup>0</sup> � �<sup>0</sup> . Then p-value for testing the linearity of model (2) can be defined by replacing G in (12) by <sup>X</sup>, <sup>β</sup> by <sup>θ</sup>, and <sup>L</sup> by <sup>L</sup>03, respectively, <sup>L</sup><sup>03</sup> <sup>¼</sup> <sup>0</sup>ð Þ� <sup>m</sup>�<sup>2</sup> <sup>p</sup>; <sup>L</sup><sup>0</sup> � �<sup>0</sup> .

The large sample and finite sample properties of the testing procedure for model (2) are the same as the test procedure for model (1).

#### 2.3. Test the constancy of functional coefficient in varying-coefficient model

For model (3), investigators often want to know whether the coefficients are really varying; this means to test the constancy of the coefficient functions, that is, testing hypothesis:

$$\text{CH}\_{31}: f\_j(\mathbf{x}) = \text{C}\_j \text{ for } j = 1, 2, \cdots, p \text{ and some constant } \text{C}\_{\circ} \tag{15}$$

$$\mathcal{H}\_{32}: f\_{\not\parallel 0}(\mathbf{x}) = \mathcal{C}\_{\not\parallel 0} \text{ for some } j = j\_0 \text{ and some constant } \mathcal{C}\_{\not\parallel 0}. \tag{16}$$

With the set of knots T = {0 ¼ t<sup>1</sup> < t2, ⋯, < tm ¼ 1g, coefficient f <sup>j</sup> ð Þx can also be approximated by

$$f\_j(\mathbf{x}) = \sum\_{k=1}^{m} \beta\_{j\mathbf{s}} \mathbf{g}\_j(\mathbf{x})\_\prime j = 1, 2, \cdots, p\_\prime$$

where the true value of βjs ¼ f <sup>j</sup> ð Þ tk . Basic functions gj , j ¼ 1, 2, ⋯, m þ 1 were defined in (7). The varying-coefficient model (3) is approximately represented as

Model Testing Based on Regression Spline http://dx.doi.org/10.5772/intechopen.74858 89

$$Y = X\beta + \varepsilon,\tag{17}$$

where X ¼ F1; ⋯; Fp � � is <sup>n</sup> � mp matrix and Fj <sup>¼</sup> zjif <sup>k</sup>ð Þ xi � �, k <sup>¼</sup> <sup>1</sup>, <sup>2</sup>, <sup>⋯</sup>, m, i <sup>¼</sup> <sup>1</sup>, <sup>2</sup>, <sup>⋯</sup>, n, j ¼ 1, 2, ⋯, p. β ¼ β<sup>0</sup> <sup>1</sup>; ⋯; β<sup>p</sup> 0 � �<sup>0</sup> is mp-dimensional parametric vector, β<sup>j</sup> ¼ f <sup>j</sup> ð Þ t<sup>1</sup> ; ⋯; f <sup>p</sup>ð Þ tm � �<sup>0</sup> .

It is worth noting that under null hypothesis H31 defined in (15), regression model (3) is equivalent to model (17). However, this equivalence does not hold under null hypothesis H32 defined in (16). Null hypotheses H31 and H32 can be expressed in matrix as the following two, respectively:

$$\mathbf{H}\_{31}^\* : L\_1' \boldsymbol{\beta} = \mathbf{0},\tag{18}$$

$$\mathbf{H}\_{32}^{\*} : L\_{2}^{\prime} \boldsymbol{\beta} = \mathbf{0},\tag{19}$$

where L<sup>0</sup> <sup>1</sup> is p mð Þ� � 1 mp matrix.

In applications, we need to check some hypotheses as follows:

t<sup>2</sup> � t<sup>1</sup>

yi ¼ Z<sup>0</sup> i

in (12) by <sup>X</sup>, <sup>β</sup> by <sup>θ</sup>, and <sup>L</sup> by <sup>L</sup>03, respectively, <sup>L</sup><sup>03</sup> <sup>¼</sup> <sup>0</sup>ð Þ� <sup>m</sup>�<sup>2</sup> <sup>p</sup>; <sup>L</sup><sup>0</sup> � �<sup>0</sup>

H32 : f <sup>j</sup>0ð Þ¼ x Cj<sup>0</sup> for some j ¼ j

ð Þ¼ <sup>x</sup> <sup>X</sup><sup>m</sup> k¼1

βjsgj

ð Þ tk . Basic functions gj

With the set of knots T = {0 ¼ t<sup>1</sup> < t2, ⋯, < tm ¼ 1g, coefficient f <sup>j</sup>

f j

varying-coefficient model (3) is approximately represented as

2.3. Test the constancy of functional coefficient in varying-coefficient model

means to test the constancy of the coefficient functions, that is, testing hypothesis:

, b ¼ b1; ⋯; bp � �<sup>0</sup>

H02 : f xð Þ¼ Cx , <sup>β</sup><sup>2</sup> � <sup>β</sup><sup>1</sup>

respectively, where L<sup>02</sup> ¼ ð Þ e1; L , e<sup>1</sup> ¼ ð Þ 1; 0; 0; ⋯; 0 <sup>0</sup> and

2 6 4

L<sup>01</sup> ¼

2.2. Test the linearity of partial linear model

observations Y ¼ y1; y2; ⋯; yn

88 Topics in Splines and Applications

<sup>i</sup> ¼ zi1; ⋯; zip � �<sup>0</sup>

same as the test procedure for model (1).

H31 : f <sup>j</sup>

where the true value of βjs ¼ f <sup>j</sup>

where Z<sup>0</sup>

θ ¼ b<sup>0</sup>

; ; β<sup>0</sup> � �<sup>0</sup>

H01 : f xð Þ¼ C , β<sup>1</sup> ¼ β<sup>2</sup> ¼ ⋯ ¼ βm,

The p-values for testing H01 and H02 can be obtained by replacing L in (12) by L<sup>01</sup> and L02,

To test the linearity of model (2), p-value can be established analogously. With n-independent

approximation of f xð Þ given in (4), model (2) can be approximated by Y ≈ Xθ þ ε, where <sup>X</sup> <sup>¼</sup> ð Þ <sup>ℤ</sup>; <sup>G</sup> <sup>n</sup>�ð Þ <sup>p</sup>þmþ<sup>1</sup> ; <sup>ℤ</sup> <sup>¼</sup> (zij); i <sup>¼</sup> <sup>1</sup>, <sup>2</sup>, <sup>⋯</sup>, n; j <sup>¼</sup> <sup>1</sup>, <sup>2</sup>, <sup>⋯</sup>, p; G is the same as above; and

The large sample and finite sample properties of the testing procedure for model (2) are the

For model (3), investigators often want to know whether the coefficients are really varying; this

b þ f xð Þþ<sup>i</sup> εi, i ¼ 1, 2, ⋯, n,

. Then p-value for testing the linearity of model (2) can be defined by replacing G

� �∈ ℝ<sup>n</sup>, model (2) can be represented as.

h<sup>2</sup> � h<sup>1</sup> 0⋯ 0 0 ⋯⋯ ⋯ ⋯ ⋯ 00 0 ⋯ hm � hm�<sup>1</sup>

<sup>¼</sup> <sup>⋯</sup> <sup>¼</sup> <sup>β</sup><sup>m</sup> � <sup>β</sup><sup>m</sup>�<sup>1</sup> tm � tm�<sup>1</sup>

, and, β<sup>1</sup> ¼ 0:

3 7

, xi, i ¼ 1, 2, ⋯, n are fixed designed points. With the

.

<sup>0</sup> and some constant Cj0: (16)

, j ¼ 1, 2, ⋯, m þ 1 were defined in (7). The

ð Þx can also be approximated by

ð Þ¼ x Cj for j ¼ 1, 2, ⋯, p and some constant Cj, (15)

ð Þx , j ¼ 1, 2, ⋯, p,

<sup>5</sup>: (14)

<sup>¼</sup> <sup>β</sup><sup>3</sup> � <sup>β</sup><sup>2</sup> t<sup>3</sup> � t<sup>2</sup>

$$L\_1' = \begin{pmatrix} L\_{01}' & \cdots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \cdots & L\_{01}' \end{pmatrix}, \boldsymbol{L}\_{01} = \begin{bmatrix} 1 & -1 & 0 & \cdots & 0 & 0 \\ \cdots & \cdots & \cdots & \cdots & \cdots & \cdots \\ 0 & 0 & 0 & \cdots & 1 & -1 \end{bmatrix},$$

$$L\_2' = \left( \mathbf{0}\_{(m-1)\times\left\{ m\_0' - m \right\}}, L', \mathbf{0}\_{(m-1)\times\left\{ mp - m \right\}\_0} \right)\_{(m-1)\times mp}.$$

In the same way as the p-value in (13) is defined, p-value to test hypotheses H<sup>∗</sup> <sup>31</sup> and H<sup>∗</sup> <sup>32</sup> can be defined as below if the error ε distributes as normal distribution:

$$p\_{31}\left(\widehat{\boldsymbol{\beta}},\boldsymbol{S}^2\right) = 1 - F\_{p(m-1),n-mp}\left(\frac{(n-mp)\widehat{\boldsymbol{\beta}}^\prime L\_1 \left(L\_1^\prime (\boldsymbol{X}^\prime \boldsymbol{X})^{-1} L\_1\right)^{-1} L\_1^\prime \widehat{\boldsymbol{\beta}}}{p(m-1)\boldsymbol{S}^2}\right),\tag{20}$$

$$p\_{32}\left(\widehat{\boldsymbol{\beta}}, \boldsymbol{\Sigma}^2\right) = 1 - F\_{m-1, n-mp} \left(\frac{(n-mp)\widehat{\boldsymbol{\beta}}^\prime \mathcal{L}\_2 \left(\boldsymbol{L}\_2^\prime (\mathbf{X}^\prime \mathbf{X})^{-1} \boldsymbol{L}\_2\right)^{-1} \boldsymbol{L}\_2^\prime \widehat{\boldsymbol{\beta}}}{(m-1)\boldsymbol{S}^2}\right). \tag{21}$$

According to the above discussion, it can be seen that <sup>p</sup><sup>31</sup> <sup>b</sup>β; <sup>S</sup><sup>2</sup> � � is uniformly distributed over (0, 1) under hypothesis H<sup>∗</sup> <sup>31</sup>. However, under null hypothesis H<sup>∗</sup> <sup>32</sup>, varying-coefficient model (2) is not linear. Hence, there is a difference between the distribution function of <sup>p</sup><sup>32</sup> <sup>b</sup>β; <sup>S</sup><sup>2</sup> � � under H∗ <sup>32</sup> and uniform distribution. This difference has an accurate expression, which can be seen in Li et al. [37] (Theorem 3). On the other hand, <sup>p</sup><sup>31</sup> <sup>b</sup>β; <sup>S</sup><sup>2</sup> � � and <sup>p</sup><sup>32</sup> <sup>b</sup>β; <sup>S</sup><sup>2</sup> � � both tend to be zero in probability if null hypotheses are false when sample size tends to be infinity under some mild conditions. The corresponding proof was provided also in Li et al. [37].
