**Statistical Tests Based on the Geometry of Second Phase Particles**

Viktor Beneš1, Lev Klebanov1, Radka Lechnerová2 and Peter Sláma3

<sup>1</sup>*Charles University in Prague, Faculty of Mathematics and Physics, Department of Probability and Mathematical Statistics* <sup>2</sup>*Private College of Economic Studies, Ltd., Prague* <sup>3</sup>*COMTES FHT a.s., Metallography, Dobˇrany Czech Republic*

### **1. Introduction**

458 Recent Trends in Processing and Degradation of Aluminium Alloys

Wong, C.C., Dean, T.A. & Lin, J. (2003). *Int. J. of Machine Tools Manufacture*, Vol. 43, pp. 1419–

Zhan, Z., He, Y., Wang, D. & Gao, W. (2006). *Surface and Coatings Technology*, Vol. 201, pp.

Zener, C. & Hollomon, J.H. (1944). J. Applied Physics, Vol. 15, pp. 22–32

1435

2684–2689

The actual trends in new material development impose the necessity of a thorough knowledge of the relationships between properties and microstructure. In order to meet the requirements on the performance of some materials in their applications, that are often very sophisticated, a very fine tuning of the manufacturing process and its parameters is needed. Therefore, it is indispensable to be able to distinguish between small differences in microstructures produced by the different variations of processing parameters. Another important field of material characterization is the description of microstructure heterogeneities. Such heterogeneities are often related to risks of premature damage nucleation and preferential defects (void, cracks, corrosion, etc.) occurrence and propagation.

One of the most important elements of the microstructure of metallic materials is the set of second phase particles. Particle size and shape distributions and the type of spatial dispersion (homogeneous, long-range or short range ordered, clustered, etc.) are often the major attributes of a particular microstructure (Humphreys & Hatherly, 2004; Polmear, 2006). Thin foils made from aluminium-manganese based alloys, such as AA3003, are the material most frequently used as fins in automotive heat exchangers (Hirsch, 2006). This application imposes very strict requirements on properties and related foil microstructures. The development of an appropriate production technology is contingent on the perfect knowledge of the impact of processing parameters on microstructure transformation, including the changes of the set of particles (Hirsch, 2006; Slámová et al., 2006).

In statistical setting, we deal with microstructures containing random objects in a space or plane, which may be second phase particles, pores, grains and their sections or projections. The question frequently asked is whether two microstructures come from a material with the same geometrical characteristics of microstructure. This statement forms a null hypothesis *H*<sup>0</sup> and the aim is to develop a statistical two-sample test of *H*<sup>0</sup> against an alternative hypothesis that the geometrical characteristics are different. In the literature, parametric models of microstructures as random sets are mostly used (Derr & Ji, 2000; Ohser & Mücklich, 2000) and the authors recommend Monte Carlo testing which is based on the possibility of simulating a random set under the null hypothesis. The evaluation of the test is based on a comparison of the test statistics (describing some characteristics (Tewari & Gokhale, 2006a;b) of the random

*Y*1,...,*Yn* from *B* with matrices of size *k* × *m* are thus obtained. Now we transform each

Statistical Tests Based on the Geometry of Second Phase Particles 461

*y*˜ = (*y*1,1,..., *yk*,1, *y*1,2,..., *yk*,2,..., *y*1,*m*,..., *yk*,*m*),

is the Euclidean distance between vectors *s*, *t*, which is a strongly negative definite kernel,

Fig. 1. Histogram of N-distances from 5000 random permutations (*m* = 1, *k* = 500, *n* = 20). The value 0.697 corresponds to the non-permuted case and the *p*-value is the probability

We describe the permutation test (Lehmann & Romano, 2005) of the null hypothesis in more detail, which is used when *n* is small. Consider *K* random permutations of 1, . . . , 2*n*. Apply each permutation to long vector (*x*˜1,..., *x*˜*n*, *y*˜1,..., *y*˜*n*), and then split the permuted set to the first *n* and last *n* vectors and evaluate (2) to obtain *K* empirical N-distances (*K* is recommended to be about 1000). Under the null hypothesis, permutations do not modify the distribution of the random variable N. From the histogram of these distances including the non-permuted case, we obtain the *p*-value for the test, which is the probability (under the validity of the null hypothesis) that the random N-distance is larger than its measured value. A typical example of the test is in Fig. 1, here we reject *H*<sup>0</sup> since the *p*-value is smaller than 0.05. If the *p*-value

(relative frequency) of N-distance being larger than this value.

*x*˜ = (*x*1,1,..., *xk*,1, *x*1,2,..., *xk*,2,..., *x*1,*m*,..., *xk*,*m*), (1)

� ⎤ ⎦

L(*s*, *t*) = �*s* − *t*� (3)

1 2

, (2)

2L(*x*˜*i*, *y*˜*j*) − L(*x*˜*i*, *x*˜*j*) − L(*y*˜*i*, *y*˜*j*)

matrix *Xi*, *Yi* to a vector *x*˜, *y*˜, respectively, of size *km*:

<sup>N</sup>� <sup>=</sup> <sup>1</sup> *n*

where e.g.

cf. Appendix.

and evaluate an empirical counterpart of the N-distance (11):

*n* ∑ *j*=1 �

⎡ ⎣ *n* ∑ *i*=1

set) obtained from simulated models with that obtained from the microstructure. In the present paper, we apply a nonparametric approach. We will consider some geometrical characteristics, which can be measured by means of image analysis or estimated from the observation in a window. Thus, we obtain a vector (typically of a large dimension) of data, which does not form a random sample since the vector components may be stochastically dependent. For statistical testing, such a dependency is a serious problem. For the data from a single window, such a problem should be considered (Kupczyk, 2006). Under the assumption that we can observe the microstructure in a few independent windows, we put all the data from the individual windows and observations within windows together. This global information for both sets can be compared using N-distances from probability theory, see (Klebanov, 2005). The same holds in the functional data approach, where data is first transformed to a function. When the number of independent windows is large, the test of *H*<sup>0</sup> can be transformed into a univariate distribution-free two-sample test. For a smaller number of measurements, the permutation test for *H*<sup>0</sup> based on N-distances can be used. In the paper, we consider microstructures with dispersed particles. A statistical method is developed independently of whether we deal with 2D or 3D data. Therefore, in the application presented from metallography, we will call the feature particles, which in fact are 2D particle sections. We distinguish two basic sets of particle characteristics, namely the individual particle geometry and the spatial distribution of particles. Even if these are rather different descriptive tools, we try to construct the test so that it can be applied to both of them in an analogous way. We describe a variety of methods in Section 2 including also a functional data analysis. In the applied part of the paper in Section 3, we present data from thin foils demonstrating the use of the test for a comparison of metallographic samples of aluminium alloys, observed by means of optical microscopy. The numerical results are in Section 4. Finally, in Section 5 we present a general discussion of the methods and interpretations of results. The theoretical statistical background is given in the Appendix.

### **2. Methods**

Several statistical methods for the testing of differences between microstructures containing particles are suggested in this Section. In Subsections 2.1 and 2.2 we describe particle systems by a vector of parameters and the testing is reduced to a test of the null hypothesis that the vector of these random parameters has the same distribution for two random sets *A* and *B*. In Subsection 2.3 the functional data approach is used, we compare functions which fit the observed data. Finally in Subsection 2.4 a simulation study concerning the power of the test based on N-distances is performed in order to see its behavior with respect to different alternative hypotheses.

### **2.1 Vector approach – individual particle parameters**

Here we describe method (I). Generally the individual particles observed in a window (metallographic sample) are not independent. We will assume that *n* windows of the same size and magnification are observed for two microstructures *A* and *B*. Let the windows be each sufficiently far from the other or taken from independent samples so that we can assume independence among windows. We measure the same number *k* of particles from each window. In our application, the image analyzer scans particles in a window in a meandering way so that we obtain a representative set of particles by taking the first *k* particles measured from each window. Assume that *m* geometrical parameters are measured for each particle (corresponding to microstructures *A*, *B*). Two independent samples *X*1,..., *Xn* from *A* and *Y*1,...,*Yn* from *B* with matrices of size *k* × *m* are thus obtained. Now we transform each matrix *Xi*, *Yi* to a vector *x*˜, *y*˜, respectively, of size *km*:

$$\begin{aligned} \tilde{\mathbf{x}} &= (\mathbf{x}\_{1,1}, \dots, \mathbf{x}\_{k,1}, \mathbf{x}\_{1,2}, \dots, \mathbf{x}\_{k,2}, \dots, \mathbf{x}\_{1,m\_1}, \dots, \mathbf{x}\_{k,m\_l}), \\ \tilde{y} &= (y\_{1,1}, \dots, y\_{k,1}, y\_{1,2}, \dots, y\_{k,2}, \dots, y\_{1,m\_l}, \dots, y\_{k,m}), \end{aligned} \tag{1}$$

and evaluate an empirical counterpart of the N-distance (11):

$$\hat{\mathfrak{N}} = \frac{1}{n} \left[ \sum\_{i=1}^{n} \sum\_{j=1}^{n} \left( 2\mathcal{L}(\mathfrak{x}\_{i\cdot}\mathfrak{z}\_{j}) - \mathcal{L}(\mathfrak{x}\_{i\cdot}\mathfrak{x}\_{j}) - \mathcal{L}(\mathfrak{y}\_{i\cdot}\mathfrak{y}\_{j}) \right) \right]^{\frac{1}{2}},\tag{2}$$

where e.g.

2 Will-be-set-by-IN-TECH

set) obtained from simulated models with that obtained from the microstructure. In the present paper, we apply a nonparametric approach. We will consider some geometrical characteristics, which can be measured by means of image analysis or estimated from the observation in a window. Thus, we obtain a vector (typically of a large dimension) of data, which does not form a random sample since the vector components may be stochastically dependent. For statistical testing, such a dependency is a serious problem. For the data from a single window, such a problem should be considered (Kupczyk, 2006). Under the assumption that we can observe the microstructure in a few independent windows, we put all the data from the individual windows and observations within windows together. This global information for both sets can be compared using N-distances from probability theory, see (Klebanov, 2005). The same holds in the functional data approach, where data is first transformed to a function. When the number of independent windows is large, the test of *H*<sup>0</sup> can be transformed into a univariate distribution-free two-sample test. For a smaller number of measurements, the permutation test for *H*<sup>0</sup> based on N-distances can be used. In the paper, we consider microstructures with dispersed particles. A statistical method is developed independently of whether we deal with 2D or 3D data. Therefore, in the application presented from metallography, we will call the feature particles, which in fact are 2D particle sections. We distinguish two basic sets of particle characteristics, namely the individual particle geometry and the spatial distribution of particles. Even if these are rather different descriptive tools, we try to construct the test so that it can be applied to both of them in an analogous way. We describe a variety of methods in Section 2 including also a functional data analysis. In the applied part of the paper in Section 3, we present data from thin foils demonstrating the use of the test for a comparison of metallographic samples of aluminium alloys, observed by means of optical microscopy. The numerical results are in Section 4. Finally, in Section 5 we present a general discussion of the methods and interpretations of

results. The theoretical statistical background is given in the Appendix.

Several statistical methods for the testing of differences between microstructures containing particles are suggested in this Section. In Subsections 2.1 and 2.2 we describe particle systems by a vector of parameters and the testing is reduced to a test of the null hypothesis that the vector of these random parameters has the same distribution for two random sets *A* and *B*. In Subsection 2.3 the functional data approach is used, we compare functions which fit the observed data. Finally in Subsection 2.4 a simulation study concerning the power of the test based on N-distances is performed in order to see its behavior with respect to different

Here we describe method (I). Generally the individual particles observed in a window (metallographic sample) are not independent. We will assume that *n* windows of the same size and magnification are observed for two microstructures *A* and *B*. Let the windows be each sufficiently far from the other or taken from independent samples so that we can assume independence among windows. We measure the same number *k* of particles from each window. In our application, the image analyzer scans particles in a window in a meandering way so that we obtain a representative set of particles by taking the first *k* particles measured from each window. Assume that *m* geometrical parameters are measured for each particle (corresponding to microstructures *A*, *B*). Two independent samples *X*1,..., *Xn* from *A* and

**2. Methods**

alternative hypotheses.

**2.1 Vector approach – individual particle parameters**

$$\mathcal{L}(\mathbf{s}, t) = \|\mathbf{s} - t\|\tag{3}$$

is the Euclidean distance between vectors *s*, *t*, which is a strongly negative definite kernel, cf. Appendix.

Fig. 1. Histogram of N-distances from 5000 random permutations (*m* = 1, *k* = 500, *n* = 20). The value 0.697 corresponds to the non-permuted case and the *p*-value is the probability (relative frequency) of N-distance being larger than this value.

We describe the permutation test (Lehmann & Romano, 2005) of the null hypothesis in more detail, which is used when *n* is small. Consider *K* random permutations of 1, . . . , 2*n*. Apply each permutation to long vector (*x*˜1,..., *x*˜*n*, *y*˜1,..., *y*˜*n*), and then split the permuted set to the first *n* and last *n* vectors and evaluate (2) to obtain *K* empirical N-distances (*K* is recommended to be about 1000). Under the null hypothesis, permutations do not modify the distribution of the random variable N. From the histogram of these distances including the non-permuted case, we obtain the *p*-value for the test, which is the probability (under the validity of the null hypothesis) that the random N-distance is larger than its measured value. A typical example of the test is in Fig. 1, here we reject *H*<sup>0</sup> since the *p*-value is smaller than 0.05. If the *p*-value

in each of *n* windows for both microstructures *A*, *B*. From the estimated curves we construct vectors *Gj* = *G*(*j*�), *Fj* = *F*(*j*�), *pcfj* = *pcf*(*j*�), *j* = 1, . . . , *k*, where � > 0 is a given step and *k*� the range considered. Choosing *m* of these three characteristics we construct 2*n* vectors of size *km*, cf. (2). If the number of independent windows *n* is smaller than 120, we evaluate N in (2). Using *K* random permutations of these 2*n* vectors we perform the

Statistical Tests Based on the Geometry of Second Phase Particles 463

exactly as in the method (I) above. If *n* > 120, we evaluate (5) and use the Kolmogorov-Smirnov test. The tests are not much dependent on � if it is small. All this

The characteristics of the spatial distribution are dependent of the intensity *λA*, *λ<sup>B</sup>* of particle centroids of microstructures *A*, *B*, respectively. The intensity is the mean number of particle

where |*W*| is the size of the window and *nA*, *nB* are the corresponding numbers of particle centroids observed in the window. Clearly, microstructures with different intensities have different nearest neighbour distances, etc. Consider the problem of an investigation of the difference between *A* and *B* purely in the spatial distribution independently of the different intensities. For the case *nA* �<sup>=</sup> *nB*, we can scale the image window *<sup>B</sup>* by <sup>√</sup>*nB*/*nA*. This transformation leads to the same estimated density of particles of *A* and transformed *B* (in windows of different size). Then we evaluate functions *F*, *G*, *pcf* and continue by testing

Consider finally the problem of testing the difference in the density of particles. Here we need a parametric model and we restrict it to a point process model with no interactions (Poisson process). Under the assumption that both microstructures can be modeled by a stationary Poisson process, there is a theoretical test of the hypothesis: *H*<sup>0</sup> : *λ<sup>A</sup>* = *λB*. We reject *H*<sup>0</sup> at a

(see (Ng et al, 2007)), where for 0 < *a* < 1, *ua* denotes the *a*-quantile of the standard Gaussian

In Subsection 2.2 we dealt in fact with functions (*F*, *G*, *pcf*) which describe the spatial distribution of particles. Since in the computer we have always discrete data, i.e. a finite number of the values of a function, typically at equidistant argument points, we used the vector analysis for testing the null hypothesis by means of N-distances. However, within this theory it is also possible to deal with functions, this approach belongs to the field of statistical

In the functional data approach we used the Bernstein polynomials (Korovkin, 2001) as a suitable approximation for corresponding functions *F*, *G*, *pcf* . Let us remind that the Bernstein

*<sup>T</sup>* <sup>=</sup> <sup>|</sup>*nA* <sup>−</sup> *nB*<sup>|</sup> <sup>√</sup>*nA* <sup>+</sup> *nB*

, *<sup>λ</sup><sup>B</sup>* <sup>=</sup> *nB*


> *<sup>u</sup>*1<sup>−</sup> *<sup>α</sup>*

(<sup>1</sup> <sup>−</sup> *<sup>x</sup>*)*n*−*<sup>j</sup>*

.

<sup>2</sup> (5)

*H*<sup>0</sup> : microstructures *A*, *B* have the same distribution of characteristics involved,

*<sup>λ</sup><sup>A</sup>* <sup>=</sup> *nA* |*W*|

permutation test of

is called method (II).

centroids per unit volume and it is estimated as

based on N-distances as above, this is our method (III).

polynomial of the degree *n* for the function *f*(*x*) is defined as

*Bn*(*x*; *f*) =

*n* ∑ *j*=0 *f j n n j xj*

confidence level *α*, if the statistics:

**2.3 Functional data approach**

analysis of functional data.

distribution.

were greater than 0.05, *H*<sup>0</sup> would not be rejected. This rule we use in all tests throughout the whole chapter.

We recommend also a simpler test in which the samples are split randomly into three sub samples *x*˜, *x*˜� , *x*˜��(*y*˜, *y*˜ � , *y*˜ ��), respectively, of size n/3 (assuming it is an integer). Then put (12)

$$\begin{aligned} \mathcal{U}l\_{i} &= \mathcal{L}(\mathfrak{x}\_{i}, \mathfrak{y}\_{i}) - \mathcal{L}(\mathfrak{x}\_{i}, \mathfrak{x}\_{i}') \\ V\_{i} &= \mathcal{L}(\mathfrak{y}\_{i'}', \mathfrak{y}\_{i}'') - \mathcal{L}(\mathfrak{x}\_{i'}', \mathfrak{y}\_{i}'') \end{aligned} \tag{4}$$

*i* = 1, . . . , *n*/3. The null hypothesis is now equivalent to the hypothesis that *Ui*, *Vi* come from the same distribution, which can be tested by an arbitrary univariate two-sample test, e.g. a Kolmogorov-Smirnov test (using STATS package in R language, (Ihaka & Gentleman, 1996)), whose statistic has the form

$$\max\_{\mathbf{x}} \left| H\_n^{U}(\mathbf{x}) - H\_n^{V}(\mathbf{x}) \right| \nu$$

where *H<sup>U</sup> <sup>n</sup>* (*x*) and *H<sup>V</sup> <sup>n</sup>* (*x*) are empirical distribution functions of *U*1,..., *Un* and *V*1,..., *Vn* correspondingly.

For small values of *n*, i.e. *n* < 120, however, the loss of information when splitting the files to a size *n*/3 leads to the situation where the use of the asymptotic statistics for the Kolmogorov-Smirnov test is not recommended (Buening & Trenkler, 1978, p.135). The test based on splitting is distribution-free (independent of the underlying distribution of observations), but it has smaller power than the permutation test, see (Klebanov, 2005). Namely, based on simulated samples from multivariate normal distributions and location alternative, it was shown that splitting test has about the same power as permutation test, but based on three times smaller sample size. For the one-dimensional case and the samples from normal distribution and location alternatives, the permutation N-test has the power very closed to optimal *t*-test. However, for the samples from the mixture of normal distribution N-test may be more powerful than *t*-test. In all situations permutation N-test is more powerful than Kolmogorov-Smirnov test.

#### **2.2 Vector approach – the spatial distribution of particles**

Here we do not evaluate the measurement directly but first the measured information is transformed.

To test the difference in spatial distribution of particles we use *m* mutual characteristics of particle centroids, among them:


Concerning the spatial distribution, we distinguish complete independence (CI), attraction (clustering) and repulsion (regularity). Functions *F* and *G* coincide when CI takes place, for clustered patterns graph *G* is to the left of *F* while for regular patterns *F* is to the left from *G*. These are distance characteristics while *pcf* is a second-order characteristic, being identically equal to 1 under CI. Peaks of *pcf* correspond to typical distances between pairs of points.

The edge-corrected estimators of these functions (Kaplan-Meier estimators for *F*, *G*, Ripley's estimator for *pcf* , using SPATSTAT package in R language) are obtained. We do the estimation 4 Will-be-set-by-IN-TECH

were greater than 0.05, *H*<sup>0</sup> would not be rejected. This rule we use in all tests throughout the

We recommend also a simpler test in which the samples are split randomly into three sub

*Ui* = L(*x*˜*i*, *y*˜*i*) − L(*x*˜*i*, *x*˜

*i* = 1, . . . , *n*/3. The null hypothesis is now equivalent to the hypothesis that *Ui*, *Vi* come from the same distribution, which can be tested by an arbitrary univariate two-sample test, e.g. a Kolmogorov-Smirnov test (using STATS package in R language, (Ihaka & Gentleman,

*<sup>n</sup>* (*x*) <sup>−</sup> *<sup>H</sup><sup>V</sup>*

For small values of *n*, i.e. *n* < 120, however, the loss of information when splitting the files to a size *n*/3 leads to the situation where the use of the asymptotic statistics for the Kolmogorov-Smirnov test is not recommended (Buening & Trenkler, 1978, p.135). The test based on splitting is distribution-free (independent of the underlying distribution of observations), but it has smaller power than the permutation test, see (Klebanov, 2005). Namely, based on simulated samples from multivariate normal distributions and location alternative, it was shown that splitting test has about the same power as permutation test, but based on three times smaller sample size. For the one-dimensional case and the samples from normal distribution and location alternatives, the permutation N-test has the power very closed to optimal *t*-test. However, for the samples from the mixture of normal distribution N-test may be more powerful than *t*-test. In all situations permutation N-test is more powerful

Here we do not evaluate the measurement directly but first the measured information is

To test the difference in spatial distribution of particles we use *m* mutual characteristics of

a) a distribution function of the nearest neighbour distance (*G*-function) (Tewari & Gokhale,

Concerning the spatial distribution, we distinguish complete independence (CI), attraction (clustering) and repulsion (regularity). Functions *F* and *G* coincide when CI takes place, for clustered patterns graph *G* is to the left of *F* while for regular patterns *F* is to the left from *G*. These are distance characteristics while *pcf* is a second-order characteristic, being identically equal to 1 under CI. Peaks of *pcf* correspond to typical distances between pairs of points. The edge-corrected estimators of these functions (Kaplan-Meier estimators for *F*, *G*, Ripley's estimator for *pcf* , using SPATSTAT package in R language) are obtained. We do the estimation

b) a contact distribution function (*F*-function) (Tewari & Gokhale, 2006a),

c) a pair correlation function (*pcf*) (Ohser & Mücklich, 2000).

� *i* , *y*˜ �� *<sup>i</sup>* ) − L(*x*˜

*Vi* = L(*y*˜

max*<sup>x</sup>* <sup>|</sup>*H<sup>U</sup>*

��), respectively, of size n/3 (assuming it is an integer). Then put (12)

� *i*

�� *<sup>i</sup>* , *y*˜ �� *i* )

*<sup>n</sup>* (*x*)|,

*<sup>n</sup>* (*x*) are empirical distribution functions of *U*1,..., *Un* and *V*1,..., *Vn*

) (4)

whole chapter.

samples *x*˜, *x*˜�

where *H<sup>U</sup>*

transformed.

2006b),

correspondingly.

, *x*˜��(*y*˜, *y*˜ � , *y*˜

1996)), whose statistic has the form

*<sup>n</sup>* (*x*) and *H<sup>V</sup>*

than Kolmogorov-Smirnov test.

particle centroids, among them:

**2.2 Vector approach – the spatial distribution of particles**

in each of *n* windows for both microstructures *A*, *B*. From the estimated curves we construct vectors *Gj* = *G*(*j*�), *Fj* = *F*(*j*�), *pcfj* = *pcf*(*j*�), *j* = 1, . . . , *k*, where � > 0 is a given step and *k*� the range considered. Choosing *m* of these three characteristics we construct 2*n* vectors of size *km*, cf. (2). If the number of independent windows *n* is smaller than 120, we evaluate N in (2). Using *K* random permutations of these 2*n* vectors we perform the permutation test of

*H*<sup>0</sup> : microstructures *A*, *B* have the same distribution of characteristics involved,

exactly as in the method (I) above. If *n* > 120, we evaluate (5) and use the Kolmogorov-Smirnov test. The tests are not much dependent on � if it is small. All this is called method (II).

The characteristics of the spatial distribution are dependent of the intensity *λA*, *λ<sup>B</sup>* of particle centroids of microstructures *A*, *B*, respectively. The intensity is the mean number of particle centroids per unit volume and it is estimated as

$$
\lambda\_A = \frac{n\_A}{|W|}, \ \lambda\_B = \frac{n\_B}{|W|}.
$$

where |*W*| is the size of the window and *nA*, *nB* are the corresponding numbers of particle centroids observed in the window. Clearly, microstructures with different intensities have different nearest neighbour distances, etc. Consider the problem of an investigation of the difference between *A* and *B* purely in the spatial distribution independently of the different intensities. For the case *nA* �<sup>=</sup> *nB*, we can scale the image window *<sup>B</sup>* by <sup>√</sup>*nB*/*nA*. This transformation leads to the same estimated density of particles of *A* and transformed *B* (in windows of different size). Then we evaluate functions *F*, *G*, *pcf* and continue by testing based on N-distances as above, this is our method (III).

Consider finally the problem of testing the difference in the density of particles. Here we need a parametric model and we restrict it to a point process model with no interactions (Poisson process). Under the assumption that both microstructures can be modeled by a stationary Poisson process, there is a theoretical test of the hypothesis: *H*<sup>0</sup> : *λ<sup>A</sup>* = *λB*. We reject *H*<sup>0</sup> at a confidence level *α*, if the statistics:

$$T = \frac{|n\_A - n\_B|}{\sqrt{n\_A + n\_B}} > u\_{1-\frac{\Delta}{2}}\tag{5}$$

(see (Ng et al, 2007)), where for 0 < *a* < 1, *ua* denotes the *a*-quantile of the standard Gaussian distribution.

#### **2.3 Functional data approach**

In Subsection 2.2 we dealt in fact with functions (*F*, *G*, *pcf*) which describe the spatial distribution of particles. Since in the computer we have always discrete data, i.e. a finite number of the values of a function, typically at equidistant argument points, we used the vector analysis for testing the null hypothesis by means of N-distances. However, within this theory it is also possible to deal with functions, this approach belongs to the field of statistical analysis of functional data.

In the functional data approach we used the Bernstein polynomials (Korovkin, 2001) as a suitable approximation for corresponding functions *F*, *G*, *pcf* . Let us remind that the Bernstein polynomial of the degree *n* for the function *f*(*x*) is defined as

$$B\_n(\mathfrak{x}; f) = \sum\_{j=0}^n f\left(\frac{j}{n}\right) \binom{n}{j} \mathfrak{x}^j (1-\mathfrak{x})^{n-j}.$$

It is possible to study a more general situation. Namely, we need not consider only coordinates (*X*, *Y*) of the particle centroids, but also individual characteristics of the particles. Then, we will have instead of two-dimensional vector (*X*,*Y*) the vectors of higher dimensionality, and instead of the functions *μ* and *ν* depending on two arguments we will have corresponding functions depending on three or more arguments. Theoretically, it is possible to consider an arbitrary number of characteristics of the particles, but the calculations for more than three arguments are very time consuming. Therefore, as the method (VI), we consider the case of three arguments, and as an example the case of three parameters: two coordinates of the

Statistical Tests Based on the Geometry of Second Phase Particles 465

Again, we have two microstructures (A and B), observed in *M* windows. Denote by *nj* the number of particles in *j*-th window from microstructure A, and by *kj* the corresponding number from the microstructure B. Corresponding coordinates of particle centroids and their

<sup>1</sup> ),...,(*X*(*j*)

We make a smoothing procedure in each window by convolving a discrete three-dimensional distribution concentrated in particle centroids and their areas with three-dimensional Gaussian distribution with zero mean vector and standard deviations *σj*(*A*) = 1/√<sup>4</sup> *nj* and

> *<sup>x</sup>* <sup>−</sup> *<sup>X</sup>*(*j*) *s <sup>σ</sup>j*(*A*) ,

 *<sup>x</sup>* <sup>−</sup> *<sup>U</sup>*(*j*) *s <sup>σ</sup>j*(*B*) ,

Define strongly negative definite kernel L for two functions *f*(*x*, *y*, *z*) and *g*(*x*, *y*, *z*) given on

In order to understand and describe the properties of the suggested testing based on N-distances, it is necessary to study the power of the tests, which quantifies the probability of a correct rejection of *H*0. It is possible to do this by means of simulations of special cases. For the use of Kolmogorov-Smirnov test for comparison of *U*, *V* in (12) a study of the power is presented in (Klebanov, 2005). We present here another study of the power of our test when

Then again the empirical analog on N-distance (6) is used with *μi*, *ν<sup>j</sup>* as arguments of L. Further, we apply the same testing procedure as for two-dimensional case, described above.

*nj* ,*Y*(*j*) *nj* , *<sup>Z</sup>*(*j*)

*kj* ) for *<sup>j</sup>*-th window from B (*<sup>j</sup>* = 1, 2, . . . , *<sup>M</sup>*).

*<sup>y</sup>* <sup>−</sup> *<sup>Y</sup>*(*j*) *s <sup>σ</sup>j*(*A*) ,

*<sup>y</sup>* <sup>−</sup> *<sup>V</sup>*(*j*) *<sup>s</sup> <sup>σ</sup>j*(*B*) ,

(*f*(*x*, *<sup>y</sup>*, *<sup>z</sup>*) <sup>−</sup> *<sup>g</sup>*(*x*, *<sup>y</sup>*, *<sup>z</sup>*))2*dxdydz*1/2

*<sup>z</sup>* <sup>−</sup> *<sup>Z</sup>*(*j*) *s σj*(*A*)

*<sup>z</sup>* <sup>−</sup> *<sup>W</sup>*(*j*) *s σj*(*B*)

.

*nj* ) for *j*-th window from A, and

particle centroid and the area of the particle (section).

<sup>1</sup> ,*Y*(*j*) <sup>1</sup> , *<sup>Z</sup>*(*j*)

*kj* , *<sup>W</sup>*(*j*)

*kj* , *<sup>V</sup>*(*j*)

*kj*, i.e. we pass to the functions

*σ*3 *<sup>j</sup>* (*A*)

*σ*3 *<sup>j</sup>* (*B*)

0

 1 0

1 *nj*

1 *kj*

Without loss of generality, we may suppose that the window is a unit cube:

 1 0

*kj* ∑ *s*=1 *K*

*nj* ∑ *s*=1 *K*

*<sup>μ</sup>j*(*x*, *<sup>y</sup>*, *<sup>z</sup>*) = <sup>1</sup>

*<sup>ν</sup>j*(*x*, *<sup>y</sup>*, *<sup>z</sup>*) = <sup>1</sup>

*Q* = {(*x*, *y*) : 0 < *x* < 1, 0 < *y* < 1, 0 < *z* < 1}.

<sup>L</sup>(*<sup>f</sup>* , *<sup>g</sup>*) = <sup>1</sup>

using permutation testing in the vector approach.

areas are now denoted by (*X*(*j*)

1),...,(*U*(*j*)

by (*U*(*j*)

*σj*(*B*) = 1/ <sup>4</sup>

for A, and

for B.

*Q* as

<sup>1</sup> , *<sup>V</sup>*(*j*) <sup>1</sup> , *<sup>W</sup><sup>j</sup>*

**2.4 The power of the test**

The test for the null hypothesis is constructed by means of N-distances, with the strongly negative definite kernel L for two functions *f*(*x*) and *g*(*x*) defined on an interval [0, *a*] as

$$\mathcal{L}(f, \mathbf{g}) = \left( \int\_0^a (f(\mathbf{x}) - \mathbf{g}(\mathbf{x}))^2 d\mathbf{x} \right)^{1/2}.$$

In this case the empirical analog on N-distance is defined as

$$\mathfrak{M} = \left(\frac{2}{M^2} \sum\_{i=1}^{M} \sum\_{j=1}^{M} \mathcal{L}(F\_i^A, F\_j^B) - \frac{1}{M^2} \sum\_{i=1}^{M} \sum\_{j=1}^{M} \mathcal{L}(F\_i^A, F\_j^A) - \frac{1}{M^2} \sum\_{i=1}^{M} \sum\_{j=1}^{M} \mathcal{L}(F\_i^B, F\_j^B)\right)^{1/2}. \tag{6}$$

Here *F<sup>A</sup> <sup>j</sup>* and *<sup>F</sup><sup>B</sup> <sup>j</sup>* are the Bernstein polynomials for the function *F* (correspondingly, *G* or *pcf*), constructed for the *j*th window of microstructures *A* and *B*.

To compare the microstructure A with microstructure B we use permutation test, that is we combine the functions *F<sup>A</sup> <sup>j</sup>* and *<sup>F</sup><sup>B</sup> <sup>j</sup>* in one long vector of functions, make a random permutation, and after that we split the vector into two parts, calculating after that N-distance between corresponding parts. The described operation has to be repeated many times, which is possible thanks to fast computers. This ends our method (IV).

Further, using the functional data approach, we suggest a new comparison technique, qualitatively different from the previous ones. Suppose again we have two microstructures (A and B), observed in *M* windows. Denote by *nj* the number of particles in *j*-th window from microstructure A, and by *kj* the corresponding number from B. Corresponding coordinates of particle centroids are denoted by (*X*(*j*) <sup>1</sup> ,*Y*(*j*) <sup>1</sup> ),...,(*X*(*j*) *nj* ,*Y*(*j*) *nj* )) for *j*th window from A, and by (*U*(*j*) <sup>1</sup> , *<sup>V</sup>*(*j*) <sup>1</sup> ),...,(*U*(*j*) *kj* , *<sup>V</sup>*(*j*) *kj* )) for *<sup>j</sup>*th window from B (*<sup>j</sup>* = 1, 2, . . . , *<sup>M</sup>*).

Method (V) is based on a smoothing procedure in each window by convolving a discrete two-dimensional distribution concentrated in particle centroids with two-dimensional Gaussian distribution with zero mean vector and standard deviations *σj*(*A*) = 1/ <sup>4</sup> √*nj* and *σj*(*B*) = 1/ <sup>4</sup> *kj*, i.e. we pass to the functions

$$\mu\_{\dot{f}}(\mathbf{x}, \mathbf{y}) = \frac{1}{\sigma\_{\dot{f}}^2(A)} \frac{1}{n\_{\dot{f}}} \sum\_{s=1}^{n\_{\dot{f}}} K(\frac{\mathbf{x} - \mathbf{X}\_s^{(\dot{f})}}{\sigma\_{\dot{f}}(A)}, \frac{\mathbf{y} - \mathbf{Y}\_s^{(\dot{f})}}{\sigma\_{\dot{f}}(A)}) \tag{7}$$

for A, and

$$\nu\_{\dot{\boldsymbol{\beta}}}(\mathbf{x}, \mathbf{y}) = \frac{1}{\sigma\_{\dot{\boldsymbol{\beta}}}^2(\mathbf{B})} \frac{1}{k\_{\dot{\boldsymbol{\beta}}}} \sum\_{s=1}^{k\_{\dot{\boldsymbol{\beta}}}} K(\frac{\mathbf{x} - \mathbf{U}\_s^{(\dot{\boldsymbol{\beta}})}}{\sigma\_{\dot{\boldsymbol{\beta}}}(\mathbf{B})}, \frac{\mathbf{y} - \mathbf{V}\_s^{(\dot{\boldsymbol{\beta}})}}{\sigma\_{\dot{\boldsymbol{\beta}}}(\mathbf{B})}) \tag{8}$$

for B. Without loss of generality, we may suppose that the window is a unit square: *Q* = {(*x*, *y*) : 0 < *x* < 1, 0 < *y* < 1}. Define strongly negative definite kernel L for two functions *f*(*x*, *y*) and *g*(*x*, *y*) given on *Q* as

$$\mathcal{L}(f,g) = \left(\int\_0^1 \int\_0^1 (f(\mathbf{x},y) - g(\mathbf{x},y))^2 d\mathbf{x} dy\right)^{1/2}.$$

In this case the empirical analog on N-distance (6) is used with *μi*, *ν<sup>j</sup>* as arguments of L. To compare the microstructures A and B we use again the permutation test, that is we combine the functions *μ<sup>j</sup>* and *ν<sup>j</sup>* in one long vector, make a random permutation, and after that we split the vector into two parts, calculating after that N-distance between corresponding parts. The described operation has to be repeated many times.

It is possible to study a more general situation. Namely, we need not consider only coordinates (*X*, *Y*) of the particle centroids, but also individual characteristics of the particles. Then, we will have instead of two-dimensional vector (*X*,*Y*) the vectors of higher dimensionality, and instead of the functions *μ* and *ν* depending on two arguments we will have corresponding functions depending on three or more arguments. Theoretically, it is possible to consider an arbitrary number of characteristics of the particles, but the calculations for more than three arguments are very time consuming. Therefore, as the method (VI), we consider the case of three arguments, and as an example the case of three parameters: two coordinates of the particle centroid and the area of the particle (section).

Again, we have two microstructures (A and B), observed in *M* windows. Denote by *nj* the number of particles in *j*-th window from microstructure A, and by *kj* the corresponding number from the microstructure B. Corresponding coordinates of particle centroids and their areas are now denoted by (*X*(*j*) <sup>1</sup> ,*Y*(*j*) <sup>1</sup> , *<sup>Z</sup>*(*j*) <sup>1</sup> ),...,(*X*(*j*) *nj* ,*Y*(*j*) *nj* , *<sup>Z</sup>*(*j*) *nj* ) for *j*-th window from A, and by (*U*(*j*) <sup>1</sup> , *<sup>V</sup>*(*j*) <sup>1</sup> , *<sup>W</sup><sup>j</sup>* 1),...,(*U*(*j*) *kj* , *<sup>V</sup>*(*j*) *kj* , *<sup>W</sup>*(*j*) *kj* ) for *<sup>j</sup>*-th window from B (*<sup>j</sup>* = 1, 2, . . . , *<sup>M</sup>*).

We make a smoothing procedure in each window by convolving a discrete three-dimensional distribution concentrated in particle centroids and their areas with three-dimensional Gaussian distribution with zero mean vector and standard deviations *σj*(*A*) = 1/√<sup>4</sup> *nj* and *σj*(*B*) = 1/ <sup>4</sup> *kj*, i.e. we pass to the functions

$$\mu\_{\boldsymbol{j}}(\boldsymbol{x},\boldsymbol{y},\boldsymbol{z}) = \frac{1}{\sigma\_{\boldsymbol{j}}^{3}(\boldsymbol{A})} \frac{1}{n\_{\boldsymbol{j}}} \sum\_{s=1}^{n\_{\boldsymbol{j}}} K(\frac{\boldsymbol{x} - \boldsymbol{X}\_{\rm{s}}^{(\boldsymbol{j})}}{\sigma\_{\boldsymbol{j}}(\boldsymbol{A})}, \frac{\boldsymbol{y} - \boldsymbol{Y}\_{\rm{s}}^{(\boldsymbol{j})}}{\sigma\_{\boldsymbol{j}}(\boldsymbol{A})}, \frac{\boldsymbol{z} - \boldsymbol{Z}\_{\rm{s}}^{(\boldsymbol{j})}}{\sigma\_{\boldsymbol{j}}(\boldsymbol{A})})$$

for A, and

$$\nu\_{\dot{\jmath}}(\mathbf{x}, \mathbf{y}, \mathbf{z}) = \frac{1}{\sigma\_{\dot{\jmath}}^3(\mathbf{B})} \frac{1}{k\_{\dot{\jmath}}} \sum\_{s=1}^{k\_{\dot{\jmath}}} K\left(\frac{\mathbf{x} - \mathbf{U}\_{\mathbf{s}}^{(j)}}{\sigma\_{\dot{\jmath}}(\mathbf{B})}, \frac{\mathbf{y} - \mathbf{V}\_{\mathbf{s}}^{(j)}}{\sigma\_{\dot{\jmath}}(\mathbf{B})}, \frac{\mathbf{z} - \mathbf{W}\_{\mathbf{s}}^{(j)}}{\sigma\_{\dot{\jmath}}(\mathbf{B})}\right)$$

for B.

6 Will-be-set-by-IN-TECH

The test for the null hypothesis is constructed by means of N-distances, with the strongly negative definite kernel L for two functions *f*(*x*) and *g*(*x*) defined on an interval [0, *a*] as

(*f*(*x*) <sup>−</sup> *<sup>g</sup>*(*x*))2*dx*

<sup>L</sup>(*F<sup>A</sup> <sup>i</sup>* , *<sup>F</sup><sup>A</sup>*

*<sup>j</sup>* are the Bernstein polynomials for the function *F* (correspondingly, *G* or *pcf*),

1/2 .

*<sup>j</sup>* in one long vector of functions, make a random permutation,

*M* ∑ *i*=1

*M* ∑ *j*=1 <sup>L</sup>(*F<sup>B</sup> <sup>i</sup>* , *<sup>F</sup><sup>B</sup> j* ) 1/2

*nj* )) for *j*th window from A, and by

*<sup>σ</sup>j*(*A*) ) (7)

*<sup>σ</sup>j*(*B*) ) (8)

. (6)

√*nj* and

*<sup>j</sup>* ) <sup>−</sup> <sup>1</sup> *M*<sup>2</sup>

 *<sup>a</sup>* 0

> *M* ∑ *i*=1

*M* ∑ *j*=1

To compare the microstructure A with microstructure B we use permutation test, that is we

and after that we split the vector into two parts, calculating after that N-distance between corresponding parts. The described operation has to be repeated many times, which is

Further, using the functional data approach, we suggest a new comparison technique, qualitatively different from the previous ones. Suppose again we have two microstructures (A and B), observed in *M* windows. Denote by *nj* the number of particles in *j*-th window from microstructure A, and by *kj* the corresponding number from B. Corresponding coordinates of

<sup>1</sup> ),...,(*X*(*j*)

*kj* )) for *<sup>j</sup>*th window from B (*<sup>j</sup>* = 1, 2, . . . , *<sup>M</sup>*). Method (V) is based on a smoothing procedure in each window by convolving a discrete two-dimensional distribution concentrated in particle centroids with two-dimensional Gaussian distribution with zero mean vector and standard deviations *σj*(*A*) = 1/ <sup>4</sup>

> 1 *nj*

1 *kj*

*nj* ∑ *s*=1 *K*(

*kj* ∑ *s*=1 *K*(

for B. Without loss of generality, we may suppose that the window is a unit square: *Q* = {(*x*, *y*) : 0 < *x* < 1, 0 < *y* < 1}. Define strongly negative definite kernel L for

*nj* ,*Y*(*j*)

*<sup>x</sup>* <sup>−</sup> *<sup>X</sup>*(*j*) *s <sup>σ</sup>j*(*A*) ,

*<sup>x</sup>* <sup>−</sup> *<sup>U</sup>*(*j*) *s <sup>σ</sup>j*(*B*) ,

(*f*(*x*, *<sup>y</sup>*) <sup>−</sup> *<sup>g</sup>*(*x*, *<sup>y</sup>*))2*dxdy*

*<sup>y</sup>* <sup>−</sup> *<sup>Y</sup>*(*j*) *s*

*<sup>y</sup>* <sup>−</sup> *<sup>V</sup>*(*j*) *<sup>s</sup>*

1/2 .

<sup>1</sup> ,*Y*(*j*)

L(*f* , *g*) =

*<sup>j</sup>* ) <sup>−</sup> <sup>1</sup> *M*<sup>2</sup>

In this case the empirical analog on N-distance is defined as

constructed for the *j*th window of microstructures *A* and *B*.

possible thanks to fast computers. This ends our method (IV).

*kj*, i.e. we pass to the functions

*<sup>μ</sup>j*(*x*, *<sup>y</sup>*) = <sup>1</sup>

*<sup>ν</sup>j*(*x*, *<sup>y</sup>*) = <sup>1</sup>

*σ*2 *<sup>j</sup>* (*A*)

*σ*2 *<sup>j</sup>* (*B*)

 <sup>1</sup> 0

 1 0

In this case the empirical analog on N-distance (6) is used with *μi*, *ν<sup>j</sup>* as arguments of L. To compare the microstructures A and B we use again the permutation test, that is we combine the functions *μ<sup>j</sup>* and *ν<sup>j</sup>* in one long vector, make a random permutation, and after that we split the vector into two parts, calculating after that N-distance between corresponding parts. The

*<sup>j</sup>* and *<sup>F</sup><sup>B</sup>*

<sup>L</sup>(*F<sup>A</sup> <sup>i</sup>* , *<sup>F</sup><sup>B</sup>*

N =

Here *F<sup>A</sup>*

(*U*(*j*) <sup>1</sup> , *<sup>V</sup>*(*j*)

*σj*(*B*) = 1/ <sup>4</sup>

for A, and

 2 *M*<sup>2</sup>

*<sup>j</sup>* and *<sup>F</sup><sup>B</sup>*

combine the functions *F<sup>A</sup>*

*M* ∑ *i*=1

particle centroids are denoted by (*X*(*j*)

*kj* , *<sup>V</sup>*(*j*)

two functions *f*(*x*, *y*) and *g*(*x*, *y*) given on *Q* as

L(*f* , *g*) =

described operation has to be repeated many times.

<sup>1</sup> ),...,(*U*(*j*)

*M* ∑ *j*=1

> Without loss of generality, we may suppose that the window is a unit cube: *Q* = {(*x*, *y*) : 0 < *x* < 1, 0 < *y* < 1, 0 < *z* < 1}.

Define strongly negative definite kernel L for two functions *f*(*x*, *y*, *z*) and *g*(*x*, *y*, *z*) given on *Q* as

$$\mathcal{L}(f,\mathcal{g}) = \left(\int\_0^1 \int\_0^1 \int\_0^1 (f(x,y,z) - \mathcal{g}(x,y,z))^2 dx dy dz\right)^{1/2}.$$

Then again the empirical analog on N-distance (6) is used with *μi*, *ν<sup>j</sup>* as arguments of L. Further, we apply the same testing procedure as for two-dimensional case, described above.

#### **2.4 The power of the test**

In order to understand and describe the properties of the suggested testing based on N-distances, it is necessary to study the power of the tests, which quantifies the probability of a correct rejection of *H*0. It is possible to do this by means of simulations of special cases. For the use of Kolmogorov-Smirnov test for comparison of *U*, *V* in (12) a study of the power is presented in (Klebanov, 2005). We present here another study of the power of our test when using permutation testing in the vector approach.

(a) Material A (b) Material C

Statistical Tests Based on the Geometry of Second Phase Particles 467

(c) Material L (d) Material P

(e) Material Z

Zn does not participate in second-phase particles and its presence in alloy *C* does not affect its particle volume fraction and size distribution. Alloys *P* and *Z* have lower silicon contents, *P* has lower copper contents while *Z* has lower manganese contents. The most important factor influencing particles volume fraction and size distribution in the set of alloys considered is the content of silicon. Coarse particles are mostly primary, undissolved particles of *α*-Al12(Mn,

Fig. 3. A metallographic sample of material (a) A, (b) C, (c) L, (d) P, (e) Z, respectively, a transverse section of the foil. In (a), (b), (c), (alloys with higher contents of Si) particles *α*-Al12(Mn, Fe)3Si prevail. In (d), (e), (alloys with lower contents of Si) particles Al6(Mn, Fe)

prevail.

Fig. 2. Estimated probability, that the null hypothesis is rejected by the permutation test on 0.05 significance level, given the alternative of location (upper left), scale (upper right), correlation (lower left and right).

Consider a *k*−dimensional Gaussian distribution with mean (*μ*,..., *μ*), and variance matrix terms <sup>Σ</sup>*ii* <sup>=</sup> *<sup>σ</sup>*2, <sup>Σ</sup>*ij* <sup>=</sup> *ρσ*2, *<sup>i</sup>* �<sup>=</sup> *<sup>j</sup>*, *<sup>ρ</sup>* <sup>∈</sup> [0, 1]. This distribution can be simulated as

$$X\_j = \sigma \sqrt{\rho} Z + \sigma \sqrt{1 - \rho} Y\_j + \mu\_\prime \ \mathbf{j} = 1, \dots, k\_\prime \ \mathbf{j}$$

where *Z*,*Y*1,...,*Yk* are independent identically distributed standard Gaussian random variables. Two independent random samples of size *n*, with parameters (*μ*1, *σ*1, *ρ*1),(*μ*2, *σ*2, *ρ*2), respectively are compared with null hypothesis of equal distributions and alternatives of

(i) location: *μ*<sup>1</sup> �= *μ*2. (ii) scale: *σ*<sup>1</sup> �= *σ*2. (iii) correlation: *ρ*<sup>1</sup> �= *ρ*2.

Numerical results of the simulation and testing are in Fig. 2, where it is *μ*<sup>1</sup> = 0, *μ*<sup>2</sup> horizontal axis (upper left); *σ*<sup>1</sup> = 1, *σ*<sup>2</sup> horizontal axis (upper right); *ρ*<sup>1</sup> = 0, *ρ*<sup>2</sup> horizontal axis (lower left); *ρ*<sup>1</sup> = 0.5, *ρ*<sup>2</sup> horizontal axis (lower right). The number of windows is *n* =20, *k* =100 grains, 100 permutations, averaged over 1000 simulations. In the lower right graph the number of windows is 40. The parameters not involved in the alternative are *μ*<sup>1</sup> = *μ*<sup>2</sup> = 0 in (ii), (iii), *σ*<sup>1</sup> = *σ*<sup>2</sup> = 1 in (i), (iii), *ρ*<sup>1</sup> = *ρ*<sup>2</sup> = 0 in (i), (ii). We can observe that for the location alternative the power function increases more rapidly than for the scale and correlation alternatives. For both location and scale alternatives the power function increases more rapidly than for the correlation alternatives.

#### **3. Materials**

Further we present an application of suggested statistical methods. A Czech company AL INVEST Bˇridliˇcná, a.s. provided five Al-Mn alloys denoted *A*, *C*, *L*, *P*, *Z*, the composition of which is in Table 1. The alloys with high manganese contents are *A*, *C*, *L*, *P*, high silicon contents have *A*, *C* and *L*, they differ in the zinc contents present in *C* and not present in *A*, *L*. Considering the high solubility of Zn in Al, all Zn is dissolved in aluminium matrix. Therefore, 8 Will-be-set-by-IN-TECH

0.0 0.2 0.4 0.6 0.8 1.0

0.05 0.10 0.15 0.20 0.25

Fig. 2. Estimated probability, that the null hypothesis is rejected by the permutation test on 0.05 significance level, given the alternative of location (upper left), scale (upper right),

Consider a *k*−dimensional Gaussian distribution with mean (*μ*,..., *μ*), and variance matrix

where *Z*,*Y*1,...,*Yk* are independent identically distributed standard Gaussian random variables. Two independent random samples of size *n*, with parameters (*μ*1, *σ*1, *ρ*1),(*μ*2, *σ*2, *ρ*2), respectively are compared with null hypothesis of equal distributions

Numerical results of the simulation and testing are in Fig. 2, where it is *μ*<sup>1</sup> = 0, *μ*<sup>2</sup> horizontal axis (upper left); *σ*<sup>1</sup> = 1, *σ*<sup>2</sup> horizontal axis (upper right); *ρ*<sup>1</sup> = 0, *ρ*<sup>2</sup> horizontal axis (lower left); *ρ*<sup>1</sup> = 0.5, *ρ*<sup>2</sup> horizontal axis (lower right). The number of windows is *n* =20, *k* =100 grains, 100 permutations, averaged over 1000 simulations. In the lower right graph the number of windows is 40. The parameters not involved in the alternative are *μ*<sup>1</sup> = *μ*<sup>2</sup> = 0 in (ii), (iii), *σ*<sup>1</sup> = *σ*<sup>2</sup> = 1 in (i), (iii), *ρ*<sup>1</sup> = *ρ*<sup>2</sup> = 0 in (i), (ii). We can observe that for the location alternative the power function increases more rapidly than for the scale and correlation alternatives. For both location and scale alternatives the power function increases

Further we present an application of suggested statistical methods. A Czech company AL INVEST Bˇridliˇcná, a.s. provided five Al-Mn alloys denoted *A*, *C*, *L*, *P*, *Z*, the composition of which is in Table 1. The alloys with high manganese contents are *A*, *C*, *L*, *P*, high silicon contents have *A*, *C* and *L*, they differ in the zinc contents present in *C* and not present in *A*, *L*. Considering the high solubility of Zn in Al, all Zn is dissolved in aluminium matrix. Therefore,

<sup>1</sup> <sup>−</sup> *<sup>ρ</sup>Yj* <sup>+</sup> *<sup>μ</sup>*, *<sup>j</sup>* <sup>=</sup> 1, . . . , *<sup>k</sup>*,

terms <sup>Σ</sup>*ii* <sup>=</sup> *<sup>σ</sup>*2, <sup>Σ</sup>*ij* <sup>=</sup> *ρσ*2, *<sup>i</sup>* �<sup>=</sup> *<sup>j</sup>*, *<sup>ρ</sup>* <sup>∈</sup> [0, 1]. This distribution can be simulated as

√*ρ<sup>Z</sup>* + *<sup>σ</sup>*

(i) location: *μ*<sup>1</sup> �= *μ*2. (ii) scale: *σ*<sup>1</sup> �= *σ*2. (iii) correlation: *ρ*<sup>1</sup> �= *ρ*2.

1.0 1.5 2.0 2.5 3.0

0.0 0.2 0.4 0.6 0.8

0.0 0.5 1.0 1.5

0.0 0.2 0.4 0.6 0.8

*Xj* = *σ*

more rapidly than for the correlation alternatives.

0.0 0.2 0.4 0.6 0.8 1.0

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

and alternatives of

**3. Materials**

correlation (lower left and right).

Fig. 3. A metallographic sample of material (a) A, (b) C, (c) L, (d) P, (e) Z, respectively, a transverse section of the foil. In (a), (b), (c), (alloys with higher contents of Si) particles *α*-Al12(Mn, Fe)3Si prevail. In (d), (e), (alloys with lower contents of Si) particles Al6(Mn, Fe) prevail.

Zn does not participate in second-phase particles and its presence in alloy *C* does not affect its particle volume fraction and size distribution. Alloys *P* and *Z* have lower silicon contents, *P* has lower copper contents while *Z* has lower manganese contents. The most important factor influencing particles volume fraction and size distribution in the set of alloys considered is the content of silicon. Coarse particles are mostly primary, undissolved particles of *α*-Al12(Mn,

Alloy MeanArea Standard Deviation Density Total number A 0.696 0.864 0.073 19 773 C 0.683 0.873 0.070 18 834 L 0.689 0.796 0.080 21 611 P 1.080 1.287 0.047 12 605 Z 1.223 1.386 0.041 10 997 Table 2. The mean area of observed particle sections (in [*μ*m2]), their density (in [*μ*m−2]) and

Statistical Tests Based on the Geometry of Second Phase Particles 469

Fig. 5. Histograms of areas of particle sections in the first window of each material. In a) we

Alloy Mean Number Standard Deviation A 988,65 44.76 C 941,7 53.50 L 1080,55 41.36 P 630,25 46.55 Z 549,85 33.47

Table 3. The mean number of observed particle sections and their standard deviation in one

20 windows) of the functions *F*, *G* and *pcf* evaluated for all materials. We obtain the results that for the material *L*, *A*, *C* these are greatly different from the results for materials *P* and *Z*. The estimators of *pcf* of materials *P* and *Z* are practically the same and the differences of

have *A*, *C*, *L* and in b) there are *P* and *Z*.

window (evaluated from all 20 windows).

total number for materials *A*, *C*, *L*, *P*, *Z*, evaluated from all 20 windows.

Fe)3Si and Al6(Mn, Fe). Fine particles, present especially for alloys with a higher content of Si, are mainly precipitates *α*-Al12(Mn, Fe)3Si, cf. (Slámová et al., 2006).

All alloys were twin-roll cast in strips of 8.5 mm in thickness. All specimens were homogenized at high temperature after a 35% reduction in thickness and then cold rolled to thickness of 0.4 mm. The samples of 0.4 mm thickness were annealed again at 350 ◦C in order to increase the ductility of the material so as to facilitate the cold rolling up to the final foil thickness of 0.10 mm.


Table 1. Composition of experimental materials [wt. %].

Metallographic samples after grinding and polishing were observed by optical microscopy, see Fig. 3, where the vertical size of the window corresponds to the total thickness of the foil. Twenty windows along the foil, see Fig. 4, with distance 1 mm between neighbouring ones were measured by an image analyzer. The number of particles and individual parameters of each particle were measured in each window and for each material. The individual parameters are Area, EqDiam (i.e. diameter of the circle with the same area as parameter Area), Minimal Feret, Maximal Feret (extremes of the breadth, Ohser & Mücklich (2000), of a particle w.r.t. directions). The Shape Factor can be evaluated as a fraction of Minimal Feret and Maximal Feret. Parameters Area, EqDiam and Shape Factor were used to the input data to our test.


between neighbouring windows is 1 mm.

#### **4. Numerical results**

The preliminary analysis concerns the mean size of particle sections and their density, see Tables 2, 3, apparently *A*, *C*, *L* differ from *P* and *Z*, see also Fig.5. For the testing, since the number of windows *n* = 20 is small, we use the permutation test described in Section 2. The results of method (I) are in Table 4. The microstructures do not differ in the shape factor of particles. Concerning the particles size we observe again two separated groups {*A*,*C*, *L*} and {*P*, *Z*} (according to silicon contents). Between them there is a significant difference, while within the groups this is not the case. Nevertheless in Table 4 we observe a difference in pairs *A* − *L*, *P* − *Z*, while we cannot reject the null hypothesis in pairs *A* − *C*, *C* − *L*.

For the methods (II)-(IV) of the spatial distribution of particle section centroids first the *F*, *G* and *pcf* functions were estimated. The estimators of functions *F*, *G* and *pcf* for all windows of material *P* are presented in Fig. 6, i.e. in each figure a), b), c) there are 20 graphs. We observe a small variability of the estimators. Let us note that there is a similarly small variability of these estimators in all other materials. In Fig. 7 we compare the average estimators (from 10 Will-be-set-by-IN-TECH

Fe)3Si and Al6(Mn, Fe). Fine particles, present especially for alloys with a higher content of

All alloys were twin-roll cast in strips of 8.5 mm in thickness. All specimens were homogenized at high temperature after a 35% reduction in thickness and then cold rolled to thickness of 0.4 mm. The samples of 0.4 mm thickness were annealed again at 350 ◦C in order to increase the ductility of the material so as to facilitate the cold rolling up to the final

> Alloy Mn Si Fe Cu Zn Mn+Si+Fe (Mn+Si)/Fe Mn/Si A 1.09 0.54 0.47 0.16 0.004 2.1 3.5 2.02 C 1.02 0.56 0.53 0.15 1.02 2.1 3.1 1.8 L 1.02 0.59 0.48 0.11 0.01 2.1 3.3 1.7 P 1.01 0.20 0.61 0.04 - 1.8 2.0 5.1 Z 0.86 0.10 0.61 0.16 - 1.6 1.6 8.7

Metallographic samples after grinding and polishing were observed by optical microscopy, see Fig. 3, where the vertical size of the window corresponds to the total thickness of the foil. Twenty windows along the foil, see Fig. 4, with distance 1 mm between neighbouring ones were measured by an image analyzer. The number of particles and individual parameters of each particle were measured in each window and for each material. The individual parameters are Area, EqDiam (i.e. diameter of the circle with the same area as parameter Area), Minimal Feret, Maximal Feret (extremes of the breadth, Ohser & Mücklich (2000), of a particle w.r.t. directions). The Shape Factor can be evaluated as a fraction of Minimal Feret and Maximal Feret. Parameters Area, EqDiam and Shape Factor were used to the input data

Fig. 4. Scheme of sampling windows along the foil of thickness 0.1 mm. The distance

*A* − *L*, *P* − *Z*, while we cannot reject the null hypothesis in pairs *A* − *C*, *C* − *L*.

The preliminary analysis concerns the mean size of particle sections and their density, see Tables 2, 3, apparently *A*, *C*, *L* differ from *P* and *Z*, see also Fig.5. For the testing, since the number of windows *n* = 20 is small, we use the permutation test described in Section 2. The results of method (I) are in Table 4. The microstructures do not differ in the shape factor of particles. Concerning the particles size we observe again two separated groups {*A*,*C*, *L*} and {*P*, *Z*} (according to silicon contents). Between them there is a significant difference, while within the groups this is not the case. Nevertheless in Table 4 we observe a difference in pairs

For the methods (II)-(IV) of the spatial distribution of particle section centroids first the *F*, *G* and *pcf* functions were estimated. The estimators of functions *F*, *G* and *pcf* for all windows of material *P* are presented in Fig. 6, i.e. in each figure a), b), c) there are 20 graphs. We observe a small variability of the estimators. Let us note that there is a similarly small variability of these estimators in all other materials. In Fig. 7 we compare the average estimators (from

Si, are mainly precipitates *α*-Al12(Mn, Fe)3Si, cf. (Slámová et al., 2006).

Table 1. Composition of experimental materials [wt. %].

between neighbouring windows is 1 mm.

foil thickness of 0.10 mm.

to our test.

**4. Numerical results**


Table 2. The mean area of observed particle sections (in [*μ*m2]), their density (in [*μ*m−2]) and total number for materials *A*, *C*, *L*, *P*, *Z*, evaluated from all 20 windows.

Fig. 5. Histograms of areas of particle sections in the first window of each material. In a) we have *A*, *C*, *L* and in b) there are *P* and *Z*.


Table 3. The mean number of observed particle sections and their standard deviation in one window (evaluated from all 20 windows).

20 windows) of the functions *F*, *G* and *pcf* evaluated for all materials. We obtain the results that for the material *L*, *A*, *C* these are greatly different from the results for materials *P* and *Z*. The estimators of *pcf* of materials *P* and *Z* are practically the same and the differences of

Fig. 7. Graphs of (a) *F*-function, (b) *G*-function and (c) *pcf* for all three materials obtained by averaging the estimators from 20 windows. We observe that for materials *L*, *A*, *C* they look

Statistical Tests Based on the Geometry of Second Phase Particles 471

A-C 0.6 0.166 0.034 0.999 A-L 0.004 < 0.001 < 0.001 0.784 A-P,A-Z < 0.001 < 0.001 < 0.001 < 0.001 C-L < 0.001 < 0.001 < 0.001 0.584 C-P,C-Z < 0.001 < 0.001 < 0.001 < 0.001 L-P,L-Z < 0.001 < 0.001 < 0.001 < 0.001 P-Z < 0.001 < 0.001 < 0.001 0.545

Table 5. The *p*-values for two-sample tests of the spatial distribution (method II) evaluated for all three parameters together (*m* = 3, column All) and then for single parameters (*m* = 1)

Let us analyze the number of particles for the microstructures *P* and *Z*, we can test the difference between *nP* = 13866, *nZ* = 11945 from all 20 windows, thus in (5) we have *T* = 12 > 1.96 and we reject an *H*<sup>0</sup> of equal particle density at the significance level *α* = 0.05. Here the Poisson process model assumption is violated, as suggested by the shape of *pcf* in Fig. 6 we have a type of a regular model, i.e. mild repulsion since there are nonoverlapping particles around the centroids. Clearly, if we reject the null hypothesis *λ<sup>P</sup>* = *λ<sup>Z</sup>* for the Poisson process model using (5), we reject it for the regular model too, since it is less dispersed, i.e. the

Further a finer analysis of the spatial distribution of particles is applied using method (III). If we eliminate the effect of particle density on the spatial distribution by means of the scale change as suggested in Section 2, the results change as presented in Table 6. The pure effect of the spatial distribution of particle centroids is such that there is no significant difference between materials *P* − *Z*, *A* − *C*. But this is moreover the case also for individual functions *F*

with *n* = 20, *k* = 11, � = 1 *μm* and with the number of permutations equal to 5000.

numbers of particles observed in windows vary more slowly.

All F G pcf

differently from *P*, *Z*.


Table 4. The *p*-values for two-sample tests of individual particle parameters (method I) evaluated for all *m* = 3 parameters together (the column All) and then for single parameters (*m* = 1) with *n* = 20, *k* = 500 and with the number of permutations equal to 5000.

Fig. 6. Estimators of (a) *F*-function, (b) *G*-function and (c) *pcf* for material *P*. Graphs obtained from each of 20 windows are drawn in the same figures in order to observe a small variability of the estimators among the windows.

estimators of the functions *F* and *G* are small, while the corresponding functions of materials *L*, *A*, *C* are shifted to the left. This is caused mostly by the different particle density The results for method (II) of the two-sample tests for the spatial distribution in Table 5 lead to the interpretation that there are significant differences in spatial distribution between any materials of different groups {*A*, *C*, *L*} and {*P*, *Z*}. It is interesting to observe what happens within the groups. We can see that while distribution functions *F*, *G* still yield differences, the pair correlation function does not reveal any. Only the pair *C* − *A* which has the most close contents of silicon, does not reveal any difference in any characteristics.

12 Will-be-set-by-IN-TECH

A-C 0.343 0.369 0.335 0.050 A-L 0.040 0.042 0.035 0.203 A-P < 0.001 < 0.001 < 0.001 0.565 A-Z < 0.001 < 0.001 < 0.001 0.529 C-L 0.302 0.355 0.288 0.046 C-P < 0.001 < 0.001 < 0.001 0.283 C-Z < 0.001 < 0.001 < 0.001 0.322 L-P < 0.001 < 0.001 < 0.001 0.422 L-Z < 0.001 < 0.001 < 0.001 0.488 P-Z 0.015 0.020 0.002 0.639 Table 4. The *p*-values for two-sample tests of individual particle parameters (method I) evaluated for all *m* = 3 parameters together (the column All) and then for single parameters

(*m* = 1) with *n* = 20, *k* = 500 and with the number of permutations equal to 5000.

Fig. 6. Estimators of (a) *F*-function, (b) *G*-function and (c) *pcf* for material *P*. Graphs

*L*, *A*, *C* are shifted to the left. This is caused mostly by the different particle density

contents of silicon, does not reveal any difference in any characteristics.

variability of the estimators among the windows.

obtained from each of 20 windows are drawn in the same figures in order to observe a small

estimators of the functions *F* and *G* are small, while the corresponding functions of materials

The results for method (II) of the two-sample tests for the spatial distribution in Table 5 lead to the interpretation that there are significant differences in spatial distribution between any materials of different groups {*A*, *C*, *L*} and {*P*, *Z*}. It is interesting to observe what happens within the groups. We can see that while distribution functions *F*, *G* still yield differences, the pair correlation function does not reveal any. Only the pair *C* − *A* which has the most close

All Area Eqdiam ShapeFactor

Fig. 7. Graphs of (a) *F*-function, (b) *G*-function and (c) *pcf* for all three materials obtained by averaging the estimators from 20 windows. We observe that for materials *L*, *A*, *C* they look differently from *P*, *Z*.


Table 5. The *p*-values for two-sample tests of the spatial distribution (method II) evaluated for all three parameters together (*m* = 3, column All) and then for single parameters (*m* = 1) with *n* = 20, *k* = 11, � = 1 *μm* and with the number of permutations equal to 5000.

Let us analyze the number of particles for the microstructures *P* and *Z*, we can test the difference between *nP* = 13866, *nZ* = 11945 from all 20 windows, thus in (5) we have *T* = 12 > 1.96 and we reject an *H*<sup>0</sup> of equal particle density at the significance level *α* = 0.05. Here the Poisson process model assumption is violated, as suggested by the shape of *pcf* in Fig. 6 we have a type of a regular model, i.e. mild repulsion since there are nonoverlapping particles around the centroids. Clearly, if we reject the null hypothesis *λ<sup>P</sup>* = *λ<sup>Z</sup>* for the Poisson process model using (5), we reject it for the regular model too, since it is less dispersed, i.e. the numbers of particles observed in windows vary more slowly.

Further a finer analysis of the spatial distribution of particles is applied using method (III). If we eliminate the effect of particle density on the spatial distribution by means of the scale change as suggested in Section 2, the results change as presented in Table 6. The pure effect of the spatial distribution of particle centroids is such that there is no significant difference between materials *P* − *Z*, *A* − *C*. But this is moreover the case also for individual functions *F*

A-C A-L A-P A-Z C-L C-P C-Z L-P L-Z P-Z < 0.01 0.0297 < 0.01 0.475 < 0.01 < 0.01 < 0.01 0.257 < 0.01 0.059 < 0.01 0.07 < 0.01 0.19 < 0.01 < 0.01 < 0.01 0.11 0.02 0.04 Table 8. The p-values of N-test for corresponding coordinates comparisons in the functional

Statistical Tests Based on the Geometry of Second Phase Particles 473

A-L A-P P-Z < 0.01 < 0.01 < 0.01

Finally a simultaneous analysis of spatial distribution and an individual particle parameter (area of the section) was performed using the functional data approach, method (VI). The results of the comparison are given in Table 9. We do not consider all pairs of microstructures since for those of different groups (different silicon contents) the particle areas surely cause the rejection of null-hypothesis. As we can see from Table 9, this is the case also in other pairs.

(a) A–C (b) A–Z

Fig. 8. Average value of the difference *μ<sup>j</sup>* − *ν<sup>j</sup>* of functions (7),(8) taken from all 20 windows,

This chapter brings an extension and continuation of research started in (Benes et al., 2009). New statistical methods are developed for the comparison of microstructural images of random objects in metallography and other applications. They are based on an appropriate interaction of approaches from mathematical statistics, image analysis and stochastic geometry. A proper two-sample test derived from N-distances enables one to evaluate a large amount of information from a few observed windows. The tests presented are easy to apply to metallographic images observed by light microscopy and image analysis. In comparison with the above mentioned paper here we suggest further methods based on functional data analysis and we analyze a broader set of foils from aluminium-manganese

Table 9. The p-values of N-test for the comparisons based on coordinates and areas of the

√*nj* and

3 .

data approach, method (V). The top row corresponds to the choice *σj*(*A*) = 1/ <sup>4</sup>

*kj*, the bottom row to the choice *σj*(*A*) = *σj*(*B*) = <sup>1</sup>

particles in the functional data approach, method (VI).

*σj*(*B*) = 1/ <sup>4</sup>

microstructure A–C, A–Z.

**5. Concluding remarks**

based alloys.


Table 6. The *p*-values for two-sample tests of the spatial distribution (with the effect of particle density eliminated, method III). Evaluation for all three parameters together (first column) and then for single parameters (*m* = 1) was obtained with *n* = 20, *k* = 11, � = 1 *μm* and with the number of permutations equal to 5000.

(holds for *A* − *P*, *A* − *Z*, *C* − *Z*) and *G* (*A* − *L*, *A* − *Z*, *C* − *L*, *C* − *Z*, *L* − *Z*). That means some differences between two groups are removed.

Finally we present results of two-sample tests when using the functional data approach in Subsection 2.3. First for the comparison based on functions *F*, *G*, *pcf* we use the versions with scale change to eliminate the effect of particle density. Similar results as in Table 6 are expected using this method (IV), see Table 7. Even if individual *p*−values in both tables differ, the decisions about *H*<sup>0</sup> are almost completely the same.


Table 7. The *p*-values for two-sample tests of the spatial distribution (with the effect of particle density eliminated), using the functional data approach, method (IV). Evaluation was obtained with the number of permutations equal to 100.

Further the tests based on multidimensional smoothing of particle characteristics are applied, which are qualitatively different methods. They are not sensitive to the particle density, on the other hand it may reveal local inhomogeneities and differences. First we give the results of comparison by method (V), that is corresponding *p*−values of the test based on N-distances of functions (7), (8), for different pairs of microstructures in terms of the particle centroid coordinates only, see Table 8. In many cases, but not in all, the results are similar to those in Tables 6, 7 (spatial distribution only is investigated in both methods). Different results are obtained especially for pairs *A* − *C*, *P* − *L*.

14 Will-be-set-by-IN-TECH

Table 6. The *p*-values for two-sample tests of the spatial distribution (with the effect of particle density eliminated, method III). Evaluation for all three parameters together (first column) and then for single parameters (*m* = 1) was obtained with *n* = 20, *k* = 11, � = 1 *μm*

(holds for *A* − *P*, *A* − *Z*, *C* − *Z*) and *G* (*A* − *L*, *A* − *Z*, *C* − *L*, *C* − *Z*, *L* − *Z*). That means

Finally we present results of two-sample tests when using the functional data approach in Subsection 2.3. First for the comparison based on functions *F*, *G*, *pcf* we use the versions with scale change to eliminate the effect of particle density. Similar results as in Table 6 are expected using this method (IV), see Table 7. Even if individual *p*−values in both tables differ,

> A–C 0.59 0.57 0.32 A–L 0.02 0.46 < 0.01 A–P 0.17 0.02 0.02 A–Z 0.42 0.13 < 0.01 C–L < 0.01 0.42 < 0.01 C–P 0.07 0.04 < 0.01 C–Z 0.43 0.14 0.01 L–P 0.02 0.04 < 0.01 L–Z 0.01 0.26 < 0.01 P–Z 0.58 0.44 0.46

Table 7. The *p*-values for two-sample tests of the spatial distribution (with the effect of particle density eliminated), using the functional data approach, method (IV). Evaluation

Further the tests based on multidimensional smoothing of particle characteristics are applied, which are qualitatively different methods. They are not sensitive to the particle density, on the other hand it may reveal local inhomogeneities and differences. First we give the results of comparison by method (V), that is corresponding *p*−values of the test based on N-distances of functions (7), (8), for different pairs of microstructures in terms of the particle centroid coordinates only, see Table 8. In many cases, but not in all, the results are similar to those in Tables 6, 7 (spatial distribution only is investigated in both methods). Different results are

F G pcf

and with the number of permutations equal to 5000.

some differences between two groups are removed.

the decisions about *H*<sup>0</sup> are almost completely the same.

was obtained with the number of permutations equal to 100.

obtained especially for pairs *A* − *C*, *P* − *L*.

All F G pcf A-C 0.483 0.132 0.527 0.540 A-L 0.003 0.002 0.509 0.004 A-P 0.019 0.108 0.022 0.028 A-Z 0.013 0.178 0.122 0.009 C-L < 0.001 < 0.001 0.502 < 0.001 C-P 0.001 0.060 0.057 0.001 C-Z 0.005 0.195 0.158 0.003 L-P < 0.001 0.010 0.037 0.001 L-Z < 0.001 0.005 0.256 < 0.001 P-Z 0.628 0.611 0.416 0.479


Table 8. The p-values of N-test for corresponding coordinates comparisons in the functional data approach, method (V). The top row corresponds to the choice *σj*(*A*) = 1/ <sup>4</sup> √*nj* and

 $\sigma\_{\hat{\jmath}}(B) = 1/\sqrt[4]{k\_{\hat{\jmath}}}$  the bottom row to the choice  $\sigma\_{\hat{\jmath}}(A) = \sigma\_{\hat{\jmath}}(B) = \frac{1}{3}$ .


Table 9. The p-values of N-test for the comparisons based on coordinates and areas of the particles in the functional data approach, method (VI).

Finally a simultaneous analysis of spatial distribution and an individual particle parameter (area of the section) was performed using the functional data approach, method (VI). The results of the comparison are given in Table 9. We do not consider all pairs of microstructures since for those of different groups (different silicon contents) the particle areas surely cause the rejection of null-hypothesis. As we can see from Table 9, this is the case also in other pairs.

Fig. 8. Average value of the difference *μ<sup>j</sup>* − *ν<sup>j</sup>* of functions (7),(8) taken from all 20 windows, microstructure A–C, A–Z.

### **5. Concluding remarks**

This chapter brings an extension and continuation of research started in (Benes et al., 2009). New statistical methods are developed for the comparison of microstructural images of random objects in metallography and other applications. They are based on an appropriate interaction of approaches from mathematical statistics, image analysis and stochastic geometry. A proper two-sample test derived from N-distances enables one to evaluate a large amount of information from a few observed windows. The tests presented are easy to apply to metallographic images observed by light microscopy and image analysis. In comparison with the above mentioned paper here we suggest further methods based on functional data analysis and we analyze a broader set of foils from aluminium-manganese based alloys.

**6. Acknowledgement**

**7. Appendix**

= 2 X X

Then

*X*� *<sup>d</sup>*

The research was supported by the Czech Science Foundation, project GACR P201/10/0472, ˇ and by the Czech Ministery of Education, project MSM 0021620839. Our memory belongs to Margarita Slámová, who was the coauthor of the paper (Benes et al., 2009). She died

Statistical Tests Based on the Geometry of Second Phase Particles 475

Here we give a mathematical background of N-distances and related statistical testing. This background comes from (Klebanov, 2005). Let {X, <sup>A</sup>} be a measurable space, <sup>L</sup> : <sup>X</sup><sup>2</sup> <sup>→</sup> IR<sup>1</sup> is

for an arbitrary probability measure *Q* on {X, A} and a measurable function *h* on X such that

Let L be a strongly negative definite kernel on <sup>X</sup>, BL the set of all probabilities *<sup>μ</sup>* on {X, <sup>A</sup>} for

is a distance on BL, it is called N-distance. We will use N-distances in the following using two approaches. First in classical data analysis when X = IR*<sup>k</sup>* is the Euclidean space of *k*−dimensional vectors. Secondly in functional data analysis where X = *L*<sup>2</sup> is the space of

Let <sup>L</sup>(*x*, *<sup>y</sup>*) be a strongly negative definite kernel on IR*k*, *<sup>X</sup>*, *<sup>Y</sup>* are two independent random

two independent samples *X*1,..., *Xn*; *Y*1,...,*Yn* from general multivariate populations *X* and *Y*, respectively. A one-dimensional test to *U* and *V* can proceed in the following ways:

b) simulate the samples from *X*� and *X*�� (as well as from *Y*� and *Y*��) by independent choices from observations *X*1,..., *Xn* (and from *Y*1,...,*Yn*, correspondingly); thus we do not test the

= *Y*, but the one of the corresponding empirical distributions,

), *V* = L(*Y*�

<sup>=</sup> *<sup>Y</sup>* ⇐⇒ *<sup>U</sup> <sup>d</sup>*

L(*x*, *y*)*dμ*(*x*)*dμ*(*y*) −

N (*μ*, *ν*)

1/2

<sup>X</sup> *h*(*x*)*dQ*(*x*) = 0. We say that L is strongly negative definite if the equality in (9) implies

L(*x*, *y*)*h*(*x*)*h*(*y*)*dQ*(*x*)*dQ*(*y*) ≤ 0 (9)

<sup>X</sup> L(*x*, *<sup>y</sup>*)*dμ*(*x*)*dμ*(*y*) < <sup>∞</sup>. For *<sup>μ</sup>*, *<sup>ν</sup>* ∈ BL put N (*μ*, *<sup>ν</sup>*) =

 X X

,*Y*�� are mutually independent, equalities of distributions *<sup>X</sup> <sup>d</sup>*

= *Y* for multivariate random vectors *X*,*Y*. This hypothesis

= *V*, where *U*, *V* are random variables taking values in IR1. Consider

, *X*��,*Y*,*Y*�

L(*x*, *y*)*dν*(*x*)*dν*(*y*). (10)

,*Y*��) − L(*X*��,*Y*��). (12)

= *V* ⇐⇒ N(*X*, *Y*) = 0. Consider

,*Y*�� and use (12); this leads

(11)

=

prematurely in 2009 and we missed her during the preparation of this chapter a lot.

a negative definite kernel on X (L(*x*, *x*) = 0 and L(*x*, *y*) = L(*y*, *x*)) if and only if

 X X

*h* = 0 almost surely with respect to the measure *Q*.

L(*x*, *y*)*dμ*(*x*)*dν*(*y*) −

X 

> X X

<sup>N</sup>(*μ*, *<sup>ν</sup>*) =

vectors in IR*k*, define one-dimensional independent random variables *U*, *V* by

*U* = L(*X*,*Y*) − L(*X*, *X*�

= *<sup>Y</sup>*�� hold. We have *<sup>X</sup> <sup>d</sup>*

, *X*��,*Y*,*Y*�

a) split each sample randomly in three equal parts *X*, *X*�

c) permutation test using Monte Carlo approximation.

which there exists the integral

square integrable functions.

Here all vectors *X*, *X*�

= *<sup>Y</sup>*� *<sup>d</sup>*

testing of the hypothesis *<sup>H</sup>*<sup>0</sup> : *<sup>X</sup> <sup>d</sup>*

<sup>0</sup> : *<sup>U</sup> <sup>d</sup>*

= *<sup>X</sup>*��, *<sup>Y</sup> <sup>d</sup>*

is equivalent to *H*�

hypothesis *X <sup>d</sup>*

to a loss of information,

The first group of methods based on vectors of characteristics obtained from image analysis measurements and further transformation of data is well-established and the results easily understandable. From the practical point-of-view, it should be mentioned that N-distances are scale dependent, so that when evaluating qualitatively different information simultaneously (*m* > 1) one has to choose the scale carefully to be comparable for all parameters. This is guaranteed in the analysis of spatial distribution since all three functions used fall within a similar range. In the analysis of particle characteristics the size scale has to be modified comparably to the range of the shape factor. We conclude that mostly recommended methods are (I) and (IV) (or (III)).

On the other hand the newly proposed methods (V) and (VI) based on multidimensional functional data analysis need further investigations to be able to claim their usefulness. In comparison with method (I) they are able to evaluate different numbers of particles in windows. Theoretically, they are independent on the dimensionality, and therefore they are of great potential use in multidimensional statistical analysis. At first sight corresponding tests seem to be strict and sensitive to both local differences in spatial distribution and in particle characteristics. One must be very careful when interpreting the results of a functional data analysis. It should be also added that functional data analysis combined with permutation testing is more time-consuming (especially in the multivariate case), but feasible when using fast computers, and, especially, clusters.

Some of results obtained by classical and functional data analysis may seem to be contradictory, but this is not the case, the reason is that various methods for comparison of spatial distribution are different in their nature. We may consider pairs of microstructures A-C and A-Z. The functional data method using functions *μ*, *ν* in (7),(8) rejects the null hypothesis for A-C and does not reject it for A-Z. This conclusion is related to Fig. 8, where we can see the average values of the difference *μ<sup>j</sup>* − *ν<sup>j</sup>* taken from all 20 windows, where the range is two times smaller for A-Z than for A-C. The same observation holds for individual windows, too. We investigated also the sensitivity of method (V) with respect to the choice of bandwidth *σj*. In the top row of *p*−values in Table 8 there is the asymptotically optimal bandwidth by theory. One can observe how the *p*−value slightly changes with a broader bandwidth in bottom row, but there is no evidence of a systematic change.

It is possible to study statistical properties of the tests by simulations. Concerning the resolution of the test, the power of the variants of the test based on N-distances was compared in (Klebanov, 2005). Besides our study in Subsection 2.4, in paper (Bakshaev, 2008) a large comparative study of the power of several two-sample tests (Kolmogorov-Smirnov, Cramer-von Mises, Anderson-Darling, Wilcoxon, Mann-Whitney, N-distances) was made. It appears that for a multidimensional case the test based on N-distances has the highest power. The results of the application of the two-sample test in metallography can be transformed from conclusions about the geometry of the microstructure to conclusions relevant materials research. Since the production of all three materials was based on the same processing, the only difference is in the chemical composition of the alloys. Therefore, we can conclude that a differentiation for high and low silicon contents is apparent, while small differences in composition within different groups {*A*,*C*, *L*} and {*P*, *Z*} do not have a clearly apparent impact on the microstructure. We can observe that while *P* and *Z* have different particle densities, they do not differ in particle size and shape, nor in the pure spatial distribution (interactions).

#### **6. Acknowledgement**

The research was supported by the Czech Science Foundation, project GACR P201/10/0472, ˇ and by the Czech Ministery of Education, project MSM 0021620839. Our memory belongs to Margarita Slámová, who was the coauthor of the paper (Benes et al., 2009). She died prematurely in 2009 and we missed her during the preparation of this chapter a lot.

#### **7. Appendix**

16 Will-be-set-by-IN-TECH

The first group of methods based on vectors of characteristics obtained from image analysis measurements and further transformation of data is well-established and the results easily understandable. From the practical point-of-view, it should be mentioned that N-distances are scale dependent, so that when evaluating qualitatively different information simultaneously (*m* > 1) one has to choose the scale carefully to be comparable for all parameters. This is guaranteed in the analysis of spatial distribution since all three functions used fall within a similar range. In the analysis of particle characteristics the size scale has to be modified comparably to the range of the shape factor. We conclude that mostly recommended methods

On the other hand the newly proposed methods (V) and (VI) based on multidimensional functional data analysis need further investigations to be able to claim their usefulness. In comparison with method (I) they are able to evaluate different numbers of particles in windows. Theoretically, they are independent on the dimensionality, and therefore they are of great potential use in multidimensional statistical analysis. At first sight corresponding tests seem to be strict and sensitive to both local differences in spatial distribution and in particle characteristics. One must be very careful when interpreting the results of a functional data analysis. It should be also added that functional data analysis combined with permutation testing is more time-consuming (especially in the multivariate case), but feasible when using

Some of results obtained by classical and functional data analysis may seem to be contradictory, but this is not the case, the reason is that various methods for comparison of spatial distribution are different in their nature. We may consider pairs of microstructures A-C and A-Z. The functional data method using functions *μ*, *ν* in (7),(8) rejects the null hypothesis for A-C and does not reject it for A-Z. This conclusion is related to Fig. 8, where we can see the average values of the difference *μ<sup>j</sup>* − *ν<sup>j</sup>* taken from all 20 windows, where the range is two times smaller for A-Z than for A-C. The same observation holds for individual windows, too. We investigated also the sensitivity of method (V) with respect to the choice of bandwidth *σj*. In the top row of *p*−values in Table 8 there is the asymptotically optimal bandwidth by theory. One can observe how the *p*−value slightly changes with a broader bandwidth in bottom row,

It is possible to study statistical properties of the tests by simulations. Concerning the resolution of the test, the power of the variants of the test based on N-distances was compared in (Klebanov, 2005). Besides our study in Subsection 2.4, in paper (Bakshaev, 2008) a large comparative study of the power of several two-sample tests (Kolmogorov-Smirnov, Cramer-von Mises, Anderson-Darling, Wilcoxon, Mann-Whitney, N-distances) was made. It appears that for a multidimensional case the test based on N-distances has the highest power. The results of the application of the two-sample test in metallography can be transformed from conclusions about the geometry of the microstructure to conclusions relevant materials research. Since the production of all three materials was based on the same processing, the only difference is in the chemical composition of the alloys. Therefore, we can conclude that a differentiation for high and low silicon contents is apparent, while small differences in composition within different groups {*A*,*C*, *L*} and {*P*, *Z*} do not have a clearly apparent impact on the microstructure. We can observe that while *P* and *Z* have different particle densities, they do not differ in particle size and shape, nor in the pure spatial distribution

are (I) and (IV) (or (III)).

fast computers, and, especially, clusters.

but there is no evidence of a systematic change.

(interactions).

Here we give a mathematical background of N-distances and related statistical testing. This background comes from (Klebanov, 2005). Let {X, <sup>A</sup>} be a measurable space, <sup>L</sup> : <sup>X</sup><sup>2</sup> <sup>→</sup> IR<sup>1</sup> is a negative definite kernel on X (L(*x*, *x*) = 0 and L(*x*, *y*) = L(*y*, *x*)) if and only if

$$\int\_{\mathfrak{X}} \int\_{\mathfrak{X}} \mathcal{L}(\mathbf{x}, y) h(\mathbf{x}) h(y) d\mathbb{Q}(\mathbf{x}) d\mathbb{Q}(y) \le 0 \tag{9}$$

for an arbitrary probability measure *Q* on {X, A} and a measurable function *h* on X such that <sup>X</sup> *h*(*x*)*dQ*(*x*) = 0. We say that L is strongly negative definite if the equality in (9) implies *h* = 0 almost surely with respect to the measure *Q*.

Let L be a strongly negative definite kernel on <sup>X</sup>, BL the set of all probabilities *<sup>μ</sup>* on {X, <sup>A</sup>} for which there exists the integral X <sup>X</sup> L(*x*, *<sup>y</sup>*)*dμ*(*x*)*dμ*(*y*) < <sup>∞</sup>. For *<sup>μ</sup>*, *<sup>ν</sup>* ∈ BL put N (*μ*, *<sup>ν</sup>*) =

$$=2\int\_{\mathfrak{X}}\int\_{\mathfrak{X}}\mathcal{L}(\mathbf{x},y)d\mu(\mathbf{x})d\nu(y) - \int\_{\mathfrak{X}}\int\_{\mathfrak{X}}\mathcal{L}(\mathbf{x},y)d\mu(\mathbf{x})d\mu(y) - \int\_{\mathfrak{X}}\int\_{\mathfrak{X}}\mathcal{L}(\mathbf{x},y)d\nu(\mathbf{x})d\nu(y). \tag{10}$$

Then

$$\mathfrak{N}(\mu, \nu) = \left(\mathcal{N}(\mu, \nu)\right)^{1/2} \tag{11}$$

is a distance on BL, it is called N-distance. We will use N-distances in the following using two approaches. First in classical data analysis when X = IR*<sup>k</sup>* is the Euclidean space of *k*−dimensional vectors. Secondly in functional data analysis where X = *L*<sup>2</sup> is the space of square integrable functions.

Let <sup>L</sup>(*x*, *<sup>y</sup>*) be a strongly negative definite kernel on IR*k*, *<sup>X</sup>*, *<sup>Y</sup>* are two independent random vectors in IR*k*, define one-dimensional independent random variables *U*, *V* by

$$\mathcal{U} = \mathcal{L}(\mathbf{X}, \mathbf{Y}) - \mathcal{L}(\mathbf{X}, \mathbf{X}'), \qquad \mathcal{V} = \mathcal{L}(\mathbf{Y}', \mathbf{Y}'') - \mathcal{L}(\mathbf{X}'', \mathbf{Y}''). \tag{12}$$

Here all vectors *X*, *X*� , *X*��,*Y*,*Y*� ,*Y*�� are mutually independent, equalities of distributions *<sup>X</sup> <sup>d</sup>* = *X*� *<sup>d</sup>* = *<sup>X</sup>*��, *<sup>Y</sup> <sup>d</sup>* = *<sup>Y</sup>*� *<sup>d</sup>* = *<sup>Y</sup>*�� hold. We have *<sup>X</sup> <sup>d</sup>* <sup>=</sup> *<sup>Y</sup>* ⇐⇒ *<sup>U</sup> <sup>d</sup>* = *V* ⇐⇒ N(*X*, *Y*) = 0. Consider testing of the hypothesis *<sup>H</sup>*<sup>0</sup> : *<sup>X</sup> <sup>d</sup>* = *Y* for multivariate random vectors *X*,*Y*. This hypothesis is equivalent to *H*� <sup>0</sup> : *<sup>U</sup> <sup>d</sup>* = *V*, where *U*, *V* are random variables taking values in IR1. Consider two independent samples *X*1,..., *Xn*; *Y*1,...,*Yn* from general multivariate populations *X* and *Y*, respectively. A one-dimensional test to *U* and *V* can proceed in the following ways: a) split each sample randomly in three equal parts *X*, *X*� , *X*��,*Y*,*Y*� ,*Y*�� and use (12); this leads

to a loss of information, b) simulate the samples from *X*� and *X*�� (as well as from *Y*� and *Y*��) by independent choices from observations *X*1,..., *Xn* (and from *Y*1,...,*Yn*, correspondingly); thus we do not test the

hypothesis *X <sup>d</sup>* = *Y*, but the one of the corresponding empirical distributions,

c) permutation test using Monte Carlo approximation.

**20** 

**Microstructural Evolution During the** 

*1Materials Innovation Institute (M2i), Mekelweg 2, 2628 CD Delft,* 

Ali Reza Eivani1,2, Jie Zhou2 and Jurek Duszczyk2

*Mekelweg 2, 2628 CD Delft,* 

*The Netherlands* 

**Homogenization of Al-Zn-Mg Aluminum Alloys** 

*2Department of Materials Science and Engineering, Delft University of Technology,* 

Aluminum and aluminum alloys are probably the most ideal materials for extrusion, and they are the most commonly extruded. Most of commercially available aluminum alloys can be extruded. Principal applications include parts for the aircraft and aerospace industries, pipes, wires, rods, bars, tubes, hollow shapes, cable sheathing, for the building, automotive and electrical industries. Sections can be extruded from heat-treatable or non-heat treatable

In the last 30 years, the development of aluminum extrusion technology has, in the main, been focused on the billet metallurgy, die design and process control for low- and mediumstrength aluminum alloys in the 6xxx series for architectural applications, in order to maximize extrusion speed and at the same time fulfill the requirements in product specifications in terms of dimensions, shape, surface and mechanical properties. As a result, there is a wealth of information available on the relationship between alloy chemistry, microstructure and extrudability of these alloys [2]. In comparison, the fundamental knowledge and extrusion technology, especially those for medium- and high-strength

7xxx series aluminum alloys, almost exclusively for air transport applications in the past but now increasingly used in the rail and road vehicles, must comply with much more stringent performance specifications than 6xxx series aluminum alloys for architectural applications. Although many investigations on the behavior of medium- and high-strength aluminum alloys at individual processing steps have been performed, systematic research linking all these processing steps is lacking, while the extrusion behavior is associated with alloy composition and a series of microstructural evolutions throughout the whole chain of material processing from casting through homogenization to extrusion. Such research is particularly needed for the aluminum extrusion companies that are currently shifting the application fields of extrusions from architecture to ground transport where mediumstrength alloys (7003, 7005, 7010, 7020, 2011, 2017 and 2618) and high-strength alloys (7049, 7050, 7075 and 2024) are increasingly used. This chapter concerns one of the mostly used medium-strength alloys, AA7020, as a representative of Cu-free 7xxx series aluminum alloys. Table 1 shows the nominal chemical composition of the AA7020 aluminum alloy.

aluminum alloys in 7xxx series, are rather scarce in the open literature [2].

low-, medium- and high-strength aluminum alloys [1].

**1. Introduction 1.1 Background** 

#### **8. References**

