**3.1 Dry matter prediction**

The calibration and prediction model (Figure 2) statistics for each individual year (Table 1) for both harvest locations indicate that FT-NIRS in diffuse reflectance has potential as a screening tool to predict %DM on whole 'Hass' avocado fruit. The 2006 and 2007 harvest seasons had lower standard deviations (SD) than the 2008 season for both the Bundaberg and Childers locations. For the two harvest locations the 2008 harvest season calibration and prediction statistics were the best in terms of regression (R2) and SDR. The RMSEP for each harvest season varied between 1.29 to 1.49 %DM and 1.41 to 1.94 %DM for Childers and Bundaberg respectively. This suggests that the fruit obtained from the 2006 and 2007 harvest seasons possibly did not include a sufficiently broad variability in physiological attributes to develop a more suitable calibration model as seen with the 2008 harvest season, although other biological or environment effects may have contributed. The number of latent variables are within an acceptable range for the number of samples for all models (Hruschka, 1987; Lammertyn et al., 2000).


Table 1. PLS calibration (CAL) and prediction (PRE) statistics for %DM for whole 'Hass' avocado fruit harvested from Bundaberg (Bu) and Childers (Ch) over the 2006, 2007 and 2008 seasons. *Note: OR = outliers removed; LV = latent variables; n = number of samples.* 

The calibration and prediction model (Figure 2) statistics for each individual year (Table 1) for both harvest locations indicate that FT-NIRS in diffuse reflectance has potential as a screening tool to predict %DM on whole 'Hass' avocado fruit. The 2006 and 2007 harvest seasons had lower standard deviations (SD) than the 2008 season for both the Bundaberg and Childers locations. For the two harvest locations the 2008 harvest season calibration and prediction statistics were the best in terms of regression (R2) and SDR. The RMSEP for each harvest season varied between 1.29 to 1.49 %DM and 1.41 to 1.94 %DM for Childers and Bundaberg respectively. This suggests that the fruit obtained from the 2006 and 2007 harvest seasons possibly did not include a sufficiently broad variability in physiological attributes to develop a more suitable calibration model as seen with the 2008 harvest season, although other biological or environment effects may have contributed. The number of latent variables are within an acceptable range for the number of samples for all models

Mean SD LV R2 RM

CAL 222(2) 18.2-35.0 27.2 3.5 7 0.75 1.76 -0.159 0.759 2.0 PRE 407 20.334.2 27.6 3.0 7 0.75 1.50 -0.582 0.818 2.0

CAL 211(0) 19.0-34.4 25.7 2.8 8 0.76 1.39 -0.0024 0.779 2.0 PRE 398(0) 19.7-32.5 25.7 2.6 8 0.70 1.41 0.112 0.754 1.8

CAL 209(1) 15.2-35.5 25.6 5.7 6 0.90 1.76 -0.0036 0.910 3.2 PRE 397(0) 15.6-35.1 25.8 5.7 6 0.88 1.94 0.1526 0.865 2.9

CAL 207 (2) 21.4-39.7 30.2 3.7 9 0.82 1.57 0.006 0.829 2.4 PRE 425 (0) 21.7-37.9 29.5 3.3 9 0.80 1.47 0.0761 0.850 2.2

CAL 209 (0) 21.9-36.8 29.1 3.3 8 0.83 1.36 -0.0098 0.842 2.4 PRE 400 (1) 22.2-36.2 29.2 3.0 8 0.81 1.29 -0.2867 0.835 2.3

CAL 209 (2) 16.1-36.2 25.6 5.2 7 0.93 1.39 0.0098 0.934 3.8 PRE 399 (0) 16.5-36.1 26.0 5.4 7 0.92 1.49 -0.1594 0.858 3.6

Table 1. PLS calibration (CAL) and prediction (PRE) statistics for %DM for whole 'Hass' avocado fruit harvested from Bundaberg (Bu) and Childers (Ch) over the 2006, 2007 and 2008 seasons. *Note: OR = outliers removed; LV = latent variables; n = number of samples.* 

SECV

RM

SEP Bias Slope SDR

**3. Results and discussion** 

(Hruschka, 1987; Lammertyn et al., 2000).

Bu-2006 629 18.2-35.0 27.5 3.2

Bu-2007 609 19.0-34.4 25.7 2.7

Bu-2008 606 15.2-35.5 25.7 5.7

Ch-2006 632 21.4-39.7 29.8 3.4

Ch-2007 609 21.9-36.8 29.2 3.1

Ch-2008 608 16.1-36.2 25.8 5.3

%DM range

Spectra n (OR)

Location - Year

**3.1 Dry matter prediction** 

Fig. 2. Model predictions plotted against actual constituent values for %DM for (a) Bundaberg 2006 season, (b) Childers 2006 season, (c) Bundaberg 2007 season, (d) Childers 2007 season, (e) Bundaberg 2008 season, and (f) Childers 2008 season.

Large seasonal effects have a major consequence for calibration models for horticultural produce, since the spectral deviations due to biological variability of future samples cannot be predicted (Peirs et al., 2003). The influence of seasonal variability was investigated for the Bundaberg and Childers growing locations over three years. For both growing locations, the

The Application of Near Infrared Spectroscopy

14 19 24 29 34 **Reference %Dry Matter**

actual constituent values for %dry matter.

accuracy (Bobelyn et al., 2010).

Bundaberg and Childers locations collected over 3 years.

(a) (b)

**Predicted %Dry Matter**

for the Assessment of Avocado Quality Attributes 221

Fig. 3. Model prediction for the combined 2006-08 calibration model for both (a) Bundaberg and (b) Childers locations predicting on the combined 2006-08 prediction set plotted against

This study demonstrated that including data from multiple growing seasons in the calibration model will improve the predictive performance, in comparison to calibration models developed using an individual season. This is in agreement with the previous studies on this topic (Peiris et al., 1998; Peirs et al., 2003; Miyanoto and Yoshinobu, 1995; Liu et al., 2005; Guthrie et al., 2005). As more biological variability is built into the model, the prediction accuracy becomes less sensitive to unknown changes of external factors (Bobelyn et al., 2010). However, in some cases, including more biological variability (at the risk of including atypical data) in the calibration set can significantly reduce the models prediction

Geographic location (growing regions) effects may also have a major consequence on model robustness as fruit composition is subject to within tree variability (i.e., tree age, crop load, position within the tree, light effects); within orchard variability (i.e., location of tree, light effects); and intra-orchard variability (i.e., soil characteristics, nutrition, weather conditions, fruit age and season variability) (Marques et al., 2006; Peirs et al., 2003). The influence of geographic location variability on %DM for whole avocado fruit was subsequently investigated by assessing calibration model performance using avocado fruit obtained from

The PLS calibration and prediction model statistics for both the Bundaberg and Childers harvest locations and combination of both regions are presented in Table 3. The Bundaberg data set of 1844 spectra was separated into a calibration set (n = 600) and a prediction set (n = 1244). The validation statistics of the calibration model were quite good and delivered an Rv2 = 0.87 with an RMSEP = 1.48 and SDR of 2.8 for %DM. An SDR value between 2.5 and 2.9 is regarded as adequate for screening (Nicolaï et al., 2007; Schimleck et al., 2003; Williams, 2008). The Bundaberg PLS model was used to predict on the entire Childers population. As expected the application of the Bundaberg model to a population from another growing region was not as successful, providing a substantially reduced predictive performance with an Rv2 = 0.59, RMSEP = 2.84 and SDR of 1.55. Similarly, the Childers data set of 1848 spectra were separated into a calibration set (n = 624) and prediction set (n = 1224).

**Predicted %Dry M**

 **atter**

> 14 19 24 29 34 **Reference %Dry Matter**

2006 calibration model was used to predict on the 2007 season population. A combined calibration set using spectra from 2006 and 2007 seasons was used to develop a calibration model that was then subsequently used to predict the 2008 season population. A combined calibration set of 2006, 2007 and 2008 seasons was used to predict over all 3 years. Table 2 displays the summary statistics of the PLS calibration and prediction models for these combinations.


Table 2. PLS calibration (CAL) and prediction (PRE) statistics for %DM for whole 'Hass' avocado fruit from both Bundaberg (Bu) and Childers (Ch) for 2006, 2006-07 and 2006-08 seasons predicting on 2007, 2008 and 2006-08 seasons respectively. *Note: OR = outliers removed; LV = latent variables; n = number of samples.* 

As expected, the application of single seasonal calibrations to populations from other growing seasons was not very successful due to seasonal biological variation. For example, the 2006 calibration models for both Bundaberg and Childers could not be used to predict the 2007 season population for the corresponding harvest location. Model predictive performance improved as more biological variability was included in the models, as seen when the combined 2006 and 2007 models was used to predict on the 2008 season. The combined 2006, 2007 and 2008 calibration models (Figure 3) was sufficiently robust to predict %DM of whole Hass avocado to within 1.48% with an Rv2 = 0.87 and SDR of 2.8 for Bundaberg; andto within 1.43 %DM with and an Rv2 = 0.89 and SDR of 3.0 for the Childers harvest location. This indicated an ability to sort the fruit into three categories with approximately 80% accuracy (Guthrie et al., 1998).

2006 calibration model was used to predict on the 2007 season population. A combined calibration set using spectra from 2006 and 2007 seasons was used to develop a calibration model that was then subsequently used to predict the 2008 season population. A combined calibration set of 2006, 2007 and 2008 seasons was used to predict over all 3 years. Table 2 displays the summary statistics of the PLS calibration and prediction models for these

SD LV R2

Bu-2007 609 14.1-34.4 2.7 0.09 5.07 4.358 0.5

Bu-2008 606 15.2-35.5 5.7 0.45 4.3 0.161 1.4

2006-08 1244(1) 14.1-35.6 4.1 6 0.87 1.48 0.0104 2.8

Ch-2007 609(0) 21.9-36.9 3.1 9 0.14 2.84 1.601 1.1

Ch-2008 608(0) 16.1-36.2 5.3 12 0.79 2.45 -0.547 2.2

2006-08 1224(0) 16.5-37.9 4.3 10 0.89 1.43 -0.021 3.0

Bu-2006 222(2) 18.2-35.0 3.5 7 0.75 1.76 -0.159 2.0

2006-07 426 14.1-35.0 3.1 9 0.75 1.60 0.112 1.9

2006-08 600(4) 15.8-35.4 4.2 6 0.86 1.55 -0.009 2.7

Ch-2006 207(2) 21.4–39.7 3.7 9 0.82 1.57 0.006 2.4

2006-07 415(1) 21.4-39.7 3.5 12 0.82 1.49 0.003 2.4

2006-08 624(1) 16.1-39.7 4.6 10 0.88 1.62 -0.001 2.8

Table 2. PLS calibration (CAL) and prediction (PRE) statistics for %DM for whole 'Hass' avocado fruit from both Bundaberg (Bu) and Childers (Ch) for 2006, 2006-07 and 2006-08 seasons predicting on 2007, 2008 and 2006-08 seasons respectively. *Note: OR = outliers* 

As expected, the application of single seasonal calibrations to populations from other growing seasons was not very successful due to seasonal biological variation. For example, the 2006 calibration models for both Bundaberg and Childers could not be used to predict the 2007 season population for the corresponding harvest location. Model predictive performance improved as more biological variability was included in the models, as seen when the combined 2006 and 2007 models was used to predict on the 2008 season. The combined 2006, 2007 and 2008 calibration models (Figure 3) was sufficiently robust to predict %DM of whole Hass avocado to within 1.48% with an Rv2 = 0.87 and SDR of 2.8 for Bundaberg; andto within 1.43 %DM with and an Rv2 = 0.89 and SDR of 3.0 for the Childers harvest location. This indicated an ability to sort the fruit into three categories with

RM SECV RM

SEP Bias SDR

combinations.

Bu-

Bu-

Ch-

Ch-

CAL PRE

Bu-

Ch-

Location - Year Spectra

n (OR)

*removed; LV = latent variables; n = number of samples.* 

approximately 80% accuracy (Guthrie et al., 1998).

%DM range

Fig. 3. Model prediction for the combined 2006-08 calibration model for both (a) Bundaberg and (b) Childers locations predicting on the combined 2006-08 prediction set plotted against actual constituent values for %dry matter.

This study demonstrated that including data from multiple growing seasons in the calibration model will improve the predictive performance, in comparison to calibration models developed using an individual season. This is in agreement with the previous studies on this topic (Peiris et al., 1998; Peirs et al., 2003; Miyanoto and Yoshinobu, 1995; Liu et al., 2005; Guthrie et al., 2005). As more biological variability is built into the model, the prediction accuracy becomes less sensitive to unknown changes of external factors (Bobelyn et al., 2010). However, in some cases, including more biological variability (at the risk of including atypical data) in the calibration set can significantly reduce the models prediction accuracy (Bobelyn et al., 2010).

Geographic location (growing regions) effects may also have a major consequence on model robustness as fruit composition is subject to within tree variability (i.e., tree age, crop load, position within the tree, light effects); within orchard variability (i.e., location of tree, light effects); and intra-orchard variability (i.e., soil characteristics, nutrition, weather conditions, fruit age and season variability) (Marques et al., 2006; Peirs et al., 2003). The influence of geographic location variability on %DM for whole avocado fruit was subsequently investigated by assessing calibration model performance using avocado fruit obtained from Bundaberg and Childers locations collected over 3 years.

The PLS calibration and prediction model statistics for both the Bundaberg and Childers harvest locations and combination of both regions are presented in Table 3. The Bundaberg data set of 1844 spectra was separated into a calibration set (n = 600) and a prediction set (n = 1244). The validation statistics of the calibration model were quite good and delivered an Rv2 = 0.87 with an RMSEP = 1.48 and SDR of 2.8 for %DM. An SDR value between 2.5 and 2.9 is regarded as adequate for screening (Nicolaï et al., 2007; Schimleck et al., 2003; Williams, 2008). The Bundaberg PLS model was used to predict on the entire Childers population. As expected the application of the Bundaberg model to a population from another growing region was not as successful, providing a substantially reduced predictive performance with an Rv2 = 0.59, RMSEP = 2.84 and SDR of 1.55. Similarly, the Childers data set of 1848 spectra were separated into a calibration set (n = 624) and prediction set (n = 1224).

The Application of Near Infrared Spectroscopy

**Predicted %Dry Matter**

against actual constituent values for %DM.

present here-abouts in the β vectors.

0.75) and Walsh et al. (2004) (Rc

for the Assessment of Avocado Quality Attributes 223

14 19 24 29 34

Fig. 4. Model prediction for the Bundaberg and Childers combined 2006-08 calibration model predicting on the Bundaberg and Childers combined 2006-08 prediction set plotted

in these circumstances the long-wavelength region can provide relevant information.

The regression coefficient vectors β for the dry matter calibration models across all years in this avocado study had many similar peak positions over the 850 – 2250 nm range. However as expected, there were slight differences in wavelength selection from one year to another which can be attributed to seasonal variability. Relevant spectral information for the calibration models was obtained primarily from oil, carbohydrate and water absorbance bands clustered in the 900 - 980 nm region (second and third overtone), with further contribution from absorbance bands for oil in the vicinity of 1360, 1703, 1722 and 1760 nm. However, these assignments can only be tentative because of other peaks and troughs

The results of this study are very encouraging and compare favourably to the results obtained by Clark et al. (2003) (RMSEP of 2.6 %DM over a 20 - 45 %DM range and an Rv2 of

unspecified cultivar) using a fixed PDA spectrometer in reflectance mode. The current FT-NIRS reflectance combined models for both Bundaberg and Childers compare well with the model accuracy obtained by Clark et al. (2003) (Rv2 of 0.88 and an RMSEP of 1.8 %DM) using a PDA spectrometer in interactance mode, indicating reflectance FT-NIRS may be a suitable

2 = 0.79, RMSECV = 1.14, SDR = 2.2, for %DM of an

**Reference %Dry Matter**

930 nm (overtones of CH2 stretching) (Clark et al., 2003; Guthrie et al., 2004; Osborne et al., 1993). Williams and Norris (1987) report that the 1300 - 1750 nm range is very fruitful for absorbers for use in the determination of protein and oil. The 900 - 920 nm absorbance band is often cited as the most important band for %DM and/or sugar determination, as it is removed from the troublesome interferences from the water absorbance peaks that typically dominate spectra of fruit (Clark et al., 2003). However, light penetration depth is wavelength dependent (Lammertyn et al., 2000). The 700 - 1100 nm short-wavelength NIR region allows better penetration into biological material, while wavelengths above 1100 nm (long-wavelength region) have limited penetration providing information only relatively close to the surface (Guthrie et al., 2004; Saranwong & Kawano, 2007). In some instances, there may be secondary correlations between skin properties and those of the bulk flesh and


Table 3. PLS calibration and prediction statistics for %DM for whole 'Hass' avocado fruit harvested over three seasons for Bundaberg and Childers growing locations and combination of both regions. *Note: OR = outliers removed; LV = latent variables; n = number of samples.* 

The Childers PLS model also produced reasonable validation statistics (Rv2 = 0.89 with an RMSEP = 1.43 and SDR of 3.0) when predicting fruit from within the Childers region. As with the Bundaberg model, the Childers model did not perform as well when it was used to predict %DM of fruit from a different geographic location such as the combined 2006-08 Bundaberg population (Rv2 = 0.74 with an RMSEP = 2.14 and SDR of 1.96).

A calibration model was developed by combining both Bundaberg and Childers populations, incorporating biological variability from both regions over three growing seasons. Model predictive performance of the combined population was comparable to the individual regional models of Bundaberg and Childers, with an Rv2 = 0.89, RMSEP = 1.42, and SDR of 3.1 (Figure 4). These results demonstrate that there are spectral differences between growing districts and that each individual regional model does not incorporate the relevant spectral information enabling the model to successfully predict samples containing biological variability from a different growing district without reduced predictive performance. It is therefore important that calibrations be developed on populations representative in which sorting is to be attempted.

Interpreting NIR models in terms of the various fruit components is often difficult due to spectral co-linearity where information in a model may not necessarily be carried by just a few independent wavelengths, but is possibly a combined effect of many wavelengths with each contributing only relatively little information (McGlone & Kawano, 1998). For oil, strong electromagnetic absorption is reported around 2200 – 2400 nm (CH2 stretch bend and combinations), with weaker absorption around 1750, 1200 and 900 – 920 nm ranges, and

Range SD LV R2 RM

SECV

RM SEP SDR

%DM

Bundaberg 600(4) 15.8-35.4 4.2 6 0.86 1.55 2.7

Bundaberg 1244(1) 14.1-35.6 4.1 6 0.87 1.48 2.8

Childers 1847 16.1-39.7 4.4 6 0.59 2.84 1.55

Childers 624(1) 16.1-39.7 4.6 10 0.88 1.62 2.8

Childers 1224(0) 16.5-37.9 4.3 10 0.89 1.43 3.0

Bundaberg 1844(1) 14.1-35.5 4.2 10 0.74 2.14 1.96

Childers 1224(4) 15.8-39.7 4.5 9 0.88 1.55 2.9

Table 3. PLS calibration and prediction statistics for %DM for whole 'Hass' avocado fruit

combination of both regions. *Note: OR = outliers removed; LV = latent variables; n = number of* 

The Childers PLS model also produced reasonable validation statistics (Rv2 = 0.89 with an RMSEP = 1.43 and SDR of 3.0) when predicting fruit from within the Childers region. As with the Bundaberg model, the Childers model did not perform as well when it was used to predict %DM of fruit from a different geographic location such as the combined 2006-08

A calibration model was developed by combining both Bundaberg and Childers populations, incorporating biological variability from both regions over three growing seasons. Model predictive performance of the combined population was comparable to the individual regional models of Bundaberg and Childers, with an Rv2 = 0.89, RMSEP = 1.42, and SDR of 3.1 (Figure 4). These results demonstrate that there are spectral differences between growing districts and that each individual regional model does not incorporate the relevant spectral information enabling the model to successfully predict samples containing biological variability from a different growing district without reduced predictive performance. It is therefore important that calibrations be developed on populations

Interpreting NIR models in terms of the various fruit components is often difficult due to spectral co-linearity where information in a model may not necessarily be carried by just a few independent wavelengths, but is possibly a combined effect of many wavelengths with each contributing only relatively little information (McGlone & Kawano, 1998). For oil, strong electromagnetic absorption is reported around 2200 – 2400 nm (CH2 stretch bend and combinations), with weaker absorption around 1750, 1200 and 900 – 920 nm ranges, and

harvested over three seasons for Bundaberg and Childers growing locations and

Bundaberg population (Rv2 = 0.74 with an RMSEP = 2.14 and SDR of 1.96).

Childers 2468(1) 14.1-37.9 4.3 9 0.89 1.42 3.1

Harvest Location Spectra n

Bundaberg &

representative in which sorting is to be attempted.

Calibration Prediction

Bundaberg &

*samples.* 

(OR)

Fig. 4. Model prediction for the Bundaberg and Childers combined 2006-08 calibration model predicting on the Bundaberg and Childers combined 2006-08 prediction set plotted against actual constituent values for %DM.

930 nm (overtones of CH2 stretching) (Clark et al., 2003; Guthrie et al., 2004; Osborne et al., 1993). Williams and Norris (1987) report that the 1300 - 1750 nm range is very fruitful for absorbers for use in the determination of protein and oil. The 900 - 920 nm absorbance band is often cited as the most important band for %DM and/or sugar determination, as it is removed from the troublesome interferences from the water absorbance peaks that typically dominate spectra of fruit (Clark et al., 2003). However, light penetration depth is wavelength dependent (Lammertyn et al., 2000). The 700 - 1100 nm short-wavelength NIR region allows better penetration into biological material, while wavelengths above 1100 nm (long-wavelength region) have limited penetration providing information only relatively close to the surface (Guthrie et al., 2004; Saranwong & Kawano, 2007). In some instances, there may be secondary correlations between skin properties and those of the bulk flesh and in these circumstances the long-wavelength region can provide relevant information.

The regression coefficient vectors β for the dry matter calibration models across all years in this avocado study had many similar peak positions over the 850 – 2250 nm range. However as expected, there were slight differences in wavelength selection from one year to another which can be attributed to seasonal variability. Relevant spectral information for the calibration models was obtained primarily from oil, carbohydrate and water absorbance bands clustered in the 900 - 980 nm region (second and third overtone), with further contribution from absorbance bands for oil in the vicinity of 1360, 1703, 1722 and 1760 nm. However, these assignments can only be tentative because of other peaks and troughs present here-abouts in the β vectors.

The results of this study are very encouraging and compare favourably to the results obtained by Clark et al. (2003) (RMSEP of 2.6 %DM over a 20 - 45 %DM range and an Rv2 of 0.75) and Walsh et al. (2004) (Rc 2 = 0.79, RMSECV = 1.14, SDR = 2.2, for %DM of an unspecified cultivar) using a fixed PDA spectrometer in reflectance mode. The current FT-NIRS reflectance combined models for both Bundaberg and Childers compare well with the model accuracy obtained by Clark et al. (2003) (Rv2 of 0.88 and an RMSEP of 1.8 %DM) using a PDA spectrometer in interactance mode, indicating reflectance FT-NIRS may be a suitable

The Application of Near Infrared Spectroscopy

**Time after impact (hours)**  **Spectra (n)** 

1-2 102 (i) 0 - 10;

'Hass' avocado fruit. *Note: LV = latent variables; n = number of samples.* 

developed on populations representative in which sorting is to be attempted.

enable the development of a robust model suitable for commercial use.

**Item assessed** 

%Bruising

**4. Conclusion** 

of scanned area

for the Assessment of Avocado Quality Attributes 225

**Defined classification (%)** 

(ii) 11 - 100

Table 5. Classification statistics for prediction of percentage bruise development in whole

NIRS has come to be extensively used in many applications for the non-invasive rapid assessment of a wide variety of products. These both include quantitative compositional determinations and qualitative determinations. The present study indicates the potential of FT-NIRS in diffuse reflectance mode to be used as a non-invasive method to predict the %DM of whole 'Hass' avocado fruit and the importance of incorporating seasonal and geographical variation in the calibration model. The results showed that the calibration model robustness increased when data from more than one season, incorporating a greater range of seasonal variation, was included in the calibration set. Also, that there are spectral differences between geographical regions and that, specific regional models may have significantly reduced predictive performance when applied to samples containing biological variability from a different growing region. It is therefore important that calibrations be

As shown, there is also great potential to use FT-NIRS as a tool to predict impact damage of whole avocados based on percentage bruise development, and to predict shelf-life based on rot development (susceptibility). It should be considered that the preliminary work presented here is a first step towards shelf-life prediction and bruise detection for avocado fruit. However, this was only a preliminary study and the classification models require many more samples, incorporating seasonal and geographical biological variations, to

Overall, FT-NIR reflectance spectroscopy shows promise for the application in a commercial, in-line setting for the non-destructive evaluation of %DM, bruises and rot susceptibility of whole avocado fruit, although optimisation of the technology is required to address speed of throughput and environmental issues. Incorporating fruit physiological variability over future seasons and growing regions will be essential to further increase model robustness and ensure the predictive performance suitable for commercial use.

Unfortunately, the process of calibration development is a major impediment to the rapid adoption of NIRS in industry. The collection and precise analysis of the reference samples remains a time-consuming and a potentially costly exercise depending on the type of analysis. With this said, NIRS has an obvious place in agriculture and environmental applications with its core strength in the analysis of biological materials, plus low cost of

24 102 8 4.9 (n=5) 95.1

**LV Spectra** 

**misclassified (n)** 

10 9.8 (n=10) 90.2

**Spectra correctly classified (%)** 

(n=92)

(n=97)

alternative for in-line and at-line environments. Another comparative study was conducted by Schmilovitch et al. (2001) for two relatively thin skin cultivars, 'Ettinger' and 'Fuerte', during a single season. They used a dispersive NIR spectrophotometer in reflectance mode in the 1200 - 2400 nm range, reporting errors of prediction for 'Ettinger' and 'Fuerte' of 0.9% and 1.3% respectively, for fruit having a 14 – 24 %DM range. It is likely that the relatively smooth to medium textured, thin-skin cultivars would not suffer to the same extent from the physiological limitations experienced in the thick rough skin of 'Hass', and prediction errors would certainly be expected to be lower. We must emphasize however, it is difficult to make a meaningful comparison of the various techniques as there is insufficient detail presented in these papers to establish if the differences are associated with the spectroscopic technique or with the geometry of the configurations used.

#### **3.2 Impact and rot assessment**

Classification statistics for the prediction of percentage rot development are presented in Table 4. The preliminary study found that by applying discriminative analysis techniques, 92.8% of the test population could be correctly classified into 2 categories, above and below 30% rot development for the area scanned. The percentage correctly classified decreased slightly to 86.8% when the classification was reduced to above and below 10% rot development for the scanned area.


Table 4. Classification statistics for prediction of percentage rot development (shelf life) of whole 'Hass' avocado fruit. *Note: LV = latent variables; n = number of samples.* 

Table 5 depicts the classification statistics for the prediction of percentage bruise development. The results indicate that 90% of the population could be correctly classified into 2 categories based on percentage bruise development in the scanned area (≤10%, ≥11%) using scans conducted 1 - 2 hours following impact. Of the 10 (9.8%) samples misclassified, 6 (5.9%) samples visually rated with bruising greater than 11% were placed into the <10% bruise category and 4 (3.9%) samples with bruising visually rated below 10% were placed into the ≥11% bruise category. The 4 samples misclassified with bruising below 10% were all on the ambiguous change over point of the two defined classification categories at 10% bruising.

These results improved to >95% correctly classified when the fruit were rescanned after 24 hours following impact. It appears the 24 hour time delay allowed more time for the bruising to develop assisting with classification. This would indicate that in a commercial situation it would be an advantage to hold the fruit for 24 hours prior to scanning. The 5 (4.9%) samples misclassified were all samples with bruising visually rated below 10% and placed into the ≥11% bruise category. Of these samples 4 (3.9%) were at the ambiguous change over point of the two defined classification categories at 10% bruising.


Table 5. Classification statistics for prediction of percentage bruise development in whole 'Hass' avocado fruit. *Note: LV = latent variables; n = number of samples.* 
