**2. A Review of research on automated identification of volcanic earthquakes**

This section is meant to be a compact but comprehensive survey of research efforts, achievements and case studies on automatic classification of seismic volcanic signals. Reviewed studies are grouped into categories according to the various approaches and methods discussed in Secs. 1.3.2 and 1.3.3.

#### **2.1 Research teams and study sites**

A literature search was performed in the main technical databases. Most of the applications on the automated identification of volcanic earthquakes have been undertaken through the inter-institutional and international research collaboration of four teams composed by: (1) Departamento de Teoría de la Señal Telemática y Comunicaciones, Universidad de Granada, Spain; and Instituto Andaluz de Geofísica, Universidad de Granada, Spain; (2) Dipartimento di Fisica, Università di Salerno, Italy; and Osservatorio Vesuviano, Istituto Nazionale di Geofisica e Vulcanologia, Italy; (3) Departamentos de Ingeniería Eléctrica y Física, Universidad de La Frontera, Temuco, Chile; and Observatorio Volcanológico de los Andes del Sur, Servicio Nacional de Geología y Minería, Chile; and (4) Departamento de Informática y Computación, Universidad Nacional de Colombia Sede Manizales, Colombia; Observatorio Vulcanológico y Sismológico de Manizales, INGEOMINAS, Colombia; and Pattern Recognition Lab, Delft University of Technology, The Netherlands. In these collaborative studies, it seems that spatial proximity between volcano observatories and at least one expert in DSP and/or PR encourages collaboration, probably due to the possibility of establishing informal communication as pointed out by Katz & Martin (1997). Other active teams are composed by personnel from Istituto Nazionale di Geofisica e Vulcanologia, Catania, Italy; and Institut für Erd- und Umweltwissenschaften, Universität Potsdam, Germany.

The found studies have been applied to data sets of the following volcanoes: Ambrym volcano, Vanuatu (AMV); Deception Island Volcano, Antarctica (DIV); Etna Volcano, Italy (ETV); Las Cañadas Volcano, Tenerife, Spain (LCV); Llaima Volcano, Chile (LLV); Mt. Merapi Volcano, Indonesia (MMV); Mt. Vesuvius Volcano, Italy (MVV); Nevado del Ruiz Volcano, Colombia (NRV); Phlegraean Fields, Italy (PFV); San Cristóbal Volcano, Nicaragua (SCV); Soufrière Hills Volcano, Montserrat (SHV); Stromboli Volcano, Italy (STV); and Villarica Volcano, Chile (VRV). Other studies are not applied to signals of volcanic origin but to tectonic seismic events. In spite of that and considering the affinity between these two problems, such studies have also been reviewed here. Data considered in those studies come from the European Broadband Network (EBN), the Mediterranean Seismic Network (MSN), the Hyblean Plateau network (HPN), the Marmara Region Network (MRN) and the Bavarian Earthquake Service Network (BEN). See Table 1 for associations between study sites and publications.


Continued on next page

10 Will-be-set-by-IN-TECH

few words, the basic principle of the support vector classifier (SVM) is to maximize the margin between two classes, which is defined by the so-called *support vectors*: the closest training examples to the decision boundary. The SVM is extended to nonlinear and multiclass problems by using strategies called the kernel trick and the one-against-rest approach. Further details can be found in some of the PR textbooks cited above as well as in the original work

The strategy of combining multiple classifiers aims to exploit (1) the availability of multiple sources of data from different sensors or representations, and (2) the possibilities of training several classifiers for the same training set and performing different tuning sessions for the same classifier. Data mentioned in item 1 may belong to either the same events or to different ones. Most seismic volcanic data sets are multiple in nature since they are acquired at multiple recording stations and across several months or years; thereby, multiple sources —stations for the same events are often available and different sets of examples can be arranged by date

Several strategies for combining classifiers have been proposed. They are typically categorized according to their architecture into parallel, serial and hierarchical; or according to the combination rule into static and trainable (Kuncheva, 2004). PR systems that include these strategies are called *multiple classifier systems*. There has been a sustained interest in this field during the last decade as evidenced by the series of workshops started by Kittler & Roli

**2. A Review of research on automated identification of volcanic earthquakes**

This section is meant to be a compact but comprehensive survey of research efforts, achievements and case studies on automatic classification of seismic volcanic signals. Reviewed studies are grouped into categories according to the various approaches and

A literature search was performed in the main technical databases. Most of the applications on the automated identification of volcanic earthquakes have been undertaken through the inter-institutional and international research collaboration of four teams composed by: (1) Departamento de Teoría de la Señal Telemática y Comunicaciones, Universidad de Granada, Spain; and Instituto Andaluz de Geofísica, Universidad de Granada, Spain; (2) Dipartimento di Fisica, Università di Salerno, Italy; and Osservatorio Vesuviano, Istituto Nazionale di Geofisica e Vulcanologia, Italy; (3) Departamentos de Ingeniería Eléctrica y Física, Universidad de La Frontera, Temuco, Chile; and Observatorio Volcanológico de los Andes del Sur, Servicio Nacional de Geología y Minería, Chile; and (4) Departamento de Informática y Computación, Universidad Nacional de Colombia Sede Manizales, Colombia; Observatorio Vulcanológico y Sismológico de Manizales, INGEOMINAS, Colombia; and Pattern Recognition Lab, Delft University of Technology, The Netherlands. In these collaborative studies, it seems that spatial proximity between volcano observatories and at least one expert in DSP and/or PR encourages collaboration, probably due to the possibility of establishing informal communication as pointed out by Katz & Martin (1997). Other active teams are composed by personnel from Istituto Nazionale di Geofisica e Vulcanologia,

by Vapnik (1998).

of acquisition.

Combination of multiple classifiers

(2000) and recently organized by Gayar et al. (2010).

methods discussed in Secs. 1.3.2 and 1.3.3.

**2.1 Research teams and study sites**


Continued on next page


Publication Data set Classes Representation Classification

background noise

and background noise (STV). TR and tremor bursts (ETV)

STV Four classes of EX Auto-correlation

NRV VT, LP and IC 1-D spectra +

NRV VT, LP 1-D spectra + Band

NRV VT and LP 1-D spectra and

NRV VT and LP Spectrograms +

NRV VT and LP Spectrograms

sausage-like and

spike-like and

EBN Tectonic earthquakes and noise

spike-like

noise

(Romeo, 1994) MSN RE, TL, TS,

(Romeo et al., 1995) MSN TL, RE, TS,

and LP+ RF

SCV LP, EX, TR and

STV and ETV EX and tremor bursts

(Ibáñez et al., 2009) STV and ETV Strombolian EX

(Langer et al., 2006) SHV VT, RE, LP, HB, RF

(Ohrnberger, 2001) MMV VT, MP and RF Wavefield

STV Four classes of EX Auto-correlation

NRV LP, RE,TL and VT Spectrograms + PCA

function, envelope function and spectra

and Fisher mapping

function, envelope function and spectra

morphological and statistical attributes of the waveforms

Autocorrelation function;

parameters

selection

(multiway approach)

transform

Dissimilarities (multiway approach)

Morphological spectral attributes

Morphological spectral attributes

Dissimilarities

spectrograms + Dissimilarities (multiway approach)

and scalograms + Dissimilarities

Continuous wavelet

MFCCs HMM

MFCCs HMM

MFCCs HMM

ANN

ANN

ANN

HMM

space

classifier

classifier

networks

ANN

ANN

1-NN in feature space. LDC and QDC in dissimilarity

Regularized LDC

Fisher linear

Fisher linear

Regularized LDC

Dynamic Bayesian

Parzen classifier, NMC, HMM, 1-NN, SVM, ANN and combining rules

Continuation of Table 1

(Falsaperla et al.,

(Gutiérrez et al.,

(Gutiérrez et al.,

(Langer & Falsaperla, 2003)

(Orozco-Alzate et al.,

(Orozco-Alzate et al.,

(Porro-Muñoz et al.,

(Porro-Muñoz et al.,

(Porro-Muñoz et al.,

(Riggelsen et al.,

Continued on next page

2006)

2008)

2010a)

2010b)

2011)

2007)

(Hoogenboezem,

1996)

2009)

2006)

2010)

Table 1. Summary of reviewed studies and their associated experimental setups.

#### **2.2 Applications and representation approaches**

Raw seismic signals are the simplest and straightforward representation to be provided to a classifier. That option exempts designers from the need to find good features and may be convenient if sufficient training examples are available. However, building a vector space by using the original time samples yields to the following drawbacks: (1) it is mandatory to have equal-length and aligned signals, which is often not possible due to the intrinsic variable duration of seismic events; and (2) high dimensional vector spaces are spanned by the samples and, thereby, large training sets are required in order to avoid the "curse of dimensionality" phenomenon. The second drawback can be overcome by applying dimensionality reduction techniques such as principal component analysis (PCA) and feature selection methods. Avossa et al. (2003) adopted this approach, reducing the dimension from 240 to 15. Langer & Falsaperla (2003); Ursino et al. (2001); and Langer et al. (2006) used the autocorrelation function instead of the original waveforms in order to avoid the phase alignment problem.

Morphological features can be extracted directly from the examination of the waveforms. Curilem et al. (2009) measured the following values from the absolute value of the signals: standard deviation, mean, median and maximum value, as well as kurtosis and skewness from a histogram of the signal amplitudes. Scarpetta et al. (2005) and Esposito et al. (2006) extracted time-domain information by computing differences, properly normalized, between the maximum and minimum signal amplitudes. Similarly, Ezin et al. (2002) measured maximum and minimum signal amplitudes, Yıldırım et al. (2011) obtained peak S-to-P amplitude ratios and complexity values and Rouland et al. (2009) detected the presence or absence of S-waves. Signal envelopes, that are smoothed versions of the original waveforms, were also tested for data representation by Falsaperla et al. (1996), Langer & Falsaperla (2003) and Beyreuther et al. (2008). A collection of morphological and statistical attributes of the waveforms were considered in the study by Langer et al. (2006). The most specialized representation is that reported in (Ohrnberger, 2001, Chap. 7) and (Beyreuther & Wassermann, 2008), which includes several wavefield parameters.

An alternative consists in computing intermediate representations, usually spectra and spectrograms because differences in spectral content allow a visual discriminating of different types of volcanic earthquakes (Zobin, 2003, Chap. 9). This approach was followed by Orozco-Alzate et al. (2008); Chu-Salgado et al. (2010); Duin et al. (2010); Hoogenboezem (2010); Orozco-Alzate et al. (2006); and Porro-Muñoz et al. (2010a;b; 2011). In the first four studies, the computation of spectra was followed by dimensionality reduction techniques such as sequential feature selection, PCA and Fisher mapping. In the remaining ones, dissimilarity representations were computed after transforms to the frequency or the time-frequency domain. Porro-Muñoz et al. (2010a;b; 2011) included multiway data analysis techniques, see Sec. 3.5.

Additional features can be extracted from spectral representations by measuring morphological attributes such as the mean frequencies of the five highest peaks, energies in given frequency bands (Curilem et al., 2009; Romeo, 1994; Romeo et al., 1995) and the instantaneous frequency (Beyreuther et al., 2008), or by computing variables such as the Mel-frequency cepstral coefficients (MFCCs), their associated log-energies and the so-called delta and delta-delta coefficients (Benítez et al., 2007; Gutiérrez et al., 2009; 2006). Spectra and spectrograms are typically computed by using the Fourier or the cosine transforms. Other ones, such as the Hilbert and wavelet transforms have been applied for representation; e.g. by Riggelsen et al. (2007), San-Martín et al. (2010), and Porro-Muñoz et al. (2010b).

The linear predictive coding (LPC) coefficients have been widely used in speech recognition and, by extension, also chosen for representation in several projects of seismic signal classification (Del Pezzo et al., 2003; Esposito et al., 2007; 2006; 2005; Ezin et al., 2002; Scarpetta et al., 2005). They are aimed to predict samples as linear combinations of several previous ones, based on the correlation between successive samples in a seismic signal.

#### **2.3 Applications and classification approaches**

In the majority of the reviewed applications, ANNs have been used for classification; particularly multilayer perceptrons (MLPs). Summarized descriptions of publication references, input-hidden-output architecture (number of neurons per layer) and training method are shown in Table 2. Architecture and training method, in almost all the studies, were selected either by trial and error or by agreement with a previous publication. An exception is the study by Curilem et al. (2009), who optimized the size of the hidden layer and selected the training process by means of a genetic algorithm, finding that 14 hidden neurons and the Levenberg-Marquardt training algorithm were the optimal choice.


Continued on next page

types of volcanic earthquakes (Zobin, 2003, Chap. 9). This approach was followed by Orozco-Alzate et al. (2008); Chu-Salgado et al. (2010); Duin et al. (2010); Hoogenboezem (2010); Orozco-Alzate et al. (2006); and Porro-Muñoz et al. (2010a;b; 2011). In the first four studies, the computation of spectra was followed by dimensionality reduction techniques such as sequential feature selection, PCA and Fisher mapping. In the remaining ones, dissimilarity representations were computed after transforms to the frequency or the time-frequency domain. Porro-Muñoz et al. (2010a;b; 2011) included multiway data analysis

Additional features can be extracted from spectral representations by measuring morphological attributes such as the mean frequencies of the five highest peaks, energies in given frequency bands (Curilem et al., 2009; Romeo, 1994; Romeo et al., 1995) and the instantaneous frequency (Beyreuther et al., 2008), or by computing variables such as the Mel-frequency cepstral coefficients (MFCCs), their associated log-energies and the so-called delta and delta-delta coefficients (Benítez et al., 2007; Gutiérrez et al., 2009; 2006). Spectra and spectrograms are typically computed by using the Fourier or the cosine transforms. Other ones, such as the Hilbert and wavelet transforms have been applied for representation; e.g. by

The linear predictive coding (LPC) coefficients have been widely used in speech recognition and, by extension, also chosen for representation in several projects of seismic signal classification (Del Pezzo et al., 2003; Esposito et al., 2007; 2006; 2005; Ezin et al., 2002; Scarpetta et al., 2005). They are aimed to predict samples as linear combinations of several previous

In the majority of the reviewed applications, ANNs have been used for classification; particularly multilayer perceptrons (MLPs). Summarized descriptions of publication references, input-hidden-output architecture (number of neurons per layer) and training method are shown in Table 2. Architecture and training method, in almost all the studies, were selected either by trial and error or by agreement with a previous publication. An exception is the study by Curilem et al. (2009), who optimized the size of the hidden layer and selected the training process by means of a genetic algorithm, finding that 14 hidden neurons and the

(Ezin et al., 2002) 174-6-1 Quasi-Newton and scaled gradient descent

(Scarpetta et al., 2005) [70,79]-[4,5]-1 Scaled conjugate gradient and Quasi-Newton

Riggelsen et al. (2007), San-Martín et al. (2010), and Porro-Muñoz et al. (2010b).

ones, based on the correlation between successive samples in a seismic signal.

Levenberg-Marquardt training algorithm were the optimal choice.

Publication Architecture Training method (Avossa et al., 2003) 15-3-1 Quasi-Newton (Curilem et al., 2009) 8-14-1 Levenberg-Marquardt (Del Pezzo et al., 2003) 105-6-1 Quasi-Newton (Esposito et al., 2006) 71-5-3 Quasi-Newton

(Falsaperla et al., 1996) 600-8-4 Gradient descent

(Ursino et al., 2001) 100-5-2 Back propagation

(Langer et al., 2006) 103-20-6 — (Romeo, 1994) 40-12-40 — (Romeo et al., 1995) 10-9-9-5 —

Continued on next page

(Langer & Falsaperla, 2003) 600-8-4 Backpropagation algorithm

**2.3 Applications and classification approaches**

techniques, see Sec. 3.5.


Table 2. MLP architecture and training methods used in several ANN-based applications.

HMMs have been widely used in the speech recognition framework. Given the analogous nature of speech and seismic signals, authors have also successfully applied them to the automated classification of volcanic earthquakes. Similarly to the case of ANNs, the performance of HMMs is controlled by several free parameters, namely: the topology of the models, the number of states for the models, the number of multivariate Gaussian probability density functions and the number of iterations of the Baum-Welch algorithm for training. Topology usually corresponds to a left-to-right configuration. Values used for the second parameter —the number of states— in the reviewed applications are listed in Table 3.


Table 3. Configurations of HMMs applied in the reviewed publications.

A conceptual discussion on the use of wavelet-based HMMs to the classification of seismic volcanic signals is presented in (Alasonati et al., 2006). Several reasons have motivated researchers to prefer a left-to-right HMM topology instead of an ergodic one; Ohrnberger (2001, Chap. 7) points out the following reasons: (1) seismic signals are causal in time; (2) seismic signals are analogous to speech signals, for which left-to-right models are widely used; and (3) the degree of freedom of a model —with equal number of states— is lower for a left-to-right topology than that for an ergodic one. Readers are referred again to (Rabiner, 1989) for details on the difference between these two topologies. A generalization of HMMs are the so-called dynamic Bayesian networks. Riggelsen et al. (2007) applied them to the real-time identification of seismic signals.

Less complex classifiers were applied by San-Martín et al. (2010); Chu-Salgado et al. (2010); Orozco-Alzate et al. (2006); Porro-Muñoz et al. (2010a;b); and Duin et al. (2010). Authors of the first study built classifiers on top of a classical feature representation while the others employed simple ones, either in the dissimilarity space or to be combined in a second step of the classification process as explained at the end of Sec. 1.3.3. The reader is referred again to Table 1 to associate studies and classifiers.

Hoogenboezem (2010) presented a compendious survey of classifiers and representations applied to signals from NRV. However, more rigorous experimentations and statistical comparisons are a must when a comprehensive study is planned to be conducted. Recommendations such as those made by Demšar (2006); Duin (1996); Salzberg (1997) and in Sec. 2.4 should be taken into account. An additional concern is the methodological rigor in the evaluation of performances for multiclass problems; even though most of the studies report confusion matrices, others draw conclusions from overall accuracies that are likely to be unreliable for multiclass and/or unbalanced data sets.

This subsection is concluded with a mention to the following studies dealing with the unsupervised classification problem: (Ansari et al., 2009; Esposito et al., 2007; 2008; 2005; Orozco-Alzate & Castellanos-Domínguez, 2007). They are aimed at finding clusters in seismic volcanic data and understanding their structure. A separate chapter would be required to properly discuss them.

#### **2.4 The need of a benchmarking data set**

Classification accuracies and other performance measures reported in the literature are not comparable across the reviewed studies because, unfortunately, there are no standard and publicly-available data sets of seismic volcanic signals. Furthermore, authors have used different sets even when they performed studies for the same volcano. Thus, the need for a benchmarking data set is evident. Researchers in this field are encouraged to define such a reference set to be made available for rigorous comparative studies. Ultimately, it is the only reliable way of measuring relative system performance.

#### **3. Open issues and research opportunities**

The area of PR has developed itself into a mature engineering field (Duin & Pekalska, 2005). As a result, in practical applications and particularly in volcano seismology, a number of recent approaches and techniques have not yet been explored. This section is concerned with future directions for research, considering not just the state-of-the-art in PR but also possibilities offered by the development of sensors and computer resources. Prospective projects are briefly outlined, considering novel approaches such as multiple instance learning, one-class classification, adaptive single and multiple classifiers, classifier optimization and multi-way representations.

#### **3.1 Multiple instance learning**

A multiple instance problem occurs when training objects are naturally organized into bags of feature vectors, also known as *multisets*, instead of being composed by individually labeled ones (Ray & Craven, 2005). It happens, for instance, when objects are too rich and contain too many details and information that can not be easily represented by a single feature vector (Tax & Duin, 2008), e.g. images that depict several objects —in addition to the one of interest, also known as *concept*— in the same picture. Feature vectors (called in this framework as *instances*) in the bag are assumed to be independent and are not individually labeled since the class labels are only assigned to the complete bags. In a two-class case, with a positive class and a negative class to be distinguished, a negative bag only contains vectors that are not members of the concept; whereas a positive bag contains at least one vector that is member of the concept and, consequently, may contain other vectors that are not.

A prospective application of multiple instance learning to the automated identification of volcanic earthquakes would consider waveforms and spectrograms as bags of feature vectors. In such a way, labels might be more accurately assigned to those segments in the signals or patches in the spectrograms clearly belonging to the concept class. Moreover, ill-defined classes might be more properly treated, e.g. the HB events.

#### **3.2 One-class classification**

16 Will-be-set-by-IN-TECH

report confusion matrices, others draw conclusions from overall accuracies that are likely to

This subsection is concluded with a mention to the following studies dealing with the unsupervised classification problem: (Ansari et al., 2009; Esposito et al., 2007; 2008; 2005; Orozco-Alzate & Castellanos-Domínguez, 2007). They are aimed at finding clusters in seismic volcanic data and understanding their structure. A separate chapter would be required to

Classification accuracies and other performance measures reported in the literature are not comparable across the reviewed studies because, unfortunately, there are no standard and publicly-available data sets of seismic volcanic signals. Furthermore, authors have used different sets even when they performed studies for the same volcano. Thus, the need for a benchmarking data set is evident. Researchers in this field are encouraged to define such a reference set to be made available for rigorous comparative studies. Ultimately, it is the only

The area of PR has developed itself into a mature engineering field (Duin & Pekalska, 2005). As a result, in practical applications and particularly in volcano seismology, a number of recent approaches and techniques have not yet been explored. This section is concerned with future directions for research, considering not just the state-of-the-art in PR but also possibilities offered by the development of sensors and computer resources. Prospective projects are briefly outlined, considering novel approaches such as multiple instance learning, one-class classification, adaptive single and multiple classifiers, classifier optimization and

A multiple instance problem occurs when training objects are naturally organized into bags of feature vectors, also known as *multisets*, instead of being composed by individually labeled ones (Ray & Craven, 2005). It happens, for instance, when objects are too rich and contain too many details and information that can not be easily represented by a single feature vector (Tax & Duin, 2008), e.g. images that depict several objects —in addition to the one of interest, also known as *concept*— in the same picture. Feature vectors (called in this framework as *instances*) in the bag are assumed to be independent and are not individually labeled since the class labels are only assigned to the complete bags. In a two-class case, with a positive class and a negative class to be distinguished, a negative bag only contains vectors that are not members of the concept; whereas a positive bag contains at least one vector that is member of

A prospective application of multiple instance learning to the automated identification of volcanic earthquakes would consider waveforms and spectrograms as bags of feature vectors. In such a way, labels might be more accurately assigned to those segments in the signals or patches in the spectrograms clearly belonging to the concept class. Moreover, ill-defined

the concept and, consequently, may contain other vectors that are not.

classes might be more properly treated, e.g. the HB events.

be unreliable for multiclass and/or unbalanced data sets.

reliable way of measuring relative system performance.

**3. Open issues and research opportunities**

properly discuss them.

multi-way representations.

**3.1 Multiple instance learning**

**2.4 The need of a benchmarking data set**

Seismic signal classification problems are unbalanced. Events of some classes are very common and, therefore, a lot of examples are available. In contrast, other classes are rare and just a few examples of them can be collected. Based on the given examples, only a boundary descriptor of the most frequent classes can be accurately built. Considering a rare type of seismic events as the outliers and the rest of the events as the target class clearly follows the definition of a one-class classification problem (Juszczak, 2006; Tax, 2001).

One-class classifiers are sound alternatives to multi-class ones for cases when rare or abnormal states are very infrequent, costly to be forced (e.g. faults in machinery) or impossible to obtain upon request: a person can not be asked to get sick with particular symptoms and a volcano can not be artificially induced to exhibit particular rare seismic events. This approach, to the best knowledge of the authors, has not yet been applied to the automated identification of earthquakes.

#### **3.3 Adaptive single and multiple classifiers**

Seismic signals of the same events may look completely different across seismic stations, waveforms of the same classes of events differ among volcanoes and; moreover, volcano geophysical conditions change over time. These dynamic nature motivates the application of classifier adaptation strategies, either for single or multiple classifiers (Aksela, 2007), that allow the possibility of learning from the test set to adapt or modify the decision regions.

Individually adaptive classifiers have been employed in optical character recognition (OCR) in order to prevent accuracy deterioration due to the statistical dissimilarity between the training and test data (Veeramachaneni & Nagy, 2003). Such a dissimilarity is introduced in OCR by the proliferation of fonts and typefaces. Similarly, in speech recognition, adaptation has been extensively applied to deal with unseen conditions or time-variant speakers (Herbig et al., 2011). In summary, undertaking an exploratory study on the application of adaptive single and multiple classifiers may provide a convenient solution for seismic signal classification under the varying conditions mentioned above. It might be indeed an alternative to re-training or entirely re-designing deployed PR systems.

#### **3.4 Classifier optimization**

The relative importance of different classification outcomes must be taken into account when optimizing and evaluating the design of a PR system. Such differences are reflected in a trade-off between the values of true positive rate and false positive rate and can be represented in receiver operator characteristic (ROC) curves, whose examination gives the designer insights to tune the classifiers. Classical ROC curves are restricted to two-class problems, in which one class is designated as positive (target) and the other one is assumed as negative.

The automated classification of seismic volcanic signals is a multiclass PR problem. Therefore, the application of classical ROC analysis is only possible under a one-against-rest approach. Nonetheless, recent research efforts have extended ROC analysis to multiclass cases while overcoming restrictive computational complexity issues that limit straightforward multiclass generalizations; see for instance (Landgrebe, 2007; Landgrebe & Paclík, 2010; Paclík et al., 2010). Optimal classification systems for the automated identification of volcanic earthquakes might be designed by using those novel ROC approaches.

#### **3.5 Multiway representations**

Multiway data analysis has been extensively used in chemometrics and psychometrics. It extends classical multivariate statistical techniques such as component analysis, factor analysis, cluster analysis, correspondence analysis, and multidimensional scaling to multiway data (Kroonenberg, 2008). Multiway means that data are arranged in high-order arrays instead of the usual two-dimensional matrices, in which each row represents an object and each column is associated to a feature or measurement. Data collected at different times, conditions or locations are suitable to be considered as multiway data sets (Porro-Muñoz et al., 2009).

Porro-Muñoz et al. (2010a;b; 2011) derived intuitive multiway representations for classifying seismic volcanic signals. Spectrograms and scalograms are computed for each segmented seismic signal and, afterwards, the whole set is arranged by stacking those initial two-dimensional representations. As a result, a so-called profile-data configuration is obtained, where the three dimensions are associated to signals, time and frequency; respectively. Further studies on the design of custom classifiers for multiway data sets are needed. Moreover, other multiway arrangements might be created by considering, for instance, the recording stations or the sensor components (vertical, North-South, and East-West) as additional ways, i.e. dimensions.

### **4. Challenges and constraints in deploying automated systems**

This section is devoted to a discussion on the difficulties and challenges for the design and deployment of custom solutions at the volcano observatories. Technical challenges and non-technical constraints are summarized. Lastly, a few remarks concerning industrial and commercial implementation alternatives are made.

#### **4.1 Technical challenges and non-technical constraints**

Technical challenges in the deployment of PR systems for the automated recognition of seismic volcanic signals are mainly related to the following issues: (1) computational aspects and (2) local conditions. The first issue depends on the actual computational requirements of classification algorithms and their associated demands for data storage. The latter is becoming less relevant since disk storage capacity has grown exponentially and hardware prices have declined. In spite of that, processing the stored data may still be cumbersome, especially when dealing with continuous recording as commented by Langer & Falsaperla (2003). Classification speed is of crucial importance for real-time applications. Computational complexities of all stages in the PR pipeline (see Fig. 5) must be carefully estimated in terms of orders or FLOPS<sup>3</sup> in order to guarantee fast execution. Such a condition implies a reasonable trade-off between complexity and classification performance.

The second issue —local conditions— includes the consideration of several volcano-specific factors as those mentioned at the beginning of Sec. 3.3. In addition, the so-called source, path and local site effects require special attention. They cause that waveforms of the same seismic

<sup>3</sup> Floating point operations per second.

event but recorded at different stations exhibit distinct characteristics; for instance, time delays introduced by the physical distance between stations and amplifications or attenuations of signal components at certain frequencies due to geophysical properties that act as filters. See (Havskov & Alguacil, 2004, Chap. 9) and (McNutt, 2005) for further details about these effects, their characterizations and corrections.

Non-technical constraints are mainly related to budget limitations to undertake R&D projects at volcano observatories. Even though the research stage can be achieved in association with universities and institutes, as reflected in the discussion in Sec. 2.1, the development and implementation of in-house solutions is subject to organizational practices and policies at the observatories. Therefore, formalizing high-level collaboration is needed, in such a way that isolated partnership between individuals become supported by inter-institutional cooperation agreements.
