*2.2.2 Beyond the state of the art*

very important capability of a learning system. In classification problems novelty detection is particularly useful when a relevant class is under-represented in the data, so that a classifier cannot be trained to reliably recognize that class or when hierarchical classifiers trained on different concept information disagree on the output. The goal of novelty detection is twofold: to be as accurate as possible in detecting

inputs which do deviate from the nominal distribution (true positives) and to predict how many normal inputs will be erroneously flagged as positives (false positives). Novelty detection is also known as one-class classification [28] or learning from only positive (or only negative) examples. The standard approach has been to assume that novelties are outliers with respect to the nominal distribution and to build a novelty detector by estimating a level set of the nominal density. This approach allows fixing a threshold for acceptance of new data while having a degree of control over the number of false alarms raised. Using this framework, novelty detection can be interpreted as a binary classification problem. Several approaches have been utilized to tackle this problem: statistical methods, neural networks and support vector method approaches (see [29–32], for good reviews of these techniques). Bayesian methods have been used to provide a nonparametric estimation of the probability distribution [33], and content-based reasoning (CBR), based on Bayesian decision theory, has also been utilized. A common drawback of all these approaches is the assumption that novelties are uniformly distributed on the support of the nominal distribution, which is not true in most cases, mainly when the

Novelty detection has been already applied with success on single modalities of

A new and promising approach to novelty detection in audiovisual data has been proposed in [38]. In this approach, the novelty detection is not the negative output of multiple classifiers but the disagreement of several concept hierarchical classifiers trained from different but hierarchically related concept. Here, the novelty is represented not by a fully new item but by relevant changes from previous seen items. In forensic multimedia data, this situation is very common when only one

According to [33], several factors make the novelty detection problem very

• The definition of a normal region that encompasses every possible normal behavior is very difficult. In addition, the boundary between normal and anomalous behavior is often not precise. Thus, an anomalous observation that

• When anomalies are the result of malicious actions, the malicious adversaries often adapt themselves to make the anomalous observations appear normal,

• In many domains, normal behavior keeps evolving, and a current notion of normal behavior might not be sufficiently representative in the future.

• The exact notion of anomaly is different for different application domains. For example, the rate of change for novelty may be different in each application. Thus applying a technique developed in one domain to another one is not

lies close to the boundary can actually be normal and vice versa.

thereby making the task of defining normal behavior more difficult.

the forensic multimedia data. For instance, the detection and classification of abnormal events in a surveillance video has been studied in [34–36]. In [33], novelty detection is applied in online document clustering, and in [37], novelty

feature space dimension is high.

*Digital Forensic Science*

modality is affected.

straightforward.

**134**

challenging:

detection is applied on image sequences.

We will consider novelty detection as a CBR problem [34]. The CBR-based novelty detection will consist of successively adapting or evolving the previously obtained solutions, taking into account the data properties, the user's needs and any other prior knowledge into account. We will use a combination of statistical and similarity-based methods as the solution to the problems underlying the CBR methodology. Our proposed scheme differs from existing methodologies on novelty detection [29, 30, 39–41] since it can perform simultaneously novelty detection and handling and also considers the incremental nature of the data.
