*4.1.2 Beyond the state of the art*

Once the problem is modeled, inference is then needed. The recently developed variational Bayesian methods have attracted a lot of interest in Bayesian statistics, machine learning, and related areas [18–20]. A major disadvantage of traditional methods (such as expectation maximization (EM)) is that they generally require exact knowledge of the posterior distributions of the unknowns, or poor approximations of them are used. Variational Bayesian methods overcome this limitation by approximating the unknown posterior distributions with simpler, analytically tractable distributions, which allow for the computation of the needed expectations and therefore extend the applicability of Bayesian inference to a much wider range of modeling options: more complex priors (which are very much needed in applications involving images) modeling the unknowns can be utilized with ease, resulting in improved estimation accuracy.

Techniques for detecting artifacts in images and videos are of paramount importance. In order to trust the information extracted from images and videos, it is necessary to make sure that the image and video have been recorded by a camera, and that no artifact has been added. The detection of artifacts is a key element to use an image or a video in court. Thus, the integrity of images and videos used as a proof of evidence should be clearly assessed. The trustworthiness of images and videos has clearly an essential role in many security areas, including forensic investigation, criminal investigation, surveillance systems, and intelligence services.

As stated by Mahdian and Saic [21], verifying the integrity of digital images and detecting the traces of tampering without using any protecting pre-extracted or pre-embedded information have become an important research field of image processing. We will utilize and develop blind methods for detecting image forgery, that is, methods that use the image function to perform the forgery detection task. These methods are based on the fact that forgeries bring into the image-specific detectable changes (e.g., statistical changes). In high-quality forgeries, these changes cannot be found by visual inspection. Existing methods mostly try to identify various traces

**77**

*Novel Methods for Forensic Multimedia Data Analysis: Part I*

be carried out by fusion of results of separate detectors.

of tampering and detect them separately. The final decision about the forgery can

Blind methods can be classified into several categories. In detection of nearduplicated image regions, a part of the image is copied and pasted into another part of the same image with the intention to hide an object or a region. There are methods capable of detecting near duplicated parts of the image that usually require a human interpretation of the results, see Refs. [21–23]. A different category includes interpolation and geometric transformation that are typically based on the resampling of a portion of an image onto a new sampling lattice, see, for example, Ref. [24]. In the photomontage detection problem, one of the fundamental tasks is the detection of image splicing, which can sometimes be based on analyzing the lighting conditions. Another category is related to compression method. In order to alter an image, typically the image is loaded to photoediting software, and once the changes are done, the digital image is resaved. Methods capable of finding the image compression history can be helpful in forgery detection. Another important category is the study of the noise characteristics and the chromatic aberrations [25, 26]. In the same line, blur and sharpening can also be analyzed to detect the concealment of traces of tampering. When two or more images are spliced together,

it is often difficult to keep the appearance of the image correct perspective. Applying the principles from projective geometry to problems in image forgery detection can be also a proper way to detect traces of tampering. There are also other groups of forensic methods effective in forgery detection, see, for instance, single-view recaptured image detection, aliveness detection for face authentication,

Case-Based Reasoning has been shown a successful problem-solving method in different applications were generalized knowledge is lacking. CBR has been used to interpret images [31, 32], 1-D signals [31, 33, 34], and text cases [35]. It also has been used for meta-learning of the best parameter of image segmentation [36] and classification methods [37], so that the best processing and classification results can be achieved, although domain knowledge is lacking. The success of these systems is because cases can be more easily collected than rules or other domain data and because of the flexibility of the systems based on their learning and maintenance mechanisms that allow incrementally improvement of their system performance

The necessity to study the taxonomy of similarity measures and a first attempt to construct a taxonomy over similarity measures has been given by Perner [38] and has been further studied by Cunningham [39]. More work is necessary especially when not only one feature type and representation is used in a CBR system, as it is the case for multimedia data. These multimedia cases will be more complex as the cases used in the system described above that only face on one specific data type. To understand the similarity between these multimedia cases will require more complex knowledge of similarity by the police investigator for the different types of multimedia data. To develop novel similarity measures for text, videos, images, and audio and speech signals and to construct a taxonomy that allows understanding the relation between the different similarity measures will be a challenging task. Similarity aggregation

and device identification in digital image forensics, Refs. [27–30].

**4.2 Case-based reasoning**

during usage of the system.

*4.2.2 Beyond the state of the art*

*4.2.1 State of the art*

*DOI: http://dx.doi.org/10.5772/intechopen.92167*

### *Novel Methods for Forensic Multimedia Data Analysis: Part I DOI: http://dx.doi.org/10.5772/intechopen.92167*

*Digital Forensic Science*

frequently used.

process and the nature of images.

in improved estimation accuracy.

*4.1.2 Beyond the state of the art*

ground-based imaging systems or extraterrestrial observations of the earth and the planets), commercial photography [3, 4], surveillance and forensics [5, 6], medical imaging [7] (e.g., X-rays, digital angiograms, autoradiographs, MRI, and SPECT), and security tasks where commercial photography and other image modalities like Synthetic Aperture Radar (SAR) [8] and Passive Millimeter (PMMW) [9] are

Degradations in such images may appear in different forms. They may be due to a known or an unknown blurring function that leads to the consideration of deconvolution [9–13] and blind deconvolution [3, 14] problems. They may also be due to the use of very low-resolution devices, which lead to the combination of several low-resolution images to obtain a high-resolution one, the so called, super-resolution problem [15, 16] or to the utilization of highly compressed images, which suffer from compression artifacts [17]. These types of degradations must be removed before the images or video sequences are used for classification or decision making. Interestingly, all the problems described above can be formulated within the Bayesian framework [18–20]. A fundamental principle of the Bayesian philosophy is to regard all parameters and unobservable variables as unknown stochastic quantities, assigning probability distributions based on subjective beliefs. Thus, the original image(s), the observation noise, and even the function(s) defining the acquisition process are all treated as samples of random fields, with corresponding prior probability density functions that model our knowledge about the imaging

Once the problem is modeled, inference is then needed. The recently developed variational Bayesian methods have attracted a lot of interest in Bayesian statistics, machine learning, and related areas [18–20]. A major disadvantage of traditional methods (such as expectation maximization (EM)) is that they generally require exact knowledge of the posterior distributions of the unknowns, or poor approximations of them are used. Variational Bayesian methods overcome this limitation by approximating the unknown posterior distributions with simpler, analytically tractable distributions, which allow for the computation of the needed expectations and therefore extend the applicability of Bayesian inference to a much wider range of modeling options: more complex priors (which are very much needed in applications involving images) modeling the unknowns can be utilized with ease, resulting

Techniques for detecting artifacts in images and videos are of paramount importance. In order to trust the information extracted from images and videos, it is necessary to make sure that the image and video have been recorded by a camera, and that no artifact has been added. The detection of artifacts is a key element to use an image or a video in court. Thus, the integrity of images and videos used as a proof of evidence should be clearly assessed. The trustworthiness of images and videos has clearly an essential role in many security areas, including forensic investigation, criminal investigation, surveillance systems, and intelligence services.

As stated by Mahdian and Saic [21], verifying the integrity of digital images and detecting the traces of tampering without using any protecting pre-extracted or pre-embedded information have become an important research field of image processing. We will utilize and develop blind methods for detecting image forgery, that is, methods that use the image function to perform the forgery detection task. These methods are based on the fact that forgeries bring into the image-specific detectable changes (e.g., statistical changes). In high-quality forgeries, these changes cannot be found by visual inspection. Existing methods mostly try to identify various traces

**76**

of tampering and detect them separately. The final decision about the forgery can be carried out by fusion of results of separate detectors.

Blind methods can be classified into several categories. In detection of nearduplicated image regions, a part of the image is copied and pasted into another part of the same image with the intention to hide an object or a region. There are methods capable of detecting near duplicated parts of the image that usually require a human interpretation of the results, see Refs. [21–23]. A different category includes interpolation and geometric transformation that are typically based on the resampling of a portion of an image onto a new sampling lattice, see, for example, Ref. [24]. In the photomontage detection problem, one of the fundamental tasks is the detection of image splicing, which can sometimes be based on analyzing the lighting conditions. Another category is related to compression method. In order to alter an image, typically the image is loaded to photoediting software, and once the changes are done, the digital image is resaved. Methods capable of finding the image compression history can be helpful in forgery detection. Another important category is the study of the noise characteristics and the chromatic aberrations [25, 26]. In the same line, blur and sharpening can also be analyzed to detect the concealment of traces of tampering. When two or more images are spliced together, it is often difficult to keep the appearance of the image correct perspective. Applying the principles from projective geometry to problems in image forgery detection can be also a proper way to detect traces of tampering. There are also other groups of forensic methods effective in forgery detection, see, for instance, single-view recaptured image detection, aliveness detection for face authentication, and device identification in digital image forensics, Refs. [27–30].

### **4.2 Case-based reasoning**

### *4.2.1 State of the art*

Case-Based Reasoning has been shown a successful problem-solving method in different applications were generalized knowledge is lacking. CBR has been used to interpret images [31, 32], 1-D signals [31, 33, 34], and text cases [35]. It also has been used for meta-learning of the best parameter of image segmentation [36] and classification methods [37], so that the best processing and classification results can be achieved, although domain knowledge is lacking. The success of these systems is because cases can be more easily collected than rules or other domain data and because of the flexibility of the systems based on their learning and maintenance mechanisms that allow incrementally improvement of their system performance during usage of the system.

### *4.2.2 Beyond the state of the art*

The necessity to study the taxonomy of similarity measures and a first attempt to construct a taxonomy over similarity measures has been given by Perner [38] and has been further studied by Cunningham [39]. More work is necessary especially when not only one feature type and representation is used in a CBR system, as it is the case for multimedia data. These multimedia cases will be more complex as the cases used in the system described above that only face on one specific data type. To understand the similarity between these multimedia cases will require more complex knowledge of similarity by the police investigator for the different types of multimedia data. To develop novel similarity measures for text, videos, images, and audio and speech signals and to construct a taxonomy that allows understanding the relation between the different similarity measures will be a challenging task. Similarity aggregation

of the different types of similarity measures is another challenging topic. Specific knowledge for the different types of data such as text [40, 41], images [42–44], video [45], 1-D signals, and meta-learning [36] is required in this work. The development of new similarity measures for multimedia data types and new data representations and ontologies will be done. A complex CBR system that can handle so many different data types, similarities, and data sources is a novelty.

Retrieval of multimedia data from a case base can be refined by relevance feedback mechanisms [46–52]. The user is asked to mark retrieved results as being "relevant" or not with respect to his/her interests. Then, feature weights and the similarity measures are suitably adapted to reflect user's interests. Relevance feedback can be implemented in a number of ways, for example, as the solution of an optimization problem, or as a classification problem. According to the problem at hand, the most suited formulation has to be devised. Thus, the main challenge will be to formulate the relevance feedback problem for forensic applications, so that the search is driven toward the cases more relevant to the case at hand.

Research has been described for learning of feature weights and similarity measures [53–55]. Case mining from raw data in order to get more generalized cases has been described by Jaenichen and Perner [56]. Learning of generalized cases and the hierarchy over the case base has been presented by the authors of Refs. [45, 57]. These works demonstrate that the system performance can be significantly improved by these functions of a CBR system.

New techniques for learning of feature weights and similarity measures and case generalization for different multimedia types are necessary and will be developed for these tasks.

The question of the Life Cycle of a CBR system goes along with the learning capabilities, case base organization and maintenance mechanism, standardization, and software engineering for which new concepts should be developed. As the result, we should come up with generic components for a CBR system for multimedia data analysis and interpretation that form a set of modules that can be easily integrated and updated into the CBR architecture. The CBR system architecture should easily allow configuring modules for new arising task.

The partner IBAI has a number of national and international patents that protect their work on CBR for images and signals. It is to expect that new methods will be developed that can be protected by patents and can ensure the international competition of European entities on CBR systems.
