**4. Expected results**

As the expected result, we will have the following:


The toolkit will be used as benchmark for testing instruments and rules involving technical skills, operators and courts.

The scenario: the achievement of an effective set of technical solutions and of (law-compliant) procedural standards is a valuable goal for the 'global society', where cultural interaction, de-territorialization of social behaviors and interdependency of phenomena (e.g. environment, health, immigration, crimes prevention, anti-terrorism fight, etc.) require law makers and courts to face the evolution of social phenomena and their legal effects within a pan-European harmonized area of justice. European judges tend to adjust the national legal framework by referring to common and shared principles, thus invoking constitutional rules as higher principles to which anchor case solutions. The problem of adopting standards of conduct in judicial procedures seems no longer to be a questionable issue but rather looks like an undeniable fact at the center of a vivid discussion within the international community of jurists, scholars and courts. In a field of

growing relevance, like digital evidences in the courts, the project will provide practices of use (cases, tests) and guidelines.

system but is characterized by high subjectivity and unknown reproducibility and

will be able to support analysts along all these steps, strongly reducing human intervention. First of all, it will include instruments to process different kinds of media data and, possibly, correlate them. This will obviously reduce the time spent to find the correct instruments for processing the medium at hand. Furthermore, it comprises preprocessing tools that alleviate, by filtering and enhancement, the problem of low data quality. In particular, for image and video data, a great help will come from super-resolution methods that will maximize the information contained in low-resolution images or videos (e.g., foster the process of face reconstruction and recognition from blurred images). This feature will greatly support all

*Novel Methods for Forensic Multimedia Data Analysis: Part II*

*DOI: http://dx.doi.org/10.5772/intechopen.92548*

In this chapter, we propose to develop a toolkit of methods and instruments that

Semi-automatic tools will be included to assist human experts in selecting the most meaningful pieces of data. In particular methods for the selection of frames from videos, images from large databases, keywords from text documents and pieces from audio signals will be developed for a first skimming of huge amounts of data according to criteria specified by the users. To this end, a great advantage will come from organizing feature extraction methods, which will also allow users to relate different types of media and operate on them contemporarily,

For evaluation and comparison, a toolkit will comprise advanced (semi-)automated instruments for different media processing so as to allow person and object retrieval, identification and recognition, writer identification, automatic handwriting recognition, speech and speaker recognition. All these methods will address the problem of recognition in the wild, going strongly beyond the current state of the art. Using the toolkit will ensure the reproducibility of the analysis and foster

Finally, the case-based approach will ensure that the knowledge acquired during each investigation will be suitably summarized, generalized and stored so as to be

Taken as a whole, the toolkit will dramatically reduce efforts spent by operators in tedious and time-consuming tasks, such as retrieving and selecting multimedia data, thus focusing them on much more important investigations. Currently, there is no other toolkit on the market that addresses the analysis of different types of media and has such a broad range of applications. Usually, different and not integrated solutions are developed to tackle specific problems on single-modality data. A fundamental concern in forensic investigation is the legal validity of evidence. We will deeply survey the legal frameworks at the national and European level, thus obtaining a clear picture of the legal hurdles governing data extraction, integration and use. Criteria and rules to evaluate digital evidence will be investigated; standards for analysis, production and usability will be acquired and, if necessary, extended. The results produced by the toolkit will then be appropriate in different

• Validity of data as proofs of evidence, guaranteed by the evidence usability

• Objectivity of data analysis, ensured by the use of mathematical methods well

• Traceability of the methods applied, obtained by logging all the tools selected

performance.

the subsequent steps.

when possible.

operators' objectivity.

profitably reused in other investigations.

national and international courts.

documented and explained

and applied to each type of media

criteria

**153**

Knowing the procedures, understanding the purposes, and assessing the results. When touching mostly the sensitive area of 'public rights' like freedom, security and equality, citizens must be ensured that their fundamental rights are secured and that States' actions should be directed to their protection against crimes at the minimum cost of their freedom. In the field of data mining, the question became, besides the regulative aspects, a matter of ethic. In this field the question is not simply a matter of what type of data is collected and whether it is relevant but also how it is collected and by whom. The fact that securely de-identified data can be collected without consent provided there is a legitimate purpose is a clear argument. But, still law enforcement agency and courts should have to legitimize the purpose in a way that citizens can understand. The project will produce public available reports on the technologies while explaining their use and application by means of transparent guidelines.

The business-oriented benefits of this project take the form of new techniques and tools for the analysis of forensic multimedia data that can be marketed as tool boxes or single software solutions for the specific tasks described in the proposal. It will make a marked improvement on the solutions currently in use by the companies involved in the proposed work and will open new markets for those that are not currently involved in forensics. It is also foreseen to establish new enterprises that market and further develop the proposed software solutions in the high-technology field of security. A special marketing and services entity that will advertise the tools in the security field and among police forces will also be established.

End-users will benefit from the more effective analysis of forensic data based on the standards and methodology.

### **5. Conclusions**

With this chapter, we finish our work on multimedia forensic data analysis, which was started in Part I of Forensic Multimedia Data Analysis [1].

Forensic investigations on multimedia evidence usually develop along four different steps: analysis, selection, evaluation and comparison. During the analysis step, technicians typically look at huge amounts of different multimedia data (e.g. hours of video or audio recordings, pages and pages of text, hundreds and hundreds of pictures) to reconstruct the dynamic of the event and collect any piece of relevant information. This step obviously requires a lot of time, and many factors can make it difficult, among which data heterogeneity, quality and quantity are the most relevant. Afterwards, during the selection step, technicians select and acquire the most meaningful pieces of information from the different multimedia data (e.g., frames from videos, audio fragments and documents). Then, in the evaluation step, they look for relevant elements in the selected data, which will be further investigated in the comparison step. They can select heads, vehicles, license plates, guns, sentences, sounds and all other elements that can link a person to the event. The main problems are the low quality of media data due to high compression, adverse environmental conditions (e.g., noise and bad lighting condition), camera/object position and facial expressions. Finally, during the comparison step, technicians place the extracted elements side by side with a known element of comparison. From the comparison of general and particular characteristics, the operators give a level of similarity. In forensic application the use of automatic pattern recognition system gives poor performance because of the high variability of data recording. On the other hand, human perception is a great pattern recognition

growing relevance, like digital evidences in the courts, the project will provide

Knowing the procedures, understanding the purposes, and assessing the results. When touching mostly the sensitive area of 'public rights' like freedom, security and equality, citizens must be ensured that their fundamental rights are secured and that States' actions should be directed to their protection against crimes at the minimum cost of their freedom. In the field of data mining, the question became, besides the regulative aspects, a matter of ethic. In this field the question is not simply a matter of what type of data is collected and whether it is relevant but also how it is collected and by whom. The fact that securely de-identified data can be collected without consent provided there is a legitimate purpose is a clear argument. But, still law enforcement agency and courts should have to legitimize the purpose in a way that citizens can understand. The project will produce public available reports on the technologies while explaining their use and application by means of

The business-oriented benefits of this project take the form of new techniques and tools for the analysis of forensic multimedia data that can be marketed as tool boxes or single software solutions for the specific tasks described in the proposal. It will make a marked improvement on the solutions currently in use by the companies involved in the proposed work and will open new markets for those that are not currently involved in forensics. It is also foreseen to establish new enterprises that market and further develop the proposed software solutions in the high-technology field of security. A special marketing and services entity that will advertise the tools

End-users will benefit from the more effective analysis of forensic data based on

With this chapter, we finish our work on multimedia forensic data analysis,

Forensic investigations on multimedia evidence usually develop along four different steps: analysis, selection, evaluation and comparison. During the analysis step, technicians typically look at huge amounts of different multimedia data (e.g. hours of video or audio recordings, pages and pages of text, hundreds and hundreds of pictures) to reconstruct the dynamic of the event and collect any piece of relevant information. This step obviously requires a lot of time, and many factors can make it difficult, among which data heterogeneity, quality and quantity are the most relevant. Afterwards, during the selection step, technicians select and acquire the most meaningful pieces of information from the different multimedia data (e.g., frames from videos, audio fragments and documents). Then, in the evaluation step, they look for relevant elements in the selected data, which will be further investigated in the comparison step. They can select heads, vehicles, license plates, guns, sentences, sounds and all other elements that can link a person to the event. The main problems are the low quality of media data due to high compression, adverse environmental conditions (e.g., noise and bad lighting condition), camera/object position and facial expressions. Finally, during the comparison step, technicians place the extracted elements side by side with a known element of comparison. From the comparison of general and particular characteristics, the operators give a level of similarity. In forensic application the use of automatic pattern recognition system gives poor performance because of the high variability of data recording. On the other hand, human perception is a great pattern recognition

in the security field and among police forces will also be established.

which was started in Part I of Forensic Multimedia Data Analysis [1].

practices of use (cases, tests) and guidelines.

transparent guidelines.

*Digital Forensic Science*

the standards and methodology.

**5. Conclusions**

**152**

system but is characterized by high subjectivity and unknown reproducibility and performance.

In this chapter, we propose to develop a toolkit of methods and instruments that will be able to support analysts along all these steps, strongly reducing human intervention. First of all, it will include instruments to process different kinds of media data and, possibly, correlate them. This will obviously reduce the time spent to find the correct instruments for processing the medium at hand. Furthermore, it comprises preprocessing tools that alleviate, by filtering and enhancement, the problem of low data quality. In particular, for image and video data, a great help will come from super-resolution methods that will maximize the information contained in low-resolution images or videos (e.g., foster the process of face reconstruction and recognition from blurred images). This feature will greatly support all the subsequent steps.

Semi-automatic tools will be included to assist human experts in selecting the most meaningful pieces of data. In particular methods for the selection of frames from videos, images from large databases, keywords from text documents and pieces from audio signals will be developed for a first skimming of huge amounts of data according to criteria specified by the users. To this end, a great advantage will come from organizing feature extraction methods, which will also allow users to relate different types of media and operate on them contemporarily, when possible.

For evaluation and comparison, a toolkit will comprise advanced (semi-)automated instruments for different media processing so as to allow person and object retrieval, identification and recognition, writer identification, automatic handwriting recognition, speech and speaker recognition. All these methods will address the problem of recognition in the wild, going strongly beyond the current state of the art. Using the toolkit will ensure the reproducibility of the analysis and foster operators' objectivity.

Finally, the case-based approach will ensure that the knowledge acquired during each investigation will be suitably summarized, generalized and stored so as to be profitably reused in other investigations.

Taken as a whole, the toolkit will dramatically reduce efforts spent by operators in tedious and time-consuming tasks, such as retrieving and selecting multimedia data, thus focusing them on much more important investigations. Currently, there is no other toolkit on the market that addresses the analysis of different types of media and has such a broad range of applications. Usually, different and not integrated solutions are developed to tackle specific problems on single-modality data.

A fundamental concern in forensic investigation is the legal validity of evidence. We will deeply survey the legal frameworks at the national and European level, thus obtaining a clear picture of the legal hurdles governing data extraction, integration and use. Criteria and rules to evaluate digital evidence will be investigated; standards for analysis, production and usability will be acquired and, if necessary, extended. The results produced by the toolkit will then be appropriate in different national and international courts.


• Reproducibility of the investigation process

Standardization will be particularly promoted by two main outcomes:

• The case base, which will foster the spreading of similar procedures and protocols, since successful solutions will be stored and efficiently reused to support novel cases

**References**

Multimedia Data: Part I

Academic Press; 2000

[2] Wang A. The Shazam music

the ACM. 2006;**49**(8):44-48

and writer identification. In: Proceedings of 3rd International Conference on Hybrid Intelligent Systems. 2003. pp. 927-938

computer methods in forensic

of the 11th Conference of the

2003. p. 279

2007. pp. 1268-1273

2005. pp. 217-221

**155**

[7] Franke K, Koeppen MK. A framework for document preprocessing in forensic handwriting analysis. In: IWFHR00; 2000. pp. 73-81

Shi Z, Shin Y-C. A system for

[8] Srihari SN, Zhang B, Tomai C, Lee S,

handwriting matching and recognition. In: Proceedings of Symposium on Document Image Understanding Technology. 2003. pp. 67-75

[9] Niels R, Vuurpijl L. Using dynamic time warping for intuitive handwriting recognition. In: Proceedings of IGS2005.

[1] Perner P. Novel Methods for Forensic

*DOI: http://dx.doi.org/10.5772/intechopen.92548*

*Novel Methods for Forensic Multimedia Data Analysis: Part II*

[10] Niels R, Vuurpijl L, Schomaker L. Automatic allograph matching in forensic writer identi-fication. International Journal of Pattern Recognition and Artificial Intelligence.

[11] Pervouchine V, Leedham G. Extraction and analysis of forensic document examiner features used for

[12] Madhvanath S, Govindaraju V. The role of holistic paradigms in handwritten word recog-nition. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2001;**23**(2):

[13] Rath TM, Manmatha R. Word spotting for historical documents. Journal on Document Analysis and Recognition. 2007;**9**:139-152

[14] Rodriguez-Serrano JA, Perronnin F. Handwritten word-spotting using hidden Markov models and universal vocabularies. Pattern Recognition. 2009;

[15] Rothfeder J, Manmatha R, Rath T. Aligning transcripts to automatically segmented hand-written manuscripts. In: Proceedings of the Conference on Document Analysis Systems. Vol. 3872.

[16] Leydier Y, Lebourgeois F, Emptoz H. Text search for medieval manuscript images. Pattern Recognition. 2007;**40**:

[17] Wang K, Belongie S. Word spotting in the wild. In: European Conference on Computer Vision (ECCV), Heraklion,

[18] Plamondon R, Srihari SN. On-line and off-line handwriting recognition: A comprehensive survey. IEEE

writer identification. Pattern Recognition. 2007;**40**(3):1004-1013

2007;**21**(01):61-81

149-164

**42**(9):2106-2116

2006. pp. 84-95

3552-3567

Crete. 2010

recognition service. Communications of

[3] Morris RN. Forensic Handwriting Identification: Fundamental Concepts and Principles. San Diego, USA:

[4] Franke K, Schomaker L, Veenhuis C, Taubenheim C, Guyon I, Vuurpijl L, et al. WANDA: A generic framework applied in forensic handwriting analysis

[5] Srihari SN, Leedham G. A survey of

document examination. In: Proceedings

International Graphonomics Society.

[6] Schomaker L. Advances in writer identification and verification. In: Ninth International Conference on Document Analysis and Recognition (ICDAR 2007), Vol. 2. Parana, Brazil: IEEE;

• The feature ontological model, which will enable reproducibility and shareability of the features that can be extracted from multimedia data in different scenarios

ICT solutions such as our proposed toolkit put a big value in forensic activities, since they enable analysts to obtain a sound identification, preservation, recovery and presentation of facts and opinions pertinent to an investigation. The awareness of this capability has been spreading in the last years, and several research initiatives and industries have been focusing on forensic informatics.
