**2. Analysis of audio and music signals**

#### **2.1 Properties of audio signal**

types of musical signals such as Rock music, Pop music, Classical music, Country music, Latin music, Arabic music, Disco and Jazz, Electronic music, etc. are existed

Acoustically speaking, the audio signals can be classified into the following

Audio signal changes randomly and continuously through time. As an example, music and audio signals have strong energy content in the low frequencies and weaker energy content in the high frequencies [31, 32]. **Figure 2** depicts a generalized time and frequency spectra of audio signals [33]. The maximum frequency *fmax* varies according to type of audio signal, where, in the telephone transmission *fmax* is equal to 4 kHz, 5 kHz in mono-loudspeaker recording, 6 KHz in multi-loudspeaker recording or stereo, 11 kHz in FM broadcasting, however, it equals to 22 KHz in the

[29]. The sound type signals hierarchy is shown in **Figure 1** [30].

3.Mixture of background music and single talker audio.

4. Songs that are a mixture of music with a singer voice.

5.May completely be music signal without any audio component.

6.Complex sound mixture like multi-singers or multi-speakers with multi-music

CD recording.

sources.

**Figure 1.**

**Figure 2.**

**92**

*Generalized frequency spectrum for audio signal [33].*

*Types of audio signals.*

1.Single talker in specific time [34].

2. Singing without music.

*Multimedia Information Retrieval*

classes:

#### *2.1.1 Representation of audio signal*

The letters symbols used for writing are not adequate, as the way they are pronounced varies; for example, the letter "o" in English, is pronounced differently in words "pot" most" and "one". It is almost impossible to tackle the audio classification problem without first establishing some way of representing the spoken utterances by some group of symbols representing the sounds produced [39–43]. The phonemes in **Table 1** are divided into groups based on the way they are produced [44], forming a set of *allophones* [45]. In some tonal languages, such as Vietnamese and Mandarin, the intonation determines the meaning of each word [46–48].

#### *2.1.2 Production of audio signal*

Since the range of sounds that can be produced by any system is limited [39–44], the pressure in the lungs is increased by the reverse process. They push the air up the *trachea*; the larynx is situated at the top of the trachea. By changing the shape of the vocal tract, different sounds are produced, so the fundamental frequency will be changing with time. The spectrogram (or sonogram) for the sentence "What can I have for dinner tonight?" is shown in **Figure 3**.


#### **Table 1.**

*Phoneme categories of British English and examples of words in which they are used [44].*

instrument manufactures to try their best to bound music frequency to human's sound system limits to achieve strong concord [35, 53, 54]. In the real world, musical instruments cover more frequencies than audible band, which is limited to

The concept of tone quality that is most common depends on the subjective acoustic properties, regardless of partials or formants and the production of music depends mainly on the kind of musical instruments [53, 54]. These instruments can

1.**The string musical instrument.** Its tones is produced by vibrating chords made from horsetail hair, or other manufactured material like copper or plastic. Every vibrating chord has its own fundamental frequency, producing complex tones so that it covers most of the audible bands. **Figure 4** shows

2.**The brass musical instrument.** The Brass musical instrument depends on blowing air like woodwind. Its shape looks like an animal horn and has manual valves to control cavity size. Brass musical instrument has huge number of

3.**The woodwind musical instrument.** Woodwind instrument consists of an open cylindrical tube at both ends. Some woodwind instruments may use small-vibrated piece of copper to produce tones. It produces many numbers of

4.**The percussion musical instrument.** Examples of percussion instruments are piano, snare drum, chimes, marimba, timpani, and xylophone. Most of the

5.**The electronic musical instrument.** The most qualified robust and accurate electronic musical instrument is the organ. It has a large keyboard, a memory that can store notes and use their frequencies as basic cadences or tones. Without organ help, disco, pop, rock and jazz cannot stand [29, 35–38]. Organ

nonharmonic signals existed in its spectrum. **Figure 5** shows brass

power of tones in percussion instruments produces non-harmonic components. **Figure 7** shows some percussion instruments.

is not the only electronic musical producer. If the electronic musical

harmonic tones. **Figure 6** shows woodwind instruments.

20 kHz).

*2.2.2 Production of music signal*

*Classification and Separation of Audio and Music Signals*

*DOI: http://dx.doi.org/10.5772/intechopen.94940*

be summarized as follows:

string instruments.

instruments.

**Figure 4.** *String instruments.*

**95**

#### **Figure 3.**

*A sonogram for the sentence "What can I have for dinner tonight?" [43].*

The way that humans recognize and interpret audio signal has been considered by many researchers [1, 25, 39]. To produce a complete set of English vowels, many researchers have depicted that the two lowest formants are necessary, as well as that the three lowest formants in frequency are necessary for good audio intelligibility. As the number of formants increased, sounds that are more natural are produced. However, when we deal with continues audio, the problem becomes more complex. The history of audio signal identification can be found in [1, 25, 39–48].

#### **2.2 Properties of music signal**

#### *2.2.1 Representation of music signal*

There are two kinds of tone structures in music signal. The first one is a simple tone formed of single sinusoidal waveform, however, the second one is a more complex tone consisting of more than one harmonic [31, 49–52]. The spectrum of music signal has twice the bandwidth of audio spectrum, and most of the power of audio signal is concentrated at lower frequencies. Melodists and musicians divide musical minor to eight parts and each part named octave, where each octave is divided into seven parts called tones [30]. For different instrument, a tempered scale is shown in **Table 2**. These tones, shown in **Table 2**, are named (Do, Re, Me, Fa, So, La and Se) or simply (A, B, C, D, E, F, and G). The tone (A1) at the first octave has the fundamental frequency of the first tone in each octave, i.e., every first tone in each octave takes the reduplicate frequency of the first tone of previous one, (i.e., A*n* = 2*<sup>n</sup>* A1 or Bn = 2*<sup>n</sup>* B1 and so on where *n* ∈ {2, 3, 4, 5, 6, 7}.

From **Table 2**, the highest tone C8 occurs at the frequency of 4186 Hz, which is the highest frequency produced by human sound system, which leads musical


#### **Table 2.**

*Frequencies of notes in the tempered scale [3].*

instrument manufactures to try their best to bound music frequency to human's sound system limits to achieve strong concord [35, 53, 54]. In the real world, musical instruments cover more frequencies than audible band, which is limited to 20 kHz).
