*2.2.2 Production of music signal*

The way that humans recognize and interpret audio signal has been considered by many researchers [1, 25, 39]. To produce a complete set of English vowels, many researchers have depicted that the two lowest formants are necessary, as well as that the three lowest formants in frequency are necessary for good audio intelligibility. As the number of formants increased, sounds that are more natural are produced. However, when we deal with continues audio, the problem becomes more complex.

There are two kinds of tone structures in music signal. The first one is a simple tone formed of single sinusoidal waveform, however, the second one is a more complex tone consisting of more than one harmonic [31, 49–52]. The spectrum of music signal has twice the bandwidth of audio spectrum, and most of the power of audio signal is concentrated at lower frequencies. Melodists and musicians divide musical minor to eight parts and each part named octave, where each octave is divided into seven parts called tones [30]. For different instrument, a tempered scale is shown in **Table 2**. These tones, shown in **Table 2**, are named (Do, Re, Me, Fa, So, La and Se) or simply (A, B, C, D, E, F, and G). The tone (A1) at the first octave has the fundamental frequency of the first tone in each octave, i.e., every first tone in each octave takes the reduplicate frequency of the first tone of previous

A1 or Bn = 2*<sup>n</sup>* B1 and so on where *n* ∈ {2, 3, 4, 5, 6, 7}. From **Table 2**, the highest tone C8 occurs at the frequency of 4186 Hz, which is

the highest frequency produced by human sound system, which leads musical

**A Hz B Hz C Hz D Hz E Hz F Hz G Hz** A1 27.5 B1 30.863 C1 32.703 D1 36.708 E1 41.203 F1 43.654 G1 48.99 A2 55 B2 61.735 C2 65.406 D2 73.416 E2 82.407 F2 87.307 G2 97.99 A3 110 B3 123.47 C3 130.81 D3 146.83 E3 164.81 F3 174.61 G3 196 A4 220 B4 246.94 C4 261.63 D4 293.66 E4 329.63 F4 349.23 G4 392 A5 440 B5 493.88 C5 523.25 D5 587.33 E5 659.26 F5 698.46 G5 783.9 A6 880 B6 987.77 C6 1046.5 D6 1174.7 E6 1318.5 F6 1396.9 G6 1568 A7 176 B7 1975.5 C7 2093 D7 2349.3 E7 2637 F7 2793 G7 3136

The history of audio signal identification can be found in [1, 25, 39–48].

*A sonogram for the sentence "What can I have for dinner tonight?" [43].*

**2.2 Properties of music signal**

*Multimedia Information Retrieval*

**Figure 3.**

one, (i.e., A*n* = 2*<sup>n</sup>*

**Table 2.**

**94**

*2.2.1 Representation of music signal*

A8 352 B8 3951.1 C8 4186

*Frequencies of notes in the tempered scale [3].*

The concept of tone quality that is most common depends on the subjective acoustic properties, regardless of partials or formants and the production of music depends mainly on the kind of musical instruments [53, 54]. These instruments can be summarized as follows:


**Figure 4.** *String instruments.*

**Figure 5.** *Brass instruments.*

**Figure 7.**

**Figure 8.** *Electronic organ.*

**Figure 9.**

**97**

*(b) magnitude. (c) Phase.*

*An example of audio signal of specking the two-second long phrase "*Very good night*": (a) time domain*

*Percussion instruments.*

*Classification and Separation of Audio and Music Signals*

*DOI: http://dx.doi.org/10.5772/intechopen.94940*

**Figure 6.** *Woodwind instruments.*

instruments are used for producing music, the tone quality measure of the fundamental frequency or harmonics is not needed. **Figure 8** shows an example of organ electronic instrument.
