**Author details**

by replacing the five-cepstral samples centered at the pitch peak by zeros, the audio segment may be attenuated or distorted completely. A typical example of the cepstrum of two audio and music signals is depicted in **Figure 22** for 5 seconds signals. The logarithmic effect will increase low amplitude reduce high one, and the

*(a) A typical 5 seconds audio signal in cepstrum domain, the pitch peak appears near zero. (b) a typical*

In this chapter, a general review of the common classification and separation algorithms used for speech and music was presented and some were introduced and discussed thoroughly. The approaches dealt with classification were divided into three categories. The first category included most of the real-time approaches. In the real-time approaches, we introduced the ZCR, the STE, the ZCR and the STE with positive derivative, with some of their modified versions, and the neural networks. The second category included most of the frequency domain approaches such as the spectral centroid and its variance, the spectral flux and its variance, the roll-off of the spectrum, the cepstral residual, and the delta pitch. However, the last category introduced two time-frequency approaches, mainly the spectrogram and the evolutionary spectrum. It has been noticed that the time-frequency classifiers provided an excellent and a robust discrimination result in discriminating speech from music signals in digital audio. Depending on the application, the decision of which feature should be chosen is selected. The algorithms of the first category are faster since the processing is made in the real time; however, those of the second

**(Spectrum) | (Cepstrum)**

STE Spectral Flux Variance of the

Pulse Metric Signal Bandwidth Pitch

Spectrum Amplitude

HMM Delta Amplitude

**Algorithms** ZCR Spectral Centroid Cepstral Residual Spectrogram (Sonogram)

Cepstral Residual

Roll-Off Variance Spectrum Roll-Off Cepstral feature Evolutionary Bispectrum

Delta Pitch

**Time-Frequency domain**

Evolutionary Spectrum

values near zero will be very large after the logarithm.

**Approaches Time domain Frequency domain**

Number of Silence

*Summary of the classification and separation algorithms.*

ANN

**Table 9.**

**114**

**5. Conclusions**

**Figure 22.**

*5 seconds music signal in cepstrum domain.*

*Multimedia Information Retrieval*

Abdullah I. Al-Shoshan Computer Engineering, Qassim University, Saudi Arabia

\*Address all correspondence to: ashoshan@qu.edu.sa

© 2020 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/ by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
