**3. Image classification**

*Multimedia Information Retrieval*

specific domain descriptors [7, 8]:

regions, textures and movement.

contour and the [16] region.

deviation, color skewness, variance, median, etc.

Parametric Motion Descriptor (WMD and PMD).

**2.1 General descriptors**

For a long time, high calculation errands caused by calculating complexity and gigantic amount of image during indexing, and retrieving steps have been obstacles for building a CBIR systems [3, 4]. Furthermore, the conventional content-based image retrieval systems have focused on small databases of face images. Therefore, it is important to generalize and train these systems on large-scale databases [5, 6]. Therefore, in this chapter, we will present the basics of CBIR systems for large-scale databases, Big Data, Big Data processing platforms for large scale image retrieval.

The extraction of features from images is the basis of any computer vision system that does recognizing. These characteristics can contain both text (keywords; annotations, etc.), and visual characteristics (color, texture, shapes, faces, etc.). We will focus on techniques for extracting these visual features only. And for that the visual characteristics (descriptors) are classified in two categories general descriptors and

They contain low-level descriptors that give a description of color, shape,

**Color:** Color is one of the most used visual characteristics in facial recognition systems or anything like that. It is relatively robust to the complexities of the background and independently of the size and orientation of the image. The most well-known representation of color is the histogram, which denotes the frequencies of occurrence of the intensities of the three color channels. Many other representations of this characteristic exist: we speak especially of the moments of color. The mathematical basis of this approach is that each color distribution can be characterized by its color moments. Furthermore, most of the information on color is concentrated on lower order moments which are respectively: mean, standard

**Texture:** A wide variety of texture descriptors have been proposed in the literature. These were traditionally divided into statistical, spectral, structural and hybrid [9] approaches. Among the most popular traditional methods are probably those based on histograms, Gabor filters [10], co-occurrence matrices [11] and models (lbp) [12]. These descriptors present various strengths and weaknesses, in particular as regards their invariance with respect to the acquisition conditions.

**Shape:** Over the past two decades, 2D shape descriptors have been actively used

**Movement:** Movement is related to the movement of objects in the sequence and to the movement of the camera. The latter information is provided by the capture device, while the rest is implemented by means of image processing. The set of descriptors is the following [7]: Motion Activity Descriptor (MAD), Camera Motion Descriptor (CMD), Motion Trajectory Descriptor (MTD), and Warp and

in 3D search engines and sketch-based modeling techniques. Some of the most popular 2D shape descriptors are curvature scale space (CSS) [13], SIFT [14], and SURF [15]. In fact, in the literature, 2D shape descriptors are classified into two main categories: contours and regions. Outline-based shape descriptors extract shape entities from the outline of a shape only. In contrast, region-based shape descriptors obtain shape characteristics of the entire region of a shape. In addition, hybrid techniques have also been proposed, combining techniques based on the

**2. The methods for extracting visual characteristics**

**4**

Image classification is an important step in the image recognition process. Indeed, many image classification techniques have been proposed to date. It is considered to be one of the main types of machine learning. Various studies have been carried out in order to choose the best technique for classifying [17] images.

### **3.1 What is machine learning?**

It is one of the subdomains of artificial intelligence (AI) which uses a series of techniques to let computers learn, (that is, gradually improve the performance of the computer on a task specific) with data, without being explicitly programmed. Indeed, machine learning covers a vast field of tasks. Below are the types of machine learning described in this section [18]:

**Supervised learning (classification):** In this case, the entries are tagged by an expert, and the algorithm must learn from the tags of these entries in order to predict the class of each new entry. In other words, from a set of observations X and another set of measures Y, we seek to estimate the dependencies between X and Y.

**Unsupervised learning (clustering):** In this case, the entries are not labeled, no expert is available, and the algorithm must predict the class of each entry. The objective of this type of learning is to describe how the data is organized and to extract homogeneous subsets.

**Semi-supervised learning:** the algorithm combines labeled and unlabeled examples to generate an appropriate function or class.

**Learning to learn:** where the algorithm learns its own inductive bias based on previous experience.
