**3. Measuring quality**

Quality evaluation of digital content is critical in all applications of information delivery. This is particularly true in case of digital image and video. Each stage of processing, storing, compression, and enhancement, may introduce perceivable distortions*.* For example, in image and video compression, the use of lossy schemes for reducing the amount of data may introduce artifacts as blurring and ringing, which leads to quality degradation. Similarly, during the transmission phase, due to the limited bandwidth available and to the channel noise, data might be lost or be modified, thus resulting in quality degradation of the received content.

The visibility and annoyance of these impairments are directly related to the quality of the received/processed data. The possibility of measuring the overall perceived quality to maintain, control, or enhance the quality of the digital data is fundamental. During the last two decades, many efforts have been directed by the scientific community to the design of quality metrics. The choice of an adequate metric usually depends on the requirements of the considered application.

There are two main methods of assessing media quality: subjective or objective. The first is carried out by human observers, while the second consists of the definition of models for predicting subjective evaluation.

### **3.1 Objective metrics**

In objective measurements of the performances of an imaging system, image quality and quality losses are determined by evaluating some parameters based on a given general mathematical, physical or psycho-psychological model. That is, the goal is to obtain a measurable and verifiable aspect of a thing or phenomenon, expressed in numbers or quantities, such as lightness or heaviness, thickness or thinness, softness or hardness.

Objective quality metrics can be classified according to the amount of side information required to compute a given quality measurement. Using this criterion, three generic classes of objective metrics can be classified as Full Reference (FR) when the original and the impaired data are available, Reduced Reference (RR) when some side information regarding the original media can be used, and No-Reference (NR) if only the impaired image is available.

To make an objective assessment, one can use measuring devices to obtain numerical values; another method is to use image or video quality metrics. These metrics are usually developed to consider the human visual system and try to better match the subjective assessment.

To the first class belong the FR quality metrics. Among the most widely adopted FR objective metrics are the Mean Squared Error (MSE) and Peak Signal-to-Noise Ratio (PSNR). Both are pixel-wise measures of the difference between the original and of the impaired media. In particular, the PSNR is a measure of the peak error between the compressed image and the original image. PSNR is given as PSNR = 20log10 MAX(I) / √ MSE, where MAX(I) represents the maximum possible value of the media. The higher the PSNR, the better the quality of the reproduction. PSNR has usually been used to measure the quality of a compressed or distorted image. It is also applied, frame by frame, to video as the first information about video degradation. Other metrics are SSIM [4], MS- 3 SSIM [5], VIF [6], MAD [7], FSIM [8], etc.

Objective metrics have low computational cost, physical meanings, and are mathematically easy to deal with for optimization purposes. However, they have been widely criticized for not being well correlated with the perceived quality measurement.

*QoE and Immersive Media: A New Challenge DOI: http://dx.doi.org/10.5772/intechopen.99973*

### **Figure 1.**

*Additive Gaussian noise of increasing variance.*

**Figure 1** shows an original image and its version deteriorated by additive Gaussian noise with increasing intensity. **Figure 2** shows the same original image and three versions of the same image in which different distortions are introduced. As can be noticed, in the first case the value of the objective metric agrees with the perceptual judgment. In the second case, the objective metric returns the same score, thus indicating an equal level of distortion. However, from a perceptual point of view, the images are perceived as of different quality.

To overcome such problems, HVS inspired objective quality metrics have been introduced e.g., PSNR-HVS and PSNR-HVS-M. The main difference between these metrics and the mathematical ones (MSE, PSNR) is that they are more heuristic. It is more difficult to perform a mathematical comparison of their performances. Thus, to adequately evaluate the quality of such metrics statistical experiments are needed [9, 10].

### **3.2 Subjective metrics**

In subjective tests, the digital content quality is assessed by performing subjective psychological tests. In this case, the goal is to find attributes, characteristics, or properties that can be observed and interpreted, and maybe approximated (quantified) but cannot be measured, such as beauty, feel, flavor, or taste. The quality score is generated by averaging the result of a set of standards, subjective tests and it can be considered as an indicator of the perceived media quality. A pool of subjects evaluates a set of images (or videos) ranking the perceived quality according to a specific scale [11]. **Table 1** is reported the most used ranking scale in which the score 1 should be given to the media perceived 'bad' since it is affected by a 'very annoying' artifact. Similarly, the score of 5 should be given to the media showing excellent quality, in which no impairments are perceivable.

Contrary to what it may seem, the subjective evaluation methodology is complex and time-consuming since, to be reliable, it requires to be properly designed and a large number of subjects is needed.

In more details, the subjective test depends on the test environment (i.e., type of monitors/speakers and other test equipment, lighting/acoustic conditions, laboratory architecture, background, …), the test material (i.e., meaningful content for the envisaged scenario/application, best, typical, worst cases, …), the test methodology (i.e., viewing distance/hearing position, subject selection, instruction phase, opinion or judgment collection, training - presentation – grading scale), and the carried-out analysis of the data.

### **3.3 Test material**

To verify the performances of an objective metric, as well as for collecting the subjective score, a large database of distorted test images is usually prepared, and the Mean Opinion Score (MOS) from many human observers is collected. Then, the subjective results are compared with the objective scores of the tested metrics to identify the metric which metric shows the highest correlation with the subjective scores. However, some drawbacks have to be considered: usually, the size of the test database is not big enough, the number of different distortions is limited [12, 13], and methodological errors in planning and execution of the experiments can occur. Since in most applications humans are the ultimate receivers of digital data, the most accurate way to determine its quality is to measure it directly using psychophysical experiments with human subjects. One of the most intensive studies in this field has been carried out by the Video Quality Expert Group (VQEG). In the image quality framework, many datasets have been created as LIVE [4, 14] or TID2013 [15]. Relevant efforts have also been devoted to the design and test of video quality datasets. In this direction, among others, LIVE Video Quality Assessment Database [16] and the EPFL-Polimi [17] video databases have been extensively adopted.


**Table 1.** *Mean opinion score assessment table.*
