2. Proposed watermarking technique

In this section, we introduce our digital video watermarking technique for the purpose of authentication and ownership protection. The proposed technique is aimed at achieving reasonable degrees of robustness, visual quality, and security. The embedding technique involves two stages: first, a decomposition process and then a hiding process. The watermark can be any binary sequence; normally a binary image of a specific size is used. The encoded videos can be in any color space; in our case, YUV space is used. It is possible to perform the hiding process in any of the three components: Y, U, and V. In this work, the luminance Y frames are used as host images for the data hiding process; that is, the hiding of the watermark will be performed in one or more of the sub-bands that result from the discrete wavelet analysis process. Choosing the wavelet filters is an important aspect in the efficiency of the reconstruction process; special types of filters are the randomly generated orthonormal filter banks [13]. These filter banks can be generated randomly depending on the generating polynomials; hence, by generating random numbers for the polynomial coefficients, it is possible to build multiple filter banks that are used for the different stages of our decomposition processes. The orthonormal analysis and synthesis filters can be generated in different ways; for our technique, having large side-lobes is preferred. This enables us to hide more energy in the medium frequencies of the image; in doing so, we construct a more robust way that can counteract the effects of different image processes, which take place intentionally or unintentionally over the course of the handling process. Each filter bank that is generated is used for one level of the DWT analysis and synthesis processes. Moreover, the number of the levels and the structure that is followed during the analysis process are controlled by the owner. It is well known to the image processing community that the medium-frequency bands are preferred for hiding. This will avoid hiding in the lower-frequency bands where most of the energy is concentrated and the higher-frequency bands where the possibility of losing the data is high due to compression processes. Furthermore, the possibility of

### DWT-Based Data Hiding Technique for Videos Ownership Protection DOI: http://dx.doi.org/10.5772/intechopen.84963

reliable, secure, and robust data hiding methods [2, 3]. Various watermarking schemes that use different techniques have been proposed over the years [4–9]. To be effective, a watermark must be imperceptible within its host, extracted with ease by the owner, and robust in the face of both intentional and unintentional distortions [7, 10, 11]. In specific, discrete wavelet transform (DWT) has wide applications in the different areas of image and video processes such as compression, noise reduction, and watermarking [12]; this is attributed to its characteristics in space-frequency localization, multi-resolution representation, and superior human visual system (HVS) modeling [5]. The robustness is a very important aspect in data hiding or watermarking. To achieve the highest levels of robustness, new methods and techniques should be introduced and optimized at both the sender and receiver sides. Furthermore, the detection process should be enhanced

In this research, a video watermarking process that depends on the discrete wavelet decompositions will be developed. Moreover, the detection process will be enhanced through statistical derivations. The security will be maintained through the adoption of random filter banks, the study of the motion and motionless scenes in the video frames, and the spread spectrum generation of the watermarks. The overall technique has to meet the requirements of visual quality, security,

In this section, we introduce our digital video watermarking technique for the purpose of authentication and ownership protection. The proposed technique is aimed at achieving reasonable degrees of robustness, visual quality, and security. The embedding technique involves two stages: first, a decomposition process and then a hiding process. The watermark can be any binary sequence; normally a binary image of a specific size is used. The encoded videos can be in any color space; in our case, YUV space is used. It is possible to perform the hiding process in any of the three components: Y, U, and V. In this work, the luminance Y frames are used as host images for the data hiding process; that is, the hiding of the watermark will be performed in one or more of the sub-bands that result from the discrete wavelet analysis process. Choosing the wavelet filters is an important aspect in the efficiency of the reconstruction process; special types of filters are the randomly generated orthonormal filter banks [13]. These filter banks can be generated randomly depending on the generating polynomials; hence, by generating random numbers for the polynomial coefficients, it is possible to build multiple filter banks that are used for the different stages of our decomposition processes. The orthonormal analysis and synthesis filters can be generated in different ways; for our technique, having large side-lobes is preferred. This enables us to hide more energy in the medium frequencies of the image; in doing so, we construct a more robust way that can counteract the effects of different image processes, which take place intentionally or unintentionally over the course of the handling process. Each filter bank that is generated is used for one level of the DWT analysis and synthesis processes. Moreover, the number of the levels and the structure that is followed during the analysis process are controlled by the owner. It is well known to the image processing community that the medium-frequency bands are preferred for hiding. This will avoid hiding in the lower-frequency bands where most of the energy is concentrated and the higher-frequency bands where the possibility of losing the data is high due to compression processes. Furthermore, the possibility of

to meet these requirements.

Wavelet Transform and Complexity

58

robustness, and computational complexity.

2. Proposed watermarking technique

using more than one sub-band rather than a single sub-band is there; this method is useful in having a robust method against the nonlinear collusion attack.

There are many scenarios that can be followed for the embedding process; one of them is to embed the data which is our binary watermark using a generated pseudorandom sequence [14]. This method depends mainly on doing the watermarking process by converting the original binary watermark image Q to some sort of a binary sequence S of a specific length M; in this case, the data pixels are given the value +1, whereas the background pixels are given the value �1. Furthermore, a pseudorandom sequence P of the same length M as our watermark sequence is generated using a secret key; likewise, this sequence is represented by values that are either +1 or �1. The DWT coefficients of the decomposed sub-bands that will be used for the hiding process are represented as a matrix Q<sup>1</sup> of the same size as our watermark. Moreover, it can be written as a vector T of length M. The binary watermark is hidden into this vector T, and that will result in a new vector that is called T<sup>0</sup> according to the rule that is shown in this equation:

$$t\_i' = t\_i + a \ast p\_i \ast s\_i \\ \text{for } i = 1, 2 \dots M \tag{1}$$

where α is a numerical factor which represents a weighting constant that determines the strength of the processed watermark. This number is chosen in such a way to offer a trade-off between the required robustness and the acceptable visual quality. Moreover, choosing this weighting factor should take into consideration many elements in image processing techniques such as the compression standard that is used and its intensity, the smooth features or the textures that are there in the image, and the algorithm that is followed when doing the detection process. Furthermore, how much energy content is there in the wavelet sub-bands must be considered at the hiding stage. One way to get the numerical magnitude factor is to have a comparison process between the energy of the original coefficients of the host DWT sub-band Q<sup>1</sup> and energy content of the original watermark image Q elements according to this empirical formula:

$$a = 2 \ast \sqrt{\frac{E(Q\_1)}{E(Q)}}\tag{2}$$

where E(Q1) represents the energy content of the original wavelet coefficients, while E(Q) represents the energy content of the watermark image Q; the energy was computed by taking the sum of the squared elements. The manipulated wavelet coefficients according to our hiding process are used then depending on their respective locations to reconstruct and build the watermarked image frame. The overall hiding process of a binary watermark for a Y frame is shown in Figure 1. It is clear from this figure, and this, in fact, depends on the decomposition structure that is followed that the low-low (LL) frequency area of the decomposed image is not used for our embedding process. This area or band is called the decimated image normally, and it results in both the pyramidal and DWT decompositions. It is clear that this band or image has most of the information or energy of the original image frame; the other images in other bands are normally called the error images, and they have lower energy content. In fact, they represent other bands depending on the analysis filters which are the low-high (LH), high-low (HL), and high-high (HH) bands. These bands offer better places for the hiding process.

The watermark, which is primarily a binary image, can be embedded in any of the frames of the host video; moreover, the frames can be chosen in a fully controlled selective way. The degree of randomness that is achieved is up to the user

statistical invisibility, which is an important condition of every security system [15]. Moreover, applying independent watermarks to each and every of these frames also causes a security problem if these frames have few or no motion areas inside them; these motionless regions in successive video frames may be statistically compared or averaged to remove independent watermarks. Attacks of such kinds are normally called collusion attacks. The inter-frame collusion attacks, for instance, exploit the repetition in the video frames and their scenes or in the watermarks themselves to produce a false copy of the video that does not have any watermarks; these attacks can be divided into watermark estimation remodulation (WER) attack and frame temporal filtering (FTF) attack [16]. Classifying the video frames according to the amount of motion in them is useful in this regard. The motion in videos is a relative one, since most of the videos have motion, but what interest us here are the amount of this motion, how fast this motion is, the relative motion with respect to the surroundings, and the distribution of this motion across the frames. Most of the video compression techniques use inter-frame motion estimations to encode the frames; however, there are other methods that can be used to detect static and dynamic scenes in videos. One method can be built depending on the 1D discrete Fourier transform (DFT). The 1D DFT in temporal direction performs a transformation process of a group of pictures (GOPs) into a temporal frequency domain; in the resulting domain, both the video frames spatial and temporal frequency information exist in the same resulting frame. Higher frequencies are a reflection of the fast motion from one frame to other frames [17]. The 1D DFT of a video f(x,y,t) that has a specific size of MxNxT, in which MxN is the size of each of the video frames and T is the number of the video frames that are grouped in one GOP, is given by

DWT-Based Data Hiding Technique for Videos Ownership Protection

DOI: http://dx.doi.org/10.5772/intechopen.84963

F uð Þ¼ ; v; τ ∑

Figure 3.

61

The first frame of the Foreman video.

T�1 t¼0

f xð Þ ; y; t e

where u and v represent the spatial domain of the video frames, while τ represents the temporal domain of these frames. Normally the GOPs are taken as five frames or a close number. Depending on that, a group of the so-called spatiotemporal frames can be constructed for the Foreman video. Twenty-five frames of the Foreman video were transformed using this method of the 1D DFT, and since the DFT is a symmetric process in one GOP, so it is logical to show only the first spatiotemporal frame of each of those groups of pictures. Figure 3 shows the first frame of the Foreman video, while Figure 4 shows the 5 temporal frames of this

�<sup>j</sup> <sup>2</sup>Πð Þ <sup>t</sup>τ=<sup>T</sup> (3)

Figure 1.

The block diagram of the proposed watermarking method.

Figure 2. The block diagram of the watermarking process with the application of HEVC process.

who is the sole owner. Furthermore, the security of the system depends partially as well on the degree of the randomness of the pseudorandom sequence that is used in the encoding process. On the other hand, the Y components of the color space were chosen intentionally because they have higher resolution and therefore higher hiding capacity, but we have to keep in mind the fact that the U and V components likewise can be used. As we mentioned in the introduction, our techniques will be used when the HEVC process is applied; Figure 2 shows the proposed hiding process when the HEVC or H.265 process is applied to the video that is watermarked.
