**2. Dedicated hardware for WT review**

The objective of this work is to investigate a suitable hardware that is able to perform image processing algorithms using WT in real time. Processing an image with the WT filter is faster in terms of computational cost in applications such as edge detection where a single filter is capable of producing three types of edges in comparison to standard methods where more than one filter masks are required to achieve the same results. In this section we review the special hardware dedicated for WT including DSPs, FPGAs and GPUs.

GPUs provide programmable vertex and pixel engines that accelerates algorithm mapping such as image processing. An example of a cost effective SIMD algorithm that performs the convolution-based DWT completely on a GPU using a normal PC (baseline processor) is reported by Wong (Wong et al., 2007). It is reported, the algorithm unifies forward and inverse WT to an almost identical process for efficient implementation on the GPU through parallel processing (Wong et al., 2007). This demonstrate that GPUs are capable of processing WT algorithms cost effectively, however it is not suitable for our application, which is PC independent.

An example of a scalable FPGA-based architecture for the separable 2-D Biorthogonal Discrete Wavelet Transform (DWT) decomposition is presented by (Benkrid et al., 2001). The architecture is based on the Pyramid Algorithm Analysis, which handles computation along the border efficiently by using the method of symmetric extension using Xilinx Virtex-E (Benkrid et al., 2001). FPGA's are suitable for real-time embedded applications due to their parallel processing abilities.

DSPs are also reported to be powerful and portable for embedded systems. An example system by Desneux and Legat (Desneux & Legat, 2000) show a DSP with an architecture designed specifically for DWT. Their DSP design stops any wait cycles during algorithm execution by using a bi-processor organization. It is able to perform a 3-stage multiresolution transform in real time. Their DSP is fully programmable in terms of filters and picture format as well as being capable of image edge processing.

Using a floating-point DSP, Patil and Abel (Patil & Abel, 2006) used redundant wavelet transform as a tool for the analysis of non-stationary signals as well as the localization and characterization of singularities. Their work focused on producing an optimized method for the implementation of a B-spline based redundant wavelet transform (RWT) using a (DSP) for integer scales leads to an improvement in the execution speed over the standard method.

A DSP-based edge detection comparison is explained in (Abdel-Qader & Maddix, 2005) where three edge detection algorithms performance on DSP are compared using Canny, Prewitt and Haar wavelet-based. The reported outcome is that the Haar wavelet-based edge detector performed best in terms of SNR in noisy images. The authors recommended postprocessing of the output edges to make them more optimal.

The review favours DSPs as a suitable choice for our ANPR application. In addition, following successful results in LP detection using a DSP as reported in (Musoromy et al., 2010) using WT, this work extends the use of WT in the LP character segmentation investigation of SD and HD images using a Texas Instrument's C64plus DSP with minimum of 600MHZ clock speed and 1MB of RAM (TI, 2006).
