**3. VLSI architecture based on Massively Parallel Processor Arrays**

In this section, we present the design methodology for real time implementation of specialized arrays of processors in VLSI architectures based on massively parallel processor arrays (MPPAs) as coprocessors units that are integrated with a FPGA platform via the HW/SW co-design paradigm. This approach represents a real possibility for low-power high-speed reconstructive signal processing (SP) for the enhancement/reconstruction of RS imagery. In addition, the authors believe that FPGA-based reconfigurable systems in aggregation with custom VLSI architectures are emerging as newer solutions which offer enormous computation potential in RS systems.

A brief perspective on the state-of-the-art of high-performance computing (HPC) techniques in the context of remote sensing problems is provided. The wide range of computer architectures (including homogeneous and heterogeneous clusters and groups of clusters, large-scale distributed platforms and grid computing environments, specialized architectures based on reconfigurable computing, and commodity graphic hardware) and data processing techniques exemplifies a subject area that has drawn at the cutting edge of science and technology. The utilization of parallel and distributed computing paradigms anticipates ground-breaking perspectives for the exploitation of high-dimensional data processing sets in many RS applications. Parallel computing architectures made up of homogeneous and heterogeneous commodity computing resources have gained popularity in the last few years due to the chance of building a high-performance system at a reasonable cost. The scalability, code reusability, and load balance achieved by the proposed implementation in such low-cost systems offer an unprecedented opportunity to explore methodologies in other fields (e.g. data mining) that previously looked to be too computationally intensive for practical applications due to the immense files common to remote sensing problems (Plaza & Chang, 2008).

To address the required near-real-time computational mode by many RS applications, we propose a high-speed low-power VLSI co-processor architecture based on MPPAs that is aggregated with a FPGA via the HW/SW co-design paradigm. Experimental results demonstrate that the hardware VLSI-FPGA platform of the presented DEDR algorithms makes appropriate use of resources in the FPGA and provides a response in near-real-time that is acceptable for newer RS applications.

### **3.1 Design flow**

The all-software execution of the prescribed RS image formation and reconstructive signal processing (SP) operations in modern high-speed personal computers (PC) or any digital signal processors (DSP) platform may be intensively time consuming. These high computational complexities of the general-form DEDR-POCS algorithms make them definitely unacceptable for real time PC-aided implementation.

In this section, we describe a specific design flow of the proposed VLSI-FPGA architecture for the implementation of the DEDR method via the HW/SW co-design paradigm. The 140 Applications of Digital Signal Processing

Any other feasible adjustments of the DEDR degrees of freedom (the regularization parameters D, E, and the weight matrix **A**) provide other possible DEDR-related SSP

In this section, we present the design methodology for real time implementation of specialized arrays of processors in VLSI architectures based on massively parallel processor arrays (MPPAs) as coprocessors units that are integrated with a FPGA platform via the HW/SW co-design paradigm. This approach represents a real possibility for low-power high-speed reconstructive signal processing (SP) for the enhancement/reconstruction of RS imagery. In addition, the authors believe that FPGA-based reconfigurable systems in aggregation with custom VLSI architectures are emerging as newer solutions which offer

A brief perspective on the state-of-the-art of high-performance computing (HPC) techniques in the context of remote sensing problems is provided. The wide range of computer architectures (including homogeneous and heterogeneous clusters and groups of clusters, large-scale distributed platforms and grid computing environments, specialized architectures based on reconfigurable computing, and commodity graphic hardware) and data processing techniques exemplifies a subject area that has drawn at the cutting edge of science and technology. The utilization of parallel and distributed computing paradigms anticipates ground-breaking perspectives for the exploitation of high-dimensional data processing sets in many RS applications. Parallel computing architectures made up of homogeneous and heterogeneous commodity computing resources have gained popularity in the last few years due to the chance of building a high-performance system at a reasonable cost. The scalability, code reusability, and load balance achieved by the proposed implementation in such low-cost systems offer an unprecedented opportunity to explore methodologies in other fields (e.g. data mining) that previously looked to be too computationally intensive for practical applications due to the immense files common to

To address the required near-real-time computational mode by many RS applications, we propose a high-speed low-power VLSI co-processor architecture based on MPPAs that is aggregated with a FPGA via the HW/SW co-design paradigm. Experimental results demonstrate that the hardware VLSI-FPGA platform of the presented DEDR algorithms makes appropriate use of resources in the FPGA and provides a response in near-real-time

The all-software execution of the prescribed RS image formation and reconstructive signal processing (SP) operations in modern high-speed personal computers (PC) or any digital signal processors (DSP) platform may be intensively time consuming. These high computational complexities of the general-form DEDR-POCS algorithms make them

In this section, we describe a specific design flow of the proposed VLSI-FPGA architecture for the implementation of the DEDR method via the HW/SW co-design paradigm. The

6

,respectively.

with **F** (1) = **F***MSF*; **F**(2) = **F***RSF*, and **F**(3) = **F***RASF*, **F**(4) = **F***RASF*

enormous computation potential in RS systems.

remote sensing problems (Plaza & Chang, 2008).

that is acceptable for newer RS applications.

definitely unacceptable for real time PC-aided implementation.

**3.1 Design flow** 

reconstruction techniques, that we do not consider in this study.

**3. VLSI architecture based on Massively Parallel Processor Arrays** 

HW/SW co-design is a hybrid method aimed at increasing the flexibility of the implementation and improvement of the overall design process (Castillo Atoche et al., 2010a). When a co-processor-based solution is employed in the HW/SW co-design architecture, the computational time can be drastically reduced. Two opposite alternatives can be considered when exploring the HW/SW co-design of a complex SP system. One of them is the use of standard components whose functionality can be defined by means of programming. The other one is the implementation of this functionality via a microelectronic circuit specifically tailored for that application. It is well known that the first alternative (the software alternative) provides solutions that present a great flexibility in spite of high area requirements and long execution times, while the second one (the hardware alternative) optimizes the size aspects and the operation speed but limits the flexibility of the solution. Halfway between both, hardware/software co-design techniques try to obtain an appropriate trade-off between the advantages and drawbacks of these two approaches.

In (Castillo Atoche et al., 2010a), an initial version of the HW/SW- architecture was presented for implementing the digital processing of a large-scale RS imagery in the operational context. The architecture developed in (Castillo Atoche et al., 2010a) did not involve MPPAs and is considered here as a simply reference for the new pursued HW/SW co-design paradigm, where the corresponding blocks are to be designed to speed-up the digital SP operations of the DEDR-POCS-related algorithms developed at the previous SW stage of the overall HW/SW co-design to meet the real time imaging system requirements.

The proposed co-design flow encompasses the following general stages:

