**3.1.1 Algorithmic implementation**

In this sub-section, the procedures for computational implementation of the DEDR-related robust space filter (RSF) and robust adaptive space filter (RASF) algorithms in the MATLAB and C++ platforms are developed. This reference implementation scheme will be next compared with the proposed architecture based on the use of a VLSI-FPGA platform.

Having established the optimal RSF/RASF estimator (20) and (21), let us now consider the way in which the processing of the data vector **u** that results in the optimum estimate ˆ **b** can be computationally performed. For this purpose, we refer to the estimator (20) as a multi-stage computational procedure. We part the overall computations prescribed by the estimator (16) into four following steps.

a. First Step: Data Innovations

At this stage the a priori known value of the data mean ¢**u Sm** ² **b** is subtracted from the data vector **u**. The innovations vector **u u Sm <sup>b</sup>** <sup>D</sup> contains all new information regarding the unknown deviations **b**<sup>D</sup> = (**b – mb**) of the vector **b** from its prescribed (known) mean value **mb** .

b. Second Step: Rough Signal Estimation

High-Speed VLSI Architecture Based on Massively Parallel

1 1 *k k* u

**u***j*

RFS ( ) ˆ **b** *<sup>j</sup>*

1 1 *k k* u

paradigm.



efficient parallelization.

*y a x for j m*

, 1,...,

1

*n j ji i i*


Processor Arrays for Real-Time Remote Sensing Applications 143

Fig. 1. VLSI-FPGA platform of the RSF/RASF algorithms via the HW/SW co-design

(HPC) in order to exploit the maximum possible parallelism in the design:

The basic algebraic matrix operation (i.e., the selected matrix–vector multiplication) that constitutes the base of the most computationally consuming applications in the reconstructive SP applications is transformed into the required parallel algorithmic representation format. A manifold of different approaches can be used to represent parallel algorithms, e.g. (Moldovan & Fortes, 1986), (Kung, 1988). In this study, we consider a number of different loop optimization techniques used in high performance computing

In addition, to achieve such maximum possible parallelism in an algorithm, the so-called data dependencies in the computations must be analyzed (Moldovan & Fortes, 1986), (Kung, 1988). Formally, these dependencies are to be expressed via the corresponding dependence graph (DG). Following (Kung, 1988), we define the dependence graph **G**=[**P**, **E**] as a composite set where **P** represents the nodes and **E** represents the arcs or edges in which each *e***E** connects 1 2 *p p*, **P** that is represented as 1 2 *ep p* o . Next, the data dependencies analysis of the matrix–vector multiplication algorithms should be performed aimed at their

For example, the matrix-vector multiplication of an *n*×*m* matrix **A** with a vector **x** of dimension *m,* given by **y**=**Ax**, can be algorithmically computed as

¦ , where **y** and *ji <sup>a</sup>* represents an *n-*dimensional (*n*-*D*) output

vector and the corresponding element of **A**, respectively. The first SW-level transformation is the so-called single assignment algorithm (Kung, 1988), (Castillo Atoche et al., 2010b) that performs the computing of the matrix-vector product. Such single assignment algorithm corresponds to a loop unrolling method in which the primary benefit in loop unrolling is to

**u**

**F**

At this stage we obtain the vector **q** = **S**<sup>+</sup> **u**<sup>D</sup> . The operator **S**+ operating on **u**<sup>D</sup> is mapped. Thus, the result, **q**, can be interpreted as a rough estimate of **b**<sup>D</sup> = (**b – mb**) referred to as a degraded image.

c. Third Step: Signal Reconstruction

At this stage we obtain the estimate -1 <sup>1</sup> <sup>ǂ</sup> RSF <sup>ˆ</sup> ( <sup>ǂ</sup> ) **b A q SS I q** <sup>D</sup> of the unknown signal referred to as the reconstructed image frame. The matrix **A**D–1 = (**S**+**S** + DRSF**I**)–1 operating on **q** produces some form of inversion of the degradations embedded in the operator **S**+**S**. It is important to note that in the case D = 0, we have 1 # (ǂ = 0) <sup>ˆ</sup> **b A q Su** <sup>D</sup> , where matrix
