**3.3 Linear-phase directional combination filter**

Using transposition techniques, we subsequently derive DF being complementary (dual) to those presented in Subsection 3.2: They combine two complex-valued signals of identical sampling rate *f*<sup>d</sup> that are likewise oversampled by at least 2 to an FDM signal, where different oversampling factors allow for different bandwidths.

An example can be deduced from Fig. 21 by considering the signals *so*(*mT*d) ←→ *So*(ejΩ(d) ), *o* = 0, 2, of Figs.21(c,d) as input signals. The multiplexing DF increases the sampling rates of both signals to *f*<sup>n</sup> = 2 *f*d, and provides the filtering operations shown in Fig. 21(b), *ho*(*kT*n) ←→ *Ho*(ejΩ), *<sup>c</sup>* <sup>=</sup> 0, 2, to form the FDM output spectrum being exclusively composed of *So*(ejΩ), *o* = 0, 2.

#### **3.3.1 Transposition of complex multirate systems**

The goal of transposition is to derive a system that is complementary or dual to the original one: The various filter transfer functions must be retained, demultiplexing and decimating operations must be replaced with the dual operations of multiplexing and interpolation, respectively [Göckler & Groth (2004)].

The types of systems we want to transpose, Figs.22 and 24, represent complex-valued 4 × 2 multiple-input multiple-output (MIMO) multirate systems. Obviously, these systems are composed of *complex monorate* sub-systems (complex filtering of polyphase components) and *real multirate* sub-systems (down- and upsampler), cf. [Göckler & Groth (2004)].

While the transposition of real MIMO monorate systems is well-known and unique [Göckler & Groth (2004); Mitra (1998)], in the context of *complex* MIMO monorate systems the *Invariant* (ITr) and the *Hermitian* (HTr) transposition must be distinguished, where the former retains the original transfer functions, *H*<sup>T</sup> *<sup>o</sup>* (*z*) = *Ho*(*z*) ∀*o*, as desired in our application. As detailed in [Göckler & Groth (2004)], the ITr is performed by applying the transposition rules known for real MIMO monorate systems *provided that* all imaginary units "j", both of the complex input and output signals *and* of the complex coefficients, are conceptually considered and treated as multipliers within the SFG<sup>3</sup> (denoted as truly complex implementation), as to be seen from Figs.22 and 24.

The transposition of an *M*-downsampler, representing a real single-input single-output (SISO) multirate system, uniquely leads to the corresponding *M*-upsampler, the complementary (dual) multirate system, and vice versa [Göckler & Groth (2004)].

<sup>3</sup> The imaginary units of the input signals and the coefficients *must not* be eliminated by simple multiplication and consideration of the correct signs in subsequent adders; this approach would transform the original complex MIMO SFG to a corresponding real SFG, where the direct transposition of the latter would perform the HTr [Göckler & Groth (2004)].

Signal Processing 31

<sup>267</sup> Most Efficient Digital Filter Structures:

Fig. 26. COHBF approach to *multiplexing* DF implementation with selectable transfer functions derived by transposition from corresponding separating DF; *N* = 11,

Fig. 27. DF combiner: Sign-setting for selection of desired channel transfer functions

Fig. 28. Generally permissible FDM input spectrum to separation DF

*bi* = (−1)*oi* , *di* = (−1)*oi*/2; *oi* <sup>∈</sup> {0, 1, 2, 3}, *<sup>i</sup>* <sup>∈</sup> {I, II}

The Potential of Halfband Filters in Digital Signal Processing

Connecting all of the above considerations, the ITr transposition of a complex-valued MIMO multirate system is performed as follows [Göckler & Groth (2004)]:


As a result of transposition [Göckler & Groth (2004)]


Obviously, the original *optimality* (minimality) is *transposition invariant*.

#### **3.3.2 Transposition of the SFG of the COHBF approach to DF**

As an example, we transpose the SFG of the COHBF approach to the implementation of a separating DF, as depicted in Fig. 24. The application of the transposition rules of the preceding Subsection 3.3.1 to the SFG of Fig. 24 results in the COHBF approach to a multiplexing DF shown in Fig. 26. The invariant properties are easily confirmed by comparing the original and the transposed SFG. Hence, the numbers of delays and multipliers required by both DF systems being mutually dual are identical. As expected, the numbers of adders required are different, since the *overall* number of branching *and* summation nodes is retained only.

Moreover, it should be noted that also the simplicity of the channel selection procedure is retained. To this end, we have shifted the channel-dependent sign-setting operators *di* = (−1)*oi*/2, *oi* <sup>∈</sup> {0, 1, 2, 3}, *<sup>i</sup>* <sup>∈</sup> {I, II}, to more suitable positions in front of the summation nodes G and H. Again, there is a total of 8 summation points, where the signs of the respective input sequences must be adjusted: The 4 inner lattice output nodes A, B, C, and D, the 2 input summation nodes E and F immediately fed by the imaginary parts of the input sequences, and the 2 inner post-lattice summing nodes G and H. At all these summation nodes, the signs of some or all input sequences must be set in compliance with the desired channel transfer functions: *Ho*(*z*), *oi* ∈ {0, 1, 2, 3}, *i* ∈ {I, II}, cf. Fig. 26. The sign selection is again most easily performed, as shown in Fig. 27.

#### **3.4 Conclusion: Halfband filter pair combined to directional filter**

In this Section 3, we have derived and analyzed two different approaches to linear-phase directional filters that separate from a complex-valued FDM input signal two complex user signals, where the FDM signal may be composed of up to four independent user signals: The FDMUX approach (Subsection 3.2.1) needs the least number of delays, whereas the synergetic COHBF approach (Subsection 3.2.2) requires minimum computation. Signal extraction is always combined with decimation by two.

While the four frequency slots of the user signals to be processed (corresponding to the four potential DF transfer functions *Ho*(*z*), *oi* ∈ {0, 1, 2, 3}, *i* ∈ {I, II}, centred according to (38); cf. Fig. 21 ) are equally wide and uniformly allocated, as indicated in Fig. 28, the individual 30 Will-be-set-by-IN-TECH

Connecting all of the above considerations, the ITr transposition of a complex-valued MIMO

• Reverse *all* arrows of the given SFG, both the arrows representing signal flows and those symbolic arrows of down- and upsamplers or rotating switches (commutators),

• all input (output) nodes become output (input) nodes, a 4 × 2 MIMO system is transformed

As an example, we transpose the SFG of the COHBF approach to the implementation of a separating DF, as depicted in Fig. 24. The application of the transposition rules of the preceding Subsection 3.3.1 to the SFG of Fig. 24 results in the COHBF approach to a multiplexing DF shown in Fig. 26. The invariant properties are easily confirmed by comparing the original and the transposed SFG. Hence, the numbers of delays and multipliers required by both DF systems being mutually dual are identical. As expected, the numbers of adders required are different, since the *overall* number of branching *and* summation nodes is retained

Moreover, it should be noted that also the simplicity of the channel selection procedure is retained. To this end, we have shifted the channel-dependent sign-setting operators *di* = (−1)*oi*/2, *oi* <sup>∈</sup> {0, 1, 2, 3}, *<sup>i</sup>* <sup>∈</sup> {I, II}, to more suitable positions in front of the summation nodes G and H. Again, there is a total of 8 summation points, where the signs of the respective input sequences must be adjusted: The 4 inner lattice output nodes A, B, C, and D, the 2 input summation nodes E and F immediately fed by the imaginary parts of the input sequences, and the 2 inner post-lattice summing nodes G and H. At all these summation nodes, the signs of some or all input sequences must be set in compliance with the desired channel transfer functions: *Ho*(*z*), *oi* ∈ {0, 1, 2, 3}, *i* ∈ {I, II}, cf. Fig. 26. The sign selection is again most easily

In this Section 3, we have derived and analyzed two different approaches to linear-phase directional filters that separate from a complex-valued FDM input signal two complex user signals, where the FDM signal may be composed of up to four independent user signals: The FDMUX approach (Subsection 3.2.1) needs the least number of delays, whereas the synergetic COHBF approach (Subsection 3.2.2) requires minimum computation. Signal extraction is

While the four frequency slots of the user signals to be processed (corresponding to the four potential DF transfer functions *Ho*(*z*), *oi* ∈ {0, 1, 2, 3}, *i* ∈ {I, II}, centred according to (38); cf. Fig. 21 ) are equally wide and uniformly allocated, as indicated in Fig. 28, the individual

• The system SFG to be transposed must be given as truly complex implementation.

multirate system is performed as follows [Göckler & Groth (2004)]:

• the overall number of branching and summation nodes is retained, and

Obviously, the original *optimality* (minimality) is *transposition invariant*.

As a result of transposition [Göckler & Groth (2004)]

• the number of delays and multipliers is retained,

• the overall number of down- and upsamplers is retained.

**3.3.2 Transposition of the SFG of the COHBF approach to DF**

**3.4 Conclusion: Halfband filter pair combined to directional filter**

respectively.

only.

to a 2 × 4 MIMO system,

performed, as shown in Fig. 27.

always combined with decimation by two.

Fig. 26. COHBF approach to *multiplexing* DF implementation with selectable transfer functions derived by transposition from corresponding separating DF; *N* = 11, *bi* = (−1)*oi* , *di* = (−1)*oi*/2; *oi* <sup>∈</sup> {0, 1, 2, 3}, *<sup>i</sup>* <sup>∈</sup> {I, II}

Fig. 27. DF combiner: Sign-setting for selection of desired channel transfer functions

Fig. 28. Generally permissible FDM input spectrum to separation DF

Signal Processing 33

<sup>269</sup> Most Efficient Digital Filter Structures:

[Göckler & Groth (2004); Göckler & Eyssele (1992)] according to Subsection 3.2.1, or on the COHBF approach investigated in Subsection 3.2.2. For both approaches it has been shown that bandwidth-to-user assignment is feasible within reasonable constraints [Abdulazim et al. (2007); Johansson & Löwenborg (2005); Kopmann et al. (2003)]: A minimum user channel bandwidth, denoted by slot bandwidth *b*, can stepwise be extended by any integer number of additional slots up to a desired maximum overall bandwidth that shall be assigned to a single

However, as to challenge *i*), the above two FB approaches fundamentally differ from each other: In a DFT PP FDMUX (*a*) the overall sample rate reduction is performed in compliance with the number of user channels in a single step: all arithmetic operations are carried out at the (lowest) output sampling rate [Vaidyanathan (1993)]. In contrast, in the multistage FDMUX (*b*) the sampling rate is reduced stepwise, in each stage by a factor of two [Göckler & Eyssele (1992)]. As a result, the polyphase approach (*a*) inherently represents a completely parallelised structure, immediately usable for extremely high front-end sampling frequencies, whereas the high-end stages of the tree-structured FDMUX (*b*) cannot be implemented with standard space-proved CMOS technology. Hence, the tree structure,

As motivated, this contribution deals with the parallelisation of multistage multirate systems. To this end, we recall a general systematic procedure for multirate system parallelisation [Groth (2003)], which is deployed in detail in Subsection 4.1. For proper understanding, in Subsection 4.2 this procedure is applied to the high rate front-end stages of the FDMUX part of the recently proposed tree-structured SBC-FDFMUX FB [Abdulazim & Göckler (2005); Abdulazim et al. (2007)], which uniformly demultiplexes an FDM signal always down to slot level (of bandwidth *b*) and that, after on-board switching, recombines these independent slot signals to an FDM signal (FMUX) with different channel allocation – *FDFMUX functionality*. If a single user occupies a multiple slot channel, the corresponding parts of FDMUX and FMUX are matched for (nearly) perfect reconstruction of this wideband channel signal – *SBC*

In this subsection, we introduce the novel sample-by-sample processing (SBSP) approach to parallelisation of digital multirate systems, as proposed by [Groth (2003)] where, without any additional delay, all incoming signal samples are directly fed into assigned units for immediate signal processing. Hence, in contrast to the widely used block processing (BP)

In order to systematically parallelise a (multirate) system, we distinguish four procedural

1. *Partition the original system* in (elementary SISO or MIMO) subsystems *E*(*z*) with single or multiple input and/or output ports, respectively, still operating at the original high clock frequency *f*<sup>n</sup> = 1/*T* that are simply amenable to parallelisation. To enumerate some of these: Delay, multiplier, down- and up-sampler, summation and branching, but also suitable

2. *Parallelise each subsystem E*(*z*) in an SBSP manner according to the desired individual degree of parallelisation *P*, where *P* ∈ **N**. To this end, each subsystem is cascaded with a *P*-fold SBSP serial-to-parallel (SP) commutator for signal decomposition (demultiplexing) followed by a consistently connected *P*-fold parallel-to-serial (PS) commutator for recomposition (remultiplexing) of the original signal, as depicted in Fig. 29(a). Here, obviously *P* =

FDMUX as well as FMUX, calls for a parallelisation of the high rate stages.

*functionality* [Vaidyanathan (1993)]. Finally, some conclusions are drawn.

compound subsystems such as SISO filters and FFT transform blocks.

**4.1 Sample-by-sample approach to parallelisation**

The Potential of Halfband Filters in Digital Signal Processing

approach, SBSP does not increase latency.

steps [Groth (2003)]:

user.

user signals may possess different bandwidths. However, each user signal must completely be contained in one of the four frequency slots, as exemplified in Fig. 28.

Furthermore, by applying the transposition rules of [Göckler & Groth (2004)], the corresponding complementary (dual) combining directional filters have been derived, where the multiplication rates and the delay counts of the original structures are always retained. Obviously, transposing a system allows for the derivation of an optimum dual system by applying the simple transposition rules, provided that the original system is optimal. Thus, a tedious re-derivation and optimization of the complementary system is circumvented. Nevertheless, it should be noted that by transposition always just one particular structure is obtained, rather than a variety of structures [Göckler & Groth (2004)].

Finally, to give an idea of the required filter lengths required, we recall the design result reported in [Göckler & Eyssele (1992)] where, as depicted in the above Fig. 21(a,b), the passband, stopband and transition bands were assumed equally wide: With an HBF prototype filter length of *N* = 11 and 10 bit coefficients, a stopband attenuation of > 50dB was achieved.

### **4. Parallelisation of tree-structured filter banks composed of directional filters** <sup>4</sup>

In the subsequent Section 4 of this chapter we consider the combination of multiple two-channel DF investigated in Section 3 to construct tree-structured filter banks. To this end, we cascade separating DF in a hierarchical manner to demultiplex (split) a frequency division multiplex (FDM) signal into its constituting user signals: this type of filter bank (FB) is denoted by FDMUX FB; Fig. 2. Its transposed counterpart (cf. Subsection 3.3.1), the FMUX FB, is a cascade connection of combining DF considered in Subsection 3.3 to form an FDM signal of independent user signals. Finally, we call an FDMUX FB followed by an FMUX FB an FDFMUX FB, which may contain a switching unit for channel routing between the two FB. Subsequently, we consider an application of FDFMUX FB for on-board processing in satellite communications. If the number of channels and/or the bandwidth requirements are high, efficient implementation of the high-end DF is crucial, if they are operated at (extremely) high sampling rates. To cope with this issue, we propose to parallelise the at least the front-end (back-end) of the FDMUX (FMUX) filter bank. For this outlined application, we give the following introduction and motivation.

Digital signal processing on-board communication satellites (OBP) is an active field of research where, in conjunction with frequency division multiplex (FDMA) systems, presently two trends and challenges are observed, respectively: *i*) The need of an ever-increasing number of user channels makes it necessary to digitally process, i.e. to demultiplex, cross-connect and remultiplex, ultra-wideband FDM signals requiring high-end sampling rates that range considerably beyond 1GHz [Arbesser-Rastburg et al. (2002); Maufroid et al. (2004; 2003); Rio-Herrero & Maufroid (2003); Wittig (2000)], and *ii*) the desire of flexibility of channel bandwidth-to-user assignment calling for simply reconfigurable OBP systems [Abdulazim & Göckler (2005); Göckler & Felbecker (2001); Johansson & Löwenborg (2005); Kopmann et al. (2003)]. Yet, overall power consumption must be minimum demanding highly efficient FB for FDM demultiplexing (FDMUX) and remultiplexing (FMUX).

Two baseline approaches to most efficient uniform digital FB, as required for OBP, are known: *a*) The complex-modulated (DFT) polyphase (PP) FB applying single-step sample rate alteration [Vaidyanathan (1993)], and *b*) the multistage tree-structured FB as depicted in Fig. 2, where its directional filters (DF) are either based on the DFT PP method

<sup>4</sup> Underlying original publication: Göckler et al. (2006)

32 Will-be-set-by-IN-TECH

user signals may possess different bandwidths. However, each user signal must completely

Furthermore, by applying the transposition rules of [Göckler & Groth (2004)], the corresponding complementary (dual) combining directional filters have been derived, where the multiplication rates and the delay counts of the original structures are always retained. Obviously, transposing a system allows for the derivation of an optimum dual system by applying the simple transposition rules, provided that the original system is optimal. Thus, a tedious re-derivation and optimization of the complementary system is circumvented. Nevertheless, it should be noted that by transposition always just one particular structure

Finally, to give an idea of the required filter lengths required, we recall the design result reported in [Göckler & Eyssele (1992)] where, as depicted in the above Fig. 21(a,b), the passband, stopband and transition bands were assumed equally wide: With an HBF prototype filter length of *N* = 11 and 10 bit coefficients, a stopband attenuation of > 50dB was achieved.

**4. Parallelisation of tree-structured filter banks composed of directional filters** <sup>4</sup>

In the subsequent Section 4 of this chapter we consider the combination of multiple two-channel DF investigated in Section 3 to construct tree-structured filter banks. To this end, we cascade separating DF in a hierarchical manner to demultiplex (split) a frequency division multiplex (FDM) signal into its constituting user signals: this type of filter bank (FB) is denoted by FDMUX FB; Fig. 2. Its transposed counterpart (cf. Subsection 3.3.1), the FMUX FB, is a cascade connection of combining DF considered in Subsection 3.3 to form an FDM signal of independent user signals. Finally, we call an FDMUX FB followed by an FMUX FB an FDFMUX FB, which may contain a switching unit for channel routing between the two FB. Subsequently, we consider an application of FDFMUX FB for on-board processing in satellite communications. If the number of channels and/or the bandwidth requirements are high, efficient implementation of the high-end DF is crucial, if they are operated at (extremely) high sampling rates. To cope with this issue, we propose to parallelise the at least the front-end (back-end) of the FDMUX (FMUX) filter bank. For this outlined application, we give the

Digital signal processing on-board communication satellites (OBP) is an active field of research where, in conjunction with frequency division multiplex (FDMA) systems, presently two trends and challenges are observed, respectively: *i*) The need of an ever-increasing number of user channels makes it necessary to digitally process, i.e. to demultiplex, cross-connect and remultiplex, ultra-wideband FDM signals requiring high-end sampling rates that range considerably beyond 1GHz [Arbesser-Rastburg et al. (2002); Maufroid et al. (2004; 2003); Rio-Herrero & Maufroid (2003); Wittig (2000)], and *ii*) the desire of flexibility of channel bandwidth-to-user assignment calling for simply reconfigurable OBP systems [Abdulazim & Göckler (2005); Göckler & Felbecker (2001); Johansson & Löwenborg (2005); Kopmann et al. (2003)]. Yet, overall power consumption must be minimum demanding highly

Two baseline approaches to most efficient uniform digital FB, as required for OBP, are known: *a*) The complex-modulated (DFT) polyphase (PP) FB applying single-step sample rate alteration [Vaidyanathan (1993)], and *b*) the multistage tree-structured FB as depicted in Fig. 2, where its directional filters (DF) are either based on the DFT PP method

efficient FB for FDM demultiplexing (FDMUX) and remultiplexing (FMUX).

be contained in one of the four frequency slots, as exemplified in Fig. 28.

is obtained, rather than a variety of structures [Göckler & Groth (2004)].

following introduction and motivation.

<sup>4</sup> Underlying original publication: Göckler et al. (2006)

[Göckler & Groth (2004); Göckler & Eyssele (1992)] according to Subsection 3.2.1, or on the COHBF approach investigated in Subsection 3.2.2. For both approaches it has been shown that bandwidth-to-user assignment is feasible within reasonable constraints [Abdulazim et al. (2007); Johansson & Löwenborg (2005); Kopmann et al. (2003)]: A minimum user channel bandwidth, denoted by slot bandwidth *b*, can stepwise be extended by any integer number of additional slots up to a desired maximum overall bandwidth that shall be assigned to a single user.

However, as to challenge *i*), the above two FB approaches fundamentally differ from each other: In a DFT PP FDMUX (*a*) the overall sample rate reduction is performed in compliance with the number of user channels in a single step: all arithmetic operations are carried out at the (lowest) output sampling rate [Vaidyanathan (1993)]. In contrast, in the multistage FDMUX (*b*) the sampling rate is reduced stepwise, in each stage by a factor of two [Göckler & Eyssele (1992)]. As a result, the polyphase approach (*a*) inherently represents a completely parallelised structure, immediately usable for extremely high front-end sampling frequencies, whereas the high-end stages of the tree-structured FDMUX (*b*) cannot be implemented with standard space-proved CMOS technology. Hence, the tree structure, FDMUX as well as FMUX, calls for a parallelisation of the high rate stages.

As motivated, this contribution deals with the parallelisation of multistage multirate systems. To this end, we recall a general systematic procedure for multirate system parallelisation [Groth (2003)], which is deployed in detail in Subsection 4.1. For proper understanding, in Subsection 4.2 this procedure is applied to the high rate front-end stages of the FDMUX part of the recently proposed tree-structured SBC-FDFMUX FB [Abdulazim & Göckler (2005); Abdulazim et al. (2007)], which uniformly demultiplexes an FDM signal always down to slot level (of bandwidth *b*) and that, after on-board switching, recombines these independent slot signals to an FDM signal (FMUX) with different channel allocation – *FDFMUX functionality*. If a single user occupies a multiple slot channel, the corresponding parts of FDMUX and FMUX are matched for (nearly) perfect reconstruction of this wideband channel signal – *SBC functionality* [Vaidyanathan (1993)]. Finally, some conclusions are drawn.

#### **4.1 Sample-by-sample approach to parallelisation**

In this subsection, we introduce the novel sample-by-sample processing (SBSP) approach to parallelisation of digital multirate systems, as proposed by [Groth (2003)] where, without any additional delay, all incoming signal samples are directly fed into assigned units for immediate signal processing. Hence, in contrast to the widely used block processing (BP) approach, SBSP does not increase latency.

In order to systematically parallelise a (multirate) system, we distinguish four procedural steps [Groth (2003)]:

1. *Partition the original system* in (elementary SISO or MIMO) subsystems *E*(*z*) with single or multiple input and/or output ports, respectively, still operating at the original high clock frequency *f*<sup>n</sup> = 1/*T* that are simply amenable to parallelisation. To enumerate some of these: Delay, multiplier, down- and up-sampler, summation and branching, but also suitable compound subsystems such as SISO filters and FFT transform blocks.

2. *Parallelise each subsystem E*(*z*) in an SBSP manner according to the desired individual degree of parallelisation *P*, where *P* ∈ **N**. To this end, each subsystem is cascaded with a *P*-fold SBSP serial-to-parallel (SP) commutator for signal decomposition (demultiplexing) followed by a consistently connected *P*-fold parallel-to-serial (PS) commutator for recomposition (remultiplexing) of the original signal, as depicted in Fig. 29(a). Here, obviously *P* =

Signal Processing 35

<sup>271</sup> Most Efficient Digital Filter Structures:

Fig. 31. Parallelisation of unit delay (a) and *M*-fold down-sampler (b) with zero time offset

identities [Groth (2003)], the contiguous *PM*-fold down-samplers of the SP demultiplexer

Subsequently, we deploy the parallelisation of the high rate FDMUX front-end section of the versatile tree-structured SBC-FDFMUX FB for flexible channel and bandwidth allocation [Abdulazim & Göckler (2005); Abdulazim et al. (2007)]. The first three hierarchically cascaded stages of the FDMUX are shown in Fig. 32 in block diagram form applying BP. In each stage, *ν* = 1, 2, 3, the respective input spectrum is split into two subbands of equal bandwidth in conjunction with decimation by two. For convenience of presentation, all DF have identical coefficients and, in contrast to Section 3, are assumed as critically sampling 2-channel DFT PP FB with zero frequency offset (cf. [Abdulazim et al. (2007)]). The branch filter transfer functions *Hλ*(*zν*), *λ* = 0, 1, represent the two PP components of the prototype filter [Göckler & Groth (2004); Vaidyanathan (1993)] where, by setting *<sup>z</sup><sup>ν</sup>* := *<sup>e</sup>j*Ω(*ν*)

which are related to the operational sampling rate *f<sup>ν</sup>* of stage *ν*. The respective DF lowpass

<sup>Ω</sup>(*ν*) <sup>=</sup> <sup>2</sup>*<sup>π</sup> <sup>f</sup>* / *<sup>f</sup><sup>ν</sup>* and *<sup>ν</sup>* <sup>=</sup> 1, 2, 3, the respective frequency responses *<sup>H</sup>λ*(*ej*Ω(*ν*)

with

) are obtained,

Fig. 30. Identity for elimination of *P*-fold interfractional PS-SP cascades

The Potential of Halfband Filters in Digital Signal Processing

(*p* = 0)

have a relative time offset of *M*.

**4.2 Parallelisation of SBC-FDFMUX filter bank**

Fig. 29. *P*-Parallelisation of SISO subsystem *E*(*z*) to *P* × *P* MIMO system **E**(*z*d)

*P*SP = *P*PS, and *p* ∈ [0, *P* − 1] denotes the relative time offsets of connected pairs of down- and up-samplers, respectively. Evidently, the *P* output signals of the SP interface comprise all polyphase components of its input signal in a time-interleaved (SBSP) manner at a *P*-fold lower sampling rate *f*<sup>d</sup> = *f*n/*P* [Göckler & Groth (2004); Vaidyanathan (1993)]. Since the subsequent PS interface is inverse to the preceding SP interface [Göckler & Groth (2004)], the SP-PS commutator cascade has unity transfer with zero delay in contrast to the (*P* − 1)-fold delay of the BP Delay-Chain Perfect-Reconstruction system [Göckler & Groth (2004); Vaidyanathan (1993)], as anticipated (cf. also Fig. 30).

After this preparation, *P*-fold parallelisation is readily achieved by shifting the (SISO) subsystem *E*(*z*) between the SP and PS interfaces by exploiting the noble identities [Göckler & Groth (2004); Vaidyanathan (1993)] and some novel generalized SBSP multirate identities [Groth (2003); Groth & Göckler (2001)]. Thus, as shown in Fig. 29(b), the two interfaces are interconnected by an equivalent *P* × *P* MIMO system **E**(*z*d), which represents the *P*-fold parallelisation of *E*(*z*), where all operations of which are performed at the *P*-fold reduced operational clock frequency *f*d.

3. *Reconnect all parallelised subsystems* exactly in the same manner as in the original system. This is always given, since parallelisation does not change the original numbers of input and output ports of SISO or MIMO subsystems, respectively.

4. *Eliminate all interfractional cascade connections of PS-SP interfaces* using the obvious multirate identity depicted in Fig. 30. Note that this elimination process requires identical up- and down-sampling factors, *P*out,a PS <sup>=</sup> *<sup>P</sup>*in,b SP , of each PS-SP interface cascade restricting free choice of *P* for subsystem parallelisation. As a result of parallelisation, all input signals of the original (possibly MIMO) system are decomposed into *P* time-interleaved polyphase components by a SP demultiplexer for subsequent parallel processing at a *P*-fold lower rate, and all system output ports are provided with a PS commutator to interleave all low rate subsignals to form the high speed output signals.

For illustration, we present the parallelisation of a unit delay *z*−<sup>1</sup> := *z*−1/*<sup>P</sup>* <sup>d</sup> , and of an *M*-fold down-sampler with zero time offset [Groth (2003)], as shown in Fig. 31. The unit delay (a) is realized by *P* parallel time-interleaved shimming delays to be implemented by suitable system control:

$$\mathbf{E}\_{P \times P}(z\_{\mathbf{d}}) = z\_{\mathbf{d}}^{-1/P} \begin{pmatrix} \mathbf{0} & 1\\ \mathbf{I}\_{(P-1)\times(P-1)} \mathbf{0} \end{pmatrix} / \mathbf{0}$$

where permutation is introduced for straightforward elimination of interfractional PS-SP cascades according to Fig. 30 (**I** : Identity matrix). In case of down-sampling Fig. 31(b), to increase efficiency, the *P* parallel down-samplers of the diagonal MIMO system **E**(*z*d) are merged with the *P* down-samplers of the SP interface. Hence, by using suitable multirate 34 Will-be-set-by-IN-TECH

*P*SP = *P*PS, and *p* ∈ [0, *P* − 1] denotes the relative time offsets of connected pairs of down- and up-samplers, respectively. Evidently, the *P* output signals of the SP interface comprise all polyphase components of its input signal in a time-interleaved (SBSP) manner at a *P*-fold lower sampling rate *f*<sup>d</sup> = *f*n/*P* [Göckler & Groth (2004); Vaidyanathan (1993)]. Since the subsequent PS interface is inverse to the preceding SP interface [Göckler & Groth (2004)], the SP-PS commutator cascade has unity transfer with zero delay in contrast to the (*P* − 1)-fold delay of the BP Delay-Chain Perfect-Reconstruction system [Göckler & Groth

After this preparation, *P*-fold parallelisation is readily achieved by shifting the (SISO) subsystem *E*(*z*) between the SP and PS interfaces by exploiting the noble identities [Göckler & Groth (2004); Vaidyanathan (1993)] and some novel generalized SBSP multirate identities [Groth (2003); Groth & Göckler (2001)]. Thus, as shown in Fig. 29(b), the two interfaces are interconnected by an equivalent *P* × *P* MIMO system **E**(*z*d), which represents the *P*-fold parallelisation of *E*(*z*), where all operations of which are performed at the *P*-fold

3. *Reconnect all parallelised subsystems* exactly in the same manner as in the original system. This is always given, since parallelisation does not change the original numbers of input and

4. *Eliminate all interfractional cascade connections of PS-SP interfaces* using the obvious multirate identity depicted in Fig. 30. Note that this elimination process requires identical up- and

of *P* for subsystem parallelisation. As a result of parallelisation, all input signals of the original (possibly MIMO) system are decomposed into *P* time-interleaved polyphase components by a SP demultiplexer for subsequent parallel processing at a *P*-fold lower rate, and all system output ports are provided with a PS commutator to interleave all low rate subsignals to form

down-sampler with zero time offset [Groth (2003)], as shown in Fig. 31. The unit delay (a) is realized by *P* parallel time-interleaved shimming delays to be implemented by suitable

where permutation is introduced for straightforward elimination of interfractional PS-SP cascades according to Fig. 30 (**I** : Identity matrix). In case of down-sampling Fig. 31(b), to increase efficiency, the *P* parallel down-samplers of the diagonal MIMO system **E**(*z*d) are merged with the *P* down-samplers of the SP interface. Hence, by using suitable multirate

 **0** 1 **<sup>I</sup>**(*P*−1)×(*P*−1) **<sup>0</sup>**

SP , of each PS-SP interface cascade restricting free choice

 , <sup>d</sup> , and of an *M*-fold

Fig. 29. *P*-Parallelisation of SISO subsystem *E*(*z*) to *P* × *P* MIMO system **E**(*z*d)

(2004); Vaidyanathan (1993)], as anticipated (cf. also Fig. 30).

output ports of SISO or MIMO subsystems, respectively.

PS <sup>=</sup> *<sup>P</sup>*in,b

For illustration, we present the parallelisation of a unit delay *z*−<sup>1</sup> := *z*−1/*<sup>P</sup>*

**<sup>E</sup>***P*×*P*(*z*d) = *<sup>z</sup>*−1/*<sup>P</sup>*

d

reduced operational clock frequency *f*d.

down-sampling factors, *P*out,a

the high speed output signals.

system control:

Fig. 30. Identity for elimination of *P*-fold interfractional PS-SP cascades

Fig. 31. Parallelisation of unit delay (a) and *M*-fold down-sampler (b) with zero time offset (*p* = 0)

identities [Groth (2003)], the contiguous *PM*-fold down-samplers of the SP demultiplexer have a relative time offset of *M*.

#### **4.2 Parallelisation of SBC-FDFMUX filter bank**

Subsequently, we deploy the parallelisation of the high rate FDMUX front-end section of the versatile tree-structured SBC-FDFMUX FB for flexible channel and bandwidth allocation [Abdulazim & Göckler (2005); Abdulazim et al. (2007)]. The first three hierarchically cascaded stages of the FDMUX are shown in Fig. 32 in block diagram form applying BP. In each stage, *ν* = 1, 2, 3, the respective input spectrum is split into two subbands of equal bandwidth in conjunction with decimation by two. For convenience of presentation, all DF have identical coefficients and, in contrast to Section 3, are assumed as critically sampling 2-channel DFT PP FB with zero frequency offset (cf. [Abdulazim et al. (2007)]). The branch filter transfer functions *Hλ*(*zν*), *λ* = 0, 1, represent the two PP components of the prototype filter [Göckler & Groth (2004); Vaidyanathan (1993)] where, by setting *<sup>z</sup><sup>ν</sup>* := *<sup>e</sup>j*Ω(*ν*) with <sup>Ω</sup>(*ν*) <sup>=</sup> <sup>2</sup>*<sup>π</sup> <sup>f</sup>* / *<sup>f</sup><sup>ν</sup>* and *<sup>ν</sup>* <sup>=</sup> 1, 2, 3, the respective frequency responses *<sup>H</sup>λ*(*ej*Ω(*ν*) ) are obtained, which are related to the operational sampling rate *f<sup>ν</sup>* of stage *ν*. The respective DF lowpass

Signal Processing 37

<sup>273</sup> Most Efficient Digital Filter Structures:

Fig. 33. Complete parallelisation of FDMUX front-end of SBC-FDFMUX filter bank (Fig. 32);

shown in Fig. 31(a), while the subsequent down-sampler applies *P*<sup>1</sup> = 4, as described above w.r.t. Fig. 31(b). Immediate cascading of parallelised unit delay (*P*<sup>0</sup> = 8) and down-sampling (*P*<sup>1</sup> = 4, *M* = 2) (as induced by Fig. 31) shows that only those four PP components of the parallelised delay with *even* time offset(*p* = 0, 2, 4, 6) are transferred via the 4-branch SP-input interface of down-sampling (2*P*<sup>1</sup> = 8) to its PS-output interface with naturally ordered time offsets *p* = 0, 1, 2, 3 w.r.t. *P*<sup>1</sup> = 4. Hence, only those retained 4 out of 8 PP components of odd time index *p* = 7, 1, 3, 5, being provided by the unit delay's SP-input interface and

offset *p* = 0, 1, 2, 3 of the 4-branch PS-output interface of the down-sampler. Fig. 33 shows the correspondingly rearranged signal flow graph representation of stage 1 input section (*ν* =

PP components of the high rate FDMUX input signal, whereas the lower branch *H*1(*z*1) →

This procedure is repeated with the input branching and blocking sections of the subsequent

*P*<sup>3</sup> = 1 (*P*<sup>1</sup> = 4), are provided with the even-numbered PP components of the respective input signals with timing offsets in natural order. Contrary, the set of PP components of odd index

<sup>d</sup> and fed into filter blocks *<sup>H</sup>*1(*zν*) <sup>→</sup> **<sup>H</sup>***<sup>ν</sup>*

achieved by systematic application of the procedure condensed in Fig. 29 (for details cf. Göckler & Groth (2004); Groth (2003)). To this end, *Hλ*(*zν*) is decomposed in *P<sup>ν</sup>* PP components of correspondingly reduced order, which are arranged to a MIMO system by

<sup>1</sup>(*z*d) is provided with the delayed versions of the PP components of odd index, as depicted in Fig. 33. Hence, as in the original system Fig. 32, the input sequence is completely fed into

<sup>d</sup> , are transferred (mapped) to the *P*<sup>1</sup> = 4 up-samplers with timing

<sup>0</sup> is parallelised by *P*<sup>0</sup> = 8, as

<sup>0</sup>(*z*d), is fed by the even-indexed

<sup>1</sup>(*z*d) in crossed manner

*<sup>λ</sup>*(*z*d), *λ* = 0, 1; *ν* = 1, 2, is

<sup>0</sup>(*z*d) parallelised by *Pν*, where *P*<sup>2</sup> = 2 and

*<sup>z</sup>*<sup>d</sup> :<sup>=</sup> *<sup>e</sup>j*Ω(d)

delayed by *z*−<sup>1</sup>

the parallelised system.

is always delayed by *z*

(cf. input section *λ* = 1).

*λ* = 1).

**H**<sup>1</sup>

<sup>0</sup> <sup>=</sup> *<sup>z</sup>*−1/8

, Ω(d) = 2*π f* / *f*d, *f*<sup>d</sup> = *f*n/8

The Potential of Halfband Filters in Digital Signal Processing

*H*1(*z*1). To this end, as required by Fig. 32, the unit delay *z*−<sup>1</sup>

As a result, the upper branch of stage 1, *<sup>H</sup>*0(*z*1) <sup>→</sup> **<sup>H</sup>**<sup>1</sup>

stages *<sup>ν</sup>* <sup>=</sup> 2, 3: The PP branch filters *<sup>H</sup>*0(*zν*) <sup>→</sup> **<sup>H</sup>***<sup>ν</sup>*

−1/*Pν*−<sup>1</sup>

3. *<sup>P</sup>ν-fold Parallelisation of PP branch filters Hλ*(*zν*) <sup>→</sup> **<sup>H</sup>***<sup>ν</sup>*

Fig. 32. FDMUX front end of SBC-FDFMUX filter bank according to Abdulazim et al. (2007)); *<sup>z</sup><sup>ν</sup>* := *<sup>e</sup>j*Ω(*ν*) , Ω(*ν*) = 2*π f* / *fν*, *ν* = 0, 1, 2, 3, *f*<sup>3</sup> = *f*<sup>d</sup> = *f*n/8

and highpass filter transfer functions of stage *ν*, related to the original sampling rate 2 *fν*, are generated by the two branch filter transfer functions *Hλ*(*zν*), *λ* = 0, 1, in combination with the simple "butterfly" across the output ports of each DF: Summation produces the lowpass, subtraction the complementary highpass filter transfer function Bellanger (1989); Kammeyer & Kroschel (2002); Mitra (1998); Schüssler (2008); Vaidyanathan (1993).

Assuming, for instance, a high-end input sampling frequency of *f*<sup>n</sup> = *f*<sup>0</sup> = 2.4GHz [Kopmann et al. (2003); Maufroid et al. (2003)], the operational clock rate of the third stage is *f*<sup>3</sup> = *f*n/23 = 300MHz, which is deemed feasible using present-day CMOS technology. Hence, front-end parallelisation has to reduce operational clock of all subsystems preceding the third stage down to *f*<sup>d</sup> = *f*<sup>3</sup> = 300MHz. This is achieved by 8-fold parallelisation of input branching and blocking (delay *z*−<sup>1</sup> <sup>0</sup> ), 4-fold parallelisation of the first stage of the FDMUX tree (comprising input decimation by two, the PP branch filters *Hλ*(*z*1), *λ* = 0, 1, and butterfly), and of the input branching and blocking (delay *z*−<sup>1</sup> <sup>1</sup> ) of the second stage and, finally, corresponding 2-fold parallelisation of the two parallel 2-channel FDMUX FB of the second stage of the tree, as indicated in Fig. 32.

The result of parallelisation, as required above, is shown in Fig. 33, where all interfractional interfaces have been removed by straightforward application of identity of Fig. 30. Subsequently, parallelisation of elementary subsystems is explained in detail:

1. *Down-Sampling by M* = 2: In compliance with Fig. 31(b), each 2-fold down-sampler is replaced with *P<sup>ν</sup>* units in parallel for 2*Pν*-fold down-sampling with even time offset 2*p*, where *p* = 0, 1, 2, 3 applies to the first tree stage (*P*<sup>1</sup> = 4), and *p* = 0, 1 to the second stage (*P*<sup>2</sup> = 2). The result of 4-fold parallelisation of the front end input down-sampler of the upper branch (*ν* = 1, *λ* = 0) is readily visible in Fig. 33 preceding filter MIMO block **H**<sup>1</sup> <sup>0</sup>(*z*d): In fact, it represents an 8-to-4 parallelisation, where all odd PP components are removed according to Fig. 31(b) Groth (2003).

2. *Cascade of unit blocking delay and 2-fold down-sampler*: For proper explanation, we first focus on the input section of the first tree stage, lower branch (*ν* = *λ* = 1) in front of filter block 36 Will-be-set-by-IN-TECH

Fig. 32. FDMUX front end of SBC-FDFMUX filter bank according to Abdulazim et al. (2007));

and highpass filter transfer functions of stage *ν*, related to the original sampling rate 2 *fν*, are generated by the two branch filter transfer functions *Hλ*(*zν*), *λ* = 0, 1, in combination with the simple "butterfly" across the output ports of each DF: Summation produces the lowpass, subtraction the complementary highpass filter transfer function Bellanger (1989);

Assuming, for instance, a high-end input sampling frequency of *f*<sup>n</sup> = *f*<sup>0</sup> = 2.4GHz [Kopmann et al. (2003); Maufroid et al. (2003)], the operational clock rate of the third stage is *f*<sup>3</sup> = *f*n/23 = 300MHz, which is deemed feasible using present-day CMOS technology. Hence, front-end parallelisation has to reduce operational clock of all subsystems preceding the third stage down to *f*<sup>d</sup> = *f*<sup>3</sup> = 300MHz. This is achieved by 8-fold parallelisation

FDMUX tree (comprising input decimation by two, the PP branch filters *Hλ*(*z*1), *λ* = 0, 1,

finally, corresponding 2-fold parallelisation of the two parallel 2-channel FDMUX FB of the

The result of parallelisation, as required above, is shown in Fig. 33, where all interfractional interfaces have been removed by straightforward application of identity of Fig. 30.

1. *Down-Sampling by M* = 2: In compliance with Fig. 31(b), each 2-fold down-sampler is replaced with *P<sup>ν</sup>* units in parallel for 2*Pν*-fold down-sampling with even time offset 2*p*, where *p* = 0, 1, 2, 3 applies to the first tree stage (*P*<sup>1</sup> = 4), and *p* = 0, 1 to the second stage (*P*<sup>2</sup> = 2). The result of 4-fold parallelisation of the front end input down-sampler of the upper branch

represents an 8-to-4 parallelisation, where all odd PP components are removed according to

2. *Cascade of unit blocking delay and 2-fold down-sampler*: For proper explanation, we first focus on the input section of the first tree stage, lower branch (*ν* = *λ* = 1) in front of filter block

<sup>0</sup> ), 4-fold parallelisation of the first stage of the

<sup>1</sup> ) of the second stage and,

<sup>0</sup>(*z*d): In fact, it

Kammeyer & Kroschel (2002); Mitra (1998); Schüssler (2008); Vaidyanathan (1993).

, Ω(*ν*) = 2*π f* / *fν*, *ν* = 0, 1, 2, 3, *f*<sup>3</sup> = *f*<sup>d</sup> = *f*n/8

and butterfly), and of the input branching and blocking (delay *z*−<sup>1</sup>

Subsequently, parallelisation of elementary subsystems is explained in detail:

(*ν* = 1, *λ* = 0) is readily visible in Fig. 33 preceding filter MIMO block **H**<sup>1</sup>

of input branching and blocking (delay *z*−<sup>1</sup>

second stage of the tree, as indicated in Fig. 32.

Fig. 31(b) Groth (2003).

*<sup>z</sup><sup>ν</sup>* := *<sup>e</sup>j*Ω(*ν*)

Fig. 33. Complete parallelisation of FDMUX front-end of SBC-FDFMUX filter bank (Fig. 32); *<sup>z</sup>*<sup>d</sup> :<sup>=</sup> *<sup>e</sup>j*Ω(d) , Ω(d) = 2*π f* / *f*d, *f*<sup>d</sup> = *f*n/8

*H*1(*z*1). To this end, as required by Fig. 32, the unit delay *z*−<sup>1</sup> <sup>0</sup> is parallelised by *P*<sup>0</sup> = 8, as shown in Fig. 31(a), while the subsequent down-sampler applies *P*<sup>1</sup> = 4, as described above w.r.t. Fig. 31(b). Immediate cascading of parallelised unit delay (*P*<sup>0</sup> = 8) and down-sampling (*P*<sup>1</sup> = 4, *M* = 2) (as induced by Fig. 31) shows that only those four PP components of the parallelised delay with *even* time offset(*p* = 0, 2, 4, 6) are transferred via the 4-branch SP-input interface of down-sampling (2*P*<sup>1</sup> = 8) to its PS-output interface with naturally ordered time offsets *p* = 0, 1, 2, 3 w.r.t. *P*<sup>1</sup> = 4. Hence, only those retained 4 out of 8 PP components of odd time index *p* = 7, 1, 3, 5, being provided by the unit delay's SP-input interface and delayed by *z*−<sup>1</sup> <sup>0</sup> <sup>=</sup> *<sup>z</sup>*−1/8 <sup>d</sup> , are transferred (mapped) to the *P*<sup>1</sup> = 4 up-samplers with timing offset *p* = 0, 1, 2, 3 of the 4-branch PS-output interface of the down-sampler. Fig. 33 shows the correspondingly rearranged signal flow graph representation of stage 1 input section (*ν* = *λ* = 1).

As a result, the upper branch of stage 1, *<sup>H</sup>*0(*z*1) <sup>→</sup> **<sup>H</sup>**<sup>1</sup> <sup>0</sup>(*z*d), is fed by the even-indexed PP components of the high rate FDMUX input signal, whereas the lower branch *H*1(*z*1) → **H**<sup>1</sup> <sup>1</sup>(*z*d) is provided with the delayed versions of the PP components of odd index, as depicted in Fig. 33. Hence, as in the original system Fig. 32, the input sequence is completely fed into the parallelised system.

This procedure is repeated with the input branching and blocking sections of the subsequent stages *<sup>ν</sup>* <sup>=</sup> 2, 3: The PP branch filters *<sup>H</sup>*0(*zν*) <sup>→</sup> **<sup>H</sup>***<sup>ν</sup>* <sup>0</sup>(*z*d) parallelised by *Pν*, where *P*<sup>2</sup> = 2 and *P*<sup>3</sup> = 1 (*P*<sup>1</sup> = 4), are provided with the even-numbered PP components of the respective input signals with timing offsets in natural order. Contrary, the set of PP components of odd index is always delayed by *z* −1/*Pν*−<sup>1</sup> <sup>d</sup> and fed into filter blocks *<sup>H</sup>*1(*zν*) <sup>→</sup> **<sup>H</sup>***<sup>ν</sup>* <sup>1</sup>(*z*d) in crossed manner (cf. input section *λ* = 1).

3. *<sup>P</sup>ν-fold Parallelisation of PP branch filters Hλ*(*zν*) <sup>→</sup> **<sup>H</sup>***<sup>ν</sup> <sup>λ</sup>*(*z*d), *λ* = 0, 1; *ν* = 1, 2, is achieved by systematic application of the procedure condensed in Fig. 29 (for details cf. Göckler & Groth (2004); Groth (2003)). To this end, *Hλ*(*zν*) is decomposed in *P<sup>ν</sup>* PP components of correspondingly reduced order, which are arranged to a MIMO system by

Signal Processing 39

<sup>275</sup> Most Efficient Digital Filter Structures:

straightforward, the operating clock rates within the front- or back-ends may be too high for implementation. To this end, we have introduced and described to some extent the systematic graphically induced procedure to parallelise multirate systems according to [Groth (2003)]. It has been applied to a three-stage demultiplexing tree-structured filter bank in such a manner that all operations throughout the overall system are performed at the operational output clock. As a result, parallelisation makes the system feasible but retains the computational

Abdulazim, M. N. & Göckler, H. G. (2007). Tree-structured MIMO FIR filter banks for flexible

Abdulazim, M. N. & Göckler, H. G. (2005). Efficient digital on-board de- and remultiplexing

Abdulazim, M. N., Kurbiel, T. & Göckler, H. G. (2007). Modified DFT SBC-FDFMUX filter

Ansari, R. (1985). Elliptic filter design for a class of generalized halfband filters, *IEEE Trans.*

Ansari, R. & Liu, B. (1983). Efficient sampling rate alternation using recursive IIR digital filters, *IEEE Trans. Acoustics, Speech, and Signal Processing* ASSP-31(6): 1366–1373. Arbesser-Rastburg, B., Bellini, R., Coromina, F., Gaudenzi, R. D., del Rio, O., Hollreiser, M.,

Bellanger, M. (1989). *Digital Processing of Signals - Theory and Practice*, 2nd edn, John Wiley &

Bellanger, M. G., Daguet, J. L. & Lepagnol, G. P. (1974). Interpolation, extrapolation, and

Damjanovic, S. & Milic, L. (2005). Examples of orthonormal wavelet transform implemented with IIR filter pairs, *Proc. SMMSP*, ICSP Series No.30, Riga, Latvia, pp. 19–27. Damjanovic, S., Milic, L. & Saramäki, T. (2005). Frequency transformations in two-band

Danesfahani, G. R., Jeans, T. G. & Evans, B. G. (1994). Low-delay distortion recursive (IIR)

Eghbali, A., Johansson, H., Löwenborg, P. & Göckler, H. G. (2009). Dynamic frequency-band

Evangelista, G. (2001). *Zum Entwurf digitaler Systeme zur asynchronen Abtastratenumsetzung*,

Evangelista, G. (2002). Design of optimum high-order finite-wordlength digital FIR filters

Fliege, N. (1993). *Multiraten-Signalverarbeitung: Theorie und Anwendungen*, B. G. Teubner,

*and Analysis (ISPA 2007)*, Istanbul, Turkey, pp. 69–74.

*Acoust., Speech, Sign. Proc.* ASSP-33(4): 1146–1150.

transmultiplexer, *Electron. Lett.* 30(7): 542–543.

PhD thesis, Ruhr-Universität Bochum, Bochum, Germany.

with linear phase, *EURASIP Signal Processing* 82(2): 187–194.

*Systems Conf.*, Rome, Italy.

The Potential of Halfband Filters in Digital Signal Processing

*Systems Conf.*, Montreal, Canada.

*Process.* ASSP-22(4): 231–235.

pp. 60–64.

Sons, New York.

pp. 87–90.

Stuttgart.

frequency reallocation, *Proc. of the 5th Int. Symposium on Image and Signal Processing*

of FDM signals allowing for flexible bandwidth allocation, *Proc. Int. Comm. Satellite*

bank systems for flexible frequency reallocation, *Proc. EUSIPCO'07*, Poznan, Poland,

Rinaldo, R., Rinous, P. & Roederer, A. (2002). R&D directions for next generation broadband multimedia systems: An ESA perspective, *Proc. Int. Comm. Satellite*

reduction of computation speed in digital filters, *IEEE Trans. Acoust., Speech, and Sign.*

wavelet IIR filter banks, *Proc. EUROCON*, Belgrade, Serbia and Montenegro,

reallocation and allocation: From satellite-based communication systems to cognitive radios, *Journal of Signal Processing Systems* (10.1007/s11265-009-0348-1, Springer NY).

load.

**6. References**

exploiting a multitude of multirate identities Groth (2003); Groth & Göckler (2001). The resulting *<sup>P</sup><sup>ν</sup>* <sup>×</sup> *<sup>P</sup><sup>ν</sup>* MIMO filter transfer matrix **<sup>H</sup>***<sup>ν</sup> <sup>λ</sup>*(*z*d) contains each PP component of *Hλ*(*zν*) *P<sup>ν</sup>* times: Thus, the amount of hardware is increased *P<sup>ν</sup>* times whereas, as desired for feasibility, the operational clock rate is concurrently reduced by *Pν*. Hence, the overall expenditure, i.e. the number of operations times the respective operational clock rate Göckler & Groth (2004), is not changed.

4. *Parallelisation of butterflies* combining the output signals of associated PP filter blocks is straightforward: For each (time-interleaved) PP component of the respective signals a butterfly has to be foreseen, as shown in Fig. 33.

#### **4.3 Conclusion: Parallelisation of multirate systems**

In this Section 4, a general and systematic procedure for parallelisation of multirate systems, for instance as investigated in Sections 2 and 3, has been presented . Its application to the high rate decimating FDMUX front end of the tree-structured SBC-FDFMUX FB Abdulazim & Göckler (2005); Abdulazim et al. (2007) has been deployed in detail. The stage *ν* degree of parallelisation *Pν*, *ν* = 0, 1, 2, 3, is diminished proportionally to the operational clock frequency *f<sup>ν</sup>* of stage *ν* and is, thus, adapted to the actual sampling rate. As a result, after suitable decomposition of the high rate front end input signal by an input commutator in *P*<sup>0</sup> = *P*max polyphase components (as depicted for *P*max = 8 in Fig. 33), all subsequent processing units are likewise operated at the same operational clock rate *f*<sup>d</sup> = *f*n/*P*<sup>0</sup> = *f*0/*P*0. Since inherent parallelism of the original tree-structured FDMUX (Fig. 32) has attained *P*max = 8 in the third stage, and the output signals of this stage represent the desired eight demultiplexed FDM subsignals, interleaving PS-output commutators are no longer required, as to be seen in Fig. 33. Finally, it should be noted that parallelisation does not change overall expenditure; yet, by multiplying stage *ν* hardware by *Pν*, the operational clock rates are reduced by a factor of *P<sup>ν</sup>* to a feasible order of magnitude, as desired.

Applying the rules of multirate transposition (cf. Subsection 3.3.1 or Göckler & Groth (2004)) to the parallelised FDMUX front end, the high rate interpolating back end of the tree-structured SBC-FDFMUX FB is obtained likewise and exhibits the same properties as to expenditure and feasibility Groth (2003). Hence, the versatile and efficient tree-structured filter bank (FDMUX, FMUX, SBC, wavelet, or any combination thereof) can be used in any (ultra) wide-band application without any restriction.

#### **5. Summary and conclusion**

In Section 2 we have introduced and investigated a special class of real and complex FIR and IIR halfband bandpass filters with the particular set of centre frequencies defined by (1). As a result of the constraint (1), almost all filter coefficients are either real-valued or purely imaginary-valued, as opposed to fully complex-valued coefficients. Hence, this class of halfband filters requires only a small amount of computation.

In Section 3, two different options to combine two of the above FIR halfband filters with different centre frequencies to form a directional filter (DF) have been investigated. As a result, one of these DF approaches is optimum w.r.t. to computation (most efficient), whereas the other requires the least number of delay elements (minimum McMillan degree). The relation between separating DF and DF that combine two independent signals to an FDM signal via multirate transposition rules has extensively been shown.

Finally, in Section 4, the above FIR directional filters (DF) have been combined to tree-structured multiplexing and demultiplexing filter banks. While this procedure is straightforward, the operating clock rates within the front- or back-ends may be too high for implementation. To this end, we have introduced and described to some extent the systematic graphically induced procedure to parallelise multirate systems according to [Groth (2003)]. It has been applied to a three-stage demultiplexing tree-structured filter bank in such a manner that all operations throughout the overall system are performed at the operational output clock. As a result, parallelisation makes the system feasible but retains the computational load.

## **6. References**

38 Will-be-set-by-IN-TECH

exploiting a multitude of multirate identities Groth (2003); Groth & Göckler (2001). The

*P<sup>ν</sup>* times: Thus, the amount of hardware is increased *P<sup>ν</sup>* times whereas, as desired for feasibility, the operational clock rate is concurrently reduced by *Pν*. Hence, the overall expenditure, i.e. the number of operations times the respective operational clock rate

4. *Parallelisation of butterflies* combining the output signals of associated PP filter blocks is straightforward: For each (time-interleaved) PP component of the respective signals a

In this Section 4, a general and systematic procedure for parallelisation of multirate systems, for instance as investigated in Sections 2 and 3, has been presented . Its application to the high rate decimating FDMUX front end of the tree-structured SBC-FDFMUX FB Abdulazim & Göckler (2005); Abdulazim et al. (2007) has been deployed in detail. The stage *ν* degree of parallelisation *Pν*, *ν* = 0, 1, 2, 3, is diminished proportionally to the operational clock frequency *f<sup>ν</sup>* of stage *ν* and is, thus, adapted to the actual sampling rate. As a result, after suitable decomposition of the high rate front end input signal by an input commutator in *P*<sup>0</sup> = *P*max polyphase components (as depicted for *P*max = 8 in Fig. 33), all subsequent processing units are likewise operated at the same operational clock rate *f*<sup>d</sup> = *f*n/*P*<sup>0</sup> = *f*0/*P*0. Since inherent parallelism of the original tree-structured FDMUX (Fig. 32) has attained *P*max = 8 in the third stage, and the output signals of this stage represent the desired eight demultiplexed FDM subsignals, interleaving PS-output commutators are no longer required, as to be seen in Fig. 33. Finally, it should be noted that parallelisation does not change overall expenditure; yet, by multiplying stage *ν* hardware by *Pν*, the operational clock rates

Applying the rules of multirate transposition (cf. Subsection 3.3.1 or Göckler & Groth (2004)) to the parallelised FDMUX front end, the high rate interpolating back end of the tree-structured SBC-FDFMUX FB is obtained likewise and exhibits the same properties as to expenditure and feasibility Groth (2003). Hence, the versatile and efficient tree-structured filter bank (FDMUX, FMUX, SBC, wavelet, or any combination thereof) can be used in any

In Section 2 we have introduced and investigated a special class of real and complex FIR and IIR halfband bandpass filters with the particular set of centre frequencies defined by (1). As a result of the constraint (1), almost all filter coefficients are either real-valued or purely imaginary-valued, as opposed to fully complex-valued coefficients. Hence, this class

In Section 3, two different options to combine two of the above FIR halfband filters with different centre frequencies to form a directional filter (DF) have been investigated. As a result, one of these DF approaches is optimum w.r.t. to computation (most efficient), whereas the other requires the least number of delay elements (minimum McMillan degree). The relation between separating DF and DF that combine two independent signals to an FDM signal via

Finally, in Section 4, the above FIR directional filters (DF) have been combined to tree-structured multiplexing and demultiplexing filter banks. While this procedure is

are reduced by a factor of *P<sup>ν</sup>* to a feasible order of magnitude, as desired.

*<sup>λ</sup>*(*z*d) contains each PP component of *Hλ*(*zν*)

resulting *<sup>P</sup><sup>ν</sup>* <sup>×</sup> *<sup>P</sup><sup>ν</sup>* MIMO filter transfer matrix **<sup>H</sup>***<sup>ν</sup>*

butterfly has to be foreseen, as shown in Fig. 33.

**4.3 Conclusion: Parallelisation of multirate systems**

(ultra) wide-band application without any restriction.

of halfband filters requires only a small amount of computation.

multirate transposition rules has extensively been shown.

**5. Summary and conclusion**

Göckler & Groth (2004), is not changed.


Signal Processing 41

<sup>277</sup> Most Efficient Digital Filter Structures:

Kumar, B., Roy, S. C. D. & Sabharwal, S. (1994). Interrelations between the coefficients

Lutovac, M. D. & Milic, L. D. (1997). Design of computationally efficient elliptic IIR filters

Lutovac, M. D. & Milic, L. D. (2000). Approximate linear phase multiplierless IIR halfband

Lutovac, M. D., Tosic, D. V. & Evans, B. L. (2001). *Filter Design for Signal Processing Using*

Man, E. D. & Kleine, U. (1988). Linear phase decimation and interpolation filters for

Maufroid, X., Coromina, F., Folio, B., Hughes, R., Couchman, A., Stirland, S. & Joly, F. (2004).

Maufroid, X., Coromina, F., Folio, B.-M., Göckler, H. G., Kopmann, H. & Abdulazim, M. N.

McClellan, H. J., Parks, T. W. & Rabiner, L. R. (1973). A computer program for designing

Meerkötter, K. & Ochs, K. (1998). A new digital equalizer based on complex signal processing, *in* Z. Ghassemlooy & R. Saatchi (eds), *Proc. CSDSP98*, Vol. 1, pp. 113–116. Milic, L. (2009). *Multirate Filtering for Digital Signal Processing*, Information Science Reference,

Mintzer, F. (1982). On half-band, third-band, and Nth-band FIR-filters and their design, *IEEE*

Mitra, S. K. (1998). *Digital Signal Processing: A Computer Based Approach*, McGraw-Hill, New

Mitra, S. K. & Kaiser, J. F. (eds) (1993). *Handbook for Digital Signal Processing*, John Wiley &

Oppenheim, A. V. & Schafer, R. W. (1989). *Discrete-Time Signal Processing*, Signal Processing

Renfors, M. & Kupianen, T. (1998). Versatile building blocks for multirate processing of

Rio-Herrero, O. & Maufroid, X. (2003). A new ultra-fast burst switched processor architecture

Schüssler, H. W. (2008). *Digitale Signalverarbeitung 1: Analyse diskreter Signale und Systeme*, 5th

Schüssler, H. W. & Steffen, P. (1998). Halfband filters and Hilbert Transformers, *Circuits*

for meshed satellite networks, *Proc. 8th Int. Workshop on Signal Processing for Space*

Parks, T. W. & Burrus, C. S. (1987). *Digital Filter Design*, John Wiley & Sons, New York. Regalia, P. A., Mitra, S. K. & Vaidyanathan, P. P. (1988). The digital all-pass filter: A versatile

bandpass signals, *Proc. EUSPICO '98*, Rhodos, Greece, pp. 273–276.

Schüssler, H. W. & Steffen, P. (2001). Recursive halfband-filters, *AEÜ* 55(6): 377–388.

signal processing building block, *Proc. of the IEEE* 76(1): 19–37.

*Trans. Acoustics, Speech, and Signal Processing* ASSP-30(5): 734–738.

Next generation of transparent processors for broadband satellite access networks,

(2003). High throughput bent-pipe processor for future broadband satellite access networks, *Proc. 8th Int. Workshop on Signal Processing for Space Commun.*, Catania, Italy,

optimum FIR linear phase digital filters, *IEEE Trans. Audio and Electroacoustics*

configuration, *EURASIP Signal Processing* 39(1/2): 247–262.

filter, *IEEE Trans. Sign. Process. Lett.* 7(3): 52–53.

high-speed application, *Electron. Lett.* 24(12): 757–759.

*Proc. Int. Comm. Satellite Systems Conf.*, Monterey, USA.

*MATLAB and Mathematica*, Prentice Hall, NJ.

Hershey, NY, ISBN 978-1-60566-178-0.

*Process.* 45(10): 2422–2430.

The Potential of Halfband Filters in Digital Signal Processing

pp. 259–275.

York.

AU(21): 506–526.

Sons, New York.

Series, Prentice Hall, NJ.

*Commun.*, Catania, Italy.

edn, Springer, Heidelberg.

*Systems Signal Processing* 17(2): 137–164.

of FIR digital differentiators and other FIR filters and a versatile multifunction

with a reduced number of shift-and-add opperations in multipliers, *IEEE Trans. Sign.*

Gazsi, L. (1986). Quasi-bireciprocal and multirate wave digital lattice filters, *Frequenz* 40(11/12): 289–296.


Göckler, H. G. (1996a). Digitale Filterweiche. German patent P 19 627 784.

40 Will-be-set-by-IN-TECH

Gazsi, L. (1986). Quasi-bireciprocal and multirate wave digital lattice filters, *Frequenz*

Göckler, H. G. & Alfsmann, D. (2010). Efficient linear-phase directional filters with selectable

Göckler, H. G. & Damjanovic, S. (2006a). Efficient implementation of real and complex

Göckler, H. G. & Damjanovic, S. (2006b). A family of efficient complex halfband filters, *Proc. 4th Karlsruhe Workshop on Software Radios*, Karlsruhe, Germany, pp. 79–88. Göckler, H. G. & Felbecker, B. (2001). Digital on-board FDM-demultiplexing without

Göckler, H. G. & Groth, A. (2004). *Multiratensysteme: Abtastratenumsetzung und digitale*

Göckler, H. G., Groth, A. & Abdulazim, M. N. (2006). Parallelisation of digital signal

Göckler, H. G. & Grotz, K. (1994). DIAMANT: All digital frequency division multiplexing for

Göckler, H. G. & Eyssele, H. (1992). Study of on-board digital FDM-demultiplexing for mobile

Groth, A. (2003). *Effiziente Parallelisierung digitaler Systeme mittels äquivalenter*

Groth, A. & Göckler, H. G. (2001). Signal-flow-graph identities for structural transformations

Johansson, H. & Löwenborg, P. (2005). Flexible frequency-band reallocation networks based

Kopmann, H., Göckler, H. G. & Abdulazim, M. N. (2003). Analogue-to-digital conversion

Gold, B. & Rader, C. M. (1969). *Digital Processing of Signals*, McGraw-Hill, New York.

*Acoustics, Speech, and Signal Processing*, Philadelphia, USA.

Kammeyer, K. D. & Kroschel, K. (2002). *Digitale Signalverarbeitung*, Teubner, Stuttgart. Kollar, I., Pintelon, R. & Schoukens, J. (1990). Optimal FIR and IIR Hilbert Transformer

centre frequencies, *Proc. 1st Int. Conf. Green Circuits and Systems (ICGCS 2010)*,

linear-phase FIR and minimum-phase IIR halfband filters for sample rate alteration,

restrictions on channel allocation and bandwidth, *Proc. 7th Int. Workshop on Dig. Sign.*

*Filterbänke*, J. Schlembach Fachverlag, Wilburgstetten, Germany, ISBN 3-935340-29-X

processing in uniform and reconfigurable filter banks for satellite communications, *Proc. IEEE Asia Pacific Conf. Circuits and Systems (APCCAS 2006)*, Singapore,

10 Gbit/s fibre-optic CATV distribution system, *Proc. EUSIPCO'94*, Edinburgh, UK,

SCPC satellite communications (Part I & II), *Europ. Trans. Telecommunic.* ETT-3: 7–30.

*Signalflussgraph-Transformationen*, PhD thesis, Ruhr-Universität Bochum, Bochum,

in multirate systems, *Proc. Europ. Conf. Circuit Theory Design*, Vol. II, Espoo, Finland,

on variable oversampled complex-modulated filter banks, *Proc. IEEE Int. Conf. on*

design via LS and minimax fitting, *IEEE Trans. Instrumentation and Measurement*

and flexible FDM demultiplexing algorithms for digital on-board processing of ultra-wideband FDM signals, *Proc. 8th Int. Workshop on Signal Processing for Space*

Göckler, H. G. (1996b). Nichtrekursives Halb-Band-Filter. German patent P 19 627 787. Göckler, H. G. (1996c). Umschaltbare Frequenzweiche. German patent P 19 627 788.

Göckler, H. G. (1996a). Digitale Filterweiche. German patent P 19 627 784.

*Proc. Techn. for Space Communications*, Sesimbra, Portugal.

(Chinese Edition: ISBN 978-7-121-08464-5).

40(11/12): 289–296.

Shanghai, China, pp. 293–298.

*Frequenz* 60(9/10): 176–185.

pp. 1061–1064.

pp. 999–1002.

Germany.

pp. 305–308.

39(6): 847–852.

*Commun.*, Catania, Italy, pp. 277–292.


**13** 

*Spain* 

**Applications of Interval-Based** 

**Design of Digital LTI Systems** 

**Simulations to the Analysis and** 

Angel Fernández-Herrero1 and Carlos Carreras1

*Medioambientales y Tecnológicas (CIEMAT),* 

*Universidad CEU-San Pablo,* 

Juan A. López1, Enrique Sedano1, Luis Esteban2, Gabriel Caffarena3,

*1Departamento de Ingeniería Electrónica, Universidad Politécnica de Madrid, 2Laboratorio Nacional de Fusión, Centro de Investigaciones Energéticas* 

*3Departamento de Ingeniería de Sistemas de Información y de Telecomunicación,* 

As the complexity of digital systems increases, the existing simulation-based quantization approaches soon become unaffordable due to the exceedingly long simulation times. Thus, it is necessary to develop optimized strategies aimed at significantly reducing the computation times required by the algorithms to find a valid solution (Clark et al., 2005; Hill, 2006). In this sense, interval-based computations are particularly well-suited to reduce the number of simulations required to quantize a digital system, since they are capable of evaluating a large number of numerical samples in a single interval-based simulation

This chapter presents a review of the most common interval-based computation techniques, as well as some experiments that show their application to the analysis and design of digital Linear Time Invariant (LTI) systems. One of the main features of these computations is that they are capable of significantly reducing the number of simulations needed to characterize a digital system, at the expense of some additional complexity in the processing of each operation. On the other hand, one of the most important problems associated to these computations is interval oversizing (i.e., the computed bounds of the intervals are wider than required), so new descriptions and methods are continuously being proposed. In this sense, each description has its own features and drawbacks, making it suitable for a

The structure is as follows: Section 2 presents a general review of the main interval-based computation methods that have been proposed in the literature to perform fast evaluation of system descriptions. For each technique, the representation of the different types of computing elements is given, as well as the main advantages and disadvantages of each approach. Section 3 presents three groups of interval-based experiments: (i) a comparison of the results provided by two different interval-based approaches to show the main problem

(Caffarena et al., 2009, 2010; López, 2004; López et al., 2007, 2008).

**1. Introduction** 

different type of processing.


Juan A. López1, Enrique Sedano1, Luis Esteban2, Gabriel Caffarena3, Angel Fernández-Herrero1 and Carlos Carreras1 *1Departamento de Ingeniería Electrónica, Universidad Politécnica de Madrid, 2Laboratorio Nacional de Fusión, Centro de Investigaciones Energéticas Medioambientales y Tecnológicas (CIEMAT), 3Departamento de Ingeniería de Sistemas de Información y de Telecomunicación, Universidad CEU-San Pablo, Spain* 
