2.1 Cellular neural network technology

Cellular neural network (CNN) was introduced by Chua and Yang at Berkeley University, California (USA), in 1988, which combined both analog spatial temporal dynamic and logic [1–3]. The CNN paradigm is a natural framework to describe the behavior of locally interconnected dynamic systems, which has an arrayed structure, so it is very useful in solving the partial differential equations [3–7]. Today, visual microprocessors based on this processing type can perform at TeraOPs computing power and approximately 50,000 fps. The possibility of developing algorithms and programs based on CNN was quickly exploited worldwide. Up to now, there are several CNN models for processing images, solving PDE, recognizing pattern, gene analysis, etc. Depending on problems, the designer can make a CNN chip having size of millions cells. The common CNN architectures are 1D, 2D, and 3D.

The standard CNN 2D is the dynamic system of autonomous cells that are connected locally with its neighbor forming a two-dimensional array [2, 18]. Each cell in the array C(i,j) contains one independent voltage source, one independent current source, a linear capacitor, resistors, and linear voltage-controlled current sources which are coupled to its neighbor cells via the controlling input voltage, and the feedback from the output voltage of each neighbor cell C(k,l). The templates A(i,j;kl) and B(i,j;k,l) are the parameters linking cell C(i,j) to neighbor C(k,l). The effective range of Sr(I,j) on radius r of cell C(I,j) is identified by the set of neighbor cells which satisfies (Figure 1).

$$\mathbf{Sr}(\mathbf{i}, \mathbf{j}) = \{ \mathbf{C}(\mathbf{k}, \mathbf{l}) \mid \max \left\{ |\mathbf{k} - \mathbf{i}|, |\mathbf{l} - \mathbf{j}| \right\} \le \mathbf{r} \}$$
 
$$\text{with } \mathbf{1} \le \mathbf{k} \le \mathbf{M}, \mathbf{1} \le \mathbf{l} \le \mathbf{N}.$$

The state equation of cell C(i,j) is given by the following equation:

$$\mathbf{C}\frac{\partial \mathbf{x}\_{\vec{\eta}}}{\partial t} = -\frac{1}{R}\mathbf{x}\_{\vec{\eta}} + \sum\_{\substack{\mathbf{C}(k,l)\in\mathcal{S}\_{\rm r}(i,j)}} A(i,j;k,l)y\_{kl} + \sum\_{\substack{\mathbf{C}(k,l)\in\mathcal{S}\_{\rm r}(i,j)}} B(i,j;k,l)u\_{kl} + z\_{\vec{\eta}} \tag{1}$$

With R, C is the linear resistor and capacitor; A(i,j;kl) is the feedback operator parameter; B(i,j;kl) is the control parameter; and zij is the bias value of the cell C(i,j). On the CNN chip, (A, B, z) are the local connective weight values of each cell C(i,j) to its neighbors. The output of the cell C(i,j) is presented by Yij as:

$$\mathbf{Y}\_{\vec{\eta}} = f\left(\mathbf{x}\_{\vec{\eta}}\right) = \frac{\mathbf{1}}{2} \left| \mathbf{x}\_{\vec{\eta}} + \mathbf{1} \right| + \frac{\mathbf{1}}{2} \left| \mathbf{x}\_{\vec{\eta}} - \mathbf{1} \right| \tag{2}$$

Thus, if radius r = 1, the cell C(i,j,k) has 26 neighbors; hence, the templates A and B

C lð Þ , m, n ∈Srð Þ i, j, k

For the problem-solving of three-dimensional PDE, the CNN 3D must be used. The original PDE is differentiated and from that the appropriate templates (A,B,z)

<sup>2</sup> <sup>j</sup>xijk <sup>þ</sup> <sup>1</sup>jþjxijk � <sup>1</sup><sup>j</sup> � �

A ið Þ , j, k; l, m, n ylmn

B ið Þ , j, k; l, m, n ylmn þ zijk (3)

<sup>R</sup> xijk <sup>þ</sup> <sup>X</sup>

The 3D CNN, with r = 1, (having 26 neighbors) in three dimensions coordinates x,y,z.

C lð Þ , m, n ∈Srð Þ i, j, k

<sup>þ</sup> <sup>X</sup>

yijk <sup>¼</sup> <sup>f</sup> <sup>x</sup>ijk � � <sup>¼</sup> <sup>1</sup>

have more three coefficients A(i,j,k) and B(i,j,k). The state equation of CNN 3D takes the form:

The output function is similar to CNN 2D:

C ∂xijk <sup>∂</sup><sup>t</sup> ¼ � <sup>1</sup>

Figure 2.

Figure 1.

Figure 3.

195

CNN circuit output function.

The architecture of a CNN chip

Solving Partial Differential Equation Using FPGA Technology

DOI: http://dx.doi.org/10.5772/intechopen.84588

of the CNN 3D are generated.

The characteristic of the CNN output function Yi,j = f(xij) is presented in Figure 2.

On the CNN 3D, beside connection with neighbors, the cell has other connection to upper and lower layer in the three-dimensional space [18] as shown in Figure 3.

Solving Partial Differential Equation Using FPGA Technology DOI: http://dx.doi.org/10.5772/intechopen.84588

Figure 1. The architecture of a CNN chip

the FPGA technology, users can use hardware programming languages, such as Verilog and VHDL, to configure the logic elements in the FPGA to produce the electronic circuit of a CNN chip. The recent FPGA architectures (Virtex 7; Stratix 10) have many tools support to test, optimize, and coordinate data exchange. The

Cellular neural network (CNN) was introduced by Chua and Yang at Berkeley University, California (USA), in 1988, which combined both analog spatial temporal dynamic and logic [1–3]. The CNN paradigm is a natural framework to describe the behavior of locally interconnected dynamic systems, which has an arrayed structure, so it is very useful in solving the partial differential equations [3–7]. Today, visual microprocessors based on this processing type can perform at

TeraOPs computing power and approximately 50,000 fps. The possibility of developing algorithms and programs based on CNN was quickly exploited worldwide. Up to now, there are several CNN models for processing images, solving PDE, recognizing pattern, gene analysis, etc. Depending on problems, the designer can make a CNN chip having size of millions cells. The common CNN architectures are

The standard CNN 2D is the dynamic system of autonomous cells that are connected locally with its neighbor forming a two-dimensional array [2, 18]. Each cell in the array C(i,j) contains one independent voltage source, one independent current source, a linear capacitor, resistors, and linear voltage-controlled current sources which are coupled to its neighbor cells via the controlling input voltage, and the feedback from the output voltage of each neighbor cell C(k,l). The templates A(i,j;kl) and B(i,j;k,l) are the parameters linking cell C(i,j) to neighbor C(k,l). The effective range of Sr(I,j) on radius r of cell C(I,j) is identified by the set of

> Sr i, j ð Þ¼ f g C k, l ð Þj max f g jk–ij, jl– jj ≤ r with 1≤ k ≤ M, 1≤ l≤ N:

> > A ið Þ , <sup>j</sup>; <sup>k</sup>, <sup>l</sup> ykl <sup>þ</sup> <sup>X</sup>

With R, C is the linear resistor and capacitor; A(i,j;kl) is the feedback operator parameter; B(i,j;kl) is the control parameter; and zij is the bias value of the cell C(i,j). On the CNN chip, (A, B, z) are the local connective weight values of each cell

<sup>2</sup> <sup>∣</sup>xij <sup>þ</sup> <sup>1</sup><sup>∣</sup> <sup>þ</sup>

On the CNN 3D, beside connection with neighbors, the cell has other connection to upper and lower layer in the three-dimensional space [18] as shown in Figure 3.

C kð Þ , l ∈Srð Þ i, j

1

B ið Þ , j; k, l ukl þ zij (1)

<sup>2</sup> <sup>∣</sup>xij–1<sup>∣</sup> (2)

The state equation of cell C(i,j) is given by the following equation:

C(i,j) to its neighbors. The output of the cell C(i,j) is presented by Yij as:

� � <sup>¼</sup> <sup>1</sup>

The characteristic of the CNN output function Yi,j = f(xij) is presented in

Yij ¼ f xij

CNN designer should use FPGA for making a CNN chip.

Boundary Layer Flows - Theory, Applications and Numerical Methods

2. CNN and FPGA technology

1D, 2D, and 3D.

C ∂xij <sup>∂</sup><sup>t</sup> ¼ � <sup>1</sup>

Figure 2.

194

2.1 Cellular neural network technology

neighbor cells which satisfies (Figure 1).

<sup>R</sup> xij <sup>þ</sup> <sup>X</sup>

C kð Þ , l ∈Srð Þ i, j

#### Figure 3.

The 3D CNN, with r = 1, (having 26 neighbors) in three dimensions coordinates x,y,z.

Thus, if radius r = 1, the cell C(i,j,k) has 26 neighbors; hence, the templates A and B have more three coefficients A(i,j,k) and B(i,j,k).

The state equation of CNN 3D takes the form:

$$\begin{split} C\frac{\partial \mathbf{x}\_{ijk}}{\partial t} &= -\frac{1}{R} \mathbf{x}\_{ijk} + \sum\_{\substack{C(l,m,n)\in \mathcal{S}\_{r}(i,j,k) \\ + \sum\_{\substack{C(l,m,n)\in \mathcal{S}\_{r}(i,j,k)}} B(i,j,k;l,m,n)} \mathbf{y}\_{lmn} \\ &+ \sum\_{\substack{C(l,m,n)\in \mathcal{S}\_{r}(i,j,k) \\ \end{subarray}} B(i,j,k;l,m,n) \mathbf{y}\_{lmn} + \mathbf{z}\_{ijk} \tag{3} \end{split} \tag{3}$$

The output function is similar to CNN 2D:

$$\mathcal{Y}\_{ijk} = f\left(\mathbf{x}\_{ijk}\right) = \frac{1}{2}\left( |\mathbf{x}\_{ijk} + \mathbf{1}| + |\mathbf{x}\_{ijk} - \mathbf{1}| \right)$$

For the problem-solving of three-dimensional PDE, the CNN 3D must be used. The original PDE is differentiated and from that the appropriate templates (A,B,z) of the CNN 3D are generated.

### 2.2 Field-programmable gate array technology

Field-programmable gate array (FPGA) is the technology in which the blank blocks have available resources of logic gates and RAM blocks are used to implement complex digital computations. FPGAs can be used to implement any logical function. The FPGA block is able to update the functionality after shipping, partial reconfiguration of a portion of the design, and the low nonrecurring engineering costs relative to an ASIC design [13–16].

Saint venant 2D equation [5]:

DOI: http://dx.doi.org/10.5772/intechopen.84588

∂u ∂t þ ∂u<sup>2</sup> ∂x þ g ∂H ∂x þ ∂uv <sup>∂</sup> <sup>y</sup> ¼ �gu

Solving Partial Differential Equation Using FPGA Technology

∂v ∂t þ ∂v<sup>2</sup> ∂ y þ g ∂H ∂ y þ ∂uv <sup>∂</sup><sup>x</sup> ¼ �gv

Saint venant 1D equation [6]:

∂ Q xð Þ , <sup>t</sup> <sup>2</sup> bh xð Þ , t h i

∂x

First, changing the original equation (4)

<sup>∂</sup>Q xð Þ , <sup>t</sup> ∂t

þ

• Designing the templates

∂H ∂t þ ∂u ∂x þ ∂v <sup>∂</sup> <sup>y</sup> <sup>¼</sup> <sup>0</sup>

<sup>b</sup> <sup>∂</sup>h xð Þ , <sup>t</sup> ∂t

Example of making a CNN chip for solving Saint venant 1D:

<sup>b</sup> <sup>∂</sup>h xð Þ , <sup>t</sup> ∂t

<sup>∂</sup>h xð Þ , <sup>t</sup>

⇔

expansion, one has equation for cell at position (i):

AhQ <sup>¼</sup> <sup>1</sup>

2bΔx

where R<sup>h</sup> is the linear resistance on cell circuit of h. For Eq. (5), changing slightly with assumptions above:

From (7), one has found templates:

197

∂h

<sup>∂</sup><sup>t</sup> ¼ � <sup>1</sup>

2bΔx

1 Rh þ

<sup>∂</sup><sup>t</sup> <sup>¼</sup> �∂Q xð Þ , <sup>t</sup> b∂x

and then choosing the difference space of variables x with step Δx for right part of (6). After differencing only the right side of (6) for space variable x by Taylor

Note that, following the CNN algorithm, on the left, we do use symbol (∂h=∂t).

� �;Bh <sup>¼</sup> ½ � <sup>010</sup> ;zh <sup>¼</sup> 0;

�1 2bΔx

Qiþ<sup>1</sup> � Qi�<sup>1</sup> � � <sup>þ</sup> <sup>q</sup>

<sup>∂</sup>Q xð Þ , <sup>t</sup> <sup>∂</sup><sup>x</sup> <sup>¼</sup> <sup>q</sup>

þ q

þ gbh xð Þ , t

þ

<sup>∂</sup>h xð Þ , <sup>t</sup>

<sup>∂</sup>Q xð Þ , <sup>t</sup>

<sup>u</sup><sup>2</sup> <sup>þ</sup> <sup>v</sup><sup>2</sup> ð Þ1=<sup>2</sup> K2 xH<sup>2</sup>

<sup>u</sup><sup>2</sup> <sup>þ</sup> <sup>v</sup><sup>2</sup> ð Þ1=<sup>2</sup> K2 yH<sup>2</sup>

<sup>∂</sup><sup>x</sup> <sup>¼</sup> <sup>q</sup> (4)

<sup>b</sup> (6)

<sup>b</sup> (7)

<sup>∂</sup><sup>x</sup> � gIbh xð Þþ , <sup>t</sup> g Jbh xð Þ¼ , <sup>t</sup> kqq (5)

A recent trend has been to take the coarse-grained architectural approach by combining the logic blocks and interconnects of traditional FPGAs with embedded chips and related peripherals to form a complete "system on a programmable chip" [17–19].

Users like teachers and students could use FGGA for making prototypes for testing application system, with VHDL or Verilog users easily design and test and then reconfigure the system until it has desired results.

### 2.3 Using FPGA to make CNN chip for solving PDE

Because the CNN architecture is not the same for every application, based on the standard model, the designer develops a particular chip for each problem. FPGA is the most useful for configuring a blank chip to make a CNN chip using programming language like Verilog or VHDL. For solving PDE, firstly, one needs to analyze (differencing) the original model of partial differential equations for finding appropriate template, then base on template found designing architecture CNN chip, finally, using VHDL to configure FPGA following designed hardware making CNN chip.

Some PDEs have been solved using the CNN technology: Burger equation [3]:

$$\frac{\partial u(\varkappa, t)}{\partial t} = \frac{1}{R} \frac{\partial^2 u(\varkappa, t)}{\partial \varkappa^2} - u(\varkappa, t) \frac{\partial u(\varkappa, t)}{\partial \varkappa} + F(\varkappa, t)$$

Klein-Gordon equation [19]:

$$\frac{\partial^2 u(\varkappa, t)}{\partial t^2} = \nabla^2 u(\varkappa, t) - \sin u(\varkappa, t)$$

Heat diffusion equation [3]:

$$\frac{\partial u\left(\mathbf{x},\ \mathbf{y},t\right)}{\partial t} = c\nabla^2 u\left(\mathbf{x},\ \mathbf{y},t\right)$$

Black-Scholes equation [9]:

$$\frac{\partial V(\mathbf{x},t)}{\partial t} = rV(\mathbf{x},t) - \frac{1}{2}\sigma^2 \mathbf{S}^2 \frac{\partial^2 V(\mathbf{x},t)}{\partial \mathbf{S}^2} - \text{rS} \frac{\partial V(\mathbf{x},t)}{\partial \mathbf{S}}$$

Air pollution equation [4]:

$$\frac{\partial \rho}{\partial t} + \operatorname{div} \mathbf{v} \rho \rho + \sigma \rho - \gamma \frac{\partial^2 \rho}{\partial z^2} - \mu \nabla^2 \rho = f(x, y, z)$$

Solving Partial Differential Equation Using FPGA Technology DOI: http://dx.doi.org/10.5772/intechopen.84588

Saint venant 2D equation [5]:

2.2 Field-programmable gate array technology

Boundary Layer Flows - Theory, Applications and Numerical Methods

then reconfigure the system until it has desired results.

2.3 Using FPGA to make CNN chip for solving PDE

Some PDEs have been solved using the CNN technology:

∂u x, y, t

<sup>∂</sup><sup>t</sup> <sup>¼</sup> rV xð Þ� , <sup>t</sup>

þ divvφ þ σφ � γ

<sup>∂</sup><sup>t</sup> <sup>¼</sup> <sup>c</sup>∇<sup>2</sup>

1 2 σ2 S2 ∂2

> ∂2 φ <sup>∂</sup>z<sup>2</sup> � <sup>μ</sup>∇<sup>2</sup>

costs relative to an ASIC design [13–16].

[17–19].

making CNN chip.

Burger equation [3]:

<sup>∂</sup>u xð Þ , <sup>t</sup> <sup>∂</sup><sup>t</sup> <sup>¼</sup> <sup>1</sup> R ∂2 u xð Þ , t

> ∂2 u xð Þ , t <sup>∂</sup>t<sup>2</sup> <sup>¼</sup> <sup>∇</sup><sup>2</sup>

Klein-Gordon equation [19]:

Heat diffusion equation [3]:

Black-Scholes equation [9]:

Air pollution equation [4]:

196

<sup>∂</sup>V xð Þ , <sup>t</sup>

∂φ ∂t

Field-programmable gate array (FPGA) is the technology in which the blank blocks have available resources of logic gates and RAM blocks are used to implement complex digital computations. FPGAs can be used to implement any logical function. The FPGA block is able to update the functionality after shipping, partial reconfiguration of a portion of the design, and the low nonrecurring engineering

A recent trend has been to take the coarse-grained architectural approach by combining the logic blocks and interconnects of traditional FPGAs with embedded chips and related peripherals to form a complete "system on a programmable chip"

Users like teachers and students could use FGGA for making prototypes for testing application system, with VHDL or Verilog users easily design and test and

Because the CNN architecture is not the same for every application, based on the standard model, the designer develops a particular chip for each problem. FPGA is the most useful for configuring a blank chip to make a CNN chip using programming language like Verilog or VHDL. For solving PDE, firstly, one needs to analyze (differencing) the original model of partial differential equations for finding appropriate template, then base on template found designing architecture CNN chip, finally, using VHDL to configure FPGA following designed hardware

<sup>∂</sup>x<sup>2</sup> � u xð Þ , <sup>t</sup>

<sup>∂</sup>u xð Þ , <sup>t</sup> ∂x

u xð Þ� , t sinu xð Þ , t

u x, y, t

V xð Þ , t

<sup>∂</sup>S<sup>2</sup> � rS <sup>∂</sup>V xð Þ , <sup>t</sup>

φ ¼ f xð Þ , y, z

∂S

þ F xð Þ , t

$$
\frac{\partial H}{\partial t} + \frac{\partial u}{\partial \mathbf{x}} + \frac{\partial v}{\partial \mathbf{y}} = \mathbf{0}
$$

$$
\frac{\partial u}{\partial t} + \frac{\partial u^2}{\partial \mathbf{x}} + \mathbf{g}\frac{\partial H}{\partial \mathbf{x}} + \frac{\partial uv}{\partial \mathbf{y}} = -\mathbf{g}u\frac{\left(u^2 + v^2\right)^{1/2}}{K\_x^2 H^2}
$$

$$
\frac{\partial v}{\partial t} + \frac{\partial v^2}{\partial \mathbf{y}} + \mathbf{g}\frac{\partial H}{\partial \mathbf{y}} + \frac{\partial uv}{\partial \mathbf{x}} = -\mathbf{g}v\frac{\left(u^2 + v^2\right)^{1/2}}{K\_y^2 H^2}
$$

Saint venant 1D equation [6]:

$$b\frac{\partial h(\mathbf{x},t)}{\partial t} + \frac{\partial Q(\mathbf{x},t)}{\partial \mathbf{x}} = q \tag{4}$$

$$\frac{\partial Q(\mathbf{x},t)}{\partial t} + \frac{\partial \left[\frac{Q(\mathbf{x},t)^2}{\partial h(\mathbf{x},t)}\right]}{\partial \mathbf{x}} + gbh(\mathbf{x},t)\frac{\partial h(\mathbf{x},t)}{\partial \mathbf{x}} - gIbh(\mathbf{x},t) + gJbh(\mathbf{x},t) = k\_q q \tag{5}$$

#### Example of making a CNN chip for solving Saint venant 1D:

• Designing the templates

First, changing the original equation (4)

$$b\frac{\partial h(\mathbf{x},t)}{\partial t} + \frac{\partial Q(\mathbf{x},t)}{\partial \mathbf{x}} = q$$

$$\Leftrightarrow \frac{\partial h(\mathbf{x},t)}{\partial t} = \frac{-\partial Q(\mathbf{x},t)}{b\partial \mathbf{x}} + \frac{q}{b} \tag{6}$$

and then choosing the difference space of variables x with step Δx for right part of (6). After differencing only the right side of (6) for space variable x by Taylor expansion, one has equation for cell at position (i):

$$\frac{\partial h}{\partial t} = -\frac{1}{2b\Delta x} \left( Q\_{i+1} - Q\_{i-1} \right) + \frac{q}{b} \tag{7}$$

Note that, following the CNN algorithm, on the left, we do use symbol (∂h=∂t). From (7), one has found templates:

$$A^{hQ} = \begin{bmatrix} \mathbf{1} & \mathbf{1} & -\mathbf{1} \\ \overline{2b\,\Delta\mathbf{x}} & \overline{R}^h & \overline{2b\,\Delta\mathbf{x}} \end{bmatrix}; \mathbf{B}^h = [\mathbf{0} \ \mathbf{1} \ \mathbf{0}]; \mathbf{z}^h = \mathbf{0};$$

where R<sup>h</sup> is the linear resistance on cell circuit of h. For Eq. (5), changing slightly with assumptions above:

$$\frac{\partial Q(\mathbf{x},t)}{\partial t} + \frac{\partial \left[\frac{Q(\mathbf{x},t)^2}{\partial h(\mathbf{x},t)}\right]}{\partial \mathbf{x}} + gbh(\mathbf{x},t)\frac{\partial h(\mathbf{x},t)}{\partial \mathbf{x}} - gIbh(\mathbf{x},t) + gJbh(\mathbf{x},t) = k\_q q \tag{8}$$

Assume that q > 0, then kq = 0. After differencing, applying the template design algorithm of CNN, one can has templates for (8):

<sup>A</sup><sup>Q</sup> <sup>¼</sup> Qiþ<sup>1</sup> 2bΔxhiþ<sup>1</sup> 1 <sup>R</sup><sup>Q</sup> � Qi�<sup>1</sup> 2bΔxhi�<sup>1</sup> �;

$$A^{\mathcal{Q}h} = \begin{bmatrix} \overline{\mathcal{g}} b h\_i & \mathcal{g} b (I - f) & -\frac{\mathcal{g} b h\_i}{2 \Delta \mathbf{x}} \end{bmatrix}; \mathcal{B}^{\mathcal{Q}} = \mathbf{0}; \mathbf{z}^{\mathcal{Q}} = \mathbf{0}; \mathbf{z}$$

From template found, we can design the CNN architecture for problem as (1) two layered-1D CNN chip (Figure 4) and (2) the h, Q processing block (Figure 5).

The cell is mixed both of h, Q in one block to make the physical architecture of a CNN cell.

In general, for each calculation, we need some basic computing block like ADDITION, SUBTRACT, MULTIPLE, DIVIDE. When designing a CNN cell using FPGA, one has to design many separate blocks of them to perform arithmetical

Figure 4. Logical architecture of a CNN cell.

processing for each input. In order to save computing resource in FPGA, the method that shares basic block in one cell leading to sequential calculating can be used (Figure 6). In this case, the processing time of each cell will be high. To reduce

the processing time of each cell, we can use a pipeline mechanism shown in Figure 7, but it needs more computing resource for each cell. Finally, for cells in a

CNN chip, we process parallel as in Figure 8.

Solution for physical architecture CNN chip.

Figure 6.

Figure 7.

199

Physical architecture of CNN cell.

Solving Partial Differential Equation Using FPGA Technology

DOI: http://dx.doi.org/10.5772/intechopen.84588

Figure 5. Logical architecture of a h, Q cell.

Solving Partial Differential Equation Using FPGA Technology DOI: http://dx.doi.org/10.5772/intechopen.84588

<sup>∂</sup>Q xð Þ , <sup>t</sup> ∂t

(Figure 5).

CNN cell.

Figure 4.

Figure 5.

198

Logical architecture of a CNN cell.

Logical architecture of a h, Q cell.

þ

∂ Q xð Þ , <sup>t</sup> <sup>2</sup> bh xð Þ , t h i

∂x

algorithm of CNN, one can has templates for (8):

AQh <sup>¼</sup> gbhi

þ gbh xð Þ , t

Boundary Layer Flows - Theory, Applications and Numerical Methods

<sup>A</sup><sup>Q</sup> <sup>¼</sup> Qiþ<sup>1</sup>

2bΔxhiþ<sup>1</sup>

<sup>2</sup>Δ<sup>x</sup> gb Ið Þ� � <sup>J</sup> gbhi

From template found, we can design the CNN architecture for problem as (1) two layered-1D CNN chip (Figure 4) and (2) the h, Q processing block

In general, for each calculation, we need some basic computing block like ADDITION, SUBTRACT, MULTIPLE, DIVIDE. When designing a CNN cell using FPGA, one has to design many separate blocks of them to perform arithmetical

The cell is mixed both of h, Q in one block to make the physical architecture of a

� �

<sup>∂</sup>h xð Þ , <sup>t</sup>

Assume that q > 0, then kq = 0. After differencing, applying the template design

1

<sup>R</sup><sup>Q</sup> � Qi�<sup>1</sup>

2Δx

2bΔxhi�<sup>1</sup>

<sup>∂</sup><sup>x</sup> � gIbh xð Þþ , <sup>t</sup> g Jbh xð Þ¼ , <sup>t</sup> kqq (8)

�;

;BQ <sup>¼</sup> 0;z<sup>Q</sup> <sup>¼</sup> 0;

Figure 7.

Solution for physical architecture CNN chip.

processing for each input. In order to save computing resource in FPGA, the method that shares basic block in one cell leading to sequential calculating can be used (Figure 6). In this case, the processing time of each cell will be high. To reduce the processing time of each cell, we can use a pipeline mechanism shown in Figure 7, but it needs more computing resource for each cell. Finally, for cells in a CNN chip, we process parallel as in Figure 8.

3.2 Description equations in Navier-Stokes equations

Solving Partial Differential Equation Using FPGA Technology

∂ρz<sup>w</sup> ∂t þ <sup>∂</sup>ρq<sup>x</sup> ∂x þ

<sup>þ</sup> <sup>ρ</sup>gd <sup>∂</sup>z<sup>w</sup>

<sup>þ</sup> <sup>ρ</sup>gd <sup>∂</sup>z<sup>w</sup> ∂ y

Explain the meanings of quantities in the equations:

<sup>∂</sup>ρ<sup>q</sup> <sup>y</sup>

Assume that the height of water is taken from the bottom of the flow, which is

<sup>∂</sup><sup>x</sup> <sup>þ</sup> <sup>ρ</sup>gdS <sup>f</sup> <sup>x</sup> � <sup>τ</sup>wx � <sup>∂</sup>

<sup>þ</sup> <sup>ρ</sup>gdSfy � <sup>τ</sup>wy � <sup>∂</sup>

<sup>∂</sup><sup>x</sup> <sup>ρ</sup>KL

∂ y ρKL <sup>∂</sup><sup>q</sup> <sup>y</sup> ∂ y � �

<sup>∂</sup><sup>t</sup> : quantities characterizing the momentum variation over time in

� �: kinetic energy variations of flow in x- and

• ρgdS <sup>f</sup> <sup>x</sup> and ρgdSfy: influence of friction by bottom and walls of channel on flow in x- and y-directions. Values of Sfx and Sfy are determined based on physical

properties of bottom and walls of hydraulic channels according to the

n<sup>2</sup> q<sup>2</sup>

• τwx and τwy: wind pressure on free surface of hydraulic flow in x-and

;khi W≤ Wmin

½ � cs1 <sup>þ</sup> cs2ð Þ <sup>W</sup>‐Wmin :10�<sup>3</sup>

<sup>y</sup> <sup>þ</sup> <sup>q</sup><sup>2</sup> x � �<sup>1</sup>=<sup>2</sup>

<sup>c</sup>osð Þ <sup>Ψ</sup> ;τwy <sup>¼</sup> csρaW<sup>2</sup>

;khi <sup>W</sup>>Wmin ( );

<sup>∂</sup> <sup>y</sup> : potential energy variations of flow in x- and y-directions.

∂qx ∂x � � � ∂ ∂ y ρKT ∂qx ∂ y � �

� ∂ <sup>∂</sup><sup>x</sup> <sup>ρ</sup>KT

<sup>d</sup><sup>1</sup>=<sup>3</sup> ð Þ n is Manning coefficient

sinð Þ Ψ ,

<sup>∂</sup><sup>q</sup> <sup>y</sup> ∂x � �

<sup>¼</sup> <sup>0</sup> (10)

<sup>¼</sup> <sup>0</sup> (11)

regarded as the origin of the coordinate system, so zw has no negative values.

<sup>∂</sup> <sup>y</sup> <sup>¼</sup> <sup>ρ</sup>qA (9)

• Equations describing the water level

DOI: http://dx.doi.org/10.5772/intechopen.84588

• Momentum equations in x-direction:

• Momentum equations in y-direction:

ρβ qxq <sup>y</sup> d � �

<sup>∂</sup>ρqx ∂t þ ∂ <sup>∂</sup><sup>x</sup> ρβ <sup>q</sup><sup>2</sup> x d � � þ ∂ ∂ y

<sup>∂</sup>ρ<sup>q</sup> <sup>y</sup> ∂t þ ∂ ∂ y

• <sup>∂</sup>ρqx

• <sup>∂</sup> <sup>∂</sup><sup>x</sup> ρβ <sup>q</sup><sup>2</sup> x d � � and <sup>∂</sup>

• ρgd <sup>∂</sup>z<sup>w</sup>

S <sup>f</sup> <sup>x</sup> ¼ qx

where:

201

ρβ <sup>q</sup><sup>2</sup> y d !

<sup>∂</sup><sup>t</sup> and <sup>∂</sup>ρ<sup>q</sup> <sup>y</sup>

y-directions.

<sup>∂</sup><sup>x</sup> and <sup>ρ</sup>gd <sup>∂</sup>z<sup>w</sup>

following formulas:

n<sup>2</sup> q<sup>2</sup>

<sup>x</sup> <sup>þ</sup> <sup>q</sup><sup>2</sup> y � �<sup>1</sup>=<sup>2</sup>

y-directions are calculated as follows:

cx <sup>¼</sup> <sup>10</sup>�<sup>3</sup>

<sup>d</sup><sup>1</sup>=<sup>3</sup> ;Sfy <sup>¼</sup> <sup>q</sup> <sup>y</sup>

<sup>τ</sup>wx <sup>¼</sup> csρaW<sup>2</sup>

þ ∂ <sup>∂</sup><sup>x</sup> ρβ <sup>q</sup> <sup>y</sup>qx d � �

x-axis and y-axis, respectively.

<sup>∂</sup> <sup>y</sup> ρβ <sup>q</sup><sup>2</sup> y d

C1, … , C4 are the coefficients as shown in Figure 7, (C1= <sup>1</sup> <sup>2</sup>bΔ<sup>x</sup>Δt; C2= gb <sup>2</sup>Δ<sup>x</sup>Δt; C3= gb Ið Þ � <sup>J</sup> <sup>Δ</sup>t; C4= <sup>q</sup> <sup>b</sup> Δt).

If each cell is uses a pipeline mechanism shown in Figure 7. With the length of a pipeline is 6, the first calculation pays 6 clock pulse (clk), and each calculation after that only needs 1 clk.

## 3. Solving Navier-Stokes equations

#### 3.1 Physico-mathematical model of Navier-Stokes equations

In hydraulics, many flow models have been researched, such as flows in channels, streams, or rivers, for controlling the flow for preventing disasters, saving water, and exploiting energy of the flow as well. Most of mathematical models of those phenomena are partial differential equations like Saint venant equations and Navier-Stokes equations [8, 9]. Some types of Navier-Stokes equations have various parameters and constraints. Using CNN technology, we could solve some of them which have clear values of boundary conditions; it means we do not research boundary problems deeply. The effectiveness of the CNN technology is making a physical parallel computing chip to increase the computing speed for satisfying a real-time system.

Navier-Stokes equations here consist of three partial differential equations, with functional variables representing water height, and flow velocity in x- and y-directions. The empirical model is a flow through a small port, which diffuses in two directions Ox and Oy.

Solving Navier-Stokes equations by using CNN requires the discretion of continuity model by difference method, the smaller difference intervals the higher accuracy. However, if difference intervals are too small, then it leads to increasing the calculation complexity and time. The CNN chip with parallel physically processing abilities, the above difficulties will be overcome.
