4. Conclusion

This chapter gives the solution for configuring CNN chip to solve Navier-Stokes equations, especially concerning to solution in the temporary boundary problem

3.6 Simulation results

Figure 24.

see that the 3x4 CNN system worked well.

An example of h.core file to initial data for the Input memory h.

Boundary Layer Flows - Theory, Applications and Numerical Methods

initial writing result back to Input memory.

Devices used summary (estimated values)

Table 1.

214

Device utilization summary.

The ISE design software shows the device utilization summary as in Table 1. Figures 25–27 show the schematics synthesized by the ISE design software. Comparing the new values of h in Figure 28i, k (doutH) with Figure 29, we can

The simulation results show the properness and effectiveness of installation methods. The cost for calculating the first three blocks of 1xN taken from memory units h, u, v is 10 clock pulses, of which 1 clock pulse is for initial reading Input memory, 3 clock pulse is for initial updating buffer to CNN, and 6 clock pulses for initial calculation. Each successive 1xN unit takes only 1 clock pulse to calculate, due to the use of the pipeline mechanism to update buffer to CNN and calculate at CNN arithmetic unit. After finishing reading each column of blocks of data in the Input memory, it needs 2 more clocks for initiating the buffer again. It also takes 1 clk for initial writing Temp memory, 1 clk for initial reading Temp memory, and 1 clk for

Logic utilization Used Available Utilization Number of slice registers 3952 301,440 1% Number of slice LUTs 16,365 150,720 10% Number of fully used LUT-FF pairs 1770 18,547 9% Number of bonded IOBs 3112 600 518% Number of Block RAM/FIFO 12 416 2% Number of BUFG/BUFGCTRLs 1 32 3% Number of DSP48E1s 132 768 17%

Figure 27.

217

Inside electronic circuit for h.

Solving Partial Differential Equation Using FPGA Technology

DOI: http://dx.doi.org/10.5772/intechopen.84588

when it is required. The purpose is to divide the big data space into many subspaces. The processing of the big data space is based on the calculation of each subdata. With the input data of 32-bit floating point real number and FPGA chip Virtex 6 XCVL240T-1FFG1156, the CNN of 1x12 cells has successfully installed. The installation results show that the effectiveness of this solution mainly lies on the

### Solving Partial Differential Equation Using FPGA Technology DOI: http://dx.doi.org/10.5772/intechopen.84588

Figure 27. Inside electronic circuit for h.






when it is required. The purpose is to divide the big data space into many subspaces. The processing of the big data space is based on the calculation of each subdata. With the input data of 32-bit floating point real number and FPGA chip Virtex 6 XCVL240T-1FFG1156, the CNN of 1x12 cells has successfully installed. The instal-

lation results show that the effectiveness of this solution mainly lies on the

Boundary Layer Flows - Theory, Applications and Numerical Methods

Figure 26.

216

The architecture of one CNN cell.

## Boundary Layer Flows - Theory, Applications and Numerical Methods

expansion of calculation space and resource saving and the accuracy of the calculation acceptable as well. This model can be further developed to feasibly solve similar problems in larger computing space and could be developed for some types of

We would like to deeply acknowledge Professor Roska Tamas, the head of the Analogic and Neural Computing Research Laboratory and Chairman of the Scientific Council—Institute of the Hungarian Academy of Sciences; and Associate Professor Pham Thuong Cat, the Head of Automation Laboratory—Institute of

Information Technology—Vietnam Academy of Science and Technology, for giving

© 2019 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/ by/3.0), which permits unrestricted use, distribution, and reproduction in any medium,

complicated (mixed) boundaries as well.

The new values of h computed by excel for the first computing cycle.

Solving Partial Differential Equation Using FPGA Technology

DOI: http://dx.doi.org/10.5772/intechopen.84588

Acknowledgements

Figure 29.

Author details

219

Vu Duc Thai\* and Bui Van Tung

Thai Nguyen University, Thai Nguyen, Vietnam

provided the original work is properly cited.

\*Address all correspondence to: vdthai@ictu.edu.vn

us many important instructions.

#### Figure 28.

Signals operating inside the 3x4 CNN system, m = 8, Q = 2. (a) Starting a computing cycle by setting start = 1. (b) The output of Input memory (doutH). (c) The data outputting from Buffer after 4 clks. (d) The results from CNN core after 10 clks; and start writing the results to Temp memory. (e) The CNN core finish computing the first column of blocks of data at 16 clks; and pause writing the results to Temp memory at 16 clks. (f) The results from CNN core after 18 clks; read Temp memory, start updating boundaries, and write the results to Input memory. (g) Pause updating boundaries from 24 clks. (h) The CNN core finishes computing; read the last column of blocks of data from Temp memory and write to Input memory. (h) Finish writing all results of the first computing cycle to Input memory. (i) The controller sets finish = 1 at 33 clks. (k) The output of Input memory shows the results computed at previous computing cycle. (l) The overview of signals.

Solving Partial Differential Equation Using FPGA Technology DOI: http://dx.doi.org/10.5772/intechopen.84588


Figure 29.

The new values of h computed by excel for the first computing cycle.

expansion of calculation space and resource saving and the accuracy of the calculation acceptable as well. This model can be further developed to feasibly solve similar problems in larger computing space and could be developed for some types of complicated (mixed) boundaries as well.
