*End-to-End Benchmarking of Chiplet-Based In-Memory Computing DOI: http://dx.doi.org/10.5772/intechopen.111926*

chiplets required to map the DNN N*Chiplet* with the maximum available chiplets in the architecture (*C*). If the number of required chiplets exceeds the maximum available chiplets, the engine throws an error and requests an increase in the number of available chiplets. On the other hand, if the number of required chiplets is less than or equal to the maximum available chiplets, the engine continues with the partition and mapping process for the subsequent layers in the DNN.

The custom partition scheme in SIAM creates a chiplet-based IMC architecture tailored to the specific DNN being considered without any upper limit on the number of chiplets used. Each chiplet in this scheme has a consistent structure with a fixed number of IMC tiles containing IMC crossbar arrays and peripheral circuitry. On the other hand, the homogeneous partition scheme uses a fixed number of chiplets (user input) to map the DNN in a generic manner. SIAM allows for the comparison of both architectures on a single platform. After partitioning and mapping the layers onto the IMC chiplets, the engine calculates the total amount of data communicated within and across the chiplets, taking into account instances where a layer is partitioned across multiple chiplets. The global accumulator is used to generate the layer output in such cases. The engine also determines the number of additions performed by the global accumulator and the number of global buffer accesses required. The engine output includes the layer partition across chiplets, the necessary number of chiplets and IMC crossbars, IMC crossbar utilization, intra- and inter-chiplet data movement volume, and the number of the global accumulator and buffer accesses. This information is then used by the circuit, NoC, and NoP engines to evaluate the performance of the chiplet-based IMC architecture.
