**4. C3S microarchitecture model**

#### **4.1 Mini-columns and TNNs (completed work)**

We have developed a microarchitecture model for building highly efficient TNN designs [11]. In this model, values are encoded as timings of events or spikes. Spikes are implemented as logic pulses whose timings are calibrated using unit hardware clock. We refer to this as "direct" implementation as hardware clock itself defines the time unit for temporal processing and spikes directly arrive at precisely timed clock cycles. Here, spike timings are not encoded as binary values propagated in the form of packets. The proposed TNN microarchitecture model consists of the following key modules: (1) a counter-based synapse that performs temporal processing of input spikes to generate specialized responses and unsupervised/ supervised STDP learning of synaptic weight; (2) a multi-synapse neuron that accumulates the synaptic responses and fires an output spike when the accumulated response crosses a threshold; and (3) a multi-neuron TNN column (referred henceforth as "mini-column" to distinguish from cortical column) that applies winner-takeall (WTA) inhibition across its neurons. **Figure 6** illustrates the implementation of a mini-column.

A mini-column with *p* synaptic inputs feeding *q* neurons via a *pxq* synaptic crossbar is referred to as a *(pxq) mini-column* in **Figure 6**. The figure illustrates the actual implementation of the major components of a (*pxq*) mini-column, including the synapses, neuron body, STDP local learning, and WTA lateral inhibition. In our prior work [11], we have taken the RTL designs of various configurations of the mini-column through both the synthesis and physical design tools, and obtained PPA (power, performance, area) results by scaling both *p* and *q*. We observed that power and area scale linearly with the total number of synapses (*pxq*), whereas performance limited by the critical path delay scales logarithmically with the number of synaptic inputs (*p*). We also derived a set of characteristic equations from the gate-level designs that provide qualitative estimation of gate-level PPA complexity for arbitrary mini-column designs.

*Cortical Columns Computing Systems: Microarchitecture Model, Functional Building Blocks… DOI: http://dx.doi.org/10.5772/intechopen.110252*

#### **Figure 6.**

*Microarchitecture model for implementing multi-column and multi-layer TNNs [11]. This figure highlights details of a generic TNN column (or mini-column). Each mini-column consists of p inputs feeding a stack of q neurons via a pxq synaptic crossbar. Each cross point in the pxq crossbar stores a weight value that is updated based on spike timing dependent plasticity (STDP) updating rules. Each neuron performs the weighted sum of its inputs. Each mini-column is also supported by winner-take-all (WTA) inhibition across its neuron outputs that selects the winner neuron. Gate-level implementations of these mini-column components are illustrated.*

Multiple mini-columns can be grouped and organized into a hierarchy of layers to form multi-layer temporal neural networks (TNNs), as shown at the bottom of **Figure 6**. Multiple multi-neuron mini-columns are stacked to form a single layer and multiple multi-column layers are cascaded to form a multi-layer TNN. A TNN is typically bookended by input-encode and output-decode layers, and is effectively a feed-forward network. The mini-column is the fundamental building block for TNNs and can learn to distinguish distinct input patterns. Hence, a single mini-column can be viewed as a fully functioning TNN. **Table 1** in Section 6 demonstrates the efficacy of single mini-column designs in performing unsupervised clustering with minimal PPA complexity.


*All mini-columns are specifically trained for their corresponding benchmarks and achieve competitive performance relative to state-of-the-art. Even the largest mini-column design only consumes 39 μ W.*

**Table 1.**

*Design space exploration for UCR time-series clustering: Using TNN7 macros, PPA [12] for seven sample TNN prototype designs (mini-columns) across diverse synapse counts and application benchmarks from [13].*
