Brain-Inspired Spiking Neural Networks

*Khadeer Ahmed*

### **Abstract**

Brain is a very efficient computing system. It performs very complex tasks while occupying about 2 liters of volume and consuming very little energy. The computation tasks are performed by special cells in the brain called neurons. They compute using electrical pulses and exchange information between them through chemicals called neurotransmitters. With this as inspiration, there are several compute models which exist today trying to exploit the inherent efficiencies demonstrated by nature. The compute models representing spiking neural networks (SNNs) are biologically plausible, hence are used to study and understand the workings of brain and nervous system. More importantly, they are used to solve a wide variety of problems in the field of artificial intelligence (AI). They are uniquely suited to model temporal and spatio-temporal data paradigms. This chapter explores the fundamental concepts of SNNs, few of the popular neuron models, how the information is represented, learning methodologies, and state of the art platforms for implementing and evaluating SNNs along with a discussion on their applications and broader role in the field of AI and data networks.

**Keywords:** spiking neural networks, spike timing dependent plasticity, neuomorphic computing, artificial intelligence, low power, supervised learning, unsupervised learning, spatio-temporal learning, neuron models, spike encoding, winner take all, stigmergy

### **1. Introduction**

Nature has provided innumerable examples of very efficient solutions to complex problems with seemingly simple rules. With these as inspiration, many engineering problems are tackled using bioinspired techniques. A few of bioinspired techniques are evolutionary and genetic algorithms, stigmergy, hidden Markov models, belief networks, neural networks, etc. These are applicable in a wide variety of domains from robotics [1], communication systems, routing [2], building construction [3], scheduling, optimization, machine intelligence, etc. The brain is a very efficient computing element capable of performing complex tasks. This is possible due to massively parallel computation being performed by the vast number of cells called neurons in the brain while consuming very little energy. This has inspired a domain of algorithms and techniques called artificial intelligence (AI) where machines are programmed to learn and then solve complex tasks. The recent advances in high performance computing and theoretical advances into statistical learning methodologies have enabled a widespread use of AI techniques for tasks

such as pattern recognition, natural language understanding, speech recognition, computer vision, odor recognition, machine translation, medical diagnosis, gaming, autonomous driving, path planning, autonomous robots, financial market modeling and the list goes on. Solving these kinds of problems with efficiency is not possible with the traditional computing paradigms. These algorithms are mimicking biology or are inspired from biology to tackle the above problems. For example, it is not humanly possible to have traditional software program coded to classify an image of a simple object such as a cup with reasonable accuracy, considering the innumerable variations available in terms of shape, size, color, etc. However, this is a trivial task for a human being as our brains learn to identify the salient features of an object. The inner working of the brains, especially the way it processes information is the inspiration behind a class of AI techniques called neural networks.

signals in the form of electrical impulses. In a human brain, there are an estimated 200 billion neurons. Also, there are several different types of neurons in the body. In general, a neuron consists of a cell body or soma consisting of cell machinery,

The dendrites receive information from other neurons, and this causes a voltage buildup on the cell body. When this membrane potential reaches a certain threshold, an electrical impulse is generated, and the axon transmits this spike away from the cell body to other neurons. After a spike is generated, the neuron returns to a lower potential called resting potential. Also, immediately after a spike is generated, the neuron cannot generate another spike for a short duration called the refractory period. The axon terminates at axon terminals which interface with dendrites of other neurons; this is called as a synapse. A synapse is connection between a pre synaptic neuron (which generates electrical impulse) and a postsynaptic neuron (receives the spike information) as shown in **Figure 1**. The synapse is not a direct connection, instead it consists of a gap called synaptic cleft as shown in **Figure 2**.

When an electrical impulse reaches the synapse, the presynaptic neuron releases certain chemicals called neurotransmitters into the synaptic cleft. The postsynaptic neuron picks up these neurotransmitters eventually causing the postsynaptic neurons membrane potential to either increase or decrease. The brain learns by strengthening or weakening the existing synaptic connections or by making new synaptic connections or dissolving those which are no longer needed. In this way, the synapses make the brain plastic and provide the ability to learn. Also, the strength of the synapse also matters for learning as it can modulate the amount of neurotransmitters released in the synaptic cleft resulting in a stronger or weaker synapse and depending on the type

*Neurons (by unknown author, licensed under CC BY-SA https://creativecommons.Org/licenses/by-sa/3.0/).*

nucleus, dendrites, and an axon as shown in **Figure 1**.

*Brain-Inspired Spiking Neural Networks DOI: http://dx.doi.org/10.5772/intechopen.93435*

Discussion about astrocyte cells is presented later in Section 4.5.

**Figure 1.**

**Figure 2.**

**77**

*Neuronal synapse along with astrocyte cells (author created).*

AI requires a large amount of compute power while churning through massive amounts of data. Today's real-world tasks require different sets of AI models with different modalities to interact with each other, hence needing a large pipeline with complex data dependencies. Training is time-consuming, while needing efficient multi-accelerator parallelization. Even with such advances we are nowhere close to the compute power or the efficiency of a human brain. Human brain is still a mystery and is a very actively researched topic. Several neuron models are proposed to mimic various aspects of how the brain works with the limited understand we have up till now.

Spiking neural networks (SNNs) are networks made up of interconnected computing elements called neurons. SNNs try to mimic biology to incorporate the efficiencies found in nature. These neurons use spikes to communicate with each other. SNNs are third generation of neural networks [4] and are gaining popularity due to its potential for very low energy dissipation due to their event-driven and asynchronous operation. SNNs are also interesting because of their ability learn in a distributed way using a technique called Spike Timing Dependent Plasticity (STDP) learning [5]. STDP relies on sparsely encoded spiking information among local neurons. SNNs are capable of learning rich spatio-temporal information [6]. In principle, SNNs can be fault tolerant due to its ability to re-learn and adapt the connections with other neurons, akin to how the brains learn. Also SNNs can natively interface with specialized hardware sensors which mimic biological vision (Dynamic Vision Sensor) and hearing (Dynamic Audio Sensor) [7] as they directly transduce sensory information to spikes.

In the rest of the chapter, a brief introduction on neuron biology and artificial neuron models is presented, followed by discussion on information representation as spikes, different learning methodologies, tools, and platforms available for simulating and implementing SNNs and finally few case studies as examples of SNN usage.

#### **2. Neuron models**

In this section, a brief overview of the biological neuron processes is provided to understand the inference and learning dynamics of SNNs. A few popular neuron models are discussed at a high level to make the reader aware of the diversity of such research and its use in SNNs.

#### **2.1 Biological neuron**

Complex living organisms have specialized cells called neurons, which are the fundamental unit of central nervous system. Neurons can transmit and receive

#### *Brain-Inspired Spiking Neural Networks DOI: http://dx.doi.org/10.5772/intechopen.93435*

such as pattern recognition, natural language understanding, speech recognition, computer vision, odor recognition, machine translation, medical diagnosis, gaming, autonomous driving, path planning, autonomous robots, financial market modeling and the list goes on. Solving these kinds of problems with efficiency is not possible with the traditional computing paradigms. These algorithms are mimicking biology or are inspired from biology to tackle the above problems. For example, it is not humanly possible to have traditional software program coded to classify an image of a simple object such as a cup with reasonable accuracy, considering the innumerable variations available in terms of shape, size, color, etc. However, this is a trivial task for a human being as our brains learn to identify the salient features of an object. The inner working of the brains, especially the way it processes information is the

AI requires a large amount of compute power while churning through massive amounts of data. Today's real-world tasks require different sets of AI models with different modalities to interact with each other, hence needing a large pipeline with complex data dependencies. Training is time-consuming, while needing efficient multi-accelerator parallelization. Even with such advances we are nowhere close to the compute power or the efficiency of a human brain. Human brain is still a mystery and is a very actively researched topic. Several neuron models are proposed to mimic various aspects of how the brain works with the limited understand we

Spiking neural networks (SNNs) are networks made up of interconnected com-

In the rest of the chapter, a brief introduction on neuron biology and artificial neuron models is presented, followed by discussion on information representation as spikes, different learning methodologies, tools, and platforms available for simulating and implementing SNNs and finally few case studies as examples of SNN

In this section, a brief overview of the biological neuron processes is provided to understand the inference and learning dynamics of SNNs. A few popular neuron models are discussed at a high level to make the reader aware of the diversity of

Complex living organisms have specialized cells called neurons, which are the fundamental unit of central nervous system. Neurons can transmit and receive

puting elements called neurons. SNNs try to mimic biology to incorporate the efficiencies found in nature. These neurons use spikes to communicate with each other. SNNs are third generation of neural networks [4] and are gaining popularity due to its potential for very low energy dissipation due to their event-driven and asynchronous operation. SNNs are also interesting because of their ability learn in a distributed way using a technique called Spike Timing Dependent Plasticity (STDP) learning [5]. STDP relies on sparsely encoded spiking information among local neurons. SNNs are capable of learning rich spatio-temporal information [6]. In principle, SNNs can be fault tolerant due to its ability to re-learn and adapt the connections with other neurons, akin to how the brains learn. Also SNNs can natively interface with specialized hardware sensors which mimic biological vision (Dynamic Vision Sensor) and hearing (Dynamic Audio Sensor) [7] as they directly

inspiration behind a class of AI techniques called neural networks.

have up till now.

*Biomimetics*

usage.

**76**

**2. Neuron models**

**2.1 Biological neuron**

such research and its use in SNNs.

transduce sensory information to spikes.

signals in the form of electrical impulses. In a human brain, there are an estimated 200 billion neurons. Also, there are several different types of neurons in the body. In general, a neuron consists of a cell body or soma consisting of cell machinery, nucleus, dendrites, and an axon as shown in **Figure 1**.

The dendrites receive information from other neurons, and this causes a voltage buildup on the cell body. When this membrane potential reaches a certain threshold, an electrical impulse is generated, and the axon transmits this spike away from the cell body to other neurons. After a spike is generated, the neuron returns to a lower potential called resting potential. Also, immediately after a spike is generated, the neuron cannot generate another spike for a short duration called the refractory period. The axon terminates at axon terminals which interface with dendrites of other neurons; this is called as a synapse. A synapse is connection between a pre synaptic neuron (which generates electrical impulse) and a postsynaptic neuron (receives the spike information) as shown in **Figure 1**. The synapse is not a direct connection, instead it consists of a gap called synaptic cleft as shown in **Figure 2**. Discussion about astrocyte cells is presented later in Section 4.5.

When an electrical impulse reaches the synapse, the presynaptic neuron releases certain chemicals called neurotransmitters into the synaptic cleft. The postsynaptic neuron picks up these neurotransmitters eventually causing the postsynaptic neurons membrane potential to either increase or decrease. The brain learns by strengthening or weakening the existing synaptic connections or by making new synaptic connections or dissolving those which are no longer needed. In this way, the synapses make the brain plastic and provide the ability to learn. Also, the strength of the synapse also matters for learning as it can modulate the amount of neurotransmitters released in the synaptic cleft resulting in a stronger or weaker synapse and depending on the type

**Figure 1.** *Neurons (by unknown author, licensed under CC BY-SA https://creativecommons.Org/licenses/by-sa/3.0/).*

**Figure 2.** *Neuronal synapse along with astrocyte cells (author created).*

of neurotransmitters released, the synapse can be excitatory or inhibitory. An excitatory synapse is one which would increase the membrane potential of the post synaptic neuron; conversely, an inhibitory synapse would decrease the membrane potential. Based on these fundamental concepts, several researchers have proposed various neuron models over the decades. We do not yet fully understand the inner workings of brains and is still an active field of research. New neuron models are being proposed frequently as our understanding of biology increases. A few neuron models are listed below, followed by an overview of select models.

#### **2.2 Artificial neuron models**

Some of the models proposed try to mimic biology for the purpose of understanding and modeling neuro-physiological processes and some models more oriented toward computing purposes. A few of neuron models to consider are McCulloch and Pitts [8], Hodgkin-Huxley [9], Perceptron [10], Izhikevich [11] Integrate and fire [12], Leaky integrate-and-fire [13], Quadratic integrate-and-fire [14], Exponential integrate-and-fire [15], Generalized integrate-and-fire [16], Time-varying integrate-and-fire model [17], Integrate-and-fire or burst [18], Resonate-and-fire [19], and Bayesian neuron model [20].

#### **2.3 Hodgkin and Huxley neuron model**

Hodgkin and Huxley [9] studied the giant axon of the squid and found currents induced by different types of ions namely sodium ions, potassium ions, and leakage current due to calcium ions. The cell consists of voltage-dependent ion channels which regulate the concentration of these ions across the cell membrane. For the sake of simplicity, at a high level, the total membrane current is the sum of current induced by membrane capacitance and the ion channel currents as shown in Eq. (1), where *Ii* is the ionic current density, V is the membrane potential, *CM* is the membrane capacitence per unit area, *t* is time, and *INa*,*IK*,*Il* are the sodium, potasium, and leakage current induced by calcium and other ions.

$$I = I\_c + I\_i \tag{1}$$

Where *v* is the membrane potential, *u* is the recovery variable, *I* is the current, and *a*, *b*,*c* and *d* are neuron parameters. Various biologically plausible firing pat-

Over time, if a biological neuron does not spike, then any potential builtup would dissipate. This phenomenon is modeled by several variations of Leaky Integrate and Fire (LIF) models. LIF neuron model is very popular due to its ease of implementation as a software model and for developing dedicated hardware models. Digital hardware implementation is more popular than the analog variants,

A typical generic LIF model adapted for discrete implementation [22] is

*V t*ðÞ¼ *V t*ð Þþ � <sup>1</sup> <sup>X</sup>

*N*�1

*i*¼0

*xi*ð Þ*t si* (5)

*V t*ðÞ¼ *V t*ðÞ� *λ* (6)

If *V t*ð Þ≥*α* then Spike and *V t*ðÞ¼ *R* (7)

terns can be modeled using this model as shown in **Figure 3**.

again due to its simplicity of design, fabrication, and scalability.

**2.5 Discrete leaky integrate and fire**

*Brain-Inspired Spiking Neural Networks DOI: http://dx.doi.org/10.5772/intechopen.93435*

represented as:

**Figure 3.**

**79**

*Izhikevich neuron model [11].*

Synaptic integration

Leak integration

Threshold, fire and reset

$$I\_c = C\_M \frac{\text{dV}}{\text{dt}}\tag{2}$$

$$I\_i = I\_{Na} + I\_K + I\_l \tag{3}$$

They also describe gating variables to control the ion channels and the resting potential of the cell. When the membrane potential increases significantly above the resting potential, the gating variable activates and then deactivates the channels resulting in a spike. This is a very simplified model and has several limitations [21].

#### **2.4 Izhikevich neuron model**

Izhikevich neuron model [11] is more biologically plausible as shown in equations below.

$$\begin{aligned} v' &= \mathbf{0}.\mathbf{0}4v^2 + \mathbf{5}v + \mathbf{1}4\mathbf{0} - u + I\\ u' &= a(bv - u) \\\ \text{If } v &\ge \mathbf{3}\mathbf{0}m \, V, \text{then } \begin{cases} &v = c\\ u = u + d \end{cases} \end{aligned} \tag{4}$$

*Brain-Inspired Spiking Neural Networks DOI: http://dx.doi.org/10.5772/intechopen.93435*

Where *v* is the membrane potential, *u* is the recovery variable, *I* is the current, and *a*, *b*,*c* and *d* are neuron parameters. Various biologically plausible firing patterns can be modeled using this model as shown in **Figure 3**.

Over time, if a biological neuron does not spike, then any potential builtup would dissipate. This phenomenon is modeled by several variations of Leaky Integrate and Fire (LIF) models. LIF neuron model is very popular due to its ease of implementation as a software model and for developing dedicated hardware models. Digital hardware implementation is more popular than the analog variants, again due to its simplicity of design, fabrication, and scalability.

#### **2.5 Discrete leaky integrate and fire**

A typical generic LIF model adapted for discrete implementation [22] is represented as:

Synaptic integration

of neurotransmitters released, the synapse can be excitatory or inhibitory. An excitatory synapse is one which would increase the membrane potential of the post synaptic neuron; conversely, an inhibitory synapse would decrease the membrane potential. Based on these fundamental concepts, several researchers have proposed various neuron models over the decades. We do not yet fully understand the inner workings of brains and is still an active field of research. New neuron models are being proposed frequently as our understanding of biology increases. A few neuron

Some of the models proposed try to mimic biology for the purpose of understanding and modeling neuro-physiological processes and some models more oriented toward computing purposes. A few of neuron models to consider are McCulloch and Pitts [8], Hodgkin-Huxley [9], Perceptron [10], Izhikevich [11] Integrate and fire [12], Leaky integrate-and-fire [13], Quadratic integrate-and-fire [14], Exponential integrate-and-fire [15], Generalized integrate-and-fire [16], Time-varying integrate-and-fire model [17], Integrate-and-fire or burst [18],

Hodgkin and Huxley [9] studied the giant axon of the squid and found currents induced by different types of ions namely sodium ions, potassium ions, and leakage current due to calcium ions. The cell consists of voltage-dependent ion channels which regulate the concentration of these ions across the cell membrane. For the sake of simplicity, at a high level, the total membrane current is the sum of current induced by membrane capacitance and the ion channel currents as shown in Eq. (1),

*I* ¼ *Ic* þ *Ii* (1)

*Ii* ¼ *INa* þ *IK* þ *Il* (3)

<sup>d</sup>*<sup>t</sup>* (2)

( (4)

where *Ii* is the ionic current density, V is the membrane potential, *CM* is the membrane capacitence per unit area, *t* is time, and *INa*,*IK*,*Il* are the sodium,

*Ic* ¼ *CM*

Izhikevich neuron model [11] is more biologically plausible as shown in

*u*<sup>0</sup> ¼ *a bv* ð Þ � *u*

*If v* ≥30*mV*, *then*

*<sup>v</sup>*<sup>0</sup> <sup>¼</sup> <sup>0</sup>*:*04*v*<sup>2</sup> <sup>þ</sup> <sup>5</sup>*<sup>v</sup>* <sup>þ</sup> <sup>140</sup> � *<sup>u</sup>* <sup>þ</sup> *<sup>I</sup>*

*v* ¼ *c u* ¼ *u* þ *d*

They also describe gating variables to control the ion channels and the resting potential of the cell. When the membrane potential increases significantly above the resting potential, the gating variable activates and then deactivates the channels resulting in a spike. This is a very simplified model and has several limitations [21].

dV

potasium, and leakage current induced by calcium and other ions.

models are listed below, followed by an overview of select models.

Resonate-and-fire [19], and Bayesian neuron model [20].

**2.3 Hodgkin and Huxley neuron model**

**2.4 Izhikevich neuron model**

equations below.

**78**

**2.2 Artificial neuron models**

*Biomimetics*

$$V(t) = V(t-1) + \sum\_{i=0}^{N-1} \varkappa\_i(t)s\_i \tag{5}$$

Leak integration

$$V(t) = V(t) - \lambda \tag{6}$$

Threshold, fire and reset

$$\text{If } V(t) \ge a \text{ then Spike and } V(t) = R \tag{7}$$

**Figure 3.** *Izhikevich neuron model [11].*

Where *V t*ð Þ is the membrane potential, *t* is discrete time step, *N* is the number of synapses, *xi*ð Þ*t* is the *i* th synapse, *si* synaptic weight of *i* th synapse, *λ* is leak, *α* is spiking threshold, and *R* is the resting potential. A spike value is 1, otherwise 0. Whenever, a spike occurs on a synapse *x t*ð Þ then the synaptic weight gets accumulated increasing the membrane potential. Every time step a leak is applied and finally when the membrane potential reaches a threshold *α*, the neuron spikes and the membrane potential is reset to a resting value *R*.

#### **2.6 Bayesian neuron model**

Bayesian neuron (BN) model is proposed in [20]. BN model is a stochastic neuron model. When the membrane potential reaches the threshold a BN model fires a spike stochastically. It generates a spike based on a Poisson process where neuron *Z* fires at time *t* with a probability proportional to its membrane potential at time *t*. The membrane potential *u t*ð Þ is computed as:

$$w(t) = w\_0 + \sum\_{i=1}^{n} w\_i y\_i(t) \tag{8}$$

**3.1 Rate coding**

Eq. (11), where *nsp*

applied to the muscle.

**Figure 4.**

**81**

and is the number of trials [23].

*Brain-Inspired Spiking Neural Networks DOI: http://dx.doi.org/10.5772/intechopen.93435*

find a spike in a short interval *Δt* is

Therefore, the instantaneous firing rate is

*ν* ¼ *lim Δt*!0

> *<sup>ρ</sup>* <sup>¼</sup> <sup>1</sup> *Δt*

*The Peri-stimulus-time histogram and the average time-dependent firing rate [23].*

The expected number of spikes for the temporal window *T* is

With rate coded spike trains, the information is encoded in the number of spikes

*<sup>ν</sup><sup>k</sup>* <sup>¼</sup> *nsp k*

Evidence of rate coding is experimentally shown in sensory and motor systems [24]. The number of spikes emitted by the receptor neuron increases with the force

If the rate *ν* is defined via a spike count over a temporal window of duration *T*, the exact firing time of a spike does not matter [23]. We can define it as a Poisson process where spikes events are stochastic and independent of each other with an instantaneous firing rate *ν:* In a homogeneous Poisson process, the probability to

*PF*ð Þ *t*; *t* þ *Δt*

To summarize, the experimental procedure of counting spikes over a time *T* and dividing by *T* gives an empirical estimate of the rate *ν* of the Poisson process. When recording an experiment over several trials, the spike response can be represented via a Peri-Stimulus-Time Histogram (PSTH) with bin width *Δt* as shown in **Figure 4**. The number of spikes *nk*ð Þ *t*; *t* þ *Δt* summed over all repetitions *K* of the experiment is a measure of the typical activity of the neuron between time *t* and *t* þ *Δt*. Therefore, the spike density can be represented as shown in Eq. (15).

*nk*ð Þ *t*; *t* þ *Δt*

*<sup>k</sup>* is the number of spikes over *k* trials over a temporal window *T*

*<sup>T</sup>* (11)

*PF*ð Þ¼ *t*; *t* þ *Δt νΔt* (12)

*<sup>Δ</sup><sup>t</sup>* (13)

*<sup>K</sup>* (15)

⟨*nsp*⟩ <sup>¼</sup> *<sup>ν</sup><sup>T</sup>* (14)

over a specified temporal window. The firing rate *νk*, over *k* trials is shown in

Where the weight of the synapse between *i th* presynaptic neuron *yi* and *<sup>Z</sup>* is *wi*. If *yi* fires a spike at time *t*, then *yi* ð Þ*t* is 1. The intrinsic excitability is *w*0. The firing probability of this stochastic neuron model depends exponentially on the membrane potential *u t*ð Þ as:

$$(probability(Z\text{fires at time } t) \propto \exp\left(u(t)\right))\tag{9}$$

To generate a Poisson process with time-varying rate *λ*ð Þ*t* , the *Time-Rescaling Theorem* is used. According to this theorem, when spike arrival times *vk* follow a Poisson process of instantaneous rate *<sup>λ</sup>*ð Þ*<sup>t</sup>* , the time-scaled random variable <sup>Λ</sup>*<sup>k</sup>* <sup>¼</sup> <sup>Ð</sup> *vk* <sup>0</sup> *λ*ð Þ*v dv* follows a homogeneous Poisson process with unit rate. Then the interarrival time *τ<sup>k</sup>* satisfies exponential distribution with unit rate.

$$
\tau\_k = \Lambda\_k - \Lambda\_{k-1} = \int\_{v\_{k-1}}^{v\_k} \lambda(v) dv \tag{10}
$$

*τ<sup>k</sup>* represents a generated random variable satisfying an exponential distribution with unit rate. *vk* is the next time to spike. As shown in Eq. (10), the instantaneous rates from Eq. (8) is cumulated until the integral values is greater than or equal to *τk:* At this point of time, a spike is generated as it implies that the interspike interval has passed. Poisson spiking behavior is achieved in this way reflecting the state of the neuron. Other stochastic neuron behaviors can be easily constructed by stochastically varying different parameters of the model.

#### **3. Information representation**

SNNs understand the language of spikes, and it is necessary to decide what is the best possible way to represent real-world data to achieve best possible training of the network and efficient inference. Different coding techniques model different aspects of input spectrum. Some of the spike coding techniques are described below to get an intuition of signal representation using spikes.

### **3.1 Rate coding**

Where *V t*ð Þ is the membrane potential, *t* is discrete time step, *N* is the number of

th synapse, *λ* is leak, *α* is

ð Þ*t* (8)

*th* presynaptic neuron *yi* and *<sup>Z</sup>* is *wi*.

*λ*ð Þ*v dv* (10)

ð Þ*t* is 1. The intrinsic excitability is *w*0. The firing

*probability Z fires at time t* ð Þ∝ exp ð Þ *u t*ð Þ (9)

th synapse, *si* synaptic weight of *i*

the membrane potential is reset to a resting value *R*.

time *t*. The membrane potential *u t*ð Þ is computed as:

Where the weight of the synapse between *i*

spiking threshold, and *R* is the resting potential. A spike value is 1, otherwise 0. Whenever, a spike occurs on a synapse *x t*ð Þ then the synaptic weight gets accumulated increasing the membrane potential. Every time step a leak is applied and finally when the membrane potential reaches a threshold *α*, the neuron spikes and

Bayesian neuron (BN) model is proposed in [20]. BN model is a stochastic neuron model. When the membrane potential reaches the threshold a BN model fires a spike stochastically. It generates a spike based on a Poisson process where neuron *Z* fires at time *t* with a probability proportional to its membrane potential at

*u t*ðÞ¼ *<sup>w</sup>*<sup>0</sup> <sup>þ</sup>X*<sup>n</sup>*

probability of this stochastic neuron model depends exponentially on the

*τ<sup>k</sup>* ¼ Λ*<sup>k</sup>* � Λ*<sup>k</sup>*�<sup>1</sup> ¼

stochastically varying different parameters of the model.

to get an intuition of signal representation using spikes.

**3. Information representation**

**80**

To generate a Poisson process with time-varying rate *λ*ð Þ*t* , the *Time-Rescaling Theorem* is used. According to this theorem, when spike arrival times *vk* follow a Poisson process of instantaneous rate *<sup>λ</sup>*ð Þ*<sup>t</sup>* , the time-scaled random variable <sup>Λ</sup>*<sup>k</sup>* <sup>¼</sup> <sup>Ð</sup> *vk* <sup>0</sup> *λ*ð Þ*v dv* follows a homogeneous Poisson process with unit rate. Then the interarrival time *τ<sup>k</sup>* satisfies exponential distribution with unit rate.

> ð*vk vk*�<sup>1</sup>

*τ<sup>k</sup>* represents a generated random variable satisfying an exponential distribution with unit rate. *vk* is the next time to spike. As shown in Eq. (10), the instantaneous rates from Eq. (8) is cumulated until the integral values is greater than or equal to *τk:* At this point of time, a spike is generated as it implies that the interspike interval has passed. Poisson spiking behavior is achieved in this way reflecting the state of the neuron. Other stochastic neuron behaviors can be easily constructed by

SNNs understand the language of spikes, and it is necessary to decide what is the best possible way to represent real-world data to achieve best possible training of the network and efficient inference. Different coding techniques model different aspects of input spectrum. Some of the spike coding techniques are described below

*i*¼1 *wiyi*

synapses, *xi*ð Þ*t* is the *i*

*Biomimetics*

**2.6 Bayesian neuron model**

If *yi* fires a spike at time *t*, then *yi*

membrane potential *u t*ð Þ as:

With rate coded spike trains, the information is encoded in the number of spikes over a specified temporal window. The firing rate *νk*, over *k* trials is shown in Eq. (11), where *nsp <sup>k</sup>* is the number of spikes over *k* trials over a temporal window *T* and is the number of trials [23].

$$
\nu\_k = \frac{n\_k^{sp}}{T} \tag{11}
$$

Evidence of rate coding is experimentally shown in sensory and motor systems [24]. The number of spikes emitted by the receptor neuron increases with the force applied to the muscle.

If the rate *ν* is defined via a spike count over a temporal window of duration *T*, the exact firing time of a spike does not matter [23]. We can define it as a Poisson process where spikes events are stochastic and independent of each other with an instantaneous firing rate *ν:* In a homogeneous Poisson process, the probability to find a spike in a short interval *Δt* is

$$P\_F(t; t + \Delta t) = \nu \Delta t \tag{12}$$

Therefore, the instantaneous firing rate is

$$\nu = \lim\_{\Delta t \to 0} \frac{P\_F(t; t + \Delta t)}{\Delta t} \tag{13}$$

The expected number of spikes for the temporal window *T* is

$$
\langle \mathfrak{n}^{sp} \rangle = \nu T \tag{14}
$$

To summarize, the experimental procedure of counting spikes over a time *T* and dividing by *T* gives an empirical estimate of the rate *ν* of the Poisson process. When recording an experiment over several trials, the spike response can be represented via a Peri-Stimulus-Time Histogram (PSTH) with bin width *Δt* as shown in **Figure 4**. The number of spikes *nk*ð Þ *t*; *t* þ *Δt* summed over all repetitions *K* of the experiment is a measure of the typical activity of the neuron between time *t* and *t* þ *Δt*. Therefore, the spike density can be represented as shown in Eq. (15).

$$\rho = \frac{1}{\Delta t} \frac{n\_k(t; t + \Delta t)}{K} \tag{15}$$

**Figure 4.** *The Peri-stimulus-time histogram and the average time-dependent firing rate [23].*

A spike train *S t*ð Þ is a sum of *δ* functions with a spike occurring at *ts*.

$$S(t) = \sum\_{\mathfrak{s}} \delta(t - t\_{\mathfrak{s}}) \tag{16}$$

be implied from the firing order among the neurons in the population as described

simplistic to model neuronal circuits in the brain [30]. To address some of the shortcomings, several derivations of coding schemes based on different combinations of above concepts are widely used. Few of the common schemes and some task specific coding schemes are Rate code, Time to spike code, Time-to-first-spike: Latency code [31], Reverse time to spike code, Weighted spike code [32], Burst code [33], Population code, Population rate, Rank order code [34], Phase-of-firing code [35, 36], Place code [37], etc. **Figure 5** summarizes a few coding strategies. These coding schemes require appropriate algorithms for converting real-world data to spikes and vice versa. A few common conversion techniques are discussed in

There is evidence suggesting that simple temporal averaging of firing rate is too

SNNs understand the language of spikes; therefore, we must transform the realworld data to appropriate spike representation and subsequently transform the output spikes to real-world formats for human consumption. There are several encoding and decoding algorithms available to achieve this goal. Several heuristics are also employed. Some of the coding techniques mentioned above infer a specific coding/ decoding scheme. Based on the nature of application (such as images, audio, video, financial data, user activity data), one must choose which is the best approach. Image pixel values are binned and proportional firing rates are assigned to different neurons in the receptive fields for each pixel neuron, hence generating random process with rate coding [38]. Since spikes have no polarity positive and negative spike, subchannels can be used to represent richer encoding of data. In threshold-based schemes, a spike is generated when the input signal intensity crosses a threshold. Real numbers are compared against different thresholds, and positive and negative spikes are produced accordingly which are rate coded [39]. BSA algorithm for encoding and decoding [40] is used for modeling brain-machine interfaces and neurological processes in the brain. The work presented by the authors of [41] provides details on step-forward (SF), and moving-window (MW) encoding schemes. In SF scheme, a baseline *B t*ð Þ intensity for the input signal is set and a positive spike is generated if the intensity is above the baseline by the threshold *B t*ðÞþ *Th* amount and the baseline is updated to this new value *B t*ðÞ¼ *B t*ð Þþ � 1 *Th*. Conversely a negative spike is generated if the signal intensity is below *B t*ðÞ� *Th* and the new baseline is adjusted as *B t*ðÞ¼ *B t*ð Þ� � 1 *Th*. MW scheme is like SF scheme except that the baseline is set based on the mean of signal intensities. These schemes are suitable for encoding continuous value signals. The above examples are only a limited set of algorithms out of a vast majority of

Hebb postulated that synaptic efficacy increases from a presynaptic neuron if it repeatedly assists the post synaptic neuron [42]. This forms the fundamentals of STDP rule for learning. STDP mimics biology where a synapse is strengthened when a presynaptic spike occurs before a post synaptic spike in close intervals, this is called Long-Term Potentiation (LTP). On the other hand, the synapse is weakened if the post synaptic neuron fires before the presynaptic neuron in close intervals. This is called as Long-Term Depression (LTD). In biology neurons are highly

in **Figure 5B**.

*Brain-Inspired Spiking Neural Networks DOI: http://dx.doi.org/10.5772/intechopen.93435*

the next section.

**3.3 Spike transduction**

methods to convert diverse signal formats to spikes.

**4. Learning principles for SNN**

**83**

The instantaneous firing rate is the expectation over trials.

$$\nu(t) = \langle \mathfrak{s}(t) \rangle \tag{17}$$

An empirical estimate of the instantaneous firing rate can be deduced as shown in Eq. (18). It implies that the PSTH as described above represents the instantaneous firing rate.

$$\boldsymbol{\nu}(t) = \frac{1}{K\Delta t} \sum\_{k=1}^{K} n\_k^{sp}(t) \tag{18}$$

The average firing rate can be computed for a single neuron, or for a population of neurons representing a class over a single run or over several trials. Rate coding over a time window is suitable for representing the strength of stimulation. On the other hand, population-based rate coding could convey the same information by employing several neurons in a shorter temporal window. The latter trades quick response over a number of neurons. There is evidence of Purkinje neurons demonstrating information coding which is not just firing rate but also the timing and duration of nonfiring, quiescent periods [25, 26].

#### **3.2 Temporal coding**

If the time of spike occurrence in a temporal window carries information, then such coding is referred to as temporal coding. In such coding schemes the quiescent periods and the spiking time both carry information. There are several evidences in biology demonstrating this behavior [27, 28]. A typical temporal code is shown in **Figure 5A**, where the time interval of spike to start of stimulus caries information. These are sometimes referred to as pulse codes. Another variation is Rank Order Coding, which uses the relative timing of spikes across a population of cells. Rank order codes look at time to spike across the neuron population and a rank order can

#### **Figure 5.**

*Different strategies for information coding with spikes (refer to [29] for details). (A) Time to first spike coding (B) rank order coding (C) latency coding (D) resonant burst coding (E) synchrony coding (F) phase coding.*

#### *Brain-Inspired Spiking Neural Networks DOI: http://dx.doi.org/10.5772/intechopen.93435*

be implied from the firing order among the neurons in the population as described in **Figure 5B**.

There is evidence suggesting that simple temporal averaging of firing rate is too simplistic to model neuronal circuits in the brain [30]. To address some of the shortcomings, several derivations of coding schemes based on different combinations of above concepts are widely used. Few of the common schemes and some task specific coding schemes are Rate code, Time to spike code, Time-to-first-spike: Latency code [31], Reverse time to spike code, Weighted spike code [32], Burst code [33], Population code, Population rate, Rank order code [34], Phase-of-firing code [35, 36], Place code [37], etc. **Figure 5** summarizes a few coding strategies. These coding schemes require appropriate algorithms for converting real-world data to spikes and vice versa. A few common conversion techniques are discussed in the next section.

#### **3.3 Spike transduction**

A spike train *S t*ð Þ is a sum of *δ* functions with a spike occurring at *ts*.

*S t*ðÞ¼ <sup>X</sup> *s*

An empirical estimate of the instantaneous firing rate can be deduced as shown in Eq. (18). It implies that the PSTH as described above represents the instanta-

> X *K*

*nsp*

*k*¼1

The average firing rate can be computed for a single neuron, or for a population of neurons representing a class over a single run or over several trials. Rate coding over a time window is suitable for representing the strength of stimulation. On the other hand, population-based rate coding could convey the same information by employing several neurons in a shorter temporal window. The latter trades quick response over a number of neurons. There is evidence of Purkinje neurons demonstrating information coding which is not just firing rate but also the timing and

If the time of spike occurrence in a temporal window carries information, then such coding is referred to as temporal coding. In such coding schemes the quiescent periods and the spiking time both carry information. There are several evidences in biology demonstrating this behavior [27, 28]. A typical temporal code is shown in **Figure 5A**, where the time interval of spike to start of stimulus caries information. These are sometimes referred to as pulse codes. Another variation is Rank Order Coding, which uses the relative timing of spikes across a population of cells. Rank order codes look at time to spike across the neuron population and a rank order can

*Different strategies for information coding with spikes (refer to [29] for details). (A) Time to first spike coding (B) rank order coding (C) latency coding (D) resonant burst coding (E) synchrony coding (F) phase coding.*

1 *KΔt*

The instantaneous firing rate is the expectation over trials.

*ν*ðÞ¼ *t*

duration of nonfiring, quiescent periods [25, 26].

neous firing rate.

*Biomimetics*

**3.2 Temporal coding**

**Figure 5.**

**82**

*δ*ð Þ *t* � *ts* (16)

*<sup>k</sup>* ð Þ*t* (18)

*v t*ðÞ¼ ⟨*s t*ð Þ⟩ (17)

SNNs understand the language of spikes; therefore, we must transform the realworld data to appropriate spike representation and subsequently transform the output spikes to real-world formats for human consumption. There are several encoding and decoding algorithms available to achieve this goal. Several heuristics are also employed. Some of the coding techniques mentioned above infer a specific coding/ decoding scheme. Based on the nature of application (such as images, audio, video, financial data, user activity data), one must choose which is the best approach.

Image pixel values are binned and proportional firing rates are assigned to different neurons in the receptive fields for each pixel neuron, hence generating random process with rate coding [38]. Since spikes have no polarity positive and negative spike, subchannels can be used to represent richer encoding of data. In threshold-based schemes, a spike is generated when the input signal intensity crosses a threshold. Real numbers are compared against different thresholds, and positive and negative spikes are produced accordingly which are rate coded [39]. BSA algorithm for encoding and decoding [40] is used for modeling brain-machine interfaces and neurological processes in the brain. The work presented by the authors of [41] provides details on step-forward (SF), and moving-window (MW) encoding schemes. In SF scheme, a baseline *B t*ð Þ intensity for the input signal is set and a positive spike is generated if the intensity is above the baseline by the threshold *B t*ðÞþ *Th* amount and the baseline is updated to this new value *B t*ðÞ¼ *B t*ð Þþ � 1 *Th*. Conversely a negative spike is generated if the signal intensity is below *B t*ðÞ� *Th* and the new baseline is adjusted as *B t*ðÞ¼ *B t*ð Þ� � 1 *Th*. MW scheme is like SF scheme except that the baseline is set based on the mean of signal intensities. These schemes are suitable for encoding continuous value signals. The above examples are only a limited set of algorithms out of a vast majority of methods to convert diverse signal formats to spikes.

## **4. Learning principles for SNN**

Hebb postulated that synaptic efficacy increases from a presynaptic neuron if it repeatedly assists the post synaptic neuron [42]. This forms the fundamentals of STDP rule for learning. STDP mimics biology where a synapse is strengthened when a presynaptic spike occurs before a post synaptic spike in close intervals, this is called Long-Term Potentiation (LTP). On the other hand, the synapse is weakened if the post synaptic neuron fires before the presynaptic neuron in close intervals. This is called as Long-Term Depression (LTD). In biology neurons are highly

selective due to lateral inhibition. This allows for them to learn discriminatory and unique features in an unsupervised manner leading to an emergent Winner Take All (WTA) behavior. Apart from this the biological system demonstrates homeostasis to maintain overall stability. These are key principles in SNN modeling. There are several ways to achieve WTA and homeostasis behavior, some directly modify the neuron state, others use neural circuits. One such example with a scalable neural circuit [43] is shown in **Figure 6**. A WTA network consists of inhibitor neurons suppressing the activation of other lateral symbol neurons as shown in **Figure 6(a)**. To assist in homeostasis a normalization of the excitations of one neural circuit compared to others can be achieved using a Normalized Winner Take All (NWTA) network as shown in **Figure 6(b**). Where an upper limit (UL) neuron uniformly inhibits all symbol neurons if they are firing beyond a desirable high threshold. On the contrary if the symbol neurons are firing below a desired low threshold, then the lower limit (LL) neuron triggers an excitor (Ex) neuron to uniformly boost the firing rate of all symbol neurons. In this manner all independent neural circuits within an SNN fire in the dynamic range of excitations of the overall network. Both hard and soft WTA behavior can be achieved based on the amount of inhibition generated. In Hard WTA only one symbol neuron is active whereas in soft WTA more than one symbol neuron is active providing richer context.

Here *ΔW* is the weight update plotted against *Δt* ¼ *tpre* � *tpost* representing the interval between the presynaptic and post synaptic spike. This approximation is

> *tpre* � *tpost τ*þ

> > *τ*�

Where *a*þ, *a*� are the learning rates and *τ*þ, *τ*� are the time constants for LTP and LTD, respectively. There are several variations of the STDP curves available in

There are two broad categorizations of STDP rules, additive and multiplicative

�*a*� exp � *tpre* � *tpost*

the literature and the reader is encouraged to explore this topic further.

STDP [38]. Multiplicative rule tends to be more stable than additive rule. In additive rules the weight changes are independent of current weight and requires additional constraints to keep the values in operating bounds. These weight changes however produce bimodal distribution resulting in strong competition. In multiplicative rule presented in [38], the weight change is inversely proportional to the current weight making it inherently stable and resulting in a unimodal distribution. This distribution lacks synaptic competition which is desirable for learning discriminative features. For such rules, competition must be introduced in a different method. The stable multiplicative rule is further explored below and simplified for efficient implementation. Here the STDP rule is modeled such that weight change of a synapse has an exponential dependence on its current weight as

**4.2 Simplified stable STDP rule with efficient hardware model**

� �*if tpre* <sup>≤</sup>*tpost* ð Þ *LTP*

� �*if tpre* <sup>&</sup>gt;*tpost* ð Þ *LTD*

(19)

th synapse of the neuron is

*tpost* � *tpre* <*τLTP* (20)

represented in Eq. (19)

*ΔW* ¼

*Brain-Inspired Spiking Neural Networks DOI: http://dx.doi.org/10.5772/intechopen.93435*

> 8 >>>><

> >>>>:

shown in **Figure 8** (a). Update for the weight *wi* of *i*

*Δwi* ¼ *ηLTPe*

�*wi*

, *wi* ¼ *wi* þ *Δwi*

calculated as below.

If

then,

**Figure 7.**

**85**

*Classic STDP curve [44].*

*a*<sup>þ</sup> exp

SNNs can learn in both unsupervised and supervised modes. WTA concepts are essential part of unsupervised learning as the neuron with highest excitation inhibits the lateral neurons the strongest hence enabling it to preferentially pick up unique features. Unsupervised learning is possible by employing a teacher signal which excites the specific neurons to fire thereby allowing it to learn the features represented by the input signal. STDP based learning has its advantages of being able to model spatiotempotal dynamics. Where the spatial component refers to localized activity/learning and temporal component refers to additional information representation by the spike intervals along the time axis. With the constant advances in SNN research, native STDP based rules are catching up to the more popular backpropagation-based learning methods used in Artificial Neural Networks (ANN). STDP lends itself for efficient localized and distributed learning, which is a huge advantage over other learning methods. Also SNNs can be adapted to model memories in the form of Long Short-Term Memory networks [39] which shows that recurrent learning behavior is also possible. The following sub-sections discus few learning rules used in training SNNs along with a brief introduced to backpropagation-based learning.

#### **4.1 Classic STDP rule**

A classic STDP rule [44] is shown in **Figure 7**. The STDP curve tries to approximate experimentally observed behavior.

**Figure 6.**

*(a) Winner take all network (b) normalized winner take all network [43].*

*Brain-Inspired Spiking Neural Networks DOI: http://dx.doi.org/10.5772/intechopen.93435*

Here *ΔW* is the weight update plotted against *Δt* ¼ *tpre* � *tpost* representing the interval between the presynaptic and post synaptic spike. This approximation is represented in Eq. (19)

$$\Delta W = \begin{cases} -a^+ \exp\left(\frac{t\_{pre} - t\_{post}}{\tau^+}\right) \text{if } t\_{pre} \le t\_{post} \ (LTP) \\\\ -a^- \exp\left(-\frac{t\_{pre} - t\_{post}}{\tau^-}\right) \text{if } t\_{pre} > t\_{post} \ (LTD) \end{cases} \tag{19}$$

Where *a*þ, *a*� are the learning rates and *τ*þ, *τ*� are the time constants for LTP and LTD, respectively. There are several variations of the STDP curves available in the literature and the reader is encouraged to explore this topic further.

#### **4.2 Simplified stable STDP rule with efficient hardware model**

*Δwi* ¼ *ηLTPe*

There are two broad categorizations of STDP rules, additive and multiplicative STDP [38]. Multiplicative rule tends to be more stable than additive rule. In additive rules the weight changes are independent of current weight and requires additional constraints to keep the values in operating bounds. These weight changes however produce bimodal distribution resulting in strong competition. In multiplicative rule presented in [38], the weight change is inversely proportional to the current weight making it inherently stable and resulting in a unimodal distribution. This distribution lacks synaptic competition which is desirable for learning discriminative features. For such rules, competition must be introduced in a different method. The stable multiplicative rule is further explored below and simplified for efficient implementation. Here the STDP rule is modeled such that weight change of a synapse has an exponential dependence on its current weight as shown in **Figure 8** (a). Update for the weight *wi* of *i* th synapse of the neuron is calculated as below.

If

selective due to lateral inhibition. This allows for them to learn discriminatory and unique features in an unsupervised manner leading to an emergent Winner Take All (WTA) behavior. Apart from this the biological system demonstrates homeostasis to maintain overall stability. These are key principles in SNN modeling. There are several ways to achieve WTA and homeostasis behavior, some directly modify the neuron state, others use neural circuits. One such example with a scalable neural circuit [43] is shown in **Figure 6**. A WTA network consists of inhibitor neurons suppressing the activation of other lateral symbol neurons as shown in **Figure 6(a)**. To assist in homeostasis a normalization of the excitations of one neural circuit compared to others can be achieved using a Normalized Winner Take All (NWTA) network as shown in **Figure 6(b**). Where an upper limit (UL) neuron uniformly inhibits all symbol neurons if they are firing beyond a desirable high threshold. On the contrary if the symbol neurons are firing below a desired low threshold, then the lower limit (LL) neuron triggers an excitor (Ex) neuron to uniformly boost the firing rate of all symbol neurons. In this manner all independent neural circuits within an SNN fire in the dynamic range of excitations of the overall network. Both hard and soft WTA behavior can be achieved based on the amount of inhibition generated. In Hard WTA only one symbol neuron is active whereas in soft WTA

more than one symbol neuron is active providing richer context.

along with a brief introduced to backpropagation-based learning.

*(a) Winner take all network (b) normalized winner take all network [43].*

**4.1 Classic STDP rule**

*Biomimetics*

**Figure 6.**

**84**

imate experimentally observed behavior.

SNNs can learn in both unsupervised and supervised modes. WTA concepts are essential part of unsupervised learning as the neuron with highest excitation inhibits the lateral neurons the strongest hence enabling it to preferentially pick up unique features. Unsupervised learning is possible by employing a teacher signal which excites the specific neurons to fire thereby allowing it to learn the features represented by the input signal. STDP based learning has its advantages of being able to model spatiotempotal dynamics. Where the spatial component refers to localized activity/learning and temporal component refers to additional information representation by the spike intervals along the time axis. With the constant advances in SNN research, native STDP based rules are catching up to the more popular backpropagation-based learning methods used in Artificial Neural Networks (ANN). STDP lends itself for efficient localized and distributed learning, which is a huge advantage over other learning methods. Also SNNs can be adapted to model memories in the form of Long Short-Term Memory networks [39] which shows that recurrent learning behavior is also possible. The following sub-sections discus few learning rules used in training SNNs

A classic STDP rule [44] is shown in **Figure 7**. The STDP curve tries to approx-

$$t\_{post} - t\_{pre} < \tau\_{LTP} \tag{20}$$

then,

�*wi*

, *wi* ¼ *wi* þ *Δwi*

**Figure 7.** *Classic STDP curve [44].*

If

$$t\_{post} - t\_{pre} > \tau\_{LTP} \text{ or } t\_{pre} - t\_{post} < \tau\_{LTD}$$

then,

$$
\Delta \mathbf{w}\_{\mathbf{i}} = \eta\_{\text{LTD}} \mathbf{e}^{\mathbf{w}\_{\mathbf{i}}},\\\mathbf{w}\_{\mathbf{i}} = \mathbf{w}\_{\mathbf{i}} - \Delta \mathbf{w}\_{\mathbf{i}} \tag{21}
$$

where ≪ and ≫ represent binary shift left and shift right operations, respectively. This approximation allows implementation of the STDP rule presented in Eq. (20) and Eq. (21) on digital hardware by using a priority encoder, negligibly

circuit. Please note that, based on Eq. (22) and Eq. (23), *Δwi* should be calculated as 2*<sup>Q</sup>* , which can be obtained by shifting value 1 by *Q*. **Figure 8** (c) compares the *Δwi* calculated using the Exp, 2P and Q2PS rules, with a learning rate of 0.08 for all the cases. Here 2P rule is same as Q2PS rule except that 2 is raised to the power of *Q*. As we can see, the Q2PS rule provides multi-level quantization, which enables similar quality of trained weights even with approximations when compared to Exp rule.

With the tremendous advances in the field of ANNs, a growing body of research is available on various statistical learning algorithms. ANNs are inspired by biology but they do not mimic it. ANNs are made up of artificial neuron models specifically tuned for compute purposes and model a biological neuron at a very abstract level. An artificial neuron computes weighted sum of input signals and then an activation function computes the neuron output. In these networks' neurons transmit signals as real numbers. ANNs compute inference by transmitting the neuron signals in the forward direction. The learning happens usually via a method called Backpropagation. This algorithm computes the gradients based on the error signal produced by a cost function and propagates it back for each layer of neurons in the neural network. The weight updates are usually made using gradient descent algorithms. There are many flavors of gradient descent algorithms available in the literature. For back propagation to work the activation function must be differentiable. Unlike SNNs, where a spike is not differentiable. In general, ANNs have proven to be very effective in tackling a wide variety of problems. Using these algorithms as inspiration several modified STDP rules have been researched, one among them is discussed below. This overview is a very high-level introduction to some of the terminology required to understand the follow-

� from the encoded value, barrel shifter and an adder

small lookup to determine *Q*

*Brain-Inspired Spiking Neural Networks DOI: http://dx.doi.org/10.5772/intechopen.93435*

**4.4 Backpropagation-STDP**

**87**

interval contains zero or one spike.

� � �

**4.3 Overview of learning in artificial neural networks**

ing section. The reader is encouraged to explore further on this topic.

The Backpropagation-STDP (BP STDP) [45] algorithm uses the number of spikes in a spike trains as an approximation for the real value of an artificial neurons excitation. They also divide the time interval into sub-intervals such that each sub-

In supervised training, the weight adjustment is governed by the STDP model shown in Eq. (25) and Eq. (26), in conjunction with a teacher signal. The teacher signal when applied to target neurons undergo weight change based on STDP and non-target neurons undergo weight changes based on anti-STDP. Anti-STDP is the opposite of STDP where LTP and LTD equations are swapped. Target neurons are identified by spike trains with maximum spike frequency (*β*) and non-target neurons are silent. The expected output spike trains *z*, are tagged with their input labels. Eq. (25) represents the weight change for a desired spike pattern *zi*ð Þ*t* for the output layer neurons.

*<sup>Δ</sup>wih*ðÞ¼ *<sup>t</sup> μξi*ð Þ*<sup>t</sup>* <sup>X</sup>*<sup>t</sup>*

0, *otherwise*

*ξi*ðÞ¼ *t*

8 ><

>:

*t*0 ¼*t*�*ϵ*

1, *zi*ðÞ¼ *t* 1,*ri* 6¼ 1 ½ � *t* � *ϵ*, *t* �1, *zi*ðÞ¼ *t* 0,*ri* ¼ 1 ½ � *t* � *ϵ*, *t*

*sh t*

<sup>0</sup> ð Þ (25)

(26)

Where *tpost* and *tpre* are the pre and post-synaptic neuron spiking time steps, *τLTP* and *τLTD* are the LTP and LTD window and *ηLTP* and *ηLTD* are the LTP and LTD learning rates respectively. Plasticity is implemented with LTP and LTD windows as shown in **Figure 8** (b). This rule is called as Exp rule.

The Exp STDP rule requires an exponential and a multiplication operation for both LTP and LTD for each synapse. From the perspective of efficient digital hardware implementation these are expensive operations in terms of circuit area and computation time. Quantized 2-power shift rule (Q2PS), which approximates the Exp rule in Eq. (20) and Eq. (21) by removing both multiplication and exponential. The approximation is summarized in Eq. (22) and Eq. (23).

If

$$t\_{\rm post} - t\_{\rm pre} < \tau\_{\rm LTP}$$

$$\Delta w\_i = \eta\_{\rm LTP} \mathcal{Z}^{-w\_i} = \mathcal{Z}^{\eta\_{\rm LTP} - w\_i} \tag{22}$$

If

$$t\_{\rm post} - t\_{\rm pre} > \tau\_{\rm LTP} \text{ or } t\_{\rm pre} - t\_{\rm post} < \tau\_{\rm LTD}$$

$$\Delta w\_i = \eta\_{\rm LTD} 2^{w\_i} = 2^{\eta\_{\rm LTD}^\* + w\_i} \tag{23}$$

where *η*<sup>0</sup> *LTP* ¼ log <sup>2</sup>*ηLTP* and *η*<sup>0</sup> *LTD* ¼ log <sup>2</sup>*ηLTD*. Let *Q* ¼ *η*<sup>0</sup> *LTP* � *wi* for LTP and *Q* ¼ *η*<sup>0</sup> *LTD* þ *wi* for LTD. Let *Q* be the quantization of *Q* through priority encoding. Priority encoding compresses a binary representation of a number to value with only the most significant bit being active as rest of the active bits have no priority. For example, the binary representation of *Q* ¼ 13 is 1101 and the priority encoded value is 1000, hence *Q* ¼ 8. Based on this quantization method, the synaptic weight change can be easily computed by left shifting 1 by *Q* or right shifting if negative as shown in Eq. (24).

$$
\Delta w\_i = \begin{cases}
1 \ll |\overline{Q}|, \text{if } \overline{Q} > 0 \\
1 \gg |\overline{Q}|, \text{if } \overline{Q} < 0
\end{cases}
\tag{24}
$$

**Figure 8.**

*(a) Current weight vs weight change for learning rates (b) STDP windows (c) Comparison of Exp, 2P and Q2PS STDP rules [38].*

#### *Brain-Inspired Spiking Neural Networks DOI: http://dx.doi.org/10.5772/intechopen.93435*

If

*Biomimetics*

then,

If

If

*Q* ¼ *η*<sup>0</sup>

**Figure 8.**

**86**

*Q2PS STDP rules [38].*

where *η*<sup>0</sup>

*LTP* ¼ log <sup>2</sup>*ηLTP* and *η*<sup>0</sup>

*tpost* � *tpre* > *τLTP* or *tpre* � *tpost* < *τLTD*

Where *tpost* and *tpre* are the pre and post-synaptic neuron spiking time steps, *τLTP* and *τLTD* are the LTP and LTD window and *ηLTP* and *ηLTD* are the LTP and LTD learning rates respectively. Plasticity is implemented with LTP and LTD windows as

The Exp STDP rule requires an exponential and a multiplication operation for both LTP and LTD for each synapse. From the perspective of efficient digital hardware implementation these are expensive operations in terms of circuit area and computation time. Quantized 2-power shift rule (Q2PS), which approximates

*tpost* � *tpre* <*τLTP*

*tpost* � *tpre* > *τLTP* or *tpre* � *tpost* < *τLTD*

*LTD* ¼ log <sup>2</sup>*ηLTD*. Let *Q* ¼ *η*<sup>0</sup>

*LTD* þ *wi* for LTD. Let *Q* be the quantization of *Q* through priority encoding. Priority encoding compresses a binary representation of a number to value with only the most significant bit being active as rest of the active bits have no priority. For example, the binary representation of *Q* ¼ 13 is 1101 and the priority encoded value is 1000, hence *Q* ¼ 8. Based on this quantization method, the synaptic weight change can be easily computed by left shifting 1 by *Q* or right shifting if negative as shown in Eq. (24).

> � � �

1 ≫ *Q* � � �

*(a) Current weight vs weight change for learning rates (b) STDP windows (c) Comparison of Exp, 2P and*

�, *if Q* >0

�, *if Q* <0

*<sup>Δ</sup>wi* <sup>¼</sup> <sup>1</sup> <sup>≪</sup> *<sup>Q</sup>*

(

*<sup>Δ</sup>wi* <sup>¼</sup> *<sup>η</sup>LTP*2�*wi* <sup>¼</sup> <sup>2</sup>*<sup>η</sup>*0*LTP*�*wi* (22)

*<sup>Δ</sup>wi* <sup>¼</sup> *<sup>η</sup>LTD*2*wi* <sup>¼</sup> <sup>2</sup>*<sup>η</sup>*0*LTD*þ*wi* (23)

*LTP* � *wi* for LTP and

(24)

the Exp rule in Eq. (20) and Eq. (21) by removing both multiplication and exponential. The approximation is summarized in Eq. (22) and Eq. (23).

, wi ¼ wi � *Δ*wi (21)

*<sup>Δ</sup>*wi <sup>¼</sup> <sup>η</sup>LTDewi

shown in **Figure 8** (b). This rule is called as Exp rule.

where ≪ and ≫ represent binary shift left and shift right operations, respectively. This approximation allows implementation of the STDP rule presented in Eq. (20) and Eq. (21) on digital hardware by using a priority encoder, negligibly small lookup to determine *Q* � � � � from the encoded value, barrel shifter and an adder circuit. Please note that, based on Eq. (22) and Eq. (23), *Δwi* should be calculated as 2*<sup>Q</sup>* , which can be obtained by shifting value 1 by *Q*. **Figure 8** (c) compares the *Δwi* calculated using the Exp, 2P and Q2PS rules, with a learning rate of 0.08 for all the cases. Here 2P rule is same as Q2PS rule except that 2 is raised to the power of *Q*. As we can see, the Q2PS rule provides multi-level quantization, which enables similar quality of trained weights even with approximations when compared to Exp rule.

#### **4.3 Overview of learning in artificial neural networks**

With the tremendous advances in the field of ANNs, a growing body of research is available on various statistical learning algorithms. ANNs are inspired by biology but they do not mimic it. ANNs are made up of artificial neuron models specifically tuned for compute purposes and model a biological neuron at a very abstract level. An artificial neuron computes weighted sum of input signals and then an activation function computes the neuron output. In these networks' neurons transmit signals as real numbers. ANNs compute inference by transmitting the neuron signals in the forward direction. The learning happens usually via a method called Backpropagation. This algorithm computes the gradients based on the error signal produced by a cost function and propagates it back for each layer of neurons in the neural network. The weight updates are usually made using gradient descent algorithms. There are many flavors of gradient descent algorithms available in the literature. For back propagation to work the activation function must be differentiable. Unlike SNNs, where a spike is not differentiable. In general, ANNs have proven to be very effective in tackling a wide variety of problems. Using these algorithms as inspiration several modified STDP rules have been researched, one among them is discussed below. This overview is a very high-level introduction to some of the terminology required to understand the following section. The reader is encouraged to explore further on this topic.

#### **4.4 Backpropagation-STDP**

The Backpropagation-STDP (BP STDP) [45] algorithm uses the number of spikes in a spike trains as an approximation for the real value of an artificial neurons excitation. They also divide the time interval into sub-intervals such that each subinterval contains zero or one spike.

In supervised training, the weight adjustment is governed by the STDP model shown in Eq. (25) and Eq. (26), in conjunction with a teacher signal. The teacher signal when applied to target neurons undergo weight change based on STDP and non-target neurons undergo weight changes based on anti-STDP. Anti-STDP is the opposite of STDP where LTP and LTD equations are swapped. Target neurons are identified by spike trains with maximum spike frequency (*β*) and non-target neurons are silent. The expected output spike trains *z*, are tagged with their input labels. Eq. (25) represents the weight change for a desired spike pattern *zi*ð Þ*t* for the output layer neurons.

$$
\Delta w\_{ih}(t) = \mu \xi\_i(t) \sum\_{t'=t-\varepsilon}^{t} s\_h(t') \tag{25}
$$

$$\xi\_i(t) = \begin{cases} \mathbf{1}, z\_i(t) = \mathbf{1}, r\_i \neq \mathbf{1} \ [t - \epsilon, t] \\ -\mathbf{1}, z\_i(t) = \mathbf{0}, r\_i = \mathbf{1} \ [t - \epsilon, t] \\ \mathbf{0}, otherwise \end{cases} \tag{26}$$

A target neuron would generate a spike *zi*ðÞ¼ *t* 1 and non-target neurons would remain silent *zi*ðÞ¼ *t* 0. Based on the expected output spike train target neuron should fire within the short STDP window ½ � *t* � *ϵ*, *t* . Based on the presynaptic activity usually zero or one spike in the STDP window, the synaptic weights are increased proportionally. The presynaptic activity is the count of spikes in the ½ � *<sup>t</sup>* � *<sup>ϵ</sup>*, *<sup>t</sup>* interval denoted as <sup>P</sup>*<sup>t</sup> t*0 <sup>¼</sup>*t*�*<sup>ϵ</sup>sh <sup>t</sup>* <sup>0</sup> ð Þ. On the other hand, the non-target neurons upon firing undergo weight depression in the same way. The difference between the desired spike pattern and output spike pattern is used as the guide for identifying target neurons and non-target neurons as the backpropagation rule. Same methodology is used for each layer while back propagating. Among the several learning methods inspired by ANN algorithms a few use strategies where the ANN is trained in its native form and tuned based on a shadow SNN and finally use those adapted weights on SNN for inference.

changes are monitored as traces for indirect communication by astrocytes. Astrocytes themselves behaving like an environment with calcium ion concentration gradients within the cell acting as a medium for other neuron agents to indirectly infer these changes. This interaction creates a feedback mechanism in an asynchronous and distributed manner [47]. **Figure 9** shows the emergent stigmergy pattern in the brain. Short term activity and long-term activity gets communicated over a distance to other synapses over a spatial domain. Greater the distance, lower would be the influence. The details about the stigmergy based brain plasticity is presented in [47], interested readers are encouraged to explore further. This is a relatively new discovery and extensive research is underway to understand the role of astrocytes in

There are several spiking neural network simulation tools available which sup-

Brian [48], is a free, open source simulator for spiking neural networks. This simulator is capable of running on several different platforms and is implemented

NEST [49] is another simulator focusing on the dynamics, size and structure of neural systems both large and small. This tool is not intended for modeling the

NEURON [50] is simulation environment best suited for modeling individual neurons and their networks. This is popular among neuroscientists for its ability to handle complex models in a computationally efficient manner. Unlike above simulator, NEURON can handle morphological details of a neuron and is used to validate

The above tools are commonly used in modeling biologically realistic neuron modes. They have their own unique interfaces and low-level semantics. An effort is made to smooth things out with a tool independent API package developed on Python programming language called PyNN [51]. The PyNN framework provides API support to model SNNs at a high level of abstraction of all aspects of neuron modeling and SNN representation, including populations of neurons, connections, layers etc. Though this provides high level abstraction, it also provides the ability to program at a low level such as adjusting individual parameters at the neuron and synapse level. To make things easy PyNN provides a set of library implementation for neurons, synapses, STDP models etc. They also provide easy interfaces to model various connectivity patterns among neurons like; all-to-all, small-world, random distance-dependent etc. These APIs are simulator independent making the code portable across different supported simulation tools and neuromorphic hardware platforms. It is relatively straightforward to add support to any custom simulation tool. PyNN officially supports BRIAN, NEST and NEURON SNN simulation tools. It is also supported on SpiNNaker [52] and BrainScaleS-2 [53] neuromorphic hardware systems. There are several more simulation tools which work with PyNN. Cypress [54] is a C++ based SNN Simulation tool. This provides a C++ wrapper around PyNN APIs. Hence, extending the multi-platform reach of Cypress using C++ interface. It is also capable of executing networks remotely on neuromorphic com-

The BrainScaleS-2 [53] is a mixed-signal accelerated neuromorphic system with analog neural core, digital connectivity along with embedded SIMD microprocessor. It is efficient for emulations of neurons, synapses, plasticity models etc. This

port biologically realistic neuron models for large scale networks. Some of the

**5. SNN simulation tools and hardware accelerators**

in python making it extendable and easy to use.

intricate biological details of a neuron.

theoretical models with experimental data.

overall brain mechanics.

*Brain-Inspired Spiking Neural Networks DOI: http://dx.doi.org/10.5772/intechopen.93435*

popular ones are:

pute platforms.

**89**
