**2. Resistive RAM memory technologies and compact modeling**

RRAM technologies are considered a promising nonvolatile memory technology for next-generation embedded systems applications, thanks to their low fabrications cost, small feature size, back end of line compatibility, and performance. Specifically, RRAMs typically provide fast switching speed (i.e., < 10 ns [13]), low programming energy (i.e., < 1pJ [13]), long retention (i.e., > 10 years [13]), high endurance (i.e., 10<sup>6</sup> –10<sup>12</sup> [13]), and a relatively large memory window (i.e., > 10<sup>2</sup> [13]) when used as binary memory elements. These devices can be electrically switched between nonvolatile a low and a high resistive state (LRS and HRS, respectively) that can be used to encode a logic 1 and a logic 0, respectively. Depending on the materials used for their fabrication, different switching mechanisms can occur [14, 15]. In this chapter, bipolar metal-oxide-based RRAM cells that exhibit filamentary switching are considered, but the methods described apply also to other resistive switching memory technologies.

A bipolar metal-oxide-based RRAM is programmed into a LRS by biasing it with a sufficiently large positive voltage, which causes the bond breakage and drift toward the top electrode (TE) of oxygen ions in the metal oxide layer, resulting in the formation of an oxygen deficient conductive filament (CF) assisting ohmic-like drift

*Study of RRAM-Based Binarized Neural Networks Inference Accelerators Using an RRAM… DOI: http://dx.doi.org/10.5772/intechopen.110340*

conduction [14, 16]. During a device set, a current compliance (IC) must be provided to prevent dielectric breakdown. The device is reset into a HRS by biasing it with a sufficiently negative voltage (i.e., VRESET) which induces drift and recombination of oxygen ions with oxygen vacancies in the dielectric which partially dissolve the CF. The current conduction in HRS is associated with trap-assisted tunneling in the dielectric [14, 16]. Due to the physical mechanisms behind the device's operation, these devices present intrinsic nonidealities. Specifically, programming C2C variations of the CF morphology and the dielectric composition [17], result in stochastically distributed resistive states which can influence the available memory window and circuit reliability. Also, random telegraph noise (RTN), which is caused by the effect of interstitial oxygen ions and vacancies [11, 18], is typically observed in low voltage reads.

#### **2.1 The UNIMORE RRAM physic-based compact model**

When designing novel RRAM-based circuits, RRAM devices nonideal effects and complex switching mechanism, result in non-trivial design constraints that need appropriate analysis tools to be studied. Thus, compact models of RRAM devices have been developed to enable circuit simulations [19]. Different compact models typically adopt different approximations, and two main categories can be identified, i.e., general purpose and physics-based compact models. The formers, typically adopt simpler equations that enable to simulate larger circuits, however at the expense of lower accuracy when simulating the device operation outside the operating range considered for parameter calibration, and a lack of a clear mapping between model and technology related parameters. Conversely, in physics-based compact models, parameters and equations are linked to the device physics, enabling to reproduce the device characteristic in multiple operating conditions [11] and simplifying the parameter extraction [20]. Thus, as a first step when studying the performance and reliability tradeoffs of novel RRAM-based circuits, physics-based compact models should be employed [21].

For the analysis and use-case examples reported in this chapter, the UNIMORE RRAM physics-based compact model available in [10, 11], is used. Other physicsbased compact models could also be used to perform the analysis [21–23], however for other compact models, clear parameter extraction procedures are currently not available [20]. The UNIMORE RRAM compact model is implemented in Verilog-A and reproduces the effects of self-heating, C2C variability, multi-level RTN, and is completed by an automated parameter extraction procedure [20]. The RRAM device approximated by the compact model is shown in **Figure 1a** and **b** for a device in LRS and HRS, respectively. The compact model considers a CF and a dielectric barrier whose thickness *x* is modulated by means of a system of differential equations, which take into account thermal effects, the field-driven oxygen ions drift and recombination during the device reset, and the field-accelerated bond breaking during the device set [10, 11]. Internally, the compact model includes two sub modules that reproduce the effects of C2C variability and RTN, as shown in **Figure 1b**. A detailed description of the modeling of C2C variability and RTN is available in [11, 18]. The compact model was calibrated in [11] on 4 RRAM metal-oxide-based bipolar RRAM technologies. In the rest of the chapter, data and results of simulations are reported considering the model calibrated on the *TiN/HfOx/AlOx/Pt* RRAM technology from [12], which has an IC of 100 μA. The calibrated compact model can well reproduce the experimental data in different operating conditions, as shown in **Figure 1c** and **d**, the

#### **Figure 1.**

*Representation of an RRAM device as approximated in the compact model for a device in (a) LRS and (b) HRS, respectively. The defects causing RTN are also shown. (b) Functional block diagram of the compact model. The internal models for the device nonidealities are shown. Experimental and simulated, (c) quasi-static IV, and (d) pulsed reset characteristics of a TiN/HfOx/AlOx/Pt device. Data from [12].*

**Figure 2.**

*Experimental and simulated cycle-to-cycle variability at different reset voltages. (b) and (c) simulated RTN at different read voltages for a device in (b) LRS, and (c) HRS.*

experimental probability distributions of the resistive states, see **Figure 2a**, and multilevel RTN signals, as shown in **Figure 2b** and **c**. Simulations are performed with Cadence Virtuoso®.

#### **2.2 Using compact models to study RRAM-based IMC architectures**

As discussed in the following sections, IMC accelerators can be quite complex, and composed of a large number of RRAM devices and components. As the size of the accelerators increases, circuit simulation can quickly require excessive computing resources that are often not available. A possible approach is to study the problem at different abstraction layers. Specifically, the three steps approach shown in **Figure 3** can be followed. The first step of this methodology requires the electrical characterization of novel emerging nonvolatile memory devices, and the application of parameter extraction strategies to calibrate compact models on the specific technology. In the second step, the compact model is used to perform circuit simulations of smaller portion of the entire architecture. Specifically, the functionality of the core of operations of the in-memory computing framework of interest is first simulated using the compact model without including nonideal effects. The results of these simulations can provide indications regarding valid circuit operating points and the existing design tradeoffs. Parametric simulations including the device nonideal effects are then used to identify appropriate operating points that satisfy the desired reliability vs. performance tradeoffs, and to estimate the performance in the worst-case (WC in **Figure 3**) corners of the circuit. Finally, the results from the previous analysis can be used to estimate the performance of a more complex design based on the core

*Study of RRAM-Based Binarized Neural Networks Inference Accelerators Using an RRAM… DOI: http://dx.doi.org/10.5772/intechopen.110340*

#### **Figure 3.**

*Flow-chart of the methodology enabled by the RRAM physics-based compact model that can be used to design reliable RRAM-based in-memory computing architectures and to estimate their performance.*

operations studied with the compact model. Specifically, the worst-case performance estimates for the core operations are mapped to each operation executed at the architectural level. Although this step introduces several approximations, the use of the worst-case performance for each operation results in an underestimation of the energy efficiency therefore providing a sufficient headroom to account, to first order, for additional energy overheads possibly existing in the complete architecture.

A use case example of this methodology is discussed in the following sections of this chapter. In Section 3, the compact model is used to study the performance and reliability of the core operations of a LIM architecture, while in Section 5 the results of analysis enabled by the compact model are used to estimate the performance, at a higher abstraction level, of an inference accelerator based on BNNs.
