**4. Write and read peripheral assist techniques for low-VMIN applications**

In moderate- to high-speed IoT applications, such as 100 MHz–1GHz, an alternative bitcell may not be the choice of SRAM designer due to high timing as well as area penalty. Thus, the lowest area 6T bitcell is still a popular choice for mid- to high-speed IoT applications. However, the 6T VMIN is heavily guardbanded due to process and temperature variation. Thus, lowering VMIN requires write and read peripheral assist [17] (PA) techniques. Moreover, alternative bitcells in ULV application involves the help of PA to have correct write and read functionality across process variation. We define the PAs as a class of circuit techniques used in SRAM periphery that improves the write-ability, readability, and read stability of SRAM bitcells. Mainly, a PA technique would either bump up or lower the wordline or bitline control voltages of the SRAM to make the write or read operation successful, as shown in **Figure 6(a)**. A PA can also decrease the SRAM cycle time by shortening the write or read operations. The PAs are transient in nature and can be classified into write-ability, readability, and readstability PAs. For the conventional 6T SRAM bitcell, the control signals are mainly wordline and bitline. Thus, an example of write-ability PA would be wordline boosting (WLB) [17] or negative bitline (NBL) [17]. Although VDD and ground (VSS) signals are usually static, they can serve as control signals for SRAM write operation. Thus, VDD lowering and VSS rising [17] are also write assist techniques. On the other hand, to improve the readability or differential development or shorten the differential development time, one can apply a small percentage of WLB (as bigger percentages could induce read-stability issues in half-selected bitcells in column mux scenario) or negative VSS (NVSS). Applying a suppressed wordline in write improves the read stability in half-selected bitcells, which have better RSNM numbers; however, it degrades the write-ability in selected bitcells. Additionally, column-wise boosting the VDD or making the VSS negative during the write operation in half-selected

Ultra-Low-Power Embedded SRAM Design for Battery-Operated and Energy-Harvested IoT… http://dx.doi.org/10.5772/intechopen.76765 135

**Figure 6.** (a) Example of write assists using wordline boost and negative bitline techniques and (b) measured CDF of 256 kb SRAM VMIN showing 90th percentile VMIN improvement of 240 mV using combined assists [VDD boosting (VDDB), WL boosting (WLB), negative bitline (NBL)] [20].

bitcells also improves the read stability. Note that usually increasing the percentage of assist further enhances the write-ability, readability, and read stability but has a limit called assist line contour [17], which is controlled by the VMAX of the process technology. The list of possible PAs for write-ability, readability, and read stability can be found in [17]. PAs can affect the VMIN and yield of SRAMs differently in different technology. Thus, evaluations of PAs are necessary for new scaled-technologies, as past technology trends may not hold true in newer ones.

**4. Write and read peripheral assist techniques for low-VMIN**

**Figure 5.** Minimum energy point (MEP) vs. (a) word width and (b) size of SRAM [16].

In moderate- to high-speed IoT applications, such as 100 MHz–1GHz, an alternative bitcell may not be the choice of SRAM designer due to high timing as well as area penalty. Thus, the lowest area 6T bitcell is still a popular choice for mid- to high-speed IoT applications. However, the 6T VMIN is heavily guardbanded due to process and temperature variation. Thus, lowering VMIN requires write and read peripheral assist [17] (PA) techniques. Moreover, alternative bitcells in ULV application involves the help of PA to have correct write and read functionality across process variation. We define the PAs as a class of circuit techniques used in SRAM periphery that improves the write-ability, readability, and read stability of SRAM bitcells. Mainly, a PA technique would either bump up or lower the wordline or bitline control voltages of the SRAM to make the write or read operation successful, as shown in **Figure 6(a)**. A PA can also decrease the SRAM cycle time by shortening the write or read operations. The PAs are transient in nature and can be classified into write-ability, readability, and readstability PAs. For the conventional 6T SRAM bitcell, the control signals are mainly wordline and bitline. Thus, an example of write-ability PA would be wordline boosting (WLB) [17] or negative bitline (NBL) [17]. Although VDD and ground (VSS) signals are usually static, they can serve as control signals for SRAM write operation. Thus, VDD lowering and VSS rising [17] are also write assist techniques. On the other hand, to improve the readability or differential development or shorten the differential development time, one can apply a small percentage of WLB (as bigger percentages could induce read-stability issues in half-selected bitcells in column mux scenario) or negative VSS (NVSS). Applying a suppressed wordline in write improves the read stability in half-selected bitcells, which have better RSNM numbers; however, it degrades the write-ability in selected bitcells. Additionally, column-wise boosting the VDD or making the VSS negative during the write operation in half-selected

**applications**

134 Green Electronics

More than a decade ago, when bulk CMOS technology scaling at 65 nm and lower was facing challenges of higher process variation, the single write or read PAs showed enormous promises to improve the VMIN and yield of 6T SRAMs. However, with the introduction of scaled 28 nm technology, the process variation was so high that the HD 6T bitcell was not writeable in all process corners, especially for the worst case. Post 28 nm bulk the FinFETs become a device fabrication option, and the trend of write-ability issues in the HD 6T bitcell persisted due to huge process variation. Thus, from 28 nm onward applying a particular single write or a read assist may not lower the SRAM VMIN across process variation anymore. Authors in [18] show the use of dual write and read PAs that reduces the VMIN and improves the yield. Moreover, authors in [19] discussed some appropriate combination of PAs (CPAs) that could lower the VMIN further for FinFETs at near-subthreshold supplies, such as a combination of negative bitline with boosting the VDD, etc. One could employ different CPAs based on VMIN lowering application needs. Because write-ability and read stability are more important metrics in FinFET SRAM design, and they often contradict the use of certain assists, such as wordline boosting for write improvement, the SRAM designer must make a careful selection of CPA. Usually, a widely used CPA combination for FinFETs nowadays is VDD underdrive with wordline underdrive [18] schemes.

Moreover with technology scaling the metal width and pitch scale. Thus, there exist challenges of electro-migration, IR drop, and cross talk issues, which could restrict the use of a specific assist or limit the size of an SRAM bank. With the explosion of IoT application needs, ULP SoCs are targeted to run ultra-low energy as well as high-speed applications from time to time. Thus, voltage scaling down to near-subthreshold or deep-subthreshold supplies for SoC is a need nowadays. As logic VMIN easily scales down to lower VDDs, but SRAM VMIN is comparatively higher due to process and temperature variation, the overall VMIN of IoT SoCs increase that has a logic core and SRAM sharing the same power rail. Note that splitting the logic and SRAM power rail, called dual-rail technology, requires additional silicon area and power routing costs due to additional DC-DC converters. Thus, lowering SRAM VMIN is essential for wide-range ULP IoT applications depending on the speed requirements of the applications. Authors [20] show for the first time the use of three combined PAs (NBL + VDDB + WLB) that lowered the conventional 6T VMIN from 0.71 V (90%) to 0.47 V as shown in **Figure 6(b)** using a measured cumulative distribution function. The work reports the total VMIN improvement as 240 mV in a commercial bulk 130 nm technology. This work shows more than 300X active power lowering using the triple CPA technique.

earlier than the population of the core SRAM bitcells [22]. A bias generator circuit is used to generate the bias voltage for the canaries in a row. A failure detector senses the canary retention failures in a closed loop. Thus, canaries could achieve a failure point before the SRAM DRV in each dies fabricated that lower the SRAM DRV and leakage current. This technique

Ultra-Low-Power Embedded SRAM Design for Battery-Operated and Energy-Harvested IoT…

http://dx.doi.org/10.5772/intechopen.76765

137

In the year 2014, the authors in [23] demonstrated a theory for dynamic write VMIN tracking for the conventional 6T SRAM. This work introduces the term reverse assists (RA) as one of the canary design knobs. As discussed earlier, the peripheral assists (PA) improve the writeability or readability of the SRAMs. On the contrary, the RA **Figure 7(a)** degrades the writeability or readability of the canaries to fail earlier than the population of core SRAM, as shown in **Figure 7(b)**. Thus, with the increase of the RA percentage, the canary distribution of write VMIN would shift to the right-hand side from distribution A to B to C. A user can tune the failure point of the canaries by selecting the proper reverse assist percentages or settings [23]. Another input design knob that helps to tune the canaries at a desired VMIN failure point is the failure threshold condition (Fth) [22], which defines the no. of canary failures that correspond to a threshold failure point. The work also derives a mathematical formulation for dynamic write VMIN tracking as shown in Eqs. (1) and (2) [23]. Here the meanings of the variables of the equations are described in [23]. Here, the two equations relate the input and output SRAM design knobs and metrics to the canary design knobs and metrics. The work explains how to calculate the output metric named canary chip failure probability. The intended SRAM bit failure rate vs. VMIN data is calculated first. Then from the canary failure rate vs. VMIN data, the

[23] for the calculation of the output metric of canary chip failure probability *Pfc*. The authors show that one can achieve the desired canary-chip-failure-probability either by selecting a smaller no. of canaries with larger reverse assist voltage strength or the vice versa, as shown in **Figure 8** [23]. The work further shows that for a fixed reverse assist voltage to track the VMIN of a bigger SRAM, more number of canaries are required. Moreover, with the same reverse assist voltage, increasing the SRAM yield requires more number of canary bitcells to track the

**Figure 7.** (a) SRAM write operation using bitline-type reverse assist and (b) write VMIN distributions with a reverse assist

is calculated. This serves as the input data to Eq. (2)

avoids the design for the worst case using canary-based DRV tracking.

corresponding canary bit failure rate *pf*

corresponding SRAM's VMIN and so on.

(A, B, Cs are canary VMIN distributions) [23].

However, it is not imperative that always lowering the VMIN would help to reduce the SRAM energy consumption. Lowering VMIN requires energy penalty due to the use of assists, which could lead to the cause of overall SRAM energy could increase in some case. The total SRAM energy could increase with certain higher assist percentages and lower VMIN is not always an intended requirement for low-energy applications. However, if the SRAM shares the same power rail with the logic core, as shown in **Figure 1(b)**, the energy savings from voltage scaling in the logic core could be much higher than the energy increase in SRAM, and thus, in this scenario only it might help.

Although write and read PAs usually improve the VMIN guardbanding, it does not remove it entirely. Moreover, due to design and application of PAs, we trade off additional silicon area and energy for SRAM and overall SoC sharing the same power rail with SRAM. Additionally, with the design for the worst-case approach, the nominal and the best-case corner dies suffer from additional area and energy overhead. Thus, an important research question emerges: how to minimize this additional VMIN guardbanding of SRAMs across process and temperature variation? The answer lies in tracking the SRAM VMIN using in situ canary sensor SRAMs that helps to apply CPA for individual dies differently across process and temperature variation. The next section describes the canary SRAM techniques.
