**5. Canary sensor SRAMs for VMIN tracking and guardband lowering**

The story of canary SRAMs ties to the story of the canary in a coalmine. Eighteenth-century coal miners used to carry this beautiful yellow canary bird for poisonous gas, such as methane detection. A moderate presence of such gases could be fatal to human beings. If there is a significant level of methane being present in the mines, the canaries used to feel sick. By observing the canaries, thus, the miners get enough time to evacuate the coalmines. The Canaries from the standpoint of a circuit could mean a weak circuit that fails earlier than the main circuit. The canary circuits are first being introduced as canary flip-flops [21]. Later in the year 2007, the authors in [22] show canary techniques could reduce the data retention voltage (DRV) of an SRAM, thus saving a huge amount of leakage power for ULP applications. This work indicates that canary SRAM bitcells are a modified version of the SRAM cells; those use an additional bias control knob to weaken the DRV of sets of canaries to tune to fail them earlier than the population of the core SRAM bitcells [22]. A bias generator circuit is used to generate the bias voltage for the canaries in a row. A failure detector senses the canary retention failures in a closed loop. Thus, canaries could achieve a failure point before the SRAM DRV in each dies fabricated that lower the SRAM DRV and leakage current. This technique avoids the design for the worst case using canary-based DRV tracking.

comparatively higher due to process and temperature variation, the overall VMIN of IoT SoCs increase that has a logic core and SRAM sharing the same power rail. Note that splitting the logic and SRAM power rail, called dual-rail technology, requires additional silicon area and power routing costs due to additional DC-DC converters. Thus, lowering SRAM VMIN is essential for wide-range ULP IoT applications depending on the speed requirements of the applications. Authors [20] show for the first time the use of three combined PAs (NBL + VDDB + WLB) that lowered the conventional 6T VMIN from 0.71 V (90%) to 0.47 V as shown in **Figure 6(b)** using a measured cumulative distribution function. The work reports the total VMIN improvement as 240 mV in a commercial bulk 130 nm technology. This work shows more than 300X

However, it is not imperative that always lowering the VMIN would help to reduce the SRAM energy consumption. Lowering VMIN requires energy penalty due to the use of assists, which could lead to the cause of overall SRAM energy could increase in some case. The total SRAM energy could increase with certain higher assist percentages and lower VMIN is not always an intended requirement for low-energy applications. However, if the SRAM shares the same power rail with the logic core, as shown in **Figure 1(b)**, the energy savings from voltage scaling in the logic core could be much higher than the energy increase in SRAM, and thus, in this

Although write and read PAs usually improve the VMIN guardbanding, it does not remove it entirely. Moreover, due to design and application of PAs, we trade off additional silicon area and energy for SRAM and overall SoC sharing the same power rail with SRAM. Additionally, with the design for the worst-case approach, the nominal and the best-case corner dies suffer from additional area and energy overhead. Thus, an important research question emerges: how to minimize this additional VMIN guardbanding of SRAMs across process and temperature variation? The answer lies in tracking the SRAM VMIN using in situ canary sensor SRAMs that helps to apply CPA for individual dies differently across process and temperature varia-

**5. Canary sensor SRAMs for VMIN tracking and guardband lowering**

The story of canary SRAMs ties to the story of the canary in a coalmine. Eighteenth-century coal miners used to carry this beautiful yellow canary bird for poisonous gas, such as methane detection. A moderate presence of such gases could be fatal to human beings. If there is a significant level of methane being present in the mines, the canaries used to feel sick. By observing the canaries, thus, the miners get enough time to evacuate the coalmines. The Canaries from the standpoint of a circuit could mean a weak circuit that fails earlier than the main circuit. The canary circuits are first being introduced as canary flip-flops [21]. Later in the year 2007, the authors in [22] show canary techniques could reduce the data retention voltage (DRV) of an SRAM, thus saving a huge amount of leakage power for ULP applications. This work indicates that canary SRAM bitcells are a modified version of the SRAM cells; those use an additional bias control knob to weaken the DRV of sets of canaries to tune to fail them

active power lowering using the triple CPA technique.

tion. The next section describes the canary SRAM techniques.

scenario only it might help.

136 Green Electronics

In the year 2014, the authors in [23] demonstrated a theory for dynamic write VMIN tracking for the conventional 6T SRAM. This work introduces the term reverse assists (RA) as one of the canary design knobs. As discussed earlier, the peripheral assists (PA) improve the writeability or readability of the SRAMs. On the contrary, the RA **Figure 7(a)** degrades the writeability or readability of the canaries to fail earlier than the population of core SRAM, as shown in **Figure 7(b)**. Thus, with the increase of the RA percentage, the canary distribution of write VMIN would shift to the right-hand side from distribution A to B to C. A user can tune the failure point of the canaries by selecting the proper reverse assist percentages or settings [23]. Another input design knob that helps to tune the canaries at a desired VMIN failure point is the failure threshold condition (Fth) [22], which defines the no. of canary failures that correspond to a threshold failure point. The work also derives a mathematical formulation for dynamic write VMIN tracking as shown in Eqs. (1) and (2) [23]. Here the meanings of the variables of the equations are described in [23]. Here, the two equations relate the input and output SRAM design knobs and metrics to the canary design knobs and metrics. The work explains how to calculate the output metric named canary chip failure probability. The intended SRAM bit failure rate vs. VMIN data is calculated first. Then from the canary failure rate vs. VMIN data, the corresponding canary bit failure rate *pf* is calculated. This serves as the input data to Eq. (2) [23] for the calculation of the output metric of canary chip failure probability *Pfc*. The authors show that one can achieve the desired canary-chip-failure-probability either by selecting a smaller no. of canaries with larger reverse assist voltage strength or the vice versa, as shown in **Figure 8** [23]. The work further shows that for a fixed reverse assist voltage to track the VMIN of a bigger SRAM, more number of canaries are required. Moreover, with the same reverse assist voltage, increasing the SRAM yield requires more number of canary bitcells to track the corresponding SRAM's VMIN and so on.

**Figure 7.** (a) SRAM write operation using bitline-type reverse assist and (b) write VMIN distributions with a reverse assist (A, B, Cs are canary VMIN distributions) [23].

**Figure 8.** Canary chip failure probability vs. reverse assist voltage for 1 million SRAM bitcells with 95% yield @ TT\_85C [23].

The work [23] shows the proposed bitline-type peripheral reverse assist circuit, as shown in **Figure 9**. The peripheral RA circuit uses a configurable NMOS-NMOS voltage divider to pass the generated voltage using an analog demultiplexer to the BL or BLB lines controlled by data D and data-bar Dbar. The reverse assist voltage generation can be disabled for normal write mode by asserting the AON signal to logic "0." The proposed block diagram of the integrated canary SRAM architecture is shown in [23], which is physically adjacent to the SRAM itself that shares the power rails. However, for independent write and read operations, at the canary and SRAM boundary, the bitlines are disconnected. The advantage of canary being an independent memory permits simultaneous operation to track voltage droops occurring at the SRAM-canary power rails to take actions if the canary SRAM fails. Such actions include either stopping the SRAM operation or lowering the SRAM clock frequency to prevent voltage scaling further or selecting an apt PA to lower the VMIN further. The proposed algorithm in this work starts with an initial VRA voltage and writes and reads canary rows to compare if the data written is correct. If the canary write operation is successful, the VRA is increased else lowered gradually to reach the minimum VRA settings. Unless the minimum VRA setting is reached, the dynamic voltage and frequency scaling (DVFS) is allowed else the DVFS has to be stopped, as reaching the minimum VRA would indicate the SRAM VMIN is reached. The minimum VRA setting would vary with the SRAM and canary input design knobs. The work also shows the area and power tradeoffs for SRAM and canary design knobs. It shows that for an increase in the number of canary bits, the normalized canary area and power overhead are amortized in bigger SRAM and increase with smaller capacity and so on. The work [23] showcases interesting results revealing that due to write VMIN tracking, the canaries can save a minimum of 31% in SS corner dies and a maximum of 51.5% in FS corner dies compared to the worst-case SF corner dies.

The testchip includes an 8kb core SRAM, 512 kb canary SRAM, a memory BIST (MBIST), a canary BIST (CBIST), and boundary scan chain blocks. The 6T bitcells used in both canary and core SRAM are same; it uses an external BLVRA voltage to apply as reverse assists to the canary SRAM. Both the MBIST and CBIST architecture are similar to a traditional MBIST [25]; however, they are specialized in measuring the number of bit failures in the core and canary SRAM. This work characterizes some important properties of canary SRAM that helps to track the core SRAM write VMIN (WVMIN). The authors show that using BLVRA and WLVRA reverse assists across different voltage, frequency, and temperatures (VFT), the canary failure curve shifts distinctly compared to each other. Without this distinction in shifting of failure curves, canary SRAM would not work, as there will be no way to tell if the input design knobs are changed, such as VFT. As discussed earlier, this work shows the first silicon proof that

Ultra-Low-Power Embedded SRAM Design for Battery-Operated and Energy-Harvested IoT…

http://dx.doi.org/10.5772/intechopen.76765

139

**Figure 9.** (a) Canary SRAM reverse assist circuit. (b) Canary write driver. (c) Reverse assist waveforms [23].

With the intuition presented in [23, 24] the authors in [20] show a closed-loop 256 kb selftuning SRAM that can automatically track the SRAM VMIN using canaries and apply apt write-read PAs to improve the VMIN based on frequency needs for ULP IoT application. This work shows a 67% extension of operating voltage from 1.2 to 0.38 V deep into subthreshold supplies. Reverse assists are used to track the core SRAM VMIN using canaries to allow

canaries can be tuned to fail earlier than the core SRAMs.

Authors in [24] first show a working prototype of the canary SRAM in a commercial 130 nm technology that reveals the necessary properties of canary SRAM to track SRAM VMIN. The work further shows a proof of concept VMIN tracking canaries that fail earlier than the SRAM starts to fail, which is controllable using the canary design knobs (Fth and RAS) postfabrication. The architecture of the SRAM is shown in **Figure 10**, which is similar to the [23].

Ultra-Low-Power Embedded SRAM Design for Battery-Operated and Energy-Harvested IoT… http://dx.doi.org/10.5772/intechopen.76765 139

**Figure 9.** (a) Canary SRAM reverse assist circuit. (b) Canary write driver. (c) Reverse assist waveforms [23].

The work [23] shows the proposed bitline-type peripheral reverse assist circuit, as shown in **Figure 9**. The peripheral RA circuit uses a configurable NMOS-NMOS voltage divider to pass the generated voltage using an analog demultiplexer to the BL or BLB lines controlled by data D and data-bar Dbar. The reverse assist voltage generation can be disabled for normal write mode by asserting the AON signal to logic "0." The proposed block diagram of the integrated canary SRAM architecture is shown in [23], which is physically adjacent to the SRAM itself that shares the power rails. However, for independent write and read operations, at the canary and SRAM boundary, the bitlines are disconnected. The advantage of canary being an independent memory permits simultaneous operation to track voltage droops occurring at the SRAM-canary power rails to take actions if the canary SRAM fails. Such actions include either stopping the SRAM operation or lowering the SRAM clock frequency to prevent voltage scaling further or selecting an apt PA to lower the VMIN further. The proposed algorithm in this work starts with an initial VRA voltage and writes and reads canary rows to compare if the data written is correct. If the canary write operation is successful, the VRA is increased else lowered gradually to reach the minimum VRA settings. Unless the minimum VRA setting is reached, the dynamic voltage and frequency scaling (DVFS) is allowed else the DVFS has to be stopped, as reaching the minimum VRA would indicate the SRAM VMIN is reached. The minimum VRA setting would vary with the SRAM and canary input design knobs. The work also shows the area and power tradeoffs for SRAM and canary design knobs. It shows that for an increase in the number of canary bits, the normalized canary area and power overhead are amortized in bigger SRAM and increase with smaller capacity and so on. The work [23] showcases interesting results revealing that due to write VMIN tracking, the canaries can save a minimum of 31% in SS corner dies and a maxi-

**Figure 8.** Canary chip failure probability vs. reverse assist voltage for 1 million SRAM bitcells with 95% yield @ TT\_85C [23].

138 Green Electronics

mum of 51.5% in FS corner dies compared to the worst-case SF corner dies.

Authors in [24] first show a working prototype of the canary SRAM in a commercial 130 nm technology that reveals the necessary properties of canary SRAM to track SRAM VMIN. The work further shows a proof of concept VMIN tracking canaries that fail earlier than the SRAM starts to fail, which is controllable using the canary design knobs (Fth and RAS) postfabrication. The architecture of the SRAM is shown in **Figure 10**, which is similar to the [23]. The testchip includes an 8kb core SRAM, 512 kb canary SRAM, a memory BIST (MBIST), a canary BIST (CBIST), and boundary scan chain blocks. The 6T bitcells used in both canary and core SRAM are same; it uses an external BLVRA voltage to apply as reverse assists to the canary SRAM. Both the MBIST and CBIST architecture are similar to a traditional MBIST [25]; however, they are specialized in measuring the number of bit failures in the core and canary SRAM. This work characterizes some important properties of canary SRAM that helps to track the core SRAM write VMIN (WVMIN). The authors show that using BLVRA and WLVRA reverse assists across different voltage, frequency, and temperatures (VFT), the canary failure curve shifts distinctly compared to each other. Without this distinction in shifting of failure curves, canary SRAM would not work, as there will be no way to tell if the input design knobs are changed, such as VFT. As discussed earlier, this work shows the first silicon proof that canaries can be tuned to fail earlier than the core SRAMs.

With the intuition presented in [23, 24] the authors in [20] show a closed-loop 256 kb selftuning SRAM that can automatically track the SRAM VMIN using canaries and apply apt write-read PAs to improve the VMIN based on frequency needs for ULP IoT application. This work shows a 67% extension of operating voltage from 1.2 to 0.38 V deep into subthreshold supplies. Reverse assists are used to track the core SRAM VMIN using canaries to allow

**Figure 10.** (a) Block diagram (not in scale) of the memory block and (b) block diagram (not in scale) of the canary SRAM column periphery (I/O) and BL-type reverse assist [24].

a closed-loop control of the system supply voltage at an intended operating frequency. The system uses write and read combined PA (CPA) along with in situ canary sensor SRAM-based VMIN tracking to maximize the operating range of the SRAM into subthreshold supplies. This work meets the design needs for SRAMs of highly variable IoT applications while retaining the density of the conventional 6T bitcells. As the battery-operated or harvested energy IoT devices have an operating range of 10kHz to 10 MHz [26, 27], it is needed as a highly versatile feature to expand the 6T SRAM operating range to ULV supply voltages for low power operation. PAs can lower SRAM VMIN; however, selecting the best CPA depends on the supply voltage that could influence the power-performance tradeoff.

and the CBIST iterates the canary write and read operations across all canary addresses to compare and determine if the canary failure (Fc) crosses the canary failure threshold condition (Fth). Based on the comparison, the CBIST generates a control signal for the ASC to increase or decrease the LDO supply voltage accordingly. Therefore, the closed loop tracking using self-tuning completes once the canary failure point is reached, which indicates the approaching SRAM VMIN. Once the canary VMIN is tuned to the SRAM VMIN using Fth and RA settings, the worst-case SRAM bitcells are mapped into canaries, and the canary sensors track properties of the worst-case SRAM bitcells across a range of voltage, frequency, and temperature (VFT) variations. The authors show measured tracking of SRAM VMIN across VFT variation as shown in **Figure 12** [20]. The canary sensors, system components without the BISTs, and CPA have reported overheads of 0.77, 1.8, and 2.8%, respectively. The work allows VDD scaling using CPA at the 90th percentile worst-case VMIN of 0.47 V with guardbands that reduces 337X active power. Moreover, enabling canary-based VMIN tracking provides a 4.3X power savings by removing the VMIN guardbanding to achieve up to 1444X active power savings at 0.38 V [20]. The authors show using CPA and in situ canary-based tracking down to 0.38 V gives a 12.4X leakage savings, too. The canary-based VMIN tracking is scalable to lower technologies such as 45 and 32 nm, which shows promise to reduce the effect of process variation in FinFET SRAM in the highly-variant 7 nm and beyond technology nodes for a wide

Ultra-Low-Power Embedded SRAM Design for Battery-Operated and Energy-Harvested IoT…

http://dx.doi.org/10.5772/intechopen.76765

141

**Figure 11.** System-level block diagram for the 256 kb 6T self-tuning SRAM subsystem showing subcomponents [20].

range of IoT applications.

This work uses write assists NBL and WLB along with read stability assist VDB to achieve a 90% VMIN of 0.47 V compared to the other assist combinations as well as a no-assist case (shown in the CDF plot in **Figure 6** [20]). However, CPA alone requires a VMIN guardbanding that ensures all chips functioning across PVT variation, hampering potential power savings. Canaries play a vital role to extend the power saving achieved using CPA using runtime determination of VMIN that allow us to reduce the guardbanding. The block diagram of the proposed system is shown in **Figure 11**. The SRAM testchip comprises a 256 kb SRAM with 2 kb integrated canaries, a PA controller (ASC), a frequency to digital converter (FDC), an MBIST, and a CBIST. This architecture shares the SRAM periphery with canary sensors, such as write drivers, sense amplifiers, pre-charge circuits, etc. The RA circuit uses a wordline slope degrading programmable control for canaries.

The work [20] employs a self-tuning algorithm described in [20] that tracks SRAM VMIN dynamically, which also adjust the supply voltage and the selection of PAs. The algorithm uses the FDC to measure and convert the input clock frequency to a digitized output and to initialize the off-chip low dropout (LDO) regulator to an initial VDD programmed by a given look-up table (LUT). Using the ASC the algorithm chooses required PAs based on the LUT, Ultra-Low-Power Embedded SRAM Design for Battery-Operated and Energy-Harvested IoT… http://dx.doi.org/10.5772/intechopen.76765 141

**Figure 11.** System-level block diagram for the 256 kb 6T self-tuning SRAM subsystem showing subcomponents [20].

a closed-loop control of the system supply voltage at an intended operating frequency. The system uses write and read combined PA (CPA) along with in situ canary sensor SRAM-based VMIN tracking to maximize the operating range of the SRAM into subthreshold supplies. This work meets the design needs for SRAMs of highly variable IoT applications while retaining the density of the conventional 6T bitcells. As the battery-operated or harvested energy IoT devices have an operating range of 10kHz to 10 MHz [26, 27], it is needed as a highly versatile feature to expand the 6T SRAM operating range to ULV supply voltages for low power operation. PAs can lower SRAM VMIN; however, selecting the best CPA depends on the supply

**Figure 10.** (a) Block diagram (not in scale) of the memory block and (b) block diagram (not in scale) of the canary SRAM

This work uses write assists NBL and WLB along with read stability assist VDB to achieve a 90% VMIN of 0.47 V compared to the other assist combinations as well as a no-assist case (shown in the CDF plot in **Figure 6** [20]). However, CPA alone requires a VMIN guardbanding that ensures all chips functioning across PVT variation, hampering potential power savings. Canaries play a vital role to extend the power saving achieved using CPA using runtime determination of VMIN that allow us to reduce the guardbanding. The block diagram of the proposed system is shown in **Figure 11**. The SRAM testchip comprises a 256 kb SRAM with 2 kb integrated canaries, a PA controller (ASC), a frequency to digital converter (FDC), an MBIST, and a CBIST. This architecture shares the SRAM periphery with canary sensors, such as write drivers, sense amplifiers, pre-charge circuits, etc. The RA circuit uses a wordline

The work [20] employs a self-tuning algorithm described in [20] that tracks SRAM VMIN dynamically, which also adjust the supply voltage and the selection of PAs. The algorithm uses the FDC to measure and convert the input clock frequency to a digitized output and to initialize the off-chip low dropout (LDO) regulator to an initial VDD programmed by a given look-up table (LUT). Using the ASC the algorithm chooses required PAs based on the LUT,

voltage that could influence the power-performance tradeoff.

column periphery (I/O) and BL-type reverse assist [24].

140 Green Electronics

slope degrading programmable control for canaries.

and the CBIST iterates the canary write and read operations across all canary addresses to compare and determine if the canary failure (Fc) crosses the canary failure threshold condition (Fth). Based on the comparison, the CBIST generates a control signal for the ASC to increase or decrease the LDO supply voltage accordingly. Therefore, the closed loop tracking using self-tuning completes once the canary failure point is reached, which indicates the approaching SRAM VMIN. Once the canary VMIN is tuned to the SRAM VMIN using Fth and RA settings, the worst-case SRAM bitcells are mapped into canaries, and the canary sensors track properties of the worst-case SRAM bitcells across a range of voltage, frequency, and temperature (VFT) variations. The authors show measured tracking of SRAM VMIN across VFT variation as shown in **Figure 12** [20]. The canary sensors, system components without the BISTs, and CPA have reported overheads of 0.77, 1.8, and 2.8%, respectively. The work allows VDD scaling using CPA at the 90th percentile worst-case VMIN of 0.47 V with guardbands that reduces 337X active power. Moreover, enabling canary-based VMIN tracking provides a 4.3X power savings by removing the VMIN guardbanding to achieve up to 1444X active power savings at 0.38 V [20]. The authors show using CPA and in situ canary-based tracking down to 0.38 V gives a 12.4X leakage savings, too. The canary-based VMIN tracking is scalable to lower technologies such as 45 and 32 nm, which shows promise to reduce the effect of process variation in FinFET SRAM in the highly-variant 7 nm and beyond technology nodes for a wide range of IoT applications.

consumption in greener IoT applications, such as alternative bitcell topologies, a combination

Ultra-Low-Power Embedded SRAM Design for Battery-Operated and Energy-Harvested IoT…

http://dx.doi.org/10.5772/intechopen.76765

143

Technology scaling in FinFET devices 7 nm node and beyond is going to experience a higher degree of process variation, which could affect the design and production of so-called lowest area 6T SRAM memory cells used in modern IoT system on chips. Based on the latest published works, there are three key directions to solve this issue. One of the directions is to use appropriate alternative bitcells for SRAMs trading off core array area that will enable ultra-low energy and lower leakage memory operation to sustain a longer battery life for portable home automation, wearable, and biomedical IoT applications. For low-cost system on chips using 6T SRAMs supporting low-power and mid- to high-speed applications, the use of appropriate combined peripheral assists is essential for a low-VMIN application. Although the combined assist lowers the VMIN and improves the SRAM yield, it does not eliminate the costly VMIN guardbanding due to process and temperature variation. To remove or minimize this VMIN guardbanding, the in situ canary sensor SRAM shows great promises for VMIN tracking across voltage, frequency, and temperature variation. Combined peripheral assists along with canary sensor SRAM show promise for improvement in the power consumption of IoT systems by more than 1000X supporting a wide range of IoT application in a single SoC. Hence, to support a wide range of greener IoT applications, SRAM designers need to choose appropriate design techniques, such as alternative bitcells, combined peripheral assist, and in situ canary sensor SRAMs to enable technology scaling for SRAMs in 7 nm node and beyond.

[1] Evans D. The Internet of Things, How the Next Evolution of the Internet Is Changing Everything. CISCO, IBSG. April 2011. http://www.cisco.com/c/dam/en\_us/about/ac79/

of peripheral assists, and in situ canary-based VMIN tracking for guardband lowering.

**7. Conclusions**

**Conflict of interest**

**Author details**

Arijit Banerjee

**References**

The author has no conflict of interest.

Address all correspondence to: ab9ca@virginia.edu

University of Virginia, Charlottesville, Virginia, USA

docs/innov/IoT\_IBSG\_0411FINAL.pdf

**Figure 12.** Measured canary VMIN tracking across clock frequencies [1 or 10, 50, 100, and 150] MHz and temperatures (a) 27°C, (b) 85°C, and (c) −20°C, showing VMIN tuning range, and (d) the distribution of overall VMIN reduction using assist and tracking [20].
