**4. Techniques adopted to reduce power consumption in SRAM‐based FPGA**

### **4.1. Leakage power reduction**

The important method to control the leakage current in the system is to switch off the transis‐ tors, which are not being used at that time. This can be achieved by using the dual threshold voltage transistor FPGA routing design [24–26]. In this technique, high threshold voltage is applied to one subset of multiplexer transistors and low threshold voltage to the rest of the transistors. High threshold voltage controls the leakage current effectively on the cost of per‐ formance degradation. This technique increases the complexity at router level. By allowing body‐bias effect, the threshold voltage of a multiplexer transistor, which is not a part of the selected path, can be raised [27]. This method increases the fabrication complexity and cost. The leakage current can also be controlled by applying negative bias voltage on the gate of the OFF multiplexer transistor, which results in drastic drop in subthreshold current on the cost of hardware burden [28].

Stack effect is another effective method to reduce the leakage current in any circuit [29–31]. Stack effect means two series connected OFF transistors in the same path. These two OFF transistors offer a high resistive path to the current flow. To utilize this concept in the FPGA design, researchers [32, 33] have introduced an extra configuration SRAM cells (redundant cells) to allow multiple OFF transistors on unselected path. Due to redundant cell approach, the unselected path contains two OFF transistors, which limits the subthreshold current along the unselected path.

Calhoun et al. [34] have proposed the creation of fine‐grained "sleep region" to control the leakage current in the system. With this technique, it becomes possible to put unused LUTs and flip‐flops to sleep mode independently. Gayasen et al. [35] have proposed coarse‐grained sleep strategy. In this technique, the entire region of the FPGA is partitioned into logic blocks so that each region can be put into sleep mode independently whenever it is not used.

Several methods have been proposed by researchers to save the leakage/static power consump‐ tion in FPGA design at the architectural level [36–39]. Tran et al. [40] have proposed low‐power FPGA architecture based on fine‐grained *V*dd control scheme, called micro‐*V*dd‐hopping. They have grouped four CLB into one block to share the *V*dd. In the micro‐Vdd‐hopping scheme, *V*dd of each block is varied between high and low *V*dd to save power consumption without scarify‐ ing performance. In their design, they have introduced a level shifter and incorporated zigzag power‐gating scheme to control the sneak leakage path problem. They have experimentally observed that the dynamic power can be reduced by 86% when the required speed is half of the highest speed. They have simulated their proposed designed at 90 nm technology and observed that 95% static power saving on the cost of 2% area overhead. In zigzag power gating scheme wake up time is smaller than other gating technique because the INVs and 2‐NAND are always in between *V*dd and *V*ss during standby mode. Since they have off‐off stacking structure, leakage current is suppressed by an order of magnitude even if the overdrive voltage is zero.

Srinivasan et al. [41] have proposed a technique to reduce the leakage current of intercon‐ nect fabric. They have put every multiplexer in its least‐leakage state by setting its undriven inputs to desired values with a circuit‐level modification in the routing multiplexer. The main advantage of this technique is that it has negligible impact on the performance of the design and has small area penalty.

In their research paper, Hasan et al. [42] have reduced the leakage current in the multiplexer‐ based interconnect matrix by controlling the inputs of unused FPGA routing multiplexers. The simulation results on different sizes and topologies of routing multiplexers show that the minimum leakage vector varies significantly at 22 nm compared to the 65 nm nodes because of higher gate leakage current and output stage loading effects. Their proposed technique reduces the static power significantly without imposing any area overhead because most of the routing multiplexers are unused in an FPGA.

A directional coarse‐grained power‐gated FPGA switch box and power gating aware rout‐ ing algorithm was proposed by Hoo et al. [43] to address the leakage current concern in FPGA. After considering the trade‐offs among different PG designs, authors have considered: (1) A novel directional coarse‐grained power‐gated FPGA switch box. (2) A power‐aware routing algorithm to leverage on new PG architecture. In their proposed architecture, mul‐ tiple buffers in each direction of the switch box are power gated independently of the buffers in the other directions. Due to the homogeneous structure of the switch box, proper sizing of the sleep transistors is not an issue. To maximize the leakage reduction of the coarse‐grained PG architecture, they have also adopted the routing algorithm. They have proposed a new cost function for the VPR routing algorithm to support the new routing architecture.

#### **4.2. Dynamic power reduction**

transistors. High threshold voltage controls the leakage current effectively on the cost of per‐ formance degradation. This technique increases the complexity at router level. By allowing body‐bias effect, the threshold voltage of a multiplexer transistor, which is not a part of the selected path, can be raised [27]. This method increases the fabrication complexity and cost. The leakage current can also be controlled by applying negative bias voltage on the gate of the OFF multiplexer transistor, which results in drastic drop in subthreshold current on the

Stack effect is another effective method to reduce the leakage current in any circuit [29–31]. Stack effect means two series connected OFF transistors in the same path. These two OFF transistors offer a high resistive path to the current flow. To utilize this concept in the FPGA design, researchers [32, 33] have introduced an extra configuration SRAM cells (redundant cells) to allow multiple OFF transistors on unselected path. Due to redundant cell approach, the unselected path contains two OFF transistors, which limits the subthreshold current along

Calhoun et al. [34] have proposed the creation of fine‐grained "sleep region" to control the leakage current in the system. With this technique, it becomes possible to put unused LUTs and flip‐flops to sleep mode independently. Gayasen et al. [35] have proposed coarse‐grained sleep strategy. In this technique, the entire region of the FPGA is partitioned into logic blocks

Several methods have been proposed by researchers to save the leakage/static power consump‐ tion in FPGA design at the architectural level [36–39]. Tran et al. [40] have proposed low‐power FPGA architecture based on fine‐grained *V*dd control scheme, called micro‐*V*dd‐hopping. They have grouped four CLB into one block to share the *V*dd. In the micro‐Vdd‐hopping scheme, *V*dd of each block is varied between high and low *V*dd to save power consumption without scarify‐ ing performance. In their design, they have introduced a level shifter and incorporated zigzag power‐gating scheme to control the sneak leakage path problem. They have experimentally observed that the dynamic power can be reduced by 86% when the required speed is half of the highest speed. They have simulated their proposed designed at 90 nm technology and observed that 95% static power saving on the cost of 2% area overhead. In zigzag power gating scheme wake up time is smaller than other gating technique because the INVs and 2‐NAND are always in between *V*dd and *V*ss during standby mode. Since they have off‐off stacking structure, leakage current is suppressed by an order of magnitude even if the overdrive voltage is zero.

Srinivasan et al. [41] have proposed a technique to reduce the leakage current of intercon‐ nect fabric. They have put every multiplexer in its least‐leakage state by setting its undriven inputs to desired values with a circuit‐level modification in the routing multiplexer. The main advantage of this technique is that it has negligible impact on the performance of the design

In their research paper, Hasan et al. [42] have reduced the leakage current in the multiplexer‐ based interconnect matrix by controlling the inputs of unused FPGA routing multiplexers. The simulation results on different sizes and topologies of routing multiplexers show that the minimum leakage vector varies significantly at 22 nm compared to the 65 nm nodes because

so that each region can be put into sleep mode independently whenever it is not used.

cost of hardware burden [28].

224 Field - Programmable Gate Array

the unselected path.

and has small area penalty.

Dynamic power is consumed during normal operation when switch toggles. It depends on the frequency of the operation, load capacitance, and square of power supply as clear from Eq. (1). The total dynamic power consumed by a device is given by the sum of the dynamic power of each resource. Due to the programmability of the FPGA, the dynamic power is design dependent. The important contributors for dynamic power are effective parasitic capacitance of the resources, resource utilization, and switching activity of the resources [44]. The effec‐ tive capacitance of the resources come from parasitic capacitance of interconnect wires and transistors. The dynamic power of the device can be reduced by addressing each of the param‐ eters in Eq. (1) effectively. Various methods have been proposed by researchers to handle the dynamic power consummation [37, 45–47]. The general adopted methods are using clock scheme, reducing toggling activity of the logic, reducing RAM and I/O powers.

Since faster switching logic consumes more dynamic power than the slower switching logic, it is required to partition the clock so that the fast clock should be assigned to those portions of the logic which require a fast clock and slow clock should be assign to those which can be run at a slower speed. This way the switching activity of various logics can be controlled to save the overall dynamic power [9, 10, 15].

Dynamic voltage scaling is another power‐saving design technique because supply voltage significantly impacts power efficiency. The power supply scaling technique can be utilized in the design of power‐efficient FPGA by considering devices like tunnel‐FET, FinFET, etc. [48–51] because these devices can operate at ultra‐low voltage.

The dual or multi‐*V*dd techniques [52–54] are other important methods to save the dynamic power. In dual *V*dd scheme, the noncritical delay circuit is connected with low power supply, whereas delay‐critical circuit is powered by high voltage. This concept is also applied in the FPGA design [55–57]. In heterogeneous architecture, some logic blocks are fixed to operate at high power supply and some logic blocks (not limited by speed) are fixed to operate at low voltage. This heterogeneous scheme helps only in small power saving due to the rigidity of the fixed fabric and loss associated with the mandatory use of low‐*V*dd in certain cases. The dual *V*dd technique cannot be applied to the interconnect wires which is the main source of power consumption. To overcome this problem, Li et al. [58] have proposed *V*dd program‐ mability technique to reduce power consumption of interconnect wire. They have selectively applied low‐*V*dd to interconnect circuits such as routing and connection switches. The *V*dd selection for different applications is obtained by programmable dual‐*V*dd technique to both logic blocks and interconnect. On average, they observed a total of 50‐55% power is reduction.

Although voltage scaling is the best way to reduce the power consumption in FPGA array, one has to scarify the performance of the circuit. To improve the power efficiency of FPGAs with‐ out scarifying performance, Li et al. [59] have explored the different supply voltage (*V*dd) levels option. According to the authors, a predefined dual‐*V*dd FPGA fabric, in general, cannot achieve better power performance trade‐off than the *V*dd scaling because the predefined dual‐*V*dd fabric is not flexible enough for a variety of applications. To address this issue they have introduced the field programmability for the *V*dd level by proposing three types of logic blocks: H‐block, L‐block, and a p‐block as shown in **Figure 1**. H‐block and L‐block are connected to supply voltages VDDH and VDDL, respectively. H‐block provides higher speed due to high supply voltage whereas L‐ block has reduced power consumption at the cost of the increased delay. They have implemented P‐block by inserting PMOS transistors (called power switches) between the power supply rails and the logic block. The configuration bits were used to control the switching behavior of these switches so that an appropriate supply voltage can be chosen for the P‐block. To avoid the short circuit current, they have introduced a level converter in between VDDH and VDDL.

**Figure 1.** Logic blocks in dual‐*V*dd and *V*dd‐programmable FPGAs [59].

Selective power‐down is another method to save power in FPGA. This technique (known as power gating) refers to shut down the power supply of certain portions of a chip which are not performing any task for a long time to save the static power considerably. This can be achieved by implementing a multisupply strategy in which the power grid of some blocks is decorrelated from others in order to allow for selective shutdown. Sleep modes within the FPGA architecture can also be deployed to selectively reduce the power supply of those blocks, which are not in use [60, 61].

the fixed fabric and loss associated with the mandatory use of low‐*V*dd in certain cases. The dual *V*dd technique cannot be applied to the interconnect wires which is the main source of power consumption. To overcome this problem, Li et al. [58] have proposed *V*dd program‐ mability technique to reduce power consumption of interconnect wire. They have selectively applied low‐*V*dd to interconnect circuits such as routing and connection switches. The *V*dd selection for different applications is obtained by programmable dual‐*V*dd technique to both logic blocks and interconnect. On average, they observed a total of 50‐55% power is reduction. Although voltage scaling is the best way to reduce the power consumption in FPGA array, one has to scarify the performance of the circuit. To improve the power efficiency of FPGAs with‐ out scarifying performance, Li et al. [59] have explored the different supply voltage (*V*dd) levels option. According to the authors, a predefined dual‐*V*dd FPGA fabric, in general, cannot achieve better power performance trade‐off than the *V*dd scaling because the predefined dual‐*V*dd fabric is not flexible enough for a variety of applications. To address this issue they have introduced the field programmability for the *V*dd level by proposing three types of logic blocks: H‐block, L‐block, and a p‐block as shown in **Figure 1**. H‐block and L‐block are connected to supply voltages VDDH and VDDL, respectively. H‐block provides higher speed due to high supply voltage whereas L‐ block has reduced power consumption at the cost of the increased delay. They have implemented P‐block by inserting PMOS transistors (called power switches) between the power supply rails and the logic block. The configuration bits were used to control the switching behavior of these switches so that an appropriate supply voltage can be chosen for the P‐block. To avoid the short

226 Field - Programmable Gate Array

circuit current, they have introduced a level converter in between VDDH and VDDL.

**Figure 1.** Logic blocks in dual‐*V*dd and *V*dd‐programmable FPGAs [59].

Power consumption in interconnect dominates dynamic power in FPGAs [62–64] due to the interconnect structure, which consist of prefabricated wire segments. Each segment is attached with used and unused switches. Wire lengths in FPGAs are generally longer than in ASICs due to the larger area consumed by SRAM cells and circuitry. The larger power consumption in interconnect in FPGA makes it high‐level target for power optimization. Anderson et al. [65] have presented a novel FPGA routing switch design to reduce the leak‐ age and dynamic power consumption. The switch can be programmed to operate in any one of the mode: high speed, low speed, or sleep mode. In high‐speed mode, power and per‐ formance characteristics are similar to those of current FPGA routing switches. Low‐power mode offers reduced leakage and dynamic power on the cost of degraded performance. Sleep mode, which is suitable for unused switches, reduces the static power drastically. Three key observations (which hold for majority of Xilinx Spartan‐3 commercial FPGA and are specific to FPGA interconnect) were made, namely (1) routing switch inputs are tolerant to "weak‐1" signals, (2) there exists sufficient timing slack in typical FPGA designs to allow a consider‐ able fraction of routing switches to be slowed down, without impacting the overall design performance, and (3) most routing switches simply feed other routing switches, authors have proposed the design of new switch as shown in **Figure 2**. The designed switch includes par‐ allel combination of NMOS and PMOS sleep transistors which can operate in three different modes as follows: In high‐speed mode, the PMOS is turned ON which results in full rail‐to‐ rail swing of output. The gate terminal of NMOS is left at *V*dd in high‐speed mode. During 0–1 logic transition the virtual *V*dd may temporarily drop below *V*dd ‐ VTH, causing the NMOS to leave cut‐off and assist with charging the switch's output load. In low‐power mode, the PMOS is turned OFF and NMOS is turned ON. The buffer is powered by the reduced voltage, VVD ≈ *V*dd – VTH.

Clock‐gating is an effective and most widely used method to reduce the dynamic power. This technique is based on the principle that only active portion of the system should be connected to the clock tree and others should not be served by the clock tree. A logic circuit must be included in the design for the selection of which portions are clocked and which portions are blocked. This reduces switching activity which results in dynamic power saving. The clock gating can be applied at the chip level as well as at the design level. The gating technique has been success‐ fully used in ASICs, but it is not very effective in SRAM‐based FPGAs because a large compo‐ nent of power consumption in FPGA is due to the switching activities of the clock signals along the routing switches. For this reason, researchers investigated the possibility of modifying the way a circuit is mapped on the FPGA array by acting on the synthesis, technology mapping, or placement and routing algorithms [66, 67]. Since clock is distributed in the chip through the global FPGA routing network, the placement of clock loads has a considerable impact on clock wire usage. Clock load placement should be done in such a way that one should get lower clock capacitance, which results in lower dynamic power consumption.

**Figure 2.** Proposed new programmable low‐power FPGA routing switch [65].

Placement and routing (P&R) on the chip also affects the dynamic power consumption because it decides the total parasitic capacitance in the design. To minimize the parasitic capacitance, it is essential to optimize the P&R strategy. It is always advisable to place two connected func‐ tional instances closer because it will reduce the interconnect wire‐length which in turn can reduce the capacitive loading of the net and lead to dynamic power reduction. The modern FPGA development software typically supports power‐driven layout to automatically accom‐ plish this task. Power‐driven layout tools examine connection between functional instances for optimization [68–70]. Power‐analysis tools are used to further optimize the power saving. Power‐analysis tools examine each subcomponent in a design hierarchy to highlight power consumption. Careful examination of this information and subsequent manipulation of the design can result in significant power savings.

Reducing the power supply of I/O can save up to 80% dynamic power. The switching activ‐ ity of I/O can be controlled by using techniques like time multiplexing, minimum I/O count design portioning [71–73], and reducing I/O drive strength/slew rates. A considerable amount of dynamic power can be saved by adopting differential I/O standards and resistively termi‐ nated I/O standards for highest toggling frequency and single‐ended I/O standard for low toggling frequency.

Tsang et al. [74] have studied the effectiveness of employing precomputation in reducing dynamic power consumption in commercial off‐the‐shelf (COTS) FPGAs. Precomputation is a high‐level logic optimization technique that lowers power consumption of a design by disabling part of the circuit based on a few relatively simple precomputation conditions. With careful design con‐ siderations and increased logic utilization, its associated power consumption can be reduced by disabling much larger part of the design with negligible increase in resource overhead.

In the literature, several techniques/methods are presented in detail to address the issue of dynamic power consumption in FPGA [10, 75–77].
