## Enabling Energy-Efficient Nonvolatile Computing With Negative Capacitance FET

Xueqing Li, *Member, IEEE*, John Sampson, *Member, IEEE*, Asif Khan, *Member, IEEE*, Kaisheng Ma, Sumitha George, Ahmedullah Aziz, *Student Member, IEEE*, Sumeet Kumar Gupta, *Member, IEEE*, Sayeef Salahuddin, *Senior Member, IEEE*, Meng-Fan Chang, *Senior Member, IEEE*, Suman Datta, *Fellow, IEEE*, and Vijaykrishnan Narayanan, *Fellow, IEEE* 

Abstract-Negative capacitance FETs (NCFETs) have attracted significant interest due to their steep-switching capability at a low voltage and the associated benefits for implementing energy-efficient Boolean logic. While most existing works aim to avoid the  $I_D - V_G$  hysteresis in NCFETs, this paper exploits this hysteresis feature for logic-memory synergy and presents a custom-designed nonvolatile NCFET D flip-flop (DFF) that maintains its state during power outages. This paper also presents an NCFET fabricated for this purpose, showing <10 mV/decade steep hysteresis edges and high, up to seven orders in magnitude, R<sub>DS</sub> ratio between the two polarization states. With a devicecircuit codesign that takes advantage of the embedded nonvolatility and the high R<sub>DS</sub> ratio, the proposed DFF consumes negligible static current in backup and restore operations, and remains robust even with significant global and local ferroelectric material variations across a wide 0.3–0.8 V supply voltage range. Therefore, the proposed DFF achieves energy-efficient and low-latency backup and restore operations. Furthermore, it has an ultralow energydelay overhead, below 2.1% in normal operations, and operates using the same voltage supply as the Boolean logic elements with which it connects. This promises energyefficient nonvolatile computing in energy-harvesting and power-gating applications.

# *Index Terms*— Ferroelectric FET, hysteresis, negative capacitance FET (NCFET), negative capacitance, nonvolatility, nonvolatile computing, nonvolatile D flip-flop (DFF).

Manuscript received January 18, 2017; revised April 21, 2017; accepted June 9, 2017. Date of publication June 27, 2017; date of current version July 21, 2017. This work was supported in part by GRC under Grant 2657.001, in part by the Center for Low Energy Systems Technology, one of the six SRC STARnet Centers, sponsored by MARCO and DARPA. The work of K. Ma was supported by NSF ASSIST. The review of this paper was arranged by Editor H. Shang. (*Corresponding authors: X. Li; A. Khan.*)

X. Li, J. Sampson, K. Ma, S. George, A. Aziz, S. K. Gupta, and V. Narayanan are with Penn State University, University Park, PA 16802 USA (e-mail: lixueq@cse.psu.edu; sampson@cse.psu.edu; kxm505@cse.psu.edu; sug241@cse.psu.edu; afa5191@psu.edu; skg157@engr.psu.edu; vijay@cse.psu.edu).

A. Khan is with the Georgia Institute of Technology, Atlanta, GA 30332 USA.

S. Salahuddin is with the University of California, Berkeley, CA 94720 USA (e-mail: asif.khan@ece.gatech.edu; sayeef@eecs.berkeley.edu).

M. F. Chang is with the Department of Electrical Engineering, National Tsing Hua University, Hsinchu 30013, Taiwan (e-mail: mfchang@mx.nthu.edu.tw).

S. Datta is with the Department of Electrical Engineering, University of Notre Dame, Notre Dame, IN 46556 USA (e-mail: sdatta@nd.edu).

Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TED.2017.2716338

**N** ONVOLATILE computing has been an effective emerging solution to prevent computation progress loss due to either an unexpected or scheduled power outage. This is achieved by backing up memory and D flip-flop (DFF) states to on-chip nonvolatile memory (NVM) elements [1]–[6]. This technique is particularly useful for energy harvesting Internetof-things (IoT) applications where frequent check-pointing is required under the notoriously intermittent supply provided by energy harvesting mechanisms [2]–[4]. Similarly, *in situ* backup also provides more energy savings in power-gating applications [1].

I. INTRODUCTION

Several embedded NVM options, such as pulse-code modulation, spin-transfer torque magnetic random access memory, Resistive random access memory, and Resistive random access memory (ReRAM) have been proposed, as nonvolatile replacements for existing volatile embedded memories [7]. However, even with significant recent improvements [8]-[14], existing nonvolatile DFFs (NV-DFFs) need a significant amount of energy and time for one backup and restore operation. One factor in these high overheads is the duplicated backup and restore interface circuitry for each NV-DFF so as to convert the capacitance or resistance into a Boolean voltage [8], [11]. Another factor is the limited resistance ratio between different states and device variations in existing NVM devices which result in over-design to obtain a satisfactory yield. For example, when a global write control is applied, the write time of ReRAM devices should be not less than that of the slowest ReRAM device, resulting in unnecessary power consumption by faster devices [34]. The need for a high voltage or current to access the nonvolatile devices also brings additional overheads in power supply management, and may cause a peak power problem that limits the plausible number of parallel backup and restore operations [31]–[33].

The recent advent of negative capacitance FETs (NCFETs), also known as ferroelectric FETs heralds a new era of logic-NVM synergy. Being fundamentally different from existing NVM devices, NCFET can behave concurrently as an NVM and a logic device with inherent compatibility with Boolean signaling, which greatly reduces the complexity and energy consumption of the interface with logic gates. More importantly, unlike magnetic tunnel junction (MTJ) and ReRAM, there can be no static current during a write operation for NCFET memory. NCFET devices are being actively developed

0018-9383 © 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications\_standards/publications/rights/index.html for more information.





Fig. 1. Proposed NCFET NV-DFF for nonvolatile computing.



Fig. 2. Required NCFET characteristics. (a)  $I_D - V_G$  hysteresis with two stable nonvolatile states at  $V_{\rm GS} = 0$ . (b) Polarization switching by the gate voltage. The *I*-*V* curve is obtained based on the LK-equation modeling method in [23], with 10-nm PTM CMOS FinFET as the integrated MOSFET. The LK-equation coefficients are:  $\alpha = -1.05e9$  m/F,  $\beta = 1e7$  m<sup>5</sup>/F/coul<sup>2</sup>,  $\gamma = 6e11$  m<sup>9</sup>/F/coul<sup>4</sup>.

and explored for low-power Boolean logic [16]–[22], [27], [28]. Our recent NCFETs exhibit a steep hysteresis edge and a high ratio between the two  $I_{\text{DS}}$  states at  $V_G = 0$ , and confirm the potential of synergizing NVM and efficient logic with NCFET.

In this paper, we propose an NCFET NV-DFF in Fig. 1, with ultralow energy and latency in backup and restore operations and negligible energy-delay overhead in normal operations. Device-circuit codesign efforts have been made to harness the unique NCFET device features for performance optimizations.

#### II. PREFERRED NCFETS AND EXPERIMENTS

For the purpose of backup, restore, and nonvolatile states storage, the NCFETs in the NV-DFF are designed to have two locally stable states in the hysteresis around  $V_G = 0$ , as shown in the N-type I-V curve example in Fig. 2(a). One state can switch to the other by applying a sufficiently high-amplitude positive or negative  $V_{GS}$  that exceeds the coercive voltage [22], as shown with an N-type NCFET in Fig. 2(b). The state transition speed will be discussed in Sections III and IV.

Capacitance matching is critical to obtain the device characteristics in Fig. 2. In recent reports, NCFETs exhibit hysteresis when  $|C_{\rm FE}| < C_{\rm MOS}$ , where  $C_{\rm FE}$  is the negative ferroelectric capacitance and  $C_{\rm MOS}$  is the MOSFET gate capacitance [17], [18], [20]. In [19], hysteresis is found around  $V_G = 0$  with about  $100 \times I_{\rm DS}$  ratio between the two hysteresis states. While these works try to avoid



Fig. 3. P-type NCFET with externally connected BiFeO<sub>3</sub> ferroelectric material. (a) Device structure. (b) Measured  $I_D - V_G$  hysteresis around  $V_G = 0$ .

the hysteresis for Boolean logic operations, in this paper, we present an NCFET device with wide hysteresis around  $V_G = 0$ , as shown in Fig. 3(a). BiFeO<sub>3</sub> ferroelectric material was externally connected to a fin-structure field-effect transistor (FinFET) of 100-nm gate length. Fabrication and measurement follow the process in [17] but differ in the additional  $V_{\text{TH}}$  shifting in the baseline MOSFET to locate the hysteresis window around  $V_G = 0$ . Further work on more  $V_{\text{TH}}$  shifting could achieve the optimum goal of centering the hysteresis at  $V_G = 0$ .

Nevertheless, the key contribution of the device is the first experimentally verified NCFET, showing that: 1) shifting of the NCFET hysteresis curve by  $V_{\text{TH}}$  engineering to provide extra nonvolatility, as compared with its predecessor in [17] and 2) steep hysteresis edges with a slope below 10 mV/decade for above seven orders in magnitude between the two  $I_{\text{DS}}$  hysteresis states at  $V_G = 0$ . As to be revealed in this paper, a steep-slope transition edge provides a wide gate voltage range in which the polarization stays stable against noise; A higher ON-state current and a lower OFF-state current provides faster restore and immunity to device variations. These characteristics are preferred and well-captured in the proposed NV-DFF design.

### III. PROPOSED NV-DFF THEORIES AND OPTIMIZATIONS A. Normal and Backup/Restore Operations

The proposed NV-DFF in Fig. 1 has a main body (the same as a conventional DFF consisting of a master latch and a slave latch), and an accessory circuitry connecting to the slave latch for backup and restore purposes. All backup and restore operations are associated with the slave latch only. In Fig. 1, *Bkp* is the backup operation control signal, and *Rstr* is the restore operation control signal. When both *Bkp* and *Rstr* are low, the interface transistors between the main body and the auxiliary circuitry are turned OFF by the gate signal *Bkp* and*Rstr*, leaving the main body functioning the same as a conventional positive-edge triggered DFF.

Fig. 4(a) shows the circuit state transition in the slave latch during a backup operation. When a supply outage is about to come, the backup control signal Bkp becomes high and turns ON the interface transistors M1-M4. Note that the pull-down transistors M7 and M8 gated by the restore control signal *Rstr* are turned OFF. Assuming Q is "1" (corresponding to a high voltage,  $V_{DD}$ ) and QN is "0" (corresponding to GND), the feedback network quickly biases the two NCFETs



Fig. 4. (a) Circuit theory for backup operation and (b) restore operation.

to switch to (or maintain) a positive polarization for M5 and a negative polarization for M6, respectively. The polarization switching is straightforward as their gates are biased at voltage levels opposite to their drain and source terminals:  $V_{DD}-V_{TH}$ for a high voltage level and GND at a low level. After the polarization switching is accomplished, removing  $V_{DD}$  will not affect the stored polarization states, regardless of Bkp and *Rstr* control signal levels. Note that the backup operation does not need to change the polarization if the state of the previous backup is the same. Operations with Q equal to "0" are similar.

Fig. 4(b) shows the circuit state transition for a restore operation. During the entire restore operation, the input clock signal CLK and backup control signal Bkp are set to be low (C = "0" and CN = "1"), and Rstr is set to be high. This guarantees that the slave latch is isolated from the master latch. As a result, the sensed resistance from Q and QN to GND and  $V_{DD}$  determines the final settled Q and QN voltage levels. For the positively polarized NCFET, its drain-to-source resistance is the order of magnitude lower than the other negatively polarized NCFET, which leads to a much stronger pull-down effect on the settling of its branch. The positive feedback network further enhances the difference, and finally leads to a full settling down as  $V_{DD}$  recovers.

Fig. 5 is a transient waveform snapshot, showing operations with a steady  $V_{DD}$  and backup and restore operations due to power failures. In the snapshot, the clock frequency is 0.25 GHz when the power supply is stable. It could be much faster as the isolated accessory backup and restore circuitry has negligible impact on the normal operation. To prevent the



Fig. 5. Transient waveform snapshots of proposed NV-DFF with the CLK period equal to  $\sim$ 4 ns during normal "power ON" operations. A few nanosecond after power supply goes OFF, the internal nodes like *Q* and QN are manually pulled down to ground to mimic real scenarios (this leads to faster settling down of the polarization as shown).

backup and restore operations from being interrupted, CLK is being kept low. The polarization state, as shown in Fig. 5, keeps stable in the power-off periods. Thanks to the simple timing requirement and the small control load, there is no need for a second supply network to deliver power for the control signals.

Note that the proposed NV-DFF backup and restore circuitry could also be built with P-type transistors connecting to  $V_{DD}$  with effective control signals at a low voltage.

#### B. Device-Circuit Co Design

From the device's perspective, given the ferroelectric material, the NCFET parameters can be tuned by the ferroelectric layer thickness  $T_{\text{FE}}$  and the ferroelectric layer area  $A_{\text{FE}}$ . From the perspective of circuits and applications, the desired performance features mainly include energy-delay overhead during normal operations, backup and restore energy and delay, retention time, yield and reliability, etc.

The backup and restore energy is one key specifications of NV-DFF. For energy-harvesting systems with an intermittent power supply, it limits the overall energy efficiency as more energy spent for backup and restore operations results in less energy for computation. For check-pointing applications, the backup and restore energy indicates a certain period of power-off time, i.e., break-even time (BET), below which no energy savings could be achieved. The backup and restore time is also critical for some applications when a fast response is preferred, e.g., fine-time-granularity power-gating scenarios and fast response processors.

Fig. 6 shows how  $T_{\text{FE}}$  and  $A_{\text{FE}}$  affect the NCFET device I-V characteristics and the NV-DFF performance. Increasing  $T_{\text{FE}}$  increases the coercive voltage or energy barrier to flip the polarization, leading to longer retention time and more time to switch the polarization. A larger  $T_{\text{FE}}$  reduces the restore time with a higher ON/OFF resistance ratio and a lower



Fig. 6. Impact of varying  $T_{FE}$  (baseline 6 nm) and  $A_{FE}$  (baseline 100% of 378 nm<sup>2</sup>) on NCFET  $I_{DS}-V_{GS}$  in (a) and (b), NV-DFF backup and restore time in (c) and (d), NV-DFF backup and restore energy in (e) and (f).  $V_{DD}$  is 0.5 V. The kinetic coefficient  $\rho$  is 0.25. The device parameters are the same as shown in Fig. 2. More simulation settings are in the next section.

ON-state resistance. The variation of  $T_{\text{FE}}$  from 5 to 7 nm also slightly affects the backup and restore energy by less than 10%. Meanwhile, decreasing  $A_{\text{FE}}$  also changes the capacitance matching between  $C_{\text{FE}}$  and  $C_{\text{MOS}}$ , leading to a different I-V with a higher coercive voltage, a lower ON-state current, and longer polarization switching time.

Considering the inevitable  $T_{\text{FE}}$  and  $A_{\text{FE}}$  variations in the fabrication process,  $T_{\text{FE}}$  and  $A_{\text{FE}}$  should be carefully optimized considering the supply voltage range, the retention model, and the application requirements on the retention time, backup and restore energy, and latency. While it varies from case-to-case, in this design evaluation,  $T_{\text{FE}}$  and  $A_{\text{FE}}$  are set to be 6 nm and 378 nm<sup>2</sup> (equal to 3\* fin\_width\*channel\_length) for one fin, respectively, for the optimized trade off. These parameter values, unless otherwise stated, will be used in the performance evaluation in Section IV.

#### **IV. NV-DFF PERFORMANCE**

This section describes SPICE simulations and performance comparisons. More variation and yield analysis, and future work are also covered.

#### A. Simulation Settings and Baseline Designs

The physics-based ferroelectric capacitance model in [23] is employed to build NCFETs with 10-nm PTM CMOS FinFET as the integrated MOSFET. In this model, the ferroelectric material has been calibrated by experimental results of lead zirconium titanate (PZT) films on hafnium oxide (HfO<sub>2</sub>)



Fig. 7. Energy-delay performance overhead of the proposed NCFET NV-DFF under a supply voltage from 0.30 to 0.80 V. The kinetic coefficient  $\rho$  is 0.25.

buffer. To reflect different polarization switching speed, in the NCFET model, the kinetic coefficient  $\rho$  is varied from 0.04 to 0.25. A typical value of  $\rho = 0.25$  has been adopted as in [15], [23], and [26]. The baseline CMOS volatile DFF is optimized with the minimum area and a similar clock-to-Q delay between "0" and "1" outputs (the number of fins for nMOS and pMOS is 1 and 2, respectively). The DFFs are simulated with a 2 fF load and 20 ps rising and falling time for D and CLK inputs.

#### B. Energy-Delay Overhead

For the NV-DFF in many applications, energy-delay performance remains critical because the DFF is still operating with a steady supply for a large portion of time. Therefore, it is meaningful that the additionally acquired nonvolatility does not cause high energy-delay overheads. Fig. 7 shows these overheads over the baseline CMOS volatile DFF design. Due to the normally-OFF configuration of the backup and restore circuitry, the energy-delay product (EDP) overhead is lower than 2.1% for  $V_{DDs}$  above 0.4 V. If a larger-size baseline DFF is used, this EDP overhead becomes even more negligible, because the backup and restore circuitry need not be scaled up by the same ratio due to the low (high) OFF-state (ONstate) NCFET resistance. In addition, thanks to the normally-OFF configuration, the backup and restore operations have little impact on the DFF setup time and hold time requirement.

#### C. Variation and Noise Performance

Existing NVM and NV-DFF designs suffer from the nonidealities of the nonvolatile storage devices inside, especially the variations and low resistance ratio between different states of resistive memory devices, such as MTJ and ReRAM [11], [30], [31], [34]. In such approaches, the worstcorner devices often greatly limit the overall system performance. For example, as mentioned in Section I, the ReRAM write pulse duration is much longer than average to ensure yield, resulting in high energy consumption. Therefore, it is important to analyze how the NV-DFF performs with NCFET variations.

Fig. 6 has actually shown how the NV-DFF behaves with global  $T_{\text{FE}}$  and  $A_{\text{FE}}$  variations (all devices vary from the design



Fig. 8. Performance of backup and restore energy and latency evaluations, in comparison with previous NV-DFF in [15] based on ferroelectric capacitor (black) and a prior NCFET circuitry (red). Three sets of data in green are included, considering NCFET  $T_{FE}$  mismatches in one NV-DFF in three combinations: 6 nm/6 nm (no mismatch), 6 nm/5.5 nm, 6 nm/6.3 nm. The kinetic coefficient  $\rho$  is 0.25.

target by the same amount). Here more scenarios are provided, considering local mismatches, i.e., the two NCFETs in one NV-DFF having different  $T_{\text{FE}}$ . In Fig. 8, the green curves are for three sets of the proposed NCFET NV-DFF results operating in a range of  $V_{\text{DD}}$  with  $T_{\text{FE}}$  diverting away from 6 nm by -10% to +5%. These simulation results show that the major impact is on the backup latency in low  $V_{\text{DD}}$  scenarios, while the impact on other metrics is comparatively much less significant than that brought by a different supply voltage.

The impact of additional noise that causes a nonzero initial Q/QN voltage opposite to the desired value is also analyzed through comprehensive simulations. We found that noise up to 200 mV is fully tolerable for correct operations, even with the above mentioned local NCFET mismatches. Such unwanted initial charge at Q or QN will be quickly discharged by the restore branches. Proper timing, the low ON-state and high OFF-state resistance of the NCFETs at different polarization states enable this feature.

As indicated by [23] and [29], the kinetic coefficient  $\rho$  affects the polarization switching time significantly. Different practical kinetic coefficient values are adopted in simulations to reflect different polarization switching time, as shown in Table I. This also indicates the value of material and device research in making faster NCFETs.

#### D. Performance Comparisons

Fig. 8 compares the backup and restore performance with existing NV-DFF designs in [15]. The proposed NCFET NV-DFF exhibits more than  $6 \times$  reduction in the restore energy and  $30 \times$  reduction in the backup energy. It also performs better in backup latency, and significantly outperforms the others with more than 50% reduction in restore latency. In addition, the proposed NV-DFF works with general DFFs based on master-slave latches, which is superior to the design in [15] that only works when the baseline DFF has set/reset ports. During restore operations, the design in [15] could not fully

| TABLE I                                             |
|-----------------------------------------------------|
| Performance Comparisons Among Recent NV-DFF Designs |

|                             | [10]<br>measured | [9]<br>simulated | [11]<br>simulated <sup>&amp;</sup> | This Work<br>simulated*    |        |        |
|-----------------------------|------------------|------------------|------------------------------------|----------------------------|--------|--------|
| Tech. size                  | 130nm            | 70nm             | 180nm                              | 10nm                       |        |        |
| Voltage                     | 1.5V             | 1.0V             | 1.8V                               | 0.3V-0.8V                  |        |        |
| Material                    | PZT Cap          | MTJ              | ReRAM                              | 6nm HfO <sub>2</sub> , PZT |        |        |
| Waterial                    | 121 Cap          | 14113            | Keikaiwi                           | ρ=0.04                     | ρ=0.10 | ρ=0.25 |
| $T_{Backup+Restore}$        | 2.67µS           | >10uS            | 1.3µS                              | 277ps                      | 583ps  | 1.29ns |
| E <sub>Backup+Restore</sub> | 2.40pJ           | 382fJ            | 735fJ                              | 1.38fJ                     |        |        |
| Break-Even Time             | /                | 0.83µs@25°C      | 1.47ms                             | 55.9ns                     |        |        |

<sup>A</sup>: The results are for the topology of NVFF-I in [11] operating at 0.8 V supply (rise to 2.4V for ReRAM write) for the shortest break-even time.

\*: Backup and restore performance in this table is simulated at 0.5V supply.

eliminate the static current of low-resistance-state NCFETs. The gains over the design in [15] stem from the deeply embedded logic-in-memory operation with the proposed simple circuit structure that can carry out the backup and restore operations in only one step without static current consumptions.

Table I also summarizes the NV-DFF overall performance in comparison with some other reported designs in different technologies. One of the strongest advantages of the proposed NV-DFF over others is the orders of magnitude lower energy for backup and restore operations. Such energy savings partly come from the capability of NCFETs to operate effectively at a lower voltage. Two more important factors are: 1) the fundamentally different three-terminal NCFET device operating in a novel cross-coupled circuitry that avoids static NCFET drain-source current during backup and restore operations and 2) NCFETs of a small size that can still ensure fast and robust operations with a high ON/OFF state resistance ratio even in the presence of significant local and global variations. In contrast, existing resistive memory elements in reported NV-DFF designs, such as MTJ and ReRAM, are continuously drawing current (because of their inherent twoterminal device feature) for a long period of time to ensure yield (because of required-write-time duration variations with a much lower resistance ratio). For capacitive NVM devices, such as ferroelectric capacitors in [10], their power inefficiency arises from a complex access interface and a large capacitance value.

The low-energy backup and restore operations in the proposed NV-DFF enable higher-efficiency nonvolatile computing applications. For energy harvesting systems with an intermittent supply, the saved backup and restore energy could be used for computing, leading to more forward progress, and higher quality of service. For general power-gating systems, the lower backup and restore energy leads to a shorter BET versus the leakage energy of an idle unit, indicating significant expansion of opportunity for energy savings from fine-grained power-gating.

#### E. Future Work

Beyond existing NCFET models in [23] and [24], future models that characterize the device variation, temperature

dependence, retention time, endurance and aging, etc., are required for more comprehensive evaluations. For example, retention time can be a key specification in some applications. In NCFETs, this is mainly determined by the coercive voltage, the remnant polarization, and the area of the ferroelectric layer. Aging may and may not be a limiter, depending on how frequent the backup operation occurs (usually orders lower in magnitude than the clock frequency).

Meanwhile, more NCFET device fabrication efforts that focus on better shaping the hysteresis could provide solid support for feature verification, model calibration, and circuit design. Future circuit optimizations and experiments for nonvolatile computing are also warranted.

Furthermore, while the proposed NCFET NV-DFF provides strong synergies for nonvolatile computing, it will be important to optimize the existing computing architectures accordingly [35], [36]. This is because the tradeoffs between retention time, operation energy, area, and yield will have shifted significantly from prior approaches.

#### V. CONCLUSION

This paper has proposed an approach to nonvolatile computing with NCFETs nonvolatile DFFs by harnessing the logic-NVM synergy with a novel circuitry. With negligible energy-delay overhead in the normal operation, low energy and low latency in backup and restore operations, it enables a new paradigm for future IoT and power-gating applications.

#### ACKNOWLEDGMENT

The authors would like to thank Dr. F. Catthoor from IMEC, Prof. S. Hu and Prof. A. Seabaugh from the University of Notre Dame, and Prof. P. Asbeck from the University of California, San Diego for their suggestions.

#### REFERENCES

- A. B. Kahng, S. Kang, T. S. Rosing, and R. Strong, "Many-core tokenbased adaptive power gating," *IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst.*, vol. 32, no. 8, pp. 1288–1292, Aug. 2013.
- [2] K. Ma et al., "Architecture exploration for ambient energy harvesting nonvolatile processors," in Proc. IEEE 21st Int. Symp. High Perform. Comput. Archit. (HPCA), Burlingame, CA, USA, Feb. 2015, pp. 526–537.
- [3] S. Kim *et al.*, "Ambient RF energy-harvesting technologies for selfsustainable standalone wireless sensor platforms," *Proc. IEEE*, vol. 102, no. 11, pp. 1649–1666, Nov. 2014.
- [4] Y. Liu et al., "Ambient energy harvesting nonvolatile processors: From circuit to system," in Proc. IEEE 21st Int. Symp. High Perform. Comput. Archit. (HPCA), San Francisco, CA, Feb. 2015, pp. 526–537.
- [5] Z. Wang *et al.*, "A 130nm FeRAM-based parallel recovery nonvolatile SoC for normally-OFF operations with 3.9x faster running speed and 11x higher energy efficiency using fast power-on detection and nonvolatile radio controller," in *Proc. Symp. VLSI Circuits (VLSI Circuits)*, Kyoto, Japan, Jun. 2017, pp. C336–C337.
- [6] Y. Liu *et al.*, "A 65 nm ReRAM-enabled nonvolatile processor with 6x reduction in restore time and 4x higher clock frequency using adaptive data retention and self-write-termination nonvolatile logic," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, San Francisco, CA, USA, Jan./Feb. 2016, pp. 84–86.
- [7] Y. Xie, Emerging Memory Technologies: Design, Architecture, and Applications. New York, NY, USA: Springer, 2014.
- [8] M. Qazi, A. Amerasekera, and A. P. Chandrakasan, "A 3.4-pJ FeRAMenabled D flip-flop in 0.13-μm CMOS for nonvolatile processing in digital systems," *IEEE J. Solid-State Circuits*, vol. 49, no. 1, pp. 202–211, Jan. 2014.

- [9] S. Yamamoto and S. Sugahara, "Nonvolatile delay flip-flop based on spin-transistor architecture and its power-gating applications," *Jpn. J. Appl. Phys.*, vol. 49, no. 9R, p. 090204, Sep. 2010.
- [10] H. Kimura et al., "A 2.4 pJ ferroelectric-based non-volatile flip-flop with 10-year data retention capability," in Proc. IEEE Asian Solid-State Circuits Conf. (A-SSCC), Kaohsiung, Taiwan, Nov. 2014, pp. 21–24.
- [11] I. Kazi et al., "Energy/reliability trade-offs in low-voltage ReRAM-based non-volatile flip-flop design," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 61, no. 11, pp. 3155–3164, Nov. 2014.
- [12] K.-W. Kwon, S. H. Choday, Y. Kim, X. Fong, S. P. Park, and K. Roy, "SHE-NVFF: Spin Hall effect-based nonvolatile flip-flop for power gating architecture," *IEEE Electron Device Lett.*, vol. 35, no. 4, pp. 488–490, Apr. 2014.
- [13] R. Bishnoi, F. Oboril, and M. B. Tahoori, "Non-volatile non-shadow flip-flop using spin orbit torque for efficient normally-off computing," in *Proc. 21st Asia South Pacific Design Autom. Conf. (ASP-DAC)*, Macau, China, Jan. 2016, pp. 769–774.
- [14] C.-P. Lo *et al.*, "A ReRAM-based single-NVM nonvolatile flip-flop with reduced stress-time and write-power against wide distribution in writetime by using self-write-termination scheme for nonvolatile processors in IoT era," in *IEDM Tech. Dig.*, Dec. 2016, pp. 16.3.1–16.3.4.
- [15] D. Wang, S. George, A. Aziz, S. Datta, V. Narayanan, and S. K. Gupta, "Ferroelectric transistor based non-volatile flip-flop," in *Proc. Int. Symp. Low Power Electron. Design*, San Francisco, CA, USA, Aug. 2016, pp. 10–15.
- [16] A. I. Khan, C. W. Yeung, C. Hu, and S. Salahuddin, "Ferroelectric negative capacitance MOSFET: Capacitance tuning & antiferroelectric operation," in *IEDM Tech. Dig.*, Washington, DC, USA, Dec. 2011, pp. 11.3.1–11.3.4.
- [17] A. I. Khan *et al.*, "Negative capacitance in short-channel FinFETs externally connected to an epitaxial ferroelectric capacitor," *IEEE Electron Device Lett.*, vol. 37, no. 1, pp. 111–114, Jan. 2016.
- [18] J. Jo and C. Shin, "Negative capacitance field effect transistor with hysteresis-free sub-60-mV/decade switching," *IEEE Electron Device Lett.*, vol. 37, no. 3, pp. 245–248, Mar. 2016.
- [19] K.-S. Li *et al.*, "Sub-60 mV-swing negative-capacitance FinFET without hysteresis," in *IEDM Tech. Dig.*, Washington, DC, USA, Dec. 2015, pp. 22.6.1–22.6.4.
- [20] M. H. Lee *et al.*, "Steep slope and near non-hysteresis of FETs with antiferroelectric-like HfZrO for low-power electronics," *IEEE Electron Device Lett.*, vol. 36, no. 4, pp. 294–296, Apr. 2015.
- [21] M. H. Lee *et al.*, "Prospects for ferroelectric HfZrOx FETs with experimentally CET=0.98 nm, SS<sub>for</sub> =42 mV/dec, SS<sub>rev</sub> =28 mV/dec, switch-off <0.2 V, and hysteresis-free strategies," in *IEDM Tech. Dig.*, Washington, DC, USA, Dec. 2015, pp. 22.5.1–22.5.4.
- [22] A. I. Khan et al., "Negative capacitance in a ferroelectric capacitor," *Nature Mater.*, vol. 14, no. 2, pp. 182–186, Feb. 2015.
- [23] A. Aziz, S. Ghosh, S. Datta, and S. K. Gupta, "Physics-based circuitcompatible SPICE model for ferroelectric transistors," *IEEE Electron Device Lett.*, vol. 37, no. 6, pp. 805–808, Jun. 2016.
- [24] C. Hu, S. Salahuddin, C.-I. Lin, and A. Khan, "0.2 V adiabatic NC-FinFET with 0.6 mA/µm ION and 0.1 nA/µm I<sub>OFF</sub>," in *Proc. 73rd Annu. Device Res. Conf. (DRC)*, Columbus, OH, USA, Jun. 2015, pp. 39–40.
- [25] S. Sakai and R. Ilangovan, "Metal-ferroelectric-insulator-semiconductor memory FET with long retention and high endurance," *IEEE Electron Device Lett.*, vol. 25, no. 6, pp. 369–371, Jun. 2004.
- [26] S. George et al., "Nonvolatile memory design based on ferroelectric FETs," in Proc. 53rd ACM/EDAC/IEEE Design Autom. Conf. (DAC), Austin, TX, USA, Jun. 2016, p. 118.
- [27] S. George *et al.*, "Device circuit Co design of FEFET based logic for low voltage processors," in *Proc. IEEE Comput. Soc. Annu. Symp. VLSI (ISVLSI)*, Pittsburgh, PA, USA, Jul. 2016, pp. 649–654.
- [28] D. E. Nikonov and I. A. Young, "Overview of beyond-CMOS devices and a uniform methodology for their benchmarking," *Proc. IEEE*, vol. 101, no. 12, pp. 2498–2533, Dec. 2013.
- [29] J. Li, B. Nagaraj, H. Liang, W. Cao, H. C. Lee, and R. Ramesh, "Ultrafast polarization switching in thin-film ferroelectrics," *Appl. Phys. Lett.*, vol. 84, no. 7, pp. 1174–1176, Feb. 2004.
- [30] L. Zhang, S. Cosemans, D. J. Wouters, G. Groeseneken, M. Jurczak, and B. Govoreanu, "On the optimal ON/OFF resistance ratio for resistive switching element in one-selector one-resistor crosspoint arrays," *IEEE Electron Device Lett.*, vol. 36, no. 6, pp. 570–572, Jun. 2015.
- [31] K. Tsunoda *et al.*, "Highly manufacturable multi-level perpendicular MTJ with a single top-pinned layer and multiple barrier/free layers," in *IEDM Tech. Dig.*, Washington, DC, USA, Dec. 2013, pp. 3.3.1–3.3.4.

- [32] X. Wu, J. Li, L. Zhang, E. Speight, R. Rajamony, and Y. Xie, "Hybrid cache architecture with disparate memory technologies," in *Proc. 36th Annu. Int. Symp Comput. Archit.*, Jun. 2009, pp. 34–45.
- [33] S. Lee, J. Jung, and C.-M. Kyung, "Hybrid cache architecture replacing SRAM cache with future memory technology," in *Proc. IEEE Int. Symp. Circuits Syst.*, Seoul, South Korea, May 2012, pp. 2481–2484.
- [34] M. F. Chang *et al.*, "Embedded 1 Mb ReRAM in 28 nm CMOS with 0.27-to-1 V read using swing-sample-and-couple sense amplifier and self-boost-write-termination scheme," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, San Francisco, CA, USA, Feb. 2014, pp. 332–333.
- [35] K. Ma *et al.*, "Nonvolatile processor architectures: Efficient, reliable progress with unstable power," *IEEE Micro*, vol. 36, no. 3, pp. 72–83, May Jun. 2016.
- [36] K. Ma et al., "Spendthrift: Machine learning based resource and frequency scaling for ambient energy harvesting nonvolatile processors," in *Proc. 22nd Asia South Pacific Design Autom. Conf. (ASP-DAC)*, Chiba, Japan, Jan. 2017, pp. 678–683.

Xueqing Li (M'13) received the B.S. and Ph.D. degrees in electronics engineering from Tsinghua University, Beijing, China.

He is currently a Post-Doctoral Research Associate with The Pennsylvania State University, University Park, PA, USA. His current research interests include CMOS and emerging circuits and systems.

John Sampson (M'04) received the Ph.D. degree from the University of California, San Diego, CA, USA.

He is currently an Assistant Professor with The Pennsylvania State University, University Park, PA, USA.

Asif Khan (M'15) received the Ph.D. degree in electrical engineering and computer sciences from the University of California, Berkeley, CA, USA, in 2015, and the B.S. degree in electrical and electronic engineering from the Bangladesh University of Engineering and Technology Dhaka, Bangladesh, in 2007.

He is currently an Assistant Professor with the Georgia Institute of Technology, Atlanta, GA, USA.

Kaisheng Ma received the bachelor's (with Hons.) degree in E.E. from Hangzhou Electronic University, China, and the master's degree from Peking University, Beijing, China. He is currently pursuing the Ph.D. degree with the Pennsylvania State University, University Park, PA, USA.

Sumitha George received the B.Tech. degree in electronics and communication from The University of Kerala, Thiruvananthapuram, India, and the M.Tech. degree from IIT Delhi, New Delhi, India. She is currently pursuing the Ph.D. degree with The Pennsylvania State University, University Park, PA, USA.

Ahmedullah Aziz (S'10) received the B.Sc. degree from the Bangladesh University of Engineering and Technology, Dhaka, Bangladesh, in 2013, and the M.S. degree from The Pennsylvania State University, University Park, PA, USA, in 2016, where he is currently pursuing the Ph.D. degree.

Sumeet Kumar Gupta (M'12) received the B.Tech. degree in electrical engineering from IIT Delhi, New Delhi, India, and the M.S. and Ph.D. degrees from Purdue University, West Lafayette, IN, USA.

He is currently an Assistant Professor with Penn State University, University Park, PA, USA.

Sayeef Salahuddin (SM'14) received the B.Sc. degree from the Bangladesh University of Engineering and Technology, Dhaka, Bangladesh, and the Ph.D. degree from Purdue University, West Lafayette, IN, USA.

He is currently an Associate Professor with the University of California, Berkeley, CA, USA.

**Meng-Fan Chang** (M'05–SM'14) received the M.S. degree from The Pennsylvania State University, University Park, PA, USA, and the Ph.D. degree from National Chiao Tung University, Hsinchu, Taiwan.

He is currently a Professor with National Tsing Hua University), Hsinchu, Taiwan.

Suman Datta (F'13) was a Professor with The Pennsylvania State University, University Park, PA, USA, from 2007 to 2011. He is currently the Chang Family Chair Professor with the University of Notre Dame, Notre Dame, IN, USA.

Vijaykrishnan Narayanan (F'11) received the B.S. degree from the University of Madras, Chennai, India, and the Ph.D. degree from the University of South Florida, Tampa, FL, USA.

He is currently a Professor with The Pennsylvania State University, West Lafayette, IN, USA.