A forward body bias generator for digital CMOS circuits with supply voltage scaling

Citation for published version (APA):

DOI:
10.1109/ISCAS.2010.5537129

Document status and date:
Published: 01/01/2010

Document Version:
Publisher’s PDF, also known as Version of Record (includes final page, issue and volume numbers)

Please check the document version of this publication:
• A submitted manuscript is the version of the article upon submission and before peer-review. There can be important differences between the submitted version and the official published version of record. People interested in the research are advised to contact the author for the final version of the publication, or visit the DOI to the publisher’s website.
• The final author version and the galley proof are versions of the publication after peer review.
• The final published version features the final layout of the paper including the volume, issue and page numbers.

Link to publication

General rights
Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.

• Users may download and print one copy of any publication from the public portal for the purpose of private study or research.
• You may not further distribute the material or use it for any profit-making activity or commercial gain
• You may freely distribute the URL identifying the publication in the public portal.

If the publication is distributed under the terms of Article 25fa of the Dutch Copyright Act, indicated by the “Taverne” license above, please follow below link for the End User Agreement:
www.tue.nl/taverne

Take down policy
If you believe that this document breaches copyright please contact us at:
openaccess@tue.nl
providing details and we will investigate your claim.
A Forward Body Bias Generator for Digital CMOS Circuits with Supply Voltage Scaling

Maurice Meijer1), José Pineda de Gyvez2), Ben Kup1), Bert van Uden1), Peter Bastiaansen1), Marco Lammers1), and Maarten Vertregt1)
1) NXP Semiconductors, Eindhoven, The Netherlands
2) Technical University of Eindhoven, Eindhoven, The Netherlands
Contact: maurice.meijer@nxp.com

Abstract—We propose a new fully-integrated forward body bias (FBB) generator that holds its voltage constant relative to the (scalable) power supply of a digital IP. The generator is modular and can drive distinct digital IP block sizes in multiples of up to 1mm². The design has been implemented in 90nm low-power CMOS. Our basic unit for driving digital IP blocks up to 1mm² occupies a silicon area of 0.03mm² only. The generator completes a 500mV FBB voltage step within 4μs. The bandwidth of the design is 570kHz. The active current of the FBB generator alone is about 177μA for a nominal process, 1.2V supply and 85°C. The standby current is as low as 72nA at 27°C.

I. INTRODUCTION

Modern digital integrated circuits are sensitive to process variability that impacts circuit performance and power consumption. In recent years, post-silicon tuning has shown to be effective to counteract process variability, or to trade-off power-performance [1-2]. Well-known post-silicon tuning techniques are supply voltage scaling (VS) and body biasing (BB). VS is primarily used to reduce active power at the expense of a lower circuit performance [1]. BB is typically used for leakage reduction or performance tuning [1-2]. Two BB approaches exist, namely reverse body bias (RBB) and forward body bias (FBB). RBB increases the threshold voltage, \(V_{th}\), which lowers leakage at a gate delay penalty. Contrarily, FBB reduces \(V_{th}\) which lowers gate delay at a leakage cost. VS and BB can also be combined to achieve collective benefits [1].

The convergence of multiple applications into a single device drives integrated circuit solutions that are both high performance and power efficient. Post-silicon tuning techniques enable the definition of new operating modes, where each mode targets a different power-performance trade-off. Our focus is on the application of FBB for improving circuit performance. When a circuit is active, FBB is preferred over VS to enhance performance due to its lower dynamic power penalty. The joint use of FBB and VS is preferred over VS alone for achieving low-power operation. When a circuit is in standby, the leakage power is dominant and FBB should not be applied. This motivates the application of FBB dynamically at runtime [3].

FBB requires a voltage generator circuit to generate the required N-well and P-well bias, respectively. The trend towards higher integration densities in modern chips favors a fully integrated solution to enable more cost-effective system solutions. From an industrial perspective the generator should comply with the following requirements: i) It should be digitally controllable to simplify system integration. ii) The FBB voltage generation should be transparent to any voltage scaling approach, i.e., the amount of applied FBB should be constant relative to the supply voltage (\(V_{DD}\)) of the circuit. iii) The FBB generator should be powered off from the always-on nominal core supply. Finally, iv) a FBB generator should have low power consumption, and small area occupation.

Several FBB generators have been proposed in the literature, but none of them meet all of the aforementioned requirements. Tschanz et al. presented an adaptive body bias (ABB) voltage generator [2]. The main drawback of their implementation is that FBB is applied only to PMOS transistors to avoid the use of a triple well technology. Likewise, the FBB voltage is \(V_{DD}\)-dependent, as well as the need for a voltage level higher than \(V_{DD}\). Choi and Shin proposed a more sophisticated solution for providing body bias voltages to multiple macros in the design [4]. However, their solution also requires voltage levels higher than the core \(V_{DD}\) and lower than \(V_{SS}\), mainly for generating RBB, and also for this design, the FBB voltage is \(V_{DD}\)-dependent. Sumita et al. presented another ABB generator [5]. However, it has similar constraints as the one proposed in [4]. Komatsu et al. proposed a FBB generator for enabling self-adjusted FBB [6]. Their solution cannot dynamically control the FBB voltage, while the generated FBB voltage is highly sensitive to \(V_{DD}\) and strongly temperature dependent. Other publications imply using a FBB generator without discussing in detail its implementation [7-9].

In contrast to prior art, our solution can meet all of the aforementioned requirements.

The remaining of this paper is organized as follows. In Section 2 we introduce the proposed FBB generation concept. Section 3 presents the FBB generator design. Section 4 shows
the circuit layout. Finally, Section 5 presents the results as obtained from circuit simulations.

II. PROPOSED FBB GENERATION CONCEPT

Fig. 1 shows the general block diagram of the proposed fully-integrated FBB generator circuit. The circuit provides independent FBB voltages to PMOS (or N-well) and NMOS (or P-well) transistors that are part of the digital IP block under control via $V_{\text{inew}}$ and $V_{\text{pnew}}$, respectively. We distinguish between two supply pairs: $(V_{\text{DD}}, V_{\text{SS}})$ and $(V_{\text{DDIP}}, V_{\text{SSIP}})$. $V_{\text{DD}}$ and $V_{\text{SS}}$ are the nominal supply voltage and ground of the system, respectively. $V_{\text{DDIP}}$ and $V_{\text{SSIP}}$ are the supply voltage and ground of the digital IP block, respectively. The circuit architecture is based on a 6-bit dual resistive digital-to-analog converter (RDAC) approach to generate the P-well and N-well voltages, respectively. Since the RDAC is not able to drive the wells of the digital IP block, it is buffered to ensure low output impedance. The digital $BB_{nw}$ and $BB_{pw}$ input signals are decoded to match the RDAC control signals. The reference circuit creates a constant current through the RDAC’s. This is essential to be able to generate a FBB voltage that follows the supply voltage of the digital IP block. The generator contains two control signals $ENB$ and $MODE$ that are paired to the standby or active modes of the digital IP block, select the internal or external reference voltage, and select the bypass switches when the circuit is in standby. The details of the circuit implementation will be discussed in section 3.

We accomplished the transparent use of the FBB generator in a voltage-scaled digital IP by ensuring a constant current through the RDAC and by powering it off from $V_{\text{DDIP}}$ and $V_{\text{SSIP}}$.

III. FBB GENERATOR DESIGN

In this section we present details of the circuits that constitute the FBB generator. We have implemented this design in a 90nm low-power CMOS technology.

A. Reference circuit

Fig. 2 shows a simplified circuit diagram of the reference circuit. A feedback circuit derives the reference current, $I_{ref}$, from the reference voltage, $V_{ref}$, of 700mV. $V_{ref}$ can be internally generated by a resistor tree, or it can be externally generated by, e.g., a bandgap circuit. The $MODE$ signal selects the internal or external reference voltage. The reference resistor, $R_{ref}$, is matched to the RDAC resistors. The reference current, $I_{ref}$, is mirrored to create the current reference for the RDACs. The $ENB$ signal can turn off the resistor tree and the amplifier to minimize the static current consumption when the FBB generator is in standby.

B. RDAC and Decoder

The required FBB voltage is generated by the resistor tree of the RDAC. We have realized a voltage drop of 500mV across the resistor tree, which corresponds to a maximum FBB of 500mV. Each resistor tree consists of 64 poly resistors, thus, the smallest possible FBB step is about 8mV. The resistor tree is referenced to $V_{\text{DDIP}}$ or $V_{\text{SSIP}}$, respectively. The reference circuit supplies a constant bias current through the resistor tree. The resistor tree has been implemented by an array of resistor elements. There exist 8 horizontal and 8 vertical bit lines to select a given node in the tree. The decoder converts the 6-bit input of the FBB generator ($BB_{nw}$ or $BB_{pw}$) to enable a single horizontal-vertical bit line pair.

C. Buffer

The voltage buffer is implemented by an operational amplifier as unity-gain buffer with rail-to-rail output. It is powered from $V_{\text{DD}}$ and $V_{\text{SS}}$. The buffer consists of two stages, the pre-driver and an expandable output stage. The pre-driver contains the input stage and a gain stage.
Fig. 3 shows the circuit diagram of the pre-driver. The input stage is implemented by a double input pair to cover the wide input voltage range as provided by the RDAC, especially when the digital IP has a voltage scalable supply. A cascaded gain stage is used to achieve high gain. The pre-driver can be turned-off by the ENB signal. In this case, the outputs outp and outn are clamped to \( V_{DD} \) and \( V_{SS} \), respectively.

The expandable drive unit is implemented by a rail-to-rail class AB output stage, which can maintain a small current in steady state and is able to offer a large current during a transient. Such output stage is very convenient for driving large capacitive loads due to its current source/sink capability. Fig. 4 (left) shows a circuit diagram of a drive stage for providing the P-well bias to the digital IP. The output stage for providing the N-well bias is similar, except that the switches are connected to \( V_{DD} \) and \( V_{DDP} \) respectively. Circuit stability is accomplished using a Miller compensation scheme embedded in the output stage unit. When multiple drive units are used, output stages are placed in parallel which maintains the ratio between maximum load capacitance and Miller capacitance to be constant, thereby ensuring stability.

![Figure 4](image)

The switches are indicated along with their control signals. The switches are used to clamp the output to fixed potentials when the voltage buffer is turned-off. This ensures that the digital IP block is always properly body biased. The switches are used to clamp the output to fixed potentials when the voltage buffer is turned-off. This ensures that the digital IP block is always properly body biased.

![Figure 4](image)

Multiple output stages can be connected to the pre-driver. The number of output stages to be used depends on the size of the digital IP block. The pre-driver with one output stage is suitable for driving a digital IP block size of 1mm\(^2\). Two output stages can drive a digital IP block of up to 2mm\(^2\), etcetera. In this way we have created an expandable output stage, and offer a re-usable FBB generator solution to drive digital IP blocks of different sizes. The collection of one output stage for P-well and N-well is referred to as drive unit. Fig. 4 (right) shows the relative bandwidth of the generator as function of the digital IP block dimension and number of drive units. Observe that the bandwidth reduces for larger digital IP block that require more drive units.

IV. CIRCUIT LAYOUT

Fig. 5 shows the layout of the proposed FBB generator design in 90nm low-power CMOS technology. The base unit contains the reference circuit, the RDAC and decoders, and the pre-driver. The drive unit is connected to the base unit by abutment. The total area of the base unit and drive unit is 250\(\mu \)m by 125\(\mu \)m. The reference circuit, RDAC and decoders, pre-driver and output stage consume 24\%, 30\%, 14\%, and 32\% of the total area, respectively. The area of the drive unit alone is 80\(\mu \)m by 125\(\mu \)m. Additional drive units can be connected to each other by abutment. Alternatively, they can spatially distributed in the overall chip layout while hooked up to the base unit.

![Figure 5](image)

V. CIRCUIT SIMULATION RESULTS

Spectre circuit simulations have been performed for the base unit with a single drive unit while driving a digital IP block of 1mm\(^2\). Such digital IP block contains approximately 300K equivalent gates. The total well capacitance and current for both N-well and P-well has been extracted as function of FBB, and process-voltage-temperature conditions. We account for contributions from transistors and junction diodes. For a 1mm\(^2\) digital IP block, we obtained \( C_{lw} = 1nF \), \( C_{lw} = 1.8nF \), \( I_{lw} = 3.5mA \), and \( I_{lw} = 2mA \) at 0.5V FBB. The respective process and operation conditions are: nominal process, \( V_{DDP} = 1.2V \), and \( T = 85\degree C \). Table 1 summarizes the main design characteristics of the FBB generator.

![Table 1](image)

### Table 1. FBB Generator Design Characteristics

<table>
<thead>
<tr>
<th>Parameter</th>
<th>Unit</th>
<th>Base Unit + 1 Drive Unit</th>
</tr>
</thead>
<tbody>
<tr>
<td>Circuit area</td>
<td>( \text{mm}^2 )</td>
<td>0.03</td>
</tr>
<tr>
<td>Idd(^1)</td>
<td>( \mu A )</td>
<td>177</td>
</tr>
<tr>
<td>Iddq</td>
<td>( nA )</td>
<td>356 (72@27\degree C)</td>
</tr>
<tr>
<td>Bandwidth(^3)</td>
<td>( kHz )</td>
<td>570</td>
</tr>
<tr>
<td>Slew rate – P-well Rise</td>
<td>( mV/\mu s )</td>
<td>235</td>
</tr>
<tr>
<td>Slew rate – N-well Rise</td>
<td>( mV/\mu s )</td>
<td>256</td>
</tr>
<tr>
<td>Slew rate – P-well Fall</td>
<td>( mV/\mu s )</td>
<td>138</td>
</tr>
<tr>
<td>Slew rate – N-well Fall</td>
<td>( mV/\mu s )</td>
<td>152</td>
</tr>
</tbody>
</table>

\(^{1}\) Idd at nominal BB, \(^{3}\) Bandwidth at 0.5V FBB

Observe that the circuit area of the FBB generator is only a small fraction (~2\%) of the digital IP block area. The considered configuration consumes about 177\(\mu A \) in active mode. In standby, it leaks about 72\(\mu A \). Every additional drive unit increases the active and standby current by about 90\(\mu A \) and 54\(\mu A \), respectively. The FBB generator has a bandwidth of 570kHz and a worst case slew rate of 132 mV/us. From a
digital systems perspective, the FBB bandwidth can be interpreted as to how often can the IP block change its FBB voltage, while the slew rate indicates how fast is the FBB voltage available. This makes the circuit suitable for both dynamic and adaptive body biasing applications.

Fig. 6 shows the simulation traces of the N-well and P-well voltage for the same conditions as before.

Fig. 7 demonstrates the operation of the FBB voltage generator along with a digital IP with voltage scaling (i.e. $V_{DDIP}$=scaled, $V_{SSIP}$=$V_{SS}$). In this example, the voltage scaling starts shortly after the FBB voltage generator is enabled at $t=5\mu$s. Observe that the N-well voltage follows $V_{DDIP}$ to maintain 0.2V FBB when reducing $V_{DDIP}$ from 1.2V down to 0.8V. This shows that the proposed voltage generator is suitable for use in power-managed digital circuit designs.

The magnitude of the well currents depends on the size of the digital IP block under control and the temperature. We have analyzed the dependence between N-well/P-well voltage and N-well/P-well current. For this purpose, we have used the base unit with one drive unit for FBB generation. Fig.8 plots the obtained well voltages and current trends. This indicates the operational range of the FBB generator. Observe that both N-well and P-well voltages remain constant at 0.5V for well currents up to about $|10|$mA. Such well currents are about 3x and 5x larger than the expected maximum P-well and N-well current for a 1mm² digital IP block, respectively (P-well: 3.5mA and N-well: -2mA at 85°C).

VI. CONCLUSION

In this paper we proposed a new fully-integrated forward body bias (FBB) generator that holds its voltage constant relative to the (scalable) power supply of a digital IP. The generator is modular and can drive distinct digital IP block sizes in multiples of up to 1mm².

The design has been implemented in 90nm low-power CMOS. Our basic unit for driving digital IP blocks up to 1mm² occupies a silicon area of 0.03mm². The generator completes a 500mV FBB voltage step within 4$\mu$s. The bandwidth of the design is 570kHz. Finally, the active current is about 177$\mu$A for a nominal process, 1.2V supply and 85°C. The standby current is as low as 72nA at 27°C.

REFERENCES


