VLSI Digital Design

# MODULE III PHYSICAL DESIGN ISSUES

3.1 Low-power design3.2 Power-supply and clock distribution

# **3.1 Low-power design**

# **3.1.1 Power dissipation in CMOS gates**

#### **Power dissipation importance**

- Package Cost. Power dissipation cost.
- Supply lines design.
- Digital noise immunity.
- Battery autonomy (in portable systems).
- Environmental implications.

#### **Dissipation sources in CMOS**

- Switching activity in gates.
  - ➤ Parasitic capacitance charging and discharging.
  - $\succ$  Glitchs.
- Short circuit currents.

>Direct electrical path among supply lines during switching.

- Leakage current.
  - > Subthreshold current in diodes and transistors.
- Static current.
  - Design styles (e.g., pseudo-NMOS).

## **3.1.2 Figures of merit**

- Mean power dissipation  $\overline{P}$  (Watt).
  - $\succ$  It determines the battery autonomy (in hours).
  - $\succ$  It establishes the package limits.
- Peak power (Watt).
  - $\succ$  It determines power supply and ground line widths.
  - > It influences signal noise margins and reliability (electromigration).
- Energy efficiency (Joule).
  - Power dissipation along time.
  - Energy = power \* delay (Joule = Watt \* second).
  - ➤ Smaller value ⇒ smaller power required to perform a computation at the same operating frequency.

# **Figures of merit (2)**

- Power-Delay Product:  $PDP = \overline{P} \cdot t_p$ 
  - > PDP is the average energy dissipation for one switching event.
  - $\succ$  Low power design: Could be simply slower (e.g. reducing V<sub>DD</sub>).
- Energy-Delay Product:  $EDP = PDP \cdot t_p$ 
  - > Accounts for the tradeoff: delay vs. energy/operation.
  - Permits better tradeoff understanding. Higher supply voltage reduces delay but increases energy.



Energy-Delay Product (EDP) as a voltage supply function







### Logic gate switching activity (1)

Logic gate average dynamic power consumption:

$$\overline{P}_{din} = C_L V_{DD}^2 f_{0 \to 1} = C_L V_{DD}^2 p_{0 \to 1} f$$

where  $f_{0\rightarrow 1}$  is *switching activity*, i.e., transition frequency, f is clock frequency and

 $p_{0\rightarrow 1}$  is *transition probability, i.e.*, the probability that an input event produces an output event.

It depends on:

- Input signals statistics
- Circuit style (dynamic or static)
- Logic function
- Network topology

### **Example: 2-input NOR gate**

Hypothesis: Equiprobabilistic and independent inputs 00, 01, 10, 11  $p_0 = p(Y = 0) = 3/4$  $p_1 = p(Y = 1) = 1/4$ 

| А | В | Y |
|---|---|---|
| 0 | 0 | 1 |
| 0 | 1 | 0 |
| 1 | 0 | 0 |
| 1 | 1 | 0 |

 $p_{0\to 1} = p_0 p_1 = 3/16$ 

State transition probability diagram. Observation: The output signal probability is not uniform anymore.



Non-equiprobable inputs

$$\begin{cases} p_1 = (1 - p_A) \cdot (1 - p_B) \\ p_0 = 1 - p_1 \\ \text{where } p_A = p \text{ (A=1) i } p_B = p \text{ (B=1)} \\ \text{Thus,} \end{cases}$$

$$p_{0\to 1} = p_0 p_1 = (1 - p_1) \cdot p_1 = [1 - (1 - p_A) \cdot (1 - p_B)] \cdot (1 - p_A) \cdot (1 - p_B)$$



Logic levels change signal statistics. Assuming **a**, **b**, **c** equiprobabilistic and independents.





reconvergent fanout circuit

$$p_{0\to 1}(y) = p_0(y)p_1(y) = (1-p_1(y)) p_1(y)$$

$$p_1(y) = p_1(x) p_1(c) = 3/8$$

Thus,  $p_{0\to 1}(y) = 15/64$   $p_1(y) = p_1(x|b) p_1(b|x) = p_1(b) = 1/2$ 

A logical result when minimizing the Boolean expression: y = (a+b)b = b

Therefore,  $p_{0\to 1}(y) = 1/4$ 

#### **Dynamic implementations**

- The output node is precharged every clock cycle.
- Every time the evaluation network discharges the precharged node, energy is dissipated (i.e., when the output is zero in a NMOS network).
- Therefore,  $p_{1\to 0}(y) = p_0(y) \ge p_0(y) p_1(y)$ .
- More switching activity: signal probability is higher than transition probability.
- Power consumption may be produced even if inputs do not change (contrarily to static logic).
- Example: 2-input dynamic NOR gate.

 $p_0(y) = p_A + p_B - p_A p_B = 3/4$  (for the equiprobable input case).

It is 4 times higher than static NOR gate power:

 $p_{1\to 0}(y) = p_1(y) p_0(y) = (1-p_0(y)) p_0(y) = 3/16.$ 

- This effect is reduced by the smaller input capacitance of dynamic logics.
- On the other hand, power is increased due to the clock signal high-speed switching not required in static gates.



Dynamic logic does not produce glitches

# **3.1.5 Short circuit currents in static CMOS circuits**

- When input switches, both PMOS and NMOS transistors are on simultaneously.
- It does not happen with dynamic circuits.



Short circuit current is globally minimized **equalizing** input and output signals rising and falling time.

#### Short circuit currents in static CMOS circuits (2)

- If output signal rising/falling time is equal or greater than the input signal, less than 10% of the total power dissipation is due to short circuit current.
- It decreases with  $V_{DD}$ .
- For  $V_{DD} < V_{TN} + |V_{TP}|$  it disappears, since both devices are never simultaneously on.

## 3.1.6 Leakage current



### 3.1.7 Low-power design

Power reduction levels:

- Gates and transistors.
- Functional units and architecture.

**Gate-level low-power design principles** 

$$\overline{P}_{din} = C_L V_{DD}^2 f_{0 \to 1}$$

- Reducing power voltage ( $V_{DD}$ )
  - $\triangleright$ Quadratic effect  $\rightarrow$  Very important effect.
  - ➤ Negative effect on speed, especially when  $V_{DD} \le V_{TN} + |V_{TP}|$ .
  - > If  $V_{TN}$ ,  $|V_{TP}|$  are reduced → leakage currents rise.

≻Option 1: Dual  $V_{DD}$  (e.g.,  $V_{DDH} = 2.5$  V and  $V_{DDL} = 1.5$  V)

Using  $V_{DDH}$  in critical path gates and  $V_{DDL}$  for all other gates.

Reduces energy without performance loss.

Slight area and design time increase.

Level converters are required to interface gates with different power voltage to prevent static currents.

≻Option 2: Duel  $V_{TH}$ , e.g.,  $V_{THH} = 0.6V$ ,  $V_{THL} = 0.3V$  ( $V_{DD} = 2.5V$ )

Using  $V_{THL}$  in critical path gates and  $V_{THH}$  for all other gates. Improves performance without power increment. Higher manufacturing complexity (higher cost). Desing time increases. Leakage current consumption has to be evaluated in  $V_{THL}$  transistors.

- Reducing la capacity (e.g., pas transistor logic instead of static logic).
- Reducing switching frequency.
  - Clock frequency.
  - > Switching activity. Example.



- Reducing glitches.
- Reducing short circuit (slope engineering).
- Reducing leakage current.

#### Functional- and architecture-level low-power design

From global to local busses (area-power tradeoff). Example: AMBA bus (divided in AHB and APB).



#### Functional- and architecture-level low-power design (2)



# **3.1.8 Dynamic Voltage Scaling (DVS)**

Consists in dynamically adapting power voltage and clock frequency to the system computation requirements along time.



If  $V_{DD}$  is reduced, the circuit reacts more slowly and switching frequency has to be reduced.

For the same computation,

- $\succ$  The number of transitions is kept constant.
- > Power is reduced with  $\alpha_i^2$  (without short circuit current accounting).

# **3.2 Power-supply and clock distribution**

Supply circuits

- High current levels.
- Ohmic and reactive voltage drops.
- Parallelism in power supply pads.

Clock circuits

- Synchronous sistems and dynamic logic  $\rightarrow$  Large fanout.
- Need of:

≻Buffering.

- Skew reduction: Geometrical distribution.
- ≻Tuning techniques: PLLs, DLLs.

# **3.2.1 Power supply circuits**

Interconnect line widths have to be adjusted accounting for:

- Metal resistance per square (R/).
- Current required by each logic block.
- Acceptable voltage drop at the less favored circuit position.

Using multiple supply pins line width can be reduced.



# **3.2.2 Clock circuits**

- Geometrical distribution: H circuit.
- Buffer insertion.
- PLL/DLL usage.

