



# Fundamental Limits on the Computational Accuracy of Resistive Crossbar-based In-memory Architectures

Saion K. Roy<sup>1</sup>, Ameya Patil<sup>2</sup>, and Naresh R. Shanbhag<sup>1</sup>

1: University of Illinois at Urbana-Champaign 2: Amazon Lab126

2022 IEEE International Symposium on Circuits and Systems May 28- June 1, 2022 Hybrid Conference

# Outline

- Introduction
- Resistive Crossbar Architecture
- Behavioral Modeling
- Simulation Results
  - Model validation
  - Compute SNR analysis for MRAM, ReRAM, and FeFET crossbars
  - System level accuracy of ResNet-20 on CIFAR-10
- Conclusion

# In-memory Computing (IMC)

#### compute memory



#### first IMC concept paper (ICASSP 2014)

#### AN ENERGY-EFFICIENT VLSI ARCHITECTURE FOR PATTERN RECOGNITION VIA DEEP EMBEDDING OF COMPUTATION IN SRAM

Mingu Kang\*, Min-Sun Keel\*, Naresh R. Shanbhag\*, Sean Eilert<sup>†</sup>, and Ken Curewitz<sup>†</sup>

\*Dept. Electrical and Computer Engineering, University of Illinois at Urbana-Champaign †Micron Technology, Inc

- computes a M×N matrix-vector multiply (MVM)
- SRAM-based IMC banks are mature → 20× lower energy + 9× higher compute density than digital<sup>1</sup>
- eNVM-based (MRAM/ReRAM) IMCs have potential for high compute density but lags digital due to low compute SNR → this work explains why

## **Resistive Crossbar Architecture**

#### voltage-drive current-sensing crossbar

#### differential inputs $V_{2N-1}$ bit $G_{0.0}$ $G_{0.1}$ G<sub>0,2N-1</sub> TIA I<sub>SL.0</sub> ADC $\rightarrow a_0$ G₅≹ C Al<sub>SL.0</sub> G<sub>1,1</sub> **G**<sub>1,2N-1</sub> $G_{1.0}$ TIA I<sub>SL.1</sub> ADC → a<sub>1</sub> G₅≸ C Al<sub>SL.1</sub> 、**G**<sub>M-1,1</sub> ⊂G<sub>M-1,2N-1</sub> G<sub>M-1,0</sub> TIA $I_{SL,M-1}$ ADC → a<sub>M-1</sub> G₅≸ $\bigcirc AI_{SL,M-7}$

- computes a *M*×*N* matrix-vector multiply (MVM)
- V-DACs provide differential inputs on BLs ( $V_{2k} = -V_{2k-1}$ )
- two BCs  $(G_{2k-1}, G_{2k})$  store 1 bit
- current summing & sensing on SLs
- device resistive contrast

$$o = \frac{R_{\rm off}}{R_{\rm on}}$$

2 (MRAM); 12 (ReRAM); 10<sup>3</sup> (FeFET);

### **Behavioral Modeling**



signal current in SL



total current in SL

 $\begin{array}{c} \text{quantization}\\ \text{DAC mismatch noise}\\ I_{\text{SL}} = I_{\text{sig}} + I_{\text{nb}} + I_{\text{nd}} + I_{\text{nc}} + I_{\text{nq}},\\ \text{conductance clipping}\\ \text{variation noise} \end{array}$ 

### **Results – Model Validation**

#### **SPICE** simulations in a 22nm process

 $N = 512, R_s = 316\Omega$ , 6b ADC



- DAC input: signed 5b with  $V_{\rm lsb} = 3 {\rm mV}$
- DAC mismatch: 4%
- Conductance variation: 4%
- ADC clipping range:
  [-2μA, +2μA]

 SL current varies due to analog non-idealities

### **Results – Compute SNR Analysis**



$$SNR = \frac{\mathbb{E}[I_{sig}^2]}{\mathbb{E}[I_{nb}^2] + \mathbb{E}[I_{nd}^2] + \mathbb{E}[I_{nc}^2] + \mathbb{E}[I_{nq}^2]},$$

- model and simulations match → further validates model
- ADC clipping vs. quantization noise trade-off:

$$S_I = \left[\frac{R_{\rm arr}}{R_{\rm arr} + R_s}\right]$$

compute SNR maximized if R<sub>s</sub> = R<sup>\*</sup><sub>s</sub>
 → clipping noise & quantization noise are equal

## **Results – SNR Dependence**



- dimension  $\rightarrow$  small  $R_{arr} \rightarrow$  small  $R_s = R_{s.\min}(= 1k\Omega) \neq R_s^*$
- higher absolute  $R_{on}$ ,  $R_{off}$  critical for high DP dimension

- SNR<sub>max</sub> roll-off with higher DP minimum ADC precision  $B_{ADC}^*$ increases with SNR<sub>max</sub>
  - SNR<sub>max</sub> saturates for  $B_{ADC} >$  $B^*_{ADC} \rightarrow \text{DAC}$  mismatch and G variations dominate
- increasing device resistive contrast beyond (~12-to-15) is futile

higher resistive contrast but...

SNR<sub>max</sub> improves with

## **System Level Accuracy Prediction Set-up**



#### **Results – System Level Accuracy Prediction**



- Baseline ResNET-20 on CIFAR-10: 5b input, ternary weights, accuracy = 84.94% (3-layer network)
- 3 Crossbars: N = 144, 288, 576;  $R_{s1} = R_{s1}^*$ ; sweep  $(R_{s2}, R_{s3})$
- SNR analysis predicted crossbar design achieves system-level accuracy to within 1% (exhaustive search) to within 2% of digital baseline value (84.94%)
- bank-level SNR is a good proxy for network level accuracy → SNR analysis bypasses trial & error

#### Conclusion

- proposed an analytical framework to obtain SNR-optimal resistive crossbar parameters → avoids expensive trial and error
- insights provided by the framework:
  - SNR-optimal sensing resistance  $R_s^*$  exists which equalizes the clipping and quantization noise in the column ADCs
  - system level inference accuracy is maximized when bank-level compute SNR is maximized
  - increasing device resistive contrast improves SNR up to a point (~12-15).
    Diminishing returns due to mismatch (input DACs) and variations (device conductance)
- proposed framework can be extended to other resistive IMC and devices

#### **Thank You**

saionkr2@Illinois.edu

#### Acknowledgement

Work supported by the Defense Advanced Research Agency (DARPA) and the Semiconductor Research Corpor ation (SRC) via the FRANC Program, and the Center for Brain-inspired Computing (C-BRIC).