Leakage Mitigation Techniques in Smartphone SoCs

Similar documents
Jae Wook Lee. SIC R&D Lab. LG Electronics

Adaptive Voltage Scaling (AVS) Alex Vainberg October 13, 2010

Comprehensive Place-and-Route Platform Olympus-SoC

Design Compiler Graphical Create a Better Starting Point for Faster Physical Implementation

Low-Power Technology for Image-Processing LSIs

Reduce Your System Power Consumption with Altera FPGAs Altera Corporation Public

Synopsys Design Platform

Low Power Emulation for Power Intensive Designs

OUTLINE Introduction Power Components Dynamic Power Optimization Conclusions

Reducing DRAM Latency at Low Cost by Exploiting Heterogeneity. Donghyuk Lee Carnegie Mellon University

Process and Design Solutions for Exploiting FD SOI Technology Towards Energy Efficient SOCs

Next-generation Power Aware CDC Verification What have we learned?

Power Solutions for Leading-Edge FPGAs. Vaughn Betz & Paul Ekas

8D-3. Experiences of Low Power Design Implementation and Verification. Shi-Hao Chen. Jiing-Yuan Lin

Efficient Evaluation and Management of Temperature and Reliability for Multiprocessor Systems

Apache s Power Noise Simulation Technologies

Best Practices for Implementing ARM Cortex -A12 Processor and Mali TM -T6XX GPUs for Mid-Range Mobile SoCs.

Stepping into UPF 2.1 world: Easy solution to complex Power Aware Verification. Amit Srivastava Madhur Bhargava

Building Energy-Efficient ICs from the Ground Up

Evolution of UPF: Getting Better All the Time

PA GLS: The Power Aware Gate-level Simulation

Cutting Power Consumption in HDD Electronics. Duncan Furness Senior Product Manager

PowerAware RTL Verification of USB 3.0 IPs by Gayathri SN and Badrinath Ramachandra, L&T Technology Services Limited

System-on-Chip Architecture for Mobile Applications. Sabyasachi Dey

An FPGA Architecture Supporting Dynamically-Controlled Power Gating

A 50% Lower Power ARM Cortex CPU using DDC Technology with Body Bias. David Kidd August 26, 2013

Power Aware Architecture Design for Multicore SoCs

ARM big.little Technology Unleashed An Improved User Experience Delivered

Minimization of NBTI Performance Degradation Using Internal Node Control

Blackfin Optimizations for Performance and Power Consumption

Chapter 5: ASICs Vs. PLDs

Complex Low Power Verification Challenges in NextGen SoCs: Taming the Beast!

Unleashing the Power of Embedded DRAM

Low Power SRAM Techniques for Handheld Products

Synthesizable FPGA Fabrics Targetable by the VTR CAD Tool

Managing Power and Performance for System-on-Chip Designs using Voltage Islands

Power, Performance and Area Implementation Analysis.

SSO Noise And Conducted EMI: Modeling, Analysis, And Design Solutions

System Level Design with IBM PowerPC Models

TINY System Ultra-Low Power Sensor Hub for Always-on Context Features

Normally-Off MCU Architecture for Low-power Sensor Node

Low Power System-on-Chip Design Chapters 3-4

Building Ultra-Low Power Wearable SoCs

Frequency and Voltage Scaling Design. Ruixing Yang

SH-Mobile3: Application Processor for 3G Cellular Phones on a Low-Power SoC Design Platform

SSO Noise And Conducted EMI: Modeling, Analysis, And Design Solutions

Last Time. Making correct concurrent programs. Maintaining invariants Avoiding deadlocks

Low-Power Verification Methodology using UPF Query functions and Bind checkers

UPF GENERIC REFERENCES: UNLEASHING THE FULL POTENTIAL

Power: What s the problem?

08 - Address Generator Unit (AGU)

EE-382M VLSI II. Early Design Planning: Front End

Retention based low power DV challenges in DDR Systems

CPU-GPU Heterogeneous Computing

James Lin Vice President, Technology Infrastructure Group National Semiconductor Corporation CODES + ISSS 2003 October 3rd, 2003

Advanced FPGA Design Methodologies with Xilinx Vivado

A 65nm LEVEL-1 CACHE FOR MOBILE APPLICATIONS

6T- SRAM for Low Power Consumption. Professor, Dept. of ExTC, PRMIT &R, Badnera, Amravati, Maharashtra, India 1

Low-Power Processor Solutions for Always-on Devices

Seahawk Power-optimized implementation of High Performance Quad-core Cortex-A15 Processor

Beyond Soft IP Quality to Predictable Soft IP Reuse TSMC 2013 Open Innovation Platform Presented at Ecosystem Forum, 2013

Novel Nonvolatile Memory Hierarchies to Realize "Normally-Off Mobile Processors" ASP-DAC 2014

Design and Implementation of Low Leakage Power SRAM System Using Full Stack Asymmetric SRAM

A Novel Methodology to Debug Leakage Power Issues in Silicon- A Mobile SoC Ramp Production Case Study

Reduce Verification Complexity In Low/Multi-Power Designs. Matthew Hogan, Mentor Graphics

Gigascale Integration Design Challenges & Opportunities. Shekhar Borkar Circuit Research, Intel Labs October 24, 2004

ECE260B CSE241A Winter Tapeout. Website:

Enhanced Multi-Threshold (MTCMOS) Circuits Using Variable Well Bias

ELE 455/555 Computer System Engineering. Section 1 Review and Foundations Class 3 Technology

A Dual-Core Multi-Threaded Xeon Processor with 16MB L3 Cache

PPP (PRICE, POWER AND PACKAGE) OPPORTUNITIES FOR INNOVATION IN MOBILE COMPUTING AND IOT

Adding SRAMs to Your Accelerator

New Challenges in Verification of Mixed-Signal IP and SoC Design

Verification of Power Management Protocols through Abstract Functional Modeling

PrimeTime: Introduction to Static Timing Analysis Workshop

Power Reduction Techniques in the Memory System. Typical Memory Hierarchy

Energy Efficiency and Resilience in Future ICs

Embedded Linux Conference San Diego 2016

Mixed Signal Verification Transistor to SoC

Ultra Low Power (ULP) Challenge in System Architecture Level

Evaluating and Exploiting Impacts of Dynamic Power Management Schemes on System Reliability

ASIC Design of Shared Vector Accelerators for Multicore Processors

Ultra-low power, Single-chip SRAM FPGA Targets Handheld Consumer Applications

Understanding the tradeoffs and Tuning the methodology

MediaTek CorePilot. Heterogeneous Multi-Processing Technology. Delivering extreme compute performance with maximum power efficiency

Crosstalk Noise Avoidance in Asynchronous Circuits

Extending Digital Verification Techniques for Mixed-Signal SoCs with VCS AMS September 2014

TIMA Lab. Research Reports

Cluster-based approach eases clock tree synthesis

Realtek Ameba-1 Power Modes

envm in Automotive Modules MINATEC Workshop Grenoble, June 21, 2010 May Marco 28, 2009 OLIVO, ST Automotive Group

Research Challenges for FPGAs

Asynchronous on-chip Communication: Explorations on the Intel PXA27x Peripheral Bus

Computing to the Energy and Performance Limits with Heterogeneous CPU-FPGA Devices. Dr Jose Luis Nunez-Yanez University of Bristol

NetSpeed ORION: A New Approach to Design On-chip Interconnects. August 26 th, 2013

Helio X20: The First Tri-Gear Mobile SoC with CorePilot 3.0 Technology

CS 250 VLSI Design Lecture 11 Design Verification

Strato and Strato OS. Justin Zhang Senior Applications Engineering Manager. Your new weapon for verification challenge. Nov 2017

CHAPTER-IV IMPLEMENTATION AND ANALYSIS OF FPGA-BASED DESIGN OF 32-BIT FPAU

Design of Advanced Applications Processors in FD-SOI 高端应用处理器设计中的 FDSOI 使用

Transcription:

Leakage Mitigation Techniques in Smartphone SoCs 1 John Redmond 1 Broadcom International Symposium on Low Power Electronics and Design

Smartphone Use Cases Power Device Convergence Diverse Use Cases Camera Music player Navigation Gaming System TV Laptop Access Point Wallet 50mW 500uW Generalized Use Cases 5W Idle Low- Med MIPS High MIPS Time Idle: Paging for call, text, & email Low-Medium MIPS: MP3, voice call Video playback High MIPS: Gaming, concurrent use-cases, benchmarks 2

Smartphone SoC SoC Highlights 2G/3G/4G MODEM VIDEO ISP AP Application Processor Modem Graphics Image Signal Processor Video Audio LPDDRx GPU 3

Smartphone Use Cases 1. Idle 2. Low-Medium MIPS 3. High MIPS 4

Idle Mode Critical for standby time. Majority of the phone s life is in idle mode. Leakage dominates overall power. Two phases: 1. Deep-sleep leakage. 2. Wake up to check for call. Key Challenge: minimize leakage at all costs. Key Focus Area: architecture. 5

Idle Leakage from 40nm->28nm 40nm Gate off highperformance, high-leakage block. 28nm Gate off everything that does not need to be on. Note: FastLogic was power gated in 65nm->40nm transition 6

Power Island Terminology (1/2) VDD Power Switch VDD_Switched Combinatorial logic x Retention Flip Flop (RFF) SRAM ret 1 Bit cell array Switched Domain

Power Island Terminology (2/2) Power Switch AON buffer ISO x ISOHIGH 1 0/1 ISOLAT 0 Switched Domain ISOLOW Unswitched Domain OFF to ON domain crossing needs isolation for electrical and functional correctness 8

Power Intent Methodology Flow Power Management Cells Insertion Power Architecture Change Power/Ground Connections Power Verification Legacy Flow Technology dependent cells instantiated in RTL RTL Change Front-end specifies in spreadsheet or connected in RTL and communicated via e-mail. Final PG Netlist Power Intent Flow UPF file specifies Power Intent and cells are inferred in flow UPF Change UPF Starting at RTL through PG Netlist Benefit RTL is power architecture and technology UPF easier to change Lingua franca to ensure clean handoffs Early and often verification

Power Intent Verification OFF DOMAIN ISOHIGH ISOLOW resetb ON DOMAIN UPF RTL Static Checking Static LP Checker (VC-LP/CLP) ISO_DEVICE_MISSING : The crossover needs isolation, but has no isolation device present. Power aware libraries Dynamic Checking Verilog Simulator (NLP/Incisive-LP /Questa PA) Simulation Fail 10

Power Intent Implementation UPF RTL Power - aware libraries Synthesis Insert isolation cells and levelshifters. Insert RFF. Physical Design Insert power-switches. Route always-on signals. Implementation Verification Structural LP checks. Custom power switch and RFF rules PG gate-level simulations. 11

Example Power Island Cost/Benefit Block has partial state retention with 10% of flops being retained. Cost is an additional 3.2% area for power switches and retention flip flops. Retention state leakage went from 5.1 mw to 45 uw achieving >100x leakage reduction. Retention State Leakage Breakdown: 12

Other Considerations Decap leakage Decap in unswitched routing channels. Minimize SoC memory that needs to be retained. >50% of leakage budget. Ultra-low leakage considerations: NWELL leakage. RFF minimization retain minimum set of state. 13

Idle Mode Leakage Mitigation Aggressive power islands. UPF is required for implementation and verification of sophisticated power-island architecture. Reduce voltage for logic remaining on. This is limited by the retention voltage of the memory bit-cells. Leakage is the top priority in implementation. The trade-offs are in area, performance, and dynamic power for leakage. Always-on logic should be low frequency. 14

Smartphone Use Cases 1. Idle 2. Low-Medium MIPS 3. High MIPS 15

Low-Medium MIPS Leakage Mitigation Active power is a combination of dynamic and leakage power. Active power is optimized to extend usage time. User expectations: Task Voice Call MP3 Playback Video Playback Time >10 hours >50 hours >10 hours Key Challenge: minimizing active power Key Area: design and implementation 16

Multi-Vth and Gate Length Biasing Vth is a big knob trading off performance and leakage. 2.5 Gate length is a small knob trading performance and leakage. Graphs are at the standard cell level. 17

# of paths # of paths Leakage Optimized Multi-Vt Flow Initial design is 100% High Vt cells. Low Vt cells are swapped in on critical paths to meet timing. This may not be the best for area and dynamic power. Histograms of endpoint slack (red paths are failing) as shown below: 100% High Vt Cells Mixed High and Low Vt Cells Vth Swap Slack Slack 18

AVS and DVFS Adaptive Voltage Scaling (AVS) Adjusting voltage based the phyiscal environment. Process variation, temperature, silicon aging, and PMU inaccuracy. Dynamic Voltage and Frequency Scaling (DVFS) Adjusting voltage and frequency based on workload. 19

Active Power Across Process and Temp Dynamic power decreases with faster silicon due to AVS. Leakage power increases with faster silicon. Active Power at 85C Leakage power increases significantly with fast silicon and hot temperature. 20

Optimizing Active Power (1/3) 10T design: Larger, faster cells. Larger percentage of higher Vth cells. Which design is better? 8T design: Smaller, slower cells. Larger percentage of lower Vth to meet performance. 21

Optimizing Active Power (2/3) The bulk of the parts are typical silicon. Operating temperature is typically around 55 C for low-medium MIPS use-cases. Optimize for typical process and temperature or worst case process and temperature? Optimize for typical while ensuring that the worst case works. Design 10T 8T Typical/55C 630 600 Fast/85C 710 810 22

Optimizing Active Power (3/3) 10T design: Smaller leakage. Higher active power at typical conditions. Graphs at 55C 8T design: Larger leakage. Lower active power at typical conditions. Higher leakage can lead to overall lower active power. 23

Other Design Techniques Guardband reduction. Using low-power cells and macros in implementation. Retain until access memories: Put memories in low-leakage retention state when not in use. Adaptive read/write assist techniques: Boost voltage on memories when writing. Power down logic and memories when not in use. 24

Smartphone Use Cases 1. Idle 2. Low-Medium MIPS 3. High MIPS 25

High MIPS Leakage Minimization High-performance cores use Low Vth devices extensively and leakage is generally >40% of the power budget in high leakage corner. Due to high frequencies, dynamic power is also high. This can lead to leakage-induced thermal runaway. Higher leakage current causes a temperature increase that further increases leakage. Key Challenge: thermal considerations Key Area: runtime 26

RFTS vs. DVFS (1/3) Run Fast Then Stop (RFTS) is a technique where the processor runs at the highest frequency until the job is finished, then it stops. DVFS runs low and slow to reduce dynamic power by V^2. RFTS: 1. Clock Gating core continues to leak. 2. Power Gating core is powered off and doesn t leak. Clock Gate Active Power Gate 27

Power Power Power RFTS vs. DVFS (2/3) What is best? 1. DVFS Active Mode Overhead - PLL 2. RFTS: clock gate Active Mode Eleak Clock Gated Active Mode 3. RFTS: power gate Active Mode Save Esave Power Gated Restore Erestore Active Mode Time 28

Break-even Time (Normalized) Run Fast Then Stop vs. DVFS (3/3) Break-even time is defined as the time that the core needs to be powered off to compensate for save and restore energy. In high leakage situations, the power gating benefit is realized in a shorter time. 1.00 0.80 0.60 0.40 0.20 Si Process - Typical Si Process - Fast 0.00 30 40 50 60 70 Temperature (C) Lowest power technique is a run-time decision. 29

Other Considerations Co-adjacent heating: Leakage increases in victim blocks due to aggressor block dynamic power. Thermal/leakage aware floorplan. Core hopping to put loads on cool cores. Leakage-aware power and thermal management algorithms. 30

Conclusions Leakage mitigation in idle mode: Architect for leakage mitigation by aggressive power gating. Leakage is the top design priority. Use UPF to realize sophisticated power architectures. Leakage mitigation in low-mid MIPS modes: Design for active power minimization. Power down logic and memory when not in use. Leakage mitigation in high MIPS modes: Runtime power and thermal management decisions that consider leakage. 31

Thank You! Questions? 32