Reduce Your System Power Consumption with Altera FPGAs Altera Corporation Public

Similar documents
Power Solutions for Leading-Edge FPGAs. Vaughn Betz & Paul Ekas

Reduce FPGA Power With Automatic Optimization & Power-Efficient Design. Vaughn Betz & Sanjay Rajput

Power Optimization in FPGA Designs

FPGA Power Management and Modeling Techniques

PowerPlay Early Power Estimator User Guide for Cyclone III FPGAs

Research Challenges for FPGAs

Low Power Design Techniques

PowerPlay Early Power Estimator User Guide

Field Programmable Gate Array (FPGA) Devices

Power Consumption in 65 nm FPGAs

8. Migrating Stratix II Device Resources to HardCopy II Devices

Implementing Bus LVDS Interface in Cyclone III, Stratix III, and Stratix IV Devices

Early Power Estimator for Intel Stratix 10 FPGAs User Guide

8. Selectable I/O Standards in Arria GX Devices

Power Considerations in High Performance FPGAs. Abu Eghan, Principal Engineer Xilinx Inc.

Stratix II vs. Virtex-4 Power Comparison & Estimation Accuracy

Signal Integrity Comparisons Between Stratix II and Virtex-4 FPGAs

Stratix II vs. Virtex-4 Performance Comparison

An FPGA Architecture Supporting Dynamically-Controlled Power Gating

Interfacing RLDRAM II with Stratix II, Stratix,& Stratix GX Devices

Unleashing the Power of Embedded DRAM

Cyclone III LS FPGAs Altera Corporation Public

The Next Generation 65-nm FPGA. Steve Douglass, Kees Vissers, Peter Alfke Xilinx August 21, 2006

An Introduction to Programmable Logic

4. Selectable I/O Standards in Stratix II and Stratix II GX Devices

Automatic Peak Detection System Power Analysis Using System on a Programmable Chip (SoPC) Methodology

CS310 Embedded Computer Systems. Maeng

FPGA. Logic Block. Plessey FPGA: basic building block here is 2-input NAND gate which is connected to each other to implement desired function.

Intel Quartus Prime Pro Edition User Guide

13. Power Optimization

13. Power Management in Stratix IV Devices

Stratix vs. Virtex-II Pro FPGA Performance Analysis

Understanding and Meeting FPGA Power Requirements

DINI Group. FPGA-based Cluster computing with Spartan-6. Mike Dini Sept 2010

Implementing OCT Calibration in Stratix III Devices

ALTERA FPGAs Architecture & Design

Implementing LVDS in Cyclone Devices

Quartus II Software Device Support Release Notes

Interfacing FPGAs with High Speed Memory Devices

Speedster22i FPGA Family Power Estimator Tool User Guide. UG054- July 28, 2015

White Paper Performing Equivalent Timing Analysis Between Altera Classic Timing Analyzer and Xilinx Trace

Stacked Silicon Interconnect Technology (SSIT)

Stratix II FPGA Family

XPERTS CORNER. Changing Utilization Rates in Real Time to Investigate FPGA Power Behavior

A Dual-Core Multi-Threaded Xeon Processor with 16MB L3 Cache

Multi-Core Microprocessor Chips: Motivation & Challenges

6. I/O Features in Arria II Devices

Power vs. Performance: The 90 nm Inflection Point

Physical Implementation

8. Best Practices for Incremental Compilation Partitions and Floorplan Assignments

Lecture 20: Package, Power, and I/O

BCT Channel LED Driver (Common-Anode & Common-Cathode) Low Dropout Current Source & Current Sink. General Description. Features.

Achieving Lowest System Cost with Midrange 28-nm FPGAs

Early Models in Silicon with SystemC synthesis

6. I/O Features in Stratix IV Devices

Intel Cyclone 10 LP Device Design Guidelines

Qsys and IP Core Integration

SUBMITTED FOR PUBLICATION TO: IEEE TRANSACTIONS ON VLSI, DECEMBER 5, A Low-Power Field-Programmable Gate Array Routing Fabric.

Cyclone III low-cost FPGAs

Summary. Introduction. Application Note: Virtex, Virtex-E, Spartan-IIE, Spartan-3, Virtex-II, Virtex-II Pro. XAPP152 (v2.1) September 17, 2003

Gigascale Integration Design Challenges & Opportunities. Shekhar Borkar Circuit Research, Intel Labs October 24, 2004

A Low-Power Wave Union TDC Implemented in FPGA

High Capacity and High Performance 20nm FPGAs. Steve Young, Dinesh Gaitonde August Copyright 2014 Xilinx

2. Recommended Design Flow

Intel Agilex Power Management User Guide

11. Analyzing Timing of Memory IP

Intel Stratix 10 General Purpose I/O User Guide

System Power Savings Using Dynamic Voltage Scaling. Scot Lester Texas Instruments

Leakage Mitigation Techniques in Smartphone SoCs

Cutting Power Consumption in HDD Electronics. Duncan Furness Senior Product Manager

Power Estimation and Management for LatticeECP/EC and LatticeXP Devices

Jae Wook Lee. SIC R&D Lab. LG Electronics

Quad-Serial Configuration (EPCQ) Devices Datasheet

ELE 455/555 Computer System Engineering. Section 1 Review and Foundations Class 3 Technology

White Paper Compromises of Using a 10-Gbps Transceiver at Other Data Rates

Stratix FPGA Family. Table 1 shows these issues and which Stratix devices each issue affects. Table 1. Stratix Family Issues (Part 1 of 2)

Power Management Using FPGA Architectural Features. Abu Eghan, Principal Engineer Xilinx Inc.

Power and Thermal Estimation and Management for MachXO3 Devices

Intel Stratix 10 Power Management User Guide

Power Estimation and Management for LatticeXP2 Devices

Chapter 5: ASICs Vs. PLDs

Adaptive Voltage Scaling (AVS) Alex Vainberg October 13, 2010

EE219A Spring 2008 Special Topics in Circuits and Signal Processing. Lecture 9. FPGA Architecture. Ranier Yap, Mohamed Ali.

Quad-Serial Configuration (EPCQ) Devices Datasheet

2 AA Cell to 3.3V USB On-The-Go Devices White LED Drivers Handheld Devices. The HM3200B is available in the 6-pin SOT23-6.

Intel Stratix 10 Power Management User Guide

Comparative study of power reduction of various solutions of an image processing IP on modern FPGAs

AN 447: Interfacing Intel FPGA Devices with 3.3/3.0/2.5 V LVTTL/ LVCMOS I/O Systems

Best Practices for Incremental Compilation Partitions and Floorplan Assignments

Designing for Low Power with Programmable System Solutions Dr. Yankin Tanurhan, Vice President, System Solutions and Advanced Applications

Digital Integrated Circuits

Low-Power Technology for Image-Processing LSIs

Qualification Strategies of Field Programmable Gate Arrays (FPGAs) for Space Application October 26, 2005

High-Performance Memory Interfaces Made Easy

ECE 486/586. Computer Architecture. Lecture # 2

Power. Management. Solution Guide. Solving Today's Hot Challenge INSIDE

Intel Arria 10 FPGA Performance Benchmarking Methodology and Results

QDR II SRAM Board Design Guidelines

4. Hot Socketing and Power-On Reset in MAX V Devices

LOW POWER DESIGN IMPLEMENTATION OF A SIGNAL ACQUISITION MODULE RAVI BHUSHAN THAKUR. B.Tech, Jawaharlal Nehru Technological University, INDIA 2007

Transcription:

Reduce Your System Power Consumption with Altera FPGAs

Agenda Benefits of lower power in systems Stratix III power technology Cyclone III power Quartus II power optimization and estimation tools Summary 2

Benefits of Lower Power Stay within a fixed power budget Chassis limits (heat, space, current) Battery life considerations Outside power budget not an option Reduce system cost Fewer/smaller heat sinks and fans Smaller power supplies Increase reliability No fans no moving parts Lower system temperature Reduce design time and effort to meet power and thermal constraints Reduce ongoing operating costs 3

Altera Power Strategy at 65 nm and Beyond Innovative architecture and advanced process technologies Best-in-class FPGA power modeling Lowest power solution in the industry 4

Power Components and Terminology Core static power Power consumed even when there are no clocks active; power due to leakage currents Static power is function of device selected and junction temperature Dynamic power I/O power Charging and discharging external load capacitance connected to device pins, I/O drivers, and external termination network (if present) Block power Power for individual device resources such as the logic elements (LEs), phase-locked loops (PLLs), or memory blocks Total power = core static power + dynamic power (I/O + block) 5

Stratix III Power Technology

Meeting the Power Challenge Power Increased Performance 65 nm (Increased Leakage) Increased Density Power Budget Stratix III FPGA Stratix II FPGA 1.5M Gates 3.0M Gates Density Stratix III FPGAs Cut Power by 50% vs. 90 nm 7

Lowest Power FPGA in the Industry Stratix III FPGA Power Reduction Technique Lower Static Power Lower Dynamic Power Silicon Process Optimizations Selectable Core Voltage (0.9 V or 1.1 V) Programmable Power Technology Power Optimized DDR Memory Interface Quartus II Software PowerPlay Power Analysis and Optimization 8

Leading Edge Process Technology Increased Performance, Reduced Power Advanced 65-nm process 15% capacitance reduction reduces dynamic power 15% Lower voltage reduces dynamic power another 16% Multiple-gate oxide thicknesses (triple oxide) Trade-off static power vs. speed per transistor Multiple-threshold voltages Trade-off static power vs. speed per transistor Low-k inter-metal dielectric Reduces dynamic power, increases performance Strained silicon Increased performance Copper interconnect Increased performance, reduced IR drop 9

Selectable Core Voltage

Selectable Core Voltage Customer selects the FPGA core voltage 1.1 V for maximum performance 0.9 V for minimum power I/O and PLL voltages unaffected Still get maximum I/O interface speed Crucial, since I/O bandwidth limits many systems Nominal Voltage Min. Regulator v OUT Max. Regulator v OUT Slow Timing Model Fast Timing Model Power Model 1.1 V 1.05 V 1.15 V 0.9 V 0.86 V 0.94 V 11

Power and Timing Impact Core Voltage Dynamic Power Reduction From Stratix II FPGAs Static Power Reduction From Stratix II FPGAs Performance Gain Over Stratix II FPGAs 1.1 V (1) 33% 52% 35% 0.9 V (2) 55% 64% 8% Notes: (1) Fastest vs. fastest speed grade comparison (2) Slowest vs. slowest speed grade comparison More Choice to Meet Your Power and Performance Budgets 12

60% Lower Static Power (85 C) 3.5 Stratix II FPGA Stratix III FPGA Core Static Power (W) 3.0 2.5 2.0 1.5 1.0 Typical high-performance design (1.1V) All low-power tiles (1.1 V) Typical design (0.9 V) 0.5 0.0 0 50K 100K 150K 200K 250K 300K Number of LEs 350K 13

Programmable Power Technology

Design-Specific Power Optimization Only a small fraction of logic is performance critical Slack Histogram 8,000 7,000 Not Performance Critical Performance Critical Number of Connections 6,000 5,000 4,000 3,000 2,000 1,000 0 90-100% 80-90% 70-80% 60-70% 50-60% 40-50% 30-40% 20-30% 10-20% 0-10% Slack % 15

Programmable Power Technology Logic Array Timing Critical Path High-Speed Logic 16

Programmable Power Technology Logic Array Timing Critical Path High-Speed Logic Low-Power Logic 17

Programmable Power Technology Logic Array Timing Critical Path * Power mapping fully automated by Quartus II software based on timing constraints High-Speed Logic Low-Power Logic Unused Low-Power Logic High Performance Where You Need It, Lowest Power Everywhere Else 18

High-Resolution Power Control Stratix III FPGA (EP3SL340) has 8,050 tiles for very highresolution power/performance optimization Only a small percentage of highspeed tiles required to maintain design performance Speed of the Fastest LABs, Power of the Slowest 19

Stratix III: Highest Performance f MAX Ratio Stratix III / Stratix II Stratix III is on average 35% Faster than Stratix II Stratix III Advantage Stratix II Advantage Real customer designs Note: Benchmarking data is based on comparing an Altera device to an equivalent Xilinx device using Quartus II V7.2 and ISE 9.2iSP1 software 20 Quartus 7.2 will be available in October of 2007

Using Programmable Power Circuit board requirements: none! Stratix III FPGAs create low-power tiles using on-chip circuitry No extra power supplies, no extra board components Design changes: none! Quartus II software automatically uses high-speed tiles where needed for timing All unused tiles set to low power All tiles with timing margin set to low power Failed timing constraints: all tiles not on critical paths set to low power 21

Power-Optimized DDR Memory Interface

Stratix III On-Chip Termination (OCT) Both parallel (Rt=50Ω) and series (Rs=50Ω) termination VTT VTT 50 Ω Zo=50 Ω 50 Ω 50 Ω OE 25 Ω OE Stratix III FPGA Memory Chip* (*) DDR 1/2/3, RLDRAMII, QDR II/II+ support 23

Stratix III FPGA Dynamic OCT Write: Rs on, Rt off Matching line impedance Read: Rs off, Rt on Terminating far end Write Read VTT VTT VTT VTT Zo=50 Ω 50 Ω 50 Ω Zo=50 Ω 50 Ω 50 Ω OE OE OE OE Stratix III FPGA (TX) Memory Chip Stratix III FPGA (RX) Memory Chip Saves 1.6 W of DC Power on 72-Bit DDR2 Bus 24

Benefits of Dynamic OCT 1. Power significantly reduced vs. traditional parallel OCT Saves 1.6 W of DC power on 72-bit DDR2 bus 2. Proper line termination and impedance matching on bidirectional busses Enhanced signal integrity 3. No need for on-board termination resistors 25

Stratix III FPGAs Support DDR3 Stratix III FPGA: the only FPGA that supports DDR3 DDR3 consumes 30% lower power than DDR2 DDR2: 1.8 V DDR3: 1.5 V Example system: 72-pin, 200-MHz memory interface, with on-chip termination Conventional FPGA DDR2 power: 3.9 W Stratix III (dynamic OCT) DDR2 power: 2.3W Stratix III (dynamic OCT) DDR3 power: 1.6W Total savings of 2.3 W 26

Cyclone III Power Advantages

Cyclone III Silicon Design Strategy Maintain performance while reducing power Utilize latest process technology and features available for low-power design Process or Technology Benefit Multi-threshold transistors Balance need for performance and low power consumption Variable gate-length transistors Balance need for performance and low power consumption TSMC 65-nm low-power (LP) process Reduced FPGA power consumption using the same low-power process technology employed for mobile phone chip sets 29

Cyclone III Power Consumption Advantage Power Component Typical Static Power Core Dynamic Power I/O Power Cyclone III Devices Power Reduction Over Cyclone II Devices (1) Up to 54% Up to 24% Comparable Notes: (1) Based on Cyclone III EPE v7.0 beta and Cyclone II EPE v.6.1 30

Core Static Power Comparison (85C Tj) Cyclone III FPGA benefits grow as Tj and density increase 0.3 Static Power (W) 0.25 0.2 0.15 0.1 0.05 0 ~54% 0 20,000 40,000 60,000 80,000 100,000 120,000 140,000 Logic Elements (LEs) Tj = 85C Cyclone II FPGA Cyclone III FPGA 31

Core Dynamic Power Does not vary with temperature Function of frequency, number of circuits switching, and capacitance All of the Cyclone III device blocks consume less dynamic power than Cyclone II device blocks except for the PLL blocks 1.200 Core Dynamic Power (W) 1.000 0.800 0.600 0.400 0.200 0.000 ~24% 0 10,000 20,000 30,000 40,000 50,000 60,000 Cyclone II FPGA Cyclone III FPGA Logic Elements (LEs) 32

Quartus II Power Optimization and Estimation Tools

Automatic Programmable Power Synthesis Placement and Routing Unused Tiles Low Power All High-Speed Tiles Timing Analysis Tiles with Timing Slack Low Power No Done? Yes Mostly Low-Power Tiles 34

Example: Power-Optimized RAM Mapping 2K X 32 RAM Default Option Power-Efficient Option 32 32 2:4 Decoder Four 2Kx8 M9K RAMs Four 512x32 M9K RAMs 35

Logic and Clock Power Optimization Clock power reduction Stratix III hardware can shut down clock at 3 levels of tree Automatic placement to reduce clock power Power-driven placement and routing Minimize capacitance of high-toggling signals Without violating timing constraints Clocking Legal, Timing Optimized Power Optimize 20 Million Toggle/s 100 Million Toggle/s Power Optimize Group Clocks for Maximum Shutdown 36

Dynamic Power Optimization Dynamic Power Reduction vs. Minimum Effort 40% 35% 30% 25% 20% 15% 10% 5% 0% Up to 35% lower dynamic power Extra Effort Normal Effort 1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77 Design 37

Design Data Accuracy vs. Estimation Accuracy Higher Early Power Estimator Spreadsheets Quartus II PowerPlay Power Analyzer Estimation Accuracy Manual entry Quartus II design profile Placement and routing results Simulation results + vectorless estimation Lower Design Data Accuracy Higher 38

Early Power Estimator Spreadsheets Enter in design data manually if design files are not available Quartus II software automatically generates design data inputs Supports quick what-if analysis Download from Altera website Input Number of Registers Transition Rates Clocks and Frequency I/O and Memory DSP and PLLs Temp, Air Flow, Heat Sinking Early Power Estimator Reports By Resource By Supply Thermal Analysis Estimate Power Before Design Files Available 39

Quartus II Power Analyzer Provides highest estimation accuracy by taking advantage of Quartus II software design data and simulation stimuli Vectorless estimation technology improves accuracy by plugging in holes not covered by test bench stimuli Input Number of Registers Transition Rates Clocks and Frequency I/O and Memory DSP and PLLs Temp, Airflow, Heat Sinking Simulation Results Place-and-Route Results Node Activity Power Analyzer Reports By Resource By Supply Thermal Analysis Design Hierarchy Power Confidence Metrics Signal Activity Data Available in All Versions of Quartus II Software 40

How Good Are Estimation Results? Altera s rigorous modeling methodology Use detailed circuit HSPICE models Logic element, routing, I/Os, clocks Around 9000 simulation elements All models tested and calibrated against FPGA Customer designs tested as final check of quality Quality of user inputs and assumptions affects estimation accuracy Are the toggle rate assumptions accurate? Are the simulations reflective of real-world stimuli? How far off is LE count from the final design? 41

Model Correlation Example: RAM 10 Dynamic Power (mw / MHz) 9 8 7 6 5 4 3 2 1 Stratix II EP2S60 Measurement Quartus II Estimate 0 M4K RAM M512 RAM MRAM RAM Configuration 8700+ Designs Covering All FPGA Elements 42

Altera 65-nm Power Advantage Innovative architecture and advanced process technologies Programmable Power Technology and selectable core voltage Stratix III FPGA consumes less than half the total power of Stratix II FPGA Cyclone III FPGA consumes less than half the static power of Cyclone II FPGA Best-in-class FPGA power modeling Allows power optimization Accurate power estimator Lowest power solution in the industry Stratix III FPGA consumes 45% lower power than Virtex-5 Cyclone III FPGA consumes up to 80% lower power than Spartan-3 43

Thank You!