Attack Your SoC Power Challenges with Virtual Prototyping

Similar documents
Power Aware Architecture Design for Multicore SoCs

Optimizing Cache Coherent Subsystem Architecture for Heterogeneous Multicore SoCs

Enabling Arm DynamIQ support. Dan Handley (Arm) Ionela Voinescu (Arm) Vincent Guittot (Linaro)

Performance Optimization for an ARM Cortex-A53 System Using Software Workloads and Cycle Accurate Models. Jason Andrews

Software Driven Verification at SoC Level. Perspec System Verifier Overview

Optimizing ARM SoC s with Carbon Performance Analysis Kits. ARM Technical Symposia, Fall 2014 Andy Ladd

Software Development Using Full System Simulation with Freescale QorIQ Communications Processors

Veloce2 the Enterprise Verification Platform. Simon Chen Emulation Business Development Director Mentor Graphics

Verification Futures Nick Heaton, Distinguished Engineer, Cadence Design Systems

Modeling Performance Use Cases with Traffic Profiles Over ARM AMBA Interfaces

On-chip Networks Enable the Dark Silicon Advantage. Drew Wingard CTO & Co-founder Sonics, Inc.

Building High Performance, Power Efficient Cortex and Mali systems with ARM CoreLink. Robert Kaye

ARM big.little Technology Unleashed An Improved User Experience Delivered

PowerAware RTL Verification of USB 3.0 IPs by Gayathri SN and Badrinath Ramachandra, L&T Technology Services Limited

Validation Strategies with pre-silicon platforms

Power management for in-vehicle infotainment systems

Hardware Software Bring-Up Solutions for ARM v7/v8-based Designs. August 2015

Power Management as I knew it. Jim Kardach

Simplifying the Development and Debug of 8572-Based SMP Embedded Systems. Wind River Workbench Development Tools

Simulation Based Analysis and Debug of Heterogeneous Platforms

Introduction Electrical Considerations Data Transfer Synchronization Bus Arbitration VME Bus Local Buses PCI Bus PCI Bus Variants Serial Buses

POWER MANAGEMENT AND ENERGY EFFICIENCY

Benchmarking of Dynamic Power Management Solutions. Frank Dols CELF Embedded Linux Conference Santa Clara, California (USA) April 19, 2007

Embedded Software Dynamic Analysis. A new life for the Virtual Platform

Test and Verification Solutions. ARM Based SOC Design and Verification

Software Quality is Directly Proportional to Simulation Speed

ECE 571 Advanced Microprocessor-Based Design Lecture 24

Embedded Linux kernel and driver development training 5-day session

Last Time. Making correct concurrent programs. Maintaining invariants Avoiding deadlocks

A Study on C-group controlled big.little Architecture

Lesson 6 Intel Galileo and Edison Prototype Development Platforms. Chapter-8 L06: "Internet of Things ", Raj Kamal, Publs.: McGraw-Hill Education

Moneta: A High-performance Storage Array Architecture for Nextgeneration, Micro 2010

Extending Fixed Subsystems at the TLM Level: Experiences from the FPGA World

FPGA Adaptive Software Debug and Performance Analysis

Charles Lefurgy IBM Research, Austin

How to get realistic C-states latency and residency? Vincent Guittot

Power Management for Embedded Systems

Frequency and Voltage Scaling Design. Ruixing Yang

Age nda. Intel PXA27x Processor Family: An Applications Processor for Phone and PDA applications

NS115 System Emulation Based on Cadence Palladium XP

Does FPGA-based prototyping really have to be this difficult?

Product specification

Software Design Challenges for heterogenic SOC's

IMPROVES. Initial Investment is Low Compared to SoC Performance and Cost Benefits

Getting the Most out of Advanced ARM IP. ARM Technology Symposia November 2013

10 Steps to Virtualization

Software Verification for Low Power, Safety Critical Systems

A Next Generation Home Access Point and Router

Projects on the Intel Single-chip Cloud Computer (SCC)

Architectural Support for Operating Systems

How to Power Tune a Device Running on a Linux Kernel for Better Suspend Battery Life ELC2011

ARM64 Server RAS Solutions. Jonathan (Zhixiong) Zhang Cavium Inc.

Embedded Hardware and Software

Techniques for Optimizing Performance and Energy Consumption: Results of a Case Study on an ARM9 Platform

Placement de processus (MPI) sur architecture multi-cœur NUMA

Freescale i.mx6 Architecture

Using Virtual Platforms To Improve Software Verification and Validation Efficiency

Designing, developing, debugging ARM Cortex-A and Cortex-M heterogeneous multi-processor systems

Hello, and welcome to this presentation of the STM32L4 power controller. The STM32L4 s power management functions and all power modes will also be

Embedded Linux Conference San Diego 2016

MYC-C437X CPU Module

Wind River. All Rights Reserved.

The mobile computing evolution. The Griffin architecture. Memory enhancements. Power management. Thermal management

Apache s Power Noise Simulation Technologies

The Challenges of System Design. Raising Performance and Reducing Power Consumption

Next Generation Verification Process for Automotive and Mobile Designs with MIPI CSI-2 SM Interface

System-on-Chip Architecture for Mobile Applications. Sabyasachi Dey

19: I/O Devices: Clocks, Power Management

Tim Kogel. June 13, 2010

Parallel Simulation Accelerates Embedded Software Development, Debug and Test

Copyright 2016 Xilinx

Developing deterministic networking technology for railway applications using TTEthernet software-based end systems

MYD-C437X-PRU Development Board

Tile Processor (TILEPro64)

QoS Handling with DVFS (CPUfreq & Devfreq)

EEM870 Embedded System and Experiment Lecture 4: SoC Design Flow and Tools

Pactron FPGA Accelerated Computing Solutions

Cortex-A15 MPCore Software Development

Evaluation of Real-time Performance in Embedded Linux. Hiraku Toyooka, Hitachi. LinuxCon Europe Hitachi, Ltd All rights reserved.

Exploring System Coherency and Maximizing Performance of Mobile Memory Systems

S2C K7 Prodigy Logic Module Series

Combining Arm & RISC-V in Heterogeneous Designs

Verification Futures The next three years. February 2015 Nick Heaton, Distinguished Engineer

Creating hybrid FPGA/virtual platform prototypes

Introduction to gem5. Nizamudheen Ahmed Texas Instruments

Will Everything Start To Look Like An SoC?

Square Pegs in Round holes. Paweł Moll

Designing with NXP i.mx8m SoC

Virtual Hardware ECU How to Significantly Increase Your Testing Throughput!

Runtime Power Management on SuperH Mobile

Embedded Systems Architecture

Porting Linux to a new SoC

Product Technical Brief S3C2416 May 2008

Intel Research mote. Ralph Kling Intel Corporation Research Santa Clara, CA

Interrupt response times on Arduino and Raspberry Pi. Tomaž Šolc

The World Leader in High Performance Signal Processing Solutions. DSP Processors

ODP Relationship to NFV. Bill Fischofer, LNG 31 October 2013

Maximizing heterogeneous system performance with ARM interconnect and CCIX

Modeling Software with SystemC 3.0

SoC FPGAs. Your User-Customizable System on Chip Altera Corporation Public

Transcription:

Attack Your SoC Power Challenges with Virtual Prototyping Stefan Thiel Gunnar Braun Accellera Systems Initiative 1

Agenda Part #1: Power-aware Architecture Definition Part #2: Power-aware Software Development Questions Accellera Systems Initiative 2

Power-aware Architecture Definition Top three goals for today Reduce risk of wrong design decision due to late power analysis Define the right HW/SW architecture to meet power and performance goals Define the right Power Management strategy Accellera Systems Initiative 3

Cache Coherent Interconnect How does architecture affect power? Power domain per core or entire CPU? Best cache size and organization? Use coherent interconnect? Separate power domain per cluster? HW/SW Partitioning? Hardware Architecture Questions Multicore CPU DMA SATA PCIe L2 Multicore CPU GMAC L2 Accelerators DDR Memory SRAM Memory SRAM Memory Run SW in parallel or power down idle cores? How to avoid page misses? Run fast and stop or DVFS? Turn off when idle? Timeout for moving to low power states? SW Architecture and Configuration Questions Accellera Systems Initiative 4

consume bus use Power and Performance Duality Workload Platform Processing Element Memory utilization power Power time Accellera Systems Initiative 5

measure consume bus use Power and Performance Duality Workload Platform Power Manager speed utilization Impact of power management on power and performance? Processing Element Memory Impact of architecture utilizationdecisions on power and performance? power Power f V dd f V dd f V dd time Accellera Systems Initiative 6

Power a n d b y f n n i n g s p e n d a n d b y f n n i n g s p e n d triggers a n d b y f n n i n g s p e n d a n d b y bus f n n i n g s p e n d Virtual Prototyping Approach Workload Virtual Prototype PMIC PLL Power Manager V dd f load CPU Power Domain CPU Core 0 Core 1 V dd f load Bus Power Domain V dd f load Memory Power Domain Memory O f O f O f O f S t S u S t S u S t S u S t S u records R u R u R u R u Power Overlay Model Energy/Power recording Accellera Systems Initiative 7

Fine-grain DVFS Coarse-grain DVFS Example: DVFS What-If Analysis Work load Freq. trace Power stats Work load Freq. trace Power stats Accellera Systems Initiative 8

What we just learned Reduce risk of late power analysis The Power and Performance duality Virtual Prototype for early power and performance analysis Create workload model to capture processing and communication requirements Create architecture model to represent processing and communication resources Map workload onto architecture Measure resulting system performance and power analysis map measure Accellera Systems Initiative 9

IEEE 1801 UPF System Level Power New IEEE 1801 Sub-committee on System Level Power Participation from EDA, IP providers and users Active participation from AGGIOS, ARM, Broadcom, Cadence, CSR, Intel, Mentor, Microsoft, ST, Synopsys Goal: Standardize format for system level power analysis Enable exchange of power models between groups, companies and EDA tools and across abstraction levels Visit http://standards.ieee.org/develop/wg/upf.html http://www.p1801.org/ http://semiengineering.com/system-level-power-modeling-activities-get-rolling/ Accellera Systems Initiative 10

Power-aware Architecture Definition Top three goals for today Reduce risk of wrong design decision due to late power analysis Define the right HW/SW architecture to meet power and performance goals Define the right Power Management strategy Accellera Systems Initiative 11 Case Study

Micro Server in Platform Architect MCO Workload model: ETH processing Ethernet Bandwidth: 512 MB/s, 1GB/s CPU-cycles per packet: 1k (avg), 2k (high), 4k (stress) Level 2 cache Cache size in KB: 32, 64, 128 Models SoC Platform Number of CPUs: 1, 2 Number of NPUs: 0, 1 Number of SRAMs: 0, 1 Design hierarchy Memory Map Accellera Systems Initiative 12

Micro Server in Platform Architect MCO Power Manager DVFS-levels: 1, 2, 3, 4, Thresholds: when to scale up and down Response delays: slow, med, fast Accellera Systems Initiative 13

Power Model Instrumentation set pm [SNPS_PAM create_monitor {Power Analysis} Cpu0Pam CPU0] $pm set_frequency_input signal CPU0/PLL/clk_in $pm set_voltage_input signal CPU0/PMIC/vdd_out # define states $pm add_state CORE0 SLEEP 5mW 1.1V 1.5GHz $pm add_state CORE0 IDLE 70mW 1.1V 1.5GHz $pm add_state CORE0 ACTIVE 120mW 1.1V 1.5GHz... # state transition triggers: signals $pm link_signal_to_state CORE0 CPU.CORE0.p_notify 0 IDLE $pm link_signal_to_state CORE0 CPU.CORE0.p_notify 1 ACTIVE... # state transition triggers: timeout $pm link_timeout_to_state 2ms CORE0 SLEEP IDLE... sleep [mw] Power Model notify sleep -> active notify idle -> active active [mw] timeout idle [mw] notify active -> idle Energy/Power recording Accellera Systems Initiative 14

Power Model Instrumentation - Details Power State Definition set pm [SNPS_PAM create_monitor {Power Analysis} modem_pam TOP.modem] $pm add_state modem_fsm Off 0W $pm add_state modem_fsm Standby 5mW $pm add_state modem_fsm Transmit 200mW $pm add_state modem_fsm Receive 150mW Off 0.000W Transmit 0.200W Standby 0.005W Receive 0.150W Accellera Systems Initiative 15

Power Model Instrumentation - Details Dynamic Voltage and Frequency Scaling For each power state we define: Reference dynamic power P dyn,ref for a reference frequency F dyn,ref and reference voltage V dyn,ref Reference leakage power P leak,ref for a reference voltage V leak,ref Trigger frequency and voltage from Virtual Prototype $pm set_frequency_input signal TOP/MODULE_3/clk_in $pm set_voltage_input signal TOP/PMIC/vdd_out $pm add_state modem_fsm Off 0W 0W 100MHz 1.1V 1.1V $pm add_state modem_fsm Standby 5mW 5mW 100MHz 1.1V 1.1V $pm add_state modem_fsm Transmit 200mW 5mW 100MHz 1.1V 1.1V $pm add_state modem_fsm Receive 150mW 5mW 100MHz 1.1V 1.1V Accellera Systems Initiative 16 Fdyn,ref Vdyn,ref Vleak,ref

Power Model Instrumentation - Details Driving Power States from a (HW) signal $pm link_signal_to_state TOP/MODULE_1/pState 0 0xf modem_fsm Off $pm link_signal_to_state TOP/MODULE_1/pState 1 0xf modem_fsm Standby $pm link_signal_to_state TOP/MODULE_1/pState 2 0xf modem_fsm Transmit $pm link_signal_to_state TOP/MODULE_1/pState 3 0xf modem_fsm Receive pstate==0 Off pstate==1 Standby pstate==2 Transmit pstate==3 Receive Accellera Systems Initiative 17

Power Model Instrumentation - Details Driving Power States from Software $pm link_instruction_address_to_state modem_fsm driver_power_off Off $pm link_instruction_address_to_state modem_fsm driver_power_on Standby $pm link_instruction_address_to_state modem_fsm send Transmit $pm link_instruction_address_to_state modem_fsm receive Receive SW function (e.g. Linux device driver) driver_power_off(); driver_power_on(); driver_suspend(); receive(int &bytes); send(int bytes, ) pstate==0 pstate==1 pstate==2 Off Standby Transmit pstate==3 Receive Accellera Systems Initiative 18

Video: Analyze Power Management Accellera Systems Initiative 19

Parameters to Explore Architecture and workload parameters Which combination will work best? Workload Ethernet Bandwidth: CPU-cycles per packet: Platform Number of CPUs: 1, 2 Number of NPUs: 0, 1 Number of SRAMs: 0, 1 L2 cache Size in KB: 32, 64, 128 512 MB/s, 1GB/s 1k (avg), 2k (high), 4k (stress) DVFS power management related parameters 6 parameters 144 combinations DVFS-levels: 1, 2, 3, 4, 5 Thresholds (up/down): 2/1, 4/1, 8/1, 4/2, 8/2, 8/4 Response delays in us:1, 2, 3, 4, 5 Total of 21600 combinations??? Sensitivity Analysis + Divide and Conquer! 3 parameters 150 combinations Accellera Systems Initiative 20

scenarios... results Sensitivity Analysis 1. Design Configuration 2. Simulation sweep 3. Metric Extraction parameters parameters metrics Spreadsheet In 4. Sensitivity Analysis Spreadsheet Out Accellera Systems Initiative 21

Video: Explore Architecture Accellera Systems Initiative 22

Parameters to Explore Best architecture configuration: Workload Ethernet Bandwidth: CPU-cycles per packet: Platform Number of CPUs: 2 Number of NPUs: 1 Number of SRAMs: 1 L2 cache Size in KB: 64 512 MB/s, 1GB/s 1k (avg), 2k (high), 4k (stress) DVFS power management related parameters Results from first sweep DVFS-levels: 1, 2, 3, 4, 5 Thresholds (up/down): 2/1, 4/1, 8/1, 4/2, 8/2, 8/4 Response delays in us:1, 2, 3, 4, 5 3 parameters 150 combinations Accellera Systems Initiative 23

Summary Power aware Architecture Definition Reduce risk of wrong design decision due to late power analysis Power and Performance duality: Performance impacts power, power (management) impacts performance Early power analysis important for taking the right design decisions Define the right HW/SW architecture to meet power and performance goals Power aware HW/SW partitioning, cache and cache coherency analysis, interconnect/memory optimization Define the right power management strategy Grouping of components into power domains Run-fast-then-stop vs. DVFS Power management thresholds, time-outs and delays Accellera Systems Initiative 24

Agenda Part #1: Power-aware Architecture Definition Part #2: Power-aware Software Development Questions Accellera Systems Initiative 25

Power-aware Software Development Top three goals for today Reduce risk of late power management software development Improve robustness of power management software Reveal software bugs that can drain the battery Accellera Systems Initiative 26

How can software affect power? Software controls the activity of the power consumers Power management application & use-case level Selection of best effort service Example: Turn off WiFi and use 3G when user idle Based on user s performance/power needs Power management operating system level (OSPM) Runtime control of (sub-) system power modes Example: Drive WiFi subsystem into standby Steered by application level performance/power needs Power control firmware level Control clocks and voltages Example: set voltage regulator to 1.1V, set clock to 1GHz Initiated by operating system power management Accellera Systems Initiative 27

Power Management complexity Example: Mobile Application Processor 200 pages of clock programming Specification for each register: 410 static struct clksrc_clk exynos5_clk_dout_armclk = { 411.clk = { 412.name = "dout_armclk", 413.parent = &exynos5_clk_mout_cpu.clk, 414 }, 415.reg_div = {.reg = EXYNOS5_CLKDIV_CPU0,.shift = 0,.size = 3 }, 416 }; Clock definition in Linux Source: http://www.samsung.com/global/business/semiconductor/file/product/exynos_5_dual_user_manaul_public_rev100-0.pdf Accellera Systems Initiative 28

Power Management complexity Example: Mobile Application Processor Programmer s Manual The devil is in the detail: Typical software problems that can cause days, or often weeks of debugging! Source: http://www.samsung.com/global/business/semiconductor/file/product/exynos_5_dual_user_manaul_public_rev100-0.pdf Accellera Systems Initiative 29

Fighting bugs with power impact Typical scenario Here Linux Kernel Mailing List Source: Feb. 2013, https://lkml.org/lkml/2013/2/26/608 Accellera Systems Initiative 30

Fighting bugs with power impact Typical scenario Here Linux Kernel Mailing List Oops! Works fine, but Source: Feb. 2013, https://lkml.org/lkml/2013/2/26/608 Accellera Systems Initiative 31

Shift-Left with Virtual Prototyping Start earlier Late Power Management software can have catastrophic consequences on project schedules Accellera Systems Initiative 32

Virtual Prototyping approach Goals Enable most critical software development tasks As early as possible Aligned with software project schedule Needs Develop and deploy virtual prototypes (VP) incrementally Enable different software teams to develop in parallel Multiple VPs are created with different focus How Model subsystems to support most critical software tasks Leverage existing or generic models for simulation of subsystems outside the software development focus Its here not a goal of the virtual prototype to represent all the hardware to develop all the software (availability would be too late to make any impact) Accellera Systems Initiative 33

Case Study: SoC Power Management USB subsystem with our specific PMIC and Clock Controller I2C Bus Soc Bus Clock Controller Voltage Controller (PMIC) Clk VDD Soc Bus Interrupt USB CORE USB PHY Accellera Systems Initiative 34

Case Study: SoC Power Management Combine with available VDK for ARM Versatile Express and software CPU Motherboard Focus: USB subsystem ARM CPUs CCI UARTs I2C I2C Bus Soc Bus Clock Controller Voltage Controller (PMIC) Timers CLCD Clk VDD GIC RAMs KMIs KMIs Soc Bus Interrupt USB CORE USB PHY Readily available pre-assembled VP building blocks. Part of VDK for the ARM Versatile Express prototyping system Allows running stock Linaro Linux kernel and filesystem images Accellera Systems Initiative 35

Case Study: SoC Power Management Add power domain information for the USB subsystem IN No Clk Vdd & Clk valid? VDD Yes OUT Power domains configuration file I2C Bus MHz Volt Soc Bus Clock 125 Controller 1.1 250 1.4 500 1.8 Voltage Controller (PMIC) Output= TLM_ERROR Output= Input Clk VDD Raise assertion A d a p t e r Bus USB CORE IRQ Power Domain: Core A d a p t e r USB PHY Power Domain: PHY Accellera Systems Initiative 36

What we just learned The scope of a Virtual Prototype need to match the requirements of the software team What is needed to enable the key critical software tasks? A mix of generic and specific HW components Assembly & modeling tools assist the specific HW model creation Enables earliest availability A VP can accurately model power management hardware Clock & voltage trees Power Managament IC (PMIC) Clock controller Power domain isolation Accellera Systems Initiative 37

Power-aware Software Development Top three goals for today Reduce risk of late power management software development Improve robustness of power management software Reveal software bugs that can drain the battery Accellera Systems Initiative 38

Case Study: Normal OS & driver operation Booting and using USB for a file storage gadget Accellera Systems Initiative 39

Case Study: Driver faults from power bug Booting and using USB for a file storage gadget Unpowered core Unpowered PHY Abort exception! No response! Soc Bus Interrupt USB CORE USB PHY Power Domain: Core Power Domain: PHY Accellera Systems Initiative 40

Case Study: Root cause analysis Using a VDK, there is more to see! Standard Linux Kernel Messaging System Debug message: Exception fault! Accellera Systems Initiative 41

Case Study: Root cause analysis USB PHY is powered on Ok PMIC drives voltages: V4 and V6 USB CORE in Power Down State! Why? At that time: Access to USB CORE PMIC controlled by SW: OK Debug message: Exception fault! VDD connectivity: USB CORE connected to V3 and not V4! Accellera Systems Initiative 42 Need to correct USB driver to driver V3 regulator!

What we just learned Virtual prototypes do accurately simulate power management fault scenarios Software developer can reproduce and observe same defects like on hardware Deterministic repetition for debug and testing Virtual prototypes help accelerating root cause analysis Visibility and traceability of any HW or SW property Cross correlation of HW and SW power management Accellera Systems Initiative 43

Power-aware Software Development Top three goals for today Reduce risk of late power management software development Improve robustness of power management software Reveal software bugs that can drain the battery Accellera Systems Initiative 44

Software bugs impacting power consumption the reality today Small software bugs can have big impact on power Invisible to the developer Only revealed in long term scenario measurements Impact is usecase dependent Talk time might not be impacted at all Standby time might be reduced by multiple hours Software developers are often unaware No means to test and use-case analyze during development Power bugs continuously slip into production firmware Accellera Systems Initiative 45

How to analyze power bugs? Soldering skills required The hardware way The Virtualizer way Dynamic power analysis Instrumentation overlay Source: How to measure SoC power, Andy Green, TI Landing Team lead, Linaro Accellera Systems Initiative 46

Dynamic power analysis Using instrumentation scripts in Virtualizer VDK Accellera Systems Initiative 47

USB Power Model Abstract power model Approximate power per component at a reference frequency & voltage Good enough to find bug mentioned in introduction! Detailed power model Approximate power for each power mode in a component Needed for run-time power management SW USB PHY Power USB Core Power Other From K Park: > It would be best to keep these regulators always on. No, it's just workaround patch. It should be handled at USB drivers. We usually used this scheme enable USB power always. but it consumes lots of power. There's no need to enable usb power when there's no usb connection. U0 Active U3 Suspend Normal Operation Accellera Systems Initiative 48

Dynamic power analysis big.little processing Task migration with DVFS A15 1.1V/1.359Ghz A15 0.8V/755MHz A7 0.8V/140Mhz Frequencies Voltages Leakage Energy Dynamic Energy Power State State Utilization A15 active A15 idle A7 off A7 idle A15 off A7 active Accellera Systems Initiative 49

What we just learned Power awareness has higher priority than power accuracy for software developers Even a simple on/off power model can reveal severe defects Power models can be realized at multiple levels of abstraction Power analysis can be added as an overly to a virtual prototype Less intrusive than cutting rails and soldering shunts Accuracy in the same ballpark as HW based measurement Accellera Systems Initiative 50

Power-aware Software Development Top three goals for today Reduce risk of later power management software Complete software development earlier with Virtual prototypes Simulate power domain control (PMIC and clock controller) Improve robustness of power management software Simulate fault scenarios such as unpowered hardware Efficient root cause analysis from HW though SW stack Reveal software bugs that can drain the battery Simulate approximate power consumption Expose power consumption defects to the SW developer Accellera Systems Initiative 51

Questions? Accellera Systems Initiative 52

Thank You Accellera Systems Initiative 53