The Challenges of System Design. Raising Performance and Reducing Power Consumption
|
|
- Crystal Hill
- 6 years ago
- Views:
Transcription
1 The Challenges of System Design Raising Performance and Reducing Power Consumption 1
2 Agenda The key challenges Visibility for software optimisation Efficiency for improved PPA 2
3 Product Challenge - Software For software engineers Good visibility Power Osprey Management Application Processors CPU1 CPU 2 CPU 3 CPU 4 AXI Interconnect CoreSight Debug & Trace On-Chip Debug & Trace DMA Controller HD LCD controller CPU L2 cache Coherency, Virtualisation Coherent interconnect AXI Interconnect AXI Interconnect DDR3/LPDDR2 Memory Controller Static Memory Controller PCIe Media Processors A standard easy-to-program h/w platform Graphics processor Video engine AXI interconnect APB Peripherals SRAM ARM Profiler To optimise performance 3
4 Design Challenge - PPA Power Osprey Management Performance Performance Performance Application Processors CPU 3 CPU L2 cache Media Processors Graphics processor Video engine Coherent interconnect AXI Interconnect Power CPU1 CPU 2 CPU 4 Power AXI AXI Interconnect Interconnect Power AXI interconnect CoreSight Debug & Trace DMA Controller HD LCD controller DDR3/LPDDR2 APB Power Memory Controller Static Memory Controller PCIe Peripherals SRAM 4
5 How to optimise your software and understand what your design VISIBILITY FOR OPTIMISATION 5
6 On Chip Visibility: a key requirement Power Osprey Management Application Processors CPU1 CPU 3 CPU 2 CPU 4 AXI Interconne ct CoreSight Debug & Trace DMA Controller HD LCD controller CPU L2 cache Coherent interconn ect AXI Interconnect DDR3/LPDDR2 Memory Controller Static Memory Controller AXI Interconnect PCIe Media Processors Graphics processor Video engine AXI interconnect AP B Peripherals SRAM 6
7 Typical CoreSight System Cross triggering between cores Single debug access port Cost effective debug AMBA AXI Cross trigger matrix System Trace Example ARM SoC SWD DAP New ETM Cortex A9 PTM Interface Cross Trigger Cortex R4 ETMR4 CS Interface Cross Trigger DSP DSP ETM Interface Cross Trigger Bus trace System trace APB bridge Shared s Port Debug bus (APB) Trace bus (ATB) Funnel Debug control bus RealView ICE Trace bus for system trace RealView Trace Trace port Trace Port Interface Unit Embedded Trace Buffer Buffer Trace Collection strategies 7
8 Software profiling using CPU Trace Top-down insight into the analyzed software Starting with overview screen, containing top 5 functions by Self Time, Delay and Memory access Detailed information on the source code and its derived assembly code, annotated with performance information Code coverage Source associated instructions Cycles per instruction Interlock information 8
9 System Trace Macrocell - STM System level visibility required by application development up to final product Debug and tuning of s/w applications running on OS Tracing of system events and system performance PMU Counts OS Trace System Trace Macrocell enables High level application software view Tuning of system performance Tracing of SoC internal signals Benefits Flexible and affordable hardware based debug for applications and system level developers Complements CPU trace, MIPI STPv2 compliant 9
10 System Level Code Instrumentation System level debug information can be sent through trace to debug your System static inline void stm_emit(unsigned int port, unsigned int value) { stm_addr[port] = value; } static inline void stm_emit_blocking(unsigned int port, unsigned int value) { // Reading from an stm port returns 1 if the FIFO can // accept data, 0 if it is full. while(!stm_addr[port]); stm_addr[port] = value; } Export debug information Visualise system level data 10
11 Event Profiling using STM Cortex-A9 Cortex-A9 11 L2 cache
12 Trace Memory Controller Single solution for cost effective and flexible trace collection SoC visibility in final product with only 2 pins Storage of trace using low cost system memory Routing to Gigabit links such as HSSTP or Reduce trace overflows and trace port size by averaging out trace bandwidth 12 Bits / cycle Ethernet Existing modes with ETB (SRAM) & Trace Port (TPIU)
13 Getting the highest performance at the lowest power consumption EFFICIENT SOC DESIGN 13
14 Introduction Systems use external memory Large address space Low cost-per-bit Large interface bandwidth Challenge: Manage the flow of data to and from external memory to present the best bandwidth and latency characteristics to each processing element 14 GPU Comms control Geometry processor Renderer Apps processor Tiling Network interface DMA Controller Display Controller Audio CODEC Interconnect Image Transform Video required Access to memory depends on accesses from other processing elements CPU Motion Estimation Motion Compensate buffer Primitives Frame buffer Dynamic Memory Ctrl Texture Primitives Application Memory Static Memory Ctrl buffer Tile lists Media source NAND Flash Physical View
15 QoS Contracts Minimum Bandwidth Minimum Bandwidth CPU GPU Comms control Geometry processor Renderer Apps processor Minimize Latency Tiling Network interface DMA Controller Maximum Latency or Minimum Bandwidth Display Controller Audio CODEC Interconnect Image Transform Video Minimum Bandwidth Motion Estimation Motion Compensate Minimum Bandwidth buffer Primitives Frame buffer 15 Maximum Latency Dynamic Memory Ctrl Texture Primitives Application Memory Static Memory Ctrl buffer Tile lists Media source NAND Flash Maximum Latency
16 QoS Objectives Allocate system capacity (latency and bandwidth) to each master to meet the contract Dynamically vary the priority to react to changes in bus traffic If there is excess capacity Allocate excess to where it can offer the most improvement Usually reducing the CPU latency Allocate excess to masters that can reduce performance later If there is insufficient capacity Remove capacity from masters that have the least impact on system performance 16
17 System Latency Latency is added throughout the system in two forms: Static latency the delay through pipeline stages Constant and specific to the path from master to slave Queuing latency the delay at arbitration points in the system The delay for each transaction depends on the number of transactions ahead of it in the queue and the rate at which they are processed The queue length depends on the capacity of the slave (memory type and efficiency), and the desired throughput Efficiency of the Memory Controller is a function of: Queue length, Burst length, Read-write mix, Address distribution Population Efficiency System Latency 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% Static Latency Queuing Latency AMBA DMC-341 : Average Burst Length Latency/clocks 17
18 System Interface Characteristics The performance on the CPU master is determined by the latency characteristics that it sees from the system Determined by the bandwidth Other masters can be replaced with traffic profile generators (VPE) Calibrated to generate the same traffic behaviour 18 GPU Apps processor Geometry processor Renderer Tiling Network interface DMA Controller Display Controller Audio CODEC Interconnect Image Transform Video from the other masters Also by the efficiency of the memory controller Depends on the burst characteristics of the traffic Comms control Motion Estimation Motion Compensate buffer Primitives Frame buffer Dynamic Memory Ctrl Texture Primitives Application Memory Static Memory Ctrl buffer Tile lists Media source NAND Flash
19 VPE Verification and Performance Exploration The AMBA VPE design tool is for verification of the system performance: A graphical profiling toolkit to generate & view traffic profiles 3 verification components: AXI Monitor, AXI Master, AXI Slave Runs on all of the big 3 RTL simulation tools Speeds up RTL simulation by Giving-up execution of functions (e.g. CPU, GPU) in favour of 19 emulating their traffic No need to model their cycle-accurate behaviour as a result Replacing real data with constrained random data Can test typical and worst case scenarios
20 System Interface Characteristics The performance on the master is determined by the latency characteristics that it sees from the system Determined by the bandwidth Other masters can be replaced with traffic profile generators (VPE) Calibrated to generate the same traffic behaviour 20 VPE Master Geometry processor Renderer Tiling Network interface DMA Controller Audio CODEC Display Controller Interconnect Image Transform Video from the other masters Also by the efficiency of the memory controller Depends on the burst characteristics of the traffic GPU Motion Estimation Static Memory Ctrl Motion Compensate NAND Flash VPE Slave
21 Calibrating VPE Master Behaviour Benchmarks Bus Master VPE Slave VPE monitor Run benchmark applications on the bus master VPE Monitor captures the traffic profile VPE Slave varies the latency seen by the master System Architect selects a representative set of benchmarks Benchmark results provide bandwidth and latency contracts Traffic profile and latency sensitivity results are used to generate a VPE model of the bus master 21
22 Better designs more quickly Iteration time of a spreadsheet with the accuracy approaching RTL simulation Spreadsheet Analysis minutes/hours RTL simulation, VPE, User VIP Industry standards VIP Statistical or recorded traffic profiles days/weeks months/years HIGH 22 Acceleration/ Emulation VIP, Logic Tiles, SW Silicon/ Applications Adding S/W, external I/F with realistic scenarios Observe actual behaviour LOW Realistic behaviour minutes/hours Mathematical formula, not dynamic Cycle time LOW HIGH
23 Reducing Visible System Latency Write data can be buffered The latency for write traffic seen by the system is significantly reduced Can be used to reduce read latency Prioritize reads Cache memory reduces latency seen by the master Also reduces system bandwidth which reduces latency to other masters Diminishing returns from increases in cache size Unbuffered Read Write % 13% 16% 19% 22% 25% 28% 31% 34% 37% 40% 43% 46% 49% 52% 55% 58% 61% 64% 67% 70% 73% 76% 79% 82% 85% 88% 91% 94% As long as coherency is managed 30 Burst Latency 35 System Utilization
24 Increasing Latency Tolerance Masters that generate transactions that are weakly dependent on the completion of previous transactions Can issue multiple outstanding transactions Multiple outstanding transactions can eliminate the effects of static (pipeline) latency The have no impact on the effects from dynamic latency Additional outstanding transactions will increase the queue length Static latency Static latency 24 Processing rate
25 Queue location A queue is implemented in the memory controller Allows re-ordering of transactions to maximize efficiency If the queue fills it extends through the interconnect Interconnect arbitration only operates when the queue extends through the interconnect For effective QoS, the arbitration policy should be consistent throughout the system System topology influences the performance CPU and LCD Ctrl placed close to the memory controller Lower latency Mali and DMA on a separate 25 Interconnect Mali-VE6 Memory Ctrl Mali-400 DMA Ctrl level LCD Ctrl Peripheral Cortex-A9 Peripheral Hierarchy improves performance
26 Stream Processing Masters Adding latency does not affect Priority Time-out performance While latency is less than the maximum Entry priority set third highest Reduces the latency to the Best-effort Best-effort Stream processing masters Batch processing If the transaction is still waiting after a masters when necessary 100% 80% Survival time-out period Promoted to highest priority Only higher priority than Best-effort Time-out 60% Maximum latency 40% 20% 0% Latency/clocks
27 Batch Processing Masters Average latency, bandwidth and queue Priority length related by Little s Law E(L)=λ.E(S) Time-out Hold queue length constant Best-effort Measure average latency and control priority Stream processing Priority controls latency which controls bandwidth Excess bandwidth is used Priority only exceeds other masters over Best-effort masters 2000 Bandwidth/MB/s when Insufficient bandwidth obtained Minimizes transactions prioritized Batch processing Latency/clocks
28 QoS with Existing Memory Controllers Existing memory controllers have only 3 priority Priority levels CPU given high priority but demoted if there is Time-out Best-effort insufficient minimum bandwidth available for the batch processors Increase proportional to outstanding transactions Excess bandwidth partitioned between other masters Hard regulation can set a maximum bandwidth for a batch processing master Batch processing Batch processing bandwidth increases to use any available bandwidth Batch processing Batch processing Batch processor bandwidth is partitioned by varying the number of outstanding transactions Stream processing 28
29 30% Browsing Boost with QoS % 130% Cortex performs >30% better 600 soaks up Mali spare system bandwidth 125% % Cortex performs Base22% Case better Performance Mali meets its target 115% Mali BW (No QoS) Reduced Mali allows Mali BWrequirement (QoS-301) Cortex-A9 to be higher Target Mali BW priority more often % 105% CPU Performance (No QoS) 350 CPU Performance (QoS-301) 300 Lower Mali requirements are exceeded. Cortex-A9 is highest priority most of the time % Targetted Mali Bandwidth (MB/s) Cortex A9 Performance Improvement Mali Bandwidth (MB/s) 650
30 Optimizing Efficiency The performance of a system depends on Maximizing the efficiency from the memory controller Using Cache to minimize the system bandwidth And reduce latency to the masters Using write buffering to minimize the latency from the system Performance is optimized by Implementing a consistent arbitration policy throughout the system Exploiting the different latency sensitivities of masters Roadmap to QoS 30 Consistent, system-wide, priority-based arbitration policy Priority controllers for the system masters Time-out mechanism in the system queue High efficiency memory controller with write buffering Regulation from QoS-301 NIC-301 Mali-VE6 Memory Ctrl Mali-400 DMA Ctrl LCD Ctrl Peripheral Cortex-A9 Peripheral
Negotiating the Maze Getting the most out of memory systems today and tomorrow. Robert Kaye
Negotiating the Maze Getting the most out of memory systems today and tomorrow Robert Kaye 1 System on Chip Memory Systems Systems use external memory Large address space Low cost-per-bit Large interface
More informationEffective System Design with ARM System IP
Effective System Design with ARM System IP Mentor Technical Forum 2009 Serge Poublan Product Marketing Manager ARM 1 Higher level of integration WiFi Platform OS Graphic 13 days standby Bluetooth MP3 Camera
More informationBuilding High Performance, Power Efficient Cortex and Mali systems with ARM CoreLink. Robert Kaye
Building High Performance, Power Efficient Cortex and Mali systems with ARM CoreLink Robert Kaye 1 Agenda Once upon a time ARM designed systems Compute trends Bringing it all together with CoreLink 400
More informationAnalyzing and Debugging Performance Issues with Advanced ARM CoreLink System IP Components
Analyzing and Debugging Performance Issues with Advanced ARM CoreLink System IP Components By William Orme, Strategic Marketing Manager, ARM Ltd. and Nick Heaton, Senior Solutions Architect, Cadence Finding
More informationModeling Performance Use Cases with Traffic Profiles Over ARM AMBA Interfaces
Modeling Performance Use Cases with Traffic Profiles Over ARM AMBA Interfaces Li Chen, Staff AE Cadence China Agenda Performance Challenges Current Approaches Traffic Profiles Intro Traffic Profiles Implementation
More informationPerformance Optimization for an ARM Cortex-A53 System Using Software Workloads and Cycle Accurate Models. Jason Andrews
Performance Optimization for an ARM Cortex-A53 System Using Software Workloads and Cycle Accurate Models Jason Andrews Agenda System Performance Analysis IP Configuration System Creation Methodology: Create,
More informationMulti-core microcontroller design with Cortex-M processors and CoreSight SoC
Multi-core microcontroller design with Cortex-M processors and CoreSight SoC Joseph Yiu, ARM Ian Johnson, ARM January 2013 Abstract: While the majority of Cortex -M processor-based microcontrollers are
More informationExploring System Coherency and Maximizing Performance of Mobile Memory Systems
Exploring System Coherency and Maximizing Performance of Mobile Memory Systems Shanghai: William Orme, Strategic Marketing Manager of SSG Beijing & Shenzhen: Mayank Sharma, Product Manager of SSG ARM Tech
More informationFPGA Adaptive Software Debug and Performance Analysis
white paper Intel Adaptive Software Debug and Performance Analysis Authors Javier Orensanz Director of Product Management, System Design Division ARM Stefano Zammattio Product Manager Intel Corporation
More informationSYSTEMS ON CHIP (SOC) FOR EMBEDDED APPLICATIONS
SYSTEMS ON CHIP (SOC) FOR EMBEDDED APPLICATIONS Embedded System System Set of components needed to perform a function Hardware + software +. Embedded Main function not computing Usually not autonomous
More informationBuilding blocks for 64-bit Systems Development of System IP in ARM
Building blocks for 64-bit Systems Development of System IP in ARM Research seminar @ University of York January 2015 Stuart Kenny stuart.kenny@arm.com 1 2 64-bit Mobile Devices The Mobile Consumer Expects
More informationOptimizing ARM SoC s with Carbon Performance Analysis Kits. ARM Technical Symposia, Fall 2014 Andy Ladd
Optimizing ARM SoC s with Carbon Performance Analysis Kits ARM Technical Symposia, Fall 2014 Andy Ladd Evolving System Requirements Processor Advances big.little Multicore Unicore DSP Cortex -R7 Block
More informationThe Use Of Virtual Platforms In MP-SoC Design. Eshel Haritan, VP Engineering CoWare Inc. MPSoC 2006
The Use Of Virtual Platforms In MP-SoC Design Eshel Haritan, VP Engineering CoWare Inc. MPSoC 2006 1 MPSoC Is MP SoC design happening? Why? Consumer Electronics Complexity Cost of ASIC Increased SW Content
More informationGetting the Most out of Advanced ARM IP. ARM Technology Symposia November 2013
Getting the Most out of Advanced ARM IP ARM Technology Symposia November 2013 Evolving System Requirements Processor Advances big.little Multicore Unicore DSP Cortex -R7 Block are now Sub-Systems Cortex
More informationOptimizing Cache Coherent Subsystem Architecture for Heterogeneous Multicore SoCs
Optimizing Cache Coherent Subsystem Architecture for Heterogeneous Multicore SoCs Niu Feng Technical Specialist, ARM Tech Symposia 2016 Agenda Introduction Challenges: Optimizing cache coherent subsystem
More informationCopyright 2016 Xilinx
Zynq Architecture Zynq Vivado 2015.4 Version This material exempt per Department of Commerce license exception TSU Objectives After completing this module, you will be able to: Identify the basic building
More informationThe ARM Cortex-A9 Processors
The ARM Cortex-A9 Processors This whitepaper describes the details of the latest high performance processor design within the common ARM Cortex applications profile ARM Cortex-A9 MPCore processor: A multicore
More informationYafit Snir Arindam Guha Cadence Design Systems, Inc. Accelerating System level Verification of SOC Designs with MIPI Interfaces
Yafit Snir Arindam Guha, Inc. Accelerating System level Verification of SOC Designs with MIPI Interfaces Agenda Overview: MIPI Verification approaches and challenges Acceleration methodology overview and
More informationCombining Arm & RISC-V in Heterogeneous Designs
Combining Arm & RISC-V in Heterogeneous Designs Gajinder Panesar, CTO, UltraSoC gajinder.panesar@ultrasoc.com RISC-V Summit 3 5 December 2018 Santa Clara, USA Problem statement Deterministic multi-core
More informationDesigning, developing, debugging ARM Cortex-A and Cortex-M heterogeneous multi-processor systems
Designing, developing, debugging ARM and heterogeneous multi-processor systems Kinjal Dave Senior Product Manager, ARM ARM Tech Symposia India December 7 th 2016 Topics Introduction System design Software
More informationValidation Strategies with pre-silicon platforms
Validation Strategies with pre-silicon platforms Shantanu Ganguly Synopsys Inc April 10 2014 2014 Synopsys. All rights reserved. 1 Agenda Market Trends Emulation HW Considerations Emulation Scenarios Debug
More informationRM4 - Cortex-M7 implementation
Formation Cortex-M7 implementation: This course covers the Cortex-M7 V7E-M compliant CPU - Processeurs ARM: ARM Cores RM4 - Cortex-M7 implementation This course covers the Cortex-M7 V7E-M compliant CPU
More informationBus AMBA. Advanced Microcontroller Bus Architecture (AMBA)
Bus AMBA Advanced Microcontroller Bus Architecture (AMBA) Rene.beuchat@epfl.ch Rene.beuchat@hesge.ch Réf: AMBA Specification (Rev 2.0) www.arm.com ARM IHI 0011A 1 What to see AMBA system architecture Derivatives
More informationRM3 - Cortex-M4 / Cortex-M4F implementation
Formation Cortex-M4 / Cortex-M4F implementation: This course covers both Cortex-M4 and Cortex-M4F (with FPU) ARM core - Processeurs ARM: ARM Cores RM3 - Cortex-M4 / Cortex-M4F implementation This course
More informationSoftware Driven Verification at SoC Level. Perspec System Verifier Overview
Software Driven Verification at SoC Level Perspec System Verifier Overview June 2015 IP to SoC hardware/software integration and verification flows Cadence methodology and focus Applications (Basic to
More informationThe CoreConnect Bus Architecture
The CoreConnect Bus Architecture Recent advances in silicon densities now allow for the integration of numerous functions onto a single silicon chip. With this increased density, peripherals formerly attached
More informationSoC Design Lecture 11: SoC Bus Architectures. Shaahin Hessabi Department of Computer Engineering Sharif University of Technology
SoC Design Lecture 11: SoC Bus Architectures Shaahin Hessabi Department of Computer Engineering Sharif University of Technology On-Chip bus topologies Shared bus: Several masters and slaves connected to
More informationZynq-7000 All Programmable SoC Product Overview
Zynq-7000 All Programmable SoC Product Overview The SW, HW and IO Programmable Platform August 2012 Copyright 2012 2009 Xilinx Introducing the Zynq -7000 All Programmable SoC Breakthrough Processing Platform
More informationARM Processors for Embedded Applications
ARM Processors for Embedded Applications Roadmap for ARM Processors ARM Architecture Basics ARM Families AMBA Architecture 1 Current ARM Core Families ARM7: Hard cores and Soft cores Cache with MPU or
More informationNext Generation Verification Process for Automotive and Mobile Designs with MIPI CSI-2 SM Interface
Thierry Berdah, Yafit Snir Next Generation Verification Process for Automotive and Mobile Designs with MIPI CSI-2 SM Interface Agenda Typical Verification Challenges of MIPI CSI-2 SM designs IP, Sub System
More informationVeloce2 the Enterprise Verification Platform. Simon Chen Emulation Business Development Director Mentor Graphics
Veloce2 the Enterprise Verification Platform Simon Chen Emulation Business Development Director Mentor Graphics Agenda Emulation Use Modes Veloce Overview ARM case study Conclusion 2 Veloce Emulation Use
More informationAddressing the Memory Wall
Lecture 26: Addressing the Memory Wall Parallel Computer Architecture and Programming CMU 15-418/15-618, Spring 2015 Tunes Cage the Elephant Back Against the Wall (Cage the Elephant) This song is for the
More informationARM s IP and OSCI TLM 2.0
ARM s IP and OSCI TLM 2.0 Deploying Implementations of IP at the Programmer s View abstraction level via RealView System Generator ESL Marketing and Engineering System Design Division ARM Q108 1 Contents
More informationChapter 2 The AMBA SOC Platform
Chapter 2 The AMBA SOC Platform SoCs contain numerous IPs that provide varying functionalities. The interconnection of IPs is non-trivial because different SoCs may contain the same set of IPs but have
More informationIMPROVES. Initial Investment is Low Compared to SoC Performance and Cost Benefits
NOC INTERCONNECT IMPROVES SOC ECONO CONOMICS Initial Investment is Low Compared to SoC Performance and Cost Benefits A s systems on chip (SoCs) have interconnect, along with its configuration, verification,
More informationARM Multimedia IP: working together to drive down system power and bandwidth
ARM Multimedia IP: working together to drive down system power and bandwidth Speaker: Robert Kong ARM China FAE Author: Sean Ellis ARM Architect 1 Agenda System power overview Bandwidth, bandwidth, bandwidth!
More informationManaging Complex Trace Filtering and Triggering Capabilities of CoreSight. Jens Braunes pls Development Tools
Managing Complex Trace Filtering and Triggering Capabilities of CoreSight Jens Braunes pls Development Tools Outline 2 Benefits and challenges of on-chip trace The evolution of embedded systems and the
More informationDesigning with ALTERA SoC Hardware
Designing with ALTERA SoC Hardware Course Description This course provides all theoretical and practical know-how to design ALTERA SoC devices under Quartus II software. The course combines 60% theory
More informationSoC Platforms and CPU Cores
SoC Platforms and CPU Cores COE838: Systems on Chip Design http://www.ee.ryerson.ca/~courses/coe838/ Dr. Gul N. Khan http://www.ee.ryerson.ca/~gnkhan Electrical and Computer Engineering Ryerson University
More informationA framework for optimizing OpenVX Applications on Embedded Many Core Accelerators
A framework for optimizing OpenVX Applications on Embedded Many Core Accelerators Giuseppe Tagliavini, DEI University of Bologna Germain Haugou, IIS ETHZ Andrea Marongiu, DEI University of Bologna & IIS
More informationOn-chip Networks Enable the Dark Silicon Advantage. Drew Wingard CTO & Co-founder Sonics, Inc.
On-chip Networks Enable the Dark Silicon Advantage Drew Wingard CTO & Co-founder Sonics, Inc. Agenda Sonics history and corporate summary Power challenges in advanced SoCs General power management techniques
More informationDesigning with NXP i.mx8m SoC
Designing with NXP i.mx8m SoC Course Description Designing with NXP i.mx8m SoC is a 3 days deep dive training to the latest NXP application processor family. The first part of the course starts by overviewing
More informationTest and Verification Solutions. ARM Based SOC Design and Verification
Test and Verification Solutions ARM Based SOC Design and Verification 7 July 2008 1 7 July 2008 14 March 2 Agenda System Verification Challenges ARM SoC DV Methodology ARM SoC Test bench Construction Conclusion
More informationComputer and Hardware Architecture II. Benny Thörnberg Associate Professor in Electronics
Computer and Hardware Architecture II Benny Thörnberg Associate Professor in Electronics Parallelism Microscopic vs Macroscopic Microscopic parallelism hardware solutions inside system components providing
More informationBig.LITTLE Processing with ARM Cortex -A15 & Cortex-A7
Big.LITTLE Processing with ARM Cortex -A15 & Cortex-A7 Improving Energy Efficiency in High-Performance Mobile Platforms Peter Greenhalgh, ARM September 2011 This paper presents the rationale and design
More informationSoftware Defined Modem A commercial platform for wireless handsets
Software Defined Modem A commercial platform for wireless handsets Charles F Sturman VP Marketing June 22 nd ~ 24 th Brussels charles.stuman@cognovo.com www.cognovo.com Agenda SDM Separating hardware from
More informationCoreTile Express for Cortex-A5
CoreTile Express for Cortex-A5 For the Versatile Express Family The Versatile Express family development boards provide an excellent environment for prototyping the next generation of system-on-chip designs.
More informationNew STM32 F7 Series. World s 1 st to market, ARM Cortex -M7 based 32-bit MCU
New STM32 F7 Series World s 1 st to market, ARM Cortex -M7 based 32-bit MCU 7 Keys of STM32 F7 series 2 1 2 3 4 5 6 7 First. ST is first to sample a fully functional Cortex-M7 based 32-bit MCU : STM32
More informationZynq Architecture, PS (ARM) and PL
, PS (ARM) and PL Joint ICTP-IAEA School on Hybrid Reconfigurable Devices for Scientific Instrumentation Trieste, 1-5 June 2015 Fernando Rincón Fernando.rincon@uclm.es 1 Contents Zynq All Programmable
More informationNetSpeed ORION: A New Approach to Design On-chip Interconnects. August 26 th, 2013
NetSpeed ORION: A New Approach to Design On-chip Interconnects August 26 th, 2013 INTERCONNECTS BECOMING INCREASINGLY IMPORTANT Growing number of IP cores Average SoCs today have 100+ IPs Mixing and matching
More informationProduct Series SoC Solutions Product Series 2016
Product Series Why SPI? or We will discuss why Serial Flash chips are used in many products. What are the advantages and some of the disadvantages. We will explore how SoC Solutions SPI and QSPI IP Cores
More informationIt's not about the core, it s about the system
It's not about the core, it s about the system Gajinder Panesar, CTO, UltraSoC gajinder.panesar@ultrasoc.com RISC-V Workshop 18 19 July 2018 Chennai, India Overview Architecture overview Example Scenarios
More informationNoC Generic Scoreboard VIP by François Cerisier and Mathieu Maisonneuve, Test and Verification Solutions
NoC Generic Scoreboard VIP by François Cerisier and Mathieu Maisonneuve, Test and Verification Solutions Abstract The increase of SoC complexity with more cores, IPs and other subsystems has led SoC architects
More informationEach Milliwatt Matters
Each Milliwatt Matters Ultra High Efficiency Application Processors Govind Wathan Product Manager, CPG ARM Tech Symposia China 2015 November 2015 Ultra High Efficiency Processors Used in Diverse Markets
More informationAchieving UFS Host Throughput For System Performance
Achieving UFS Host Throughput For System Performance Yifei-Liu CAE Manager, Synopsys Mobile Forum 2013 Copyright 2013 Synopsys Agenda UFS Throughput Considerations to Meet Performance Objectives UFS Host
More informationHardware-Software Codesign
Hardware-Software Codesign 8. Performance Estimation Lothar Thiele 8-1 System Design specification system synthesis estimation -compilation intellectual prop. code instruction set HW-synthesis intellectual
More informationNext Generation Enterprise Solutions from ARM
Next Generation Enterprise Solutions from ARM Ian Forsyth Director Product Marketing Enterprise and Infrastructure Applications Processor Product Line Ian.forsyth@arm.com 1 Enterprise Trends IT is the
More informationMaximizing heterogeneous system performance with ARM interconnect and CCIX
Maximizing heterogeneous system performance with ARM interconnect and CCIX Neil Parris, Director of product marketing Systems and software group, ARM Teratec June 2017 Intelligent flexible cloud to enable
More informationAN4777 Application note
Application note Implications of memory interface configurations on low-power STM32 microcontrollers Introduction The low-power STM32 microcontrollers have a rich variety of configuration options regarding
More informationFujitsu System Applications Support. Fujitsu Microelectronics America, Inc. 02/02
Fujitsu System Applications Support 1 Overview System Applications Support SOC Application Development Lab Multimedia VoIP Wireless Bluetooth Processors, DSP and Peripherals ARM Reference Platform 2 SOC
More informationAMBA Protocol for ALU
International Journal of Emerging Engineering Research and Technology Volume 2, Issue 5, August 2014, PP 51-59 ISSN 2349-4395 (Print) & ISSN 2349-4409 (Online) AMBA Protocol for ALU K Swetha Student, Dept
More informationIntroduction to gem5. Nizamudheen Ahmed Texas Instruments
Introduction to gem5 Nizamudheen Ahmed Texas Instruments 1 Introduction A full-system computer architecture simulator Open source tool focused on architectural modeling BSD license Encompasses system-level
More informationThe Design and Implementation of a Low-Latency On-Chip Network
The Design and Implementation of a Low-Latency On-Chip Network Robert Mullins 11 th Asia and South Pacific Design Automation Conference (ASP-DAC), Jan 24-27 th, 2006, Yokohama, Japan. Introduction Current
More informationARM CORTEX-R52. Target Audience: Engineers and technicians who develop SoCs and systems based on the ARM Cortex-R52 architecture.
ARM CORTEX-R52 Course Family: ARMv8-R Cortex-R CPU Target Audience: Engineers and technicians who develop SoCs and systems based on the ARM Cortex-R52 architecture. Duration: 4 days Prerequisites and related
More informationTRACE32. Product Overview
TRACE32 Product Overview Preprocessor Product Portfolio Lauterbach is the world s leading manufacturer of complete, modular microprocessor development tools with 35 years experience in the field of embedded
More informationSystem-on-Chip Architecture for Mobile Applications. Sabyasachi Dey
System-on-Chip Architecture for Mobile Applications Sabyasachi Dey Email: sabyasachi.dey@gmail.com Agenda What is Mobile Application Platform Challenges Key Architecture Focus Areas Conclusion Mobile Revolution
More informationStrato and Strato OS. Justin Zhang Senior Applications Engineering Manager. Your new weapon for verification challenge. Nov 2017
Strato and Strato OS Your new weapon for verification challenge Justin Zhang Senior Applications Engineering Manager Nov 2017 Emulation Market Evolution Emulation moved to Virtualization with Veloce2 Data
More informationChapter 5. Introduction ARM Cortex series
Chapter 5 Introduction ARM Cortex series 5.1 ARM Cortex series variants 5.2 ARM Cortex A series 5.3 ARM Cortex R series 5.4 ARM Cortex M series 5.5 Comparison of Cortex M series with 8/16 bit MCUs 51 5.1
More informationFPGA Entering the Era of the All Programmable SoC
FPGA Entering the Era of the All Programmable SoC Ivo Bolsens, Senior Vice President & CTO Page 1 Moore s Law: The Technology Pipeline Page 2 Industry Debates on Cost Page 3 Design Cost Estimated Chip
More informationHotChips An innovative HD video and digital image processor for low-cost digital entertainment products. Deepu Talla.
HotChips 2007 An innovative HD video and digital image processor for low-cost digital entertainment products Deepu Talla Texas Instruments 1 Salient features of the SoC HD video encode and decode using
More informationA Next Generation Home Access Point and Router
A Next Generation Home Access Point and Router Product Marketing Manager Network Communication Technology and Application of the New Generation Points of Discussion Why Do We Need a Next Gen Home Router?
More information3D Graphics in Future Mobile Devices. Steve Steele, ARM
3D Graphics in Future Mobile Devices Steve Steele, ARM Market Trends Mobile Computing Market Growth Volume in millions Mobile Computing Market Trends 1600 Smart Mobile Device Shipments (Smartphones and
More informationKeyStone C665x Multicore SoC
KeyStone Multicore SoC Architecture KeyStone C6655/57: Device Features C66x C6655: One C66x DSP Core at 1.0 or 1.25 GHz C6657: Two C66x DSP Cores at 0.85, 1.0, or 1.25 GHz Fixed and Floating Point Operations
More informationVerification Futures Nick Heaton, Distinguished Engineer, Cadence Design Systems
Verification Futures 2016 Nick Heaton, Distinguished Engineer, Cadence Systems Agenda Update on Challenges presented in 2015, namely Scalability of the verification engines The rise of Use-Case Driven
More informationAnalyze system performance using IWB. Interconnect Workbench Dave Huang
Analyze system performance using IWB Interconnect Workbench Dave Huang Perf_analysis@126.com 1 Information Personal peech of personal experience I am on behalf on myself Interconnects Are at the Heart
More informationSimplify System Complexity
1 2 Simplify System Complexity With the new high-performance CompactRIO controller Arun Veeramani Senior Program Manager National Instruments NI CompactRIO The Worlds Only Software Designed Controller
More informationImplementing Flexible Interconnect Topologies for Machine Learning Acceleration
Implementing Flexible Interconnect for Machine Learning Acceleration A R M T E C H S Y M P O S I A O C T 2 0 1 8 WILLIAM TSENG Mem Controller 20 mm Mem Controller Machine Learning / AI SoC New Challenges
More informationARM Debug and Trace. Configuration and Usage Models. Document number: ARM DEN 0034A Copyright ARM Limited
ARM Debug and Trace Configuration and Usage Models Document number: ARM DEN 0034A Copyright ARM Limited 2012-2013 ARM Debug and Trace Configuration and Usage Models Release information The following table
More informationThe Evolution of the ARM Architecture Towards Big Data and the Data-Centre
The Evolution of the ARM Architecture Towards Big Data and the Data-Centre 8th Workshop on Virtualization in High-Performance Cloud Computing (VHPC'13) held in conjunction with SC 13, Denver, Colorado
More informationChapter 15 ARM Architecture, Programming and Development Tools
Chapter 15 ARM Architecture, Programming and Development Tools Lesson 07 ARM Cortex CPU and Microcontrollers 2 Microcontroller CORTEX M3 Core 32-bit RALU, single cycle MUL, 2-12 divide, ETM interface,
More informationVLSI Design of Multichannel AMBA AHB
RESEARCH ARTICLE OPEN ACCESS VLSI Design of Multichannel AMBA AHB Shraddha Divekar,Archana Tiwari M-Tech, Department Of Electronics, Assistant professor, Department Of Electronics RKNEC Nagpur,RKNEC Nagpur
More informationGrowth outside Cell Phone Applications
ARM Introduction Growth outside Cell Phone Applications ~1B units shipped into non-mobile applications Embedded segment now accounts for 13% of ARM shipments Automotive, microcontroller and smartcards
More informationMultimedia in Mobile Phones. Architectures and Trends Lund
Multimedia in Mobile Phones Architectures and Trends Lund 091124 Presentation Henrik Ohlsson Contact: henrik.h.ohlsson@stericsson.com Working with multimedia hardware (graphics and displays) at ST- Ericsson
More informationAHB monitor. Monitor. AHB bridge. Expansion AHB ports M1, M2, and S. AHB bridge. AHB bridge. Configuration. Smart card reader SSP (PL022)
The ARM RealView Versatile family of development boards provide a feature rich prototyping system for system-on-chip designs. This family includes the first development board to support both the ARM926EJ-S
More informationSEMICON Solutions. Bus Structure. Created by: Duong Dang Date: 20 th Oct,2010
SEMICON Solutions Bus Structure Created by: Duong Dang Date: 20 th Oct,2010 Introduction Buses are the simplest and most widely used interconnection networks A number of modules is connected via a single
More informationSQLoC: Using SQL database for performance analysis of an ARM v8 SoC
SQLoC: Using SQL database for performance analysis of an ARM v8 SoC Gordon Allan and Avidan Efody Mentor Graphics Agenda The performance analysis problem Run time & design time configuration Use cases
More informationPlace Your Logo Here. K. Charles Janac
Place Your Logo Here K. Charles Janac President and CEO Arteris is the Leading Network on Chip IP Provider Multiple Traffic Classes Low Low cost cost Control Control CPU DSP DMA Multiple Interconnect Types
More informationARM Connected Community Technical Symposium Reaching High Performance System Design Using AMBA Fabric IP
ARM Connected Community Technical Symposium Reaching High Performance System Design Using AMBA Fabric IP Tim Mace Senior Technical Marketing Manager Fabric IP BU, ARM 1 What is Fabric IP? Fabric IP is:
More informationCortex-A75 and Cortex-A55 DynamIQ processors Powering applications from mobile to autonomous driving
Cortex-A75 and Cortex- DynamIQ processors Powering applications from mobile to autonomous driving Lionel Belnet Sr. Product Manager Arm Arm Tech Symposia 2017 Agenda Market growth and trends DynamIQ technology
More informationECE 551 System on Chip Design
ECE 551 System on Chip Design Introducing Bus Communications Garrett S. Rose Fall 2018 Emerging Applications Requirements Data Flow vs. Processing µp µp Mem Bus DRAMC Core 2 Core N Main Bus µp Core 1 SoCs
More informationPower Aware Architecture Design for Multicore SoCs
Power Aware Architecture Design for Multicore SoCs EDPS Monterey Patrick Sheridan Synopsys Virtual Prototyping April 2015 Low Power SoC Design Multi-disciplinary system problem Must manage energy consumption
More informationMobile & IoT Market Trends and Memory Requirements
Mobile & IoT Market Trends and Memory Requirements JEDEC Mobile & IOT Forum Daniel Heo ARM Segment Marketing Copyright ARM 2016 Outline Wearable & IoT Market Opportunities Challenges in Wearables & IoT
More informationBuses. Maurizio Palesi. Maurizio Palesi 1
Buses Maurizio Palesi Maurizio Palesi 1 Introduction Buses are the simplest and most widely used interconnection networks A number of modules is connected via a single shared channel Microcontroller Microcontroller
More informationMapping applications into MPSoC
Mapping applications into MPSoC concurrency & communication Jos van Eijndhoven jos@vectorfabrics.com March 12, 2011 MPSoC mapping: exploiting concurrency 2 March 12, 2012 Computation on general purpose
More informationModeling and Simulation of System-on. Platorms. Politecnico di Milano. Donatella Sciuto. Piazza Leonardo da Vinci 32, 20131, Milano
Modeling and Simulation of System-on on-chip Platorms Donatella Sciuto 10/01/2007 Politecnico di Milano Dipartimento di Elettronica e Informazione Piazza Leonardo da Vinci 32, 20131, Milano Key SoC Market
More informationThe Nios II Family of Configurable Soft-core Processors
The Nios II Family of Configurable Soft-core Processors James Ball August 16, 2005 2005 Altera Corporation Agenda Nios II Introduction Configuring your CPU FPGA vs. ASIC CPU Design Instruction Set Architecture
More informationEvolving IP configurability and the need for intelligent IP configuration
Evolving IP configurability and the need for intelligent IP configuration Mayank Sharma Product Manager ARM Tech Symposia India December 7 th 2016 Increasing IP integration costs per node $140 $120 $M
More informationApplying the Benefits of Network on a Chip Architecture to FPGA System Design
white paper Intel FPGA Applying the Benefits of on a Chip Architecture to FPGA System Design Authors Kent Orthner Senior Manager, Software and IP Intel Corporation Table of Contents Abstract...1 Introduction...1
More informationAn Efficient AXI Read and Write Channel for Memory Interface in System-on-Chip
An Efficient AXI Read and Write Channel for Memory Interface in System-on-Chip Abhinav Tiwari M. Tech. Scholar, Embedded System and VLSI Design Acropolis Institute of Technology and Research, Indore (India)
More informationAsynchronous on-chip Communication: Explorations on the Intel PXA27x Peripheral Bus
Asynchronous on-chip Communication: Explorations on the Intel PXA27x Peripheral Bus Andrew M. Scott, Mark E. Schuelein, Marly Roncken, Jin-Jer Hwan John Bainbridge, John R. Mawer, David L. Jackson, Andrew
More information