Embedded Systems 1: Course Presentation

Size: px
Start display at page:

Download "Embedded Systems 1: Course Presentation"

Transcription

1 October 2017 Embedded Systems 1: Course Presentation Davide Zoni PhD webpage: home.deib.polimi.it/zoni

2 About me Education: PostDoC (March 2014 now) PhD Student ( Jan 2011 Dec 2013 ) M.S. Degree Politecnico di Milano (2010) Visiting: ARM Cambridge (UK) 2015 UCY (Cyprus) 2015 UPV (Valencia SPAIN) 2013 and 2014 Research Topics: On chip interconnect, Cache coherence and hierarchy RTL design, simulation and verification Hardware side side channel coutermeasures 2

3 3 Contacts & Places Prof. William Fornaciari (Professor in charge) webpage: home.deib.polimi.it/fornacia Office: Building 20 first floor Davide Zoni PhD phone: webpage: home.deib.polimi.it/zoni My Office Campus bassini (eg building) First floor, room 004

4 A short list of Embedded Systems 4

5 5 Some Embedded Systems Topics Multi cores Analysis Power/performance optimization Thermal/performance optimization Application oriented design Simulation On chip interconnect Cache Hierarchy Design FPGA prototyping Run time policies Application oriented design Embedded devices Microcontroller' peripherals Wireless sensor networks (WSN)

6 6 What do you expect from this course?

7 Course Outline On chip Communication (3 hours): Bus Networks on Chip Architectural Simulation (5 hours): Event Driven Execution Model Cycle accurate simulators: GEM5 Design Space Exploration (DSE) RTL Design Verification and Simulation (12 hours): Design Organization RTL Simulation Model SystemVerilog For Synthesis (Verilog 2001 subset) Combinatorial and sequential logic, state machines SystemVerilog for Verification (Verilog 2001 subset) Writing testbenches Testbench DUT model Invited talks: GPU architecture, on going MS thesis, STMicroelectronics 7

8 Objectives of this part of the course Hands on with the most common issues in the design of high end embedded multi cores Design, Simulation and Verification aspects On chip communication Architectural Exploration (DSE) Research/Company Approach Project presentation Get involved in realistic design and evaluation problems Team working 8

9 9 Tentative schedule Introduction + On Chip Bus/Networks 3 hours Architectural Simulation with Gem5 3 hours Design Space Exploration using GEM5 2 hours SystemVerilog for Design 3 hours SystemVerilog for Verification 3 hours SystemVerilog with Vivado 3 hours SystemVerilog FSM from Past Exams 3 hours Project Presentation 2 hours ? (Before Xmas?) Written Exam 2 hours

10 The Multi Core Revolution 10

11 11 The Power Performance Gap Heterogeneous Processors Tile-GX 10 0 Tilera-64 # of cores Intel Polaris(80-core) Gap Intel Single-chip cloud computer 4 Intel IBM Cell Larrabee Performance Sun Intel Core i5, i Time Time

12 12 Goal: Better performance Design Issues and Constraints Power Thermal Reliability

13 13 Market: More Performance and Energy Efficiency Users Same experience with any device Multi tasking Demanding tasks (graphic apps) Multiple Running Applications Compete for resources Quality of Service (QoS) Different priority levels Embedded Multi Cores Limited resources Optimized power consumption

14 Market: The Smartphone/Tablet Scenario 14 The embedded GPU raw computational power increases at each generation Market asks for even higher resolution Portable device should be able to reproduce media contents smoothly, at native resolution FHD resolution: 32bit*1920*1080=8MB/frame 25 fps 200MB/s CPU0 CPUn GPU L1 0 L1 n L1 L2 On chip interconnect Main Memory CPU Prepares the vertexes and texture addresses Signal the GPU GPU Elaborate the scene and store it back to memory Signal the CPU LCD Controller Get the frame and display it at fixed rate

15 Power Trends and Issues 15 Desktop/Server: Today some of the most powerful microprocessor chips can dissipate Watts, for an average power density of Watts per sqare centimeter. Local hot spots on the die can be several times higher than this number. Low Power Methodology Manual, 2007 Smartphone/Tablet: For battery powered, hand held devices, the numbers are smaller but the problem just as serious. Battery life peaked in Since then the battery life reduced due as the features have been added faster than power has been reduced ITRS Datacenter/HPC: The total power consumption of microprocessor chips presents a significant problem for server farms, where infrastructure costs can equal the cost of the computer themselves. Low Power Methodology Manual, 2007

16 CMOS Technology Evolution 16 Thermal and Cooling Issues Michael Keating Davidd Flynn et. al., Low Power Methodology Manual, ARM&Synopsis, 2007

17 Reliability and Process Variability Issues 17 Small Transistor = more transistor per chip Thermal issues Increasing heat High leakage High power dissipation Design Issues Increasing complexity Escaped bugs Technology Issues Parametric Variations Soft/Hard Errors

18 Technology VS Design Paradox Technology scaling leads to fragile transistors Reliable system design: Providing digital design solutions that deliver the Reliable systems built on an unreliable silicon substrate Electronic devices must be reliable to be successful 18

19 Reliability: Hard Faults VS Escaped Bugs Escaped Bugs Hard Fault 19 Irreversible physical change Latent manufacturing defects Thermal issues and over voltage can induce them Design errors Increase with the device complexity Dozens in current designs Intel named their solution as a specification update Verification Formal methods No simulation required Specific design properties are evaluated RTL simulation Ad hoc test to assess the design semantic Current HDL support assert and checkers Check here Intel P4

20 RTL Simulation: Bug Inspection Static View (design time) Escaped Bugs Seems unavoidable in complex designs 500M transistors and above 5M FlipFlops 20 Verified (bug free) Verification is time consuming and constrained to the time to market Dynamic View (Run time) Verified (bug free) Possible Escaped bugs End user experience Possible Escaped bugs Design works Escaped bugs: subtle and not easily triggered They eventually go unnoticed for the entire life of the device

21 The Multi cores: Rethink the Processor Design 21 Key concepts Split the complexity (fewer bugs) Split the power consumption Simplify the design of reliable methodology to face HW faults Ease performance/power prediction Explicit Parallelism Coremulticore FUcore Design Issues: a Fresh View is Required Memory/cache hierarchy Coherence protocols Interconnect Power Cache 4 3 Large Core Small Core Performance Power = 1/4 Performance = 1/2 1 1 C1 C2 Cache C3 C

22 22 The design problem: On chip interconnect and Cache hierarchy Bandwidth, arbitration, Quality of Service (QoS), latency Low power, small footprint from low (single task/dedicated) to high end (multi core) embedded systems

23 Microcontrollers: STM32F4Discovery ARM M4 23

24 Embedded CPUs: mor1kx SoC 24 OpenSource OpenRisc1000 compliant 3 6 stage pipeline Wishbone Bus Configurable FPU Configurable Icache/Dcache Multi core available OptimSoC NoC based PolimiMCore SCA design Coherence Bus Based Both memory and interconnect strongly affect the SoC performance: +20% using a split instr/data bus +30% with caches vs w/o caches

25 HighEnd Multicore: Sun Niagara T OpenSource (Verilog) 8 in order CPUs 4 way HW multithreading Up to 32 parallel threads of execution The interconnect forces the floorplan Enhanced versions: T2 (2007), T3 (2008), T4 (2011), T5 (2013)

26 26 HighEnd ManyCore: Intel SCC 48 Pentium CPUs 2D mesh State of the art VC based routers Message passing / no HW coherence Multiple Voltage Islands 1VDD per tile (2CPUs per tile) 1NoC Vdd island The interconnect shapes the floorplan The NoC power consumption is within 20% of the SoC

27 Multi core: On chip Architectural View On chip CPU0 L1 0 CPU1 L1 1 CPUn L1 n On chip interconnect On chip interconnect Point to Point Bus/Crossbar NoC Traffic type Coherence traffic Message passing L2 0 L2 1 L2 n On chip interconnect Off chip Main Memory 27 Additional control traffic from user space if supported Heterogeneous Multi Cores GPUs, HW Accelerators How (inter)connected Coherence support NOTE: The memory controller can be on chip in current and future solutions

28 28 Available Solutions to the On chip Interconnect Design Problem

29 On chip Interconnect Architecture: Point2Point29 Scheduler component is required Requests/responses routing No master 2 master communication No data sharing on the interconnect Scalability/Performance issues CPU0 CPU1 CPUn L1 0 L1 1 L1 n Sched Sched Sched The link count grows as n*(n 1) Full throughput between each master/slave pair Reliability issues On chip Single faulty link prevents the communication Sched Sched Sched L2 0 L2 1 L2 n On chip interconnect Off chip Main Memory

30 30 On chip Interconnect Architecture: Bus Single shared communication medium Data sharing Any attached master can observe the on the bus data Contention grows with the attached masters onto the bus Reliability issues CPU0 CPU1 CPUn L1 0 L1 1 L1 n Exploited property for the snooping coherence protocol family Scalability/Performance issues Arbitration request On chip L2 0 L2 1 L2 n On chip interconnect Single point of failure Off chip Main Memory

31 31 On chip Interconnect Architecture: Xbar Single shared communication medium Arbitration request More complex and bus arbiters Data sharing Exploited property for the snooping coherence protocol family Scalability/Performance issues Any attached master can observe the on the bus data Better performance with higher power and area On chip CPU0 CPU1 CPUn L1 0 L1 1 L1 n L2 0 L2 1 L2 n On chip interconnect Reliability issues Off chip redundancy Main Memory

32 32 On chip Interconnect Architecture: NoC Distributed interconnect Reduced resource contention Still requires the Data sharing No data sharing, since no assumptions can be made on the taken path from source to destination Directory based coherence protocols Scalability/Performance issues Higher latencies than on chip bus Fully scalable On chip CPU0 CPU1 CPUn L1 0 L1 1 L1 n R R R R R R R R L2 0 L2 1 L2 n On chip interconnect Reliability issues Fully reliable (multiple paths for the same source destination pair) Off chip Main Memory

33 33 On chip Data Types Coherence Traffic Message Passing Data User Data

34 Multi core: Interconnect Traffic On chip CPU0 CPU1 CPUn L1 0 L1 1 L1 n The interconnect always routes Data/Instructions On chip interconnect L2 0 L2 1 L2 n On chip interconnect Off chip Main Memory 34 HW/SW coherence only shapes the traffic volume and intensity A load/store is elaborated in a packet suitable for the interconnect Both the coherent and message passing architectures share the same high level data protocol Request response Request forward response

35 Multi core: On Chip Bus Scalability Example CPU0 CPU1 Some parameters: CPUn L2 0 L2 1 L2 n Main Required bandwidth? Well, at the L2 the generated coherence events: Misses Dirty replacements 2 GHz CPUs, 2 IPC 33% memory operations, 2% of which miss in L2 50% of evictions are dirty Some results: Memory 35 (0.33 * 0.02) + (0.33 * 0.02 * 0.50)) = 0.01 events/insn 0.01 events/insns * 2 insn/cycle * 2 cycle/ns = 0.04 events/ns Request: 0.04 events/ns * 4B/event = 0.16 GB/s = 160 MB/s Data response: 0.04 events/ns * 64 B/event = 2.56 GB/s What about scalability? That s 2.5 GB/s... per processor With 16 processors, that 40 GB/s! With 128 processors, that s 320 GB/s!!

36 Wrapping up: Interconnect and Memory Real case scenario Adapteva board dual core arm A9 coupled with the Parallela 16 core coprocessor [Epiphany III 16 core 65nm (E16G301)] (~100 online) 36 E16G301 TECHCAL SPECS 16 High Performance RISC CPU Cores 1 GHz Operating Frequency 32 GFLOPS Peak Performance (2float op/core/cycle) ((4B*2)*2)*16=256B/cycle *10^9cycles/s =256GB/s Interconnect 512GB/s Local Memory Bandwidth 64GB/s Network On Chip Bisection Bandwidth 8 GB/s Off Chip Bandwidth 0.5 MB On Chip Distributed Shared Memory (32KB per core) No Data Coherence Additional Technical Specs 2 Watt Maximum Chip Power Consumption IEEE Floating Point Instruction Set Fully featured ANSI C/C++ programmable GNU/Eclipse based tool chain Source synchronous LVDS off chip links for host or direct chip to chip interfacing. Chip to chip links for integrating up to 64 chips on a single board 324 ball 15x15mm flip chip BGA

37 The Final Goal of this Course: Top Down View 37 System level optimization and simulation Savings System Tools: cycle accurate simulators (GEM5) System level trade off analysis and DSE: Cache hierarchy, CPU/Interconnect freq RTL Behavioral RTL Behavioral Tools: IcarusVerilog, Xsim RTL module design and verification Pre synthesis, testbench design RTL Post Synthesis RTL Post Synthesis Tools: Cadence/Xilinx Front end Optimize clock, critical path, module fan out Power, area, timing estimates Place&Route and SPICE level Physical Tools: Cadence/Xilinx Back end Optimize clock, critical path, module fan out Get final power area timing estimates

Embedded Systems: Projects

Embedded Systems: Projects November 2016 Embedded Systems: Projects Davide Zoni PhD email: davide.zoni@polimi.it webpage: home.dei.polimi.it/zoni Contacts & Places Prof. William Fornaciari (Professor in charge) email: william.fornaciari@polimi.it

More information

The Multi-core revolution: Design Issues and Bus-based Interconnect

The Multi-core revolution: Design Issues and Bus-based Interconnect Friday, October, 2013 The Multi-core revolution: Design Issues and Bus-based Interconnect Davide Zoni PhD Student email: zoni@elet.polimi.it webpage: home.dei.polimi.it/zoni Outline Power wall and singlecore

More information

A Closer Look at the Epiphany IV 28nm 64 core Coprocessor. Andreas Olofsson PEGPUM 2013

A Closer Look at the Epiphany IV 28nm 64 core Coprocessor. Andreas Olofsson PEGPUM 2013 A Closer Look at the Epiphany IV 28nm 64 core Coprocessor Andreas Olofsson PEGPUM 2013 1 Adapteva Achieves 3 World Firsts 1. First processor company to reach 50 GFLOPS/W 3. First semiconductor company

More information

Embedded Systems: Projects

Embedded Systems: Projects December 2015 Embedded Systems: Projects Davide Zoni PhD email: davide.zoni@polimi.it webpage: home.dei.polimi.it/zoni Research Activities Interconnect: bus, NoC Simulation (component design, evaluation)

More information

Embedded Systems 1: On Chip Bus

Embedded Systems 1: On Chip Bus October 2016 Embedded Systems 1: On Chip Bus Davide Zoni PhD email: davide.zoni@polimi.it webpage: home.deib.polimi.it/zoni Additional Material and Reference Book 2 Reference Book Chapter Principles and

More information

NetSpeed ORION: A New Approach to Design On-chip Interconnects. August 26 th, 2013

NetSpeed ORION: A New Approach to Design On-chip Interconnects. August 26 th, 2013 NetSpeed ORION: A New Approach to Design On-chip Interconnects August 26 th, 2013 INTERCONNECTS BECOMING INCREASINGLY IMPORTANT Growing number of IP cores Average SoCs today have 100+ IPs Mixing and matching

More information

William Stallings Computer Organization and Architecture 8 th Edition. Chapter 18 Multicore Computers

William Stallings Computer Organization and Architecture 8 th Edition. Chapter 18 Multicore Computers William Stallings Computer Organization and Architecture 8 th Edition Chapter 18 Multicore Computers Hardware Performance Issues Microprocessors have seen an exponential increase in performance Improved

More information

There s STILL plenty of room at the bottom! Andreas Olofsson

There s STILL plenty of room at the bottom! Andreas Olofsson There s STILL plenty of room at the bottom! Andreas Olofsson 1 Richard Feynman s Lecture (1959) There's Plenty of Room at the Bottom An Invitation to Enter a New Field of Physics Why cannot we write the

More information

EE5780 Advanced VLSI CAD

EE5780 Advanced VLSI CAD EE5780 Advanced VLSI CAD Lecture 1 Introduction Zhuo Feng 1.1 Prof. Zhuo Feng Office: EERC 513 Phone: 487-3116 Email: zhuofeng@mtu.edu Class Website http://www.ece.mtu.edu/~zhuofeng/ee5780fall2013.html

More information

Processor Architectures At A Glance: M.I.T. Raw vs. UC Davis AsAP

Processor Architectures At A Glance: M.I.T. Raw vs. UC Davis AsAP Processor Architectures At A Glance: M.I.T. Raw vs. UC Davis AsAP Presenter: Course: EEC 289Q: Reconfigurable Computing Course Instructor: Professor Soheil Ghiasi Outline Overview of M.I.T. Raw processor

More information

Lecture 1: Introduction

Lecture 1: Introduction Contemporary Computer Architecture Instruction set architecture Lecture 1: Introduction CprE 581 Computer Systems Architecture, Fall 2016 Reading: Textbook, Ch. 1.1-1.7 Microarchitecture; examples: Pipeline

More information

An Evaluation of an Energy Efficient Many-Core SoC with Parallelized Face Detection

An Evaluation of an Energy Efficient Many-Core SoC with Parallelized Face Detection An Evaluation of an Energy Efficient Many-Core SoC with Parallelized Face Detection Hiroyuki Usui, Jun Tanabe, Toru Sano, Hui Xu, and Takashi Miyamori Toshiba Corporation, Kawasaki, Japan Copyright 2013,

More information

VLSI Design Automation

VLSI Design Automation VLSI Design Automation IC Products Processors CPU, DSP, Controllers Memory chips RAM, ROM, EEPROM Analog Mobile communication, audio/video processing Programmable PLA, FPGA Embedded systems Used in cars,

More information

Leveraging OpenSPARC. ESA Round Table 2006 on Next Generation Microprocessors for Space Applications EDD

Leveraging OpenSPARC. ESA Round Table 2006 on Next Generation Microprocessors for Space Applications EDD Leveraging OpenSPARC ESA Round Table 2006 on Next Generation Microprocessors for Space Applications G.Furano, L.Messina TEC- OpenSPARC T1 The T1 is a new-from-the-ground-up SPARC microprocessor implementation

More information

VLSI Design Automation. Maurizio Palesi

VLSI Design Automation. Maurizio Palesi VLSI Design Automation 1 Outline Technology trends VLSI Design flow (an overview) 2 Outline Technology trends VLSI Design flow (an overview) 3 IC Products Processors CPU, DSP, Controllers Memory chips

More information

VLSI Design Automation. Calcolatori Elettronici Ing. Informatica

VLSI Design Automation. Calcolatori Elettronici Ing. Informatica VLSI Design Automation 1 Outline Technology trends VLSI Design flow (an overview) 2 IC Products Processors CPU, DSP, Controllers Memory chips RAM, ROM, EEPROM Analog Mobile communication, audio/video processing

More information

Outline Marquette University

Outline Marquette University COEN-4710 Computer Hardware Lecture 1 Computer Abstractions and Technology (Ch.1) Cristinel Ababei Department of Electrical and Computer Engineering Credits: Slides adapted primarily from presentations

More information

ECE 2162 Intro & Trends. Jun Yang Fall 2009

ECE 2162 Intro & Trends. Jun Yang Fall 2009 ECE 2162 Intro & Trends Jun Yang Fall 2009 Prerequisites CoE/ECE 0142: Computer Organization; or CoE/CS 1541: Introduction to Computer Architecture I will assume you have detailed knowledge of Pipelining

More information

System-on-Chip Architecture for Mobile Applications. Sabyasachi Dey

System-on-Chip Architecture for Mobile Applications. Sabyasachi Dey System-on-Chip Architecture for Mobile Applications Sabyasachi Dey Email: sabyasachi.dey@gmail.com Agenda What is Mobile Application Platform Challenges Key Architecture Focus Areas Conclusion Mobile Revolution

More information

OpenSMART: Single-cycle Multi-hop NoC Generator in BSV and Chisel

OpenSMART: Single-cycle Multi-hop NoC Generator in BSV and Chisel OpenSMART: Single-cycle Multi-hop NoC Generator in BSV and Chisel Hyoukjun Kwon and Tushar Krishna Georgia Institute of Technology Synergy Lab (http://synergy.ece.gatech.edu) hyoukjun@gatech.edu April

More information

How to build a Megacore microprocessor. by Andreas Olofsson (MULTIPROG WORKSHOP 2017)

How to build a Megacore microprocessor. by Andreas Olofsson (MULTIPROG WORKSHOP 2017) How to build a Megacore microprocessor by Andreas Olofsson (MULTIPROG WORKSHOP 2017) 1 Disclaimers 2 This presentation summarizes work done by Adapteva from 2008-2016. Statements and opinions are my own

More information

Microprocessor Trends and Implications for the Future

Microprocessor Trends and Implications for the Future Microprocessor Trends and Implications for the Future John Mellor-Crummey Department of Computer Science Rice University johnmc@rice.edu COMP 522 Lecture 4 1 September 2016 Context Last two classes: from

More information

EITF35: Introduction to Structured VLSI Design

EITF35: Introduction to Structured VLSI Design EITF35: Introduction to Structured VLSI Design Part 1.1.2: Introduction (Digital VLSI Systems) Liang Liu liang.liu@eit.lth.se 1 Outline Why Digital? History & Roadmap Device Technology & Platforms System

More information

SYSTEMS ON CHIP (SOC) FOR EMBEDDED APPLICATIONS

SYSTEMS ON CHIP (SOC) FOR EMBEDDED APPLICATIONS SYSTEMS ON CHIP (SOC) FOR EMBEDDED APPLICATIONS Embedded System System Set of components needed to perform a function Hardware + software +. Embedded Main function not computing Usually not autonomous

More information

Memory Systems IRAM. Principle of IRAM

Memory Systems IRAM. Principle of IRAM Memory Systems 165 other devices of the module will be in the Standby state (which is the primary state of all RDRAM devices) or another state with low-power consumption. The RDRAM devices provide several

More information

Computer Architecture

Computer Architecture Computer Architecture Slide Sets WS 2013/2014 Prof. Dr. Uwe Brinkschulte M.Sc. Benjamin Betting Part 10 Thread and Task Level Parallelism Computer Architecture Part 10 page 1 of 36 Prof. Dr. Uwe Brinkschulte,

More information

FPGA based Design of Low Power Reconfigurable Router for Network on Chip (NoC)

FPGA based Design of Low Power Reconfigurable Router for Network on Chip (NoC) FPGA based Design of Low Power Reconfigurable Router for Network on Chip (NoC) D.Udhayasheela, pg student [Communication system],dept.ofece,,as-salam engineering and technology, N.MageshwariAssistant Professor

More information

Low-Power Interconnection Networks

Low-Power Interconnection Networks Low-Power Interconnection Networks Li-Shiuan Peh Associate Professor EECS, CSAIL & MTL MIT 1 Moore s Law: Double the number of transistors on chip every 2 years 1970: Clock speed: 108kHz No. transistors:

More information

Energy Efficient Computing Systems (EECS) Magnus Jahre Coordinator, EECS

Energy Efficient Computing Systems (EECS) Magnus Jahre Coordinator, EECS Energy Efficient Computing Systems (EECS) Magnus Jahre Coordinator, EECS Who am I? Education Master of Technology, NTNU, 2007 PhD, NTNU, 2010. Title: «Managing Shared Resources in Chip Multiprocessor Memory

More information

Aim High. Intel Technical Update Teratec 07 Symposium. June 20, Stephen R. Wheat, Ph.D. Director, HPC Digital Enterprise Group

Aim High. Intel Technical Update Teratec 07 Symposium. June 20, Stephen R. Wheat, Ph.D. Director, HPC Digital Enterprise Group Aim High Intel Technical Update Teratec 07 Symposium June 20, 2007 Stephen R. Wheat, Ph.D. Director, HPC Digital Enterprise Group Risk Factors Today s s presentations contain forward-looking statements.

More information

Modeling and Simulation of System-on. Platorms. Politecnico di Milano. Donatella Sciuto. Piazza Leonardo da Vinci 32, 20131, Milano

Modeling and Simulation of System-on. Platorms. Politecnico di Milano. Donatella Sciuto. Piazza Leonardo da Vinci 32, 20131, Milano Modeling and Simulation of System-on on-chip Platorms Donatella Sciuto 10/01/2007 Politecnico di Milano Dipartimento di Elettronica e Informazione Piazza Leonardo da Vinci 32, 20131, Milano Key SoC Market

More information

Distributed systems: paradigms and models Motivations

Distributed systems: paradigms and models Motivations Distributed systems: paradigms and models Motivations Prof. Marco Danelutto Dept. Computer Science University of Pisa Master Degree (Laurea Magistrale) in Computer Science and Networking Academic Year

More information

Parallelism in Hardware

Parallelism in Hardware Parallelism in Hardware Minsoo Ryu Department of Computer Science and Engineering 2 1 Advent of Multicore Hardware 2 Multicore Processors 3 Amdahl s Law 4 Parallelism in Hardware 5 Q & A 2 3 Moore s Law

More information

ECE/CS 757: Advanced Computer Architecture II Interconnects

ECE/CS 757: Advanced Computer Architecture II Interconnects ECE/CS 757: Advanced Computer Architecture II Interconnects Instructor:Mikko H Lipasti Spring 2017 University of Wisconsin-Madison Lecture notes created by Natalie Enright Jerger Lecture Outline Introduction

More information

1. Microprocessor Architectures. 1.1 Intel 1.2 Motorola

1. Microprocessor Architectures. 1.1 Intel 1.2 Motorola 1. Microprocessor Architectures 1.1 Intel 1.2 Motorola 1.1 Intel The Early Intel Microprocessors The first microprocessor to appear in the market was the Intel 4004, a 4-bit data bus device. This device

More information

Multicore Hardware and Parallelism

Multicore Hardware and Parallelism Multicore Hardware and Parallelism Minsoo Ryu Department of Computer Science and Engineering 2 1 Advent of Multicore Hardware 2 Multicore Processors 3 Amdahl s Law 4 Parallelism in Hardware 5 Q & A 2 3

More information

Computer Architecture!

Computer Architecture! Informatics 3 Computer Architecture! Dr. Vijay Nagarajan and Prof. Nigel Topham! Institute for Computing Systems Architecture, School of Informatics! University of Edinburgh! General Information! Instructors

More information

VLSI Design Automation

VLSI Design Automation VLSI Design Automation IC Products Processors CPU, DSP, Controllers Memory chips RAM, ROM, EEPROM Analog Mobile communication, audio/video processing Programmable PLA, FPGA Embedded systems Used in cars,

More information

Lecture 1: Gentle Introduction to GPUs

Lecture 1: Gentle Introduction to GPUs CSCI-GA.3033-004 Graphics Processing Units (GPUs): Architecture and Programming Lecture 1: Gentle Introduction to GPUs Mohamed Zahran (aka Z) mzahran@cs.nyu.edu http://www.mzahran.com Who Am I? Mohamed

More information

Intel Many Integrated Core (MIC) Matt Kelly & Ryan Rawlins

Intel Many Integrated Core (MIC) Matt Kelly & Ryan Rawlins Intel Many Integrated Core (MIC) Matt Kelly & Ryan Rawlins Outline History & Motivation Architecture Core architecture Network Topology Memory hierarchy Brief comparison to GPU & Tilera Programming Applications

More information

CHAPTER 1 INTRODUCTION

CHAPTER 1 INTRODUCTION CHAPTER 1 INTRODUCTION Rapid advances in integrated circuit technology have made it possible to fabricate digital circuits with large number of devices on a single chip. The advantages of integrated circuits

More information

Multimedia in Mobile Phones. Architectures and Trends Lund

Multimedia in Mobile Phones. Architectures and Trends Lund Multimedia in Mobile Phones Architectures and Trends Lund 091124 Presentation Henrik Ohlsson Contact: henrik.h.ohlsson@stericsson.com Working with multimedia hardware (graphics and displays) at ST- Ericsson

More information

A 1.5GHz Third Generation Itanium Processor

A 1.5GHz Third Generation Itanium Processor A 1.5GHz Third Generation Itanium Processor Jason Stinson, Stefan Rusu Intel Corporation, Santa Clara, CA 1 Outline Processor highlights Process technology details Itanium processor evolution Block diagram

More information

Lecture 26: Multiprocessing continued Computer Architecture and Systems Programming ( )

Lecture 26: Multiprocessing continued Computer Architecture and Systems Programming ( ) Systems Group Department of Computer Science ETH Zürich Lecture 26: Multiprocessing continued Computer Architecture and Systems Programming (252-0061-00) Timothy Roscoe Herbstsemester 2012 Today Non-Uniform

More information

CCNoC: Specializing On-Chip Interconnects for Energy Efficiency in Cache-Coherent Servers

CCNoC: Specializing On-Chip Interconnects for Energy Efficiency in Cache-Coherent Servers CCNoC: Specializing On-Chip Interconnects for Energy Efficiency in Cache-Coherent Servers Stavros Volos, Ciprian Seiculescu, Boris Grot, Naser Khosro Pour, Babak Falsafi, and Giovanni De Micheli Toward

More information

Analyzing and Debugging Performance Issues with Advanced ARM CoreLink System IP Components

Analyzing and Debugging Performance Issues with Advanced ARM CoreLink System IP Components Analyzing and Debugging Performance Issues with Advanced ARM CoreLink System IP Components By William Orme, Strategic Marketing Manager, ARM Ltd. and Nick Heaton, Senior Solutions Architect, Cadence Finding

More information

Computer and Hardware Architecture II. Benny Thörnberg Associate Professor in Electronics

Computer and Hardware Architecture II. Benny Thörnberg Associate Professor in Electronics Computer and Hardware Architecture II Benny Thörnberg Associate Professor in Electronics Parallelism Microscopic vs Macroscopic Microscopic parallelism hardware solutions inside system components providing

More information

Parallel Computing: Parallel Architectures Jin, Hai

Parallel Computing: Parallel Architectures Jin, Hai Parallel Computing: Parallel Architectures Jin, Hai School of Computer Science and Technology Huazhong University of Science and Technology Peripherals Computer Central Processing Unit Main Memory Computer

More information

Chapter 0 Introduction

Chapter 0 Introduction Chapter 0 Introduction Jin-Fu Li Laboratory Department of Electrical Engineering National Central University Jhongli, Taiwan Applications of ICs Consumer Electronics Automotive Electronics Green Power

More information

SoC Platforms and CPU Cores

SoC Platforms and CPU Cores SoC Platforms and CPU Cores COE838: Systems on Chip Design http://www.ee.ryerson.ca/~courses/coe838/ Dr. Gul N. Khan http://www.ee.ryerson.ca/~gnkhan Electrical and Computer Engineering Ryerson University

More information

Performance of computer systems

Performance of computer systems Performance of computer systems Many different factors among which: Technology Raw speed of the circuits (clock, switching time) Process technology (how many transistors on a chip) Organization What type

More information

An Alternative to GPU Acceleration For Mobile Platforms

An Alternative to GPU Acceleration For Mobile Platforms Inventing the Future of Computing An Alternative to GPU Acceleration For Mobile Platforms Andreas Olofsson andreas@adapteva.com 50 th DAC June 5th, Austin, TX Adapteva Achieves 3 World Firsts 1. First

More information

Multi-Core Microprocessor Chips: Motivation & Challenges

Multi-Core Microprocessor Chips: Motivation & Challenges Multi-Core Microprocessor Chips: Motivation & Challenges Dileep Bhandarkar, Ph. D. Architect at Large DEG Architecture & Planning Digital Enterprise Group Intel Corporation October 2005 Copyright 2005

More information

Hardware Design Environments. Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University

Hardware Design Environments. Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University Hardware Design Environments Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University Outline Welcome to COE 405 Digital System Design Design Domains and Levels of Abstractions Synthesis

More information

More Course Information

More Course Information More Course Information Labs and lectures are both important Labs: cover more on hands-on design/tool/flow issues Lectures: important in terms of basic concepts and fundamentals Do well in labs Do well

More information

NoC Simulation in Heterogeneous Architectures for PGAS Programming Model

NoC Simulation in Heterogeneous Architectures for PGAS Programming Model NoC Simulation in Heterogeneous Architectures for PGAS Programming Model Sascha Roloff, Andreas Weichslgartner, Frank Hannig, Jürgen Teich University of Erlangen-Nuremberg, Germany Jan Heißwolf Karlsruhe

More information

Emerging Platforms, Emerging Technologies, and the Need for Crosscutting Tools Luca Carloni

Emerging Platforms, Emerging Technologies, and the Need for Crosscutting Tools Luca Carloni Emerging Platforms, Emerging Technologies, and the Need for Crosscutting Tools Luca Carloni Department of Computer Science Columbia University in the City of New York NSF Workshop on Emerging Technologies

More information

What are Clusters? Why Clusters? - a Short History

What are Clusters? Why Clusters? - a Short History What are Clusters? Our definition : A parallel machine built of commodity components and running commodity software Cluster consists of nodes with one or more processors (CPUs), memory that is shared by

More information

Parallelism Marco Serafini

Parallelism Marco Serafini Parallelism Marco Serafini COMPSCI 590S Lecture 3 Announcements Reviews First paper posted on website Review due by this Wednesday 11 PM (hard deadline) Data Science Career Mixer (save the date!) November

More information

Digital Design Methodology

Digital Design Methodology Digital Design Methodology Prof. Soo-Ik Chae Digital System Designs and Practices Using Verilog HDL and FPGAs @ 2008, John Wiley 1-1 Digital Design Methodology (Added) Design Methodology Design Specification

More information

DEVELOPMENT AND VERIFICATION OF AHB2APB BRIDGE PROTOCOL USING UVM TECHNIQUE

DEVELOPMENT AND VERIFICATION OF AHB2APB BRIDGE PROTOCOL USING UVM TECHNIQUE DEVELOPMENT AND VERIFICATION OF AHB2APB BRIDGE PROTOCOL USING UVM TECHNIQUE N.G.N.PRASAD Assistant Professor K.I.E.T College, Korangi Abstract: The AMBA AHB is for high-performance, high clock frequency

More information

HW Trends and Architectures

HW Trends and Architectures Pavel Tvrdík, Jiří Kašpar (ČVUT FIT) HW Trends and Architectures MI-POA, 2011, Lecture 1 1/29 HW Trends and Architectures prof. Ing. Pavel Tvrdík CSc. Ing. Jiří Kašpar Department of Computer Systems Faculty

More information

Gigascale Integration Design Challenges & Opportunities. Shekhar Borkar Circuit Research, Intel Labs October 24, 2004

Gigascale Integration Design Challenges & Opportunities. Shekhar Borkar Circuit Research, Intel Labs October 24, 2004 Gigascale Integration Design Challenges & Opportunities Shekhar Borkar Circuit Research, Intel Labs October 24, 2004 Outline CMOS technology challenges Technology, circuit and μarchitecture solutions Integration

More information

Computer Systems Architecture I. CSE 560M Lecture 19 Prof. Patrick Crowley

Computer Systems Architecture I. CSE 560M Lecture 19 Prof. Patrick Crowley Computer Systems Architecture I CSE 560M Lecture 19 Prof. Patrick Crowley Plan for Today Announcement No lecture next Wednesday (Thanksgiving holiday) Take Home Final Exam Available Dec 7 Due via email

More information

Design Verification Lecture 01

Design Verification Lecture 01 M. Hsiao 1 Design Verification Lecture 01 Course Title: Verification of Digital Systems Professor: Michael Hsiao (355 Durham) Prerequisites: Digital Logic Design, C/C++ Programming, Data Structures, Computer

More information

Design and Technology Trends

Design and Technology Trends Lecture 1 Design and Technology Trends R. Saleh Dept. of ECE University of British Columbia res@ece.ubc.ca 1 Recently Designed Chips Itanium chip (Intel), 2B tx, 700mm 2, 8 layer 65nm CMOS (4 processors)

More information

Embedded Systems: Hardware Components (part II) Todor Stefanov

Embedded Systems: Hardware Components (part II) Todor Stefanov Embedded Systems: Hardware Components (part II) Todor Stefanov Leiden Embedded Research Center, Leiden Institute of Advanced Computer Science Leiden University, The Netherlands Outline Generic Embedded

More information

Design Methodologies. Kai Huang

Design Methodologies. Kai Huang Design Methodologies Kai Huang News Is that real? In such a thermally constrained environment, going quad-core only makes sense if you can properly power gate/turbo up when some cores are idle. I have

More information

Today. SMP architecture. SMP architecture. Lecture 26: Multiprocessing continued Computer Architecture and Systems Programming ( )

Today. SMP architecture. SMP architecture. Lecture 26: Multiprocessing continued Computer Architecture and Systems Programming ( ) Lecture 26: Multiprocessing continued Computer Architecture and Systems Programming (252-0061-00) Timothy Roscoe Herbstsemester 2012 Systems Group Department of Computer Science ETH Zürich SMP architecture

More information

Fundamentals of Quantitative Design and Analysis

Fundamentals of Quantitative Design and Analysis Fundamentals of Quantitative Design and Analysis Dr. Jiang Li Adapted from the slides provided by the authors Computer Technology Performance improvements: Improvements in semiconductor technology Feature

More information

COMPUTING ELEMENT EVOLUTION AND ITS IMPACT ON SIMULATION CODES

COMPUTING ELEMENT EVOLUTION AND ITS IMPACT ON SIMULATION CODES COMPUTING ELEMENT EVOLUTION AND ITS IMPACT ON SIMULATION CODES P(ND) 2-2 2014 Guillaume Colin de Verdière OCTOBER 14TH, 2014 P(ND)^2-2 PAGE 1 CEA, DAM, DIF, F-91297 Arpajon, France October 14th, 2014 Abstract:

More information

Introduction to System-on-Chip

Introduction to System-on-Chip Introduction to System-on-Chip COE838: Systems-on-Chip Design http://www.ee.ryerson.ca/~courses/coe838/ Dr. Gul N. Khan http://www.ee.ryerson.ca/~gnkhan Electrical and Computer Engineering Ryerson University

More information

Networks for Multi-core Chips A A Contrarian View. Shekhar Borkar Aug 27, 2007 Intel Corp.

Networks for Multi-core Chips A A Contrarian View. Shekhar Borkar Aug 27, 2007 Intel Corp. Networks for Multi-core hips A A ontrarian View Shekhar Borkar Aug 27, 2007 Intel orp. 1 Outline Multi-core system outlook On die network challenges A simple contrarian proposal Benefits Summary 2 A Sample

More information

BREAKING THE MEMORY WALL

BREAKING THE MEMORY WALL BREAKING THE MEMORY WALL CS433 Fall 2015 Dimitrios Skarlatos OUTLINE Introduction Current Trends in Computer Architecture 3D Die Stacking The memory Wall Conclusion INTRODUCTION Ideal Scaling of power

More information

The Use Of Virtual Platforms In MP-SoC Design. Eshel Haritan, VP Engineering CoWare Inc. MPSoC 2006

The Use Of Virtual Platforms In MP-SoC Design. Eshel Haritan, VP Engineering CoWare Inc. MPSoC 2006 The Use Of Virtual Platforms In MP-SoC Design Eshel Haritan, VP Engineering CoWare Inc. MPSoC 2006 1 MPSoC Is MP SoC design happening? Why? Consumer Electronics Complexity Cost of ASIC Increased SW Content

More information

Design methodology for multi processor systems design on regular platforms

Design methodology for multi processor systems design on regular platforms Design methodology for multi processor systems design on regular platforms Ph.D in Electronics, Computer Science and Telecommunications Ph.D Student: Davide Rossi Ph.D Tutor: Prof. Roberto Guerrieri Outline

More information

High-Level Simulations of On-Chip Networks

High-Level Simulations of On-Chip Networks High-Level Simulations of On-Chip Networks Claas Cornelius, Frank Sill, Dirk Timmermann 9th EUROMICRO Conference on Digital System Design (DSD) - Architectures, Methods and Tools - University of Rostock

More information

Interconnect Challenges in a Many Core Compute Environment. Jerry Bautista, PhD Gen Mgr, New Business Initiatives Intel, Tech and Manuf Grp

Interconnect Challenges in a Many Core Compute Environment. Jerry Bautista, PhD Gen Mgr, New Business Initiatives Intel, Tech and Manuf Grp Interconnect Challenges in a Many Core Compute Environment Jerry Bautista, PhD Gen Mgr, New Business Initiatives Intel, Tech and Manuf Grp Agenda Microprocessor general trends Implications Tradeoffs Summary

More information

Reference. T1 Architecture. T1 ( Niagara ) Case Study of a Multi-core, Multithreaded

Reference. T1 Architecture. T1 ( Niagara ) Case Study of a Multi-core, Multithreaded Reference Case Study of a Multi-core, Multithreaded Processor The Sun T ( Niagara ) Computer Architecture, A Quantitative Approach, Fourth Edition, by John Hennessy and David Patterson, chapter. :/C:8

More information

Getting to Work with OpenPiton. Princeton University. OpenPit

Getting to Work with OpenPiton. Princeton University.   OpenPit Getting to Work with OpenPiton Princeton University http://openpiton.org OpenPit Princeton Parallel Research Group Redesigning the Data Center of the Future Chip Architecture Operating Systems and Runtimes

More information

Power dissipation! The VLSI Interconnect Challenge. Interconnect is the crux of the problem. Interconnect is the crux of the problem.

Power dissipation! The VLSI Interconnect Challenge. Interconnect is the crux of the problem. Interconnect is the crux of the problem. The VLSI Interconnect Challenge Avinoam Kolodny Electrical Engineering Department Technion Israel Institute of Technology VLSI Challenges System complexity Performance Tolerance to digital noise and faults

More information

System Level Design with IBM PowerPC Models

System Level Design with IBM PowerPC Models September 2005 System Level Design with IBM PowerPC Models A view of system level design SLE-m3 The System-Level Challenges Verification escapes cost design success There is a 45% chance of committing

More information

Swizzle Switch: A Self-Arbitrating High-Radix Crossbar for NoC Systems

Swizzle Switch: A Self-Arbitrating High-Radix Crossbar for NoC Systems 1 Swizzle Switch: A Self-Arbitrating High-Radix Crossbar for NoC Systems Ronald Dreslinski, Korey Sewell, Thomas Manville, Sudhir Satpathy, Nathaniel Pinckney, Geoff Blake, Michael Cieslak, Reetuparna

More information

Multicore computer: Combines two or more processors (cores) on a single die. Also called a chip-multiprocessor.

Multicore computer: Combines two or more processors (cores) on a single die. Also called a chip-multiprocessor. CS 320 Ch. 18 Multicore Computers Multicore computer: Combines two or more processors (cores) on a single die. Also called a chip-multiprocessor. Definitions: Hyper-threading Intel's proprietary simultaneous

More information

Digital Design Methodology (Revisited) Design Methodology: Big Picture

Digital Design Methodology (Revisited) Design Methodology: Big Picture Digital Design Methodology (Revisited) Design Methodology Design Specification Verification Synthesis Technology Options Full Custom VLSI Standard Cell ASIC FPGA CS 150 Fall 2005 - Lec #25 Design Methodology

More information

Embedded Systems 1: Hardware Description Languages (HDLs) for Verification

Embedded Systems 1: Hardware Description Languages (HDLs) for Verification November 2017 Embedded Systems 1: Hardware Description Languages (HDLs) for Verification Davide Zoni PhD email: davide.zoni@polimi.it webpage: home.deib.polimi.it/zoni Outline 2 How to test an RTL design

More information

VERIFICATION OF RISC-V PROCESSOR USING UVM TESTBENCH

VERIFICATION OF RISC-V PROCESSOR USING UVM TESTBENCH VERIFICATION OF RISC-V PROCESSOR USING UVM TESTBENCH Chevella Anilkumar 1, K Venkateswarlu 2 1.2 ECE Department, JNTU HYDERABAD(INDIA) ABSTRACT RISC-V (pronounced "risk-five") is a new, open, and completely

More information

Copyright 2012, Elsevier Inc. All rights reserved.

Copyright 2012, Elsevier Inc. All rights reserved. Computer Architecture A Quantitative Approach, Fifth Edition Chapter 1 Fundamentals of Quantitative Design and Analysis 1 Computer Technology Performance improvements: Improvements in semiconductor technology

More information

Modeling Performance Use Cases with Traffic Profiles Over ARM AMBA Interfaces

Modeling Performance Use Cases with Traffic Profiles Over ARM AMBA Interfaces Modeling Performance Use Cases with Traffic Profiles Over ARM AMBA Interfaces Li Chen, Staff AE Cadence China Agenda Performance Challenges Current Approaches Traffic Profiles Intro Traffic Profiles Implementation

More information

ADVANCED COMPUTER ARCHITECTURES

ADVANCED COMPUTER ARCHITECTURES 088949 ADVANCED COMPUTER ARCHITECTURES AA 2014/2015 Second Semester http://home.deib.polimi.it/silvano/aca-milano.htm Prof. Cristina Silvano email: cristina.silvano@polimi.it Dipartimento di Elettronica,

More information

ECE 486/586. Computer Architecture. Lecture # 2

ECE 486/586. Computer Architecture. Lecture # 2 ECE 486/586 Computer Architecture Lecture # 2 Spring 2015 Portland State University Recap of Last Lecture Old view of computer architecture: Instruction Set Architecture (ISA) design Real computer architecture:

More information

L évolution des architectures et des technologies d intégration des circuits intégrés dans les Data centers

L évolution des architectures et des technologies d intégration des circuits intégrés dans les Data centers I N S T I T U T D E R E C H E R C H E T E C H N O L O G I Q U E L évolution des architectures et des technologies d intégration des circuits intégrés dans les Data centers 10/04/2017 Les Rendez-vous de

More information

INF5063: Programming heterogeneous multi-core processors Introduction

INF5063: Programming heterogeneous multi-core processors Introduction INF5063: Programming heterogeneous multi-core processors Introduction Håkon Kvale Stensland August 19 th, 2012 INF5063 Overview Course topic and scope Background for the use and parallel processing using

More information

EECS4201 Computer Architecture

EECS4201 Computer Architecture Computer Architecture A Quantitative Approach, Fifth Edition Chapter 1 Fundamentals of Quantitative Design and Analysis These slides are based on the slides provided by the publisher. The slides will be

More information

Challenges for Future Interconnection Networks Hot Interconnects Panel August 24, Dennis Abts Sr. Principal Engineer

Challenges for Future Interconnection Networks Hot Interconnects Panel August 24, Dennis Abts Sr. Principal Engineer Challenges for Future Interconnection Networks Hot Interconnects Panel August 24, 2006 Sr. Principal Engineer Panel Questions How do we build scalable networks that balance power, reliability and performance

More information

Meet in the Middle: Leveraging Optical Interconnection Opportunities in Chip Multi Processors

Meet in the Middle: Leveraging Optical Interconnection Opportunities in Chip Multi Processors Meet in the Middle: Leveraging Optical Interconnection Opportunities in Chip Multi Processors Sandro Bartolini* Department of Information Engineering, University of Siena, Italy bartolini@dii.unisi.it

More information

NoC Round Table / ESA Sep Asynchronous Three Dimensional Networks on. on Chip. Abbas Sheibanyrad

NoC Round Table / ESA Sep Asynchronous Three Dimensional Networks on. on Chip. Abbas Sheibanyrad NoC Round Table / ESA Sep. 2009 Asynchronous Three Dimensional Networks on on Chip Frédéric ric PétrotP Outline Three Dimensional Integration Clock Distribution and GALS Paradigm Contribution of the Third

More information

Multiprocessors & Thread Level Parallelism

Multiprocessors & Thread Level Parallelism Multiprocessors & Thread Level Parallelism COE 403 Computer Architecture Prof. Muhamed Mudawar Computer Engineering Department King Fahd University of Petroleum and Minerals Presentation Outline Introduction

More information

ECE 485/585 Microprocessor System Design

ECE 485/585 Microprocessor System Design Microprocessor System Design Lecture 4: Memory Hierarchy Memory Taxonomy SRAM Basics Memory Organization DRAM Basics Zeshan Chishti Electrical and Computer Engineering Dept Maseeh College of Engineering

More information

Hardware/Software Co-design

Hardware/Software Co-design Hardware/Software Co-design Zebo Peng, Department of Computer and Information Science (IDA) Linköping University Course page: http://www.ida.liu.se/~petel/codesign/ 1 of 52 Lecture 1/2: Outline : an Introduction

More information