On mapping to multi/manycores

Size: px
Start display at page:

Download "On mapping to multi/manycores"

Transcription

1 On mapping to multi/manycores Jeronimo Castrillon Chair for Compiler Construction (CCC) TU Dresden, Germany MULTIPROG HiPEAC Conference Stockholm,

2 Mapping for dataflow programming models MEM subsystem DMAs, semaphores PMU L2 VLIW DSP,L2 NoC Peripherals Communication support HW ueues Network Processor Packet DMA Dataflow: Successful abstraction across different domains Embedded Resource and real-time constraints, heterogeneous platforms Traditionally static models (reduce run-time overhead), but Dynamic models needed: reuire profiling information 2 Prof. J. Castrillon. HiPEAC, 2017

3 Mapping processes/actors & channels (well studied) Trace-based: Analyze unrolled executions to compute compile-time mappings... Use meta-heuristics (popular: genetic algorithms) Need for Better architecture models (standardization?) Convergence of dataflow models (common IR?) Compiler run-time interaction [Castrill13, Castrill14] [Erbas06, Thiele07] 3 Prof. J. Castrillon. HiPEAC, 2017

4 From dataflow to data life-time MEM subsystem DMAs, semaphores PMU L2 VLIW DSP,L2 NoC Peripherals Communication support HW ueues Network Processor Packet DMA Buffer life-times Large applications with multiple constraints (e.g., communication standard) Memory re-used extremely important ILP formulation for temporal mapping of logical buffers to memories under bandwidth and memory size constraints Benchmark: 1ms LTE execution (1000 logical buffers) onto an 80-core machine [Goens16] 4 Prof. J. Castrillon. HiPEAC, 2017

5 Mapping data to data structures Changing graphs (e.g., in social networks) With some changes, better data-representations become more efficient Profiling and monitoring 5 [Schiller16] Prof. J. Castrillon. HiPEAC, 2017

6 Mapping data to data structures Changing graphs (e.g., in social networks) With some changes, better data-representations become more efficient Profiling and monitoring 6 [Schiller16] Prof. J. Castrillon. HiPEAC, x speedup (averaged over seven different graph metrics, Molecular dynamics benchmark)

7 More mapping to come HAEC project: wireless and optical system interconnect Map/configure communication architecture [Fettweis12] Cfaed Orchestration project on [Voelp16] post-cmos technologies Mapping data to heterogeneous memories (reliability, retention, ) Map computation to truly heterogeneous resources 7 Prof. J. Castrillon. HiPEAC, 2017

8 Summary Described several mapping problem formulations Dataflow to heterogeneous multi-cores Buffer lifetimes to interconnect and memories Data to data structures Outlook: Interesting mapping problems coming up (e.g., in HAEC and cfaed) 8 Prof. J. Castrillon. HiPEAC, 2017

9 References [Castrill14] J. Castrillon and R. Leupers, Programming Heterogeneous MPSoCs: Tool Flows to Close the Software Productivity Gap. Springer, 2014 [Castrill13] J. Castrillon, R. Leupers, and G. Ascheid, MAPS: Mapping concurrent dataflow applications to heterogeneous MPSoCs, IEEE Transactions on Industrial Informatics, vol. 9, no. 1, pp , 2013 [Erbas06] Erbas, C., et al Multiobjective Optimization and Evolutionary Algorithms for the Application Mapping Problem in Multiprocessor System-on-Chip Design, IEEE Transactions on Evolutionary Computation, 2006, 10, [Fettweis12] Fettweis, G.; Nagel, W. & Lehner, W. Pathways to servers of the future: highly adaptive energy efficient computing (HAEC), Proceedings of the Conference on Design, Automation and Test in Europe, 2012, [Goens16] Goens, A., et al. An Optimal Allocation of Memory Buffers for Complex Multicore Platforms, Journal of Systems Architecture, Elsevier, 2016, 66-67, [Schiller16] Schiller, B., et al. Compile- and Run-time Approaches for the Selection of Efficient Data Structures for Dynamic Graph Analysis, Journal of Applied Network Science, 2016, 1, 1-22 [Thiele07] Thiele, L.; et al., Mapping Applications to Tiled Multiprocessor Embedded Systems, Proceedings of the Seventh International Conference on Application of Concurrency to System Design, IEEE Computer Society, 2007, [Voelp16] Voelp, M., et al., The Orchestration Stack: The Impossible Task of Designing Software for Unknown Future Post-CMOS Hardware. Proceedings of the 1st International Workshop on Post-Moore's Era Supercomputing (PMES), Co-located with The International Conference for High Performance Computing, Networking, Storage and Analysis (SC16), Prof. J. Castrillon. HiPEAC, 2017

Compiling for deeply embedded and heterogeneous signal processing systems

Compiling for deeply embedded and heterogeneous signal processing systems Compiling for deeply embedded and heterogeneous signal processing systems Jeronimo Castrillon Cfaed Chair for Compiler Construction (CCC) 5G Summit, Dresden, Germany September 29, 2016 Multi-Processor/core

More information

Programming Heterogeneous Embedded Systems for IoT

Programming Heterogeneous Embedded Systems for IoT Programming Heterogeneous Embedded Systems for IoT Jeronimo Castrillon Chair for Compiler Construction TU Dresden jeronimo.castrillon@tu-dresden.de Get-together toward a sustainable collaboration in IoT

More information

Dataflow programming for heterogeneous computing systems

Dataflow programming for heterogeneous computing systems Dataflow programming for heterogeneous computing systems Jeronimo Castrillon Cfaed Chair for Compiler Construction TU Dresden jeronimo.castrillon@tu-dresden.de Tutorial: Algorithmic specification, tools

More information

HETEROGENEOUS MULTIPROCESSOR MAPPING FOR REAL-TIME STREAMING SYSTEMS

HETEROGENEOUS MULTIPROCESSOR MAPPING FOR REAL-TIME STREAMING SYSTEMS HETEROGENEOUS MULTIPROCESSOR MAPPING FOR REAL-TIME STREAMING SYSTEMS Jing Lin, Akshaya Srivasta, Prof. Andreas Gerstlauer, and Prof. Brian L. Evans Department of Electrical and Computer Engineering The

More information

Easy Multicore Programming using MAPS

Easy Multicore Programming using MAPS Easy Multicore Programming using MAPS Jeronimo Castrillon, Maximilian Odendahl Multicore Challenge Conference 2012 September 24 th, 2012 Institute for Communication Technologies and Embedded Systems Outline

More information

EE382V: System-on-a-Chip (SoC) Design

EE382V: System-on-a-Chip (SoC) Design EE382V: System-on-a-Chip (SoC) Design Lecture 10 Task Partitioning Sources: Prof. Margarida Jacome, UT Austin Prof. Lothar Thiele, ETH Zürich Andreas Gerstlauer Electrical and Computer Engineering University

More information

Multicore Hardware and Parallelism

Multicore Hardware and Parallelism Multicore Hardware and Parallelism Minsoo Ryu Department of Computer Science and Engineering 2 1 Advent of Multicore Hardware 2 Multicore Processors 3 Amdahl s Law 4 Parallelism in Hardware 5 Q & A 2 3

More information

Buffer Dimensioning for Throughput Improvement of Dynamic Dataflow Signal Processing Applications on Multi-Core Platforms

Buffer Dimensioning for Throughput Improvement of Dynamic Dataflow Signal Processing Applications on Multi-Core Platforms Buffer Dimensioning for Throughput Improvement of Dynamic Dataflow Signal Processing Applications on Multi-Core Platforms Małgorzata Michalska, Endri Bezati, Simone Casale-Brunet, Marco Mattavelli EPFL

More information

Network on Chip Architecture: An Overview

Network on Chip Architecture: An Overview Network on Chip Architecture: An Overview Md Shahriar Shamim & Naseef Mansoor 12/5/2014 1 Overview Introduction Multi core chip Challenges Network on Chip Architecture Regular Topology Irregular Topology

More information

Applications to MPSoCs

Applications to MPSoCs 3 rd Workshop on Mapping of Applications to MPSoCs A Design Exploration Framework for Mapping and Scheduling onto Heterogeneous MPSoCs Christian Pilato, Fabrizio Ferrandi, Donatella Sciuto Dipartimento

More information

Orchestration: Turning material breakthroughs into application performance

Orchestration: Turning material breakthroughs into application performance Orchestration: Turning material breakthroughs into application performance Jeronimo Castrillon (on behalf of the Orchestration team) TU Dresden Cfaed Chair for Compiler Construction jeronimo.castrillon@tu-dresden.de

More information

Analysis and software synthesis of KPN applications

Analysis and software synthesis of KPN applications Analysis and software synthesis of KPN applications Jeronimo Castrillon Chair for Compiler Construction TU Dresden jeronimo.castrillon@tu-dresden.de DREAMS Seminar UC Berkeley, CA. October 22 nd 2015 Acknowledgements

More information

An MPSoC for Energy-Efficient Database Query Processing

An MPSoC for Energy-Efficient Database Query Processing Vodafone Chair Mobile Communications Systems, Prof. Dr.-Ing. Dr. h.c. G. Fettweis An MPSoC for Energy-Efficient Database Query Processing TensilicaDay 2016 Sebastian Haas Emil Matúš Gerhard Fettweis 09.02.2016

More information

Center for Scalable Application Development Software (CScADS): Automatic Performance Tuning Workshop

Center for Scalable Application Development Software (CScADS): Automatic Performance Tuning Workshop Center for Scalable Application Development Software (CScADS): Automatic Performance Tuning Workshop http://cscads.rice.edu/ Discussion and Feedback CScADS Autotuning 07 Top Priority Questions for Discussion

More information

Designing Predictable Real-Time and Embedded Systems

Designing Predictable Real-Time and Embedded Systems Designing Predictable Real-Time and Embedded Systems Juniorprofessor Dr. Jian-Jia Chen Karlsruhe Institute of Technology (KIT), Germany 0 KIT Feb. University 27-29, 2012 of at thetu-berlin, State of Baden-Wuerttemberg

More information

Distributed Operation Layer

Distributed Operation Layer Distributed Operation Layer Iuliana Bacivarov, Wolfgang Haid, Kai Huang, and Lothar Thiele ETH Zürich Outline Distributed Operation Layer Overview Specification Application Architecture Mapping Design

More information

TETRiS: a Multi-Application Run-Time System for Predictable Execution of Static Mappings

TETRiS: a Multi-Application Run-Time System for Predictable Execution of Static Mappings TETiS: a Multi-Application un-time System for Predictable Execution of Static Mappings Andrés Goens, obert Khasanov, Jeronimo Castrillon Chair for Compiler Construction TU Dresden, Center for Advancing

More information

Software Compilation Techniques for Heterogeneous Embedded Multi-Core Systems

Software Compilation Techniques for Heterogeneous Embedded Multi-Core Systems Software Compilation Techniques for Heterogeneous Embedded Multi-Core Systems Rainer Leupers, Miguel Angel Aguilar, Jeronimo Castrillon, and Weihua Sheng Abstract The increasing demands of modern embedded

More information

Compiling for Concise Code and Efficient I/O

Compiling for Concise Code and Efficient I/O Compiling for Concise Code and Efficient I/O Sebastian Ertel, Andrés Goens, Justus Adam and Jeronimo Castrillon Chair for Compiler Construction TU Dresden 27th International Conference on Compiler Construction

More information

Computer Architecture Crash course

Computer Architecture Crash course Computer Architecture Crash course Frédéric Haziza Department of Computer Systems Uppsala University Summer 2008 Conclusions The multicore era is already here cost of parallelism is dropping

More information

Future Gigascale MCSoCs Applications: Computation & Communication Orthogonalization

Future Gigascale MCSoCs Applications: Computation & Communication Orthogonalization Basic Network-on-Chip (BANC) interconnection for Future Gigascale MCSoCs Applications: Computation & Communication Orthogonalization Abderazek Ben Abdallah, Masahiro Sowa Graduate School of Information

More information

Parallelism in Hardware

Parallelism in Hardware Parallelism in Hardware Minsoo Ryu Department of Computer Science and Engineering 2 1 Advent of Multicore Hardware 2 Multicore Processors 3 Amdahl s Law 4 Parallelism in Hardware 5 Q & A 2 3 Moore s Law

More information

Design methodology for multi processor systems design on regular platforms

Design methodology for multi processor systems design on regular platforms Design methodology for multi processor systems design on regular platforms Ph.D in Electronics, Computer Science and Telecommunications Ph.D Student: Davide Rossi Ph.D Tutor: Prof. Roberto Guerrieri Outline

More information

Introduction to Multiprocessors (Part I) Prof. Cristina Silvano Politecnico di Milano

Introduction to Multiprocessors (Part I) Prof. Cristina Silvano Politecnico di Milano Introduction to Multiprocessors (Part I) Prof. Cristina Silvano Politecnico di Milano Outline Key issues to design multiprocessors Interconnection network Centralized shared-memory architectures Distributed

More information

TDT 4260 lecture 3 spring semester 2015

TDT 4260 lecture 3 spring semester 2015 1 TDT 4260 lecture 3 spring semester 2015 Lasse Natvig, The CARD group Dept. of computer & information science NTNU http://research.idi.ntnu.no/multicore 2 Lecture overview Repetition Chap.1: Performance,

More information

The Tactile Internet Driving 5G

The Tactile Internet Driving 5G All rights reserved The Tactile Internet Driving 5G Gerhard P. Fettweis Vodafone Chair Professor 3D Chip-Stacks Saxony Abu Dhabi TU Dresden TI Dresden Masdar Institute TSVs inductive/capacitive optical

More information

Instruction Set Architecture Extensions for a Dynamic Task Scheduling Unit

Instruction Set Architecture Extensions for a Dynamic Task Scheduling Unit Instruction Set Architecture Extensions for a Dynamic Task Scheduling Unit Oliver Arnold, Benedikt Noethen, and Gerhard Fettweis Vodafone Chair Mobile Communications Systems Dresden University of Technology

More information

Mapping of Applications to Multi-Processor Systems

Mapping of Applications to Multi-Processor Systems Mapping of Applications to Multi-Processor Systems Peter Marwedel TU Dortmund, Informatik 12 Germany Marwedel, 2003 Graphics: Alexandra Nolte, Gesine 2011 年 12 月 09 日 These slides use Microsoft clip arts.

More information

Mapping of Applications to Multi-Processor Systems

Mapping of Applications to Multi-Processor Systems Springer, 2010 Mapping of Applications to Multi-Processor Systems Peter Marwedel TU Dortmund, Informatik 12 Germany 2014 年 01 月 17 日 These slides use Microsoft clip arts. Microsoft copyright restrictions

More information

Hardware-Software Codesign. 1. Introduction

Hardware-Software Codesign. 1. Introduction Hardware-Software Codesign 1. Introduction Lothar Thiele 1-1 Contents What is an Embedded System? Levels of Abstraction in Electronic System Design Typical Design Flow of Hardware-Software Systems 1-2

More information

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING UNIT-1

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING UNIT-1 DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING Year & Semester : III/VI Section : CSE-1 & CSE-2 Subject Code : CS2354 Subject Name : Advanced Computer Architecture Degree & Branch : B.E C.S.E. UNIT-1 1.

More information

Handling Challenges of Multi-Core Technology in Automotive Software Engineering

Handling Challenges of Multi-Core Technology in Automotive Software Engineering Model Based Development Tools for Embedded Multi-Core Systems Handling Challenges of Multi-Core Technology in Automotive Software Engineering VECTOR INDIA CONFERENCE 2017 Timing-Architects Embedded Systems

More information

DATA REUSE DRIVEN MEMORY AND NETWORK-ON-CHIP CO-SYNTHESIS *

DATA REUSE DRIVEN MEMORY AND NETWORK-ON-CHIP CO-SYNTHESIS * DATA REUSE DRIVEN MEMORY AND NETWORK-ON-CHIP CO-SYNTHESIS * University of California, Irvine, CA 92697 Abstract: Key words: NoCs present a possible communication infrastructure solution to deal with increased

More information

System-on-Chip. 4l1 Springer. Embedded Software Design and Programming of Multiprocessor. Simulink and SystemC. Case Studies

System-on-Chip. 4l1 Springer. Embedded Software Design and Programming of Multiprocessor. Simulink and SystemC. Case Studies Katalin Popovici Frederic Rousseau Ahmed A. Jerraya Marilyn Wolf Embedded Software Design and Programming of Multiprocessor System-on-Chip Simulink and SystemC Case Studies 4l1 Springer Contents 1 Embedded

More information

Introduction to System-on-Chip

Introduction to System-on-Chip Introduction to System-on-Chip COE838: Systems-on-Chip Design http://www.ee.ryerson.ca/~courses/coe838/ Dr. Gul N. Khan http://www.ee.ryerson.ca/~gnkhan Electrical and Computer Engineering Ryerson University

More information

Network-on-Chip Architecture

Network-on-Chip Architecture Multiple Processor Systems(CMPE-655) Network-on-Chip Architecture Performance aspect and Firefly network architecture By Siva Shankar Chandrasekaran and SreeGowri Shankar Agenda (Enhancing performance)

More information

Adaptive Query Processing on Prefix Trees Wolfgang Lehner

Adaptive Query Processing on Prefix Trees Wolfgang Lehner Adaptive Query Processing on Prefix Trees Wolfgang Lehner Fachgruppentreffen, 22.11.2012 TU München Prof. Dr.-Ing. Wolfgang Lehner > Challenges for Database Systems Three things are important in the database

More information

Distributed Operation Layer Integrated SW Design Flow for Mapping Streaming Applications to MPSoC

Distributed Operation Layer Integrated SW Design Flow for Mapping Streaming Applications to MPSoC Distributed Operation Layer Integrated SW Design Flow for Mapping Streaming Applications to MPSoC Iuliana Bacivarov, Wolfgang Haid, Kai Huang, and Lothar Thiele ETH Zürich MPSoCs are Hard to program (

More information

EE382V: System-on-a-Chip (SoC) Design

EE382V: System-on-a-Chip (SoC) Design EE382V: System-on-a-Chip (SoC) Design Lecture 8 HW/SW Co-Design Sources: Prof. Margarida Jacome, UT Austin Andreas Gerstlauer Electrical and Computer Engineering University of Texas at Austin gerstl@ece.utexas.edu

More information

Software Synthesis, Code Generation and Timing Analysis

Software Synthesis, Code Generation and Timing Analysis Year 1 Review Brussels, January 23rd, 2008 Cluster Achievements and Perspectives : Software Synthesis, Code Generation and Timing Analysis Cluster leader : Peter Marwedel TU Dortmund High-Level Objectives

More information

Implementing Fine/Medium Grained TLP Support in a Many-Core Architecture

Implementing Fine/Medium Grained TLP Support in a Many-Core Architecture Implementing Fine/Medium Grained TLP Support in a Many-Core Architecture Roberto Giorgi, Zdravko Popovic, Nikola Puzovic Department of Information Engineering, University of Siena, Italy http://www.dii.unisi.it/~

More information

A Tuneable Software Cache Coherence Protocol for Heterogeneous MPSoCs. Marco Bekooij & Frank Ophelders

A Tuneable Software Cache Coherence Protocol for Heterogeneous MPSoCs. Marco Bekooij & Frank Ophelders A Tuneable Software Cache Coherence Protocol for Heterogeneous MPSoCs Marco Bekooij & Frank Ophelders Outline Context What is cache coherence Addressed challenge Short overview of related work Related

More information

Efficient Hardware Acceleration on SoC- FPGA using OpenCL

Efficient Hardware Acceleration on SoC- FPGA using OpenCL Efficient Hardware Acceleration on SoC- FPGA using OpenCL Advisor : Dr. Benjamin Carrion Schafer Susmitha Gogineni 30 th August 17 Presentation Overview 1.Objective & Motivation 2.Configurable SoC -FPGA

More information

Wenisch Final Review. Fall 2007 Prof. Thomas Wenisch EECS 470. Slide 1

Wenisch Final Review. Fall 2007 Prof. Thomas Wenisch  EECS 470. Slide 1 Final Review Fall 2007 Prof. Thomas Wenisch http://www.eecs.umich.edu/courses/eecs470 Slide 1 Announcements Wenisch 2007 Exam is Monday, 12/17 4 6 in this room I recommend bringing a scientific calculator

More information

Multi-objective Mapping for Mesh-based NoC Architectures

Multi-objective Mapping for Mesh-based NoC Architectures Multi-objective Mapping for Mesh-based NoC Architectures Giuseppe Ascia Dipartimento di Ingegneria Informatica e delle Telecomunicazioni University of Catania, Italy gascia@diit.unict.it Vincenzo Catania

More information

Comparing Memory Systems for Chip Multiprocessors

Comparing Memory Systems for Chip Multiprocessors Comparing Memory Systems for Chip Multiprocessors Jacob Leverich Hideho Arakida, Alex Solomatnikov, Amin Firoozshahian, Mark Horowitz, Christos Kozyrakis Computer Systems Laboratory Stanford University

More information

Cor Meenderinck, Ben Juurlink Nexus: hardware support for task-based programming

Cor Meenderinck, Ben Juurlink Nexus: hardware support for task-based programming Cor Meenderinck, Ben Juurlink Nexus: hardware support for task-based programming Conference Object, Postprint version This version is available at http://dx.doi.org/0.479/depositonce-577. Suggested Citation

More information

ECE 669 Parallel Computer Architecture

ECE 669 Parallel Computer Architecture ECE 669 Parallel Computer Architecture Lecture 23 Parallel Compilation Parallel Compilation Two approaches to compilation Parallelize a program manually Sequential code converted to parallel code Develop

More information

Performance Balancing: Software-based On-chip Memory Management for Effective CMP Executions

Performance Balancing: Software-based On-chip Memory Management for Effective CMP Executions Performance Balancing: Software-based On-chip Memory Management for Effective CMP Executions Naoto Fukumoto, Kenichi Imazato, Koji Inoue, Kazuaki Murakami Department of Advanced Information Technology,

More information

Multi/Many-Core Programming: Where are we Standing?

Multi/Many-Core Programming: Where are we Standing? Multi/Many-Core Programming: Where are we Standing? Jeronimo Castrillon TU Dresden, jeronimo.castrillon@tu-dresden.de Weihua Sheng Silexica Software Solutions GmbH, sheng@silexica.com Ralph Jessenberger

More information

Versal: AI Engine & Programming Environment

Versal: AI Engine & Programming Environment Engineering Director, Xilinx Silicon Architecture Group Versal: Engine & Programming Environment Presented By Ambrose Finnerty Xilinx DSP Technical Marketing Manager October 16, 2018 MEMORY MEMORY MEMORY

More information

Improving multicore memory systems

Improving multicore memory systems 1 Improving multicore memory systems and some thoughts on chip multiprocessor programming NIK MULTICORE TECHNOLOGY WORKSHOP 19. Nov. 2007 Lasse.Natvig@idi.ntnu.no NTNU Computer Architecture Research group

More information

CSCI 402: Computer Architectures. Parallel Processors (2) Fengguang Song Department of Computer & Information Science IUPUI.

CSCI 402: Computer Architectures. Parallel Processors (2) Fengguang Song Department of Computer & Information Science IUPUI. CSCI 402: Computer Architectures Parallel Processors (2) Fengguang Song Department of Computer & Information Science IUPUI 6.6 - End Today s Contents GPU Cluster and its network topology The Roofline performance

More information

Native Simulation of Complex VLIW Instruction Sets Using Static Binary Translation and Hardware-Assisted Virtualization

Native Simulation of Complex VLIW Instruction Sets Using Static Binary Translation and Hardware-Assisted Virtualization Native Simulation of Complex VLIW Instruction Sets Using Static Binary Translation and Hardware-Assisted Virtualization Mian-Muhammad Hamayun, Frédéric Pétrot and Nicolas Fournel System Level Synthesis

More information

Mapping of Real-time Applications on

Mapping of Real-time Applications on Mapping of Real-time Applications on Network-on-Chip based MPSOCS Paris Mesidis Submitted for the degree of Master of Science (By Research) The University of York, December 2011 Abstract Mapping of real

More information

Multimedia in Mobile Phones. Architectures and Trends Lund

Multimedia in Mobile Phones. Architectures and Trends Lund Multimedia in Mobile Phones Architectures and Trends Lund 091124 Presentation Henrik Ohlsson Contact: henrik.h.ohlsson@stericsson.com Working with multimedia hardware (graphics and displays) at ST- Ericsson

More information

Application-Platform Mapping in Multiprocessor Systems-on-Chip

Application-Platform Mapping in Multiprocessor Systems-on-Chip Application-Platform Mapping in Multiprocessor Systems-on-Chip Leandro Soares Indrusiak lsi@cs.york.ac.uk http://www-users.cs.york.ac.uk/lsi CREDES Kick-off Meeting Tallinn - June 2009 Application-Platform

More information

Lecture 14: Multithreading

Lecture 14: Multithreading CS 152 Computer Architecture and Engineering Lecture 14: Multithreading John Wawrzynek Electrical Engineering and Computer Sciences University of California, Berkeley http://www.eecs.berkeley.edu/~johnw

More information

Arquitecturas y Modelos de. Multicore

Arquitecturas y Modelos de. Multicore Arquitecturas y Modelos de rogramacion para Multicore 17 Septiembre 2008 Castellón Eduard Ayguadé Alex Ramírez Opening statements * Some visionaries already predicted multicores 30 years ago And they have

More information

Adaptable Intelligence The Next Computing Era

Adaptable Intelligence The Next Computing Era Adaptable Intelligence The Next Computing Era Hot Chips, August 21, 2018 Victor Peng, CEO, Xilinx Pervasive Intelligence from Cloud to Edge to Endpoints >> 1 Exponential Growth and Opportunities Data Explosion

More information

HPC Technology Trends

HPC Technology Trends HPC Technology Trends High Performance Embedded Computing Conference September 18, 2007 David S Scott, Ph.D. Petascale Product Line Architect Digital Enterprise Group Risk Factors Today s s presentations

More information

A Scenario-based Run-time Task Mapping Algorithm for MPSoCs

A Scenario-based Run-time Task Mapping Algorithm for MPSoCs A Scenario-based Run-time Task Mapping Algorithm for MPSoCs Wei Quan, Andy D. Pimentel Informatics Institute School of Computer Science University of Amsterdam National University of Defense Technology

More information

Architectural Time-predictability Factor (ATF) to Measure Architectural Time Predictability

Architectural Time-predictability Factor (ATF) to Measure Architectural Time Predictability Architectural Time-predictability Factor (ATF) to Measure Architectural Time Predictability Yiqiang Ding, Wei Zhang Department of Electrical and Computer Engineering Virginia Commonwealth University Outline

More information

Parallel Computing: Parallel Architectures Jin, Hai

Parallel Computing: Parallel Architectures Jin, Hai Parallel Computing: Parallel Architectures Jin, Hai School of Computer Science and Technology Huazhong University of Science and Technology Peripherals Computer Central Processing Unit Main Memory Computer

More information

Network protocols and. network systems INTRODUCTION CHAPTER

Network protocols and. network systems INTRODUCTION CHAPTER CHAPTER Network protocols and 2 network systems INTRODUCTION The technical area of telecommunications and networking is a mature area of engineering that has experienced significant contributions for more

More information

Session: Configurable Systems. Tailored SoC building using reconfigurable IP blocks

Session: Configurable Systems. Tailored SoC building using reconfigurable IP blocks IP 08 Session: Configurable Systems Tailored SoC building using reconfigurable IP blocks Lodewijk T. Smit, Gerard K. Rauwerda, Jochem H. Rutgers, Maciej Portalski and Reinier Kuipers Recore Systems www.recoresystems.com

More information

MULTI-OBJECTIVE DESIGN SPACE EXPLORATION OF EMBEDDED SYSTEM PLATFORMS

MULTI-OBJECTIVE DESIGN SPACE EXPLORATION OF EMBEDDED SYSTEM PLATFORMS MULTI-OBJECTIVE DESIGN SPACE EXPLORATION OF EMBEDDED SYSTEM PLATFORMS Jan Madsen, Thomas K. Stidsen, Peter Kjærulf, Shankar Mahadevan Informatics and Mathematical Modelling Technical University of Denmark

More information

CSCI-GA Multicore Processors: Architecture & Programming Lecture 10: Heterogeneous Multicore

CSCI-GA Multicore Processors: Architecture & Programming Lecture 10: Heterogeneous Multicore CSCI-GA.3033-012 Multicore Processors: Architecture & Programming Lecture 10: Heterogeneous Multicore Mohamed Zahran (aka Z) mzahran@cs.nyu.edu http://www.mzahran.com Status Quo Previously, CPU vendors

More information

Keywords and Review Questions

Keywords and Review Questions Keywords and Review Questions lec1: Keywords: ISA, Moore s Law Q1. Who are the people credited for inventing transistor? Q2. In which year IC was invented and who was the inventor? Q3. What is ISA? Explain

More information

Software Defined Modem A commercial platform for wireless handsets

Software Defined Modem A commercial platform for wireless handsets Software Defined Modem A commercial platform for wireless handsets Charles F Sturman VP Marketing June 22 nd ~ 24 th Brussels charles.stuman@cognovo.com www.cognovo.com Agenda SDM Separating hardware from

More information

Simulink -based Programming Environment for Heterogeneous MPSoC

Simulink -based Programming Environment for Heterogeneous MPSoC Simulink -based Programming Environment for Heterogeneous MPSoC Katalin Popovici katalin.popovici@mathworks.com Software Engineer, The MathWorks DATE 2009, Nice, France 2009 The MathWorks, Inc. Summary

More information

Stanford University Computer Systems Laboratory. Stream Scheduling. Ujval J. Kapasi, Peter Mattson, William J. Dally, John D. Owens, Brian Towles

Stanford University Computer Systems Laboratory. Stream Scheduling. Ujval J. Kapasi, Peter Mattson, William J. Dally, John D. Owens, Brian Towles Stanford University Concurrent VLSI Architecture Memo 122 Stanford University Computer Systems Laboratory Stream Scheduling Ujval J. Kapasi, Peter Mattson, William J. Dally, John D. Owens, Brian Towles

More information

Cache Performance, System Performance, and Off-Chip Bandwidth... Pick any Two

Cache Performance, System Performance, and Off-Chip Bandwidth... Pick any Two Cache Performance, System Performance, and Off-Chip Bandwidth... Pick any Two Bushra Ahsan and Mohamed Zahran Dept. of Electrical Engineering City University of New York ahsan bushra@yahoo.com mzahran@ccny.cuny.edu

More information

Standards for NoC: What can we gain?

Standards for NoC: What can we gain? Standards for NoC: What can we gain? Axel Jantsch Royal Institute of Technology, Stockholm March 2006 March 2006 Standards for NoC 1 What Kind of Standards Informal Standards are a set of assumptions shared

More information

Partial Expansion Graphs: Exposing Parallelism and Dynamic Scheduling Opportunities for DSP Applications

Partial Expansion Graphs: Exposing Parallelism and Dynamic Scheduling Opportunities for DSP Applications In Proceedings of the International Conference on Application Specific Systems, Architectures, and Processors, 2012, to appear. Partial Expansion Graphs: Exposing Parallelism and Dynamic Scheduling Opportunities

More information

HETEROGENEOUS COMPUTING

HETEROGENEOUS COMPUTING HETEROGENEOUS COMPUTING Shoukat Ali, Tracy D. Braun, Howard Jay Siegel, and Anthony A. Maciejewski School of Electrical and Computer Engineering, Purdue University Heterogeneous computing is a set of techniques

More information

ScienceDirect. Power-Aware Mapping for 3D-NoC Designs using Genetic Algorithms

ScienceDirect. Power-Aware Mapping for 3D-NoC Designs using Genetic Algorithms Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 34 (2014 ) 538 543 2014 International Workshop on the Design and Performance of Networks on Chip (DPNoC 2014) Power-Aware

More information

A Case for Hardware Task Management Support for the StarSS Programming Model

A Case for Hardware Task Management Support for the StarSS Programming Model A Case for Hardware Task Management Support for the StarSS Programming Model Cor Meenderinck Delft University of Technology Delft, the Netherlands cor@ce.et.tudelft.nl Ben Juurlink Technische Universität

More information

A Phase-Coupled Compiler Backend for a New VLIW Processor Architecture Using Two-step Register Allocation

A Phase-Coupled Compiler Backend for a New VLIW Processor Architecture Using Two-step Register Allocation A Phase-Coupled Compiler Backend for a New VLIW Processor Architecture Using Two-step Register Allocation Jie Guo, Jun Liu, Björn Mennenga and Gerhard P. Fettweis Vodafone Chair Mobile Communications Systems

More information

CSC 553 Operating Systems

CSC 553 Operating Systems CSC 553 Operating Systems Lecture 1- Computer System Overview Operating System Exploits the hardware resources of one or more processors Provides a set of services to system users Manages secondary memory

More information

Topics on Compilers Spring Semester Christine Wagner 2011/04/13

Topics on Compilers Spring Semester Christine Wagner 2011/04/13 Topics on Compilers Spring Semester 2011 Christine Wagner 2011/04/13 Availability of multicore processors Parallelization of sequential programs for performance improvement Manual code parallelization:

More information

high performance medical reconstruction using stream programming paradigms

high performance medical reconstruction using stream programming paradigms high performance medical reconstruction using stream programming paradigms This Paper describes the implementation and results of CT reconstruction using Filtered Back Projection on various stream programming

More information

All-Terminal Reliability Evaluation through a Monte Carlo simulation based on an MPI implementation

All-Terminal Reliability Evaluation through a Monte Carlo simulation based on an MPI implementation All-Terminal Reliability Evaluation through a Monte Carlo simulation based on an MPI implementation Silvia Pascual Martínez a, Beatriz Otero Calviño a, Claudio M. Rocco S. b* a Universitat Politècnica

More information

Kaisen Lin and Michael Conley

Kaisen Lin and Michael Conley Kaisen Lin and Michael Conley Simultaneous Multithreading Instructions from multiple threads run simultaneously on superscalar processor More instruction fetching and register state Commercialized! DEC

More information

Adaptive processor architectures for detector applications

Adaptive processor architectures for detector applications Adaptive processor architectures for detector applications Prof. Dr.-Ing. habil. Michael Hübner Chair for Embedded Systems in Information Technology (ESIT) Faculty of Electrical Engineering and Information

More information

Dirk Tetzlaff Technical University of Berlin

Dirk Tetzlaff Technical University of Berlin Software Engineering g for Embedded Systems Intelligent Task Mapping for MPSoCs using Machine Learning Dirk Tetzlaff Technical University of Berlin 3rd Workshop on Mapping of Applications to MPSoCs June

More information

Dr. Yassine Hariri CMC Microsystems

Dr. Yassine Hariri CMC Microsystems Dr. Yassine Hariri Hariri@cmc.ca CMC Microsystems 03-26-2013 Agenda MCES Workshop Agenda and Topics Canada s National Design Network and CMC Microsystems Processor Eras: Background and History Single core

More information

Compiler Optimizations and Auto-tuning. Amir H. Ashouri Politecnico Di Milano -2014

Compiler Optimizations and Auto-tuning. Amir H. Ashouri Politecnico Di Milano -2014 Compiler Optimizations and Auto-tuning Amir H. Ashouri Politecnico Di Milano -2014 Compilation Compilation = Translation One piece of code has : Around 10 ^ 80 different translations Different platforms

More information

Analyzing Methodologies of Irregular NoC Topology Synthesis

Analyzing Methodologies of Irregular NoC Topology Synthesis Analyzing Methodologies of Irregular NoC Topology Synthesis Naveen Choudhary Dharm Singh Surbhi Jain ABSTRACT Network-On-Chip (NoC) provides a structured way of realizing communication for System on Chip

More information

ENERGY-EFFICIENT NOC FOR BEST-EFFORT COMMUNICATION

ENERGY-EFFICIENT NOC FOR BEST-EFFORT COMMUNICATION ENERGY-EFFICIENT NOC FOR BEST-EFFORT COMMUNICATION Pascal T. Wolkotte, Gerard J.M. Smit University of Twente, Department of EEMCS P.O. Box 217, 7500 AE Enschede The Netherlands E-mail: P.T.Wolkotte@utwente.nl

More information

Multi-Level Cache Hierarchy Evaluation for Programmable Media Processors. Overview

Multi-Level Cache Hierarchy Evaluation for Programmable Media Processors. Overview Multi-Level Cache Hierarchy Evaluation for Programmable Media Processors Jason Fritts Assistant Professor Department of Computer Science Co-Author: Prof. Wayne Wolf Overview Why Programmable Media Processors?

More information

Why do we care about parallel?

Why do we care about parallel? Threads 11/15/16 CS31 teaches you How a computer runs a program. How the hardware performs computations How the compiler translates your code How the operating system connects hardware and software The

More information

Resource-Efficient Scheduling for Partially-Reconfigurable FPGAbased

Resource-Efficient Scheduling for Partially-Reconfigurable FPGAbased Resource-Efficient Scheduling for Partially-Reconfigurable FPGAbased Systems Andrea Purgato: andrea.purgato@mail.polimi.it Davide Tantillo: davide.tantillo@mail.polimi.it Marco Rabozzi: marco.rabozzi@polimi.it

More information

Hardware/Software Partitioning for SoCs. EECE Advanced Topics in VLSI Design Spring 2009 Brad Quinton

Hardware/Software Partitioning for SoCs. EECE Advanced Topics in VLSI Design Spring 2009 Brad Quinton Hardware/Software Partitioning for SoCs EECE 579 - Advanced Topics in VLSI Design Spring 2009 Brad Quinton Goals of this Lecture Automatic hardware/software partitioning is big topic... In this lecture,

More information

MPSoC Design Space Exploration Framework

MPSoC Design Space Exploration Framework MPSoC Design Space Exploration Framework Gerd Ascheid RWTH Aachen University, Germany Outline Motivation: MPSoC requirements in wireless and multimedia MPSoC design space exploration framework Summary

More information

MEMORY/RESOURCE MANAGEMENT IN MULTICORE SYSTEMS

MEMORY/RESOURCE MANAGEMENT IN MULTICORE SYSTEMS MEMORY/RESOURCE MANAGEMENT IN MULTICORE SYSTEMS INSTRUCTOR: Dr. MUHAMMAD SHAABAN PRESENTED BY: MOHIT SATHAWANE AKSHAY YEMBARWAR WHAT IS MULTICORE SYSTEMS? Multi-core processor architecture means placing

More information

Hybrid Code-Data Prefetch-Aware Multiprocessor Task Graph Scheduling

Hybrid Code-Data Prefetch-Aware Multiprocessor Task Graph Scheduling Hybrid Code-Data Prefetch-Aware Multiprocessor Task Graph Scheduling Morteza Damavandpeyma 1, Sander Stuijk 1, Twan Basten 1,2, Marc Geilen 1 and Henk Corporaal 1 1 Department of Electrical Engineering,

More information

SDSoC: Session 1

SDSoC: Session 1 SDSoC: Session 1 ADAM@ADIUVOENGINEERING.COM What is SDSoC SDSoC is a system optimising compiler which allows us to optimise Zynq PS / PL Zynq MPSoC PS / PL MicroBlaze What does this mean? Following the

More information

Mapping and Physical Planning of Networks-on-Chip Architectures with Quality-of-Service Guarantees

Mapping and Physical Planning of Networks-on-Chip Architectures with Quality-of-Service Guarantees Mapping and Physical Planning of Networks-on-Chip Architectures with Quality-of-Service Guarantees Srinivasan Murali 1, Prof. Luca Benini 2, Prof. Giovanni De Micheli 1 1 Computer Systems Lab, Stanford

More information

NED: A Novel Synthetic Traffic Pattern for Power/Performance Analysis of Network-on-chips Using Negative Exponential Distribution

NED: A Novel Synthetic Traffic Pattern for Power/Performance Analysis of Network-on-chips Using Negative Exponential Distribution To appear in Int l Journal of Low Power Electronics, American Scientific Publishers, 2009 NED: A Novel Synthetic Traffic Pattern for Power/Performance Analysis of Network-on-chips Using Negative Exponential

More information