PREESM: A Dataflow-Based Rapid Prototyping Framework for Simplifying Multicore DSP Programming
|
|
- Aron Griffin
- 6 years ago
- Views:
Transcription
1 PREESM: A Dataflow-Based Rapid Prototyping Framework for Simplifying Multicore DSP Programming Maxime Pelcat, Karol Desnos, Julien Heulot Clément Guy, Jean-François Nezan, Slaheddine Aridhi EDERC 2014 Conference, Milan, September 11 th 1
2 Transistors/chip x2 every 18 months Source: Hardware-dependent Software, Ecker, et. al 2
3 Lines of code/chip x3.5 every 18 months Transistors/chip x2 every 18 months Source: Hardware-dependent Software, Ecker, et. al 3
4 Lines of code/chip x3.5 every 18 months Transistors/chip x2 every 18 months Lines of code/day +25% every 18 months Source: Hardware-dependent Software, Ecker, et. al 4
5 Lines of code/chip x3.5 every 18 months Transistors/chip x2 every 18 months Software Productivity Gap Lines of code/day +25% every 18 months Source: Hardware-dependent Software, Ecker, et. al 5
6 Typical Single DSP Environment INSTITUT D ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES C/C++ Algorithm Code Compiler Program Command Line Options Simulator + Debugger + Profiler OS Core (s) 6
7 Multicore DSP Rapid Prototyping Functional Algorithm Model + Code Rapid Prototyping Program Program Program Program Deployment Constraints + Options Architecture Model Simulator + Debugger + Profiler OS Core 1 OS Core 2 7
8 Reduce Software Productivity Gap In early design phases: Metrics Design parallel algorithms Automatic mapping and scheduling Predictable time and memory choose the right algorithm and hardware 8
9 Reduce Software Productivity Gap In late design phases: Rapid Prototyping Automatic multi-core speedup Inter-core communication Guaranteed Deadlock-freeness 9
10 Reduce Software Productivity Gap For migration to a new hardware Seamless porting to a new architecture Legacy code reuseability Portable performance Dataflow modelling can help 10
11 PREESM for C6678 INSTITUT D ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES Algo dataflow + C Code Program Program Program Program PREESM Multiple C Programs Scenario Archi Model PREESM Simulator + CCS Debugger and Profiler SYS/ BIOS C66 C6678 SYS/ BIOS C66 11
12 Algo dataflow: PiSDF INSTITUT D ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES Read 1 Size Size Size Size Filter Size Display K. Desnos, M. Pelcat, J.-F. Nezan, S. S. Bhattacharyya, S. Aridhi PiMM: Parameterized and Interfaced Dataflow Meta-Model for MPSoCs Runtime Reconfiguration, SAMOS XIII 12
13 PiSDF Size Read 1 Size Size Size Size Filter Size Display K. Desnos, M. Pelcat, J.-F. Nezan, S. S. Bhattacharyya, S. Aridhi PiMM: Parameterized and Interfaced Dataflow Meta-Model for MPSoCs Runtime Reconfiguration, SAMOS XIII 13
14 back feed in out INSTITUT D ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES PiSDF Size Read C Code 1 Size Size Size Size Filter Size Size Display C Code N Size Size Size/N Size/N Kernel Size/N Size/N Size Size K. Desnos, M. Pelcat, J.-F. Nezan, S. S. Bhattacharyya, S. Aridhi PiMM: Parameterized and Interfaced Dataflow Meta-Model for MPSoCs Runtime Reconfiguration, SAMOS XIII 14
15 back feed in out INSTITUT D ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES PiSDF Size Read C Code 1 Size Size Size Size Filter Size Size Display C Code N Size Size Size/N Size/N Kernel C Code Size/N Size/N Size Size K. Desnos, M. Pelcat, J.-F. Nezan, S. S. Bhattacharyya, S. Aridhi PiMM: Parameterized and Interfaced Dataflow Meta-Model for MPSoCs Runtime Reconfiguration, SAMOS XIII 15
16 Algo dataflow: PiSDF INSTITUT D ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES PiSDF MoC is: Hierarchical & Compositional Statically parameterizable Dynamically reconfigurable PiSDF fosters: - Predictability - Parallelism - Lightweight runtime overhead - Developer-friendliness K. Desnos, M. Pelcat, J.-F. Nezan, S. S. Bhattacharyya, S. Aridhi PiMM: Parameterized and Interfaced Dataflow Meta-Model for MPSoCs Runtime Reconfiguration, SAMOS XIII 16
17 Archi: System-Level Archi. Model Representing contentions as TDMA core1 TMS320C6678 core5 core2 core3 core4 MSMC 16 GB/s DDR3 5.3 GB/s core6 core7 core8 17
18 PREESM: Multicore Scheduling Scheduling based on latency and load balancing 18
19 PREESM: Multicore Scheduling Scheduling based on latency and load balancing 19
20 PREESM: Multicore Scheduling Scheduling based on latency and load balancing core1 core2 core3 core4 20
21 PREESM: Memory Bounds INSTITUT D ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES Bounding the memory needs of an application graph to: - Evaluate the memory requirements - Adjust the size of architecture memory - Assess the optimality of a memory allocation Insufficient memory Possible allocated memory Wasted memory 0 Lower Bound Upper Bound Available Memory 21
22 PREESM: Prototype Code Generation A B C D E o1 o2 A B C D E o1 Actor A Actor B Actor D o2 Actor C time Actor E 22 22
23 PREESM Features INSTITUT D ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES Open Source Tool Available on GitHub Research-Oriented Tool New models, optimizations, scheduling Eclipse-based Integrated Tool Several plug-ins, metamodels Extended Web Tutorials 23
24 Other Tools INSTITUT D ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES OpenMP, OpenEM Adding Rapid Prototyping MAPS Compiler, Polycore Polymapper, SynDEx Open-source code 24
25 PREESM Features INSTITUT D ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES 25
26 Some Results on Stereo Matching Theoretical speedup Measured Performance allocated memory lower memory bund Number of cores Number of cores 26
27 Conclusion INSTITUT D ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES Reduce Software Productivity Gap Design space exploration Rapid Prototyping Extract coarse grain parallelism Portable performance PREESM Dataflow modelling can help! Good decisions necessitate extensive information on both computation and data flow 27
28 Thanks! M. Pelcat, K. Desnos, J. Heulot, C. Guy, J.-F. Nezan, S. Aridhi, "PREESM: A Dataflow-Based Rapid Prototyping Framework for Simplifying Multicore DSP Programming" EDERC, PREESM Tutorial 16:00 17:00 - Room: Oro Plenaria M. Pelcat, S. Aridhi, J. Piat, J.-F. Nezan, "Physical Layer Multicore Prototyping: A Dataflow-Based Approach for LTE enodeb". Springer,
Tutorial: PREESM - Dataflow Programming of Multicore DSPs
Tutorial: PREESM - Dataflow Programming of Multicore DSPs Karol Desnos, Clément Guy, Maxime Pelcat EDERC 2014 Conference, Milan, September 11 th 1 PREESM http://preesm.sourceforge.net/website Eclipse-based
More informationPREESM: A Dataflow-Based Rapid Prototyping Framework for Simplifying Multicore DSP Programming
PREESM: A Dataflow-Based Rapid Prototyping Framework for Simplifying Multicore DSP Programming Maxime Pelcat, Karol Desnos, Julien Heulot, Clément Guy, Jean François Nezan, Slaheddine Aridhi To cite this
More informationDynamic Dataflow. Seminar on embedded systems
Dynamic Dataflow Seminar on embedded systems Dataflow Dataflow programming, Dataflow architecture Dataflow Models of Computation Computation is divided into nodes that can be executed concurrently Dataflow
More informationHW/SW Cyber-System Co-Design and Modeling
HW/SW Cyber-System Co-Design and Modeling Julio OLIVEIRA Karol DESNOS Karol Desnos (IETR) & Julio Oliveira (TNO) 1 Introduction Who are we? Julio de OLIVEIRA Position: TNO - Researcher & innovation scientist
More informationMARTE to PiSDF transformation for data-intensive applications analysis
MARTE to PiSDF transformation for data-intensive applications analysis Manel Ammar, Mouna Baklouti, Maxime Pelcat, Karol Desnos, Mohammed Abid To cite this version: Manel Ammar, Mouna Baklouti, Maxime
More informationSDACCEL DEVELOPMENT ENVIRONMENT. The Xilinx SDAccel Development Environment. Bringing The Best Performance/Watt to the Data Center
SDAccel Environment The Xilinx SDAccel Development Environment Bringing The Best Performance/Watt to the Data Center Introduction Data center operators constantly seek more server performance. Currently
More informationModels of Architecture
Models of Architecture Maxime Pelcat, Karol Desnos, Luca Maggiani, Yanzhou Liu, Julien Heulot, Jean-François Nezan, Shuvra S. Bhattacharyya To cite this version: Maxime Pelcat, Karol Desnos, Luca Maggiani,
More informationA System-Level Architecture Model for Rapid Prototyping of Heterogeneous Multicore Embedded Systems
A System-Level Architecture Model for Rapid Prototyping of Heterogeneous Multicore Embedded Systems Maxime Pelcat, Jean François Nezan, Jonathan Piat, Jerome Croizer, Slaheddine Aridhi To cite this version:
More informationMemory Study and Dataflow Representations for Rapid Prototyping of Signal Processing Applications on MPSoCs
Memory Study and Dataflow Representations for Rapid Prototyping of Signal Processing Applications on MPSoCs Karol Desnos To cite this version: Karol Desnos. Memory Study and Dataflow Representations for
More informationApplying the Adaptive Hybrid Flow-Shop Scheduling Method to Schedule a 3GPP LTE Physical Layer Algorithm onto Many-Core Digital Signal Processors
Author manuscript, published in " " Applying the Adaptive Hybrid Flow-Shop Scheduling Method to Schedule a 3GPP LTE Physical Layer Algorithm onto Many-Core Digital Signal Processors Julien Heulot, Jani
More informationAutomatic Generation of S-LAM Descriptions from UML/MARTE for the DSE of Massively Parallel Embedded Systems
Automatic Generation of S-LAM Descriptions from UML/MARTE for the DSE of Massively Parallel Embedded Systems Manel Ammar, Mouna Baklouti, Maxime Pelcat, Karol Desnos, Mohamed Abid To cite this version:
More informationOn Memory Reuse Between Inputs and Outputs of Dataflow Actors
On Memory Reuse Between Inputs and Outputs of Dataflow Actors Karol Desnos, Maxime Pelcat, Jean François Nezan, Slaheddine Aridhi To cite this version: Karol Desnos, Maxime Pelcat, Jean François Nezan,
More informationPartial Expansion Graphs: Exposing Parallelism and Dynamic Scheduling Opportunities for DSP Applications
In Proceedings of the International Conference on Application Specific Systems, Architectures, and Processors, 2012, to appear. Partial Expansion Graphs: Exposing Parallelism and Dynamic Scheduling Opportunities
More informationRelaxed Subgraph Execution Model for the Throughput Evaluation of IBSDF Graphs
Relaxed Subgraph Execution Model for the Throughput Evaluation of ISF raphs Hamza eroui, Karol esnos and Jean-François Nezan IETR, INS Rennes CNRS UMR 664, UE Rennes, France Email: hderoui, kdesnos, jnezan@insa-rennes.fr
More informationProgramming Heterogeneous Embedded Systems for IoT
Programming Heterogeneous Embedded Systems for IoT Jeronimo Castrillon Chair for Compiler Construction TU Dresden jeronimo.castrillon@tu-dresden.de Get-together toward a sustainable collaboration in IoT
More informationAVSynDEx Methodology For Fast Prototyping of Multi-C6x DSP Architectures
AVSynDEx Methodology For Fast Prototyping of Multi-C6x DSP Architectures Jean-François NEZAN, Virginie FRESSE, Olivier DEFORGES, Michael RAULET CNRS UMR IETR (Institut en Electronique et Télecommunications
More informationDepartment of Electrical and Computer Engineering, University of Maryland
Department of Electrical and Computer Engineering, University of Maryland OUTLINE Introduction: Problem statement Background Goals Co-processing units generation: Approach and baseline Multi-Dataflow Composer
More informationOptimization of automatically generated multi-core code for the LTE RACH-PD algorithm
Optimization of automatically generated multi-core code for the LTE RACH-PD algorithm Maxime Pelcat, Slaheddine Aridhi, Jean François Nezan To cite this version: Maxime Pelcat, Slaheddine Aridhi, Jean
More informationHigh Performance Computing Systems
High Performance Computing Systems Multikernels Doug Shook Multikernels Two predominant approaches to OS: Full weight kernel Lightweight kernel Why not both? How does implementation affect usage and performance?
More informationCompilation of Parametric Dataflow Applications for Software-Defined-Radio-Dedicated MPSoCs DREAM seminar
Compilation of Parametric Dataflow Applications for Software-Defined-Radio-Dedicated MPSoCs DREAM seminar Mickaël Dardaillon Research Intern with NOKIA Technologies January 27th, 2015 2 / 33 What we know
More informationA Methodology for Profiling and Partitioning Stream Programs on Many-core Architectures
Procedia Computer Science Volume 51, 2015, Pages 2962 2966 ICCS 2015 International Conference On Computational Science A Methodology for Profiling and Partitioning Stream Programs on Many-core Architectures
More informationGetting the Most out of Advanced ARM IP. ARM Technology Symposia November 2013
Getting the Most out of Advanced ARM IP ARM Technology Symposia November 2013 Evolving System Requirements Processor Advances big.little Multicore Unicore DSP Cortex -R7 Block are now Sub-Systems Cortex
More informationDesign and Implementation of Adaptive Signal Processing Systems Using Markov Decision Processes
Design and Implementation of Adaptive Signal Processing Systems Using Markov Decision Processes Lin Li, Adrian E. Sapio, Jiahao Wu, Yanzhou Liu, Kyunghun Lee, Marilyn Wolf, Shuvra S. Bhattacharyya University
More informationMulticore DSP Software Synthesis using Partial Expansion of Dataflow Graphs
Multicore DSP Software Synthesis using Partial Expansion of Dataflow Graphs George F. Zaki, William Plishker, Shuvra S. Bhattacharyya University of Maryland, College Park, MD, USA & Frank Fruth Texas Instruments
More informationTowards a codelet-based runtime for exascale computing. Chris Lauderdale ET International, Inc.
Towards a codelet-based runtime for exascale computing Chris Lauderdale ET International, Inc. What will be covered Slide 2 of 24 Problems & motivation Codelet runtime overview Codelets & complexes Dealing
More informationDynamic inter-core scheduling in Barrelfish
Dynamic inter-core scheduling in Barrelfish. avoiding contention with malleable domains Georgios Varisteas, Mats Brorsson, Karl-Filip Faxén November 25, 2011 Outline Introduction Scheduling & Programming
More informationCopyright Khronos Group Page 1. Vulkan Overview. June 2015
Copyright Khronos Group 2015 - Page 1 Vulkan Overview June 2015 Copyright Khronos Group 2015 - Page 2 Khronos Connects Software to Silicon Open Consortium creating OPEN STANDARD APIs for hardware acceleration
More informationMULTICORE DIGITAL SIGNAL PROCESSING
1 MULTICORE DIGITAL SIGNAL PROCESSING Maxime Pelcat mpelcat@insa-rennes.fr Slides from M. Pelcat, K. Desnos, J.-F. Nezan, D. Ménard, M. Raulet, J. Gorin, F. Pescador Institute 2 IETR INSA Rennes 3 Introduction:
More informationSoftware Synthesis Trade-offs in Dataflow Representations of DSP Applications
in Dataflow Representations of DSP Applications Shuvra S. Bhattacharyya Department of Electrical and Computer Engineering, and Institute for Advanced Computer Studies University of Maryland, College Park
More informationProfiling and Debugging OpenCL Applications with ARM Development Tools. October 2014
Profiling and Debugging OpenCL Applications with ARM Development Tools October 2014 1 Agenda 1. Introduction to GPU Compute 2. ARM Development Solutions 3. Mali GPU Architecture 4. Using ARM DS-5 Streamline
More informationTrends and Challenges in Multicore Programming
Trends and Challenges in Multicore Programming Eva Burrows Bergen Language Design Laboratory (BLDL) Department of Informatics, University of Bergen Bergen, March 17, 2010 Outline The Roadmap of Multicores
More informationUniversity of Cagliari Multicore Digital Signal Processing
Seminar @ University of Cagliari Multicore Digital Signal Processing Maxime Pelcat June 2013 Slides from M. Pelcat, K. Desnos, J-F. Nezan, D. Ménard, M. Raulet, J Gorin 2 Porting Algorithms on Multicore
More informationHSA Foundation! Advanced Topics on Heterogeneous System Architectures. Politecnico di Milano! Seminar Room (Bld 20)! 15 December, 2017!
Advanced Topics on Heterogeneous System Architectures HSA Foundation! Politecnico di Milano! Seminar Room (Bld 20)! 15 December, 2017! Antonio R. Miele! Marco D. Santambrogio! Politecnico di Milano! 2
More informationIntel Parallel Studio XE 2015
2015 Create faster code faster with this comprehensive parallel software development suite. Faster code: Boost applications performance that scales on today s and next-gen processors Create code faster:
More informationOrcc: multimedia development made easy
Orcc: multimedia development made easy Hervé Yviquel, Antoine Lorence, Khaled Jerbi, Gildas Cocherel, Alexandre Sanchez, Mickaël Raulet To cite this version: Hervé Yviquel, Antoine Lorence, Khaled Jerbi,
More informationHSA foundation! Advanced Topics on Heterogeneous System Architectures. Politecnico di Milano! Seminar Room A. Alario! 23 November, 2015!
Advanced Topics on Heterogeneous System Architectures HSA foundation! Politecnico di Milano! Seminar Room A. Alario! 23 November, 2015! Antonio R. Miele! Marco D. Santambrogio! Politecnico di Milano! 2
More informationHPC learning using Cloud infrastructure
HPC learning using Cloud infrastructure Florin MANAILA IT Architect florin.manaila@ro.ibm.com Cluj-Napoca 16 March, 2010 Agenda 1. Leveraging Cloud model 2. HPC on Cloud 3. Recent projects - FutureGRID
More informationConclusions. Introduction. Objectives. Module Topics
Conclusions Introduction In this chapter a number of design support products and services offered by TI to assist you in the development of your DSP system will be described. Objectives As initially stated
More informationFrom MDD back to basic: Building DRE systems
From MDD back to basic: Building DRE systems, ENST MDx in software engineering Models are everywhere in engineering, and now in software engineering MD[A, D, E] aims at easing the construction of systems
More informationExperience in Developing Model- Integrated Tools and Technologies for Large-Scale Fault Tolerant Real-Time Embedded Systems
Institute for Software Integrated Systems Vanderbilt University Experience in Developing Model- Integrated Tools and Technologies for Large-Scale Fault Tolerant Real-Time Embedded Systems Presented by
More informationMODELING OF BLOCK-BASED DSP SYSTEMS
MODELING OF BLOCK-BASED DSP SYSTEMS Dong-Ik Ko and Shuvra S. Bhattacharyya Department of Electrical and Computer Engineering, and Institute for Advanced Computer Studies University of Maryland, College
More informationChapter 4: Threads. Chapter 4: Threads. Overview Multicore Programming Multithreading Models Thread Libraries Implicit Threading Threading Issues
Chapter 4: Threads Silberschatz, Galvin and Gagne 2013 Chapter 4: Threads Overview Multicore Programming Multithreading Models Thread Libraries Implicit Threading Threading Issues 4.2 Silberschatz, Galvin
More informationMIGRATION OF LEGACY APPLICATIONS TO HETEROGENEOUS ARCHITECTURES Francois Bodin, CTO, CAPS Entreprise. June 2011
MIGRATION OF LEGACY APPLICATIONS TO HETEROGENEOUS ARCHITECTURES Francois Bodin, CTO, CAPS Entreprise June 2011 FREE LUNCH IS OVER, CODES HAVE TO MIGRATE! Many existing legacy codes needs to migrate to
More informationFUJITSU Cloud Service K5 CF Service Functional Overview
FUJITSU Cloud Service K5 CF Service Functional Overview December 2016 Fujitsu Limited - Unauthorized copying and replication of the contents of this document is prohibited. - The contents of this document
More informationOpenCL: History & Future. November 20, 2017
Mitglied der Helmholtz-Gemeinschaft OpenCL: History & Future November 20, 2017 OpenCL Portable Heterogeneous Computing 2 APIs and 2 kernel languages C Platform Layer API OpenCL C and C++ kernel language
More informationOhua: Implicit Dataflow Programming for Concurrent Systems
Ohua: Implicit Dataflow Programming for Concurrent Systems Sebastian Ertel Compiler Construction Group TU Dresden, Germany Christof Fetzer Systems Engineering Group TU Dresden, Germany Pascal Felber Institut
More informationMorsel- Drive Parallelism: A NUMA- Aware Query Evaluation Framework for the Many- Core Age. Presented by Dennis Grishin
Morsel- Drive Parallelism: A NUMA- Aware Query Evaluation Framework for the Many- Core Age Presented by Dennis Grishin What is the problem? Efficient computation requires distribution of processing between
More informationRuntime multicore scheduling techniques for dispatching parameterized signal and vision dataflow applications on heterogeneous MPSoCs
Runtime multicore scheduling techniques for dispatching parameterized signal and vision dataflow applications on heterogeneous MPSoCs Julien Heulot To cite this version: Julien Heulot. Runtime multicore
More informationKismet: Parallel Speedup Estimates for Serial Programs
Kismet: Parallel Speedup Estimates for Serial Programs Donghwan Jeon, Saturnino Garcia, Chris Louie, and Michael Bedford Taylor Computer Science and Engineering University of California, San Diego 1 Questions
More informationIntroducing the Cray XMT. Petr Konecny May 4 th 2007
Introducing the Cray XMT Petr Konecny May 4 th 2007 Agenda Origins of the Cray XMT Cray XMT system architecture Cray XT infrastructure Cray Threadstorm processor Shared memory programming model Benefits/drawbacks/solutions
More informationEclipse in Embedded. Neha Garg : Prerna Rustagi :
Eclipse in Embedded Neha Garg :200601138 Prerna Rustagi : 200601203 Flow Of Presentation What is Eclipse? Eclipse Platform Architecture Features in Eclipse(RCP) Exploring Eclipse s ercp Eclipse For Embdded
More informationA Design Framework for Mapping Vectorized Synchronous Dataflow Graphs onto CPU-GPU Platforms
A Design Framework for Mapping Vectorized Synchronous Dataflow Graphs onto CPU-GPU Platforms Shuoxin Lin, Yanzhou Liu, William Plishker, Shuvra Bhattacharyya Maryland DSPCAD Research Group Department of
More informationOptimize HPC - Application Efficiency on Many Core Systems
Meet the experts Optimize HPC - Application Efficiency on Many Core Systems 2018 Arm Limited Florent Lebeau 27 March 2018 2 2018 Arm Limited Speedup Multithreading and scalability I wrote my program to
More informationIntroduction to AADL analysis and modeling with FACE Units of Conformance
Introduction to AADL analysis and modeling with FACE Units of Conformance AMRDEC Aviation Applied Technology Directorate Contract Number W911W6-17- D-0003 Delivery Order 3 This material is based upon work
More informationOracle Developer Studio 12.6
Oracle Developer Studio 12.6 Oracle Developer Studio is the #1 development environment for building C, C++, Fortran and Java applications for Oracle Solaris and Linux operating systems running on premises
More informationEarly Models in Silicon with SystemC synthesis
Early Models in Silicon with SystemC synthesis Agility Compiler summary C-based design & synthesis for SystemC Pure, standard compliant SystemC/ C++ Most widely used C-synthesis technology Structural SystemC
More informationOn mapping to multi/manycores
On mapping to multi/manycores Jeronimo Castrillon Chair for Compiler Construction (CCC) TU Dresden, Germany MULTIPROG HiPEAC Conference Stockholm, 24.01.2017 Mapping for dataflow programming models MEM
More informationSoC Systeme ultra-schnell entwickeln mit Vivado und Visual System Integrator
SoC Systeme ultra-schnell entwickeln mit Vivado und Visual System Integrator Embedded Computing Conference 2017 Matthias Frei zhaw InES Patrick Müller Enclustra GmbH 5 September 2017 Agenda Enclustra introduction
More informationCSSE 490 Model-Based Software Engineering: More MBSD. Shawn Bohner Office: Moench Room F212 Phone: (812)
CSSE 490 Model-Based Software Engineering: More MBSD Shawn Bohner Office: Moench Room F212 Phone: (812) 877-8685 Email: bohner@rose-hulman.edu Learning Outcomes: MBE Discipline Relate Model-Based Engineering
More informationGenerating Efficient Data Movement Code for Heterogeneous Architectures with Distributed-Memory
Generating Efficient Data Movement Code for Heterogeneous Architectures with Distributed-Memory Roshan Dathathri Thejas Ramashekar Chandan Reddy Uday Bondhugula Department of Computer Science and Automation
More informationGaining Insights into Multicore Cache Partitioning: Bridging the Gap between Simulation and Real Systems
Gaining Insights into Multicore Cache Partitioning: Bridging the Gap between Simulation and Real Systems 1 Presented by Hadeel Alabandi Introduction and Motivation 2 A serious issue to the effective utilization
More informationOverview of research activities Toward portability of performance
Overview of research activities Toward portability of performance Do dynamically what can t be done statically Understand evolution of architectures Enable new programming models Put intelligence into
More informationIndustrial Multicore Software with EMB²
Siemens Industrial Multicore Software with EMB² Dr. Tobias Schüle, Dr. Christian Kern Introduction In 2022, multicore will be everywhere. (IEEE CS) Parallel Patterns Library Apple s Grand Central Dispatch
More informationZSIM: FAST AND ACCURATE MICROARCHITECTURAL SIMULATION OF THOUSAND-CORE SYSTEMS
ZSIM: FAST AND ACCURATE MICROARCHITECTURAL SIMULATION OF THOUSAND-CORE SYSTEMS DANIEL SANCHEZ MIT CHRISTOS KOZYRAKIS STANFORD ISCA-40 JUNE 27, 2013 Introduction 2 Current detailed simulators are slow (~200
More informationCUDA PROGRAMMING MODEL Chaithanya Gadiyam Swapnil S Jadhav
CUDA PROGRAMMING MODEL Chaithanya Gadiyam Swapnil S Jadhav CMPE655 - Multiple Processor Systems Fall 2015 Rochester Institute of Technology Contents What is GPGPU? What s the need? CUDA-Capable GPU Architecture
More informationTechnology and Design Tools for Multicore Embedded Systems Software Development
Technology and Design Tools for Multicore Embedded Systems Software Development Yuriy Sheynin, Alexey Syschikov, Boris Sedov Saint Petersburg State University of Aerospace Instrumentation Why do we need
More informationThe OpenVX Computer Vision and Neural Network Inference
The OpenVX Computer and Neural Network Inference Standard for Portable, Efficient Code Radhakrishna Giduthuri Editor, OpenVX Khronos Group radha.giduthuri@amd.com @RadhaGiduthuri Copyright 2018 Khronos
More informationModeling pilot project at Ericsson Expert Analytics
Modeling pilot project at Ericsson Expert Analytics Gábor Ferenc Kovács, Gergely Dévai ELTE-Soft, ELTE University, Ericsson Ericsson Modeling Days, Stockholm, 13-14 September 2016 Overview Background of
More informationZSIM: FAST AND ACCURATE MICROARCHITECTURAL SIMULATION OF THOUSAND-CORE SYSTEMS
ZSIM: FAST AND ACCURATE MICROARCHITECTURAL SIMULATION OF THOUSAND-CORE SYSTEMS DANIEL SANCHEZ MIT CHRISTOS KOZYRAKIS STANFORD ISCA-40 JUNE 27, 2013 Introduction 2 Current detailed simulators are slow (~200
More informationAdvances of parallel computing. Kirill Bogachev May 2016
Advances of parallel computing Kirill Bogachev May 2016 Demands in Simulations Field development relies more and more on static and dynamic modeling of the reservoirs that has come a long way from being
More informationA unified multicore programming model
A unified multicore programming model Simplifying multicore migration By Sven Brehmer Abstract There are a number of different multicore architectures and programming models available, making it challenging
More informationA Meta-Modeling-Based Approach for Automatic Generation of Fault- Injection Processes
A Meta-Modeling-Based Approach for Automatic Generation of Fault- Injection Processes B.-A. Tabacaru, M. Chaari, W. Ecker, T. Kruse Infineon Technologies AG Accellera Systems Initiative 1 Outline Motivation
More informationMercury Computer Systems & The Cell Broadband Engine
Mercury Computer Systems & The Cell Broadband Engine Georgia Tech Cell Workshop 18-19 June 2007 About Mercury Leading provider of innovative computing solutions for challenging applications R&D centers
More informationOptimizing ARM SoC s with Carbon Performance Analysis Kits. ARM Technical Symposia, Fall 2014 Andy Ladd
Optimizing ARM SoC s with Carbon Performance Analysis Kits ARM Technical Symposia, Fall 2014 Andy Ladd Evolving System Requirements Processor Advances big.little Multicore Unicore DSP Cortex -R7 Block
More informationTHE FUTURE OF GPU DATA MANAGEMENT. Michael Wolfe, May 9, 2017
THE FUTURE OF GPU DATA MANAGEMENT Michael Wolfe, May 9, 2017 CPU CACHE Hardware managed What data to cache? Where to store the cached data? What data to evict when the cache fills up? When to store data
More informationIntroducing Overdecomposition to Existing Applications: PlasComCM and AMPI
Introducing Overdecomposition to Existing Applications: PlasComCM and AMPI Sam White Parallel Programming Lab UIUC 1 Introduction How to enable Overdecomposition, Asynchrony, and Migratability in existing
More informationEasy Multicore Programming using MAPS
Easy Multicore Programming using MAPS Jeronimo Castrillon, Maximilian Odendahl Multicore Challenge Conference 2012 September 24 th, 2012 Institute for Communication Technologies and Embedded Systems Outline
More informationHow to explicitly defines MoCCs within a model
CCSL@work: How to explicitly defines MoCCs within a model AOSTE sophia I3S/UNS/INRIA Synchron 2010 1 CCSL@work: the RT-Simex project (or a mean to check an implementation against its specification ) AOSTE
More informationThe etrice Eclipse Project Proposal
The etrice Eclipse Project Proposal Dipl.-Ing. Thomas Schütz, Protos Software GmbH Eclipse Embedded Day 2010, Stuttgart Agenda Motivation Scope of etrice ROOM Language Codegenerators Middleware Realization
More informationAdaptive SMT Control for More Responsive Web Applications
Adaptive SMT Control for More Responsive Web Applications Hiroshi Inoue and Toshio Nakatani IBM Research Tokyo University of Tokyo Oct 27, 2014 IISWC @ Raleigh, NC, USA Response time matters! Peak throughput
More informationThird annual ITU IMT-2020/5G Workshop and Demo Day 2018
All Sessions Outcome Third annual ITU IMT-2020/5G Workshop and Demo Day 2018 Geneva, Switzerland, 18 July 2018 Session 1: IMT-2020/5G standardization (part 1): activities and future plan in ITU-T SGs 1.
More informationLars Schor, and Lothar Thiele ETH Zurich, Switzerland
Iuliana Bacivarov, Wolfgang Haid, Kai Huang, Lars Schor, and Lothar Thiele ETH Zurich, Switzerland Efficient i Execution of KPN on MPSoC Efficiency regarding speed-up small memory footprint portability
More informationJoe Butler, Principal Engineer, Director Cloud Services Lab. Nov , OpenStack Summit Paris.
Telemetry the foundation of intelligent cloud orchestration. Joe Butler, Principal Engineer, Director Cloud Services Lab. Nov 3 2014, OpenStack Summit Paris. http://sched.co/1xj2lm9 Datacenter Trends and
More informationOmpCloud: Bridging the Gap between OpenMP and Cloud Computing
OmpCloud: Bridging the Gap between OpenMP and Cloud Computing Hervé Yviquel, Marcio Pereira and Guido Araújo University of Campinas (UNICAMP), Brazil A bit of background qguido Araujo, PhD Princeton University
More informationSHARCNET Workshop on Parallel Computing. Hugh Merz Laurentian University May 2008
SHARCNET Workshop on Parallel Computing Hugh Merz Laurentian University May 2008 What is Parallel Computing? A computational method that utilizes multiple processing elements to solve a problem in tandem
More informationUsing Industry Standards to Exploit the Advantages and Resolve the Challenges of Multicore Technology
Using Industry Standards to Exploit the Advantages and Resolve the Challenges of Multicore Technology September 19, 2007 Markus Levy, EEMBC and Multicore Association Enabling the Multicore Ecosystem Multicore
More informationPlacement de processus (MPI) sur architecture multi-cœur NUMA
Placement de processus (MPI) sur architecture multi-cœur NUMA Emmanuel Jeannot, Guillaume Mercier LaBRI/INRIA Bordeaux Sud-Ouest/ENSEIRB Runtime Team Lyon, journées groupe de calcul, november 2010 Emmanuel.Jeannot@inria.fr
More informationModelling, Analysis and Scheduling with Dataflow Models
technische universiteit eindhoven Modelling, Analysis and Scheduling with Dataflow Models Marc Geilen, Bart Theelen, Twan Basten, Sander Stuijk, AmirHossein Ghamarian, Jeroen Voeten Eindhoven University
More informationEfficiently Introduce Threading using Intel TBB
Introduction This guide will illustrate how to efficiently introduce threading using Intel Threading Building Blocks (Intel TBB), part of Intel Parallel Studio XE. It is a widely used, award-winning C++
More informationCUDA GPGPU Workshop 2012
CUDA GPGPU Workshop 2012 Parallel Programming: C thread, Open MP, and Open MPI Presenter: Nasrin Sultana Wichita State University 07/10/2012 Parallel Programming: Open MP, MPI, Open MPI & CUDA Outline
More informationCS4961 Parallel Programming. Lecture 10: Data Locality, cont. Writing/Debugging Parallel Code 09/23/2010
Parallel Programming Lecture 10: Data Locality, cont. Writing/Debugging Parallel Code Mary Hall September 23, 2010 1 Observations from the Assignment Many of you are doing really well Some more are doing
More informationWireless SDN 기술. Seungwon Shin KAIST
Wireless SDN 기술 Seungwon Shin KAIST Background First, we need to talk about traditional network devices Consist of two main components Control path (plane) decision module (e.g., routing) Data path (plane)
More informationOPERATING SYSTEM. Chapter 4: Threads
OPERATING SYSTEM Chapter 4: Threads Chapter 4: Threads Overview Multicore Programming Multithreading Models Thread Libraries Implicit Threading Threading Issues Operating System Examples Objectives To
More informationOverview of ROCCC 2.0
Overview of ROCCC 2.0 Walid Najjar and Jason Villarreal SUMMARY FPGAs have been shown to be powerful platforms for hardware code acceleration. However, their poor programmability is the main impediment
More informationScalable Shared Memory Programing
Scalable Shared Memory Programing Marc Snir www.parallel.illinois.edu What is (my definition of) Shared Memory Global name space (global references) Implicit data movement Caching: User gets good memory
More informationAn Introduction to Parallel Programming
An Introduction to Parallel Programming Ing. Andrea Marongiu (a.marongiu@unibo.it) Includes slides from Multicore Programming Primer course at Massachusetts Institute of Technology (MIT) by Prof. SamanAmarasinghe
More informationCopyright 2014 Xilinx
IP Integrator and Embedded System Design Flow Zynq Vivado 2014.2 Version This material exempt per Department of Commerce license exception TSU Objectives After completing this module, you will be able
More informationCo-Design of Many-Accelerator Heterogeneous Systems Exploiting Virtual Platforms. SAMOS XIV July 14-17,
Co-Design of Many-Accelerator Heterogeneous Systems Exploiting Virtual Platforms SAMOS XIV July 14-17, 2014 1 Outline Introduction + Motivation Design requirements for many-accelerator SoCs Design problems
More informationSystem-level co-modeling AADL and Simulink specifications using Polychrony (and Syndex)
System-level co-modeling AADL and Simulink specifications using Polychrony (and Syndex) AADL Standards Meeting June 6., 2011 Jean-Pierre Talpin, INRIA Parts of this presentation are joint work with Paul,
More informationComputer and Information Sciences College / Computer Science Department CS 207 D. Computer Architecture. Lecture 9: Multiprocessors
Computer and Information Sciences College / Computer Science Department CS 207 D Computer Architecture Lecture 9: Multiprocessors Challenges of Parallel Processing First challenge is % of program inherently
More information