Efficient Application Mapping on CGRAs Based on Backward Simultaneous Scheduling / Binding and Dynamic Graph Transformations

Size: px
Start display at page:

Download "Efficient Application Mapping on CGRAs Based on Backward Simultaneous Scheduling / Binding and Dynamic Graph Transformations"

Transcription

1 Efficient Application Mapping on CGRAs Based on Backward Simultaneous Scheduling / Binding and Dynamic Graph Transformations T. Peyret 1, G. Corre 1, M. Thevenin 1, K. Martin 2, P. Coussy 2 1 CEA, LIST, Electronic Architectures and Sensors Laboratory (LCAE) F Gif-sur-Yvette, France 2 Université de Bretagne-Sud, Lab-STICC Lorient, France ASAP 2014 Conference

2 COARSE-GRAINED RECONFIGURABLE ARCHITECTURE (CGRA) Processing Elements / Tiles Homogeneous/heterogeneous Register Files (RF) Operators Interconnection network Mesh 1D, 2D, Torus, Segmented Example: 4 4 CGRA Torus 2D mesh Local RF PE PE PE PE PE PE PE PE From Neighbours & Memory PE PE PE PE FU RF PE PE PE PE To Neighbours & Memory ASAP2014 Peyret Thomas 2

3 MAPPING ON CGRA Scheduling & binding are two NP-Complete problems Separate resolution Heuristic and meta-heuristic (e.g. EMS, VPR) Heuristic and exact method (e.g. EPIMap, REGIMap) Merge resolution Exact methods (e.g. ILP-Based) Meta-heuristic (e.g. DRESC) Purpose: Have a mapping flow which deeply explores the solution space for entire application code ASAP2014 Peyret Thomas 3

4 MAPPING FLOW C Code Compilation Schedule & Binding of highest Priority Node Yes Changes? N CGRA Model CDFG Solutions? Yes No Graph Transformation Mapping Pruning List of Mappings Application & CGRA models Mapping tool No Last Node? Yes ASAP2014 Peyret Thomas 4

5 APPLICATION & CGRA MODELS Compilation C Control Data Flow Graph (CDFG) with GCC CDFG is composed of basic blocs and a control part Basic blocs are represented by Data Flow Graphs (DFG) New kind of nodes: memorization operation nodes ASAP2014 Peyret Thomas 5

6 Cycle i + 2 Cycle i + 2 Cycle i + 1 Cycle i + 1 Cycle i Cycle i APPLICATION & CGRA MODELS Example of a 2-tile CGRA with RF Memorization operators are introduced Be able to cope with RF 1/A A 2/B B 1 1 RF RF A A B B RF RF A A 3/B B A 4/B A B 4 ASAP2014 Peyret Thomas 6

7 Cycle i + 2 Cycle i + 1 Cycle i APPLICATION & CGRA MODELS Homomorphic CGRA and DFG models Memorization nodes: to keep data dependencies Equivalence between nodes: Operators Operations Registers Data Binding finding DFG into CGRA model 1/A A 2/B B 1 2 A 3/B B B RF A 4/B B 4 ASAP2014 Peyret Thomas 7

8 MAPPING FLOW C Code Compilation Schedule & Binding of highest Priority Node Yes Changes? No Fail CGRA Model CDFG Solutions? Yes No Graph Transformation Mapping Pruning List of Mappings No Last Node? Yes Application & CGRA models Simultaneous Scheduling and Binding Binding method Backward List-scheduling based scheduling Formal graph transformations Pruning step ASAP2014 Peyret Thomas 8

9 SIMULTANEOUS SCHEDULING/BINDING Purpose: Check whether at least one binding solution exists for each node schedule Avoid dead-ends due to the dependence between these two problems Allow to transform the graph only when needed and with the right transformation Based on Levi s algorithm Solves the maximum sub graph problem for homomorphic graphs Incremental version Rely on previously found partial bindings Add the newly scheduled node (and its data node) to the previously considered sub graph Find every possible partial mapping Exhaustive method If no binding solution => graph transformation is required ASAP2014 Peyret Thomas 9

10 GRAPH TRANSFORMATIONS 3 dynamic transformations are proposed: Operation splitting Simple routing Memorization splitting ASAP2014 Peyret Thomas 10

11 Cycle i + 2 Cycle i + 1 Cycle i PRUNING STEP Idea: remove mapping with same operator utilization to limit the number of partial mappings Executed at the end of each scheduling cycle Removes redundant partial mappings Still exhaustive Example: On a 2-tile CGRA 1/A 2/B 1/A 2/B A 3/B 2 2 3/A B A 4/B 4/A B ASAP2014 Peyret Thomas 11

12 EXPERIMENTS & RESULTS Compared with two other methods: Method 1: A forward list-scheduling with just routing transformation and use Levi s algorithm to bind. Method 2: Heuristic described in EPIMap which applies static a priori transformations (routing & splitting) to schedule and use Levi s algorithm to bind. 4 metrics: Success Rate Latency Exploration Quality Exploration Efficiency 9 application codes (FFT, DCT, ) 16 constraint sets per code ASAP2014 Peyret Thomas 12

13 Success Rate EXPERIMENTS & RESULTS Success Rate 1 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 Method 1 Method 2 Proposed Approach 0,1 0 DC Filter DCT 2D Elliptic Filter EMA Filter FFT Manhattan Distance Matrix Product MWD Filter 99% for Proposed Approach (vs 37% and 62%) Unsharp Mask Average ASAP2014 Peyret Thomas 13

14 Best Latency Rate EXPERIMENTS & RESULTS Percentage of time a mapping has the best latency 1 0,9 0,8 0,7 0,6 0,5 0,4 0,3 0,2 Method 1 Method 2 Proposed Approach 0,1 0 DC Filter DCT 2D Elliptic Filter EMA Filter FFT Manhattan Distance Matrix Product MWD Filter 90% for Proposed Approach (vs 31% and 42%) Unsharp Mask Average ASAP2014 Peyret Thomas 14

15 CONCLUSION & PROSPECTS Mapping flow C DFGs CGRA Simultaneous scheduling / exhaustive-based binding Dynamic graph transformations Very promising results Success rate Latency Exploration quality and efficiency Future works Improve pruning step Improve scalability ASAP2014 Peyret Thomas 15

16 Thank you for your attention Commissariat à l énergie atomique et aux énergies alternatives Institut Carnot CEA LIST Sensors And Electronic Architectures Laboratory Centre de Saclay bâtiment PC 72l Gif-sur-Yvette Cedex T. +33 (0) Thomas.peyret@cea.fr Etablissement public à caractère industriel et commercial l RCS Paris B

17 INTRODUCTION Performance vs Flexibility vs Conception Cost Raffin E., Déploiement d'applications multimédia sur architecture reconfigurable à gros grain : modélisation avec la programmation par contraintes, 2011 ASAP2014 Peyret Thomas 17

18 INTRODUCTION Many architectures Morphosys DART MORA ADRES Etc. Less automated compilation flow Dedicated to an architecture Not scalable (e.g. ILP-based) Not versatile or with limitations (e.g. no RF or manual partitioning) Only for kernel loop acceleration ASAP2014 Peyret Thomas 18

19 SIMULTANEOUS SCHEDULING/BINDING Purpose: Check if at least one binding solution exist for each node schedule Avoid dead-ends due to the dependence between these two problems Allow to transform the graph only when needed and with the right transformation Example: Map this DFG on this CGRA ASAP2014 Peyret Thomas 19

20 SIMULTANEOUS SCHEDULING/BINDING Schedule example: Cycle Opération ASAP2014 Peyret Thomas 20

21 SIMULTANEOUS SCHEDULING/BINDING Binding is impossible: Cycle A B C D E ? & 14 are conflicting on tile C ASAP2014 Peyret Thomas 21

22 SIMULTANEOUS SCHEDULING/BINDING Other example: Cycle A B C D E ? & 14 are conflicting on tile C ASAP2014 Peyret Thomas 22

23 BACKWARD TRAVERSING Allows to know if a transformation is relevant and which one Schedule and binding of successor nodes are already done So it is possible to know the real needs for the current node Example: Forward (non a priori transformations) Backward Cycle Operations &3 3 7 Cycle Opérations a 2b 3 ASAP2014 Peyret Thomas 23

24 BACKWARD / FORWARD TRAVERSING Example: Forward (a priori transformations) Cycle Operations 1 1 2a b Backward Cycle Operations ASAP2014 Peyret Thomas 24

25 LEVI S ALGORITHM Determining the maximum sub graph between 2 graphs is NP- Complete Based on caracteritics matrix of the graphs Adjacence matrix, compatibility matrix Example of adjacence matrix ASAP2014 Peyret Thomas 25

26 LEVI S ALGORITHM Complete example Adjacence matrix ASAP2014 Peyret Thomas 26

27 LEVI S ALGORITHM Complete example Reduce compatibility matrix ASAP2014 Peyret Thomas 27

28 LEVI S ALGORITHM Complete example Maximum compatibility classes ASAP2014 Peyret Thomas 28

29 LEVI S ALGORITHM Complete example Connected maximum sub graphs ASAP2014 Peyret Thomas 29

30 LEVI S ALGORITHM Complete example Result ASAP2014 Peyret Thomas 30

31 Number of Different Mappings EXPERIMENTS & RESULTS Number of different mappings found Method 1 Method 2 Proposed Approach 2 0 DC Filter DCT 2D Elliptic Filter EMA Filter FFT Manhattan Distance Matrix Product MWD Filter Unsharp Mask 3.7 and 2.4 times higher for Proposed Approach Average ASAP2014 Peyret Thomas 31

32 Number of Different Mappings Generated per Second EXPERIMENTS & RESULTS Number of different mappings found per second 1,6 1,4 1,2 1 Method 1 0,8 0,6 0,4 Method 2 Proposed Approach 0,2 0 DC Filter DCT 2D Elliptic Filter EMA Filter FFT Manhattan Distance Matrix Product MWD Filter Unsharp Mask Average 2.6 and 2.2 more times higher for Proposed Approach ASAP2014 Peyret Thomas 32

CODE ANALYSES FOR NUMERICAL ACCURACY WITH AFFINE FORMS: FROM DIAGNOSIS TO THE ORIGIN OF THE NUMERICAL ERRORS. Teratec 2017 Forum Védrine Franck

CODE ANALYSES FOR NUMERICAL ACCURACY WITH AFFINE FORMS: FROM DIAGNOSIS TO THE ORIGIN OF THE NUMERICAL ERRORS. Teratec 2017 Forum Védrine Franck CODE ANALYSES FOR NUMERICAL ACCURACY WITH AFFINE FORMS: FROM DIAGNOSIS TO THE ORIGIN OF THE NUMERICAL ERRORS NUMERICAL CODE ACCURACY WITH FLUCTUAT Compare floating point with ideal computation Use interval

More information

PAPYRUS FUTURE. CEA Papyrus Team

PAPYRUS FUTURE. CEA Papyrus Team PAPYRUS FUTURE CEA ABSTRACT SYNTAX The definition of a DSML abstract syntax in Papyrus is done with the profile editor. It lets define abstract syntax constraints in OCL and Java. Ongoing: Façade [1] lets

More information

Memory-Aware Loop Mapping on Coarse-Grained Reconfigurable Architectures

Memory-Aware Loop Mapping on Coarse-Grained Reconfigurable Architectures Memory-Aware Loop Mapping on Coarse-Grained Reconfigurable Architectures Abstract: The coarse-grained reconfigurable architectures (CGRAs) are a promising class of architectures with the advantages of

More information

DATA-MANAGEMENT DIRECTORY FOR OPENMP 4.0 AND OPENACC

DATA-MANAGEMENT DIRECTORY FOR OPENMP 4.0 AND OPENACC DATA-MANAGEMENT DIRECTORY FOR OPENMP 4.0 AND OPENACC Heteropar 2013 Julien Jaeger, Patrick Carribault, Marc Pérache CEA, DAM, DIF F-91297 ARPAJON, FRANCE 26 AUGUST 2013 24 AOÛT 2013 CEA 26 AUGUST 2013

More information

VISION FOR AUTOMOTIVE DRIVING

VISION FOR AUTOMOTIVE DRIVING VISION FOR AUTOMOTIVE DRIVING French Japanese Workshop on Deep Learning & AI, Paris, October 25th, 2017 Quoc Cuong PHAM, PhD Vision and Content Engineering Lab AI & MACHINE LEARNING FOR ADAS AND SELF-DRIVING

More information

SYSTEM MODELING Introduction

SYSTEM MODELING Introduction SYSTEM MODELING Introduction 2015-09-14 François Terrier 1 FIRST WHAT IS A SYSTEM? Complex and heterogeneous systems responding to real-world events Human interactions Embedded system Software + Computers

More information

REAL-TIME ADAPTIVE IMAGING FOR ULTRASONIC NONDESTRUCTIVE TESTING OF STRUCTURES WITH IRREGULAR SHAPES

REAL-TIME ADAPTIVE IMAGING FOR ULTRASONIC NONDESTRUCTIVE TESTING OF STRUCTURES WITH IRREGULAR SHAPES REAL-TIME ADATIVE IMAGING FOR ULTRASONIC NONDESTRUCTIVE TESTING OF STRUCTURES WITH IRREGULAR SHAES Sébastien Robert, Léonard Le Jeune, Vincent Saint-Martin CEA-LIST, 91191 Gif-sur-Yvette Cedex, France

More information

Memory Partitioning Algorithm for Modulo Scheduling on Coarse-Grained Reconfigurable Architectures

Memory Partitioning Algorithm for Modulo Scheduling on Coarse-Grained Reconfigurable Architectures Scheduling on Coarse-Grained Reconfigurable Architectures 1 Mobile Computing Center of Institute of Microelectronics, Tsinghua University Beijing, China 100084 E-mail: daiyuli1988@126.com Coarse Grained

More information

Modules v4. Pushing forward user environment management. Xavier Delaruelle FOSDEM 2018 February 4th 2018, ULB, Bruxelles

Modules v4. Pushing forward user environment management. Xavier Delaruelle FOSDEM 2018 February 4th 2018, ULB, Bruxelles Modules v4 Pushing forward user environment management Xavier Delaruelle FOSDEM 2018 February 4th 2018, ULB, Bruxelles whoami I am Xavier Delaruelle Work at CEA, a large research

More information

SDN-BASED CONFIGURATION SOLUTION FOR IEEE TIME SENSITIVE NETWORKING (TSN)

SDN-BASED CONFIGURATION SOLUTION FOR IEEE TIME SENSITIVE NETWORKING (TSN) SDN-BASED CONFIGURATION SOLUTION FOR IEEE 802.1 TIME SENSITIVE NETWORKING (TSN) SIWAR BEN HADJ SAID, QUANG HUY TRUONG, AND MICHAEL BOC CONTEXT Switch to IEEE standard Ethernet in Industrial and automotive

More information

Slurm at CEA. status and evolutions. 13 septembre 2013 CEA 10 AVRIL 2012 PAGE 1. SLURM User Group - September 2013 F. Belot, F. Diakhaté, M.

Slurm at CEA. status and evolutions. 13 septembre 2013 CEA 10 AVRIL 2012 PAGE 1. SLURM User Group - September 2013 F. Belot, F. Diakhaté, M. status and evolutions SLURM User Group - September 2013 F. Belot, F. Diakhaté, M. Hautreux 13 septembre 2013 CEA 10 AVRIL 2012 PAGE 1 Agenda Supercomputing projects Slurm usage and configuration specificities

More information

Next Generation CEA Computing Centres

Next Generation CEA Computing Centres Next Generation IO @ CEA Computing Centres J-Ch Lafoucriere ORAP Forum #39 2017-03-28 A long History of Storage Architectures Last Century Compute Systems Few Cray Supercomputers (vectors and MPP) Few

More information

ELEMENTTYPES CONFIGURATION FRAMEWORK

ELEMENTTYPES CONFIGURATION FRAMEWORK ELEMENTTYPES CONFIGURATION FRAMEWORK Florian NOYRIT florian.noyrit@cea.fr AGENDA Why such a framework? The Basics How is it used in Papyrus? The Association Example ElementTypeConfiguration for DSML designers

More information

Evolving Frama-C Value Analysis

Evolving Frama-C Value Analysis Evolving Frama-C Value Analysis Evolving Frama-C Value Analysis Frama-C Day 2016 Boris Yakobowski, CEA Tech List Frama-C Value Analysis: a Brief Recap Frama-C Value Analysis: a Brief Recap The Value Analysis

More information

HIGH PERFORMANCE LARGE EDDY SIMULATION OF TURBULENT FLOWS AROUND PWR MIXING GRIDS

HIGH PERFORMANCE LARGE EDDY SIMULATION OF TURBULENT FLOWS AROUND PWR MIXING GRIDS HIGH PERFORMANCE LARGE EDDY SIMULATION OF TURBULENT FLOWS AROUND PWR MIXING GRIDS U. Bieder, C. Calvin, G. Fauchet CEA Saclay, CEA/DEN/DANS/DM2S P. Ledac CS-SI HPCC 2014 - First International Workshop

More information

GOING ARM A CODE PERSPECTIVE

GOING ARM A CODE PERSPECTIVE GOING ARM A CODE PERSPECTIVE ISC18 Guillaume Colin de Verdière JUNE 2018 GCdV PAGE 1 CEA, DAM, DIF, F-91297 Arpajon, France June 2018 A history of disruptions All dates are installation dates of the machines

More information

Mapping loops onto Coarse-Grained Reconfigurable Architectures using Particle Swarm Optimization

Mapping loops onto Coarse-Grained Reconfigurable Architectures using Particle Swarm Optimization Mapping loops onto Coarse-Grained Reconfigurable Architectures using Particle Swarm Optimization Rani Gnanaolivu, Theodore S. Norvell, Ramachandran Venkatesan Faculty of Electrical and Computer Engineering

More information

ASSEMBLY OF THE IFMIF CRYOMODULE

ASSEMBLY OF THE IFMIF CRYOMODULE ASSEMBLY OF THE IFMIF CRYOMODULE Janic Chambrillon On behalf of the SRF-Linac Team TTC Meetting - June 5th 8th, Saclay CONTENT The IFMIF cavity string Test and trial on cavity string elements BPM s buttons

More information

Metaheuristics for Clustered Vehicle Routing Problems

Metaheuristics for Clustered Vehicle Routing Problems Metaheuristics for Vehicle Routing s T. Barthélémy A. Rossi M. Sevaux K. Sörensen Université de Bretagne-Sud Lab-STICC, CNRS Lorient, France University of Antwerp Faculty of Economics Antwerp, Belgium

More information

ECE 5775 (Fall 17) High-Level Digital Design Automation. More Binding Pipelining

ECE 5775 (Fall 17) High-Level Digital Design Automation. More Binding Pipelining ECE 5775 (Fall 17) High-Level Digital Design Automation More Binding Pipelining Logistics Lab 3 due Friday 10/6 No late penalty for this assignment (up to 3 days late) HW 2 will be posted tomorrow 1 Agenda

More information

A Just-In-Time Modulo Scheduling for Virtual Coarse-Grained Reconfigurable Architectures

A Just-In-Time Modulo Scheduling for Virtual Coarse-Grained Reconfigurable Architectures A Just-In-Time Modulo Scheduling for Virtual Coarse-Grained Reconfigurable Architectures Ricardo Ferreira, Vinicius Duarte, Waldir Meireles, Monica Pereira, Luigi Carro and Stephan Wong Departamento de

More information

Branch-Aware Loop Mapping on CGRAs

Branch-Aware Loop Mapping on CGRAs Branch-Aware Loop Mapping on CGRAs Mahdi Hamzeh, Aviral Shrivastava, and Sarma Vrudhula School of Computing, Informatics, and Decision Systems Engineering Arizona State University, Tempe, AZ {mahdi, aviral.shrivastava,

More information

Modules v4. Yes, Environment Modules project is not dead. Xavier Delaruelle

Modules v4. Yes, Environment Modules project is not dead. Xavier Delaruelle Modules v4 Yes, Environment Modules project is not dead Xavier Delaruelle 3rd EasyBuild User Meeting January 30th 2018, SURFsara, Amsterdam whoami I am Xavier Delaruelle Joined

More information

ALICE. Double Chooz. Irfu. Interpreting radiations from the Universe. Site report 2017 IRFU ARNAB SINHA

ALICE. Double Chooz. Irfu. Interpreting radiations from the Universe. Site report 2017 IRFU ARNAB SINHA ALICE Double Chooz Irfu Edelweiss HESS Herschel CMS Interpreting radiations from the Universe. Site report 2017 IRFU ARNAB SINHA Irvin MARTIN Pascal ALLEXANDRE Dora MERELLI Frederic SCHAER Augustin VISSER

More information

Combination of Parallel Imaging and Compressed Sensing for high acceleration factor at 7T

Combination of Parallel Imaging and Compressed Sensing for high acceleration factor at 7T Combination of Parallel Imaging and Compressed Sensing for high acceleration factor at 7T DEDALE Workshop Nice Loubna EL GUEDDARI (NeuroSPin) Joint work with: Carole LAZARUS, Alexandre VIGNAUD and Philippe

More information

RobinHood Project Update

RobinHood Project Update FROM RESEARCH TO INDUSTRY RobinHood Project Update Robinhood User Group 2016 Thomas Leibovici SEPTEMBER, 19 th 2016 Project update Latest Releases Robinhood 2.5.6 (july 2016)

More information

ASAP.V2 and ASAP.V3: Sequential optimization of an Algorithm Selector and a Scheduler

ASAP.V2 and ASAP.V3: Sequential optimization of an Algorithm Selector and a Scheduler ASAP.V2 and ASAP.V3: Sequential optimization of an Algorithm Selector and a Scheduler François Gonard, Marc Schoenauer, Michele Sebag To cite this version: François Gonard, Marc Schoenauer, Michele Sebag.

More information

BINARY-LEVEL SECURITY: SEMANTIC ANALYSIS TO THE RESCUE

BINARY-LEVEL SECURITY: SEMANTIC ANALYSIS TO THE RESCUE BINARY-LEVEL SECURITY: SEMANTIC ANALYSIS TO THE RESCUE Sébastien Bardin (CEA LIST) Joint work with Richard Bonichon, Robin David, Adel Djoudi & many other people 1 ABOUT MY LAB @CEA 2 IN A NUTSHELL Binary-level

More information

Parallelization Using a PGAS Language such as X10 in HYDRO and TRITON

Parallelization Using a PGAS Language such as X10 in HYDRO and TRITON Available online at www.prace-ri.eu Partnership for Advanced Computing in Europe Parallelization Using a PGAS Language such as X10 in HYDRO and TRITON Marc Tajchman* a a Commissariat à l énergie atomique

More information

Lecture 21: High-level Synthesis (2)

Lecture 21: High-level Synthesis (2) Lecture 21: High-level Synthesis (2) Slides courtesy of Deming Chen Outline Binding for DFG Left-edge algorithm Network flow algorithm Binding to reduce interconnects Simultaneous scheduling and binding

More information

SIDE CHANNEL ANALYSIS : LOW COST PLATFORM. ETSI SECURITY WEEK Driss ABOULKASSIM Jacques FOURNIERI

SIDE CHANNEL ANALYSIS : LOW COST PLATFORM. ETSI SECURITY WEEK Driss ABOULKASSIM Jacques FOURNIERI SIDE CHANNEL ANALYSIS : LOW COST PLATFORM ETSI SECURITY WEEK Driss ABOULKASSIM Jacques FOURNIERI THE CEA Military Applications Division (DAM) Nuclear Energy Division (DEN) Technological Research Division

More information

HyCUBE: A CGRA with Reconfigurable Single-cycle Multi-hop Interconnect

HyCUBE: A CGRA with Reconfigurable Single-cycle Multi-hop Interconnect HyCUBE: A CGRA with Reconfigurable Single-cycle Multi-hop Interconnect Manupa Karunaratne, Aditi Kulkarni Mohite, Tulika Mitra and Li-Shiuan Peh National University of Singapore {manupa,aditi,tulika,peh}@comp.nus.edu.sg

More information

RobinHood Project Status

RobinHood Project Status FROM RESEARCH TO INDUSTRY RobinHood Project Status Robinhood User Group 2015 Thomas Leibovici 9/18/15 SEPTEMBER, 21 st 2015 Project history... 1999: simple purge tool for HPC

More information

Coarse Grain Reconfigurable Arrays are Signal Processing Engines!

Coarse Grain Reconfigurable Arrays are Signal Processing Engines! Coarse Grain Reconfigurable Arrays are Signal Processing Engines! Advanced Topics in Telecommunications, Algorithms and Implementation Platforms for Wireless Communications, TLT-9707 Waqar Hussain Researcher

More information

OVERVIEW OF MPC JUNE 24 TH LLNL Meeting June 15th, 2015 PAGE 1

OVERVIEW OF MPC JUNE 24 TH LLNL Meeting June 15th, 2015 PAGE 1 OVERVIEW OF MPC Forum Teratec Patrick CARRIBA ULT, Julien JAEGER, Marc PERACHE CEA, DAM, DIF, F-91297 Arpajon, France www.cea.fr www.cea.fr JUNE 24 TH 2015 LLNL Meeting June 15th, 2015 PAGE 1 Context Starting

More information

EPIMap: Using Epimorphism to Map Applications on CGRAs

EPIMap: Using Epimorphism to Map Applications on CGRAs EPIMap: Using Epimorphism to Map Applications on CGRAs Mahdi Hamzeh, Aviral Shrivastava, and Sarma Vrudhula School of Computing, Informatics, and Decision Systems Engineering Arizona State University,

More information

Design methodology for multi processor systems design on regular platforms

Design methodology for multi processor systems design on regular platforms Design methodology for multi processor systems design on regular platforms Ph.D in Electronics, Computer Science and Telecommunications Ph.D Student: Davide Rossi Ph.D Tutor: Prof. Roberto Guerrieri Outline

More information

Co-synthesis and Accelerator based Embedded System Design

Co-synthesis and Accelerator based Embedded System Design Co-synthesis and Accelerator based Embedded System Design COE838: Embedded Computer System http://www.ee.ryerson.ca/~courses/coe838/ Dr. Gul N. Khan http://www.ee.ryerson.ca/~gnkhan Electrical and Computer

More information

MANAGING LUSTRE & ITS CEA

MANAGING LUSTRE & ITS CEA MANAGING LUSTRE & ITS DATA @ CEA LUG Japan Aurelien Degremont CEA, DAM, DIF, F-91297 ARPAJON CEDEX October 17, 2013 CEA 10 AVRIL 2012 PAGE 1 AGENDA WHAT IS CEA? LUSTRE ARCHITECTURE

More information

EE382V: System-on-a-Chip (SoC) Design

EE382V: System-on-a-Chip (SoC) Design EE382V: System-on-a-Chip (SoC) Design Lecture 8 HW/SW Co-Design Sources: Prof. Margarida Jacome, UT Austin Andreas Gerstlauer Electrical and Computer Engineering University of Texas at Austin gerstl@ece.utexas.edu

More information

ACCURACY-ENERGY TRADE-OFF WITH DYNAMIC ADEQUATE OPERATORS. MPSoC 2017 Anca Molnos 06/07/2017

ACCURACY-ENERGY TRADE-OFF WITH DYNAMIC ADEQUATE OPERATORS. MPSoC 2017 Anca Molnos 06/07/2017 ACCURACY-ENERGY TRADE-OFF WITH DYNAMIC ADEQUATE OPERATORS MPSoC 2017 Anca Molnos 06/07/2017 OVERVIEW Context: adequate/approximate computing Hardware Design methodology for dynamic accuracy operators Software

More information

Performance Improvements of Microprocessor Platforms with a Coarse-Grained Reconfigurable Data-Path

Performance Improvements of Microprocessor Platforms with a Coarse-Grained Reconfigurable Data-Path Performance Improvements of Microprocessor Platforms with a Coarse-Grained Reconfigurable Data-Path MICHALIS D. GALANIS 1, GREGORY DIMITROULAKOS 2, COSTAS E. GOUTIS 3 VLSI Design Laboratory, Electrical

More information

COE 561 Digital System Design & Synthesis Introduction

COE 561 Digital System Design & Synthesis Introduction 1 COE 561 Digital System Design & Synthesis Introduction Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Outline Course Topics Microelectronics Design

More information

Big Data Management and NoSQL Databases

Big Data Management and NoSQL Databases NDBI040 Big Data Management and NoSQL Databases Lecture 10. Graph databases Doc. RNDr. Irena Holubova, Ph.D. holubova@ksi.mff.cuni.cz http://www.ksi.mff.cuni.cz/~holubova/ndbi040/ Graph Databases Basic

More information

A Graceful Degradation Framework for Distributed Embedded Systems William Nace Philip Koopman

A Graceful Degradation Framework for Distributed Embedded Systems William Nace Philip Koopman A Graceful Degradation Framework for Distributed Embedded Systems William Nace Philip Koopman Electrical & Computer ENGINEERING RoSES Project Robust Self-configuring Embedded Systems (RoSES) Robustness

More information

EE/CSCI 451: Parallel and Distributed Computation

EE/CSCI 451: Parallel and Distributed Computation EE/CSCI 451: Parallel and Distributed Computation Lecture #12 2/21/2017 Xuehai Qian Xuehai.qian@usc.edu http://alchem.usc.edu/portal/xuehaiq.html University of Southern California 1 Last class Outline

More information

Hardware-Software Codesign

Hardware-Software Codesign Hardware-Software Codesign 4. System Partitioning Lothar Thiele 4-1 System Design specification system synthesis estimation SW-compilation intellectual prop. code instruction set HW-synthesis intellectual

More information

Design of Parallel Algorithms. Models of Parallel Computation

Design of Parallel Algorithms. Models of Parallel Computation + Design of Parallel Algorithms Models of Parallel Computation + Chapter Overview: Algorithms and Concurrency n Introduction to Parallel Algorithms n Tasks and Decomposition n Processes and Mapping n Processes

More information

Lecture Compiler Backend

Lecture Compiler Backend Lecture 19-23 Compiler Backend Jianwen Zhu Electrical and Computer Engineering University of Toronto Jianwen Zhu 2009 - P. 1 Backend Tasks Instruction selection Map virtual instructions To machine instructions

More information

MARKET demands urge embedded systems to incorporate

MARKET demands urge embedded systems to incorporate IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 19, NO. 3, MARCH 2011 429 High Performance and Area Efficient Flexible DSP Datapath Synthesis Sotirios Xydis, Student Member, IEEE,

More information

Coarse Grained Reconfigurable Architecture

Coarse Grained Reconfigurable Architecture Coarse Grained Reconfigurable Architecture Akeem Edwards July 29 2012 Abstract: This paper examines the challenges of mapping applications on to a Coarsegrained reconfigurable architecture (CGRA). Through

More information

Approximate Computing with Runtime Code Generation on Resource-Constrained Embedded Devices

Approximate Computing with Runtime Code Generation on Resource-Constrained Embedded Devices Approximate Computing with Runtime Code Generation on Resource-Constrained Embedded Devices WAPCO HiPEAC conference 2016 Damien Couroussé Caroline Quéva Henri-Pierre Charles www.cea.fr Univ. Grenoble Alpes,

More information

Classification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University

Classification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University Classification Vladimir Curic Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University Outline An overview on classification Basics of classification How to choose appropriate

More information

A Spatial Mapping Algorithm for Heterogeneous Coarse- Grained Reconfigurable Architectures

A Spatial Mapping Algorithm for Heterogeneous Coarse- Grained Reconfigurable Architectures A Spatial Mapping Algorithm for Heterogeneous Coarse- Grained Reconfigurable Architectures Minwook Ahn, Jonghee W. Yoon, Yunheung Paek Software Optimization & Restructuring Laboratory, School of EE/CS,

More information

Iterative Optimization in the Polyhedral Model: Part I, One-Dimensional Time

Iterative Optimization in the Polyhedral Model: Part I, One-Dimensional Time Iterative Optimization in the Polyhedral Model: Part I, One-Dimensional Time Louis-Noël Pouchet, Cédric Bastoul, Albert Cohen and Nicolas Vasilache ALCHEMY, INRIA Futurs / University of Paris-Sud XI March

More information

ELECTROMAGNETIC GLITCH ON THE AES ROUND COUNTER

ELECTROMAGNETIC GLITCH ON THE AES ROUND COUNTER ELECTROMAGNETIC GLITCH ON THE AES ROUND COUNTER Amine DEHBAOUI ¹, Amir-Pasha Mirbaha ², Nicolas MORO¹, Jean-Max DUTERTRE ², Assia TRIA ¹ COSADE 2013 Paris, France (1) (2) OUTLINE! Context! Round Modification

More information

TODAY, new applications, e.g., multimedia or advanced

TODAY, new applications, e.g., multimedia or advanced 584 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: ANALOG AND DIGITAL SIGNAL PROCESSING, VOL. 45, NO. 5, MAY 1998 A Formal Technique for Hardware Interface Design Adel Baganne, Jean-Luc Philippe, and Eric

More information

Mapping DSP Applications on Processor Systems with Coarse-Grain Reconfigurable Hardware

Mapping DSP Applications on Processor Systems with Coarse-Grain Reconfigurable Hardware Mapping DSP Applications on Processor Systems with Coarse-Grain Reconfigurable Hardware Michalis D. Galanis 1, Gregory Dimitroulakos 2, and Costas E. Goutis 3 VLSI Design Laboratory, Electrical and Computer

More information

Generic Design Space Exploration for Reconfigurable Architectures

Generic Design Space Exploration for Reconfigurable Architectures Generic Design Space Exploration for Reconfigurable Architectures Lilian Bossuet, Guy Gogniat, Jean Luc Philippe To cite this version: Lilian Bossuet, Guy Gogniat, Jean Luc Philippe. Generic Design Space

More information

A Novel Design Framework for the Design of Reconfigurable Systems based on NoCs

A Novel Design Framework for the Design of Reconfigurable Systems based on NoCs Politecnico di Milano & EPFL A Novel Design Framework for the Design of Reconfigurable Systems based on NoCs Vincenzo Rana, Ivan Beretta, Donatella Sciuto Donatella Sciuto sciuto@elet.polimi.it Introduction

More information

A Bimodal Scheduler for Coarse-Grained Reconfigurable Arrays

A Bimodal Scheduler for Coarse-Grained Reconfigurable Arrays 15 A Bimodal Scheduler for Coarse-Grained Reconfigurable Arrays PANAGIOTIS THEOCHARIS and BJORN DE SUTTER, Ghent University, Belgium Compilers for Course-Grained Reconfigurable Array (CGRA) architectures

More information

FROM RESEARCH TO INDUSTRY. RobinHood v3. Robinhood User Group Thomas Leibovici 16 septembre 2015

FROM RESEARCH TO INDUSTRY. RobinHood v3. Robinhood User Group Thomas Leibovici 16 septembre 2015 FROM RESEARCH TO INDUSTRY RobinHood v3 Robinhood User Group 2015 Thomas Leibovici 16 septembre 2015 SEPTEMBER, 21 st 2015 About Robinhood v3 Next major release: robinhood v3.0

More information

Placement de processus (MPI) sur architecture multi-cœur NUMA

Placement de processus (MPI) sur architecture multi-cœur NUMA Placement de processus (MPI) sur architecture multi-cœur NUMA Emmanuel Jeannot, Guillaume Mercier LaBRI/INRIA Bordeaux Sud-Ouest/ENSEIRB Runtime Team Lyon, journées groupe de calcul, november 2010 Emmanuel.Jeannot@inria.fr

More information

From C Programs to the Configure-Execute Model

From C Programs to the Configure-Execute Model From C Programs to the Configure-Execute Model João M. P. Cardoso FCT/University of Algarve, Campus de Gambelas, 8000-117 Faro, Portugal Email: jmpc@acm.org Markus Weinhardt PACT XPP Technologies AG Muthmannstrasse

More information

CEA Site Report. SLURM User Group Meeting 2012 Matthieu Hautreux 26 septembre 2012 CEA 10 AVRIL 2012 PAGE 1

CEA Site Report. SLURM User Group Meeting 2012 Matthieu Hautreux 26 septembre 2012 CEA 10 AVRIL 2012 PAGE 1 CEA Site Report SLURM User Group Meeting 2012 Matthieu Hautreux 26 septembre 2012 CEA 10 AVRIL 2012 PAGE 1 Agenda Supercomputing Projects SLURM usage SLURM related work SLURM

More information

Using Speculative Computation and Parallelizing techniques to improve Scheduling of Control based Designs

Using Speculative Computation and Parallelizing techniques to improve Scheduling of Control based Designs Using Speculative Computation and Parallelizing techniques to improve Scheduling of Control based Designs Roberto Cordone Fabrizio Ferrandi, Gianluca Palermo, Marco D. Santambrogio, Donatella Sciuto Università

More information

Evaluating Inter-cluster Communication in Clustered VLIW Architectures

Evaluating Inter-cluster Communication in Clustered VLIW Architectures Evaluating Inter-cluster Communication in Clustered VLIW Architectures Anup Gangwar Embedded Systems Group, Department of Computer Science and Engineering, Indian Institute of Technology Delhi September

More information

Towards an automatic co-generator for manycores. architecture and runtime: STHORM case-study

Towards an automatic co-generator for manycores. architecture and runtime: STHORM case-study Procedia Computer Science Towards an automatic co-generator for manycores Volume 51, 2015, Pages 2809 2813 architecture and runtime: STHORM case-study ICCS 2015 International Conference On Computational

More information

: Advanced Compiler Design. 8.0 Instruc?on scheduling

: Advanced Compiler Design. 8.0 Instruc?on scheduling 6-80: Advanced Compiler Design 8.0 Instruc?on scheduling Thomas R. Gross Computer Science Department ETH Zurich, Switzerland Overview 8. Instruc?on scheduling basics 8. Scheduling for ILP processors 8.

More information

Self-optimisation using runtime code generation for Wireless Sensor Networks

Self-optimisation using runtime code generation for Wireless Sensor Networks Self-optimisation using runtime code generation for Wireless Sensor Networks ComNet-IoT Workshop ICDCN 16 Singapore Caroline Quéva Damien Couroussé Henri-Pierre Charles www.cea.fr Univ. Grenoble Alpes,

More information

Instruction scheduling. Advanced Compiler Construction Michel Schinz

Instruction scheduling. Advanced Compiler Construction Michel Schinz Instruction scheduling Advanced Compiler Construction Michel Schinz 2015 05 21 Instruction ordering When a compiler emits the instructions corresponding to a program, it imposes a total order on them.

More information

Unit 2: High-Level Synthesis

Unit 2: High-Level Synthesis Course contents Unit 2: High-Level Synthesis Hardware modeling Data flow Scheduling/allocation/assignment Reading Chapter 11 Unit 2 1 High-Level Synthesis (HLS) Hardware-description language (HDL) synthesis

More information

High Level Synthesis

High Level Synthesis High Level Synthesis Design Representation Intermediate representation essential for efficient processing. Input HDL behavioral descriptions translated into some canonical intermediate representation.

More information

MT-ADRES: Multithreading on Coarse-Grained Reconfigurable Architecture

MT-ADRES: Multithreading on Coarse-Grained Reconfigurable Architecture MT-ADRES: Multithreading on Coarse-Grained Reconfigurable Architecture Kehuai Wu, Jan Madsen Dept. of Informatics and Mathematic Modelling Technical University of Denmark {kw, jan}@imm.dtu.dk Andreas Kanstein

More information

IBM IBM Storage Networking Solutions Version 1.

IBM IBM Storage Networking Solutions Version 1. IBM 000-740 IBM Storage Networking Solutions Version 1 http://killexams.com/exam-detail/000-740 - disk storage subsystem with four (4) total ports - two (2) LTO3 tape drives to be attached Assuming best

More information

Parallel graph traversal for FPGA

Parallel graph traversal for FPGA LETTER IEICE Electronics Express, Vol.11, No.7, 1 6 Parallel graph traversal for FPGA Shice Ni a), Yong Dou, Dan Zou, Rongchun Li, and Qiang Wang National Laboratory for Parallel and Distributed Processing,

More information

Center for Scalable Application Development Software (CScADS): Automatic Performance Tuning Workshop

Center for Scalable Application Development Software (CScADS): Automatic Performance Tuning Workshop Center for Scalable Application Development Software (CScADS): Automatic Performance Tuning Workshop http://cscads.rice.edu/ Discussion and Feedback CScADS Autotuning 07 Top Priority Questions for Discussion

More information

Introduction VLSI PHYSICAL DESIGN AUTOMATION

Introduction VLSI PHYSICAL DESIGN AUTOMATION VLSI PHYSICAL DESIGN AUTOMATION PROF. INDRANIL SENGUPTA DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING Introduction Main steps in VLSI physical design 1. Partitioning and Floorplanning l 2. Placement 3.

More information

SPARK: A Parallelizing High-Level Synthesis Framework

SPARK: A Parallelizing High-Level Synthesis Framework SPARK: A Parallelizing High-Level Synthesis Framework Sumit Gupta Rajesh Gupta, Nikil Dutt, Alex Nicolau Center for Embedded Computer Systems University of California, Irvine and San Diego http://www.cecs.uci.edu/~spark

More information

Network Calculus: A Comparison

Network Calculus: A Comparison Time-Division Multiplexing vs Network Calculus: A Comparison Wolfgang Puffitsch, Rasmus Bo Sørensen, Martin Schoeberl RTNS 15, Lille, France Motivation Modern multiprocessors use networks-on-chip Congestion

More information

Maximum Clique Problem. Team Bushido bit.ly/parallel-computing-fall-2014

Maximum Clique Problem. Team Bushido bit.ly/parallel-computing-fall-2014 Maximum Clique Problem Team Bushido bit.ly/parallel-computing-fall-2014 Agenda Problem summary Research Paper 1 Research Paper 2 Research Paper 3 Software Design Demo of Sequential Program Summary Of the

More information

Retiming Arithmetic Datapaths using Timed Taylor Expansion Diagrams

Retiming Arithmetic Datapaths using Timed Taylor Expansion Diagrams Retiming Arithmetic Datapaths using Timed Taylor Expansion Diagrams Daniel Gomez-Prado Dusung Kim Maciej Ciesielski Emmanuel Boutillon 2 University of Massachusetts Amherst, USA. {dgomezpr,ciesiel,dukim}@ecs.umass.edu

More information

Electromagnetic Transient Fault Injection on AES

Electromagnetic Transient Fault Injection on AES Electromagnetic Transient Fault Injection on AES Amine DEHBAOUI ¹, Jean-Max DUTERTRE ², Bruno ROBISSON ¹, Assia TRIA ¹ Fault Diagnosis and Tolerance in Cryptography Leuven, Belgium Sunday, September 9,

More information

SELF-TUNING HTM. Paolo Romano

SELF-TUNING HTM. Paolo Romano SELF-TUNING HTM Paolo Romano 2 Based on ICAC 14 paper N. Diegues and Paolo Romano Self-Tuning Intel Transactional Synchronization Extensions 11 th USENIX International Conference on Autonomic Computing

More information

Hardware/Software Partitioning of Digital Systems

Hardware/Software Partitioning of Digital Systems Hardware/Software Partitioning of Digital Systems F. Dufour Advisor: M. Radetzki Department of Technical Computer Science University of Stuttgart Seminar Embedded Systems Outline 1 Partitioning and digital

More information

Supporting information

Supporting information Electronic Supplementary Material (ESI) for Journal of Materials Chemistry C. This journal is The Royal Society of Chemistry 2018 Supporting information for Ligand-Free Synthesis of Gold Nanoparticles

More information

Mapping MPEG Video Decoders on the ADRES Reconfigurable Array Processor for Next Generation Multi-Mode Mobile Terminals

Mapping MPEG Video Decoders on the ADRES Reconfigurable Array Processor for Next Generation Multi-Mode Mobile Terminals Mapping MPEG Video Decoders on the ADRES Reconfigurable Array Processor for Next Generation Multi-Mode Mobile Terminals Mladen Berekovic IMEC Kapeldreef 75 B-301 Leuven, Belgium 0032-16-28-8162 Mladen.Berekovic@imec.be

More information

Fast Stereoscopic Rendering on Mobile Ray Tracing GPU for Virtual Reality Applications

Fast Stereoscopic Rendering on Mobile Ray Tracing GPU for Virtual Reality Applications Fast Stereoscopic Rendering on Mobile Ray Tracing GPU for Virtual Reality Applications SAMSUNG Advanced Institute of Technology Won-Jong Lee, Seok Joong Hwang, Youngsam Shin, Jeong-Joon Yoo, Soojung Ryu

More information

Memory Access Optimization in Compilation for Coarse-Grained Reconfigurable Architectures

Memory Access Optimization in Compilation for Coarse-Grained Reconfigurable Architectures Memory Access Optimization in Compilation for Coarse-Grained Reconfigurable Architectures YONGJOO KIM, Seoul National University JONGEUN LEE, Ulsan National Institute of Science and Technology AVIRAL SHRIVASTAVA,

More information

COARSE GRAINED RECONFIGURABLE ARCHITECTURES FOR MOTION ESTIMATION IN H.264/AVC

COARSE GRAINED RECONFIGURABLE ARCHITECTURES FOR MOTION ESTIMATION IN H.264/AVC COARSE GRAINED RECONFIGURABLE ARCHITECTURES FOR MOTION ESTIMATION IN H.264/AVC 1 D.RUKMANI DEVI, 2 P.RANGARAJAN ^, 3 J.RAJA PAUL PERINBAM* 1 Research Scholar, Department of Electronics and Communication

More information

POLYMORPHIC PIPELINE ARRAY: A FLEXIBLE MULTICORE ACCELERATOR FOR MOBILE MULTIMEDIA APPLICATIONS. Hyunchul Park

POLYMORPHIC PIPELINE ARRAY: A FLEXIBLE MULTICORE ACCELERATOR FOR MOBILE MULTIMEDIA APPLICATIONS. Hyunchul Park POLYMORPHIC PIPELINE ARRAY: A FLEXIBLE MULTICORE ACCELERATOR FOR MOBILE MULTIMEDIA APPLICATIONS by Hyunchul Park A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor

More information

Claude TADONKI. MINES ParisTech PSL Research University Centre de Recherche Informatique

Claude TADONKI. MINES ParisTech PSL Research University Centre de Recherche Informatique Got 2 seconds Sequential 84 seconds Expected 84/84 = 1 second!?! Got 25 seconds MINES ParisTech PSL Research University Centre de Recherche Informatique claude.tadonki@mines-paristech.fr Séminaire MATHEMATIQUES

More information

HYRISE In-Memory Storage Engine

HYRISE In-Memory Storage Engine HYRISE In-Memory Storage Engine Martin Grund 1, Jens Krueger 1, Philippe Cudre-Mauroux 3, Samuel Madden 2 Alexander Zeier 1, Hasso Plattner 1 1 Hasso-Plattner-Institute, Germany 2 MIT CSAIL, USA 3 University

More information

High-Level Synthesis (HLS)

High-Level Synthesis (HLS) Course contents Unit 11: High-Level Synthesis Hardware modeling Data flow Scheduling/allocation/assignment Reading Chapter 11 Unit 11 1 High-Level Synthesis (HLS) Hardware-description language (HDL) synthesis

More information

Extraction of tiled top-down irregular pyramids from large images

Extraction of tiled top-down irregular pyramids from large images Extraction of tiled top-down irregular pyramids from large images Romain Goffe 1 Guillaume Damiand 2 Luc Brun 3 1 SIC-XLIM, Université de Poitiers, CNRS, UMR6172, Bâtiment SP2MI, F-86962, Futuroscope Chasseneuil,

More information

ECE 669 Parallel Computer Architecture

ECE 669 Parallel Computer Architecture ECE 669 Parallel Computer Architecture Lecture 23 Parallel Compilation Parallel Compilation Two approaches to compilation Parallelize a program manually Sequential code converted to parallel code Develop

More information

Chronological Backtracking Conflict Directed Backjumping Dynamic Backtracking Branching Strategies Branching Heuristics Heavy Tail Behavior

Chronological Backtracking Conflict Directed Backjumping Dynamic Backtracking Branching Strategies Branching Heuristics Heavy Tail Behavior PART III: Search Outline Depth-first Search Chronological Backtracking Conflict Directed Backjumping Dynamic Backtracking Branching Strategies Branching Heuristics Heavy Tail Behavior Best-First Search

More information

Fundamentals of. Parallel Computing. Sanjay Razdan. Alpha Science International Ltd. Oxford, U.K.

Fundamentals of. Parallel Computing. Sanjay Razdan. Alpha Science International Ltd. Oxford, U.K. Fundamentals of Parallel Computing Sanjay Razdan Alpha Science International Ltd. Oxford, U.K. CONTENTS Preface Acknowledgements vii ix 1. Introduction to Parallel Computing 1.1-1.37 1.1 Parallel Computing

More information

Memory Management Algorithms on Distributed Systems. Katie Becker and David Rodgers CS425 April 15, 2005

Memory Management Algorithms on Distributed Systems. Katie Becker and David Rodgers CS425 April 15, 2005 Memory Management Algorithms on Distributed Systems Katie Becker and David Rodgers CS425 April 15, 2005 Table of Contents 1. Introduction 2. Coarse Grained Memory 2.1. Bottlenecks 2.2. Simulations 2.3.

More information

Coarse-Grained Reconfigurable Array Architectures

Coarse-Grained Reconfigurable Array Architectures Coarse-Grained Reconfigurable Array Architectures Bjorn De Sutter, Praveen Raghavan, Andy Lambrechts Abstract Coarse-Grained Reconfigurable Array (CGRA) architectures accelerate the same inner loops that

More information