Lecture 4: Synchronous Data Flow Graphs - HJ94 goal: Skiing down a mountain

Size: px
Start display at page:

Download "Lecture 4: Synchronous Data Flow Graphs - HJ94 goal: Skiing down a mountain"

Transcription

1 Lecture 4: Synchronous ata Flow Graphs - I. Verbauwhede, K.U.Leuven HJ94 goal: Skiing down a mountain SPW, Matlab, C pipelining, unrolling Specification Algorithm Transformations loop merging, compaction Memory Transformations and Optimizations 40 bit accumulator Floating-point to Fied-point ASIC Special Purpose Retargetable coprocessor SP processor SP- RISC RISC Page

2 Overview Lecture : what is a system-on-chip Lecture : terminology for the different steps Lecture 3: models of computation Lecture 4 today: two MOC s Synchronous data flow graphs Control flow 3 Time Representations Tag t is abstraction of time (temporal order) - Absolute time = global ordering=overspecification - Cumbersome and harmful because reduces degree of freedom - Order in t is order in events (t<t <=> e<e ) 3 representations: -Absolute time T = R (T totally ordered closed connected set) v -iscrete time T T is totally ordered discrete set < in T such that t t' ( t < t' ) ( t' < t) = -Precedences T T is partially ordered discrete set v t t 4 Page

3 Models for Time Timed Models of Computation = total order Continuous time- iscrete event- (simulation with zero-delay??) Synchronous Clocked discrete time = most used iscrete time-(synchronous/reactive) Untimed MOC = partial order Sequential Processes with Rendez-Vous Kahn Networks ata-flow networks Reality = miture of MOC s 5 Today: Reference: E. Lee,. Messerschmitt, Synchronous data flow, Proceedings of the IEEE, Vol. 75, No.9, September 987. Other reference: E. Lee,. Messerschmitt, Static Scheduling of Synchronous ata Flow Programs for igital Signal Processing, IEEE Transactions on Computers, Vol. C-36, No., Jan (This reference includes the proofs for the first reference.) For multi-dimensional signal processing: stream scheduling: very effective for video and image processing applications Eamples: Phideo [Philips] 6 Page 3

4 ata flow ata flow representation of an algorithm: is a directed graph nodes are computations (actors) arcs (or edges) are paths over which the data ( samples ) travels. F shows which computations to perform, not sequence. Sequence is only determined by data dependencies. Hence eposes concurrency. 7 ata flow (cont.) Assume infinite stream of input samples. So nodes perform computations an infinite times. Node will fire (start its computation) when inputs are available. Node with no inputs can fire anytime. Numbers indicate the number of samples (tokens) produced, consumed by one firing. Nodes will fire when input data is available, called data-driven. Hence it eposes concurrency. Nodes must be free of side effects : e.g. a write to a memory location followed by a read, only allowed if there is an arc between them 8 Page 4

5 ata flow (cont.) True data flow: overhead for checking the availability of input tokens is too large. BUT, synchronous data flow: the number of tokens produced/consumed is know beforehand (a priori)! Hence, the scheduling can done a priori, at compile time. Thus there is NO runtime overhead! For signal processing applications: the number of tokens produced & consumed is independent of the data and known beforehand (= relative sample rates). 9 Synchronous ata Flow - definition Synchronous data flow graph (SF) is a network of synchronous nodes (also called blocks). A node is a function that is invoked whenever there are enough inputs available. The inputs are consumed. For a synchronous node, the consumptions and productions are known a priori. Homogeneous SF graph: when only s on the graph. 0 Page 5

6 elay - elay of signal processing Unit delay on arc between A and B, means A B n-th sample consumed by B, is (n-)th sample produced by A. Initialized by d zero samples A synchronous compiler Translation from SF graph to a sequential program on a processor Two tasks: Allocation of shared memory between blocks or setting up communication between blocks Scheduling blocks onto processors such that all input data is available when block is invoked Goal: create Periodic Admissible Parallel Schedule (PAPS) Page 6

7 Precedence graph - Schedule Precedence graph indicates the sequence of operations: C A A B B C Schedule determines when and where (which processor or which data path unit) the node fires. Valid schedules: A B C Invalid schedule: C A B B A C 3 Blocked Schedule Blocked: one cycle terminates before net one starts C A B F E G Static schedule 3 processors/units: valid blocked schedule With pipeline (not blocked): P A C G P A C P B F P B F P3 E P3 G E 4 Page 7

8 Small large grain Iteration period = length of one cycle = /throughput Goal: minimize iteration period Iteration period bound = minimum achievable (assuming pipelining) = bound by total number of operations in loop divided by number of delays in the loop) Atomic SF graph, when nodes are primitive operations Large grain SF graph, when nodes are larger functions: Eample: IIR filter = small grain JPEG = large grain 5 SF graph implementation Implementation requires: buffering of the data samples passing between nodes schedule nodes when inputs are available ynamic implementation (= runtime) requires runtime scheduler checks when inputs are available and schedules nodes when a processor is free. usually epensive because overhead Contribution of Lee-87: SF graphs can be scheduled at compile time no overhead 6 Compiler will: determine the eecution order of the nodes on one or multiple processors or data path units determine communication buffers between nodes. Page 8

9 Periodic schedule for SF graph Assumptions: infinite stream of input data (the case for signal processing applications) periodic schedule: same schedule applied repetitively on input stream Goal: check if schedule can be found: Periodic admissible sequential schedule (PASS) for a single processor or data path unit Periodic admissible parallel schedule (PAPS) for multiple processors n n n n PASS 7 Rate inconsistency n Consistent solution n Formal approach Γ = Construct topology matri each node is a column each arc is a row entry (i,j) = data produced on node i by arc j. consumption is negative entry n e n e e n n e Self loop entry? 8 Page 9

10 FIFO queues b(n) = size of queues on each arc 0 0 v(n) = 0 or or 0 indicates firing node 0 0 b(n) = b(n) Γ v(n) e e n n n e n 0 b(0) = 0, b() = 0 0 e 9 FIFO queues & delays elays are handled by initializing b(0) with the delay values: n n b(0) = So at start-up: can fire two times before firing n again So, every directed loop must have at least one delay to be able to start 0 Page 0

11 Identifying inconsistent sample rates Necessary condition for the eistence of periodic schedule with bounded memory Rank of Γ is s- (s is number of nodes) n n n n e e n n rank? e e n n rank? Relative firing frequency Topology matri with the correct rank, has a strictly positive (element-wise) integer vector q in its right nulspace: Thus: Γq = 0 n n e e n n rank =, q = q determines number of times each node is invoked! Page

12 Insufficient delays Rank s- is a necessary but not a sufficient condition: n n n n - - = Scheduling for single processor Given: positive integer vector q, such that Γq = 0 given b(0) The i-th node is runnable if it has not been run qi times it will not cause the buffer size to become negative Class S (sequential) algorithm creates a static schedule: is an algorithm that schedules a node if it is runnable it updates b(n) it stops when no more nodes are runnable. If the class S algorithm terminates before it has scheduled each node the number of times specified in the q vector, then it is said to be deadlocked. 4 Page

13 Eample Class S algorithm Solve for smallest positive integer q Form a list of all nodes in the system for each node, schedule if runnable, try each node once if each node has been scheduled qi times, STOP. If no node can be scheduled, indicate deadlock else continue with the net node. n n Schedule: is PASS is not PASS is not PASS (Compleity: traverse the graph once, visiting each edge once). Optimization: minimize buffer (=memory) requirements 5 Schedule for parallel processors Assumptions: homogeneous processors, no overhead in communication if PASS eists, then also PAPS (because we could run all nodes on one processor) A blocked periodic admissible parallel schedule is set of lists {Xi; i =,... M} M is the number of processors Xi = periodic schedule for processor i p is smallest positive integer vector, such that Γp = 0. Then a cycle of schedule invokes every node q = Jp times. J is called the blocking factor (can be different from ). 6 Page 3

14 Precedence graph n e n e e e n n Γp = 0. PASS:? rank =, p = Precedence graph for unity blocking factor: n n n 7 Schedule on two processors, J= Assumptions: node takes time unit, node takes, node 3 takes 3 X = {3} X = {,, } n n n Time processor 3 processor Iteration period = 4 8 Page 4

15 Schedule on two processors, J= Assumptions: node takes time unit, node takes, node 3 takes 3 nodes have self loops (so nodes can not overlap with themselves) n n n 3 n n n 4 X = {3,,, } X = {,,, 3} Time processor 3 processor 3 Iteration period is 7/ = Why are we doing this? The principle of synchronous data flow is used in many simulators Based on this, multi-dimensional data flow representations have been developed. Reality is always more complicated. Issues in practice: choose schedule to minimize memory requirements. include non data flow nodes if-then-else data dependent calculations 30 Page 5

EE213A - EE298-2 Lecture 8

EE213A - EE298-2 Lecture 8 EE3A - EE98- Lecture 8 Synchronous ata Flow Ingrid Verbauwhede epartment of Electrical Engineering University of California Los Angeles ingrid@ee.ucla.edu EE3A, Spring 000, Ingrid Verbauwhede, UCLA - Lecture

More information

STATIC SCHEDULING FOR CYCLO STATIC DATA FLOW GRAPHS

STATIC SCHEDULING FOR CYCLO STATIC DATA FLOW GRAPHS STATIC SCHEDULING FOR CYCLO STATIC DATA FLOW GRAPHS Sukumar Reddy Anapalli Krishna Chaithanya Chakilam Timothy W. O Neil Dept. of Computer Science Dept. of Computer Science Dept. of Computer Science The

More information

Embedded Systems 8. Identifying, modeling and documenting how data moves around an information system. Dataflow modeling examines

Embedded Systems 8. Identifying, modeling and documenting how data moves around an information system. Dataflow modeling examines Embedded Systems 8 - - Dataflow modeling Identifying, modeling and documenting how data moves around an information system. Dataflow modeling examines processes (activities that transform data from one

More information

Embedded Systems CS - ES

Embedded Systems CS - ES Embedded Systems - 1 - Synchronous dataflow REVIEW Multiple tokens consumed and produced per firing Synchronous dataflow model takes advantage of this Each edge labeled with number of tokens consumed/produced

More information

DIGITAL VS. ANALOG SIGNAL PROCESSING Digital signal processing (DSP) characterized by: OUTLINE APPLICATIONS OF DIGITAL SIGNAL PROCESSING

DIGITAL VS. ANALOG SIGNAL PROCESSING Digital signal processing (DSP) characterized by: OUTLINE APPLICATIONS OF DIGITAL SIGNAL PROCESSING 1 DSP applications DSP platforms The synthesis problem Models of computation OUTLINE 2 DIGITAL VS. ANALOG SIGNAL PROCESSING Digital signal processing (DSP) characterized by: Time-discrete representation

More information

Overview of Dataflow Languages. Waheed Ahmad

Overview of Dataflow Languages. Waheed Ahmad Overview of Dataflow Languages Waheed Ahmad w.ahmad@utwente.nl The purpose of models is not to fit the data but to sharpen the questions. Samuel Karlins 11 th R.A Fisher Memorial Lecture Royal Society

More information

Software Synthesis Trade-offs in Dataflow Representations of DSP Applications

Software Synthesis Trade-offs in Dataflow Representations of DSP Applications in Dataflow Representations of DSP Applications Shuvra S. Bhattacharyya Department of Electrical and Computer Engineering, and Institute for Advanced Computer Studies University of Maryland, College Park

More information

HW/SW Codesign. Exercise 2: Kahn Process Networks and Synchronous Data Flows

HW/SW Codesign. Exercise 2: Kahn Process Networks and Synchronous Data Flows HW/SW Codesign Exercise 2: Kahn Process Networks and Synchronous Data Flows 4. October 2017 Stefan Draskovic stefan.draskovic@tik.ee.ethz.ch slides by: Mirela Botezatu 1 Kahn Process Network (KPN) Specification

More information

EECS 144/244: Fundamental Algorithms for System Modeling, Analysis, and Optimization

EECS 144/244: Fundamental Algorithms for System Modeling, Analysis, and Optimization EECS 144/244: Fundamental Algorithms for System Modeling, Analysis, and Optimization Dataflow Lecture: SDF, Kahn Process Networks Stavros Tripakis University of California, Berkeley Stavros Tripakis: EECS

More information

Static Scheduling and Code Generation from Dynamic Dataflow Graphs With Integer- Valued Control Streams

Static Scheduling and Code Generation from Dynamic Dataflow Graphs With Integer- Valued Control Streams Presented at 28th Asilomar Conference on Signals, Systems, and Computers November, 994 Static Scheduling and Code Generation from Dynamic Dataflow Graphs With Integer- Valued Control Streams Joseph T.

More information

Chapter 2 Data Flow Modeling and Transformation 2.1 Introducing Data Flow Graphs

Chapter 2 Data Flow Modeling and Transformation 2.1 Introducing Data Flow Graphs Chapter 2 ata Flow Modeling and Transformation 2. Introducing ata Flow Graphs y nature, hardware is parallel and software is sequential. s a result, software models (C programs) are not very well suited

More information

Fundamental Algorithms for System Modeling, Analysis, and Optimization

Fundamental Algorithms for System Modeling, Analysis, and Optimization Fundamental Algorithms for System Modeling, Analysis, and Optimization Stavros Tripakis, Edward A. Lee UC Berkeley EECS 144/244 Fall 2014 Copyright 2014, E. A. Lee, J. Roydhowdhury, S. A. Seshia, S. Tripakis

More information

Software Synthesis from Dataflow Models for G and LabVIEW

Software Synthesis from Dataflow Models for G and LabVIEW Software Synthesis from Dataflow Models for G and LabVIEW Hugo A. Andrade Scott Kovner Department of Electrical and Computer Engineering University of Texas at Austin Austin, TX 78712 andrade@mail.utexas.edu

More information

Dataflow Languages. Languages for Embedded Systems. Prof. Stephen A. Edwards. March Columbia University

Dataflow Languages. Languages for Embedded Systems. Prof. Stephen A. Edwards. March Columbia University Dataflow Languages Languages for Embedded Systems Prof. Stephen A. Edwards Columbia University March 2009 Philosophy of Dataflow Languages Drastically different way of looking at computation Von Neumann

More information

Hardware-Software Codesign. 6. System Simulation

Hardware-Software Codesign. 6. System Simulation Hardware-Software Codesign 6. System Simulation Lothar Thiele 6-1 System Design specification system simulation (this lecture) (worst-case) perf. analysis (lectures 10-11) system synthesis estimation SW-compilation

More information

Introduction to Electronic Design Automation. Model of Computation. Model of Computation. Model of Computation

Introduction to Electronic Design Automation. Model of Computation. Model of Computation. Model of Computation Introduction to Electronic Design Automation Model of Computation Jie-Hong Roland Jiang 江介宏 Department of Electrical Engineering National Taiwan University Spring 03 Model of Computation In system design,

More information

EE382N.23: Embedded System Design and Modeling

EE382N.23: Embedded System Design and Modeling EE38N.3: Embedded System Design and Modeling Lecture 5 Process-Based MoCs Andreas Gerstlauer Electrical and Computer Engineering University of Texas at Austin gerstl@ece.utexas.edu Lecture 5: Outline Process-based

More information

A Schedulability-Preserving Transformation Scheme from Boolean- Controlled Dataflow Networks to Petri Nets

A Schedulability-Preserving Transformation Scheme from Boolean- Controlled Dataflow Networks to Petri Nets Schedulability-Preserving ransformation Scheme from oolean- ontrolled Dataflow Networks to Petri Nets ong Liu Edward. Lee University of alifornia at erkeley erkeley,, 94720, US {congliu,eal}@eecs. berkeley.edu

More information

Hierarchical FSMs with Multiple CMs

Hierarchical FSMs with Multiple CMs Hierarchical FSMs with Multiple CMs Manaloor Govindarajan Balasubramanian Manikantan Bharathwaj Muthuswamy (aka Bharath) Reference: Hierarchical FSMs with Multiple Concurrency Models. Alain Girault, Bilung

More information

The Power of Streams on the SRC MAP. Wim Bohm Colorado State University. RSS!2006 Copyright 2006 SRC Computers, Inc. ALL RIGHTS RESERVED.

The Power of Streams on the SRC MAP. Wim Bohm Colorado State University. RSS!2006 Copyright 2006 SRC Computers, Inc. ALL RIGHTS RESERVED. The Power of Streams on the SRC MAP Wim Bohm Colorado State University RSS!2006 Copyright 2006 SRC Computers, Inc. ALL RIGHTS RSRV. MAP C Pure C runs on the MAP Generated code: circuits Basic blocks in

More information

Meta-Data-Enabled Reuse of Dataflow Intellectual Property for FPGAs

Meta-Data-Enabled Reuse of Dataflow Intellectual Property for FPGAs Meta-Data-Enabled Reuse of Dataflow Intellectual Property for FPGAs Adam Arnesen NSF Center for High-Performance Reconfigurable Computing (CHREC) Dept. of Electrical and Computer Engineering Brigham Young

More information

Mapping Array Communication onto FIFO Communication - Towards an Implementation

Mapping Array Communication onto FIFO Communication - Towards an Implementation Mapping Array Communication onto Communication - Towards an Implementation Jeffrey Kang Albert van der Werf Paul Lippens Philips Research Laboratories Prof. Holstlaan 4, 5656 AA Eindhoven, The Netherlands

More information

FILTER SYNTHESIS USING FINE-GRAIN DATA-FLOW GRAPHS. Waqas Akram, Cirrus Logic Inc., Austin, Texas

FILTER SYNTHESIS USING FINE-GRAIN DATA-FLOW GRAPHS. Waqas Akram, Cirrus Logic Inc., Austin, Texas FILTER SYNTHESIS USING FINE-GRAIN DATA-FLOW GRAPHS Waqas Akram, Cirrus Logic Inc., Austin, Texas Abstract: This project is concerned with finding ways to synthesize hardware-efficient digital filters given

More information

Portland State University ECE 588/688. Dataflow Architectures

Portland State University ECE 588/688. Dataflow Architectures Portland State University ECE 588/688 Dataflow Architectures Copyright by Alaa Alameldeen and Haitham Akkary 2018 Hazards in von Neumann Architectures Pipeline hazards limit performance Structural hazards

More information

Cache Memories /18-213/15-513: Introduction to Computer Systems 12 th Lecture, October 5, Today s Instructor: Phil Gibbons

Cache Memories /18-213/15-513: Introduction to Computer Systems 12 th Lecture, October 5, Today s Instructor: Phil Gibbons Cache Memories 15-213/18-213/15-513: Introduction to Computer Systems 12 th Lecture, October 5, 2017 Today s Instructor: Phil Gibbons 1 Today Cache memory organization and operation Performance impact

More information

CSL373: Lecture 5 Deadlocks (no process runnable) + Scheduling (> 1 process runnable)

CSL373: Lecture 5 Deadlocks (no process runnable) + Scheduling (> 1 process runnable) CSL373: Lecture 5 Deadlocks (no process runnable) + Scheduling (> 1 process runnable) Past & Present Have looked at two constraints: Mutual exclusion constraint between two events is a requirement that

More information

High Performance Computer Architecture Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

High Performance Computer Architecture Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur High Performance Computer Architecture Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture - 18 Dynamic Instruction Scheduling with Branch Prediction

More information

Extensions of Daedalus Todor Stefanov

Extensions of Daedalus Todor Stefanov Extensions of Daedalus Todor Stefanov Leiden Embedded Research Center, Leiden Institute of Advanced Computer Science Leiden University, The Netherlands Overview of Extensions in Daedalus DSE limited to

More information

Dynamic Dataflow. Seminar on embedded systems

Dynamic Dataflow. Seminar on embedded systems Dynamic Dataflow Seminar on embedded systems Dataflow Dataflow programming, Dataflow architecture Dataflow Models of Computation Computation is divided into nodes that can be executed concurrently Dataflow

More information

A Design Framework for Mapping Vectorized Synchronous Dataflow Graphs onto CPU-GPU Platforms

A Design Framework for Mapping Vectorized Synchronous Dataflow Graphs onto CPU-GPU Platforms A Design Framework for Mapping Vectorized Synchronous Dataflow Graphs onto CPU-GPU Platforms Shuoxin Lin, Yanzhou Liu, William Plishker, Shuvra Bhattacharyya Maryland DSPCAD Research Group Department of

More information

fakultät für informatik informatik 12 technische universität dortmund Data flow models Peter Marwedel TU Dortmund, Informatik /10/08

fakultät für informatik informatik 12 technische universität dortmund Data flow models Peter Marwedel TU Dortmund, Informatik /10/08 12 Data flow models Peter Marwedel TU Dortmund, Informatik 12 2009/10/08 Graphics: Alexandra Nolte, Gesine Marwedel, 2003 Models of computation considered in this course Communication/ local computations

More information

Embedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institution of Technology, Delhi

Embedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institution of Technology, Delhi Embedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institution of Technology, Delhi Lecture - 34 Compilers for Embedded Systems Today, we shall look at the compilers, which

More information

Automatic Parallelization of NLPs with Non-Affine Index- Expressions. Marco Bekooij (NXP research) Tjerk Bijlsma (University of Twente)

Automatic Parallelization of NLPs with Non-Affine Index- Expressions. Marco Bekooij (NXP research) Tjerk Bijlsma (University of Twente) Automatic Parallelization of NLPs with Non-Affine Index- Expressions Marco Bekooij (NXP research) Tjerk Bijlsma (University of Twente) Outline Context: car-entertainment applications Mapping Flow Motivation

More information

Distributed Algorithms in Networks EECS 122: Lecture 17

Distributed Algorithms in Networks EECS 122: Lecture 17 istributed lgorithms in Networks EES : Lecture 7 epartment of Electrical Engineering and omputer Sciences University of alifornia erkeley Network Protocols often have unintended effects TP Eample TP connections

More information

Modelling, Analysis and Scheduling with Dataflow Models

Modelling, Analysis and Scheduling with Dataflow Models technische universiteit eindhoven Modelling, Analysis and Scheduling with Dataflow Models Marc Geilen, Bart Theelen, Twan Basten, Sander Stuijk, AmirHossein Ghamarian, Jeroen Voeten Eindhoven University

More information

LabVIEW Based Embedded Design [First Report]

LabVIEW Based Embedded Design [First Report] LabVIEW Based Embedded Design [First Report] Sadia Malik Ram Rajagopal Department of Electrical and Computer Engineering University of Texas at Austin Austin, TX 78712 malik@ece.utexas.edu ram.rajagopal@ni.com

More information

Main Points of the Computer Organization and System Software Module

Main Points of the Computer Organization and System Software Module Main Points of the Computer Organization and System Software Module You can find below the topics we have covered during the COSS module. Reading the relevant parts of the textbooks is essential for a

More information

From synchronous models to distributed, asynchronous architectures

From synchronous models to distributed, asynchronous architectures From synchronous models to distributed, asynchronous architectures Stavros Tripakis Joint work with Claudio Pinello, Cadence Alberto Sangiovanni-Vincentelli, UC Berkeley Albert Benveniste, IRISA (France)

More information

Precedence Graphs Revisited (Again)

Precedence Graphs Revisited (Again) Precedence Graphs Revisited (Again) [i,i+6) [i+6,i+12) T 2 [i,i+6) [i+6,i+12) T 3 [i,i+2) [i+2,i+4) [i+4,i+6) [i+6,i+8) T 4 [i,i+1) [i+1,i+2) [i+2,i+3) [i+3,i+4) [i+4,i+5) [i+5,i+6) [i+6,i+7) T 5 [i,i+1)

More information

Dataflow Architectures. Karin Strauss

Dataflow Architectures. Karin Strauss Dataflow Architectures Karin Strauss Introduction Dataflow machines: programmable computers with hardware optimized for fine grain data-driven parallel computation fine grain: at the instruction granularity

More information

Embedded Systems 7. Models of computation for embedded systems

Embedded Systems 7. Models of computation for embedded systems Embedded Systems 7 - - Models of computation for embedded systems Communication/ local computations Communicating finite state machines Data flow model Computational graphs Von Neumann model Discrete event

More information

FSMs & message passing: SDL

FSMs & message passing: SDL 12 FSMs & message passing: SDL Peter Marwedel TU Dortmund, Informatik 12 Springer, 2010 2012 年 10 月 30 日 These slides use Microsoft clip arts. Microsoft copyright restrictions apply. Models of computation

More information

SDL. Jian-Jia Chen (slides are based on Peter Marwedel) TU Dortmund, Informatik 年 10 月 18 日. technische universität dortmund

SDL. Jian-Jia Chen (slides are based on Peter Marwedel) TU Dortmund, Informatik 年 10 月 18 日. technische universität dortmund 12 SDL Jian-Jia Chen (slides are based on Peter Marwedel) TU Dortmund, Informatik 12 2017 年 10 月 18 日 Springer, 2010 These slides use Microsoft clip arts. Microsoft copyright restrictions apply. Models

More information

Computational Process Networks

Computational Process Networks Computational Process Networks for Real-Time High-Throughput Signal and Image Processing Systems on Workstations Gregory E. Allen EE 382C - Embedded Software Systems 17 February 2000 http://www.ece.utexas.edu/~allen/

More information

Unit 2: High-Level Synthesis

Unit 2: High-Level Synthesis Course contents Unit 2: High-Level Synthesis Hardware modeling Data flow Scheduling/allocation/assignment Reading Chapter 11 Unit 2 1 High-Level Synthesis (HLS) Hardware-description language (HDL) synthesis

More information

High-Level Synthesis (HLS)

High-Level Synthesis (HLS) Course contents Unit 11: High-Level Synthesis Hardware modeling Data flow Scheduling/allocation/assignment Reading Chapter 11 Unit 11 1 High-Level Synthesis (HLS) Hardware-description language (HDL) synthesis

More information

Code Generation for TMS320C6x in Ptolemy

Code Generation for TMS320C6x in Ptolemy Code Generation for TMS320C6x in Ptolemy Sresth Kumar, Vikram Sardesai and Hamid Rahim Sheikh EE382C-9 Embedded Software Systems Spring 2000 Abstract Most Electronic Design Automation (EDA) tool vendors

More information

Co-synthesis and Accelerator based Embedded System Design

Co-synthesis and Accelerator based Embedded System Design Co-synthesis and Accelerator based Embedded System Design COE838: Embedded Computer System http://www.ee.ryerson.ca/~courses/coe838/ Dr. Gul N. Khan http://www.ee.ryerson.ca/~gnkhan Electrical and Computer

More information

AdaStreams : A Type-based Programming Extension for Stream-Parallelism with Ada 2005

AdaStreams : A Type-based Programming Extension for Stream-Parallelism with Ada 2005 AdaStreams : A Type-based Programming Extension for Stream-Parallelism with Ada 2005 Gingun Hong*, Kirak Hong*, Bernd Burgstaller* and Johan Blieberger *Yonsei University, Korea Vienna University of Technology,

More information

ESE532: System-on-a-Chip Architecture. Today. Programmable SoC. Message. Process. Reminder

ESE532: System-on-a-Chip Architecture. Today. Programmable SoC. Message. Process. Reminder ESE532: System-on-a-Chip Architecture Day 5: September 18, 2017 Dataflow Process Model Today Dataflow Process Model Motivation Issues Abstraction Basic Approach Dataflow variants Motivations/demands for

More information

Cover Page. The handle holds various files of this Leiden University dissertation

Cover Page. The handle   holds various files of this Leiden University dissertation Cover Page The handle http://hdl.handle.net/1887/32963 holds various files of this Leiden University dissertation Author: Zhai, Jiali Teddy Title: Adaptive streaming applications : analysis and implementation

More information

Contents Part I Basic Concepts The Nature of Hardware and Software Data Flow Modeling and Transformation

Contents Part I Basic Concepts The Nature of Hardware and Software Data Flow Modeling and Transformation Contents Part I Basic Concepts 1 The Nature of Hardware and Software... 3 1.1 Introducing Hardware/Software Codesign... 3 1.1.1 Hardware... 3 1.1.2 Software... 5 1.1.3 Hardware and Software... 7 1.1.4

More information

EE249 Discussion Petri Nets: Properties, Analysis and Applications - T. Murata. Chang-Ching Wu 10/9/2007

EE249 Discussion Petri Nets: Properties, Analysis and Applications - T. Murata. Chang-Ching Wu 10/9/2007 EE249 Discussion Petri Nets: Properties, Analysis and Applications - T. Murata Chang-Ching Wu 10/9/2007 What are Petri Nets A graphical & modeling tool. Describe systems that are concurrent, asynchronous,

More information

Modelling and simulation of guaranteed throughput channels of a hard real-time multiprocessor system

Modelling and simulation of guaranteed throughput channels of a hard real-time multiprocessor system Modelling and simulation of guaranteed throughput channels of a hard real-time multiprocessor system A.J.M. Moonen Information and Communication Systems Department of Electrical Engineering Eindhoven University

More information

Promela and SPIN. Mads Dam Dept. Microelectronics and Information Technology Royal Institute of Technology, KTH. Promela and SPIN

Promela and SPIN. Mads Dam Dept. Microelectronics and Information Technology Royal Institute of Technology, KTH. Promela and SPIN Promela and SPIN Mads Dam Dept. Microelectronics and Information Technology Royal Institute of Technology, KTH Promela and SPIN Promela (Protocol Meta Language): Language for modelling discrete, event-driven

More information

Buffer Sizing to Reduce Interference and Increase Throughput of Real-Time Stream Processing Applications

Buffer Sizing to Reduce Interference and Increase Throughput of Real-Time Stream Processing Applications Buffer Sizing to Reduce Interference and Increase Throughput of Real-Time Stream Processing Applications Philip S. Wilmanns Stefan J. Geuns philip.wilmanns@utwente.nl stefan.geuns@utwente.nl University

More information

Real-Time Component Software. slide credits: H. Kopetz, P. Puschner

Real-Time Component Software. slide credits: H. Kopetz, P. Puschner Real-Time Component Software slide credits: H. Kopetz, P. Puschner Overview OS services Task Structure Task Interaction Input/Output Error Detection 2 Operating System and Middleware Application Software

More information

A Hierarchical Multiprocessor Scheduling System for DSP Applications

A Hierarchical Multiprocessor Scheduling System for DSP Applications Presented at the Twent-Ninth Annual Asilomar Conference on Signals, Sstems, and Computers - October 1995 A Hierarchical Multiprocessor Scheduling Sstem for DSP Applications José Luis Pino, Shuvra S Bhattachara

More information

SDF Domain. 3.1 Purpose of the Domain. 3.2 Using SDF Deadlock. Steve Neuendorffer

SDF Domain. 3.1 Purpose of the Domain. 3.2 Using SDF Deadlock. Steve Neuendorffer Chapter 3 from: C. Brooks, E. A. Lee, X. Liu, S. Neuendorffer, Y. Zhao, H. Zheng "Heterogeneous Concurrent Modeling and Design in Java (Volume 3: Ptolemy II Domains)," Technical Memorandum UCB/ERL M04/7,

More information

Hardware Software Codesign

Hardware Software Codesign Hardware Software Codesign 2. Specification and Models of Computation Lothar Thiele 2-1 System Design Specification System Synthesis Estimation SW-Compilation Intellectual Prop. Code Instruction Set HW-Synthesis

More information

Petri Nets. Petri Nets. Petri Net Example. Systems are specified as a directed bipartite graph. The two kinds of nodes in the graph:

Petri Nets. Petri Nets. Petri Net Example. Systems are specified as a directed bipartite graph. The two kinds of nodes in the graph: System Design&Methodologies Fö - 1 System Design&Methodologies Fö - 2 Petri Nets 1. Basic Petri Net Model 2. Properties and Analysis of Petri Nets 3. Extended Petri Net Models Petri Nets Systems are specified

More information

VHDL simulation and synthesis

VHDL simulation and synthesis VHDL simulation and synthesis How we treat VHDL in this course You will not become an expert in VHDL after taking this course The goal is that you should learn how VHDL can be used for simulation and synthesis

More information

Computational Models for Concurrent Streaming Applications

Computational Models for Concurrent Streaming Applications 2 Computational Models for Concurrent Streaming Applications The challenges of today Twan Basten Based on joint work with Marc Geilen, Sander Stuijk, and many others Department of Electrical Engineering

More information

Embedded Systems 7 BF - ES - 1 -

Embedded Systems 7 BF - ES - 1 - Embedded Systems 7-1 - Production system A modelbased realtime faultdiagnosis system for technical processes Ch. Steger, R. Weiss - 2 - Sprout Counter Flow Pipeline-Processor Based on a stream of data

More information

CS250 VLSI Systems Design Lecture 9: Patterns for Processing Units and Communication Links

CS250 VLSI Systems Design Lecture 9: Patterns for Processing Units and Communication Links CS250 VLSI Systems Design Lecture 9: Patterns for Processing Units and Communication Links John Wawrzynek, Krste Asanovic, with John Lazzaro and Yunsup Lee (TA) UC Berkeley Fall 2010 Unit-Transaction Level

More information

Parameterized Modeling and Scheduling for Dataflow Graphs 1

Parameterized Modeling and Scheduling for Dataflow Graphs 1 Technical Report #UMIACS-TR-99-73, Institute for Advanced Computer Studies, University of Maryland at College Park, December 2, 999 Parameterized Modeling and Scheduling for Dataflow Graphs Bishnupriya

More information

Lecture 16: Recapitulations. Lecture 16: Recapitulations p. 1

Lecture 16: Recapitulations. Lecture 16: Recapitulations p. 1 Lecture 16: Recapitulations Lecture 16: Recapitulations p. 1 Parallel computing and programming in general Parallel computing a form of parallel processing by utilizing multiple computing units concurrently

More information

A Lost Cycles Analysis for Performance Prediction using High-Level Synthesis

A Lost Cycles Analysis for Performance Prediction using High-Level Synthesis A Lost Cycles Analysis for Performance Prediction using High-Level Synthesis Bruno da Silva, Jan Lemeire, An Braeken, and Abdellah Touhafi Vrije Universiteit Brussel (VUB), INDI and ETRO department, Brussels,

More information

An algorithm for Performance Analysis of Single-Source Acyclic graphs

An algorithm for Performance Analysis of Single-Source Acyclic graphs An algorithm for Performance Analysis of Single-Source Acyclic graphs Gabriele Mencagli September 26, 2011 In this document we face with the problem of exploiting the performance analysis of acyclic graphs

More information

Like scalar processor Processes individual data items Item may be single integer or floating point number. - 1 of 15 - Superscalar Architectures

Like scalar processor Processes individual data items Item may be single integer or floating point number. - 1 of 15 - Superscalar Architectures Superscalar Architectures Have looked at examined basic architecture concepts Starting with simple machines Introduced concepts underlying RISC machines From characteristics of RISC instructions Found

More information

EE382V: System-on-a-Chip (SoC) Design

EE382V: System-on-a-Chip (SoC) Design EE382V: System-on-a-Chip (SoC) Design Lecture 8 HW/SW Co-Design Sources: Prof. Margarida Jacome, UT Austin Andreas Gerstlauer Electrical and Computer Engineering University of Texas at Austin gerstl@ece.utexas.edu

More information

[6] E. A. Lee and D. G. Messerschmitt, Static Scheduling of Synchronous Dataflow Programs for Digital Signal Processing, IEEE Trans.

[6] E. A. Lee and D. G. Messerschmitt, Static Scheduling of Synchronous Dataflow Programs for Digital Signal Processing, IEEE Trans. [6] E A Lee and G Messerschmitt, Static Scheduling of Synchronous ataflow Programs for igital Signal Processing, IEEE Trans on Computers, February, 1987 [7] E A Lee, and S Ha, Scheduling Strategies for

More information

ESE532: System-on-a-Chip Architecture. Today. Process. Message FIFO. Thread. Dataflow Process Model Motivation Issues Abstraction Recommended Approach

ESE532: System-on-a-Chip Architecture. Today. Process. Message FIFO. Thread. Dataflow Process Model Motivation Issues Abstraction Recommended Approach ESE53: System-on-a-Chip Architecture Day 5: January 30, 07 Dataflow Process Model Today Dataflow Process Model Motivation Issues Abstraction Recommended Approach Message Parallelism can be natural Discipline

More information

Dynamic Response Time Optimization for SDF Graphs

Dynamic Response Time Optimization for SDF Graphs Dynamic Response Time Optimization for SDF Graphs Dirk Ziegenbein, Jan Uerpmann, Rolf Ernst TU Braunschweig ziegenbein, uerpmann, ernst @ida.ing.tu-bs.de Abstract Synchronous Data Flow (SDF) is a well-known

More information

Static Dataflow Graphs

Static Dataflow Graphs Static Dataflow Graphs Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of echnology L20-1 Motivation: Dataflow Graphs A common Base Language - to serve as target representation

More information

Learning Lab 3: Parallel Methods of Solving the Linear Equation Systems

Learning Lab 3: Parallel Methods of Solving the Linear Equation Systems Learning Lab 3: Parallel Methods of Solving the Linear Equation Systems Lab Objective... Eercise State the Problem of Solving the Linear Equation Systems... 2 Eercise 2 - Studying the Gauss Algorithm for

More information

CS370 Operating Systems

CS370 Operating Systems CS370 Operating Systems Colorado State University Yashwant K Malaiya Spring 2018 L17 Main Memory Slides based on Text by Silberschatz, Galvin, Gagne Various sources 1 1 FAQ Was Great Dijkstra a magician?

More information

High-level idea. Abstractions for algorithms and parallel machines. Computation DAG s. Abstractions introduced in lecture

High-level idea. Abstractions for algorithms and parallel machines. Computation DAG s. Abstractions introduced in lecture Abstractions for algorithms and parallel machines Keshav Pingali University of eas, Austin High-level idea Difficult to work directly with tetual programs Where is the parallelism in the program? Solution:

More information

Parallel Computer Architecture and Programming Written Assignment 3

Parallel Computer Architecture and Programming Written Assignment 3 Parallel Computer Architecture and Programming Written Assignment 3 50 points total. Due Monday, July 17 at the start of class. Problem 1: Message Passing (6 pts) A. (3 pts) You and your friend liked the

More information

Symbolic Buffer Sizing for Throughput-Optimal Scheduling of Dataflow Graphs

Symbolic Buffer Sizing for Throughput-Optimal Scheduling of Dataflow Graphs Symbolic Buffer Sizing for Throughput-Optimal Scheduling of Dataflow Graphs Anan Bouakaz Pascal Fradet Alain Girault Real-Time and Embedded Technology and Applications Symposium, Vienna April 14th, 2016

More information

ECE 450:DIGITAL SIGNAL. Lecture 10: DSP Arithmetic

ECE 450:DIGITAL SIGNAL. Lecture 10: DSP Arithmetic ECE 450:DIGITAL SIGNAL PROCESSORS AND APPLICATIONS Lecture 10: DSP Arithmetic Last Session Floating Point Arithmetic Addition Block Floating Point format Dynamic Range and Precision 2 Today s Session Guard

More information

High Performance Computing. University questions with solution

High Performance Computing. University questions with solution High Performance Computing University questions with solution Q1) Explain the basic working principle of VLIW processor. (6 marks) The following points are basic working principle of VLIW processor. The

More information

Concurrent Models of Computation

Concurrent Models of Computation Concurrent Models of Computation Edward A. Lee Robert S. Pepper Distinguished Professor, UC Berkeley EECS 219D: Concurrent Models of Computation Fall 2011 Copyright 2011, Edward A. Lee, All rights reserved

More information

CA441 BPM - Modelling Workflow with Petri Nets. Modelling Workflow with Petri Nets. Workflow Management Issues. Workflow. Process.

CA441 BPM - Modelling Workflow with Petri Nets. Modelling Workflow with Petri Nets. Workflow Management Issues. Workflow. Process. Modelling Workflow with Petri Nets 1 Workflow Management Issues Georgakopoulos,Hornick, Sheth Process Workflow specification Workflow Implementation =workflow application Business Process Modelling/ Workflow

More information

Buffer Dimensioning for Throughput Improvement of Dynamic Dataflow Signal Processing Applications on Multi-Core Platforms

Buffer Dimensioning for Throughput Improvement of Dynamic Dataflow Signal Processing Applications on Multi-Core Platforms Buffer Dimensioning for Throughput Improvement of Dynamic Dataflow Signal Processing Applications on Multi-Core Platforms Małgorzata Michalska, Endri Bezati, Simone Casale-Brunet, Marco Mattavelli EPFL

More information

HIGH-LEVEL SYNTHESIS

HIGH-LEVEL SYNTHESIS HIGH-LEVEL SYNTHESIS Page 1 HIGH-LEVEL SYNTHESIS High-level synthesis: the automatic addition of structural information to a design described by an algorithm. BEHAVIORAL D. STRUCTURAL D. Systems Algorithms

More information

MODELING OF BLOCK-BASED DSP SYSTEMS

MODELING OF BLOCK-BASED DSP SYSTEMS MODELING OF BLOCK-BASED DSP SYSTEMS Dong-Ik Ko and Shuvra S. Bhattacharyya Department of Electrical and Computer Engineering, and Institute for Advanced Computer Studies University of Maryland, College

More information

Interfacing a High Speed Crypto Accelerator to an Embedded CPU

Interfacing a High Speed Crypto Accelerator to an Embedded CPU Interfacing a High Speed Crypto Accelerator to an Embedded CPU Alireza Hodjat ahodjat @ee.ucla.edu Electrical Engineering Department University of California, Los Angeles Ingrid Verbauwhede ingrid @ee.ucla.edu

More information

Functional modeling style for efficient SW code generation of video codec applications

Functional modeling style for efficient SW code generation of video codec applications Functional modeling style for efficient SW code generation of video codec applications Sang-Il Han 1)2) Soo-Ik Chae 1) Ahmed. A. Jerraya 2) SD Group 1) SLS Group 2) Seoul National Univ., Korea TIMA laboratory,

More information

A Predictable RTOS. Mantis Cheng Department of Computer Science University of Victoria

A Predictable RTOS. Mantis Cheng Department of Computer Science University of Victoria A Predictable RTOS Mantis Cheng Department of Computer Science University of Victoria Outline I. Analysis of Timeliness Requirements II. Analysis of IO Requirements III. Time in Scheduling IV. IO in Scheduling

More information

Applying Models of Computation to OpenCL Pipes for FPGA Computing. Nachiket Kapre + Hiren Patel

Applying Models of Computation to OpenCL Pipes for FPGA Computing. Nachiket Kapre + Hiren Patel Applying Models of Computation to OpenCL Pipes for FPGA Computing Nachiket Kapre + Hiren Patel nachiket@uwaterloo.ca Outline Models of Computation and Parallelism OpenCL code samples Synchronous Dataflow

More information

Basic Low Level Concepts

Basic Low Level Concepts Course Outline Basic Low Level Concepts Case Studies Operation through multiple switches: Topologies & Routing v Direct, indirect, regular, irregular Formal models and analysis for deadlock and livelock

More information

SysteMoC. Verification and Refinement of Actor-Based Models of Computation

SysteMoC. Verification and Refinement of Actor-Based Models of Computation SysteMoC Verification and Refinement of Actor-Based Models of Computation Joachim Falk, Jens Gladigau, Christian Haubelt, Joachim Keinert, Martin Streubühr, and Jürgen Teich {falk, haubelt}@cs.fau.de Hardware-Software-Co-Design

More information

SEQUENCES, MATHEMATICAL INDUCTION, AND RECURSION

SEQUENCES, MATHEMATICAL INDUCTION, AND RECURSION CHAPTER 5 SEQUENCES, MATHEMATICAL INDUCTION, AND RECURSION Alessandro Artale UniBZ - http://www.inf.unibz.it/ artale/ SECTION 5.5 Application: Correctness of Algorithms Copyright Cengage Learning. All

More information

COMP Parallel Computing. CC-NUMA (2) Memory Consistency

COMP Parallel Computing. CC-NUMA (2) Memory Consistency COMP 633 - Parallel Computing Lecture 11 September 26, 2017 Memory Consistency Reading Patterson & Hennesey, Computer Architecture (2 nd Ed.) secn 8.6 a condensed treatment of consistency models Coherence

More information

Concurrent Reading and Writing of Clocks

Concurrent Reading and Writing of Clocks Concurrent Reading and Writing of Clocks LESLIE LAMPORT Digital Equipment Corporation As an exercise in synchronization without mutual exclusion, algorithms are developed to implement both a monotonic

More information

A Framework for Space and Time Efficient Scheduling of Parallelism

A Framework for Space and Time Efficient Scheduling of Parallelism A Framework for Space and Time Efficient Scheduling of Parallelism Girija J. Narlikar Guy E. Blelloch December 996 CMU-CS-96-97 School of Computer Science Carnegie Mellon University Pittsburgh, PA 523

More information

GRAph Parallel Actor Language A Programming Language for Parallel Graph Algorithms

GRAph Parallel Actor Language A Programming Language for Parallel Graph Algorithms GRAph Parallel Actor Language A Programming Language for Parallel Graph Algorithms Thesis by Michael delorimier In Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy California

More information

Fall 2011 Prof. Hyesoon Kim. Thanks to Prof. Loh & Prof. Prvulovic

Fall 2011 Prof. Hyesoon Kim. Thanks to Prof. Loh & Prof. Prvulovic Fall 2011 Prof. Hyesoon Kim Thanks to Prof. Loh & Prof. Prvulovic Reading: Data prefetch mechanisms, Steven P. Vanderwiel, David J. Lilja, ACM Computing Surveys, Vol. 32, Issue 2 (June 2000) If memory

More information

Computer Architecture: Dataflow/Systolic Arrays

Computer Architecture: Dataflow/Systolic Arrays Data Flow Computer Architecture: Dataflow/Systolic Arrays he models we have examined all assumed Instructions are fetched and retired in sequential, control flow order his is part of the Von-Neumann model

More information