Reliable Dynamic Embedded Data Processing Systems

Size: px
Start display at page:

Download "Reliable Dynamic Embedded Data Processing Systems"

Transcription

1 2 Embedded Data Processing Systems Reliable Dynamic Embedded Data Processing Systems sony Twan Basten thales Joint work with Marc Geilen, AmirHossein Ghamarian, Hamid Shojaei, Sander Stuijk, Bart Theelen, Yang Yang and others Funding: NWO PROMES EC FP6 Betsy, FP7 MNEMEE SenterNovem Octopus 3 sony Embedded Data Processing Systems asml océ Knowing is not understanding. Charles Kettering apple 4 Embedded Data Processing Systems philips 2

2 5 Trend: Dynamic Behavior 6 Multiprocessor systems / networking / concurrency Application convergence / application evolution Intra-application dynamism Dataflow analysis What is the optimal configuration? - given applications starting/stopping over time - given data-dependent execution times - given varying communication bandwidth - given battery status - Intra-application dynamism Inter-application dynamism 7 Dataflow Analysis: Challenges 8 Synchronous Data Flow Graphs (SDFGs [Lee 986]) predictability and efficiency are contradictory rate token actor channel execution time A, B,2 C,2

3 9 Scenario-aware MPEG-4 SP Decoder Model 0 (FSM-based Scenario-Aware DataFlow (SADF)) Control channel Fixed rate Kernel Data channel VLD d d d IDCT Parameterized rate a c c b e FD MC RC Detector... Tokens 3 x={30,40,50,60,70,80,99} Rate I P x P 0 VLD IDCT MC Execution time P 0 0 others 40 P 0 0 others 7 I, P 0 0 P P P P P P P I 350 Throughput analysis for SDF I P 99 P 99 a 0 0 b 0 x 0 RC P 0 0 P 30, P 40, P c 99 x P P 0 MEMOCODE 2006 d 0 e 99 x 0 P 70, P 80, P FD All 0 ACSD 2006 Synchronous Data Flow Graphs 2 SDF Throughput Analysis (SDFGs [Lee 986]) rate token actor channel execution time A, B,2 C,2 Self-timed execution: A Transient Phase A, C B A, C A Periodic Phase B B A B Throughput: average number of actor firings over time Iteration: smallest non-empty set of actor firings that does not change the token distribution (example: A:3, B:2, C:) Throughput can be calculated from the periodic phase Efficient implementation Consider one designated firing per iteration only to detect a recurrent state

4 3 Throughput Analysis: Our Result 4 Traditional methods State Space Dasdan Gupta Howard Young Tarjan Orlin MP3 dec Modem Storage-throughput trade-offs Sample Rate >800 >800 >800 Satellite >800 >800 >800 H.263 decoder >800 >800 >800 Runtimes of various throughput analysis methods (in seconds) IEEE T Comp Lessons to be Learned Repeated application of throughput analysis Causal dependencies potentially reducing performance 6 Trade-off: Storage vs. Throughput 2 α 3 β 2 A, B,2 C,2 Tokens on channels must be stored in memory. Separate memory for each channel. Storage distribution, for example, α,β, 4,2

5 7 Trade-off: Storage vs. Throughput put through 2 α 3 β 2 A, B,2 C, ,2 5,3 6,3 6,2 7, distribution size Find all minimal storage distributions for any possible throughput 8 Dependency Graph firings in one period sp - space tk - token B 0 2 α 3 β 2 A, B,2 C,2 sp(β) Storage distribution α,β 4,2 sp(α) tk(α) tk(β) C 0 A 2 tk(α) B sp(α) A 3 A tk(a) 4 dependency: an actor firing start may depend on an actor firing end cycles denote storage dependencies 9 Dependency Graph 20 Abstract Dependency Graph resolving storage dependencies may increase throughput firings in one period sp(β) tk(β) C 0 dependency: an actor firing start may depend on an actor firing end dependencies between actors A B C tk(α) tk(β) tk(a) sp(α) sp(β) tk(β) sp(β) C 0 abstract dependency graph contains all storage dependencies easily constructed from state-space exploration B 0 sp(α) A 2 tk(α) B B 0 sp(α) A 2 tk(α) B sp - space tk - token tk(α) sp(α) A 3 A tk(a) 4 cycles denote storage dependencies dependency graph may be very large tk(α) sp(α) A 3 A tk(a) 4

6 2 Experimental Results 22 Trade-off Analysis Example Satellite MP3 H.263 #actors #channels min throughput > 0 /7 0.8 /37 /2876 Start small, recursively resolve bottlenecks Fast in general, approximation possible Generalizable to other resources, computational models Distr. size Max throughput Approximation / /32 /0000 Distr. size increase 0buffers with 544 n times the minimal step size #Pareto points #Distributions Exec. time ms 7ms 2ms 53min Scenario-aware Throughput Analysis SDF throughput analysis considers worst-case actor execution times Scenarios Scenario-aware throughput analysis considers worst-case execution times per scenario, but it must consider scenario transitions Analyzing all scenario transitions separately can be avoided Separate scenario transitions with an invariant reference schedule ACM TECS 200

7 MP3 Result 26 Pointers SDF model: 9.28 Mcycles per frame SADF model: 5.83 Mcycles per frame Yang Yang: shared resources, trade-off analysis Sander Stuijk: SDF3 toolset, Future: shared resources + scenarios + trade-off analysis L-L L-S S-L S-S M Run-time Application Management - Applications starting/stopping over time Inter-application dynamism - 6 procs, slow (ck) and fast clocks (ck2) - Minimize energy under time constraints Energy (3,ck2) (2,ck2) (,ck2) (9,ck2) (8,ck2) (7,ck2) Run-time adaptation (3,ck2) (2,ck2) Const (,ck2) (5,ck) (4,ck) (3,ck) (6,ck2) (5,ck) (4,ck) (3,ck) (6,ck2) (5,ck2) (4,ck2) JCM Completion time (0,ck) (9,ck) (3,ck2) (2,ck2) (,ck2) (8,ck) Gantt chart (5,ck) DAC 2009 (4,ck) (3,ck) Join Const Min enforce the same clock

8 29 Pareto Algebra 30 Pareto-Algebraic Characterization of run-time adaptation Goal: Compositional computation of trade-offs product - derive constrain abstract - minimize component trade-o off spaces adapt select and configure system trade-off space (at the time of a change) 3 Pareto-Algebraic Characterization of run-time adaptation 32 Complexity n components with p Pareto points each components product - derive constrain abstract - minimize select and configure X derive constrain abstract min component trade-o off spaces take a product and on-the-fly derive constrain abstract - minimize adapt O(p 2n ) system trade-off space (at the time of a change)

9 33 Complexity n components with p Pareto points each take a product and on-the-fly derive constrain abstract - minimize 34 Complexity Control keep n and p small Compositional reasoning, among others Approximate Pareto sets O(p 2n ) y/x = c 2 components X constrain derive-abstract min y = y/x = c 4 x = 35 Multi-dimensional Multiple-choice Knapsack Problem 36 A Parameterized Compositional Heuristic MMKP NP hard - One optimization objective (value) - Multiple resource dimensions with capacity constraints - Multiple independent applications - Multiple independent configurations per application Pick one configuration per application optimizing value within constraints Project all resource dimensions into one dimension Approximate Pareto set in 2-dimensional space y/x = c 2 y/x = c 4 y = x = product-constrain-derive-abstract-minimize

10 37 A Parameterized Compositional Heuristic Project all resource dimensions into one dimension Approximate Pareto set in 2-dimensional space parameter p: maintain (at most) p Pareto points Allows to budget analysis time analysis time c n p 2 (n number of applications; c platform constant) product-constrain-derive-abstract-minimize 38 par p time Controlling Time Budgets budgets T=ms t=5ms T=20ms t=00ms values I I2 I3 I4 I5 I6 MMKP benchmark: Always within budget Good values Similar results to the IMEC, SOC 2006, non-compositional state-of-the-art heuristic SimIt-ARM simulator 206 Mhz StrongARM 39 Compositionality 40 Conclusions time(ms) time(ms) Applications start at various times 5 at a time (a), at a time (b) IMEC values 2353 Pareto 3224 what about applications stopping? # of active applications values always at least as good as IMEC heuristic # of active applications Goal: reliable, resource efficient data processing systems Dynamic behavior complicates analysis Intra-application dynamism: scenario-aware dataflow analysis and design-space exploration Inter-application dynamism: Pareto-algebra-based run-time adaptation

11 4 Thank you! Questions? More info: SDF3 toolkit: Pareto calculator: An understanding of the natural world and what's in it is a source of not only a great curiosity but great fulfillment. David Attenborough

Reliable Embedded Multimedia Systems?

Reliable Embedded Multimedia Systems? 2 Overview Reliable Embedded Multimedia Systems? Twan Basten Joint work with Marc Geilen, AmirHossein Ghamarian, Hamid Shojaei, Sander Stuijk, Bart Theelen, and others Embedded Multi-media Analysis of

More information

Pareto Algebra. Reliable Run-time Adaptation in Resource-constrained Embedded Systems. Twan Basten

Pareto Algebra. Reliable Run-time Adaptation in Resource-constrained Embedded Systems. Twan Basten Reliable Run-time Adaptation in Resource-constrained Embedded Systems 2 Run-time Adaptation Encoding qualities Bandwidth requirements Decoding streams of different quality Computational effort required

More information

Computational Models for Concurrent Streaming Applications

Computational Models for Concurrent Streaming Applications 2 Computational Models for Concurrent Streaming Applications The challenges of today Twan Basten Based on joint work with Marc Geilen, Sander Stuijk, and many others Department of Electrical Engineering

More information

Modelling, Analysis and Scheduling with Dataflow Models

Modelling, Analysis and Scheduling with Dataflow Models technische universiteit eindhoven Modelling, Analysis and Scheduling with Dataflow Models Marc Geilen, Bart Theelen, Twan Basten, Sander Stuijk, AmirHossein Ghamarian, Jeroen Voeten Eindhoven University

More information

A Parameterized Compositional Multi-dimensional Multiple-choice Knapsack Heuristic for CMP Run-time Management

A Parameterized Compositional Multi-dimensional Multiple-choice Knapsack Heuristic for CMP Run-time Management A Parameterized Compositional Multi-dimensional Multiple-choice Knapsack Heuristic for CMP Run-time Management Hamid Shojaei 1,2, AmirHossein Ghamarian 2, Twan Basten 2,3, Marc Geilen 2, Sander Stuijk

More information

Overview of Dataflow Languages. Waheed Ahmad

Overview of Dataflow Languages. Waheed Ahmad Overview of Dataflow Languages Waheed Ahmad w.ahmad@utwente.nl The purpose of models is not to fit the data but to sharpen the questions. Samuel Karlins 11 th R.A Fisher Memorial Lecture Royal Society

More information

QoS Trade-off Analysis for Wireless Sensor Networks

QoS Trade-off Analysis for Wireless Sensor Networks QoS Trade-off Analysis for Wireless Sensor Networks Rob Hoes, Twan Basten Joint work with Phillip Stanley-Marbell, Marc Geilen, Chen Kong Tham, Henk Corporaal Department of Electrical Engineering Electronic

More information

HETEROGENEOUS MULTIPROCESSOR MAPPING FOR REAL-TIME STREAMING SYSTEMS

HETEROGENEOUS MULTIPROCESSOR MAPPING FOR REAL-TIME STREAMING SYSTEMS HETEROGENEOUS MULTIPROCESSOR MAPPING FOR REAL-TIME STREAMING SYSTEMS Jing Lin, Akshaya Srivasta, Prof. Andreas Gerstlauer, and Prof. Brian L. Evans Department of Electrical and Computer Engineering The

More information

Symbolic Buffer Sizing for Throughput-Optimal Scheduling of Dataflow Graphs

Symbolic Buffer Sizing for Throughput-Optimal Scheduling of Dataflow Graphs Symbolic Buffer Sizing for Throughput-Optimal Scheduling of Dataflow Graphs Anan Bouakaz Pascal Fradet Alain Girault Real-Time and Embedded Technology and Applications Symposium, Vienna April 14th, 2016

More information

Towards Translating FSM-SADF to Timed Automata

Towards Translating FSM-SADF to Timed Automata Towards Translating FSM-SADF to Timed Automata Mladen Skelin Department of Engineering Cybernetics, Norwegian University of Science and Technology mladen.skelin@itk.ntnu.no Erik Ramsgaard Wognsen, Mads

More information

Synchronous dataflow scenarios

Synchronous dataflow scenarios Synchronous dataflow scenarios Geilen, M.C.W. Published in: ACM Transactions on Embedded Computing Systems DOI: 10.1145/1880050.1880052 Published: 01/01/2010 Document Version Accepted manuscript including

More information

MULTI-PROCESSOR SYSTEM-LEVEL SYNTHESIS FOR MULTIPLE APPLICATIONS ON PLATFORM FPGA

MULTI-PROCESSOR SYSTEM-LEVEL SYNTHESIS FOR MULTIPLE APPLICATIONS ON PLATFORM FPGA MULTI-PROCESSOR SYSTEM-LEVEL SYNTHESIS FOR MULTIPLE APPLICATIONS ON PLATFORM FPGA Akash Kumar,, Shakith Fernando, Yajun Ha, Bart Mesman and Henk Corporaal Eindhoven University of Technology, Eindhoven,

More information

Performance Analysis of Weakly-Consistent Scenario-Aware Dataflow Graphs

Performance Analysis of Weakly-Consistent Scenario-Aware Dataflow Graphs Performance Analysis of Weakly-Consistent Scenario-Aware Dataflow Graphs Marc Geilen 1, Joachim Falk 2, Christian Haubelt 3, Twan Basten 1,4, Bart Theelen 4 and Sander Stuijk 1 1 Electrical Engineering

More information

Main application of SDF: DSP hardware modeling

Main application of SDF: DSP hardware modeling EE 144/244: Fundamental lgorithms for System Modeling, nalysis, and Optimization Fall 2014 Dataflow Timed SDF, Throughput nalysis Stavros Tripakis University of California, erkeley Stavros Tripakis (UC

More information

Worst-case Performance Analysis of Synchronous Dataflow Scenarios

Worst-case Performance Analysis of Synchronous Dataflow Scenarios Worst-case Performance Analysis of Synchronous Dataflow Scenarios ABSTRACT Marc Geilen Eindhoven University of Technology P.O. Box 5 Den Dolech Eindhoven, The Netherlands m.c.w.geilen@tue.nl Synchronous

More information

Predictable Mapping of Streaming Applications on Multiprocessors

Predictable Mapping of Streaming Applications on Multiprocessors Predictable Mapping of Streaming Applications on Multiprocessors PROEFSCHRIFT ter verkrijging van de graad van doctor aan de Technische Universiteit Eindhoven, op gezag van de Rector Magnificus prof.dr.ir.

More information

Tight Temporal Bounds for Dataflow Applications Mapped onto Shared Resources

Tight Temporal Bounds for Dataflow Applications Mapped onto Shared Resources Tight Temporal Bounds for Dataflow Applications Mapped onto Shared Resources Hadi Alizadeh Ara, Marc Geilen, Twan Basten, Amir Behrouzian, Martijn Hendriks and Dip Goswami Eindhoven University of Technology,

More information

Resource-efficient Routing and Scheduling of Time-constrained Network-on-Chip Communication

Resource-efficient Routing and Scheduling of Time-constrained Network-on-Chip Communication Resource-efficient Routing and Scheduling of Time-constrained Network-on-Chip Communication Sander Stuijk, Twan Basten, Marc Geilen, Amir Hossein Ghamarian and Bart Theelen Eindhoven University of Technology,

More information

Mapping real-life applications on run-time reconfigurable NoC-based MPSoC on FPGA. Singh, A.K.; Kumar, A.; Srikanthan, Th.; Ha, Y.

Mapping real-life applications on run-time reconfigurable NoC-based MPSoC on FPGA. Singh, A.K.; Kumar, A.; Srikanthan, Th.; Ha, Y. Mapping real-life applications on run-time reconfigurable NoC-based MPSoC on FPGA. Singh, A.K.; Kumar, A.; Srikanthan, Th.; Ha, Y. Published in: Proceedings of the 2010 International Conference on Field-programmable

More information

Process-Variation Aware Mapping of Real-Time Streaming Applications to MPSoCs for Improved Yield

Process-Variation Aware Mapping of Real-Time Streaming Applications to MPSoCs for Improved Yield Process-Variation Aware Mapping of Real-Time Streaming Applications to MPSoCs for Improved Yield Davit Mirzoyan 1, Benny Akesson 2, Kees Goossens 2 1 Delft University of Technology, 2 Eindhoven University

More information

Dynamic Dataflow Graphs

Dynamic Dataflow Graphs Dynamic Dataflow Graphs Bart D. Theelen, Ed F. Deprettere, and Shuvra S. Bhattacharyya Abstract Much of the work to date on dataflow models for signal processing system design has focused on decidable

More information

kickoff 15 oct 2004 Project Overview Henk Corporaal

kickoff 15 oct 2004 Project Overview Henk Corporaal PreMaDoNA kickoff 15 oct 2004 Project Overview Henk Corporaal Agenda 15.00 Opening and Overview 15.30 Implementation and Demonstrator 15.40 Project Management 15.55 Application track 16.05 Simulation track

More information

Functional modeling style for efficient SW code generation of video codec applications

Functional modeling style for efficient SW code generation of video codec applications Functional modeling style for efficient SW code generation of video codec applications Sang-Il Han 1)2) Soo-Ik Chae 1) Ahmed. A. Jerraya 2) SD Group 1) SLS Group 2) Seoul National Univ., Korea TIMA laboratory,

More information

Energy-Aware Dynamic Reconfiguration of Communication-Centric Applications for Reliable MPSoCs

Energy-Aware Dynamic Reconfiguration of Communication-Centric Applications for Reliable MPSoCs Energy-Aware Dynamic Reconfiguration of Communication-Centric for Reliable SoCs Anup Das, Amit Kumar Singh and Akash Kumar Department of Electrical and Computer Engineering National University of Singapore

More information

Buffer Minimization in Pipelined SDF Scheduling on Multi-Core Platforms

Buffer Minimization in Pipelined SDF Scheduling on Multi-Core Platforms Buffer Minimization in Pipelined SDF Scheduling on Multi-Core Platforms Yuankai Chen and Hai Zhou Electrical Engineering and Computer Science, Northwestern University, U.S.A. Abstract With the increasing

More information

Compositionality in system design: interfaces everywhere! UC Berkeley

Compositionality in system design: interfaces everywhere! UC Berkeley Compositionality in system design: interfaces everywhere! Stavros Tripakis UC Berkeley DREAMS Seminar, Mar 2013 Computers as parts of cyber physical systems cyber-physical ~98% of the world s processors

More information

A New Hybrid Algorithm for the Multiple-Choice Multi-Dimensional Knapsack Problem

A New Hybrid Algorithm for the Multiple-Choice Multi-Dimensional Knapsack Problem A New Hybrid Algorithm for the Multiple-Choice Multi-Dimensional Knapsack Problem MAHMOUD ZENNAKI Computer science department, faculty of mathematics and computer science University of Science and Technology

More information

Profiling Driven Scenario Detection and Prediction for Multimedia Applications

Profiling Driven Scenario Detection and Prediction for Multimedia Applications Profiling Driven Scenario Detection and Prediction for Multimedia Applications Stefan Valentin Gheorghita, Twan Basten and Henk Corporaal EE Department, Electronic Systems Group Eindhoven University of

More information

EECS 144/244: Fundamental Algorithms for System Modeling, Analysis, and Optimization

EECS 144/244: Fundamental Algorithms for System Modeling, Analysis, and Optimization EECS 144/244: Fundamental Algorithms for System Modeling, Analysis, and Optimization Dataflow Lecture: SDF, Kahn Process Networks Stavros Tripakis University of California, Berkeley Stavros Tripakis: EECS

More information

Trace Signal Selection to Enhance Timing and Logic Visibility in Post-Silicon Validation

Trace Signal Selection to Enhance Timing and Logic Visibility in Post-Silicon Validation Trace Signal Selection to Enhance Timing and Logic Visibility in Post-Silicon Validation Hamid Shojaei, and Azadeh Davoodi University of Wisconsin 1415 Engineering Drive, Madison WI 53706 Email: {shojaei,

More information

Resource-Efficient Routing and Scheduling of Time-Constrained Streaming Communication on Networks-on-Chip

Resource-Efficient Routing and Scheduling of Time-Constrained Streaming Communication on Networks-on-Chip Resource-Efficient Routing and Scheduling of Time-Constrained Streaming Communication on Networks-on-Chip Sander Stuijk, Twan Basten, Marc Geilen, Amir Hossein Ghamarian and Bart Theelen Eindhoven University

More information

Throughput-optimizing Compilation of Dataflow Applications for Multi-Cores using Quasi-Static Scheduling

Throughput-optimizing Compilation of Dataflow Applications for Multi-Cores using Quasi-Static Scheduling Throughput-optimizing Compilation of Dataflow Applications for Multi-Cores using Quasi-Static Scheduling Tobias Schwarzer 1, Joachim Falk 1, Michael Glaß 1, Jürgen Teich 1, Christian Zebelein 2, Christian

More information

A Accelerating Throughput-aware Run-time Mapping for Heterogeneous MPSoCs

A Accelerating Throughput-aware Run-time Mapping for Heterogeneous MPSoCs A Accelerating Throughput-aware Run-time Mapping for Heterogeneous MPSoCs AMIT KUMAR SINGH, Nanyang Technological University and National University of Singapore AKASH KUMAR, National University of Singapore

More information

Compilation of Parametric Dataflow Applications for Software-Defined-Radio-Dedicated MPSoCs DREAM seminar

Compilation of Parametric Dataflow Applications for Software-Defined-Radio-Dedicated MPSoCs DREAM seminar Compilation of Parametric Dataflow Applications for Software-Defined-Radio-Dedicated MPSoCs DREAM seminar Mickaël Dardaillon Research Intern with NOKIA Technologies January 27th, 2015 2 / 33 What we know

More information

trend: embedded systems Composable Timing and Energy in CompSOC trend: multiple applications on one device problem: design time 3 composability

trend: embedded systems Composable Timing and Energy in CompSOC trend: multiple applications on one device problem: design time 3 composability Eindhoven University of Technology This research is supported by EU grants T-CTEST, Cobra and NL grant NEST. Parts of the platform were developed in COMCAS, Scalopes, TSA, NEVA,

More information

Fundamental Algorithms for System Modeling, Analysis, and Optimization

Fundamental Algorithms for System Modeling, Analysis, and Optimization Fundamental Algorithms for System Modeling, Analysis, and Optimization Stavros Tripakis, Edward A. Lee UC Berkeley EECS 144/244 Fall 2014 Copyright 2014, E. A. Lee, J. Roydhowdhury, S. A. Seshia, S. Tripakis

More information

Performance Analysis of Weakly-Consistent Scenario-Aware Dataflow Graphs

Performance Analysis of Weakly-Consistent Scenario-Aware Dataflow Graphs J Sign Process Syst (2017) 87:157 175 DOI 10.1007/s11265-016-1193-7 Performance Analysis of Weakly-Consistent Scenario-Aware Dataflow Graphs Marc Geilen 1 Joachim Falk 2 Christian Haubelt 3 Twan Basten

More information

Pareto Efficient Design for Reconfigurable Streaming Applications on CPU/FPGAs

Pareto Efficient Design for Reconfigurable Streaming Applications on CPU/FPGAs Pareto Efficient Design for Reconfigurable Streaming Applications on CPU/FPGAs Jun Zhu, Ingo Sander, Axel Jantsch Royal Institute of Technology, Stockholm, Sweden {junz, ingo, axel}@kth.se Abstract We

More information

Resource-bound process algebras for Schedulability and Performance Analysis of Real-Time and Embedded Systems

Resource-bound process algebras for Schedulability and Performance Analysis of Real-Time and Embedded Systems Resource-bound process algebras for Schedulability and Performance Analysis of Real-Time and Embedded Systems Insup Lee 1, Oleg Sokolsky 1, Anna Philippou 2 1 RTG (Real-Time Systems Group) Department of

More information

Pareto Optimal Scheduling for Synchronous Data Flow Graphs on Heterogeneous Multiprocessor

Pareto Optimal Scheduling for Synchronous Data Flow Graphs on Heterogeneous Multiprocessor Pareto Optimal Scheduling for Synchronous Data Flow Graphs on Heterogeneous Multiprocessor Yu-Lei Gu, Xue-Yang Zhu, Guangquan Zhang, Yifan He State Key Laboratory of Computer Science, Institute of Software,

More information

Efficient Static Buffering to Guarantee Throughput-Optimal FPGA Implementation of Synchronous Dataflow Graphs

Efficient Static Buffering to Guarantee Throughput-Optimal FPGA Implementation of Synchronous Dataflow Graphs In Proceedings of the International Conference on Embedded Computer Systems: Architectures, Modeling, Simulation, pages 136-143, 2010. Efficient Static Buffering to Guarantee hroughput-optimal FPGA Implementation

More information

Modelling and simulation of guaranteed throughput channels of a hard real-time multiprocessor system

Modelling and simulation of guaranteed throughput channels of a hard real-time multiprocessor system Modelling and simulation of guaranteed throughput channels of a hard real-time multiprocessor system A.J.M. Moonen Information and Communication Systems Department of Electrical Engineering Eindhoven University

More information

Mapping C code on MPSoC for Nomadic Embedded Systems

Mapping C code on MPSoC for Nomadic Embedded Systems -1 - ARTIST2 Summer School 2008 in Europe Autrans (near Grenoble), France September 8-12, 8 2008 Mapping C code on MPSoC for Nomadic Embedded Systems http://www.artist-embedded.org/ Lecturer: Diederik

More information

Hierarchical FSMs with Multiple CMs

Hierarchical FSMs with Multiple CMs Hierarchical FSMs with Multiple CMs Manaloor Govindarajan Balasubramanian Manikantan Bharathwaj Muthuswamy (aka Bharath) Reference: Hierarchical FSMs with Multiple Concurrency Models. Alain Girault, Bilung

More information

Dynamic Dataflow. Seminar on embedded systems

Dynamic Dataflow. Seminar on embedded systems Dynamic Dataflow Seminar on embedded systems Dataflow Dataflow programming, Dataflow architecture Dataflow Models of Computation Computation is divided into nodes that can be executed concurrently Dataflow

More information

Extensions of Daedalus Todor Stefanov

Extensions of Daedalus Todor Stefanov Extensions of Daedalus Todor Stefanov Leiden Embedded Research Center, Leiden Institute of Advanced Computer Science Leiden University, The Netherlands Overview of Extensions in Daedalus DSE limited to

More information

Task-FIFO Co-Scheduling of Streaming Applications on MPSoCs with Predictable Memory Hierarchy

Task-FIFO Co-Scheduling of Streaming Applications on MPSoCs with Predictable Memory Hierarchy Task-FIFO Co-Scheduling of Streaming Applications on MPSoCs with Predictable Memory Hierarchy Qi Tang, Twan Basten, Marc Geilen, Sander Stuijk and Ji-Bo Wei Department of Electronic Science and Engineering

More information

ENERGY MODELS FOR NETWORK-ON-CHIP COMPONENTS

ENERGY MODELS FOR NETWORK-ON-CHIP COMPONENTS TECHNISCHE UNIVERSITEIT EINDHOVEN Department of Mathematics and Computer Science ENERGY MODELS FOR NETWORK-ON-CHIP COMPONENTS By Shubha Bhat Supervisors: Dr. Ir. Twan Basten Dr. Ir. Marc Geilen Ir. Sander

More information

Buffer Dimensioning for Throughput Improvement of Dynamic Dataflow Signal Processing Applications on Multi-Core Platforms

Buffer Dimensioning for Throughput Improvement of Dynamic Dataflow Signal Processing Applications on Multi-Core Platforms Buffer Dimensioning for Throughput Improvement of Dynamic Dataflow Signal Processing Applications on Multi-Core Platforms Małgorzata Michalska, Endri Bezati, Simone Casale-Brunet, Marco Mattavelli EPFL

More information

Cover Page. The handle holds various files of this Leiden University dissertation

Cover Page. The handle   holds various files of this Leiden University dissertation Cover Page The handle http://hdl.handle.net/1887/32963 holds various files of this Leiden University dissertation Author: Zhai, Jiali Teddy Title: Adaptive streaming applications : analysis and implementation

More information

Multicore DSP Software Synthesis using Partial Expansion of Dataflow Graphs

Multicore DSP Software Synthesis using Partial Expansion of Dataflow Graphs Multicore DSP Software Synthesis using Partial Expansion of Dataflow Graphs George F. Zaki, William Plishker, Shuvra S. Bhattacharyya University of Maryland, College Park, MD, USA & Frank Fruth Texas Instruments

More information

Computational Process Networks a model and framework for high-throughput signal processing

Computational Process Networks a model and framework for high-throughput signal processing Computational Process Networks a model and framework for high-throughput signal processing Gregory E. Allen Ph.D. Defense 25 April 2011 Committee Members: James C. Browne Craig M. Chase Brian L. Evans

More information

Automated Design Flow for Coarse-Grained Reconfigurable Platforms: an RVC-CAL Multi-Standard Decoder Use-Case

Automated Design Flow for Coarse-Grained Reconfigurable Platforms: an RVC-CAL Multi-Standard Decoder Use-Case XIV International Conference on Embedded Computer and Systems: Architectures, MOdeling and Simulation SAMOS XIV - 2014 July 14 th - Samos Island (Greece) Carlo Sau, Luigi Raffo DIEE Università degli Studi

More information

Hybrid Code-Data Prefetch-Aware Multiprocessor Task Graph Scheduling

Hybrid Code-Data Prefetch-Aware Multiprocessor Task Graph Scheduling Hybrid Code-Data Prefetch-Aware Multiprocessor Task Graph Scheduling Morteza Damavandpeyma 1, Sander Stuijk 1, Twan Basten 1,2, Marc Geilen 1 and Henk Corporaal 1 1 Department of Electrical Engineering,

More information

Resource Manager for Non-preemptive Heterogeneous Multiprocessor System-on-chip

Resource Manager for Non-preemptive Heterogeneous Multiprocessor System-on-chip Resource Manager for Non-preemptive Heterogeneous Multiprocessor System-on-chip Akash Kumar, Bart Mesman, Bart Theelen and Henk Corporaal Eindhoven University of Technology 5600MB Eindhoven, The Netherlands

More information

PREESM: A Dataflow-Based Rapid Prototyping Framework for Simplifying Multicore DSP Programming

PREESM: A Dataflow-Based Rapid Prototyping Framework for Simplifying Multicore DSP Programming PREESM: A Dataflow-Based Rapid Prototyping Framework for Simplifying Multicore DSP Programming Maxime Pelcat, Karol Desnos, Julien Heulot, Clément Guy, Jean François Nezan, Slaheddine Aridhi To cite this

More information

STATIC SCHEDULING FOR CYCLO STATIC DATA FLOW GRAPHS

STATIC SCHEDULING FOR CYCLO STATIC DATA FLOW GRAPHS STATIC SCHEDULING FOR CYCLO STATIC DATA FLOW GRAPHS Sukumar Reddy Anapalli Krishna Chaithanya Chakilam Timothy W. O Neil Dept. of Computer Science Dept. of Computer Science Dept. of Computer Science The

More information

SDL. Jian-Jia Chen (slides are based on Peter Marwedel) TU Dortmund, Informatik 年 10 月 18 日. technische universität dortmund

SDL. Jian-Jia Chen (slides are based on Peter Marwedel) TU Dortmund, Informatik 年 10 月 18 日. technische universität dortmund 12 SDL Jian-Jia Chen (slides are based on Peter Marwedel) TU Dortmund, Informatik 12 2017 年 10 月 18 日 Springer, 2010 These slides use Microsoft clip arts. Microsoft copyright restrictions apply. Models

More information

Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks

Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks X. Yuan, R. Melhem and R. Gupta Department of Computer Science University of Pittsburgh Pittsburgh, PA 156 fxyuan,

More information

Synthesizing Energy-Optimal Controllers for Multiprocessor Dataflow Applications with UPPAAL STRATEGO

Synthesizing Energy-Optimal Controllers for Multiprocessor Dataflow Applications with UPPAAL STRATEGO Synthesizing Energy-Optimal Controllers for Multiprocessor Dataflow Applications with UPPAAL STRATEGO Waheed Ahmad and Jaco van de Pol University of Twente, The Netherlands, {w.ahmad, j.c.vandepol}@utwente.nl

More information

Mapping-Aware Constrained Scheduling for LUT-Based FPGAs

Mapping-Aware Constrained Scheduling for LUT-Based FPGAs Mapping-Aware Constrained Scheduling for LUT-Based FPGAs Mingxing Tan, Steve Dai, Udit Gupta, Zhiru Zhang School of Electrical and Computer Engineering Cornell University High-Level Synthesis (HLS) for

More information

Static Dataflow with Access Patterns: Semantics and Analysis

Static Dataflow with Access Patterns: Semantics and Analysis Static Dataflow with Access Patterns: Semantics and Analysis Arkadeb Ghosal*, Rhishikesh Limaye*, Kaushik Ravindran*, Stavros Tripakis** Ankita Prasad*, Guoqiang Wang*, Trung N Tran*, Hugo Andrade* * National

More information

Parallel Implementation of Arbitrary-Shaped MPEG-4 Decoder for Multiprocessor Systems

Parallel Implementation of Arbitrary-Shaped MPEG-4 Decoder for Multiprocessor Systems Parallel Implementation of Arbitrary-Shaped MPEG-4 oder for Multiprocessor Systems Milan Pastrnak *,a,c, Peter H.N. de With a,c, Sander Stuijk c and Jef van Meerbergen b,c a LogicaCMG Nederland B.V., RTSE

More information

Minimising Buffer Requirements of Synchronous Dataflow Graphs with Model Checking

Minimising Buffer Requirements of Synchronous Dataflow Graphs with Model Checking Minimising Buffer Requirements of Synchronous Dataflow Graphs with Model Checking Marc Geilen, Twan Basten and Sander Stuijk Eindhoven University of Technology, Department of Electrical Engineering {m.c.w.geilen,a.a.basten,s.stuijk}@tue.nl

More information

Evaluating Inter-cluster Communication in Clustered VLIW Architectures

Evaluating Inter-cluster Communication in Clustered VLIW Architectures Evaluating Inter-cluster Communication in Clustered VLIW Architectures Anup Gangwar Embedded Systems Group, Department of Computer Science and Engineering, Indian Institute of Technology Delhi September

More information

Preliminary design and validation of a modular framework for predictable composition of medical imaging applications

Preliminary design and validation of a modular framework for predictable composition of medical imaging applications Preliminary design and validation of a modular framework for predictable composition of medical imaging applications 7 th July 2015 Martijn van den Heuvel S.C. Cracana H. Salunkhe J.J. Lukkien A. Lele

More information

FSMs & message passing: SDL

FSMs & message passing: SDL 12 FSMs & message passing: SDL Peter Marwedel TU Dortmund, Informatik 12 Springer, 2010 2012 年 10 月 30 日 These slides use Microsoft clip arts. Microsoft copyright restrictions apply. Models of computation

More information

Unit 2: High-Level Synthesis

Unit 2: High-Level Synthesis Course contents Unit 2: High-Level Synthesis Hardware modeling Data flow Scheduling/allocation/assignment Reading Chapter 11 Unit 2 1 High-Level Synthesis (HLS) Hardware-description language (HDL) synthesis

More information

15-740/ Computer Architecture Lecture 20: Main Memory II. Prof. Onur Mutlu Carnegie Mellon University

15-740/ Computer Architecture Lecture 20: Main Memory II. Prof. Onur Mutlu Carnegie Mellon University 15-740/18-740 Computer Architecture Lecture 20: Main Memory II Prof. Onur Mutlu Carnegie Mellon University Today SRAM vs. DRAM Interleaving/Banking DRAM Microarchitecture Memory controller Memory buses

More information

R-Storm: A Resource-Aware Scheduler for STORM. Mohammad Hosseini Boyang Peng Zhihao Hong Reza Farivar Roy Campbell

R-Storm: A Resource-Aware Scheduler for STORM. Mohammad Hosseini Boyang Peng Zhihao Hong Reza Farivar Roy Campbell R-Storm: A Resource-Aware Scheduler for STORM Mohammad Hosseini Boyang Peng Zhihao Hong Reza Farivar Roy Campbell Introduction STORM is an open source distributed real-time data stream processing system

More information

Optimal Architectures for Massively Parallel Implementation of Hard. Real-time Beamformers

Optimal Architectures for Massively Parallel Implementation of Hard. Real-time Beamformers Optimal Architectures for Massively Parallel Implementation of Hard Real-time Beamformers Final Report Thomas Holme and Karen P. Watkins 8 May 1998 EE 382C Embedded Software Systems Prof. Brian Evans 1

More information

ECE 5775 (Fall 17) High-Level Digital Design Automation. More Binding Pipelining

ECE 5775 (Fall 17) High-Level Digital Design Automation. More Binding Pipelining ECE 5775 (Fall 17) High-Level Digital Design Automation More Binding Pipelining Logistics Lab 3 due Friday 10/6 No late penalty for this assignment (up to 3 days late) HW 2 will be posted tomorrow 1 Agenda

More information

Design Space Exploration and Application Autotuning for Runtime Adaptivity in Multicore Architectures

Design Space Exploration and Application Autotuning for Runtime Adaptivity in Multicore Architectures Design Space Exploration and Application Autotuning for Runtime Adaptivity in Multicore Architectures Cristina Silvano Politecnico di Milano cristina.silvano@polimi.it Outline Research challenges in multicore

More information

Multimedia Decoder Using the Nios II Processor

Multimedia Decoder Using the Nios II Processor Multimedia Decoder Using the Nios II Processor Third Prize Multimedia Decoder Using the Nios II Processor Institution: Participants: Instructor: Indian Institute of Science Mythri Alle, Naresh K. V., Svatantra

More information

Process-Variation-Aware Mapping of Best-Effort and Real-Time Streaming Applications to MPSoCs

Process-Variation-Aware Mapping of Best-Effort and Real-Time Streaming Applications to MPSoCs Process-Variation-Aware Mapping of Best-Effort and Real-Time Streaming Applications to MPSoCs DAVIT MIRZOYAN, Delft University of Technology BENNY AKESSON and KEES GOOSSENS, Eindhoven University of Technology

More information

A Design Framework for Mapping Vectorized Synchronous Dataflow Graphs onto CPU-GPU Platforms

A Design Framework for Mapping Vectorized Synchronous Dataflow Graphs onto CPU-GPU Platforms A Design Framework for Mapping Vectorized Synchronous Dataflow Graphs onto CPU-GPU Platforms Shuoxin Lin, Yanzhou Liu, William Plishker, Shuvra Bhattacharyya Maryland DSPCAD Research Group Department of

More information

A Transactional Model and Platform for Designing and Implementing Reactive Systems

A Transactional Model and Platform for Designing and Implementing Reactive Systems A Transactional Model and Platform for Designing and Implementing Reactive Systems Justin R. Wilson A dissertation presented to the Graduate School of Arts and Sciences of Washington University in partial

More information

Uni-Address Threads: Scalable Thread Management for RDMA-based Work Stealing

Uni-Address Threads: Scalable Thread Management for RDMA-based Work Stealing Uni-Address Threads: Scalable Thread Management for RDMA-based Work Stealing Shigeki Akiyama, Kenjiro Taura The University of Tokyo June 17, 2015 HPDC 15 Lightweight Threads Lightweight threads enable

More information

A Multi-Modal Composability Framework for Cyber-Physical Systems

A Multi-Modal Composability Framework for Cyber-Physical Systems S5 Symposium June 12, 2012 A Multi-Modal Composability Framework for Cyber-Physical Systems Linh Thi Xuan Phan Insup Lee PRECISE Center University of Pennsylvania Avionics, Automotive Medical Devices Cyber-physical

More information

MAPPING VIDEO CODECS TO HETEROGENEOUS ARCHITECTURES. Mauricio Alvarez-Mesa Techische Universität Berlin - Spin Digital MULTIPROG 2015

MAPPING VIDEO CODECS TO HETEROGENEOUS ARCHITECTURES. Mauricio Alvarez-Mesa Techische Universität Berlin - Spin Digital MULTIPROG 2015 MAPPING VIDEO CODECS TO HETEROGENEOUS ARCHITECTURES Mauricio Alvarez-Mesa Techische Universität Berlin - Spin Digital MULTIPROG 2015 Video Codecs 70% of internet traffic will be video in 2018 [CISCO] Video

More information

Data Storage Exploration and Bandwidth Analysis for Distributed MPEG-4 Decoding

Data Storage Exploration and Bandwidth Analysis for Distributed MPEG-4 Decoding Data Storage Exploration and Bandwidth Analysis for Distributed MPEG-4 oding Milan Pastrnak, Peter H. N. de With, Senior Member, IEEE Abstract The low bit-rate profiles of the MPEG-4 standard enable video-streaming

More information

Reliable Distributed System Approaches

Reliable Distributed System Approaches Reliable Distributed System Approaches Manuel Graber Seminar of Distributed Computing WS 03/04 The Papers The Process Group Approach to Reliable Distributed Computing K. Birman; Communications of the ACM,

More information

Homework 2 COP The total number of paths required to reach the global state is 20 edges.

Homework 2 COP The total number of paths required to reach the global state is 20 edges. Homework 2 COP 5611 Problem 1: 1.a Global state lattice 1. The total number of paths required to reach the global state is 20 edges. 2. In the global lattice each and every edge (downwards) leads to a

More information

Lecture 9: Load Balancing & Resource Allocation

Lecture 9: Load Balancing & Resource Allocation Lecture 9: Load Balancing & Resource Allocation Introduction Moler s law, Sullivan s theorem give upper bounds on the speed-up that can be achieved using multiple processors. But to get these need to efficiently

More information

A Low Energy Clustered Instruction Memory Hierarchy for Long Instruction Word Processors

A Low Energy Clustered Instruction Memory Hierarchy for Long Instruction Word Processors A Low Energy Clustered Instruction Memory Hierarchy for Long Instruction Word Processors Murali Jayapala 1, Francisco Barat 1, Pieter Op de Beeck 1, Francky Catthoor 2, Geert Deconinck 1 and Henk Corporaal

More information

System-on-Chip Architecture for Mobile Applications. Sabyasachi Dey

System-on-Chip Architecture for Mobile Applications. Sabyasachi Dey System-on-Chip Architecture for Mobile Applications Sabyasachi Dey Email: sabyasachi.dey@gmail.com Agenda What is Mobile Application Platform Challenges Key Architecture Focus Areas Conclusion Mobile Revolution

More information

A Many-Core Machine Model for Designing Algorithms with Minimum Parallelism Overheads

A Many-Core Machine Model for Designing Algorithms with Minimum Parallelism Overheads A Many-Core Machine Model for Designing Algorithms with Minimum Parallelism Overheads Sardar Anisul Haque Marc Moreno Maza Ning Xie University of Western Ontario, Canada IBM CASCON, November 4, 2014 ardar

More information

Rate-Optimal Unfolding of Balanced Synchronous Data-Flow Graphs

Rate-Optimal Unfolding of Balanced Synchronous Data-Flow Graphs Rate-Optimal Unfolding of Balanced Synchronous Data-Flow Graphs Timothy W. O Neil Dept. of Computer Science, The University of Akron, Akron OH 44325 USA toneil@uakron.edu Abstract Many common iterative

More information

Programming Heterogeneous Embedded Systems for IoT

Programming Heterogeneous Embedded Systems for IoT Programming Heterogeneous Embedded Systems for IoT Jeronimo Castrillon Chair for Compiler Construction TU Dresden jeronimo.castrillon@tu-dresden.de Get-together toward a sustainable collaboration in IoT

More information

Evolutionary Algorithm for Embedded System Topology Optimization. Supervisor: Prof. Dr. Martin Radetzki Author: Haowei Wang

Evolutionary Algorithm for Embedded System Topology Optimization. Supervisor: Prof. Dr. Martin Radetzki Author: Haowei Wang Evolutionary Algorithm for Embedded System Topology Optimization Supervisor: Prof. Dr. Martin Radetzki Author: Haowei Wang Agenda Introduction to the problem Principle of evolutionary algorithm Model specification

More information

Evaluating Run-time Resource Management Policies for Multi-core Embedded Platforms with

Evaluating Run-time Resource Management Policies for Multi-core Embedded Platforms with Evaluating Run-time Resource Management Policies for Multi-core Embedded Platforms with the EMME Evaluation Framework Giovanni Mariani ALaRI - University of Lugano Lugano, Switzerland E-mail: giovanni.mariani@lu.unisi.ch

More information

EE382N.23: Embedded System Design and Modeling

EE382N.23: Embedded System Design and Modeling EE38N.3: Embedded System Design and Modeling Lecture 5 Process-Based MoCs Andreas Gerstlauer Electrical and Computer Engineering University of Texas at Austin gerstl@ece.utexas.edu Lecture 5: Outline Process-based

More information

Computer Engineering Mekelweg 4, 2628 CD Delft The Netherlands MSc THESIS

Computer Engineering Mekelweg 4, 2628 CD Delft The Netherlands   MSc THESIS Computer Engineering Mekelweg 4, 2628 CD Delft The Netherlands http://ce.et.tudelft.nl/ 2012 MSc THESIS Dynamic loading and task migration for streaming applications on a composable system-on-chip Adriaan

More information

Caching video contents in IPTV systems with hierarchical architecture

Caching video contents in IPTV systems with hierarchical architecture Caching video contents in IPTV systems with hierarchical architecture Lydia Chen 1, Michela Meo 2 and Alessandra Scicchitano 1 1. IBM Zurich Research Lab email: {yic,als}@zurich.ibm.com 2. Politecnico

More information

Model Checking of Finite-state Machine-based Scenario-aware Dataflow Using Timed Automata

Model Checking of Finite-state Machine-based Scenario-aware Dataflow Using Timed Automata Model Checking of Finite-state Machine-based Scenario-aware Dataflow Using Timed Automata Mladen Skelin Department of Engineering Cybernetics, Norwegian University of Science and Technology mladen.skelin@itk.ntnu.no

More information

Best Practices for Implementing ARM Cortex -A12 Processor and Mali TM -T6XX GPUs for Mid-Range Mobile SoCs.

Best Practices for Implementing ARM Cortex -A12 Processor and Mali TM -T6XX GPUs for Mid-Range Mobile SoCs. Best Practices for Implementing ARM Cortex -A12 Processor and Mali TM -T6XX GPUs for Mid-Range Mobile SoCs. Cortex-A12: ARM-Cadence collaboration Joint team working on ARM Cortex -A12 irm flow irm content:

More information

Dataflow Processing. A.R. Hurson Computer Science Department Missouri Science & Technology

Dataflow Processing. A.R. Hurson Computer Science Department Missouri Science & Technology A.R. Hurson Computer Science Department Missouri Science & Technology hurson@mst.edu 1 Control Flow Computation Operands are accessed by their addresses. Shared memory cells are the means by which data

More information

Mode-Controlled Dataflow Modeling of Real-Time Memory Controllers

Mode-Controlled Dataflow Modeling of Real-Time Memory Controllers Mode-Controlled Dataflow Modeling of eal-time Memory Controllers Yonghui Li, Hrishikesh alunkhe, Joao Bastos, Orlando Moreira 2, Benny Akesson 3 and Kees Goossens Eindhoven University of Technology, the

More information

Multiprocessor Systems. Chapter 8, 8.1

Multiprocessor Systems. Chapter 8, 8.1 Multiprocessor Systems Chapter 8, 8.1 1 Learning Outcomes An understanding of the structure and limits of multiprocessor hardware. An appreciation of approaches to operating system support for multiprocessor

More information

MAMPSx: A Design Framework for Rapid Synthesis of Predictable Heterogeneous MPSoCs

MAMPSx: A Design Framework for Rapid Synthesis of Predictable Heterogeneous MPSoCs MAMPSx: A Design Framework for Rapid Synthesis of Predictable Heterogeneous MPSoCs Shakith Fernando, Firew Siyoum, Yifan He, Akash Kumar 2 and Henk Corporaal Department of Electrical Engineering, Eindhoven

More information