Reliable Dynamic Embedded Data Processing Systems
|
|
- Tyler Wiggins
- 5 years ago
- Views:
Transcription
1 2 Embedded Data Processing Systems Reliable Dynamic Embedded Data Processing Systems sony Twan Basten thales Joint work with Marc Geilen, AmirHossein Ghamarian, Hamid Shojaei, Sander Stuijk, Bart Theelen, Yang Yang and others Funding: NWO PROMES EC FP6 Betsy, FP7 MNEMEE SenterNovem Octopus 3 sony Embedded Data Processing Systems asml océ Knowing is not understanding. Charles Kettering apple 4 Embedded Data Processing Systems philips 2
2 5 Trend: Dynamic Behavior 6 Multiprocessor systems / networking / concurrency Application convergence / application evolution Intra-application dynamism Dataflow analysis What is the optimal configuration? - given applications starting/stopping over time - given data-dependent execution times - given varying communication bandwidth - given battery status - Intra-application dynamism Inter-application dynamism 7 Dataflow Analysis: Challenges 8 Synchronous Data Flow Graphs (SDFGs [Lee 986]) predictability and efficiency are contradictory rate token actor channel execution time A, B,2 C,2
3 9 Scenario-aware MPEG-4 SP Decoder Model 0 (FSM-based Scenario-Aware DataFlow (SADF)) Control channel Fixed rate Kernel Data channel VLD d d d IDCT Parameterized rate a c c b e FD MC RC Detector... Tokens 3 x={30,40,50,60,70,80,99} Rate I P x P 0 VLD IDCT MC Execution time P 0 0 others 40 P 0 0 others 7 I, P 0 0 P P P P P P P I 350 Throughput analysis for SDF I P 99 P 99 a 0 0 b 0 x 0 RC P 0 0 P 30, P 40, P c 99 x P P 0 MEMOCODE 2006 d 0 e 99 x 0 P 70, P 80, P FD All 0 ACSD 2006 Synchronous Data Flow Graphs 2 SDF Throughput Analysis (SDFGs [Lee 986]) rate token actor channel execution time A, B,2 C,2 Self-timed execution: A Transient Phase A, C B A, C A Periodic Phase B B A B Throughput: average number of actor firings over time Iteration: smallest non-empty set of actor firings that does not change the token distribution (example: A:3, B:2, C:) Throughput can be calculated from the periodic phase Efficient implementation Consider one designated firing per iteration only to detect a recurrent state
4 3 Throughput Analysis: Our Result 4 Traditional methods State Space Dasdan Gupta Howard Young Tarjan Orlin MP3 dec Modem Storage-throughput trade-offs Sample Rate >800 >800 >800 Satellite >800 >800 >800 H.263 decoder >800 >800 >800 Runtimes of various throughput analysis methods (in seconds) IEEE T Comp Lessons to be Learned Repeated application of throughput analysis Causal dependencies potentially reducing performance 6 Trade-off: Storage vs. Throughput 2 α 3 β 2 A, B,2 C,2 Tokens on channels must be stored in memory. Separate memory for each channel. Storage distribution, for example, α,β, 4,2
5 7 Trade-off: Storage vs. Throughput put through 2 α 3 β 2 A, B,2 C, ,2 5,3 6,3 6,2 7, distribution size Find all minimal storage distributions for any possible throughput 8 Dependency Graph firings in one period sp - space tk - token B 0 2 α 3 β 2 A, B,2 C,2 sp(β) Storage distribution α,β 4,2 sp(α) tk(α) tk(β) C 0 A 2 tk(α) B sp(α) A 3 A tk(a) 4 dependency: an actor firing start may depend on an actor firing end cycles denote storage dependencies 9 Dependency Graph 20 Abstract Dependency Graph resolving storage dependencies may increase throughput firings in one period sp(β) tk(β) C 0 dependency: an actor firing start may depend on an actor firing end dependencies between actors A B C tk(α) tk(β) tk(a) sp(α) sp(β) tk(β) sp(β) C 0 abstract dependency graph contains all storage dependencies easily constructed from state-space exploration B 0 sp(α) A 2 tk(α) B B 0 sp(α) A 2 tk(α) B sp - space tk - token tk(α) sp(α) A 3 A tk(a) 4 cycles denote storage dependencies dependency graph may be very large tk(α) sp(α) A 3 A tk(a) 4
6 2 Experimental Results 22 Trade-off Analysis Example Satellite MP3 H.263 #actors #channels min throughput > 0 /7 0.8 /37 /2876 Start small, recursively resolve bottlenecks Fast in general, approximation possible Generalizable to other resources, computational models Distr. size Max throughput Approximation / /32 /0000 Distr. size increase 0buffers with 544 n times the minimal step size #Pareto points #Distributions Exec. time ms 7ms 2ms 53min Scenario-aware Throughput Analysis SDF throughput analysis considers worst-case actor execution times Scenarios Scenario-aware throughput analysis considers worst-case execution times per scenario, but it must consider scenario transitions Analyzing all scenario transitions separately can be avoided Separate scenario transitions with an invariant reference schedule ACM TECS 200
7 MP3 Result 26 Pointers SDF model: 9.28 Mcycles per frame SADF model: 5.83 Mcycles per frame Yang Yang: shared resources, trade-off analysis Sander Stuijk: SDF3 toolset, Future: shared resources + scenarios + trade-off analysis L-L L-S S-L S-S M Run-time Application Management - Applications starting/stopping over time Inter-application dynamism - 6 procs, slow (ck) and fast clocks (ck2) - Minimize energy under time constraints Energy (3,ck2) (2,ck2) (,ck2) (9,ck2) (8,ck2) (7,ck2) Run-time adaptation (3,ck2) (2,ck2) Const (,ck2) (5,ck) (4,ck) (3,ck) (6,ck2) (5,ck) (4,ck) (3,ck) (6,ck2) (5,ck2) (4,ck2) JCM Completion time (0,ck) (9,ck) (3,ck2) (2,ck2) (,ck2) (8,ck) Gantt chart (5,ck) DAC 2009 (4,ck) (3,ck) Join Const Min enforce the same clock
8 29 Pareto Algebra 30 Pareto-Algebraic Characterization of run-time adaptation Goal: Compositional computation of trade-offs product - derive constrain abstract - minimize component trade-o off spaces adapt select and configure system trade-off space (at the time of a change) 3 Pareto-Algebraic Characterization of run-time adaptation 32 Complexity n components with p Pareto points each components product - derive constrain abstract - minimize select and configure X derive constrain abstract min component trade-o off spaces take a product and on-the-fly derive constrain abstract - minimize adapt O(p 2n ) system trade-off space (at the time of a change)
9 33 Complexity n components with p Pareto points each take a product and on-the-fly derive constrain abstract - minimize 34 Complexity Control keep n and p small Compositional reasoning, among others Approximate Pareto sets O(p 2n ) y/x = c 2 components X constrain derive-abstract min y = y/x = c 4 x = 35 Multi-dimensional Multiple-choice Knapsack Problem 36 A Parameterized Compositional Heuristic MMKP NP hard - One optimization objective (value) - Multiple resource dimensions with capacity constraints - Multiple independent applications - Multiple independent configurations per application Pick one configuration per application optimizing value within constraints Project all resource dimensions into one dimension Approximate Pareto set in 2-dimensional space y/x = c 2 y/x = c 4 y = x = product-constrain-derive-abstract-minimize
10 37 A Parameterized Compositional Heuristic Project all resource dimensions into one dimension Approximate Pareto set in 2-dimensional space parameter p: maintain (at most) p Pareto points Allows to budget analysis time analysis time c n p 2 (n number of applications; c platform constant) product-constrain-derive-abstract-minimize 38 par p time Controlling Time Budgets budgets T=ms t=5ms T=20ms t=00ms values I I2 I3 I4 I5 I6 MMKP benchmark: Always within budget Good values Similar results to the IMEC, SOC 2006, non-compositional state-of-the-art heuristic SimIt-ARM simulator 206 Mhz StrongARM 39 Compositionality 40 Conclusions time(ms) time(ms) Applications start at various times 5 at a time (a), at a time (b) IMEC values 2353 Pareto 3224 what about applications stopping? # of active applications values always at least as good as IMEC heuristic # of active applications Goal: reliable, resource efficient data processing systems Dynamic behavior complicates analysis Intra-application dynamism: scenario-aware dataflow analysis and design-space exploration Inter-application dynamism: Pareto-algebra-based run-time adaptation
11 4 Thank you! Questions? More info: SDF3 toolkit: Pareto calculator: An understanding of the natural world and what's in it is a source of not only a great curiosity but great fulfillment. David Attenborough
Reliable Embedded Multimedia Systems?
2 Overview Reliable Embedded Multimedia Systems? Twan Basten Joint work with Marc Geilen, AmirHossein Ghamarian, Hamid Shojaei, Sander Stuijk, Bart Theelen, and others Embedded Multi-media Analysis of
More informationPareto Algebra. Reliable Run-time Adaptation in Resource-constrained Embedded Systems. Twan Basten
Reliable Run-time Adaptation in Resource-constrained Embedded Systems 2 Run-time Adaptation Encoding qualities Bandwidth requirements Decoding streams of different quality Computational effort required
More informationComputational Models for Concurrent Streaming Applications
2 Computational Models for Concurrent Streaming Applications The challenges of today Twan Basten Based on joint work with Marc Geilen, Sander Stuijk, and many others Department of Electrical Engineering
More informationModelling, Analysis and Scheduling with Dataflow Models
technische universiteit eindhoven Modelling, Analysis and Scheduling with Dataflow Models Marc Geilen, Bart Theelen, Twan Basten, Sander Stuijk, AmirHossein Ghamarian, Jeroen Voeten Eindhoven University
More informationA Parameterized Compositional Multi-dimensional Multiple-choice Knapsack Heuristic for CMP Run-time Management
A Parameterized Compositional Multi-dimensional Multiple-choice Knapsack Heuristic for CMP Run-time Management Hamid Shojaei 1,2, AmirHossein Ghamarian 2, Twan Basten 2,3, Marc Geilen 2, Sander Stuijk
More informationOverview of Dataflow Languages. Waheed Ahmad
Overview of Dataflow Languages Waheed Ahmad w.ahmad@utwente.nl The purpose of models is not to fit the data but to sharpen the questions. Samuel Karlins 11 th R.A Fisher Memorial Lecture Royal Society
More informationQoS Trade-off Analysis for Wireless Sensor Networks
QoS Trade-off Analysis for Wireless Sensor Networks Rob Hoes, Twan Basten Joint work with Phillip Stanley-Marbell, Marc Geilen, Chen Kong Tham, Henk Corporaal Department of Electrical Engineering Electronic
More informationHETEROGENEOUS MULTIPROCESSOR MAPPING FOR REAL-TIME STREAMING SYSTEMS
HETEROGENEOUS MULTIPROCESSOR MAPPING FOR REAL-TIME STREAMING SYSTEMS Jing Lin, Akshaya Srivasta, Prof. Andreas Gerstlauer, and Prof. Brian L. Evans Department of Electrical and Computer Engineering The
More informationSymbolic Buffer Sizing for Throughput-Optimal Scheduling of Dataflow Graphs
Symbolic Buffer Sizing for Throughput-Optimal Scheduling of Dataflow Graphs Anan Bouakaz Pascal Fradet Alain Girault Real-Time and Embedded Technology and Applications Symposium, Vienna April 14th, 2016
More informationTowards Translating FSM-SADF to Timed Automata
Towards Translating FSM-SADF to Timed Automata Mladen Skelin Department of Engineering Cybernetics, Norwegian University of Science and Technology mladen.skelin@itk.ntnu.no Erik Ramsgaard Wognsen, Mads
More informationSynchronous dataflow scenarios
Synchronous dataflow scenarios Geilen, M.C.W. Published in: ACM Transactions on Embedded Computing Systems DOI: 10.1145/1880050.1880052 Published: 01/01/2010 Document Version Accepted manuscript including
More informationMULTI-PROCESSOR SYSTEM-LEVEL SYNTHESIS FOR MULTIPLE APPLICATIONS ON PLATFORM FPGA
MULTI-PROCESSOR SYSTEM-LEVEL SYNTHESIS FOR MULTIPLE APPLICATIONS ON PLATFORM FPGA Akash Kumar,, Shakith Fernando, Yajun Ha, Bart Mesman and Henk Corporaal Eindhoven University of Technology, Eindhoven,
More informationPerformance Analysis of Weakly-Consistent Scenario-Aware Dataflow Graphs
Performance Analysis of Weakly-Consistent Scenario-Aware Dataflow Graphs Marc Geilen 1, Joachim Falk 2, Christian Haubelt 3, Twan Basten 1,4, Bart Theelen 4 and Sander Stuijk 1 1 Electrical Engineering
More informationMain application of SDF: DSP hardware modeling
EE 144/244: Fundamental lgorithms for System Modeling, nalysis, and Optimization Fall 2014 Dataflow Timed SDF, Throughput nalysis Stavros Tripakis University of California, erkeley Stavros Tripakis (UC
More informationWorst-case Performance Analysis of Synchronous Dataflow Scenarios
Worst-case Performance Analysis of Synchronous Dataflow Scenarios ABSTRACT Marc Geilen Eindhoven University of Technology P.O. Box 5 Den Dolech Eindhoven, The Netherlands m.c.w.geilen@tue.nl Synchronous
More informationPredictable Mapping of Streaming Applications on Multiprocessors
Predictable Mapping of Streaming Applications on Multiprocessors PROEFSCHRIFT ter verkrijging van de graad van doctor aan de Technische Universiteit Eindhoven, op gezag van de Rector Magnificus prof.dr.ir.
More informationTight Temporal Bounds for Dataflow Applications Mapped onto Shared Resources
Tight Temporal Bounds for Dataflow Applications Mapped onto Shared Resources Hadi Alizadeh Ara, Marc Geilen, Twan Basten, Amir Behrouzian, Martijn Hendriks and Dip Goswami Eindhoven University of Technology,
More informationResource-efficient Routing and Scheduling of Time-constrained Network-on-Chip Communication
Resource-efficient Routing and Scheduling of Time-constrained Network-on-Chip Communication Sander Stuijk, Twan Basten, Marc Geilen, Amir Hossein Ghamarian and Bart Theelen Eindhoven University of Technology,
More informationMapping real-life applications on run-time reconfigurable NoC-based MPSoC on FPGA. Singh, A.K.; Kumar, A.; Srikanthan, Th.; Ha, Y.
Mapping real-life applications on run-time reconfigurable NoC-based MPSoC on FPGA. Singh, A.K.; Kumar, A.; Srikanthan, Th.; Ha, Y. Published in: Proceedings of the 2010 International Conference on Field-programmable
More informationProcess-Variation Aware Mapping of Real-Time Streaming Applications to MPSoCs for Improved Yield
Process-Variation Aware Mapping of Real-Time Streaming Applications to MPSoCs for Improved Yield Davit Mirzoyan 1, Benny Akesson 2, Kees Goossens 2 1 Delft University of Technology, 2 Eindhoven University
More informationDynamic Dataflow Graphs
Dynamic Dataflow Graphs Bart D. Theelen, Ed F. Deprettere, and Shuvra S. Bhattacharyya Abstract Much of the work to date on dataflow models for signal processing system design has focused on decidable
More informationkickoff 15 oct 2004 Project Overview Henk Corporaal
PreMaDoNA kickoff 15 oct 2004 Project Overview Henk Corporaal Agenda 15.00 Opening and Overview 15.30 Implementation and Demonstrator 15.40 Project Management 15.55 Application track 16.05 Simulation track
More informationFunctional modeling style for efficient SW code generation of video codec applications
Functional modeling style for efficient SW code generation of video codec applications Sang-Il Han 1)2) Soo-Ik Chae 1) Ahmed. A. Jerraya 2) SD Group 1) SLS Group 2) Seoul National Univ., Korea TIMA laboratory,
More informationEnergy-Aware Dynamic Reconfiguration of Communication-Centric Applications for Reliable MPSoCs
Energy-Aware Dynamic Reconfiguration of Communication-Centric for Reliable SoCs Anup Das, Amit Kumar Singh and Akash Kumar Department of Electrical and Computer Engineering National University of Singapore
More informationBuffer Minimization in Pipelined SDF Scheduling on Multi-Core Platforms
Buffer Minimization in Pipelined SDF Scheduling on Multi-Core Platforms Yuankai Chen and Hai Zhou Electrical Engineering and Computer Science, Northwestern University, U.S.A. Abstract With the increasing
More informationCompositionality in system design: interfaces everywhere! UC Berkeley
Compositionality in system design: interfaces everywhere! Stavros Tripakis UC Berkeley DREAMS Seminar, Mar 2013 Computers as parts of cyber physical systems cyber-physical ~98% of the world s processors
More informationA New Hybrid Algorithm for the Multiple-Choice Multi-Dimensional Knapsack Problem
A New Hybrid Algorithm for the Multiple-Choice Multi-Dimensional Knapsack Problem MAHMOUD ZENNAKI Computer science department, faculty of mathematics and computer science University of Science and Technology
More informationProfiling Driven Scenario Detection and Prediction for Multimedia Applications
Profiling Driven Scenario Detection and Prediction for Multimedia Applications Stefan Valentin Gheorghita, Twan Basten and Henk Corporaal EE Department, Electronic Systems Group Eindhoven University of
More informationEECS 144/244: Fundamental Algorithms for System Modeling, Analysis, and Optimization
EECS 144/244: Fundamental Algorithms for System Modeling, Analysis, and Optimization Dataflow Lecture: SDF, Kahn Process Networks Stavros Tripakis University of California, Berkeley Stavros Tripakis: EECS
More informationTrace Signal Selection to Enhance Timing and Logic Visibility in Post-Silicon Validation
Trace Signal Selection to Enhance Timing and Logic Visibility in Post-Silicon Validation Hamid Shojaei, and Azadeh Davoodi University of Wisconsin 1415 Engineering Drive, Madison WI 53706 Email: {shojaei,
More informationResource-Efficient Routing and Scheduling of Time-Constrained Streaming Communication on Networks-on-Chip
Resource-Efficient Routing and Scheduling of Time-Constrained Streaming Communication on Networks-on-Chip Sander Stuijk, Twan Basten, Marc Geilen, Amir Hossein Ghamarian and Bart Theelen Eindhoven University
More informationThroughput-optimizing Compilation of Dataflow Applications for Multi-Cores using Quasi-Static Scheduling
Throughput-optimizing Compilation of Dataflow Applications for Multi-Cores using Quasi-Static Scheduling Tobias Schwarzer 1, Joachim Falk 1, Michael Glaß 1, Jürgen Teich 1, Christian Zebelein 2, Christian
More informationA Accelerating Throughput-aware Run-time Mapping for Heterogeneous MPSoCs
A Accelerating Throughput-aware Run-time Mapping for Heterogeneous MPSoCs AMIT KUMAR SINGH, Nanyang Technological University and National University of Singapore AKASH KUMAR, National University of Singapore
More informationCompilation of Parametric Dataflow Applications for Software-Defined-Radio-Dedicated MPSoCs DREAM seminar
Compilation of Parametric Dataflow Applications for Software-Defined-Radio-Dedicated MPSoCs DREAM seminar Mickaël Dardaillon Research Intern with NOKIA Technologies January 27th, 2015 2 / 33 What we know
More informationtrend: embedded systems Composable Timing and Energy in CompSOC trend: multiple applications on one device problem: design time 3 composability
Eindhoven University of Technology This research is supported by EU grants T-CTEST, Cobra and NL grant NEST. Parts of the platform were developed in COMCAS, Scalopes, TSA, NEVA,
More informationFundamental Algorithms for System Modeling, Analysis, and Optimization
Fundamental Algorithms for System Modeling, Analysis, and Optimization Stavros Tripakis, Edward A. Lee UC Berkeley EECS 144/244 Fall 2014 Copyright 2014, E. A. Lee, J. Roydhowdhury, S. A. Seshia, S. Tripakis
More informationPerformance Analysis of Weakly-Consistent Scenario-Aware Dataflow Graphs
J Sign Process Syst (2017) 87:157 175 DOI 10.1007/s11265-016-1193-7 Performance Analysis of Weakly-Consistent Scenario-Aware Dataflow Graphs Marc Geilen 1 Joachim Falk 2 Christian Haubelt 3 Twan Basten
More informationPareto Efficient Design for Reconfigurable Streaming Applications on CPU/FPGAs
Pareto Efficient Design for Reconfigurable Streaming Applications on CPU/FPGAs Jun Zhu, Ingo Sander, Axel Jantsch Royal Institute of Technology, Stockholm, Sweden {junz, ingo, axel}@kth.se Abstract We
More informationResource-bound process algebras for Schedulability and Performance Analysis of Real-Time and Embedded Systems
Resource-bound process algebras for Schedulability and Performance Analysis of Real-Time and Embedded Systems Insup Lee 1, Oleg Sokolsky 1, Anna Philippou 2 1 RTG (Real-Time Systems Group) Department of
More informationPareto Optimal Scheduling for Synchronous Data Flow Graphs on Heterogeneous Multiprocessor
Pareto Optimal Scheduling for Synchronous Data Flow Graphs on Heterogeneous Multiprocessor Yu-Lei Gu, Xue-Yang Zhu, Guangquan Zhang, Yifan He State Key Laboratory of Computer Science, Institute of Software,
More informationEfficient Static Buffering to Guarantee Throughput-Optimal FPGA Implementation of Synchronous Dataflow Graphs
In Proceedings of the International Conference on Embedded Computer Systems: Architectures, Modeling, Simulation, pages 136-143, 2010. Efficient Static Buffering to Guarantee hroughput-optimal FPGA Implementation
More informationModelling and simulation of guaranteed throughput channels of a hard real-time multiprocessor system
Modelling and simulation of guaranteed throughput channels of a hard real-time multiprocessor system A.J.M. Moonen Information and Communication Systems Department of Electrical Engineering Eindhoven University
More informationMapping C code on MPSoC for Nomadic Embedded Systems
-1 - ARTIST2 Summer School 2008 in Europe Autrans (near Grenoble), France September 8-12, 8 2008 Mapping C code on MPSoC for Nomadic Embedded Systems http://www.artist-embedded.org/ Lecturer: Diederik
More informationHierarchical FSMs with Multiple CMs
Hierarchical FSMs with Multiple CMs Manaloor Govindarajan Balasubramanian Manikantan Bharathwaj Muthuswamy (aka Bharath) Reference: Hierarchical FSMs with Multiple Concurrency Models. Alain Girault, Bilung
More informationDynamic Dataflow. Seminar on embedded systems
Dynamic Dataflow Seminar on embedded systems Dataflow Dataflow programming, Dataflow architecture Dataflow Models of Computation Computation is divided into nodes that can be executed concurrently Dataflow
More informationExtensions of Daedalus Todor Stefanov
Extensions of Daedalus Todor Stefanov Leiden Embedded Research Center, Leiden Institute of Advanced Computer Science Leiden University, The Netherlands Overview of Extensions in Daedalus DSE limited to
More informationTask-FIFO Co-Scheduling of Streaming Applications on MPSoCs with Predictable Memory Hierarchy
Task-FIFO Co-Scheduling of Streaming Applications on MPSoCs with Predictable Memory Hierarchy Qi Tang, Twan Basten, Marc Geilen, Sander Stuijk and Ji-Bo Wei Department of Electronic Science and Engineering
More informationENERGY MODELS FOR NETWORK-ON-CHIP COMPONENTS
TECHNISCHE UNIVERSITEIT EINDHOVEN Department of Mathematics and Computer Science ENERGY MODELS FOR NETWORK-ON-CHIP COMPONENTS By Shubha Bhat Supervisors: Dr. Ir. Twan Basten Dr. Ir. Marc Geilen Ir. Sander
More informationBuffer Dimensioning for Throughput Improvement of Dynamic Dataflow Signal Processing Applications on Multi-Core Platforms
Buffer Dimensioning for Throughput Improvement of Dynamic Dataflow Signal Processing Applications on Multi-Core Platforms Małgorzata Michalska, Endri Bezati, Simone Casale-Brunet, Marco Mattavelli EPFL
More informationCover Page. The handle holds various files of this Leiden University dissertation
Cover Page The handle http://hdl.handle.net/1887/32963 holds various files of this Leiden University dissertation Author: Zhai, Jiali Teddy Title: Adaptive streaming applications : analysis and implementation
More informationMulticore DSP Software Synthesis using Partial Expansion of Dataflow Graphs
Multicore DSP Software Synthesis using Partial Expansion of Dataflow Graphs George F. Zaki, William Plishker, Shuvra S. Bhattacharyya University of Maryland, College Park, MD, USA & Frank Fruth Texas Instruments
More informationComputational Process Networks a model and framework for high-throughput signal processing
Computational Process Networks a model and framework for high-throughput signal processing Gregory E. Allen Ph.D. Defense 25 April 2011 Committee Members: James C. Browne Craig M. Chase Brian L. Evans
More informationAutomated Design Flow for Coarse-Grained Reconfigurable Platforms: an RVC-CAL Multi-Standard Decoder Use-Case
XIV International Conference on Embedded Computer and Systems: Architectures, MOdeling and Simulation SAMOS XIV - 2014 July 14 th - Samos Island (Greece) Carlo Sau, Luigi Raffo DIEE Università degli Studi
More informationHybrid Code-Data Prefetch-Aware Multiprocessor Task Graph Scheduling
Hybrid Code-Data Prefetch-Aware Multiprocessor Task Graph Scheduling Morteza Damavandpeyma 1, Sander Stuijk 1, Twan Basten 1,2, Marc Geilen 1 and Henk Corporaal 1 1 Department of Electrical Engineering,
More informationResource Manager for Non-preemptive Heterogeneous Multiprocessor System-on-chip
Resource Manager for Non-preemptive Heterogeneous Multiprocessor System-on-chip Akash Kumar, Bart Mesman, Bart Theelen and Henk Corporaal Eindhoven University of Technology 5600MB Eindhoven, The Netherlands
More informationPREESM: A Dataflow-Based Rapid Prototyping Framework for Simplifying Multicore DSP Programming
PREESM: A Dataflow-Based Rapid Prototyping Framework for Simplifying Multicore DSP Programming Maxime Pelcat, Karol Desnos, Julien Heulot, Clément Guy, Jean François Nezan, Slaheddine Aridhi To cite this
More informationSTATIC SCHEDULING FOR CYCLO STATIC DATA FLOW GRAPHS
STATIC SCHEDULING FOR CYCLO STATIC DATA FLOW GRAPHS Sukumar Reddy Anapalli Krishna Chaithanya Chakilam Timothy W. O Neil Dept. of Computer Science Dept. of Computer Science Dept. of Computer Science The
More informationSDL. Jian-Jia Chen (slides are based on Peter Marwedel) TU Dortmund, Informatik 年 10 月 18 日. technische universität dortmund
12 SDL Jian-Jia Chen (slides are based on Peter Marwedel) TU Dortmund, Informatik 12 2017 年 10 月 18 日 Springer, 2010 These slides use Microsoft clip arts. Microsoft copyright restrictions apply. Models
More informationPerformance of Multihop Communications Using Logical Topologies on Optical Torus Networks
Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks X. Yuan, R. Melhem and R. Gupta Department of Computer Science University of Pittsburgh Pittsburgh, PA 156 fxyuan,
More informationSynthesizing Energy-Optimal Controllers for Multiprocessor Dataflow Applications with UPPAAL STRATEGO
Synthesizing Energy-Optimal Controllers for Multiprocessor Dataflow Applications with UPPAAL STRATEGO Waheed Ahmad and Jaco van de Pol University of Twente, The Netherlands, {w.ahmad, j.c.vandepol}@utwente.nl
More informationMapping-Aware Constrained Scheduling for LUT-Based FPGAs
Mapping-Aware Constrained Scheduling for LUT-Based FPGAs Mingxing Tan, Steve Dai, Udit Gupta, Zhiru Zhang School of Electrical and Computer Engineering Cornell University High-Level Synthesis (HLS) for
More informationStatic Dataflow with Access Patterns: Semantics and Analysis
Static Dataflow with Access Patterns: Semantics and Analysis Arkadeb Ghosal*, Rhishikesh Limaye*, Kaushik Ravindran*, Stavros Tripakis** Ankita Prasad*, Guoqiang Wang*, Trung N Tran*, Hugo Andrade* * National
More informationParallel Implementation of Arbitrary-Shaped MPEG-4 Decoder for Multiprocessor Systems
Parallel Implementation of Arbitrary-Shaped MPEG-4 oder for Multiprocessor Systems Milan Pastrnak *,a,c, Peter H.N. de With a,c, Sander Stuijk c and Jef van Meerbergen b,c a LogicaCMG Nederland B.V., RTSE
More informationMinimising Buffer Requirements of Synchronous Dataflow Graphs with Model Checking
Minimising Buffer Requirements of Synchronous Dataflow Graphs with Model Checking Marc Geilen, Twan Basten and Sander Stuijk Eindhoven University of Technology, Department of Electrical Engineering {m.c.w.geilen,a.a.basten,s.stuijk}@tue.nl
More informationEvaluating Inter-cluster Communication in Clustered VLIW Architectures
Evaluating Inter-cluster Communication in Clustered VLIW Architectures Anup Gangwar Embedded Systems Group, Department of Computer Science and Engineering, Indian Institute of Technology Delhi September
More informationPreliminary design and validation of a modular framework for predictable composition of medical imaging applications
Preliminary design and validation of a modular framework for predictable composition of medical imaging applications 7 th July 2015 Martijn van den Heuvel S.C. Cracana H. Salunkhe J.J. Lukkien A. Lele
More informationFSMs & message passing: SDL
12 FSMs & message passing: SDL Peter Marwedel TU Dortmund, Informatik 12 Springer, 2010 2012 年 10 月 30 日 These slides use Microsoft clip arts. Microsoft copyright restrictions apply. Models of computation
More informationUnit 2: High-Level Synthesis
Course contents Unit 2: High-Level Synthesis Hardware modeling Data flow Scheduling/allocation/assignment Reading Chapter 11 Unit 2 1 High-Level Synthesis (HLS) Hardware-description language (HDL) synthesis
More information15-740/ Computer Architecture Lecture 20: Main Memory II. Prof. Onur Mutlu Carnegie Mellon University
15-740/18-740 Computer Architecture Lecture 20: Main Memory II Prof. Onur Mutlu Carnegie Mellon University Today SRAM vs. DRAM Interleaving/Banking DRAM Microarchitecture Memory controller Memory buses
More informationR-Storm: A Resource-Aware Scheduler for STORM. Mohammad Hosseini Boyang Peng Zhihao Hong Reza Farivar Roy Campbell
R-Storm: A Resource-Aware Scheduler for STORM Mohammad Hosseini Boyang Peng Zhihao Hong Reza Farivar Roy Campbell Introduction STORM is an open source distributed real-time data stream processing system
More informationOptimal Architectures for Massively Parallel Implementation of Hard. Real-time Beamformers
Optimal Architectures for Massively Parallel Implementation of Hard Real-time Beamformers Final Report Thomas Holme and Karen P. Watkins 8 May 1998 EE 382C Embedded Software Systems Prof. Brian Evans 1
More informationECE 5775 (Fall 17) High-Level Digital Design Automation. More Binding Pipelining
ECE 5775 (Fall 17) High-Level Digital Design Automation More Binding Pipelining Logistics Lab 3 due Friday 10/6 No late penalty for this assignment (up to 3 days late) HW 2 will be posted tomorrow 1 Agenda
More informationDesign Space Exploration and Application Autotuning for Runtime Adaptivity in Multicore Architectures
Design Space Exploration and Application Autotuning for Runtime Adaptivity in Multicore Architectures Cristina Silvano Politecnico di Milano cristina.silvano@polimi.it Outline Research challenges in multicore
More informationMultimedia Decoder Using the Nios II Processor
Multimedia Decoder Using the Nios II Processor Third Prize Multimedia Decoder Using the Nios II Processor Institution: Participants: Instructor: Indian Institute of Science Mythri Alle, Naresh K. V., Svatantra
More informationProcess-Variation-Aware Mapping of Best-Effort and Real-Time Streaming Applications to MPSoCs
Process-Variation-Aware Mapping of Best-Effort and Real-Time Streaming Applications to MPSoCs DAVIT MIRZOYAN, Delft University of Technology BENNY AKESSON and KEES GOOSSENS, Eindhoven University of Technology
More informationA Design Framework for Mapping Vectorized Synchronous Dataflow Graphs onto CPU-GPU Platforms
A Design Framework for Mapping Vectorized Synchronous Dataflow Graphs onto CPU-GPU Platforms Shuoxin Lin, Yanzhou Liu, William Plishker, Shuvra Bhattacharyya Maryland DSPCAD Research Group Department of
More informationA Transactional Model and Platform for Designing and Implementing Reactive Systems
A Transactional Model and Platform for Designing and Implementing Reactive Systems Justin R. Wilson A dissertation presented to the Graduate School of Arts and Sciences of Washington University in partial
More informationUni-Address Threads: Scalable Thread Management for RDMA-based Work Stealing
Uni-Address Threads: Scalable Thread Management for RDMA-based Work Stealing Shigeki Akiyama, Kenjiro Taura The University of Tokyo June 17, 2015 HPDC 15 Lightweight Threads Lightweight threads enable
More informationA Multi-Modal Composability Framework for Cyber-Physical Systems
S5 Symposium June 12, 2012 A Multi-Modal Composability Framework for Cyber-Physical Systems Linh Thi Xuan Phan Insup Lee PRECISE Center University of Pennsylvania Avionics, Automotive Medical Devices Cyber-physical
More informationMAPPING VIDEO CODECS TO HETEROGENEOUS ARCHITECTURES. Mauricio Alvarez-Mesa Techische Universität Berlin - Spin Digital MULTIPROG 2015
MAPPING VIDEO CODECS TO HETEROGENEOUS ARCHITECTURES Mauricio Alvarez-Mesa Techische Universität Berlin - Spin Digital MULTIPROG 2015 Video Codecs 70% of internet traffic will be video in 2018 [CISCO] Video
More informationData Storage Exploration and Bandwidth Analysis for Distributed MPEG-4 Decoding
Data Storage Exploration and Bandwidth Analysis for Distributed MPEG-4 oding Milan Pastrnak, Peter H. N. de With, Senior Member, IEEE Abstract The low bit-rate profiles of the MPEG-4 standard enable video-streaming
More informationReliable Distributed System Approaches
Reliable Distributed System Approaches Manuel Graber Seminar of Distributed Computing WS 03/04 The Papers The Process Group Approach to Reliable Distributed Computing K. Birman; Communications of the ACM,
More informationHomework 2 COP The total number of paths required to reach the global state is 20 edges.
Homework 2 COP 5611 Problem 1: 1.a Global state lattice 1. The total number of paths required to reach the global state is 20 edges. 2. In the global lattice each and every edge (downwards) leads to a
More informationLecture 9: Load Balancing & Resource Allocation
Lecture 9: Load Balancing & Resource Allocation Introduction Moler s law, Sullivan s theorem give upper bounds on the speed-up that can be achieved using multiple processors. But to get these need to efficiently
More informationA Low Energy Clustered Instruction Memory Hierarchy for Long Instruction Word Processors
A Low Energy Clustered Instruction Memory Hierarchy for Long Instruction Word Processors Murali Jayapala 1, Francisco Barat 1, Pieter Op de Beeck 1, Francky Catthoor 2, Geert Deconinck 1 and Henk Corporaal
More informationSystem-on-Chip Architecture for Mobile Applications. Sabyasachi Dey
System-on-Chip Architecture for Mobile Applications Sabyasachi Dey Email: sabyasachi.dey@gmail.com Agenda What is Mobile Application Platform Challenges Key Architecture Focus Areas Conclusion Mobile Revolution
More informationA Many-Core Machine Model for Designing Algorithms with Minimum Parallelism Overheads
A Many-Core Machine Model for Designing Algorithms with Minimum Parallelism Overheads Sardar Anisul Haque Marc Moreno Maza Ning Xie University of Western Ontario, Canada IBM CASCON, November 4, 2014 ardar
More informationRate-Optimal Unfolding of Balanced Synchronous Data-Flow Graphs
Rate-Optimal Unfolding of Balanced Synchronous Data-Flow Graphs Timothy W. O Neil Dept. of Computer Science, The University of Akron, Akron OH 44325 USA toneil@uakron.edu Abstract Many common iterative
More informationProgramming Heterogeneous Embedded Systems for IoT
Programming Heterogeneous Embedded Systems for IoT Jeronimo Castrillon Chair for Compiler Construction TU Dresden jeronimo.castrillon@tu-dresden.de Get-together toward a sustainable collaboration in IoT
More informationEvolutionary Algorithm for Embedded System Topology Optimization. Supervisor: Prof. Dr. Martin Radetzki Author: Haowei Wang
Evolutionary Algorithm for Embedded System Topology Optimization Supervisor: Prof. Dr. Martin Radetzki Author: Haowei Wang Agenda Introduction to the problem Principle of evolutionary algorithm Model specification
More informationEvaluating Run-time Resource Management Policies for Multi-core Embedded Platforms with
Evaluating Run-time Resource Management Policies for Multi-core Embedded Platforms with the EMME Evaluation Framework Giovanni Mariani ALaRI - University of Lugano Lugano, Switzerland E-mail: giovanni.mariani@lu.unisi.ch
More informationEE382N.23: Embedded System Design and Modeling
EE38N.3: Embedded System Design and Modeling Lecture 5 Process-Based MoCs Andreas Gerstlauer Electrical and Computer Engineering University of Texas at Austin gerstl@ece.utexas.edu Lecture 5: Outline Process-based
More informationComputer Engineering Mekelweg 4, 2628 CD Delft The Netherlands MSc THESIS
Computer Engineering Mekelweg 4, 2628 CD Delft The Netherlands http://ce.et.tudelft.nl/ 2012 MSc THESIS Dynamic loading and task migration for streaming applications on a composable system-on-chip Adriaan
More informationCaching video contents in IPTV systems with hierarchical architecture
Caching video contents in IPTV systems with hierarchical architecture Lydia Chen 1, Michela Meo 2 and Alessandra Scicchitano 1 1. IBM Zurich Research Lab email: {yic,als}@zurich.ibm.com 2. Politecnico
More informationModel Checking of Finite-state Machine-based Scenario-aware Dataflow Using Timed Automata
Model Checking of Finite-state Machine-based Scenario-aware Dataflow Using Timed Automata Mladen Skelin Department of Engineering Cybernetics, Norwegian University of Science and Technology mladen.skelin@itk.ntnu.no
More informationBest Practices for Implementing ARM Cortex -A12 Processor and Mali TM -T6XX GPUs for Mid-Range Mobile SoCs.
Best Practices for Implementing ARM Cortex -A12 Processor and Mali TM -T6XX GPUs for Mid-Range Mobile SoCs. Cortex-A12: ARM-Cadence collaboration Joint team working on ARM Cortex -A12 irm flow irm content:
More informationDataflow Processing. A.R. Hurson Computer Science Department Missouri Science & Technology
A.R. Hurson Computer Science Department Missouri Science & Technology hurson@mst.edu 1 Control Flow Computation Operands are accessed by their addresses. Shared memory cells are the means by which data
More informationMode-Controlled Dataflow Modeling of Real-Time Memory Controllers
Mode-Controlled Dataflow Modeling of eal-time Memory Controllers Yonghui Li, Hrishikesh alunkhe, Joao Bastos, Orlando Moreira 2, Benny Akesson 3 and Kees Goossens Eindhoven University of Technology, the
More informationMultiprocessor Systems. Chapter 8, 8.1
Multiprocessor Systems Chapter 8, 8.1 1 Learning Outcomes An understanding of the structure and limits of multiprocessor hardware. An appreciation of approaches to operating system support for multiprocessor
More informationMAMPSx: A Design Framework for Rapid Synthesis of Predictable Heterogeneous MPSoCs
MAMPSx: A Design Framework for Rapid Synthesis of Predictable Heterogeneous MPSoCs Shakith Fernando, Firew Siyoum, Yifan He, Akash Kumar 2 and Henk Corporaal Department of Electrical Engineering, Eindhoven
More information