From Temporal Partitioning and Temporal Placement to Algorithmic Skeletons

Size: px

Start display at page:

Download "From Temporal Partitioning and Temporal Placement to Algorithmic Skeletons"

Ethelbert Burke
5 years ago
Views:

1 From Temporal Partitioning and Temporal Placement to Algorithmic Skeletons Florian Dittmann, Franz J. Rammig Heinz Nixdorf Institute University of Paderborn, Germany

2 Motivation Making reconfigurable computing mature - Industrialization Capabilities - Processing in parallel - Runtime reconfiguration - Partial reconfiguration - Space and time -Etc. Abstraction - Layers - Beneficial methods FPGA 2/26 Anschlusskolloquium, Lübeck, 24. und 25. Mai 2007

3 Overview Motivation Partitioning methods and their application - Lessons learned Layered approach - Specification Graph Approach - Reconfiguration Port Scheduling - Algorithmic Skeletons Cooperation Part-E Conclusion 3/26 Anschlusskolloquium, Lübeck, 24. und 25. Mai 2007

4 Partitioning Applying the spectral method on coarse grained systems - Mesh-based nearest neighbor communication - 2D topology Mapping - Data flow graphs - Resource efficient - Communication optimized PE PE y x y x y x /26 Anschlusskolloquium, Lübeck, 24. und 25. Mai 2007

5 Partitioning Temporal + Spatial Partitioning/Placement Basis - ASAP scheduling - Spectral placement Combination - Focusing on one level - Location of the nodes in the spectral placement Placement of the extracted nodes on PE Benefit - ASAP: precedence constraints - Spectral Method: overall closeness respected 5/26 Anschlusskolloquium, Lübeck, 24. und 25. Mai 2007

6 Partitioning Two Slot Model Execution Environment of exactly two slots - Alternating execution of tasks Slot A Controller Slot B - Hiding of the reconfiguration overhead FPGA Freq. Exploiting partial run-time reconfiguration Challenges time - Architecture demands for communication infrastructure - Partial bitstream generation - Task mapping - Partitioning 6/26 Anschlusskolloquium, Lübeck, 24. und 25. Mai 2007

7 Two Slot Model Partitioning of the input algorithms B B d) Partitioning e) Including of Buffers Scheduling - Simple dispatching - Single server for two machines Slot A Slot B time time 7/26 Anschlusskolloquium, Lübeck, 24. und 25. Mai 2007

8 Lessons Learned Partitioning as reasonable/fundamental step Challenges - Placement: fragmentation - Reconfiguration overhead -Etc. Valuable concept - Two phases Reconfiguration phase Execution phase Derivable concepts - Specification graph approach Domain of platform-based design - Reconfiguration Port Scheduling - Algorithmic Skeletons FPGA? 8/26 Anschlusskolloquium, Lübeck, 24. und 25. Mai 2007

9 Layered Model for Design Methods area and phase - varies - varies Challenging design, scheduling, etc. Tasks Layer 1 Layer 2 Layer n Reconfigurable Fabric time Layer model/approach - Abstraction - Specification example Tasks Partitioning Dispatching Execution Environment FPGA 9/26 Anschlusskolloquium, Lübeck, 24. und 25. Mai 2007

10 Specification Graph Approach Tasks Problem graph - Integration of the phase Tasks Problem Graph Mapping Architecture Graph FPGA CPU FPGA FPGA Processing units Architecture graph - Also heterogeneous Mapping - Links tasks, communication and reconfiguration with architectural resources - Hierarchical mapping edges Synthesis - Scheduling, allocation and binding - By evolutionary computing 10/26 Anschlusskolloquium, Lübeck, 24. und 25. Mai 2007

11 1 C0 T1 Tasks Problem Graph 3 C1 T3 C4 C2 C3 T2 2 Task graph G Add reconfiguration phases G* Add Communication Vertexes G P (Problem Graph) Tasks Problem Graph Mapping Architecture Graph FPGA CPU FPGA 11/26 Anschlusskolloquium, Lübeck, 24. und 25. Mai 2007

12 Specification Graph Approach Mapping of Resources problem and architecture graph + mapping edges = specification graph Slot 1 Slot 2 Port 1 T1 C0 3 ms Port 3 C2 C1 C3 T3 C4 T ms 3 ms 12 ms 10 ms 1 ms 15 ms Slot 1 Bus Slot 2 12/26 Anschlusskolloquium, Lübeck, 24. und 25. Mai 2007

13 Specification Graph Approach Extensions C1 1 Port Slot_B Multiple devices Multiple reconfiguration ports T1 Slot_C FPGA1 2 C2 C3 3 Gigabit T2 T3 Gigabit C4 Slot_A FPGA2 G P Port Platform-based design All within the domain of synthesis G A1 G A2 Slot A 2 T2 Slot B 1 T1 Slot C 3 T3 time 13/26 Anschlusskolloquium, Lübeck, 24. und 25. Mai 2007

14 Reconfiguration Port Scheduling Partially run-time reconfigurable FPGAs for real-time processing - Task set executed on FPGA Slot A Slot B Slot C Area assignment? - Prevent fragmentation - Offer communication Bus FPGA Scheduling? - Execution time of tasks - Reconfiguration process Overhead: time + single port At a pace of the environment Reconfiguration port scheduling Task Set Reconfiguration Port Scheduling Execution Environment FPGA 14/26 Anschlusskolloquium, Lübeck, 24. und 25. Mai 2007

15 Reconfiguration Port Scheduling Slotted execution environment Set of n partial Bitstreams Slot A Slot B Slot C d* d Single reconfiguration port Bus FPGA Real-time processing on slotted FPGA architecture - Guarantee meeting of deadlines - Constant reconfiguration phase - Deadline d* - One reconfiguration port mono processor scheduling algorithms Task Sets OS Real-Time Scheduling Execution Environment FPGA 15/26 Anschlusskolloquium, Lübeck, 24. und 25. Mai 2007

16 Performance of d* For aperiodic task sets - d* outperforms d - Of 100 feasible task sets d* finds approx. 90 and d finds approx. 70 d d* R T Performance depends on - ratio to t = l - # of slots t 16/26 Anschlusskolloquium, Lübeck, 24. und 25. Mai 2007

17 Fixed Priority Example Periodic task set scheduling - Static priorities - Preemption Characteristics - Deadlines (D*) shorter than periods Apply deadline monotonic scheduling (DM) 17/26 Anschlusskolloquium, Lübeck, 24. und 25. Mai 2007

18 Fixed Priority Scheduling Schedulability Analysis Parameters Response time analysis - DM with D*: - Schedulable if: τ i : R < D * i i - Calculate R i by R i 1 i R i = t, i + t, j j= 1 Tj - Critical instance: all tasks are released simultaneously Sufficient and necessary for DM with D - Challenging abnormalities for DM with D* 18/26 Anschlusskolloquium, Lübeck, 24. und 25. Mai 2007

19 Algorithmic Skeletons Motivation Tasks - Programmability and portability - Structure and behavior of the tasks Application level Use of algorithmic skeletons - Wrapping of tasks - Programming templates Applications Tasks Algorithmic Skeletons Runtime environment - Partial reconfiguration capabilities Dispatcher Skeleton Dispatching Runtime Environment FPGA 19/26 Anschlusskolloquium, Lübeck, 24. und 25. Mai 2007

20 Algorithmic Skeletons Background Invented for the parallel processing domain - First discussed by Murray Cole in the mid 80ies Objectives - Separate structure of a computation from the computation itself - Free programmer from the implementation details of the structure - Implementation guideline for activities and their interactions Related: design patterns - Differences Design level Final implementation left to the freedom of the designer Algorithmic skeletons force the applications to be well-formed - Enable to extract valuable information - Design space exploration on a high level of abstraction - Static and dynamic optimization of implementations 20/26 Anschlusskolloquium, Lübeck, 24. und 25. Mai 2007

21 Dynamic Reconfiguration Multi-threading on an FPGA - Hosting of more than one task - Share processing resource Challenges - Dispatch newly arriving tasks during run-time Tasks are not known at design time Architecture to facilitate dispatching must exist - Area assignment - Prevent fragmentation - Communication assignment Algorithmic Skeletons Applications Tasks Algorithmic Skeletons Skeleton Dispatching Runtime Environment FPGA 21/26 Anschlusskolloquium, Lübeck, 24. und 25. Mai 2007

22 Dynamic Reconfiguration Example Combination of Pipeline and Farm Skeleton - Slotted architecture - Bus and direct communication in Pipeline: W 1 W 2 W p Farm: in out E W 1 W 2 W p FPGA W1 W3 W2 W1 CE C out 22/26 Anschlusskolloquium, Lübeck, 24. und 25. Mai 2007

23 Cooperation PadErOl (Erlangen [Prof. Teich] and Oldenburg [Prof. Nebel]) - Integrated design flow, see it journal Braunschweig (Prof. Fekete) - Reconfiguration phase scheduling/ single server scheduling - Comparison of methods Erlangen (Prof. Teich) - Bus-based architecture for reconfiguration port scheduling Paderborn (Prof. Platzner) - Local cooperation - Information exchange Erlangen Slot Machine 23/26 Anschlusskolloquium, Lübeck, 24. und 25. Mai 2007

24 Publications 2007 Dittmann, Florian; Frank, Stefan: Hard Real-Time Reconfiguration Port Scheduling. In: Proceedings of the Design, automation and Test in Europe, Nice, France 2007 Dittmann, Florian; Götz, Marcelo; Rettberg, Achim: Model and Methodology for the Synthesis of Heterogeneous and Partially Reconfigurable Systems. In: Proceedings of the Reconfigurable Architecture Workshop, Long Beach, CA, USA 2007 Dittmann, Florian: Algorithmic Skeletons for the Programming of Reconfigurable Systems. In: Proceedings of the SEUS 2007, Santorin, Greece, Mai 2007 Dittmann, Florian; Rammig, Franz Josef; Streubühr, Martin; Haubelt, C.; Schallenberg, Andreas; Nebel, Wolfgang: Exploration, Partitioning and Simulation of Reconfigurable Systems. it - Information Technology (vormals it+ti), 3(7), 1. Jan Dittmann, Florian; Rettberg, Achim; Weber, Raphael: Optimization Techniques for a Reconfigurable Self-Timed and Bit-Serial Architecture. In: Proceedings of the SBCCI 2007, Rio de Janeiro, Brazil, Sep Dittmann, Florian; Frank, S.: Caching in Real-time Reconfiguration Port Scheduling. In: Proceedings of the FPL 2007, Amsterdam, The Netherlands, Aug Dittmann, Florian; Rettberg, Achim; Weber, Raphael: Latency Optimization for a Reconfigurable, Self-Timed and Bit-Serial Architecture. In: Proceedings of the ERSA 2007, Las Vegas, USA, Jun Dittmann, Florian; Heimfarth, Tales: Clock Frequency Variation of Partially Reconfigurable Systems. In: Proceedings of the19th International Conference on Architecture of Computing Systems: Workshop Proceedings, S , Frankfurt, Germany, 1. Jan Warkentin, Alexander; Dittmann, Florian: Data Transfer Protocols for a Two Slot Based Reconfigurable Platform. In: Proceedings of the Reconfigurable Communication-centric SoCs (ReCoSoC), Montpellier, France, 2006 Götz, Marcelo; Dittmann, Florian: Reconfigurable Microkernel-based OS: Mechanisms and Methods for Run-Time Reconfiguration. In: Proceedings of the 3rd International Conference on ReConFigurable Computing and FPGAs 2006 (ReConFig'06), S Dittmann, Florian; Götz, M.: Applying Single Processor Algorithms to Schedule Tasks on Reconfigurable Devices Respecting Reconfiguration Times. In: Proceedings of the 13th Reconfigurable Architectures Workshop (RAW 2006), Rhodes Island, Greece, 2006 Dittmann, Florian; Rettberg, Achim: Design of Partially Reconfigurable Systems: From Abstract Modeling to Practical Realization. In: Proceedings of the 1st International Workshop on Reconfigurable Computing Education, Karlsruhe, Germany, 1. Jan Götz, Marcelo; Dittmann, Florian; Pereira, Carlos E.: Deterministic Mechanism for Run-Time Reconfiguration Activities in an OS. In: Proceedings of the 4th International IEEE Conference on Industrial Informatics (INDIN 2006), Singapore, 2006 Götz, Marcelo; Dittmann, Florian: Scheduling Reconfiguration Activities of Run-time Reconfigurable OS Using an Aperiodic Task Server. In: Proceedings of the ARC 2006, Delft, The Netherlands, Mrz Dittmann, Florian; Rettberg, Achim; Weber, Raphael: Towards the Implementation of Path Concepts for a Reconfigurable Bit-Serial Synchronous Architecture. In: Proceedings of the 3rd International Conference on ReConFigurable Computing and FPGA's, S , San Luis Potosi, Mexico, 2006 Dittmann, Florian; Götz, Marcelo: Reconfiguration Time Aware Processing on FPGAs. In: In Proceedings of the Dagstuhl Seminar Nº on Dynamically Reconfigurable Architectures, Dagstuhl, Germany, /26 Anschlusskolloquium, Lübeck, 24. und 25. Mai 2007

25 Part-E Eclipse based development environment for partial bitstream generation Opensource parte.sf.net Tutorial available Coffee break Bitstreams in 2 min 25/26 Anschlusskolloquium, Lübeck, 24. und 25. Mai 2007

26 Thank you for your attention. Florian Dittmann, Franz Rammig Heinz Nixdorf Institute University of Paderborn Fuerstenallee Paderborn Germany Phone: +49 (0) 52 51/ Fax: +49 (0) 52 51/ Thanks to E. Weber, S. Frank, A Warkentin

COMMUNICATION-AWARE COMPONENT ALLOCATION ALGORITHM FOR A HYBRID ARCHITECTURE

COMMUNICATION-AWARE COMPONENT ALLOCATION ALGORITHM FOR A HYBRID ARCHITECTURE Marcelo Götz, 1 Achim Rettberg 2 and Carlos Eduardo Pereira 3 1 Heinz Nixdorf Institute University of Paderborn, Germany mgoetz@uni-paderborn.de