High Level Synthesis. Shankar Balachandran Assistant Professor, Dept. of CSE IIT Madras
|
|
- Benjamin Hill
- 5 years ago
- Views:
Transcription
1 High Level Synthesis Shankar Balachandran Assistant Professor, Dept. of CSE IIT Madras
2 References and Copyright Tetbooks referred (none required) [Mic94] G. De Micheli Synthesis and Optimization of Digital Circuits McGrawHill, 994. Slides used: Giovanni de Micheli s Slides on Synthesis Kia Bazargan s Material on High Level Synthesis [ Gupta] Rajesh Gupta UCIrvine 2
3 High Level Synthesis (HLS) The process of converting a highlevel description of a design to a netlist Input: o Highlevel languages (e.g., C) o Behavioral hardware description languages (e.g., VHDL) o Structural HDLs (e.g., VHDL) o State diagrams / logic networks Tools: o Parser o Library of modules Constraints: o Area constraints (e.g., # modules of a certain type) o Delay constraints (e.g., set of operations should finish in λ clock cycles) Output: o Operation scheduling (time) and binding (resource) o Control generation and detailed interconnections
4 Architecturallevel synthesis motivation Raise input abstraction level Reduce specification of details Etend designer base Selfdocumenting design specifications Ease modifications and etensions Reduce design time Eplore and optimize macroscopic structure: Series/parallel eecution of operations 4
5 Synthesis Transform behavioral into structural view Architecturallevel synthesis: Architectural abstraction level Determine macroscopic structure Eample: major building blocks Logiclevel synthesis: Logic abstraction level Determine microscopic structure Eample: logic gate interconnection 5
6 HighLevel Synthesis Compilation Flow Le Parse Compilation frontend = a ( b c ) d Behavioral Optimization Arch synth Logic synth Lib Binding Intermediate form HLS backend a b c d a d b c 6
7 Eample diffeq { read (; y; u; d; a); repeat l = d; } ul = u (.. u. d) (. y. d) yl = y u. d ; c = l < a; X = l; u = ul; y = yl; until ( c ) write (y); 7
8 Compilation and behavioral optimization Software compilation: Compile program into intermediate form Optimize intermediate form Generate target code for an architecture Hardware compilation: Compile programs/hdl into sequencing graph Optimize sequencing graph Generate gatelevel interconnection for a cell library 8
9 Behaviorallevel optimization Semanticpreserving transformations aiming at simplifying the model Applied to parsetrees or during their generation Taonomy: Dataflow based transformations Controlflow based transformations 9
10 Architectural synthesis and optimization Synthesize macroscopic structure in terms of buildingblocks Eplore area/performance tradeoff: maimize performance of implementations subject to area constraints minimize area implementations subject to performance constraints Determine an optimal implementation Create logic model for datapath and control 0
11 Design space: Design space and objectives Set of all feasible implementations Implementation parameters: Area Performance: o Cycletime o Latency o Throughput (for pipelined implementations) Power consumption
12 Design evaluation space Area Area Area Area Ma Cycletime Latency Latency Latency Ma Latency 2
13 Dependency Graph (Sequencing Graph) diffeq { read (; y; u; d; a); repeat l = d; } ul = u (.. u. d) (. y. d) yl = y u. d ; c = l < a; X = l; u = ul; y = yl; until ( c ) write (y); u y a d < ul vl c l
14 Hardware modeling Circuit behavior: Sequencing graphs Building blocks: Resources Constraints: Timing and resource usage 4
15 Resources Functional resources: Perform operations on data Eample: arithmetic and logic blocks Storage resources: Store data Eample: memory and registers Interface resources: Eample: busses and ports 5
16 Resources and circuit families Resourcedominated circuits. Area and performance depend on few, wellcharacterized blocks The most common in DSP Circuits Non resourcedominated circuits Area and performance are strongly influenced by sparse logic, control and wiring Eample: some ASIC circuits 6
17 Implementation constraints Timing constraints: Cycletime Latency of a set of operations Time spacing between operation pairs Resource constraints: Resource usage (or allocation) Partial binding 7
18 Sequence Graph Remove all nodes corresponding to constants (with respect to the loop) Remove edges from those constants as well Remove nodes that are being written in the loop and the corresponding edges Add NOP nodes at both ends to represent reads and writes of the loop Such a graph is much simpler to operate on u < y NOP a d NOP ul vl c l 8
19 Synthesis in the temporal domain Scheduling: Associate a starttime with each operation Determine latency and parallelism of the implementation The schedule is called φ Scheduled sequencing graph: Sequencing graph with starttime annotation 9
20 Synthesis in Temporal Domain Schedule: Mapping of operations to time slots (cycles) A scheduled sequencing graph is a labeled graph NOP NOP 2 < NOP Schedule Schedule 2 < NOP 20
21 Operation Types For each operation, define its type. For each resource, define a resource type, and a delay (in terms of # cycles) T is a relation that maps an operation to a resource type that can implement it T : V {, 2,..., n res }. More general case: A resource type may implement more than one operation type (e.g., ALU) Resource binding: Map each operation to a resource with the same type Might have multiple options 2
22 Binding: Synthesis in the spatial domain Associate a resource with each operation with the same type Determine the area of the implementation Sharing: Bind a resource to more than one operation Operations must not eecute concurrently Bound sequencing graph: Sequencing graph with resource annotation 22
23 Resource sharing Schedule in Spatial Domain More than one operation bound to same resource Operations have to be serialized Can be represented using hyperedges (define verte partition) NOP 2 4 < NOP 2
24 Binding Should Change with Schedules NOP NOP 2 4 < NOP 2 4 NOP < 4 Multipliers 2 Adders 2 Multipliers 2 Adders 24
25 Scheduling and Binding Resource constraints: Number of resource instances of each type {a k : k=, 2,..., n res }. Scheduling: Labeled vertices φ (v )=. Binding: Hyperedges (or verte partitions) β (v 2 )=adder. Cost: Resource dominated Number of resources area Registers, steering logic (Mues, busses), wiring, control unit Delay: Start time of the sink node Might be affected by steering logic and schedule (control logic) resourcedominated vs. ctrldominated Control dominated 25
26 Architectural Optimization Optimization in view of design space fleibility A multicriteria optimization problem: Determine schedule φ and binding β. Under area Α, latency λ and cycle time τ objectives Find nondominated points in solution space Solution space tradeoff curves: Nonlinear, discontinuous Area / latency / cycle time (more?) 26
27 Area/latency tradeoff Area (2,2) (2,) (,2) (,) X (,2) (,) (2,) 40 Cycletime Latency 27
28 Scheduling and Binding Cost λ and Α determined by both φ and β. Also affected by floorplan and detailed routing β affected by φ: Resources cannot be shared among concurrent ops φ affected by β: Resources cannot be shared among concurrent ops When register and steering logic delays added to eecution delays, might violate cycle time. Order? Apply either one (scheduling, binding) first 28
29 How Is the Datapath Implemented? In the following schedule and binding, every operation has two inputs. If an input is not shown eplicitly, it comes from a unique register <
30 Operation Scheduling Input: Sequencing graph G(V, E), with n vertices Cycle time τ. Operation delays D = {d i : i=0..n}. Output: Schedule φ determines start time t i of operation v i. Latency λ = t n t 0. Goal: determine area / latency tradeoff Classes: Nonhierarchical and unconstrained Latency constrained Resource constrained Hierarchical 0
31 Min Latency Unconstrained Scheduling Simplest case: no constraints, find min latency Given set of vertices V, delays D and a partial order > on operations E, find an integer labeling of operations φ: V Z Such that: t i = φ(v i ). t i t j d j (v j, v i ) E. λ = t n t 0 is minimum. Solvable in polynomial time Bounds on latency for resource constrained problems ASAP algorithm used: topological order
32 ASAP Schedules Schedule v 0 at t 0 =0. While (v n not scheduled) o Select v i with all scheduled predecessors o Schedule v i at t i = ma {t j d j }, v j being a predecessor of v i. Return t n. NOP 2 4 < NOP 2
33 ALAP Schedules Schedule v n at t n =λ. While (v 0 not scheduled) o Select v i with all scheduled successors o Schedule v i at t i = min {t j d j }, v j being a succecessor of v i. NOP 2 4 < NOP
34 Remarks ALAP solves a latencyconstrained problem Latency bound can be set to latency computed by ASAP algorithm Mobility: Defined for each operation Difference between ALAP and ASAP schedule Slack on the start time 4
35 Eample Operations with zero mobility: { v, v 2, v, v 4, v 5 } Critical path Operations with mobility one: { v 6, v 7 } Operations with mobility two: { v 8, v 9, v 0, v } NOP0 NOP0 2 * * * 4 6 * * * 5 9 < TIME TIME 2 TIME TIME * * * * 7 9 * * < 4 5 NOP n NOP n 5
36 Scheduling under Relative Timing constraints Motivation: Deadlines and Release times are absolute Also makes sense to have relative constraints o Eg: Memory fetch must be done within 6 cycles and takes a minimum of 2 cycles Constraints: Upper/lower bounds on starttime difference of any operation pair A minimum timing constraint l ij 0 for specified i,j pairs A maimum timing constraint u ij 0 for specified i,j pairs Feasibility of a solution 6
37 Constraint graph model Start from sequencing graph Model delays as weights on edges Add forward edges for minimum constraints: Edge ( v i, v j ) with weight l ij t j t i l ij Add backward edges for maimum constraints: That is, for constraint from v i to v j add backward edge ( v j, v i ) with weight: u ij o because t j t i u ij t i t j u ij 7
38 Eample NOP 0 NOP MAX TIME * * 2 4 MIN TIME 4 2 * * NOP n NOP n. Ensure that there are no positive cycles in the graph 2. Find longest paths in the graph between 2 nodes i and j and use as delay separations Verte v 0 v v 2 v v 4 Start time 5 v n 6 8
39 Resource Constraint Scheduling Constrained scheduling General case NPcomplete Minimize latency given constraints on area or the resources (MLRCS) Minimize resources subject to bound on latency (MR LCS) Eact solution methods ILP: Integer Linear Programming Hu s heuristic algorithm for identical processors Heuristics List scheduling Forcedirected scheduling 9
40 Use binary decision variables i = 0,,..., n l =, 2,..., λ λ given upperbound on latency il = if operation i starts at step l, 0 otherwise. Set of linear inequalities (constraints), and an objective function (min latency) Observations il ( t S i = ILP Formulation of MLRCS = 0 ASAP( v t = l. for i l ), < t t L i ALAP( v t i = start time of op i. i l l il S i = and? is op v i (still) eecuting at step l? m= l d i im = l i > )) t L i [Mic94] p.98 40
41 Constraints Operations start only once Σ il = i =, 2,, n Sequencing relations must be satisfied t i t j d j t i t j d j 0 for all (v j, v i ) є E Resource bounds must be satisfied Simple case (unit delay) Σ l il a k k =,2, n res ; for all l i:t(v i )=k Equation 2 can be rewritten as Σ l il Σ l jl d j 0 for all (v j, v i ) є E 4
42 Start Time vs. Eecution Time For each operation v i, only one start time If d i =, then the following questions are the same: Does operation v i start at step l? Is operation v i running at step l? But if d i >, then the two questions should be formulated as: Does operation v i start at step l? o Does il = hold? Is operation v i running at step l? o Does the following hold? l m= l d i im? = 42
43 Operation v i Still Running at Step l? Is v 9 running at step 6? Is 9,6 9,5 9,4 =? v 9 6 v 9 6 v 9 6 9,6 = 9,5 = 9,4 = Note: Only one (if any) of the above three cases can happen To meet resource constraints, we have to ask the same question for ALL steps, and ALL operations of that type 4
44 Operation v i Still Running at Step l? Is v i running at step l? Is i,l i,l... i,ldi =? ld i ld i ld i... l... l l v i l v i l v i l i,l = i,l = i,ldi = 44
45 ILP Formulation of MLRCS (cont.) Constraints: Unique start times: Sequencing (dependency) relations must be satisfied t i t j d Resource constraints Objective: min c T t. j ( v im i: T ( v ) = k m= l d i l i j, v i a ) k, il =, i = 0,,..., n t =start times vector, c =cost weight (e.g., [ ]) When c =[ ], c T t = l l E k. l l. il l. jl l l =,..., n nl res, l d =,..., λ j 45
46 ILP Eample Assume λ = 4 First, perform ASAP and ALAP (we can write the ILP without ASAP and ALAP, but using ASAP and ALAP will simplify the inequalities) NOP NOP v v2 v6 v8 v0 v v2 2 v v7 v9 < v4 v 2 v v4 v6 v7 v8 v0 4 v5 4 v5 v9 < v NOPvn NOP vn 46
47 ILP Eample: Unique Start Times Constraint Without using ASAP and ALAP values:, 2,,, 2 2, 2, 2, 2,,, 4 2, 4 =, 4 = = Using ASAP and ALAP:..., 2,, 2 4, 5, 4 6, 7, 2 8, 9, 2 = = = = = 6, 2 7, 8, 2 9, * * * * = = 7 9 * * < 4 5 8, 9, 4 = = NO P NO P 0 n 47 0
48 ILP Eample: Dependency Constraints Using ASAP and ALAP, the nontrivial inequalities are: (assuming unit delay for and *) 2. 2., 2 9, 2.., 9, 7 9 * * < NO P * * * * n 4., 4 n, 5, 5 7, 2 9, , 5, 4 9, 2, 2 8, 7, , 8, 2, 2 7, 2 9,, , 2 0 7, 9, 4 8,,, NO P n 48
49 ILP Eample: Resource Constraints Resource constraints (assuming 2 adders and 2 multipliers), 2, 6, 8, 2 Objective:, 2 4, Since λ=4 and sink has no mobility, any feasible solution is optimum, but we can use the following anyway: 9, 2 6, 2 9, 5, 4 0, 2 0 7, 2 7,, 9, 4 Min 8, 2 0 8,,, 2,, 4 n, 2. n,2. n, 4. n,
50 Eample NOP 0 TIME 2 * * 0 TIME 2 * * 6 < TIME 4 * 7 * 8 TIME NOP n Note that the schedule is different from both ALAP and ASAP schedules. 50
51 ILP Formulation of MRLCS Dual problem to MLRCS Objective: Goal is to optimize total resource usage, a. Objective function is c T a, where entries in c are respective area costs of resources Constraints: Same as MLRCS constraints, plus: Latency constraint added: l l. λ nl Note: unknown a k appears in constraints. Resource usage is unknown in the constraints Resource usage is the objective to minimize 5
52 ILP Solution Use standard ILP packages Transform into LP problem Advantages: Eact method Others constraints can be incorporated Disadvantages: Works well only up to few thousand variables 52
53 Hu s Algorithm Simple case of the scheduling problem Operations of unit delay Operations (and resources) of the same type Hu s algorithm Greedy Polynomial AND optimal Computes lower bound on number of resources for a given latency OR: computes lower bound on latency subject to resource constraints Basic idea: Label operations based on their distances from the sink Try to schedule nodes with higher labels first (i.e., most critical operations have priority) 5
54 Assumptions: Hu s algorithm Graph is a forest All operations have unit delay All operations have the same type Algorithm: Greedy strategy Eact solution 54
55 Eample Assumptions: One resource type only All operations have unit delay Labels: Distance to sink 5 0 n 55
56 Hu s Algorithm HU (G(V,E), a) { Label the vertices // label = length of longest path passing through the verte l = repeat { U = unscheduled vertices in V whose predecessors have been scheduled (or have no predecessors) } Select S U such that S a and labels in S are maimal Schedule the S operations at step l by setting t i =l, i: v i S. l = l } until v n is scheduled. 56
57 Eample _ a = Step : Op,2,6 Step 2: Op,7,8 Step : Op 4,9,0 Step 4: Op 5, 5 0 n 57
58 Hu s Algorithm: Eample (a=) 58
59 List Scheduling Greedy algorithm for MLRCS and MRLCS Does NOT guarantee optimum solution Similar to Hu s algorithm Operation selection decided by criticality O(n) time compleity More general input Resource constraints on different resource types 59
60 List Scheduling Algorithm: MLRCS LIST_L (G(V,E), a) { l = repeat { for each resource type k { U l,k = available vertices in V. T l,k = operations in progress. } Select S k U l,k such that S k T l,k a k Schedule the S k operations at step l } l = l } until v n is scheduled. 60
61 Eample NOP * * 2 * 6 * NOP 0 * * 7 9 < TIME 2 * * * NOP n TIME 2 TIME * * 7 8 * < TIME 4 Resource bounds: TIME 5 4 multipliers with delay 2 TIME 6 5 ALU with delay TIME 7 9 n NOP 6
62 List Scheduling Algorithm: MRLCS LIST_R (G(V,E), λ ) { a =, l = Compute the ALAP times t L. if t 0 L < 0 return (not feasible) repeat { for each resource type k { U l,k = available vertices in V. Compute the slacks { s i = t i L l, vi U l,k }. Schedule operations with zero slack, update a } Schedule additional S k U l,k under a constraints } l = l } until v n is scheduled. 62
63 Eample NOP * * * * 7 9 * * < Step Two multiplications on CP Set a = 2 Schedule Mult,2 Schedule ALU 0 Step 2 Schedule Mult, 6 Schedule ALU Step Schedule Mult 7,8 Schedule ALU 4 Step 4 Set a 2 =2 Schedule ALU 5, 9 NOP0 NOP n TIME 2 * * 0 Assumptions Unitdelay resources Maimum latency = 4 Start with : a = multiplier a 2 = ALUs TIME 2 TIME TIME 4 * 4 5 * * 6 7 * 8 9 n < NOP 6
64 Summary Scheduling algorithms are used by tools Compilers use them when you write code for DSP processors Tools like Xilin ISE, Synopsys DC etc. use them when you compiler HDL models Good understanding of under the hood operations of tools is useful The constraint solving techniques can be used directly for your custom designs Eg: In DSP software, if you know the resources, you write assembly code to minimize latency 64
65 ForceDirected Scheduling Similar to list scheduling Can handle MLRCS and MRLCS For MLRCS, schedules stepbystep BUT, selection of the operations tries to find the globally best set of operations Idea: Find the mobility μ i = t i L ti S of operations Look at the operation type probability distributions Try to flatten the operation type distributions Definition: operation probability density p i ( l ) = Pr { v i starts at step l }. Assume uniform distribution: p i ( l) = μ i for l [ t S i, t L i ] 65
66 ForceDirected Scheduling: Definitions Operationtype distribution (NOT normalized to ) q k ( l) = i: T ( v i ) = k Operation probabilities over control steps: p i ( l) p i = { p (0), p (), Κ, p ( n)} i i i Distribution graph of type k over all steps: { q (0), q (), Κ, q ( n)} k k k q k ( l ) can be thought of as epected operator cost for implementing operations of type k at step l. 66
67 67 Eample NOP < NOP (4) () (2) () = = = = = = = mult mult mult mult q q q q (4) 2 () (2) 0. () = = = = = = = = add add add add q q q q
68 ForceDirected Scheduling Algorithm: Idea Very similar to LIST_L(G(V,E), a) Compute mobility of operations using ASAP and ALAP Computer operation probabilities and type distributions Select and schedule operations Update operation probabilities and type distributions Go to net control step Difference with list sched in selecting operations Select operations with least force Consider the effect on the type distribution Consider the effect on successor nodes and their type distributions 68
Architectural-Level Synthesis. Giovanni De Micheli Integrated Systems Centre EPF Lausanne
Architectural-Level Synthesis Giovanni De Micheli Integrated Systems Centre EPF Lausanne This presentation can be used for non-commercial purposes as long as this note and the copyright footers are not
More informationSCHEDULING II Giovanni De Micheli Stanford University
SCHEDULING II Giovanni De Micheli Stanford University Scheduling under resource constraints Simplified models: Hu's algorithm. Heuristic algorithms: List scheduling. Force-directed scheduling. Hu's algorithm
More informationHDL. Operations and dependencies. FSMs Logic functions HDL. Interconnected logic blocks HDL BEHAVIORAL VIEW LOGIC LEVEL ARCHITECTURAL LEVEL
ARCHITECTURAL-LEVEL SYNTHESIS Motivation. Outline cgiovanni De Micheli Stanford University Compiling language models into abstract models. Behavioral-level optimization and program-level transformations.
More informationUnit 2: High-Level Synthesis
Course contents Unit 2: High-Level Synthesis Hardware modeling Data flow Scheduling/allocation/assignment Reading Chapter 11 Unit 2 1 High-Level Synthesis (HLS) Hardware-description language (HDL) synthesis
More informationIntroduction to Electronic Design Automation. Model of Computation. Model of Computation. Model of Computation
Introduction to Electronic Design Automation Model of Computation Jie-Hong Roland Jiang 江介宏 Department of Electrical Engineering National Taiwan University Spring 03 Model of Computation In system design,
More informationHigh-Level Synthesis
High-Level Synthesis 1 High-Level Synthesis 1. Basic definition 2. A typical HLS process 3. Scheduling techniques 4. Allocation and binding techniques 5. Advanced issues High-Level Synthesis 2 Introduction
More informationHigh-Level Synthesis (HLS)
Course contents Unit 11: High-Level Synthesis Hardware modeling Data flow Scheduling/allocation/assignment Reading Chapter 11 Unit 11 1 High-Level Synthesis (HLS) Hardware-description language (HDL) synthesis
More informationLecture 20: High-level Synthesis (1)
Lecture 20: High-level Synthesis (1) Slides courtesy of Deming Chen Some slides are from Prof. S. Levitan of U. of Pittsburgh Outline High-level synthesis introduction High-level synthesis operations Scheduling
More informationCOE 561 Digital System Design & Synthesis Introduction
1 COE 561 Digital System Design & Synthesis Introduction Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Outline Course Topics Microelectronics Design
More informationHIGH-LEVEL SYNTHESIS
HIGH-LEVEL SYNTHESIS Page 1 HIGH-LEVEL SYNTHESIS High-level synthesis: the automatic addition of structural information to a design described by an algorithm. BEHAVIORAL D. STRUCTURAL D. Systems Algorithms
More informationHigh Level Synthesis
High Level Synthesis Design Representation Intermediate representation essential for efficient processing. Input HDL behavioral descriptions translated into some canonical intermediate representation.
More informationHigh-Level Synthesis Creating Custom Circuits from High-Level Code
High-Level Synthesis Creating Custom Circuits from High-Level Code Hao Zheng Comp Sci & Eng University of South Florida Exis%ng Design Flow Register-transfer (RT) synthesis - Specify RT structure (muxes,
More informationEE382M.20: System-on-Chip (SoC) Design
EE82M.20: SystemonChip (SoC) Design Lecture 1 Resource Allocation and Binding Source: G. De Micheli, Integrated Systems Center, EPFL Synthesis and Optimization of Digital Circuits, McGraw Hill, 2001. Andreas
More informationAdditional Slides to De Micheli Book
Additional Slides to De Micheli Book Sungho Kang Yonsei University Design Style - Decomposition 08 3$9 0 Behavioral Synthesis Resource allocation; Pipelining; Control flow parallelization; Communicating
More informationRetiming. Adapted from: Synthesis and Optimization of Digital Circuits, G. De Micheli Stanford. Outline. Structural optimization methods. Retiming.
Retiming Adapted from: Synthesis and Optimization of Digital Circuits, G. De Micheli Stanford Outline Structural optimization methods. Retiming. Modeling. Retiming for minimum delay. Retiming for minimum
More informationSynthesis at different abstraction levels
Synthesis at different abstraction levels System Level Synthesis Clustering. Communication synthesis. High-Level Synthesis Resource or time constrained scheduling Resource allocation. Binding Register-Transfer
More informationMODELING LANGUAGES AND ABSTRACT MODELS. Giovanni De Micheli Stanford University. Chapter 3 in book, please read it.
MODELING LANGUAGES AND ABSTRACT MODELS Giovanni De Micheli Stanford University Chapter 3 in book, please read it. Outline Hardware modeling issues: Representations and models. Issues in hardware languages.
More informationLecture Compiler Backend
Lecture 19-23 Compiler Backend Jianwen Zhu Electrical and Computer Engineering University of Toronto Jianwen Zhu 2009 - P. 1 Backend Tasks Instruction selection Map virtual instructions To machine instructions
More informationAdministrivia. What is Synthesis? What is architectural synthesis? Approximately 20 1-hour lectures An assessed mini project.
Synthesis of Digital Architectures Self-contained course no previous requirements beyond compulsory courses all material is described in these notes and during the lectures Course materials will draw on
More informationHardware Design Environments. Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University
Hardware Design Environments Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University Outline Welcome to COE 405 Digital System Design Design Domains and Levels of Abstractions Synthesis
More informationHardware/Software Codesign
Hardware/Software Codesign 3. Partitioning Marco Platzner Lothar Thiele by the authors 1 Overview A Model for System Synthesis The Partitioning Problem General Partitioning Methods HW/SW-Partitioning Methods
More informationHardware-Software Codesign
Hardware-Software Codesign 4. System Partitioning Lothar Thiele 4-1 System Design specification system synthesis estimation SW-compilation intellectual prop. code instruction set HW-synthesis intellectual
More informationUnconstrained and Constrained Optimization
Unconstrained and Constrained Optimization Agenda General Ideas of Optimization Interpreting the First Derivative Interpreting the Second Derivative Unconstrained Optimization Constrained Optimization
More informationVHDL for Synthesis. Course Description. Course Duration. Goals
VHDL for Synthesis Course Description This course provides all necessary theoretical and practical know how to write an efficient synthesizable HDL code through VHDL standard language. The course goes
More informationOverview. Design flow. Principles of logic synthesis. Logic Synthesis with the common tools. Conclusions
Logic Synthesis Overview Design flow Principles of logic synthesis Logic Synthesis with the common tools Conclusions 2 System Design Flow Electronic System Level (ESL) flow System C TLM, Verification,
More informationVHDL simulation and synthesis
VHDL simulation and synthesis How we treat VHDL in this course You will not become an expert in VHDL after taking this course The goal is that you should learn how VHDL can be used for simulation and synthesis
More informationChapter 2 Designing Crossbar Based Systems
Chapter 2 Designing Crossbar Based Systems Over the last decade, the communication architecture of SoCs has evolved from single shared bus systems to multi-bus systems. Today, state-of-the-art bus based
More informationIntroduction. Sungho Kang. Yonsei University
Introduction Sungho Kang Yonsei University Outline VLSI Design Styles Overview of Optimal Logic Synthesis Model Graph Algorithm and Complexity Asymptotic Complexity Brief Summary of MOS Device Behavior
More informationLinking Layout to Logic Synthesis: A Unification-Based Approach
Linking Layout to Logic Synthesis: A Unification-Based Approach Massoud Pedram Department of EE-Systems University of Southern California Los Angeles, CA February 1998 Outline Introduction Technology and
More informationSystem Synthesis of Digital Systems
System Synthesis Introduction 1 System Synthesis of Digital Systems Petru Eles, Zebo Peng System Synthesis Introduction 2 Literature: Introduction P. Eles, K. Kuchcinski and Z. Peng "System Synthesis with
More informationTopics. Verilog. Verilog vs. VHDL (2) Verilog vs. VHDL (1)
Topics Verilog Hardware modeling and simulation Event-driven simulation Basics of register-transfer design: data paths and controllers; ASM charts. High-level synthesis Initially a proprietary language,
More informationVerilog for High Performance
Verilog for High Performance Course Description This course provides all necessary theoretical and practical know-how to write synthesizable HDL code through Verilog standard language. The course goes
More informationConstraint Analysis and Heuristic Scheduling Methods
Constraint Analysis and Heuristic Scheduling Methods P. Poplavko, C.A.J. van Eijk, and T. Basten Eindhoven University of Technology, Department of Electrical Engineering, Eindhoven, The Netherlands peter@ics.ele.tue.nl
More informationHW SW Partitioning. Reading. Hardware/software partitioning. Hardware/Software Codesign. CS4272: HW SW Codesign
CS4272: HW SW Codesign HW SW Partitioning Abhik Roychoudhury School of Computing National University of Singapore Reading Section 5.3 of textbook Embedded System Design Peter Marwedel Also must read Hardware/software
More informationDynamic Memory Usage Optimization using ILP
Dynamic emory Usage Optimization using IP A. Allam and J. Ramanuam 2 Electrical Engineering Dept., Assiut University, Egypt 2 Electrical and Computer Engineering Dept, ouisiana State University, USA atef
More informationCS137: Electronic Design Automation
CS137: Electronic Design Automation Day 4: January 16, 2002 Clustering (LUT Mapping, Delay) Today How do we map to LUTs? What happens when delay dominates? Lessons for non-luts for delay-oriented partitioning
More informationRTL Coding General Concepts
RTL Coding General Concepts Typical Digital System 2 Components of a Digital System Printed circuit board (PCB) Embedded d software microprocessor microcontroller digital signal processor (DSP) ASIC Programmable
More informationEE382V-ICS: System-on-a-Chip (SoC) Design
EE382V-ICS: System-on-a-Chip (SoC) High-Level Synthesis Sources: Jacob Abraham Andreas Gerstlauer Electrical and Computer Engineering University of Texas at Austin gerstl@ece.utexas.edu Lecture 13: Outline
More informationSPARK: A Parallelizing High-Level Synthesis Framework
SPARK: A Parallelizing High-Level Synthesis Framework Sumit Gupta Rajesh Gupta, Nikil Dutt, Alex Nicolau Center for Embedded Computer Systems University of California, Irvine and San Diego http://www.cecs.uci.edu/~spark
More informationESE535: Electronic Design Automation. Today. LUT Mapping. Simplifying Structure. Preclass: Cover in 4-LUT? Preclass: Cover in 4-LUT?
ESE55: Electronic Design Automation Day 7: February, 0 Clustering (LUT Mapping, Delay) Today How do we map to LUTs What happens when IO dominates Delay dominates Lessons for non-luts for delay-oriented
More informationSynthesis and Optimization of Digital Circuits
Synthesis and Optimization of Digital Circuits Dr. Travis Doom Wright State University Computer Science and Engineering Outline Introduction Microelectronics Micro economics What is design? Techniques
More informationICS 252 Introduction to Computer Design
ICS 252 Introduction to Computer Design Placement Fall 2007 Eli Bozorgzadeh Computer Science Department-UCI References and Copyright Textbooks referred (none required) [Mic94] G. De Micheli Synthesis and
More informationBasic Idea. The routing problem is typically solved using a twostep
Global Routing Basic Idea The routing problem is typically solved using a twostep approach: Global Routing Define the routing regions. Generate a tentative route for each net. Each net is assigned to a
More informationICS 252 Introduction to Computer Design
ICS 252 Introduction to Computer Design Lecture 16 Eli Bozorgzadeh Computer Science Department-UCI References and Copyright Textbooks referred (none required) [Mic94] G. De Micheli Synthesis and Optimization
More informationDatapath Allocation. Zoltan Baruch. Computer Science Department, Technical University of Cluj-Napoca
Datapath Allocation Zoltan Baruch Computer Science Department, Technical University of Cluj-Napoca e-mail: baruch@utcluj.ro Abstract. The datapath allocation is one of the basic operations executed in
More informationEmbedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institution of Technology, Delhi
Embedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institution of Technology, Delhi Lecture - 34 Compilers for Embedded Systems Today, we shall look at the compilers, which
More informationOptimization Method for Broadband Modem FIR Filter Design using Common Subexpression Elimination
Optimization Method for Broadband Modem FIR Filter Design using Common Subepression Elimination Robert Pasko 1, Patrick Schaumont 2, Veerle Derudder 2, Daniela Durackova 1 1 Faculty of Electrical Engineering
More informationMATLAB Simulink Modeling and Simulation of Recurrent Neural Network for Solving Linear Programming Problems
International Conference on Mathematical Computer Engineering - ICMCE - 8 MALAB Simulink Modeling and Simulation of Recurrent Neural Network for Solving Linear Programming Problems Raja Das a a School
More informationLattice Semiconductor Design Floorplanning
September 2012 Introduction Technical Note TN1010 Lattice Semiconductor s isplever software, together with Lattice Semiconductor s catalog of programmable devices, provides options to help meet design
More informationEECS 3201: Digital Logic Design Lecture 4. Ihab Amer, PhD, SMIEEE, P.Eng.
EECS 32: Digital Logic Design Lecture 4 Ihab Amer, PhD, SMIEEE, P.Eng. What is a HDL? A high-level computer language that can describe digital systems in tetual form Two applications of HDL processing:
More informationOptimal Partition with Block-Level Parallelization in C-to-RTL Synthesis for Streaming Applications
Optimal Partition with Block-Level Parallelization in C-to-RTL Synthesis for Streaming Applications Authors: Shuangchen Li, Yongpan Liu, X.Sharon Hu, Xinyu He, Pei Zhang, and Huazhong Yang 2013/01/23 Outline
More informationDM6 Support Vector Machines
DM6 Support Vector Machines Outline Large margin linear classifier Linear separable Nonlinear separable Creating nonlinear classifiers: kernel trick Discussion on SVM Conclusion SVM: LARGE MARGIN LINEAR
More informationEvaluating Inter-cluster Communication in Clustered VLIW Architectures
Evaluating Inter-cluster Communication in Clustered VLIW Architectures Anup Gangwar Embedded Systems Group, Department of Computer Science and Engineering, Indian Institute of Technology Delhi September
More informationDesign Space Exploration in System Level Synthesis under Memory Constraints
Design Space Exploration in System Level Synthesis under Constraints Radoslaw Szymanek and Krzysztof Kuchcinski Dept. of Computer and Information Science Linköping University Sweden radsz@ida.liu.se Abstract
More informationEE595. Part VIII Overall Concept on VHDL. EE 595 EDA / ASIC Design Lab
EE595 Part VIII Overall Concept on VHDL VHDL is a Standard Language Standard in the electronic design community. VHDL will virtually guarantee that you will not have to throw away and re-capture design
More informationEN2911X: Reconfigurable Computing Topic 03: Reconfigurable Computing Design Methodologies
EN2911X: Reconfigurable Computing Topic 03: Reconfigurable Computing Design Methodologies Professor Sherief Reda http://scale.engin.brown.edu Electrical Sciences and Computer Engineering School of Engineering
More information10. Network dimensioning
Partly based on slide material by Samuli Aalto and Jorma Virtamo ELEC-C7210 Modeling and analysis of communication networks 1 Contents Introduction Parameters: topology, routing and traffic Dimensioning
More informationEE382V: System-on-a-Chip (SoC) Design
EE382V: System-on-a-Chip (SoC) Design Lecture 8 HW/SW Co-Design Sources: Prof. Margarida Jacome, UT Austin Andreas Gerstlauer Electrical and Computer Engineering University of Texas at Austin gerstl@ece.utexas.edu
More informationPrevious Exam Questions System-on-a-Chip (SoC) Design
This image cannot currently be displayed. EE382V Problem: System Analysis (20 Points) This is a simple single microprocessor core platform with a video coprocessor, which is configured to process 32 bytes
More informationOPTIMIZATION OF FIR FILTER USING MULTIPLE CONSTANT MULTIPLICATION
OPTIMIZATION OF FIR FILTER USING MULTIPLE CONSTANT MULTIPLICATION 1 S.Ateeb Ahmed, 2 Mr.S.Yuvaraj 1 Student, Department of Electronics and Communication/ VLSI Design SRM University, Chennai, India 2 Assistant
More informationINTERNATIONAL JOURNAL OF PROFESSIONAL ENGINEERING STUDIES Volume VII /Issue 2 / OCT 2016
NEW VLSI ARCHITECTURE FOR EXPLOITING CARRY- SAVE ARITHMETIC USING VERILOG HDL B.Anusha 1 Ch.Ramesh 2 shivajeehul@gmail.com 1 chintala12271@rediffmail.com 2 1 PG Scholar, Dept of ECE, Ganapathy Engineering
More informationAn Ant-Based Routing Algorithm to Achieve the Lifetime Bound for Target Tracking Sensor Networks
An Ant-Based Routing Algorithm to Achieve the Lifetime Bound for Target Tracking Sensor Networks Peng Zeng Cuanzhi Zang Haibin Yu Shenyang Institute of Automation Chinese Academy of Sciences Target Tracking
More informationAn overview of standard cell based digital VLSI design
An overview of standard cell based digital VLSI design Implementation of the first generation AsAP processor Zhiyi Yu and Tinoosh Mohsenin VCL Laboratory UC Davis Outline Overview of standard cellbased
More informationFPGA Latency Optimization Using System-level Transformations and DFG Restructuring
PGA Latency Optimization Using System-level Transformations and DG Restructuring Daniel Gomez-Prado, Maciej Ciesielski, and Russell Tessier Department of Electrical and Computer Engineering University
More informationRS-FDRA: A Register-Sensitive Software Pipelining Algorithm for Embedded VLIW Processors
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 21, NO. 12, DECEMBER 2002 1395 RS-FDRA: A Register-Sensitive Software Pipelining Algorithm for Embedded VLIW Processors
More information244 Lecture Notes 11-1
RTL and Behavioral Synthesis Srinivas evadas MIT Kurt Keutzer Richard Newton UC Berkeley 1 RTL Synthesis Flow HL RTL Synthesis Library netlist logic optimization a b 0 1 s d clk q netlist a b 0 1 s d clk
More informationBased on slides/material by. Topic Design Methodologies and Tools. Outline. Digital IC Implementation Approaches
Based on slides/material by Topic 11 Peter Y. K. Cheung Department of Electrical & Electronic Engineering Imperial College London K. Masselos http://cas.ee.ic.ac.uk/~kostas J. Rabaey http://bwrc.eecs.berkeley.edu/classes/icbook/instructors.html
More informationDesign of a Low Density Parity Check Iterative Decoder
1 Design of a Low Density Parity Check Iterative Decoder Jean Nguyen, Computer Engineer, University of Wisconsin Madison Dr. Borivoje Nikolic, Faculty Advisor, Electrical Engineer, University of California,
More informationSungmin Bae, Hyung-Ock Kim, Jungyun Choi, and Jaehong Park. Design Technology Infrastructure Design Center System-LSI Business Division
Sungmin Bae, Hyung-Ock Kim, Jungyun Choi, and Jaehong Park Design Technology Infrastructure Design Center System-LSI Business Division 1. Motivation 2. Design flow 3. Parallel multiplier 4. Coarse-grained
More informationHardware/Software Partitioning using Integer Programming. Ralf Niemann, Peter Marwedel. University of Dortmund. D Dortmund, Germany
Hardware/Software using Integer Programming Ralf Niemann, Peter Marwedel Dept. of Computer Science II University of Dortmund D-44221 Dortmund, Germany Abstract One of the key problems in hardware/software
More informationA Process Model suitable for defining and programming MpSoCs
A Process Model suitable for defining and programming MpSoCs MpSoC-Workshop at Rheinfels, 29-30.6.2010 F. Mayer-Lindenberg, TU Hamburg-Harburg 1. Motivation 2. The Process Model 3. Mapping to MpSoC 4.
More informationScheduling with Integer Time Budgeting for Low-Power Optimization
Scheduling with Integer Time Budgeting for Low-Power Optimization Wei Jiang, Zhiru Zhang, Miodrag Potkonjak and Jason Cong Computer Science Department University of California, Los Angeles, CA 995, USA
More informationAn Exact Algorithm for the Statistical Shortest Path Problem
An Exact Algorithm for the Statistical Shortest Path Problem Liang Deng and Martin D. F. Wong Dept. of Electrical and Computer Engineering University of Illinois at Urbana-Champaign Outline Motivation
More informationArchitecture-Level Synthesis for Automatic Interconnect Pipelining
Architecture-Level Synthesis for Automatic Interconnect Pipelining Jason Cong, Yiping Fan, Zhiru Zhang Computer Science Department University of California, Los Angeles, CA 90095 {cong, fanyp, zhiruz}@cs.ucla.edu
More informationBehavioural Transformation to Improve Circuit Performance in High-Level Synthesis*
Behavioural Transformation to Improve Circuit Performance in High-Level Synthesis* R. Ruiz-Sautua, M. C. Molina, J.M. Mendías, R. Hermida Dpto. Arquitectura de Computadores y Automática Universidad Complutense
More informationPerformance of computer systems
Performance of computer systems Many different factors among which: Technology Raw speed of the circuits (clock, switching time) Process technology (how many transistors on a chip) Organization What type
More informationOptimization Method for Broadband Modem FIR Filter Design using Common Subexpression Elimination
Optimization Method for Broadband Modem FIR Filter Design using Common Subepression Elimination Robert Pasko *, Patrick Schaumont **, Veerle Derudder **, Daniela Durackova * * Faculty of Electrical Engineering
More informationTowards Optimal Custom Instruction Processors
Towards Optimal Custom Instruction Processors Wayne Luk Kubilay Atasu, Rob Dimond and Oskar Mencer Department of Computing Imperial College London HOT CHIPS 18 Overview 1. background: extensible processors
More informationBuilt-in Chaining: Introducing Complex Components into Architectural Synthesis
Built-in Chaining: Introducing Complex Components into Architectural Synthesis Peter Marwedel, Birger Landwehr Rainer Dömer Dept. of Computer Science II Dept. of Information and Computer Science University
More informationEE 466/586 VLSI Design. Partha Pande School of EECS Washington State University
EE 466/586 VLSI Design Partha Pande School of EECS Washington State University pande@eecs.wsu.edu Lecture 18 Implementation Methods The Design Productivity Challenge Logic Transistors per Chip (K) 10,000,000.10m
More informationVLSI Signal Processing
VLSI Signal Processing Programmable DSP Architectures Chih-Wei Liu VLSI Signal Processing Lab Department of Electronics Engineering National Chiao Tung University Outline DSP Arithmetic Stream Interface
More informationFloorplan and Power/Ground Network Co-Synthesis for Fast Design Convergence
Floorplan and Power/Ground Network Co-Synthesis for Fast Design Convergence Chen-Wei Liu 12 and Yao-Wen Chang 2 1 Synopsys Taiwan Limited 2 Department of Electrical Engineering National Taiwan University,
More informationHead, Dept of Electronics & Communication National Institute of Technology Karnataka, Surathkal, India
Mapping Signal Processing Algorithms to Architecture Sumam David S Head, Dept of Electronics & Communication National Institute of Technology Karnataka, Surathkal, India sumam@ieee.org Objectives At the
More informationRIZALAFANDE CHE ISMAIL TKT. 3, BLOK A, PPK MIKRO-e KOMPLEKS PENGAJIAN KUKUM. SYNTHESIS OF COMBINATIONAL LOGIC (Chapter 8)
RIZALAFANDE CHE ISMAIL TKT. 3, BLOK A, PPK MIKRO-e KOMPLEKS PENGAJIAN KUKUM SYNTHESIS OF COMBINATIONAL LOGIC (Chapter 8) HDL-BASED SYNTHESIS Modern ASIC design use HDL together with synthesis tool to create
More informationC-based Interactive RTL Design Methodology
C-based Interactive RTL Design Methodology Dongwan Shin, Andreas Gerstlauer, Rainer Dömer and Daniel Gajski Technical Report CECS-03-42 December 1, 2003 Center for Embedded Computer Systems University
More informationProcessor Design CSCE Instructor: Saraju P. Mohanty, Ph. D. NOTE: The figures, text etc included in slides are borrowed
Lecture 3: General Purpose Processor Design CSCE 665 Advanced VLSI Systems Instructor: Saraju P. ohanty, Ph. D. NOTE: The figures, tet etc included in slides are borrowed from various books, websites,
More informationTopic 14: Scheduling COS 320. Compiling Techniques. Princeton University Spring Lennart Beringer
Topic 14: Scheduling COS 320 Compiling Techniques Princeton University Spring 2016 Lennart Beringer 1 The Back End Well, let s see Motivating example Starting point Motivating example Starting point Multiplication
More informationMixed-Integer Optimization for the Combined capacitated Facility Location-Routing Problem
Mixed-Integer Optimization for the Combined capacitated Facility Location-Routing Problem Dimitri Papadimitriou 1, Didier Colle 2, Piet Demeester 2 dimitri.papadimitriou@nokia.com, didier.colle@ugent.be,
More informationAdministration CS 412/413. Instruction ordering issues. Simplified architecture model. Examples. Impact of instruction ordering
dministration CS 1/13 Introduction to Compilers and Translators ndrew Myers Cornell University P due in 1 week Optional reading: Muchnick 17 Lecture 30: Instruction scheduling 1 pril 00 1 Impact of instruction
More informationICS 252 Introduction to Computer Design
ICS 252 Introduction to Computer Design Lecture 3 Fall 2006 Eli Bozorgzadeh Computer Science Department-UCI System Model According to Abstraction level Architectural, logic and geometrical View Behavioral,
More informationAS THE fabrication technology advances and transistors
1010 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 26, NO. 6, JUNE 2007 Ant Colony Optimizations for Resource- and Timing-Constrained Operation Scheduling Gang Wang,
More informationThe ILP approach to the layered graph drawing. Ago Kuusik
The ILP approach to the layered graph drawing Ago Kuusik Veskisilla Teooriapäevad 1-3.10.2004 1 Outline Introduction Hierarchical drawing & Sugiyama algorithm Linear Programming (LP) and Integer Linear
More informationCo-Optimization of Memory Access and Task Scheduling on MPSoC Architectures with Multi-Level Memory
Co-Optimization of Memory Access and Task Scheduling on MPSoC Architectures with Multi-Level Memory Assistant Professor Computer Science City University of Hong Kong http://www.cs.cityu.edu.hk/~jasonxue/
More informationLecture 2 Hardware Description Language (HDL): VHSIC HDL (VHDL)
Lecture 2 Hardware Description Language (HDL): VHSIC HDL (VHDL) Pinit Kumhom VLSI Laboratory Dept. of Electronic and Telecommunication Engineering (KMUTT) Faculty of Engineering King Mongkut s University
More informationEECS150 - Digital Design Lecture 5 - Verilog Logic Synthesis
EECS150 - Digital Design Lecture 5 - Verilog Logic Synthesis Jan 31, 2012 John Wawrzynek Spring 2012 EECS150 - Lec05-verilog_synth Page 1 Outline Quick review of essentials of state elements Finite State
More informationCircuit Placement: 2000-Caldwell,Kahng,Markov; 2002-Kennings,Markov; 2006-Kennings,Vorwerk
Circuit Placement: 2000-Caldwell,Kahng,Markov; 2002-Kennings,Markov; 2006-Kennings,Vorwerk Andrew A. Kennings, Univ. of Waterloo, Canada, http://gibbon.uwaterloo.ca/ akenning/ Igor L. Markov, Univ. of
More informationBRNO UNIVERSITY OF TECHNOLOGY Faculty of Information Technology Department of Computer Systems. Ing. Azeddien M. Sllame
BRNO UNIVERSITY OF TECHNOLOGY Faculty of Information Technology Department of Computer Systems Ing. Azeddien M. Sllame DESIGN SPACE EXPLORATION OF HIGH-PERFORMANCE DIGITAL SYSTEMS PROZKOUMÁVÁNÍ PROSTORU
More informationii Copyright c Harikrishna Samala 2009 All Rights Reserved
ii Copyright c Harikrishna Samala 2009 All Rights Reserved iii Abstract Methodology to Derive Resource Aware Context Adaptable Architectures for Field Programmable Gate Arrays by Harikrishna Samala, Master
More informationPublished in: Proceedings of the Practical Application of Constraint Logic Programming (PACLP) Conference
Digital Systems Design Using Constraint Logic Programming Szymanek, Radoslaw; Gruian, Flavius; Kuchcinski, Krzysztof Published in: Proceedings of the Practical Application of Constraint Logic Programming
More informationFILTER SYNTHESIS USING FINE-GRAIN DATA-FLOW GRAPHS. Waqas Akram, Cirrus Logic Inc., Austin, Texas
FILTER SYNTHESIS USING FINE-GRAIN DATA-FLOW GRAPHS Waqas Akram, Cirrus Logic Inc., Austin, Texas Abstract: This project is concerned with finding ways to synthesize hardware-efficient digital filters given
More information