IMPLEMENTATION DESIGN FLOW

Similar documents
TECHNOLOGY MAPPING FOR THE ATMEL FPGA CIRCUITS

CHAPTER 9 MULTIPLEXERS, DECODERS, AND PROGRAMMABLE LOGIC DEVICES

Assignment (3-6) Boolean Algebra and Logic Simplification - General Questions

1/28/2013. Synthesis. The Y-diagram Revisited. Structural Behavioral. More abstract designs Physical. CAD for VLSI 2

Beyond the Combinatorial Limit in Depth Minimization for LUT-Based FPGA Designs

CSE241 VLSI Digital Circuits UC San Diego

ECE260B CSE241A Winter Logic Synthesis

Representations of Terms Representations of Boolean Networks

Figure 1. PLA-Style Logic Block. P Product terms. I Inputs

Programmable Logic Devices (PLDs)

FlowMap: An Optimal Technology Mapping Algorithm for Delay Optimization in Lookup-Table Based FPGA Designs

Prog. Logic Devices Schematic-Based Design Flows CMPE 415. Designer could access symbols for logic gates and functions from a library.

Topics. Midterm Finish Chapter 7

ELCT201: DIGITAL LOGIC DESIGN

ELCT201: DIGITAL LOGIC DESIGN

CS8803: Advanced Digital Design for Embedded Hardware

Presentation 4: Programmable Combinational Devices

Code No: R Set No. 1

Technology Dependent Logic Optimization Prof. Kurt Keutzer EECS University of California Berkeley, CA Thanks to S. Devadas

Programmable Logic Devices

Simultaneous Depth and Area Minimization in LUT-based FPGA Mapping

DKT 122/3 DIGITAL SYSTEM 1

Experiment 4 Boolean Functions Implementation

Evolution of Implementation Technologies. ECE 4211/5211 Rapid Prototyping with FPGAs. Gate Array Technology (IBM s) Programmable Logic

On Nominal Delay Minimization in LUT-Based FPGA Technology Mapping

ECE260B CSE241A Winter Logic Synthesis

PROGRAMMABLE LOGIC DEVICES

Chapter 2 Combinational

Workshop on Digital Circuit Design in FPGA

Programmable Logic Devices HDL-Based Design Flows CMPE 415

Chapter 2 Combinational Logic Circuits

Memory and Programmable Logic

IT 201 Digital System Design Module II Notes

Ch. 5 : Boolean Algebra &

Code No: R Set No. 1

Chapter 3. Gate-Level Minimization. Outlines

ABC basics (compilation from different articles)

Introduction to Programmable Logic Devices (Class 7.2 2/28/2013)

1. Fill in the entries in the truth table below to specify the logic function described by the expression, AB AC A B C Z

Combinational Logic Circuits

Simultaneous Depth and Area Minimization in LUT-based FPGA Mapping

Architecture and Synthesis of. Field-Programmable Gate Arrays with. Hard-wired Connections. Kevin Charles Kenton Chung

Gate Level Minimization Map Method

LSN 4 Boolean Algebra & Logic Simplification. ECT 224 Digital Computer Fundamentals. Department of Engineering Technology

R07

Hardware Modelling. Design Flow Overview. ECS Group, TU Wien

Field Programmable Gate Arrays

Chapter 6: Multilevel Combinational Circuits. Name: Lương Văn Minh No. :

FPGA: What? Why? Marco D. Santambrogio

Programmable Logic Devices

PARALLEL PERFORMANCE DIRECTED TECHNOLOGY MAPPING FOR FPGA. Laurent Lemarchand. Informatique. ea 2215, D pt. ubo University{ bp 809

ADAPTIVE MAP FOR SIMPLIFYING BOOLEAN EXPRESSIONS

MODULE 5 - COMBINATIONAL LOGIC

CMPE223/CMSE222 Digital Logic

Unit 4: Formal Verification

DIGITAL CIRCUIT LOGIC UNIT 9: MULTIPLEXERS, DECODERS, AND PROGRAMMABLE LOGIC DEVICES

UNIT I BOOLEAN ALGEBRA AND COMBINATIONAL CIRCUITS PART-A (2 MARKS)

FPGA for Complex System Implementation. National Chiao Tung University Chun-Jen Tsai 04/14/2011

EECS 150 Homework 7 Solutions Fall (a) 4.3 The functions for the 7 segment display decoder given in Section 4.3 are:

EEE130 Digital Electronics I Lecture #4_1

Reconfigurable Computing. Design and implementation. Chapter 4.1

COE 561 Digital System Design & Synthesis Introduction

COLLEGE OF ENGINEERING DEPARTMENT OF ELECTRICAL AND ELECTRONICS ENGINEERING QUESTION BANK SUBJECT CODE & NAME: EC 1312 DIGITAL LOGIC CIRCUITS UNIT I

Bawar Abid Abdalla. Assistant Lecturer Software Engineering Department Koya University

Bawar Abid Abdalla. Assistant Lecturer Software Engineering Department Koya University

Boolean Representations and Combinatorial Equivalence

HANSABA COLLEGE OF ENGINEERING & TECHNOLOGY (098) SUBJECT: DIGITAL ELECTRONICS ( ) Assignment

Outline. EECS Components and Design Techniques for Digital Systems. Lec 11 Putting it all together Where are we now?

Reconfigurable Computing. Design and Implementation. Chapter 4.1

Code No: R Set No. 1

Digital Design Methodology (Revisited) Design Methodology: Big Picture

Building Roads. Page 2. I = {;, a, b, c, d, e, ab, ac, ad, ae, bc, bd, be, cd, ce, de, abd, abe, acd, ace, bcd, bce, bde}

Logic Gates and Boolean Algebra ENT263

數位系統 Digital Systems 朝陽科技大學資工系. Speaker: Fuw-Yi Yang 楊伏夷. 伏夷非征番, 道德經察政章 (Chapter 58) 伏者潛藏也道紀章 (Chapter 14) 道無形象, 視之不可見者曰夷

Digital Design Methodology

Summary. Boolean Addition

Design Methodologies and Tools. Full-Custom Design

Quick Look under the Hood of ABC

B.Tech II Year I Semester (R13) Regular Examinations December 2014 DIGITAL LOGIC DESIGN

Hybrid LUT/Multiplexer FPGA Logic Architectures

state encoding with fewer bits has fewer equations to implement state encoding with more bits (e.g., one-hot) has simpler equations

Programmable Logic Devices UNIT II DIGITAL SYSTEM DESIGN

ECE 5745 Complex Digital ASIC Design Topic 12: Synthesis Algorithms

Gate-Level Minimization

EECS Components and Design Techniques for Digital Systems. Lec 07 PLAs and FSMs 9/ Big Idea: boolean functions <> gates.

ESE535: Electronic Design Automation. Today. LUT Mapping. Simplifying Structure. Preclass: Cover in 4-LUT? Preclass: Cover in 4-LUT?

INTRODUCTION TO FPGA ARCHITECTURE

ECE 331 Digital System Design

KINGS COLLEGE OF ENGINEERING DEPARTMENT OF ELECTRICAL AND ELECTRONICS ENGINEERING QUESTION BANK NAME OF THE SUBJECT: EE 2255 DIGITAL LOGIC CIRCUITS

Introduction. Sungho Kang. Yonsei University

L2: Design Representations

A B AB CD Objectives:

Simplification of Boolean Functions

VALLIAMMAI ENGINEERING COLLEGE. SRM Nagar, Kattankulathur DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING EC6302 DIGITAL ELECTRONICS

EECS150, Fall 2004, Midterm 1, Prof. Culler. Problem 1 (15 points) 1.a. Circle the gate-level circuits that DO NOT implement a Boolean AND function.

4 KARNAUGH MAP MINIMIZATION

Boolean Analysis of Logic Circuits

ESE535: Electronic Design Automation. Today. EDA Use. Problem PLA. Programmable Logic Arrays (PLAs) Two-Level Logic Optimization

Code No: 07A3EC03 Set No. 1

Hours / 100 Marks Seat No.

Transcription:

IMPLEMENTATION DESIGN FLOW Hà Minh Trần Hạnh Nguyễn Duy Thái Course: Reconfigurable Computing Outline Over view Integra tion Node manipulation LUT-based mapping Design flow Design entry Functional simulation Synthesis Place & Route Configuration bitstream Logic synthesis Chortle algorithm Flowmap algorithm Conclu sion 1

Outline Over view Integra tion Node manipulation LUT-based mapping Logic synthesis Chortle algorithm Flowmap algorithm Design flow Design entry Functional simulation Synthesis Place & Route Configuration bitstream Conclu sion Overview The steps required to implement an application. FPGA design flow: Not aim at generating a set of instructions but hardware components that will be loaded at different time on available resources. Technology mapping targets look-up tables rather than NAND gates. 2

Outline Over view Integra tion Node manipulation LUT-based mapping Design flow Design entry Functional simulation Synthesis Place & Route Configuration bitstream Logic synthesis Chortle algorithm Flowmap algorithm Conclu sion Integration Usage Rapid prototyping Emulator for another digital devices such as ASIC. Non-frequently reconfigurable system For testing and initialization at startup and for upgrading purpose. No reconfiguration happens during operation. Frequently reconfigurable system Usually coupled with a host processor to reconfigure device and control complete system. 3

Integration Reconfiguration time Compile-time reconfiguration Never change during computation. Run-time reconfiguration Not known at compile time. Given task is known at run time. More interesting for only fully-reconfigurable devices. Reconfiguration process exchanged parts of devices to accommodate the system to change operational and environmental conditions. Integration Management of Reconfigurable device Common architecture 4

Integration Management of Reconfigurable device Common operation flow Integration Design flow of dynamic reconfigurable system Hardware/software partitioning process: The part of the code to be executed on the host processor is determined. The parts of the code to be executed on the reconfigurable device are identified. The interface between the processor and the reconfigurable device is implemented. 5

Outline Over view Integra tion Node manipulation LUT-based mapping Design flow Design entry Functional simulation Synthesis Place & Route Configuration bitstream Logic synthesis Chortle algorithm Flowmap algorithm Conclu sion Design flow Design entry Make description of function. Tools: HDL, schematic, FSM editor, Functional simulation Simulate function to check correctness. Synthesis Compile and optimize. Implement function with available modules in function library of target device to create netlist. Place & route Place operators (mux, LUTs, FF, ) and connect together through routing on FPGA device. Normally achieve by CAD tools to generate a bitstream to download to device later. 6

Design flow Design tools Outline Over view Integra tion Node manipulation LUT-based mapping Design flow Design entry Functional simulation Synthesis Place & Route Configuration bitstream Logic synthesis Chortle algorithm Flowmap algorithm Conclu sion 7

Logic synthesis Overview Function is described as a digital structured system (register transfer description): Combinational logic blocks. are nodes. Registers to store results of combinational logic blocks. Goals is producing an optimal implementation of system on given hardware platform. In case of FPGA, it s generating configuration data. Logic synthesis Structured digital system 8

Logic synthesis Synthesis Approaches Two-level logic synthesis Natural and straightforward method to implement Boolean functions. All functions can represented as a sum of product terms. Longest path is two. Most appropriate for PAL and PLA. Multi-level logic synthesis Present as multi-level logic. Longest path is greater than two. Smaller, faster and less power in most case. Used for standard cells, mask-programmable or field-programmable devices. Outline Over view Integra tion Node manipulation LUT-based mapping Design flow Design entry Functional simulation Synthesis Place & Route Configuration bitstream Logic synthesis Chortle algorithm Flowmap algorithm Conclu sion 9

Synthesis Approaches Technology dependent Only valid gates chosen from target library are used. Final implementation matches the node presentation. Technology independent Design is not tied to any library. 2 steps: Optimizing Boolean equations. Mapping parts of Boolean network to set of LUTs. Efficiently use memory and easy to manipulate. Technology-independent synthesis method is most used for easier optimization. In FPGA, synthesis follows 2 steps: Minimize Boolean equations. Map Boolean networks to a set of LUTs. 10

Representation approaches SOP form Representing as Sum Of Products. Being well understood, easy to manipulate. Non-representativity of logic complexity. Factored form c [a+b(d+e)]. Representative of the logic complexity, easy to estimate comlexity of logic implementation. Lack of manipulation algorithms. BDD form Representing as Binary Decision Diagram. Manipulations of BDDs are well defined and efficient. Binary Decision Diagram F(c) = c.f(a) + (~c).f(d) Variable node: index = {0.. N} Constant node: value = {0; 1} 11

Outline Over view Integra tion LUT-based mapping Logic synthesis Chortle algorithm Flowmap algorithm Design flow Design entry Functional simulation Synthesis Place & Route Configuration bitstream Conclu sion Node manipulation Transformations to get optimized representation in previous section: Decompose: f abc abd acd bcd ab( c d) ( a b) cd ab( c d) ab( c d) XY XY Extract : f ( a bc) d e xd e g ( a bc) e xe Factor : f ac ad bc bd e ( a b)( c d) e Substitute : f ( a bc)( d e) g( d e) Collapse : f ga gb ac ad bcd g c d 12

Outline Over view Integra tion Node manipulation Design flow Design entry Functional simulation Synthesis Place & Route Configuration bitstream Logic synthesis Chortle algorithm Flowmap algorithm Conclu sion LUT-based mapping Algs for minimizing the Area Chortle-crf. MIS-fpga Xmap. Algs for minimizing the Delay FlowMap. Chortle-d. DAG-map. MIS-pga-delay. Algs for maximizing the Routability Bhat and Hill. Schlag. Kong and Chang. 13

LUT-based mapping Prerequisite definitions Primary input (PI) node: node without any predecessor. Primary output (PO) node: node without any successor. The level l(v) of a node: the length of the longest path from the primary inputs to that node. The depth of a network G: the largest level of a node in G. The fan-in of a node v: the set of gates whose outputs are inputs of v. The fan-out of v: the set of gates that use the output of v as input. Given a node v G, input(v) is defined as the set of node of G, which are fan-in of v, i.e. the set of predecessors of v. 28 LUT-based mapping Begins with an AND/OR representation of the optimized Boolean network. Decomposing network into a forest of trees. Optimal mapping of each tree into LUTs. Assembling together. Modified tree-covering approaches An example of Boolean network. 14

Outline Over view Integra tion Node manipulation Logic synthesis Flowmap algorithm Design flow Design entry Functional simulation Synthesis Place & Route Configuration bitstream Conclu sion Chortle Algorithm Concepts Specifically designed for TLU-based FPGAs. The input: Boolean network as a forest of directed acyclic graphs (DAGs): Leaves : inputs. Root : output Internal nodes : logic functions AND/OR. Edges : inverting or non-inverting signal paths. Goal : implementing the circuit using the fewest number of K-input CLBs in minimal running time. 15

Chortle Algorithm Concepts Boolean network and DAG representation. Chortle Algorithm Tree mapping Operation Flow Converting the input graph to forest of fan-out-free trees, where each logic function has exactly one output. Evaluating each sub-tree independently. Forest of fan-out-free trees. 16

Chortle Algorithm Operation Flow Decomposing Chortle Algorithm Operation Flow Decomposing Required if the original circuit has a fan-in greater than K. Two steps: 1. Two-level decomposing. 2. Converting into a multi-level decomposition. 17

Chortle Algorithm Operation Flow Decomposition Chortle Algorithm Example For this example, we will assume that each CLB may have as many as four inputs (i.e., K = 4). The inputs, {A,B,C,D,E,F}, perform the logic function: A * B + (C * D) E + F. In the postorder traversal n1 is visited first, followed by n2 n5. For n1, there is only one possible mapping function, namely, U = 2, u = {1,1}. The same is true for n2. Chortle mapping example. 18

Chortle Algorithm Example When n3 is evaluated, there are two possibilities, as illustrated First, the function could be implemented as a new CLB with two inputs (U = 2), driven from the outputs of n2 and E. This sub-graph would use two TLBs; thus, it would have a cost function of 2. For U = 3,only one utilization vector is possible, namely, u = {2,1}. All three primary inputs C, D, and E are grouped into one CLB, thus producing a cost function of 1. Mapping of node 3. Chortle Algorithm Example Mapping of node 4. 19

Chortle Algorithm Example Mapping of node 5 Finally, we evaluate the output node, n5, as illustrated. We see that there are four possible mappings and, of those, two minimal mappings are possible. Chortle may return either of the mappings where two CLBs implement: n5 = (A * B) + n3 + F and n3 = (C * D) * E. Improvement of the Chortle 1. Exploiting the reconvergent paths. 20

Improvement of the Chortle 2. Logic replication at fan-out LUTs. Outline Over view Integra tion Node manipulation Design flow Design entry Functional simulation Synthesis Place & Route Configuration bitstream Logic synthesis Chortle algorithm Conclu sion 21

Flowmap Concepts Optimal solution in polynomial time. Two steps: 1. Decomposing: For a given FPGA device, with a k-input TLU, all nodes of the network with more than k inputs must be decomposed. Different methods decompose the network in different ways. 2. Node Mapping.: Example: Local Elimination: The operation used for local elimination is collapsing, which merges node ni into node nj whenever ni is a fan-in node to nj and the new node obtained is feasible. Flowmap Decomposing 22

Flowmap Node-mapping Questions/Discussions Any question? 23

Thank you! 24