An Introduction to FPGA Placement. Yonghong Xu Supervisor: Dr. Khalid

Similar documents
Place and Route for FPGAs

L14 - Placement and Routing

CAD Flow for FPGAs Introduction

Placement Algorithm for FPGA Circuits

Very Large Scale Integration (VLSI)

COMPARATIVE STUDY OF CIRCUIT PARTITIONING ALGORITHMS

EN2911X: Reconfigurable Computing Lecture 13: Design Flow: Physical Synthesis (5)

CAD Algorithms. Placement and Floorplanning

How Much Logic Should Go in an FPGA Logic Block?

Introduction. A very important step in physical design cycle. It is the process of arranging a set of modules on the layout surface.

Introduction VLSI PHYSICAL DESIGN AUTOMATION

Can Recursive Bisection Alone Produce Routable Placements?

Unit 5A: Circuit Partitioning

MODULAR PARTITIONING FOR INCREMENTAL COMPILATION

CAD Algorithms. Circuit Partitioning

ECE260B CSE241A Winter Placement

Cluster-Based Architecture, Timing-Driven Packing and Timing-Driven Placement for FPGAs

Hardware/Software Codesign

Floorplan and Power/Ground Network Co-Synthesis for Fast Design Convergence

Partitioning. Course contents: Readings. Kernighang-Lin partitioning heuristic Fiduccia-Mattheyses heuristic. Chapter 7.5.

VLSI Physical Design: From Graph Partitioning to Timing Closure

Genetic Algorithm for Circuit Partitioning

Toward More Efficient Annealing-Based Placement for Heterogeneous FPGAs. Yingxuan Liu

Estimation of Wirelength

Timing-Driven Placement for FPGAs

Preclass Warmup. ESE535: Electronic Design Automation. Motivation (1) Today. Bisection Width. Motivation (2)

Academic Clustering and Placement Tools for Modern Field-Programmable Gate Array Architectures

ESE535: Electronic Design Automation. Today. LUT Mapping. Simplifying Structure. Preclass: Cover in 4-LUT? Preclass: Cover in 4-LUT?

A Path Based Algorithm for Timing Driven. Logic Replication in FPGA

GENETIC ALGORITHM BASED FPGA PLACEMENT ON GPU SUNDAR SRINIVASAN SENTHILKUMAR T. R.

Term Paper for EE 680 Computer Aided Design of Digital Systems I Timber Wolf Algorithm for Placement. Imran M. Rizvi John Antony K.

Genetic Algorithm for FPGA Placement

(Lec 14) Placement & Partitioning: Part III

Runtime and Quality Tradeoffs in FPGA Placement and Routing

Reference. Wayne Wolf, FPGA-Based System Design Pearson Education, N Krishna Prakash,, Amrita School of Engineering

AN 567: Quartus II Design Separation Flow

ICS 252 Introduction to Computer Design

An Interconnect-Centric Design Flow for Nanometer Technologies

Designing 3D Tree-based FPGA TSV Count Minimization. V. Pangracious, Z. Marrakchi, H. Mehrez UPMC Sorbonne University Paris VI, France

Reducing Power in an FPGA via Computer-Aided Design

PRESENTATION SLIDES FOR PUN-LOSP William Bricken July Interface as Overview

Hardware-Software Codesign

UNIVERSITY OF CALIFORNIA College of Engineering Department of Electrical Engineering and Computer Sciences Lab #2: Layout and Simulation

BACKEND DESIGN. Circuit Partitioning

Multilevel Global Placement With Congestion Control

RTL Coding General Concepts

160 M. Nadjarbashi, S.M. Fakhraie and A. Kaviani Figure 2. LUTB structure. each block-level track can be arbitrarily connected to each of 16 4-LUT inp

Parallel Simulated Annealing for VLSI Cell Placement Problem

Spiral 2-8. Cell Layout

AN ACCELERATOR FOR FPGA PLACEMENT

CS 331: Artificial Intelligence Local Search 1. Tough real-world problems

ASIC Physical Design Top-Level Chip Layout

SUBMITTED FOR PUBLICATION TO: IEEE TRANSACTIONS ON VLSI, DECEMBER 5, A Low-Power Field-Programmable Gate Array Routing Fabric.

Design Space Exploration Using Parameterized Cores

Unification of Partitioning, Placement and Floorplanning. Saurabh N. Adya, Shubhyant Chaturvedi, Jarrod A. Roy, David A. Papa, and Igor L.

EE582 Physical Design Automation of VLSI Circuits and Systems

TECHNOLOGY MAPPING FOR THE ATMEL FPGA CIRCUITS

Retiming & Pipelining over Global Interconnects

Digital VLSI Design. Lecture 7: Placement

Challenges of FPGA Physical Design

5. Quartus II Design Separation Flow

Large Scale Circuit Placement: Gap and Promise

Research Incubator: Combinatorial Optimization. Dr. Lixin Tao December 9, 2003

Automating Reconfiguration Chain Generation for SRL-Based Run-Time Reconfiguration

Physical Design Closure

Chapter 5 Global Routing

Routability-Driven Bump Assignment for Chip-Package Co-Design

ECE 5745 Complex Digital ASIC Design Topic 13: Physical Design Automation Algorithms

CS137: Electronic Design Automation

8. Best Practices for Incremental Compilation Partitions and Floorplan Assignments

Review of the Robust K-means Algorithm and Comparison with Other Clustering Methods

SPEED AND AREA TRADE-OFFS IN CLUSTER-BASED FPGA ARCHITECTURES

CprE 583 Reconfigurable Computing

Abbas El Gamal. Joint work with: Mingjie Lin, Yi-Chang Lu, Simon Wong Work partially supported by DARPA 3D-IC program. Stanford University

Iterative-Constructive Standard Cell Placer for High Speed and Low Power

Best Practices for Incremental Compilation Partitions and Floorplan Assignments

Prog. Logic Devices Schematic-Based Design Flows CMPE 415. Designer could access symbols for logic gates and functions from a library.

UNIVERSITY OF CALGARY. Force-Directed Partitioning Technique for 3D IC. Aysa Fakheri Tabrizi A THESIS SUBMITTED TO THE FACULTY OF GRADUATE STUDIES

Hardware describing languages, high level tools and Synthesis

Automated Extraction of Physical Hierarchies for Performance Improvement on Programmable Logic Devices

Optimality, Scalability and Stability Study of Partitioning and Placement Algorithms

Faster Placer for Island-style FPGAs

A Routing Approach to Reduce Glitches in Low Power FPGAs

Mesh. Mesh Channels. Straight-forward Switching Requirements. Switch Delay. Total Switches. Total Switches. Total Switches?

Routing Path Reuse Maximization for Efficient NV-FPGA Reconfiguration

Fast Timing-driven Partitioning-based Placement for Island Style FPGAs

Standard Optimization Techniques

Reduce Your System Power Consumption with Altera FPGAs Altera Corporation Public

Placement techniques for the physical synthesis of nanometer-scale integrated circuits

Graph Models for Global Routing: Grid Graph

TROUTE: A Reconfigurability-aware FPGA Router

Circuit Placement: 2000-Caldwell,Kahng,Markov; 2002-Kennings,Markov; 2006-Kennings,Vorwerk

CAD for a 3-Dimensional FPGA. Vikram Massachusetts Institute of Technology All rights reserved. AUG

Column Generation: Cutting Stock

Outline. Column Generation: Cutting Stock A very applied method. Introduction to Column Generation. Given an LP problem

An FPGA Design And Implementation Framework Combined With Commercial VLSI CADs

Hardware Design Environments. Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University

Overview. CSE372 Digital Systems Organization and Design Lab. Hardware CAD. Two Types of Chips

OPTIMIZATION OF TRANSISTOR-LEVEL FLOORPLANS FOR FIELD-PROGRAMMABLE GATE ARRAYS. Ryan Fung. Supervisor: Jonathan Rose. April 2002

Exploiting Signal Flow and Logic Dependency in Standard Cell Placement

Transcription:

RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS UNIVERSITY OF WINDSOR An Introduction to FPGA Placement Yonghong Xu Supervisor: Dr. Khalid

RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS UNIVERSITY OF WINDSOR Island FPGA Architecture Abstract logic block X Channel Segment Y Channel Segment Input Pin I/O block Output Pin

RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS UNIVERSITY OF WINDSOR An Abstract Logic Block Inputs 4-input LUT Clock D-FF Out I_3 I_2 I_1 I_4 O_3

IC Design Flow Specification Architectural design Logic design Physical design Partitioning Circuit design Placement Physical design Test/Fabrication Routing

Placement Definition Input: Netlist of logic blocks I/O pads Inter connections Output: Coordinates (x i, y i ) for each block. The total wire length is minimized. Also need to consider delay and routability.

An Example MCNC Benchmark circuit e64 (contains 230 4-LUT). Placed to a FPGA Random placement Final placement

Placement is NP-Complete NP (non-deterministic polynomial-time) the class NP consists of all those decision problems whose positive solutions can be verified in polynomial time given the right information, or equivalently, whose solution can be found in polynomial time on a non-deterministic machine NP-Complete basically means that there is no efficient way to compute a solution using a computer program Approximate solutions for NP-complete problems can be found using heuristic method.

Major Placement Techniques Partitioning-Based Placement Simulated Annealing Placement Quadratic Placement Hybrid and Hierarchical Placement

Partitioning-Based Placement Partitioning: Decomposition of a complex system into smaller subsystems Partitioning-based placement recursively apply min-cut partitioning to map the netlist into the layout region Cutsize: the number of nets not contained in just one side of the partition e.g. FM partitioning algorithm

Partitioning-Based Placement Pros: Open Cost Function(partitioning cost) Minimize net cut, edge cut etc. It is Move Based, Suitable to Timing Driven Placement Cons: Lots of indifferent moves May not work well with some cost functions Multi partitioning

Simulated Annealing Placement Simulated annealing algorithm mimics the annealing process used to gradually cool molten metal to produce high quality metal structures Initial placement improved by iterative swaps and moves Accept swaps if they improve the cost Accept swaps that degrade the cost under some probability conditions to prevent the algorithm from being trapped in a local minimum

Simulated Annealing Placement Pros: Open Cost Function Wire length cost Timing cost Can Reach Globally Optimal Solution given enough time Cons: Slow

Quadratic Placement Use squared wire length as objective function Minimize the objective function by solving linear equations Expand the design to entire chip area

Quadratic Placement x3 x1 x2 x4 Min [(x1-x3) x3) 2 + (x1-x2) x2) 2 + (x2-x4) x4) 2 ] : F δf/δx1 = 0; δf/δx2 = 0; Ax = B A = 2-1 -11 2 B = x3 x4 x = x1 x2

Quadratic Placement Pros: Very fast Can handle large design Cons: Minimize squared wire length while not the wire length Not Suitable for Simultaneous Optimization of Other Aspects of Physical Design Gives Trivial Solutions without Pads

Hybrid and Hierarchical Placement The scale of logic circuits grows fast Compile time grows exponential with the number of logic blocks Hierarchical methods are introduced to reduce the compile time in different hierarchies use different algorithms E.g. GORDIAN quadratic + partitioning Dragon: partitioning + Simulated Annealing

Timing Optimization To minimize the longest path delay or maximize the minimum slack Existing Algorithms Path based algorithms Net weighting algorithms

Path Based Algorithms Directly minimize the longest path delay Accurate timing view during optimization High computational cost Popular approaches: Explicitly reduce the maximum length of a set of paths; the set could be pre-computed or dynamically adjusted Mathematical programming by introducing auxiliary variables(arrival time, required time)

Net Weighting Algorithms Timing criticalities are translated into net weights; then compute a placement which minimized total weighted delay or wire length Edges have smaller slack are assigned higher weight Edges shared by more path are assigned higher weight Less accurate, less computational cost

Dominating Placement algorithm: VPR Simulated Annealing based Adaptive annealing schedule Use delay profile of target FPGA to evaluate the delay of each connection Trades off between circuit delay and wiring cost optimization

VPR Cost Function C = λ Time_ Cost Pr evious _ Time_ Cost + (1 λ) Wiring _ Cost Pr evious _ Wiring _ Cost Wiring _ Cost N = nets i= 1 q( i) [ bb i ( x) + bb i ( y)] Time _ cost = i, j circuit Time _ Cost( i, j) Time _ cost( i, j) = Delay( i, j) Criticality( i, j) criticality _ exponent

VPR Stopping Criteria Inner loop Outer loop InnerNum N blocks 4 / 3 T < 0.005* Cost / N nets

VPR Temperature Update T = α new T old Fraction of Moves Accepted (R accept ) Temperature Update Factor (α) R accept >0.96 0.5 0.8<R accept <=0.96 0.9 0.15<R accept <=0.8 0.95 R accept <=0.15 0.8

Thank you