AN ACCELERATOR FOR FPGA PLACEMENT
|
|
- Frank Sullivan
- 5 years ago
- Views:
Transcription
1 AN ACCELERATOR FOR FPGA PLACEMENT Pritha Banerjee and Susmita Sur-Kolay * Abstract In this paper, we propose a constructive heuristic for initial placement of a given netlist of CLBs on a FPGA, in order to accelerate the iterative phase of the placement in the context of re-configurable computing. The experimental results of our method show significant improvement in cost compared to the initial placement of the popular tool VPR. We observe that simulated annealing converges much faster given the proposed initial placement configuration. Keywords: FPGA, Placement 1. Introduction Placement in an FPGA is the process by which a netlist of circuit blocks (which are either I/O or Configurable Logic Blocks (CLBs)) is mapped onto physical locations which is essentially a two dimensional array. Placement algorithms are broadly classified as constructive methods and iterative improvement techniques. Constructive methods build up a solution step by step starting with a single unit. Iterative improvement algorithms start with a valid initial placement and repeatedly modify the configuration with the objective of reducing a certain cost such as delay, wire-length, area, etc. Iterative algorithms produce good placements but require enormous computation time which may depend on the initial configuration of the placement. Typically, many trials are performed with various initial solutions. The iterative phase may however be accelerated by starting with a good initial configuration. In the context of reconfigurable co-processors, it is essential to reduce the time complexity of mapping, place-and-route stages without sacrificing the quality of solution. From earlier works, we find that most of the FPGA placement algorithms are iterative. With this motivation, we propose in this paper a fast constructive placement algorithm for a given technology-mapped netlist of CLBs on an island-style FPGA, realizing a given digital circuit. 2. Earlier Works on FPGA Placement FPGA placement techniques can be categorized as force directed methods, Tabu search, successive bi-partitioning or quad-partitioning and clustering based algorithms. There are many works on bi-partitioning and quad-partitioning techniques for placement [Takahashi (1995), Krupnova (1997)]. Work related to clustering based techniques is extensive in the literature [Lou (1998), Senouci (1998), Fang (1997), Tsay (1995)]. Quinn and Breuer proposed a force directed constructive method by formulating a set of force equations [Quinn (1998)]. Eisenmann et al used additional force * Advanced Computing and Microelectronics Unit, Indian Statistical Institute, 203 B. T. Road, Kolkata , India. {pritha_r, ssk}@isical.ac.in This work was funded by Indo-French Centre for the Promotion of Advanced Research.
2 equations to reduce cell overlaps [Eisenmann (1998)]. The algorithm by Raman et al takes into account circuit performance as well [Raman (199)]. Betz et. al. have developed a tool VPR, that starts with a random placement and optimizes the placement by using a linear congestion cost function in the framework of simulated annealing [Betz (1997)]. There are several genetic algorithm based techniques for FPGA placement [Cohoon (1986), Saab (1991), King (1989)]. Emmert and Bhatia developed a Tabu search approach towards speeding up the placement and floorplanning steps of FPGA [Emmert (1999)] Mathur and Liu have developed a timing driven iterative algorithm that has alternate compression and relaxation phases for placement on regular architectures [Mathur (1997)]. In summary, most of the effective placement algorithms for FPGAs are based on stochastic iterative methods which however do not pay heed to the quality of the initial solution and its impact on the convergence time. 3. Problem Formulation The FPGA placement problem can be formally defined as follows. Given a set of modules M = {m 1, m 2,, m n } and a set of signals S = {s 1, s 2,, s q }, we associate each module m i M with a set of signals S, where S S. Similarly with each signal s i S we associate a set of modules M M s i = {m j s i S m j }. m i s i m i, where M s i is said to be signal net. We are also given a set L = {l 1, l 2,, l p }, where p M. The placement problem is to assign each module m i M to a unique location l j L such that the chosen objective function is optimized. For the case of mapping m i M to a regular two dimensional array, each l j L is represented by a unique (x j, y j ) location on the surface of the two dimensional array where x j and y j are integers [Emmert (1999)].. Proposed Initial Placement Method Our proposed approach is to place the netlists of CLBs on the FPGA using a novel constructive method and then we use a Simulated annealing framework to improve the configuration. This paper presents our fast yet cost effective constructive heuristic to place the CLB netlists. It also guides the iterative phase to improve the placement configuration in lesser number of iterations..1. Overview We place only the primary outputs randomly in the boundary of the square array, instead of randomly placing the entire CLB netlist on a minimum dimensional square array to place all CLBs and primary I/Os. For the given circuit specified as a netlist, let us define a directed graph D = <V, E>, where V = {v v is either a CLB, or a primary input (PI) or a primary output (PO)} and A = {<v i, v j > v i fanin(v j ) and v j fanout(v i )}. We define a cone for each of the primary outputs present in netlist as follows. A cone of a primary output O i, denoted by f i, is the set consisting of O i and all its predecessors [Mathur (1997)]. In other words, f i = cone(o i ) = { u a simple directed path from u to O i in D}. The root of the cone is the primary output
3 itself. Let C l be a CLB at level l (breadth-first order) in the cone of D. The predecessors of a CLB at level l, C l, within the cone are either CLBs or primary inputs which are in the fan-in of the CLB C l. These blocks form the next level of the cone f i. We continue tracing the input list at every block in breadth-first manner, till we find no new CLBs and primary inputs for the cone f i. Breadthfirst traversal of the cone results in a tree structure. At the leaves of the cone, we find all CLBs and primary inputs that have already been visited at one of the previous levels. Having defined output cone as above, we place the CLBs and primary inputs on the two dimensional minimum square matrix by processing any output cone at a time. Starting from the root of the output cone f i, we select all the blocks b i B l, where B l is the set of blocks at level l for placement. This process is continued till the base of the cone is reached. The CLB or primary input, b i, which is selected for placement, is placed at the optimal position with respect to the current configuration. In order to determine the optimal position we define a bounding box for the block b i as a rectangular region containing all b j fanout(b i ) fanin(b i ), i.e, the fan-out CLBs of b i as well as the CLBs and primary inputs in the input list of b i that are placed at that instant. We assign b i to a slot within the bounding box that results in minimum wire-length of the nets in the bounding box. If there are no empty slots left within the bounding box, we extend the bounding box by a row or a column, and place the block b i in the extended bounding box. This process continues for each of the CLBs and primary inputs present in the cone f i. By placing the blocks of the output cone f i, according to level order, the total estimated wirelength is therefore maintained as close to minimum as possible. Thus we process all the output cones of the given netlist one by one. This gives the initial placement configuration for the technology-mapped netlist specified, as input to an iterative procedure for further improvement in the placement configuration. The cost of our initial placement is measured as sum of semi-perimeter wirelengths over all nets in the given netlist as shown below. nets Total wirelength = ( bbspan x ( neti ) + bbspan y ( neti )) i= 1 where bbspan x (net i ) and bbspan y (net i ) are the horizontal and vertical span of bounding box of net i, respectively. We have compared our result with the bounding box cost of VPR [Betz (1997)].2. Initial Placement Algorithm Structure Used: netlist { primary input list, primary output list, clb list, number of primary inputs, number of primary outputs, number of CLBs } Initial Placement(technology-mapped~netlist): begin netlist read the netlist file clbmatrix generate a two dimensional array such that all CLBs and primary I/O can be placed in the square array configuration place primary outputs randomly on the boundary of clbmatrix
4 for all O i primary output do Q 1 φ Q 2 φ b i fanin(o i) configuration Place Block(configuration, b i) b i fanin(b i) enqueue(q 1,b i) repeat b i Trace Cone() configuration Place Block(configuration, b i) until Q 1 is empty end for end Trace Cone (): Begin b i dequeue(q 1) enqueue(q 2, b j) such that b j fanin(b i) and b j not placed if (Q 1 is empty) then} Q 1 Q 2 /* Start tracing next level in the cone */ Q 2 φ end if return b i end Place Block(configuration, b i): begin bbfanin(b i) A square region in clbmatrix containing fanin(b i) that are placed bbfanout(b i) A square region in clbmatrix containing fanout(b i) that are placed bb(b i) A square region in clbmatrix defined by bbfanin(b i) bbfanout(b i) while b i is not placed do p i bb(b i) such that p i is an empty slot in the bounding box and wlength(fanin(b i)) + wlength(fanout(b i)) is minimum amongst all slots in bb(b i) where, wlength(fanin(b i)) b fanin( b ) b j i j manhattan distan ce between p i and p j placed at p wlength(fanout(b i)) semi-perimeter length of bb(b k) such that b k fanout(b i) if no empty slot p i found then bb(b i) expanded bounding box by one row and one column on all sides else configuration b i placed in clbmatrix at p i end if end while return configuration end.3. A Running Example Let us consider an example netlist with 10 CLBs, primary inputs and 3 primary outputs, given in Table 1. The cone f i for primary output o_1() is shown in Figure 1. A strikethrough index in the output cone of indicates that the tree need not be grown any further. For example, all the children of 16 in the cone, i.e., 9, 1, 3 and 2 are already placed in the previous levels. Indices that are primary inputs are not expanded at any level. Tracing of the algorithm is shown in Figure 2. The random placement of all primary outputs as in VPR is shown in Figure 2a. By tracing the cone of primary output we find 7 in the tree, whose bounding box and placement is shown in 2b. Next we find 2 in the tree and so on. Figures 2c show the placement of all other blocks in the cone of. Figure 2d shows the initial placement configuration after all the output cones of the given netlist is placed. j
5 Table 1 : An Example Netlist Primary Input Name Index Fanout Name Index Fanout i_9 0 12, 15, 8, 13, 10 i_7 2 12, 9, 15, 16, 7 i_ , 11, 1 i_8 3 9, 16, 10 Primary Output Name Index Input Name Index Input o_1 7 o_ o_ CLB Name Index Input Fanout Nam Index Input Fanout e o_1_ 7 2, 8, 9 n3 8 11, 16, 0 12, 15, 1, 7 n6 9 15, 3, 2 16, 1, 7 o_2_ 10 3, 11, 12, 0 5 n2 11 1, 1 12, 8, 10 n , 2, 0, 8 10 o_0_ 13 0, 1, 15 6 n1 1 1, 9, 16, 8 11, 13 n5 15 8, 0, 2 9, 13 n 16 9, 1, 3, 2 8, Figure 1: Tracing the cone for a primary output o_1 5. Experimental Results We have compared the experimental results of our initial placement algorithm with that of VPR which places the CLBs and I/Os randomly on the minimum size square array of slots. We have placed the output pins randomly as is done in VPR, and then we followed the procedure described in the previous subsection to place the remaining CLBs and primary inputs. We tested the initial placement algorithm on the same MCNC Benchmark circuits used by VPR and the results obtained are shown in Table 2. The results show that for all the benchmark circuits, our initial placement method gives better result with respect to the bounding box cost as defined in Section.1 at the initial placement stage.
6 a) After the random placement of all the primary outputs c) After placing 2 (BB:0, to 1,) 8 (BB:1,3 to 1, or 1, to 2, ) 9 (BB:0, to 2,) 11 (BB:2 to 1,3 or 1,3 to 2,3) 16 (BB:0,3 to 2,), 0 (BB:0,3 to 1,3) 15 (BB:0,2 to 2,), 3 (BB:2,3 to 2,5) 1 (BB:2,1 to 1,), 1 (BB:1,0 to 2,0) b) After placing 7 (BB:1, to 1,5) d) After placing all output cones, i.e., 5 and 6 Slots reserved for primary I/Os Figure 2: Steps of our algorithm Table 2: Experimental Results of Initial Placement Our Initial Placement Initial Placement of VPR * MCNC Matrix Total BB Cost Benchmark Dimension Wire-length (VPR) Initial BB Cost alu.net apex2.net apex.net Ex1010.net ex5p.net pdc.net seq.net spla.net * Calculation of Total Wire-length for Initial Placement of VPR can be obtained easily
7 Table 3: Experimental Results of Final Placement MCNC Benchmark Our Initial Placement + VPR Final BB Cost # of Iterations Cpu Time (min:sec) Final BB Cost Placement of VPR # of Iterations Cpu Time (min:sec) Alu.net : :19 apex2.net : :21 apex.net 181 1: :35 ex1010.net : :20 ex5p.net : :00 pdc.net : :27 seq.net : :35 spla.net : :15 Table 3 shows the comparison of results of final placement, CPU time and number of iterations in the iterative phase. For all the circuits we achieve a speedup in the number of iterations and CPU time with bounding box cost very close to optimal. 6. Conclusion and Future Directions We present a constructive initial placement algorithm for placement of netlist on FPGA to accelerate the final placement phase. The netlist is placed by tracing the output cones present in the netlist. At every step, blocks are assigned to an optimal position with respect to the current configuration. Our results show improvements in the initial placement stage when compared to the initial placement of VPR. We also observed that VPR converges faster, given our initial placement configuration. We will consider the effect of ordering output cones for critical paths in our future work. We also expect further acceleration by taking into account the overlap of cones. Another area of improvement is elimination of re-calculation of the best position by reuse of information. Last but not the least, an appropriate iterative technique focusing on moves based on neighbourhood parameters so that it converges very quickly to an optimal solution, is being developed. References Emmert, J. M., Blanacha, S., And Bhatia D. K. (1999), Physical Layout Techniques for Field Programmable Gate Arrays, invited paper, IEEE, ACM, SIGDA Design and Test Workshop. Quinn, Jr., N. R., And Breuer, M. A. (1979), A Force Directed Component Placement Procedure for Printed Circuit Boards, IEEE Trans. on Circuits and Systems, Vol. CAS-26, No.6, June, pp. Eisenmann, H., And Johannes, F. M. (1998), Generic Global Placement and Floorplanning, In Proceedings of the 35th Design Automation Conference. Raman, S., Liu, C. L., And Jones, L. G. (199), Timing-Constrained FPGA Placement: A Force-Directed Formulation & Its Performance Evaluation, VLSI
8 Design: An International Journal of Custom Chip Design, Simulation, and Testing. Betz, V., And Rose, J. (1997), VPR: A New Packing, Placement and Routing Tool for FPGA Research, In 7th International Workshop on Field- Programmable Logic and Applications, pp Cohoon, J. P., And Parris, W. D. (199), Genetic Placement, In Proceedings of the International Conference on Computer-Aided Design, pp Saab Y. G., And Rao V. B. (1991), Combinatorial Optimization by Stochastic Evolution, IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems, Vol 10, April, pp King, R. M., And Banerjee, P. (1989), ESP: Placement by Simulated Evolution, IEEE Trans. on Computer-Aided Design,Vol. 8, March, pp Emmert, J. M., And Bhatia, D. K. (1999), Fast Timing Driven Placement Using Tabu Search, In IEEE International Symposium on Circuits and Systems, May. Takahashi, K., Nakajima, K., Terai, M., And Sato, K. (1995), Min-Cut Placement with Global Objective Functions for Large Scale Sea-of Gates Arrays, IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems, Vol. 1, April, pp Krupnova, H., Rabedaoro, R., And Saucier, G. (1997), Synthesis and Floorplanning for Large Hierarchical FPGAs, In ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. Lou, J., Salek, A. H., And Pedram, M. (1998), An Integrated Flow for Technology Remapping and Placement of Subhalf-micron Circuits, In Asia- South Pacific Design Automation Conference Proceedings. Senouci, S. A., Amoura, A., Krupnova, H., And Saucier, G. (1995), Timing Driven Floorpanning on Programmable Hierarchical Targets, International Symposium on Field Programmable Gate Arrays. Fang, W. J, And Wu, A. C-H. (1997), Multi-Way FPGA Partitioning by Fully Exploiting Design Hierarchy, In Proceedings of the 3th Design Automation Conference. Tsay, Y., And Lin, Y. (1995), A Row-Based Cell Placement Method that Utilizes Circuit Structural Properties, IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems, Vol. 1, March, pp Mathur A., And Liu, C. L. (1997), Compression-Relaxation: A New Approach to Timing-Driven Placement for Regular Architectures, IEEE Trans. on CAD of Integrated Circuits and Systems, Vol 16, No 6, June, pp
Faster Placer for Island-style FPGAs
Faster Placer for Island-style FPGAs Pritha Banerjee and Susmita Sur-Kolay Advanced Computing and Microelectronics Unit Indian Statistical Institute 0 B. T. Road, Kolkata, India email:{pritha r, ssk}@isical.ac.in
More informationIntroduction. A very important step in physical design cycle. It is the process of arranging a set of modules on the layout surface.
Placement Introduction A very important step in physical design cycle. A poor placement requires larger area. Also results in performance degradation. It is the process of arranging a set of modules on
More informationPlacement Algorithm for FPGA Circuits
Placement Algorithm for FPGA Circuits ZOLTAN BARUCH, OCTAVIAN CREŢ, KALMAN PUSZTAI Computer Science Department, Technical University of Cluj-Napoca, 26, Bariţiu St., 3400 Cluj-Napoca, Romania {Zoltan.Baruch,
More informationGenetic Algorithm for FPGA Placement
Genetic Algorithm for FPGA Placement Zoltan Baruch, Octavian Creţ, and Horia Giurgiu Computer Science Department, Technical University of Cluj-Napoca, 26, Bariţiu St., 3400 Cluj-Napoca, Romania {Zoltan.Baruch,
More informationGENETIC ALGORITHM BASED FPGA PLACEMENT ON GPU SUNDAR SRINIVASAN SENTHILKUMAR T. R.
GENETIC ALGORITHM BASED FPGA PLACEMENT ON GPU SUNDAR SRINIVASAN SENTHILKUMAR T R FPGA PLACEMENT PROBLEM Input A technology mapped netlist of Configurable Logic Blocks (CLB) realizing a given circuit Output
More informationBasic Idea. The routing problem is typically solved using a twostep
Global Routing Basic Idea The routing problem is typically solved using a twostep approach: Global Routing Define the routing regions. Generate a tentative route for each net. Each net is assigned to a
More informationEstimation of Wirelength
Placement The process of arranging the circuit components on a layout surface. Inputs: A set of fixed modules, a netlist. Goal: Find the best position for each module on the chip according to appropriate
More informationCAD Algorithms. Placement and Floorplanning
CAD Algorithms Placement Mohammad Tehranipoor ECE Department 4 November 2008 1 Placement and Floorplanning Layout maps the structural representation of circuit into a physical representation Physical representation:
More informationVery Large Scale Integration (VLSI)
Very Large Scale Integration (VLSI) Lecture 6 Dr. Ahmed H. Madian Ah_madian@hotmail.com Dr. Ahmed H. Madian-VLSI 1 Contents FPGA Technology Programmable logic Cell (PLC) Mux-based cells Look up table PLA
More informationAn Introduction to FPGA Placement. Yonghong Xu Supervisor: Dr. Khalid
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS UNIVERSITY OF WINDSOR An Introduction to FPGA Placement Yonghong Xu Supervisor: Dr. Khalid RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS UNIVERSITY OF WINDSOR
More informationAn Enhanced Perturbing Algorithm for Floorplan Design Using the O-tree Representation*
An Enhanced Perturbing Algorithm for Floorplan Design Using the O-tree Representation* Yingxin Pang Dept.ofCSE Univ. of California, San Diego La Jolla, CA 92093 ypang@cs.ucsd.edu Chung-Kuan Cheng Dept.ofCSE
More informationNon-Rectangular Shaping and Sizing of Soft Modules for Floorplan Design Improvement
Non-Rectangular Shaping and Sizing of Soft Modules for Floorplan Design Improvement Chris C.N. Chu and Evangeline F.Y. Young Abstract Many previous works on floorplanning with non-rectangular modules [,,,,,,,,,,,
More informationHow Much Logic Should Go in an FPGA Logic Block?
How Much Logic Should Go in an FPGA Logic Block? Vaughn Betz and Jonathan Rose Department of Electrical and Computer Engineering, University of Toronto Toronto, Ontario, Canada M5S 3G4 {vaughn, jayar}@eecgutorontoca
More informationParallel Simulated Annealing for VLSI Cell Placement Problem
Parallel Simulated Annealing for VLSI Cell Placement Problem Atanu Roy Karthik Ganesan Pillai Department Computer Science Montana State University Bozeman {atanu.roy, k.ganeshanpillai}@cs.montana.edu VLSI
More informationA Novel Net Weighting Algorithm for Timing-Driven Placement
A Novel Net Weighting Algorithm for Timing-Driven Placement Tim (Tianming) Kong Aplus Design Technologies, Inc. 10850 Wilshire Blvd., Suite #370 Los Angeles, CA 90024 Abstract Net weighting for timing-driven
More informationExploiting Signal Flow and Logic Dependency in Standard Cell Placement
Exploiting Signal Flow and Logic Dependency in Standard Cell Placement Jason Cong and Dongmin Xu Computer Sci. Dept., UCLA, Los Angeles, CA 90024 Abstract -- Most existing placement algorithms consider
More informationA Path Based Algorithm for Timing Driven. Logic Replication in FPGA
A Path Based Algorithm for Timing Driven Logic Replication in FPGA By Giancarlo Beraudo B.S., Politecnico di Torino, Torino, 2001 THESIS Submitted as partial fulfillment of the requirements for the degree
More informationAbstraction and Optimization of Consistent Floorplanning with Pillar Block Constraints
Abstraction and Optimization of Consistent Floorplanning with Pillar Block Constraints Ning FU, Shigetoshi NAKATAKE, Yasuhiro TAKASHIMA, Yoji KAJITANI School of Environmental Engineering, University of
More informationICS 252 Introduction to Computer Design
ICS 252 Introduction to Computer Design Placement Fall 2007 Eli Bozorgzadeh Computer Science Department-UCI References and Copyright Textbooks referred (none required) [Mic94] G. De Micheli Synthesis and
More informationGenetic Placement: Genie Algorithm Way Sern Shong ECE556 Final Project Fall 2004
Genetic Placement: Genie Algorithm Way Sern Shong ECE556 Final Project Fall 2004 Introduction Overview One of the principle problems in VLSI chip design is the layout problem. The layout problem is complex
More informationL14 - Placement and Routing
L14 - Placement and Routing Ajay Joshi Massachusetts Institute of Technology RTL design flow HDL RTL Synthesis manual design Library/ module generators netlist Logic optimization a b 0 1 s d clk q netlist
More informationSynthesizable FPGA Fabrics Targetable by the VTR CAD Tool
Synthesizable FPGA Fabrics Targetable by the VTR CAD Tool Jin Hee Kim and Jason Anderson FPL 2015 London, UK September 3, 2015 2 Motivation for Synthesizable FPGA Trend towards ASIC design flow Design
More informationIntroduction VLSI PHYSICAL DESIGN AUTOMATION
VLSI PHYSICAL DESIGN AUTOMATION PROF. INDRANIL SENGUPTA DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING Introduction Main steps in VLSI physical design 1. Partitioning and Floorplanning l 2. Placement 3.
More informationAn LP-based Methodology for Improved Timing-Driven Placement
An LP-based Methodology for Improved Timing-Driven Placement Qingzhou (Ben) Wang, John Lillis and Shubhankar Sanyal Department of Computer Science University of Illinois at Chicago Chicago, IL 60607 {qwang,
More informationFloorplan and Power/Ground Network Co-Synthesis for Fast Design Convergence
Floorplan and Power/Ground Network Co-Synthesis for Fast Design Convergence Chen-Wei Liu 12 and Yao-Wen Chang 2 1 Synopsys Taiwan Limited 2 Department of Electrical Engineering National Taiwan University,
More informationFast Timing-driven Partitioning-based Placement for Island Style FPGAs
.1 Fast Timing-driven Partitioning-based Placement for Island Style FPGAs Pongstorn Maidee Cristinel Ababei Kia Bazargan Electrical and Computer Engineering Department University of Minnesota, Minneapolis,
More informationConstraint-Driven Floorplanning based on Genetic Algorithm
Proceedings of the 2007 WSEAS International Conference on Computer Engineering and Applications, Gold Coast, Australia, January 17-19, 2007 147 Constraint-Driven Floorplanning based on Genetic Algorithm
More informationSatisfiability Modulo Theory based Methodology for Floorplanning in VLSI Circuits
Satisfiability Modulo Theory based Methodology for Floorplanning in VLSI Circuits Suchandra Banerjee Anand Ratna Suchismita Roy mailnmeetsuchandra@gmail.com pacific.anand17@hotmail.com suchismita27@yahoo.com
More informationFloorplanning in Modern FPGAs
Floorplanning in Modern FPGAs Pritha Banerjee, Susmita Sur-Kolay Advanced Computing and Microelectronics Unit Indian Statistical Institute 23 B. T. Road, Kolkata, India {pritha r,ssk}@isical.ac.in Arijit
More informationCircuit Placement: 2000-Caldwell,Kahng,Markov; 2002-Kennings,Markov; 2006-Kennings,Vorwerk
Circuit Placement: 2000-Caldwell,Kahng,Markov; 2002-Kennings,Markov; 2006-Kennings,Vorwerk Andrew A. Kennings, Univ. of Waterloo, Canada, http://gibbon.uwaterloo.ca/ akenning/ Igor L. Markov, Univ. of
More informationToward More Efficient Annealing-Based Placement for Heterogeneous FPGAs. Yingxuan Liu
Toward More Efficient Annealing-Based Placement for Heterogeneous FPGAs by Yingxuan Liu A thesis submitted in conformity with the requirements for the degree of Master of Applied Science Graduate Department
More informationCAD Flow for FPGAs Introduction
CAD Flow for FPGAs Introduction What is EDA? o EDA Electronic Design Automation or (CAD) o Methodologies, algorithms and tools, which assist and automatethe design, verification, and testing of electronic
More informationGraph Models for Global Routing: Grid Graph
Graph Models for Global Routing: Grid Graph Each cell is represented by a vertex. Two vertices are joined by an edge if the corresponding cells are adjacent to each other. The occupied cells are represented
More informationARCHITECTURE AND CAD FOR DEEP-SUBMICRON FPGAs
ARCHITECTURE AND CAD FOR DEEP-SUBMICRON FPGAs THE KLUWER INTERNATIONAL SERIES IN ENGINEERING AND COMPUTER SCIENCE ARCHITECTURE AND CAD FOR DEEP-SUBMICRON FPGAs Vaughn Betz Jonathan Rose Alexander Marquardt
More informationRALP:Reconvergence-Aware Layer Partitioning For 3D FPGAs*
RALP:Reconvergence-Aware Layer Partitioning For 3D s* Qingyu Liu 1, Yuchun Ma 1, Yu Wang 2, Wayne Luk 3, Jinian Bian 1 1 Department of Computer Science and Technology, Tsinghua University, Beijing, China
More informationOn Improving Recursive Bipartitioning-Based Placement
Purdue University Purdue e-pubs ECE Technical Reports Electrical and Computer Engineering 12-1-2003 On Improving Recursive Bipartitioning-Based Placement Chen Li Cheng-Kok Koh Follow this and additional
More informationGenetic Algorithm for Circuit Partitioning
Genetic Algorithm for Circuit Partitioning ZOLTAN BARUCH, OCTAVIAN CREŢ, KALMAN PUSZTAI Computer Science Department, Technical University of Cluj-Napoca, 26, Bariţiu St., 3400 Cluj-Napoca, Romania {Zoltan.Baruch,
More informationPlace and Route for FPGAs
Place and Route for FPGAs 1 FPGA CAD Flow Circuit description (VHDL, schematic,...) Synthesize to logic blocks Place logic blocks in FPGA Physical design Route connections between logic blocks FPGA programming
More informationCAD Algorithms. Circuit Partitioning
CAD Algorithms Partitioning Mohammad Tehranipoor ECE Department 13 October 2008 1 Circuit Partitioning Partitioning: The process of decomposing a circuit/system into smaller subcircuits/subsystems, which
More informationCongestion Estimation and Localization in FPGAs: A Visual Tool for Interconnect Prediction
Congestion Estimation and Localization in FPGAs: A Visual Tool for Interconnect Prediction David Yeager Darius Chiu Guy Lemieux Dept of Electrical and Computer Engineering The University of British Columbia
More informationMultilayer Routing on Multichip Modules
Multilayer Routing on Multichip Modules ECE 1387F CAD for Digital Circuit Synthesis and Layout Professor Rose Friday, December 24, 1999. David Tam (2332 words, not counting title page and reference section)
More informationFast FPGA Routing Approach Using Stochestic Architecture
. Fast FPGA Routing Approach Using Stochestic Architecture MITESH GURJAR 1, NAYAN PATEL 2 1 M.E. Student, VLSI and Embedded System Design, GTU PG School, Ahmedabad, Gujarat, India. 2 Professor, Sabar Institute
More informationA Routing Method based on Nearest Via Assignment for 2-Layer Ball Grid Array Packages
A Routing Method based on Nearest Via Assignment for 2-Layer Ball Grid Array Packages Yoshiaki KURATA Yoichi TOMIOKA Yukihide KOHIRA Atsushi TAKAHASHI Tokyo Institute of Technology Dept. of Communications
More informationDetailed Router for 3D FPGA using Sequential and Simultaneous Approach
Detailed Router for 3D FPGA using Sequential and Simultaneous Approach Ashokkumar A, Dr. Niranjan N Chiplunkar, Vinay S Abstract The Auction Based methodology for routing of 3D FPGA (Field Programmable
More informationCorolla: GPU-Accelerated FPGA Routing Based on Subgraph Dynamic Expansion
Corolla: GPU-Accelerated FPGA Routing Based on Subgraph Dynamic Expansion Minghua Shen and Guojie Luo Peking University FPGA-February 23, 2017 1 Contents Motivation Background Search Space Reduction for
More informationBasic Block. Inputs. K input. N outputs. I inputs MUX. Clock. Input Multiplexors
RPack: Rability-Driven packing for cluster-based FPGAs E. Bozorgzadeh S. Ogrenci-Memik M. Sarrafzadeh Computer Science Department Department ofece Computer Science Department UCLA Northwestern University
More informationECE 5745 Complex Digital ASIC Design Topic 13: Physical Design Automation Algorithms
ECE 7 Complex Digital ASIC Design Topic : Physical Design Automation Algorithms Christopher atten School of Electrical and Computer Engineering Cornell University http://www.csl.cornell.edu/courses/ece7
More informationRevisiting Genetic Algorithms for the FPGA Placement Problem
Revisiting Genetic Algorithms for the FPGA Placement Problem Peter Jamieson Miami University, Oxford, OH, 45056 Email: jamiespa@muohio.edu Abstract In this work, we present a genetic algorithm framework
More information(Lec 14) Placement & Partitioning: Part III
Page (Lec ) Placement & Partitioning: Part III What you know That there are big placement styles: iterative, recursive, direct Placement via iterative improvement using simulated annealing Recursive-style
More informationCOE 561 Digital System Design & Synthesis Introduction
1 COE 561 Digital System Design & Synthesis Introduction Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Outline Course Topics Microelectronics Design
More information/97 $ IEEE
NRG: Global and Detailed Placement Majid Sarrafzadeh Maogang Wang Department of Electrical and Computer Engineering, Northwestern University majid@ece.nwu.edu Abstract We present a new approach to the
More informationConstructive floorplanning with a yield objective
Constructive floorplanning with a yield objective Rajnish Prasad and Israel Koren Department of Electrical and Computer Engineering University of Massachusetts, Amherst, MA 13 E-mail: rprasad,koren@ecs.umass.edu
More informationA Simple Placement and Routing Algorithm for a Two-Dimensional Computational Origami Architecture
A Simple Placement and Routing Algorithm for a Two-Dimensional Computational Origami Architecture Robert S. French April 5, 1989 Abstract Computational origami is a parallel-processing concept in which
More information160 M. Nadjarbashi, S.M. Fakhraie and A. Kaviani Figure 2. LUTB structure. each block-level track can be arbitrarily connected to each of 16 4-LUT inp
Scientia Iranica, Vol. 11, No. 3, pp 159{164 c Sharif University of Technology, July 2004 On Routing Architecture for Hybrid FPGA M. Nadjarbashi, S.M. Fakhraie 1 and A. Kaviani 2 In this paper, the routing
More informationModular Placement for Interposer based Multi-FPGA Systems
Modular Placement for Interposer based Multi-FPGA Systems Fubing Mao 1, Wei Zhang 2, Bo Feng 3, Bingsheng He 1, Yuchun Ma 3 1 School of Computer Engineering, Nanyang Technological University, Singapore
More informationA SURVEY: ON VARIOUS PLACERS USED IN VLSI STANDARD CELL PLACEMENT AND MIXED CELL PLACEMENT
Int. J. Chem. Sci.: 14(1), 2016, 503-511 ISSN 0972-768X www.sadgurupublications.com A SURVEY: ON VARIOUS PLACERS USED IN VLSI STANDARD CELL PLACEMENT AND MIXED CELL PLACEMENT M. SHUNMUGATHAMMAL a,b, C.
More informationCongestion-Driven Regional Re-clustering for Low-Cost FPGAs
Congestion-Driven Regional Re-clustering for Low-Cost FPGAs Darius Chiu, Guy G.F. Lemieux, Steve Wilton Electrical and Computer Engineering, University of British Columbia British Columbia, Canada dariusc@ece.ubc.ca
More informationA Methodology and Tool Framework for Supporting Rapid Exploration of Memory Hierarchies in FPGAs
A Methodology and Tool Framework for Supporting Rapid Exploration of Memory Hierarchies in FPGAs Harrys Sidiropoulos, Kostas Siozios and Dimitrios Soudris School of Electrical & Computer Engineering National
More informationSungmin Bae, Hyung-Ock Kim, Jungyun Choi, and Jaehong Park. Design Technology Infrastructure Design Center System-LSI Business Division
Sungmin Bae, Hyung-Ock Kim, Jungyun Choi, and Jaehong Park Design Technology Infrastructure Design Center System-LSI Business Division 1. Motivation 2. Design flow 3. Parallel multiplier 4. Coarse-grained
More informationEfficient Computation of Canonical Form for Boolean Matching in Large Libraries
Efficient Computation of Canonical Form for Boolean Matching in Large Libraries Debatosh Debnath Dept. of Computer Science & Engineering Oakland University, Rochester Michigan 48309, U.S.A. debnath@oakland.edu
More informationFastPlace 2.0: An Efficient Analytical Placer for Mixed- Mode Designs
FastPlace.0: An Efficient Analytical Placer for Mixed- Mode Designs Natarajan Viswanathan Min Pan Chris Chu Iowa State University ASP-DAC 006 Work supported by SRC under Task ID: 106.001 Mixed-Mode Placement
More informationThe Management of Applications for Reconfigurable Computing using an Operating System
The Management of Applications for Reconfigurable Computing using an Operating System Grant Wigley & David Kearney Advanced Computing Research Centre School of Computer and Information Science University
More informationAN EMPIRICAL STUDY OF THE STOCHASTIC EVOLUTION ALGORITHM FOR THE VLSI CELL PLACEMENT PROBLEM. Natrajan Thamizhmani
AN EMPIRICAL STUDY OF THE STOCHASTIC EVOLUTION ALGORITHM FOR THE VLSI CELL PLACEMENT PROBLEM by Natrajan Thamizhmani A project submitted in partial fulfillment of the requirements for the degree of Master
More informationLarge Scale Circuit Partitioning
Large Scale Circuit Partitioning With Loose/Stable Net Removal And Signal Flow Based Clustering Jason Cong Honching Li Sung-Kyu Lim Dongmin Xu UCLA VLSI CAD Lab Toshiyuki Shibuya Fujitsu Lab, LTD Support
More informationAbstract A SCALABLE, PARALLEL, AND RECONFIGURABLE DATAPATH ARCHITECTURE
A SCALABLE, PARALLEL, AND RECONFIGURABLE DATAPATH ARCHITECTURE Reiner W. Hartenstein, Rainer Kress, Helmut Reinig University of Kaiserslautern Erwin-Schrödinger-Straße, D-67663 Kaiserslautern, Germany
More informationAnimation of VLSI CAD Algorithms A Case Study
Session 2220 Animation of VLSI CAD Algorithms A Case Study John A. Nestor Department of Electrical and Computer Engineering Lafayette College Abstract The design of modern VLSI chips requires the extensive
More informationCell Density-driven Detailed Placement with Displacement Constraint
Cell Density-driven Detailed Placement with Displacement Constraint Wing-Kai Chow, Jian Kuang, Xu He, Wenzan Cai, Evangeline F. Y. Young Department of Computer Science and Engineering The Chinese University
More informationEu = {n1, n2} n1 n2. u n3. Iu = {n4} gain(u) = 2 1 = 1 V 1 V 2. Cutset
Shantanu Dutt 1 and Wenyong Deng 2 A Probability-Based Approach to VLSI Circuit Partitioning Department of Electrical Engineering 1 of Minnesota University Minneapolis, Minnesota 55455 LSI Logic Corporation
More informationUnit 7: Maze (Area) and Global Routing
Unit 7: Maze (Area) and Global Routing Course contents Routing basics Maze (area) routing Global routing Readings Chapters 9.1, 9.2, 9.5 Filling Unit 7 1 Routing Unit 7 2 Routing Constraints 100% routing
More informationSequential/Parallel Global Routing Algorithms for VLSI Standard. Cells
Sequential/Parallel Global Routing Algorithms for VLSI Standard Cells A Thesis Presented to The Faculty of Graduate Studies of The University of Guelph by HAO SUN In partial fulfilment of requirements
More informationGeneric Global Placement and Floorplanning
Generic Global Placement and Floorplanning Hans Eisenmann and Frank M. Johannes http://www.regent.e-technik.tu-muenchen.de Institute of Electronic Design Automation Technical University Munich 80290 Munich
More informationChapter 5 Global Routing
Chapter 5 Global Routing 5. Introduction 5.2 Terminology and Definitions 5.3 Optimization Goals 5. Representations of Routing Regions 5.5 The Global Routing Flow 5.6 Single-Net Routing 5.6. Rectilinear
More informationParallel Implementation of VLSI Gate Placement in CUDA
ME 759: Project Report Parallel Implementation of VLSI Gate Placement in CUDA Movers and Placers Kai Zhao Snehal Mhatre December 21, 2015 1 Table of Contents 1. Introduction...... 3 2. Problem Formulation...
More informationOPTIMIZATION OF TRANSISTOR-LEVEL FLOORPLANS FOR FIELD-PROGRAMMABLE GATE ARRAYS. Ryan Fung. Supervisor: Jonathan Rose. April 2002
OPTIMIZATION OF TRANSISTOR-LEVEL FLOORPLANS FOR FIELD-PROGRAMMABLE GATE ARRAYS by Ryan Fung Supervisor: Jonathan Rose April 2002 OPTIMIZATION OF TRANSISTOR-LEVEL FLOORPLANS FOR FIELD-PROGRAMMABLE GATE
More informationThree-Dimensional Integrated Circuits: Performance, Design Methodology, and CAD Tools
Three-Dimensional Integrated Circuits: Performance, Design Methodology, and CAD Tools Shamik Das, Anantha Chandrakasan, and Rafael Reif Microsystems Technology Laboratories Massachusetts Institute of Technology
More informationSYNTHETIC CIRCUIT GENERATION USING CLUSTERING AND ITERATION
SYNTHETIC CIRCUIT GENERATION USING CLUSTERING AND ITERATION Paul D. Kundarewich and Jonathan Rose Department of Electrical and Computer Engineering, University of Toronto Toronto, ON, M5S G4, Canada {kundarew,
More informationAnalysis of different legalisation methods for unequal sized recursive Min-cut placement
IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) Volume 4, Issue 4, Ver. III (Jul-Aug. 04), PP -60 e-issn: 3 400, p-issn No. : 3 4 Analysis of different legalisation methods for unequal sized recursive
More informationEN2911X: Reconfigurable Computing Lecture 13: Design Flow: Physical Synthesis (5)
EN2911X: Lecture 13: Design Flow: Physical Synthesis (5) Prof. Sherief Reda Division of Engineering, rown University http://scale.engin.brown.edu Fall 09 Summary of the last few lectures System Specification
More informationOn Nominal Delay Minimization in LUT-Based FPGA Technology Mapping
On Nominal Delay Minimization in LUT-Based FPGA Technology Mapping Jason Cong and Yuzheng Ding Department of Computer Science University of California, Los Angeles, CA 90024 Abstract In this report, we
More informationBACKEND DESIGN. Circuit Partitioning
BACKEND DESIGN Circuit Partitioning Partitioning System Design Decomposition of a complex system into smaller subsystems. Each subsystem can be designed independently. Decomposition scheme has to minimize
More informationFigure 1. PLA-Style Logic Block. P Product terms. I Inputs
Technology Mapping for Large Complex PLDs Jason Helge Anderson and Stephen Dean Brown Department of Electrical and Computer Engineering University of Toronto 10 King s College Road Toronto, Ontario, Canada
More informationThree-Dimensional Cylindrical Model for Single-Row Dynamic Routing
MATEMATIKA, 2014, Volume 30, Number 1a, 30-43 Department of Mathematics, UTM. Three-Dimensional Cylindrical Model for Single-Row Dynamic Routing 1 Noraziah Adzhar and 1,2 Shaharuddin Salleh 1 Department
More informationJRoute: A Run-Time Routing API for FPGA Hardware
JRoute: A Run-Time Routing API for FPGA Hardware Eric Keller Xilinx Inc. 2300 55 th Street Boulder, CO 80301 Eric.Keller@xilinx.com Abstract. JRoute is a set of Java classes that provide an application
More informationChallenges of FPGA Physical Design
Challenges of FPGA Physical Design Larry McMurchie 1 and Jovanka Ciric Vujkovic 2 1 Principal Engineer, Solutions Group, Synopsys, Inc., Mountain View, CA, USA 2 R&D Manager, Solutions Group, Synopsys,
More informationParallel Global Routing Algorithms for Standard Cells
Parallel Global Routing Algorithms for Standard Cells Zhaoyun Xing Computer and Systems Research Laboratory University of Illinois Urbana, IL 61801 xing@crhc.uiuc.edu Prithviraj Banerjee Center for Parallel
More informationVLSI Physical Design: From Graph Partitioning to Timing Closure
VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5 Global Routing Original uthors: ndrew. Kahng, Jens, Igor L. Markov, Jin Hu VLSI Physical Design: From Graph Partitioning to Timing
More informationPartitioning. Course contents: Readings. Kernighang-Lin partitioning heuristic Fiduccia-Mattheyses heuristic. Chapter 7.5.
Course contents: Partitioning Kernighang-Lin partitioning heuristic Fiduccia-Mattheyses heuristic Readings Chapter 7.5 Partitioning 1 Basic Definitions Cell: a logic block used to build larger circuits.
More informationFloorplan Management: Incremental Placement for Gate Sizing and Buffer Insertion
Floorplan Management: Incremental Placement for Gate Sizing and Buffer Insertion Chen Li, Cheng-Kok Koh School of ECE, Purdue University West Lafayette, IN 47907, USA {li35, chengkok}@ecn.purdue.edu Patrick
More informationAutomated Extraction of Physical Hierarchies for Performance Improvement on Programmable Logic Devices
Automated Extraction of Physical Hierarchies for Performance Improvement on Programmable Logic Devices Deshanand P. Singh Altera Corporation dsingh@altera.com Terry P. Borer Altera Corporation tborer@altera.com
More informationWire Type Assignment for FPGA Routing
Wire Type Assignment for FPGA Routing Seokjin Lee Department of Electrical and Computer Engineering The University of Texas at Austin Austin, TX 78712 seokjin@cs.utexas.edu Hua Xiang, D. F. Wong Department
More informationTiming-Driven Placement for FPGAs
Timing-Driven Placement for FPGAs Alexander (Sandy) Marquardt, Vaughn Betz, and Jonathan Rose 1 {arm, vaughn, jayar}@rtrack.com Right Track CAD Corp., Dept. of Electrical and Computer Engineering, 720
More informationAn Enhanced Congestion-Driven Floorplanner
An Enhanced Congestion-Driven Floorplanner Yu-Cheng Lin 1 Shin-Jia Chen 1 1 Hsin-Hsiung Huang 2 1 Dept. of Information and Electronic Commerce, Kainan University, Taoyuan, Taiwan 2 Dept. of EE., Lunghwa
More informationA Level-wise Priority Based Task Scheduling for Heterogeneous Systems
International Journal of Information and Education Technology, Vol., No. 5, December A Level-wise Priority Based Task Scheduling for Heterogeneous Systems R. Eswari and S. Nickolas, Member IACSIT Abstract
More informationSUBMITTED FOR PUBLICATION TO: IEEE TRANSACTIONS ON VLSI, DECEMBER 5, A Low-Power Field-Programmable Gate Array Routing Fabric.
SUBMITTED FOR PUBLICATION TO: IEEE TRANSACTIONS ON VLSI, DECEMBER 5, 2007 1 A Low-Power Field-Programmable Gate Array Routing Fabric Mingjie Lin Abbas El Gamal Abstract This paper describes a new FPGA
More informationFault Simulation. Problem and Motivation
Fault Simulation Problem and Motivation Fault Simulation Problem: Given A circuit A sequence of test vectors A fault model Determine Fault coverage Fraction (or percentage) of modeled faults detected by
More informationReducing Power in an FPGA via Computer-Aided Design
Reducing Power in an FPGA via Computer-Aided Design Steve Wilton University of British Columbia Power Reduction via CAD How to reduce power dissipation in an FPGA: - Create power-aware CAD tools - Create
More informationHardware-Software Codesign
Hardware-Software Codesign 4. System Partitioning Lothar Thiele 4-1 System Design specification system synthesis estimation SW-compilation intellectual prop. code instruction set HW-synthesis intellectual
More informationReference. Wayne Wolf, FPGA-Based System Design Pearson Education, N Krishna Prakash,, Amrita School of Engineering
FPGA Fabrics Reference Wayne Wolf, FPGA-Based System Design Pearson Education, 2004 Logic Design Process Combinational logic networks Functionality. Other requirements: Size. Power. Primary inputs Performance.
More informationRoutability-Driven Bump Assignment for Chip-Package Co-Design
1 Routability-Driven Bump Assignment for Chip-Package Co-Design Presenter: Hung-Ming Chen Outline 2 Introduction Motivation Previous works Our contributions Preliminary Problem formulation Bump assignment
More informationAcademic Clustering and Placement Tools for Modern Field-Programmable Gate Array Architectures
Academic Clustering and Placement Tools for Modern Field-Programmable Gate Array Architectures by Daniele G Paladino A thesis submitted in conformity with the requirements for the degree of Master of Applied
More information