Delay-Bounded Routing for Shadow Registers
|
|
- Shanon Mosley
- 5 years ago
- Views:
Transcription
1 Delay-Bounded Routing for Shadow isters Eddie Hung, Josh Leine, Ed Stott, George Constantinides, Wayne Luk {e.hung,josh.leine,ed.stott, Department of Computing Department of Electrical and Electronic Engineering Imperial College London, UK FPGA15 :: 23 Feb '15
2 In a nutshell, this is about... More accurate on-chip timing measurements By inserting instruments precisely clk User Logic & Routing User 2
3 In a nutshell, this is about... More accurate on-chip timing measurements By inserting instruments precisely clk User Logic & Routing User Instrument 3
4 In a nutshell, this is about... More accurate on-chip timing measurements By inserting instruments precisely clk User Logic & Routing User Instrument In particular, we want to bound this routing delay 4
5 In a nutshell, this is about... More accurate on-chip timing measurements By inserting instruments precisely User Logic & Routing User In particular, we want to bound this routing delay clk Instrument across 1000s of instruments 5
6 In a nutshell, this is about... More accurate on-chip timing measurements By inserting instruments precisely User Logic & Routing User In particular, we want to bound this routing delay clk Instrument across 1000s of instruments Most CAD focusses on minimising (or maximising) We place and route instruments with bounded delay Key result: Insertion within 200ps ag. error 6
7 Introduction Why do we care about on-chip timing? Because of ariation, endors forced to proision for all possible scenarios Fmax is worst-case path, at worst-case timing corner Can we safely operate better than worst-case? Why operate at the worst corner if silicon is rarely that slow, and if critical-path is rarely (neer) exercised? 7
8 Main Contributions Proposal for inserting shadow registers post place-and-route, using spare resources Does not disrupt original circuit timing Modification to Dijkstra's algorithm for minimum cost (delay) bounds Experimental ealuation on commercial architecture Achiee aerage bound error of 200ps 8
9 Introduction How can on-chip timing instruments help? Failure prediction Is my FPGA showing signs of being close to failing? Error detection Safe oerclocking (e.g. Razor) Slack measurement How much leeway do I hae before failing? All possible using shadow registers 9
10 Background: Shadow isters clk User Critical Path Logic & Routing User User circuit (compiled using regular tools, and locked down) 10
11 Background: Shadow isters clk User Critical Path Logic & Routing User User circuit (compiled using regular tools, and locked down) φ sclk Shad. Timing instruments (built out of spare resources only) Duplicates a critical path endpoint, but with a phase-shifted clock 11
12 Background: Shadow isters clk User Critical Path Logic & Routing User Mismatch indicates a timing failure φ sclk Shad. Duplicates a critical path endpoint, but with a phase-shifted clock 12
13 Background: Shadow isters Critical Path User Logic & Routing User clk φ sclk Shad. Focus: controlling this delay Duplicates a critical path endpoint, but with a phase-shifted clock 13
14 Challenge: Consistent Skew Tuser Tskew User Logic & Routing User clk sclk Tshadow Shad. For one shadow reg, we can cancel out any path skew using a phase offset 14
15 Challenge: Consistent Skew Tskew1 Tskew2 clk User User Logic & Routing Logic & Routing User User Shad. Shad. sclk φskew = {Tskew1,Tskew2} 15
16 Challenge: Consistent Skew Tskew Tskew clk User User Logic & Routing Logic & Routing User User Shad. sclk Shad. φskew = Tskew By placing and routing these shadow registers carefully, target consistent skew 16
17 Existing Work Existing CAD tools already do bounded routing Minimum delay: Hold time minus Clock-to-Q Maximum delay: Clock period minus Setup time Algorithms target a relatiely big landing window: User Signal Tcrit clk Hold Setup Acceptable 17
18 Challenge: Consistent Skew We desire a ery narrow window (ideally, 1ps) Hard problem, but aided by two characteristics: Freedom to place shadow register on any spare site Freedom to route to any site using any spare resources Soled simultaneously by routing to all sites clk User Logic & Routing User Free Site Free Site Free Site Free Site 18
19 Problem FPGA routing is a case of graph search Finding shortest path has long been soled: Dijkstra Delay cost of edge (wire) 2 s 3 2 t Edge cost = 1 unless labelled 19
20 Problem FPGA routing is a case of graph search Finding shortest path has long been soled: Dijkstra 2 s 3 2 t u Visited Vertex Queued Vertex Unseen Vertex 20
21 Problem FPGA routing is a case of graph search Finding shortest path has long been soled: Dijkstra 2 2 s 3 t u Visited Vertex Queued Vertex Unseen Vertex 21
22 Problem FPGA routing is a case of graph search Finding shortest path has long been soled: Dijkstra 2 2 s 3 t u Visited Vertex Queued Vertex Unseen Vertex 22
23 Problem FPGA routing is a case of graph search Finding shortest path has long been soled: Dijkstra u s t u Visited Vertex Queued Vertex Unseen Vertex Cost = 4 23
24 Problem FPGA routing is a case of graph search Finding shortest path has long been soled: Dijkstra But how can we find not-so-short paths? 24
25 Proposal: Dijkstra with Rollback Summary: if path found is too short, undo search and rerun as if a prior edge did not exist u s t u Visited Vertex Queued Vertex Unseen Vertex Cost = 4 25
26 Proposal: Dijkstra with Rollback Summary: if path found is too short, undo search and rerun as if a prior edge did not exist Heuristic decides to erase this edge u s t u Visited Vertex Queued Vertex Unseen Vertex Cost = 4 26
27 Proposal: Dijkstra with Rollback Summary: if path found is too short, undo search and rerun as if a prior edge did not exist 1. Inalidate all children of this edge u s t u Visited Vertex Queued Vertex Unseen Vertex Cost = 4 27
28 Proposal: Dijkstra with Rollback Summary: if path found is too short, undo search and rerun as if a prior edge did not exist 1. Inalidate all children of this edge 2 s 3 2 t u Visited Vertex Queued Vertex Unseen Vertex 28
29 Proposal: Dijkstra with Rollback Summary: if path found is too short, undo search and rerun as if a prior edge did not exist 2 s 3 2 t u Visited Vertex Queued Vertex Unseen Vertex 2. Repair priority ueue (fix costs of those nodes that would hae been isited by another edge) 29
30 Proposal: Dijkstra with Rollback Summary: if path found is too short, undo search and rerun as if a prior edge did not exist 2 2 s 3 t u Visited Vertex Queued Vertex Unseen Vertex Old Cost = 4 New Cost = 6 30
31 Proposal: Dijkstra with Rollback Big picture: disjoint paths from each s to any t 2 t ttt s s s 3 Millions 2 of nodes and edges t ttt t t t t 1000s of endpoints 10s of thousands of spare registers 31
32 Proposal: Dijkstra with Rollback Not dissimilar to Yen's K-shortest path algorithm Guaranteed to find the shortest, and K-1 next shortest, paths Essence: remoe edge(s) of prior shortest paths, restart Dijkstra, then replace edge(s) and retry Our heuristic: Remoes each edge permanently Repair and continue, instead of restarting Dijkstra No guarantee of finding next shortest, but much faster 32
33 Proposal: Dijkstra with Rollback We propose three different algorithms: For each net For each net For all nets Route to closest site Route to any site Route to any site No Meets min bound? Yes No Meets min bound? Yes No Meets min bound? Yes One-Closest One-All All-All 33
34 Proposal: Dijkstra with Rollback We propose three different algorithms: For each net For each net For all nets Route to closest site Route to any site Route to any site No Meets min bound? Yes No Meets min bound? Yes No Meets min bound? Yes One-Closest One-All All-All Full details are in our paper! 34
35 Experimental Flow Ealuated on Xilinx, but applicable to others Place and route using ISE, apply our shadow register tool, then re-analyse using endor STA Experimented on three large benchmarks: LEON3: an 8-core system-on-chip AES-x3: a chain of 3 encoders then 3 decoders JPEG-x2: two parallel instances of CHStone JPEG synthesised using Viado HLS All occupying ~90% of mid-range Virtex6 35
36 Results LEON3 Critical endpoints with <10% slack (1424) Baseline (STA) Delay (ns) Critical Paths (by ascending slack) 36
37 Results LEON3 Critical endpoints with <10% slack (1424) With 1417 shadow registers at >1ns skew target Shadow (STA) Baseline (STA) Delay (ns) Critical Paths (by ascending slack) 37
38 Results LEON3 Critical endpoints with <10% slack (1424) With 1417 shadow registers at >1ns skew target Shadow (STA) Baseline (STA) Delay (ns) Skew (ns) Critical Paths (by ascending slack) Skew (STA) 38
39 Results LEON3 Critical endpoints with <10% slack (1424) With 1417 shadow registers at >1ns skew target Shadow (STA) Baseline (STA) Delay (ns) Skew (ns) Critical Paths (by ascending slack) Skew (STA) Ag. Error 39
40 Results LEON3 Aerage skew error of 200ps (from Xilinx STA) Due to incomplete timing information Based on our delay model, we achiee <1ps error! 40
41 Results LEON3 Aerage skew error of 200ps (from Xilinx STA) Due to incomplete timing information Based on our delay model, we achiee <1ps error! Why so good? Many spare registers (total: 300K, used: 62K, spare: 78K) Millions of wires (nodes), 10s of millions of switches (edges) compounding with up to 7 ways (with differing delays) to get to each: 6LUT A1:A6 AX 5LUT 5LUT FF FF 41
42 More in the paper... Experimental results for: Three different algorithms on three benchmarks Different number of steps to roll back Different alues for skew target Analysis on why not all nets could be shadowed 42
43 More in the paper... Experimental results for: Three different algorithms on three benchmarks Different number of steps to roll back Different alues for skew target Analysis on why not all nets could be shadowed Future work: insert downstream infrastructure Comparators, counters, recoery logic, etc. Opportunities: latency-insensitie, reduction operation 43
44 Conclusion Shadow registers can help detect, measure and react to physical imperfections But need to be inserted precisely: consistent skew 44
45 Conclusion Shadow registers can help detect, measure and react to physical imperfections But need to be inserted precisely: consistent skew Our proposal: insert post place and route, with a minimum delay bound Using just leftoer resources no timing impact Bounding achieed by Dijkstra with Rollback Key result: achiee 200ps aerage skew error With more timing information, may do een better! 45
46 Backup 46
47 Introduction Shadow register clock phase φ Failure prediction: User Signal clk sclk1 φ<0 Shorter period Tcrit 47
48 Introduction Shadow register clock phase φ User Signal Tcrit clk Failure prediction: Error detection: sclk1 φ<0 sclk2 φ>0 Shorter period Longer period 48
49 Introduction Shadow register clock phase φ Failure prediction: Error detection: Slack measurement: User Signal clk sclk1 φ<0 sclk2 φ>0 sclk3 sweep φ Shorter period Tcrit Longer period 49
50 Case for Post Place and Route We insert shadow instruments entirely post P&R By locking down all parts of existing user circuit... and using spare, leftoer, resources only This preseres timing of original user circuit Barring one extra switch load: ~3ps 50
51 Case for Post Place and Route Compile Timing Analysis List of critical endpoints 51
52 Case for Post Place and Route Compile Timing Analysis List of critical endpoints Instrument at RTL 52
53 Case for Post Place and Route Re-compile Timing Analysis Different list of critical endpoints Instrument at RTL 53
54 Case for Post Place and Route For LEON3, shadowing top 10% critical nets: 1500 Critical Signals Baseline Source Post P&R Total Shadowed Instrument critical nets at source-leel, then recompile Of what was critical initially, only ~50% remains critical after recompiling! 54
55 Case for Post Place and Route For LEON3, shadowing top 10% critical nets: 1500 Critical Signals Baseline Source Post P&R Total Shadowed Instrument critical nets post place and route, using spare resources only 55
56 Dijkstra with Rollback Profile ns target 800 Skew Rollback Count
57 Yen's K-shortest Profile ns target 800 Skew K-th Shortest Path 57
58 Yen's K-shortest Profile ns target 800 Skew K-th Shortest Path 58
59 Benchmark Summary 59
60 Experimental Flow Ealuated on Xilinx, but applicable to any endor HDL Xilinx ISE xst Synthesis map Pack & Place par Route.ncd trce STA bitgen.bit ular FPGA flow 60
61 Experimental Flow Ealuated on Xilinx, but applicable to any endor HDL * Xilinx ISE xst Synthesis map Pack & Place par Route.ncd trce STA Timing Report.twr.xdl Placed-and-routed netlist ncd2xdl Delay-Bounded Routing Tool Simple Wire Delay Model.xdl bitgen.bit ular FPGA flow Shadow ister Extension 61
62 Experimental Flow Ealuated on Xilinx, but applicable to any endor HDL * Xilinx ISE xst Synthesis map Pack & Place par Route.ncd bitgen trce STA Timing Report.twr.xdl Placed-and-routed netlist ncd2xdl trce STA Delay-Bounded Routing Tool Simple Wire Delay Model.xdl xdl2ncd (with DRC).ncd par Clock Route Only.bit ular FPGA flow Shadow ister Extension 62
63 Experimental Results Unbounded (place closest reg, Xilinx PAR): Shadow (STA) Baseline (STA) Delay (ns) Critical Paths (by ascending slack) Clock skew (STA) Data skew (STA) Skew (ns)
64 Experimental Results >1ns on LEON3: 64
65 Experimental Results >1ns on LEON3: 65
66 Experimental Results >1ns on AES-x3: 66
67 Experimental Results >1ns on JPEG-x2: 67
68 Experimental Results Reasons behind failure i) and ii) reuire user circuit to be ripped up Only iii) could be soled by better algorithms 68
IDEA! Avnet SpeedWay Design Workshop
The essence of FPGA technology IDEA! 2 ISE Tool Flow Overview Design Entry Synthesis Constraints Synthesis Simulation Implementation Constraints Floor-Planning Translate Map Place & Route Timing Analysis
More informationGraduate Institute of Electronics Engineering, NTU FPGA Design with Xilinx ISE
FPGA Design with Xilinx ISE Presenter: Shu-yen Lin Advisor: Prof. An-Yeu Wu 2005/6/6 ACCESS IC LAB Outline Concepts of Xilinx FPGA Xilinx FPGA Architecture Introduction to ISE Code Generator Constraints
More informationDRAF: A Low-Power DRAM-based Reconfigurable Acceleration Fabric
DRAF: A Low-Power DRAM-based Reconfigurable Acceleration Fabric Mingyu Gao, Christina Delimitrou, Dimin Niu, Krishna Malladi, Hongzhong Zheng, Bob Brennan, Christos Kozyrakis ISCA June 22, 2016 FPGA-Based
More informationDRAF: A Low-Power DRAM-based Reconfigurable Acceleration Fabric
DRAF: A Low-Power DRAM-based Reconfigurable Acceleration Fabric Mingyu Gao, Christina Delimitrou, Dimin Niu, Krishna Malladi, Hongzhong Zheng, Bob Brennan, Christos Kozyrakis ISCA June 22, 2016 FPGA-Based
More informationVerilog for High Performance
Verilog for High Performance Course Description This course provides all necessary theoretical and practical know-how to write synthesizable HDL code through Verilog standard language. The course goes
More informationLab 3 Sequential Logic for Synthesis. FPGA Design Flow.
Lab 3 Sequential Logic for Synthesis. FPGA Design Flow. Task 1 Part 1 Develop a VHDL description of a Debouncer specified below. The following diagram shows the interface of the Debouncer. The following
More informationECE 4514 Digital Design II. Spring Lecture 20: Timing Analysis and Timed Simulation
ECE 4514 Digital Design II Lecture 20: Timing Analysis and Timed Simulation A Tools/Methods Lecture Topics Static and Dynamic Timing Analysis Static Timing Analysis Delay Model Path Delay False Paths Timing
More informationTutorial on FPGA Design Flow based on Aldec Active HDL. Ver 1.5
Tutorial on FPGA Design Flow based on Aldec Active HDL Ver 1.5 1 Prepared by Ekawat (Ice) Homsirikamol, Marcin Rogawski, Jeremy Kelly, John Pham, and Dr. Kris Gaj This tutorial assumes that you have basic
More informationMapping-Aware Constrained Scheduling for LUT-Based FPGAs
Mapping-Aware Constrained Scheduling for LUT-Based FPGAs Mingxing Tan, Steve Dai, Udit Gupta, Zhiru Zhang School of Electrical and Computer Engineering Cornell University High-Level Synthesis (HLS) for
More informationTowards Optimal Custom Instruction Processors
Towards Optimal Custom Instruction Processors Wayne Luk Kubilay Atasu, Rob Dimond and Oskar Mencer Department of Computing Imperial College London HOT CHIPS 18 Overview 1. background: extensible processors
More informationReduce FPGA Power With Automatic Optimization & Power-Efficient Design. Vaughn Betz & Sanjay Rajput
Reduce FPGA Power With Automatic Optimization & Power-Efficient Design Vaughn Betz & Sanjay Rajput Previous Power Net Seminar Silicon vs. Software Comparison 100% 80% 60% 40% 20% 0% 20% -40% Percent Error
More informationVerilog Sequential Logic. Verilog for Synthesis Rev C (module 3 and 4)
Verilog Sequential Logic Verilog for Synthesis Rev C (module 3 and 4) Jim Duckworth, WPI 1 Sequential Logic Module 3 Latches and Flip-Flops Implemented by using signals in always statements with edge-triggered
More informationAn easy to read reference is:
1. Synopsis: Timing Analysis and Timing Constraints The objective of this lab is to make you familiar with two critical reports produced by the Xilinx ISE during your design synthesis and implementation.
More informationWhat is Xilinx Design Language?
Bill Jason P. Tomas University of Nevada Las Vegas Dept. of Electrical and Computer Engineering What is Xilinx Design Language? XDL is a human readable ASCII format compatible with the more widely used
More informationPrototyping Architectural Support for Program Rollback Using FPGAs
Prototyping Architectural Support for Program Rollback Using FPGAs Radu Teodorescu and Josep Torrellas http://iacoma.cs.uiuc.edu University of Illinois at Urbana-Champaign Motivation Problem: Software
More informationIntroduction. In this exercise you will:
Introduction In a lot of digital designs (DAQ, Trigger,..) the FPGAs are used. The aim of this exercise is to show you a way to logic design in a FPGA. You will learn all the steps from the idea to the
More informationSynthesis of VHDL Code for FPGA Design Flow Using Xilinx PlanAhead Tool
Synthesis of VHDL Code for FPGA Design Flow Using Xilinx PlanAhead Tool Md. Abdul Latif Sarker, Moon Ho Lee Division of Electronics & Information Engineering Chonbuk National University 664-14 1GA Dekjin-Dong
More informationWhite Paper Performing Equivalent Timing Analysis Between Altera Classic Timing Analyzer and Xilinx Trace
Introduction White Paper Between Altera Classic Timing Analyzer and Xilinx Trace Most hardware designers who are qualifying FPGA performance normally run bake-off -style software benchmark comparisons
More informationECE 491 Laboratory 1 Introducing FPGA Design with Verilog September 6, 2004
Goals ECE 491 Laboratory 1 Introducing FPGA Design with Verilog September 6, 2004 1. To review the use of Verilog for combinational logic design. 2. To become familiar with using the Xilinx ISE software
More informationCOE 561 Digital System Design & Synthesis Introduction
1 COE 561 Digital System Design & Synthesis Introduction Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Outline Course Topics Microelectronics Design
More informationAdvanced FPGA Design Methodologies with Xilinx Vivado
Advanced FPGA Design Methodologies with Xilinx Vivado Alexander Jäger Computer Architecture Group Heidelberg University, Germany Abstract With shrinking feature sizes in the ASIC manufacturing technology,
More informationUsing Synplify Pro, ISE and ModelSim
Using Synplify Pro, ISE and ModelSim VLSI Systems on Chip ET4 351 Rene van Leuken Huib Lincklaen Arriëns Rev. 1.2 The EDA programs that will be used are: For RTL synthesis: Synplicity Synplify Pro For
More informationChapter 9: Integration of Full ASIP and its FPGA Implementation
Chapter 9: Integration of Full ASIP and its FPGA Implementation 9.1 Introduction A top-level module has been created for the ASIP in VHDL in which all the blocks have been instantiated at the Register
More informationand 32 bit for 32 bit. If you don t pay attention to this, there will be unexpected behavior in the ISE software and thing may not work properly!
This tutorial will show you how to: Part I: Set up a new project in ISE 14.7 Part II: Implement a function using Schematics Part III: Simulate the schematic circuit using ISim Part IV: Constraint, Synthesize,
More informationXilinx ASMBL Architecture
FPGA Structure Xilinx ASMBL Architecture Design Flow Synthesis: HDL to FPGA primitives Translate: FPGA Primitives to FPGA Slice components Map: Packing of Slice components into Slices, placement of Slices
More informationOverview. Design flow. Principles of logic synthesis. Logic Synthesis with the common tools. Conclusions
Logic Synthesis Overview Design flow Principles of logic synthesis Logic Synthesis with the common tools Conclusions 2 System Design Flow Electronic System Level (ESL) flow System C TLM, Verification,
More informationGraduate Institute of Electronics Engineering, NTU. FPGA Lab. Speaker : 鍾明翰 (CMH) Advisor: Prof. An-Yeu Wu Date: 2010/12/14 ACCESS IC LAB
FPGA Lab Speaker : 鍾明翰 (CMH) Advisor: Prof. An-Yeu Wu Date: 2010/12/14 ACCESS IC LAB Objective In this Lab, you will learn the basic set-up and design methods of implementing your design by ISE 10.1. Create
More informationSynthesizable FPGA Fabrics Targetable by the VTR CAD Tool
Synthesizable FPGA Fabrics Targetable by the VTR CAD Tool Jin Hee Kim and Jason Anderson FPL 2015 London, UK September 3, 2015 2 Motivation for Synthesizable FPGA Trend towards ASIC design flow Design
More informationDESIGN AND IMPLEMENTATION OF 32-BIT CONTROLLER FOR INTERACTIVE INTERFACING WITH RECONFIGURABLE COMPUTING SYSTEMS
DESIGN AND IMPLEMENTATION OF 32-BIT CONTROLLER FOR INTERACTIVE INTERFACING WITH RECONFIGURABLE COMPUTING SYSTEMS Ashutosh Gupta and Kota Solomon Raju Digital System Group, Central Electronics Engineering
More informationLab 1: FPGA Physical Layout
Lab 1: FPGA Physical Layout University of California, Berkeley Department of Electrical Engineering and Computer Sciences EECS150 Components and Design Techniques for Digital Systems John Wawrzynek, James
More informationTutorial on FPGA Design Flow based on Aldec Active HDL. Ver 1.5
Tutorial on FPGA Design Flow based on Aldec Active HDL Ver 1.5 1 Prepared by Ekawat (Ice) Homsirikamol, Marcin Rogawski, Jeremy Kelly, Kishore Kumar Surapathi and Dr. Kris Gaj This tutorial assumes that
More informationA Deterministic Flow Combining Virtual Platforms, Emulation, and Hardware Prototypes
A Deterministic Flow Combining Virtual Platforms, Emulation, and Hardware Prototypes Presented at Design Automation Conference (DAC) San Francisco, CA, June 4, 2012. Presented by Chuck Cruse FPGA Hardware
More informationINTRODUCTION TO FPGA ARCHITECTURE
3/3/25 INTRODUCTION TO FPGA ARCHITECTURE DIGITAL LOGIC DESIGN (BASIC TECHNIQUES) a b a y 2input Black Box y b Functional Schematic a b y a b y a b y 2 Truth Table (AND) Truth Table (OR) Truth Table (XOR)
More informationAn Introduction to FPGA Placement. Yonghong Xu Supervisor: Dr. Khalid
RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS UNIVERSITY OF WINDSOR An Introduction to FPGA Placement Yonghong Xu Supervisor: Dr. Khalid RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS UNIVERSITY OF WINDSOR
More informationTransparent In-Circuit Assertions for FPGAs
1 Transparent In-Circuit Assertions for FPGAs Eddie Hung, Tim Todman, Wayne Luk Department of Computing Imperial College London, UK {e.hung,timothy.todman,w.luk}@imperial.ac.uk Abstract Commonly used in
More informationSynthesis Options FPGA and ASIC Technology Comparison - 1
Synthesis Options Comparison - 1 2009 Xilinx, Inc. All Rights Reserved Welcome If you are new to FPGA design, this module will help you synthesize your design properly These synthesis techniques promote
More informationUsing the ChipScope Pro for Testing HDL Designs on FPGAs
Using the ChipScope Pro for Testing HDL Designs on FPGAs Compiled by OmkarCK CAD Lab, Dept of E&ECE, IIT Kharagpur. Introduction: Simulation based method is widely used for debugging the FPGA design on
More informationStructured Datapaths. Preclass 1. Throughput Yield. Preclass 1
ESE534: Computer Organization Day 23: November 21, 2016 Time Multiplexing Tabula March 1, 2010 Announced new architecture We would say w=1, c=8 arch. March, 2015 Tabula closed doors 1 [src: www.tabula.com]
More informationHierarchical Design Using Synopsys and Xilinx FPGAs
White Paper: FPGA Design Tools WP386 (v1.0) February 15, 2011 Hierarchical Design Using Synopsys and Xilinx FPGAs By: Kate Kelley Xilinx FPGAs offer up to two million logic cells currently, and they continue
More informationTutorial on FPGA Design Flow based on Aldec Active HDL. ver 1.6
Tutorial on FPGA Design Flow based on Aldec Active HDL ver 1.6 Fall 2011 1 Prepared by Ekawat (Ice) Homsirikamol, Marcin Rogawski, Jeremy Kelly, Kishore Kumar Surapathi, Ambarish Vyas, Umar Sharif and
More informationHardware Design Environments. Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University
Hardware Design Environments Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University Outline Welcome to COE 405 Digital System Design Design Domains and Levels of Abstractions Synthesis
More informationVdd Programmable and Variation Tolerant FPGA Circuits and Architectures
Vdd Programmable and Variation Tolerant FPGA Circuits and Architectures Prof. Lei He EE Department, UCLA LHE@ee.ucla.edu Partially supported by NSF. Pathway to Power Efficiency and Variation Tolerance
More informationESE534: Computer Organization. Tabula. Previously. Today. How often is reuse of the same operation applicable?
ESE534: Computer Organization Day 22: April 9, 2012 Time Multiplexing Tabula March 1, 2010 Announced new architecture We would say w=1, c=8 arch. 1 [src: www.tabula.com] 2 Previously Today Saw how to pipeline
More informationFT-UNSHADES credits. UNiversity of Sevilla HArdware DEbugging System.
FT-UNSHADES Microelectronic Presentation Day February, 4th, 2004 J. Tombs & M.A. Aguirre jon@gte.esi.us.es, aguirre@gte.esi.us.es AICIA-GTE of The University of Sevilla (SPAIN) FT-UNSHADES credits UNiversity
More informationCCE 3202 Advanced Digital System Design
CCE 3202 Advanced Digital System Design Lab Exercise #2 Introduction You will use Xilinx Webpack v9.1 to allow the synthesis and creation of VHDLbased designs. This lab will outline the steps necessary
More informationDesign & Implementation of 64 bit ALU for Instruction Set Architecture & Comparison between Speed/Power Consumption on FPGA.
Design & Implementation of 64 bit ALU for Instruction Set Architecture & Comparison between Speed/Power Consumption on FPGA 1 Rajeev Kumar Coordinator M.Tech ECE, Deptt of ECE, IITT College, Punjab rajeevpundir@hotmail.com
More informationEECS150 - Digital Design Lecture 5 - Verilog Logic Synthesis
EECS150 - Digital Design Lecture 5 - Verilog Logic Synthesis Jan 31, 2012 John Wawrzynek Spring 2012 EECS150 - Lec05-verilog_synth Page 1 Outline Quick review of essentials of state elements Finite State
More informationOptimizing latency in Xilinx FPGA Implementations of the GBT. Jean-Pierre CACHEMICHE
Optimizing latency in Xilinx FPGA Implementations of the GBT Steffen MUSCHTER Christian BOHM Sophie BARON Jean-Pierre CACHEMICHE Csaba SOOS (Stockholm University) (Stockholm University) (CERN) (CPPM) (CERN)
More informationA GENERATION AHEAD SEMINAR SERIES
A GENERATION AHEAD SEMINAR SERIES Constraints &Tcl Scripting Design Methodology Guidelines for Faster Timing Convergence Agenda Vivado Tcl Overview XDC Management Design Methodology for Faster Timing Closure
More informationUniversity of Hawaii EE 361L. Getting Started with Spartan 3E Digilent Basys2 Board. Lab 4.1
University of Hawaii EE 361L Getting Started with Spartan 3E Digilent Basys2 Board Lab 4.1 I. Test Basys2 Board Attach the Basys2 board to the PC or laptop with the USB connector. Make sure the blue jumper
More informationFPGA Design Flow 1. All About FPGA
FPGA Design Flow 1 In this part of tutorial we are going to have a short intro on FPGA design flow. A simplified version of FPGA design flow is given in the flowing diagram. FPGA Design Flow 2 FPGA_Design_FLOW
More informationFPGA. Logic Block. Plessey FPGA: basic building block here is 2-input NAND gate which is connected to each other to implement desired function.
FPGA Logic block of an FPGA can be configured in such a way that it can provide functionality as simple as that of transistor or as complex as that of a microprocessor. It can used to implement different
More informationKyoung Hwan Lim and Taewhan Kim Seoul National University
Kyoung Hwan Lim and Taewhan Kim Seoul National University Table of Contents Introduction Motivational Example The Proposed Algorithm Experimental Results Conclusion In synchronous circuit design, all sequential
More informationToday. Comments about assignment Max 1/T (skew = 0) Max clock skew? Comments about assignment 3 ASICs and Programmable logic Others courses
Today Comments about assignment 3-43 Comments about assignment 3 ASICs and Programmable logic Others courses octor Per should show up in the end of the lecture Mealy machines can not be coded in a single
More informationPushPull: Short Path Padding for Timing Error Resilient Circuits YU-MING YANG IRIS HUI-RU JIANG SUNG-TING HO. IRIS Lab National Chiao Tung University
PushPull: Short Path Padding for Timing Error Resilient Circuits YU-MING YANG IRIS HUI-RU JIANG SUNG-TING HO IRIS Lab National Chiao Tung University Outline Introduction Problem Formulation Algorithm -
More informationLSN 1 Digital Design Flow for PLDs
LSN 1 Digital Design Flow for PLDs ECT357 Microprocessors I Department of Engineering Technology LSN 1 Programmable Logic Devices Functionless devices in base form Require programming to operate The logic
More informationDesigning Safe Verilog State Machines with Synplify
Designing Safe Verilog State Machines with Synplify Introduction One of the strengths of Synplify is the Finite State Machine compiler. This is a powerful feature that not only has the ability to automatically
More informationReference. Wayne Wolf, FPGA-Based System Design Pearson Education, N Krishna Prakash,, Amrita School of Engineering
FPGA Fabrics Reference Wayne Wolf, FPGA-Based System Design Pearson Education, 2004 Logic Design Process Combinational logic networks Functionality. Other requirements: Size. Power. Primary inputs Performance.
More informationTutorial on FPGA Design Flow based on Aldec Active HDL. ver 1.7
Tutorial on FPGA Design Flow based on Aldec Active HDL ver 1.7 Fall 2012 1 Prepared by Ekawat (Ice) Homsirikamol, Marcin Rogawski, Jeremy Kelly, Kishore Kumar Surapathi, Ambarish Vyas, Malik Umar Sharif
More informationIntroduction Warp Processors Dynamic HW/SW Partitioning. Introduction Standard binary - Separating Function and Architecture
Roman Lysecky Department of Electrical and Computer Engineering University of Arizona Dynamic HW/SW Partitioning Initially execute application in software only 5 Partitioned application executes faster
More informationCAD for VLSI Design - I. Lecture 21 V. Kamakoti and Shankar Balachandran
CAD for VLSI Design - I Lecture 21 V. Kamakoti and Shankar Balachandran Overview of this Lecture Understanding the process of Logic synthesis Logic Synthesis of HDL constructs Logic Synthesis What is this?
More informationFPGA Design Flow. - from HDL to physical implementation - Victor Andrei. Kirchhoff-Institut für Physik (KIP) Ruprecht-Karls-Universität Heidelberg
FPGA Design Flow - from HDL to physical implementation - Victor Andrei Kirchhoff-Institut für Physik (KIP) Ruprecht-Karls-Universität Heidelberg 6th Detector Workshop of the Helmholtz Alliance Physics
More informationPARLGRAN: Parallelism granularity selection for scheduling task chains on dynamically reconfigurable architectures *
PARLGRAN: Parallelism granularity selection for scheduling task chains on dynamically reconfigurable architectures * Sudarshan Banerjee, Elaheh Bozorgzadeh, Nikil Dutt Center for Embedded Computer Systems
More informationTSEA44 - Design for FPGAs
2015-11-24 Now for something else... Adapting designs to FPGAs Why? Clock frequency Area Power Target FPGA architecture: Xilinx FPGAs with 4 input LUTs (such as Virtex-II) Determining the maximum frequency
More informationCOMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor
COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle
More informationCAD SUBSYSTEM FOR DESIGN OF EFFECTIVE DIGITAL FILTERS IN FPGA
CAD SUBSYSTEM FOR DESIGN OF EFFECTIVE DIGITAL FILTERS IN FPGA Pavel Plotnikov Vladimir State University, Russia, Gorky str., 87, 600000, plotnikov_pv@inbox.ru In given article analyze of DF design flows,
More informationFPGA Implementation of MIPS RISC Processor
FPGA Implementation of MIPS RISC Processor S. Suresh 1 and R. Ganesh 2 1 CVR College of Engineering/PG Student, Hyderabad, India 2 CVR College of Engineering/ECE Department, Hyderabad, India Abstract The
More informationEECS150: Lab 3, Verilog Synthesis & FSMs
EECS150: Lab 3, Verilog Synthesis & FSMs UC Berkeley College of Engineering Department of Electrical Engineering and Computer Science February 11, 2009 1 Time Table ASSIGNED DUE Friday, Febuary 6 th Week
More informationTLL5000 Electronic System Design Base Module
TLL5000 Electronic System Design Base Module The Learning Labs, Inc. Copyright 2007 Manual Revision 2007.12.28 1 Copyright 2007 The Learning Labs, Inc. Copyright Notice The Learning Labs, Inc. ( TLL )
More informationXilinx ISE Synthesis Tutorial
Xilinx ISE Synthesis Tutorial The following tutorial provides a basic description of how to use Xilinx ISE to create a simple 2-input AND gate and synthesize the design onto the Spartan-3E Starter Board
More informationTutorial: Working with the Xilinx tools 14.4
Tutorial: Working with the Xilinx tools 14.4 This tutorial will show you how to: Part I: Set up a new project in ISE Part II: Implement a function using Schematics Part III: Implement a function using
More informationUnlocking FPGAs Using High- Level Synthesis Compiler Technologies
Unlocking FPGAs Using High- Leel Synthesis Compiler Technologies Fernando Mar*nez Vallina, Henry Styles Xilinx Feb 22, 2015 Why are FPGAs Good Scalable, highly parallel and customizable compute 10s to
More informationStochastic Processors (or processors that do not always compute correctly by design)
Stochastic Processors (or processors that do not always compute correctly by design) Rakesh Kumar Department of Electrical and Computer Engineering University of Illinois, Urbana-Champaign Insisting on
More informationSpiral 2-8. Cell Layout
2-8.1 Spiral 2-8 Cell Layout 2-8.2 Learning Outcomes I understand how a digital circuit is composed of layers of materials forming transistors and wires I understand how each layer is expressed as geometric
More informationGraphics: Alexandra Nolte, Gesine Marwedel, Universität Dortmund. RTL Synthesis
Graphics: Alexandra Nolte, Gesine Marwedel, 2003 Universität Dortmund RTL Synthesis Purpose of HDLs Purpose of Hardware Description Languages: Capture design in Register Transfer Language form i.e. All
More informationEEL 4783: HDL in Digital System Design
EEL 4783: HDL in Digital System Design Lecture 3: Architeching Speed Prof. Mingjie Lin 1 Flowchart of CAD 2 Digital Circuits: Definition of Speed Throughput Latency The amount of data that is processed
More informationSP3Q.3. What makes it a good idea to put CRC computation and error-correcting code computation into custom hardware?
Part II CST: SoC D/M: Quick exercises S3-S4 (examples sheet) Feb 2018 (rev a). This sheet contains short exercises for quick revision. Please also look at past exam questions and/or try some of the longer
More informationFPGA Matrix Multiplier
FPGA Matrix Multiplier In Hwan Baek Henri Samueli School of Engineering and Applied Science University of California Los Angeles Los Angeles, California Email: chris.inhwan.baek@gmail.com David Boeck Henri
More informationVHDL for Synthesis. Course Description. Course Duration. Goals
VHDL for Synthesis Course Description This course provides all necessary theoretical and practical know how to write an efficient synthesizable HDL code through VHDL standard language. The course goes
More informationTutorial on FPGA Design Flow based on Xilinx ISE WebPack and ModelSim. ver. 2.0
Tutorial on FPGA Design Flow based on Xilinx ISE WebPack and ModelSim ver. 2.0 Updated: Fall 2013 1 Preparing the Input: Download examples associated with this tutorial posted at http://ece.gmu.edu/tutorials-and-lab-manuals
More informationVIVADO TUTORIAL- TIMING AND POWER ANALYSIS
VIVADO TUTORIAL- TIMING AND POWER ANALYSIS IMPORTING THE PROJECT FROM ISE TO VIVADO Initially for migrating the same project which we did in ISE 14.7 to Vivado 2016.1 you will need to follow the steps
More informationPlanAhead Release Notes
PlanAhead Release Notes What s New in the 11.1 Release UG656(v 11.1.0) April 27, 2009 PlanAhead 11.1 Release Notes Page 1 Table of Contents What s New in the PlanAhead 11.1 Release... 4 Device Support...
More informationVirtex-II Architecture
Virtex-II Architecture Block SelectRAM resource I/O Blocks (IOBs) edicated multipliers Programmable interconnect Configurable Logic Blocks (CLBs) Virtex -II architecture s core voltage operates at 1.5V
More informationOverview. CSE372 Digital Systems Organization and Design Lab. Hardware CAD. Two Types of Chips
Overview CSE372 Digital Systems Organization and Design Lab Prof. Milo Martin Unit 5: Hardware Synthesis CAD (Computer Aided Design) Use computers to design computers Virtuous cycle Architectural-level,
More informationDesign Process. Design : specify and enter the design intent. Verify: Implement: verify the correctness of design and implementation
Design Verification 1 Design Process Design : specify and enter the design intent Verify: verify the correctness of design and implementation Implement: refine the design through all phases Kurt Keutzer
More informationAL8253 Core Application Note
AL8253 Core Application Note 6-15-2012 Table of Contents General Information... 3 Features... 3 Block Diagram... 3 Contents... 4 Behavioral... 4 Synthesizable... 4 Test Vectors... 4 Interface... 5 Implementation
More informationEfficient Hardware Context- Switch for Task Migration between Heterogeneous FPGAs
Efficient Hardware Context- Switch for Task Migration between Heterogeneous FPGAs Frédéric Rousseau TIMA lab University of Grenoble Alpes A join work with Alban Bourge, Arief Wicaksana and Olivier Muller
More informationPINE TRAINING ACADEMY
PINE TRAINING ACADEMY Course Module A d d r e s s D - 5 5 7, G o v i n d p u r a m, G h a z i a b a d, U. P., 2 0 1 0 1 3, I n d i a Digital Logic System Design using Gates/Verilog or VHDL and Implementation
More informationEECS150: Lab 3, Verilog Synthesis & FSMs
EECS5: Lab 3, Verilog Synthesis & FSMs UC Berkeley College of Engineering Department of Electrical Engineering and Computer Science September 3, 2 Time Table ASSIGNED DUE Friday, September th September
More informationCHAPTER-IV IMPLEMENTATION AND ANALYSIS OF FPGA-BASED DESIGN OF 32-BIT FPAU
CHAPTER-IV IMPLEMENTATION AND ANALYSIS OF FPGA-BASED DESIGN OF 32-BIT FPAU The design of 32 bit FPAU using VHDL presented in chapter III needs to be further implemented and tested on FPGA platform and
More informationVHDL-MODELING OF A GAS LASER S GAS DISCHARGE CIRCUIT Nataliya Golian, Vera Golian, Olga Kalynychenko
136 VHDL-MODELING OF A GAS LASER S GAS DISCHARGE CIRCUIT Nataliya Golian, Vera Golian, Olga Kalynychenko Abstract: Usage of modeling for construction of laser installations today is actual in connection
More informationAdvanced FPGA Design Methodologies with Xilinx Vivado
Advanced FPGA Design Methodologies with Xilinx Vivado Lecturer: Alexander Jäger Course of studies: Technische Informatik Student number: 3158849 Date: 30.01.2015 30/01/15 Advanced FPGA Design Methodologies
More informationRecommended Design Techniques for ECE241 Project Franjo Plavec Department of Electrical and Computer Engineering University of Toronto
Recommed Design Techniques for ECE241 Project Franjo Plavec Department of Electrical and Computer Engineering University of Toronto DISCLAIMER: The information contained in this document does NOT contain
More informationHardware Design with VHDL PLDs IV ECE 443
Embedded Processor Cores (Hard and Soft) Electronic design can be realized in hardware (logic gates/registers) or software (instructions executed on a microprocessor). The trade-off is determined by how
More informationBlock RAM. Size. Ports. Virtex-4 and older: 18Kb Virtex-5 and newer: 36Kb, can function as two 18Kb blocks
Block RAM Dedicated FPGA resource, separate columns from CLBs Designed to implement large (Kb) memories Multi-port capabilities Multi-clock capabilities FIFO capabilities Built-in error detection and correction
More informationstructure syntax different levels of abstraction
This and the next lectures are about Verilog HDL, which, together with another language VHDL, are the most popular hardware languages used in industry. Verilog is only a tool; this course is about digital
More informationHere is a list of lecture objectives. They are provided for you to reflect on what you are supposed to learn, rather than an introduction to this
This and the next lectures are about Verilog HDL, which, together with another language VHDL, are the most popular hardware languages used in industry. Verilog is only a tool; this course is about digital
More informationAdvanced Synthesis Techniques
Advanced Synthesis Techniques Reminder From Last Year Use UltraFast Design Methodology for Vivado www.xilinx.com/ultrafast Recommendations for Rapid Closure HDL: use HDL Language Templates & DRC Constraints:
More informationISim Hardware Co-Simulation Tutorial: Accelerating Floating Point Fast Fourier Transform Simulation
ISim Hardware Co-Simulation Tutorial: Accelerating Floating Point Fast Fourier Transform Simulation UG817 (v 13.2) July 28, 2011 Xilinx is disclosing this user guide, manual, release note, and/or specification
More informationIntroduction to FPGA Design with Vivado High-Level Synthesis. UG998 (v1.0) July 2, 2013
Introduction to FPGA Design with Vivado High-Level Synthesis Notice of Disclaimer The information disclosed to you hereunder (the Materials ) is provided solely for the selection and use of Xilinx products.
More information