A GENERATION AHEAD SEMINAR SERIES Constraints &Tcl Scripting Design Methodology Guidelines for Faster Timing Convergence
Agenda Vivado Tcl Overview XDC Management Design Methodology for Faster Timing Closure Available Resources & Next Steps
Vivado Tcl Overview
Vivado Tcl XDC - superset of SDC* includes Industry standard SDC timing constraints Xilinx physical constraints (LOC, IO std, ) Vivado Tcl includes XDC Xilinx flow commands Project management, synthesize, place and route, Objects and interactive query Netlist, device, timing, project, get_*, create_*, connect_* General Tcl (8.5) language *SDC Synopsys Design Constrains
Vivado XDC/Tcl Benefits Reduces learning curve A common constraint language from Synthesis to P&R Sign-off static timing analysis (STA) Enables to quickly indentify and fix design challenges Powerful debug and analysis environment Fast custom reports & DRC Design transformation What-if analysis with incremental STA Extendable Accelerates design migration to Xilinx FPGAs Plug & Play with 3rd Party EDA Tools Industry standard Tool Control
Vivado XDC/Tcl Benefits Faster Runtime Foundation for Fast STA UCF - Not scaled to handle large* designs XDC - Perfectly suitable for large* designs *large = 2M+ instances Example: apply a constraint to all instances in block A UCF: INST A/* XDC: get_cells A/* Top Top A B A B a1 a2 b1 b2 a1 a2 b1 b2 Full design search pace Enables to control search pace
Vivado XDC/Tcl Benefits Constraints Lifetime Predictable names mreg neta D Q mreg Primary ports FFs Nets connected to FF outputs Non-Predictable names Boolean logic instances and nets Example: apply KEEP to a net connected to the D input of mreg UCF XDC set_property DONT_TOUCH 1 [get_nets -of [get_pins mreg/d]] NET neta KEEP; Drawbacks neta - must be known in advance neta - a non-predictable name Changes in logic invalid UCF constraint Advantages The name of a net is not necessary Based on the predictable FF name: mreg Changes in logic - constraint not impacted
Vivado XDC/Tcl Benefits Constraining Exclusive Clocks Example REGA REGB D D Q CE Q CE CLK0 CLK1 UCF no easy solution to constraint and analyze the design Sol. 1: use most critical clock (CLK0 or CLK1) Sol. 2: use two UCF files (for CLK0 & for CLK1) XDC Natively supported Just specify that CLK0 and CLK1 are exclusive set_clock_groups -physically_exclusive -group CLK0 -group CLK1
Vivado XDC/Tcl Benefits Custom Scripts to Solve Timing Issues DSP48 Accelerate DSP by push a register from fabric into DSP P Replicate a high fanout FF to improve placement and reduce routing delays Generate a custom BUFG report set_property, create_cell, create_net, connect_net, disconnect_net,
Vivado Tcl Benefits Example of ECO in Tcl Goal: to connect a internal net to an output pin Assumption: Design is fully routed prb p_sig CLK Five-step process 1. Create a port 2. Select an I/O Standard 3. Select an unused pin create_port set_property IOSTANDARD set_property PACKAGE_PIN 4. Connect the net to the port 5. Route net -direction OUT connect_net route_design prb prb prb -net p_sig prb
Vivado XDC/Tcl Benefits Example of ECO in Tcl Example: Modifying MMCM Duty Cycle after routing set_property CLKOUT0_DUTY_CYCLE 0.3 [get_cells mmcm_adv_inst] write_bitstream Note: MMCM Duty cycle can be modified directly in the Attributes window (GUI)
Integrate Custom Commands in IDE Create menu for Custom Tcl Procedures Parameters can be # my_drc.tcl Entered manually or proc my_drc1 { } { } As a selected_object (ex: in a schematic viewer) proc my_drc2 { } { } Example: Replicate a selected FF Move Reg to fabric for a selected RAM Report timing through a selected net Traverse clock network to look for LUTs
XDC Management
Constraint Flow: Vivado vs. ISE ISE two entry points Vivado XCF (XST), UCF(Impl.) No post-implementation constraints tuning Constraints tuning at each flow step Supports various constraint scenarios HDL HDL XCF UCF synth_design XST XDC NGDBuild MAP route_design PAR write_bitstream Bitgen Silicon Silicon opt_design
XDC vs. UCF Differences Constraint Order UCF a constraint file Constraint order does not matter TIMESPEC TS_clk = PERIOD OFFSET = IN 15 ns BEFORE clk"; = OFFSET = IN 15 ns BEFORE clk"; TIMESPEC TS_clk = PERIOD XDC is a Tcl Program Constraint order matters (in general) Note: SDC exception rules (non-order set_input_delay -clock based) [get_clocks clk create_clock -name clk... constraints have specific set_input_delay -clock [get_clocks clk] ] create_clock -name clk... The order of constraints in XDC file & the order of XDC files does matter
XDC Constraints can be Specified in Different Ways.XDC Files Added to project sources Entered via Constraint Editor Manually Edited (use XDC Templates) Tcl scripts Specified before & after each step of the flow Tcl Console Interactively added Sourced from Tcl scripts
Scope of XDC files XDC can apply to the whole flow Synthesis timing IOs XDC can apply to part of the flow Main Main (ISE Style) Implementation Impl Implementation Main: Impl: Synth Synthesis Impl Primary clocks I/O delays Exceptions on clocks Physical constraints Exceptions based on physical netlist
Customizing the Push Button Flow Use tcl.pre & tcl.post to execute Tcl scripts before & after flow steps Main Impl Synthesis Implementation opt_design place_design route_design
Design Space Exploration Launching Different Strategies Constraint Set & tool options part of implementation strategy Vivado supports multiple constraint sets Create multiple runs with different constraint sets.
Leverage IP XDC IP might create their own XDC file Example: clocking wizard In general - IP XDC is read after the user XDC Some IP - IP XDC is read before the user XDC (clocking wizard) (user constraints can override IP defined clocks by default) The order of constraint files matters! To report the order of XDC files: report_compile_order constraints Always verify the clocks: To change the default processing order report clocks set_property set_processing_order early late IP_XDC_File If necessary, IP_XDC_files can be enabled/disabled
Using XDC Templates Timing & Physical Accessing templates in IDE Windows Language Templates Contain Clocks Input & Output Exceptions Physical DDR Input DDR Output DDR Templates Inputs and outputs Source synchronous Center aligned.
Design Methodology for Faster Timing Closure
Fix Design Issues Earlier in the Flow Fix timing issues at early stages C & RTL stages have bigger impact on QoR Iterations at these levels are much faster HLS (C, C++) RTL Synthesis The worst path is a moving target Synthesis reduces longest RTL paths Place optimizes placement for worst path in netlist Route uses preferred routing for next set of paths #paths 100 75 50 25 Path distribution in RTL design optimized in routing optimized in placement optimized in synthesis 0 levels of logic 23 Impact of change on Performance 1000x 10x opt Place physopt 1.2x Route 1.1x
Critical Path could be a Moving Target Example from a Real Design Post-synthesis Worst path: 13 levels of logic worst path: 4.3ns Post-place Worst path: 7 levels Paths with 7-13 levels got placed locally worst path: 4.2ns Post-route Worst Path: 4 levels of logic Paths with 5-13 levels got preferred routing worst path: 4.1ns Analyze & Fix timing issues at early stages for faster timing convergence 24
Follow HDL Coding & Synthesis Recommendations Adder tree performance bottleneck HDL Coding Follow RAM and DSP templates Pipeline to reduce levels of logic Avoid Resets (or prefer synchronous) Synthesis Do not hinder synthesis. Pipelined adder chain optimal performance Avoid Bottom-up flows Avoid KEEP, syn_preserve attributes Use Synthesis Contorls Control LUT combining Limit Max Fanout Reduce # Control Sets DSP48 DSP48 DSP48 Review and Resolve critical Warnings. DSP48
Timing Constraints Must Be Pristine Missing Constraints The corresponding paths are not optimized Violations are not reported, but design may not work Path is incorrectly constrained Optimization effort is spent on the wrong paths Reported timing violations may not result in any issues on HW Constraints create wrong HOLD violations Possible: Long runtime and Setup violations Note: Vivado fixes HOLD violations as #1 priority Designs with HOLD violations won t work Designs with SETUP violations will work, but slower
Method to Create Good Constraints Create Constraints: Four Key Steps 1. 2. 3. 4. Create clocks Define clocks interactions Set input and output delays Set timing exceptions Validate Constraints at each step. Monitor unconstraint objects. Validate timing report_timing_summary check_timing report_clocks (Note: Tcl only) report_clock_networks report_clock_interaction report_timing Note: available via GUI & Tcl.
Constraints Creation Helpers First review unconstraint objects Helps to monitor constraining progress report_timing_summary: Check Timing section check_timing Avoid Clock Skew Verify clock network topology report_clock_networks Beware of: Gated clocks Unconstrained clocks Related clock from different MMCM.
Clock Creation Ground Rules Clocks only exist after you create them CLK1 create_clock -name CLK1 -period 20 CLK2 CLK2 domain: no analysis, no optimization Generated Clocks Clocks automatically propagated through clocking modules (MMCM, PLL) Remaining clocks - define manually: create_generated_clock don t create clock CLK1 CLK2 MMCM CNT CLK_OUT C_CLK create_clock create_clock name CLK2 create_generated_clock name C_CLK create generated clock -name CLK1
Creating Clocks Define primary clocks: create_clocks Verify specified and automatically generated clocks: report_clocks Attributes P: Propagated G: Generated Clock sys_clk pll0/clkout0 pll0/clkout1 Period 10.000 2.500 10.000 Waveform {0.000 5.000} {0.000 1.250} {0.000 5.000} Attributes P P,G P,G Sources {sys_clk} {pll0/plle2_adv_inst/clkout0} {pll0/plle2_adv_inst/clkout1} Update constraining process status: check_timing Define remaining internal clocks: create_generated_clocks Find them in Check Timing & Report Clock Networks reports Check progress: report_clocks report_clock_networks
Clock Interaction Ground Rules All inter-clock paths are evaluated by default CLK1 CLK2 CDC CLK1 & CLK2 UCF: asynchronous SDC: synchronous CDC UCF: ignored SDC: analyzed/optimized Use set_clock_groups to make CLK1 & CLK2 asynchronous ignore CDC # primary clocks create_clock -name clk_oxo create_clock -name clk_core # set Asynchronous Clock Groups set_clock_groups -asynchronous -group [ include_generated_clocks clk_oxo] -group [ include_generated_clocks clk_core]
Clock Interaction Helper Evaluate clock interaction: report_clock_interaction Unconstraint inter-clock path (CDC) as needed: set_clock_groups Could be done directly via Clock Interaction Report Regenerate Clock Interaction report - observe changes Check progress: check_timing.
Constraining I/Os Specify realistic IO delays: set_input_delay, set_output_delay Wrong delay value (e.g. 0 ns) can cause wrong HOLD violations SDC: delay value is the external delay UCF: internal delay (default) Example: input delay 10 ns Period # UCF OFFSET = IN 6ns BEFORE ClkIn; ClkIn Din Data Valid 4ns Input Delay = 4ns 6ns Offset In = 6ns At FPGA: Tsu = 6ns Th = 4ns 33 # XDC set_input_delay 4 -clock clk_in Check progress: check_timing
Timing Exceptions: Less is More! Goal to help timing closure Adjust unrealistic timing requirements Avoid higher implementation runtimes set_false_path set_multicycle_path set_max_delay Exceptions can HURT timing closure Syntax related set_multicycle_path beware about hold (avoid wrong hold violations) regexp Runtime Conflicts resolution check you only cover the expected paths set_false_path from No Impact set_false_path -from -to Big Impact! (due to shared paths) set_multicycle_path 3 from REGA/Q -from wins: it has higher priority vs. -to set_multicycle_path 2 to REGB/D.
Timing Analysis, Reading Reports report_timing_summary a complete view on the Design Timing Store results from various commands: check_timing, report_timing, report_timing interactive STA Enables to focus on a specific design part One clock domain All paths between two registers All paths going though a specific net ize m sto u C sis y l na A ing m Ti Use them for constraints tuning at each constraints definition step.
Design Migration Helper UCF to XDC Conversion Vivado - XDC based system only write_xdc UCF to XDC conversion helper in PlanAhead SDC and UCF assumptions / engines are very different Translation process must understand the intention of each UCF constraint Timing constraints translation almost never 100% correct a1 Example: pin locking A UCF NET a1 loc=t19 XDC set_property LOC T19 [get_ports A] Write xdc use for Physical constraints XDC Timing constraints write from scratch
Available Resources & Next Steps
Vivado Tcl Help Help from Tcl prompt % help list categories (see next slide for details) % help -category <name> list commands in that category % help * list all commands with brief description % help get_cells % get_cells -help gives details of get_cells command % help get_* lists all commands starting with get_
Available Resources on www.xilinx.com Documentation Vivado Design Suite User Guide: Using Constraints Vivado Design Suite Tcl Command Reference Guide Vivado Design Suite Properties Reference Guide Vivado Video Tutorials Design Constraints Overview Creating Basic Clock Constraints Training Classes
Take the Next Step facebook.com/xilinxinc twitter.com/#!/xilinxinc youtube.com/xilinxinc.
Summary Advanced users can analyze and fix design issues in IDE or Tcl Tcl & IDE interact on the same datamodel Tcl allows custom reports and ECO changes Fix problems in HDL first Be mindful of BRAM, LUTRAM, DSP, SRL inference needs Avoid asynchronous reset and wired resets in general Provide clean timing constraints Bad constraints results in bad runtime, performance and HW failures