June 2003, ver. 1.2 Application Note 198

Similar documents
13. LogicLock Design Methodology

Using the LogicLock Methodology in the

Using the LogicLock Methodology in the

8. Best Practices for Incremental Compilation Partitions and Floorplan Assignments

Best Practices for Incremental Compilation Partitions and Floorplan Assignments

Using Verplex Conformal LEC for Formal Verification of Design Functionality

Introduction. Design Hierarchy. FPGA Compiler II BLIS & the Quartus II LogicLock Design Flow

AN 567: Quartus II Design Separation Flow

Compiler User Guide. Intel Quartus Prime Pro Edition. Updated for Intel Quartus Prime Design Suite: Subscribe Send Feedback

Intel Quartus Prime Pro Edition User Guide

Advanced ALTERA FPGA Design

5. Quartus II Design Separation Flow

Quartus II Incremental Compilation for Hierarchical

Cover TBD. intel Quartus prime Design software

10. Synopsys Synplify Support

Intel Quartus Prime Standard Edition User Guide

Block-Based Design User Guide

Intel Quartus Prime Pro Edition User Guide

18. Synopsys Formality Support

Cover TBD. intel Quartus prime Design software

Section IV. In-System Design Debugging

SOPC LAB1. I. Introduction. II. Lab contents. 4-bit count up counter. Advanced VLSI Due Wednesday, 01/08/2003

White Paper Performing Equivalent Timing Analysis Between Altera Classic Timing Analyzer and Xilinx Trace

Tutorial 2 Implementing Circuits in Altera Devices

Low Power Design Techniques

Analyzing Designs with Quartus II Netlist Viewers

Using Quartus II Verilog HDL & VHDL Integrated Synthesis

Tutorial for Altera DE1 and Quartus II

ALTERA FPGAs Architecture & Design

1. Quartus II Design Flow for MAX+PLUS II Users

Synopsys Synplify Support

System Debugging Tools Overview

Quartus II Handbook, Volume 2 Design Implementation & Optimization

LeonardoSpectrum & Quartus II Design Methodology

Automated Extraction of Physical Hierarchies for Performance Improvement on Programmable Logic Devices

Quartus. Introduction. Programmable Logic Development System & Software

CSE P567 - Winter 2010 Lab 1 Introduction to FGPA CAD Tools

DDR and DDR2 SDRAM Controller Compiler User Guide

Altera Technical Training Quartus II Software Design

Chapter 2 Getting Hands on Altera Quartus II Software

Tutorial on Quartus II Introduction Using Schematic Designs

EE 231 Fall EE 231 Lab 2

Hierarchical Design Using Synopsys and Xilinx FPGAs

Quartus II Prime Foundation

Lattice Semiconductor Design Floorplanning

Design Verification Using the SignalTap II Embedded

University of Florida EEL 3701 Dr. Eric M. Schwartz Department of Electrical & Computer Engineering Revision 0 12-Jun-16

Error Correction Code (ALTECC_ENCODER and ALTECC_DECODER) Megafunctions User Guide

Quartus II Introduction Using Schematic Design

UNIVERSITY OF CALIFORNIA, DAVIS Department of Electrical and Computer Engineering. EEC180A DIGITAL SYSTEMS I Winter 2015

UG0787 User Guide PolarFire FPGA Block Flow

Vivado Design Suite User Guide

Tutorial on Quartus II Introduction Using Verilog Code

PlanAhead Release Notes

Partial Reconfiguration User Guide

Intel Quartus Prime Pro Edition User Guide

Quartus II Introduction Using Verilog Design

Academic Clustering and Placement Tools for Modern Field-Programmable Gate Array Architectures

4DM4 Lab. #1 A: Introduction to VHDL and FPGAs B: An Unbuffered Crossbar Switch (posted Thursday, Sept 19, 2013)

ALTERA FPGA Design Using Verilog

Agilent Technologies InfiniiVision MSO N5434A FPGA Dynamic Probe for Altera

Reduce FPGA Power With Automatic Optimization & Power-Efficient Design. Vaughn Betz & Sanjay Rajput

AN 839: Design Block Reuse Tutorial

NIOS CPU Based Embedded Computer System on Programmable Chip

2 nd Year Laboratory. Experiment: FPGA Design with Verilog. Department of Electrical & Electronic Engineering. Imperial College London.

Designing with Synplicity SynplifyPro & Altera s Quartus II Software. Copyright 2004 Altera Corporation

AN 825: Partially Reconfiguring a Design on Intel Stratix 10 GX FPGA Development Board

SmartFusion2, IGLOO2, and RTG4 Designing with Blocks for Libero SoC v11.8 in the Enhanced Constraint Flow User Guide

AN 818: Static Update Partial Reconfiguration Tutorial

Design Debugging Using the SignalTap II Logic Analyzer

EMT1250 LABORATORY EXPERIMENT. EXPERIMENT # 6: Quartus II Tutorial and Practice. Name: Date:

Introduction to VHDL Design on Quartus II and DE2 Board

ECSE-323 Digital System Design. Lab #1 Using the Altera Quartus II Software Fall 2008

Generating Parameterized Modules and IP Cores

DE2 Board & Quartus II Software

Intel Quartus Prime Standard Edition User Guide

Vivado Design Suite User Guide

Managing Quartus II Projects

Parallelizing FPGA Technology Mapping using GPUs. Doris Chen Deshanand Singh Aug 31 st, 2010

High Speed Memory Interfacing 800MHz DDR3 Memory Lab

Contents. Appendix B HDL Entry Tutorial 2 Page 1 of 14

lpm_compare Megafunction User Guide

8. Migrating Stratix II Device Resources to HardCopy II Devices

altshift_taps Megafunction User Guide

AN 818: Static Update Partial Reconfiguration Tutorial

lpm_rom Megafunction User Guide

Park Sung Chul. AE MentorGraphics Korea

Applications Note. HDL Simulation FPGA Design Methodology. October 15, Revision 1.0

SmartFusion2, IGLOO2, and RTG4 Block Designing with Blocks for Libero SoC v11.8 in the Classic Constraint Flow User Guide

Synthesis Options FPGA and ASIC Technology Comparison - 1

20. Mentor Graphics LeonardoSpectrum Support

DDR & DDR2 SDRAM Controller Compiler

Cyclone II FPGA Family

Actel Libero TM Integrated Design Environment v2.3 Structural Schematic Flow Design Tutorial

Design Constraints User Guide

UNIVERSITI MALAYSIA PERLIS

ISE Design Suite Software Manuals and Help

RTL Coding General Concepts

EE 231 Fall Lab 1: Introduction to Verilog HDL and Altera IDE

Transcription:

Timing Closure with the Quartus II Software June 2003, ver. 1.2 Application Note 198 Introduction With FPGA designs surpassing the multimillion-gate mark, designers need advanced tools to better address timing closure issues and meet system performance. The Altera Quartus II software offers a fully integrated timing closure flow that allows more control over how a design is synthesized and placed-and-routed. Advanced optimization options can help you meet your performance goals. This application note explains the timing closure flow and the features that help achieve timing closure with the Quartus II software version 3.0. It also describes the design space explorer, a Tcl script that allows you to automate compilation with different sets of options activated, so that you may compare results. This application note is intended for designers who have a basic understanding of the Quartus II software and the LogicLock design methodology. The following topics will be covered: Timing closure flow Netlist optimizations (including physical synthesis) Design analysis using the Timing Closure Floorplan Timing closure assignments the design space explorer Timing Closure Flow A traditional flow for designs using FPGA tools is to enter constraints, synthesize your design, and then place-and-route it. The Quartus II software introduces features that allow you to more effectively close timing, including netlist optimizations, a new timing closure floorplan, and more powerful user assignments. The Quartus II timing closure flow also offers more control over the synthesis and place-and-route fitting steps. Also, fitter information can be used for more efficient synthesis. Figure 1 shows the Quartus II timing closure flow diagram. Altera Corporation 1 AN-198-1.2

Figure 1. Timing Closure Flow Diagram Compile Design Performance Met? Yes Success No Netlist Optimizations Performance Met? Yes Success No No Analysis Using Timing Closure Floorplan No Make Assignments and Compile Performance Met? Yes Success 2 Altera Corporation

1 It is important to understand the design and apply appropriate assignments such that performance is increased. It is possible to decrease performance if assignments are applied without full understanding of the design. f Netlist Optimizations The timing closure flow can be applied to an overall design or to modules of a design that can be integrated later. For more information on a blockbased design approach, refer to AN 161: Using the LogicLock Methodology in the Quartus II Design Software. The Quartus II software includes netlist optimization options to optimize your design further than that optimization performed in the course of the standard compilation flow. These options can be applied regardless of the synthesis tool used. Depending on your design, some options may have more of an effect than others, and the options can be applied in combination to provide optimal results. The Quartus II netlist optimizations are applied at different stages of the design flow, either during synthesis or during fitting. The synthesis netlist optimizations occur during the synthesis stage of the Quartus II compilation flow. Operating on output from either a thirdparty synthesis tool or the Quartus II standard integrated synthesis, these optimizations make changes to the synthesis netlist that are beneficial in terms of area or speed, depending on your selected optimization technique. The fitter netlist optimizations take place during the fitter stage of the Quartus II compilation flow, and are referred to as physical synthesis. Traditionally, the Quartus II design flow has involved separate steps of synthesis and fitting. The synthesis step optimizes the logical structure of a circuit for area, speed, or both. The fitter then places and routes the logic elements to ensure critical portions of logic are close together and use the fastest possible routing resources. While this flow produces excellent push-button results, anticipating the routing delays seen in the fitter during the synthesis stage allows synthesis tools to intelligently restructure a circuit to compensate for these delays. Since routing delays are a significant part of the typical critical path delay, a synthesis tool can target its timing-driven optimizations at these parts of the design. This tight integration of fitting and synthesis process is known as physical synthesis. Physical synthesis netlist optimizations make device-specific changes to the netlist that improve results for a specific Altera device. See Applying Netlist Optimization Options on page 15 for details on preserving your results through back-annotation. Altera Corporation 3

1 When netlist optimization options are turned on, the node names for primitives in the design can change. The fact that nodes may be renamed must be considered if you are using a LogicLock or verification flow that requires fixed node names. Primitive node names are specified during synthesis and are contained in atom netlists from third-party synthesis tools. When netlist optimizations are applied, node names may change as primitives are removed and created. HDL attributes applied to preserve logic in third-party synthesis tools cannot be honored because those attributes are not written into the atom netlist read by the Quartus II software. If you are synthesizing directly in the Quartus II software, you may use the Preserve Register (preserve) and Keep Combinational Logic (keep) attributes to maintain certain nodes in the design. f For more information on synthesis using the Quartus II software, see AN 238: Using Quartus II Verilog HDL & VHDL Integrated Synthesis. The following synthesis netlist optimizations are available: WYSIWYG primitive resynthesis Gate-level register retiming The following fitter netlist optimizations are available: Physical synthesis for combinational logic Physical synthesis for registers: Register duplication Register retiming View and modify the netlist optimization options on the Netlist Optimizations page of the Settings dialog box (Assignments Menu). See Figure 2. 4 Altera Corporation

Figure 2. Netlist Optimizations Page of the Settings Dialog Box A two-pass optimization is also available from the command line. For information on using the Quartus II software from the command line, see AN 309: Command-Line Scripting in the Quartus II Software. The following sections describe the available options. Synthesis Netlist Optimization Options This section describes the functionality of the synthesis netlist optimizations available in the Settings dialog box (Assignment menu). WYSIWYG Primitive Resynthesis The Perform WYSIWYG primitive resynthesis (using optimization technique specified in default logic option settings) synthesis option on the Netlist Optimizations section of the Settings dialog box (Assignments menu) can be used when you have an atom netlist that specifies a design as Altera-specific primitives. An example of an atom netlist file is an EDIF Input File (.edf) or a Verilog Quartus Mapping (.vqm) file generated by a third-party synthesis tool. Altera Corporation 5

The WYSIWYG primitive resynthesis option directs the Quartus II software to un-map the logic elements (LEs) in an atom netlist to gates and then re-map the gates back to Altera-specific primitives. This feature allows the Quartus II software to use different techniques specific to the device architecture during the re-mapping process. The Quartus II technology mapper will optimize the design for either speed or area according to the specification in the Optimization Technique logic option in the Default Logic Option Settings page of the Settings dialog box (Assignments menu). Figure 3 shows the Quartus II software flow for this feature. Figure 3. WYSIWYG Primitive Resynthesis ATOM Netlist Un-Map Re-Map Place & Route The WYSIWYG primitive resynthesis option can be used with the Stratix, Stratix GX, Cyclone, or APEX device families. This option is not applicable if you are using integrated Quartus II synthesis. With Quartus II synthesis, you do not need to un-map Altera primitives; they are already mapped during the synthesis step using the techniques that are used with the WYSIWYG primitive resynthesis option. The WYSIWYG primitive resynthesis option will only un-map and remap logic cell (also referred to as LCELL or LE) primitives. Memory primitives, DSP primitives, and logic cells in carry chains will not be touched. Logic specified in an encrypted VQM or EDIF file will also not be touched. Any nodes or entities that have the logic option Netlist Optimizations set to Never allow will not be affected during WYSIWYG primitive resynthesis. This option can be applied through Assignment Organizer > Options for Individual Nodes & Entities (Assignments menu). 6 Altera Corporation

Gate-Level Register Retiming The Perform gate-level register retiming option on the Netlist Optimizations page of the Settings dialog box (Assignments menu) enables movement of registers across combinational logic to balance timing, allowing the Quartus II software to trade off the delay between critical path and non-critical path. The functionality of your design will not change when the Perform gatelevel register retiming option is turned on. However, if any registers in your design have the Power-Up Don t Care logic option assigned, the values of registers during power-up may change due to this register and logic movement. (The Power-Up Don t Care logic option is turned on globally by default.) Registers that are explicitly assigned power-up values will not be combined with registers that have been explicitly assigned other values. Figure 4 shows an example of gate-level register retiming where the 10 ns critical delay is reduced by moving the register relative to the combinatorial logic. Figure 4. Gate-Level Register Retiming Diagram D Q 10 ns D Q 5 ns D Q D Q 7 ns D Q 8 ns D Q Register retiming makes changes at the gate level. If you are using an atom netlist from a third-party synthesis tool, you must also use the Perform WYSIWYG primitive resynthesis option to un-map atom primitives to gates (so that register retiming can be performed) and then to re-map gates to Altera primitives. Megafunctions instantiated in a design will always be synthesized using the Quartus II software. If your design uses Quartus II integrated synthesis, retiming occurs during synthesis before the design is mapped to Altera primitives. Altera Corporation 7

The design flows for the case of integrated Quartus II synthesis and a third-party atom netlist are shown in Figure 5. Figure 5. Flows for Gate-Level Register Retiming Quartus II Integrated Synthesis Gate Synthesis Retiming Technology Map Place & Route Third-Party ATOM Netlist Unmap Retiming Remap Place & Route The gate-level register retiming options will only move registers across combinational gates. Registers will not be moved across LCELL primitives instantiated by the user, memory blocks, DSP blocks, or carry/cascade chains that you have instantiated. Carry/cascade chains are always left intact when using register retiming. One of the benefits of register retiming is the ability to move registers from the inputs of a combinational logic block to the output, potentially combining the registers. In this case, some registers are removed, and one is created at the output. This case is shown in Figure 6. Figure 6. Combining Registers with Register Retiming D Q D Q D Q You can only move and combine registers in this type of situation if the following conditions are met: All registers have the same clock domain signal All registers have the same clock enable signal All registers have asynchronous control signals that are active under the same conditions Only one register has an asynchronous load other than VCC or GND 8 Altera Corporation

You can always create multiple registers at the input of a combinational block from a register at the output of a combinational block. In this case, the new registers have the same clock and clock enable. The asynchronous control signals and power-up level are derived from previous registers to provide equivalent functionality. The Gate-level Retiming report in the Nelist Optimizations section of the Analysis & Synthesis Compilation Report (Processing menu) provides a list of registers that were removed and created during register retiming. See Figure 7. Figure 7. Gate-Level Retiming Report You can set the Netlist Optimizations logic option to Never Allow for registers to prevent movement during register retiming. This option can be applied either to individual registers or entities in the design and is applied through the Assignment Editor (Assignments menu). The following registers are never moved during gate-level register retiming: Registers that have any timing constraint other than global f MAX, t SU or t CO Registers that feed asynchronous control signals on another register Registers feeding the clock of another register Registers feeding a register in another clock domain Registers that are fed by a register in another clock domain Registers connected to serializer/deserializer (SERDES) Altera Corporation 9

Registers that have the Netlist Optimizations logic option set to Never Allow Allow Register Retiming to Trade-Off Tsu/Tco with Fmax The Allow register retiming to trade off Tsu/Tco with Fmax option in the Netlist Optimizations section of the Settings dialog box (Assignments menu) determines whether the Quartus II compiler should attempt to increase f MAX at the expense of t SU or t CO times. This option will affect the gate-level register retiming option. When both the Perform gate-level register retiming and the Allow register retiming to trade off Tsu/Tco with Fmax options are turned on, retiming could affect registers that feed and are fed by I/O pins. If it is not turned on, the retiming option will not touch any registers that directly connect to I/O pins. Fitter Optimizations (Physical Synthesis) Options This section describes the fitter netlist optimizations or physical synthesis optimizations available in the Settings dialog box (Assignments menu). The physical synthesis optimizations are split into two groups, those that affect only combinational logic and not registers (because some designers may want to keep their registers intact for formal verification or other reasons), and those that can affect registers. Physical Synthesis for Combinational Logic The Perform physical synthesis for combinational logic fitter option on the Netlist Optimizations page of the Settings dialog box (Assignments menu) allows the Quartus II fitter to resynthesize the design to reduce delay along the critical path. You can achieve this type of optimization by swapping the LUT ports within LEs so that the critical path has fewer layers through which to travel. See Figure 8 for an example. This option also allows the duplication of LUTs to enable further optimizations on the critical path. 10 Altera Corporation

Figure 8. Physical Synthesis for Combinatorial Logic In this case, the critical input feeds through the first look-up table (LUT) to the second LUT. The Quartus II software swaps the input to the first LUT with an input feeding the second LUT to reduce the number of LUTs contained in the critical path. The synthesis information for each LUT is altered to maintain design functionality. The Physical Synthesis for Combinational Logic option will only affect combinational logic in the form of LUTs. The registers contained in the affected LEs will not be modified. Inputs into memory blocks, DSP blocks and I/O elements will not be swapped. LEs that are in carry or cascade chains will not be affected. 1 Nodes or entities that have the Netlist Optimizations logic option set to Never Allow will not be affected by the Physical Synthesis algorithms. The Physical Synthesis report in the Netlist Optimizations section of the Compilation Report (Processing menu) provides a list of combinational atoms that were modified and created during physical synthesis. See Figure 10. Physical Synthesis for Registers Register Duplication The Perform register duplication fitter option on the Netlist Optimizations page of the Settings dialog box (Assignments menu) allows the Quartus II fitter to duplicate registers based on fitter placement information. Combinational logic may also be duplicated when this option is enabled. An LE that fans out to multiple locations can be duplicated to reduce the delay of one path without degrading the delay of another. The new LE may be placed closer to critical logic without affecting the other fan-out paths of the original LE. Figure 9 shows an example of register duplication. Altera Corporation 11

Figure 9. Register Duplication The Quartus II software does not perform register duplication on LEs that: Are part of a carry/cascade chain Contain registers that feed asynchronous control signals on another register Contain registers feeding the clock of another register Contain registers that drive global signals Contain registers constrained to a single LAB location Contain registers that are fed by input pins without a t su constraint Contain registers that are fed by a register in another clock domain Are considered virtual I/O pins (for more information on virtual I/O pins, refer to AN 161: Using the LogicLock Methodology in the Quartus II Design Software.) Have the Netlist Optimizations option set to Never Allow The Physical Synthesis report in the Netlist Optimizations section of the Compilation Report (Processing menu) provides a list of atoms that were modified and created during register duplication. (See Figure 10.) Physical Synthesis for Registers Register Retiming The Perform register retiming fitter option in the Netlist Optimizations section of the Settings dialog box allows the Quartus II fitter to move registers across combinational logic to balance timing. This option enables algorithms similar to the Perform gate-level register retiming option (see Gate-Level Register Retiming on page 7). This option applies to the atom level (registers and combinational logic placed into LEs), and it compliments the synthesis gate-level option. The following registers are never moved during register retiming: Registers that feed asynchronous control signals on another register Registers feeding the clock of another register? Registers feeding a register in another clock domain Registers that are fed by a register in another clock domain Registers connected to serializer/deserializer (SERDES) 12 Altera Corporation

Registers that have the Netlist Optimizations logic option set to Never Allow Registers constrained to a single LAB location The Physical Synthesis report in the Netlist Optimizations section of the Compilation Report (Processing menu) provides a list of atoms that were modified during register retiming. (See Figure 10.) Physical Synthesis Report All the Physical Synthesis netlist optimizations write results to the Physical Synthesis report in the Netlist Optimizations section of the Compilation Report (Processing menu). This report, provides a list of atoms that were modified and created during physical syntheses (Figure 10). Figure 10. Physical Synthesis Report Altera Corporation 13

Two-Pass Optimization Flow from the Quartus II Command Line The Quartus II software supports netlist optimizations in the form of a two-pass optimization flow. This flow uses detailed fitter timing information during synthesis to attempt to improve resource utilization and performance. After standard synthesis, a Fast Fit is performed to obtain timing information and identify potential critical paths. The design is re-synthesized using the knowledge obtained from the Fast Fit by performing resource utilization improvement techniques on non-critical parts of the design and performance improvement techniques on critical parts of the design. After the re-synthesis step, logic optimization and technology mapping occur and the design then proceeds through the standard fitting process. f For more information on the Fast Fit option, see AN 297: Optimizing FPGA Performance Using the Quartus II Software The two-pass optimization flow is supported for Stratix, Stratix GX, and Cyclone device families. The option is only supported from the command line. It can be called directly using the following syntax: quartus_sh --flow two_pass_optimization <project> [-c <csf/ssf>] It can also be embedded in a Tcl script which can be run using the following command: quartus_sh -t <my_script>.tcl where the Tcl script <my_script>.tcl should contain the following commands: package require ::quartus::flow project_open <project> execute_flow -two_pass_optimization project_close If you are using an atom netlist from a third-party synthesis tool, turn on the Perform WYSIWYG Primitive Resynthesis netlist optimization option on the Netlist Optimizations page of the Compiler Settings section of the Settings dialog box (Assignment menu). This allows the software to unmap the atom primitives to gates after the Fast Fit. If the WYSIWYG Primitive Resynthesis netlist optimization option is not turned on, only modules that are synthesized within the Quartus II software are affected. Megafunctions instantiated in a design will always be synthesized using the Quartus II software. Nodes or entities that have the Netlist Optimizations logic option set to Never Allow will not be affected by re-synthesis. 14 Altera Corporation

Applying Netlist Optimization Options To obtain optimal results when using netlist optimization options, vary the options applied to find the best results. By default, all options are off. In general, applying all of the options produces the best results. A design space explorer Tcl/Tk script is provided with the Quartus II software to automate the application of various sets of netlist optimization options. See Design Space Explorer on page 36 for more information. f For information on how to effectively use combinations of netlist optimization options and performance results, refer to AN 297: Optimizing Performance using the Quartus II Software. 1 When using a third-party atom netlist, the WYSIWYG Primitive Resynthesis option must be turned on in order to use the Gatelevel Register Retiming option. Netlist optimizations options can have various effects on your designs. Designs that are very well coded or have already been restructured to balance critical path delays may not see a noticeable difference in performance. If you are using any Quartus II netlist optimization options, you can save your optimized results using the option Save a node-level netlist into a persistent source file on the Synthesis page under Compiler Settings in the Settings dialog box (Assignments menu). This option will save your final results as an atom-based netlist in Verilog Quartus Mapping File (.vqm) format. By default, the Quartus II software places the VQM File in the atom_netlists directory under the current project directory. If you are using the synthesis netlist optimizations, generating a VQM file is optional. If you create a VQM file and wish to recompile the design, use the new VQM file as the input source file and turn off the synthesis netlist optimizations for the new compile. If you are using the fitter physical synthesis netlist optimizations and you wish to back-annotate your design, a VQM netlist is required to preserve the changes that were made to your original netlist. Since the netlist optimizations depend on the placement of the nodes in the design, backannotating the placement will change the results from physical synthesis. Changing the results means that node names will be different, and your back-annotated locations will no longer be valid. To back-annotate a design that was compiled with fitter netlist optimizations, first create a VQM. When recompiling the design, use the new VQM file as the input source file and turn off the fitter netlist optimizations for the new compilation. Altera Corporation 15

Design Analysis Using the Timing Closure Floorplan The Quartus II software has introduced a timing closure floorplan to help you better analyze designs. This new floorplan, used in conjunction with traditional Quartus II timing analysis features, provides a powerful method to perform design analysis. Floorplan Views The new timing closure floorplan allows you to customize how to view your design. The Field View is a color-coded, high-level view of resources. Figure 11 shows a Field View of a Stratix device. Figure 11. Field View of a Stratix Device DSP Blocks M4K Blocks M512 Blocks M-RAM I/O Blocks You can also view your design in the timing closure floorplan with the traditional Interior Cells, Package Top, and Package Bottom views. Use the View menu to change the floorplan view. When in the field view, you can view the details of a resource by selecting the resource, right-clicking, then clicking on Show Details. To hide the details, select all the resources, right click, and click on Hide Details. See Figure 12. 16 Altera Corporation

Figure 12. Show Details & Hide Details of an LAB in Field View Viewing Assignments With the timing closure floorplan, you can view both user assignments and fitter placements at the same time. (User assignments are location and LogicLock assignments that you make.) To see user assignments, click the user assignments icon from the floorplan toolbar or choose Assignments (View menu) and select Show User Assignments. See Figure 13. Altera Corporation 17

Figure 13. User Assignments Fitter placements refer to where the Quartus II software placed all nodes after the last compilation. To see fitter placements, click the fitter assignments icon from the floorplan toolbar or choose Assignments (View menu) and select Show Fitter Placements. See Figure 14. 18 Altera Corporation

Figure 14. Fitter Placements Viewing Critical Paths The View Critical Paths feature displays path in the floorplan and ranks their importance as shown in Figure 15. The criticality of a path is either determined by delay or by slack. You can also view a percentage of critical paths or specify how many paths you wish to see. You can also choose to see path for all clock domains or a specific clock domain. The paths that can be displayed are: Pin-to-pin (t PD ) Pin-to-register (t SU ) Register-to-pin (t CO ) Register-to-register (f MAX ) To view critical paths in the floorplan, select the Show Critical Paths icon or go to Routing > Show Critical Paths (View menu). To set the criteria for the critical path you wish to view, select the Critical Paths Settings icon or choose Routing > Critical Paths Settings (View menu). See Figure 15. Altera Corporation 19

Figure 15. Critical Paths When viewing critical paths by slack, the settings are specified using the By Slack tab of the Critical Path Settings window shown in Figure 16. You determine which path to view and specify the slack threshold beyond which you would like the path displayed in the floorplan. For example, you can view all paths with a slack of -1 ns or worse. Figure 16. Critical Paths Settings, by Slack 20 Altera Corporation

When viewing critical paths by delay, the settings are specified using the By Delay tab of the Critical Path Settings window shown in Figure 17. This view displays the critical paths with the largest delay. Figure 17. Critical Paths Settings, by Delay The critical path feature is extremely useful in determining the criticality of nodes based on node placement. There are a number of options to view the details of critical path. To see the delay of the critical path, click the Show Routing Delays icon or choose Routing > Show Routing Delays (View menu). See Figure 18. Altera Corporation 21

Figure 18. Routing Delays for Critical Paths The default view shows the register-to-register path. You can also view all the combinational nodes for the worst-case path between the source and destination nodes. To view the full path, select the path by clicking on the delay label, right click, and select Show Paths Edges. Figure 19 shows a critical path through combinational nodes. To hide the combinational nodes, select the path, right click, and select Hide Path Edges. 1 The routing delays must be shown in order to be able to select a path. 22 Altera Corporation

Figure 19. Worst-Case Combinational Paths of Critical Paths You can also assign the path to a LogicLock region through the Paths window. Just select the path, right click, and select Properties. After using the critical path feature at least once, it is possible to determine the maximum routing delay between two nodes within a LogicLock region. To use this feature, select the Show Intra-region Delay icon or go to Routing> Show Intra-region Delay (View menu). Place your mouse over a fitter-placed LogicLock region to see the maximum delay. Figure 20 shows the maximum routing delay of a LogicLock region. Altera Corporation 23

Figure 20. Maximum Intra-Region Delay For more information on making path assignments through the Paths window, refer to Path-Based Assignments on page 32. Physical Timing Estimates In the timing closure floorplan, you can select a resource and see an approximate delay to any other resource on the chip. Once a resource is selected, the delay is visually represented by the color of potential destination resources. The darker the resource, the longer the delay, as shown in Figure 21. Figure 21. Physical Timing Estimates for Large Floorplan An approximate delay in nanoseconds can also be determined by selecting a source and then holding your mouse over a potential destination resource, as shown in Figure 22. 24 Altera Corporation

Figure 22. Delay for Physical Timing Estimate The delays represented are an estimate based on probable best-case routing. It is possible for the delay to be greater than what is shown, depending on the availability of routing resources. In general, there is a strong correlation between the probable and actual delay. To view the physical timing estimates, select the Show Physical Timing Estimate icon or go to Routing > Show Physical Timing Estimates (View menu). LogicLock Region Connectivity You can also see how logic in LogicLock regions interface by viewing the connectivity between assigned LogicLock regions. This capability is extremely valuable when entities are assigned to LogicLock regions. It is also possible to see the fan-in and fan-out of selected LogicLock Regions. Figure 23 shows standard LogicLock region connections. To view the connections in the timing closure floorplan, select the Show LogicLock Regions Connectivity icon from the toolbar or go to Routing > Show LogicLock Regions Connectivity (View menu). Altera Corporation 25

Figure 23. LogicLock Region Connections with Connection Count The connection line thickness indicates how many connections exist between regions. To view the number of connections between regions, select the Show Connection Count icon or go to Routing > Show Connection Count (View menu). LogicLock region connectivity is applicable only when the User Assignments are viewed in the floorplan. When floating LogicLock regions are used, the origin of the user-assigned region is not necessarily the same as the fitter-placed region. This is so that you can unlock a region and then lock it down again at a later time. The origin of your floating LogicLock regions can be changed to that of the last compilation origin in the LogicLock Regions Window (Assignments Menu), or by selecting Back-Annotate Origin and Lock in the Back-Annotate Assignments dialog box (Assignments menu). 26 Altera Corporation

To see the fan-in or fan-out of a LogicLock region, select the user-assigned LogicLock region while the fan-in and/or the fan-out option is turned on. To set the fan-in option, select the Show Node Fan-In icon or choose Routing > Show Node Fan-In (View menu). To set the fan-out option, select the Show Node Fan-Out icon or choose to Routing > Show Node Fan-Out (View menu). Only the nodes that have user assignments will be seen when viewing fan-in or fan-out of LogicLock regions. Figure 24 shows the fan-out of a selected LogicLock region. Figure 24. Fan-In or Fan-Out Viewing Routing Congestion The View Routing Congestion feature allows you to determine the percentage of routing resources used after a compilation. This feature can aid in identifying where there is a lack of routing resources. The congestion is visually represented by the color and shading of logic resources. The darker the shading, the greater the routing resource utilization. Logic resources that are red have routing resource utilization greater than the threshold specified. To view routing congestion in the floorplan, select the Show Routing Congestion icon or go to Routing > Show Routing Congestion (View menu). To set the criteria for the critical path you wish to view, click the view Routing Congestion Settings icon or choose Routing > Routing Congestion Settings (View menu). Figure 25 shows the Routing Congestion Settings window. Altera Corporation 27

Figure 25. Routing Congestion Settings Window You can choose the routing resource you wish to examine and set the congestion threshold for viewing. Routing congestion is calculated based on the total resource usage divided by total available resources. If you are using the routing congestion viewer to determine where there is a lack of routing resources, examine each routing resource individually to see which ones use close to 100%. Timing Closure Assignments To achieve timing closure once a design has been analyzed, there are a number of assignments that can be made and a number of methods by which to make those assignments. This section covers the LogicLock and location assignments as a means to set the placement of nodes. You can choose to make assignments to nodes, modules, or paths in the design. Quartus II Assignments With the Quartus II software, you can choose to make LogicLock region assignments, soft LogicLock region assignments, or location assignments. All of these assignments are described below. LogicLock Regions LogicLock regions are contiguous, rectangular blocks of device resources to which you can assign logic. LogicLock assignments are made to nodes, paths, or entities of a design. Anything assigned to a LogicLock region will be contained within the region s boundaries. 28 Altera Corporation

f For information on LogicLock regions and the LogicLock Design methodology, refer to AN 161: Using the LogicLock Methodology in the Quartus II Design Software. Soft LogicLock Regions The Quartus II Fitter treats soft LogicLock regions the same as LogicLock regions, but nodes or entities assigned to soft LogicLock regions do not necessarily have to stay within the boundaries of the region. The Quartus II software will place nodes outside of the region if it is likely that placing it within the region will cause a critical path to fail. You can make LogicLock assignments to nodes, paths or entities of a design. 1 Soft LogicLock regions are the reccomended type of region to use in conjucntion with fitter netlist optimizations. f For information on soft LogicLock regions, refer to AN 161: Using the LogicLock Methodology in the Quartus II Design Software. Location Assignments Location assignments are hard assignments that must be strictly interpreted by the software. Assignments are made to nodes or entities and determine which specific resources they are assigned to. Nodes and entities can be assigned to LEs, memory blocks, DSP blocks, and pins. Location assignments can be made using the Assignment Editor or the timing closure floorplan. When using the Assignment Editor, you can select the nodes through the Node Finder and then assign them to resources that are specified under Locations. Figure 26 shows the Assignment Editor window. Figure 26. Assignment Editor Altera Corporation 29

When using the timing closure floorplan, nodes can be dragged from their fitter-placed location to a resource for a location assignment. You can see all location assignments as long as the User Assignments View is turned on. Applying Timing Closure Assignments The Quartus II software provides a methodology to make node, entity, or path-based assignments. Node Assignments You can make Node assignments through the Assignment Editor, the timing closure floorplan, or by back-annotating your design. When using the Assignment Editor, nodes can be selected using the Node Finder, as shown in Figure 27. Nodes selected using the Assignment Editor can be assigned to LogicLock regions, soft LogicLock regions, or locations. Figure 27. Node Finder You can make wildcard assignments ( * and? ) using the Assignment Editor. The Quartus II software specifies the assignment in the constraint file (.csf) as a wildcard instead of writing out assignments for each node. 30 Altera Corporation

You can also select nodes in the timing closure floorplan. To make assignments, you can either right click on one of the highlighted nodes and select Assignment Editor, or drag the nodes to a LogicLock region or resource to make a location assignment. A common way to make location assignments to all nodes is to backannotate your design. This effectively locks down the placement of each node to the resource it was placed in during your last compilation. Entity Assignments You can assign entities using the Assignment Editor, the timing closure floorplan, or by dragging and dropping from the Hierarchies window. Entities are most often assigned to LogicLock regions but can be assigned directly to resources using location assignments. When using the Assignment Editor, you can specify the entity in the Edit specific entity & node settings for box, or you can select your entity in the Hierarchies window, right click, and select Assignment Editor. After selecting your entity, it can be assigned to a LogicLock region or a resource through a location assignment. You can also make assignments by dragging and dropping using the Hierarchies window. To make a location assignment to a resource (e.g., a memory block), select the entity in the Hierarchies window and drag to the specific resource in the timing closure floorplan. LogicLock assignments can be made to entities by dragging an entity from the Hierarchies window to one of the LogicLock regions in the LogicLock Regions window. If a LogicLock region does not exist, the entity can be dragged to the <<new>> line for a new LogicLock region. This will create a LogicLock region with the instance name of the entity and assign the entity to it. Altera Corporation 31

Figure 28. Drag-Drop from Hierarchies Window to LogicLock Region You can also create a new LogicLock Region for an entity by right clicking on the entity and selecting Create New LogicLock Region. Path-Based Assignments Path assignments can only be made to LogicLock regions, and only through the following methods: Using the new Paths window By dragging and dropping path from the Timing Analysis section of the Compilation Report By dragging and dropping using the critical path utility in the timing closure floorplan 32 Altera Corporation

The new path assignments that can be made will assign every node in every path from the source and destination nodes. For a situation as shown in Figure 29, assume that the path Source > N1 > N2 > N4 > Destination is the worst-case path. If a path assignment is made from node Source to node Destination, nodes Source, Destination, N1, N2, N4 and N3 will be assigned to a LogicLock region. Figure 29. Worst-Case vs. All Paths f If you prefer to assign only a worst-case path to a region, you can do so by making assignments to the individual nodes. For information on how to make assignments to nodes, refer to Node Assignments on page 30. Paths Window The Paths window allows you to specify a path by identifying a source and destination node. You can use wildcard assignments to specify source and destination nodes. See Figure 30. Altera Corporation 33

Figure 30. Paths Window with List Nodes & Wildcard You have the option of excluding the source nodes, the destination nodes, and any nodes that match a name or a wildcard. It is also possible to change the LogicLock region to which the path will be assigned. Before making an assignment, you can select List Nodes to determine how many nodes will be assigned to the LogicLock region. A list of nodes will be provided along with a node count. Access the Paths window through the LogicLock Region Properties window (shown in Figure 31) by clicking Add Paths The LogicLock Region Properties window can be opened for a LogicLock region by right-clicking on the region in the LogicLock Regions Window and selecting Properties. 34 Altera Corporation

Figure 31. LogicLock Regions Properties Window The Paths window can also be accessed when using the Critical Paths utility by selecting the critical path, right-clicking, and pressing Properties (see Figure 32). Figure 32. Right Click on Critical Paths Dragging and Dropping Paths can be selected from the timing analysis reports and assigned to LogicLock regions. To do this, open the Compilation Report and select one of the reports under the Timing Analyses section. Select which path you would like to assign and drag them from the timing analysis report to an existing LogicLock region or to the <<new>> line to create a region. See Figure 33. Altera Corporation 35

Figure 33. Compilation Report & LogicLock Region Window You can also assingn a critical path displayed using the critical path utility to LogicLock regions. To create a region, select the critical path you would like to assign and drag them from the timing closure floorplan to an existing LogicLock region or to the <<new>> line. See Figure 34. Figure 34. Timing Closure Floorplan & LogicLock Region Window Design Space Explorer The Quartus II software provides many advanced options that help you achieve timing closure within your Altera device. When switched on, these optimization options invoke optimization algorithms which can be customized by setting certain variables and parameters. These options provide you with complete control over Quartus II optimization techniques. Because each FPGA design is unique, there is no standard set of options and settings that will always achieve optimal performance gains when turned on or off. Each design requires a unique set of options to achieve optimal performance. This section describes a new tool that explores a range of Quartus II options to provide the best possible result for your design. 36 Altera Corporation

The design space explorer (DSE) is a Tcl/Tk script that automates the process of finding the optimal set of options for your design. DSE accomplishes this by exploring the design space of your design by applying the various optimization techniques and determining the optimal set. DSE Basics The DSE Tcl/Tk script can be found in default Quartus II software installations at <Quartus II Install Directory>\bin\tcl_scripts\dse\dse.tcl. However, it is recommended that you run DSE using the quartus_sh --dse command from the command prompt. DSE can be used in GUI or command-line mode. Figure 35 shows the main user interface of DSE. DSE must be run with the Quartus II shell. Table 1 shows some examples of invoking DSE in the Quartus II shell. f For more information, type quartus_sh --help=dse at the command prompt. Figure 35. DSE User Interface Altera Corporation 37

Table 1 shows the different syntax to use when starting DSE in commandline or GUI modes. Table 1. DSE Command Modes Mode Shell Command GUI Mode quartus_sh --dse Command-line Mode quartus_sh --dse [project=simple seeds=1,3,5 exploration=custom gain=15...] Exploration Modes DSE extends the concept of fitter seed sweeping in the Quartus II software, providing a method for sweeping general compilation and fitter parameters to find the best options for your design. You can run DSE in an exhaustive try-all-options-and-values mode called Parameter Sweep Mode or focus on one parameter by running DSE in Signature mode. Parameter Sweep modes In this mode you can set a specific target that must be met for the design project, such as a timing requirement. In Parameter Sweep mode, DSE offers a range of exploration types from Type 0, a seed sweep exploration, to a Type 3, an exhaustive exploration. Compilation time increases in relation to the breadth of the exploration; the design space increases as more optimization options and parameters are explored. Parameter sweep mode provides four predefined exploration types: Type 0: Seed Sweep Type 1: Basic Exploration Type 2: Advanced Exploration Type 3: Exhaustive Exploration Type 0: Seed Sweep You can change the initial placement configuration used by the PowerFitter by varying the Fitter seed value. The seed value can be set on the Fitter page of the Settings dialog box (Assignment menu). For typical circuits the results that are generated by varying the seed value is about ± 5% in f MAX. For example, for a typical design compiled with three different seeds, 1/3 of the time f MAX does not improve over initial compilation, 1/3 of the time f MAX gets 5% better, and 1/3 of the time F MAX gets 10% better. 38 Altera Corporation

Selecting DSE Exploration type Type 0: Seed sweep leverages this concept and automates the process. You specify the seed values in the Seeds to Sweep field of the DSE GUI. There are no magic seeds. Any integer value is as likely to produce good results as another since the variation between seeds is truly random. By default DSE will select 5 seeds: 1, 5, 7, 11, and 23. This type of exploration makes no changes to your netlist. There is a 1 increase in compilation time for every seed value specified. For example, if you select 5 seeds, the compilation time is 5 the initial compilation time. Type 1: Basic Exploration Selecting Type 1: Basic Exploration adds the Register Packing option to the exploration performed by Type 0. This exploration type also increases PowerFitter effort during place and route. This type of exploration makes no changes to your netlist. There is a 5 increase in compile time for every seed value specified. Type 2: Advanced Exploration Selecting Type 2: Advanced Exploration adds LE duplication, LUT resynthesis, and the unmapping flow netlist options to exploration included in Type 1. These netlist optimizations do not affect registers. Only LUTs are modified. These options will not affect your design s behavior and will remain unchanged. There is a 10 increase in compilation time for every seed value specified. Type 3: Exhaustive Exploration Selecting Type 3: Exhaustive Exploration adds Gate-Level Retiming, Atom-Level retiming, and Register Retiming netlist optimization to the exploration included in Type 2. There is a 13 increase in compilation time for every seed value specified. Custom Exploration Selecting Custom Exploration allows you to selectively explore the effects of various optimization options on your design. This exploration type gives you complete control over which options will be chosen and in what mode. In Custom Exploration mode you can explore all optimization options available in the other three exploration modes. Altera Corporation 39

Table 2 summaries the four exploration types. Table 2. DSE Exploration Type Exploration Type Optimization Options Explored Increase in Compile Time Type 0: Seed Sweep Fitter Seed 1 per seed Type 1: Basic Exploration Fitter Seed Register Packing Increase PowerFitter effort 5 per seed Type 2: Advanced Exploration Type 3: Exhaustive Exploration Fitter Seed Register Packing Increase PowerFitter effort Logic Element duplication LUT resynthesis Unmapping Fitter Seed Register Packing Logic Element duplication LUT resynthesis Unmapping Physical synthesis for combinatorial logic WYSIWYG primitive resynthesis Register retiming Use fitter timing information Retime core and IO Increase PowerFitter Effort Custom Exploration User Dependent N/A 10 per seed 13 per seed Signature mode In Signature mode, DSE analyzes the fmax, slack, compile time and area tradeoffs of a single parameter. Running the single parameter over multiple seeds, DSE will report the average of these values. With this information you gain a better understanding of how that single parameter interacts with your design. For example, in Signature mode, DSE can explore the Auto Packed Registers logic option with its four settings: OFF, Normal, Minimized Area, Minimize Area with Chains and report the effects of each on your design. DSE Exploration Options DSE also contains many options that allow you to control the runtime of the design exploration. 40 Altera Corporation

The Stop exploration after this much time has elapsed option allows you to halt further exploration after a specified number, in hours, has elapsed. 1 Exploration time might exceed that of the specified value as compilation is not halted in the middle of a compile. The Stop exploration if this slack is achieved option allows you to halt further exploration after a specified slack value is achieved. Other options allow you to archive each exploration done by DSE into a Quartus II Archive file or create archives with all optimization options set but not compile them. Choose Show Documentation (Help menu) for more information. DSE Support for Altera Device Families The following device families are supported by DSE exploration types 0, 1, 2, 3, and Custom: Stratix device family Stratix GX device family Cyclone device family The following devices are supported by DSE exploration types 0, 1, and Custom: APEX20K devices APEX20KC devices APEX20KE devices APEX II devices FLEX10K devices FLEX10KA devices FLEX10KE devices DSE Flow DSE can be run at any point in the design process. However, it is recommended that you run DSE very late in your design cycle when attempting to push the performance of the design. The results gained from diffrerent combinations of optimization options may not persist over large changes in a design. In Signature mode, DSE can be run at the midpoint in your design cycle to see the effect of various parameters, such as the register packing logic options. Altera Corporation 41