ECE 5745 ASIC Tutorial

Size: px
Start display at page:

Download "ECE 5745 ASIC Tutorial"

Transcription

1 README.md - Grip ECE 5745 ASIC Tutorial This tutorial will explain how to use a set of Synopsys tools to push an RTL design through synthesis, place-and-route, and power analysis. This tutorial assumes you have already completed the "basic" ECE 5745 tutorials on Linux, Git, PyMTL, and Verilog. Overview of ECE 5745 ASIC Flow The following diagram illustrates the PyMTL-based ECE 5745 ASIC toolflow. There are four main steps. 1. We use the PyMTL framework to test, verify, and evaluate the execution time (in cycles) of our design. This part of the flow is exactly the same as ECE Note that we can write our RTL models in either PyMTL or Verilog. Once we are sure our design is working correctly, we can then start to push the design through the flow. The ASIC flow requires Verilog RTL as an input, so we can use PyMTL's automatic translation tool to translate PyMTL RTL models into Verilog RTL. 2. We use Synopsys Design Compiler (DC) to synthesize our design, which means to transform the Verilog RTL model into a Verilog gate-level netlist where all of the gates are selected from a standard cell library. We need to provide Synopsys DC with higher-level characterization information about our standard cell library. The primary file containing this characterization is in a.lib file and it contains information about the logical functionality, timing, and power of each cell. 3. We use Synopsys IC Compiler (ICC) to place-and-route our design, which means to place all of the gates in the gatelevel netlist into rows on the chip and then to generate the metal wires that connect all of the gates together. We need to provide Synopsys ICC with lower-level characterization information about our standard cell library. The primary file containing this characterization is in a.lef file and it contains information about the dimensions, pin placement, and metal blockages of each cell. Synopsys ICC generates a Milkway Database which contains the actual layout as well as

2 additional characterization information. Synopsys ICC also generates reports that can be used to more accurately characterize area and timing. 4. We use Synopsys PrimeTime (PT) to perform power-analysis of our design. This requires switching activity information for every net in the design (which comes from our Verilog RTL VCD file) and capacitance information for every net in the design (which comes from the Milkyway Database generated by Synopsys ICC). Synopsys PT puts the switching activity, capacitance, clock frequency, and voltage together to estimate the power consumption of every net and thus every module in the design. Extensive documentation is provided by Synopsys for Design Compiler, IC Compiler, and PrimeTime. We have organized this documentation and made it available to you on the public course webpage. The username/password was distributed during lecture. PyMTL-Based Testing, Simulation, Translation First step is to source the setup script and clone the tutorial repository from GitHub. We create a bash variable to keep track of the tutorial directory. % source setup- ece5745.sh % mkdir $HOME/ece5745 % cd $HOME/ece5745 % git clone git@github.com:cornell- ece5745/ece5745- tut- asic % cd ece5745- tut- asic % TOPDIR=$PWD We will be pushing the sort unit from the PyMTL tutorial through the ASIC flow. As a reminder, the sort unit takes as input four integers and a valid bit and outputs those same four integers in increasing order with the valid bit. The sort unit is implemented using a three-stage pipelined, bitonic sorting network and the datapath is shown below. Run the tests for the sort unit and note that the tests for the SortUnitStructRTL will fail. You can just copy over your implementation of the MinMaxUnit from when you completed the PyMTL tutorial. If you have not completed the PyMTL tutorial then go back and do that now. After running the tests we use the sort unit simulator to translate the PyMTL RTL model into Verilog and to dump the VCD file that we want to use for power analysis. % mkdir $TOPDIR/pymtl/build % cd $TOPDIR/pymtl/build % py.test../tut3_pymtl/sort %../tut3_pymtl/sort/sort- sim - - impl rtl- struct - - translate - - dump- vcd Take a moment to open up the translated Verilog which should be in a file named SortUnitStructRTL_0x73ab8da9cdd886de.v. The complicated hash suffix is used by PyMTL to make this filename unique even for parameterized modules which are instantiated for a specific set of parameters. The hash might be different for your design. Try to see how both the structural composition and the behavioral modeling translates into Verilog. Here is an example of the translation for the MinMaxUnit. Notice how PyMTL will output the source Python embedded as a comment in the corresponding translated Verilog. module MinMaxUnit_0x152ab97dfd22b898 ( input wire [ 0:0] clk, input wire [ 7:0] in0, input wire [ 7:0] in1,

3 output reg [ output reg [ input wire [ ); 7:0] out_max, 7:0] out_min, 0:0] reset // PYMTL SOURCE: // // def block(): // // if s.in0 >= s.in1: // s.out_max.value = s.in0 // s.out_min.value = s.in1 // else: // s.out_max.value = s.in1 // s.out_min.value = s.in0 // logic for block() (*) begin if ((in0 >= in1)) begin out_max = in0; out_min = in1; end else begin out_max = in1; out_min = in0; end end endmodule // MinMaxUnit_0x4b8e51bd a Although we hope students will not need to actually open up this translated Verilog it is occasionally necessary. For example, PyMTL is not perfect and can translate incorrectly which might require looking at the Verilog to see where it went wrong. Other steps in the ASIC flow might refer to an error in the translated Verilog which will also require looking at the Verilog to figure out why the other steps are going wrong. While we try and make things as automated as possible, students will eventually need to dig in and debug some of these steps themselves. Using Synopsys Design Compiler Manually We use Synopsys Design Compiler (DC) to synthesize Verilog RTL models into a gate-level netlist where all of the gates are from the standard cell library. So Synopsys DC will synthesize the Verilog + operator into a specific arithmetic block at the gate-level. Based on various constraints it may synthesize a ripple-carry adder, a carry-look-ahead adder, or even more advanced parallel-prefix adders. We will start by manually entering a sequence of commands into Synopsys DC and in the next section we will see how to automate this process. Create a directory to work in and launch Synopsys DC. % mkdir $TOPDIR/asic/dc- syn/manual- dc % cd $TOPDIR/asic/dc- syn/manual- dc % dc_shell- xg- t To make it easier to copy-and-paste commands from this document, we tell Synopsys DC to ignore the prefix dc_shell> using the following: dc_shell> alias "dc_shell>" "" Before we can really start synthesizing the design we need to setup a bunch of variables and options. We need to point Synopsys DC to where the standard cells are installed, where the Verilog we want to synthesize is located, where the standard cell characterization files are located, and what the names for logic 0 and logic 1 are in the standard cell library. dc_shell> set stdcells_home /classes/ece5745/install/bare- pkgs/noarch/saed- 90nm- synopsys- cells dc_shell> set_app_var search_path "$stdcells_home../../../pymtl/build" dc_shell> set_app_var target_library "cells.db"

4 dc_shell> set_app_var link_library "* $target_library" dc_shell> set_app_var alib_library_analysis_path \ "/classes/ece5745/install/bare- pkgs/noarch/saed- 90nm- synopsys- cells" dc_shell> set_app_var mw_logic1_net "VDD" dc_shell> set_app_var mw_logic0_net "VSS" Now we create a new Milkyway Database. Milkyway is Synopsys' proprietary database format which is used to hold all kinds of design data (RTL models, gate-level models, standard-cell models, timing information, layout, etc). We open the new database and also create a directory for Synopsys DC to work in. dc_shell> create_mw_lib - technology $stdcells_home/cells.tf \ - mw_reference_library $stdcells_home/cells.fr "LIB" dc_shell> open_mw_lib "LIB" dc_shell> define_design_lib WORK - path "./work" We are now ready to synthesize the design. We first read in the Verilog file which contains the top-level design and all referenced modules. dc_shell> analyze - format verilog "SortUnitStructRTL_0x73ab8da9cdd886de.v" We use the elaborate command to convert the Verilog models into a unified in-memory model format that Synopsys can analyze. This is also when Synopsys starts to do some analysis on the design, and the command output can sometimes display useful information about inferred latches and such. Notice that you need to give the elaborate command the name of the Verilog module which is the top of the design. dc_shell> elaborate "SortUnitStructRTL_0x73ab8da9cdd886de" We use the link command to resolve all module references and then we use the check_design command to check for any warnings or errors. Always be sure to explicitly look for errors; they can get buried in the tons of output that the Synopsys tools produce. Synopsys DC does not usually stop if there is an error but instead just keeps going. dc_shell> link dc_shell> check_design We need to create a clock constraint to tell Synopsys DC what our target cycle time is. Synopsys DC will not synthesize a design to run "as fast as possible". Instead, the designer gives Synopsys DC a target cycle time and the tool will try to meet this constraint while minimizing area and power. The create_clock command takes the name of the clock signal in the Verilog (which in this course will always be clk ), the label to give this clock (i.e., ideal_clock1 ), and the target clock period in nanoseconds. So in this example, we are asking Synopsys DC to see if it can synthesize the design to run at 1GHz (i.e., a cycle time of 1ns). dc_shell> create_clock clk - name ideal_clock1 - period 1 Finally, the compile_ultra command will do the synthesis. Without any options, the compile_ultra command will sometimes flatten parts of the design. Flatten means to remove module hierarchy boundaries; so instead of having module A and module B within module C, Synopsys DC will take all of the logic in module A and module B and put it directly in module C. Without these extra hierarchy boundaries, Synopsys DC is able to perform more optimizations and potentially achieve better area, energy, and timing. The - no_autoungroup option prevents Synopsys DC from flattening any part of the design and thus preserves the module hierarchy. This makes it much easier to interpret the reports since if there is a module A in your RTL design that same module will always be in the synthesized gate-level netlist. dc_shell> compile_ultra - no_autoungroup As compile_ultra runs it will display how it is trying to optimize your design. Synopsys DC will use sophisticated CAD algorithms to try and meet the clock cycle constraint, then to reduce the area/power overhead, and then to again improve

5 the timing. It will iterate many times as it works hard to optimize the design. Now that we have synthesized the design, we output the resulting gate-level netlist in two different file formats: Verilog and DDC (which we will use with DesignVision). dc_shell> write - f verilog - hierarchy - output SortUnitStructRTL_0x73ab8da9cdd886de.mapped.v dc_shell> write - format ddc - hierarchy - output SortUnitStructRTL_0x73ab8da9cdd886de.mapped.ddc We can use various commands to generate reports about area, energy, and timing. The report_timing command will show the critical path through the design. Part of the report is displayed below. dc_shell> report_timing - transition_time - nets - attributes - nosplit... Point Fanout Trans Incr Path clock network delay (ideal) elm_s1s2$001/out_reg[1]/clk (DFFARX1) r elm_s1s2$001/out_reg[1]/q (DFFARX1) r elm_s1s2$001/out[1] (net) r elm_s1s2$001/out[1] (Reg_0x45f1552f10c5f05d_10) r elm_s1s2$001$out[1] (net) r minmax1_s2/in0[1] (MinMaxUnit_0x152ab97dfd22b898_1) r minmax1_s2/in0[1] (net) r minmax1_s2/u28/zn (INVX0) f minmax1_s2/n32 (net) f minmax1_s2/u31/q (OA22X1) f minmax1_s2/n15 (net) f minmax1_s2/u32/q (OA22X1) f minmax1_s2/n18 (net) f minmax1_s2/u33/q (OA22X1) f minmax1_s2/n21 (net) f minmax1_s2/u34/q (OA22X1) f minmax1_s2/n24 (net) f minmax1_s2/u35/q (OA22X1) f minmax1_s2/n27 (net) f minmax1_s2/u4/q (OA22X1) f minmax1_s2/n2 (net) f minmax1_s2/u6/q (OA22X1) f minmax1_s2/n4 (net) f minmax1_s2/u40/q (MUX21X1) r minmax1_s2/out_max[0] (net) r minmax1_s2/out_max[0] (MinMaxUnit_0x152ab97dfd22b898_1) r minmax1_s2$out_max[0] (net) r elm_s2s3$003/in_[0] (Reg_0x45f1552f10c5f05d_0) r elm_s2s3$003/in_[0] (net) r elm_s2s3$003/out_reg[0]/d (DFFX2) r data arrival time 1.31 clock ideal_clock1 (rise edge) clock network delay (ideal) elm_s2s3$003/out_reg[0]/clk (DFFX2) r library setup time data required time data required time 0.92 data arrival time slack (VIOLATED) This timing report uses static timing analysis to find the critical path. Static timing analysis checks the timing across all paths in the design (regardless of whether these paths can actually be used in practice) and finds the longest path. You can learn more about static timing analysis in Chapter 1 of the Synopsys Timing Constraints and Optimization User Guide. The report clearly shows that the critical path starts at the first pipeline register in between the S1 and S2 stages, goes into the first input of the bottom MinMaxUnit, comes out the out_min port of the MinMaxUnit, and ends at a pipeline register in between the S2 and S3 stages. The report shows the delay through each logic gate (e.g., the clk-to-q delay of the initial DFF is 180ps, the propagation delay of a OA22X1 gate is 130ps) and the total delay for the critical path which in this case is 1.31ns. Notice how the OA22X1 gates do not all have the same propagation delay; this is because the static timing analysis also factors in input slew rates, rise vs fall time, and output load when calculating the delay of each gate. We set the clock

6 constraint to be 1ns, but also notice that the report factors in the setup time required at the final register. The setup time is 80ps, so in order to operate the sort unit at 1ns and meet the setup time we would need the critical path to arrive in 0.92ns. The difference between the required arrival time and the actual arrival time is called the slack. Positive slack means the path arrived before it needed to while negative slack means the path arrived after it needed to. If you end up with positive slack it means you probably want to decrease your clock constraint to push the tools harder and produce a faster design. Even if you have no slack you still probably want to decrease your clock constraint. This is because the tools rarely leave positive slack preferring instead to take an overly fast design and resynthesize smaller logic to save area and power. In the above example, we have 390ps of negative slack. Note that this does not mean the sort unit will not work. It just means the cycle time would have to be 1.39ns in order for the sort unit to operate correctly. Because in this course we are primarily interested in design-space exploration (as opposed to meeting some kind of arbitrary timing constraint), we suggest adjusting the clock constraint until you end up with about 5-10% negative slack. This will result in a well-optimized design and help identify the "fundamental" performance of the design. The report_area command can show how much area each module uses and can enable detailed area breakdown analysis. dc_shell> report_area - nosplit - hierarchy... Global Local Cell Area Cell Area Hierarchical cell Abs Non Black- Total % Comb Comb boxes SortUnitStructRTL elm_s0s1$ Reg_0x45f1552f10c5f05d_7 elm_s0s1$ Reg_0x45f1552f10c5f05d_6 elm_s0s1$ Reg_0x45f1552f10c5f05d_5 elm_s0s1$ Reg_0x45f1552f10c5f05d_4 elm_s1s2$ Reg_0x45f1552f10c5f05d_11 elm_s1s2$ Reg_0x45f1552f10c5f05d_10 elm_s1s2$ Reg_0x45f1552f10c5f05d_9 elm_s1s2$ Reg_0x45f1552f10c5f05d_8 elm_s2s3$ Reg_0x45f1552f10c5f05d_3 elm_s2s3$ Reg_0x45f1552f10c5f05d_2 elm_s2s3$ Reg_0x45f1552f10c5f05d_1 elm_s2s3$ Reg_0x45f1552f10c5f05d_0 minmax0_s MinMaxUnit_0x152ab97dfd22b898_2 minmax0_s MinMaxUnit_0x152ab97dfd22b898_3 minmax1_s MinMaxUnit_0x152ab97dfd22b898_4 minmax1_s MinMaxUnit_0x152ab97dfd22b898_1 minmax_s MinMaxUnit_0x152ab97dfd22b898_0 val_s0s RegRst_0x2ce052f8c32c5c39_0 val_s1s RegRst_0x2ce052f8c32c5c39_1 val_s2s RegRst_0x2ce052f8c32c5c39_ Total The units are in square micron. From the above report, we can see that each pipeline register consumes about 4-5% of the area, while the MinMaxUnits consume a total of 43% of the area. This is one reason we try not to flatten our designs, since the module hierarchy helps us understand the area breakdowns. If we completely flattened the design there would only be one line in the above table. The report_power command can show how much power each module consumes. Note that this power analysis is actually not that useful yet, since at this stage of the flow the power analysis is based purely on statistical activity factor estimation. Basically, Synopsys DC assumes every net toggles 10% of the time. This is a pretty poor estimate. dc_shell> report_power - nosplit - hier Finally, we go ahead and exit Synopsys DC. dc_shell> exit Take a few minutes to examine the resulting Verilog gate-level netlist. Notice that the module hierarchy is preserved and also

7 notice that the MinMaxUnit synthesizes into a large number of basic logic gates. % cd $TOPDIR/asic/dc- syn/manual- dc % more SortUnitStructRTL_0x73ab8da9cdd886de.mapped.v We can use the Synopsys Design Vision tool for browsing the resulting gate-level netlist, plotting critical path histograms, and generally analyzing our design. Start Synopsys Design Vision and setup the various variables and options as follows: % design_vision- xg design_vision> alias "design_vision>" "" design_vision> set stdcells_home /classes/ece5745/install/bare- pkgs/noarch/saed- 90nm- synopsys- cells design_vision> set_app_var search_path "$stdcells_home../../../pymtl/build" design_vision> set_app_var target_library "cells.db" design_vision> set_app_var link_library "* $target_library" design_vision> set_app_var alib_library_analysis_path \ "/classes/ece5745/install/bare- pkgs/noarch/saed- 90nm- synopsys- cells/alib" Now choose File > Read from the menu and open the SortUnitStructRTL_0x73ab8da9cdd886de.mapped.ddc file. To view a schematic of the gate-level netlist, right click on the module in the module hierarchy browser and choose Schematic View. Open the top-level module as a gate-level netlist and then double click on the box to see the MinMaxUnits. To see a histogram of path slack choose Timing > Paths Slack from the menu, and the click OK. To see a schematic of the critical path, first click on one of the bars in the path slack histogram, and then right click on a specific path from the list that will appear to the right of the histogram. Choose Path Inspector. Using Synopsys Design Compiler with Makefile Obviously entering all of the above commands is tedious and error prone. We could also potentially directly drive synthesis using the Design Vision GUI, but that is just as tedious and error prone. To enable an agile hardware design methodology, we must script as much of the ASIC flow as possible. Luckily, Synopsys tools can be easily scripted using TCL, and even better, the ECE 5745 staff have already created these TCL scripts. The ECE 5745 TCL scripts were based on the Synopsys reference methodology which is copyrighted by Synopsys. This means you cannot take this repo and/or the scripts and make them public. Please keep this in mind. We use make to drive the ASIC flow. A special Makefrag describes the details of the specific design you want to push through the flow. Go into the asic subdirectory and take a look at the Makefrag. % cd $TOPDIR/asic % more Makefrag The Makefrag has one entry for each design. Each entry looks like this: ifeq ($(design),pymtl- sort) flow = pymtl clock_period = 1.0 sim_build_dir = pymtl/build vsrc = SortUnitStructRTL_0x73ab8da9cdd886de.v vmname = SortUnitStructRTL_0x73ab8da9cdd886de viname = TOP/v vcd = sort- rtl- struct- random.verilator1.vcd endif Every design has a name and in this case the design name is pymtl- sort. For now the flow variable will always be pymtl, the sim_build_dir variable will always be pymtl/build, and the viname variable will always be TOP/v. The clock_period variable is where you set the target clock period constraint for this design. The vsrc variable is the name of the Verilog file you want to push through the flow. The vmname variable is the name of the Verilog module which is the top of the design. For now it will always be the Verilog file name without the.v suffix. Finally, the vcd variable is the name of the VCD file you want to use for power analysis. We set the following line in the Makefrag to choose which design we want to push through the flow:

8 design = pymtl- sort Since this is already set to push our sort unit through the flow, we are all set. Now all we need to do use make like this: % cd $TOPDIR/asic/dc- syn % make You will see make run some commands, start Synopsys DC, run some TCL scripts, and then finish up. Essentially, the automated system is doing something very similar to what we did in the previous section manually. If Synopsys DC exits with a status code of zero then something went wrong. You will need to carefully look through the log to search for errors or warnings that might hint at what went wrong. You may have used the incorrect file/module names in the Makefrag or there might be code in your Verilog RTL that is not synthesizable. This is not easy and there is no simple way to figure out these issues. You just need to poke through the log file: % cd $TOPDIR/asic/dc- syn/current- dc/log % more dc.log When the synthesis is completed you can take a look at the resulting Verilog gate-level netlist here: % cd $TOPDIR/asic/dc- syn/current- dc/results % more SortUnitStructRTL_0x73ab8da9cdd886de.mapped.v The automated system is also setup to output a bunch of reports. Here are the key ones: % cd $TOPDIR/asic/dc- syn/current- dc/reports % more SortUnitStructRTL_0x73ab8da9cdd886de.mapped.qor.rpt % more SortUnitStructRTL_0x73ab8da9cdd886de.mapped.timing.rpt % more SortUnitStructRTL_0x73ab8da9cdd886de.mapped.area.rpt % more SortUnitStructRTL_0x73ab8da9cdd886de.mapped.power.rpt The quality-of-results (QOR) report is a particularly useful summary. If you take a look that report you will see something like this: Timing Path Group 'REGIN' Levels of Logic: 2.00 Critical Path Length: 0.06 Critical Path Slack: 0.87 Critical Path Clk Period: 1.00 Total Negative Slack: 0.00 No. of Violating Paths: 0.00 Worst Hold Violation: 0.00 Total Hold Violation: 0.00 No. of Hold Violations: Timing Path Group 'REGOUT' Levels of Logic: 9.00 Critical Path Length: 0.85 Critical Path Slack: 0.15 Critical Path Clk Period: 1.00 Total Negative Slack: 0.00 No. of Violating Paths: 0.00 Worst Hold Violation: 0.00 Total Hold Violation: 0.00 No. of Hold Violations: Timing Path Group 'ideal_clock1' Levels of Logic: 9.00 Critical Path Length: 0.88

9 Critical Path Slack: 0.05 Critical Path Clk Period: 1.00 Total Negative Slack: 0.00 No. of Violating Paths: 0.00 Worst Hold Violation: 0.00 Total Hold Violation: 0.00 No. of Hold Violations: Paths are organized into four groups: REGIN, REGOUT, INOUT, and CLK path groups. REGIN paths start at an input port and end at a register; REGOUT paths start at a register and end at an output port; INOUT paths start at an input port and end at an output port; and CLK paths start at a register and end at register. The following diagram is from Chapter 1 of the Synopsys Timing Constraints and Optimization User Guide. We have setup the flow so that the tools have to fit all four of these paths in a single cycle. The QOR report shows the worst path within each path group. The overall critical path for your design will be the worse critical path across all four groups, and the actual cycle time is calculated as the "Critical Path Clk Period" (this is the target clock constraint) minus the "Critical Path Slack"). So in this example the cycle time would be 0.95ns. Recall that when we manually entered the commands for synthesis the critical path was 1.39ns. What changed? The automated flow takes advantage of what is known as "topological mode"; this is an advanced feature in Synopsys DC which involves more complex algorithms that do synthesis, preliminary placement, more synthesis, and more preliminary placement. By incorporating some preliminary placement algorithms into the synthesis part of the flow, Synopsys DC is able to achieve much higher QOR. Keep in mid that the area, energy, timing results post-synthesis will not be as accurate as the post-place-and-route results. While it is fine to iterate quickly just using synthesis, you will eventually need to use Synopsys IC Compiler for more accurate area and timing analysis, and use Synopsys PrimeTime for more accurate power analysis. Using IC Compiler with Makefile We use Synopsys IC Compiler (ICC) for placing and routing standard cells, but also for power routing and clock tree synthesis. The Verilog gate-level netlist generated by Synopsys DC has no physical information: it is just a netlist, so the Synopsys IC will first try and do a rough placement of all of the gates into rows on the chip. Synopsys IC will then do some preliminary routing, and iterate between more and more detailed placement and routing until it reaches the target cycle time (or gives up). Synopsys IC will also route all of the power and ground rails in a grid and connect this grid to the power and ground pins of each standard cell, and Synopsys IC will automatically generate a clock tree to distribute the clock to all sequential state elements with hopefully low skew. We can use make to run Synopsys ICC like this: % cd $TOPDIR/asic/icc- par % make Place-and-route can take significantly longer than synthesis, so be prepared to wait a while with larger designs. If you look at the output scrolling by you will see some of the optimization passes as Synopsys ICC attempts to iteratively improve the design. The automated system is also setup to output a bunch of reports. Here are the key ones:

10 % cd $TOPDIR/asic/icc- par/current- icc/reports % more chip_finish_icc.qor.rpt % more chip_finish_icc.timing.rpt % more chip_finish_icc.area.rpt % more chip_finish_icc.power.rpt % more summary.txt vsrc = SortUnitStructRTL_0x73ab8da9cdd886de.v area = 5072 # um^2 constraint = 1.0 # ns slack = 0.01 # ns cycle_time = 0.99 # ns If Synopsys ICC exits with an error or the reports look very odd, you will need to carefully look through the log to search for errors or warnings that might hint at what went wrong. Usually we catch errors in Synopsys DC and after that we are all set, so you might want to go back and see if there were any errors in Synopsys DC. The Synopsys ICC log files are here: % cd $TOPDIR/asic/dc- syn/current- iccdp/log % cd $TOPDIR/asic/dc- syn/current- icc/log We have written a little script to parse the reports and generate a summary.txt file. This script takes care of looking across all four path groups to fine the true cycle time that you should use in your analysis. The general format of the area, energy, timing reports is similar in spirit to what we saw earlier when working with Synopsys DC. From the summary.txt file, we can see that the cycle time is now estimated to be 0.99ns, but recall that our post-synthesis estimate was 0.95ns. The key difference of course, is that these results are based on post-place-and-route analysis so they factor in routing congestion and interconnect overheads. While we do not use GUIs to drive our flow, we often use GUIs to analyze the results. You can start the Synopsys ICC GUI to visualize the final layout like this: % cd $TOPDIR/asic/icc- par/current- icc % icc_shell - gui Once the GUI has finished loading you will viewing a "MainWindow", use the following steps to actually open up the most recently placed-and-routed design in a "LayoutWindow": enter source icc_setup.tcl at icc_shell> prompt Chose File > Open Design... from the menu Click the folder button to right of Library Name Select the orange folder with L in file browser Select chip_finish_icc in list Click Okay We call the resulting plot an "amoeba plot" because the tool often generates blocks that look like amoebas. You can now zoom in to see how the standard cells were placed and how the routing was done. You can turn on an off the visibility of metal layers using the panel on the left. One very useful feature is to view the hierarchy and area breakdown. This will be critical for producing high-quality amoeba plots. You can use the following steps to highlight various modules on the amoeba plot: Choose Placement > Color By Hierarchy from the menu In the sidebar menu on right, select Reload In the pop-up window, select Color hierarchical cells at level Click OK in the pop up Click checkmark and apply to show just one component Another very useful feature is to highlight the critical path on the amoeba plot using the following steps: Choose Timing > New Timing Analysis Window from the menu Focus on Select Paths window, click OK

11 List of paths should appear Click on path to see it highlighted in the layout view You can see an example amoeba plot below. Note that you will need to use some kind of "screen-capture" software to capture the plot and by default it will have a black background. We strongly recommend inverting the colors so that the amoeba plot you include in your reports is dark on white (instead of white on dark). This makes the chip plot easier to read. You will also need to play with the colors to enable easily seeing the various parts of your design. In this example, we have chosen to highlight the five MinMaxUnits (brown, blue, green, red, gray) and one of the critical paths which goes through the red MinMaxUnit. Note how the tool has actually spread the red MinMaxUnit a part a bit. Keep in mind that these tools use incredibly sophisticated heuristics and so it can sometimes be difficult to understand every detail about why it places cells in specific places. Using Primetime with Makefile We use Synopsys PrimeTime (PT) for power analysis. There are many ways to perform power analysis. The power postsynthesis and post-place-and-route power reports use statistic power analysis where we simply assume some toggle probability on each net. For more accurate power analysis we need to find out the actual activity for every net for a given experiment. One way to do this is to perform post-place-and-route gate-level simulation; in other words, we can do a simulation of the gate-level netlist generated by synthesis and place-and-route. These kind of gate-level simulations can be very, very slow and are tedious to setup correctly. So in this course we will use a slightly less accurate yet much simpler approach. We will use the VCD from an RTL simulation instead of the VCD from a gate-level simulation. The challenge is that not all of the nets in the gate-level simulation are actually in the RTL so we will only have activity information for a subset of the nets that are in both the RTL and gate-level models (e.g., module ports, state elements). This is not as bad as it seems, because Synopsys PT will use sophisticated algorithms including many tiny little gate-level simulations of just a few gates in order to estimate the activity factor of all nets downstream from those nets we already know. We can use make to run Synopsys PT like this: % cd $TOPDIR/asic/pt- pwr % make

12 vsrc = SortUnitStructRTL_0x73ab8da9cdd886de.v input = sort- rtl- struct- random area = 5072 # um^2 constraint = 1.0 # ns slack = 0.01 # ns cycle_time = 0.99 # ns exec_time = 104 # cycles power = # mw energy = # nj We have setup the flow to display the final summary information after this step. You can see the total area, cycle time, power, and energy for your design when running the given input (i.e., when using the VCD file specified in the Makefrag ). You can see a more detailed power breakdown by module here: % cd $TOPDIR/asic/pt- pwr/current- pt/reports % more pt- pwr.power.avg.max.report Int Switch Leak Total Hierarchy Power Power Power Power % SortUnitStructRTL 4.97e e e e elm_s2s3_000 (Reg_3) 2.26e e e e elm_s2s3_001 (Reg_2) 2.63e e e e elm_s2s3_002 (Reg_1) 2.65e e e e elm_s2s3_003 (Reg_0) 2.36e e e e elm_s1s2_000 (Reg_11) 2.58e e e e elm_s1s2_001 (Reg_10) 2.66e e e e elm_s1s2_002 (Reg_9) 2.55e e e e elm_s1s2_003 (Reg_8) 2.64e e e e elm_s0s1_000 (Reg_7) 2.66e e e e elm_s0s1_001 (Reg_6) 2.66e e e e elm_s0s1_002 (Reg_5) 2.68e e e e val_s2s3 (RegRst_2) 5.55e e e e elm_s0s1_003 (Reg_4) 2.65e e e e minmax_s3 (MinMaxUnit_0) 1.94e e e e minmax0_s1 (MinMaxUnit_2) 2.09e e e e minmax0_s2 (MinMaxUnit_3) 2.01e e e e minmax1_s1 (MinMaxUnit_4) 2.04e e e e minmax1_s2 (MinMaxUnit_1) 2.01e e e e val_s0s1 (RegRst_0) 3.30e e e e val_s1s2 (RegRst_1) 2.19e e e e These estimates are in Watts. The power of each module is broken down into internal power, switching power, and leakage power. Internal power and switching power are both forms of dynamic power. Internal power is the "dynamic power dissipated within the boundary of a cell". According to Synopsys documentation it includes power due to charging/discharging internal nodes within the cell but also short circuit power. Switching power is the dynamic "power dissipated by the charging and discharging of the load capacitance at the output of the cell". To learn more about how Synopsys PT does power analysis see the PrimeTime PX User Guide. From the breakdown you can see a relatively even distribution of the power across the modules, and that the dynamic power is much more significant than the leakage power. Let's do a quick experiment to compare the energy for sorting a stream of all zeros to the energy for sorting a stream of random values (which we just found to be 650pJ). We do not need to re-synthesize and re-place-and-route the design. We just need to generate a new VCD file and re-run Synopsys PT. So first we re-run the sort unit simulator with a different input: % cd $TOPDIR/pymtl/build %../tut3_pymtl/sort/sort- sim - - impl rtl- struct - - input zeros - - translate - - dump- vcd Now we need to change the entry in the Makefrag to point to the new VCD file. The entry in the Makefrag should look like this: ifeq ($(design),pymtl- sort) flow = pymtl clock_period = 1.0 sim_build_dir = pymtl/build vsrc = SortUnitStructRTL_0x73ab8da9cdd886de.v vmname = SortUnitStructRTL_0x73ab8da9cdd886de

13 viname = TOP/v vcd = sort- rtl- struct- zeros.verilator1.vcd endif Now we re-run Synopsys PT: % cd $TOPDIR/asic/pt- pwr && make vsrc = SortUnitStructRTL_0x73ab8da9cdd886de.v input = sort- rtl- struct- zeros area = 72 # um^2 constraint = 1.0 # ns slack = 0.01 # ns cycle_time = 0.99 # ns exec_time = 104 # cycles power = # mw energy = # nj Not surprisingly, sorting a stream of zeros consumes significantly less energy compared to sorting a stream of random values: 275pJ vs 653pJ. One might ask why the sort unit consumes any energy if it is just sorting a stream of zeros. We can dig into the report to find the answer: % cd $TOPDIR/asic/pt- pwr/current- pt/reports % more pt- pwr.power.avg.max.report Int Switch Leak Total Hierarchy Power Power Power Power % SortUnitStructRTL 2.19e e e e elm_s2s3_000 (Reg_3) 1.63e e e elm_s2s3_001 (Reg_2) 1.63e e e elm_s2s3_002 (Reg_1) 1.63e e e elm_s2s3_003 (Reg_0) 1.63e e e elm_s1s2_000 (Reg_11) 1.63e e e elm_s1s2_001 (Reg_10) 1.63e e e elm_s1s2_002 (Reg_9) 1.63e e e elm_s1s2_003 (Reg_8) 1.63e e e elm_s0s1_000 (Reg_7) 1.63e e e elm_s0s1_001 (Reg_6) 1.63e e e elm_s0s1_002 (Reg_5) 1.63e e e val_s2s3 (RegRst_2) 5.54e e e e elm_s0s1_003 (Reg_4) 1.63e e e minmax_s3 (MinMaxUnit_0) e e minmax0_s1 (MinMaxUnit_2) e e minmax0_s2 (MinMaxUnit_3) e e minmax1_s1 (MinMaxUnit_4) e e minmax1_s2 (MinMaxUnit_1) e e val_s0s1 (RegRst_0) 3.31e e e e val_s1s2 (RegRst_1) 2.18e e e e Notice that the switching power is indeed zero for the pipeline registers, but not the valid bit. This is probably because the valid bit does toggle at the beginning and end of the simulation; the absolute switching power of valid bit is very, very small. Notice that there is still leakage, but none of this accounts for the majority of the 275pJ. The key is the internal power of the pipeline registers. Internal power also includes the clock power for sequential state elements, so effectively while sorting a stream of zeros results in very little energy on the data bits we still require energy to toggle the clock across all of the pipeline registers. In this design there are bit pipeline registers which is quite a bit of state. So the key point here is that we want to always try small experiments to verify that things are working as expected, and that you will almost certainly need to dig into the detailed reports to understand what is going on. Using Verilog RTL Models Students are welcome to use Verilog instead of PyMTL to design their RTL models. Having said this, we will still exclusively use PyMTL for all test harnesses, FL/CL models, and simulation drivers. This really simplifies managing the course, and PyMTL is actually a very productive way to test/evaluate your Verilog RTL designs. We use PyMTL's Verilog import feature described in the Verilog tutorial to make all of this work. The following commands will run all of the tests on the Verilog

14 implementation of the sort unit. % cd $TOPDIR/pymtl/build % rm - rf * % py.test../tut4_verilog/sort As before, the tests for the SortUnitStructRTL will fail. You can just copy over your implementation of the MinMaxUnit from when you completed the Verilog tutorial. If you have not completed the Verilog tutorial then go back and do that now. After running the tests we use the sort unit simulator to translate the PyMTL RTL model into Verilog and to dump the VCD file that we want to use for power analysis. % cd $TOPDIR/pymtl/build %../tut4_verilog/sort/sort- sim - - impl rtl- struct - - translate - - dump- vcd Take a moment to open up the translated Verilog which should be in a file named SortUnitStructRTL_0x73ab8da9cdd886de.v. You might ask, "Why do we need to use PyMTL to translate the Verilog if we already have the Verilog?" PyMTL will take care of preprocessing all of your Verilog RTL code to ensure it is in a single Verilog file. This greatly simplifies getting your design into the ASIC flow. This also ensures a one-to-one match between the Verilog that was used to generate the VCD file and the Verilog that is used in the ASIC flow. Once you have tested your design and generated the single Verilog file and the VCD file, you can push the design through the ASIC flow using the exact same steps we used above. % cd $TOPDIR/asic/dc- syn && make % cd $TOPDIR/asic/icc- par && make % cd $TOPDIR/asic/pt- pwr && make vsrc = SortUnitStructRTL_0x73ab8da9cdd886de.v input = sort- rtl- struct- random area = 5695 # um^2 constraint = 1.0 # ns slack = 0.0 # ns cycle_time = 1.0 # ns exec_time = 104 # cycles power = # mw energy = # nj On Your Own Now that you have gone through the entire ECE 5745 ASIC flow for both the PyMTL and Verilog implementation of the sort unit, you should try the same approach for the GCD unit which is included in the tutorial. Explore the area, energy, and timing of the GCD unit. Where is the critical path? How is the area allocated across the various submodules? How does the energy of the GCD unit vary based on the input pattern?

EECS 151/251A ASIC Lab 6: Power and Timing Verification

EECS 151/251A ASIC Lab 6: Power and Timing Verification EECS 151/251A ASIC Lab 6: Power and Timing Verification Written by Nathan Narevsky (2014,2017) and Brian Zimmer (2014) Modified by John Wright (2015,2016), Ali Moin (2017) and Taehwan Kim (2018) Overview

More information

Part B. Dengxue Yan Washington University in St. Louis

Part B. Dengxue Yan Washington University in St. Louis Tools Tutorials Part B Dengxue Yan Washington University in St. Louis Tools mainly used in this class Synopsys VCS Simulation Synopsys Design Compiler Generate gate-level netlist Cadence Encounter placing

More information

EECS 151/251A ASIC Lab 7: SRAM Integration

EECS 151/251A ASIC Lab 7: SRAM Integration EECS 151/251A ASIC Lab 7: SRAM Integration Written by Nathan Narevsky (2014,2017) and Brian Zimmer (2014) Modified by John Wright (2015,2016) and Taehwan Kim (2018) Overview In this lab, we will go over

More information

RTL Synthesis using Design Compiler. Dr Basel Halak

RTL Synthesis using Design Compiler. Dr Basel Halak RTL Synthesis using Design Compiler Dr Basel Halak Learning Outcomes: After completing this unit, you should be able to: 1. Set up the DC RTL Synthesis Software and run synthesis tasks 2. Synthesize a

More information

ECE 5745 Complex Digital ASIC Design, Spring 2017 Lab 2: Sorting Accelerator

ECE 5745 Complex Digital ASIC Design, Spring 2017 Lab 2: Sorting Accelerator School of Electrical and Computer Engineering Cornell University revision: 2017-03-16-23-56 In this lab, you will explore a medium-grain hardware accelerator for sorting an array of integer values of unknown

More information

CS/EE 6710 Digital VLSI Design Tutorial on Cadence to Synopsys Interface (CSI)

CS/EE 6710 Digital VLSI Design Tutorial on Cadence to Synopsys Interface (CSI) CS/EE 6710 Digital VLSI Design Tutorial on Cadence to Synopsys Interface (CSI) This tutorial walks you through the Cadence to Synopsys Interface (CSI). This interface lets you take a schematic from composer

More information

Design Space Exploration: Implementing a Convolution Filter

Design Space Exploration: Implementing a Convolution Filter Design Space Exploration: Implementing a Convolution Filter CS250 Laboratory 3 (Version 101012) Written by Rimas Avizienis (2012) Overview This goal of this assignment is to give you some experience doing

More information

EE4415 Integrated Digital Design Project Report. Name: Phang Swee King Matric Number: U066584J

EE4415 Integrated Digital Design Project Report. Name: Phang Swee King Matric Number: U066584J EE4415 Integrated Digital Design Project Report Name: Phang Swee King Matric Number: U066584J April 10, 2010 Contents 1 Lab Unit 1 2 2 Lab Unit 2 3 3 Lab Unit 3 6 4 Lab Unit 4 8 5 Lab Unit 5 9 6 Lab Unit

More information

Tutorial 2.(b) : Synthesizing your design using the Synopsys Design Compiler ( For DFT Flow)

Tutorial 2.(b) : Synthesizing your design using the Synopsys Design Compiler ( For DFT Flow) Tutorial 2.(b) : Synthesizing your design using the Synopsys Design Compiler ( For DFT Flow) Objectives: In this tutorial you will learrn to use Synopsys Design Compiler (DC) to perform hardware synthesis

More information

ECE 4750 Computer Architecture, Fall 2017 Lab 1: Iterative Integer Multiplier

ECE 4750 Computer Architecture, Fall 2017 Lab 1: Iterative Integer Multiplier School of Electrical and Computer Engineering Cornell University revision: 2017-08-31-12-21 The first lab assignment is a warmup lab where you will design two implementations of an integer iterative multiplier:

More information

Pipelined MIPS CPU Synthesis and On-Die Representation ECE472 Joseph Crop Stewart Myers

Pipelined MIPS CPU Synthesis and On-Die Representation ECE472 Joseph Crop Stewart Myers Pipelined MIPS CPU Synthesis and On-Die Representation ECE472 Joseph Crop Stewart Myers 2008 Table of Contents Introduction... 3 Steps Taken and Simulation... 3 Pitfalls... 8 Simulated Delay... 9 APPENDIX

More information

Tutorial for Verilog Synthesis Lab (Part 2)

Tutorial for Verilog Synthesis Lab (Part 2) Tutorial for Verilog Synthesis Lab (Part 2) Before you synthesize your code, you must absolutely make sure that your verilog code is working properly. You will waste your time if you synthesize a wrong

More information

Laboratory 5. - Using Design Compiler for Synthesis. By Mulong Li, 2013

Laboratory 5. - Using Design Compiler for Synthesis. By Mulong Li, 2013 CME 342 (VLSI Circuit Design) Laboratory 5 - Using Design Compiler for Synthesis By Mulong Li, 2013 Reference: http://www.tkt.cs.tut.fi/tools/public/tutorials/synopsys/design_compiler/gsdc.html Background

More information

A. Setting Up the Environment a. ~/ece394 % mkdir synopsys b.

A. Setting Up the Environment a. ~/ece394 % mkdir synopsys b. ECE 394 ASIC & FPGA Design Synopsys Design Compiler and Design Analyzer Tutorial A. Setting Up the Environment a. Create a new folder (i.e. synopsys) under your ece394 directory ~/ece394 % mkdir synopsys

More information

Introduction to STA using PT

Introduction to STA using PT Introduction to STA using PT Learning Objectives Given the design, library and script files, your task will be to successfully perform STA using the PrimeTime GUI and generate reports. After completing

More information

Lecture 11 Logic Synthesis, Part 2

Lecture 11 Logic Synthesis, Part 2 Lecture 11 Logic Synthesis, Part 2 Xuan Silvia Zhang Washington University in St. Louis http://classes.engineering.wustl.edu/ese461/ Write Synthesizable Code Use meaningful names for signals and variables

More information

Adding SRAMs to Your Accelerator

Adding SRAMs to Your Accelerator Adding SRAMs to Your Accelerator CS250 Laboratory 3 (Version 100913) Written by Colin Schmidt Adpated from Ben Keller Overview In this lab, you will use the CAD tools and jackhammer to explore tradeoffs

More information

Hardware Verification Group. Department of Electrical and Computer Engineering, Concordia University, Montreal, Canada. CAD Tool Tutorial.

Hardware Verification Group. Department of Electrical and Computer Engineering, Concordia University, Montreal, Canada. CAD Tool Tutorial. Digital Logic Synthesis and Equivalence Checking Tools Hardware Verification Group Department of Electrical and Computer Engineering, Concordia University, Montreal, Canada CAD Tool Tutorial May, 2010

More information

ECE425: Introduction to VLSI System Design Machine Problem 3 Due: 11:59pm Friday, Dec. 15 th 2017

ECE425: Introduction to VLSI System Design Machine Problem 3 Due: 11:59pm Friday, Dec. 15 th 2017 ECE425: Introduction to VLSI System Design Machine Problem 3 Due: 11:59pm Friday, Dec. 15 th 2017 In this MP, you will use automated tools to synthesize the controller module from your MP2 project into

More information

Bits and Pieces of CS250 s Toolflow

Bits and Pieces of CS250 s Toolflow Bits and Pieces of CS250 s Toolflow CS250 Tutorial 2 (Version 092509a) September 25, 2009 Yunsup Lee In this tutorial you will learn what each VLSI tools used in class are meant to do, how they flow, file

More information

ECE 4514 Digital Design II. Spring Lecture 20: Timing Analysis and Timed Simulation

ECE 4514 Digital Design II. Spring Lecture 20: Timing Analysis and Timed Simulation ECE 4514 Digital Design II Lecture 20: Timing Analysis and Timed Simulation A Tools/Methods Lecture Topics Static and Dynamic Timing Analysis Static Timing Analysis Delay Model Path Delay False Paths Timing

More information

Bits and Pieces of CS250 s Toolflow

Bits and Pieces of CS250 s Toolflow Bits and Pieces of CS250 s Toolflow CS250 Tutorial 2 (Version 091210a) September 12, 2010 Yunsup Lee In this tutorial you will learn what each VLSI tools used in class are meant to do, how they flow, file

More information

Getting a Quick Start 2

Getting a Quick Start 2 2 Getting a Quick Start 2 This chapter walks you through the basic synthesis design flow (shown in Figure 2-1). You use the same basic flow for both design exploration and design implementation. The following

More information

Building your First Image Processing ASIC

Building your First Image Processing ASIC Building your First Image Processing ASIC CS250 Laboratory 2 (Version 092312) Written by Rimas Avizienis (2012) Overview The goal of this assignment is to give you some experience implementing an image

More information

EECS 151/251A ASIC Lab 2: Simulation

EECS 151/251A ASIC Lab 2: Simulation EECS 151/251A ASIC Lab 2: Simulation Written by Nathan Narevsky (2014, 2017) and Brian Zimmer (2014) Modified by John Wright (2015,2016) and Taehwan Kim (2018) Overview In lecture, you have learned how

More information

University of California, Davis Department of Electrical and Computer Engineering. EEC180B DIGITAL SYSTEMS Spring Quarter 2018

University of California, Davis Department of Electrical and Computer Engineering. EEC180B DIGITAL SYSTEMS Spring Quarter 2018 University of California, Davis Department of Electrical and Computer Engineering EEC180B DIGITAL SYSTEMS Spring Quarter 2018 LAB 2: FPGA Synthesis and Combinational Logic Design Objective: This lab covers

More information

EE 5327 VLSI Design Laboratory Lab 8 (1 week) Formal Verification

EE 5327 VLSI Design Laboratory Lab 8 (1 week) Formal Verification EE 5327 VLSI Design Laboratory Lab 8 (1 week) Formal Verification PURPOSE: To use Formality and its formal techniques to prove or disprove the functional equivalence of two designs. Formality can be used

More information

Recommended Design Techniques for ECE241 Project Franjo Plavec Department of Electrical and Computer Engineering University of Toronto

Recommended Design Techniques for ECE241 Project Franjo Plavec Department of Electrical and Computer Engineering University of Toronto Recommed Design Techniques for ECE241 Project Franjo Plavec Department of Electrical and Computer Engineering University of Toronto DISCLAIMER: The information contained in this document does NOT contain

More information

Place & Route Tutorial #1

Place & Route Tutorial #1 Place & Route Tutorial #1 In this tutorial you will use Synopsys IC Compiler (ICC) to place, route, and analyze the timing and wirelength of two simple designs. This tutorial assumes that you have worked

More information

Laboratory 6. - Using Encounter for Automatic Place and Route. By Mulong Li, 2013

Laboratory 6. - Using Encounter for Automatic Place and Route. By Mulong Li, 2013 CME 342 (VLSI Circuit Design) Laboratory 6 - Using Encounter for Automatic Place and Route By Mulong Li, 2013 Reference: Digital VLSI Chip Design with Cadence and Synopsys CAD Tools, Erik Brunvand Background

More information

EECS 151/251A ASIC Lab 3: Logic Synthesis

EECS 151/251A ASIC Lab 3: Logic Synthesis EECS 151/251A ASIC Lab 3: Logic Synthesis Written by Nathan Narevsky (2014, 2017) and Brian Zimmer (2014) Modified by John Wright (2015,2016) and Taehwan Kim (2018) Overview For this lab, you will learn

More information

ECE 5745 PyMTL CL Modeling

ECE 5745 PyMTL CL Modeling README.md - Grip ECE 5745 PyMTL CL Modeling Most students are quite familiar with functional-level (FL) and register-transfer-level (RTL) modeling from ECE 4750, but students are often less familiar with

More information

CPE/EE 427, CPE 527, VLSI Design I: Tutorial #4, Standard cell design flow (from verilog to layout, 8-bit accumulator)

CPE/EE 427, CPE 527, VLSI Design I: Tutorial #4, Standard cell design flow (from verilog to layout, 8-bit accumulator) CPE/EE 427, CPE 527, VLSI Design I: Tutorial #4, Standard cell design flow (from verilog to layout, 8-bit accumulator) Joel Wilder, Aleksandar Milenkovic, ECE Dept., The University of Alabama in Huntsville

More information

Pushing SRAM Blocks through CS250 s Toolflow

Pushing SRAM Blocks through CS250 s Toolflow Pushing SRAM Blocks through CS250 s Toolflow CS250 Tutorial 8 (Version 093009a) September 30, 2009 Yunsup Lee In this tutorial you will gain experience pushing SRAM blocks through the toolflow. You will

More information

PlanAhead Software Tutorial

PlanAhead Software Tutorial PlanAhead Software Tutorial RTL Design and IP Generation The information disclosed to you hereunder (the Information ) is provided AS-IS with no warranty of any kind, express or implied. Xilinx does not

More information

ECE 551 Design Vision Tutorial

ECE 551 Design Vision Tutorial ECE 551 Design Vision Tutorial ECE 551 Staff Dept of Electrical & Computer Engineering, UW-Madison Lesson 0 Tutorial Setup... 2 Lesson 1 Code Input (Analyze and Elaborate)... 4 Lesson 2 - Simple Synthesis...

More information

SHA3: Pipelining and interfaces with Exploration

SHA3: Pipelining and interfaces with Exploration SHA3: Pipelining and interfaces with Exploration Overview CS250 Laboratory 2 (Version 091014) Written by Colin Scmidt Portions based on previous work by Yunsup Lee Updated by Brian Zimmer, Rimas Avizienis,

More information

Altera Quartus II Synopsys Design Vision Tutorial

Altera Quartus II Synopsys Design Vision Tutorial Altera Quartus II Synopsys Design Vision Tutorial Part III ECE 465 (Digital Systems Design) ECE Department, UIC Instructor: Prof. Shantanu Dutt Prepared by: Xiuyan Zhang, Ouwen Shi In tutorial part II,

More information

and 32 bit for 32 bit. If you don t pay attention to this, there will be unexpected behavior in the ISE software and thing may not work properly!

and 32 bit for 32 bit. If you don t pay attention to this, there will be unexpected behavior in the ISE software and thing may not work properly! This tutorial will show you how to: Part I: Set up a new project in ISE 14.7 Part II: Implement a function using Schematics Part III: Simulate the schematic circuit using ISim Part IV: Constraint, Synthesize,

More information

Partitioning for Better Synthesis Results

Partitioning for Better Synthesis Results 3 Partitioning for Better Synthesis Results Learning Objectives After completing this lab, you should be able to: Use the group and ungroup commands to repartition a design within Design Analyzer Analyze

More information

SHA3: Pipelining and interfaces with Exploration

SHA3: Pipelining and interfaces with Exploration SHA3: Pipelining and interfaces with Exploration Overview CS250 Laboratory 2 (Version 020416) Written by Colin Scmidt Modified by Christopher Yarp Portions based on previous work by Yunsup Lee Updated

More information

Performing STA. Learning Objectives

Performing STA. Learning Objectives Performing STA Learning Objectives UNIT 45 minutes Unit 8 You are provided with a design netlist that does not meet timing. You are also provided with another set of sub blocks that were improved for timing

More information

Synthesis and APR Tools Tutorial

Synthesis and APR Tools Tutorial Synthesis and APR Tools Tutorial (Last updated: Oct. 26, 2008) Introduction This tutorial will get you familiarized with the design flow of synthesizing and place and routing a Verilog module. All the

More information

ECE 5745 PARCv2 Accelerator Tutorial

ECE 5745 PARCv2 Accelerator Tutorial README.md - Grip ECE 5745 PARCv2 Accelerator Tutorial The infrastructure for the ECE 5745 lab assignments and projects has support for implementing medium-grain accelerators. Fine-grain accelerators are

More information

EE 330 Laboratory Experiment Number 11

EE 330 Laboratory Experiment Number 11 EE 330 Laboratory Experiment Number 11 Design and Simulation of Digital Circuits using Hardware Description Languages Fall 2017 Contents Purpose:... 3 Background... 3 Part 1: Inverter... 4 1.1 Simulating

More information

Logic Synthesis. Logic Synthesis. Gate-Level Optimization. Logic Synthesis Flow. Logic Synthesis. = Translation+ Optimization+ Mapping

Logic Synthesis. Logic Synthesis. Gate-Level Optimization. Logic Synthesis Flow. Logic Synthesis. = Translation+ Optimization+ Mapping Logic Synthesis Logic Synthesis = Translation+ Optimization+ Mapping Logic Synthesis 2 Gate-Level Optimization Logic Synthesis Flow 3 4 Design Compiler Procedure Logic Synthesis Input/Output 5 6 Design

More information

1 Design Process HOME CONTENTS INDEX. For further assistance, or call your local support center

1 Design Process HOME CONTENTS INDEX. For further assistance,  or call your local support center 1 Design Process VHDL Compiler, a member of the Synopsys HDL Compiler family, translates and optimizes a VHDL description to an internal gate-level equivalent. This representation is then compiled with

More information

CPE/EE 427, CPE 527, VLSI Design I: Tutorial #2, Schematic Capture, DC Analysis, Transient Analysis (Inverter, NAND2)

CPE/EE 427, CPE 527, VLSI Design I: Tutorial #2, Schematic Capture, DC Analysis, Transient Analysis (Inverter, NAND2) CPE/EE 427, CPE 527, VLSI Design I: Tutorial #2, Schematic Capture, DC Analysis, Transient Analysis (Inverter, NAND2) Joel Wilder, Aleksandar Milenkovic, ECE Dept., The University of Alabama in Huntsville

More information

CS250 DISCUSSION #2. Colin Schmidt 9/18/2014 Std. Cell Slides adapted from Ben Keller

CS250 DISCUSSION #2. Colin Schmidt 9/18/2014 Std. Cell Slides adapted from Ben Keller CS250 DISCUSSION #2 Colin Schmidt 9/18/2014 Std. Cell Slides adapted from Ben Keller LAST TIME... Overview of course structure Class tools/unix basics THIS TIME... Synthesis report overview for Lab 2 Lab

More information

TOPIC : Verilog Synthesis examples. Module 4.3 : Verilog synthesis

TOPIC : Verilog Synthesis examples. Module 4.3 : Verilog synthesis TOPIC : Verilog Synthesis examples Module 4.3 : Verilog synthesis Example : 4-bit magnitude comptarator Discuss synthesis of a 4-bit magnitude comparator to understand each step in the synthesis flow.

More information

Image Courtesy CS250 Section 3. Yunsup Lee 9/11/09

Image Courtesy  CS250 Section 3. Yunsup Lee 9/11/09 CS250 Section 3 Image Courtesy www.ibm.com Yunsup Lee 9/11/09 Announcements Lab 2: Write and Synthesize a Two-Stage SMIPSv2 Processor is out Lab 2 due on September 24th (Thursday) before class Four late

More information

Tutorial for Cadence SOC Encounter Place & Route

Tutorial for Cadence SOC Encounter Place & Route Tutorial for Cadence SOC Encounter Place & Route For Encounter RTL-to-GDSII System 13.15 T. Manikas, Southern Methodist University, 3/9/15 Contents 1 Preliminary Setup... 1 1.1 Helpful Hints... 1 2 Starting

More information

Tutorial: GNU Radio Companion

Tutorial: GNU Radio Companion Tutorials» Guided Tutorials» Previous: Introduction Next: Programming GNU Radio in Python Tutorial: GNU Radio Companion Objectives Create flowgraphs using the standard block libraries Learn how to debug

More information

EE115C Digital Electronic Circuits. Tutorial 2: Hierarchical Schematic and Simulation

EE115C Digital Electronic Circuits. Tutorial 2: Hierarchical Schematic and Simulation EE115C Digital Electronic Circuits Tutorial 2: Hierarchical Schematic and Simulation The objectives are to become familiar with Virtuoso schematic editor, learn how to create the symbol view of basic primitives,

More information

EE 330 Laboratory Experiment Number 11 Design, Simulation and Layout of Digital Circuits using Hardware Description Languages

EE 330 Laboratory Experiment Number 11 Design, Simulation and Layout of Digital Circuits using Hardware Description Languages EE 330 Laboratory Experiment Number 11 Design, Simulation and Layout of Digital Circuits using Hardware Description Languages Purpose: The purpose of this experiment is to develop methods for using Hardware

More information

EE 330 Spring Laboratory 2: Basic Boolean Circuits

EE 330 Spring Laboratory 2: Basic Boolean Circuits EE 330 Spring 2013 Laboratory 2: Basic Boolean Circuits Objective: The objective of this experiment is to investigate methods for evaluating the performance of Boolean circuits. Emphasis will be placed

More information

Problem Formulation. Specialized algorithms are required for clock (and power nets) due to strict specifications for routing such nets.

Problem Formulation. Specialized algorithms are required for clock (and power nets) due to strict specifications for routing such nets. Clock Routing Problem Formulation Specialized algorithms are required for clock (and power nets) due to strict specifications for routing such nets. Better to develop specialized routers for these nets.

More information

Physical Placement with Cadence SoCEncounter 7.1

Physical Placement with Cadence SoCEncounter 7.1 Physical Placement with Cadence SoCEncounter 7.1 Joachim Rodrigues Department of Electrical and Information Technology Lund University Lund, Sweden November 2008 Address for correspondence: Joachim Rodrigues

More information

PrimeTime: Introduction to Static Timing Analysis Workshop

PrimeTime: Introduction to Static Timing Analysis Workshop i-1 PrimeTime: Introduction to Static Timing Analysis Workshop Synopsys Customer Education Services 2002 Synopsys, Inc. All Rights Reserved PrimeTime: Introduction to Static 34000-000-S16 Timing Analysis

More information

EE 101 Lab 5 Fast Adders

EE 101 Lab 5 Fast Adders EE 0 Lab 5 Fast Adders Introduction In this lab you will compare the performance of a 6-bit ripple-carry adder (RCA) with a 6-bit carry-lookahead adder (CLA). The 6-bit CLA will be implemented hierarchically

More information

EE-382M VLSI II. Early Design Planning: Front End

EE-382M VLSI II. Early Design Planning: Front End EE-382M VLSI II Early Design Planning: Front End Mark McDermott EE 382M-8 VLSI-2 Page Foil # 1 1 EDP Objectives Get designers thinking about physical implementation while doing the architecture design.

More information

An easy to read reference is:

An easy to read reference is: 1. Synopsis: Timing Analysis and Timing Constraints The objective of this lab is to make you familiar with two critical reports produced by the Xilinx ISE during your design synthesis and implementation.

More information

SystemC-to-Layout ASIC Flow Walkthrough

SystemC-to-Layout ASIC Flow Walkthrough SystemC-to-Layout ASIC Flow Walkthrough 20.6.2015 Running the Demo You can execute the flow automatically by executing the csh shell script: csh run_asic_demo.csh The script runs all tools in a sequence.

More information

18. Synopsys Formality Support

18. Synopsys Formality Support 18. Synopsys Formality Support QII53015-7.2.0 Introduction Formal verification of FPGA designs is gaining momentum as multi-million System-on-a-Chip (SoC) designs are targeted at FPGAs. Use the Formality

More information

Setup file.synopsys_dc.setup

Setup file.synopsys_dc.setup Setup file.synopsys_dc.setup The.synopsys_dc.setup file is the setup file for Synopsys' Design Compiler. Setup file is used for initializing design parameters and variables, declare design libraries, and

More information

SmartTime for Libero SoC v11.5

SmartTime for Libero SoC v11.5 SmartTime for Libero SoC v11.5 User s Guide NOTE: PDF files are intended to be viewed on the printed page; links and cross-references in this PDF file may point to external files and generate an error

More information

CSE P567 - Winter 2010 Lab 1 Introduction to FGPA CAD Tools

CSE P567 - Winter 2010 Lab 1 Introduction to FGPA CAD Tools CSE P567 - Winter 2010 Lab 1 Introduction to FGPA CAD Tools This is a tutorial introduction to the process of designing circuits using a set of modern design tools. While the tools we will be using (Altera

More information

Synthesis. Other key files. Standard cell (NAND, NOR, Flip-Flop, etc.) FPGA CLB

Synthesis. Other key files. Standard cell (NAND, NOR, Flip-Flop, etc.) FPGA CLB SYNTHESIS Synthesis Involves synthesizing a gate netlist from verilog source code We use Design Compiler (DC) by Synopsys which is the most popular synthesis tool used in industry Target library examples:

More information

HOW TO SYNTHESIZE VERILOG CODE USING RTL COMPILER

HOW TO SYNTHESIZE VERILOG CODE USING RTL COMPILER HOW TO SYNTHESIZE VERILOG CODE USING RTL COMPILER This tutorial explains how to synthesize a verilog code using RTL Compiler. In order to do so, let s consider the verilog codes below. CNT_16 Module: 16

More information

ENGN 1630: CPLD Simulation Fall ENGN 1630 Fall Simulating XC9572XLs on the ENGN1630 CPLD-II Board Using Xilinx ISim

ENGN 1630: CPLD Simulation Fall ENGN 1630 Fall Simulating XC9572XLs on the ENGN1630 CPLD-II Board Using Xilinx ISim ENGN 1630 Fall 2018 Simulating XC9572XLs on the ENGN1630 CPLD-II Board Using Xilinx ISim You will use the Xilinx ISim simulation software for the required timing simulation of the XC9572XL CPLD programmable

More information

GCD: VLSI s Hello World

GCD: VLSI s Hello World GCD: VLSI s Hello World CS250 Laboratory 1 (Version 092509a) September 25, 2009 Yunsup Lee For the first lab assignment, you will write an RTL model of a greatest common divisor (GCD) circuit and push

More information

EE183 LAB TUTORIAL. Introduction. Projects. Design Entry

EE183 LAB TUTORIAL. Introduction. Projects. Design Entry EE183 LAB TUTORIAL Introduction You will be using several CAD tools to implement your designs in EE183. The purpose of this lab tutorial is to introduce you to the tools that you will be using, Xilinx

More information

Verilog for High Performance

Verilog for High Performance Verilog for High Performance Course Description This course provides all necessary theoretical and practical know-how to write synthesizable HDL code through Verilog standard language. The course goes

More information

Cluster-based approach eases clock tree synthesis

Cluster-based approach eases clock tree synthesis Page 1 of 5 EE Times: Design News Cluster-based approach eases clock tree synthesis Udhaya Kumar (11/14/2005 9:00 AM EST) URL: http://www.eetimes.com/showarticle.jhtml?articleid=173601961 Clock network

More information

Welcome to CS250 VLSI Systems Design

Welcome to CS250 VLSI Systems Design Image Courtesy: Intel Welcome to CS250 VLSI Systems Design 9/2/10 Yunsup Lee YUNSUP LEE Email: yunsup@cs.berkeley.edu Please add [CS250] in the subject Will try to get back in a day CS250 Newsgroup Post

More information

Design Compiler Graphical Create a Better Starting Point for Faster Physical Implementation

Design Compiler Graphical Create a Better Starting Point for Faster Physical Implementation Datasheet Create a Better Starting Point for Faster Physical Implementation Overview Continuing the trend of delivering innovative synthesis technology, Design Compiler Graphical streamlines the flow for

More information

8. Best Practices for Incremental Compilation Partitions and Floorplan Assignments

8. Best Practices for Incremental Compilation Partitions and Floorplan Assignments 8. Best Practices for Incremental Compilation Partitions and Floorplan Assignments QII51017-9.0.0 Introduction The Quartus II incremental compilation feature allows you to partition a design, compile partitions

More information

EECS 151/251A ASIC Lab 4: Floorplanning, Placement and Power

EECS 151/251A ASIC Lab 4: Floorplanning, Placement and Power EECS 151/251A ASIC Lab 4: Floorplanning, Placement and Power Written by Nathan Narevsky (2014, 2017) and Brian Zimmer (2014) Modified by John Wright (2015,2016) and Taehwan Kim (2018) Overview This lab

More information

Graduate Institute of Electronics Engineering, NTU Synopsys Synthesis Overview

Graduate Institute of Electronics Engineering, NTU Synopsys Synthesis Overview Synopsys Synthesis Overview Ben 2006.02.16 ACCESS IC LAB Outline Introduction Setting Design Environment Setting Design Constraints Synthesis Report and Analysis pp. 2 What is Synthesis Synthesis = translation

More information

Definitions. Key Objectives

Definitions. Key Objectives CHAPTER 2 Definitions Key Objectives & Types of models & & Black box versus white box Definition of a test Functional verification requires that several elements are in place. It relies on the ability

More information

HDL Compiler Directives 7

HDL Compiler Directives 7 7 HDL Compiler Directives 7 Directives are a special case of regular comments and are ignored by the Verilog HDL simulator HDL Compiler directives begin, like all other Verilog comments, with the characters

More information

Using Synplify Pro, ISE and ModelSim

Using Synplify Pro, ISE and ModelSim Using Synplify Pro, ISE and ModelSim VLSI Systems on Chip ET4 351 Rene van Leuken Huib Lincklaen Arriëns Rev. 1.2 The EDA programs that will be used are: For RTL synthesis: Synplicity Synplify Pro For

More information

Hardware Design Environments. Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University

Hardware Design Environments. Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University Hardware Design Environments Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University Outline Welcome to COE 405 Digital System Design Design Domains and Levels of Abstractions Synthesis

More information

Overview. Design flow. Principles of logic synthesis. Logic Synthesis with the common tools. Conclusions

Overview. Design flow. Principles of logic synthesis. Logic Synthesis with the common tools. Conclusions Logic Synthesis Overview Design flow Principles of logic synthesis Logic Synthesis with the common tools Conclusions 2 System Design Flow Electronic System Level (ESL) flow System C TLM, Verification,

More information

CMOS VLSI Design Lab 3: Controller Design and Verification

CMOS VLSI Design Lab 3: Controller Design and Verification CMOS VLSI Design Lab 3: Controller Design and Verification The controller for your MIPS processor is responsible for generating the signals to the datapath to fetch and execute each instruction. It lacks

More information

Week - 01 Lecture - 04 Downloading and installing Python

Week - 01 Lecture - 04 Downloading and installing Python Programming, Data Structures and Algorithms in Python Prof. Madhavan Mukund Department of Computer Science and Engineering Indian Institute of Technology, Madras Week - 01 Lecture - 04 Downloading and

More information

Project Timing Analysis

Project Timing Analysis Project Timing Analysis Jacob Schneider, Intel Corp Sanjeev Gokhale, Intel Corp Mark McDermott EE 382M Class Notes Overview Brief overview of global timing Example of extracting AT, RAT, and PASSTHROUGHs

More information

A Verilog Primer. An Overview of Verilog for Digital Design and Simulation

A Verilog Primer. An Overview of Verilog for Digital Design and Simulation A Verilog Primer An Overview of Verilog for Digital Design and Simulation John Wright Vighnesh Iyer Department of Electrical Engineering and Computer Sciences College of Engineering, University of California,

More information

EECS 470: Computer Architecture. Discussion #2 Friday, September 14, 2007

EECS 470: Computer Architecture. Discussion #2 Friday, September 14, 2007 EECS 470: Computer Architecture Discussion #2 Friday, September 14, 2007 Administrative Homework 1 due right now Project 1 due tonight Make sure its synthesizable Homework 2 due week from Wednesday Project

More information

VHDL for Synthesis. Course Description. Course Duration. Goals

VHDL for Synthesis. Course Description. Course Duration. Goals VHDL for Synthesis Course Description This course provides all necessary theoretical and practical know how to write an efficient synthesizable HDL code through VHDL standard language. The course goes

More information

Verilog Design Entry, Synthesis, and Behavioral Simulation

Verilog Design Entry, Synthesis, and Behavioral Simulation ------------------------------------------------------------- PURPOSE - This lab will present a brief overview of a typical design flow and then will start to walk you through some typical tasks and familiarize

More information

VIVADO TUTORIAL- TIMING AND POWER ANALYSIS

VIVADO TUTORIAL- TIMING AND POWER ANALYSIS VIVADO TUTORIAL- TIMING AND POWER ANALYSIS IMPORTING THE PROJECT FROM ISE TO VIVADO Initially for migrating the same project which we did in ISE 14.7 to Vivado 2016.1 you will need to follow the steps

More information

Lecture 8: Synthesis, Implementation Constraints and High-Level Planning

Lecture 8: Synthesis, Implementation Constraints and High-Level Planning Lecture 8: Synthesis, Implementation Constraints and High-Level Planning MAH, AEN EE271 Lecture 8 1 Overview Reading Synopsys Verilog Guide WE 6.3.5-6.3.6 (gate array, standard cells) Introduction We have

More information

Hardware describing languages, high level tools and Synthesis

Hardware describing languages, high level tools and Synthesis Hardware describing languages, high level tools and Synthesis Hardware describing languages (HDL) Compiled/Interpreted Compiled: Description compiled into C and then into binary or directly into binary

More information

Hardware Verification Group

Hardware Verification Group Digital Logic Synthesis and Equivalence Checking Tools Tutorial Hardware Verification Group Department of Electrical and Computer Engineering, Concordia University, Montreal, Canada {n ab, h aridh}@encs.concordia.ca

More information

LAB 6 Testing the ALU

LAB 6 Testing the ALU Goals LAB 6 Testing the ALU Learn how to write testbenches in Verilog to verify the functionality of the design. Learn to find and resolve problems (bugs) in the design. To Do We will write a Verilog testbench

More information

Lab 1: FPGA Physical Layout

Lab 1: FPGA Physical Layout Lab 1: FPGA Physical Layout University of California, Berkeley Department of Electrical Engineering and Computer Sciences EECS150 Components and Design Techniques for Digital Systems John Wawrzynek, James

More information

A Brief Introduction to Verilog Hardware Definition Language (HDL)

A Brief Introduction to Verilog Hardware Definition Language (HDL) www.realdigital.org A Brief Introduction to Verilog Hardware Definition Language (HDL) Forward Verilog is a Hardware Description language (HDL) that is used to define the structure and/or behavior of digital

More information

Verilog Module 1 Introduction and Combinational Logic

Verilog Module 1 Introduction and Combinational Logic Verilog Module 1 Introduction and Combinational Logic Jim Duckworth ECE Department, WPI 1 Module 1 Verilog background 1983: Gateway Design Automation released Verilog HDL Verilog and simulator 1985: Verilog

More information

Tutorial: Working with the Xilinx tools 14.4

Tutorial: Working with the Xilinx tools 14.4 Tutorial: Working with the Xilinx tools 14.4 This tutorial will show you how to: Part I: Set up a new project in ISE Part II: Implement a function using Schematics Part III: Implement a function using

More information