Mixed Signal Verification of an FPGA-Embedded DDR3 SDRAM Memory Controller using ADMS

Similar documents
ZBT SRAM Controller Reference Design

DDR and DDR2 SDRAM Controller Compiler User Guide

Introduction. Design Hierarchy. FPGA Compiler II BLIS & the Quartus II LogicLock Design Flow

RLDRAM II Controller MegaCore Function

DDR & DDR2 SDRAM Controller Compiler

DDR & DDR2 SDRAM Controller Compiler

Practical Hardware Debugging: Quick Notes On How to Simulate Altera s Nios II Multiprocessor Systems Using Mentor Graphics ModelSim

DDR & DDR2 SDRAM Controller Compiler

FPGAs Provide Reconfigurable DSP Solutions

Stratix vs. Virtex-II Pro FPGA Performance Analysis

Simulating the PCI MegaCore Function Behavioral Models

Stratix FPGA Family. Table 1 shows these issues and which Stratix devices each issue affects. Table 1. Stratix Family Issues (Part 1 of 2)

Design Tools for 100,000 Gate Programmable Logic Devices

Enhanced Configuration Devices

DDR & DDR2 SDRAM Controller

White Paper The Need for a High-Bandwidth Memory Architecture in Programmable Logic Devices

Signal Integrity Comparisons Between Stratix II and Virtex-4 FPGAs

DDR & DDR2 SDRAM Controller

FPGA Design Security Solution Using MAX II Devices

Implementing LVDS in Cyclone Devices

System Debugging Tools Overview

White Paper. Floating-Point FFT Processor (IEEE 754 Single Precision) Radix 2 Core. Introduction. Parameters & Ports

Logic Optimization Techniques for Multiplexers

7. External Memory Interfaces in Stratix IV Devices

Using MAX II & MAX 3000A Devices as a Microcontroller I/O Expander

Design Guidelines for Optimal Results in High-Density FPGAs

Simulating the PCI MegaCore Function Behavioral Models

SERDES Transmitter/Receiver (ALTLVDS) Megafunction User Guide

White Paper Configuring the MicroBlaster Passive Serial Software Driver

External Memory Interfaces Intel Arria 10 FPGA IP Design Example User Guide

White Paper Low-Cost FPGA Solution for PCI Express Implementation

AN 549: Managing Designs with Multiple FPGAs

CORDIC Reference Design. Introduction. Background

Interfacing RLDRAM II with Stratix II, Stratix,& Stratix GX Devices

Table 1 shows the issues that affect the FIR Compiler v7.1.

Error Correction Code (ALTECC_ENCODER and ALTECC_DECODER) Megafunctions User Guide

Intel Cyclone 10 External Memory Interfaces IP Design Example User Guide

Enhanced Configuration Devices

Stratix II FPGA Family

FPGA Co-Processing Architectures for Video Compression

White Paper Performing Equivalent Timing Analysis Between Altera Classic Timing Analyzer and Xilinx Trace

Arria II GX FPGA Development Board

ALTDQ_DQS2 Megafunction User Guide

Implementing Bus LVDS Interface in Cyclone III, Stratix III, and Stratix IV Devices

Intel Stratix 10 External Memory Interfaces IP Design Example User Guide

Cyclone II FPGA Family

Power Optimization in FPGA Designs

POS-PHY Level 4 MegaCore Function

9. SEU Mitigation in Cyclone IV Devices

Increasing Productivity with Altera Quartus II to I/O Designer/DxDesigner Interface

Intel Stratix 10 H-tile Hard IP for Ethernet Design Example User Guide

Simulating Nios II Embedded Processor Designs

High-Performance Memory Interfaces Made Easy

ByteBlaster II Parallel Port Download Cable

Matrices in MAX II & MAX 3000A Devices

Table 1 shows the issues that affect the FIR Compiler, v6.1. Table 1. FIR Compiler, v6.1 Issues.

AN 610: Implementing Deterministic Latency for CPRI and OBSAI Protocols in Altera Devices

White Paper AHB to Avalon & Avalon to AHB Bridges

Simple Excalibur System

DSP Builder. DSP Builder v6.1 Issues. Error When Directory Pathname is a Network UNC Path

FPGA Power Management and Modeling Techniques

Simulating the ASMI Block in Your Design

UTOPIA Level 2 Slave MegaCore Function

Exercise 1 In this exercise you will review the DSSS modem design using the Quartus II software.

DSP Development Kit, Stratix II Edition

Using MAX 3000A Devices as a Microcontroller I/O Expander

White Paper Compromises of Using a 10-Gbps Transceiver at Other Data Rates

9. Functional Description Example Designs

Designing RGMII Interface with FPGA and HardCopy Devices

Using the SDRAM on Altera s DE1 Board with Verilog Designs. 1 Introduction. For Quartus II 13.0

Interfacing Cyclone III Devices with 3.3/3.0/2.5-V LVTTL/LVCMOS I/O Systems

QDR II SRAM Board Design Guidelines

Simulating Visual IP Models with the NC-Verilog, Verilog-XL, VCS, or ModelSim (UNIX) Simulators

7. External Memory Interfaces in Arria II Devices

Stratix II vs. Virtex-4 Performance Comparison

Introduction to the Altera SOPC Builder Using Verilog Design

Using Flexible-LVDS I/O Pins in

Using the Serial FlashLoader With the Quartus II Software

Legacy SDRAM Controller with Avalon Interface

Introduction to Simulation of VHDL Designs Using ModelSim Graphical Waveform Editor. 1 Introduction. For Quartus Prime 16.1

Using the Transceiver Reconfiguration Controller for Dynamic Reconfiguration in Arria V and Cyclone V Devices

Transient Voltage Protection for Stratix GX Devices

Arria II GX FPGA Development Board

Design Verification Using the SignalTap II Embedded

Errata Sheet for Cyclone V Devices

Arria II GX FPGA Development Kit HSMC Loopback Tests Rev 0.1

Video and Image Processing Suite

University Program 3 Kit

Excalibur Solutions Using the Expansion Bus Interface. Introduction. EBI Characteristics

High-Performance FPGA PLL Analysis with TimeQuest

Implementing LED Drivers in MAX Devices

White Paper Assessing FPGA DSP Benchmarks at 40 nm

High-Speed Data Interface for Stratix Devices & Fujitsu MB86064 DACs

Implementing 9.8G CPRI in Arria V GT and ST FPGAs

Creating High-Speed Memory Interfaces with Virtex-II and Virtex-II Pro FPGAs Author: Nagesh Gupta, Maria George

AN 547: Putting the MAX II CPLD in Hibernation Mode to Achieve Zero Standby Current

A Reset Control Apparatus for PLL Power-Up Sequence and Auto-Synchronization

White Paper Using the MAX II altufm Megafunction I 2 C Interface

Implementing LED Drivers in MAX and MAX II Devices. Introduction. Commercial LED Driver Chips

24K FFT for 3GPP LTE RACH Detection

Transcription:

Mixed Signal Verification of an FPGA-Embedded DDR3 SDRAM Memory Controller using ADMS Arch Zaliznyak 1, Malik Kabani 1, John Lam 1, Chong Lee 1, Jay Madiraju 2 1. Altera Corporation 2. Mentor Graphics Corporation Abstract System-level verification of a high-end FPGA with an embedded DDR3 SDRAM external memory controller presents challenges to designers that exceed those encountered in ASIC-like verification flows due to the mixed-signal nature of the design. The Stratix III FPGA with a DDR3 Hard IP is fabricated on a 65nm process. The memory interface is comprised of programmablewidth data groups running at 800-Mbps per data bit. The memory controller consists of custom/analog Hard IP (Read, Write, Command/Address paths and the Clock Management) and the Soft IP (Autocalibration IP and Memory Controller) and with the External Memory device model comprise a mixed-signal system. See Figure 1. SOFT IP HARD IP Traditional verification methodology, while guaranteeing functional aspect of the verification flow, is not capable of verifying the adherence to the system-level timing specifications, nor in accurately predicting system timing budget. This paper describes a strategy devised and successfully utilized for the verification of the embedded DDR3 SDRAM external memory controller in the FPGA. The mixed-signal verification methodology is based on the use of Mentor Graphics ADMS mixed-mode simulator with the Mentor Graphics Fast SPICE engine ADiT enabled. It presents steps taken in the verification flow dealing with the simulation data preparation, the usage of the specific features of ADMS, and the verification process automation. Calibration Auto- Calibration IP + Memory Controller IO REGISTER +IO BUFFER IO REGISTER +IO BUFFER DELAY CHAINS IO REGISTER +IO BUFFER IO REGISTER +IO BUFFER EXTERNAL EXTERNAL MEMORY MEMORY (SDRAM + UDIMM) PLL DLL FPGA Figure 1. Memory Interface System

Motivation for Mixed Signal Verification Traditionally, ASIC design verification flows have always been content with using Digital Simulators to do their system-level verification. The entire system was described in a Hardware description language like and simulated using digital simulators like Mentor Graphics Modelsim. This type of verification is termed functional verification since it mainly targeted verifying the high level functionality of the chip. Hence it only guaranteed the high-level functionality of the chip matched its specifications. With this approach the designers were not verifying if the overall system timing met the specifications. The individual circuit blocks were verified using SPICE/Fast SPICE tools in a block level verification environment. However, for designs mixed-signal in nature this approach is not sufficient for various reasons. Using only purely digital simulations, it is very hard to verify the timing specifications, or accurately predict system timing budget that would be exhibited in silicon-package-board environment. For our DDR3 solution, the system timing budget depends on proper configuration of the dynamically programmed delays which is accomplished by the Auto-calibration IP. The Auto-calibration accuracy depends on accurate modeling of silicon-package-board, preferably in SPICE. The functional robustness of the Auto-calibration IP needs to be checked with the realistic duty-cycle distortions of the data and strobe signals as they travel through the circuits of the Hard IP. Also, the drive strength, termination scheme, data pattern effects, process voltage and temperature (PVT) effects can be investigated with the SPICE models to find the areas critical to the system s bandwidth. To accomplish that, one needs to incorporate transistor-level models into these simulations. However running the entire system at the transistor-level is impossible due to the capacity/speed limitation of current Analog simulators. Hence a mixed-signal simulation flow with the ability to run various test benches with the capability to simulate both and SPICE is required to address the above needs. Also, the flow needs to be scalable with increasing SPICE content in the simulations. Specification RTL DESIGN BLOCK DESIGN unit verification Figure 2. Bottom up Verification Flow Bottom Up Verification Strategy using Advance MS In order to efficiently and exhaustively validate the design, verification flow using Mentor Graphics mixed signal simulation tool Advance MS (ADMS) simulator was adopted. In this flow, each block of the Hard IP has been verified at transistor-level on its own. The system-level validation is targeted to check the adherence to the system-level timing specifications, and to accurately predict system timing budget, taking into account duty-cycle distortions, which will guarantee that the design is functionally and timing-wise correct in the first silicon run. Simulation configuration Concept block abstraction system verification checkerboard verification Here the concept of a Simulation configuration is introduced. Each Simulation configuration runs the same design except that different blocks in the design are at different levels of abstraction, i.e. checkerboard pattern. For example, in configuration # 1 a Write Path design block (see Figure. 3) may be in SPICE, whereas in configuration # 2 the same design block may Bottom-Up Verification

be in and the Read Path design block may be in SPICE, depending on the nature of the test and a particular timing parameter being tested. It is the verification team s job then to come up with a series of simulation configurations to test various system characteristics. One advantage of using such a flow is the ability to use the same set of test benches used in digital simulations, since the simulation is run on the complete design. This alleviates the need to create separate test benches for stand alone portions of the design. Also, now the designer can verify each SPICE block in the context of the full design. Critical system characteristics can be measured by keeping only the required portions of the chip in SPICE and the rest in. Important constraints such as data path timing, critical path delay can be verified with the SPICE implementation of the design blocks rather than HDL models as with digital verification flow. This enables designers to accurately predict the system budget and evaluate the risk accordingly. scalable with the increasing amount of SPICE in these simulations. The solution was to use Mentor Graphics ADMS-ADiT Simulator which integrates the Mentor Graphics Fast SPICE engine ADiT into the ADMS mixed signal simulator. The difference between ADMS and ADMS-ADiT is that the former uses the highly accurate but slow Eldo SPICE engine whereas the latter uses the slightly less accurate but high speed ADiT fast SPICE engine. Incorporating ADMS-ADiT enables running of larger amount of SPICE with reasonable turn around times making the flow scalable and more efficient. Testbench VHDL/ VHDL VHDL SPICE SPICE VHDL VHDL SPICE VHDL SPICE VHDL Description of the DDR3 SDRAM external memory controller Figure 3. Analog inside Digital Centric Flow However, when trying to run these systemlevel simulations using a fairly large amount of SPICE, the simulations take too long to be useful. Compounded with the requirement of running these simulations at multiple process corners, that makes the turn-around time even longer. This renders the flow difficult to use in situations where it is absolutely a must to run such simulations to verify certain system characteristics. Hence using ADVance MS by itself was not System-level test bench contains both the Soft IP RTL modules as well as the mixture of SPICE and netlist representing the custom/analog block, along with the necessary and VHDL-AMS models embedded in both SPICE and netlists. DDR3 memory interface is sourcesynchronous, where a bidirectional data strobe DQS is used for both Read and Write operations. However, handling timing is

DQ DQS Read: Edge-aligned DQS sent by memory Write: Center-aligned DQS expected by memory Figure 4. Data and DQS Relationship in DDR Memories different in each case. The data strobe is edge aligned with the data during a Read operation and center aligned during a Write operation (see Figure 4). One data strobe signal is associated with a number of data bits, usually eight. Within the memory, the data strobe path closely matches the data path; hence the access times with respect to data strobe are significantly better than the access times with respect to clock.if data strobe is used from the memory to capture the data sent by memory, maximum performance is realized. The challenge for the Memory controller is to transfer the data captured in the memory domain to the system domain. Need for Auto-calibration Techniques As applications move to higher frequencies and newer technology generations, memory and board uncertainties are increasing as a percentage of the bit period. See Figure 5. The need to place the strobe in the center of the data valid window is most important, so there is a need to reduce some of these uncertainties and improve the margin on the controller side to achieve higher performance. With Auto-calibration, the internal timing is set up in real time in a real system environment to achieve optimum timing, highest performance, and a robust design. 66 MHz, 66 Mbps Single Data Rate 15.0ns 400 MHz, 800 Mbps Double Data Rate 1.25ns Figure 5. Memory and board uncertainties increase as a percentage of the bit period

Reconfigurable PLL Swept Resynchronization Phase DQ Capture Resync DQS Known Training Pattern Comparator 0 15 30 45 60 315 330 345 360 DQ0 DQ1 DQ2 DQ3 DQ4 DQ5 DQ6 DQ7 Valid Data Window Ideal Resync Phase: Maximum Setup & Hold margin Figure 6. Auto-calibration System Block Diagram The aim of data Auto-calibration is to determine and set up the optimum sampling phase of the capture and resynchronization clock. The capture register shown in Figure 6 captures the data from external memory. The resync register is used to transfer the data from the memory domain to the FPGA system domain. In such a mixed-signal system, with the need to account for Vt, temperature, load and other variations, the simulation approach chosen is a must. To verify the Auto-calibration is working correctly to produce the correct timing, the system-level testbench goes through the Auto-calibration process on the read path, write path on groups of pins and measures various delay parameters. From these measured delays, the Auto-calibration algorithm decides the best timing fit. If this entire system was simulated in using a Digital Simulator it would necessitate the manual backannotating of the various delay parameters measured by the Auto-calibration process which can be time-consuming and errorprone. In order to make sure the timing requirements on the Data Strobe signal are within the specs, it is necessary to simulate the entire Memory subsystem with the address path, data path, strobe, clock, etc in transistor-level implementations.

Top-level Simulation Strategy using ADMS-ADiT Configuration Hard IP Soft IP Read Path Write Path Clock Management Calibration Controller External Memory Model 1 SPICE /VHDL 2 /VHDL SPICE /VHDL 3 SPICE SPICE /VHDL ADMS-ADiT allows the Auto-calibration Soft IP to run in tandem with the transistor-level implementation of the Hard IP blocks, which results in very precise delay parameter measurements. Not only does this result in a more accurate prediction of the system timing budget, also reduces the overhead of having to do manual back-annotation of the delay parameters. The idea is to simulate portions of the hard IP in SPICE and keep the soft IP blocks (Calibration system, Memory controller, etc) in (including the RTL Memory model) and run the system-level test bench in ADMS-ADiT. Then, multiple simulations need to be run with varying process corners, temperature and some varying parameters such as the DQ-DQS trace/dimm skew, etc. Also, since it is imperative to validate all the portions of the Hard IP to establish the required timing, keeping the entire Read/Write/Address paths in SPICE may not result in quick turnaround time. Hence multiple transistorlevel simulations need to be run with different portions of Hard IP in SPICE and the rest in. The table above summarizes the simulations run on the design to validate the system timing. Simulation Data Preparation 1. Test bench and Netlist Preparation The top-level block of the design is the test bench which instantiates the top-level design blocks. The Hard IP is instantiated from this design hierarchy and replaced with SPICE without modifying the netlist using ADMS compilation commands. This allows the original -only flow to be extended with minimal modifications. To make the customization of netlist for each simulation configuration easy, the language feature generate statements is used. Each instance inside the that could potentially be replaced with SPICE is surrounded by the construct generate. endgenerate. Each such Generate statement uses a genvar variable to statically determine if SPICE or is to be used.

Example: The instance c6480 looks like the following in the -only Flow: c6480 x6480b0 (.datovr(datovrb[0]),.din(dinb[0]),.octrt(roct[0]),.oeb(oebb[0]), vccp(vccp),..vss(vss)); Now, to extend to the Mixed-Signal flow so it can potentially be replaced with SPICE, it would now look like generate if (gspice_c6480 = = 1) \\spicelib. c6480a x6480b0 x6480b0 datovr(datovrb[0]),.din(dinb[0]),.octrt(roct[0]),.oeb(oebb[0]), vccp(vccp),..vss(vss)); Else c6480a x6480b0 x6480b0 datovr(datovrb[0]),.din(dinb[0]),.octrt(roct[0]),.oeb(oebb[0]), vccp(vccp),..vss(vss)); endgenerate 2. ADMS Simulation setup ADMS requires all of the modules to be compiled into a Library. All the. Simulation Results modules in the design are compiled into a work library and the SPICE subckts associated with each module that could potentially be replaced with SPICE is compiled into a separate Library. The ADMS command vaspi is used to compile the SPICE subckts. Then, when a certain simulation configuration is run, the appropriate Generate Variable is passed in from the command line to the simulator which uses the appropriate abstraction for the block (SPICE or ) as specified by the value of the variable. This avoids recompilation of modules every time the user wants to run a new configuration, because now only the value of the genvar flag being passed needs to be changed on the Simulation Command line invocation.. In the above example, the variable gspice_c6480 is passed in from the command line with the value 1, so the SPICE implementation for the block c6480 is used. All of the above commands have been automated inside custom scripts written to create customized netlists, appropriate compilation and simulation commands for each configuration CLK PIN T HP T CL T CH DQ PIN T DQSCK T DQSQ DQS PIN T DATA_DELAY DQ INT. T CLOCK_DELAY DQS INT. Figure 7. ADMS-ADiT Simulated Read Timing Margin

Figure 7 shows a typical ADMS waveform snapshot of the Read Path. Here, it is demonstrated the successful internal placement of the DQS strobe relative to the internal data window. Figure 8. Measured Eye Diagram During Write to DIMM Figure 8 shows lab measurements of the Eye during Write to DIMM. Conclusion Verification flow based on ADMS-ADiT Mixed signal FastSPICE simulator applied to a DDR3 Memory controller design is demonstrated. The functional robustness of the Auto-calibration IP was checked with the realistic duty-cycle distortions of the data and strobe signals as they travel through the circuits of the Hard IP. The adherence to the system-level timing specifications was verified, as well as system timing budget was accurately predicted.

101 Innovation Drive San Jose, CA 95134 (408) 544-7000 http://www.altera.com Copyright 2007 Altera Corporation. All rights reserved. Altera, The Programmable Solutions Company, the stylized Altera logo, specific device designations, and all other words and logos that are identified as trademarks and/or service marks are, unless noted otherwise, the trademarks and service marks of Altera Corporation in the U.S. and other countries. All other product or service names are the property of their respective holders. Altera products are protected under numerous U.S. and foreign patents and pending applications, maskwork rights, and copyrights. Altera warrants performance of its semiconductor products to current specifications in accordance with Altera's standard warranty, but reserves the right to make changes to any products and services at any time without notice. Altera assumes no responsibility or liability arising out of the application or use of any information, product, or service described herein except as expressly agreed to in writing by Altera Corporation. Altera customers are advised to obtain the latest version of device specifications before relying on any published information and before placing orders for products or services.