DDR4: Designing for Power and Performance

Similar documents
DDR4 Design And Verification In Hyperlynx LINESIM/Boardsim

Arria 10 GX Dev kit 으로 PCI express 와 DDR4 완전정복. 송영규차장 Staff FAE, Altera Korea

Organization Row Address Column Address Bank Address Auto Precharge 128Mx8 (1GB) based module A0-A13 A0-A9 BA0-BA2 A10

High-Speed DDR4 Memory Designs and Power Integrity Analysis

2GB DDR3 SDRAM SODIMM with SPD

RML1531MH48D8F-667A. Ver1.0/Oct,05 1/8

2GB DDR3 SDRAM 72bit SO-DIMM

4GB Unbuffered VLP DDR3 SDRAM DIMM with SPD

Designing High-Speed Memory Subsystem DDR. using. Cuong Nguyen. Field Application Engineer

LE4ASS21PEH 16GB Unbuffered 2048Mx64 DDR4 SO-DIMM 1.2V Up to PC CL

Technical Note LPSDRAM Unterminated Point-to-Point System Design: Layout and Routing Tips

SDRAM DDR3 256MX8 ½ Density Device Technical Note

SDRAM DDR3 512MX8 ½ Density Device Technical Note

SDRAM DDR3 512MX8 ½ Density Device Technical Note

SC64G1A08. DDR3-1600F(CL7) 240-Pin XMP(ver 2.0) U-DIMM 1GB (128M x 64-bits)

ADQVD1B16. DDR2-800+(CL4) 240-Pin EPP U-DIMM 2GB (256M x 64-bits)

Technical Note. ONFI 4.0 Design Guide. Introduction. TN-29-83: ONFI 4.0 Design Guide. Introduction

Technical Note DDR2 (Point-to-Point) Package Sizes and Layout Basics

ADATA Technology Corp. DDR3-1333(CL9) 204-Pin ECC SO-DIMM 2GB (256M x 72-bit)

Organization Row Address Column Address Bank Address Auto Precharge 256Mx4 (1GB) based module A0-A13 A0-A9 BA0-BA2 A10

204Pin DDR3L 1.35V 1600 SO-DIMM 8GB Based on 512Mx8 AQD-SD3L8GN16-MGI. Advantech. AQD-SD3L8GN16-MGI Datasheet. Rev

APPROVAL SHEET. Apacer Technology Inc. Apacer Technology Inc. CUSTOMER: 研華股份有限公司 APPROVED NO. : T0031 PCB PART NO. :

Board Design Guidelines for PCI Express Architecture

Symbol Parameter Min Typ Max VDD_CORE Core power 0.9V 1.0V 1. 1V. VDD33 JTAG/FLASH power 2.97V 3.3V 3.63V

External Memory Interfaces Intel Arria 10 FPGA IP User Guide

Samsung Memory DDR4 SDRAM

ADATA Technology Corp. DDR3-1333(CL9) 240-Pin R-DIMM 8GB (1024M x 72-bit)

DDR2 SDRAM UDIMM MT8HTF12864AZ 1GB

Intel Stratix 10 External Memory Interfaces IP User Guide

QDR II SRAM Board Design Guidelines

200Pin DDR2 1.8V 800 SODIMM 1GB Based on 128Mx8 AQD-SD21GN80-SX. Advantech. AQD-SD21GN80-SX Datasheet. Rev

Interfacing RLDRAM II with Stratix II, Stratix,& Stratix GX Devices

APPROVAL SHEET. Apacer Technology Inc. Apacer Technology Inc. CUSTOMER: 研華股份有限公司 APPROVED NO. : T0026 PCB PART NO. :

PC2-5300/PC DDR2 SDRAM Unbuffered DIMM Design Specification Revision 3.1 October 2008

Interfacing FPGAs with High Speed Memory Devices

External Memory Interfaces Intel Stratix 10 FPGA IP User Guide

Datasheet. Zetta 4Gbit DDR3L SDRAM. Features VDD=VDDQ=1.35V / V. Fully differential clock inputs (CK, CK ) operation

Interfacing DDR2 SDRAM with Stratix II, Stratix II GX, and Arria GX Devices

DDR2 SODIMM Module. 256MB based on 256Mbit component 256MB, 512MB and 1GB based on 512Mbit component 1GB and 2GB based on 1Gbit component

Features. DDR2 UDIMM w/o ECC Product Specification. Rev. 1.1 Aug. 2011

Digital IO PAD Overview and Calibration Scheme

4GB Unbuffered DDR3 SDRAM SODIMM

Features. DDR3 Registered DIMM Spec Sheet

DDR2 SDRAM UDIMM MT16HTF25664AZ 2GB MT16HTF51264AZ 4GB For component data sheets, refer to Micron s Web site:

Signal Integrity Comparisons Between Stratix II and Virtex-4 FPGAs

Key Features 240-pin, dual in-line memory module (DIMM) ECC 1-bit error detection and correction. Registered inputs with one-clock delay.

2GB Unbuffered DDR3 SDRAM DIMM

IMME256M64D2DUD8AG (Die Revision E) 2GByte (256M x 64 Bit)

Memory Module Specifications KVR667D2Q8F5K2/8G. 8GB (4GB 512M x 72-Bit x 2 pcs.) PC CL5 ECC 240-Pin FBDIMM Kit DESCRIPTION SPECIFICATIONS

Fairchild Semiconductor Application Note December 2000 Revised June What is LVDS? FIGURE 2. Driver/Receiver Schematic

IMM128M72D1SOD8AG (Die Revision F) 1GByte (128M x 72 Bit)

IMME256M64D2SOD8AG (Die Revision E) 2GByte (256M x 64 Bit)

MAX 10 FPGA Signal Integrity Design Guidelines

IMM128M64D1DVD8AG (Die Revision F) 1GByte (128M x 64 Bit)

204PIN DDR SO-DIMM 1024MB With 128Mx8 CL9. Description. Placement. Features PCB: Transcend Information Inc.

M2U1G64DS8HB1G and M2Y1G64DS8HB1G are unbuffered 200-Pin Double Data Rate (DDR) Synchronous DRAM Unbuffered Dual In-Line

Datasheet. Zetta 4Gbit DDR4 SDRAM. Features. RTT_NOM switchable by ODT pin Asynchronous RESET pin supported

PC2-6400/PC2-5300/PC2-4200/PC Registered DIMM Design Specification Revision 3.40 August 2006

2GB 4GB 8GB Module Configuration 256 x M x x x 8 (16 components)

SP001GBLRU800S pin DDR2 SDRAM Unbuffered Module

IMM64M72D1SCS8AG (Die Revision D) 512MByte (64M x 72 Bit)

4GB DDR3 SDRAM SO-DIMM

DDR3 SDRAM UDIMM MT8JTF12864AZ 1GB MT8JTF25664AZ 2GB. Features. 1GB, 2GB (x64, SR) 240-Pin DDR3 SDRAM UDIMM. Features

240Pin DDR3L 1.35V 1866 U-DIMM 8GB Based on 512Mx8 AQD-D3L8GN18-MG. Advantech. AQD-D3L8GN18-MG Datasheet. Rev

IMM64M64D1SOD16AG (Die Revision D) 512MByte (64M x 64 Bit)

Making Your Most Accurate DDR4 Compliance Measurements. Ai-Lee Kuan OPD Memory Product Manager

MT51J256M32 16 Meg x 32 I/O x 16 banks, 32 Meg x 16 I/O x 16 banks. Options 1. Note:

Address Summary Table: 1GB 2GB 4GB Module Configuration 128M x M x M x 64

1024MB DDR2 SDRAM SO-DIMM

DDR SDRAM RDIMM MT18VDDF6472D 512MB 1 MT18VDDF12872D 1GB

4GB DDR3 SDRAM SO-DIMM

DDR3 SDRAM UDIMM MT8JTF12864AZ 1GB MT8JTF25664AZ 2GB MT8JTF51264AZ 4GB. Features. 1GB, 2GB, 4GB (x64, SR) 240-Pin DDR3 UDIMM.

2GB DDR3 SDRAM UDIMM. RoHS Compliant. Product Specifications. January 15, Version 1.2. Apacer Technology Inc.

Module height: 30mm (1.18in) Note:

Platform Design Guide

Modern Memory Interfaces (DDR3) Design with ANSYS Virtual Prototype approach

DDR SDRAM RDIMM MT18VDDF6472D 512MB 1 MT18VDDF12872D 1GB

90000 DSO/DSA Series Oscilloscopes

High Performance DDR4 interfaces with FPGA Flexibility. Adrian Cosoroaba and Terry Magee Xilinx, Inc.

High-Performance Memory Interfaces Made Easy

DDR SDRAM UDIMM. Draft 9/ 9/ MT18VDDT6472A 512MB 1 MT18VDDT12872A 1GB For component data sheets, refer to Micron s Web site:

Real Time Embedded Systems

External Memory Interfaces Intel Cyclone 10 GX FPGA IP User Guide

D G28RA 128M x 64 HIGH PERFORMANCE PC UNBUFFERED DDR3 SDRAM SODIMM

IMM64M64D1DVS8AG (Die Revision D) 512MByte (64M x 64 Bit)

DDR SDRAM RDIMM MT36VDDF GB MT36VDDF GB

1.35V DDR3L SDRAM UDIMM

Characterize and Debug Crosstalk Issues with Keysight Crosstalk Analysis App

ADATA Technology Corp. DDR3L-1600(CL11) 240-Pin VLP R-DIMM 8GB (1024M x 72-bits)

DDR3 SDRAM UDIMM MT4JTF6464AZ 512MB MT4JTF12864AZ 1GB. Features. 512MB, 1GB (x64, SR) 240-Pin DDR3 SDRAM UDIMM. Features

Multi-Drop LVDS with Virtex-E FPGAs

1.35V DDR3L SDRAM SODIMM

1.35V DDR3L SDRAM SODIMM

DDR SDRAM RDIMM. MT9VDDT MB 1 MT9VDDT MB 2 MT9VDDT MB 2 For component data sheets, refer to Micron s Web site:

Data Rate (MT/s) CL = 3 CL = 2.5 CL = 2-40B PC PC

204Pin DDR V ECC SO-DIMM 8GB Based on 512Mx8. Advantech AQD-SD3L8GE16-SG. Datasheet. Rev

High-speed, high-bandwidth DRAM memory bus with Crosstalk Transfer Logic (XTL) interface. Outline

KVR667D2D8F5/1G 1GB 128M x 72-Bit PC CL5 ECC 240-Pin FBDIMM

ProASIC3/E SSO and Pin Placement Guidelines

Memory Solutions. Industry Trends and Solution Overview

Transcription:

DDR4: Designing for Power and Performance

Agenda Comparison between DDR3 and DDR4 Designing for power DDR4 power savings Designing for performance Creating a data valid window Good layout practices for DDR4 Board debug tools to minimize issues Looking ahead and conclusion 2

3 Comparison Between DDR3 and DDR4

DRAM Technology Comparison DDR3 DDR4 GDDR5 Voltage 1.5 V / 1.35 V 1.2 V 1.5 V / 1.35 V Strobe Bi-directional differential Bi-directional differential Free-running differential WRITE clock Strobe Configuration Per byte Per byte Per word READ Data Capture Strobe based Strobe based Clock data recovery Data Termination VDDQ/2 VDDQ VDDQ Address/Command Termination VDDQ/2 VDDQ/2 VDDQ Burst Length BC4, 8 BC4, 8 8 Bank Grouping No 4 4 On-Chip Error Detection No Command / address parity CRC for data bus CRC for data bus Configuration x4, x8, x16 x4, x8, x16 x16, x32 Package 78-ball / 96-ball FBGA 78-ball / 96-ball FBGA 170-ball FBGA Data Rate (Mbps/Pin) 800 2,133 1,600 3,200+ 4,000 7,000 Component Density 1 GB 8 GB 2 GB 16 GB 512 MB 2 GB Stacking Options DDP, QDP Up to 8H (128-GB stack); single load No 4

5 DDR4 Power Savings

DDR4 Power Savings Features DDR4 voltage is 1.2 V (up to 40% savings) Lower voltage than DDR3 (1.5 V) On-die VREF Pseudo-open drain I/Os Manages refreshes (up to 20% savings) Based on temperature New DDR4 low-power auto self-refresh (LPASR) capability Changes refresh rate based on temperature Only refreshes parts of array that is in use Controller must allow fine-granularity refresh based on memory utilization Supports data bus inversion Limits number of signals transitioning, reducing simultaneous switching output (SSO) and saving power 6

7 Creating a Data Valid Window

Timing Margins Are Shrinking Shrinking Timing Margins in Picoseconds DRAM Margin Package/board / Board Margin Chip Margin Data Valid Window 2,500 Data Valid Window DRAM Margin Package/ Board Margin Chip Margin DDR1 2,500 900 800 800 DDR2 938 425 256 256 DDR3 469 188 140 140 DDR4 313 125 93 93 938 469 313 DDR1 DDR2 DDR3 DDR4 400 Mbps 3,200 Mbps 8

Shrinking the Window Even More: DDR4 VREF Training (1/2) DDR4 VREF training Training: sweep VREF setting, find maximum passing window Lump sum of DCD, RX offset, etc. Resolution error is the combination of (VREF, PI, or delay chain) Margin loss calculation VREF step size: from 0.5% VDDQ to 0.8% VDDQ VREF set tolerance: 1.625% or 0.15% Calibration error: 1 step size 0.8% * VDDQ = 0.8% * 1.2V = 9.6 mv Margin loss (due to VREF calibration error) 9.6 mv * 2 / slew_rate = 4.8 ps (assume slew rate = 4 V/ns) Calibration error = half step size Vref Step Size Vref step 0.50% 0.65% 0.80% VDDQ 2 Vref Set Tolerance Vref_set_tol -1.625% 0.00% 1.625% VDDQ 3, 4, 6-0.15% 0.00% 0.15% VDDQ 3, 5, 7 10

Shrinking the Window Even More: DDR4 VREF Training (2/2) Discussion with JEDEC members RDDR4 specification section 13.4: any DRAM component level variation must be accounted for within the DRAM RX mask. This means that the VREF calibration error is included in VdlVW_total. VREF_DQ internal aligns to VCENT_DQs with training. VCENT_DQs has variation. VREF_DQ training error should increase with this variation and internal voltage noise etc. 11

Shrinking the Window Even More: Duty Cycle Error DDR4 specification is +/-2% tck = +/- 0.04 UI IPD current budget +/-3% tck Margin loss is 4% tck With proper link timing calibration 2% tck margin loss Assume same for read +/-2% +/-2% DQS DQ Timing Parameters by Speed Bin for DDR4-2400 to DDR4-3200 Clock Timing Speed DDR4-2400 DDR4-2666 DDR4-3200 Parameter Symbol MIN MAX MIN MAX MIN MAX Units NOTE Minimum Clock Cycle Time (DLL Off Mode) tck (DLL_OFF) 8-8 - 8 - nδ 22 Average Clock Period tck (avg) TBD pδ Average High Pulse Width tch (avg) 0.48 0.52 0.48 0.52 0.48 0.52 tck (avg) Average Low Pulse Width tcl (avg) 0.48 0.52 0.48 0.52 0.48 0.52 tck (avg) 12

Shrinking the Window Even More: Calculating the PLL Jitter Current Profile : I(f) PDN Impedance : Z(f) Jitter Sensitivity : S(f) PSRR of PLL: P(f) f f f f Jitter Spectrum J(f) TIE Jitter : j(t) ifft p-p jitter f t ifft I( f ) Z( f ) S( f ) P( f ) = J ( f ) j ( t) TIE 13

DDR4 Bank Group Timing Different timing within a group and between groups (tccd, twtr, trrd) Long timing: bank-to-bank within a group Short timing: access to different bank groups Maintain array timing requirements within bank group Maintain speed between different bank groups Bank 2 Bank 3 Bank 2 Bank 3 Bank Group 0 Bank Group 1 Bank 0 Bank 1 Bank 0 Bank 1 Bank 2 Bank 3 Short Timings Long Timings Bank 2 Bank 3 Bank Group 2 Bank 2 Bank 3 Bank Group 3 Bank 0 Bank 1 Bank 0 Bank 1 Bank 0 Bank 1 Bank Group 1 14

Calibration Is Critical to Shrinking Margins 0.5 0.4 0.3 FPGA Effects External Effects Calibration Effects Calibration Uncertainty Margin (ns) 0.2 0.1 0 No Margin Without Calibration -0.1 15

What is Calibration? Capture Calibration (De-skew) Before de-skew small valid capture window DQs 0 15 30 45 60 75 90 105 120 135 150 165 180 DQ0 DQ1 DQ2 DQ3 DQ4 DQ5 DQ6 DQ7 Benefit: Reduce skew between data group More capture margin After de-skew maximize valid capture window DQ0 DQ1 DQ2 DQ3 DQ4 DQ5 DQs 0 15 30 45 60 75 90 105 120 135 150 165 180 Resync Calibration Benefit: Accurate strobe placement More resync margin DQ0 DQ1 DQ2 DQ3 * * DQ70 DQ71 0 15 30 45 60 315 330 345 360 Valid data window VT Compensation Data shifts due to VT variations Voltage and temperature tracking Benefit: Dynamic phase adjustment to match shifting data valid window Robust over VT 16

High-Level Output Topology CLK DQS OUT1 Delay DQS OUT2 Delay DQS ptap control X+90 phase X phase DQS out dtap1 control DQ OUT1 Delay DQS out dtap2 control DQ OUT2 Delay DQ Calibration knobs DQ out dtap1 control DQ out dtap2 control DQ-out1 and DQ-out2 delay : Control the delay applied to outgoing DQ pins DQS-out1 and DQS-out2 delay : Control the delay applied to outgoing DQS pins Write leveling output : Changes the delay on both DQ and DQS relative to the memory clock-in phase taps 17

High-Level Input Topology vfifo control dqs_en ptap control DQS en dtap control VFIFO X phase DQS En Delay DQS DDIOin DQS Enable DQS IN Delay DQS Delay Chain LFIFO DQS in dtap control DQ Lfifo control DQ IN Delay Calibration knobs DQ in dtap control DQ-in delay: Control the delay applied to incoming DQ pins DQS-in delay: Control the delay applied to incoming DQS pins LFIFO : Controls number of cycles after read command that data is read out of the LFIFO DQS-En phase: Control the delay on DQS En in phase taps DQS-En delay: Control the delay on DQS En in dtaps VIFO : Adjusts the delay in cycles applied to controller-provided DQS burst signal to generate DQS enable 18

Calibration Stages DQS-enable calibration Calibrate DQS enable (delayed read data valid) relative to DQS Post-amble tracking Track DQS-enable across temperature variation Read data deskew Calibrate DQS relative to read command (read leveling) Calibrate DQ versus DQS (per-bit deskew) for reads LFIFO training Calibrate LFIFO delay cycles (read latency) Write leveling Calibrate DQS and DM to write command (write leveling) Write data deskew Calibrate DQ versus DQS (per-bit deskew) for writes Address/command training (leveling and deskew) Calibrate CS, CAS, RAS, and ODT versus memory clock VREF training (FPGA and memory) Calibrates receiver voltage threshold (for DDR4 with pseudo open drain DQs) Start Wait for PLL/DLL locking Initialize INST/AC ROM for all pins on this Mem Interface Initialize the memory (Mode Registers etc.) Calibrate the Mem Interface All Mem Interfaces calibrated? Y User command found in DPRIO? N User command found in RAM? N Y Y N Process DPRIO user command Process RAM user command Calibration loop User mode loop 19

Calibration Is Critical to Shrinking Margins 0.5 0.4 0.3 FPGA Effects External Effects Calibration Effects Calibration Uncertainty Margin (ns) 0.2 0.1 0 No Margin Without Calibration -0.1 20

21 Good Layout Practices for DDR4

DDR4 Output Driver DDR3 Push-Pull DDR4 Pseudo Open Drain 22 Content Courtesy of Micron

Unadjusted, Non-Terminated Data Eye VDD Overshoot VSS Undershoot Jitter 23 Content Courtesy of Micron

Terminated Data Eye Overshoot VIHac VIHdc Hi-Ringback Lo-Ringback Vref VILac VILdc Undershoot 24 Content Courtesy of Micron

OCT from the Controller Standpoint DQ and CA pins are terminated differently in DDR4 Interface Specification DDR3 DDR4 Density / Speed Voltage (VDD / VDDQ / VPP) 512 Mb ~ 8 GB 1.6 ~ 2.1 Gbps 1.5 V / 1.5 V / NA (1.35 V / 1.35 V / NA) 2 GB ~ 16 GB 1.6 ~ 3.2 Gbps 1.2 V / 1.2 V / 2.5 V VREF External VREF (VDD / 2) Internal VREF (need training) Data I/Os CTT (34 ohm) POD (34 ohm) CMD/ADDR I/Os CTT CTT Strobe Bi-directional / differential Bi-directional / differential Number of banks 8 16 (4 GB) Core Architect Physical Page size (x4 / x8 / x16) 1 KB / 1 KB / 2 KB 512 B / 1 KB / 2 KB Number of prefetch 8 bits 8 bits Added function RESET / ZQ / Dynamic ODT + CRC / DBI / Multi preamble Package type / balls (x4, x8 / x16) 78 / 96 BGA 78 / 96 BGA DIMM type R, LR, U, SoDIMM + ECC SoDIMM DIMM pins 240 (R, LR, U) / 204 (So) 284 (R, LR, U) / 256 (So) 25

OCT Calibration Scheme to Support DDR4 OCT can calibrate 2 times with 2 sets of pins (DQ/CA) DQ and CA pins will have 2 different sets of codes in DDR4 DDR4 DDR3 26

General Layout Concerns Avoid crossing splits in the power plane SSO on controller collapsed strobes/clocks Separate supplies and/or flip-chip packaging helps Low-pass VREF filtering on controller helps Minimize VREF noise Minimize intersymbol interference (ISI) Minimize crosstalk 27 Content Courtesy of Micron

Layout and Termination (1/12) Signal integrity review Importance of transmission line theory Today s clock rates are too fast to ignore Matched impedance line is important for good signaling Mismatched impedance lines result in reflections Termination schemes are used to reduce / eliminate reflections Good power bussing is paramount to reducing SSO SSO reduce voltage and timing margins Decoupling capacitors needs and requirements 28 Content Courtesy of Micron

Layout and Termination (2/12) Signal integrity analysis is paramount to developing cost-effective high-speed memory systems Develop timing budget for proof of concept Use models to simulate Board skews are important and should accounted for ISI, crosstalk, VREF noise, path length matching, Cin and RTT mismatch employ industry practices and assumptions Model vias too Eliminate return path discontinuities (RPDs) Minimize SSO affects Difficult to model 29 Content Courtesy of Micron

Layout and Termination (3/12) DRAM and controller package parasitics are fixed SSO effects already contained in their specified timings However, these are to test conditions with specific decoupling Power delivery network (PDN) for the controller and DRAM need to be properly designed Lowering power supply inductance minimizes signaling variations between devices Use power and ground planes wherever possible Make all power and ground traces as fat as possible Couple power and ground as much as possible Lowers inductance (mutual effects) 30 Content Courtesy of Micron

Layout and Termination (4/12) SSO Timing and noise issues generated due to rapid changes in voltage and current caused by multiple circuits switching simultaneously in the same direction Problems caused by SSO False triggers due to power/ground bounce Reduced timing margin due to SSO induced skew Reduced voltage margin due to power/ground noise Slew rate variation 31 Content Courtesy of Micron

Layout and Termination (5/12) Good power bussing is paramount to reducing SSO Reduce L (power delivery effective inductance) Use planes for power and ground distribution Proper routing of power and ground traces to devices Proper use of decoupling capacitance Locate as close as possible to the component pins Reduce di/dt (switching current slew rate) V = L di dt Use the slowest drive edge that will work Use reduced drive strength instead of full drive where possible 32 Content Courtesy of Micron

Layout and Termination (6/12) RPDs induce board noise and are difficult to model Splits/holes in reference planes Connector discontinuities Layer changes Avoid RPDs if at all possible Avoid crossing holes/splits in reference plane Route signals so they reference the proper domain Add power/ground vias to board Split Return Path Especially in dense layer-change areas Place decoupling capacitors near connectors Solid Return Path 33 Content Courtesy of Micron

Layout and Termination (7/12) VREF noise Induces strobe to data skews and reduces voltage margins Power/ground plane noise Crosstalk Minimize VREF noise Use widest trace practical to route From chip to decoupling capacitor Use large spacing between VREF and neighboring traces 34 Content Courtesy of Micron

Layout and Termination (8/12) ISI Occurs when data is random Clocks do not have ISI Multiple bits on the bus at the same time Bus cannot settle from bit #1 before bit #2, etc. Signal edges jitter due to previous bit s energy still on the bus Ringing due to impedance mismatches Low pass structures can cause ISI Minimize ISI Optimize layout Keep board/dimm impedances matched Drive impedance should be same as Zo of transmission line Terminate nets Termination values should be the same as Zo of transmission line Select high-quality connector Matched to board/dimm impedance Low mutual coupling 35 Content Courtesy of Micron

Layout and Termination (9/12) Crosstalk Coupling on board, package, and connector from other signals, including RPDs Inductive coupling is typically stronger than capacitive coupling When aggressors fire at the same time as victim (e.g. data-to-data coupling) Victim edge speeds up or slows down, causing jitter When aggressors do not fire at the same time as victim (e.g. data-tocommand/address coupling) Noise couples onto victim at time of aggressor switching 36 Content Courtesy of Micron

Layout and Termination (10/12) Minimize crosstalk Keep bits that switch on same clock edge routed together Route data bits next to other data bits; never next to CMD/ADDR bits Isolate sensitive bits (strobes) If need be, route next to signals that rarely switch Separate traces by at least two to three {preferred} conductor widths (more accurately, one would define by trace pitch and height above reference plane) Example: 5-mil trace located 5 mils from a reference plane should have a 15-mil gap to its nearest neighbors to minimize crosstalk Choose a high-quality connector Run traces as stripline (as opposed to microstrip) Not at the cost of additional vias Maintain good references for signals and their return paths Avoid RPDs Keep driver, BD Zo, and ODT selections well matched 37 Content Courtesy of Micron

Layout and Termination (11/12) Cin mismatch Differing input capacitances on receiver pins Adds skew to input timings RTT mismatch Termination resistors not at nominal value Internal ODT on data pins have smaller variation than on DDR2 They are calibrated (so is DRAM s Ron) External termination resistor variation must be accounted for Consider one-percent resistors 38 Content Courtesy of Micron

Layout and Termination (12/12) High-speed signals must maintain a solid reference plane Reference plane may be either VDD or ground For DDR3 UDIMM systems, the DQ busses are referenced to ground while the ADDR/CMD and clock are referenced to VDD All signals may be referenced to ground if the layout allows Best signaling is obtained when a constant reference plane is maintained If this is not possible try to make the transitions near decoupling capacitors Signal Power Plane Cap Ground Plane Content Courtesy of Micron 39

40 Board Debug Tools to Minimize Issues

TimeQuest DDR Timing: Read Capture Before calibration Calibrating is the out standard some Calibrating of timing the process to analysis the FPGA variation variations in the (deskew memory + pessimism removal) Errors in the calibration Effects algorithm of temperature and voltage changes on the calibration Total margin after calibration 41

EMIF Debug Toolkit Features Reports results of the last calibration to the user Reports interface details, margins observed before calibration, settings made during calibration, and post-calibration margins In the case of a calibration failure, toolkit reports the stage at which calibration failed and the group Provides eye monitor support Provides loopback support Allows user interaction with memory interface Send commands to the memory interface to recalibrate, mask groups and ranks Eye monitor support of data valid window Loopback support for bit error rate (BER) testing 42

TimeQuest-Like GUI interface Reports section Tasks section Commands run Shown in console 43

On-Chip EMIF Debug Toolkit Core access to calibration data Access same calibration data as the EMIF toolkit, now via FPGA logic Via Avalon Memory-Mapped (Avalon-MM) interface 44

45 Looking Ahead and Conclusion

Will There Be a DDR5? Very unlikely SI for a parallel bus of 2 GHz and above would be very difficult Timing budget would be consumed in the package PDN noise Package skew Transition to stack memory Hybrid Memory Cube and serialized memory 3D memories integrated into ASICs 46

Conclusion DDR4 has many ways to reduce overall system power ~50% lower power than DDR3 at 1.5 V DDR4 is 33% faster than DDR3 2133 But there are challenges.. Shrinking data valid window Increase signal integrity and power integrity concerns These can be overcome by good controller design Innovative calibration Good ODT Careful board design Good board debug tools 47

Thank You