White Paper The Need for a High-Bandwidth Memory Architecture in Programmable Logic Devices

Similar documents
POS-PHY Level 4 POS-PHY Level 3 Bridge Reference Design

How FPGAs Enable Automotive Systems

White Paper Understanding 40-nm FPGA Solutions for SATA/SAS

Introduction. Synchronous vs. Asynchronous Memory. Converting Memory from Asynchronous to Synchronous for Stratix & Stratix GX Designs

4. TriMatrix Embedded Memory Blocks in HardCopy IV Devices

FPGAs Provide Reconfigurable DSP Solutions

Stratix vs. Virtex-II Pro FPGA Performance Analysis

APEX II The Complete I/O Solution

Nios Soft Core Embedded Processor

Using TriMatrix Embedded Memory Blocks

White Paper Configuring the MicroBlaster Passive Serial Software Driver

White Paper Low-Cost FPGA Solution for PCI Express Implementation

White Paper Enabling Quality of Service With Customizable Traffic Managers

POS-PHY Level 4 MegaCore Function

POS-PHY Level 4 MegaCore Function (POSPHY4)

Active Serial Memory Interface

White Paper Compromises of Using a 10-Gbps Transceiver at Other Data Rates

FPGA Co-Processing Architectures for Video Compression

Benefits of Embedded RAM in FLEX 10K Devices

Using the Nios Development Board Configuration Controller Reference Designs

On-Chip Memory Implementations

FPGA Design Security Solution Using MAX II Devices

Enhanced Configuration Devices

White Paper Taking Advantage of Advances in FPGA Floating-Point IP Cores

Nios II Performance Benchmarks

Using Flexible-LVDS Circuitry in Mercury Devices

Implementing LED Drivers in MAX Devices

DSP Development Kit, Stratix II Edition

DDR & DDR2 SDRAM Controller Compiler

SPI-4.2 Interoperability with the Intel IXF1110 in Stratix GX Devices

Practical Hardware Debugging: Quick Notes On How to Simulate Altera s Nios II Multiprocessor Systems Using Mentor Graphics ModelSim

Legacy SDRAM Controller with Avalon Interface

PCI Express Multi-Channel DMA Interface

EFEC20 IP Core. Features

Estimating Nios Resource Usage & Performance

Nios II Embedded Design Suite 6.1 Release Notes

Simple Excalibur System

Excalibur Solutions DPRAM Reference Design

E3 Mapper MegaCore Function (E3MAP)

Enhanced Configuration Devices

Simultaneous Multi-Mastering with the Avalon Bus

Using Flexible-LVDS I/O Pins in

Using MAX 3000A Devices as a Microcontroller I/O Expander

Using MAX II & MAX 3000A Devices as a Microcontroller I/O Expander

ZBT SRAM Controller Reference Design

Recommended Protocol Configurations for Stratix IV GX FPGAs

APEX Devices APEX 20KC. High-Density Embedded Programmable Logic Devices for System-Level Integration. Featuring. All-Layer Copper.

White Paper. Floating-Point FFT Processor (IEEE 754 Single Precision) Radix 2 Core. Introduction. Parameters & Ports

Altera Product Overview. Altera Product Overview

Stratix. High-Density, High-Performance FPGAs. Available in Production Quantities

FFT/IFFT Block Floating Point Scaling

Stratix II vs. Virtex-4 Performance Comparison

RapidIO Physical Layer MegaCore Function

White Paper AHB to Avalon & Avalon to AHB Bridges

Stratix FPGA Family. Table 1 shows these issues and which Stratix devices each issue affects. Table 1. Stratix Family Issues (Part 1 of 2)

SONET/SDH STS-12c/STM-4 Framer MegaCore Function (STS12CFRM)

Implementing LVDS in Cyclone Devices

Arria II GX FPGA Development Board

Excalibur Solutions Using the Expansion Bus Interface. Introduction. EBI Characteristics

System-on-a-Programmable-Chip (SOPC) Development Board

RLDRAM II Controller MegaCore Function

Stratix II FPGA Family

Introduction. Design Hierarchy. FPGA Compiler II BLIS & the Quartus II LogicLock Design Flow

Implementing Bus LVDS Interface in Cyclone III, Stratix III, and Stratix IV Devices

White Paper Using the MAX II altufm Megafunction I 2 C Interface

11. SEU Mitigation in Stratix IV Devices

Converting.srec Files to.flash Files for Nios Embedded Processor Applications

SONET/SDH Compiler. Introduction. SONET/SDH Compiler v2.3.0 Issues

FFT MegaCore Function User Guide

Signal Integrity Comparisons Between Stratix II and Virtex-4 FPGAs

MILITARY ANTI-TAMPERING SOLUTIONS USING PROGRAMMABLE LOGIC

Power Optimization in FPGA Designs

Table 1 shows the issues that affect the FIR Compiler v7.1.

Matrices in MAX II & MAX 3000A Devices

DDR & DDR2 SDRAM Controller Compiler

Mixed Signal Verification of an FPGA-Embedded DDR3 SDRAM Memory Controller using ADMS

Design Guidelines for Optimal Results in High-Density FPGAs

Implementing LED Drivers in MAX and MAX II Devices. Introduction. Commercial LED Driver Chips

DDR and DDR2 SDRAM Controller Compiler User Guide

Nios DMA. General Description. Functional Description

SPI-4.2 Interoperability with. in Stratix GX Devices. Introduction. PMC-Sierra XENON Family

T3 Framer MegaCore Function (T3FRM)

Design Verification Using the SignalTap II Embedded

Video and Image Processing Suite

Using DCFIFO for Data Transfer between Asynchronous Clock Domains

DDR & DDR2 SDRAM Controller Compiler

Cyclone II FPGA Family

System Debugging Tools Overview

10. Introduction to UniPHY IP

DDR & DDR2 SDRAM Controller

AN 547: Putting the MAX II CPLD in Hibernation Mode to Achieve Zero Standby Current

Introduction to TCP/IP Offload Engine (TOE)

Implementing the Top Five Control-Path Applications with Low-Cost, Low-Power CPLDs

DDR & DDR2 SDRAM Controller

100G Interlaken MegaCore Function User Guide

RapidIO MegaCore Function

Nios II Embedded Design Suite 7.1 Release Notes

AN 549: Managing Designs with Multiple FPGAs

AN423: Configuring the MicroBlaster Passive Serial Software Driver

Transient Voltage Protection for Stratix GX Devices

Transcription:

Introduction White Paper The Need for a High-Bandwidth Memory Architecture in Programmable Logic Devices One of the challenges faced by engineers designing communications equipment is that memory devices were adopted from PC computer platforms. These devices, although inexpensive and easy to source, are suboptimal for communication systems because they were designed with a different end system in mind. SDRAM memory is a good example of a borrowed PC technology. Fundamentally, SDRAM memory was designed for microprocessor storage, not to accept TCP/IP packets travelling at 10 gigabits per second (Gbps) through very high-performance switching equipment. Although discrete memory suppliers introduced new architectures with increased bandwidth such as double data rate (DDR), zero-bus turnaround (ZBT), and quad data rate (QDR), these architectures have not kept pace with the performance demands of transmission media standards such as OC-48 and OC-192. Since these discrete devices formerly were the only source of system memory, they defined the bottleneck in the system. The industry adapted by developing sophisticated memory architectures embedded within programmable logic devices (PLDs). Over the last several years, the memory architectures in PLDs have become as complex as many ASIC system designs, including application-optimized features such as dual-port, mixed-width, and parity bits. The latest generation of high-bandwidth, multi-functional memory architectures available in programmable logic supports higher system speeds than the discrete memory devices currently available. Furthermore, significant performance, time-to-market, and cost advantages are associated with embedded memory in a multi-million-gate programmable logic. Figure 1 shows how embedded memory architectures on PLDs can increase system performance by supporting various functions. Figure 1 also illustrates that although system bandwidth may be 40 Gbps, a significantly higher memory bandwidth may be required to support clock domain translation because each clock domain translation requires two memory transactions. WP-TriMatrix-2.0 October 2003, ver. 2.0 1

Figure 1. The Role of PLD Embedded Memory Architectures in Communications Equipment Gigabit Ethernet MAC (125 MHz) POS-PHY Transmitter (104 MHz) POS-PHY Transmitter (104 MHz) Nios Embedded Processor (120 MHz) Nios Embedded Processor (120 MHz) Dual-Port RAM (1) Note to Figure 1: (1) M4k,, and TM blocks support dual-port RAM mode. In the new Stratix TM family of high-density PLDs, Altera offers the TriMatrix TM memory architecture, a technology that meets the memory bandwidth requirements of emerging transmission and processing technologies unavailable with discrete memory devices. The TriMatrix memory architecture in Stratix devices offers up to 10 Mbits of RAM and up to 12 terabits per second peak device memory bandwidth, which makes this family an ideal choice for memory-intensive applications. The TriMatrix memory architecture consists of an array of three embedded RAM block types that can be configured to support a wide range of applications: blocks (32 18): you can use the smaller blocks of RAM for first-in first-out (FIFO) applications blocks (128 36): you can use the medium size for channelized functions (a channelized function consolidates smaller channels into a larger channel) blocks (4,096 144): you can use the large blocks with 512K bits of RAM for network processor cache The different memory block sizes are the foundation of the Stratix high-bandwidth memory architecture. The organization of these resources enables you to customize these abundant memory resources distributed throughout the programmable logic at a lower overall system cost. Communications systems contain devices which use multiple standards at multiple clock frequencies Communications systems require memory to perform a variety of functions in the system, including buffering data between clock domains. As various telecommunications and/or networking standards intersect in hardware, memory resources are required to translate or shake hands between their unique clock domains, and slow down or speed up the rate of data transfer. 2

For example, the OC-48 and OC-192 telecommunications standards perform the same function, but at different frequencies. However, many systems must support both standards, requiring translation between them. Memory on a PLD effectively supports this translation (see Figure 1). Data packet sizes are growing, and the speed at which we can transport them between devices is accelerating As applications such as video-on-demand become mainstream, the packaging of transmitted data is changing to accommodate increasing packet sizes. At the same time, wide area networks (WANs) are beginning to adopt multiplewavelength fibers with each wavelength running at, and have started planning for 40-Gbps networks. Increasing data sizes and rates create greater demands for larger storage buffers on end systems and devices. With the TriMatrix memory architecture, you can bring this data into a Stratix device, store (in large packet buffers), synchronize (with small wide FIFO buffers), process (with multiple Nios embedded processor cores using blocks for data storage), and then send the data back out (see Figure 2). Figure 2. Major Elements in Communications Design that Define the Role of Memory Discrete Memory SDRAM (maximum = 266 MHz) MPU MPU MPU MPU Processor performance is growing at a slower rate than transmission media I/O speeds Since 1999, the roadmaps for microprocessor and network processor devices have been one step behind the requirements of current networking research and development. This lag in development results in system implementations in which the number of system processors is a function of the system bandwidth, leading to more overhead, less efficiency, additional board space, additional devices, higher costs, increased board complexity, and higher power consumption. Network designers are forced to accept PC technologies that never quite do exactly what they want. Network processor suppliers current roadmaps show the ability to process, approximately 25% of the current bandwidth of optical transport technologies under development. Therefore, to achieve bandwidth objectives, the system must incorporate multiple processors to support the anticipated network packet loads. The additional complication caused by more processors is typically addressed through complex memory-management techniques. Multiple ports, byte masking, and large buffers are all integral parts of these techniques. With every additional 3

processor, additional data storage requirements are introduced, including standard CPU cache and main CPU memory. The TriMatrix memory architecture incorporates memories of multiple sizes that can be configured to serve a multitude of system-level functions, reducing the total number of memory solutions required in a given system. The result is a reduction in system complexity and associated unit costs, and a cleaner design. Off-chip memory speeds for storing those packets are also growing at a slower rate than PLD speeds Further exacerbating the multi-processor memory complexity, the dedicated-memory devices such as SDRAM, quad data rate QDR SRAM, and DDR SDRAM have not accelerated at a pace comparable to the bandwidth capacity of the transmission media. Once again, a PLD is required to accept packets of data at very high speeds and store it quickly in wide memories, reducing the transmission frequency of the data packets to a level at which discrete memory devices can successfully receive and store them. For example, a 40-Gbps input data rate has to be translated to a 256-bit bus running at 156 MHz. This 256-bit bus needs to communicate with a back-end interface, a microprocessor, and an external memory on the input and output side (see Figure 2). Therefore, the memory architecture must support four (interfaces) two (input/output) 256 bits (bus width) for basic interfacing. Very high memory bandwidth is required at the intersection of microprocessor, memory, and transmission technologies, which is the PLD One important metric for evaluating a memory architecture is its peak memory bandwidth. The peak memory bandwidth defines the upper limit of the device s ability to move data through memory. To calculate the peak memory bandwidth of a Stratix device, each TriMatrix memory block is assumed to be at maximum clock frequency in simple dual-port mode (i.e., the devices can simultaneously read and write data). The calculation used to determine the peak memory bandwidth in the Stratix family is as follows: Peak Bandwidth of Each Type Peak Bandwidth: (maximum achievable frequency maximum configurable bit width)/number of clock cycles to write a word) number of write ports Peak Bandwidth: (maximum achievable frequency maximum configurable bit width)/number of clock cycles to write a word) number of write ports Peak Bandwidth: (maximum achievable frequency maximum configurable bit width)/number of clock cycles to write a word) number of write ports Peak Bandwidth of Each Stratix Device The calculation used to determine the peak memory bandwidth in each Stratix device is as follows: (Peak bandwidth of number of blocks on Device) + (peak bandwidth of number of blocks on device) + (peak bandwidth of block number of blocks on device) 4

The peak bandwidth of each Stratix device equation calculates the memory architecture s bandwidth within the device. The calculation also assumes that each block type contains the standard number of ports (i.e., no special techniques for increasing the number of ports are utilized). See Tables 1 and 2. Table 1. Peak Memory Bandwidth by Stratix Type File s s s Peak Memory Bandwidth 6 Gbps 11 Gbps 43 Gbps Table 2. Peak Memory Bandwidth by Stratix Device Device Resources Resources Resources Peak Memory Bandwidth EP1S10 95 60 1 1 terabit per second EP1S20 194 82 2 2 terabits per second EP1S25 224 138 2 3 terabits per second EP1S30 295 171 4 4 terabits per second EP1S40 384 183 4 4 terabits per second EP1S60 574 292 6 7 terabits per second EP1S80 767 364 8 9 terabits per second EP1S120 1,118 520 12 12 terabits per second The PLD must manage packet speeds and buffer them while the microprocessor and off-chip memories prepare themselves for more data. For more information on Stratix I/O capabilities, see Application Note 202 (Using High- Speed Differential I/O Interfaces in Stratix Devices). Conclusion Emerging technologies in PLDs such as the TriMatrix memory architecture, address the need of next-generation communications systems. By incorporating high-density memory architectures into devices with the I/O capabilities to receive and transmit data at high rates, Stratix devices are well suited for communications system designs. 101 Innovation Drive San Jose, CA 95134 (408) 544-7000 http://www.altera.com Copyright 2002 Altera Corporation. All rights reserved. Altera, The Programmable Solutions Company, the stylized Altera logo, specific device designations, and all other words and logos that are identified as trademarks and/or service marks are, unless noted otherwise, the trademarks and service marks of Altera Corporation in the U.S. and other countries. All other product or service names are the property of their respective holders. Altera products are protected under numerous U.S. and foreign patents and pending applications, maskwork rights, and copyrights. Altera warrants performance of its semiconductor products to current specifications in accordance with Altera's standard warranty, but reserves the right to make changes to any products and services at any time without notice. Altera assumes no responsibility or liability arising out of the application or use of any information, product, or service described herein except as expressly agreed to in writing by Altera Corporation. Altera customers are advised to obtain the latest version of device specifications before relying on any published information and before placing orders for products or services. 5