Designing with the Xilinx 7 Series PCIe Embedded Block Follow @avnetxfest Tweet this event: #avtxfest www.facebook.com/xfest2012
Why Would This Presentation Matter to You? 2 If you are designing a PCIe based system and you need Up to PCIe Gen 3 data rate Optimal system cost/performance PCIe DMA solution Low cost FPGA configuration solution Then you need to know about the Xilinx 7 series PCIe solutions
Objectives 3 Become familiar with the Xilinx 7 series PCIe solutions Know what Alliance Partner DMA controller solutions are available for the 7 series integrated PCIe block Know how to configure the 7 series FPGA in order to meet the configuration time requirement in a PCIe system
Agenda 4 7 series PCIe Solutions 7 series PCIe User Interface Key PCIe Gen 3 Specifications 7 series FPGA Configuration in a PCIe System PCIe Multi-Function and Single-Root I/O Virtualization (SR-IOV) PCIe DMA Solutions Development Tools Closing Comments
Agenda 5 7 series PCIe Solutions 7 series PCIe User Interface Key PCIe Gen 3 Specifications 7 series FPGA Configuration in a PCIe System PCIe Multi-Function and Single-Root I/O Virtualization (SR-IOV) PCIe DMA Solutions Development Tools Closing Comments
Xilinx 7 Series Integrated PCIe Block 6 The 7 series PCIe block contains the functionality defined in the specifications maintained by the PCI-SIG Compliant with the PCI Express base 2.1/3.0 specification Configurable for Gen 1 (2.5Gbps), Gen 2 (5Gbps) or Gen 3 (8Gbps) data rates x8, x4, x2, or x1 lane width Configurable for Endpoint or Root Port applications 7 series transceivers implement a fully compliant PCIe PHY Maximum Payload Size (MPS) of 128/256/512/1024 bytes Up to 6 x 32-bit or 3 x 64-bit BARs
7 Series PCI Express Solutions 7 7 series PCIe solutions provide optimal system cost/performance Artix -7, optimized for lowest cost and power Kintex -7, optimized for best price/performance Virtex -7, optimized for highest system performance and capacity 7 series PCIe Solutions Gen 2 Hard IP Gen 3 Hard IP Gen 3 Soft IP Artix-7 x4* No No Kintex-7 x8 No x8 Virtex-7, T Devices x8 No x8 Virtex-7, XT Devices x8 x8** x8 Virtex-7, HT Devices x8 x8 x8 * Artix-7 bandwidth will be limited to Gen 2 x4 ** Excluding the XC7VX485T device
7 Series PCIe Gen 2 Integrated Block 8 Features Compliant to PCIe revision 2.1 Endpoint or Root Port support AXI4 user interface TX BRAM RX BRAM DRP Phy Layer Configurations x1- x8 lane widths Gen 1/Gen 2 data rates (2.5/5Gbps) Transaction Interface Configuration Interface Transaction Layer Data Link Layer Configuration Module Physical Layer GTXs PCIe Integrated Block Summary Virtually the same as Virtex-6 Enhanced performance Gen 2 x8 Root Port Lower Latency
7 Series PCIe Gen 3 Integrated Block 9 Features Compliant to PCIe revision 3.0 Endpoint or Root Port support 4 individual AXI4 user interfaces TX BRAM RX BRAM DRP Phy Layer 2 AXI4 Completer interfaces GTHs 2 AXI4 Requester interfaces Integrated SR-IOV (6 channels) Integrated Multi-Function (2 functions) Transaction Interface Configuration Interface Transaction Layer Data Link Layer Configuration Module PCIe Integrated Block Physical Layer Configurations x1- x8 lane widths Gen 1/Gen 2/Gen 3 data rates (2.5/5/8Gbps)
7 Series PCIe Gen 3 Soft IP Solutions 10 Supported in Kintex-7 and Virtex-7 Utilizes -2 or -3 speed grade part depending on the lane width Northwest Logic or PLDA soft IP for Gen 3 Xilinx supplied Gen 3 PCS and PMA Physical Coding Sublayer (PCS) soft IP with PIPE 3.0 connection to the Gen 3 soft IP Physical Media Attachment (PMA) hard IP via GTH transceivers PIPE 3.0 (PHY Interface for PCI Express) AXI Transaction Layer Data Link Layer PCS PMA PCIe Gen 3 Alliance Partner IP Soft IP Hard IP
Alliance Partner PCIe Gen 3 Soft IP Solutions 11 Northwest Logic and PLDA PCIe Gen 3 Soft IP features PCIe revision 3.0 compliant x1, x2, x4, x8, x16 (NW Logic only) lane support 8, 5, and 2.5Gbps support Endpoint and Root Port support Optional multi-channel DMA controller Linux and Windows device driver www.nwlogic.com www.plda.com
Why Use PCIe Gen 3 Soft IP Solution? 12 Gen 3 early adopters can use the soft IP for initial prototyping Using Gen 3 soft IP in a Kintex-7 device could be less expensive than using a Virtex-7 device with the integrated Gen 3 hard IP Some applications might need more PCIe Gen 3 blocks than available in a given 7 series device Virtex-7 XT devices have 2-4 PCIe Gen 3 hard IP blocks Virtex-7 HT devices have 1-3 PCIe Gen 3 hard IP blocks
7 Series PCIe Clocking 13 7 series PCIe block requires a 100MHz or 250MHz system clock input The clock frequency used must match the clock frequency selection in the CORE Generator GUI In a typical PCIe system, the Endpoint device PCIe reference clock input is a 100MHz clock provided by the PCIe edge connector Some Endpoint devices require an external PLL Avnet K7 MMP baseboard uses ICS874003-05 external PLL Reference Clock Input Artix-7 Kintex-7 Virtex-7 Gen 1 100 MHz 100 MHz 100 MHz Gen 2 100 MHz 100 MHz 100 MHz Gen 3 Hard IP Gen 3 Soft IP N/A N/A 250 MHz N/A 250 MHz 250 MHz
Agenda 14 7 series PCIe Solutions 7 series PCIe User Interface Key PCIe Gen 3 Specifications 7 series FPGA Configuration in a PCIe System PCIe Multi-Function and Single-Root I/O Virtualization (SR-IOV) PCIe DMA Solutions Development Tools Closing Comments
7 Series PCIe Unified AXI Interfaces 15 The user interface of the 7 series PCIe block is designed to the AXI4 specification Three variations of AXI4 interfaces will be provided, each tailored for a different customer use case AXI4 Type Basic AXI4-Stream Enhanced AXI4-Stream AXI4 Memory Mapped Description This interface is analogous to the legacy Local Link interface found in previous Xilinx FPGA families. This interface is similar to the Basic AXI4- Stream interface but expands on it by splitting/combining the data stream into Completer and Requester streams. This is a memory-mapped interface for use in processor based systems (EDK PCIe bridge).
7 Series PCIe Gen 2/Gen 3 AXI4 Interfaces 16 Basic AXI4-Stream PCIe Gen 2 user interface Easy migration from Local Link Enhanced AXI4-Stream PCIe Gen 3 user interface Separate Requester R/W and Completer R/W interfaces AXI4 Memory Mapped PCIe Gen 2/Gen 3 user interface Used in embedded designs EDK IP core Migration from PLB46
7 Series PCIe AXI4 Bus Width and Clock Frequency 17 The clock frequency of the AXI4 user interface can be selected via CORE Generator GUI Each PCIe lane width provides a default frequency along with alternative frequencies Where possible, Xilinx recommends using the default frequency Non-default frequencies will result in difficulties to close timing Data Rate Lanes Bus Width (Bits) Bus Speed (MHz) Gen 1 1 64 62.5 (default), 31.25, 125, or 250 4 64 125 (default) or 250 8 64/128 250/125 (default) or 250 Gen 2 1 64 62.5 (default), 125, or 250 4 64/128 250/125 (default) or 250 8 128 250 Gen 3 1 64 250 (default) or 125 4 128/256 250/125 8 256 250
Agenda 18 7 series PCIe Solutions 7 series PCIe User Interface Key PCIe Gen 3 Specifications 7 series FPGA Configuration in a PCIe System PCIe Multi-Function and Single-Root I/O Virtualization (SR-IOV) PCIe DMA Solutions Development Tools Closing Comments
Gen 3 Goals From PCI-SIG 19 Double the effective bandwidth 8Gbps (4Gbps for Gen 2) Two different approaches 10G with 8b/10b coding 8G with scrambling Analysis clearly showed 8G as the preferred approach Channel loss and distortion much worse at 10G than 8G Remain cost effective FR4 for PCB Similar reference clock architectures as Gen 2 Similar power budget and physical connections as Gen 2 Compatibility Gen 1/Gen 2 cards must operate at Gen 1/Gen 2 rate in Gen 3 slots
Gen 3 Enhancements and Challenges 20 New PHY layer enhancements Encoding 8Gbps raw line rate per lane Supported by the 7 series Serdes 128b/130b encoding/scrambling 128b/130b & Scrambler use a 23-bit LFSR Gen3 Scrambling 1.5% encoding overhead Implemented in the 7 series FPGA fabric Multi-lane Gen 3 designs will use a custom PCS soft IP Raw Line Rate Effective BW Gen1 8b10b 2.5Gbps 2Gbps Gen2 8b10b 5Gbps 4Gbps 8Gbps 7.88Gbps Tightened reference clock input specifications (user responsibility) Reference clock RMS jitter of 1ps (3.1ps for Gen 2) Protocol enhancements v2.1 ECNs http://www.pcisig.com/specifications/pciexpress/specifications/#ecn2
Board Design and Simulation Tools 21 PCB design guidelines chapter in Xilinx PCIe user s guide Updated to include Gen 3 considerations PCB simulation for Gen 3 designs Rule of thumb PCB design may have worked for 5Gbps, but will be difficult at 8Gbps on FR4 PCB material Simulation will be essential for PCB design at 8Gbps Mentor Hyperlynx GHz and Sigrity SystemSI simulators Xilinx provides transceiver IBIS-AMI model http://www.xilinx.com/member/ibis_ami/ Free statistical eye simulators PCI-SIG Seasim Intel Channel Test Tool (ICTT)
Free 3 rd Party Statistical Eye Simulators 22 Seasim Open source statistical eye simulator from PCI-SIG (requires membership) Input: Step response of channel Output: Eye diagram after LE and DFE (uses behavioral DFE and LE) Version 0.46 supports PCIe Gen 3 http://www.pcisig.com/specifications/pciexpress/base2/seasim_package/ ICTT Intel channel test tool Free closed source statistical eye simulator from Intel Input: Step response of channel Output: Eye diagram after LE and DFE Version 1.0.0 supports PCIe Gen 3 http://www.intel.com/technology/pciexpress/devnet/resources.htm
Xilinx Chipscope Pro IBERT 23 Integrated Bit Error Ratio Tester (IBERT) Allows hardware evaluation of highspeed links IBERT GUI key features Hardware PRBS generator and checker Transmitter and receiver parameter sweeping RX margin analysis Horizontal and vertical scan Eye diagram plot
Agenda 24 7 series PCIe Solutions 7 series PCIe User Interface Key PCIe Gen 3 Specifications 7 series FPGA Configuration in a PCIe System PCIe Multi-Function and Single-Root I/O Virtualization (SR-IOV) PCIe DMA Solutions Development Tools Closing Comments
FPGA Configuration Requirements in PCIe Systems 25 Open system specification requirements FPGA must be configured and ready for PCIe enumeration in 100ms Host CPU begins PCIe enumeration upon de-assertion of the PCIe reset signal (PERST#) PCIe reset signal is de-asserted 100ms after the power supply PWR_OK signal is asserted (12V supply reaches 95%) In systems with ATX power supply, the FPGA configuration time is increased to 200ms ATX power supply PWR_OK signal is asserted minimum of 100ms after the 12V supply reaches 95% User cost reduction requirements User wants to use inexpensive Flash devices (SPI, QSPI, BPI, etc.) for configuration User wants to use existing Flash or hard drive present in the system attached to the CPU
Meeting 7 Series Configuration Requirements in PCIe Systems 26 PCIe interface needs to be ready in 100ms after stable power condition Most 7 series devices cannot meet the 100ms timing using the popular single chip solutions (SPI, QSPI, BPI, etc.) The following three solutions are available to meet the 7 series 100ms configuration time requirement Solution Solution 1 (Tandem PROM) (IDS 14.1) Refer to the 7 series PCIe User s Guide for more info on Tandem PROM configuration. Solution 2 (Tandem PCIe) (IDS 14.2) Solution 3 (Tandem PCIe with PR) (IDS 14.2) Description Split the configuration into two stages (Tandem) 1 st Stage: Configure just the PCIe interface (PCIe IP, Serdes, CMT, and BRAM) 2 nd Stage: Configure the remainder of the FPGA Tandem configuration over PCIe 1 st Stage will use a small Flash device Tandem configuration over PCIe with Partial Reconfiguration (PR)
Solution 1 Tandem PROM 27 Both initial PCIe link configuration and user application bitstreams are stored in the same Flash device Initial PCIe configuration has the BitGen Persist option enabled This will ensure the configuration IO pins continue to load the 2 nd stage from the Flash after the 1 st stage has completed BitGen reports the number of configuration frames in 1 st stage Used to calculate the 1 st stage bitstream size Flash Initial PCIe Configuration Padding User Application FPGA PCIe Link
Tandem PROM 1 st Stage - Initial PCIe Link Configuration 28 Flash configures the PCIe Block (PCIe IP, Serdes, CMT, and BRAM) At the end of the initial configuration, a configuration startup command is issued to bring up the FPGA and the PCIe link Configuration Flash Frequency 1 st Stage Config. Time SPI 100 MHz 90 ms QSPI 66 MHz 34 ms BPI x8 Synch 33 MHz 34 ms BPI x16 Synch 33 MHz 17 ms Estimates based on 9 Mb 1 st stage bitstream size (7K325T)
Tandem PROM 2 nd Stage - Remainder of the FPGA 29 PCIe enumeration/configuration occurs as normal The remainder of the FPGA configuration is then loaded while the PCIe enumeration/configuration is taking place Configuration Port PCIe Block
Tandem PROM Configuration Design Flow 30 Coregen Check the option box for Fast Boot Generated UCF will contain Fast Boot Area Group constraint with floor planned PCIe core and partition Synthesis & PAR Normal flow, designers integrate PCIe core RTL and constraints into user application design No Partial Reconfiguration (License) necessary BitGen Run BitGen with Persist option BitGen creates single Tandem bitstream
Agenda 31 7 series PCIe Solutions 7 series PCIe User Interface Key PCIe Gen 3 Specifications 7 series FPGA Configuration in a PCIe System PCIe Multi-Function and Single-Root I/O Virtualization (SR-IOV) PCIe DMA Solutions Development Tools Closing Comments
PCIe Multi-Function 32 Enables PCIe Endpoints to have multiple functions Each function has its own PCIe configuration space From a host CPU perspective, each function appears as an individual PCIe device Fully integrated in the Virtex-7 XT PCIe hard IP Host CPU + Chipset Gigabit Ethernet Driver Windows OS USB 3.0 Driver PCI Express 1 Physical Link FPGA Enables easy software driver implementation and portability Driver developer can create a single driver and replicate it for each hardware function Configuration Space Gigabit Ethernet Function 0 Configuration Space USB 3.0 Function 1
PCIe Single Root I/O Virtualization (SR-IOV) 33 Typically, multiple virtual machines (OS) running on a physical machine share a physical device via software emulation Significant impact on I/O performance Limits the number of virtual machines Host CPU + Chipset Windows OS Linux OS Virtual Machine Manager (VMM) SR-IOV defines a method to share a physical device without software emulation Creates a number of virtual functions (configuration spaces) per physical device VMM configures the physical device to appear in the PCIe configuration space as multiple virtual functions Each virtual function is directly assigned to a virtual machine SR-IOV is fully integrated in the Virtex-7 XT PCIe hard IP Virtual Function PCIe Physical Function Gigabit Ethernet FPGA Virtual Function
Agenda 34 7 series PCIe Solutions 7 series PCIe User Interface Key PCIe Gen 3 Specifications 7 series FPGA Configuration in a PCIe System PCIe Multi-Function and Single-Root I/O Virtualization (SR-IOV) PCIe DMA Solutions Development Tools Closing Comments
Typical System Data Flow 35 CPU performs all data transfers to/from FPGA Transfer rate is limited by CPU's ability to service the FPGA CPU is tied up managing I/O data transfers
System Data Flow Using a DMA Controller 36 CPU programs the DMA controller for data transfer Data is transferred by the DMA controller when system bus is not used by CPU (useful work can still be done by CPU while DMA is active) Can achieve full bandwidth for large data transfers
DMA Controller Setup and Operation (PCIe System) 37 CPU configures the DMA controller using PIO read/write operations Manages and allocates buffer descriptors Source base address Destination base address Length of the block Starts the DMA controller PIO transactions to set up descriptors happen concurrently with the DMA transfers to maximize the data flow A given descriptor is released when the TLP is transferred to the buffer in memory (so the OS and application know that the memory has been used)
Alliance Partner DMA Controllers 38 The Alliance Partner DMA IP features (PLDA and NW Logic) Full-featured, high performance, block-based or packet-based DMA controller DMA transfers can begin/end on any byte address without restriction Supports scatter-gather as well as multiple DMA channels Delivered with Windows or Linux driver
Agenda 39 7 series PCIe Solutions 7 series PCIe User Interface Key PCIe Gen 3 Specifications 7 series FPGA Configuration in a PCIe System PCIe Multi-Function and Single-Root I/O Virtualization (SR-IOV) PCIe DMA Solutions Development Tools Closing Comments
CORE Generator Simplifies Design Task 40 Configures and connects PCIe hard IP, BRAM, CMT, and Serdes Unencrypted RTL wrapper source code Automatically inserts pipeline registers between the PCIe block and BRAM if necessary Provides Programmed I/O (PIO) example design Implementation scripts for synthesis, map and par User Constraints File (UCF) Simulation Simulation support for Modelsim, ISIM, VCS, NC-Sim Root Port and Endpoint Bus Functional Model (BFM)
Embedded Solutions for PCIe 41 AXI Memory Mapped Root Port/Endpoint bridge for PCIe Supports up to Gen 2 x4 AXI-CDMA High performance central DMA controller (generic DMA controller) Software drivers and example application code Standalone Embedded Linux
Agenda 42 7 series PCIe Solutions 7 series PCIe User Interface Key PCIe Gen 3 Specifications 7 series FPGA Configuration in a PCIe System PCIe Multi-Function and Single-Root I/O Virtualization (SR-IOV) PCIe DMA Solutions Development Tools Closing Comments
Key Takeaways 43 Xilinx offers PCIe solution in every 7 series family Gen 2 hard IP in Artix-7, Kintex-7 and Virtex-7 Gen 3 hard IP in most Virtex-7 XT and HT devices Gen 3 soft IP in Kintex-7 and Virtex-7 devices Alliance Partners (PLDA and NW Logic) offer DMA controller solutions for the 7 series integrated PCIe block Xilinx provides innovative and low cost 7 series FPGA configuration schemes for PCIe applications Tandem PROM (IDS 14.1 release) Tandem PCIe (IDS 14.2 release) Tandem PCIe with Partial Reconfiguration (IDS 14.2 release)
Next Steps 44 Learn more about the 7 series PCIe solutions Visit www.xilinx.com/pcie Purchase a Kintex-7 development kit Kintex-7 MMP Development Kit www.em.avnet.com/k7mmp P/N: AES-MMP-7K325T-G P/N: AES-MMP-BB2-G P/N: Selected Power Module Price: $1,695 Available: May 2012 KC705 Evaluation Kit www.xilinx.com/kc705 P/N: EK-K7-KC705-CES-G Price: $1,695 Available: Now
Next Steps 45 See the Kintex-7 demos in the exhibit area K7 MMP Exhibits: Avnet KC705 Exhibits: Avnet and Xilinx Contact your local Avnet FAE Application and architecture reviews Tools demo Attend additional 7 series PCIe training courses Avnet SpeedWay hands-on workshops Xilinx Authorized Training Partner courses Visit www.xilinx.com/training for more details
Thank You Please Visit the Demo Area Follow @avnetxfest Tweet this event: #avtxfest www.facebook.com/xfest2012
Appendix Follow @avnetxfest Tweet this event: #avtxfest www.facebook.com/xfest2012
TE PCIe Card Edge Connectors 48 TE provides PCIe Card Edge Connectors in standard sizes x1 (36 pins) x4 (64 pins) x8 (98 pins) x16 (164 pins) PCIe 3.0 data rates supported (6.0Gbps) With plastic locating posts or metal hold downs Straddle mount and right-angle configurations available Multiple tail lengths and plating options available
49