I ntel. QuickPath I nterconnect Overview. presented by Robert Safranek. Contributors: Gurbir Singh, Robert Maddox

Similar documents
Intelligent servers- Lower TCO, Rapid ROI and More Performance

Intel Desktop Board DZ68DB

Intel 945(GM/GME)/915(GM/GME)/ 855(GM/GME)/852(GM/GME) Chipsets VGA Port Always Enabled Hardware Workaround

Intel QuickPath Interconnect Architectural Features Supporting Scalable System Architectures

Poulson: An 8 Core 32 nm Next Generation I ntel I tanium. Processor. Steve Undy I ntel I NTEL CONFI DENTI AL

A Reconfigurable Computing System Based on a Cache-Coherent Fabric

Intel Desktop Board DH55TC

Intel Cache Acceleration Software for Windows* Workstation

Intel Desktop Board DP55SB

Intel Desktop Board D945GCCR

Evolving Small Cells. Udayan Mukherjee Senior Principal Engineer and Director (Wireless Infrastructure)

Intel QuickPath Interconnect Electrical Architecture Overview

Intel Desktop Board DG41RQ

Intel Desktop Board D975XBX2

Intel Desktop Board D945GCLF2

Intel Desktop Board DG41CN

Intel Desktop Board DH61CR

Intel Server Board S2400SC

Link Layer & Network Layer

Software Evaluation Guide for ImTOO* YouTube* to ipod* Converter Downloading YouTube videos to your ipod

Intel Desktop Board DP67DE

Intel Desktop Board DG31PR

Intel and the Future of Consumer Electronics. Shahrokh Shahidzadeh Sr. Principal Technologist

Intel Desktop Board DH61SA

Design of Vhost-pci - designing a new virtio device for inter-vm communication

Intel Setup and Configuration Service. (Lightweight)

Intel Desktop Board D946GZAB

Intel Atom Processor Based Platform Technologies. Intelligent Systems Group Intel Corporation

2013 Intel Corporation

Expand Your HPC Market Reach and Grow Your Sales with Intel Cluster Ready

Making the Cloud Work for You Introducing the Intel Xeon Processor E5 Family

How to Create a.cibd File from Mentor Xpedition for HLDRC

Data Center Energy Efficiency Using Intel Intelligent Power Node Manager and Intel Data Center Manager

LED Manager for Intel NUC

How to Create a.cibd/.cce File from Mentor Xpedition for HLDRC

Maximize Performance and Scalability of RADIOSS* Structural Analysis Software on Intel Xeon Processor E7 v2 Family-Based Platforms

Intel Serial to Parallel PCI Bridge Evaluation Board

Intel Desktop Board DQ35JO

Intel Transparent Computing

Drive Recovery Panel

Intel G31/P31 Express Chipset

Upgrading Intel Server Board Set SE8500HW4 to Support Intel Xeon Processors 7000 Sequence

Intel 848P Chipset. Specification Update. Intel 82848P Memory Controller Hub (MCH) August 2003

Intel Desktop Board D945GCLF

Intel Xeon Phi Coprocessor. Technical Resources. Intel Xeon Phi Coprocessor Workshop Pawsey Centre & CSIRO, Aug Intel Xeon Phi Coprocessor

An Introduction to Com puter Networks

The Transition to PCI Express* for Client SSDs

KVM as The NFV Hypervisor

Intel Server Board S2600STB

The Intel Processor Diagnostic Tool Release Notes

Intel Virtualization Technology Roadmap and VT-d Support in Xen

Intel s next generation processors for IBM System x servers

Software Evaluation Guide for CyberLink MediaEspresso *

Intel Stereo 3D SDK Developer s Guide. Alpha Release

Intel Cache Acceleration Software - Workstation

Intel Desktop Board DP45SG

Intel NetStructure DMN160TEC ISDN Call Control Performance Testing

Non-Volatile Memory Cache Enhancements: Turbo-Charging Client Platform Performance

Sample for OpenCL* and DirectX* Video Acceleration Surface Sharing

Intel Server Board S5520HC

Data Plane Development Kit

Intel Unite. Intel Unite Firewall Help Guide

Omni-Path Cluster Configurator

Intel Desktop Board DQ57TM

Intel Server Board S2600CW2S

Mary Yeoh Intel Penang Design Center (ipdc) Intel Corporation Penang, Malaysia

Intel Unite Plugin Guide for VDO360 Clearwater

Intel Desktop Board D945GSEJT

OpenCL* and Microsoft DirectX* Video Acceleration Surface Sharing

Solid-State Drive System Optimizations In Data Center Applications

Intel Atom Processor E3800 Product Family Development Kit Based on Intel Intelligent System Extended (ISX) Form Factor Reference Design

AMD and OpenCL. Mike Houst on Senior Syst em Archit ect Advanced Micro Devices, I nc.

Intel & Lustre: LUG Micah Bhakti

Technology challenges and trends over the next decade (A look through a 2030 crystal ball) Al Gara Intel Fellow & Chief HPC System Architect

Performance and Energy Efficiency of the 14 th Generation Dell PowerEdge Servers

Supra-linear Packet Processing Performance with Intel Multi-core Processors

Intel Platform Administration Technology Quick Start Guide

Theory and Practice of the Low-Power SATA Spec DevSleep

Intel Server Board S1200BTS

Open FCoE for ESX*-based Intel Ethernet Server X520 Family Adapters

GraphBuilder: A Scalable Graph ETL Framework

GUID Partition Table (GPT)

Intel Server Board S1200V3RPO Intel Server System R1208RPOSHORSPP

Intel Core TM i7-4702ec Processor for Communications Infrastructure

Intel Desktop Board DH61HO. MLP Report. Motherboard Logo Program (MLP) 09/20/2012

Intel Server Board S1200V3RPO Intel Server System R1208RPOSHORSPP

True Scale Fabric Switches Series

Intel vpro Technology Virtual Seminar 2010

Intel Setup and Configuration Service Lite

Lenovo ThinkCentre M90z with Intel vpro Technology. Stefan Richards Intel Corporation Business Client Platform Division

Innovating and Integrating for Communications and Storage

BIOS Update Release Notes

Intel Server Board S3000PT Spares/Parts List and Configuration Guide for Production Products

Intel Education Theft Deterrent Release Note WW16'14. August 2014

Intel Dialogic CPi/200B2 and CPi/400B2 Fax Boards

Optimizing the operations with sparse matrices on Intel architecture

Making Nested Virtualization Real by Using Hardware Virtualization Features

Intel 865G/865GV/865PE/865P Chipset

Intel Desktop Board D845PT Specification Update

Software Evaluation Guide Adobe Premiere Pro CS3 SEG

Transcription:

Overview presented by Robert Safranek Contributors: Gurbir Singh, Robert Maddox

Legal Disclaim er INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. INTEL PRODUCTS ARE NOT INTENDED FOR USE IN MEDICAL, LIFE SAVING, OR LIFE SUSTAINING APPLICATIONS. Intel may make changes to specifications and product descriptions at any time, without notice. All products, dates, and figures specified are preliminary based on current expectations, and are subject to change without notice. Intel, processors, chipsets, and desktop boards may contain design defects or errors known as errata, which may cause the product to deviate from published specifications. Current characterized errata are available on request. Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Intel, Intel Inside, and the Intel logo are trademarks of Intel Corporation in the United States and other countries. *Other names and brands may be claimed as the property of others. Hot Copyright Chips 21 2009 Intel Corporation. 2

High Perform ance / Low pin count syst em int erface New Syst em Choices Efficient scaling Num ber of processors Plat form Topologies I m proved System Robustness Error Detection & Recovery Layered Archit ect ure Modular, Flexible Prot ocol Rout ing Physical Packet s Flits Phits Prot ocol Rout ing Physical Applied to Xeon and I tanium processor- based syst em s 3

Com m on Plat form Configurat ions 1 and 2 Socket Server Topologies Node Cont roller based Topologies 4 and 8 Socket Server Fully ( and Part ially) Connect ed Topologies 4

Physical Layer Prot ocol Rout ing Packet s Prot ocol Rout ing QPI -Pair Two sets of Unidirectional links Physical Flits Phits Physical Transm itter provides a Forwarded Clock Full width -Pair is 84 signals Widths - 20 or 10 or 5 lanes Data Rate: up to 6.4 GT/ s Full Width is 12.8GB/ s (or 25.6GB/ s for a -Pair) BW Calculation is based on data payload only Ot her Physical Layer feat ures Polarit y I nversion and Lane Reversal Transm itter Equalization Probe- less Test ing 5

Logical View of a Pair 6

Transm itter Discrete-tim e Linear Equalizer Data in c -1 c 0 c 1 c 2 + Channel QPI Tx circuit reshapes w aveform to m atch channel characteristics 7

Coefficient Search Proper selection of Tx coefficients is critical to open Data Eye at the Receiver 8

Support for Probe- Less Test ing At 6.4GT/ s physical probing of a link is no longer an option Physical layer provides additional diagnostic hooks to support several variants of loopback testing Digit al Near- End Loopback Local I nter/ I ntra Loopback Rem ote Loopback 9

Layer Prot ocol Rout ing Packet s Prot ocol Rout ing Flow Cont rol Credit/ debit schem e Physical Flits Phits Physical Higher layer services Arbitrate am ong the m ultiple m essage classes (HOM / NCS / NCB / NDR / DRS / SNP / Special flits - link layer) Manage the m ultiple virtual networks Prevent prot ocol deadlocks Ensure the BW is realizable Error Checking and Recovery CRC checking level ret ry recovery Self-healing for hard errors 10

Layer Quantum of I nform ation Physical Layer deals with Phits (20, 10 or 5 bits) Protocol Layer deals with Messages (or Packets) Layer works at a Flit granularity Always 72 Bits of Payload + 8 Bits CRC Full Width Half Width Quarter Width 11

Virt ual Channels Mapping Funct ion Layer m aps Messages t o VN s 6 Non- blocking m essage classes 3 Virtual networks defined (VN0, VN1, VNA) 18 Virt ual channel com binat ions 1 3 Credit Pools 6 for m essage classes on VN0 (Packets) 6 for m essage classes on VN1 (Packets) 1 for everything on VNA (Flits) Note: Message Class and VN inform ation are encoded in the fields of the header flit 12

Message Classes Protocol events are grouped into m essage classes to prevent undesirable dependencies, I.e., situations where dependencies creat e deadlock or livelock scenarios 7 Total Message Classes with 6 for the Protocol Layer N oncoherent Classes Response Classes Coherent Classes NC Standard NC Bypa ss No- Data Re sponse Data Re sponse HOM Snoop Special Noncoherent Request s Noncoherent W rit es Com plet ions - Pair Dat a Com plet ions Hom e Node Request s, Snoop Responses Peer Node Snoops 13

I nt erleaving of Messages Referred to as Com m and I nsert I nterleave (or CI I ) Allows for link layer to insert single (& dual) flit m essages into a m ulti-flit m essages Benefits of CI I Elim inates any reason to Store&Forward packets in the route through case. Elim inates issues that could potentially inject bubbles into the link Minim izes lead off latency without sacrificing link utilization H 1 9 14

CRC Protection Properties Per Flit Calculation (not packet) No Additional latency incurred collecting the rest of the packet Protocol Flits can be consum ed im m ediately CRC8 Properties of a Flit All 1, 2, and 3 bit errors are detected Any odd num ber of bit errors is detected All bit errors of burst length 8 or less are detected 99 percent of all errors of burst length 9 are detected 99.6 percent of all errors of burst length greater than 9 are detected Additional CRC Protection Option Rolling CRC across two flits I ncur an additional flit delay in latency 15

Rout ing Layer Routing algorithm s Specified through Routing Tables At source based on System Address Prot ocol Rout ing At interm ediate points - based on destination Node I D Program m ing done by firm ware provides a flexible and distributed m ethod to route QuickPath I nterconnect transactions from source to destination Program m able Routing Table is key to supporting higher level system features OL*, Dynam ic Partitioning, etc. Physical Packet s Flits Phits Prot ocol Rout ing Physical

Prot ocol Layer Cat egories of Transact ions Coherent / Non-Coherent Virtual Legacy Wires Power Managem ent Flexible cache coherency prot ocol opt ions Source Snoop option m echanism Caching Agent issues snoops (or spawned by system topology) to other CA s and snoop responses to the Hom e Agent Hom e Agent collect s snoop responses Hom e Snoop option m echanism Caching Agent issues request to Hom e Hom e Agent m ulticasts snoop, hom e node collects snoop responses (directory based protocol) MESI F cache states (Modified, Exclusive, Shared, I nvalid, Forwarding) HOM m essage class per source/ destination pair is ordered for sam e addresses Coherent request s and snoop responses are on HOM Pre-allocated resources at hom e agent to prevent deadlock / forward progress issues No Ret ries / Built in conflict resolut ion Prot ocol Rout ing Physical Packet s Flits Phits Prot ocol Rout ing Physical 17

Agent Types Configurat ion Agent ( self explanat ory) Caching Agent I ssues request to hom e after cache m iss Supplies data on snoop hits or forward requests from Hom e Detects and acknowledges conflicts to Hom e Responds with acknowledgem ent of conflict if snooped on an outstanding request or forced by a Hom e Hom e Agent ( coherency/ order cont roller) Front end of a m em ory controller Tracks every potential outstanding request Recipient of caching agent s request s Recipient of all snoop responses Decides the next owner in address conflict cases Caching agents report their observed conflicts to Hom e Forces acknowledgem ent response t o am biguous conflict cases 18

Message Class Dependency Diagram 19

Convention for Flow Diagram s A B C H MC Caching Agents Home Agent Memory Controller Message traveling along ordered HOM Message Class Message traveling along unordered snoop or response channel Allocate requester entry or home agent tracking entry De-allocate requester entry or home agent tracking entry Time progresses in a down direction of the Flow Diagram 20

RdData request after Cache Miss A B C H I I I MC I E 21

RdI nvown with C2C Transfer A B C H I M I MC I M M I

I nvalidating Write with no Cached Copy A (I W) B C H I I I MC I E E M M I

Where is this QPI Headed? Server I nterconnect Strategy for years to com e QPI is m ore than a link definition it is an infrastructure for Legacy Support for pre- exist ing soft ware Efficient Processing m arket segm ent s feat ures Low Lat ency / High BW Topology I nvalidat ing Writ e Flow / Snoop Spawning System Power Managem ent (via PMReq Message) Reliabilit y/ Availabilit y/ Serviceabilit y feat ures for t he Highend and Mission Critical m arket segm ents, such as Clock Fail- safe / Self- healing / Rolling CRC Ability to re-route around a failed link Static / Dynam ic Partitioning and OL* usage of the Quiesce Message / Program m able Rout e Tables I nter-socket m em ory m irroring / DI MM sparing / etc 24

For More I nform ation Web-based info: http: / / www.intel.com / technology/ quickpath/ index.htm http: / / www.intel.com / technology/ quickpath/ whitepaper.pdf Book published by I ntel Press: W eaving High Perform ance Multiprocessor Fabric Architectural I nsights to the QuickPath I nt erconnect By Robert A. Maddox, Gurbir Singh and Robert J. Safranek http: / / www.intel.com / intelpress/ sum _qpi.htm Training from MindShare* http: / / www.m indshare.com Cont act Ravi Budruk ravi@m indshare.com 25

I nt el Xeon 5500 Perform ance Publicat ions SPECint*_rate_base2006 241 score (+72%) SPECpower*_ssj2008 1977 ssj_ops/watt (+74%) IBM J9* JVM SPECfp*_rate_base2006 197 score (+128%) SPECjAppServer*2004 TPC*-C SAP-SD* 2-Tier 3,975 JOPS (+93%) 631,766 tpmc (+130%) 5,100 SD Users (+103%) Oracle WebLogic* Server Oracle 11g* database SAP* ERP 6.0/IBM DB2* VMmark* 24.35 @17 tiles(+166%) VMware* ESX 4.0 Fluent* 12.0 benchmark Geo mean of 6 (+127%) ANSYS FLUENT* TPC*-E 800 tpse (+152%) Microsoft SQL Server* 2008 SPECjbb*2005 604,417 BOPS (+64%) IBM J9* JVM SPECWeb*2005 75023 score (+150%) Rock Web* Server SPECapc* for Maya 6.5 7.70 score (+87%) Autodesk* Maya Over 30 New 2S Server and Workstation World Records! Percentage gains shown are based on com parison to Xeon 5400 series; Perform ance results based on published/ subm itted results as of April 27, 2009. Platform configuration details are available at http: / / www.intel.com / perform ance/ server/ xeon/ sum m ary.htm * Other nam es and brands m ay be claim ed as the property of others Perform ance tests and ratings are m easured using specific com puter system s and/ or com ponents and reflect the approxim ate perform ance of I ntel products as m easured by those tests. Any difference in system hardware or software design or configuration m ay affect actual perform ance. Buyers should consult other sources of inform at ion t o evaluat e t he perform ance of syst em s or com ponent s t hey are considering purchasing. For m ore inform at ion on perform ance t est s and on t he perform ance of I nt el product s, visit I ntel Perform ance Benchm ark Lim itations