Building blocks for custom HyperTransport solutions

Size: px
Start display at page:

Download "Building blocks for custom HyperTransport solutions"

Transcription

1 Building blocks for custom HyperTransport solutions Holger Fröning 2 nd Symposium of the HyperTransport Center of Excellence Feb th 2009, Mannheim, Germany

2 Motivation Back in 2005: Quite some experience with PCI(-X) Convinced of HyperTransport protocol Lean and efficient, low latency, possibility of coherency, direct connection Highly suitable for networking and FPGA technologies Learned about HTX at ISC05 OK, now let s do this 2

3 Outline HT 1.x/2.x HT 3.x HT-based on-chip network Conclusion 3

4 HyperTransport 2.x Established Chip-to-chip and board-to-board interconnect technology AMD s Direct Connect Architecture Lean packet protocol Parallel links Dedicated source clocking No 8b/10b coding No clock recovery Variable link width and frequency Compared to PCIe Higher bandwidth Much lower latency Less protocol overhead 4

5 HT Core ncht version HT Cave unit 8 or 16bit HT links Efficient queue interface HT AMD Opteron HT HT Core NP CMD & DATA P CMD & DATA R CMD & DATA NP CMD & DATA P CMD & DATA User Module Each 96bit CMD & 64 bit DATA Resource usage HT R CMD & DATA HT Unit V4-FX60: 11% of slices V4-FX100: 7% of slices FPGA & ASIC technologies Xilinx, Altera, Lattice IBM 130nm Open source 5

6 cht version Same features as ncht Core Not open source Available from AMD under NDA Introduces 4 th queue for probes & broadcasts Some additional special packets Different device header HT Core Coherent Version Coherent Memory Controller/Cache Not part of cht Core Must be implemented separately Partial implementation of coherency protocol possible 6

7 HT Core Features Fully synchronous design Efficient pipelined structure Synthesizable Verilog HDL code Ultra low latency Incoming: 12 cycles Outgoing: 6 cycles Deseralizer Sy nc nii Decode RX Buffer Init Config CRC Gen Credit Serializer Mux Outputgen TX Buffer Xilinx V4 tech HT200 & HT400 mode 200MHz 8 or 16bit HT interface Xilinx V5 tech HT800 mode 200MHz 8bit HT interface (preliminary) IBM 130nm tech HT800 mode 400MHz 8 or 16bit HT interface 7

8 HT Core Status & Users Status HT Core FPGA proven ASIC ports ongoing 8

9 HTX-Board First available FPGA-based test bed with HTX connection Reference design of HTC Validated with HP DL145 G3 HP DL165 G5 Tyan N3600R / S2912E Supermicro H8QME-2+ (Iwill DK8-HTX) Features Xilinx Virtex-4 FX100 16bit wide HT link Six high speed serial links Flash memory Embedded DDR2-DRAM 9

10 HTX-Board Block Diagram Programming options JTAG PROM USB HT PHY Based on Parallel I/O Lowest latency method possible 10

11 HTX-Board Current Status Mature product, successfully used in Academia and Industry 4 th generation currently in production Demonstrator for developments of the Computer Architecture Group Ultra low latency interconnect networks Fine grain communication HT Core AMD, Dresden, Germany AMD, Bellevue, USA Cadence Design Systems, San Jose, USA Dialogic Corporation, Parsippany, NJ, USA Dolphinics, Oslo, Norway Google IBM, Yorktown Heights, NY, USA IBM, Research Center, China SUN Microsystems, CA, USA Technical University Chemnitz, Germany University of Karlsruhe, Germany Universidad Politecnica de Valencia, Spain Georgia Tech, USA TU Delft, Netherlands 11

12 HTX-Board as HyperTransport Verification Platform Use as Verification Platform Rapid Prototyping Station Bring up & test of new designs 12

13 HTX-Board as Interconnection Network Use as fully-featured development Six links enable direct networks No external switching required Special Special Cluster Cluster Features Features SPD SPD for for non-volatile non-volatile data data Programming Programming using using USB USB Examples EXTOLL Fine grain communication Valencia Cluster HT native extension Particle Physics Experiments Clock distribution & synchronization Data & control transfer Start-up Start-up latency latency ~ ~ 1us 1us See See paper paper presentation presentation at at WHTRA WHTRA tomorrow tomorrow Courtesy of Universidad Politècnica de Valencia 13

14 HT Core & HTX Board Performance Bandwidth [MB/s] DMA Bandwidth Xilinx V4 HT400 16bit Latency PIO Read Latency 0,400 0,350 0,300 0,250 0,200 0,150 0,100 0,050 0,000 Iwill [µs] HP [µs] HT200 0,371 0,330 HT400 0,195 0, HT-400 HP DMA write HT-400 HP DMA read Transfer Size [Byte] 180ns 14

15 Outline HT 1.x/2.x HT 3.x HT-based on-chip network Conclusion 15

16 HyperTransport 3.x HT3 supersedes HT1 with Higher frequencies Improved error handling Link splitting AC mode HTX3 specification published 2008 Support for additional CTL pair Improvements for FPGAs HT 1.x/2.x 400MT/s 2.8GT/s Opteron: up up to to 1GT/s HT 3.x 3.x 2.4GT/s 6.4GT/s Opteron: not not yet yet available 16

17 HTX3 Link splitting scenarios HT3 8bit HT3 8bit HT3 HT3 16bit 17

18 HT3 Core cht & ncht versions HT Cave unit Efficient queue interface Posted, Non-Posted, Request Each 96bit CMD & 128 bit DATA Resource usage V5-LXT110: 70% of slices Xilinx XC5V technology 8bit 150MHz core clock 16bit 300MHz core clock 2.4GB/s Current implementation 4.8GB/s Future plan, frequency challenging for XC5V See See paper paper presentation presentation at at WHTRA WHTRA tomorrow! tomorrow! 18

19 HTX3 Xilinx Board First available FPGAbased test bed with HTX3 connection HT3 PHY design Based on GTPs (up to 3.75GBit/s) Supports link splitting Supported by Features Features XC5VLX110T XC5VLX110T & 2x 2x LX50T LX50T 16bit 16bit wide wide HT3 HT3 link link 2 2 CX-4 CX-4 connectors connectors (IB, (IB, 10GigEth) 10GigEth) SO-DIMM SO-DIMM connector connector 19

20 Altera Stratix-IV based Single FPGA solution HT3 PHY design Based on GXs (up to 8.5GBit/s) HTX3 Altera Board (preliminary) 20

21 Outline HT 1.x/2.x HT 3.x HT-based on-chip network Conclusion 21

22 HT-based on-chip network Need to interconnect multiple modules FUs, RFs, TLBs, Caches, MCs Don t introduce protocol conversion Solution: HTAX Non-blocking crossbar Protocol derived from HT Support for more endpoints & source tags Extended packet length Fully configurable, scalable, low latency 6/12 cycles 3 cycles 22 2 cycles

23 Outline HT 1.x/2.x HT 3.x HT-based on-chip network Conclusion 23

24 Conclusion (1) Availability ncht Core Fully verified Sponsored by AMD Open source Downloadable for free after short registration cht Core Fully verified Sponsored by AMD Available for free, NDA required Contact AMD HTX-Board Fully verified Available from University of Heidelberg Contact us for pricing ncht3 Core Under development cht3 Core Design finished Extended verification phase Contact us Xilinx HTX3 Board Prototyping phase finished Contact us Altera HTX3 Board Board design finished Prototypes expected Q2/2008 Small volume expected Q3/

25 Conclusion (2) HTX-Board / HT Core PIO access: 180ns DMA BW: ~1.4GB/s USER Complete framework Turnkey solution for accelerator applications (c/nc) HT/HT3 Core HTAX crossbar RF TLB Caches Upcoming cmc HTX3 Board(s) / HT3 Core Design of cht3 Core finished! COMM 25

A HT3 Platform for Rapid Prototyping and High Performance Reconfigurable Computing

A HT3 Platform for Rapid Prototyping and High Performance Reconfigurable Computing A HT3 Platform for Rapid Prototyping and High Performance Reconfigurable Computing Second International Workshop on HyperTransport Research and Application (WHTRA 2011) University of Heidelberg Computer

More information

Leveraging HyperTransport for a custom high-performance cluster network

Leveraging HyperTransport for a custom high-performance cluster network Leveraging HyperTransport for a custom high-performance cluster network Mondrian Nüssle HTCE Symposium 2009 11.02.2009 Outline Background & Motivation Architecture Hardware Implementation Host Interface

More information

Maintaining Cache Coherency with AMD Opteron Processors using FPGA s. Parag Beeraka February 11, 2009

Maintaining Cache Coherency with AMD Opteron Processors using FPGA s. Parag Beeraka February 11, 2009 Maintaining Cache Coherency with AMD Opteron Processors using FPGA s Parag Beeraka February 11, 2009 Outline Introduction FPGA Internals Platforms Results Future Enhancements Conclusion 2 Maintaining Cache

More information

Flexible Architecture Research Machine (FARM)

Flexible Architecture Research Machine (FARM) Flexible Architecture Research Machine (FARM) RAMP Retreat June 25, 2009 Jared Casper, Tayo Oguntebi, Sungpack Hong, Nathan Bronson Christos Kozyrakis, Kunle Olukotun Motivation Why CPUs + FPGAs make sense

More information

HyperTransport Extending Technology Leadership

HyperTransport Extending Technology Leadership HyperTransport Extending Technology Leadership International HyperTransport Symposium 2009 February 11, 2009 Mario Cavalli General Manager HyperTransport Technology Consortium HyperTransport Extending

More information

HyperTransport. Dennis Vega Ryan Rawlins

HyperTransport. Dennis Vega Ryan Rawlins HyperTransport Dennis Vega Ryan Rawlins What is HyperTransport (HT)? A point to point interconnect technology that links processors to other processors, coprocessors, I/O controllers, and peripheral controllers.

More information

S2C K7 Prodigy Logic Module Series

S2C K7 Prodigy Logic Module Series S2C K7 Prodigy Logic Module Series Low-Cost Fifth Generation Rapid FPGA-based Prototyping Hardware The S2C K7 Prodigy Logic Module is equipped with one Xilinx Kintex-7 XC7K410T or XC7K325T FPGA device

More information

Commodity Converged Fabrics for Global Address Spaces in Accelerator Clouds

Commodity Converged Fabrics for Global Address Spaces in Accelerator Clouds Commodity Converged Fabrics for Global Address Spaces in Accelerator Clouds Jeffrey Young, Sudhakar Yalamanchili School of Electrical and Computer Engineering, Georgia Institute of Technology Motivation

More information

A (Very Hand-Wavy) Introduction to. PCI-Express. Jonathan Heathcote

A (Very Hand-Wavy) Introduction to. PCI-Express. Jonathan Heathcote A (Very Hand-Wavy) Introduction to PCI-Express Jonathan Heathcote Motivation Six Week Project Before PhD Starts: SpiNNaker Ethernet I/O is Sloooooow How Do You Get Things In/Out of SpiNNaker, Fast? Build

More information

Pactron FPGA Accelerated Computing Solutions

Pactron FPGA Accelerated Computing Solutions Pactron FPGA Accelerated Computing Solutions Intel Xeon + Altera FPGA 2015 Pactron HJPC Corporation 1 Motivation for Accelerators Enhanced Performance: Accelerators compliment CPU cores to meet market

More information

Compute Node Design for DAQ and Trigger Subsystem in Giessen. Justus Liebig University in Giessen

Compute Node Design for DAQ and Trigger Subsystem in Giessen. Justus Liebig University in Giessen Compute Node Design for DAQ and Trigger Subsystem in Giessen Justus Liebig University in Giessen Outline Design goals Current work in Giessen Hardware Software Future work Justus Liebig University in Giessen,

More information

INT 1011 TCP Offload Engine (Full Offload)

INT 1011 TCP Offload Engine (Full Offload) INT 1011 TCP Offload Engine (Full Offload) Product brief, features and benefits summary Provides lowest Latency and highest bandwidth. Highly customizable hardware IP block. Easily portable to ASIC flow,

More information

Integrating FPGAs in High Performance Computing A System, Architecture, and Implementation Perspective

Integrating FPGAs in High Performance Computing A System, Architecture, and Implementation Perspective Integrating FPGAs in High Performance Computing A System, Architecture, and Implementation Perspective Nathan Woods XtremeData FPGA 2007 Outline Background Problem Statement Possible Solutions Description

More information

Dynamic Partitioned Global Address Spaces for Power Efficient DRAM Virtualization

Dynamic Partitioned Global Address Spaces for Power Efficient DRAM Virtualization Dynamic Partitioned Global Address Spaces for Power Efficient DRAM Virtualization Jeffrey Young, Sudhakar Yalamanchili School of Electrical and Computer Engineering, Georgia Institute of Technology Talk

More information

Experience with the NetFPGA Program

Experience with the NetFPGA Program Experience with the NetFPGA Program John W. Lockwood Algo-Logic Systems Algo-Logic.com With input from the Stanford University NetFPGA Group & Xilinx XUP Program Sunday, February 21, 2010 FPGA-2010 Pre-Conference

More information

Introduction to the OpenCAPI Interface

Introduction to the OpenCAPI Interface Introduction to the OpenCAPI Interface Brian Allison, STSM OpenCAPI Technology and Enablement Speaker name, Title Company/Organization Name Join the Conversation #OpenPOWERSummit Industry Collaboration

More information

An FPGA based Verification Platform for HyperTransport 3.x

An FPGA based Verification Platform for HyperTransport 3.x An FPGA based Verification Platform for HyperTransport 3.x Heiner Litz Holger Froening Maximilian Thuermer Ulrich Bruening University of Heidelberg Computer Architecture Group Germany {heiner.litz, holger.froening,

More information

The HTX-Board: A Rapid Prototyping Station

The HTX-Board: A Rapid Prototyping Station The HTX-Board: A Rapid Prototyping Station Holger Fröning Mondrian Nüssle David Slogsnat Heiner Litz Ulrich Brüning University of Mannheim Computer Architecture Group B6,26 68159 Mannheim, Germany {froening,nuessle,slogsnat,heiner.litz,bruening}@uni-mannheim.de

More information

Gate Estimate. Practical (60% util)* (1000's) Max (100% util)* (1000's)

Gate Estimate. Practical (60% util)* (1000's) Max (100% util)* (1000's) The Product Brief October 07 Ver. 1.3 Group DN9000K10PCIe-4GL XilinxVirtex-5 Based ASIC Prototyping Engine, 4-lane PCI Express (Genesys Logic PHYs) Features PCI Express (4-lane) logic prototyping system

More information

HORUS. Large Scale SMP for Opterons

HORUS. Large Scale SMP for Opterons HORUS Large Scale SMP for Opterons Rich Oehler Rajesh Kota 23 August 2004 1 Outline Newisys, Inc. A Sanmina-SCI company Limits of Scalability on Opteron Horus Our Custom ASIC System Management around Horus

More information

Maximizing heterogeneous system performance with ARM interconnect and CCIX

Maximizing heterogeneous system performance with ARM interconnect and CCIX Maximizing heterogeneous system performance with ARM interconnect and CCIX Neil Parris, Director of product marketing Systems and software group, ARM Teratec June 2017 Intelligent flexible cloud to enable

More information

Multi-Gigabit Transceivers Getting Started with Xilinx s Rocket I/Os

Multi-Gigabit Transceivers Getting Started with Xilinx s Rocket I/Os Multi-Gigabit Transceivers Getting Started with Xilinx s Rocket I/Os Craig Ulmer cdulmer@sandia.gov July 26, 2007 Craig Ulmer SNL/CA Sandia is a multiprogram laboratory operated by Sandia Corporation,

More information

INT G bit TCP Offload Engine SOC

INT G bit TCP Offload Engine SOC INT 10011 10 G bit TCP Offload Engine SOC Product brief, features and benefits summary: Highly customizable hardware IP block. Easily portable to ASIC flow, Xilinx/Altera FPGAs or Structured ASIC flow.

More information

Spartan-6 & Virtex-6 FPGA Connectivity Kit FAQ

Spartan-6 & Virtex-6 FPGA Connectivity Kit FAQ 1 P age Spartan-6 & Virtex-6 FPGA Connectivity Kit FAQ April 04, 2011 Getting Started 1. Where can I purchase a kit? A: You can purchase your Spartan-6 and Virtex-6 FPGA Connectivity kits online at: Spartan-6

More information

Snoop-Based Multiprocessor Design III: Case Studies

Snoop-Based Multiprocessor Design III: Case Studies Snoop-Based Multiprocessor Design III: Case Studies Todd C. Mowry CS 41 March, Case Studies of Bus-based Machines SGI Challenge, with Powerpath SUN Enterprise, with Gigaplane Take very different positions

More information

Atlys (Xilinx Spartan-6 LX45)

Atlys (Xilinx Spartan-6 LX45) Boards & FPGA Systems and and Robotics how to use them 1 Atlys (Xilinx Spartan-6 LX45) Medium capacity Video in/out (both DVI) Audio AC97 codec 220 US$ (academic) Gbit Ethernet 128Mbyte DDR2 memory USB

More information

ML605 PCIe x8 Gen1 Design Creation

ML605 PCIe x8 Gen1 Design Creation ML605 PCIe x8 Gen1 Design Creation March 2010 Copyright 2010 Xilinx XTP044 Note: This presentation applies to the ML605 Overview Virtex-6 PCIe x8 Gen1 Capability Xilinx ML605 Board Software Requirements

More information

Lecture 41: Introduction to Reconfigurable Computing

Lecture 41: Introduction to Reconfigurable Computing inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture 41: Introduction to Reconfigurable Computing Michael Le, Sp07 Head TA April 30, 2007 Slides Courtesy of Hayden So, Sp06 CS61c Head TA Following

More information

Onyx: A Prototype Phase-Change Memory Storage Array

Onyx: A Prototype Phase-Change Memory Storage Array Onyx: A Prototype Phase-Change Memory Storage Array Ameen Akel * Adrian Caulfield, Todor Mollov, Rajesh Gupta, Steven Swanson Non-Volatile Systems Laboratory, Department of Computer Science and Engineering

More information

Intelop. *As new IP blocks become available, please contact the factory for the latest updated info.

Intelop. *As new IP blocks become available, please contact the factory for the latest updated info. A FPGA based development platform as part of an EDK is available to target intelop provided IPs or other standard IPs. The platform with Virtex-4 FX12 Evaluation Kit provides a complete hardware environment

More information

System-on-a-Programmable-Chip (SOPC) Development Board

System-on-a-Programmable-Chip (SOPC) Development Board System-on-a-Programmable-Chip (SOPC) Development Board Solution Brief 47 March 2000, ver. 1 Target Applications: Embedded microprocessor-based solutions Family: APEX TM 20K Ordering Code: SOPC-BOARD/A4E

More information

Programmable Logic Design Grzegorz Budzyń Lecture. 15: Advanced hardware in FPGA structures

Programmable Logic Design Grzegorz Budzyń Lecture. 15: Advanced hardware in FPGA structures Programmable Logic Design Grzegorz Budzyń Lecture 15: Advanced hardware in FPGA structures Plan Introduction PowerPC block RocketIO Introduction Introduction The larger the logical chip, the more additional

More information

FARM: A Prototyping Environment for Tightly-Coupled, Heterogeneous Architectures

FARM: A Prototyping Environment for Tightly-Coupled, Heterogeneous Architectures FARM: A Prototyping Environment for Tightly-Coupled, Heterogeneous Architectures Tayo Oguntebi, Sungpack Hong, Jared Casper, Nathan Bronson Christos Kozyrakis, Kunle Olukotun Outline Motivation The Stanford

More information

The Nios II Family of Configurable Soft-core Processors

The Nios II Family of Configurable Soft-core Processors The Nios II Family of Configurable Soft-core Processors James Ball August 16, 2005 2005 Altera Corporation Agenda Nios II Introduction Configuring your CPU FPGA vs. ASIC CPU Design Instruction Set Architecture

More information

Application Performance on Dual Processor Cluster Nodes

Application Performance on Dual Processor Cluster Nodes Application Performance on Dual Processor Cluster Nodes by Kent Milfeld milfeld@tacc.utexas.edu edu Avijit Purkayastha, Kent Milfeld, Chona Guiang, Jay Boisseau TEXAS ADVANCED COMPUTING CENTER Thanks Newisys

More information

Digital Integrated Circuits

Digital Integrated Circuits Digital Integrated Circuits Lecture 9 Jaeyong Chung Robust Systems Laboratory Incheon National University DIGITAL DESIGN FLOW Chung EPC6055 2 FPGA vs. ASIC FPGA (A programmable Logic Device) Faster time-to-market

More information

Field Program mable Gate Arrays

Field Program mable Gate Arrays Field Program mable Gate Arrays M andakini Patil E H E P g r o u p D H E P T I F R SERC school NISER, Bhubaneshwar Nov 7-27 2017 Outline Digital electronics Short history of programmable logic devices

More information

AN 690: PCI Express DMA Reference Design for Stratix V Devices

AN 690: PCI Express DMA Reference Design for Stratix V Devices AN 690: PCI Express DMA Reference Design for Stratix V Devices an690-1.0 Subscribe The PCI Express Avalon Memory-Mapped (Avalon-MM) DMA Reference Design highlights the performance of the Avalon-MM 256-Bit

More information

CS/ECE 217. GPU Architecture and Parallel Programming. Lecture 16: GPU within a computing system

CS/ECE 217. GPU Architecture and Parallel Programming. Lecture 16: GPU within a computing system CS/ECE 217 GPU Architecture and Parallel Programming Lecture 16: GPU within a computing system Objective To understand the major factors that dictate performance when using GPU as an compute co-processor

More information

Industry Collaboration and Innovation

Industry Collaboration and Innovation Industry Collaboration and Innovation Open Coherent Accelerator Processor Interface OpenCAPI TM - A New Standard for High Performance Memory, Acceleration and Networks Jeff Stuecheli April 10, 2017 What

More information

OCP Engineering Workshop - Telco

OCP Engineering Workshop - Telco OCP Engineering Workshop - Telco Low Latency Mobile Edge Computing Trevor Hiatt Product Management, IDT IDT Company Overview Founded 1980 Workforce Approximately 1,800 employees Headquarters San Jose,

More information

Interconnection Network for Tightly Coupled Accelerators Architecture

Interconnection Network for Tightly Coupled Accelerators Architecture Interconnection Network for Tightly Coupled Accelerators Architecture Toshihiro Hanawa, Yuetsu Kodama, Taisuke Boku, Mitsuhisa Sato Center for Computational Sciences University of Tsukuba, Japan 1 What

More information

SunFire range of servers

SunFire range of servers TAKE IT TO THE NTH Frederic Vecoven Sun Microsystems SunFire range of servers System Components Fireplane Shared Interconnect Operating Environment Ultra SPARC & compilers Applications & Middleware Clustering

More information

Leveraging OpenSPARC. ESA Round Table 2006 on Next Generation Microprocessors for Space Applications EDD

Leveraging OpenSPARC. ESA Round Table 2006 on Next Generation Microprocessors for Space Applications EDD Leveraging OpenSPARC ESA Round Table 2006 on Next Generation Microprocessors for Space Applications G.Furano, L.Messina TEC- OpenSPARC T1 The T1 is a new-from-the-ground-up SPARC microprocessor implementation

More information

Product Overview. Programmable Network Cards Network Appliances FPGA IP Cores

Product Overview. Programmable Network Cards Network Appliances FPGA IP Cores 2018 Product Overview Programmable Network Cards Network Appliances FPGA IP Cores PCI Express Cards PMC/XMC Cards The V1151/V1152 The V5051/V5052 High Density XMC Network Solutions Powerful PCIe Network

More information

Virtex-7 FPGA Gen3 Integrated Block for PCI Express

Virtex-7 FPGA Gen3 Integrated Block for PCI Express Virtex-7 FPGA Gen3 Integrated Block for PCI Express Product Guide Table of Contents Chapter 1: Overview Feature Summary.................................................................. 9 Applications......................................................................

More information

INT-1010 TCP Offload Engine

INT-1010 TCP Offload Engine INT-1010 TCP Offload Engine Product brief, features and benefits summary Highly customizable hardware IP block. Easily portable to ASIC flow, Xilinx or Altera FPGAs INT-1010 is highly flexible that is

More information

DINI Group. FPGA-based Cluster computing with Spartan-6. Mike Dini Sept 2010

DINI Group. FPGA-based Cluster computing with Spartan-6. Mike Dini  Sept 2010 DINI Group FPGA-based Cluster computing with Spartan-6 Mike Dini mdini@dinigroup.com www.dinigroup.com Sept 2010 1 The DINI Group We make big FPGA boards Xilinx, Altera 2 The DINI Group 15 employees in

More information

Design of Scalable Network Considering Diameter and Cable Delay

Design of Scalable Network Considering Diameter and Cable Delay Tohoku Design of Scalable etwork Considering Diameter and Cable Delay Kentaro Sano Tohoku University, JAPA Agenda Introduction Assumption Preliminary evaluation & candidate networks Cable length and delay

More information

The Cray XD1. Technical Overview. Amar Shan, Senior Product Marketing Manager. Cray XD1. Cray Proprietary

The Cray XD1. Technical Overview. Amar Shan, Senior Product Marketing Manager. Cray XD1. Cray Proprietary The Cray XD1 Cray XD1 Technical Overview Amar Shan, Senior Product Marketing Manager Cray Proprietary The Cray XD1 Cray XD1 Built for price performance 30 times interconnect performance 2 times the density

More information

SOFTWARE DEFINED RADIO

SOFTWARE DEFINED RADIO SOFTWARE DEFINED RADIO USR SDR WORKSHOP, SEPTEMBER 2017 PROF. MARCELO SEGURA SESSION 1: SDR PLATFORMS 1 PARAMETER TO BE CONSIDER 2 Bandwidth: bigger band better analysis possibilities. Spurious free BW:

More information

Virtex 6 FPGA Broadcast Connectivity Kit FAQ

Virtex 6 FPGA Broadcast Connectivity Kit FAQ Getting Started Virtex 6 FPGA Broadcast Connectivity Kit FAQ Q: Where can I purchase a kit? A: Once the order entry is open, you can purchase your Virtex 6 FPGA Broadcast Connectivity kit online or contact

More information

Ultra-Fast NoC Emulation on a Single FPGA

Ultra-Fast NoC Emulation on a Single FPGA The 25 th International Conference on Field-Programmable Logic and Applications (FPL 2015) September 3, 2015 Ultra-Fast NoC Emulation on a Single FPGA Thiem Van Chu, Shimpei Sato, and Kenji Kise Tokyo

More information

AN 829: PCI Express* Avalon -MM DMA Reference Design

AN 829: PCI Express* Avalon -MM DMA Reference Design AN 829: PCI Express* Avalon -MM DMA Reference Design Updated for Intel Quartus Prime Design Suite: 18.0 Subscribe Latest document on the web: PDF HTML Contents Contents 1....3 1.1. Introduction...3 1.1.1.

More information

Exploiting Dynamically Changing Parallelism with a Reconfigurable Array of Homogeneous Sub-cores (a.k.a. Field Programmable Core Array or FPCA)

Exploiting Dynamically Changing Parallelism with a Reconfigurable Array of Homogeneous Sub-cores (a.k.a. Field Programmable Core Array or FPCA) Exploiting Dynamically Changing Parallelism with a Reconfigurable Array of Homogeneous Sub-cores (a.k.a. Field Programmable Core Array or FPCA) Sponsored by SRC and NSF as a Part of Multicore Chip Design

More information

Yet Another Implementation of CoRAM Memory

Yet Another Implementation of CoRAM Memory Dec 7, 2013 CARL2013@Davis, CA Py Yet Another Implementation of Memory Architecture for Modern FPGA-based Computing Shinya Takamaeda-Yamazaki, Kenji Kise, James C. Hoe * Tokyo Institute of Technology JSPS

More information

NetFPGA Hardware Architecture

NetFPGA Hardware Architecture NetFPGA Hardware Architecture Jeffrey Shafer Some slides adapted from Stanford NetFPGA tutorials NetFPGA http://netfpga.org 2 NetFPGA Components Virtex-II Pro 5 FPGA 53,136 logic cells 4,176 Kbit block

More information

GRVI Phalanx. A Massively Parallel RISC-V FPGA Accelerator Accelerator. Jan Gray

GRVI Phalanx. A Massively Parallel RISC-V FPGA Accelerator Accelerator. Jan Gray GRVI Phalanx A Massively Parallel RISC-V FPGA Accelerator Accelerator Jan Gray jan@fpga.org Introduction FPGA accelerators are hot MSR Catapult. Intel += Altera. OpenPOWER + Xilinx FPGAs as computers Massively

More information

CCIX: a new coherent multichip interconnect for accelerated use cases

CCIX: a new coherent multichip interconnect for accelerated use cases : a new coherent multichip interconnect for accelerated use cases Akira Shimizu Senior Manager, Operator relations Arm 2017 Arm Limited Arm 2017 Interconnects for different scale SoC interconnect. Connectivity

More information

nforce 680i and 680a

nforce 680i and 680a nforce 680i and 680a NVIDIA's Next Generation Platform Processors Agenda Platform Overview System Block Diagrams C55 Details MCP55 Details Summary 2 Platform Overview nforce 680i For systems using the

More information

Zynq-7000 All Programmable SoC Product Overview

Zynq-7000 All Programmable SoC Product Overview Zynq-7000 All Programmable SoC Product Overview The SW, HW and IO Programmable Platform August 2012 Copyright 2012 2009 Xilinx Introducing the Zynq -7000 All Programmable SoC Breakthrough Processing Platform

More information

RFNoC : RF Network on Chip Martin Braun, Jonathon Pendlum GNU Radio Conference 2015

RFNoC : RF Network on Chip Martin Braun, Jonathon Pendlum GNU Radio Conference 2015 RFNoC : RF Network on Chip Martin Braun, Jonathon Pendlum GNU Radio Conference 2015 Outline Motivation Current situation Goal RFNoC Basic concepts Architecture overview Summary No Demo! See our booth,

More information

ML605 PCIe x8 Gen1 Design Creation

ML605 PCIe x8 Gen1 Design Creation ML605 PCIe x8 Gen1 Design Creation October 2010 Copyright 2010 Xilinx XTP044 Revision History Date Version Description 10/05/10 12.3 Recompiled under 12.3. AR35422 fixed; included in ISE tools. 07/23/10

More information

L2: FPGA HARDWARE : ADVANCED DIGITAL DESIGN PROJECT FALL 2015 BRANDON LUCIA

L2: FPGA HARDWARE : ADVANCED DIGITAL DESIGN PROJECT FALL 2015 BRANDON LUCIA L2: FPGA HARDWARE 18-545: ADVANCED DIGITAL DESIGN PROJECT FALL 2015 BRANDON LUCIA 18-545: FALL 2014 2 Admin stuff Project Proposals happen on Monday Be prepared to give an in-class presentation Lab 1 is

More information

ML505 ML506 ML501. Description. Description. Description. Features. Features. Features

ML505 ML506 ML501. Description. Description. Description. Features. Features. Features ML501 Purpose: General purpose FPGA development board. Board Part Number: HW-V5-ML501-UNI-G Device Supported: XC5VLX50FFG676 Price: $995 The ML501 is a feature-rich and low-cost evaluation/development

More information

Ettus Research Update

Ettus Research Update Ettus Research Update Matt Ettus Ettus Research GRCon13 Outline 1 Introduction 2 Recent New Products 3 Third Generation Introduction Who am I? Core GNU Radio contributor since 2001 Designed

More information

Field Programmable Gate Array (FPGA) Devices

Field Programmable Gate Array (FPGA) Devices Field Programmable Gate Array (FPGA) Devices 1 Contents Altera FPGAs and CPLDs CPLDs FPGAs with embedded processors ACEX FPGAs Cyclone I,II FPGAs APEX FPGAs Stratix FPGAs Stratix II,III FPGAs Xilinx FPGAs

More information

LEON4: Fourth Generation of the LEON Processor

LEON4: Fourth Generation of the LEON Processor LEON4: Fourth Generation of the LEON Processor Magnus Själander, Sandi Habinc, and Jiri Gaisler Aeroflex Gaisler, Kungsgatan 12, SE-411 19 Göteborg, Sweden Tel +46 31 775 8650, Email: {magnus, sandi, jiri}@gaisler.com

More information

DRPM architecture overview

DRPM architecture overview DRPM architecture overview Jens Hagemeyer, Dirk Jungewelter, Dario Cozzi, Sebastian Korf, Mario Porrmann Center of Excellence Cognitive action Technology, Bielefeld University, Germany Project partners:

More information

genzconsortium.org Gen-Z Technology: Enabling Memory Centric Architecture

genzconsortium.org Gen-Z Technology: Enabling Memory Centric Architecture Gen-Z Technology: Enabling Memory Centric Architecture Why Gen-Z? Gen-Z Consortium 2017 2 Why Gen-Z? Gen-Z Consortium 2017 3 Why Gen-Z? Businesses Need to Monetize Data Big Data AI Machine Learning Deep

More information

OpenCAPI Technology. Myron Slota Speaker name, Title OpenCAPI Consortium Company/Organization Name. Join the Conversation #OpenPOWERSummit

OpenCAPI Technology. Myron Slota Speaker name, Title OpenCAPI Consortium Company/Organization Name. Join the Conversation #OpenPOWERSummit OpenCAPI Technology Myron Slota Speaker name, Title OpenCAPI Consortium Company/Organization Name Join the Conversation #OpenPOWERSummit Industry Collaboration and Innovation OpenCAPI Topics Computation

More information

Why Put FPGAs in your CPU Socket?

Why Put FPGAs in your CPU Socket? Why Put FPGAs in your CPU Socket? Paul Chow High-Performance Reconfigurable Computing Group Department of Electrical and Computer Engineering University of Toronto What are we talking about? 1. Start with

More information

COSC 6385 Computer Architecture - Memory Hierarchies (III)

COSC 6385 Computer Architecture - Memory Hierarchies (III) COSC 6385 Computer Architecture - Memory Hierarchies (III) Edgar Gabriel Spring 2014 Memory Technology Performance metrics Latency problems handled through caches Bandwidth main concern for main memory

More information

Network Interface Architecture and Prototyping for Chip and Cluster Multiprocessors

Network Interface Architecture and Prototyping for Chip and Cluster Multiprocessors University of Crete School of Sciences & Engineering Computer Science Department Master Thesis by Michael Papamichael Network Interface Architecture and Prototyping for Chip and Cluster Multiprocessors

More information

PCI to SH-3 AN Hitachi SH3 to PCI bus

PCI to SH-3 AN Hitachi SH3 to PCI bus PCI to SH-3 AN Hitachi SH3 to PCI bus Version 1.0 Application Note FEATURES GENERAL DESCRIPTION Complete Application Note for designing a PCI adapter or embedded system based on the Hitachi SH-3 including:

More information

10G bit UDP Offload Engine (UOE) MAC+ PCIe SOC IP

10G bit UDP Offload Engine (UOE) MAC+ PCIe SOC IP Intilop Corporation 4800 Great America Pkwy Ste-231 Santa Clara, CA 95054 Ph: 408-496-0333 Fax:408-496-0444 www.intilop.com 10G bit UDP Offload Engine (UOE) MAC+ PCIe INT 15012 (Ultra-Low Latency SXUOE+MAC+PCIe+Host_I/F)

More information

Coherent HyperTransport Enables The Return of the SMP

Coherent HyperTransport Enables The Return of the SMP Coherent HyperTransport Enables The Return of the SMP Einar Rustad Copyright 2010 - All rights reserved. 1 Top500 History The expensive SMPs used to rule: Cray XMP, Convex Exemplar, Sun ES NOW, the Clusters

More information

Caches. Hiding Memory Access Times

Caches. Hiding Memory Access Times Caches Hiding Memory Access Times PC Instruction Memory 4 M U X Registers Sign Ext M U X Sh L 2 Data Memory M U X C O N T R O L ALU CTL INSTRUCTION FETCH INSTR DECODE REG FETCH EXECUTE/ ADDRESS CALC MEMORY

More information

User Guide of the PCIe SG DMA Engine on AVNET Virtex5 Development Board

User Guide of the PCIe SG DMA Engine on AVNET Virtex5 Development Board User Guide of the PCIe SG DMA Engine on AVNET Virtex5 Development Board V.3 Wenxue Gao weng.ziti@gmail.com 4 September 2 Revision Date Comment. 2 Aug 29 Created.. 26 Nov 29 Correction of some errors..2

More information

Realize the Genius of Your Design

Realize the Genius of Your Design Realize the Genius of Your Design Introducing Xilinx 7 Series SoC/ASIC Prototyping Platform Delivering Rapid SoC Prototyping Solutions Since 2003 Xilinx 7 Series Prodigy Logic Module Gigabit Ethernet Enabled

More information

ECE 571 Advanced Microprocessor-Based Design Lecture 24

ECE 571 Advanced Microprocessor-Based Design Lecture 24 ECE 571 Advanced Microprocessor-Based Design Lecture 24 Vince Weaver http://www.eece.maine.edu/ vweaver vincent.weaver@maine.edu 25 April 2013 Project/HW Reminder Project Presentations. 15-20 minutes.

More information

Peripheral Component Interconnect - Express

Peripheral Component Interconnect - Express PCIe Peripheral Component Interconnect - Express Preceded by PCI and PCI-X But completely different physically Logical configuration separate from the physical configuration Logical configuration is backward

More information

FPGA Solutions: Modular Architecture for Peak Performance

FPGA Solutions: Modular Architecture for Peak Performance FPGA Solutions: Modular Architecture for Peak Performance Real Time & Embedded Computing Conference Houston, TX June 17, 2004 Andy Reddig President & CTO andyr@tekmicro.com Agenda Company Overview FPGA

More information

6.9. Communicating to the Outside World: Cluster Networking

6.9. Communicating to the Outside World: Cluster Networking 6.9 Communicating to the Outside World: Cluster Networking This online section describes the networking hardware and software used to connect the nodes of cluster together. As there are whole books and

More information

The Next Revolution in Computer Systems Architecture

The Next Revolution in Computer Systems Architecture The Next Revolution in Computer Systems Architecture Richard Oehler Corporate Fellow Office of the CTO University of Mannheim 2/08/07 Computer Systems Architecture Not just the Processor Chip It s all

More information

TwinCastle: : A Multi-processor North Bridge Server Chipset

TwinCastle: : A Multi-processor North Bridge Server Chipset TwinCastle: : A Multi-processor North Bridge Server Chipset Debendra Das Sharma, Ashish Gupta, Gordon Kurpanek, Dean Mulla, Bob Pflederer, Ram Rajamani Advanced Components Division, Intel Corporation 1

More information

Moneta: A High-performance Storage Array Architecture for Nextgeneration, Micro 2010

Moneta: A High-performance Storage Array Architecture for Nextgeneration, Micro 2010 Moneta: A High-performance Storage Array Architecture for Nextgeneration, Non-volatile Memories Micro 2010 NVM-based SSD NVMs are replacing spinning-disks Performance of disks has lagged NAND flash showed

More information

Reconfigurable Computing - (RC)

Reconfigurable Computing - (RC) Reconfigurable Computing - (RC) Yogindra S Abhyankar Hardware Technology Development Group, C-DAC Outline Motivation Architecture Applications Performance Summary HPC Fastest Growing Sector HPC, the massive

More information

What does Heterogeneity bring?

What does Heterogeneity bring? What does Heterogeneity bring? Ken Koch Scientific Advisor, CCS-DO, LANL LACSI 2006 Conference October 18, 2006 Some Terminology Homogeneous Of the same or similar nature or kind Uniform in structure or

More information

Maximizing Six-Core AMD Opteron Processor Performance with RHEL

Maximizing Six-Core AMD Opteron Processor Performance with RHEL Maximizing Six-Core AMD Opteron Processor Performance with RHEL Bhavna Sarathy Red Hat Technical Lead, AMD Sanjay Rao Senior Software Engineer, Red Hat Sept 4, 2009 1 Agenda Six-Core AMD Opteron processor

More information

DNNBuilder: an Automated Tool for Building High-Performance DNN Hardware Accelerators for FPGAs

DNNBuilder: an Automated Tool for Building High-Performance DNN Hardware Accelerators for FPGAs IBM Research AI Systems Day DNNBuilder: an Automated Tool for Building High-Performance DNN Hardware Accelerators for FPGAs Xiaofan Zhang 1, Junsong Wang 2, Chao Zhu 2, Yonghua Lin 2, Jinjun Xiong 3, Wen-mei

More information

FELI. : the detector readout upgrade of the ATLAS experiment. Soo Ryu. Argonne National Laboratory, (on behalf of the FELIX group)

FELI. : the detector readout upgrade of the ATLAS experiment. Soo Ryu. Argonne National Laboratory, (on behalf of the FELIX group) LI : the detector readout upgrade of the ATLAS experiment Soo Ryu Argonne National Laboratory, sryu@anl.gov (on behalf of the LIX group) LIX group John Anderson, Soo Ryu, Jinlong Zhang Hucheng Chen, Kai

More information

New slow-control FPGA IP for GBT based system and status update of the GBT-FPGA project

New slow-control FPGA IP for GBT based system and status update of the GBT-FPGA project New slow-control FPGA IP for GBT based system and status update of the GBT-FPGA project 1 CERN Geneva CH-1211, Switzerland E-mail: julian.mendez@cern.ch Sophie Baron a, Pedro Vicente Leitao b CERN Geneva

More information

Design Space Exploration for Memory Subsystems of VLIW Architectures

Design Space Exploration for Memory Subsystems of VLIW Architectures E University of Paderborn Dr.-Ing. Mario Porrmann Design Space Exploration for Memory Subsystems of VLIW Architectures Thorsten Jungeblut 1, Gregor Sievers, Mario Porrmann 1, Ulrich Rückert 2 1 System

More information

NVM PCIe Networked Flash Storage

NVM PCIe Networked Flash Storage NVM PCIe Networked Flash Storage Peter Onufryk Microsemi Corporation Santa Clara, CA 1 PCI Express (PCIe) Mid-range/High-end Specification defined by PCI-SIG Software compatible with PCI and PCI-X Reliable,

More information

SATA-IP Host Demo Instruction on SP605 Rev Jan-10

SATA-IP Host Demo Instruction on SP605 Rev Jan-10 SATA-IP Host Demo Instruction on SP605 Rev1.0 21-Jan-10 This document describes SATA-IP Host evaluation procedure using SATA-IP Host reference design bit-file. 1 Environment For real board evaluation of

More information

Looking Ahead to Higher Performance SSDs with HLNAND

Looking Ahead to Higher Performance SSDs with HLNAND Looking Ahead to Higher Performance SSDs with HLNAND Roland Schuetz Director, Applications & Business Initiatives MOSAID Technologies Inc. Soogil Jeong Vice President, Engineering INDILINX Infrastructure

More information

XPU A Programmable FPGA Accelerator for Diverse Workloads

XPU A Programmable FPGA Accelerator for Diverse Workloads XPU A Programmable FPGA Accelerator for Diverse Workloads Jian Ouyang, 1 (ouyangjian@baidu.com) Ephrem Wu, 2 Jing Wang, 1 Yupeng Li, 1 Hanlin Xie 1 1 Baidu, Inc. 2 Xilinx Outlines Background - FPGA for

More information

Design Choices for FPGA-based SoCs When Adding a SATA Storage }

Design Choices for FPGA-based SoCs When Adding a SATA Storage } U4 U7 U7 Q D U5 Q D Design Choices for FPGA-based SoCs When Adding a SATA Storage } Lorenz Kolb & Endric Schubert, Missing Link Electronics Rudolf Usselmann, ASICS World Services Motivation for SATA Storage

More information

P51: High Performance Networking

P51: High Performance Networking P51: High Performance Networking Lecture 6: Programmable network devices Dr Noa Zilberman noa.zilberman@cl.cam.ac.uk Lent 2017/18 High Throughput Interfaces Performance Limitations So far we discussed

More information