1 Copyright 2013 Oracle and/or its affiliates. All rights reserved.

Size: px
Start display at page:

Download "1 Copyright 2013 Oracle and/or its affiliates. All rights reserved."

Transcription

1 1 Copyright 2013 Oracle and/or its affiliates. All rights reserved.

2 Bixby: the Scalability and Coherence Directory ASIC in Oracle's Highly Scalable Enterprise Systems Thomas Wicki and Jürgen Schulz Senior Principal Hardware Engineers, Microelectronics Hot Chips 25 August 25-27, 2013

3 The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle s products remains at the sole discretion of Oracle. 3 Copyright 2013 Oracle and/or its affiliates. All rights reserved.

4 Outline Motivation and Design Objectives M5 System and Beyond System RAS Features Implementation Details Debug and DFT Features Summary 4 Copyright 2013 Oracle and/or its affiliates. All rights reserved.

5 - T Motivation SL SL M5 and M6 s direct interconnects scale up to 8 processors using Coherence Links (CL) (Glueless system) To enable systems to scale beyond 8 processors: Scalability Links (SL) were added to M5 and M6 SL SL M6 M6 M6 CL M6 M6 M6 SL SL Bixby ASICs are needed (Glued system) M6 SL M6 SL 5 Copyright 2013 Oracle and/or its affiliates. All rights reserved.

6 Bixby Design Objectives Scalable up to 96 processors Communication switch between 8-processor SMPs Large System Scaling Bixby Coherence Directory & Processing Directory for L3 caches of all processors Multi-generation support Enabling mixed processor systems Enterprise System Focus Enterprise-Class RAS feature set High bandwidth, low latency 6 Copyright 2013 Oracle and/or its affiliates. All rights reserved.

7 Challenges and Trade-Offs Directory Size Directory Width Challenge Large directory size requirement Massive number of L3 cache ways x number of processors per look-up Solution Scale up number of Bixbys with system size Pipeline look-ups Switch Size 24 x 24 crossbar efficiency Overprovision switching bandwidth Shared Resources Some resources shared by multiple hardware domains Associate errors with single domain and clean up shared resources after error 7 Copyright 2013 Oracle and/or its affiliates. All rights reserved.

8 Outline Motivation and Design Objectives M5 System and Beyond System RAS Features Implementation Details Debug and DFT Features Summary 8 Copyright 2013 Oracle and/or its affiliates. All rights reserved.

9 Oracle s M5-32 System 32 M5 SPARC processors 12 Bixbys 4 physical (hardware) domains 3.1TB/s payload coherence bandwidth 1.5TB/s payload scalability bandwidth Bixbys 9 Copyright 2013 Oracle and/or its affiliates. All rights reserved.

10 M5-32 System Coherence Interconnect Coherence Links (CL) 12 lanes per direction Scalability Links (SL) 4 lanes per direction 12Gbps per lane 7 CLs + 6 SLs per processor 16 SLs per Bixby 10 Copyright 2013 Oracle and/or its affiliates. All rights reserved.

11 Scalability Link Bandwidth 11 Copyright 2013 Oracle and/or its affiliates. All rights reserved.

12 Outline Motivation and Design Objectives M5 System and Beyond System RAS Features Implementation Details Debug and DFT Features Summary 12 Copyright 2013 Oracle and/or its affiliates. All rights reserved.

13 Physical (Hardware) Domain Support S S S S S S S S Domain A S S S S S S S S Domain B BX BX BX BX S S S S S S S S Domain A S S S S S S S S Domain C Up to 12 physical domains Dynamically configurable by Service Processor Packet filtering and Physical Address fencing Errors resolved to physical domain Per-domain Cease Operation support 13 Copyright 2013 Oracle and/or its affiliates. All rights reserved.

14 5-of-6 Redundancy Mode Normal configuration: BX BX BX BX BX BX BX BX BX BX BX BX Failover configuration: BX BX BX BX BX BX BX BX BX BX BX BX Still Boots! System can boot with any 5 out of each group of 6 Bixbys Increases availability since system can be used until service is performed 14 Copyright 2013 Oracle and/or its affiliates. All rights reserved.

15 Hot Maintenance Support S S S S S S S S Domain A S S S S S S S S Domain A In a running system, failing Bixby or SMP can be: BIST BX BX BX BX IBIST Replaced Tested Re-integrated S S S S S S S S Domain B BIST S S S S S S S S New SMP 15 Copyright 2013 Oracle and/or its affiliates. All rights reserved.

16 Link Protection CRC check and auto retry Replay, if CRC error detected Guaranteed lane failure detection Built in PRBS testing during link training Auto link re-initialization Re-training link, if Replay unsuccessful No Service Processor intervention required Auto single lane failover (per direction) Based on PRBS testing No Service Processor intervention required SL Chip A LFU LFU Chip B 16 Copyright 2013 Oracle and/or its affiliates. All rights reserved.

17 Outline Motivation and Design Objectives M5 System and Beyond System RAS Features Implementation Details Debug and DFT Features Summary 17 Copyright 2013 Oracle and/or its affiliates. All rights reserved.

18 Implementation Details 96 Tx + 96 Rx 16Gb/s Long-Reach AC coupled SerDes Package: 45mm x 45mm 1677-pin FPBGA (~500 signal IO) Process: 28nm 10 layer metal 0.85V ASIC ~160 Mbits SRAM (~20MB Tags) ~70M Gates (nand2 equivalent) 18 Copyright 2013 Oracle and/or its affiliates. All rights reserved.

19 Functional Blocks Forward Packet SerDes Links (24 x4) Link Framing Units (LFU) LQU Input Queues (IQU) ASU Crossbar In (AXI) Forwarding Crossbar (FXU) Address Serialization Units ASU ASU ASU ASU ASU ASU ASU ASU ASU Crossbar Out (AXO) LQU Output Queues (OQU) Link Framing Units (LFU) SerDes Links (24 x4) 19 Copyright 2013 Oracle and/or its affiliates. All rights reserved.

20 Functional Blocks SerDes Links (24 x4) Directory Lookup Link Framing Units (LFU) LQU Input Queues (IQU) ASU Crossbar In (AXI) Forwarding Crossbar (FXU) Address Serialization Units ASU ASU ASU ASU ASU ASU ASU ASU ASU Crossbar Out (AXO) LQU Output Queues (OQU) Link Framing Units (LFU) SerDes Links (24 x4) 20 Copyright 2013 Oracle and/or its affiliates. All rights reserved.

21 Floorplan CRC/Retry CRC/Retry CRC/Retry CRC/Retry ECC for Datapath Parity for Control CRC/Retry CRC/Retry CRC/Retry CRC/Retry SEC-DED on all major datapaths Parity on control signals Custom top level wires on top two routing layers Critical nets implemented by Buffer on route Faster ps/mm PVT invariant clock distribution 21 Copyright 2013 Oracle and/or its affiliates. All rights reserved.

22 Link Queuing Unit (LQU) Forwarding Crossbar (FXU) Link Framing Units (LFU) LQU Input Queues (IQU) ASU Crossbar In (AXI) Address Serialization Units ASU ASU ASU ASU ASU ASU ASU ASU ASU Crossbar Out (AXO) LQU Output Queues (OQU) Link Framing Units (LFU) Each manages an x4 Scalability Link Provides queuing support for multiple Virtual Channels Each LQU is part of a single physical domain SEC-DED on Link FIFOs (RAM based) 22 Copyright 2013 Oracle and/or its affiliates. All rights reserved.

23 % Efficiency Cross Bar Units (XBU) A separate data path forwards traffic and is sized to account for any Head-of-Line blocking inefficiencies FXU: 24in x 24out (2-cycle packet) AXI: 24in x 8out AXO: 16in x 24out Switch fabrics implemented as custom layout hard macros Bixby fully sustains mixed request and data traffic at full line rate FXU is single domain, AXI/AXO are multi-domain logic Flow through SEC-DED, parity on routing control Forwarding Crossbar (FXU) Link Framing Units (LFU) LQU Input Queues (IQU) Switch Efficiencies FXU AXI AXO ASU Crossbar In (AXI) Address Serialization Units ASU ASU ASU ASU ASU ASU ASU ASU ASU Crossbar Out (AXO) LQU Output Queues (OQU) Link Framing Units (LFU) Output Input 23 Copyright 2013 Oracle and/or its affiliates. All rights reserved.

24 Address Serialization Unit (ASU) Partitioned into eight parallel units Each directory unit can compare and process up to 22,656 bits per cycle (total 181,248 bits per chip per cycle) 0.5 request lookups per cycle (total 4 per chip) Flow-through correction on incoming packets and tag directory contents Retry on directory tag staging flops Supports up to 12 hardware domains with error steering Per domain Built-In Self Initialization (BISI) Tag RAM scrubber Forwarding Crossbar (FXU) Link Framing Units (LFU) LQU Input Queues (IQU) ASU Crossbar In (AXI) Address Serialization Units ASU ASU ASU ASU ASU ASU ASU ASU ASU Crossbar Out (AXO) LQU Output Queues (OQU) Link Framing Units (LFU) 24 Copyright 2013 Oracle and/or its affiliates. All rights reserved.

25 Outline Motivation and Design Objectives M5 System and Beyond System RAS Features Implementation Details Debug and DFT Features Summary 25 Copyright 2013 Oracle and/or its affiliates. All rights reserved.

26 Debug and DFT Features Monitoring link at full signaling speed is challenging Two internal rings to allow capturing packet flow in ingress or egress direction Internal triggering logic and RAM to store packet flow External DDR interface to allow capturing packet flow on Logic Analyzer In-system test features: MemBIST InterconnectBIST ASU tag RAM can be read, written or read-modify-write 26 Copyright 2013 Oracle and/or its affiliates. All rights reserved.

27 Outline Motivation and Design Objectives M5 System and Beyond System RAS Features Implementation Details Debug and DFT Features Summary 27 Copyright 2013 Oracle and/or its affiliates. All rights reserved.

28 Bixby Design Objectives Accomplished Scales to 96 processors Provides communication switching between 8-processor SMPs Large System Scaling Bixby Coherence Directory & Processing Directory for L3 caches of all processors Multi-generation support Enabling mixed processor systems Enterprise System Focus Enterprise-Class RAS feature set High bandwidth, low latency invisible 28 Copyright 2013 Oracle and/or its affiliates. All rights reserved.

29 Bixby Scalability ASIC Hosts L3 cache directory Processes coherence requests Includes comprehensive Enterprise-Class RAS Provides extensive debug and DFT features Pushes ASIC boundaries: technology, complexity, die size, SerDes count, power, Flexible scaling up to 12x Glueless systems Glued systems 29 Copyright 2013 Oracle and/or its affiliates. All rights reserved.

30 30 Copyright 2013 Oracle and/or its affiliates. All rights reserved. Q&A

31 References White Paper: SPARC M5-32 Server Architecture Data Sheets: SPARC M5-32 Server SPARC M5 Processor processor-ds pdf 31 Copyright 2013 Oracle and/or its affiliates. All rights reserved.

32 Glossary BISI Built-In Self Initialization BIST Built-In Self Test BX Bixby ASIC CL Coherence Link CRC Cyclic Redundancy Check IBIST Interconnect Built-In Self Test MemBIST Memory Built-In Self Test PRBS Pseudo-Random Binary Sequence PVT Process Voltage Temperature RAS Reliability Availability Serviceability SEC-DED Single-bit Error Correction - Double-bit Error Detection SL Scalability Link SMP Shared Memory Processor SPARC - Scalable Processor ARChitecture 32 Copyright 2013 Oracle and/or its affiliates. All rights reserved.

33 33 Copyright 2013 Oracle and/or its affiliates. All rights reserved.

34 34 Copyright 2013 Oracle and/or its affiliates. All rights reserved.

M7: Next Generation SPARC. Hotchips 26 August 12, Stephen Phillips Senior Director, SPARC Architecture Oracle

M7: Next Generation SPARC. Hotchips 26 August 12, Stephen Phillips Senior Director, SPARC Architecture Oracle M7: Next Generation SPARC Hotchips 26 August 12, 2014 Stephen Phillips Senior Director, SPARC Architecture Oracle Safe Harbor Statement The following is intended to outline our general product direction.

More information

A 400Gbps Multi-Core Network Processor

A 400Gbps Multi-Core Network Processor A 400Gbps Multi-Core Network Processor James Markevitch, Srinivasa Malladi Cisco Systems August 22, 2017 Legal THE INFORMATION HEREIN IS PROVIDED ON AN AS IS BASIS, WITHOUT ANY WARRANTIES OR REPRESENTATIONS,

More information

Challenges for Future Interconnection Networks Hot Interconnects Panel August 24, Dennis Abts Sr. Principal Engineer

Challenges for Future Interconnection Networks Hot Interconnects Panel August 24, Dennis Abts Sr. Principal Engineer Challenges for Future Interconnection Networks Hot Interconnects Panel August 24, 2006 Sr. Principal Engineer Panel Questions How do we build scalable networks that balance power, reliability and performance

More information

POWER7: IBM's Next Generation Server Processor

POWER7: IBM's Next Generation Server Processor POWER7: IBM's Next Generation Server Processor Acknowledgment: This material is based upon work supported by the Defense Advanced Research Projects Agency under its Agreement No. HR0011-07-9-0002 Outline

More information

POWER7: IBM's Next Generation Server Processor

POWER7: IBM's Next Generation Server Processor Hot Chips 21 POWER7: IBM's Next Generation Server Processor Ronald Kalla Balaram Sinharoy POWER7 Chief Engineer POWER7 Chief Core Architect Acknowledgment: This material is based upon work supported by

More information

A Single Chip Shared Memory Switch with Twelve 10Gb Ethernet Ports

A Single Chip Shared Memory Switch with Twelve 10Gb Ethernet Ports A Single Chip Shared Memory Switch with Twelve 10Gb Ethernet Ports Takeshi Shimizu, Yukihiro Nakagawa, Sridhar Pathi, Yasushi Umezawa, Takashi Miyoshi, Yoichi Koyanagi, Takeshi Horie, Akira Hattori Hot

More information

The iflow Address Processor Forwarding Table Lookups using Fast, Wide Embedded DRAM

The iflow Address Processor Forwarding Table Lookups using Fast, Wide Embedded DRAM Enabling the Future of the Internet The iflow Address Processor Forwarding Table Lookups using Fast, Wide Embedded DRAM Mike O Connor - Director, Advanced Architecture www.siliconaccess.com Hot Chips 12

More information

Hybrid Memory Cube (HMC)

Hybrid Memory Cube (HMC) 23 Hybrid Memory Cube (HMC) J. Thomas Pawlowski, Fellow Chief Technologist, Architecture Development Group, Micron jpawlowski@micron.com 2011 Micron Technology, I nc. All rights reserved. Products are

More information

HORUS. Large Scale SMP for Opterons

HORUS. Large Scale SMP for Opterons HORUS Large Scale SMP for Opterons Rich Oehler Rajesh Kota 23 August 2004 1 Outline Newisys, Inc. A Sanmina-SCI company Limits of Scalability on Opteron Horus Our Custom ASIC System Management around Horus

More information

PUSHING THE LIMITS, A PERSPECTIVE ON ROUTER ARCHITECTURE CHALLENGES

PUSHING THE LIMITS, A PERSPECTIVE ON ROUTER ARCHITECTURE CHALLENGES PUSHING THE LIMITS, A PERSPECTIVE ON ROUTER ARCHITECTURE CHALLENGES Greg Hankins APRICOT 2012 2012 Brocade Communications Systems, Inc. 2012/02/28 Lookup Capacity and Forwarding

More information

SunFire range of servers

SunFire range of servers TAKE IT TO THE NTH Frederic Vecoven Sun Microsystems SunFire range of servers System Components Fireplane Shared Interconnect Operating Environment Ultra SPARC & compilers Applications & Middleware Clustering

More information

PEX 8636, PCI Express Gen 2 Switch, 36 Lanes, 24 Ports

PEX 8636, PCI Express Gen 2 Switch, 36 Lanes, 24 Ports Highlights PEX 8636 General Features o 36-lane, 24-port PCIe Gen2 switch - Integrated 5.0 GT/s SerDes o 35 x 35mm 2, 1156-ball FCBGA package o Typical Power: 8.8 Watts PEX 8636 Key Features o Standards

More information

Module 17: "Interconnection Networks" Lecture 37: "Introduction to Routers" Interconnection Networks. Fundamentals. Latency and bandwidth

Module 17: Interconnection Networks Lecture 37: Introduction to Routers Interconnection Networks. Fundamentals. Latency and bandwidth Interconnection Networks Fundamentals Latency and bandwidth Router architecture Coherence protocol and routing [From Chapter 10 of Culler, Singh, Gupta] file:///e /parallel_com_arch/lecture37/37_1.htm[6/13/2012

More information

PEX8764, PCI Express Gen3 Switch, 64 Lanes, 16 Ports

PEX8764, PCI Express Gen3 Switch, 64 Lanes, 16 Ports Highlights PEX8764 General Features o 64-lane, 16-port PCIe Gen3 switch Integrated 8.0 GT/s SerDes o 35 x 35mm 2, 1156-ball FCBGA package o Typical Power: 1. Watts PEX8764 Key Features o Standards Compliant

More information

Zynq-7000 All Programmable SoC Product Overview

Zynq-7000 All Programmable SoC Product Overview Zynq-7000 All Programmable SoC Product Overview The SW, HW and IO Programmable Platform August 2012 Copyright 2012 2009 Xilinx Introducing the Zynq -7000 All Programmable SoC Breakthrough Processing Platform

More information

PEX 8680, PCI Express Gen 2 Switch, 80 Lanes, 20 Ports

PEX 8680, PCI Express Gen 2 Switch, 80 Lanes, 20 Ports , PCI Express Gen 2 Switch, 80 Lanes, 20 Ports Features General Features o 80-lane, 20-port PCIe Gen2 switch - Integrated 5.0 GT/s SerDes o 35 x 35mm 2, 1156-ball BGA package o Typical Power: 9.0 Watts

More information

Overcoming the Memory System Challenge in Dataflow Processing. Darren Jones, Wave Computing Drew Wingard, Sonics

Overcoming the Memory System Challenge in Dataflow Processing. Darren Jones, Wave Computing Drew Wingard, Sonics Overcoming the Memory System Challenge in Dataflow Processing Darren Jones, Wave Computing Drew Wingard, Sonics Current Technology Limits Deep Learning Performance Deep Learning Dataflow Graph Existing

More information

4. Networks. in parallel computers. Advances in Computer Architecture

4. Networks. in parallel computers. Advances in Computer Architecture 4. Networks in parallel computers Advances in Computer Architecture System architectures for parallel computers Control organization Single Instruction stream Multiple Data stream (SIMD) All processors

More information

RHiNET-3/SW: an 80-Gbit/s high-speed network switch for distributed parallel computing

RHiNET-3/SW: an 80-Gbit/s high-speed network switch for distributed parallel computing RHiNET-3/SW: an 0-Gbit/s high-speed network switch for distributed parallel computing S. Nishimura 1, T. Kudoh 2, H. Nishi 2, J. Yamamoto 2, R. Ueno 3, K. Harasawa 4, S. Fukuda 4, Y. Shikichi 4, S. Akutsu

More information

SGI UV for SAP HANA. Scale-up, Single-node Architecture Enables Real-time Operations at Extreme Scale and Lower TCO

SGI UV for SAP HANA. Scale-up, Single-node Architecture Enables Real-time Operations at Extreme Scale and Lower TCO W h i t e P a p e r SGI UV for SAP HANA Scale-up, Single-node Architecture Enables Real-time Operations at Extreme Scale and Lower TCO Table of Contents Introduction 1 SGI UV for SAP HANA 1 Architectural

More information

An Introduction to the QorIQ Data Path Acceleration Architecture (DPAA) AN129

An Introduction to the QorIQ Data Path Acceleration Architecture (DPAA) AN129 July 14, 2009 An Introduction to the QorIQ Data Path Acceleration Architecture (DPAA) AN129 David Lapp Senior System Architect What is the Datapath Acceleration Architecture (DPAA)? The QorIQ DPAA is a

More information

PEX 8696, PCI Express Gen 2 Switch, 96 Lanes, 24 Ports

PEX 8696, PCI Express Gen 2 Switch, 96 Lanes, 24 Ports , PCI Express Gen 2 Switch, 96 Lanes, 24 Ports Highlights General Features o 96-lane, 24-port PCIe Gen2 switch - Integrated 5.0 GT/s SerDes o 35 x 35mm 2, 1156-ball FCBGA package o Typical Power: 10.2

More information

100 GBE AND BEYOND. Diagram courtesy of the CFP MSA Brocade Communications Systems, Inc. v /11/21

100 GBE AND BEYOND. Diagram courtesy of the CFP MSA Brocade Communications Systems, Inc. v /11/21 100 GBE AND BEYOND 2011 Brocade Communications Systems, Inc. Diagram courtesy of the CFP MSA. v1.4 2011/11/21 Current State of the Industry 10 Electrical Fundamental 1 st generation technology constraints

More information

The RM9150 and the Fast Device Bus High Speed Interconnect

The RM9150 and the Fast Device Bus High Speed Interconnect The RM9150 and the Fast Device High Speed Interconnect John R. Kinsel Principal Engineer www.pmc -sierra.com 1 August 2004 Agenda CPU-based SOC Design Challenges Fast Device (FDB) Overview Generic Device

More information

Keystone Architecture Inter-core Data Exchange

Keystone Architecture Inter-core Data Exchange Application Report Lit. Number November 2011 Keystone Architecture Inter-core Data Exchange Brighton Feng Vincent Han Communication Infrastructure ABSTRACT This application note introduces various methods

More information

SGI UV 300RL for Oracle Database In-Memory

SGI UV 300RL for Oracle Database In-Memory SGI UV 300RL for Oracle Database In- Single-system Architecture Enables Real-time Business at Near Limitless Scale with Mission-critical Reliability TABLE OF CONTENTS 1.0 Introduction 1 2.0 SGI In- Computing

More information

Algorithmic Memory Increases Memory Performance By an Order of Magnitude

Algorithmic Memory Increases Memory Performance By an Order of Magnitude Algorithmic Memory Increases Memory Performance By an Order of Magnitude Sundar Iyer Co-Founder & CTO Memoir Systems Track F, Lecture 2: Intellectual Property for SoC & Cores May 2, 2012 1 Problem: Processor-Embedded

More information

NetFPGA Hardware Architecture

NetFPGA Hardware Architecture NetFPGA Hardware Architecture Jeffrey Shafer Some slides adapted from Stanford NetFPGA tutorials NetFPGA http://netfpga.org 2 NetFPGA Components Virtex-II Pro 5 FPGA 53,136 logic cells 4,176 Kbit block

More information

Barcelona: a Fibre Channel Switch SoC for Enterprise SANs Nital P. Patwa Hardware Engineering Manager/Technical Leader

Barcelona: a Fibre Channel Switch SoC for Enterprise SANs Nital P. Patwa Hardware Engineering Manager/Technical Leader Barcelona: a Fibre Channel Switch SoC for Enterprise SANs Nital P. Patwa Hardware Engineering Manager/Technical Leader 1 Agenda Introduction to Fibre Channel Switching in Enterprise SANs Barcelona Switch-On-a-Chip

More information

PCI Express: Evolution, Deployment and Challenges

PCI Express: Evolution, Deployment and Challenges PCI Express: Evolution, Deployment and Challenges Nick Ma 马明辉 Field Applications Engineer, PLX Freescale Technology Forum, Beijing Track: Enabling Technologies Freescale Technology Forum, Beijing - November

More information

White paper Advanced Technologies of the Supercomputer PRIMEHPC FX10

White paper Advanced Technologies of the Supercomputer PRIMEHPC FX10 White paper Advanced Technologies of the Supercomputer PRIMEHPC FX10 Next Generation Technical Computing Unit Fujitsu Limited Contents Overview of the PRIMEHPC FX10 Supercomputer 2 SPARC64 TM IXfx: Fujitsu-Developed

More information

Fujitsu s Chipset Development for High-Performance, High-Reliability Mission- Critical IA Servers PRIMEQUEST

Fujitsu s Chipset Development for High-Performance, High-Reliability Mission- Critical IA Servers PRIMEQUEST Fujitsu s Chipset Development for High-Performance, High-Reliability Mission- Critical IA Servers PRIMEQUEST V Yasuhide Shibata (Manuscript received May 20, 2005) Fujitsu has developed a new mission-critical

More information

Initial Performance Evaluation of the Cray SeaStar Interconnect

Initial Performance Evaluation of the Cray SeaStar Interconnect Initial Performance Evaluation of the Cray SeaStar Interconnect Ron Brightwell Kevin Pedretti Keith Underwood Sandia National Laboratories Scalable Computing Systems Department 13 th IEEE Symposium on

More information

Spatial Debug & Debug without re-programming in Microsemi FPGAs

Spatial Debug & Debug without re-programming in Microsemi FPGAs Power Matters. TM Spatial Debug & Debug without re-programming in Microsemi FPGAs Pankaj Shanker, Aditya Veluri, Kinshuk Sharma Systems Validation Group 21 Feb 2016 1 Agenda Traditional debug methods and

More information

User Guide. for TAHOE 8622

User Guide. for TAHOE 8622 User Guide for TAHOE 8622 TAHOE 8622 User Guide REV: 01 07/27/2017 PAGE 1 OF 37 TABLE OF CONTENTS 1. INTRODUCTION... 5 1.1 PRODUCT DESCRIPTION... 5 1.2 STANDARD FEATURES... 5 1.3 FUNCTIONAL DIAGRAM...

More information

Scaling routers: Where do we go from here?

Scaling routers: Where do we go from here? Scaling routers: Where do we go from here? HPSR, Kobe, Japan May 28 th, 2002 Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University nickm@stanford.edu www.stanford.edu/~nickm

More information

Oracle Zero Data Loss Recovery Appliance (ZDLRA)

Oracle Zero Data Loss Recovery Appliance (ZDLRA) Oracle Zero Data Loss Recovery Appliance (ZDLRA) Overview Attila Mester Principal Sales Consultant Data Protection Copyright 2015, Oracle and/or its affiliates. All rights reserved. Safe Harbor Statement

More information

ECE 485/585 Microprocessor System Design

ECE 485/585 Microprocessor System Design Microprocessor System Design Lecture 4: Memory Hierarchy Memory Taxonomy SRAM Basics Memory Organization DRAM Basics Zeshan Chishti Electrical and Computer Engineering Dept Maseeh College of Engineering

More information

Agenda. System Performance Scaling of IBM POWER6 TM Based Servers

Agenda. System Performance Scaling of IBM POWER6 TM Based Servers System Performance Scaling of IBM POWER6 TM Based Servers Jeff Stuecheli Hot Chips 19 August 2007 Agenda Historical background POWER6 TM chip components Interconnect topology Cache Coherence strategies

More information

Concurrent High Performance Processor design: From Logic to PD in Parallel

Concurrent High Performance Processor design: From Logic to PD in Parallel IBM Systems Group Concurrent High Performance design: From Logic to PD in Parallel Leon Stok, VP EDA, IBM Systems Group Mainframes process 30 billion business transactions per day The mainframe is everywhere,

More information

Virtex-7 FPGA Gen3 Integrated Block for PCI Express

Virtex-7 FPGA Gen3 Integrated Block for PCI Express Virtex-7 FPGA Gen3 Integrated Block for PCI Express Product Guide Table of Contents Chapter 1: Overview Feature Summary.................................................................. 9 Applications......................................................................

More information

Future Memories. Jim Handy OBJECTIVE ANALYSIS

Future Memories. Jim Handy OBJECTIVE ANALYSIS Future Memories Jim Handy OBJECTIVE ANALYSIS Hitting a Brick Wall OBJECTIVE ANALYSIS www.objective-analysis.com Panelists Michael Miller VP Technology, Innovation & Systems Applications MoSys Christophe

More information

16-Lane 16-Port PCIe Gen2 System Interconnect Switch with Non-Transparent Bridging

16-Lane 16-Port PCIe Gen2 System Interconnect Switch with Non-Transparent Bridging 16-Lane 16-Port PCIe Gen2 with Non-Transparent Bridging 89HPES16NT16G2 Product Brief Device Overview The 89HPES16NT16G2 is a member of the IDT family of PCI Express ing solutions. The PES16NT16G2 is a

More information

CCNoC: Specializing On-Chip Interconnects for Energy Efficiency in Cache-Coherent Servers

CCNoC: Specializing On-Chip Interconnects for Energy Efficiency in Cache-Coherent Servers CCNoC: Specializing On-Chip Interconnects for Energy Efficiency in Cache-Coherent Servers Stavros Volos, Ciprian Seiculescu, Boris Grot, Naser Khosro Pour, Babak Falsafi, and Giovanni De Micheli Toward

More information

High-Performance Network Data-Packet Classification Using Embedded Content-Addressable Memory

High-Performance Network Data-Packet Classification Using Embedded Content-Addressable Memory High-Performance Network Data-Packet Classification Using Embedded Content-Addressable Memory Embedding a TCAM block along with the rest of the system in a single device should overcome the disadvantages

More information

POWER9 Announcement. Martin Bušek IBM Server Solution Sales Specialist

POWER9 Announcement. Martin Bušek IBM Server Solution Sales Specialist POWER9 Announcement Martin Bušek IBM Server Solution Sales Specialist Announce Performance Launch GA 2/13 2/27 3/19 3/20 POWER9 is here!!! The new POWER9 processor ~1TB/s 1 st chip with PCIe4 4GHZ 2x Core

More information

Solaris Engineered Systems

Solaris Engineered Systems Solaris Engineered Systems SPARC SuperCluster Introduction Andy Harrison andy.harrison@oracle.com Engineered Systems, Revenue Product Engineering The following is intended to outline

More information

LatticeSCM SPI4.2 Interoperability with PMC-Sierra PM3388

LatticeSCM SPI4.2 Interoperability with PMC-Sierra PM3388 August 2006 Technical Note TN1121 Introduction The System Packet Interface, Level 4, Phase 2 (SPI4.2) is a system level interface, published in 2001 by the Optical Internetworking Forum (OIF), for packet

More information

SPARC64 X: Fujitsu s New Generation 16 Core Processor for the next generation UNIX servers

SPARC64 X: Fujitsu s New Generation 16 Core Processor for the next generation UNIX servers X: Fujitsu s New Generation 16 Processor for the next generation UNIX servers August 29, 2012 Takumi Maruyama Processor Development Division Enterprise Server Business Unit Fujitsu Limited All Rights Reserved,Copyright

More information

<Insert Picture Here> QCon: London 2009 Data Grid Design Patterns

<Insert Picture Here> QCon: London 2009 Data Grid Design Patterns QCon: London 2009 Data Grid Design Patterns Brian Oliver Global Solutions Architect brian.oliver@oracle.com Oracle Coherence Oracle Fusion Middleware Product Management Agenda Traditional

More information

Achieving UFS Host Throughput For System Performance

Achieving UFS Host Throughput For System Performance Achieving UFS Host Throughput For System Performance Yifei-Liu CAE Manager, Synopsys Mobile Forum 2013 Copyright 2013 Synopsys Agenda UFS Throughput Considerations to Meet Performance Objectives UFS Host

More information

Data Sheet Fujitsu M10-4S Server

Data Sheet Fujitsu M10-4S Server Data Sheet Fujitsu M10-4S Server Flexible and scalable system that delivers high performance and high availability for mission-critical enterprise applications The Fujitsu M10-4S The Fujitsu M10-4S server

More information

KeyStone C66x Multicore SoC Overview. Dec, 2011

KeyStone C66x Multicore SoC Overview. Dec, 2011 KeyStone C66x Multicore SoC Overview Dec, 011 Outline Multicore Challenge KeyStone Architecture Reminder About KeyStone Solution Challenge Before KeyStone Multicore performance degradation Lack of efficient

More information

Netronome NFP: Theory of Operation

Netronome NFP: Theory of Operation WHITE PAPER Netronome NFP: Theory of Operation TO ACHIEVE PERFORMANCE GOALS, A MULTI-CORE PROCESSOR NEEDS AN EFFICIENT DATA MOVEMENT ARCHITECTURE. CONTENTS 1. INTRODUCTION...1 2. ARCHITECTURE OVERVIEW...2

More information

1. NoCs: What s the point?

1. NoCs: What s the point? 1. Nos: What s the point? What is the role of networks-on-chip in future many-core systems? What topologies are most promising for performance? What about for energy scaling? How heavily utilized are Nos

More information

Intel QuickPath Interconnect Electrical Architecture Overview

Intel QuickPath Interconnect Electrical Architecture Overview Chapter 1 Intel QuickPath Interconnect Electrical Architecture Overview The art of progress is to preserve order amid change and to preserve change amid order Alfred North Whitehead The goal of this chapter

More information

The Oracle Database Appliance I/O and Performance Architecture

The Oracle Database Appliance I/O and Performance Architecture Simple Reliable Affordable The Oracle Database Appliance I/O and Performance Architecture Tammy Bednar, Sr. Principal Product Manager, ODA 1 Copyright 2012, Oracle and/or its affiliates. All rights reserved.

More information

Peripheral Component Interconnect - Express

Peripheral Component Interconnect - Express PCIe Peripheral Component Interconnect - Express Preceded by PCI and PCI-X But completely different physically Logical configuration separate from the physical configuration Logical configuration is backward

More information

Oracle Database Exadata Cloud Service Exadata Performance, Cloud Simplicity DATABASE CLOUD SERVICE

Oracle Database Exadata Cloud Service Exadata Performance, Cloud Simplicity DATABASE CLOUD SERVICE Oracle Database Exadata Exadata Performance, Cloud Simplicity DATABASE CLOUD SERVICE Oracle Database Exadata combines the best database with the best cloud platform. Exadata is the culmination of more

More information

The S6000 Family of Processors

The S6000 Family of Processors The S6000 Family of Processors Today s Design Challenges The advent of software configurable processors In recent years, the widespread adoption of digital technologies has revolutionized the way in which

More information

SUN SPARC ENTERPRISE M5000 SERVER

SUN SPARC ENTERPRISE M5000 SERVER SUN SPARC ENTERPRISE M5000 SERVER MAINFRAME-CLASS RAS AND UNMATCHED INVESTMENT PROTECTION KEY FEATURES Optimized for 24x7 mission critical computing and large shared memory applications Mainframe class

More information

Ting Wu, Chi-Ying Tsui, Mounir Hamdi Hong Kong University of Science & Technology Hong Kong SAR, China

Ting Wu, Chi-Ying Tsui, Mounir Hamdi Hong Kong University of Science & Technology Hong Kong SAR, China CMOS Crossbar Ting Wu, Chi-Ying Tsui, Mounir Hamdi Hong Kong University of Science & Technology Hong Kong SAR, China OUTLINE Motivations Problems of Designing Large Crossbar Our Approach - Pipelined MUX

More information

William Stallings Computer Organization and Architecture 8 th Edition. Chapter 18 Multicore Computers

William Stallings Computer Organization and Architecture 8 th Edition. Chapter 18 Multicore Computers William Stallings Computer Organization and Architecture 8 th Edition Chapter 18 Multicore Computers Hardware Performance Issues Microprocessors have seen an exponential increase in performance Improved

More information

Exam Questions C

Exam Questions C Exam Questions C9010-250 Power Systems with POWER8 Sales Skills V1 1. A credit card company is going to implement a new application for which security is a major requirement. The company has been quoted

More information

Cisco Series Internet Router Architecture: Packet Switching

Cisco Series Internet Router Architecture: Packet Switching Cisco 12000 Series Internet Router Architecture: Packet Switching Document ID: 47320 Contents Introduction Prerequisites Requirements Components Used Conventions Background Information Packet Switching:

More information

Memory Systems IRAM. Principle of IRAM

Memory Systems IRAM. Principle of IRAM Memory Systems 165 other devices of the module will be in the Standby state (which is the primary state of all RDRAM devices) or another state with low-power consumption. The RDRAM devices provide several

More information

Overview of the SPARC Enterprise Servers

Overview of the SPARC Enterprise Servers Overview of the SPARC Enterprise Servers SPARC Enterprise Technologies for the Datacenter Ideal for Enterprise Application Deployments System Overview Virtualization technologies > Maximize system utilization

More information

KeyStone C665x Multicore SoC

KeyStone C665x Multicore SoC KeyStone Multicore SoC Architecture KeyStone C6655/57: Device Features C66x C6655: One C66x DSP Core at 1.0 or 1.25 GHz C6657: Two C66x DSP Cores at 0.85, 1.0, or 1.25 GHz Fixed and Floating Point Operations

More information

Packet Switch Architectures Part 2

Packet Switch Architectures Part 2 Packet Switch Architectures Part Adopted from: Sigcomm 99 Tutorial, by Nick McKeown and Balaji Prabhakar, Stanford University Slides used with permission from authors. 999-000. All rights reserved by authors.

More information

Data Sheet Fujitsu M10-4 Server

Data Sheet Fujitsu M10-4 Server Data Sheet Fujitsu M10-4 Server High-performance, highly reliable midrange server that is ideal for data center integration and virtualization The Fujitsu M10-4 The Fujitsu M10-4 server can be configured

More information

Processor Architectures At A Glance: M.I.T. Raw vs. UC Davis AsAP

Processor Architectures At A Glance: M.I.T. Raw vs. UC Davis AsAP Processor Architectures At A Glance: M.I.T. Raw vs. UC Davis AsAP Presenter: Course: EEC 289Q: Reconfigurable Computing Course Instructor: Professor Soheil Ghiasi Outline Overview of M.I.T. Raw processor

More information

40Gbps+ Full Line Rate, Programmable Network Accelerators for Low Latency Applications SAAHPC 19 th July 2011

40Gbps+ Full Line Rate, Programmable Network Accelerators for Low Latency Applications SAAHPC 19 th July 2011 40Gbps+ Full Line Rate, Programmable Network Accelerators for Low Latency Applications SAAHPC 19 th July 2011 Allan Cantle President & Founder www.nallatech.com Company Overview ISI + Nallatech + Innovative

More information

P51: High Performance Networking

P51: High Performance Networking P51: High Performance Networking Lecture 6: Programmable network devices Dr Noa Zilberman noa.zilberman@cl.cam.ac.uk Lent 2017/18 High Throughput Interfaces Performance Limitations So far we discussed

More information

TEXAS INSTRUMENTS ANALOG UNIVERSITY PROGRAM DESIGN CONTEST MIXED SIGNAL TEST INTERFACE CHRISTOPHER EDMONDS, DANIEL KEESE, RICHARD PRZYBYLA SCHOOL OF

TEXAS INSTRUMENTS ANALOG UNIVERSITY PROGRAM DESIGN CONTEST MIXED SIGNAL TEST INTERFACE CHRISTOPHER EDMONDS, DANIEL KEESE, RICHARD PRZYBYLA SCHOOL OF TEXASINSTRUMENTSANALOGUNIVERSITYPROGRAMDESIGNCONTEST MIXED SIGNALTESTINTERFACE CHRISTOPHEREDMONDS,DANIELKEESE,RICHARDPRZYBYLA SCHOOLOFELECTRICALENGINEERINGANDCOMPUTERSCIENCE OREGONSTATEUNIVERSITY I. PROJECT

More information

The Network Layer and Routers

The Network Layer and Routers The Network Layer and Routers Daniel Zappala CS 460 Computer Networking Brigham Young University 2/18 Network Layer deliver packets from sending host to receiving host must be on every host, router in

More information

CSE 123A Computer Networks

CSE 123A Computer Networks CSE 123A Computer Networks Winter 2005 Lecture 8: IP Router Design Many portions courtesy Nick McKeown Overview Router basics Interconnection architecture Input Queuing Output Queuing Virtual output Queuing

More information

Next Generation Multi-Purpose Microprocessor

Next Generation Multi-Purpose Microprocessor Next Generation Multi-Purpose Microprocessor Presentation at MPSA, 4 th of November 2009 www.aeroflex.com/gaisler OUTLINE NGMP key requirements Development schedule Architectural Overview LEON4FT features

More information

A ONE CHIP HARDENED SOLUTION FOR HIGH SPEED SPACEWIRE SYSTEM IMPLEMENTATIONS

A ONE CHIP HARDENED SOLUTION FOR HIGH SPEED SPACEWIRE SYSTEM IMPLEMENTATIONS A ONE CHIP HARDENED SOLUTION FOR HIGH SPEED SPACEWIRE SYSTEM IMPLEMENTATIONS Joseph R. Marshall, Richard W. Berger, Glenn P. Rakow Conference Contents Standards & Topology ASIC Program History ASIC Features

More information

IBM Cell Processor. Gilbert Hendry Mark Kretschmann

IBM Cell Processor. Gilbert Hendry Mark Kretschmann IBM Cell Processor Gilbert Hendry Mark Kretschmann Architectural components Architectural security Programming Models Compiler Applications Performance Power and Cost Conclusion Outline Cell Architecture:

More information

Massively Parallel Processor Breadboarding (MPPB)

Massively Parallel Processor Breadboarding (MPPB) Massively Parallel Processor Breadboarding (MPPB) 28 August 2012 Final Presentation TRP study 21986 Gerard Rauwerda CTO, Recore Systems Gerard.Rauwerda@RecoreSystems.com Recore Systems BV P.O. Box 77,

More information

Future of Interconnect Fabric A Contrarian View. Shekhar Borkar June 13, 2010 Intel Corp. 1

Future of Interconnect Fabric A Contrarian View. Shekhar Borkar June 13, 2010 Intel Corp. 1 Future of Interconnect Fabric A ontrarian View Shekhar Borkar June 13, 2010 Intel orp. 1 Outline Evolution of interconnect fabric On die network challenges Some simple contrarian proposals Evaluation and

More information

Programmable Data Plane at Terabit Speeds

Programmable Data Plane at Terabit Speeds AUGUST 2018 Programmable Data Plane at Terabit Speeds Milad Sharif SOFTWARE ENGINEER PISA: Protocol Independent Switch Architecture PISA Block Diagram Match+Action Stage Memory ALU Programmable Parser

More information

DDR4 Memory Technology on HP Z Workstations

DDR4 Memory Technology on HP Z Workstations Technical white paper DDR4 Memory Technology on HP Z Workstations DDR4 is the latest memory technology available for main memory on mobile, desktops, workstations, and server computers. DDR stands for

More information

Portland State University ECE 588/688. Cray-1 and Cray T3E

Portland State University ECE 588/688. Cray-1 and Cray T3E Portland State University ECE 588/688 Cray-1 and Cray T3E Copyright by Alaa Alameldeen 2014 Cray-1 A successful Vector processor from the 1970s Vector instructions are examples of SIMD Contains vector

More information

A Four-Terabit Single-Stage Packet Switch with Large. Round-Trip Time Support. F. Abel, C. Minkenberg, R. Luijten, M. Gusat, and I.

A Four-Terabit Single-Stage Packet Switch with Large. Round-Trip Time Support. F. Abel, C. Minkenberg, R. Luijten, M. Gusat, and I. A Four-Terabit Single-Stage Packet Switch with Large Round-Trip Time Support F. Abel, C. Minkenberg, R. Luijten, M. Gusat, and I. Iliadis IBM Research, Zurich Research Laboratory, CH-8803 Ruschlikon, Switzerland

More information

High Performance Memory Opportunities in 2.5D Network Flow Processors

High Performance Memory Opportunities in 2.5D Network Flow Processors High Performance Memory Opportunities in 2.5D Network Flow Processors Jay Seaton, VP Silicon Operations, Netronome Larry Zu, PhD, President, Sarcina Technology LLC August 6, 2013 2013 Netronome 1 Netronome

More information

Building blocks for custom HyperTransport solutions

Building blocks for custom HyperTransport solutions Building blocks for custom HyperTransport solutions Holger Fröning 2 nd Symposium of the HyperTransport Center of Excellence Feb. 11-12 th 2009, Mannheim, Germany Motivation Back in 2005: Quite some experience

More information

Artisan Technology Group is your source for quality new and certified-used/pre-owned equipment

Artisan Technology Group is your source for quality new and certified-used/pre-owned equipment Artisan Technology Group is your source for quality new and certified-used/pre-owned equipment FAST SHIPPING AND DELIVERY TENS OF THOUSANDS OF IN-STOCK ITEMS EQUIPMENT DEMOS HUNDREDS OF MANUFACTURERS SUPPORTED

More information

RFNoC : RF Network on Chip Martin Braun, Jonathon Pendlum GNU Radio Conference 2015

RFNoC : RF Network on Chip Martin Braun, Jonathon Pendlum GNU Radio Conference 2015 RFNoC : RF Network on Chip Martin Braun, Jonathon Pendlum GNU Radio Conference 2015 Outline Motivation Current situation Goal RFNoC Basic concepts Architecture overview Summary No Demo! See our booth,

More information

HP solutions for mission critical SQL Server Data Management environments

HP solutions for mission critical SQL Server Data Management environments HP solutions for mission critical SQL Server Data Management environments SQL Server User Group Sweden Michael Kohs, Technical Consultant HP/MS EMEA Competence Center michael.kohs@hp.com 1 Agenda HP ProLiant

More information

POWER4 Systems: Design for Reliability. Douglas Bossen, Joel Tendler, Kevin Reick IBM Server Group, Austin, TX

POWER4 Systems: Design for Reliability. Douglas Bossen, Joel Tendler, Kevin Reick IBM Server Group, Austin, TX Systems: Design for Reliability Douglas Bossen, Joel Tendler, Kevin Reick IBM Server Group, Austin, TX Microprocessor 2-way SMP system on a chip > 1 GHz processor frequency >1GHz Core Shared L2 >1GHz Core

More information

UCLA 3D research started in 2002 under DARPA with CFDRC

UCLA 3D research started in 2002 under DARPA with CFDRC Coping with Vertical Interconnect Bottleneck Jason Cong UCLA Computer Science Department cong@cs.ucla.edu http://cadlab.cs.ucla.edu/ cs edu/~cong Outline Lessons learned Research challenges and opportunities

More information

TwinCastle: : A Multi-processor North Bridge Server Chipset

TwinCastle: : A Multi-processor North Bridge Server Chipset TwinCastle: : A Multi-processor North Bridge Server Chipset Debendra Das Sharma, Ashish Gupta, Gordon Kurpanek, Dean Mulla, Bob Pflederer, Ram Rajamani Advanced Components Division, Intel Corporation 1

More information

Chapter 1. Introduction

Chapter 1. Introduction Chapter 1 Introduction In a packet-switched network, packets are buffered when they cannot be processed or transmitted at the rate they arrive. There are three main reasons that a router, with generic

More information

3DIC and the Hybrid Memory Cube

3DIC and the Hybrid Memory Cube 3DIC and the Hybrid Memory Cube Dean Klein Micron Technology, Inc. 2012 Micron Technology, Inc. All rights reserved. Products are warranted only to meet Micron s production data sheet specifications. Information,

More information

White Paper Enabling Quality of Service With Customizable Traffic Managers

White Paper Enabling Quality of Service With Customizable Traffic Managers White Paper Enabling Quality of Service With Customizable Traffic s Introduction Communications networks are changing dramatically as lines blur between traditional telecom, wireless, and cable networks.

More information

OPENSPARC T1 OVERVIEW

OPENSPARC T1 OVERVIEW Chapter Four OPENSPARC T1 OVERVIEW Denis Sheahan Distinguished Engineer Niagara Architecture Group Sun Microsystems Creative Commons 3.0United United States License Creative CommonsAttribution-Share Attribution-Share

More information

Portland State University ECE 588/688. IBM Power4 System Microarchitecture

Portland State University ECE 588/688. IBM Power4 System Microarchitecture Portland State University ECE 588/688 IBM Power4 System Microarchitecture Copyright by Alaa Alameldeen 2018 IBM Power4 Design Principles SMP optimization Designed for high-throughput multi-tasking environments

More information

Chapter 2: Memory Hierarchy Design Part 2

Chapter 2: Memory Hierarchy Design Part 2 Chapter 2: Memory Hierarchy Design Part 2 Introduction (Section 2.1, Appendix B) Caches Review of basics (Section 2.1, Appendix B) Advanced methods (Section 2.3) Main Memory Virtual Memory Fundamental

More information

8. Migrating Stratix II Device Resources to HardCopy II Devices

8. Migrating Stratix II Device Resources to HardCopy II Devices 8. Migrating Stratix II Device Resources to HardCopy II Devices H51024-1.3 Introduction Altera HardCopy II devices and Stratix II devices are both manufactured on a 1.2-V, 90-nm process technology and

More information