A New Era of Hardware Microservices in the Cloud. Doug Burger Distinguished Engineer, Microsoft UW Cloud Workshop March 31, 2017
|
|
- Barnard Benson
- 5 years ago
- Views:
Transcription
1 A New Era of Hardware Microservices in the Cloud Doug Burger Distinguished Engineer, Microsoft UW Cloud Workshop March 31, 2017
2
3 Moore s Law Dennard Scaling has been dead for a decade Moore s La is o er Re e er, it s a rate Silicon scaling continues But lots of weirdnesses 3
4
5 Specialization (?)
6 Hardware Microservices HuS A instance 1 HuS A instance 2 Cloud (or client) service HuS B instance 1 HuS B instance 2 IP + us HuS C instance 1
7 Catapult v2 Mezzanine card Catapult V2 Architecture WCS 2.0 Server Blade DRAM Catapult V2 DRAM DRAM 40Gb/s QPI CPU Gen3 2x8 Gen3 x8 NIC FPGA QSFP Switch WCS Gen4.1 Blade with Mellanox NIC and Catapult FPGA QSFP CPU QSFP 40Gb/s The architecture justifies the economics Pikes Peak 1. Can act as a local compute accelerator 2. Can act as a network/storage accelerator 3. Can act as a remote distributed computing fabric Option Card Mezzanine Connectors WCS Tray Backplane 7
8 Configurable Cloud CPU compute layer Reconfigurable compute layer Converged network
9 Top-of-Rack Switch Server NIC HuS shell Support 4 8 FIFO 40G MAC x8 PCIe Core (HIP 0) Host CPU x8 PCIe Core (HIP 1) 40G 256 MAC FIFO G Bypass Ctrl GB DDR3 72 DDR3 Controller 512 Network Switch Lightweight Transport Layer Slot DMA Engine 256 Mb QSPI Config Flash 4 Config Flash (RSU) Clock 128 Elastic Router User Logic User Logic I2C SEU JTAG Temp. Role ASLs Shell 9
10 LTL Network reach and latencies 25 LTL L2 LTL Average Latency LTL 99.9th Percentile Round-Trip Latency (us) Example L1 latency histogram LTL L1 10 Example L0 latency histogram LTL L0 (same TOR) 5 Examples of L2 latency histograms for different pairs of FPGAs Number of Reachable Hosts/FPGAs 10K K K
11 Scaling the HuS fabric L2 L1 TOR TOR L1 TOR TOR Deep neural networks Web search ranking Web search ranking SQL GFT Offload TOR
12 Use case: shared service Most accelerators have more throughput than a single host requires Share excess capacity, use fewer instances Frees up FPGAs for other use services Sustains 3.6x clients / FPGA 73% remaining FPGAs available th Bed-Wide Remote-Local 95th Bed-Wide Remote-Local
13 Hardware as a Service architecture
14 Scaling HuS FPGAs To Ultra-Large DNNs Thanks to Eric Chung and team Distribute NN models across as many FPGAs as needed (up to thousands) Recent Imagenet competition: 152layer model Use HaaS and LTL to manage multifpga execution Only vectors travel over network Low FPGA-FPGA latency at ~1.8us per L0 hop
15 Huge infrastructure: Scale is the enabler Mi rosoft s Cloud usiness growing in the triple digits annually Dublin Chicago Quincy Cheyenne Finland Amsterdam Boydton Shanghai Des Moines Hong Kong San Antonio Singapore Brazil FPGAs Deployed across 15 countries and 5 continents Australia Catapult included in every new server for all major services In large scale production in both Bing and Azure (and others) Hardware Microservices (+ DNNs) in a subset and scaling Japan
16 Looking forward
17 Gen2 Shell Abstractions 4 8 Host CPU FIFO FIFO x8 PCIe Core (HIP 1) Mb QSPI Config Flash 4 DDR3 Controller 40G 256 MAC Network Switch Lightweight Transport Layer Slot DMA Engine G Bypass Ctrl 40G MAC x8 PCIe Core (HIP 0) 4 GB DDR3 Top-of-Rack Switch Server NIC 128 SDN Elastic Router Config Flash (RSU) Clock User Logic I2C SEU JTAG Temp. Role ASLs Shell 17
18 Democratizing Hardware Microservices Network Host FPGA Third-party ASIC
19
20 FPGAs Deployed across 15 countries and 5 continents Catapult included in every new server for all major services In large scale production in both Bing and Azure (and others) Hardware Microservices (+ DNNs) in a subset and scaling
21 Accelerated Networking Generic Flow Table (GFT) rule based packet rewriting 10x latency reduction vs software, CPU load now <1 core 25Gb/s throughput at < 25µs latency the fastest cloud network VMs On Haswell, AES GCM-128 costs 1.26 cycles/byte[1] (5+ 2.4Ghz cores to sustain 40Gb/s) CBC and other algorithms are more expensive AES CBC-128-SHA1 is 11µs in FPGA vs 4µs on CPU (1500B packet) Higher latency, but significant CPU savings GFT FPGA 40G [1] S. Gulley, et al. Has ell Cryptographi Perfor a e Flow Action > , >80 Decap, DNAT, Rewrite, Meter Crypto 40G NIC 40GGFT Table
22 Training Inference Client Cloud Humans GPUs ASICs FPGAs
Today s Data Centers. How can we improve efficiencies?
Today s Data Centers O(100K) servers/data center Tens of MegaWatts, difficult to power and cool Very noisy Security taken very seriously Incrementally upgraded 3 year server depreciation, upgraded quarterly
More informationCatapult: A Reconfigurable Fabric for Petaflop Computing in the Cloud
Catapult: A Reconfigurable Fabric for Petaflop Computing in the Cloud Doug Burger Director, Hardware, Devices, & Experiences MSR NExT November 15, 2015 The Cloud is a Growing Disruptor for HPC Moore s
More informationA Cloud-Scale Acceleration Architecture
A Cloud-Scale Acceleration Architecture Adrian M. Caulfield Eric S. Chung Andrew Putnam Hari Angepat Jeremy Fowers Michael Haselman Stephen Heil Matt Humphrey Puneet Kaur Joo-Young Kim Daniel Lo Todd Massengill
More informationApplication-Specific Hardware. in the real world
Application-Specific Hardware in the real world 1 http://warfarehistorynetwork.com/wp-content/uploads/military-weapons-the-catapult.jpg 2 Large-Scale Reconfigurable Computing in a Microsoft Datacenter
More informationRecurrent Neural Networks. Deep neural networks have enabled major advances in machine learning and AI. Convolutional Neural Networks
Deep neural networks have enabled major advances in machine learning and AI Computer vision Language translation Speech recognition Question answering And more Problem: DNNs are challenging to serve and
More informationN V M e o v e r F a b r i c s -
N V M e o v e r F a b r i c s - H i g h p e r f o r m a n c e S S D s n e t w o r k e d f o r c o m p o s a b l e i n f r a s t r u c t u r e Rob Davis, VP Storage Technology, Mellanox OCP Evolution Server
More informationEnabling Flexible Network FPGA Clusters in a Heterogeneous Cloud Data Center
Enabling Flexible Network FPGA Clusters in a Heterogeneous Cloud Data Center Naif Tarafdar, Thomas Lin, Eric Fukuda, Hadi Bannazadeh, Alberto Leon-Garcia, Paul Chow University of Toronto 1 Cloudy with
More informationImplementing Ultra Low Latency Data Center Services with Programmable Logic
Implementing Ultra Low Latency Data Center Services with Programmable Logic John W. Lockwood, CEO: Algo-Logic Systems, Inc. http://algo-logic.com Solutions@Algo-Logic.com (408) 707-3740 2255-D Martin Ave.,
More informationHardened Security in the Cloud Bob Doud, Sr. Director Marketing March, 2018
Hardened Security in the Cloud Bob Doud, Sr. Director Marketing March, 2018 1 Cloud Computing is Growing at an Astounding Rate Many compelling reasons for business to move to the cloud Cost, uptime, easy-expansion,
More informationHow to Network Flash Storage Efficiently at Hyperscale. Flash Memory Summit 2017 Santa Clara, CA 1
How to Network Flash Storage Efficiently at Hyperscale Manoj Wadekar Michael Kagan Flash Memory Summit 2017 Santa Clara, CA 1 ebay Hyper scale Infrastructure Search Front-End & Product Hadoop Object Store
More informationProgrammable NICs. Lecture 14, Computer Networks (198:552)
Programmable NICs Lecture 14, Computer Networks (198:552) Network Interface Cards (NICs) The physical interface between a machine and the wire Life of a transmitted packet Userspace application NIC Transport
More informationRDMA in Data Centers: Looking Back and Looking Forward
RDMA in Data Centers: Looking Back and Looking Forward Chuanxiong Guo Microsoft Research ACM SIGCOMM APNet 2017 August 3 2017 The Rising of Cloud Computing 40 AZURE REGIONS Data Centers Data Centers Data
More informationOCP Engineering Workshop - Telco
OCP Engineering Workshop - Telco Low Latency Mobile Edge Computing Trevor Hiatt Product Management, IDT IDT Company Overview Founded 1980 Workforce Approximately 1,800 employees Headquarters San Jose,
More informationGetting Real Performance from a Virtualized CCAP
Getting Real Performance from a Virtualized CCAP A Technical Paper prepared for SCTE/ISBE by Mark Szczesniak Software Architect Casa Systems, Inc. 100 Old River Road Andover, MA, 01810 978-688-6706 mark.szczesniak@casa-systems.com
More informationMaximizing heterogeneous system performance with ARM interconnect and CCIX
Maximizing heterogeneous system performance with ARM interconnect and CCIX Neil Parris, Director of product marketing Systems and software group, ARM Teratec June 2017 Intelligent flexible cloud to enable
More informationRDMA over Commodity Ethernet at Scale
RDMA over Commodity Ethernet at Scale Chuanxiong Guo, Haitao Wu, Zhong Deng, Gaurav Soni, Jianxi Ye, Jitendra Padhye, Marina Lipshteyn ACM SIGCOMM 2016 August 24 2016 Outline RDMA/RoCEv2 background DSCP-based
More informationSDA: Software-Defined Accelerator for Large- Scale DNN Systems
SDA: Software-Defined Accelerator for Large- Scale DNN Systems Jian Ouyang, 1 Shiding Lin, 1 Wei Qi, Yong Wang, Bo Yu, Song Jiang, 2 1 Baidu, Inc. 2 Wayne State University Introduction of Baidu A dominant
More informationNFV Infrastructure for Media Data Center Applications
NFV Infrastructure for Media Data Center Applications Today s Presenters Roger Sherwood Global Strategy & Business Development, Cisco Systems Damion Desai Account Manager for Datacenter, SDN, NFV and Mobility,
More informationBirds of a Feather Presentation
Mellanox InfiniBand QDR 4Gb/s The Fabric of Choice for High Performance Computing Gilad Shainer, shainer@mellanox.com June 28 Birds of a Feather Presentation InfiniBand Technology Leadership Industry Standard
More informationSDA: Software-Defined Accelerator for Large- Scale DNN Systems
SDA: Software-Defined Accelerator for Large- Scale DNN Systems Jian Ouyang, 1 Shiding Lin, 1 Wei Qi, 1 Yong Wang, 1 Bo Yu, 1 Song Jiang, 2 1 Baidu, Inc. 2 Wayne State University Introduction of Baidu A
More informationNext Generation Enterprise Solutions from ARM
Next Generation Enterprise Solutions from ARM Ian Forsyth Director Product Marketing Enterprise and Infrastructure Applications Processor Product Line Ian.forsyth@arm.com 1 Enterprise Trends IT is the
More information2017 Storage Developer Conference. Mellanox Technologies. All Rights Reserved.
Ethernet Storage Fabrics Using RDMA with Fast NVMe-oF Storage to Reduce Latency and Improve Efficiency Kevin Deierling & Idan Burstein Mellanox Technologies 1 Storage Media Technology Storage Media Access
More informationBlueDBM: An Appliance for Big Data Analytics*
BlueDBM: An Appliance for Big Data Analytics* Arvind *[ISCA, 2015] Sang-Woo Jun, Ming Liu, Sungjin Lee, Shuotao Xu, Arvind (MIT) and Jamey Hicks, John Ankcorn, Myron King(Quanta) BigData@CSAIL Annual Meeting
More informationECONOMICS OF THE CLOUD. Michel N Guettia Business Group Lead Slide 1
ECONOMICS OF THE CLOUD Michel N Guettia Business Group Lead micheln@microsoft.com Slide 1 TRANSFORMATIE VAN IT Today Cloud 2000s Web 1990s Client / Server 1970s and 80s Mainframe HOW MICROSOFT VIEWS THE
More informationEnyx soft-hardware design services and development framework for FPGA & SoC
soft-hardware design services and development framework for FPGA & SoC Smart NIC Smart Switch Your custom hardware hardware acceleration experts 3rd party IP Cores AXI ARM DMA CPU Your own soft-hardware
More informationMaximum Performance. How to get it and how to avoid pitfalls. Christoph Lameter, PhD
Maximum Performance How to get it and how to avoid pitfalls Christoph Lameter, PhD cl@linux.com Performance Just push a button? Systems are optimized by default for good general performance in all areas.
More informationHigh Performance Computing
High Performance Computing Dror Goldenberg, HPCAC Switzerland Conference March 2015 End-to-End Interconnect Solutions for All Platforms Highest Performance and Scalability for X86, Power, GPU, ARM and
More informationNVMe over Universal RDMA Fabrics
NVMe over Universal RDMA Fabrics Build a Flexible Scale-Out NVMe Fabric with Concurrent RoCE and iwarp Acceleration Broad spectrum Ethernet connectivity Universal RDMA NVMe Direct End-to-end solutions
More informationThe NE010 iwarp Adapter
The NE010 iwarp Adapter Gary Montry Senior Scientist +1-512-493-3241 GMontry@NetEffect.com Today s Data Center Users Applications networking adapter LAN Ethernet NAS block storage clustering adapter adapter
More informationThomas Lin, Naif Tarafdar, Byungchul Park, Paul Chow, and Alberto Leon-Garcia
Thomas Lin, Naif Tarafdar, Byungchul Park, Paul Chow, and Alberto Leon-Garcia The Edward S. Rogers Sr. Department of Electrical and Computer Engineering University of Toronto, ON, Canada Motivation: IoT
More informationSmartNICs: Giving Rise To Smarter Offload at The Edge and In The Data Center
SmartNICs: Giving Rise To Smarter Offload at The Edge and In The Data Center Jeff Defilippi Senior Product Manager Arm #Arm Tech Symposia The Cloud to Edge Infrastructure Foundation for a World of 1T Intelligent
More informationQLogic/Lenovo 16Gb Gen 5 Fibre Channel for Database and Business Analytics
QLogic/ Gen 5 Fibre Channel for Database Assessment for Database and Business Analytics Using the information from databases and business analytics helps business-line managers to understand their customer
More informationWorld s most advanced data center accelerator for PCIe-based servers
NVIDIA TESLA P100 GPU ACCELERATOR World s most advanced data center accelerator for PCIe-based servers HPC data centers need to support the ever-growing demands of scientists and researchers while staying
More informationAll product specifications are subject to change without notice.
MSI N3000 series is cost-benefit rackmount network security. Basing on Intel Xeon E3-1200 v3/v4/v5 series CPU and Xeon D-1500 series SoC which is to help enterprise to be flexibly applied to various network
More informationRDMA and Hardware Support
RDMA and Hardware Support SIGCOMM Topic Preview 2018 Yibo Zhu Microsoft Research 1 The (Traditional) Journey of Data How app developers see the network Under the hood This architecture had been working
More informationVPI / InfiniBand. Performance Accelerated Mellanox InfiniBand Adapters Provide Advanced Data Center Performance, Efficiency and Scalability
VPI / InfiniBand Performance Accelerated Mellanox InfiniBand Adapters Provide Advanced Data Center Performance, Efficiency and Scalability Mellanox enables the highest data center performance with its
More informationIBM WebSphere MQ Low Latency Messaging Software Tested With Arista 10 Gigabit Ethernet Switch and Mellanox ConnectX
IBM WebSphere MQ Low Latency Messaging Software Tested With Arista 10 Gigabit Ethernet Switch and Mellanox ConnectX -2 EN with RoCE Adapter Delivers Reliable Multicast Messaging With Ultra Low Latency
More informationAdaptable Intelligence The Next Computing Era
Adaptable Intelligence The Next Computing Era Hot Chips, August 21, 2018 Victor Peng, CEO, Xilinx Pervasive Intelligence from Cloud to Edge to Endpoints >> 1 Exponential Growth and Opportunities Data Explosion
More information16GFC Sets The Pace For Storage Networks
16GFC Sets The Pace For Storage Networks Scott Kipp Brocade Mark Jones Emulex August 30 th, 2011 To be presented to the Block Storage Track at 1:30 on Monday September 19th 1 Overview What is 16GFC? What
More informationQLogic 16Gb Gen 5 Fibre Channel for Database and Business Analytics
QLogic 16Gb Gen 5 Fibre Channel for Database Assessment for Database and Business Analytics Using the information from databases and business analytics helps business-line managers to understand their
More informationWhat s New with the Cloud?
What s New with the Cloud? A quick look at the evolution and possible future of cloud computing A view from the past 10 years of working on and using cloud technology Dennis Gannon, Professor Emeritus,
More informationOptimize New Intel Xeon E based Ser vers with Emulex OneConnect and OneCommand Manager
W h i t e p a p e r Optimize New Intel Xeon E5-2600-based Ser vers with Emulex OneConnect and OneCommand Manager Emulex products complement Intel Xeon E5-2600 processor capabilities for virtualization,
More informationTECHNICAL OVERVIEW ACCELERATED COMPUTING AND THE DEMOCRATIZATION OF SUPERCOMPUTING
TECHNICAL OVERVIEW ACCELERATED COMPUTING AND THE DEMOCRATIZATION OF SUPERCOMPUTING Table of Contents: The Accelerated Data Center Optimizing Data Center Productivity Same Throughput with Fewer Server Nodes
More informationLow-Overhead Flash Disaggregation via NVMe-over-Fabrics Vijay Balakrishnan Memory Solutions Lab. Samsung Semiconductor, Inc.
Low-Overhead Flash Disaggregation via NVMe-over-Fabrics Vijay Balakrishnan Memory Solutions Lab. Samsung Semiconductor, Inc. 1 DISCLAIMER This presentation and/or accompanying oral statements by Samsung
More informationHow to abstract hardware acceleration device in cloud environment. Maciej Grochowski Intel DCG Ireland
How to abstract hardware acceleration device in cloud environment Maciej Grochowski Intel DCG Ireland Outline Introduction to Hardware Accelerators Intel QuickAssist Technology (Intel QAT) as example of
More informationCCIX: a new coherent multichip interconnect for accelerated use cases
: a new coherent multichip interconnect for accelerated use cases Akira Shimizu Senior Manager, Operator relations Arm 2017 Arm Limited Arm 2017 Interconnects for different scale SoC interconnect. Connectivity
More informationQNAP 25GbE PCIe Expansion Card. Mellanox SN GbE/100GbE Management Switch. QNAP NAS X 25GbE card X 25GbE/100GbE switch
QNAP 25GbE PCIe Expansion Card Mellanox SN2010 25GbE/100GbE Management Switch QNAP NAS X 25GbE card X 25GbE/100GbE switch Over 10GbE: QNAP 25GbE PCIe expansion card Fast Growing market and technology of
More informationHow Might Recently Formed System Interconnect Consortia Affect PM? Doug Voigt, SNIA TC
How Might Recently Formed System Interconnect Consortia Affect PM? Doug Voigt, SNIA TC Three Consortia Formed in Oct 2016 Gen-Z Open CAPI CCIX complex to rack scale memory fabric Cache coherent accelerator
More informationAccelerating Data Centers Using NVMe and CUDA
Accelerating Data Centers Using NVMe and CUDA Stephen Bates, PhD Technical Director, CSTO, PMC-Sierra Santa Clara, CA 1 Project Donard @ PMC-Sierra Donard is a PMC CTO project that leverages NVM Express
More informationVPI / InfiniBand. Performance Accelerated Mellanox InfiniBand Adapters Provide Advanced Data Center Performance, Efficiency and Scalability
VPI / InfiniBand Performance Accelerated Mellanox InfiniBand Adapters Provide Advanced Data Center Performance, Efficiency and Scalability Mellanox enables the highest data center performance with its
More informationLow-Overhead Flash Disaggregation via NVMe-over-Fabrics
Low-Overhead Flash Disaggregation via NVMe-over-Fabrics Vijay Balakrishnan Memory Solutions Lab. Samsung Semiconductor, Inc. August 2017 1 DISCLAIMER This presentation and/or accompanying oral statements
More informationSERVER. Samuli Toivola Lead HW Architect Nokia
SERVER AirFrame Open Rack Server with Integrated HW Acceleration. Samuli Toivola Lead HW Architect Nokia Nokia in Open Compute Project Nokia is a Platinum Member of the Open Compute Project and an OCP
More informationCisco UCS B460 M4 Blade Server
Data Sheet Cisco UCS B460 M4 Blade Server Product Overview The new Cisco UCS B460 M4 Blade Server uses the power of the latest Intel Xeon processor E7 v3 product family to add new levels of performance
More informationHP ProLiant BladeSystem Gen9 vs Gen8 and G7 Server Blades on Data Warehouse Workloads
HP ProLiant BladeSystem Gen9 vs Gen8 and G7 Server Blades on Data Warehouse Workloads Gen9 server blades give more performance per dollar for your investment. Executive Summary Information Technology (IT)
More informationHighest Levels of Scalability Simplified Network Manageability Maximum System Productivity
InfiniBand Brochure Highest Levels of Scalability Simplified Network Manageability Maximum System Productivity 40/56/100/200Gb/s InfiniBand Switch System Family MELLANOX SMART INFINIBAND SWITCH SYSTEMS
More informationFUJITSU Server PRIMERGY CX400 M4 Workload-specific power in a modular form factor. 0 Copyright 2018 FUJITSU LIMITED
FUJITSU Server PRIMERGY CX400 M4 Workload-specific power in a modular form factor 0 Copyright 2018 FUJITSU LIMITED FUJITSU Server PRIMERGY CX400 M4 Workload-specific power in a compact and modular form
More informationInfiniBand Switch System Family. Highest Levels of Scalability, Simplified Network Manageability, Maximum System Productivity
InfiniBand Switch System Family Highest Levels of Scalability, Simplified Network Manageability, Maximum System Productivity Mellanox continues its leadership by providing InfiniBand SDN Switch Systems
More informationPUSHING THE LIMITS, A PERSPECTIVE ON ROUTER ARCHITECTURE CHALLENGES
PUSHING THE LIMITS, A PERSPECTIVE ON ROUTER ARCHITECTURE CHALLENGES Greg Hankins APRICOT 2012 2012 Brocade Communications Systems, Inc. 2012/02/28 Lookup Capacity and Forwarding
More informationLow-Latency Datacenters. John Ousterhout Platform Lab Retreat May 29, 2015
Low-Latency Datacenters John Ousterhout Platform Lab Retreat May 29, 2015 Datacenters: Scale and Latency Scale: 1M+ cores 1-10 PB memory 200 PB disk storage Latency: < 0.5 µs speed-of-light delay Most
More informationZynq-7000 All Programmable SoC Product Overview
Zynq-7000 All Programmable SoC Product Overview The SW, HW and IO Programmable Platform August 2012 Copyright 2012 2009 Xilinx Introducing the Zynq -7000 All Programmable SoC Breakthrough Processing Platform
More informationFuture Routing Schemes in Petascale clusters
Future Routing Schemes in Petascale clusters Gilad Shainer, Mellanox, USA Ola Torudbakken, Sun Microsystems, Norway Richard Graham, Oak Ridge National Laboratory, USA Birds of a Feather Presentation Abstract
More informationSOFTWARE-DEFINED BLOCK STORAGE FOR HYPERSCALE APPLICATIONS
SOFTWARE-DEFINED BLOCK STORAGE FOR HYPERSCALE APPLICATIONS SCALE-OUT SERVER SAN WITH DISTRIBUTED NVME, POWERED BY HIGH-PERFORMANCE NETWORK TECHNOLOGY INTRODUCTION The evolution in data-centric applications,
More informationCOSMOS Architecture and Key Technologies. June 1 st, 2018 COSMOS Team
COSMOS Architecture and Key Technologies June 1 st, 2018 COSMOS Team COSMOS: System Architecture (2) System design based on three levels of SDR radio node (S,M,L) with M,L connected via fiber to optical
More informationASPERA HIGH-SPEED TRANSFER. Moving the world s data at maximum speed
ASPERA HIGH-SPEED TRANSFER Moving the world s data at maximum speed ASPERA HIGH-SPEED FILE TRANSFER Aspera FASP Data Transfer at 80 Gbps Elimina8ng tradi8onal bo
More informationInterconnect Your Future
Interconnect Your Future Gilad Shainer 2nd Annual MVAPICH User Group (MUG) Meeting, August 2014 Complete High-Performance Scalable Interconnect Infrastructure Comprehensive End-to-End Software Accelerators
More informationSoftFlash: Programmable Storage in Future Data Centers Jae Do Researcher, Microsoft Research
SoftFlash: Programmable Storage in Future Data Centers Jae Do Researcher, Microsoft Research 1 The world s most valuable resource Data is everywhere! May. 2017 Values from Data! Need infrastructures for
More informationFPGA Augmented ASICs: The Time Has Come
FPGA Augmented ASICs: The Time Has Come David Riddoch Steve Pope Copyright 2012 Solarflare Communications, Inc. All Rights Reserved. Hardware acceleration is Niche (With the obvious exception of graphics
More informationService Edge Virtualization - Hardware Considerations for Optimum Performance
Service Edge Virtualization - Hardware Considerations for Optimum Performance Executive Summary This whitepaper provides a high level overview of Intel based server hardware components and their impact
More informationFast packet processing in the cloud. Dániel Géhberger Ericsson Research
Fast packet processing in the cloud Dániel Géhberger Ericsson Research Outline Motivation Service chains Hardware related topics, acceleration Virtualization basics Software performance and acceleration
More informationUsing FPGAs to accelerate NVMe-oF based Storage Networks
Using FPGAs to accelerate NVMe-oF based Storage Networks Deboleena Sakalley IP & Solutions Architect, Xilinx Santa Clara, CA 1 Agenda NVMe-oF Offload in FPGA NVMe-oF Integrated Solution Solution Architecture
More informationARISTA: Improving Application Performance While Reducing Complexity
ARISTA: Improving Application Performance While Reducing Complexity October 2008 1.0 Problem Statement #1... 1 1.1 Problem Statement #2... 1 1.2 Previous Options: More Servers and I/O Adapters... 1 1.3
More informationIntroduction to Ethernet Latency
Introduction to Ethernet Latency An Explanation of Latency and Latency Measurement The primary difference in the various methods of latency measurement is the point in the software stack at which the latency
More informationImplementing Flexible Interconnect Topologies for Machine Learning Acceleration
Implementing Flexible Interconnect for Machine Learning Acceleration A R M T E C H S Y M P O S I A O C T 2 0 1 8 WILLIAM TSENG Mem Controller 20 mm Mem Controller Machine Learning / AI SoC New Challenges
More informationCOMPUTING. MC1500 MaxCore Micro Versatile Compute and Acceleration Platform
COMPUTING Preliminary Data Sheet The MC1500 MaxCore Micro is a low cost, versatile platform ideal for a wide range of applications Supports two PCIe Gen 3 FH-FL slots Slot 1 Artesyn host server card Slot
More informationOracle Exadata: Strategy and Roadmap
Oracle Exadata: Strategy and Roadmap - New Technologies, Cloud, and On-Premises Juan Loaiza Senior Vice President, Database Systems Technologies, Oracle Safe Harbor Statement The following is intended
More informationNetwork Design Considerations for Grid Computing
Network Design Considerations for Grid Computing Engineering Systems How Bandwidth, Latency, and Packet Size Impact Grid Job Performance by Erik Burrows, Engineering Systems Analyst, Principal, Broadcom
More informationInterconnect Your Future
#OpenPOWERSummit Interconnect Your Future Scot Schultz, Director HPC / Technical Computing Mellanox Technologies OpenPOWER Summit, San Jose CA March 2015 One-Generation Lead over the Competition Mellanox
More informationVendor: Cisco. Exam Code: Exam Name: DCID Designing Cisco Data Center Infrastructure. Version: Demo
Vendor: Cisco Exam Code: 300-160 Exam Name: DCID Designing Cisco Data Center Infrastructure Version: Demo Exam A QUESTION 1 Which three options are features of a Cisco Nexus 7700 Switch? (Choose three.)
More informationLegUp: Accelerating Memcached on Cloud FPGAs
0 LegUp: Accelerating Memcached on Cloud FPGAs Xilinx Developer Forum December 10, 2018 Andrew Canis & Ruolong Lian LegUp Computing Inc. 1 COMPUTE IS BECOMING SPECIALIZED 1 GPU Nvidia graphics cards are
More informationBenefits of Offloading I/O Processing to the Adapter
Benefits of Offloading I/O Processing to the Adapter FCoE and iscsi Protocol Offload Delivers Enterpriseclass Performance, Reliability, and Scalability Hewlett Packard Enterprise (HPE) and Cavium have
More informationP51: High Performance Networking
P51: High Performance Networking Lecture 6: Programmable network devices Dr Noa Zilberman noa.zilberman@cl.cam.ac.uk Lent 2017/18 High Throughput Interfaces Performance Limitations So far we discussed
More informationEnabling FPGAs in Hyperscale Data Centers
J. Weerasinghe; IEEE CBDCom 215, Beijing; 13 th August 215 Enabling s in Hyperscale Data Centers J. Weerasinghe 1, F. Abel 1, C. Hagleitner 1, A. Herkersdorf 2 1 IBM Research Zurich Laboratory 2 Technical
More information2014 LENOVO INTERNAL. ALL RIGHTS RESERVED.
2014 LENOVO INTERNAL. ALL RIGHTS RESERVED. Connectivity Categories and Selection Considerations NIC HBA CNA Primary Purpose Basic Ethernet Connectivity Connection to SAN/DAS Converged Network and SAN connectivity
More informationGPU > CPU. FOR HIGH PERFORMANCE COMPUTING PRESENTATION BY - SADIQ PASHA CHETHANA DILIP
GPU > CPU. FOR HIGH PERFORMANCE COMPUTING PRESENTATION BY - SADIQ PASHA CHETHANA DILIP INTRODUCTION or With the exponential increase in computational power of todays hardware, the complexity of the problem
More informationVM Migration Acceleration over 40GigE Meet SLA & Maximize ROI
VM Migration Acceleration over 40GigE Meet SLA & Maximize ROI Mellanox Technologies Inc. Motti Beck, Director Marketing Motti@mellanox.com Topics Introduction to Mellanox Technologies Inc. Why Cloud SLA
More informationFlash In the Data Center
Flash In the Data Center Enterprise-grade Morgan Littlewood: VP Marketing and BD Violin Memory, Inc. Email: littlewo@violin-memory.com Mobile: +1.650.714.7694 7/12/2009 1 Flash in the Data Center Nothing
More informationNFS/RDMA over 40Gbps iwarp Wael Noureddine Chelsio Communications
NFS/RDMA over 40Gbps iwarp Wael Noureddine Chelsio Communications Outline RDMA Motivating trends iwarp NFS over RDMA Overview Chelsio T5 support Performance results 2 Adoption Rate of 40GbE Source: Crehan
More informationExploring System Coherency and Maximizing Performance of Mobile Memory Systems
Exploring System Coherency and Maximizing Performance of Mobile Memory Systems Shanghai: William Orme, Strategic Marketing Manager of SSG Beijing & Shenzhen: Mayank Sharma, Product Manager of SSG ARM Tech
More informationHigh Performance Memory in FPGAs
High Performance Memory in FPGAs Industry Trends and Customer Challenges Packet Processing & Transport > 400G OTN Software Defined Networks Video Over IP Network Function Virtualization Wireless LTE Advanced
More informationParallel Stochastic Gradient Descent: The case for native GPU-side GPI
Parallel Stochastic Gradient Descent: The case for native GPU-side GPI J. Keuper Competence Center High Performance Computing Fraunhofer ITWM, Kaiserslautern, Germany Mark Silberstein Accelerated Computer
More informationBenefits of 25, 40, and 50GbE Networks for Ceph and Hyper- Converged Infrastructure John F. Kim Mellanox Technologies
Benefits of 25, 40, and 50GbE Networks for Ceph and Hyper- Converged Infrastructure John F. Kim Mellanox Technologies Storage Transitions Change Network Needs Software Defined Storage Flash Storage Storage
More informationApplication Advantages of NVMe over Fabrics RDMA and Fibre Channel
Application Advantages of NVMe over Fabrics RDMA and Fibre Channel Brandon Hoff Broadcom Limited Tuesday, June 14 2016 10:55 11:35 a.m. Agenda r Applications that have a need for speed r The Benefits of
More informationWhen MPPDB Meets GPU:
When MPPDB Meets GPU: An Extendible Framework for Acceleration Laura Chen, Le Cai, Yongyan Wang Background: Heterogeneous Computing Hardware Trend stops growing with Moore s Law Fast development of GPU
More informationCompute Engineering Workshop March 9, 2015 San Jose
Compute Engineering Workshop March 9, 2015 San Jose Compute Engineering Workshop Monday 1:00 PM Microsoft Open CloudServer OCS V2 Overview 2:00 PM Microsoft OCS 1U Quad Server System 2:45 PM OCP Mezz 2.0
More informationUpgrade to Microsoft SQL Server 2016 with Dell EMC Infrastructure
Upgrade to Microsoft SQL Server 2016 with Dell EMC Infrastructure Generational Comparison Study of Microsoft SQL Server Dell Engineering February 2017 Revisions Date Description February 2017 Version 1.0
More informationBroadcom Adapters for Dell PowerEdge 12G Servers
White Paper Broadcom Adapters for Dell PowerEdge 12G Servers The Dell PowerEdge 12G family of rack, tower, and blade servers offer a broad range of storage and networking options designed to address the
More informationCloud Networking (VITMMA02) Server Virtualization Data Center Gear
Cloud Networking (VITMMA02) Server Virtualization Data Center Gear Markosz Maliosz PhD Department of Telecommunications and Media Informatics Faculty of Electrical Engineering and Informatics Budapest
More informationOmniSwitch 6900 Overview 1 COPYRIGHT 2011 ALCATEL-LUCENT ENTERPRISE. ALL RIGHTS RESERVED.
OmniSwitch Overview 1 OmniSwitch Hardware Overview Ethernet management port, Serial and USB ports OmniSwitch OS-X40 (front / back views) Optional Module #1 Optional Module #2 Hot swappable fan tray 3+1
More informationPower Technology For a Smarter Future
2011 IBM Power Systems Technical University October 10-14 Fontainebleau Miami Beach Miami, FL IBM Power Technology For a Smarter Future Jeffrey Stuecheli Power Processor Development Copyright IBM Corporation
More informationScaling Acceleration Capacity from 5 to 50 Gbps and Beyond with Intel QuickAssist Technology
SOLUTION BRIEF Intel QuickAssist Technology Scaling Acceleration Capacity from 5 to 5 Gbps and Beyond with Intel QuickAssist Technology Equipment manufacturers can dial in the right capacity by choosing
More information