RouteBricks: Exploiting Parallelism To Scale Software Routers
|
|
- Marshall Goodman
- 5 years ago
- Views:
Transcription
1 outebricks: Exploiting Parallelism To Scale Software outers Mihai Dobrescu & Norbert Egi, Katerina Argyraki, Byung-Gon Chun, Kevin Fall, Gianluca Iannaccone, Allan Knies, Maziar Manesh, Sylvia atnasamy EPFL, Intel Labs Berkeley, Lancaster University
2 Building routers Fast Programmable» custom statistics» filtering» packet transformation» 2
3 Why programmable routers New ISP services» intrusion detection, application acceleration Simpler network monitoring» measure link latency, track down traffic New protocols» IP traceback, Trajectory Sampling, Enable flexible, extensible networks 3
4 Today: fast or programmable Fast hardware routers» throughput : Tbps» no programmability Programmable software routers» processing by general-purpose CPUs» throughput < 10Gbps 4
5 outebricks A router out of off-the-shelf PCs» familiar programming environment» large-volume manufacturing Can we build a Tbps router out of PCs? 5
6 outer = N packet processing + switching N: number of external router ports : external line rate 6
7 A hardware router N linecards linecards Processing at rate ~ per linecard 7
8 A hardware router N linecards switch fabric linecards Processing at rate ~ per linecard Switching at rate N x by switch fabric 8
9 outebricks N commodity interconnect servers servers Processing at rate ~ per server Switching at rate ~ per server 9
10 outebricks N commodity interconnect servers servers Per-server processing rate: c x 10
11 Outline Interconnect Server optimizations Performance Conclusions 11
12 Outline Interconnect Server optimizations Performance Conclusions 12
13 equirements N commodity interconnect Internal link rates < Per-server processing rate: c x Per-server fanout: constant 13
14 A naive solution N 14
15 A naive solution N N external links of capacity N 2 internal links of capacity 15
16 Valiant load balancing N /N /N 16
17 Valiant load balancing /N /N N N external links of capacity N 2 internal links of capacity 2/N 17
18 Valiant load balancing /N /N N Per-server processing rate: 3 Uniform traffic: 2 18
19 Per-server fanout? N 19
20 Per-server fanout? N Increase server capacity 20
21 Per-server fanout? N Increase server capacity 21
22 Per-server fanout? N Increase server capacity Add intermediate nodes» k-degree n-stage butterfly 22
23 Our solution: combination Assign max external ports per server Full mesh, if possible Extra servers, otherwise 23
24 Example Assuming current servers» 5 NICs, 2 x 10G ports or 8 x 1G ports» 1 external port per server N = 32 ports: full mesh» 32 servers N = 1024 ports: 16-ary 4-fly» 2 extra servers per port 24
25 ecap N Valiant load balancing + full mesh k-ary n-fly Per-server processing rate:
26 Outline Interconnect Server optimizations Performance Conclusions 26
27 Setup: NUMA architecture min-size packets I/O hub Mem Ports Cores Mem» Nehalem architecture, QuickPath interconnect» CPUs: 2 x [2.8GHz, 4 cores, 8MB L3 cache]» NICs: 2 x Intel XFS 2x10Gbps» kernel-mode Click 27
28 Single-server performance I/O hub Mem Ports Cores Mem First try: 1.3 Gbps 28
29 Problem #1: book-keeping Managing packet descriptors» moving between NIC and memory» updating descriptor rings Solution: batch packet operations» NIC batches multiple packet descriptors» CPU polls for multiple packets 29
30 Single-server performance I/O hub Mem Ports Cores Mem First try: 1.3 Gbps With batching: 3 Gbps 30
31 Problem #2: queue access Ports Cores 31
32 Problem #2: queue access ule #1: 1 core per port 32
33 Problem #2: queue access ule #1: 1 core per port ule #2: 1 core per packet 33
34 Problem #2: queue access ule #1: 1 core per port ule #2: 1 core per packet 34
35 Problem #2: queue access ule #1: 1 core per port ule #2: 1 core per packet 35
36 Problem #2: queue access ule #1: 1 core per port ule #2: 1 core per packet queue 36
37 Single-server performance I/O hub Mem Ports Cores Mem First try: 1.3 Gbps With batching: 3 Gbps With multiple queues: 9.7 Gbps 37
38 ecap State-of-the art hardware» NUMA architecture, multi-queue NICs Modified NIC driver» batching Careful queue-to-core allocation» one core per queue, per packet 38
39 Outline Interconnect Server optimizations Performance Conclusions 39
40 Single-server performance Gbps ealistic size mix Min-size packets No-op forwarding IP routing ealistic size mix: = 8 12 Gbps Min-size packets: = 2 3 Gbps 40
41 Bottlenecks Gbps ealistic size mix Min-size packets No-op forwarding IP routing ealistic size mix: I/O Min-size packets: CPU 41
42 With upcoming servers ealistic size mix Gbps Min-size packets No-op forwarding IP routing ealistic size mix: = Gbps Min-size packets: = Gbps 42
43 B4 prototype N = 4 external ports» 1 server per port» full mesh ealistic size mix: 4 x 8.75 = 35 Gbps» expected = 8 12 Gbps Min-size packets: 4 x 3 = 12 Gbps» expected = 2 3 Gbps 43
44 I did not talk about eordering» avoid per-flow reordering» 0.15% Latency» 24 microseconds per server (estimate) Open issues» power, form-factor, programming model 44
45 Conclusions outebricks: high-end software router» Valiant LB cluster of commodity servers Programmable with Click Performance:» easily = 1Gbps, N = 100s» = 10Gbps for realistic traffic» for worst case, with upcoming servers 45
46 Thank you. NIC driver and more information at 46
Today s Paper. Routers forward packets. Networks and routers. EECS 262a Advanced Topics in Computer Systems Lecture 18
EECS 262a Advanced Topics in Computer Systems Lecture 18 Software outers/outebricks March 30 th, 2016 John Kubiatowicz Electrical Engineering and Computer Sciences University of California, Berkeley Slides
More informationToday s Paper. Routers forward packets. Networks and routers. EECS 262a Advanced Topics in Computer Systems Lecture 18
EECS 262a Advanced Topics in Computer Systems Lecture 18 Software outers/outebricks October 29 th, 2012 John Kubiatowicz and Anthony D. Joseph Electrical Engineering and Computer Sciences University of
More informationRouteBricks: Exploiting Parallelism To Scale Software Routers
RouteBricks: Exploiting Parallelism To Scale Software Routers Mihai Dobrescu 1 and Norbert Egi 2, Katerina Argyraki 1, Byung-Gon Chun 3, Kevin Fall 3, Gianluca Iannaccone 3, Allan Knies 3, Maziar Manesh
More informationEvaluating the Suitability of Server Network Cards for Software Routers
Evaluating the Suitability of Server Network Cards for Software Routers Maziar Manesh Katerina Argyraki Mihai Dobrescu Norbert Egi Kevin Fall Gianluca Iannaccone Eddie Kohler Sylvia Ratnasamy EPFL, UCLA,
More informationRouteBricks: Exploi2ng Parallelism to Scale So9ware Routers
RouteBricks: Exploi2ng Parallelism to Scale So9ware Routers Mihai Dobrescu and etc. SOSP 2009 Presented by Shuyi Chen Mo2va2on Router design Performance Extensibility They are compe2ng goals Hardware approach
More informationTOWARD PREDICTABLE PERFORMANCE IN SOFTWARE PACKET-PROCESSING PLATFORMS. Mihai Dobrescu, EPFL Katerina Argyraki, EPFL Sylvia Ratnasamy, UC Berkeley
TOWARD PREDICTABLE PERFORMANCE IN SOFTWARE PACKET-PROCESSING PLATFORMS Mihai Dobrescu, EPFL Katerina Argyraki, EPFL Sylvia Ratnasamy, UC Berkeley Programmable Networks 2 Industry/research community efforts
More informationForwarding Architecture
Forwarding Architecture Brighten Godfrey CS 538 February 14 2018 slides 2010-2018 by Brighten Godfrey unless otherwise noted Building a fast router Partridge: 50 Gb/sec router A fast IP router well, fast
More informationResQ: Enabling SLOs in Network Function Virtualization
ResQ: Enabling SLOs in Network Function Virtualization Amin Tootoonchian* Aurojit Panda Chang Lan Melvin Walls Katerina Argyraki Sylvia Ratnasamy Scott Shenker *Intel Labs UC Berkeley ICSI NYU Nefeli EPFL
More informationControlling Parallelism in a Multicore Software Router
Controlling Parallelism in a Multicore Software Router Mihai Dobrescu, Katerina Argyraki EPFL, Switzerland Gianluca Iannaccone, Maziar Manesh, Sylvia Ratnasamy Intel Research Labs, Berkeley ABSTRACT Software
More informationImproved parallelism and scheduling in multi-core software routers
J Supercomput DOI 10.1007/s11227-011-0579-3 Improved parallelism and scheduling in multi-core software routers Norbert Egi Gianluca Iannaccone Maziar Manesh Laurent Mathy Sylvia Ratnasamy Springer Science+Business
More informationCan Software Routers Scale?
Can Software Routers Scale? Katerina Argyraki EPFL, Switzerland Salman Baset Columbia Univ., NY Byung-Gon Chun ICSI, Berkeley, CA Kevin Fall Intel Research Gianluca Iannaccone Allan Knies Eddie Kohler
More informationPacketShader: A GPU-Accelerated Software Router
PacketShader: A GPU-Accelerated Software Router Sangjin Han In collaboration with: Keon Jang, KyoungSoo Park, Sue Moon Advanced Networking Lab, CS, KAIST Networked and Distributed Computing Systems Lab,
More information15-744: Computer Networking. Routers
15-744: Computer Networking outers Forwarding and outers Forwarding IP lookup High-speed router architecture eadings [McK97] A Fast Switched Backplane for a Gigabit Switched outer Optional [D+97] Small
More informationFast packet processing in the cloud. Dániel Géhberger Ericsson Research
Fast packet processing in the cloud Dániel Géhberger Ericsson Research Outline Motivation Service chains Hardware related topics, acceleration Virtualization basics Software performance and acceleration
More informationRule-Based Forwarding
Building Extensible Networks with Rule-Based Forwarding Lucian Popa Norbert Egi Sylvia Ratnasamy Ion Stoica UC Berkeley/ICSI Lancaster Univ. Intel Labs Berkeley UC Berkeley Making Internet forwarding flexible
More informationInterconnection Networks: Topology. Prof. Natalie Enright Jerger
Interconnection Networks: Topology Prof. Natalie Enright Jerger Topology Overview Definition: determines arrangement of channels and nodes in network Analogous to road map Often first step in network design
More informationPacketShader as a Future Internet Platform
PacketShader as a Future Internet Platform AsiaFI Summer School 2011.8.11. Sue Moon in collaboration with: Joongi Kim, Seonggu Huh, Sangjin Han, Keon Jang, KyoungSoo Park Advanced Networking Lab, CS, KAIST
More informationIX: A Protected Dataplane Operating System for High Throughput and Low Latency
IX: A Protected Dataplane Operating System for High Throughput and Low Latency Belay, A. et al. Proc. of the 11th USENIX Symp. on OSDI, pp. 49-65, 2014. Reviewed by Chun-Yu and Xinghao Li Summary In this
More informationImplementing Software Virtual Routers on Multi-core PCs using Click
Implementing Software Virtual Routers on Multi-core PCs using Click Mickaël Hoerdt, Dept. of computer engineering Université catholique de Louvain la neuve mickael.hoerdt@uclouvain.be LANCASTER UNIVERSITY
More information소프트웨어기반고성능침입탐지시스템설계및구현
소프트웨어기반고성능침입탐지시스템설계및구현 KyoungSoo Park Department of Electrical Engineering, KAIST M. Asim Jamshed *, Jihyung Lee*, Sangwoo Moon*, Insu Yun *, Deokjin Kim, Sungryoul Lee, Yung Yi* Department of Electrical
More informationThe Power of Batching in the Click Modular Router
The Power of Batching in the Click Modular Router Joongi Kim, Seonggu Huh, Keon Jang, * KyoungSoo Park, Sue Moon Computer Science Dept., KAIST Microsoft Research Cambridge, UK * Electrical Engineering
More informationPOLYMORPHIC ON-CHIP NETWORKS
POLYMORPHIC ON-CHIP NETWORKS Martha Mercaldi Kim, John D. Davis*, Mark Oskin, Todd Austin** University of Washington *Microsoft Research, Silicon Valley ** University of Michigan On-Chip Network Selection
More informationTopologies. Maurizio Palesi. Maurizio Palesi 1
Topologies Maurizio Palesi Maurizio Palesi 1 Network Topology Static arrangement of channels and nodes in an interconnection network The roads over which packets travel Topology chosen based on cost and
More informationCS 498 Hot Topics in High Performance Computing. Networks and Fault Tolerance. 9. Routing and Flow Control
CS 498 Hot Topics in High Performance Computing Networks and Fault Tolerance 9. Routing and Flow Control Intro What did we learn in the last lecture Topology metrics Including minimum diameter of directed
More informationNetwork Design Considerations for Grid Computing
Network Design Considerations for Grid Computing Engineering Systems How Bandwidth, Latency, and Packet Size Impact Grid Job Performance by Erik Burrows, Engineering Systems Analyst, Principal, Broadcom
More informationThemes. The Network 1. Energy in the DC: ~15% network? Energy by Technology
Themes The Network 1 Low Power Computing David Andersen Carnegie Mellon University Last two classes: Saving power by running more slowly and sleeping more. This time: Network intro; saving power by architecting
More informationTopology basics. Constraints and measures. Butterfly networks.
EE48: Advanced Computer Organization Lecture # Interconnection Networks Architecture and Design Stanford University Topology basics. Constraints and measures. Butterfly networks. Lecture #: Monday, 7 April
More informationReducing CPU and network overhead for small I/O requests in network storage protocols over raw Ethernet
Reducing CPU and network overhead for small I/O requests in network storage protocols over raw Ethernet Pilar González-Férez and Angelos Bilas 31 th International Conference on Massive Storage Systems
More informationNon-Uniform Memory Access (NUMA) Architecture and Multicomputers
Non-Uniform Memory Access (NUMA) Architecture and Multicomputers Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico February 29, 2016 CPD
More informationNUMA replicated pagecache for Linux
NUMA replicated pagecache for Linux Nick Piggin SuSE Labs January 27, 2008 0-0 Talk outline I will cover the following areas: Give some NUMA background information Introduce some of Linux s NUMA optimisations
More informationNon-Uniform Memory Access (NUMA) Architecture and Multicomputers
Non-Uniform Memory Access (NUMA) Architecture and Multicomputers Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico September 26, 2011 CPD
More informationHardware Evolution in Data Centers
Hardware Evolution in Data Centers 2004 2008 2011 2000 2013 2014 Trend towards customization Increase work done per dollar (CapEx + OpEx) Paolo Costa Rethinking the Network Stack for Rack-scale Computers
More informationSpeeding up Linux TCP/IP with a Fast Packet I/O Framework
Speeding up Linux TCP/IP with a Fast Packet I/O Framework Michio Honda Advanced Technology Group, NetApp michio@netapp.com With acknowledge to Kenichi Yasukata, Douglas Santry and Lars Eggert 1 Motivation
More informationOFAR-CM: Efficient Dragonfly Networks with Simple Congestion Management
Marina Garcia 22 August 2013 OFAR-CM: Efficient Dragonfly Networks with Simple Congestion Management M. Garcia, E. Vallejo, R. Beivide, M. Valero and G. Rodríguez Document number OFAR-CM: Efficient Dragonfly
More informationNetwork-on-chip (NOC) Topologies
Network-on-chip (NOC) Topologies 1 Network Topology Static arrangement of channels and nodes in an interconnection network The roads over which packets travel Topology chosen based on cost and performance
More informationNon-Uniform Memory Access (NUMA) Architecture and Multicomputers
Non-Uniform Memory Access (NUMA) Architecture and Multicomputers Parallel and Distributed Computing MSc in Information Systems and Computer Engineering DEA in Computational Engineering Department of Computer
More informationLecture 3: Topology - II
ECE 8823 A / CS 8803 - ICN Interconnection Networks Spring 2017 http://tusharkrishna.ece.gatech.edu/teaching/icn_s17/ Lecture 3: Topology - II Tushar Krishna Assistant Professor School of Electrical and
More informationInterconnection Networks: Routing. Prof. Natalie Enright Jerger
Interconnection Networks: Routing Prof. Natalie Enright Jerger Routing Overview Discussion of topologies assumed ideal routing In practice Routing algorithms are not ideal Goal: distribute traffic evenly
More informationNetchannel 2: Optimizing Network Performance
Netchannel 2: Optimizing Network Performance J. Renato Santos +, G. (John) Janakiraman + Yoshio Turner +, Ian Pratt * + HP Labs - * XenSource/Citrix Xen Summit Nov 14-16, 2007 2003 Hewlett-Packard Development
More informationTopologies. Maurizio Palesi. Maurizio Palesi 1
Topologies Maurizio Palesi Maurizio Palesi 1 Network Topology Static arrangement of channels and nodes in an interconnection network The roads over which packets travel Topology chosen based on cost and
More informationResearch on DPDK Based High-Speed Network Traffic Analysis. Zihao Wang Network & Information Center Shanghai Jiao Tong University
Research on DPDK Based High-Speed Network Traffic Analysis Zihao Wang Network & Information Center Shanghai Jiao Tong University Outline 1 Background 2 Overview 3 DPDK Based Traffic Analysis 4 Experiment
More informationDPDK Summit China 2017
Summit China 2017 Embedded Network Architecture Optimization Based on Lin Hao T1 Networks Agenda Our History What is an embedded network device Challenge to us Requirements for device today Our solution
More informationParallel Architectures
Parallel Architectures Part 1: The rise of parallel machines Intel Core i7 4 CPU cores 2 hardware thread per core (8 cores ) Lab Cluster Intel Xeon 4/10/16/18 CPU cores 2 hardware thread per core (8/20/32/36
More informationRuler: High-Speed Packet Matching and Rewriting on Network Processors
Ruler: High-Speed Packet Matching and Rewriting on Network Processors Tomáš Hrubý Kees van Reeuwijk Herbert Bos Vrije Universiteit, Amsterdam World45 Ltd. ANCS 2007 Tomáš Hrubý (VU Amsterdam, World45)
More informationThe Arbitration Problem
HighPerform Switchingand TelecomCenterWorkshop:Sep outing ance t4, 97. EE84Y: Packet Switch Architectures Part II Load-balanced Switches ick McKeown Professor of Electrical Engineering and Computer Science,
More informationReducing Network Contention with Mixed Workloads on Modern Multicore Clusters
Reducing Network Contention with Mixed Workloads on Modern Multicore Clusters Matthew Koop 1 Miao Luo D. K. Panda matthew.koop@nasa.gov {luom, panda}@cse.ohio-state.edu 1 NASA Center for Computational
More informationNetwork Architecture Laboratory
Automated Synthesis of Adversarial Workloads for Network Functions Luis Pedrosa, Rishabh Iyer, Arseniy Zaostrovnykh, Jonas Fietz, Katerina Argyraki Network Architecture Laboratory Software NFs The good:
More informationIntel Workstation Technology
Intel Workstation Technology Turning Imagination Into Reality November, 2008 1 Step up your Game Real Workstations Unleash your Potential 2 Yesterday s Super Computer Today s Workstation = = #1 Super Computer
More informationBest Practices for Setting BIOS Parameters for Performance
White Paper Best Practices for Setting BIOS Parameters for Performance Cisco UCS E5-based M3 Servers May 2013 2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page
More informationFairness Issues in Software Virtual Routers
Fairness Issues in Software Virtual Routers Norbert Egi, Adam Greenhalgh, h Mark Handley, Mickael Hoerdt, Felipe Huici, Laurent Mathy Lancaster University PRESTO 2008 Presenter: Munhwan Choi Virtual Router
More informationASPERA HIGH-SPEED TRANSFER. Moving the world s data at maximum speed
ASPERA HIGH-SPEED TRANSFER Moving the world s data at maximum speed ASPERA HIGH-SPEED FILE TRANSFER Aspera FASP Data Transfer at 80 Gbps Elimina8ng tradi8onal bo
More informationG-NET: Effective GPU Sharing In NFV Systems
G-NET: Effective Sharing In NFV Systems Kai Zhang*, Bingsheng He^, Jiayu Hu #, Zeke Wang^, Bei Hua #, Jiayi Meng #, Lishan Yang # *Fudan University ^National University of Singapore #University of Science
More informationLecture 2: Topology - I
ECE 8823 A / CS 8803 - ICN Interconnection Networks Spring 2017 http://tusharkrishna.ece.gatech.edu/teaching/icn_s17/ Lecture 2: Topology - I Tushar Krishna Assistant Professor School of Electrical and
More informationMuch Faster Networking
Much Faster Networking David Riddoch driddoch@solarflare.com Copyright 2016 Solarflare Communications, Inc. All rights reserved. What is kernel bypass? The standard receive path The standard receive path
More informationObject Storage: Redefining Bandwidth for Linux Clusters
Object Storage: Redefining Bandwidth for Linux Clusters Brent Welch Principal Architect, Inc. November 18, 2003 Blocks, Files and Objects Block-base architecture: fast but private Traditional SCSI and
More informationNetSlices: Scalable Mul/- Core Packet Processing in User- Space
NetSlices: Scalable Mul/- Core Packet Processing in - Space Tudor Marian, Ki Suh Lee, Hakim Weatherspoon Cornell University Presented by Ki Suh Lee Packet Processors Essen/al for evolving networks Sophis/cated
More informationRecent Advances in Software Router Technologies
Recent Advances in Software Router Technologies KRNET 2013 2013.6.24-25 COEX Sue Moon In collaboration with: Sangjin Han 1, Seungyeop Han 2, Seonggu Huh 3, Keon Jang 4, Joongi Kim, KyoungSoo Park 5 Advanced
More informationAccelerating Pointer Chasing in 3D-Stacked Memory: Challenges, Mechanisms, Evaluation Kevin Hsieh
Accelerating Pointer Chasing in 3D-Stacked : Challenges, Mechanisms, Evaluation Kevin Hsieh Samira Khan, Nandita Vijaykumar, Kevin K. Chang, Amirali Boroumand, Saugata Ghose, Onur Mutlu Executive Summary
More informationSurvey of ETSI NFV standardization documents BY ABHISHEK GUPTA FRIDAY GROUP MEETING FEBRUARY 26, 2016
Survey of ETSI NFV standardization documents BY ABHISHEK GUPTA FRIDAY GROUP MEETING FEBRUARY 26, 2016 VNFaaS (Virtual Network Function as a Service) In our present work, we consider the VNFaaS use-case
More informationGPGPU introduction and network applications. PacketShaders, SSLShader
GPGPU introduction and network applications PacketShaders, SSLShader Agenda GPGPU Introduction Computer graphics background GPGPUs past, present and future PacketShader A GPU-Accelerated Software Router
More informationReliably Scalable Name Prefix Lookup! Haowei Yuan and Patrick Crowley! Washington University in St. Louis!! ANCS 2015! 5/8/2015!
Reliably Scalable Name Prefix Lookup! Haowei Yuan and Patrick Crowley! Washington University in St. Louis!! ANCS 2015! 5/8/2015! ! My Topic for Today! Goal: a reliable longest name prefix lookup performance
More informationImpact of Cache Coherence Protocols on the Processing of Network Traffic
Impact of Cache Coherence Protocols on the Processing of Network Traffic Amit Kumar and Ram Huggahalli Communication Technology Lab Corporate Technology Group Intel Corporation 12/3/2007 Outline Background
More informationL0 L1 L2 L3 T0 T1 T2 T3. Eth1-4. Eth1-4. Eth1-2 Eth1-2 Eth1-2 Eth Eth3-4 Eth3-4 Eth3-4 Eth3-4.
Click! N P 10.3.0.1 10.3.1.1 Eth33 Eth1-4 Eth1-4 C0 C1 Eth1-2 Eth1-2 Eth1-2 Eth1-2 10.2.0.1 10.2.1.1 10.2.2.1 10.2.3.1 Eth3-4 Eth3-4 Eth3-4 Eth3-4 L0 L1 L2 L3 Eth24-25 Eth1-24 Eth24-25 Eth24-25 Eth24-25
More informationLearning with Purpose
Network Measurement for 100Gbps Links Using Multicore Processors Xiaoban Wu, Dr. Peilong Li, Dr. Yongyi Ran, Prof. Yan Luo Department of Electrical and Computer Engineering University of Massachusetts
More informationA Look at Intel s Dataplane Development Kit
A Look at Intel s Dataplane Development Kit Dominik Scholz Chair for Network Architectures and Services Department for Computer Science Technische Universität München June 13, 2014 Dominik Scholz: A Look
More informationAdvanced Computer Networks. End Host Optimization
Oriana Riva, Department of Computer Science ETH Zürich 263 3501 00 End Host Optimization Patrick Stuedi Spring Semester 2017 1 Today End-host optimizations: NUMA-aware networking Kernel-bypass Remote Direct
More informationFast-Response Multipath Routing Policy for High-Speed Interconnection Networks
HPI-DC 09 Fast-Response Multipath Routing Policy for High-Speed Interconnection Networks Diego Lugones, Daniel Franco, and Emilio Luque Leonardo Fialho Cluster 09 August 31 New Orleans, USA Outline Scope
More informationDPDK Summit 2016 OpenContrail vrouter / DPDK Architecture. Raja Sivaramakrishnan, Distinguished Engineer Aniket Daptari, Sr.
DPDK Summit 2016 OpenContrail vrouter / DPDK Architecture Raja Sivaramakrishnan, Distinguished Engineer Aniket Daptari, Sr. Product Manager CONTRAIL (MULTI-VENDOR) ARCHITECTURE ORCHESTRATOR Interoperates
More informationSupporting Fine-Grained Network Functions through Intel DPDK
Supporting Fine-Grained Network Functions through Intel DPDK Ivano Cerrato, Mauro Annarumma, Fulvio Risso - Politecnico di Torino, Italy EWSDN 2014, September 1st 2014 This project is co-funded by the
More informationNetwork Requirements for Resource Disaggregation
Network Requirements for Resource Disaggregation Peter Gao (Berkeley), Akshay Narayan (MIT), Sagar Karandikar (Berkeley), Joao Carreira (Berkeley), Sangjin Han (Berkeley), Rachit Agarwal (Cornell), Sylvia
More informationScaling Internet Routers Using Optics Producing a 100TB/s Router. Ashley Green and Brad Rosen February 16, 2004
Scaling Internet Routers Using Optics Producing a 100TB/s Router Ashley Green and Brad Rosen February 16, 2004 Presentation Outline Motivation Avi s Black Box Black Box: Load Balance Switch Conclusion
More informationThe NE010 iwarp Adapter
The NE010 iwarp Adapter Gary Montry Senior Scientist +1-512-493-3241 GMontry@NetEffect.com Today s Data Center Users Applications networking adapter LAN Ethernet NAS block storage clustering adapter adapter
More informationCOSC 6374 Parallel Computation. Parallel Computer Architectures
OS 6374 Parallel omputation Parallel omputer Architectures Some slides on network topologies based on a similar presentation by Michael Resch, University of Stuttgart Edgar Gabriel Fall 2015 Flynn s Taxonomy
More informationMWC 2015 End to End NFV Architecture demo_
MWC 2015 End to End NFV Architecture demo_ March 2015 demonstration @ Intel booth Executive summary The goal is to demonstrate how an advanced multi-vendor implementation of the ETSI ISG NFV architecture
More informationASPERA HIGH-SPEED TRANSFER. Moving the world s data at maximum speed
ASPERA HIGH-SPEED TRANSFER Moving the world s data at maximum speed ASPERA HIGH-SPEED FILE TRANSFER 80 GBIT/S OVER IP USING DPDK Performance, Code, and Architecture Charles Shiflett Developer of next-generation
More informationBackground. IBM sold expensive mainframes to large organizations. Monitor sits between one or more OSes and HW
Virtual Machines Background IBM sold expensive mainframes to large organizations Some wanted to run different OSes at the same time (because applications were developed on old OSes) Solution: IBM developed
More informationIX: A Protected Dataplane Operating System for High Throughput and Low Latency
IX: A Protected Dataplane Operating System for High Throughput and Low Latency Adam Belay et al. Proc. of the 11th USENIX Symp. on OSDI, pp. 49-65, 2014. Presented by Han Zhang & Zaina Hamid Challenges
More informationThe Power of Batching in the Click Modular Router
The Power of Batching in the Click Modular Router Joongi Kim Seonggu Huh Keon Jang KyoungSoo Park Sue Moon Department of Computer Science, KAIST, Korea {joongi, seonggu}@an.kaist.ac.kr, sbmoon@kaist.edu
More informationAn FPGA-Based Optical IOH Architecture for Embedded System
An FPGA-Based Optical IOH Architecture for Embedded System Saravana.S Assistant Professor, Bharath University, Chennai 600073, India Abstract Data traffic has tremendously increased and is still increasing
More informationChapter 7 Slicing and Dicing
1/ 22 Chapter 7 Slicing and Dicing Lasse Harju Tampere University of Technology lasse.harju@tut.fi 2/ 22 Concentrators and Distributors Concentrators Used for combining traffic from several network nodes
More informationGetting Real Performance from a Virtualized CCAP
Getting Real Performance from a Virtualized CCAP A Technical Paper prepared for SCTE/ISBE by Mark Szczesniak Software Architect Casa Systems, Inc. 100 Old River Road Andover, MA, 01810 978-688-6706 mark.szczesniak@casa-systems.com
More informationOpenOnload. Dave Parry VP of Engineering Steve Pope CTO Dave Riddoch Chief Software Architect
OpenOnload Dave Parry VP of Engineering Steve Pope CTO Dave Riddoch Chief Software Architect Copyright 2012 Solarflare Communications, Inc. All Rights Reserved. OpenOnload Acceleration Software Accelerated
More informationPerformance assessment of CORBA for the transport of userplane data in future wideband radios. Piya Bhaskar Lockheed Martin
Performance assessment of CORBA for the transport of userplane data in future wideband radios Piya Bhaskar Lockheed Martin 1 Outline Introduction to the problem Test Setup Results Conclusion 2 Problem
More informationCSE398: Network Systems Design
CSE398: Network Systems Design Instructor: Dr. Liang Cheng Department of Computer Science and Engineering P.C. Rossin College of Engineering & Applied Science Lehigh University March 14, 2005 Outline Classification
More informationScalable Ethernet Clos-Switches. Norbert Eicker John von Neumann-Institute for Computing Ferdinand Geier ParTec Cluster Competence Center GmbH
Scalable Ethernet Clos-Switches Norbert Eicker John von Neumann-Institute for Computing Ferdinand Geier ParTec Cluster Competence Center GmbH Outline Motivation Clos-Switches Ethernet Crossbar Switches
More informationPCI Express x8 Single Port SFP+ 10 Gigabit Server Adapter (Intel 82599ES Based) Single-Port 10 Gigabit SFP+ Ethernet Server Adapters Provide Ultimate
NIC-PCIE-1SFP+-PLU PCI Express x8 Single Port SFP+ 10 Gigabit Server Adapter (Intel 82599ES Based) Single-Port 10 Gigabit SFP+ Ethernet Server Adapters Provide Ultimate Flexibility and Scalability in Virtual
More informationNetronome NFP: Theory of Operation
WHITE PAPER Netronome NFP: Theory of Operation TO ACHIEVE PERFORMANCE GOALS, A MULTI-CORE PROCESSOR NEEDS AN EFFICIENT DATA MOVEMENT ARCHITECTURE. CONTENTS 1. INTRODUCTION...1 2. ARCHITECTURE OVERVIEW...2
More informationECE 4750 Computer Architecture, Fall 2017 T06 Fundamental Network Concepts
ECE 4750 Computer Architecture, Fall 2017 T06 Fundamental Network Concepts School of Electrical and Computer Engineering Cornell University revision: 2017-10-17-12-26 1 Network/Roadway Analogy 3 1.1. Running
More informationCommunication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems.
Cluster Networks Introduction Communication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems. As usual, the driver is performance
More informationECE/CS 757: Advanced Computer Architecture II Interconnects
ECE/CS 757: Advanced Computer Architecture II Interconnects Instructor:Mikko H Lipasti Spring 2017 University of Wisconsin-Madison Lecture notes created by Natalie Enright Jerger Lecture Outline Introduction
More informationHeterogeneous SoCs. May 28, 2014 COMPUTER SYSTEM COLLOQUIUM 1
COSCOⅣ Heterogeneous SoCs M5171111 HASEGAWA TORU M5171112 IDONUMA TOSHIICHI May 28, 2014 COMPUTER SYSTEM COLLOQUIUM 1 Contents Background Heterogeneous technology May 28, 2014 COMPUTER SYSTEM COLLOQUIUM
More informationOptimizing Performance: Intel Network Adapters User Guide
Optimizing Performance: Intel Network Adapters User Guide Network Optimization Types When optimizing network adapter parameters (NIC), the user typically considers one of the following three conditions
More informationANALYSIS AND IMPROVEMENT OF VALIANT ROUTING IN LOW- DIAMETER NETWORKS
ANALYSIS AND IMPROVEMENT OF VALIANT ROUTING IN LOW- DIAMETER NETWORKS Mariano Benito Pablo Fuentes Enrique Vallejo Ramón Beivide With support from: 4th IEEE International Workshop of High-Perfomance Interconnection
More informationLecture 26: Multiprocessing continued Computer Architecture and Systems Programming ( )
Systems Group Department of Computer Science ETH Zürich Lecture 26: Multiprocessing continued Computer Architecture and Systems Programming (252-0061-00) Timothy Roscoe Herbstsemester 2012 Today Non-Uniform
More informationCOSC 6374 Parallel Computation. Parallel Computer Architectures
OS 6374 Parallel omputation Parallel omputer Architectures Some slides on network topologies based on a similar presentation by Michael Resch, University of Stuttgart Spring 2010 Flynn s Taxonomy SISD:
More informationOvercoming the Memory System Challenge in Dataflow Processing. Darren Jones, Wave Computing Drew Wingard, Sonics
Overcoming the Memory System Challenge in Dataflow Processing Darren Jones, Wave Computing Drew Wingard, Sonics Current Technology Limits Deep Learning Performance Deep Learning Dataflow Graph Existing
More informationArrakis: The Operating System is the Control Plane
Arrakis: The Operating System is the Control Plane Simon Peter, Jialin Li, Irene Zhang, Dan Ports, Doug Woos, Arvind Krishnamurthy, Tom Anderson University of Washington Timothy Roscoe ETH Zurich Building
More informationSharing CPUs via endpoint congestion control
Sharing CPUs via endpoint congestion control Laura Vasilescu laura.vasilescu@cs.pub.ro Vladimir Olteanu vladimir.olteanu@cs.pub.ro University Politehnica of Bucharest Splaiul Independentei 313a Bucharest,
More informationHigh bandwidth, Long distance. Where is my throughput? Robin Tasker CCLRC, Daresbury Laboratory, UK
High bandwidth, Long distance. Where is my throughput? Robin Tasker CCLRC, Daresbury Laboratory, UK [r.tasker@dl.ac.uk] DataTAG is a project sponsored by the European Commission - EU Grant IST-2001-32459
More informationHigh Performance Packet Processing with FlexNIC
High Performance Packet Processing with FlexNIC Antoine Kaufmann, Naveen Kr. Sharma Thomas Anderson, Arvind Krishnamurthy University of Washington Simon Peter The University of Texas at Austin Ethernet
More information