Ethan Kao CS 6410 Oct. 18 th 2011

Size: px
Start display at page:

Download "Ethan Kao CS 6410 Oct. 18 th 2011"

Transcription

1 Ethan Kao CS 6410 Oct. 18 th 2011

2 Active Messages: A Mechanism for Integrated Communication and Control, Thorsten von Eicken, David E. Culler, Seth Copen Goldstein, and Klaus Erik Schauser. In Proceedings of the 19th Annual International Symposium on Computer Architecture, U-Net: A User-Level Network Interface for Parallel and Distributed Computing, Von Eicken, Basu, Buch and Werner Vogels. 15th SOSP, December 1995.

3 Parallel System: Multiple processors one machine Shared Memory Supercomputing

4 Distributed System: Multiple machines linked together Distributed memory Cloud computing

5 How to efficiently communicate? Between processors Between machines Active Messages U-Net

6 Thorsten von Eicken Berkeley Ph.D. -> Assistant professor at Cornell -> UCSB Founded RightScale, Chief Architect at Expertcity.com David E. Culler Professor at Berkeley Seth Copen Goldstein Berkeley Ph.D. -> Associate professor at CMU Klaus Erik Schauser Berkeley Ph.D. -> Associate professor at UCSB

7 Existing message passing multiprocessors had high communication costs Message passing machines made inefficient use of underlying hardware capabilities ncube/2 CM-5 Thousands of nodes interconnected Poor overlap between computation and communication

8 Improve overlap between computation & communication Aim for 100% utilization of resources Low start-up costs for network usage

9 Asynchronous communication Minimal buffering Handler interface Weaknesses: Address of the message handler must be known Design needs to be hardware specific?

10 Asynchronous communication mechanism Messages contain user-level handler address Handler executed on message arrival Takes message off network Message body is argument Does not block

11 Sender blocks until messages can be injected into network Receiver interrupted on message arrival - runs handler User level program pre-allocates receiving structures Eliminates buffering

12 Traditional send/receive models

13 Key optimization in AM vs. send/receive is reduction of buffering. AM can achieve near order of magnitude reduction: ncube/2 AM send/handle: 11us/15us overhead ncube/2 async send/receive: 160us overhead CM-5 AM : <2us overhead CM-5 blocking: 86us overhead Prototype of blocking send/receive on top of AM: 23us overhead

14 Non-blocking implementations of PUT and GET Implementations consist of a message formatter and a message handler

15 Multiplication of C = A x B. Processor GETS one column of A after another to perform rank-1 update with its own columns of B. Achieves 95% of peak performance

16 Computation occurs in the message handler. Specialized hardware -> Monsoon, J-Machine Memory allocation and scheduling required upon message arrival Tricky to implement in hardware Expensive In Active Messages, handler only removes messages from the network. Threaded Abstract Machine (TAM) Parallel execution model based on Active Message Typically no memory allocation upon message arrival No test results

17 Good performance Not a new parallel programming paradigm Evolutionary not Revolutionary AM systems? Multiprocessor vs. Cluster

18 Thorsten von Eicken Anindya Basu Advised by von Eicken Vineet Buch M.S. from Cornell Co-founded Like.com -> Google Werner Vogels Research Scientist at Cornell -> CTO of Amazon

19 Bottleneck of local area communication at kernel Several copies of messages made Processing overhead dominates for small messages Low round-trip latencies growing in importance Especially for small messages Traditional networking architecture inflexible Cannot easily support new protocols or send/receive interfaces

20 Remove kernel from critical path of communication Provide low-latency communication in local area settings Exploit full network bandwidth even with small messages Facilitate the use of novel communication protocols

21 Flexible Low latency for smaller messages Off the shelf hardware good performance Weaknesses : Multiplexing resources between processes not in kernel Specialized NI needed?

22 User level communication architecture independent Virtualizes network devices Kernel control of channel set-up and tear-down

23 Remove kernel from critical path: send/recv

24 U-Net: Multiplexes NI among all processes accessing network Enforces protection boundaries and resource limits Process: Contents of each message and management of send/recv resources (i.e. buffers)

25 Main building blocks of U-Net: Endpoints Communication Segments Message Queues Each process that wishes to access the network Creates one or more endpoints Associates a communication segment with each endpoint Associates set of send, receive and free message queues with each endpoint

26

27 Prepare packet -> place it in the comm seg Place descriptor on the Send queue U-Net takes descriptor from queue Transfer packet from memory to network packet U-Net NI Network From Itamar Sagi

28 U-Net receives message and identifies Endpoint Takes free space from free queue Places message in communication cegment Places descriptor in receive queue Process takes descriptor from receive queue and reads message U-Net NI packet Network From Itamar Sagi

29 Only owning process can access: Endpoints Communication Segments Message queues Outgoing messages tagged with the originating endpoint Incoming messages demultiplexed by U-Net

30 Base-level: zero-copy Comm segment not regarded as memory regions 1 copy betw application data structure and buffer in comm segment Small messages held entirely in queue Direct-access: true zero copy Comm segments can span entire process address space Sender can specify offset within destination comm seg for data Difficult to implement on existing workstation hardware

31 U-Net implementations support Base-level Hardware for direct-access not available Copy overhead not a dominant cost Kernel emulated endpoints

32 Implemented on SPARCstations running SunOS 4.13 Fore SBA-100 interface Lack of hardware for CRC computation = overhead Fore SBA-200 interface Uses custom firmware to implement base-level architecture i960 processor reprogrammed to implement U-Net directly Small messages: 65us RTT vs. 12us for CM-5 Fiber saturated with packet sizes of 800 bytes

33

34

35 Traditional UDP and TCP over ATM performance disappointing < 55% max bandwidth for TCP Better performance with UDP and TCP over U-Net Not bounded by kernel resources More state awareness = better application-network relationships

36

37 Main goals were to achieve low latency communication and flexibility NetBump

AN O/S PERSPECTIVE ON NETWORKS Adem Efe Gencer 1. October 4 th, Department of Computer Science, Cornell University

AN O/S PERSPECTIVE ON NETWORKS Adem Efe Gencer 1. October 4 th, Department of Computer Science, Cornell University AN O/S PERSPECTIVE ON NETWORKS Adem Efe Gencer 1 October 4 th, 2012 1 Department of Computer Science, Cornell University Papers 2 Active Messages: A Mechanism for Integrated Communication and Control,

More information

An O/S perspective on networks: Active Messages and U-Net

An O/S perspective on networks: Active Messages and U-Net An O/S perspective on networks: Active Messages and U-Net Theo Jepsen Cornell University 17 October 2013 Theo Jepsen (Cornell University) CS 6410: Advanced Systems 17 October 2013 1 / 30 Brief History

More information

Low-Latency Communication over Fast Ethernet

Low-Latency Communication over Fast Ethernet Low-Latency Communication over Fast Ethernet Matt Welsh, Anindya Basu, and Thorsten von Eicken {mdw,basu,tve}@cs.cornell.edu Department of Computer Science Cornell University, Ithaca, NY 14853 http://www.cs.cornell.edu/info/projects/u-net

More information

HIGH-PERFORMANCE NETWORKING :: USER-LEVEL NETWORKING :: REMOTE DIRECT MEMORY ACCESS

HIGH-PERFORMANCE NETWORKING :: USER-LEVEL NETWORKING :: REMOTE DIRECT MEMORY ACCESS HIGH-PERFORMANCE NETWORKING :: USER-LEVEL NETWORKING :: REMOTE DIRECT MEMORY ACCESS CS6410 Moontae Lee (Nov 20, 2014) Part 1 Overview 00 Background User-level Networking (U-Net) Remote Direct Memory Access

More information

THE U-NET USER-LEVEL NETWORK ARCHITECTURE. Joint work with Werner Vogels, Anindya Basu, and Vineet Buch. or: it s easy to buy high-speed networks

THE U-NET USER-LEVEL NETWORK ARCHITECTURE. Joint work with Werner Vogels, Anindya Basu, and Vineet Buch. or: it s easy to buy high-speed networks Thorsten von Eicken Dept of Computer Science tve@cs.cornell.edu Cornell niversity THE -NET SER-LEVEL NETWORK ARCHITECTRE or: it s easy to buy high-speed networks but making them work is another story NoW

More information

U-Net: A User-Level Network Interface for Parallel and Distributed Computing

U-Net: A User-Level Network Interface for Parallel and Distributed Computing This document was created with FrameMaker 4..4 Proc. of the 15th ACM Symposium on Operating Systems Principles, Copper Mountain, Colorado, December 3-6, 1995 -Net: A ser-level Network Interface for Parallel

More information

U-Net: A User-Level Network Interface for Parallel and Distributed Computing

U-Net: A User-Level Network Interface for Parallel and Distributed Computing -Net: A ser-level Network Interface for Parallel and Distributed Computing Computer Science Technical Report to appear DRAFT Comments welcome Anindya Basu, Vineet Buch, Werner Vogels, Thorsten von Eicken

More information

Parallel Computing Trends: from MPPs to NoWs

Parallel Computing Trends: from MPPs to NoWs Parallel Computing Trends: from MPPs to NoWs (from Massively Parallel Processors to Networks of Workstations) Fall Research Forum Oct 18th, 1994 Thorsten von Eicken Department of Computer Science Cornell

More information

Advanced Computer Networks. End Host Optimization

Advanced Computer Networks. End Host Optimization Oriana Riva, Department of Computer Science ETH Zürich 263 3501 00 End Host Optimization Patrick Stuedi Spring Semester 2017 1 Today End-host optimizations: NUMA-aware networking Kernel-bypass Remote Direct

More information

ATM and Fast Ethernet Network Interfaces for User-level Communication

ATM and Fast Ethernet Network Interfaces for User-level Communication and Fast Ethernet Network Interfaces for User-level Communication Matt Welsh, Anindya Basu, and Thorsten von Eicken {mdw,basu,tve}@cs.cornell.edu Department of Computer Science Cornell University, Ithaca,

More information

Active Messages: a Mechanism for Integrated Communication and Computation 1

Active Messages: a Mechanism for Integrated Communication and Computation 1 Active Messages: a Mechanism for Integrated Communication and Computation 1 Thorsten von Eicken David E. Culler Seth Copen Goldstein Klaus Erik Schauser ftve,culler,sethg,schauserg@cs.berkeley.edu Report

More information

The latency of user-to-user, kernel-to-kernel and interrupt-to-interrupt level communication

The latency of user-to-user, kernel-to-kernel and interrupt-to-interrupt level communication The latency of user-to-user, kernel-to-kernel and interrupt-to-interrupt level communication John Markus Bjørndalen, Otto J. Anshus, Brian Vinter, Tore Larsen Department of Computer Science University

More information

cmam ( 3 ) CM-5 Active Message Layer

cmam ( 3 ) CM-5 Active Message Layer cmam ( 3 ) CM-5 Active Message Layer cmam (3) NAME CMAM - Introduction to the CM-5 Active Message communication layer. DESCRIPTION The CM-5 Active Message layer CMAM () provides a set of communication

More information

Low-Latency Communication over ATM Networks using Active Messages

Low-Latency Communication over ATM Networks using Active Messages This document was created with FrameMaker 4.0.2 Presented at Hot Interconnects II, Aug. 1994, Palo Alto, CA, abridged version in IEEE Micro, Feb 1995. Low-Latency Communication over ATM Networks using

More information

Push-Pull Messaging: a high-performance communication mechanism for commodity SMP clusters

Push-Pull Messaging: a high-performance communication mechanism for commodity SMP clusters Title Push-Pull Messaging: a high-performance communication mechanism for commodity SMP clusters Author(s) Wong, KP; Wang, CL Citation International Conference on Parallel Processing Proceedings, Aizu-Wakamatsu

More information

To provide a faster path between applications

To provide a faster path between applications Cover Feature Evolution of the Virtual Interface Architecture The recent introduction of the VIA standard for cluster or system-area networks has opened the market for commercial user-level network interfaces.

More information

19: Networking. Networking Hardware. Mark Handley

19: Networking. Networking Hardware. Mark Handley 19: Networking Mark Handley Networking Hardware Lots of different hardware: Modem byte at a time, FDDI, SONET packet at a time ATM (including some DSL) 53-byte cell at a time Reality is that most networking

More information

Security versus Performance Tradeoffs in RPC Implementations for Safe Language Systems

Security versus Performance Tradeoffs in RPC Implementations for Safe Language Systems Security versus Performance Tradeoffs in RPC Implementations for Safe Language Systems Chi-Chao Chang, Grzegorz Czajkowski, Chris Hawblitzel, Deyu Hu, and Thorsten von Eicken Department of Computer Science

More information

Advanced Computer Networks. RDMA, Network Virtualization

Advanced Computer Networks. RDMA, Network Virtualization Advanced Computer Networks 263 3501 00 RDMA, Network Virtualization Patrick Stuedi Spring Semester 2013 Oriana Riva, Department of Computer Science ETH Zürich Last Week Scaling Layer 2 Portland VL2 TCP

More information

LRP: A New Network Subsystem Architecture for Server Systems. Rice University. Abstract

LRP: A New Network Subsystem Architecture for Server Systems. Rice University. Abstract LRP: A New Network Subsystem Architecture for Server Systems Peter Druschel Gaurav Banga Rice University Abstract The explosive growth of the Internet, the widespread use of WWW-related applications, and

More information

08:End-host Optimizations. Advanced Computer Networks

08:End-host Optimizations. Advanced Computer Networks 08:End-host Optimizations 1 What today is about We've seen lots of datacenter networking Topologies Routing algorithms Transport What about end-systems? Transfers between CPU registers/cache/ram Focus

More information

CS 457 Networking and the Internet. Network Overview (cont d) 8/29/16. Circuit Switching (e.g., Phone Network) Fall 2016 Indrajit Ray

CS 457 Networking and the Internet. Network Overview (cont d) 8/29/16. Circuit Switching (e.g., Phone Network) Fall 2016 Indrajit Ray 8/9/6 CS 457 Networking and the Internet Fall 06 Indrajit Ray Network Overview (cont d) Circuit vs. Packet Switching Best Effort Internet Model Circuit Switching (e.g., Phone Network) Step : Source establishes

More information

A Modular High Performance Implementation of the Virtual Interface Architecture

A Modular High Performance Implementation of the Virtual Interface Architecture A Modular High Performance Implementation of the Virtual Interface Architecture Patrick Bozeman Bill Saphir National Energy Research Scientific Computing Center (NERSC) Lawrence Berkeley National Laboratory

More information

Directed Point: An Efficient Communication Subsystem for Cluster Computing. Abstract

Directed Point: An Efficient Communication Subsystem for Cluster Computing. Abstract Directed Point: An Efficient Communication Subsystem for Cluster Computing Chun-Ming Lee, Anthony Tam, Cho-Li Wang The University of Hong Kong {cmlee+clwang+atctam}@cs.hku.hk Abstract In this paper, we

More information

U-Net/SLE: A Java-based user-customizable virtual network interface

U-Net/SLE: A Java-based user-customizable virtual network interface 147 U-Net/SLE: A Java-based user-customizable virtual network interface Matt Welsh, David Oppenheimer and David Culler Computer Science Division, University of California, Berkeley, Berkeley, CA 94720,

More information

Multiprocessing and Scalability. A.R. Hurson Computer Science and Engineering The Pennsylvania State University

Multiprocessing and Scalability. A.R. Hurson Computer Science and Engineering The Pennsylvania State University A.R. Hurson Computer Science and Engineering The Pennsylvania State University 1 Large-scale multiprocessor systems have long held the promise of substantially higher performance than traditional uniprocessor

More information

[ 7.2.5] Certain challenges arise in realizing SAS or messagepassing programming models. Two of these are input-buffer overflow and fetch deadlock.

[ 7.2.5] Certain challenges arise in realizing SAS or messagepassing programming models. Two of these are input-buffer overflow and fetch deadlock. Buffering roblems [ 7.2.5] Certain challenges arise in realizing SAS or messagepassing programming models. Two of these are input-buffer overflow and fetch deadlock. Input-buffer overflow Suppose a large

More information

RTI Performance on Shared Memory and Message Passing Architectures

RTI Performance on Shared Memory and Message Passing Architectures RTI Performance on Shared Memory and Message Passing Architectures Steve L. Ferenci Richard Fujimoto, PhD College Of Computing Georgia Institute of Technology Atlanta, GA 3332-28 {ferenci,fujimoto}@cc.gatech.edu

More information

Chapter 6. What happens at the Transport Layer? Services provided Transport protocols UDP TCP Flow control Congestion control

Chapter 6. What happens at the Transport Layer? Services provided Transport protocols UDP TCP Flow control Congestion control Chapter 6 What happens at the Transport Layer? Services provided Transport protocols UDP TCP Flow control Congestion control OSI Model Hybrid Model Software outside the operating system Software inside

More information

Under the Hood, Part 1: Implementing Message Passing

Under the Hood, Part 1: Implementing Message Passing Lecture 27: Under the Hood, Part 1: Implementing Message Passing Parallel Computer Architecture and Programming CMU 15-418/15-618, Fall 2017 Today s Theme 2 Message passing model (abstraction) Threads

More information

LogP Performance Assessment of Fast Network Interfaces

LogP Performance Assessment of Fast Network Interfaces November 22, 1995 LogP Performance Assessment of Fast Network Interfaces David Culler, Lok Tin Liu, Richard P. Martin, and Chad Yoshikawa Computer Science Division University of California, Berkeley Abstract

More information

Transport Layer. The transport layer is responsible for the delivery of a message from one process to another. RSManiaol

Transport Layer. The transport layer is responsible for the delivery of a message from one process to another. RSManiaol Transport Layer Transport Layer The transport layer is responsible for the delivery of a message from one process to another Types of Data Deliveries Client/Server Paradigm An application program on the

More information

Low Latency MPI for Meiko CS/2 and ATM Clusters

Low Latency MPI for Meiko CS/2 and ATM Clusters Low Latency MPI for Meiko CS/2 and ATM Clusters Chris R. Jones Ambuj K. Singh Divyakant Agrawal y Department of Computer Science University of California, Santa Barbara Santa Barbara, CA 93106 Abstract

More information

SEDA: An Architecture for Well-Conditioned, Scalable Internet Services

SEDA: An Architecture for Well-Conditioned, Scalable Internet Services SEDA: An Architecture for Well-Conditioned, Scalable Internet Services Matt Welsh, David Culler, and Eric Brewer Computer Science Division University of California, Berkeley Operating Systems Principles

More information

Design and Implementation of Virtual Memory-Mapped Communication on Myrinet

Design and Implementation of Virtual Memory-Mapped Communication on Myrinet Design and Implementation of Virtual Memory-Mapped Communication on Myrinet Cezary Dubnicki, Angelos Bilas, Kai Li Princeton University Princeton, New Jersey 854 fdubnicki,bilas,lig@cs.princeton.edu James

More information

Congestion Control. Tom Anderson

Congestion Control. Tom Anderson Congestion Control Tom Anderson Bandwidth Allocation How do we efficiently share network resources among billions of hosts? Congestion control Sending too fast causes packet loss inside network -> retransmissions

More information

6.9. Communicating to the Outside World: Cluster Networking

6.9. Communicating to the Outside World: Cluster Networking 6.9 Communicating to the Outside World: Cluster Networking This online section describes the networking hardware and software used to connect the nodes of cluster together. As there are whole books and

More information

CSC 4900 Computer Networks: Network Layer

CSC 4900 Computer Networks: Network Layer CSC 4900 Computer Networks: Network Layer Professor Henry Carter Fall 2017 Villanova University Department of Computing Sciences Review What is AIMD? When do we use it? What is the steady state profile

More information

Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks

Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks X. Yuan, R. Melhem and R. Gupta Department of Computer Science University of Pittsburgh Pittsburgh, PA 156 fxyuan,

More information

The material in this lecture is taken from Dynamo: Amazon s Highly Available Key-value Store, by G. DeCandia, D. Hastorun, M. Jampani, G.

The material in this lecture is taken from Dynamo: Amazon s Highly Available Key-value Store, by G. DeCandia, D. Hastorun, M. Jampani, G. The material in this lecture is taken from Dynamo: Amazon s Highly Available Key-value Store, by G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall,

More information

CMPE 150/L : Introduction to Computer Networks. Chen Qian Computer Engineering UCSC Baskin Engineering Lecture 11

CMPE 150/L : Introduction to Computer Networks. Chen Qian Computer Engineering UCSC Baskin Engineering Lecture 11 CMPE 150/L : Introduction to Computer Networks Chen Qian Computer Engineering UCSC Baskin Engineering Lecture 11 1 Midterm exam Midterm this Thursday Close book but one-side 8.5"x11" note is allowed (must

More information

CSE 123A Computer Networks

CSE 123A Computer Networks CSE 123A Computer Networks Winter 2005 Lecture 14 Congestion Control Some images courtesy David Wetherall Animations by Nick McKeown and Guido Appenzeller The bad news and the good news The bad news: new

More information

Switching / Forwarding

Switching / Forwarding Switching / Forwarding A switch is a device that allows interconnection of links to form larger networks Multi-input, multi-output device Packet switch transfers packets from an input to one or more outputs

More information

Low-Latency Communication on the IBM RISC System/6000 SP

Low-Latency Communication on the IBM RISC System/6000 SP Low-Latency Communication on the IBM RISC System/6000 SP Chi-Chao Chang, Grzegorz Czajkowski, Chris Hawblitzel and Thorsten von Eicken Department of Computer Science Cornell University Ithaca NY 1483 Abstract

More information

Low-Latency Message Passing on Workstation Clusters using SCRAMNet 1 2

Low-Latency Message Passing on Workstation Clusters using SCRAMNet 1 2 Low-Latency Message Passing on Workstation Clusters using SCRAMNet 1 2 Vijay Moorthy, Matthew G. Jacunski, Manoj Pillai,Peter, P. Ware, Dhabaleswar K. Panda, Thomas W. Page Jr., P. Sadayappan, V. Nagarajan

More information

CMSC 417 Project Implementation of ATM Network Layer and Reliable ATM Adaptation Layer

CMSC 417 Project Implementation of ATM Network Layer and Reliable ATM Adaptation Layer CMSC 417 Project Implementation of ATM Network Layer and Reliable ATM Adaptation Layer 1. Introduction In this project you are required to implement an Asynchronous Transfer Mode (ATM) network layer and

More information

EXPLORING THE PERFORMANCE OF THE MYRINET PC CLUSTER ON LINUX Roberto Innocente Olumide S. Adewale

EXPLORING THE PERFORMANCE OF THE MYRINET PC CLUSTER ON LINUX Roberto Innocente Olumide S. Adewale EXPLORING THE PERFORMANCE OF THE MYRINET PC CLUSTER ON LINUX Roberto Innocente Olumide S. Adewale ABSTRACT Both the Infiniband and the virtual interface architecture (VIA) aim at providing effective cluster

More information

CSE/EE 461 Lecture 16 TCP Congestion Control. TCP Congestion Control

CSE/EE 461 Lecture 16 TCP Congestion Control. TCP Congestion Control CSE/EE Lecture TCP Congestion Control Tom Anderson tom@cs.washington.edu Peterson, Chapter TCP Congestion Control Goal: efficiently and fairly allocate network bandwidth Robust RTT estimation Additive

More information

set of \built-in" features. However, experience has shown [10] that in the interest of reducing host overhead, interrupts, and I/O bus transfers, it m

set of \built-in features. However, experience has shown [10] that in the interest of reducing host overhead, interrupts, and I/O bus transfers, it m U-Net/SLE: A Java-based User-Customizable Virtual Network Interface Matt Welsh, David Oppenheimer, and David Culler Computer Science Division University of California, Berkeley Berkeley, CA, 94720 USA

More information

Last Class: RPCs and RMI. Today: Communication Issues

Last Class: RPCs and RMI. Today: Communication Issues Last Class: RPCs and RMI Case Study: Sun RPC Lightweight RPCs Remote Method Invocation (RMI) Design issues Lecture 9, page 1 Today: Communication Issues Message-oriented communication Persistence and synchronicity

More information

Achieving Lightweight Multicast in Asynchronous Networks-on-Chip Using Local Speculation

Achieving Lightweight Multicast in Asynchronous Networks-on-Chip Using Local Speculation Achieving Lightweight Multicast in Asynchronous Networks-on-Chip Using Local Speculation Kshitij Bhardwaj Dept. of Computer Science Columbia University Steven M. Nowick 2016 ACM/IEEE Design Automation

More information

Advanced Computer Networks. Flow Control

Advanced Computer Networks. Flow Control Advanced Computer Networks 263 3501 00 Flow Control Patrick Stuedi Spring Semester 2017 1 Oriana Riva, Department of Computer Science ETH Zürich Last week TCP in Datacenters Avoid incast problem - Reduce

More information

Toward a Reliable Data Transport Architecture for Optical Burst-Switched Networks

Toward a Reliable Data Transport Architecture for Optical Burst-Switched Networks Toward a Reliable Data Transport Architecture for Optical Burst-Switched Networks Dr. Vinod Vokkarane Assistant Professor, Computer and Information Science Co-Director, Advanced Computer Networks Lab University

More information

CS455: Introduction to Distributed Systems [Spring 2018] Dept. Of Computer Science, Colorado State University

CS455: Introduction to Distributed Systems [Spring 2018] Dept. Of Computer Science, Colorado State University CS 455: INTRODUCTION TO DISTRIBUTED SYSTEMS [NETWORKING] Shrideep Pallickara Computer Science Colorado State University Frequently asked questions from the previous class survey Why not spawn processes

More information

Network Control and Signalling

Network Control and Signalling Network Control and Signalling 1. Introduction 2. Fundamentals and design principles 3. Network architecture and topology 4. Network control and signalling 5. Network components 5.1 links 5.2 switches

More information

Chapter 3 Packet Switching

Chapter 3 Packet Switching Chapter 3 Packet Switching Self-learning bridges: Bridge maintains a forwarding table with each entry contains the destination MAC address and the output port, together with a TTL for this entry Destination

More information

SOFTWARE BASED FAULT-TOLERANT OBLIVIOUS ROUTING IN PIPELINED NETWORKS*

SOFTWARE BASED FAULT-TOLERANT OBLIVIOUS ROUTING IN PIPELINED NETWORKS* SOFTWARE BASED FAULT-TOLERANT OBLIVIOUS ROUTING IN PIPELINED NETWORKS* Young-Joo Suh, Binh Vien Dao, Jose Duato, and Sudhakar Yalamanchili Computer Systems Research Laboratory Facultad de Informatica School

More information

High performance communication subsystem for clustering standard high-volume servers using Gigabit Ethernet

High performance communication subsystem for clustering standard high-volume servers using Gigabit Ethernet Title High performance communication subsystem for clustering standard high-volume servers using Gigabit Ethernet Author(s) Zhu, W; Lee, D; Wang, CL Citation The 4th International Conference/Exhibition

More information

The Lighweight Protocol CLIC on Gigabit Ethernet

The Lighweight Protocol CLIC on Gigabit Ethernet The Lighweight Protocol on Gigabit Ethernet Díaz, A.F.; Ortega; J.; Cañas, A.; Fernández, F.J.; Anguita, M.; Prieto, A. Departamento de Arquitectura y Tecnología de Computadores University of Granada (Spain)

More information

IP Packet Switching. Goals of Todayʼs Lecture. Simple Network: Nodes and a Link. Connectivity Links and nodes Circuit switching Packet switching

IP Packet Switching. Goals of Todayʼs Lecture. Simple Network: Nodes and a Link. Connectivity Links and nodes Circuit switching Packet switching IP Packet Switching CS 375: Computer Networks Dr. Thomas C. Bressoud Goals of Todayʼs Lecture Connectivity Links and nodes Circuit switching Packet switching IP service model Best-effort packet delivery

More information

Optimal Communication Performance on. Fast Ethernet with GAMMA. Giuseppe Ciaccio. DISI, Universita di Genova. via Dodecaneso 35, Genova, Italy

Optimal Communication Performance on. Fast Ethernet with GAMMA. Giuseppe Ciaccio. DISI, Universita di Genova. via Dodecaneso 35, Genova, Italy Optimal Communication Performance on Fast Ethernet with GAMMA Giuseppe Ciaccio DISI, Universita di Genova via Dodecaneso 35, 16146 Genova, Italy E-mail: ciaccio@disi.unige.it Abstract. The current prototype

More information

Modeling Cone-Beam Tomographic Reconstruction U sing LogSMP: An Extended LogP Model for Clusters of SMPs

Modeling Cone-Beam Tomographic Reconstruction U sing LogSMP: An Extended LogP Model for Clusters of SMPs Modeling Cone-Beam Tomographic Reconstruction U sing LogSMP: An Extended LogP Model for Clusters of SMPs David A. Reimann, Vipin Chaudhary 2, and Ishwar K. Sethi 3 Department of Mathematics, Albion College,

More information

NOW Handout Page 1. Recap: Gigaplane Bus Timing. Scalability

NOW Handout Page 1. Recap: Gigaplane Bus Timing. Scalability Recap: Gigaplane Bus Timing 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Address Rd A Rd B Scalability State Arbitration 1 4,5 2 Share ~Own 6 Own 7 A D A D A D A D A D A D A D A D CS 258, Spring 99 David E. Culler

More information

Eliminating the Protocol Stack for Socket based Communication in Shared Memory Interconnects

Eliminating the Protocol Stack for Socket based Communication in Shared Memory Interconnects Eliminating the Protocol Stack for Socket based Communication in Shared Memory Interconnects Stein Jørgen Ryan and Haakon Bryhni Department of Informatics, University of Oslo PO Box 1080, Blindern, N-0316

More information

Crossbar switch. Chapter 2: Concepts and Architectures. Traditional Computer Architecture. Computer System Architectures. Flynn Architectures (2)

Crossbar switch. Chapter 2: Concepts and Architectures. Traditional Computer Architecture. Computer System Architectures. Flynn Architectures (2) Chapter 2: Concepts and Architectures Computer System Architectures Disk(s) CPU I/O Memory Traditional Computer Architecture Flynn, 1966+1972 classification of computer systems in terms of instruction

More information

HWP2 Application level query routing HWP1 Each peer knows about every other beacon B1 B3

HWP2 Application level query routing HWP1 Each peer knows about every other beacon B1 B3 HWP2 Application level query routing HWP1 Each peer knows about every other beacon B2 B1 B3 B4 B5 B6 11-Feb-02 Computer Networks 1 HWP2 Query routing searchget(searchkey, hopcount) Rget(host, port, key)

More information

Studying Fairness of TCP Variants and UDP Traffic

Studying Fairness of TCP Variants and UDP Traffic Studying Fairness of TCP Variants and UDP Traffic Election Reddy B.Krishna Chaitanya Problem Definition: To study the fairness of TCP variants and UDP, when sharing a common link. To do so we conduct various

More information

Outline. Limited Scaling of a Bus

Outline. Limited Scaling of a Bus Outline Scalability physical, bandwidth, latency and cost level of integration Realizing rogramming Models network transactions protocols safety input buffer problem: N-1 fetch deadlock Communication Architecture

More information

CS162 - Operating Systems and Systems Programming. Address Translation => Paging"

CS162 - Operating Systems and Systems Programming. Address Translation => Paging CS162 - Operating Systems and Systems Programming Address Translation => Paging" David E. Culler! http://cs162.eecs.berkeley.edu/! Lecture #15! Oct 3, 2014!! Reading: A&D 8.1-2, 8.3.1. 9.7 HW 3 out (due

More information

Design and Implementation of A P2P Cooperative Proxy Cache System

Design and Implementation of A P2P Cooperative Proxy Cache System Design and Implementation of A PP Cooperative Proxy Cache System James Z. Wang Vipul Bhulawala Department of Computer Science Clemson University, Box 40974 Clemson, SC 94-0974, USA +1-84--778 {jzwang,

More information

Virtual Interface Architecture over Myrinet. EEL Computer Architecture Dr. Alan D. George Project Final Report

Virtual Interface Architecture over Myrinet. EEL Computer Architecture Dr. Alan D. George Project Final Report Virtual Interface Architecture over Myrinet EEL5717 - Computer Architecture Dr. Alan D. George Project Final Report Department of Electrical and Computer Engineering University of Florida Edwin Hernandez

More information

Designing Next Generation Data-Centers with Advanced Communication Protocols and Systems Services

Designing Next Generation Data-Centers with Advanced Communication Protocols and Systems Services Designing Next Generation Data-Centers with Advanced Communication Protocols and Systems Services P. Balaji, K. Vaidyanathan, S. Narravula, H. W. Jin and D. K. Panda Network Based Computing Laboratory

More information

Performance of a High-Level Parallel Language on a High-Speed Network

Performance of a High-Level Parallel Language on a High-Speed Network Performance of a High-Level Parallel Language on a High-Speed Network Henri Bal Raoul Bhoedjang Rutger Hofman Ceriel Jacobs Koen Langendoen Tim Rühl Kees Verstoep Dept. of Mathematics and Computer Science

More information

CHAPTER 9: PACKET SWITCHING N/W & CONGESTION CONTROL

CHAPTER 9: PACKET SWITCHING N/W & CONGESTION CONTROL CHAPTER 9: PACKET SWITCHING N/W & CONGESTION CONTROL Dr. Bhargavi Goswami, Associate Professor head, Department of Computer Science, Garden City College Bangalore. PACKET SWITCHED NETWORKS Transfer blocks

More information

LS Example 5 3 C 5 A 1 D

LS Example 5 3 C 5 A 1 D Lecture 10 LS Example 5 2 B 3 C 5 1 A 1 D 2 3 1 1 E 2 F G Itrn M B Path C Path D Path E Path F Path G Path 1 {A} 2 A-B 5 A-C 1 A-D Inf. Inf. 1 A-G 2 {A,D} 2 A-B 4 A-D-C 1 A-D 2 A-D-E Inf. 1 A-G 3 {A,D,G}

More information

Network Interface Architecture and Prototyping for Chip and Cluster Multiprocessors

Network Interface Architecture and Prototyping for Chip and Cluster Multiprocessors University of Crete School of Sciences & Engineering Computer Science Department Master Thesis by Michael Papamichael Network Interface Architecture and Prototyping for Chip and Cluster Multiprocessors

More information

CS-534 Packet Switch Architecture

CS-534 Packet Switch Architecture CS-534 Packet Switch Architecture The Hardware Architect s Perspective on High-Speed Networking and Interconnects Manolis Katevenis University of Crete and FORTH, Greece http://archvlsi.ics.forth.gr/~kateveni/534

More information

Communication Networks

Communication Networks Communication Networks Spring 2018 Laurent Vanbever nsg.ee.ethz.ch ETH Zürich (D-ITET) April 30 2018 Materials inspired from Scott Shenker & Jennifer Rexford Last week on Communication Networks We started

More information

Multiprocessor Interconnection Networks

Multiprocessor Interconnection Networks Multiprocessor Interconnection Networks Todd C. Mowry CS 740 November 19, 1998 Topics Network design space Contention Active messages Networks Design Options: Topology Routing Direct vs. Indirect Physical

More information

Basic Low Level Concepts

Basic Low Level Concepts Course Outline Basic Low Level Concepts Case Studies Operation through multiple switches: Topologies & Routing v Direct, indirect, regular, irregular Formal models and analysis for deadlock and livelock

More information

Network Design Considerations for Grid Computing

Network Design Considerations for Grid Computing Network Design Considerations for Grid Computing Engineering Systems How Bandwidth, Latency, and Packet Size Impact Grid Job Performance by Erik Burrows, Engineering Systems Analyst, Principal, Broadcom

More information

An Extensible Message-Oriented Offload Model for High-Performance Applications

An Extensible Message-Oriented Offload Model for High-Performance Applications An Extensible Message-Oriented Offload Model for High-Performance Applications Patricia Gilfeather and Arthur B. Maccabe Scalable Systems Lab Department of Computer Science University of New Mexico pfeather@cs.unm.edu,

More information

Scalable Multiprocessors

Scalable Multiprocessors arallel Computer Organization and Design : Lecture 7 er Stenström. 2008, Sally A. ckee 2009 Scalable ultiprocessors What is a scalable design? (7.1) Realizing programming models (7.2) Scalable communication

More information

Asynchronous Transfer Mode (ATM) ATM concepts

Asynchronous Transfer Mode (ATM) ATM concepts Asynchronous Transfer Mode (ATM) Asynchronous Transfer Mode (ATM) is a switching technique for telecommunication networks. It uses asynchronous time-division multiplexing,[1][2] and it encodes data into

More information

Next Steps Spring 2011 Lecture #18. Multi-hop Networks. Network Reliability. Have: digital point-to-point. Want: many interconnected points

Next Steps Spring 2011 Lecture #18. Multi-hop Networks. Network Reliability. Have: digital point-to-point. Want: many interconnected points Next Steps Have: digital point-to-point We ve worked on link signaling, reliability, sharing Want: many interconnected points 6.02 Spring 2011 Lecture #18 multi-hop networks: design criteria network topologies

More information

An RDMA Protocol Specification (Version 1.0)

An RDMA Protocol Specification (Version 1.0) draft-recio-iwarp-rdmap-v.0 Status of this Memo R. Recio IBM Corporation P. Culley Hewlett-Packard Company D. Garcia Hewlett-Packard Company J. Hilland Hewlett-Packard Company October 0 An RDMA Protocol

More information

Section 3; in Section 4 we will attempt a discussion of the four approaches. The paper ends with a ConclusionèSection 5è. 2 Issues in Communication In

Section 3; in Section 4 we will attempt a discussion of the four approaches. The paper ends with a ConclusionèSection 5è. 2 Issues in Communication In Process Communication on Clusters Sultan Al-Muhammadi, Peter Petrov, Ju Wang, Bogdan Warinschi November 21, 1998 Abstract Clusters of computers promise to be the supercomputers of the future. Traditional

More information

Packet Switching. Hongwei Zhang Nature seems to reach her ends by long circuitous routes.

Packet Switching. Hongwei Zhang  Nature seems to reach her ends by long circuitous routes. Problem: not all networks are directly connected Limitations of directly connected networks: limit on the number of hosts supportable limit on the geographic span of the network Packet Switching Hongwei

More information

EXTENDING AN ASYNCHRONOUS MESSAGING LIBRARY USING AN RDMA-ENABLED INTERCONNECT. Konstantinos Alexopoulos ECE NTUA CSLab

EXTENDING AN ASYNCHRONOUS MESSAGING LIBRARY USING AN RDMA-ENABLED INTERCONNECT. Konstantinos Alexopoulos ECE NTUA CSLab EXTENDING AN ASYNCHRONOUS MESSAGING LIBRARY USING AN RDMA-ENABLED INTERCONNECT Konstantinos Alexopoulos ECE NTUA CSLab MOTIVATION HPC, Multi-node & Heterogeneous Systems Communication with low latency

More information

Last time. Wireless link-layer. Introduction. Characteristics of wireless links wireless LANs networking. Cellular Internet access

Last time. Wireless link-layer. Introduction. Characteristics of wireless links wireless LANs networking. Cellular Internet access Last time Wireless link-layer Introduction Wireless hosts, base stations, wireless links Characteristics of wireless links Signal strength, interference, multipath propagation Hidden terminal, signal fading

More information

Implementing TreadMarks over GM on Myrinet: Challenges, Design Experience, and Performance Evaluation

Implementing TreadMarks over GM on Myrinet: Challenges, Design Experience, and Performance Evaluation Implementing TreadMarks over GM on Myrinet: Challenges, Design Experience, and Performance Evaluation Ranjit Noronha and Dhabaleswar K. Panda Dept. of Computer and Information Science The Ohio State University

More information

Local Area Network Overview

Local Area Network Overview Local Area Network Overview Chapter 15 CS420/520 Axel Krings Page 1 LAN Applications (1) Personal computer LANs Low cost Limited data rate Back end networks Interconnecting large systems (mainframes and

More information

Network Implementation

Network Implementation CS 256/456: Operating Systems Network Implementation John Criswell! University of Rochester 1 Networking Overview 2 Networking Layers Application Layer Format of Application Data Transport Layer Which

More information

CS 455/555 Intro to Networks and Communications. Link Layer Addressing, Ethernet, and a Day in the Life of a Web Request

CS 455/555 Intro to Networks and Communications. Link Layer Addressing, Ethernet, and a Day in the Life of a Web Request CS 455/555 Intro to Networks and Communications Link Layer Addressing, ernet, and a Day in the Life of a Web Request Dr. Michele Weigle Department of Computer Science Old Dominion University mweigle@cs.odu.edu

More information

Architecture or Parallel Computers CSC / ECE 506

Architecture or Parallel Computers CSC / ECE 506 Architecture or Parallel Computers CSC / ECE 506 Summer 2006 Scalable Programming Models 6/19/2006 Dr Steve Hunter Back to Basics Parallel Architecture = Computer Architecture + Communication Architecture

More information

Goals for Today s Class. EE 122: Networks & Protocols. What Global (non-digital) Communication Network Do You Use Every Day?

Goals for Today s Class. EE 122: Networks & Protocols. What Global (non-digital) Communication Network Do You Use Every Day? Goals for Today s Class EE 122: & Protocols Ion Stoica TAs: Junda Liu, DK Moon, David Zats http://inst.eecs.berkeley.edu/~ee122/fa09 (Materials with thanks to Vern Paxson, Jennifer Rexford, and colleagues

More information

Data Link Layer. Our goals: understand principles behind data link layer services: instantiation and implementation of various link layer technologies

Data Link Layer. Our goals: understand principles behind data link layer services: instantiation and implementation of various link layer technologies Data Link Layer Our goals: understand principles behind data link layer services: link layer addressing instantiation and implementation of various link layer technologies 1 Outline Introduction and services

More information

Lecture 28: Networks & Interconnect Architectural Issues Professor Randy H. Katz Computer Science 252 Spring 1996

Lecture 28: Networks & Interconnect Architectural Issues Professor Randy H. Katz Computer Science 252 Spring 1996 Lecture 28: Networks & Interconnect Architectural Issues Professor Randy H. Katz Computer Science 252 Spring 1996 RHK.S96 1 Review: ABCs of Networks Starting Point: Send bits between 2 computers Queue

More information

Chapter 6 Queuing Disciplines. Networking CS 3470, Section 1

Chapter 6 Queuing Disciplines. Networking CS 3470, Section 1 Chapter 6 Queuing Disciplines Networking CS 3470, Section 1 Flow control vs Congestion control Flow control involves preventing senders from overrunning the capacity of the receivers Congestion control

More information

G Robert Grimm New York University

G Robert Grimm New York University G22.3250-001 Receiver Livelock Robert Grimm New York University Altogether Now: The Three Questions What is the problem? What is new or different? What are the contributions and limitations? Motivation

More information