Remote Procedure Calls

Similar documents
416 Distributed Systems. RPC Day 2 Jan 11, 2017

416 Distributed Systems. RPC Day 2 Jan 12, 2018

Network Communication and Remote Procedure Calls

Distributed Systems. Lecture 06 Remote Procedure Calls Thursday, September 13 th, 2018

REMOTE PROCEDURE CALLS EE324

Building up to today. Remote Procedure Calls. Reminder about last time. Threads - impl

Administrivia. Remote Procedure Calls. Reminder about last time. Building up to today

Distributed Systems 8. Remote Procedure Calls

Remote Invocation. Today. Next time. l Overlay networks and P2P. l Request-reply, RPC, RMI

Distributed Systems. How do regular procedure calls work in programming languages? Problems with sockets RPC. Regular procedure calls

Operating Systems. 18. Remote Procedure Calls. Paul Krzyzanowski. Rutgers University. Spring /20/ Paul Krzyzanowski

CSci Introduction to Distributed Systems. Communication: RPC

Distributed Systems [Fall 2013]

Remote Procedure Call (RPC) and Transparency

Today CSCI Communication. Communication in Distributed Systems. Communication in Distributed Systems. Remote Procedure Calls (RPC)

Distributed Systems. Lec 4: Remote Procedure Calls (RPC)

Overview. Communication types and role of Middleware Remote Procedure Call (RPC) Message Oriented Communication Multicasting 2/36

CHAPTER - 4 REMOTE COMMUNICATION

C 1. Recap: Finger Table. CSE 486/586 Distributed Systems Remote Procedure Call. Chord: Node Joins and Leaves. Recall? Socket API

Outline. EEC-681/781 Distributed Computing Systems. The OSI Network Architecture. Inter-Process Communications (IPC) Lecture 4

COMMUNICATION PROTOCOLS: REMOTE PROCEDURE CALL (RPC)

Lecture 8: February 19

Middleware. Adapted from Alonso, Casati, Kuno, Machiraju Web Services Springer 2004

RMI & RPC. CS 475, Spring 2019 Concurrent & Distributed Systems

Distributed Objects and Remote Invocation. Programming Models for Distributed Applications

03 Remote invoaction. Request-reply RPC. Coulouris 5 Birrel_Nelson_84.pdf RMI

Outline. Interprocess Communication. Interprocess Communication. Communication Models: Message Passing and shared Memory.

Remote Procedure Calls (RPC)

MODELS OF DISTRIBUTED SYSTEMS

CSCI-1680 RPC and Data Representation. Rodrigo Fonseca

Remote Procedure Call

MODELS OF DISTRIBUTED SYSTEMS

Distributed Systems Theory 4. Remote Procedure Call. October 17, 2008

DISTRIBUTED COMPUTER SYSTEMS

Distributed Systems. 03. Remote Procedure Calls. Paul Krzyzanowski. Rutgers University. Fall 2017

CSCI-1680 RPC and Data Representation. Rodrigo Fonseca

Lecture 5: Object Interaction: RMI and RPC

CSCI-1680 RPC and Data Representation John Jannotti

Last Class: RPCs. Today:

CS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring Lecture 21: Network Protocols (and 2 Phase Commit)

CS 417 9/18/17. Paul Krzyzanowski 1. Socket-based communication. Distributed Systems 03. Remote Procedure Calls. Sample SMTP Interaction

Chapter 4 Communication

Remote Invocation. To do. Request-reply, RPC, RMI. q Today q. q Next time: Indirect communication

ECE 650 Systems Programming & Engineering. Spring 2018

CS454/654 Midterm Exam Fall 2004

RMI: Design & Implementation

Chapter 5: Remote Invocation. Copyright 2015 Prof. Amr El-Kadi

NETWORKED STORAGE AND REMOTE PROCEDURE CALLS (RPC)

COMMUNICATION IN DISTRIBUTED SYSTEMS

Network Protocols. Sarah Diesburg Operating Systems CS 3430

Lecture 8: February 17

Distributed Systems Lecture 2 1. External Data Representation and Marshalling (Sec. 4.3) Request reply protocol (failure modes) (Sec. 4.

Remote Invocation Vladimir Vlassov and Johan Montelius


Communication. Distributed Systems Santa Clara University 2016

IPC. Communication. Layered Protocols. Layered Protocols (1) Data Link Layer. Layered Protocols (2)

Communication. Communication. Distributed Systems. Networks and protocols Sockets Remote Invocation Messages Streams. Fall /10/2001 DoCS

Remote Procedure Call. Tom Anderson

Lecture 15: Network File Systems

Communication in Distributed Systems

How do modules communicate? Enforcing modularity. Modularity: client-server organization. Tradeoffs of enforcing modularity

Remote Invocation. Today. Next time. l Indirect communication. l Request-reply, RPC, RMI

Distributed Systems. 02. Networking. Paul Krzyzanowski. Rutgers University. Fall 2017

Networked Applications: Sockets. End System: Computer on the Net

Interprocess Communication

Structured communication (Remote invocation)

CSMC 412. Computer Networks Prof. Ashok K Agrawala Ashok Agrawala Set 2. September 15 CMSC417 Set 2 1

Communication. Distributed Systems IT332

Lecture 06: Distributed Object

MTAT Enterprise System Integration. Lecture 2: Middleware & Web Services

DS 2009: middleware. David Evans

CS603: Distributed Systems

416 Distributed Systems. Networks review; Day 2 of 2 Fate sharing, e2e principle And start of RPC Jan 10, 2018

RPCs in Go. COS 418: Distributed Systems Precept 2. Themis Melissaris and Daniel Suo

RPC. Remote Procedure Calls. Robert Grimm New York University

EECS 482 Introduction to Operating Systems

Communication. Overview

Distributed Systems. The main method of distributed object communication is with remote method invocation

Computer Networks Prof. Ashok K. Agrawala

CS321: Computer Networks Introduction to Application Layer

struct foomsg { u_int32_t len; }

Lecture 5: RMI etc. Servant. Java Remote Method Invocation Invocation Semantics Distributed Events CDK: Chapter 5 TVS: Section 8.3

Networks and distributed computing

a. Under overload, whole network collapsed iii. How do you make an efficient high-level communication mechanism? 1. Similar to using compiler instead

CS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring Lecture 22: Remote Procedure Call (RPC)

Hybrid of client-server and P2P. Pure P2P Architecture. App-layer Protocols. Communicating Processes. Transport Service Requirements

Networks and Operating Systems Chapter 3: Remote Procedure Call (RPC)

Remote Procedure Calls CS 707

INTRODUCTION TO PROTOCOLS, LAYERING, FRAMING AND PARSING

Two Phase Commit Protocol. Distributed Systems. Remote Procedure Calls (RPC) Network & Distributed Operating Systems. Network OS.

CS555: Distributed Systems [Fall 2017] Dept. Of Computer Science, Colorado State University

Chapter IV INTERPROCESS COMMUNICATION. Jehan-François Pâris

Randall Stewart, Cisco Systems Phill Conrad, University of Delaware

CS4700/CS5700 Fundaments of Computer Networks

Networked Applications: Sockets. Goals of Todayʼs Lecture. End System: Computer on the ʻNet. Client-server paradigm End systems Clients and servers

416 Distributed Systems. Networks review; Day 2 of 2 And start of RPC Jan 13, 2016

ELEC / COMP 177 Fall Some slides from Kurose and Ross, Computer Networking, 5 th Edition

System Models and Communication

Dr. Robert N. M. Watson

Distributed Systems are Everywhere!" CS162 Operating Systems and Systems Programming Lecture 22 Client-Server" Client-Server" Message Passing"

Transcription:

CS 5450 Remote Procedure Calls Vitaly Shmatikov

Abstractions Abstractions for communication TCP masks some of the pain of communicating over unreliable IP Abstractions for computation Goal: programming abstraction that enables splitting work across multiple networked computers 2

The Problem of Communication Process on Host A wants to talk to process on Host B A and B must agree on the meaning of the bits being sent and received at many different levels, including: How many volts is a 0 bit, a 1 bit? How does receiver know which is the last bit? How many bits long is a number? 3

The Problem of Communication Applications HTTP Skype SSH FTP Transmission media Coaxial cable Fiber optic Wi-Fi Re-implement every application for every new underlying transmission medium? Consequence: change every application on any change to an underlying transmission medium No! 4

Solution: Layering Applications HTTP Skype SSH FTP Intermediate layers Transmission media Coaxial cable Fiber optic Wi-Fi Intermediate layers provide a set of abstractions for applications and media New applications or media need only implement for intermediate layer s interface 5

Layering in the Internet Applications Transport layer Network layer Link layer Physical layer Host Transport: Provide end-to-end communication between processes on different hosts Network: Deliver packets to destinations on other (heterogeneous) networks Link: Enables end hosts to exchange atomic messages with each other Physical: Moves bits between two hosts connected by a physical link 6

Communication Between Hosts How to forge agreement on the meaning of the bits exchanged between two hosts? Protocol: Rules that governs the format, contents, and meaning of messages Each layer on a host interacts with its peer host s corresponding layer via the protocol interface Application Transport Network Link Physical Network Link Physical Application Transport Network Link Physical Host A Router Host B 7

Socket-Based Communication Socket: the interface OS provides to the network Explicit inter-process message exchange Can build distributed systems atop sockets: send(), recv() Application layer Process Socket Transport layer Network layer Link layer Physical layer Host A Application layer Process Socket Transport layer Network layer Link layer Physical layer Host B 8

Principle of Transparency Principle of transparency: Hide that resource is physically distributed across multiple computers Access resource the same way as locally Users can t tell where resource is physically located Network sockets provide applications with point-topoint communications between processes Is this transparent? 9

// Create a socket for the client if ((sockfd = socket (AF_INET, SOCK_STREAM, 0)) < 0) { perror( Socket creation"); exit(2); } // Set server address and port memset(&servaddr, 0, sizeof(servaddr)); servaddr.sin_family = AF_INET; servaddr.sin_addr.s_addr = inet_addr(argv[1]); servaddr.sin_port Sockets = don t htons(serv_port); provide transparency // to big-endian // Establish TCP connection if (connect(sockfd, (struct sockaddr *) &servaddr, sizeof(servaddr)) < 0) { perror( Connect to server"); exit(3); } // Transmit the data over the TCP connection send(sockfd, buf, strlen(buf), 0); 10

Remote Procedure Call A type of client/server communication Attempts to make remote procedure calls look like local ones {... foo() } void foo() { invoke_remote_foo() } figure from Microsoft MSDN 11

RPC Goals Ease programming Hide complexity Automate task of implementing distributed computation Familiar model for programmers (just make a function call) 12

RPC Properties A remote procedure call makes a call to a remote service look like a local call RPC makes transparent whether server is local or remote RPC allows applications to become distributed transparently RPC makes architecture of remote machine transparent 13

RPC Issues Heterogeneity Client needs to rendezvous with the server Server must dispatch to the required function Different address spaces, data representation Failure What if messages get dropped? What if client, server, or network fails? Performance Procedure call takes 10 cycles 3 ns RPC in a data center takes 10 µs (103 times slower) In the wide area, typically 106 times slower 14

Differences in Data Representation Not an issue for local procedure call For an RPC, a remote machine may: Represent data types using different sizes Use a different byte ordering (endianness) Represent floating point numbers differently Have different data alignment requirements E.g., 4-byte type begins only on 4-byte memory boundary 15

Stubs: Obtaining Transparency Compiler generates from API stubs for a procedure on the client and server Client stub Marshals arguments into machine-independent format Sends request to server Waits for response Unmarshals result and returns to caller Server stub Unmarshals arguments and builds stack frame Calls procedure Server stub marshals results and sends reply slide 16 16

Marshaling and Unmarshaling Example of marshaling: convert into string in a predefined byte order, followed by string length Floating point... Nested structures? Design question for the RPC system support them or not? Complex data structures? Some RPC systems let you send lists and maps as first-order objects 17

Stubs and IDLs RPC stubs do the work of marshaling and unmarshaling data But how do they know how to do it? Typically: write a description of the function signature using an IDL -- interface definition language Lots of these. Some look like C, some look like XML... details don t matter much. 18

Interface Description Language Mechanism to pass procedure parameters and return values in a machine-independent way Use IDL to define API for procedure calls: names, parameter/return types IDL compiler generates: Code to marshal (convert) and unmarshal native data types into machine-independent byte streams Client stub: forwards local procedure call as a request to server Server stub: dispatches RPC to its implementation 19

Server Stub Structure Dispatcher Receives a client s RPC request Identifies appropriate server-side method to invoke Skeleton Unmarshals parameters to server-native types Calls the local server procedure Marshals the response, sends it back to the dispatcher All this is hidden from the programmer 20

A Day in the Life of RPC 1. Client calls stub function (pushes parameters onto stack) Client machine Client process k = add(3, 5) Client stub (RPC library) 21

A Day in the Life of RPC 1. Client calls stub function (pushes params onto stack) 2. Stub marshals parameters to a network message Client machine Client process k = add(3, 5) Client stub (RPC library) proc: add int: 3 int: 5 Client OS 22

A Day in the Life of RPC 2. Stub marshals parameters to a network message 3. OS sends a network message to the server Client machine Server machine Client process k = add(3, 5) Client stub (RPC library) Client OS proc: add int: 3 int: 5 Server OS 23

A Day in the Life of RPC 3. OS sends a network message to the server 4. Server OS receives message, sends it up to stub Client machine Server machine Client process k = add(3, 5) Client stub (RPC library) Server stub (RPC library) Client OS Server OS proc: add int: 3 int: 5 24

A Day in the Life of RPC 4. Server OS receives message, sends it up to stub 5. Server stub unmarshals parameters, calls server function Client machine Client process k = add(3, 5) Client stub (RPC library) Client OS Server machine Server process Implementation of add Server stub (RPC library) proc: add int: 3 int: 5 Server OS 25

A Day in the Life of RPC 5. Server stub unmarshals parameters, calls server function 6. Server function runs, returns a value Client machine Client process k = add(3, 5) Client stub (RPC library) Server machine Server process 8! add(3, 5) Server stub (RPC library) Client OS Server OS 26

A Day in the Life of RPC 6. Server function runs, returns a value 7. Server stub marshals the return value, sends msg Client machine Client process k = add(3, 5) Client stub (RPC library) Server machine Server process 8! add(3, 5) Server stub (RPC library) Result int: 8 Client OS Server OS 27

A Day in the Life of RPC 7. Server stub marshals the return value, sends msg 8. Server OS sends the reply back across the network Client machine Client process k = add(3, 5) Client stub (RPC library) Server machine Server process 8! add(3, 5) Server stub (RPC library) Client OS Server OS Result int: 8 28

A Day in the Life of RPC 8. Server OS sends the reply back across the network 9. Client OS receives the reply and passes up to stub Client machine Client process k = add(3, 5) Client stub (RPC library) Server machine Server process 8! add(3, 5) Server stub (RPC library) Client OS Result int: 8 Server OS 29

A Day in the Life of RPC 9. Client OS receives the reply and passes up to stub 10. Client stub unmarshals return value, returns to client Client machine Client process k! 8 Client stub (RPC library) Server machine Server process 8! add(3, 5) Server stub (RPC library) Result int: 8 Client OS Server OS 30

Passing Value Parameters (1) Doing a remote computation through RPC 31

Passing Value Parameters (2) a) Original message on x86 b) The message after receipt on the SPARC c) The message after being inverted (the little numbers in boxes indicate the address of each byte) 32

Passing Reference Parameters Replace with pass by copy/restore Need to know size of data to copy Difficult in some programming languages Solves the problem only partially What about data structures containing pointers? Access to memory in general? Distributed shared memory People have kind of given up on it - it turns out often better to admit that you re doing things remotely 33

RPC vs. LPC Achieving transparency is difficult! Partial failures Latency Memory access 34

RPC Failures Request from client to server lost Reply from server to client lost Server crashes after receiving request Client crashes after sending request These look the same to client 35

Partial Failures Local computing: if machine fails, application fails Distributed computing: if a machine fails, part of application fails - cannot tell the difference between a machine failure and network failure How to make partial failures transparent to client? 36

Strawman Solution Make remote behavior identical to local behavior: every partial failure results in complete failure Abort and reboot the whole system Wait patiently until system is repaired Problems with this solution: Many catastrophic failures Clients block for long periods System might not be able to recover 37

Real Solution: Break Transparency Possible semantics for RPC Exactly-once Impossible in practice At-least-once Only for idempotent operations At-most-once Zero, don t know, or once Zero-or-once Transactional semantics 38

At-Least-Once Keep retrying on client side until you get a response Server just processes requests as normal, doesn t remember anything. Simple! 39

At-Least-Once and Side Effects Client Server Debit(acct, $10) Timeout ACK! Debit(acct, $10) (debit $10) ACK! (debit $10) Time 40

At-Least-Once and Writes Client put(x,10) put(x,20) put(x, 10) Server Timeout put(x, 10) ACK! put(x, 20) x!10 x=20 ACK! x!20 Time 41

At-Least-Once and Writes Client put(x,10) put(x,20) put(x, 10) Server Timeout put(x, 10) ACK! put(x, 20) x!10 x=20 ACK! x!20 x!10 Time 42

At-Least-Once is Ok...... for read-only operations with no side effects Example: read a key s value in a database... if the application has its own functionality to cope with duplication and reordering 43

At-Most-Once Server might get same request twice... Must re-send previous reply, not process request Implies: keep cache of handled requests/responses Discard replies after client confirmed receipt (how?) Must be able to identify requests Same name, same arguments = same request Give each RPC an ID, remember all RPC IDs handled Have client number RPC IDs sequentially, keep sliding window of valid RPC IDs Never re-use IDs! Store on disk, or use boot time, or use big random numbers. 44

What Does the Client Know? If the server receives request with invalid ID (or server has crashed), responds with exception... What does the client know if they receive an exception from an at-most-once server? Maybe the money was transferred, maybe it was not but it was not transferred twice! 45

Exactly-Once Impossible in general Imagine that message triggers an external physical thing Example: medical device injects medicine into patient The device could crash immediately before or after firing and lose its state - don t know which one happened Can, however, make this window very small 46

Implementation Concerns As a general library, performance is often a big concern for RPC systems Major source of overhead: copies and marshaling/unmarshaling overhead Zero-copy tricks Representation: send on the wire in native format and indicate that format with a bit/byte beforehand. What does this do? Think about sending uint32 between two little-endian machines Scatter-gather writes - writev() and friends 47

Traditional RPC 48

Asynchronous RPC (1) 49

Asynchronous RPC (2) 50

Example: Distributed Bitcoin Miner Request client Submits cracking request to server, waits until response Worker Initially a client, sends join request to server, then reverses role and becomes a server, receives requests from main server to attempt cracking over limited range Server (orchestrates the whole thing) When receives request from client, splits it into smaller jobs over limited ranges, farms these out to workers. When finds bitcoin or exhausts complete range, responds to request client. 51

Using RPC Request Server Response Classic synchronous RPC Worker Server Synchronous RPC, but no return value ("I'm a worker and I'm listening for you on host XXX, port YYY ) Server Worker Synchronous RPC would be a bad idea: have to block while worker does its work, which misses the whole point of having many workers Need asynchronous RPC 52

RPC Systems ONC RPC (a.k.a. Sun RPC) Fairly basic, includes encoding standard XDR + language for describing data formats Java RMI (remote method invocation) Very elaborate, tries to make it look like can perform arbitrary methods on remote objects Thrift Developed at Facebook, open-sourced at Apache. Multiple data encodings and transport mechanisms Avro From Hadoop, now part of Apache 53

Important Lessons Procedure calls Simple way to pass control and data Elegant, transparent way to distribute application Hard to provide true transparency Failures, performance, memory access, etc. Practical solution: expose RPC properties to client since can t hide them Application developer must decide how to deal with partial failures (example: e-commerce app vs. game) Worse is better 54