RPC and Threads. Jinyang Li. These slides are based on lecture notes of 6.824

Similar documents
Remote Procedure Call. Tom Anderson

Network Communication and Remote Procedure Calls

MODELS OF DISTRIBUTED SYSTEMS

1/10/16. RPC and Clocks. Tom Anderson. Last Time. Synchroniza>on RPC. Lab 1 RPC

Lecture 5: RMI etc. Servant. Java Remote Method Invocation Invocation Semantics Distributed Events CDK: Chapter 5 TVS: Section 8.3

COMMUNICATION IN DISTRIBUTED SYSTEMS

Distributed systems. Programming with threads

MODELS OF DISTRIBUTED SYSTEMS

Project 2: Part 1: RPC and Locks

C 1. Recap: Finger Table. CSE 486/586 Distributed Systems Remote Procedure Call. Chord: Node Joins and Leaves. Recall? Socket API

Distributed Systems Theory 4. Remote Procedure Call. October 17, 2008

CS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring Lecture 22: Remote Procedure Call (RPC)

Remote Invocation. Today. Next time. l Overlay networks and P2P. l Request-reply, RPC, RMI

Remote Procedure Call

DISTRIBUTED COMPUTER SYSTEMS

COMMUNICATION PROTOCOLS: REMOTE PROCEDURE CALL (RPC)

Remote Procedure Call

Lecture 5: Object Interaction: RMI and RPC

Distributed Systems. Lecture 06 Remote Procedure Calls Thursday, September 13 th, 2018

RPCs in Go. COS 418: Distributed Systems Precept 2. Themis Melissaris and Daniel Suo

CSci Introduction to Distributed Systems. Communication: RPC

Building up to today. Remote Procedure Calls. Reminder about last time. Threads - impl

03 Remote invoaction. Request-reply RPC. Coulouris 5 Birrel_Nelson_84.pdf RMI

EECS 482 Introduction to Operating Systems

Administrivia. Remote Procedure Calls. Reminder about last time. Building up to today

Go Concurrent Programming. QCon2018 Beijing

Networks and distributed computing

416 Distributed Systems. RPC Day 2 Jan 12, 2018

p1 : Distributed Bitcoin Miner /640 9/26/16

The RPC abstraction. Procedure calls well-understood mechanism. - Transfer control and data on single computer

Distributed Systems [Fall 2013]

RMI: Design & Implementation

Remote Invocation. Today. Next time. l Indirect communication. l Request-reply, RPC, RMI

Distributed Systems. 5. Remote Method Invocation

Remote Invocation. To do. Request-reply, RPC, RMI. q Today q. q Next time: Indirect communication

Distributed Objects and Remote Invocation. Programming Models for Distributed Applications

Quiz I Solutions MASSACHUSETTS INSTITUTE OF TECHNOLOGY Spring Department of Electrical Engineering and Computer Science

Lecture 06: Distributed Object

Communication Paradigms

Distributed Systems. The main method of distributed object communication is with remote method invocation

3. Remote Procedure Call

ECE454 Tutorial. June 16, (Material prepared by Evan Jones)

Remote Invocation. 1. Introduction 2. Remote Method Invocation (RMI) 3. RMI Invocation Semantics

Today CSCI Communication. Communication in Distributed Systems. Communication in Distributed Systems. Remote Procedure Calls (RPC)

Key-value store with eventual consistency without trusting individual nodes

CSCI-1680 RPC and Data Representation John Jannotti

a. Under overload, whole network collapsed iii. How do you make an efficient high-level communication mechanism? 1. Similar to using compiler instead

Introduzione a Go e RPC in Go

Computer Systems Organization

Last Class: RPCs. Today:

I/O Management Software. Chapter 5

Reliable Transport I: Concepts and TCP Protocol

Structured communication (Remote invocation)

You can also launch the instances on different machines by supplying IPv4 addresses and port numbers in the format :3410

NWEN 243. Networked Applications. Layer 4 TCP and UDP

Distributed Systems 8. Remote Procedure Calls

Java RMI Middleware Project

Process Management! Goals of this Lecture!

Networked Systems and Services, Fall 2018 Chapter 4. Jussi Kangasharju Markku Kojo Lea Kutvonen

Lecture 8: February 19

Slides on cross- domain call and Remote Procedure Call (RPC)

NoSQL Databases. Vincent Leroy

Stanford University Computer Science Department CS 140 Midterm Exam Dawson Engler Winter 1999

Remote Procedure Calls CS 707

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Distributed System Engineering: Spring Exam I

User Datagram Protocol

6. Results. This section describes the performance that was achieved using the RAMA file system.

CSCI-1680 RPC and Data Representation. Rodrigo Fonseca

Remote Procedure Calls

Communication in Distributed Systems

Q.1. (a) [4 marks] List and briefly explain four reasons why resource sharing is beneficial.

William Kennedy. Brian Ketelsen Erik St. Martin Steve Francia FOREWORD BY MANNING WITH

416 Distributed Systems. RPC Day 2 Jan 11, 2017

Computer Systems and Networks

Process Management 1

Lecture 2 Process Management

DISTRIBUTED OBJECTS AND REMOTE INVOCATION

Lecture 15: Network File Systems

Distributed Systems. How do regular procedure calls work in programming languages? Problems with sockets RPC. Regular procedure calls

Today: Fault Tolerance

Operating Systems. 18. Remote Procedure Calls. Paul Krzyzanowski. Rutgers University. Spring /20/ Paul Krzyzanowski

Chapter 4: Processes. Process Concept

IMPLEMENTATION OF SIGNAL HANDLING. CS124 Operating Systems Fall , Lecture 15

CS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring Lecture 21: Network Protocols (and 2 Phase Commit)

Communication. Distributed Systems Santa Clara University 2016

Implemen'ng Asynchronous Checkpoint/Restart for CnC. Nick Vrvilo, Vivek Sarkar Rice University Kath Knobe, Frank Schlimbach Intel September 24, 2013

Page 1. Analogy: Problems: Operating Systems Lecture 7. Operating Systems Lecture 7

ECE 650 Systems Programming & Engineering. Spring 2018

6.033 Spring 2004, Quiz 1 Page 1 of Computer Systems Engineering: Spring Quiz I

Chapter 4: Processes

6.033 Lecture 12 3/16/09. Last time: network layer -- how to deliver a packet across a network of multiple links

Reliable Transport I: Concepts and TCP Protocol

Diagram of Process State Process Control Block (PCB)

Clustering in Go May 2016

416 Distributed Systems. Networks review; Day 2 of 2 Fate sharing, e2e principle And start of RPC Jan 10, 2018

struct foomsg { u_int32_t len; }

Process Concept. Chapter 4: Processes. Diagram of Process State. Process State. Process Control Block (PCB) Process Control Block (PCB)

Chapter 4: Processes

Interprocess Communication

How do modules communicate? Enforcing modularity. Modularity: client-server organization. Tradeoffs of enforcing modularity

Transcription:

RPC and Threads Jinyang Li These slides are based on lecture notes of 6.824

Labs are based on Go language Why Golang? (as opposed to the popular alternacve: C++) good support for concurrency good support for RPC garbage- collected (no use aier freeing problems) good and comprehensive library Notable systems built using Go Dropbox s backend infrastructure CoachroachDB

Threads Thread = thread of execucon allow one program to logically execute many things at once Each thread has its own per- thread state: program counter registers stack All threads share memory (occupy same address space) Golang refers to threads as goroucnes

Why threads and how many? Threads are created to exploit concurrency I/O concurrency: while waicng for a response from another machine, process another request MulCcore: threads run parallel on many CPU cores Go encourages one to create many goroucnes many more than # of cores Go runcme schedules threads on available cores GorouCnes are more efficient than C++ threads but scll more expensive than funccon calls

Threading Challenges : races Races arise because of shared memory var x int for i := 0; i < 2; i++ { go func() { x++ () Read x=0 into %rax Incr %rax to 1 Write %rax=1 to x go run race main.go Read x=0 into %rax Incr %rax to 1 Write %rax=1 to x

Threading challenges: races Races due to shared memory var x int var mu sync.mutex for i := 0; i < 2; i++ { go func() { mu.lock() defer mu.unlock() defer is executed when x++ the enclosing funccon () returns Try to always put mu.unlock in defer go run race main.go

Threading challenges: coordinacon Mechanism: Go channel For passing informacon between goroucnes can be unbuffered or have a bounded- size buffer Several threads can send/receive on same channel Go runcme uses locks internally Sending blocks when buffered channel is full when no thread is ready to receive from unbuffered channel Receiving blocks when channel is empty Channel is closed to indicate the end receiving from a closed channel returns error

Threading challenges: coordinacon Mechanism: Waitgroup Used for waicng for a colleccon of threads to finish Supports 3 methods: Add(int x): add x (threads) to the colleccon Done(): called when one thread has finished Wait: blocks uncl all threads in the colleccon has finished

Channels and Waitgroups func main() { workchan := := make(chan int) int, int) 20) //unbuffered //buffer size channel 20 for go i func() := 1; { i <= 20; i++ { workchan for workchan i := <- 1; i <- i <= i 20; i++ { workchan <- i () for i := 0; i < 5; i++ { go func() { for { n :=<- workchan f := computefactorial(n) fmt.printf( n=%d, f=%d\n, n, f) () deadlock! Main thread blocks here because no goroucne is ready to receive from channel

Channels and WaitGroups func main() { workchan := make(chan int, 20) //buffer size 20 for i := 1; i <= 20; i++ { workchan <- i var wg sync.waitgroup for i := 0; i < 5; i++ { wg.add(1) go func() { defer wg.done() for { n :=<- workchan f := computefactorial(n) fmt.printf( n=%d, f=%d\n, n, f) () wg.wait()

Channels and WaitGroups func main() { workchan := make(chan int, 20) //buffer size 20 for i := 1; i <= 20; i++ { workchan <- i close(workchan) var wg sync.waitgroup for i := 0; i < 5; i++ { wg.add(1) go func() { defer wg.done() for { n, ok :=<- workchan if!ok { //alternative: for n:= range workchan break f := computefactorial(n) fmt.printf( n=%d, f=%d\n, n, f) () wg.wait()

RPC A key piece of infrastructure when building DS RPC s goals: easier to program than raw sockets hides details of client/server communicacon Ideal RPC interface Client: z = fn(x, y) Server: fn(x, y) {... return z

Example: KV service (Server- side) import net/rpc type PutArgs struct { Key string Value string type PutReply struct { Err Err type KV struct { mu sync.mutex RPC handlers have a certain signature (two arguments, second being a pointer, return type error) RPC handlers must be exported (First lemer capitalized) func (kv *KV) get(key string) string {... keyvalue map[string]string func (kv *KV) Put(args *PutArgs, reply *PutReply) error { kv.mu.lock() defer kv.mu.unlock() kv.keyvalue[args.key] = args.value reply.err = OK return nil

Example: starcng RPC server func startserver() { rpcs := rpc.newserver() kv := KV{keyvalue: make(map[string]string) rpcs.register(&kv) l, e := net.listen( tcp, :8888 ) go func() { for { conn, err := l.accept() if err == nil { go rpcs.serveconn(conn) else { break () l.close() Server handles each conneccon and request in a separate thread

Example: client- side type KVClient struct { clt *rpc.client func NewKVClient() *KVClient { clt, err := rpc.dial( tcp, :8888 ) return &KVClient{clt: clt func (c *KVClient) Put(key string, value string) { args := &PutArgs{Key: key, Value: val reply := PutReply{ err := c.clt.call( KV.Put, args, &reply)

Example: puqng it together func main() { startserver() client := NewKVClient() client.put( nyu, New York University ) client.put( cmu, Carnegie Mellon University ) fmt.printf( Get value=%s\n, client.get( nyu ))

RPC soiware structure Client applicacon Stubs RPC library Network construct RPC request - marshaled args - some RPC handler id send request, wait for reply RPC response RPC request Server RPC handlers dispatcher RPC library Network IdenCfy handlers to call Unmarshall args Invoke RPC handler w/ args Marshall reply Send RPC response back send request, wait for reply

Details of RPC library Which server funccon to call? In Go RPC, it s specified in Call( KV.Put,...) Marshalling: format data structure into byte stream Go RPC forbids channels, funccons in RPC args/reply Binding: how does client know which server to talk to? Go RPC: client supplies server host name (rpcbind) A name service maps service names to some server host

RPC challenge What to do about failures? lost packets, broken network, slow server, crashed server What does a failure look like to the client RPC library? Client does not see a response from the server Client cannot discnguish between the 2 scenarios: Server has never see/processed request Server has processed request, but reply is lost

RPC behavior under failure: at- least- once? A simple scheme to handle failure RPC library waits for response for a while If none arrives, re- send the request (re- establish network conneccon if necessary) Repeat several Cmes If scll no response, return an error to applicacon client.

Perils of at- least- once semanccs err1 := clt.call( KV.Put, &PutArgs{ k, 10,...) err2 := clt.call( KV.Put, &PutArgs{ k, 20,...) What s the expected value stored under k if err1==nil and err2== nil? Could it be otherwise?

Perils of at- least- once semanccs client Put k=10 server Put k=10, reply=ok Put k=20 Put k=20, reply=ok K=10 overwrites k=20 Cme

Perils of at- least- once Is at- least- once semanccs ever okay? If it s ok to repeat operacons, e.g. read- only operacons If applicacon has its own plan for copying with duplicates

At- most- once RPC semanccs Idea: RPC library detects duplicate requests, returns previous reply instead of re- running handler Client uses unique ID (xid) with each request, (same xid for re- send) Server: if oldreply, ok := seen[xid]; ok { reply = oldreply else { reply = handler() seen[xid] = reply

ComplexiCes in realizing at- most- once How to ensure XID is unique? random numbers (must be big) unique client- id + sequence # How to eventually discard old RPC replies? Possible solucon- 1: xid = unique client- id + sequent # clients include x in request, indicacng seen all replies <= x server discards replies <= x Possible solucon- 2: client agrees to retry for < 5 minutes server discards aier 5+ minutes

ComplexiCes in realizing at- most- once How to handle dup. request while original is in the middle of execucon? Server does not know the reply yet Must not run twice SoluCon: pending flag per execucng RPC; wait or ignore What if an at- most- once server crashes and restarts? If server state is in- memory, server will forget. Possible solucon: server uses a unique number nonce upon each startup, client obtains server- nonce upon startup, and includes it in every RPC request. server rejects all requests with an old nonce

At- most- once semanccs err = clt.call(...) What are the possible scenarios? if err == nil if err!= nil 1. Handler is executed exactly once 2. Handler is executed >=1 Cmes 3. Handler is not executed

Go RPC semanccs Go RPC is at- most- once Client open TCP conneccon write request to TCP conneccon TCP may retransmit, but server s TCP filters out duplicates No retry in Go RPC library (e.g. will NOT create another TCP conneccon if original one fails) Go RPC returns error if it does not get a reply aier a TCP conneccon error

What about exactly- once? If a RPC call returns error, how should the applicacon client respond? Re- transmit to the same server? Re- transmit to a different server replica? ApplicaCons need solucons to avoid duplicates Lab 3 will handle this in the context of a fault- tolerant key- value service capable of handling unbounded client retransmits