Googles Approach for Distributed Systems. Slides partially based upon Majd F. Sakr, Vinay Kolar, Mohammad Hammoud and Google PrototBuf Tutorial

Similar documents
s Protocol Buffer Knight76 at gmail.com

Protocol Buffers, grpc

CSCI-1680 RPC and Data Representation. Rodrigo Fonseca

Semesterprojekt Implementierung eines Brettspiels (inklusive computergesteuerter Spieler) Remote Procedure Calls

Piqi-RPC. Exposing Erlang services via JSON, XML and Google Protocol Buffers over HTTP. Friday, March 25, 2011

CSCI-1680 RPC and Data Representation. Rodrigo Fonseca

CSCI-1680 RPC and Data Representation John Jannotti

CloudI Integration Framework. Chicago Erlang User Group May 27, 2015

CS 378 Big Data Programming

grpc - A solution for RPCs by Google Distributed Systems Seminar at Charles University in Prague, Nov 2016 Jan Tattermusch - grpc Software Engineer

External data representation

Contents. Java RMI. Java RMI. Java RMI system elements. Example application processes/machines Client machine Process/Application A

Google Protocol Buffers for Embedded IoT

Teach your (micro)services speak Protocol Buffers with grpc.

C 1. Recap: Finger Table. CSE 486/586 Distributed Systems Remote Procedure Call. Chord: Node Joins and Leaves. Recall? Socket API

Transitioning from C# to Scala Using Apache Thrift. Twitter Finagle

Distributed Systems COMP 212. Lecture 8 Othon Michail

Apache Thrift Introduction & Tutorial

About the Tutorial. Audience. Prerequisites. Disclaimer & Copyright. Avro

Apache Avro. Sharing and managing data efficiently

Announcements. me your survey: See the Announcements page. Today. Reading. Take a break around 10:15am. Ack: Some figures are from Coulouris

Communication Paradigms

Comparative Survey of Object Serialization Techniques and the Programming Supports

Chapter 4 Communication

Introduction to Programming Using Java (98-388)

DS 2009: middleware. David Evans

Compaq Interview Questions And Answers

ΠΙΝΑΚΑΣ ΠΛΑΝΟΥ ΕΚΠΑΙΔΕΥΣΗΣ

DISTRIBUTED SYSTEMS. Instructor: Prasun Dewan (FB 150,

MTAT Enterprise System Integration. Lecture 2: Middleware & Web Services

Java J Course Outline

Programming Web Services in Java

Cloud Computing CS

RMI & RPC. CS 475, Spring 2019 Concurrent & Distributed Systems

Interprocess Communication

Contents. Figures. Tables. Examples. Foreword. Preface. 1 Basics of Java Programming 1. xix. xxi. xxiii. xxvii. xxix

{ REST } vs. Battle of API s

RProtoBuf: An R API for Protocol Buffers

CS 417 9/18/17. Paul Krzyzanowski 1. Socket-based communication. Distributed Systems 03. Remote Procedure Calls. Sample SMTP Interaction

Operating Systems. 18. Remote Procedure Calls. Paul Krzyzanowski. Rutgers University. Spring /20/ Paul Krzyzanowski

Distributed Systems. 03. Remote Procedure Calls. Paul Krzyzanowski. Rutgers University. Fall 2017

Remote Invocation. 1. Introduction 2. Remote Method Invocation (RMI) 3. RMI Invocation Semantics

Projects and Apache Thrift. August 30 th 2018 Suresh Marru, Marlon Pierce

15CS45 : OBJECT ORIENTED CONCEPTS

RMI: Design & Implementation

Distributed Programming and Remote Procedure Calls (RPC): Apache Thrift. George Porter CSE 124 February 19, 2015

Cross-Platform Data Models and API Using grpc

Communication. Distributed Systems Santa Clara University 2016

Java SE7 Fundamentals

Structured communication (Remote invocation)

Executive Summary. It is important for a Java Programmer to understand the power and limitations of concurrent programming in Java using threads.

OCPP Implementation Guide Protocol Buffers & MQTT

A Scalable and Reliable Message Transport Service for the ATLAS Trigger and Data Acquisition System

CORBA (Common Object Request Broker Architecture)

Chapter 2. Application Layer. Chapter 2: Application Layer. Application layer - Overview. Some network apps. Creating a network appication

Introduction to Apache Kafka

3. Remote Procedure Call

Outline. EEC-681/781 Distributed Computing Systems. The OSI Network Architecture. Inter-Process Communications (IPC) Lecture 4

Course Description. Learn To: : Intro to JAVA SE7 and Programming using JAVA SE7. Course Outline ::

Chapter 4. Internet Applications

Coroutines & Data Stream Processing

Short Notes of CS201

Advanced Java Programming. Networking

MCSA Universal Windows Platform. A Success Guide to Prepare- Programming in C# edusum.com

Today: Distributed Objects. Distributed Objects

Communication. Overview

CS201 - Introduction to Programming Glossary By

JXTA TM Technology for XML Messaging

Chapter 4: Inter-process Communications

Distributed Systems Lecture 2 1. External Data Representation and Marshalling (Sec. 4.3) Request reply protocol (failure modes) (Sec. 4.

Distributed Xbean Applications DOA 2000

CSci Introduction to Distributed Systems. Communication: RPC

The Umbilical Cord And Alphabet Soup

Distributed Systems 8. Remote Procedure Calls

A Brief History of Distributed Programming: RPC. YOW Brisbane 2016

Selenium Testing Course Content

Page 1

TITLE: PRE-REQUISITE THEORY. 1. Introduction to Hadoop. 2. Cluster. Implement sort algorithm and run it using HADOOP

Notes. Submit homework on Blackboard The first homework deadline is the end of Sunday, Feb 11 th. Final slides have 'Spring 2018' in chapter title

Cloud Computing CS

03 Remote invoaction. Request-reply RPC. Coulouris 5 Birrel_Nelson_84.pdf RMI

Distributed Systems. 29. Distributed Caching Paul Krzyzanowski. Rutgers University. Fall 2014

IBD Intergiciels et Bases de Données

Writing REST APIs with OpenAPI and Swagger Ada

Remote Invocation. Today. Next time. l Overlay networks and P2P. l Request-reply, RPC, RMI

RPC flow. 4.3 Remote procedure calls IDL. RPC components. Procedure. Program. sum (j,k) int j,k; {return j+k;} i = sum (3,7); Local procedure call

A Lightweight and High Performance Remote Procedure Call Framework for Cross Platform Communication

Elliotte Rusty Harold August From XML to Flat Buffers: Markup in the Twenty-teens

Remote Invocation. Today. Next time. l Indirect communication. l Request-reply, RPC, RMI

Programming by Delegation

Apache Multipurpose Infrastructure for Network Applications Building Scalable Network Applications

DOWNLOAD PDF CORE JAVA APTITUDE QUESTIONS AND ANSWERS

(800) Toll Free (804) Fax Introduction to Java and Enterprise Java using Eclipse IDE Duration: 5 days

spindle Documentation

CMSC 330: Organization of Programming Languages

COURSE DETAILS: CORE AND ADVANCE JAVA Core Java

Core Java SYLLABUS COVERAGE SYLLABUS IN DETAILS

Advanced Topics in Operating Systems

Unit 7: RPC and Indirect Communication

Introduction to RPC, Apache Thrift Workshop. Tomasz Powchowicz

Transcription:

Protocol Buffers Googles Approach for Distributed Systems Slides partially based upon Majd F. Sakr, Vinay Kolar, Mohammad Hammoud and Google PrototBuf Tutorial https://developers.google.com/protocol-buffers/docs/tutorials

Google Case Study Google provides a case study of Distributed Systems Scalability: Google has scaled from an initial production system in 1998 to handling 63'000 per Second. Reliability and Fault-tolerance: The main search engine has never experienced an outage. Users can expect a query result in 0.2 seconds 24x7 availability with 99.9% Service Level Agreement (for paying customers) Variety of software applications are supported on Google Infrastructure Google has built a generic infrastructure that handles many varying web applications (search, maps, voice, email, social networking, ads, docs, ) Google is offering Platform As A Service With the launch of Google App Engine, Google is stepping beyond providing software services It is now providing its distributed systems infrastructure for application developers 2 von 30

Google Infrastructure Physical Model Google has created a large distributed system from commodity PCs Commodity PC Data Centers Cluster Approx 30 racks (around 2400 PCs) 2 high-bandwidth switches (each rack connected to both the switches for redundancy) Placement and replication generally done at cluster level 3 von 30

Google Infrastructure Conceptual HW View 4 von 30

Google Infrastructure - Logical View Google has developed a set of services that are tailored for various applications running on Google infrastructure We will study two important communication paradigms developed by Google Google Protocol Buffer Google Publish-Subscribe 5 von 30

Communication Protocols 6 von 30

Google Publish-Subscribe Google adopts a topic-based PS system A number of channels for event streams with channels corresponding to particular topics Event contains the following fields: Header Set of keywords Payload: Opaque to the programmer Subscription request specify Channel Filter defined over the set of keywords Reliability guarantees at least once 7 von 30

Google Protocol Buffer Google adopts a minimal (but efficient) remote invocation service Recall that: Remote invocation requires among all the other services the following two components Serialization of data Agreement on data representation (data-type size and format) Protocol Buffer (PB) is a common serialization format for Google 8 von 30

Data serialization (in Java): 1. RMI/Java built-in serialization Ex.: writeobject/readobject(aobject) Easy to use, but inefficient in terms of space (extra fields) No language interoperability 3. CORBA Part of Java But deprecated technology Language interoperable 2. Manual binary encoding Ex.: 4 ints as - and aobject.getbytes() Space efficient, but time efficiency depends on parsing methods Difficult for complex objects Language interoperable 3. Human-readable formats Ex.: Using XML, JSON, DOM, SAX, STAX, JAXB, JAXP, etc. Inefficient (space and time w/ human readable format) Language interoperable 9 von 30

Comparison with some other Products Google Google Apache Apache (Hadoop) (Hadoop) Apache Apache Tony Ng 10 von 30

Protocol Buffer Protocol Buffers (Protobuf) is Google's approach to data serialization: A description language A compiler A library Easy-to-use, efficient automatic binary encoding Created by and in production at Google Inc. Publicly launched in 2008. Language interoperable: Java, Go, C++, and Python, C, C#, Erlang, Perl, PHP, Ruby, etc. 11 von 30

Goal of Protocol Buffer The goal of Protocol Buffer is to provide a language- and platform-neutral way to specify and serialize data such that: Serialization process is efficient, extensible and simple to use Serialized data can be stored or transmitted over the network In Protocol buffers, Google has designed a language to specify messages 12 von 30

Sample for an IDL Defintion (.proto File) Define the interface in an IDL File syntax="proto2"; package tutorial; option java_package = "com.example.tutorial"; option java_outer_classname = "AddressBookProtos"; option java_generic_services = true; // generate stubs for rpc message Person { required string name = 1; required int32 id = 2; optional string email = 3; enum PhoneType { MOBILE = 0; HOME = 1; WORK = 2; message PhoneNumber { required string number = 1; optional PhoneType type = 2 [default = HOME]; repeated PhoneNumber phone = 4; 13 von 30

Protocol Buffer Language Message contains uniquely numbered fields Field is represented by <field-type, data-type, field-name, encoding-value, [default value]> Available data-types Primitive data-type int, float, bool, string, raw-bytes Enumerated data-type Nested Message Allows structuring data into an hierarchy 14 von 30

Protocol Buffer Language (cont d) Field-types can be: Required fields Optional fields Repeated fields Dynamically sized array Encoding-value A unique number (=1,=2, ) represents a tag that a particular field has in the binary encoding of the message 15 von 30

Compiling the Description Compile the *.proto File c:> protoc --java_out=. name_of_protoc.proto File generated 16 von 30

Using in a Java Program Import in your Source Instantiate an Object The Builder class: Messages are immutable in protocol buffer, Builder class is mutable 17 von 30

File as Storage Write to a file Read from a file 18 von 30

TCP/IP Socket for Date Transfer Server Side (receive data) Client Side (send date) 19 von 30

UDP for Date Transfer Receive Data Send Data 20 von 30

Remote Procedure Call 21 von 30

Extensibility of PB In addition to being language- and platform-neutral, PBs are also agnostic with respect to underlying RPC protocol PB library provides two abstract interfaces: RpcChannel: Provides a common interface to underlying communication e.g., Programmer can specify if HTTP or FTP has to be used for communicating data RpcController: Providing common control interface 22 von 30

Supporting RPC using Protocol Buffers PB produces a serialized data that can be used for storage or communications Most common use is to use PB for RPCs Example: RequestType can correspond to list of keywords ResponseType can then correspond to a list of books matching the keywords service SearchService { rpc Search(RequestType) returns (ResponseType) protoc compiler takes this specification and produces Abstract interface SearchService A stub that supports type-safe RPC calls Attention only if: option java_generic_services = true; 23 von 30

The Definition of a Service Interface proto-file to describe the Messages and Services syntax="proto2"; package tutorial; a a "wrapper" "wrapper" class class HelloProtos HelloProtos is is generated generated option java_package = "com.example.tutorial"; option java_outer_classname = "HelloProtos"; //if you don't do this, protoc wont generate the stubs you need for rpc option java_generic_services = true; message HelloRequest { required string name = 1; message HelloReply { required string message = 1; //In generated class, this class is abstract class //that extends service method need to extends this service GreetingService { rpc sayhello(hellorequest) returns (HelloReply); c:> protoc --java_out=. name_of_protoc.proto 24 von 30

Implementation of the Services MyServiceImpl extends abstract generated Class "Service" package com.example.tutorial; import java.util.logging.*; import com.google.protobuf.rpccallback; import com.google.protobuf.rpccontroller; import com.example.tutorial.helloprotos.*; public class MyGreetingServiceImpl extends GreetingService { private static final Logger log =Logger.getLogger(MyGreetingServiceImpl.class.getName()); private static HelloReply createreplymessage(string name) { HelloReply.Builder helloreplybuilder = HelloReply.newBuilder(); helloreplybuilder.setmessage("hello:"+name); return helloreplybuilder.build(); get get all all inner inner classes classes @Override public void sayhello(rpccontroller controller, HelloRequest hellorequest, RpcCallback<HelloReply> done) { log.info("got request:"+hellorequest.getname()); HelloReply helloreply = createreplymessage(hellorequest.getname()); done.run(helloreply); 25 von 30

Implementation of the Services Register and Start the Server package com.example.tutorial; import java.util.concurrent.executors; import com.googlecode.protobuf.socketrpc.rpcserver; import com.googlecode.protobuf.socketrpc.serverrpcconnectionfactory; import com.googlecode.protobuf.socketrpc.socketrpcconnectionfactories; public class ServerCode { public static void main(string[] args) { ServerRpcConnectionFactory rpcconnectionfactory = SocketRpcConnectionFactories.createServerRpcConnectionFactory( 4446); RpcServer server = new RpcServer(rpcConnectionFactory, Executors.newFixedThreadPool(5), true); server.registerservice(new MyGreetingServiceImpl()); server.run(); 26 von 30

Client for Calling the Service Connector Methods for Sync and Blocking Calls package com.example.tutorial; import com.example.tutorial.helloprotos.*; import com.google.protobuf.*; import java.util.logging.*; import com.googlecode.protobuf.socketrpc.*; import java.util.concurrent.*; public class ProtoClient { private static String host = "127.0.0.1"; private static int port = 4446; private static final Logger log = Logger.getLogger(ProtoClient.class.getName() ); private static GreetingService.Stub connecttoasyncservice() { // Create a thread pool ExecutorService threadpool = Executors.newFixedThreadPool(1); // Create channel RpcConnectionFactory connectionfactory = SocketRpcConnectionFactories.createRpcConnectionFactory(host, port); RpcChannel channel = RpcChannels.newRpcChannel(connectionFactory,threadPool); GreetingService.Stub myservice = GreetingService.newStub(channel); return myservice; private static GreetingService.BlockingInterface connecttoblockingservice() { GreetingService.BlockingInterface helloservice; RpcConnectionFactory connectionfactory = SocketRpcConnectionFactories.createRpcConnectionFactory(host, port); BlockingRpcChannel channel = RpcChannels.newBlockingRpcChannel(connectionFactory); helloservice = GreetingService.newBlockingStub(channel); return helloservice;... 27 von 30

Client for Calling the Service Do an Async or Blocking Call private static HelloRequest createrequestmessage(string name) { HelloRequest.Builder hellorequestbuilder = HelloRequest.newBuilder(); hellorequestbuilder.setname(name); return hellorequestbuilder.build(); public static void main(string[] args) throws Exception { RpcController controller = new SocketRpcController(); HelloRequest hellorequest = createrequestmessage("hugo"); // Call service sync GreetingService.BlockingInterface myservicerpc = connecttoblockingservice(); HelloReply reply = myservicerpc.sayhello(controller,hellorequest); log.info("received Sync Response: " + reply.getmessage()); // Call service async GreetingService.Stub myservice = connecttoasyncservice(); myservice.sayhello(controller, hellorequest, new RpcCallback<HelloReply>() { public void run(helloreply reply) { log.info("received Aync Response: " + reply.getmessage()); ); Thread.sleep(1000); System.exit(0); 28 von 30

Java Installation (Version 2.4.1) Minimal for Java Protoc Compiler (protoc-2.4.1-win32.zip) for Windows https://github.com/google/protobuf/releases/tag/v2.4.1 Protobuff Java Runtime http://search.maven.org/remotecontent?filepath=com/google/protobuf/protobuf-java/2.4.1/protobufjava-2.4.1.jar Protobuff socket-rpc https://github.com/sdeo/protobuf-socket-rpc/blob/master/downloads/protobuf-socket-rpc-2.0.jar 29 von 30

Compare PB with traditional RPCs In messages, field-types are encoded as numbers. Hence, lesser data needs to be communicated RPCs using PB restricts single input parameter and single result parameter Supports extensibility and software reusability Pushes the complexity towards data Programmer can control protocols used for communication by writing their own RpcChannel But, we have studied that RPC was designed to relieve programmer from communication. Discuss this dilemma. 30 von 30