Erlang. Joe Armstrong

Similar documents
Erlang. Joe Armstrong.

Robust Erlang (PFP Lecture 11) John Hughes

The do s and don ts of error handling. Joe Armstrong

Erlang 101. Google Doc

Erlang. Functional Concurrent Distributed Soft real-time OTP (fault-tolerance, hot code update ) Open. Check the source code of generic behaviours

Erlang. Functional Concurrent Distributed Soft real-time OTP (fault-tolerance, hot code update ) Open. Check the source code of generic behaviours

concurrent programming XXL

Message Passing. Advanced Operating Systems Tutorial 7

FRANCESCO CESARINI. presents ERLANG/OTP. Francesco Cesarini Erlang

Concurrency Oriented Programming in Erlang

Functional Programming In Real Life

Erlang: distributed programming

Programming Language Impact on the Development of Distributed Systems

Programming Paradigms

The Actor Model applied to the Raspberry Pi and the Embedded Domain. The Erlang Embedded Project. Omer

The Actor Model applied to the Raspberry Pi and the Embedded Domain. Omer

MapReduce in Erlang. Tom Van Cutsem

Erlang and Go (CS262a, Berkeley Fall 2016) Philipp Moritz

<Urban.Boquist. com> Rev PA

An Introduction to Erlang

All you need is fun. Cons T Åhs Keeper of The Code

An Introduction to Erlang

20 Years of Commercial Functional Programming

Akka: Simpler Concurrency, Scalability & Fault-tolerance through Actors. Jonas Bonér Viktor Klang

An Introduction to Erlang

Outline. Definition of a Distributed System Goals of a Distributed System Types of Distributed Systems

An Introduction to Erlang

Reminder from last time

Beautiful Concurrency with Erlang

An Introduction to Erlang

Lecture 5: Erlang 1/31/12

6.033 Lecture Fault Tolerant Computing 3/31/2014

The Golden Trinity of Erlang How Something Simple Has Real Business Value

Executive Summary. It is important for a Java Programmer to understand the power and limitations of concurrent programming in Java using threads.

DS 2009: middleware. David Evans

Erlang. An introduction. Paolo Baldan Linguaggi e Modelli per il Global Computing AA 2016/2017

DISTRIBUTED SYSTEMS [COMP9243] Lecture 1.5: Erlang INTRODUCTION TO ERLANG BASICS: SEQUENTIAL PROGRAMMING 2. Slide 1

Erlang: An Overview. Part 2 Concurrency and Distribution. Thanks to Richard Carlsson for most of the slides in this part

Programming Paradigms

Parallel Programming in Erlang (PFP Lecture 10) John Hughes

A History of the Erlang VM

ERLANG EVOLVES FOR MULTI-CORE AND CLOUD ENVIRONMENTS

Fault Tolerance. Goals: transparent: mask (i.e., completely recover from) all failures, or predictable: exhibit a well defined failure behavior

Lecture 8. Linda and Erlang

The Actor Model, Part Two. CSCI 5828: Foundations of Software Engineering Lecture 18 10/23/2014

MODELS OF DISTRIBUTED SYSTEMS

Piqi-RPC. Exposing Erlang services via JSON, XML and Google Protocol Buffers over HTTP. Friday, March 25, 2011

Architectural Blueprint

Erlang. The Good the Bad and the Ugly. Joe Armstrong

Distributed Systems. Characteristics of Distributed Systems. Lecture Notes 1 Basic Concepts. Operating Systems. Anand Tripathi

Distributed Systems. Characteristics of Distributed Systems. Characteristics of Distributed Systems. Goals in Distributed System Designs

General Overview of Mozart/Oz

Erlang functional programming in a concurrent world

The case for reactive objects

Erlang functional programming in a concurrent world Johan Montelius and Vladimir Vlassov

G Distributed Systems: Fall Quiz I

Lecture 9. Part I. Overview of Message Passing. Communication Coupling. Decoupling Blackboard. Decoupling Broadcast. Linda and Erlang.

Fault Tolerance Causes of failure: process failure machine failure network failure Goals: transparent: mask (i.e., completely recover from) all

CSC 261/461 Database Systems Lecture 20. Spring 2017 MW 3:25 pm 4:40 pm January 18 May 3 Dewey 1101

MODELS OF DISTRIBUTED SYSTEMS

Message-passing concurrency in Erlang

SCALARIS. Irina Calciu Alex Gillmor

Producer Consumer in Ada

Distributed Systems. The main method of distributed object communication is with remote method invocation

Concurrent & Distributed Systems Supervision Exercises

Erlang and Concurrency. André Pang Rising Sun Research

Remote Invocation. Today. Next time. l Overlay networks and P2P. l Request-reply, RPC, RMI

The development of Erlang. Joe Armstrong. Computer Science Laboratory. Ericsson Telecom AB

The Actor Model. CSCI 5828: Foundations of Software Engineering Lecture 13 10/04/2016

02 - Distributed Systems

02 - Distributed Systems

CloudI Integration Framework. Chicago Erlang User Group May 27, 2015

Operating Systems (ECS 150) Spring 2011

CprE Fault Tolerance. Dr. Yong Guan. Department of Electrical and Computer Engineering & Information Assurance Center Iowa State University

Two Phase Commit Protocol. Distributed Systems. Remote Procedure Calls (RPC) Network & Distributed Operating Systems. Network OS.

Thinking in a Highly Concurrent, Mostly-functional Language

Erlang and VoltDB TechPlanet 2012 H. Diedrich

CPSC 213. Introduction to Computer Systems. The Operating System. Unit 2e

Haskell in the corporate environment. Jeff Polakow October 17, 2008

Distributed Systems Fault Tolerance

The Ideas in Erlang. Joe Armstrong

Consistency. CS 475, Spring 2018 Concurrent & Distributed Systems

Today: Fault Tolerance

Architectural Blueprint The 4+1 View Model of Software Architecture. Philippe Kruchten

Concurrent Object-Oriented Programming

Concurre Concurr nt n Systems 8L for Part IB Handout 3 Dr. Steven Hand 1

Today: Fault Tolerance. Fault Tolerance

Erlang - functional programming in a concurrent world. Johan Montelius KTH

3C05 - Advanced Software Engineering Thursday, April 29, 2004

NewSQL Database for New Real-time Applications

Today: Fault Tolerance. Failure Masking by Redundancy

CS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring Lecture 22: Remote Procedure Call (RPC)

Distributed KIDS Labs 1

Distributed Systems. Before We Begin. Advantages. What is a Distributed System? CSE 120: Principles of Operating Systems. Lecture 13.

Processes. CS439: Principles of Computer Systems January 30, 2019

EOS: An Extensible Operating System

1. Introduction: Key Features of Erlang

Distributed Transaction Management 2003

Today: Fault Tolerance. Replica Management

CSE544 Database Architecture

Transcription:

Erlang Joe Armstrong

Though OOP came from many motivations, two were central. The large scale one was to find a better module scheme for complex systems involving hiding of details, and the small scale one was to find a more flexible version of assignment, and then to try to eliminate it altogether....doing encapsulation right is a commitment not just to abstraction of state, but to eliminate state oriented metaphors from programming. The Early History of Smalltalk Alan Kay

for i in {objects, processes} { create very large numbers of $i $i work the same way on all OS's $i's are garbage collected $i are location transparent $i cannot damage other $i $i are defined in the language creating and destroying $i is light-weight }

Erlang is Smalltalk as Alan Kay wanted it - Niall Dalton

How do we build systems that run forever, are scalable, fault-tolerant, evolve with time and work reasonably well works despite errors in the software?

Difficult

To make a fault-tolerant system you need at least two computers

this is Distributed Programming

Simplify the problem no sharing pure message passing no locks

This is Concurrency Oriented Programming

Concurrency Oriented Programming A style of programming where concurrency is used to structure the application Large numbers of processes Complete isolation of processes No sharing of data Location transparency Pure message passing My first message is that concurrency is best regarded as a program structuring principle Structured concurrent programming Tony Hoare Redmond, July 2001

COP Design Rules 1) Identify the concurrent operations in your problem 2) Identify the message channels 3) Write down the set of message seen on each channel 4) Write down the protocols 5) Write the code Try to make the design isomorphic to the problem ie a 1:1 correspondence between the process/message structure in the model and the problem.

Who am I? Inventor of Erlang, UBF Chief designer of OTP Founder of the company Bluetail Currently Senior System Architect Ericsson AB Current Interests Concurrency Oriented Programming Multi-core CPUs FPGAs Cats Motorbikes

How do we correct hardware failures? Replicate the hardware How do we correct software errors? Having two identical copies of the software won't work both will fail at the same time and for the same reason Why does your computer crash? Which fails more often, hardware or software?

Talk organisation Introduction Architecture Erlang Building an application OTP It works! Case studies Programming faulttolerant systems Programming techniques API's and protocols Conclusions

History 1986 Pots Erlang (in Prolog) 1987 ACS/Dunder 1988 Erlang -> Strand (fails) 1989 JAM (Joe's abstract machine) 1990 Erlang syntax changes (70x faster) 1991 Distribution 1992 Mobility Server 1993 Erlang Systems AB 1995 AXE-N collapses. AXD starts 1996 OTP starts 1998 AXD deployed. Erlang Banned. Open Source Erlang. Bluetail formed 1999 BMR sold 2000 Alteon buys Blutail. Nortel buys Alteon 2002 UBF. Concurrency Oriented Programming 2003 Ph.D. Thesis - Making reliable systems 2006 Multi-core Erlang

How do we make systems? Systems are made of black boxes (components) Black boxes execute concurrently Black boxes communicate How the black box works internally is irrelevant Failures inside one black box should not crash another black box

Problem domain Highly concurrent (hundreds of thousands of parallel activities) Real time Distributed High Availability (down times of minutes/year never down) Complex software (million of lines of code) Continuous operation (years) Continuous evolution In service upgrade

Architecture Philosophy Way of doing things Construction Guidelines Programming examples

Philosophy Concurrency Oriented Programming 1. COPLs support processes 2. Processes are Isolated 3. Each process has a unique unforgeable Id 4. There is no shared state between processes 5. Message passing is unreliable 6. It should be possible to detect failure in another processes and we should know the reason for failure

System requirements R1. Concurrency processes R2. Error encapsulation isolation R3. Fault detection what failed R4. Fault identification why it failed R5. Live code upgrade evolving systems R6. Stable storage crash recovery

Isolation. Hardware components operate concurrently are isolated and communicate by message passing

Consequences of Isolation Processes have share nothing semantics and data must be copied Message passing is the only way to exchange data Message passing is asynchronous

Processes GOOD STUFF Copying Message passing

Language My program should not be able to crash your program Need strong isolation and concurrency Processes are OK threads are not (threads have shared resources) Can't use OS processes (Heavy semantics depends on OS)

Isolation My program should not be able to crash your program. This is the single most important property that a system component must have All things are not equally important

Erlang Lightweight processes (lighter than OS threads) Good isolation (not perfect yet...) Programs never loose control Error detection primitives Reason for failure is known Exceptions Garbage collected memory Lots of processes Functional Agner Krarup Erlang (1878-1929)

Erlang in 11 minutes

Erlang You can create a parallel process Pid = spawn(fun() ->... end). then send it a message Pid! Msg It typically takes 1 microsecond to create a process or send a message and then wait for a reply receive {Pid, Rely} -> Actions end Processes are isolated

Generalisation Client Pid = spawn(fun() -> loop() end) Pid! {self(), 21}, receive {Pid, Val} ->... end Server loop() -> receive {From, X} -> From! {self(), 2*X}, loop() end. A simple process Client Double = fun(x) -> 2 *X end, Pid = spawn(fun() -> loop(double) end) Pid! {self(), 21}, receive {Pid, Val} ->... end Server loop(f) -> receive {From, X} -> From! {self(), F(X)}, loop(f) end. Generalised

A generic server -module(gserver). -export([start/1, rpc/2, code_change/2]). start(fun) -> spawn(fun() -> loop(fun) end). rpc(pid, Q) -> Pid! {self(), Q}, receive {Pid, Reply} -> Reply end. code_change(pid, Fun1) -> Pid! {swap_code, Fun1}. loop(f) -> receive {swap_code, F1} -> loop(f1); {Pid, X} -> Pid! {self(), F(X)}, loop(f); end. Double = fun(x) -> 2*X end, Pid = gserver:start(double),... Triple = fun(x) -> 3*X end, gserver:code_change(pid, Triple)

A generic server with data -module(gserver). -export([start/2, rpc/2, code_change/2]). start(fun, Data) -> spawn(fun() -> loop(fun, Data) end). rpc(pid, Q) -> Pid! {self(), Q}, receive {Pid, Reply} -> Reply end. loop(f, Data) -> receive {swap_code, F1} -> loop(f1, Data); {Pid, X} -> {Reply, Data1} = F(X), Pid! {self(), Reply}, loop(f, Data1); end. code_change(pid, Fun1) -> Pid! {swap_code, Fun1}.

Trapping errors In Pid1... Pid2 = spawn_link(fun() ->... end). process_flag(trap_exit, true)... receive {'EXIT', Pid, Why} -> Actions end. Pid1 {'EXIT', Pid1, badarith} 1/0 Pid2 Pid1 error detection + reason for failure (slide 10)

Why remote trapping of errors? To do fault-tolerant computing you need at least TWO computers Computer1 Error Computer2 {'EXIT', Computer2, Reason} Computer1 Which means you can't share data

Programming for errors If you can't do what you want to do try and do something simpler Links Supervisor Workers The supervisor monitors the workers and restarts them if they fail

A supervision hierarchy Links Supervisor Supervisor and worker Workers Links Workers

OTP behaviours Generic libraries for building components of a real-time system. Includes Client-server Finite State machine Supervisor Event Handler Applications Systems

case studies Ericsson AXD301 (in Engine) Size = 1136150 lines Erlang Dirty functions = 0.359% Availability = 99.9999999% Alteon (Nortel) SSL accelerator Size = 74440 line Erlang Dirty functions = 0.82% Ref: Armstrong Ph.D. thesis

Commercial Successes Ericsson AXD301 (part of Engine ) Ericsson GPRS system Alteon (Nortel) SSL accelerator Alteon (Nortel) SSL VPN Teba Bank (credit card system South Africa) T-mobile SMS system (UK) Kreditor (Sweden) jabber.org

How do we make systems? Systems are made of black boxes (components) Black boxes execute concurrently Protocol checker Black boxes communicate with defined (universal) protocols The protocol is checked externally How the black box works internally is irrelevant

APIs done wrong +type file:open(filename(), read write) -> {ok, filehandle()} {error, string()}. +type file:read_line(filehandle()) -> {ok, string()} eof. +type file:close(filehandle()) -> true. +deftype filename() = [int()] +deftype string() = [int()]. +deftype filehandle() = pid(). silly() -> {ok, H} = file:open("foo.dat", read), file:close(h), file:read_line(h).

APIs with state +type start x file:open(filename(), read write) -> {ok, filehandle()} x ready {error, string()} x stop. +type ready x file:read_line(filehandle()) -> {ok, string()} x ready eof x ateof. +type ateof ready x file:close(filehandle()) -> true x stop. +type ateof ready x file:rewind(filehandle()) -> true x ready. silly() -> {ok, H} = file:open("foo.dat", read), file:close(h), file:read_line(h).

Protocols or APIs +state start x {open, filename(), read write} -> {ok, filehandle()} x ready {error, string()} x stop. How things work inside the black box is irrelevant +state ready x {read_line, filehandle()} -> {ok, string()} x ready eof x ateof. +state ready ateof x {close, filehandle()}-> true x stop. +state ready ateof x {rewind, filehandle()) -> true x ready Check the protocol at the boundaries to the black box

Finally My program should not be able to crash your program. This is the single most important property that a system component must have All things are not equally important