Beautiful Concurrency with Erlang

Similar documents
Erlang. Joe Armstrong.

Erlang: distributed programming

Erlang. Joe Armstrong

Parallelizing Computations

Phoenix Is Not Your Application. Lance Halvorsen ElixirConf EU 2016

S AMPLE CHAPTER. Erlang AND OTP IN ACTION. Martin Logan Eric Merritt Richard Carlsson FOREWORD BY ULF WIGER MANNING

An Introduction to Erlang

<Urban.Boquist. com> Rev PA

The Actor Model, Part Two. CSCI 5828: Foundations of Software Engineering Lecture 18 10/23/2014

Erlang. Functional Concurrent Distributed Soft real-time OTP (fault-tolerance, hot code update ) Open. Check the source code of generic behaviours

Cliff Moon. Bottleneck Whack-A-Mole. Thursday, March 21, 13

An Introduction to Erlang

Erlang: An Overview. Part 2 Concurrency and Distribution. Thanks to Richard Carlsson for most of the slides in this part

Erlang. Functional Concurrent Distributed Soft real-time OTP (fault-tolerance, hot code update ) Open. Check the source code of generic behaviours

An Introduction to Erlang

Programming Paradigms

Kent Academic Repository

concurrent programming XXL

Robust Erlang (PFP Lecture 11) John Hughes

Parallel Programming in Erlang (PFP Lecture 10) John Hughes

HOW DOES A SEARCH ENGINE WORK?

Erlang 101. Google Doc

Scaling Up from 1000 to 10 Nodes

Advanced Functional Programming, 1DL Lecture 2, Cons T Åhs

Macros, and protocols, and metaprogramming - Oh My!

An Introduction to Erlang

Erlang Programming. for Multi-core. Ulf Wiger. QCON London, March 12 th,

CS5412: TRANSACTIONS (I)

Experiments in OTP-Compliant Dataflow Programming

Functional Programming In Real Life

Erlang functional programming in a concurrent world

Erlang functional programming in a concurrent world Johan Montelius and Vladimir Vlassov

Erlang. An introduction. Paolo Baldan Linguaggi e Modelli per il Global Computing AA 2016/2017

Introduction to Asynchronous Programming Fall 2014

Message Passing. Advanced Operating Systems Tutorial 7

Introduction to Erlang. Franck Petit / Sebastien Tixeuil

Programming Paradigms

Dynamic Types , Spring March 21, 2017

The Mercury project. Zoltan Somogyi

Lesson 3 Transcript: Part 1 of 2 - Tools & Scripting

CS390 Principles of Concurrency and Parallelism. Lecture Notes for Lecture #5 2/2/2012. Author: Jared Hall

SearchWinIT.com SearchExchange.com SearchSQLServer.com

Thinking in a Highly Concurrent, Mostly-functional Language

Replication. Feb 10, 2016 CPSC 416

What did we talk about last time? Finished hunters and prey Class variables Constants Class constants Started Big Oh notation

DISTRIBUTED SYSTEMS [COMP9243] Lecture 1.5: Erlang INTRODUCTION TO ERLANG BASICS: SEQUENTIAL PROGRAMMING 2. Slide 1


FRANCESCO CESARINI. presents ERLANG/OTP. Francesco Cesarini Erlang

ECE 550D Fundamentals of Computer Systems and Engineering. Fall 2017

Erlang and VoltDB TechPlanet 2012 H. Diedrich

What is a programming language?

Analysing and visualising callback modules of Erlang generic server behaviours

CS5412: TRANSACTIONS (I)

API Interlude: Process Creation (DRAFT)

EmberJS A Fitting Face for a D8 Backend. Taylor Solomon

CISC-124. This week we continued to look at some aspects of Java and how they relate to building reliable software.

Programming Languages Fall Prof. Liang Huang

CIO 24/7 Podcast: Tapping into Accenture s rich content with a new search capability

Reminder from last time

Message-passing concurrency in Erlang

Implementing Riak in Erlang: Benefits and Challenges

Erlang - functional programming in a concurrent world. Johan Montelius KTH

Lesson 4 Transcript: DB2 Architecture

ENVIRONMENT MODEL: FUNCTIONS, DATA 18

CSE373: Data Structure & Algorithms Lecture 23: Programming Languages. Aaron Bauer Winter 2014

Recall: Address Space Map. 13: Memory Management. Let s be reasonable. Processes Address Space. Send it to disk. Freeing up System Memory

Erlang. for Concurrent Programming. 26 September 2008 ACM QUEUE rants:

2015 Erlang Solutions Ltd

Processes and Threads

Text Input and Conditionals

Hi, I m SMS SMS BOOTCAMP. Everything you need to know to get started with SMS marketing.

Arranging lunch value of preserving the causal order. a: how about lunch? meet at 12? a: <receives b then c>: which is ok?

NAME SYNOPSIS DESCRIPTION. Behavior of other Perl features in forked pseudo-processes. Perl version documentation - perlfork

Thinking Beyond Search with Solr Understanding How Solr Can Help Your Business Scale. Magento Expert Consulting Group Webinar July 31, 2013

Distributed Systems. Lec 10: Distributed File Systems GFS. Slide acks: Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung

Review of Fundamentals

Programming Language Impact on the Development of Distributed Systems

Bruce Moore Fall 99 Internship September 23, 1999 Supervised by Dr. John P.

Cold, Hard Cache KV? On the implementation and maintenance of caches. who is

Raghav Karol Motorola Solutions. Torben Hoffmann Motorola Solutions

Organising . page 1 of 8. bbc.co.uk/webwise/accredited-courses/level-one/using- /lessons/your- s/organising-

Dealer Reviews Best Practice Guide

Map-Reduce. John Hughes

The HALFWORD HEAP EMULATOR

Agreement and Consensus. SWE 622, Spring 2017 Distributed Software Engineering

Configuring the Oracle Network Environment. Copyright 2009, Oracle. All rights reserved.

CS61A Notes Week 6: Scheme1, Data Directed Programming You Are Scheme and don t let anyone tell you otherwise

Styling your Architecture in an Evolving Concurrent World

VARIABLES AND TYPES CITS1001

Frustrated by all the hype?

Writing Portable Applications for J2EE. Pete Heist Compoze Software, Inc.

Computer Systems Assignment 2: Fork and Threads Package

Chordy: a distributed hash table. Johan Montelius and Vladimir Vlassov. August 29, 2016

The Little Elixir & OTP Guidebook by Tan Wei Hao

Intro. Scheme Basics. scm> 5 5. scm>

I m a really lazy person by nature. I m not lazy in the sense that I like to sit

Compellent Storage Center

6.001 Notes: Section 15.1

CSE 341: Programming Languages

Binghamton University. CS-140 Fall Dynamic Types

Transcription:

Beautiful Concurrency with Erlang Kevin Scaldeferri OSCON 23 July 2008 6 years at Yahoo, building large high-concurrency distributed systems Not an expert, don t use it professionally Dabbled, liked it, want to share what I think is cool

What is Erlang? Strict pure functional language Strong dynamic typing weak structural user-defined types Interpreted Syntax similar to Prolog & ML Concurrency primitives Created at Ericsson for telecom applications in 1987 Not going to talk about syntax, basic language features, etc Go to Francesco Cesarini s talk yesterday.

Erlang Concurrency Primitives spawn - create a process! - send a message to a process receive - listen for a message

Parallelizing Algorithms Quicksort Shamelessly stolen from http:// 21ccw.blogspot.com/2008/05/parallelquicksort-in-erlang-part-ii.html

qsort([]) -> []; qsort([pivot Rest]) -> qsort([ X X <- Rest, X < Pivot]) ++ [Pivot] ++ qsort([ Y Y <- Rest, Y >= Pivot]). Erlang: one of those quicksort in 3 lines languages but... to small to read

qsort([]) -> []; qsort([pivot Rest]) -> qsort([ X X <- Rest, X < Pivot]) ++ [Pivot] ++ qsort([ Y Y <- Rest, Y >= Pivot]).

qsort([]) -> []; qsort([pivot Rest]) -> Left = [ X X <- Rest, X < Pivot], Right = [ Y Y <- Rest, Y >= Pivot], qsort(left) ++ [Pivot] ++ qsort(right). Extract temp variables

qsort([]) -> []; qsort([pivot Rest]) -> Left = [ X X <- Rest, X < Pivot], Right = [ Y Y <- Rest, Y >= Pivot], [SortedLeft, SortedRight] = map(fun qsort/1, [Left, Right]), SortedLeft ++ [Pivot] ++ SortedRight. Add a map(), which looks odd but now we re ready to do some magic

qsort([]) -> []; qsort([pivot Rest]) -> Left = [ X X <- Rest, X < Pivot], Right = [ Y Y <- Rest, Y >= Pivot], [SortedLeft, SortedRight] = pmap(fun qsort/1, [Left, Right]), SortedLeft ++ [Pivot] ++ SortedRight. Now we re running on as many cores as you ve got Who thinks this is a good idea?

Don t try this at home actually 10x slower on my machine spawning a process is fast, but still much slower than a comparison / list cons a better example - web spidering

spider(uris) ->... Links = pmap(fun get_links/1, URIs),... web spider needs to fetch content, parse XML/HTML, extract links Significant speedup here, both from parallelizing network requests and CPU

pmap(f, L) -> S = self(), Pids = map(fun(i) -> spawn(fun() -> pmap_f(s, F, I) end) end, L), pmap_gather(pids). pmap_f(parent, F, I) -> Parent! {self(), (catch F(I))}. pmap_gather([h T]) -> receive {H, Ret} -> [Ret pmap_gather(t)] end; pmap_gather([]) -> [].

pmap(f, L) -> S = self(), Pids = map(fun(i) -> spawn(fun() -> pmap_f(s, F, I) end) end, L), pmap_gather(pids). pmap_f(parent, F, I) -> Parent! {self(), (catch F(I))}. pmap_gather([h T]) -> receive {H, Ret} -> [Ret pmap_gather(t)] end; pmap_gather([]) -> []. pmap uses map

pmap(f, L) -> S = self(), Pids = map(fun(i) -> spawn(fun() -> pmap_f(s, F, I) end) end, L), pmap_gather(pids). pmap_f(parent, F, I) -> Parent! {self(), (catch F(I))}. pmap_gather([h T]) -> receive {H, Ret} -> [Ret pmap_gather(t)] end; pmap_gather([]) -> []. but instead of running the function directly, spawns a new process to run it

pmap(f, L) -> S = self(), Pids = map(fun(i) -> spawn(fun() -> pmap_f(s, F, I) end) end, L), pmap_gather(pids). pmap_f(parent, F, I) -> Parent! {self(), (catch F(I))}. pmap_gather([h T]) -> receive {H, Ret} -> [Ret pmap_gather(t)] end; pmap_gather([]) -> [].

pmap(f, L) -> S = self(), Pids = map(fun(i) -> spawn(fun() -> pmap_f(s, F, I) end) end, L), pmap_gather(pids). pmap_f(parent, F, I) -> Parent! {self(), (catch F(I))}. pmap_gather([h T]) -> receive {H, Ret} -> [Ret pmap_gather(t)] end; pmap_gather([]) -> []. apply the function to the list item in the child process

pmap(f, L) -> S = self(), Pids = map(fun(i) -> spawn(fun() -> pmap_f(s, F, I) end) end, L), pmap_gather(pids). pmap_f(parent, F, I) -> Parent! {self(), (catch F(I))}. pmap_gather([h T]) -> receive {H, Ret} -> [Ret pmap_gather(t)] end; pmap_gather([]) -> []. then send it back to the parent

pmap(f, L) -> S = self(), Pids = map(fun(i) -> spawn(fun() -> pmap_f(s, F, I) end) end, L), pmap_gather(pids). pmap_f(parent, F, I) -> Parent! {self(), (catch F(I))}. pmap_gather([h T]) -> receive {H, Ret} -> [Ret pmap_gather(t)] end; pmap_gather([]) -> []. parent gathers results

pmap(f, L) -> S = self(), Pids = map(fun(i) -> spawn(fun() -> pmap_f(s, F, I) end) end, L), pmap_gather(pids). pmap_f(parent, F, I) -> Parent! {self(), (catch F(I))}. pmap_gather([h T]) -> receive {H, Ret} -> [Ret pmap_gather(t)] end; pmap_gather([]) -> []. receive a message from each Pid we spawned

pmap(f, L) -> S = self(), Pids = map(fun(i) -> spawn(fun() -> pmap_f(s, F, I) end) end, L), pmap_gather(pids). pmap_f(parent, F, I) -> Parent! {self(), (catch F(I))}. pmap_gather([h T]) -> receive {H, Ret} -> [Ret pmap_gather(t)] end; pmap_gather([]) -> []. cons up the return values

pmap(f, L) -> S = self(), Pids = map(fun(i) -> spawn(fun() -> pmap_f(s, F, I) end) end, L), pmap_gather(pids). pmap_f(parent, F, I) -> Parent! {self(), (catch F(I))}. pmap_gather([h T]) -> receive {H, Ret} -> [Ret pmap_gather(t)] end; pmap_gather([]) -> [].

pmap(f, L) -> S = self(), Pids = map(fun(i) -> spawn(fun() -> pmap_f(s, F, I) end) end, L), pmap_gather(pids). pmap_f(parent, F, I) -> Parent! {self(), (catch F(I))}. pmap_gather([h T]) -> receive {H, Ret} -> [Ret pmap_gather(t)] end; pmap_gather([]) -> [].

Distributed Systems Who uses Twitter? Who s frustrated by twitter? Who s written their own twitter clone?

Twitter Twitter is, fundamentally, a messaging system. Twitter was not architected as a messaging system, however. For expediency's sake, Twitter was built with technologies and practices that are more appropriate to a content management system. -Alex Payne Erlang approach: treat it as a messaging application. Model users by processes sending messages to each other.

create_user(name) -> User = #user{name=name}, Pid = spawn(fun() -> loop(user) end), try register(name, Pid) of true -> {ok, Pid} catch error:badarg -> exit(pid, in_use), {error, in_use} end.

create_user(name) -> User = #user{name=name}, Pid = spawn(fun() -> loop(user) end), try register(name, Pid) of true -> {ok, Pid} catch error:badarg -> exit(pid, in_use), {error, in_use} end. create a user record

create_user(name) -> User = #user{name=name}, Pid = spawn(fun() -> loop(user) end), try register(name, Pid) of true -> {ok, Pid} catch error:badarg -> exit(pid, in_use), {error, in_use} end. spawn a new process to manage the user

create_user(name) -> User = #user{name=name}, Pid = spawn(fun() -> loop(user) end), try register(name, Pid) of true -> {ok, Pid} catch error:badarg -> exit(pid, in_use), {error, in_use} end. register a name for the process, so we can send using the username rather than pid

follow(userpid, OtherName) ->... send(userpid, {follow, OtherName}). send(name, Msg) -> try Name! Msg catch error:badarg -> {error, no_such_user} end.

follow(userpid, OtherName) ->... send(userpid, {follow, OtherName}). send(name, Msg) -> try Name! Msg catch error:badarg -> {error, no_such_user} end. to add a follower

follow(userpid, OtherName) ->... send(userpid, {follow, OtherName}). send(name, Msg) -> try Name! Msg catch error:badarg -> {error, no_such_user} end. send a message to the user

follow(userpid, OtherName) ->... send(userpid, {follow, OtherName}). send(name, Msg) -> try Name! Msg catch error:badarg -> {error, no_such_user} end. saying follow that guy

follow(userpid, OtherName) ->... send(userpid, {follow, OtherName}). send(name, Msg) -> try Name! Msg catch error:badarg -> {error, no_such_user} end.

follow(userpid, OtherName) ->... send(userpid, {follow, OtherName}). send(name, Msg) -> try Name! Msg catch error:badarg -> {error, no_such_user} end. send is just a thin wrapper around! with error handling

Going Global Distribute across multiple machines? Just use global names so far, just running on one machine (can handle tens of thousands, maybe hundreds, of users) eventually need to grow past that to multiple machines. Fortunately this is easy

create_user(name) -> User = #user{name=name}, Pid = spawn(fun() -> loop(user) end), try register(name, Pid) of true -> {ok, Pid} catch error:badarg -> exit(pid, in_use), {error, in_use} end. just change register

create_user(name) -> User = #user{name=name}, Pid = spawn(fun() -> loop(user) end), try global:register_name(name, Pid) of true -> {ok, Pid} catch error:badarg -> exit(pid, in_use), {error, in_use} end. to global register

create_user(name) -> User = #user{name=name}, Pid = spawn(fun() -> loop(user) end), try global:register_name(name, Pid) of true -> {ok, Pid} catch error:badarg -> exit(pid, in_use), {error, in_use} end.

send(name, Msg) -> try Name! Msg catch error:badarg -> {error, no_such_user} end. similarly, change!

send(name, Msg) -> try global:send(name, Msg) catch error:badarg -> {error, no_such_user} end. to global:send

send(name, Msg) -> try global:send(name, Msg) catch error:badarg -> {error, no_such_user} end.

Reliable Distributed Systems What if a process crashes?

OTP Open Telecom Platform OTP provides frameworks for common application patterns, and handles reliability by watching and restarting processes

-module(twitterl). -behaviour(gen_server). We ll use the gen_server behaviour (similar to a Java interface)

create_user(name) -> gen_server:start_link( {global, Name},?MODULE, [#user{name=name}], [] ). start_link handles registering names, spawning the process and running the main loop

create_user(name) -> gen_server:start_link( {global, Name},?MODULE, [#user{name=name}], [] ).

create_user(name) -> gen_server:start_link( {global, Name},?MODULE, [#user{name=name}], [] ). using global names again

create_user(name) -> gen_server:start_link( {global, Name},?MODULE, [#user{name=name}], [] ). required callbacks provided by the current module

create_user(name) -> gen_server:start_link( {global, Name},?MODULE, [#user{name=name}], [] ). initial state

follow(username, OtherName) -> gen_server:call( {global, UserName}, {follow, OtherName} ). post(username, Msg) -> gen_server:call( {global, UserName}, {post, Msg} ). other API function follow a common pattern

follow(username, OtherName) -> gen_server:call( {global, UserName}, {follow, OtherName} ). post(username, Msg) -> gen_server:call( {global, UserName}, {post, Msg} ).

follow(username, OtherName) -> gen_server:call( {global, UserName}, {follow, OtherName} ). post(username, Msg) -> gen_server:call( {global, UserName}, {post, Msg} ). make a call to the server

follow(username, OtherName) -> gen_server:call( {global, UserName}, {follow, OtherName} ). post(username, Msg) -> gen_server:call( {global, UserName}, {post, Msg} ). using the global name

follow(username, OtherName) -> gen_server:call( {global, UserName}, {follow, OtherName} ). post(username, Msg) -> gen_server:call( {global, UserName}, {post, Msg} ). follow message

handle_call({follow, Other}, _From, State) -> NewF = [Other State#user.following], gen_server:call( {global, Other}, {add_follower, State#user.name}), {reply, ok, State#user{following=NewF}}; set up callbacks for expected messages

handle_call({follow, Other}, _From, State) -> NewF = [Other State#user.following], gen_server:call( {global, Other}, {add_follower, State#user.name}), {reply, ok, State#user{following=NewF}}; to follow another user

handle_call({follow, Other}, _From, State) -> NewF = [Other State#user.following], gen_server:call( {global, Other}, {add_follower, State#user.name}), {reply, ok, State#user{following=NewF}}; add them to the list of people we re following

handle_call({follow, Other}, _From, State) -> NewF = [Other State#user.following], gen_server:call( {global, Other}, {add_follower, State#user.name}), {reply, ok, State#user{following=NewF}}; call the other process

handle_call({follow, Other}, _From, State) -> NewF = [Other State#user.following], gen_server:call( {global, Other}, {add_follower, State#user.name}), {reply, ok, State#user{following=NewF}};

handle_call({follow, Other}, _From, State) -> NewF = [Other State#user.following], gen_server:call( {global, Other}, {add_follower, State#user.name}), {reply, ok, State#user{following=NewF}}; and tell them to add you as a follower

handle_call({follow, Other}, _From, State) -> NewF = [Other State#user.following], gen_server:call( {global, Other}, {add_follower, State#user.name}), {reply, ok, State#user{following=NewF}}; tell gen_server all is good, and the new state

handle_call({add_follower, F}, _From, State) -> NewF = [F State#user.followers], {reply, ok, State#user{followers=NewF}}; the other process adds you to their follower list

handle_call({post, Msg}, map(fun(name) -> _From, State) -> gen_server:cast(name, {posted, State#user.name, Msg}) end, State#user.followers), {reply, ok, State}; to post a message

handle_call({post, Msg}, map(fun(name) -> _From, State) -> gen_server:cast(name, {posted, State#user.name, Msg}) end, State#user.followers), {reply, ok, State}; for each follower

handle_call({post, Msg}, map(fun(name) -> _From, State) -> gen_server:cast(name, {posted, State#user.name, Msg}) end, State#user.followers), {reply, ok, State}; send the message we posted

handle_call({post, Msg}, map(fun(name) -> _From, State) -> gen_server:cast(name, {posted, State#user.name, Msg}) end, State#user.followers), {reply, ok, State}; use cast() because we don t care about any reply

handle_cast({posted, Other, Msg}, State) -> %% really store in DB, SMS, etc io:fwrite("~s ~s~n", [Other, Msg]), {noreply, State}; just for illustrative purposes, print messages we receive. Really stick it in DB, send to SMS, etc

In Summary Erlang makes adding concurrency to your programs nearly trivial

Thank You

Resources http://erlang.org/ http://trapexit.org/ #erlang on freenode Erlang-questions@erlang.org CEAN