Introduction to Erlang. Franck Petit / Sebastien Tixeuil

Introduction to Erlang Franck Petit / Sebastien Tixeuil Firstname.Lastname@lip6.fr

Hello World % starts a comment. ends a declaration Every function must be in a module one module per source file source file name is module name +.erl : used for calling functions in other modules

Recursive Functions Variables start with upper-case characters ; separates function clauses, separates instructions Variables are local to the function clause Pattern matching and guards to select clauses

Recursive Functions -module(mymath). -export([factorial/1]). factorial(0) -> 1; factorial(n) -> N * factorial(n-1). > mymath:factorial(6). 720

Tail Recursion The arity is part of the function name Non-exported functions are local to the module

Tail Recursion -module(mylists). -export([reverse/1]). reverse(l) -> reverse(l, []). reverse([h T], L) -> reverse(t, [H L]); reverse([], L) -> L. > mylists:reverse([3,2,1]). [1,2,3]

Recursion over Lists Pattern-matching selects components of the data _ is a don t care pattern (not a variable) [] is the empty list [X,Y,Z] is a list with exactly three elements [X,Y,Z Tail] has three or more elements

List Recursion with Accumulator The same syntax is used to construct lists Strings are simply lists of character codes Avoid adding data to the end of the list

Numbers Regular numbers 123-34567 12.345-27.45e-05 #-notation for base-n integers 16#ffff $-notation for character codes (ISO-8859-1) $A (->65)

Atoms Must start with lower case character or be quoted friday unquoted_atoms_cannot_contain_blanks A quoted atom with several blanks hello \n my friend Similar to hashed strings use only one word of data constant-time equality test

Tuples Terms separated by, and enclosed in {} {a,12, hello } {1,2,{3, 4},{a,{b,c}}} {} A fixed number of items (similar to structure or record in conventional programming languages) A tuple whose first element is an atom is called a tagged tuple

Other Data Types Functions Binaries Process identifiers All terms are ordered and can be compared with <, >, ==, =:=, etc. References No separate booleans Erlang values in general are called terms

Built-in Functions Implemented in C All the type tests and conversions are BIFs Most BIFs (not all) are in the module erlang Many common BIFs are auto-imported (recognized without writing erlang:... ) Operators ( +, -, *, /,...) are also really BIFs

Standard Libraries Application Libraries Kernel erlang code file inet os Stdlib lists dict sets...

Expressions Boolean and/or/xor are strict (always evaluate both arguments) Use andalso/orelse for short circuit evaluation == for equality, not = Always use parentheses when not absolutely certain about the precedence

Fun Expressions Anonymous functions (lambda expressions) Can have several clauses All variables in the pattern are new All variable bindings in the fun are local Variables bound in the environment can be used in the fun-body

Pattern Matching Match failure causes run-time error Successful matching binds the variables but only if they are not already bound to a value previously bound variables can be used in a pattern a new variable can also be repeated in a pattern

Pattern Matching mylength([]) -> 0; mylength([_ T]) -> mylength(t) + 1.

Case-switches Any number of clauses Patterns and guards, just as in functions ; separates clauses Use _ as catch-all Variables may also begin with underscore signals I don t intend to use this value

If-switches Like a case-switch without the patterns and the when keyword Use true as catch-all factorial(n) when N == 0 -> 1; factorial(n) when N > 0 ->! N * factorial(n - 1).

Switching Pattern Matching factorial(0) -> 1; factorial(n) -> N * factorial(n-1). When factorial(n) when N == 0 -> 1; factorial(n) when N > 0 ->! N * factorial(n - 1). If factorial(n) -> if N == 0 -> 1; N > 0 -> N * factorial(n - 1) end. Case factorial(n) -> 1 case (N) of 0 -> 1; N when N > 0 -> N * factorial(n - 1) end.

List Processing List Processing BIFs atom_to_list(a) float_to_list(f) integer_to_list(i) tuple_to_list(t) list_to_atom(l)... hd(l) tl(l) length(l) List Processing Functions member(x,l) append(l1,l2) reverse(l) delete_all(x,l)

Tuple Processing Tuple Processing BIFs tuple_to_list(t) element(n,t) setelement(n,t,val) size(l)... Multiple Return Values PID, now()...

Catching Exceptions throw: user defined error: runtime errors exit: end process only catch throw exceptions normally

Processes Code is executed by a process A process keeps track of the program pointer, the stack, the variables values, etc. Every process has a unique process identifier Processes are concurrent

Processes: Implementation Virtual machine layer processes Preemptive multitasking Little overhead (e.g. 100.000 processes) Can use multiple CPUs on multiprocessor machines

Concurrency Several processes may use the same program code at the same time each has own program counter, stack, and variables programmer need not think about other processes updating the variables

Message Passing! is the send operator Pid of the receiver is used as the address Messages are sent asynchronously The sender continues immediately Any value can be sent as a message

Echo -module(echo). -export([start/0,loop/0]). start() -> spawn(echo, loop, []). loop() -> receive {From, Message} -> io:format("> echo: ~w Msg: ~w ~n", [self(), Message]), From! Message, loop() end. > Id=echo:start(), Id! {self(),hello}. echo: <0.35.0> Msg: hello {<0.32.0>,hello} >

Message Queues Each process has a message queue (mailbox) incoming messages are placed in the queue (no size limit) A process receives a message when it extracts it from the mailbox need not take the first message in the queue

Receiving a Message receive-expressions are similar to case switches patterns are used to match messages in the mailbox messages in the queue are tested in order only one message can be extracted each time

Selective Receive Patterns and guards permit message selection receive-clauses are tried in order If no message matches, the process suspends and waits for a new message

Receive with Timeout A receive-expression can have an after-part can be an integer (milliseconds) or infinity The process waits until a matching message arrives, or the timeout limit is exceeded soft real-time: no guarantees

Send and Reply Pids are often included in messages (self()), so that the receiver can reply to the sender If the reply includes the Pid of the second process, it is easier for the first process to recognize the reply

Message Order The only guaranteed message order is when both the sender and the receiver are the same for both messages (first-in, firstout)

Selecting Unordered Messages Using selective receive, it is possible to choose which messages to accept, even if they arrive in a different order

Starting Processes The spawn function creates a new process The new process will run the specified function The spawn operation always returns immediately The return value is the Pid of the child

Process Termination A process terminates when: it finishes the function call that it started with there is an exception that is not caught All messages sent to a terminated process will be thrown away Same Pid will not be used before long time

A Stateful Server The parameter variables of a server loop can be used to remember the current state

Hot Code Swapping When using module:function(...), the latest version of module is always used If the server module is recompiled and reloaded, the process will jump to the new code after handling the next message

Registered Processes A process can be registered under a name Any process can send a message to a registered process, or look up the Pid The Pid might change (if the process is restarted and re-registered), but the name stays the same Pid = spawn(?module, server, []), register(myserver, Pid), myserver! Msg.

Links and Exit Signals Any two processes can be linked Links are always bidirectionnal When a process dies, an exit signal is sent to all linked processes, which are also killed normal exit does not kill other processes

Trapping Exit Signals If a process sets its trap_exit flag, all signals will be caught and turned into normal messages process_flag(trap_exit, true) { EXIT, Pid, ErrorTerm} This way, a process can watch other processes

Distribution Running erl with the flag -name xxx starts the Erlang network distribution system makes the virtual machine emulator a node ( xxx@host.domain ) Erlang nodes can communicate over the network (but must find each other first)

Connecting Nodes Nodes are connected the first time they try to communicate The function net_adm:ping(node) is the easiest way to set up a connection between nodes returns pong or pang Send a message to a registered process using {Name,Node}! Message

Distribution is Transparent Possible to send a Pid from one node to another (Pids are unique across nodes) You can send a message to any process through its Pid (even on another node) You can run several Erlang nodes (with different names) on the same computer

Running Remote Processes Variants of the spawn function can start processes directly on another node The module global contains functions for registering and using named processes over the whole network of connected nodes setting global locks

Ports: Talking to the Outside Talks to an external (or linked-in) C program A port is connected to the process that opened it The port sends data to the process in messages A process can send data to the port