INF672/ACN910 Protocol Safety and Verification

INF672/ACN910 Protocol Safety and Verification Karthik Bhargavan Xavier Rival Thomas Clausen Http://prosecco.inria.fr/personal/karthik/teaching/ACN910.html 1

Networking Formal Verification Routers, endpoints, protocol stacks Models, programs, automated analysis Concerns: Speed (latency, b/w) Interoperability Fault-tolerance Preventing Attacks Concerns: Termination Crash-freedom Correctness/finding bugs Security theorem This course: Can we use formal verification to build more robust protocols?

Course Outline Lecture 1 [Today, Sep 15] Introduction, Motivating Examples Lectures 2-4 [Sep 22,29, Oct 6] Network Protocol Verification: Spin Lectures 5-8 [Oct 13, 20, 27, Nov 3] Program Verification: Properties, Tools & Techniques Lecture 9 [Nov 10] Security Protocol Verification: ProVerif Lecture 10 [Nov 17] Exam

Readings No textbook: Slides will be available on moodle after lecture Research Papers: Each lecture will cite several research papers; reading them is optional but recommended. (Learning to read such papers is one of the objectives of the course) Verification Tools: Some lectures will describe verification tools. Downloading tools and reading manuals is expected Similar master-level courses elsewhere: Princeton:http://www.cs.princeton.edu/courses/archive/spring10/ cos598d/formalmethodsnetworkingoutline.html University of Pennsylvania:http://netdb.cis.upenn.edu/cis800-fa11/

Network Protocols In this course, we will study network protocols Their goals, specifications, and implementations Network protocols are distributed programs Execute concurrently on multiple hosts Have complex state machines, subtle corner cases Must account for temporary/permanent failures Wire behavior is usually well specified in standards State machines, protocol goals often left underspecified How do we ensure that a protocol (and its implementation) achieves its intended goals?

Formal Protocol Analysis How do we ensure that a protocol (and its implementation) achieves its intended goals? Can we formally state a correctness theorem? Can we (semi-)automatically prove such theorems? Can we (semi-)automatically find counterexamples? Automation is often necessary There are too many cases to consider by hand Proofs and counterexamples should be reproducible

Formal Methods Formal languages for modeling protocol execution Communicating Finite State Machines, Pi Calculus, Programs Formal languages for specifying their goals Abstract Protocols, Temporal Logic, Safety Assertions First-order Logic, Higher-order Logic Verifying that a protocol meets its goals Model Checking, Constraint (SAT) Solving, Theorem Proving Verifying that an implementation meets its goals Static Analysis, Type Checking, Program Synthesis

Goals Learn how to model and specify protocols Learn how to use verification tools Be able to state and prove theorems about important protocols and their implementations Or find counterexamples/attacks Today: Motivating case studies Specifying and analyzing popular Internet protocols

HTTP TLS TCP IP

Protocol Layers HTTP TLS TCP IP Hypertext Transfer Protocol (HTTP) Standards (v1.1): IETF RFC 7230-7235 Goal: Retrieving resources from URIs Transport Layer Security (TLS) Standards (v1.2): IETF RFC 5246 Goal: Secure communication channel Transmission Control Protocol (TCP) Standards: IETF RFC 793 Goal: Reliable data streams Internet Protocol (IP) Standards: IETF RFC 791 Goal: Packet forwarding Paths computed by various routing protocols: BGP, OSPF, IS-IS, AODV,

Protocol Goals HTTP TLS TCP IP Each protocol uses an API provided by the lower-layer and provides an API to the higher-layer For each API, we can specify both local and global goals API provided by IP at host H: send : (dstip,data) -> void recv : void -> (srcip,data) (All functions can return errors) Exercise: What are the goals?

Many Kinds of Properties Safety: protocol does not do bad things e.g. no runtime error, no incorrect messages Liveness: eventually, something good happens e.g. termination, message delivery Security: no illegal information flows e.g. confidentiality, integrity, authenticity Availability: don t exhaust available resources e.g. denial-of-service, out-of-memory Some properties matter more than others: e.g. security leak > crash > denial-of-service

IP Goals TCP IP API provided by IP at host H: send : (dstip,data) -> void recv : void -> (srcip,data) Exercise: What are the goals? Goal: Correct Delivery (Safety) If host H calls recv and gets (H,D) then host H must have called send(h,d) Assumption: Unreliable network Packets may be silently dropped But packets will not be tampered with

TCP/IP Goals TCP IP At each TCP/IP host with address H: connect: (dstip,dstport) -> (srcport,handle) accept: dstport -> (srcip,srcport,handle) send: (handle,data) -> void handle -> data close: handle -> void Exercise: What are the guarantees? Goal: Local Invariants (Safety): If application calls send before connect, or if it calls send after close, then send must return an error

TCP/IP Goals TCP IP At each TCP/IP host with address H: connect: (dstip,dstport) -> (srcport,handle) accept: dstport -> (srcip,srcport,handle) send: (handle,data) -> void handle -> data Exercise: What are the guarantees? Goal: Reliable Delivery (Liveness): Suppose H calls connect(h,p ) and gets (P,h) Suppose H calls accept(p ) and gets (H,P,h ) Suppose H then calls send(h,d) Suppose H then repeatedly calls recv(h ) Then eventually recv(h ) will return D Assumption: Connected Network (Fairness) IP delivers at least 1 in n packets from H to H

Formally Specifying TCP Abstractly, client and server are communicating state machines linked by a lossy channel (IP) To state a theorem, we need a formal semantics with a concurrent execution model and channel-based communication Protocol not really finite state (unbounded data). So, most verification questions are undecidable Concretely, many many details in the standard and its implementations Congestion control, timeouts, packet formats, Typically implemented in C with aggressive optimizations and pointer arithmetic

What Is This? This graph shows an approximation to the Host Transition System of the TCP specification TCP: an approximation to the real state diagram close_8 close()) send: LISTEN shutdown_1 deliver_in_1b deliver_in_7b shutdown() r R send: Rs send: send: bad recv segment NONEXIST close_7 close_7 close_8 deliver_in_1 close() close() close() ars send: ArSf send: send: send: there is another socket in state LISTEN states on the incomplete connection queue deliver_in_2 ars send: ArSf TCP, UDP, and Sockets: rigorous and experimentallyvalidated behavioural specification. Volume 1: Overview. Volume 2: The Specification. Steven Bishop, Matthew Fairbairn, Michael Norrish, Peter Sewell, Michael Smith, and Keith Wansbrough. 2005. The states are the classic TCP states, though note that these are only a tiny part of the protocol endpoint state, in the specification or in implementations. The transitions are an over-approximation to the set of all the transitions in the model which (1) affect the TCP state of a socket, and/or (2) involve processing segments from the host s input queue or adding them to its output queue, except that transitions involving ICMPs are omitted, as are transitions arising from the pathological BSD behaviour in which arbitrary sockets can be moved to LISTEN states. Transitions are labelled by their Host LTS rule name (e.g. socket 1, deliver in 3, etc.), any socket call involved (e.g. close()), and constraints on the flags of any TCP segment received and sent, with e.g. R indicating that RST is set and r indicating RST is clear. Transitions involving segments (either inbound or outbound) with RST set are coloured orange; others that have SYN set are coloured green; others that have FIN set are coloured blue; others are coloured black. The FIN indication includes the case of FINs that are constructed by reassembly rather than appearing in a literal segment. The graph is based on data extracted manually from the HOL specification. The data does not capture all the invariants of the model, so some depicted transitions may not be reachable in the model (or in practice). Similarly, the constraints on flags shown may be overly weak. SYN_SENT connect_4 deliver_in_2a deliver_in_7c deliver_out_1 deliver_out_1 timer_tt_rexmtsyn_1 timer_tt_keep_1 r R send: send: Rs send: send: rsf send: rsf send: arsf send: Arsf bad recv segment close_8 close() send: ARs states on the complete connection queue SYN_RECEIVED deliver_in_3c deliver_in_8 deliver_out_1 deliver_out_1 timer_tt_rexmt_1 timer_tt_persist_1 timer_tt_keep_1 A rs send: Rs send: ARs send: rsf send: rsf send: ArSf send: Arsf send: Arsf stupid ack, or LAND DoS close 3 close 7 close 8 Transition Rules Successful abortive close of a synchronised socket Successfully close the last file descriptor for a socket in the CLOSED, SYN SENT or SYN RECEIVED states. Successfully close the last file descriptor for a listening TCP socket listen_1 listen() send: timer_tt_conn_est_1 send: connect_1 connect()) send: arsf connect_4 send: deliver_in_2 deliver_in_2 deliver_in_2 deliver_in_3 ars ArS ArS rf send: ArSf send: Ars send: Ars send: di3out deliver_in_2 ArS send: Ars deliver_in_7d timer_tt_rexmtsyn_1 AR send: send: except on WinXP maxrxtshift reached CLOSE_WAIT deliver_in_3 deliver_in_3 deliver_in_8 deliver_out_1 timer_tt_persist_1 timer_tt_keep_1 rf rf rs send: di3out send: di3out send: ARs send: rsf send: Arsf send: Arsf deliver_in_2 ArS send: Ars ESTABLISHED FIN_WAIT_1 deliver_in_3 deliver_in_8 deliver_out_1 rf rs send: di3out send: ARs send: rsf deliver_in_3 deliver_out_1 timer_tt_rexmt_1 timer_tt_persist_1 rf send: di3out send: rsf send: ArsF send: ArsF deliver_in_3 rf send: di3out timer_tt_rexmt_1 timer_tt_persist_1 timer_tt_keep_1 deliver_in_3 rf send: Arsf send: Arsf send: Arsf send: di3out connect_1 deliver_in_2 connect()) ArS send: Ars send: arsf deliver_in_3 rf send: di3out deliver_in_3 deliver_in_8 deliver_out_1 deliver_out_1 timer_tt_rexmt_1 timer_tt_keep_1 rf rs send: di3out send: ARs send: rsf send: rsf send: Arsf send: Arsf deliver_in_3 rf send: di3out CLOSING timer_tt_rexmt_1 send: ARs maxrxtshift reached deliver_in_3 deliver_in_3 deliver_in_8 deliver_out_1 deliver_out_1 timer_tt_rexmt_1 timer_tt_keep_1 rf rf rs send: di3out send: di3out send: ARs send: rsf send: rsf send: Arsf send: Arsf deliver_in_7a R send: close_7 close() send: socket_1 socket() send: connect 1 Begin connection establishment by creating a SYN and trying to enqueue it on host s outqueue connect 4 Fail: socket has pending error deliver in 1 Passive open: receive SYN, send SYN,ACK deliver in 1b For a listening socket, receive and drop a bad datagram and either generate a RST segment or ignore it. Drop the incoming segment if the socket s queue of incomplete connections is full. deliver in 2 Completion of active open (in SYN SENT receive SYN,ACK and send ACK) or simultaneous open (in SYN SENT receive SYN and send SYN,ACK) deliver in 2a Receive bad or boring datagram and RST or ignore for SYN SENT socket deliver in 3 Receive data, FINs, and ACKs in a connected state deliver in 3b Receive data after process has gone away deliver in 3c Receive stupid ACK or LAND DoS in SYN RECEIVED state deliver in 6 Receive and drop (silently) a sane segment that matches a CLOSED socket deliver in 7 Receive RST and zap non-{closed; LISTEN; SYN SENT; SYN RECEIVED; TIME WAIT} socket deliver in 7a Receive RST and zap SYN RECEIVED socket deliver in 7b Receive RST and ignore for LISTEN socket deliver in 7c Receive RST and ignore for SYN SENT(unacceptable ack) or TIME WAIT socket deliver in 7d Receive RST and zap SYN SENT(acceptable ack) socket deliver in 8 Receive SYN in non-{closed; LISTEN; SYN SENT; TIME WAIT} state deliver in 9 Receive SYN in TIME WAIT state if there is no matching LISTEN socket or sequence number has not increased deliver out 1 Common case TCP output listen 1 Successfully put socket in LISTEN state listen 1c Successfully put socket in the LISTEN state from any non- {CLOSED; LISTEN} state on FreeBSD shutdown 1 Shut down read or write half of TCP connection socket 1 Successfully return a new file descriptor for a fresh socket timer tt 2msl 1 2*MSL timer expires timer tt conn est 1 connection establishment timer expires timer tt fin wait 2 1FIN WAIT 2 timer expires timer tt keep 1 keepalive timer expires timer tt persist 1 persist timer expires timer tt rexmt 1 retransmit timer expires timer tt rexmtsyn 1 SYN retransmit timer expires LAST_ACK deliver_in_7 R send: deliver_out_1 timer_tt_persist_1 send: rsf send: ArsF close_3 deliver_in_7 close() R send: send: ARs deliver_in_3 deliver_in_8 deliver_out_1 deliver_out_1 timer_tt_rexmt_1 timer_tt_keep_1 rf rs send: di3out send: ARs send: rsf send: rsf send: Arsf send: Arsf close_3 deliver_in_3b deliver_in_7 timer_tt_rexmt_1 deliver_in_3 close() rs R rf send: Rs send: send: ARs send: di3out send: ARs process gone away maxrxtshift reached timer_tt_rexmt_1 send: ARs maxrxtshift reached close_3 close() send: ARs deliver_in_3b rs send: Rs process gone away CLOSED deliver_in_7 R send: FIN_WAIT_2 deliver_in_7 R send: timer_tt_rexmt_1 send: ARs maxrxtshift reached timer_tt_fin_wait_2_1 send: connect_1 connect() send: if the enqueue failed deliver_in_3 rf send: di3out deliver_in_3 rf send: di3out close_3 close() send: ARs deliver_in_6 unconstrained send: deliver_in_8 rs send: ARs deliver_out_1 send: rsf deliver_in_3b rs send: Rs process gone away deliver_in_3 rf send: di3out timer_tt_keep_1 send: Arsf connect_1 connect() send: if the enqueue failed deliver_in_3 rf send: di3out close_3 close() send: ARs deliver_in_1 ars send: ArSf segments for new conn deliver_in_3 rf send: di3out TIME_WAIT deliver_in_3b rs send: Rs process gone away deliver_in_3 rf send: di3out deliver_in_3 rf send: di3out timer_tt_2msl_1 send: deliver_in_3 rf send: di3out close_3 close() send: ARs deliver_in_7c R send: deliver_in_9 rs send: Rs no listening socket close_3 close() send: ARs deliver_out_1 send: rsf deliver_in_3b rs send: Rs process gone away deliver_in_7 R send: timer_tt_rexmt_1 send: ARs maxrxtshift reached http://www.cl.cam.ac.uk/users/pes20/netsem March 18, 2005 The RFC793 Original Transmission Control Protocol Functional Specification ---------\ active OPEN +---------+ CLOSED \ ----------- +---------+<---------\ create TCB \ ^ \ \ snd SYN passive OPEN CLOSE \ \ ------------ ---------- \ \ create TCB delete TCB \ \ V \ \ +---------+ CLOSE \ LISTEN ---------- +---------+ delete TCB rcv SYN SEND ----------- ------- V snd SYN,ACK / \ snd SYN +---------+ +---------+ <----------------- ------------------> SYN rcv SYN SYN RCVD <----------------------------------------------- SENT snd ACK ------------------ ------------------- +---------+ rcv ACK of SYN / rcv SYN,ACK +---------+ \ -------------- ----------- x snd ACK V V +---------+ CLOSE ------- ESTAB snd FIN +---------+ CLOSE rcv FIN V ------- ------- +---------+ snd FIN / \ snd ACK +---------+ FIN <----------------- ------------------> CLOSE WAIT-1 ------------------ WAIT +---------+ rcv FIN +---------+ \ rcv ACK ------- CLOSE of FIN -------------- snd ACK ------- V x V snd FIN V +---------+ +---------+ +---------+ FINWAIT-2 CLOSING LAST-ACK +---------+ +---------+ +---------+ rcv ACK of FIN rcv ACK of FIN rcv FIN -------------- Timeout=2MSL -------------- ------- x V ------------ x V \ snd ACK +---------+delete TCB +---------+ ------------------------> TIME WAIT ------------------> CLOSED +---------+ +---------+ TCP Connection State Diagram Figure 6. September 1981

Full Formal TCP Specification Rigorous specification of TCP, UDP, and the Sockets API Written in Higher-Order Logic 360 pages printed out Can be experimentally validated against real TCP traces Has found real bugs in TCP implementations by conformance testing No proofs of TCP, just a detailed model [OPEN PROBLEM] Implementations are even more complicated [MANY HIDDEN BUGS] Papers (See http://www.cl.cam.ac.uk/~pes20/netsem/ ) Engineering with Logic: HOL Specification and Symbolic-Evaluation Testing for TCP Implementations. In POPL 2006. Steve Bishop, Matthew Fairbairn, Michael Norrish, Peter Sewell, Michael Smith, Keith Wansbrough. Rigorous specification and conformance testing techniques for network protocols, as applied to TCP, UDP, and Sockets. In SIGCOMM 2005. Steve Bishop, Matthew Fairbairn, Michael Norrish, Peter Sewell, Michael Smith, Keith Wansbrough.

Specification Methodology Define the global state of the network Local protocol state at each node, plus packets in flight Define local transitions for each node It may receive data from the application, change local state, and send a network message It may receive a message from the network, change local state, forward data to the application, and send a network message Define the environment model Model the lossyip network and the generic application Yields a definition of valid traces of the network + environment Show that all valid traces satisfy desired goals A goal can be any higher-order logic formula, including safety and liveness Proof typically proceed by induction on the length of the trace Use generic theorem proving tactics, define new specialized ones

Higher-Order Logic Spec: Received SYN; Sending SYNACK

Internet Routing Protocols Verifying Safety and Liveness

Routing in an Internetwork Hosts (h1,h2) connect to networks (n1,n2,n3,n4) Routers (r1,r2,r3,r4) connect two or more networks The link between a router and a network is an interface (i1,i2,i3) Routers forward packets based on routing tables Routers run a routing protocol to compute (optimal) routing tables

Internet Routing Protocols Intradomain Routing: Interior Gateway Protocols Routing within an Autonomous System (AS) Single administrative domain (e.g. ISP, University) Popular protocols: OSPF, IS-IS, RIP Failures can cause local outages Interdomain Routing: Exterior Gateway Protocols Routing between AS s Dominant protocol: BGP Failures can cause internet-wide outages Ad-hoc Network Routing Protocols Routing between proximate wireless mobile devices Popular protocols: AODV, DSR, OLSR Typically run in low-reliability environment

Routing in a network (simplified) D S Shortest Paths Problem: Find the cheapest route from S to D L(i, j) = Cost of direct link i --- j. R(a, b) = Cost of route from a to b. R(a, b) = min{ L(a, k) + R(k, b) }

Routing Protocol Designs Distance Vector Routing Each node keeps a table with a distance to every destination Global routing graph is computed as an asynchronous distributed computation that solves the shortest paths problem Example: RIP, AODV Path Vector Routing Each node keeps a table with the full path to every destination Each node decides its preferred paths (not necessarily shortest) Example: BGP Link State Routing Each node maintains the full link graph of the network All link updates are propagated throughout the network Global routing graph is computed locally at each node Example: OSPF

Routing Information Protocol (RIP) One of the oldest Internet routing protocols Based on Asynchronous Distributed Bellman-Ford [Bertsekas 91] Each node n maintains a routing table hops D : number of hops to D (no weighted edges) next D : next router on the path to D Global progress: Initially: All nodes know their neighbors (hops = 1) Finally: All nodes know distance & successor to all other nodes Local processes: Periodically send routing table to all neighbors Locally update hops D to 1 + min(received hops D ) Use timeouts to detect link breakage and expire routes

Distance-vector routing in RIP Initially A: 0 B: 1 C: A: 1 B: 0 C: 1 A: B: 1 C: 0 A B C After exchange A: 0 B: 1 C: 2 A: 1 B: 0 C: 1 A: 2 B: 1 C: 0

Routing Loops in RIP: Count to Infinity After exchange A: 0 B: 1 C: 2 A: 1 B: 0 C: 1 A: 2 B: 1 C: 0 A B C C: 2 C: 3+1=4 C: C: 2+1=3 C: 4+1=5

Poisoned reverse Advertise hops D = to next D Prevents loops of two routers (adds more cases for verification) A B C C: Limitation: Doesn t prevent loops of three or more routers

Infinity = 16 Since we can t solve the loop problem Set Infinity to 16 RIP is not to be used in a network with more than 15 hops. If a routing loop occurs, it will be discovered in at most 15 routing updates That is, the routing loop is transient Until then, packets will be forwarded in a loop Concrete protocol design deviates from theory Original proof of asynchronous distributed Bellman-Ford no longer directly applies Many corner cases, race conditions to worry about

Formal Goals for Routing Goal: Loop Freedom (Safety) In the global state of a routing protocol, there should not be a subset of routing tables that creates a loop on the route to D Many routing protocols have transient routing loops when links go down or come back up RIP has transient loops that may last 15 updates BGP prevents count-to-infinity by using paths AODV prevents it by using sequence numbers But AODV (v2) still had persistent routing loops

Formal Goals for Routing Gool: Route Convergence (Liveness) In the global state of a routing protocol, if all links remain stable, then all routing tables eventually converge Soundness: they should converge to valid routes Optimality: they should converge to minimal routes Convergence in Internet Routing Protocols: RIP converges in at most 15 routing updates AODV (-02) may converge to invalid routes! BGP may not converge!

AODV: Ad-hoc On-demand Distance Vector Routing Protocol Designed for Mobile Ad-hoc Networks: low-range, low-power wireless devices in battlefields or disaster areas Routes are computed on-demand to save bandwidth; loops are prevented by versioning routes D D S S

AODV Loop Freedom AODV is designed to prevent transient loops Avoids bandwidth wastage to unreachable nodes Loops in on-demand routes difficult to get rid of, since they cannot rely on regular route updates Each routing table entry contains: hops D, next D as before seqno D : a number indicating route freshness Only fresher routes can update an existing route Among two routes of equal freshness, smaller hop-count is preferred. Property to be formally verified is loop freedom

AODV Loop Freedom After exchange A: 0, 0 B: 1, 0 C: 2, 0 A: 1, 0 B: 0, 0 C: 1, 0 A: 2, 0 B: 1, 0 C: 0, 0 A B C C:, 1 C:, 1

Analyzing AODV for Loop-Freedom A B D We model the 3-node AODV network in SPIN Each node runs an identical Promela process A global link table encodes dynamic topology We specify loop freedom as a safety property in Linear Temporal Logic Always (!((next D (A)==B) /\ (next D (B)==A))) We run SPIN, which finds four counterexamples

Looping Counter-examples SPIN finds 4 cases where AODV can form loops! E.g. when routes expire, when nodes restart, In response, AODV was fixed to cover these cases

Looping Counter-examples Let s keep expired route but set it to infinity?

Looping Counter-examples Let s keep expired route but set it to infinity, increase its sequence number, and delete it later What if the route update gets dropped before the timeout?

Looping Counter-examples What if one of the nodes reboots? Its routing table is reset Its neighbours may not detect that it has been rebooted

Sufficient Conditions for Loop Freedom 1. Increase sequence number on every update, even if route expires or breaks 2. Never delete expired routes Can delete them if all other nodes have indicated knowledge of expiry in some way 3. Detect when a neighbor restarts AODV Treated as if all links to neighbors are broken Are these conditions enough to guarantee loop freedom for all runs of arbitrary AODV networks? Yes, we can prove a general theorem by combining finitary proofs in SPIN with abstraction proofs in HOL

Verifying RIP and AODV Detailed models derived from the standards Fully formal proofs in HOL + SPIN of AODV safety: loop freedom and route validity RIP liveness: convergence with sharp timing bounds Paper: Formal verification of standards for distance vector routing protocols, K Bhargavan, D Obradovic, C Gunter, JACM 2002

TLS: Verifying Security Properties

Secure Channels over Insecure Networks Web Server Web User Internet Man-in-the-middle attackers may We require a secure (private, reliable) channel Read data sent to and received from servers To achieve Tamper this, with we the must contents rely of on messages cryptography Impersonate A unilaterally-authenticated a user or a key server exchange protocol servers are authenticated, but clients may remain anonymous

Transport Layer Security (1994--) The most widely deployed cryptographic protocol Web, Wi-Fi, VPN, Mail, Chat, Voice over IP, SChannel (Windows) NSS (Firefox, Chrome) SecureTransport (ios, OSX) OpenSSL (Servers, Linux, Android) Many standardized versions and protocol extensions 1994 Netscape s Secure Sockets Layer 1995 SSL3 1999 TLS1.0 (RFC2246, SSL3) 2006 TLS1.1 (RFC4346) 2008 TLS1.2 (RFC5246) Many recent flaws & attacks GotoFail, HeartBleed, 3Shake What causes these attacks?

Hello, I d like to connect to Google 1C has: cert Client C Server S C, pk, sk C C S has: cert S, pk, sk S S ClientHello(cr, [KEX ALG 1, KEX ALG 2,...],[ENC ALG 1, ENC ALG 2,...]) ServerHello(sr, sid, KEX ALG, ENC ALG) ServerCertificate(cert S, pk S ) Hi, I m Google (here s my certificate) Let s exchange keys with RSA Verify cert S is valid for host S 2 ServerKeyExchange(kex S ) Verify signature using pk S Ok, I have the keys Let s start talking Compute ms from kex C, kex S Authenticated Key Exchange 1 CertificateRequest ServerHelloDone 1 ClientCertificate(cert C, pk C ) (RSA or Diffie-Hellman or PSK or Kerberos or ) log 1 ClientKeyExchange(kex C ) log 1 log 2 1 CertificateVerify(sign(sk C, log 1 )) log 2 ClientCCS log 3 ClientFinished(verifydata(ms, log 2 )) log 3 Verify cert C is valid 1 Verify signature using pk C Compute ms from kex C, kex S Your keys look fine Send me data ServerCCS ServerFinished(verifydata(ms, log 3 )) Verify finished using ms Verify finished using ms Cache new session: sid, ms, cert 1 C /anon cert S, cr, sr, KEX ALG, ENC ALG I m done. Goodbye. Application Data Exchange... (HTTP or AppData SMTP i or ) AppData j... CloseNotifyAlert CloseNotifyAlert Cache new session: sid, ms, cert 1 C /anon cert S, cr, sr, KEX ALG, ENC ALG Got it. Goodbye.

TLS Security Goals Goal: Confidentiality: If a client C sends secret data D to server S, then D is not leaked to the adversary Goal: Authentication and Integrity: When C receives D over a connection with S, it must be that S sent D over this connection Concretely, an attacker cannot impersonate an honest HTTPS website or steal its users passwords Assumption (threat model): Network-based adversary The adversary fully controls the network and all other clients and servers We assume that C and S are honest, that is, their long-term credentials are unknown to the adversary The probability of the adversary breaking a cryptographic primitive or guessing a secret is negligible

What can go wrong with TLS? Implementation bugs Parsing messages incorrectly [HeartBleed] Forgetting cryptographic checks [GotoFail] Allowing messages at wrong time [SMACK, FREAK] Cryptographic weaknesses RC4, RSA-PKCS1.5, MAC-Then-Encrypt Theoretical attacks may not be immediately exploitable, but they always get you in the end [BREACH, CRIME] Incorrect protocol composition Confusing DHE with ECDHE [CrossProtocol] Session Resumption and Renegotiation [3Shake]

Hello, I d like to connect to Google 1C has: cert Client C Server S C, pk, sk C C S has: cert S, pk, sk S S Verify cert S is valid for host S We don t have keys yet, Verify signature using pk S but let s change ciphers Great, I know that key J ClientHello(cr, [KEX ALG 1, KEX ALG 2,...],[ENC ALG 1, ENC ALG 2,...]) ServerHello(sr, sid, KEX ALG, ENC ALG) ServerCertificate(cert S, pk S ) 2 ServerKeyExchange(kex S ) 1 CertificateRequest ServerHelloDone 1 ClientCertificate(cert C, pk C ) log 1 ClientKeyExchange(kex C ) log 1 log 2 1 CertificateVerify(sign(sk C, log 1 )) log 2 Hi, I m Google (here s my certificate) Let s exchange keys with RSA Ok, if you say so. Lets set keys to 0000000000! Verify cert C is valid Yikes, the key exchange algorithm has been bypassed. Compute ms from kex C, kex S 1 Verify signature using pk C Anybody Compute can ms read from the kex C, kex data! S log 3 ClientCCS ClientFinished(verifydata(ms, log 2 )) log 3 ServerCCS ServerFinished(verifydata(ms, log 3 )) Verify finished using ms Verify finished using ms Cache new session: sid, ms, cert C 1 /anon cert S, cr, sr, KEX ALG, ENC ALG... AppData i AppData j... CloseNotifyAlert CloseNotifyAlert Cache new session: Early CCS Injection [Kikuchi 14] sid, ms, cert 1 C /anon cert S, cr, sr, KEX ALG, ENC ALG Bug in OpenSSL state machine Found when trying to do a formal Coq proof of TLS

State Machine Attacks (SMACK) A series of attacks that rely on tampering with the order of protocol messages (2015) Attacks: SKIP, FREAK, LOGJAM (affects 25% of Web) Bugs in OpenSSL, Firefox, Chrome, Windows, ios,... Some bugs were 15 years old Security updates to all major browsers in 2015 For demos, see: smacktls.com Bugs and attacks found due to formal analysis Formal verification and specification-based testing See mitls project: mitls.org A Messy State of the Union: Taming the Composite State Machines of TLS. In IEEE S&P 2015. Benjamin Beurdouche, Karthikeyan Bhargavan, Antoine Delignat-Lavaud, Cédric Fournet, Markulf Kohlweiss, Alfredo Pironti, Pierre-Yves Strub, Jean Karim Zinzindohoue.

Triple Handshake Attack [3Shake] A new protocol-level attack on TLS in 2014 Malicious servers can impersonate user at trusted servers Relies on cryptographic weaknesses in key exchange Exploits the composition of 3 different modes of TLS Reveals unsafe implementations in mainstream browsers Affects banking websites, Wi-Fi, VPN, GIT, Chat servers

1C has: cert Client C Server S C, pk, sk C C S has: cert S, pk, sk S S ClientHello(cr, [KEX ALG 1, KEX ALG 2,...], [ENC ALG 1, ENC ALG 2,...]) ServerHello(sr, sid, KEX ALG, ENC ALG) ServerCertificate(cert S, pk S ) Verify cert S is valid for host S 2 ServerKeyExchange(kex S ) Verify signature using pk S 1 CertificateRequest ServerHelloDone 1 ClientCertificate(cert C, pk C ) TLS Handshake log 1 ClientKeyExchange(kex C ) log 1 [Proved secure in CRYPTO 13] log 2 1 CertificateVerify(sign(sk C, log 1 )) log 2 Verify cert C is valid Compute ms from kex C, kex S ClientCCS log 3 ClientFinished(verifydata(ms, log 2 )) log 3 1 Verify signature using pk C Compute ms from kex C, kex S ServerCCS ServerFinished(verifydata(ms, log 3 )) Verify finished using ms Verify finished using ms Cache new session: sid, ms, cert C 1 /anon cert S, cr, sr, KEX ALG, ENC ALG Application Data Exchange... AppData i AppData j... CloseNotifyAlert CloseNotifyAlert Cache new session: sid, ms, cert C 1 /anon cert S, cr, sr, KEX ALG, ENC ALG

TLS Renegotiation Attack [2009] Martin Rex s Version (Compound authentication failure) C connects to M M connects to S C renegotiates (connects again) with M M forwards C s messages to S S confuses data sent by C and M

TLS Renegotiation Fix [2009] Cryptographically bind the two handshakes Proved Secure in CCS 13 Binding won t match! S refuses to renegotiate

3Shake Attack [2014] Three handshakes C connects to M C resumes session on a new connection with M C renegotiates with M M can impersonate C at any other server S Relies on cryptographic weaknesses in TLS s RSA and DHE exchanges Bypasses renegotiation fix

3Shake Impact Security updates and CVEs Firefox, Chrome, Windows, Apple, Papers, demos: See http://secure-resumption.com New fix proposed for TLS protocol (IETF RFC7627) Will be mandatory for all TLS implementations Deep protocol change backed by formal analysis Attack exists since SSL 3.0 (1998) Why did nobody find it? Three (different) state machines difficult to analyze by hand; Case analysis requires automated methods The desired compound authentication property is subtle; Easier to reason about in a formal setting

Formal Analyses of TLS Long history of formal methods for TLS Theorem proving: Inductive analysis of the Internet protocol TLS, LC Paulson, ACM TISSEC 99 Specialized cryptographic proving (ProVerif): Cryptographically verified implementations for TLS, K Bhargavan, C Fournet, R Corin, E Zalinescu, CCS 08 Model checking: ASPIER: An Automated Framework for Verifying Security Protocol Implementations, S Chaki, A Datta, IEEE CSF 09 Type checking: Implementing TLS with verified cryptographic security K Bhargavan, C Fournet, M Kohlweiss, A Pironti, P-Y Strub

Symbolic Protocol Analysis Define a symbolic model of ideal cryptography E.g. symmetric encryption represented by two functions Encryption enc is injective, unbreakable without the key Decryption dec inverts encryption (only with right key): forall k, x. dec(k,enc(k,x)) = x Define a process for each protocol participant Sends and receives messages over public channels, maintains state, generates fresh secrets, calls crypto functions, Attacker is implicit and unbounded Controls all public channels Can create fresh data but cannot guess secrets Has unlimited computational power but cannot break crypto Verify that the protocol preserves security goals Undecidable in general, but known decidable protocol classes Tools like ProVerif can still verify many protocols

3Shake in ProVerif Client Process: Resumption Handshake Crypto Model: RSA Encryption Client Process: Initial RSA Handshake Client Process: Renegotiation Handshake ProVerif automatically finds the 3Shake attack coded as a reachability query

Proofs for Protocol Software Reference implementation of TLS 1.2 in F# 5000 lines of code, 2500 lines of logical specification Automated proofs by typechecking with F7 Collaboration between Microsoft Research and INRIA Supports major protocol versions, ciphersuites Papers in IEEE S&P (Oakland) 2013, CRYPTO 2014 More details, source code: http://mitls.org

Summary and Conclusions

Protocols and their Properties TCP: Reliable Communication Channel TLS: Secure Communication Channel RIP: Convergent Distance Vector Routing AODV: Loop-Free On-Demand Routing BGP: Convergent Path Vector Routing Many other interesting protocol properties to explore (next lectures)

Formal Methods for Protocol Analysis Automated Theorem Proving (Coq, HOL, Isabelle) Powerful proof techniques, not very automated Example: Detailed specification of TCP in HOL Cryptographic protocol analysis (ProVerif, Tamarin) Unbounded (symbolic) attacker, not finite state Example: Proofs and attacks for TLS in ProVerif Model checking (SPIN, SMV) Proofs for finitary approximations of protocol and network Example: Proofs and counterexamples for AODV in SPIN Program analysis (Astrée, F*) Directly analyze protocol source code, not just a model Needs sound abstractions to reduce verification complexity Example: Verified TLS implementation

Conclusions Formal methods can be effective for the precise specification and (semi-)automated verification of network protocols They uncover attacks They explain observed phenomena They expose specification ambiguities They can aid with testing implementations Formal proofs increase our confidence in protocols and their implementations See: http://mitls.org Formal models drive new networking paradigms like Software Defined Networking See: http://www.frenetic-lang.org/

Exercises Download and install SPIN http://spinroot.com/spin/whatispin.html Try out its examples and read its tutorial A B C Encode a simple static forwarding (routing) network in Promela One proctype each for A, B, C Each node non-deterministically sends messages to the others A, B, C forward messages according to topology above Show by random simulation that packets sent from A to C do reach C Can you prove that all such messages will reach C? (Liveness) Can you prove that no message is delivered to wrong node? (Safety) Can you generalize the model by using a global topology table?

Course Outline Lecture 1 [Today, Sep 14] Introduction, Motivating Examples Lectures 2-4 [Sep 21,28, Oct 5] Program Verification: Properties, Tools & Techniques Lectures 5-8 [Oct 12, 19, 26, Nov 2] Protocol Verification: Case Studies Lecture 9 [Nov 9] Advanced Topics Lecture 10 [Nov 16] Exam

End of Lecture 1

BGP: Unachievable Goals

Path Vector Routing in BGP A BGP configuration has two distinctive features Each routing entry has a full path to destination Each node has an ordered list of preferred paths BGP does not solve the shortest path problem It tries to solve the Stable Paths Problem Each node picks its most preferred path that is also consistent with its neighbors chosen path

BGP Convergence BGP convergence depends on configuration May not have a unique solution! May not have any solution! Real-world impact Misconfiguration can cause outages BGP routes may be unstable ( flapping ) BGP may not converge even if good routes exist

BGP Configurations and Solutions Not shortest paths, but still has a stable solution Has a stable solution, but BGP may diverge No solution

BGP Configurations and Solutions

Formalizing BGP Convergence Formal definition of Stable Paths Problem Can be modeled using Alloy Formal model of core BGP protocol Simple Path Vector Protocol (SPVP) Can be modeled in Promela/Spin Verify whether an SPP problem has a solution SAT solving Verify whether SPVP converges for a SPP problem Model checking Papers: The Stable Paths Problem and Interdomain Routing, TG Griffin, F Bruce Shepherd, G Wilfong, IEEE TON 02 Toward a lightweight model of BGP safety, M Arye, R Harrison, R Wang, P Zave, J Rexford, WRiPE 11