Optimising Functional Programming Languages. Max Bolingbroke, Cambridge University CPRG Lectures 2010

Similar documents
Faster Haskell. Neil Mitchell

Fastest Lambda First. Neil Mitchell.

Stream Fusion. From Lists to Streams to Nothing at All. 2 Computer Science & Engineering

Implementing Type-Based Deforestation. by Marie-Christine (Kirsten) Chevalier. Research Project

Once Upon a Polymorphic Type

Haske k ll An introduction to Functional functional programming using Haskell Purely Lazy Example: QuickSort in Java Example: QuickSort in Haskell

CS 11 Haskell track: lecture 1

An introduction introduction to functional functional programming programming using usin Haskell

The Worker/Wrapper Transformation

A general introduction to Functional Programming using Haskell

FUSION is a technique that aims to transform a program

Simon Peyton Jones Microsoft Research August 2012

The Worker/Wrapper Transformation

Simon Peyton Jones Microsoft Research August 2013

Chapter 11 :: Functional Languages

The design of a programming language for provably correct programs: success and failure

Haskell Overview II (2A) Young Won Lim 8/23/16

Haskell Overview II (2A) Young Won Lim 8/9/16

Lists. Michael P. Fourman. February 2, 2010

More Lambda Calculus and Intro to Type Systems

Introduction to Haskell

Functional Languages. CSE 307 Principles of Programming Languages Stony Brook University

Type families and data kinds

Verification of an ML compiler. Lecture 3: Closures, closure conversion and call optimisations

CSCE 314 TAMU Fall CSCE 314: Programming Languages Dr. Flemming Andersen. Haskell Functions

Accurate Step Counting

CS152: Programming Languages. Lecture 11 STLC Extensions and Related Topics. Dan Grossman Spring 2011

Stream Fusion. Wouter Swierstra 30/11/2010

CS153: Compilers Lecture 15: Local Optimization

Static Contract Checking for Haskell

Overloading, Type Classes, and Algebraic Datatypes

Haskell & functional programming, some slightly more advanced stuff. Matteo Pradella

Exercise 1 ( = 24 points)

Programming Language Pragmatics

CS 320 Midterm Exam. Fall 2018

Goal. CS152: Programming Languages. Lecture 15 Parametric Polymorphism. What the Library Likes. What The Client Likes. Start simpler.

Polymorphic lambda calculus Princ. of Progr. Languages (and Extended ) The University of Birmingham. c Uday Reddy

Introduction. chapter Functions

Accurate Step Counting

Lambda Calculus and Type Inference

Many Holes in Hindley-Milner

CS457/557 Functional Languages

Functional Languages. Hwansoo Han

Programming Languages Lecture 15: Recursive Types & Subtyping

MoreIntro_annotated.v. MoreIntro_annotated.v. Printed by Zach Tatlock. Oct 04, 16 21:55 Page 1/10

CS 242. Fundamentals. Reading: See last slide

Arbitrary-rank polymorphism in (GHC) Haskell

Simple Unification-based Type Inference for GADTs

Efficient Interpretation by Transforming Data Types and Patterns to Functions

Lecture 4: Higher Order Functions

A Library Writer s Guide to Shortcut Fusion

Solution sheet 1. Introduction. Exercise 1 - Types of values. Exercise 2 - Constructors

Polymorphic Contexts FP-Dag 2015

Shell CSCE 314 TAMU. Haskell Functions

Lazy Functional Programming in Haskell

Urmas Tamm Tartu 2013

INTRODUCTION TO HASKELL

Introduction to Programming: Lecture 6

CSCC24 Functional Programming Scheme Part 2

Wellesley College CS251 Programming Languages Spring, 2000 FINAL EXAM REVIEW PROBLEM SOLUTIONS

CS558 Programming Languages

Mutation. COS 326 David Walker Princeton University

Lambda Calculus and Type Inference

(Refer Slide Time: 4:00)

Extended Static Checking for Haskell (ESC/Haskell)

Functional Programming for Logicians - Lecture 1

Foundations. Yu Zhang. Acknowledgement: modified from Stanford CS242

CSE341: Programming Languages Lecture 6 Nested Patterns Exceptions Tail Recursion. Zach Tatlock Winter 2018

COP4020 Programming Languages. Functional Programming Prof. Robert van Engelen

Advanced Functional Programming

GADTs. Wouter Swierstra. Advanced functional programming - Lecture 7. Faculty of Science Information and Computing Sciences

Writing code that I'm not smart enough to write. A funny thing happened at Lambda Jam

Work-in-progress: "Verifying" the Glasgow Haskell Compiler Core language

Programming with Math and Logic

CSCI B522 Lecture 11 Naming and Scope 8 Oct, 2009

Lambda calculus. Wouter Swierstra and Alejandro Serrano. Advanced functional programming - Lecture 6

Stop coding Pascal. Saturday, April 6, 13

CS 360: Programming Languages Lecture 10: Introduction to Haskell

Haskell Overview II (2A) Young Won Lim 9/26/16

Homework 5. Notes. Turn-In Instructions. Reading. Problems. Handout 19 CSCI 334: Spring (Required) Read Mitchell, Chapters 6 and 7.

Functional Programming and λ Calculus. Amey Karkare Dept of CSE, IIT Kanpur

G Programming Languages - Fall 2012

Transformation and Analysis of Haskell Source Code

Functional Programming and Haskell

TYPE INFERENCE. François Pottier. The Programming Languages Mentoring ICFP August 30, 2015

Functional Programming Language Haskell

Review. CS152: Programming Languages. Lecture 11 STLC Extensions and Related Topics. Let bindings (CBV) Adding Stuff. Booleans and Conditionals

Processadors de Llenguatge II. Functional Paradigm. Pratt A.7 Robert Harper s SML tutorial (Sec II)

Fundamentals and lambda calculus. Deian Stefan (adopted from my & Edward Yang s CSE242 slides)

Type Systems, Type Inference, and Polymorphism

GADTs. Wouter Swierstra and Alejandro Serrano. Advanced functional programming - Lecture 7. [Faculty of Science Information and Computing Sciences]

From the λ-calculus to Functional Programming Drew McDermott Posted

Haskell Introduction Lists Other Structures Data Structures. Haskell Introduction. Mark Snyder

Exercise 1 ( = 18 points)

More Lambda Calculus and Intro to Type Systems

Compilers: The goal. Safe. e : ASM

6.001 Notes: Section 6.1

Shortcut Fusion for Accumulating Parameters & Zip-like Functions

GADTs. Alejandro Serrano. AFP Summer School. [Faculty of Science Information and Computing Sciences]

CSCE 314 TAMU Fall CSCE 314: Programming Languages Dr. Flemming Andersen. Haskell Basics

Transcription:

Optimising Functional Programming Languages Max Bolingbroke, Cambridge University CPRG Lectures 2010

Objectives Explore optimisation of functional programming languages using the framework of equational rewriting Compare some approaches for deforestation of functional programs

Making FP fast is important The great promise of functional programming is that you can write simple, declarative specifications of how to solve problems wordcount :: String -> Int wordcount = length. words sumsquares :: String -> Int sumsquares = sum. map square. words where square x = x * x Unfortunately, simple and declarative programs are rarely efficient If we want functional programming to displace the imperative style it needs to be somewhat fast

FP vs Imperative Optimisation Glasgow Haskell Compiler GNU C Compiler Haskell Input Program C Haskell Syn. Source AST C Syntax λ-calculus IR GENERIC C-- Lowered AST GIMPLE/RTL x86/x64/... Machine Code x86/x64/...

Pure λ-calculus is almost embarrassingly easy to optimise Optimisation consists of applying rules derived from the axioms of the calculus Things are much more complicated if the language has impure features such as reference cells (sorry, ML fans!)

Pure λ-calculus Pure, CBV Impure (\x -> e1) e2 case Just e1 of Just x -> e2 Nothing -> e3 let x = e2 in e1 let x = e1 in e2 let x = e1 in let y = e2 in e3 let x = e1 in... x... (x n.f. in e2) let y = e2 in let x = e1 in e3 let x = e1 in... e1... let x = e1 in e2 (x n.f. in e2) e2

Artificial example of equational optimisation let fst = \pair -> case pair of (a, b) -> a (.) = \f g x -> f (g x) in (\y -> y + 1). fst (desugar operator) let fst = \pair -> case pair of (a, b) -> a (.) = \f g x -> f (g x) (inline) in (.) (\y -> y + 1) fst (\f g x -> f (g x)) (\y -> y + 1) (\pair -> case pair of (a, b) -> a) (β-reduce) (inline) let f = \x -> x + 1 g = \pair -> case pair of (a, b) -> a in \x -> f (g x) \x -> (\y -> y + 1) ((\pair -> case pair of (a, b) -> a) x) (β-reduce) (inline) \x -> let y = case x of (a, b) -> a in y + 1 \x -> (case x of (a, b) -> a) + 1 (+ is strict) \x -> case x of (a, b) -> a + 1

Equational optimisation is the bread and butter of a functional compiler Equational optimisation is the number one most important optimisation in a functional language compiler Inlining to remove higher order functions (e.g. in the arguments to the composition (.) function) is a particularly large win Remove need to allocate closures for those functions Eliminates some jumps through a function pointer (which are hard for the CPU to predict) Allows some intraprocedural optimisation

Simple equational optimisation is not sufficient Consider the following reduction sequence: map (\y -> y + 1) (map (\x -> x + 1) [1, 2, 3, 4, 5]) map (\y -> y + 1) [2, 3, 4, 5, 6] Input list [3, 4, 5, 6, 7] Intermediate list. Is this really Output list necessary?

Idea: use higher level equations to optimise! We could build some facts about library functions into the compiler These can be in the form of extra equations to be applied by the compiler wherever possible, just like those derived from the axioms of λ-calculus

Example Before, we had this expression, which allocates a useless intermediate list: map (\y -> y + 1) (map (\x -> x + 1) [1, 2, 3, 4, 5]) However, if the compiler realises that: f g xs. map f (map g xs) = map (f. g) xs It can then spot the (inefficient) original expression at compile time and equationally rewrite it to: map ((\y -> y + 1). (\x -> x + 1)) [1, 2, 3, 4, 5] Since there is only one call to map there is no intermediate list If f and g have side effects this rule isn t always true - purity pays off

Removing intermediate data is important for a FP compiler In a purely functional programming language, you can never update an existing data structure Instead, the program is constantly allocating brand new data structures A whole family of optimisations known as deforestation have sprung up to remove intermediate data structures (we just saw a very simple deforester)

Deforestation in practice Naively you might imagine that you need the compiler to know (at least) one equation for all possible pairs of composed functions (map of a map, sum of a map, map of a enumfromto, etc.) The main implementation of the Haskell programming language implements a type of deforestation called foldr/build fusion based on a single equational rewrite rule This is a (much) more general version of the map/map fusion I showed earlier Knowing just a single equation, the compiler is able to deforest compositions of all sorts of list functions! sumsq x = sum (map (\x -> x * x) (enumfromto 1 x)) (deforestation) sumsq x = if 1 > x then 0 else go 1 where go y = if y == x then y * y else (y * y) + go (y + 1) n.b: no lists - instead, we have a simple loop!

foldr/build fusion The idea (Gill et al., FPLCA 1993): 1. Write all your list consumers (sum, length, etc.) by using the foldr function 2. Write all your list producers (e.g. enumfromto) by using the build function 3. Provide a clever equational optimisation rewriting an application of a foldr to build (i.e. a consumer to a producer), which will cause the intermediate list to be removed

Writing a foldr list consumer In case you ve forgotten: foldr :: (a -> b -> b) -> b -> [a] -> b foldr c n [] = n foldr c n (x:xs) = c n (foldr c n xs) Intuitively, foldr c n on a list replaces all the cons (:) in the list with c and nil [] with n: foldr c n (e1 : e2 :... : em : []) = foldr c n ((:) e1 ((:) e2 (... ((:) em [])))) = c e1 (c e2 (... (c em n ))) Lots of useful list consumers can be written as a foldr: sum :: [Int] -> Int sum = foldr (\x y -> x + y) 0 sum (1 : 2 : 3 : []) (inline) foldr (\x y -> x + y) 0 (1 : 2 : 3 : []) (foldr replaces cons and nil in the list) 1 + 2 + 3 + 0

Lots of useful list consumers can be defined using foldr Another example: length :: [a] -> Int length = foldr (\_ y -> y + 1) 0 length (1 : 2 : 3 : []) (inline) foldr (\_ y -> 1 + y) 0 (1 : 2 : 3 : []) (foldr replaces cons and nil in the list) 1 + 1 + 1 + 0 Another (more complicated) consumer: unzip :: [(a, b)] -> ([a], [b]) unzip = foldr (\(a,b) (as,bs) -> (a:as,b:bs)) ([],[])

Writing a build list producer The build function is apparently trivial: build g = g (:) [] The real magic is in the type signature: build :: (forall b. (a -> b -> b) -> b -> b) -> [a] You might be wondering what that forall means. Don t worry! You ve secretly used it before: id :: forall a. a -> a map :: forall a b. (a -> b) -> [a] -> [b] foldr :: forall a b. (a -> b -> b) -> b -> [a] -> b (.) :: forall a b c. (b -> c) -> (a -> b) -> a -> c The types written above are just the normal types for those functions, but with the forall quantification written in explicitly.

Writing a build list producer id :: forall a. a -> a map :: forall a b. (a -> b) -> [a] -> [b] foldr :: forall a b. (a -> b -> b) -> b -> [a] -> b (.) :: forall a b c. (b -> c) -> (a -> b) -> a -> c The funny thing about these function types is that the forall quantification is always on the left hand side of the type. This is known as rank-1 polymorphism The person calling the function gets to choose what a,b,... are For example, in an expression like id 10, the caller has chosen that a should be Int

Writing a build list producer build :: (forall b. (a -> b -> b) -> b -> b) -> [a] In build, the forall quantification is nested within the argument type. This is known as rank-2 polymorphism The function itself gets to choose what b is (Types inferred by Hindley-Milner are always rank-1) build (\c n -> 1 : c 2 n) Does not typecheck! 1 : c 2 n requires that c returns a [Int] However, the rank-2 type enforces that all you know is that c returns a value of some type that build chooses (called b), which may or may not be [Int]! build (\c n -> c 1 (c 2 n)) Typechecks (builds a 2 element list)

Some intuition about build build :: (forall b. (a -> b -> b) -> b -> b) -> [a] build g = g (:) [] If we wrote list producers using (:) and [] all over the place, it would be hard for the compiler to spot and remove them if it wanted to stop an intermediate list being constructed Instead, λ-abstract our list producer functions over the cons and nil functions for building a list Now, by simply applying that function to different arguments we are able to do make the producer do something other than heap-allocate a cons/nil Just change the cons and nil we pass in e.g. make nil be 0, and then have cons add 1 to its second argument (i.e. add 1 every time the producer tries to output a list cons cell) - this gives us the length function. The build function takes something abstracted over the cons and nil, and fills in the real (:) and [] (The rank-2 type ensures that our abstracted version hasn t cheated by using (:) and [] directly)

Writing a build list producer Proof : enumfromto :: Int -> Int -> [Int] enumfromto from to = build (go from) where go from c n = if from > to then n else c from (go (from + 1) c n) enumfromto from to = build (go from) where go from c n = if from > to then n else c from (go (from + 1) c n) (inline build) enumfromto from to = go from (:) [] where go from c n = if from > to then n else c from (go (from + 1) c n) (notice that c and n are invariant in the recursion) enumfromto from to = go from where go from = if from > to then [] else from : go (from + 1) So our enumfromto does do the right thing. The version without build is easier to understand, but our deforestation equational rewrite will only understand list producers using build, so we use that version of enumfromto.

The magic foldr/build rule c n g. foldr c n (build g) = g c n Intuitively: Take a g which has been abstracted over the cons and nil functions Where the list produced by build g is being immediately consumed by a foldr c n Finally, instead of building to produce that intermediate list and then consuming it, just instantiate the list constructors in the producer with the thing doing the consuming

The magic foldr/build rule c n g. foldr c n (build g) = g c n sum (enumfromto 1 10) (inline sum and enumfromto) foldr (\x y -> x + y) 0 (build (go 1)) where go from c n = if from > 10 then n else c from (go (from + 1) c n) (apply the foldr/build equation) go 1 where go from = if from > 10 then 0 else from + go (from + 1) The intermediate list has been totally eliminated

The foldr/build equation is type correct (instantiate b to b ) :: forall b. (a -> b -> b) -> b -> b :: (a -> b -> b ) -> b -> b c n g. foldr c n (build g) = g c n :: b' :: [a] :: b' :: a -> b' -> b' The rank-2 polymorphism in the type of build is essential! If it weren t polymorphic, we could only fuse if b = b This would break most interesting deforestations e.g. when deforesting sum (enumfromto 1 10) we need b = [Int] and b = Int

Functions which are both producers and consumers are defined using both build and foldr map :: (a -> b) -> [a] -> [b] map f xs = build (\c n -> foldr (\x ys -> c (f x) ys) n xs) (++) :: [a] -> [a] -> [a] xs ++ ys = build (\c n -> foldr c (foldr c n ys) xs) It all looks very weird, but it works! Upshot is that you can eliminate superfluous intermediate lists (and hence reduce allocation) for compositions of almost all of the common list functions The map/map deforestation example I showed at the start is a special case of the foldr/build rule

Extensions and alternatives The foldr/build framework can be generalised to data types other than simple lists (but that is not so useful in practice) The main issue with the framework is that zip-like functions (that consume two or more lists) cannot be deforested There is a categorically dual framework called unfoldr/destroy (Svenningsson, ICFP 2002) which can deal with such functions However, it can turn non-terminating programs into terminating ones (i.e. the unfoldr/destroy rule is not actually an equivalence) Fails to deforest some functions that will deforest with foldr/ build (such as filter and concatmap) Yet another approach is Stream Fusion (Coutts et al., ICFP 2007), which relies on the equation stream. unstream = id Can fuse everything that the above approaches can, except for concatmap

The deforestation landscape concatmap concat filter foldr sum map zip foldr/build stream/unstream unfoldr/destroy

Deforestation by supercompilation There are many other approaches to deforestation, many of which don t use a simple equational rewriting One such method is supercompilation (Turchin, PLS 1986) A supercompiler is based around an evaluator which is capable of evaluating expressions containing free variables

Deforestation by supercompilation h0 xs = map (\x -> x + 1) (map (\x -> x + 2) xs) (inline outer map) (inline inner map) h0 xs = case map (\x -> x + 2) xs of [] -> [] (y:ys) -> y + 1 : map (\x -> x + 1) ys h0 xs = case (case xs of [] -> [] (z:zs) -> z + 2 : map (\x -> x + 2) zs) of [] -> [] (y:ys) -> y + 1 : map (\x -> x + 1) ys (case-of-case) h0 xs = case xs of [] -> case [] of [] -> [] (y:ys) -> y + 1 : map (\x -> x + 1) ys (z:zs) -> case z + 2 : map (\x -> x + 2) zs of [] -> [] (y:ys) -> y + 1 : map (\x -> x + 1) ys

Deforestation by supercompilation h0 xs = case xs of [] -> case [] of [] -> [] (y:ys) -> y + 1 : map (\x -> x + 1) ys (z:zs) -> case z + 2 : map (\x -> x + 2) zs of [] -> [] (y:ys) -> y + 1 : map (\x -> x + 1) ys (evaluate both inner cases) h0 xs = case xs of [] -> [] (z:zs) -> z + 2 + 1 : map (\x -> x + 1) (map (\x -> x + 2) zs) (tie back, since the case branch which recursively calls map is just a renaming of what we started with, and thoughtfully called h0) h0 xs = case xs of [] -> [] (z:zs) -> z + 2 + 1 : h0 zs

The deforestation landscape direct compile time constants recursion filter concatmap sum map zip concat foldr foldr/build stream/unstream unfoldr/destroy supercompile

Supercompilation Supercompilation is a very powerful transformation, which can achieve much more than just deforestation No need to define your library functions in a stylised way (e.g. in foldr/build you carefully use foldr for everything) Closely related to partial evaluation Currently, supercompilers are too slow to be practical (i.e. they can take on the order of hours to compile some simple examples)

Conclusion You can write beautiful, declarative functional programs which compile to very fast code! Deforestation is an important optimisation for achieving this, and can be achieved in practice using foldr/build The fact that we are optimising a pure and functional language makes such optimisations reliable and simple to do Removing intermediate data structures from e.g. C programs is much harder (but possible!) All programs should be written in Haskell :-)

Further Reading Shrinking lambda expressions in linear time (Appel et al., JFP 7:5) Call-pattern specialisation for Haskell programs (Peyton Jones, ICFP 2007) The worker/wrapper transformation (Gill et al., JFP 19:2) The source code to GHC! (http://hackage.haskell.org/ trac/ghc/)