High-performance defunctionalization in Futhark

Similar documents
Higher-order functions for a high-performance programming language for GPUs

APL on GPUs A Progress Report with a Touch of Machine Learning

Combining Static and Dynamic Contract Checking for Curry

The Arrow Calculus (Functional Pearl)

Structural polymorphism in Generic Haskell

Functional Logic Programming Language Curry

Logic - CM0845 Introduction to Haskell

Introduction to Programming, Aug-Dec 2006

11/6/17. Outline. FP Foundations, Scheme. Imperative Languages. Functional Programming. Mathematical Foundations. Mathematical Foundations

Dependencies by case or by function?

JVM ByteCode Interpreter

Modelling Unique and Affine Typing using Polymorphism

CSE-321 Programming Languages 2010 Midterm

Haske k ll An introduction to Functional functional programming using Haskell Purely Lazy Example: QuickSort in Java Example: QuickSort in Haskell

Type Inference; Parametric Polymorphism; Records and Subtyping Section and Practice Problems Mar 20-23, 2018

Denotational Semantics. Domain Theory

4.5 Pure Linear Functional Programming

An introduction introduction to functional functional programming programming using usin Haskell

Polymorphic lambda calculus Princ. of Progr. Languages (and Extended ) The University of Birmingham. c Uday Reddy

Advanced Programming Handout 7. Monads and Friends (SOE Chapter 18)

CSE-321 Programming Languages 2012 Midterm

Exercise 1 ( = 24 points)

Lazy Modules. Keiko Nakata Institute of Cybernetics at Tallinn University of Technology

Introduction to OCaml

CMSC 330: Organization of Programming Languages. Operational Semantics

Introduction to System F. Lecture 18 CS 565 4/20/09

If a program is well-typed, then a type can be inferred. For example, consider the program

Lecture Notes on Data Representation

An introduction to C++ template programming

Faith, Evolution, and Programming Languages. Philip Wadler University of Edinburgh

Introduction to the Lambda Calculus

Functional Programming in Haskell Part I : Basics

Value Recursion in Monadic Computations

Lecture Notes on Program Equivalence

Exercise 1 ( = 18 points)

CMSC 330: Organization of Programming Languages. OCaml Higher Order Functions

Lecture 19: Functions, Types and Data Structures in Haskell

Explicit Recursion in Generic Haskell

CMSC 330: Organization of Programming Languages. OCaml Higher Order Functions

CITS3211 FUNCTIONAL PROGRAMMING. 14. Graph reduction

CMSC 330: Organization of Programming Languages. OCaml Higher Order Functions

Foundations. Yu Zhang. Acknowledgement: modified from Stanford CS242

Polymorphism and System-F (OV)

CS 4110 Programming Languages & Logics. Lecture 17 Programming in the λ-calculus

INTRODUCTION TO FUNCTIONAL PROGRAMMING

Haskell Types, Classes, and Functions, Currying, and Polymorphism

Safe Smart Contract Programming with Scilla

Exercise 1 (2+2+2 points)

Harvard School of Engineering and Applied Sciences CS 152: Programming Languages

Multi-paradigm Declarative Languages

M. Snyder, George Mason University LAMBDA CALCULUS. (untyped)

Values (a.k.a. data) representation. Advanced Compiler Construction Michel Schinz

Values (a.k.a. data) representation. The problem. Values representation. The problem. Advanced Compiler Construction Michel Schinz

CSC324 Principles of Programming Languages

Lecture 4: Higher Order Functions

Introduction to Functional Programming in Haskell 1 / 56

Type-indexed functions in Generic Haskell

Overview. A normal-order language. Strictness. Recursion. Infinite data structures. Direct denotational semantics. Transition semantics

Programming Languages

CSE-321 Programming Languages 2011 Final

Haskell Introduction Lists Other Structures Data Structures. Haskell Introduction. Mark Snyder

Haskell to Hardware and Other Dreams

Theorem Proving Principles, Techniques, Applications Recursion

Short Notes of CS201

Tuples. CMSC 330: Organization of Programming Languages. Examples With Tuples. Another Example

Symbolic Computation via Program Transformation

Recitation 8: Dynamic and Unityped Languages : Foundations of Programming Languages

Optimising Functional Programming Languages. Max Bolingbroke, Cambridge University CPRG Lectures 2010

CS201 - Introduction to Programming Glossary By

Copperhead: Data Parallel Python Bryan Catanzaro, NVIDIA Research 1/19

Metaprogramming assignment 3

CSE 591: GPU Programming. Using CUDA in Practice. Klaus Mueller. Computer Science Department Stony Brook University

Dynamic Types , Spring March 21, 2017

Programming with Math and Logic

TRELLYS and Beyond: Type Systems for Advanced Functional Programming

CSCE 314 TAMU Fall CSCE 314: Programming Languages Dr. Flemming Andersen. Haskell Functions

More Lambda Calculus and Intro to Type Systems

An update on XML types

Resource Aware ML. 1 Introduction. Jan Hoffmann 1, Klaus Aehlig 2, and Martin Hofmann 2

Lecture Notes on Aggregate Data Structures

The Polymorphic Blame Calculus and Parametricity

Simple Unification-based Type Inference for GADTs

Imperative languages

Introduction to Functional Programming in Racket. CS 550 Programming Languages Jeremy Johnson

Programming in Omega Part 1. Tim Sheard Portland State University

FUNCTIONAL PROGRAMMING 1 HASKELL BASICS

The University of Nottingham

Polymorphism. Lecture 19 CS 565 4/17/08

Functional Fibonacci to a Fast FPGA

Intrinsically Typed Reflection of a Gallina Subset Supporting Dependent Types for Non-structural Recursion of Coq

Polymorphic Contexts FP-Dag 2015

Second-Order Type Systems

Functional Programming

Dependent types and program equivalence. Stephanie Weirich, University of Pennsylvania with Limin Jia, Jianzhou Zhao, and Vilhelm Sjöberg

Stream Fusion. Wouter Swierstra 30/11/2010

More Lambda Calculus and Intro to Type Systems

INTRODUCTION TO HASKELL

APCS Semester #1 Final Exam Practice Problems

Language Proposal: tail

A Finer Functional Fibonacci on a Fast fpga

Transcription:

1 High-performance defunctionalization in Futhark Anders Kiel Hovgaard Troels Henriksen Martin Elsman Department of Computer Science University of Copenhagen (DIKU) Trends in Functional Programming, 2018

2 Motivation Massively parallel processors, like GPUs, are common but difficult to program. Functional programming can make it easier to program GPUs: Referential transparency. Expressing data-parallelism. Problem Higher-order functions cannot be directly implemented on GPUs. Can we do higher-order functional GPU programming anyway?

3 Motivation Higher-order functions on GPUs? Yes! Using moderate type restrictions, we can eliminate all higher-order functions at compile-time. Gain many benefits of higher-order functions without any run-time performance overhead.

Reynolds s defunctionalization 4

Defunctionalization (Reynolds, 1972) John Reynolds: Definitional interpreters for higher-order programming languages, ACM Annual Conference 1972. Basic idea: Replace each function abstraction by a tagged data value that captures the free variables: λx : int. x + y = LamN y Replace application by case dispatch over these functions: f a = case f of Lam1... Lam2... LamN y a + y... Branch divergence on GPUs. 5

Language and type restrictions 6

7 Futhark A purely functional, data-parallel array language with an optimizing compiler that generates GPU code via OpenCL. Parallelism expressed through built-in higher-order functions, called second-order array combinators (SOACs): map, reduce, scan,... No recursion, but sequential loop constructs: loop pat = init for x in arr do body

8 Type-based restrictions on functions To permit efficient defunctionalization, we introduce type-based restrictions on the use of functions. Statically determine the form of every applied function. Transformation is simple and eliminates all higher-order functions. Instead of allowing unrestricted functions and relying on subsequent analysis, we entirely avoid such analysis.

9 Type-based restrictions on functions Conditionals may not produce functions: let f = if b1 then... if bn then λx e_n else... λx e_k in... f y Which function f is applied? If our goal is to eliminate higher-order functions without introducing branching, we must restrict conditionals from returning functions. Require that branches have order zero type.

10 Type-based restrictions on functions Arrays may not contain functions: let fs = [λy y+a, λz z+b,...] in... fs[n] 5 Which function fs[n] is applied? Also need to restrict map to not create array of functions: map (λx λy...) xs

11 Type-based restrictions on functions Loops may not produce functions: loop f = (λz z+1) for x in xs do (λz x + f z) The shape of f depends on the number of iterations of the loop. Require that loop has order zero type. All other typing rules are standard and do not restrict functions.

Defunctionalization 12

13 Defunctionalization Type restrictions enable us to track functions precisely. Control-flow is restricted so every applied function is known and every application can be specialized.

14 Defunctionalization Defunctionalization in a nutshell: let a = 1 let b = 2 let f = λx x+a in f b let a = 1 let b = 2 let f = {a=a} in f f b Create lifted function: let f env x = let a = env.a in x+a

15 Defunctionalization Static values: sv ::= Dyn τ Lam x e 0 E Rcd {(l i sv i ) i 1..n } Static approximation of the value of an expression. Precisely capture the closures produced by an expression. Translation environment E maps variables to static values.

16 Defunctionalization let twice (g: int int) = λx g (g x) in twice (λy y+a)

16 Defunctionalization let twice (g: int int) = λx g (g x) in twice (λy y+a) let twice = {} Lam g (λx g (g x)) [ ]

16 Defunctionalization let twice (g: int int) = λx g (g x) in twice (λy y+a) let twice = {} Lam g (λx g (g x)) [ ] in twice (λy y+a)

16 Defunctionalization let twice (g: int int) = λx g (g x) in twice (λy y+a) let twice = {} Lam g (λx g (g x)) [ ] in twice (λy y+a) twice twice (λy y + a) {a = a}, Lam y (y + a) [a Dyn int]

16 Defunctionalization let twice (g: int int) = λx g (g x) in twice (λy y+a) let twice = {} Lam g (λx g (g x)) [ ] in twice twice {a = a} let twice (env: {}) (g: {a: int}) = λx g (g x) twice twice (λy y + a) {a = a}, Lam y (y + a) [a Dyn int]

16 Defunctionalization let twice (g: int int) = λx g (g x) in twice (λy y+a) let twice = {} Lam g (λx g (g x)) [ ] in twice twice {a = a} let twice (env: {}) (g: {a: int}) = λx g (g x) twice twice (λy y + a) {a = a}, Lam y (y + a) [a Dyn int] }{{} g

Defunctionalization let twice (g: int int) = λx g (g x) in twice (λy y+a) let twice = {} Lam g (λx g (g x)) [ ] in twice twice {a = a} let twice (env: {}) (g: {a: int}) = λx g (g x) twice twice (λy y + a) {a = a}, Lam y (y + a) [a Dyn int] }{{} g λx g (g x) {g = g}, Lam x (g (g x)) [g Lam y (y + a)...)] 16

Defunctionalization let twice (g: int int) = λx g (g x) in twice (λy y+a) let twice = {} Lam g (λx g (g x)) [ ] in twice twice {a = a} let twice (env: {}) (g: {a: int}) = {g = g} twice twice (λy y + a) {a = a}, Lam y (y + a) [a Dyn int] }{{} g λx g (g x) {g = g}, Lam x (g (g x)) [g Lam y (y + a)...)] 16

Defunctionalization let twice (g: int int) = λx g (g x) in twice (λy y+a) let twice = {} Lam g (λx g (g x)) [ ] in {g = {a = a}} let twice (env: {}) (g: {a: int}) = {g = g} twice twice (λy y + a) {a = a}, Lam y (y + a) [a Dyn int] }{{} g λx g (g x) {g = g}, Lam x (g (g x)) [g Lam y (y + a)...)] 16

16 Defunctionalization let twice (g: int int) = λx g (g x) in twice (λy y+a) in {g = {a = a}}

16 Defunctionalization let twice (g: int int) = λx g (g x) in twice (λy y+a) in {g = {a = a}} f Lam x (g (g x)) [g Lam y (y + a) (a Dyn int)]

16 Defunctionalization let twice (g: int int) = λx g (g x) in twice (λy y+a) in {g = {a = a}} in f f 1 let f (env: {g: {a: int}}) (x: int) = let g = env.g in g (g x) f Lam x (g (g x)) [g Lam y (y + a) (a Dyn int)]

16 Defunctionalization let twice (g: int int) = λx g (g x) in twice (λy y+a) in {g = {a = a}} in f f 1 let f (env: {g: {a: int}}) (x: int) = let g = env.g in g (g x) g Lam y (y + a) [a Dyn int]

16 Defunctionalization let twice (g: int int) = λx g (g x) in twice (λy y+a) in {g = {a = a}} in f f 1 let f (env: {g: {a: int}}) (x: int) = let g = env.g in g g (g g x) let g (env: {a: int}) (y: int) = let a = env.a in y+a g Lam y (y + a) [a Dyn int]

16 Defunctionalization let twice (g: int int) = λx g (g x) in twice (λy y+a) in {g = {a = a}} in f f 1 let f (env: {g: {a: int}}) (x: int) = let g = env.g in g g (g g x) let g (env: {a: int}) (y: int) = let a = env.a in y+a

Correctness 17

18 Correctness Defunctionalization has been proven correct: Defunctionalization terminates and yields a consistently typed residual expression. For order 0, the type is unchanged. Proof using a logical relations argument. Meaning is preserved. More details in the paper.

Implementation 19

20 Implementation Futhark program Type checking Typed Futhark program Static interpretation Module-free program Monomorphization Module-free, monomorphic Defunctionalization Module-free, monomorphic, first-order Internalizer Compiler IR Compiler back end

21 Implementation Polymorphism and defunctionalization What if type a is instantiated with a function type? let ite a (b: bool) (x: a) (y: a) : a = if b then x else y

21 Implementation Polymorphism and defunctionalization What if type a is instantiated with a function type? let ite a (b: bool) (x: a) (y: a) : a = if b then x else y Distinguish lifted type variables: a regular type variable ^a lifted type variable

Evaluation 22

23 Evaluation Does defunctionalization yield efficient programs? Rewrite benchmark programs to use higher-order functions. Most SOACs converted to higher-order library functions. Higher-order utility functions Function composition, application, flip, curry, etc. Segmented operations and sorting functions in library use higher-order functions instead of parametric modules.

24 Evaluation Immediately after adding defunctionalization Using higher-order SOACs, utilities etc. 1.2 1.0 1.000 0.999 1.000 0.998 1.002 0.999 0.999 1.000 1.020 0.990 1.009 1.009 1.001 1.002 Speedup 0.8 0.6 0.4 0.2 0.0 FFT Pagerank CFD K-means MRI-Q Stencil TPACF 12.06ms 12.85ms 2260.70ms 350.47ms 16.13ms 132.56ms 3534.01ms Run-time performance is unaffected. Relies on the optimizations performed by the compiler.

Functional images Represent images as functions: type image a = point a type filter a = image a image a Due to Conal Elliott. Implemented in the Haskell EDSL Pan. The entire Pan library has been translated to Futhark. 25

Functional images 25

Function-type conditionals 26

27 Support for function-type conditionals let r = if b then {f = λx x+1, a = 1} else {f = λx x+n, a = 2} in r.f r.a

27 Support for function-type conditionals let r = if b then {f = λx x+1, a = 1} else {f = λx x+n, a = 2} in r.f r.a Introduce new form of static value: Or sv 1 sv 2 Static value representation of r: Rcd {f Or (Lam x (x + 1) [ ]) (Lam x (x + n) [n Dyn int]) a Dyn int}

27 Support for function-type conditionals let r = if b then {f = λx x+1, a = 1} else {f = λx x+n, a = 2} in r.f r.a Straightforward translation is ill-typed: if b then {f = {}, a = 1} else {f = {n=n}, a = 2} Even worse with nested conditionals. Binary sum types to complement Or static value: τ 1 + τ 2

27 Support for function-type conditionals let r = if b then {f = λx x+1, a = 1} else {f = λx x+n, a = 2} in r.f r.a let r = if b then {f = inl {}, a = 1} else {f = inr {n=n}, a = 2} in let x = r.a in case r.f of inl e x+1 inr e let n = e.n in x+n

28 Conclusion General and practical approach to implementing higher-order functions in high-performance functional languages for GPUs. Proof of correctness. Implementation in Futhark. No performance overhead, but gain many of the benefits. Questions, comments?