Functional Programming Concepts for Data Processing

Similar documents
Quick announcement. Midterm date is Wednesday Oct 24, 11-12pm.

JVM ByteCode Interpreter

Imperative languages

CSc 372. Comparative Programming Languages. 2 : Functional Programming. Department of Computer Science University of Arizona

Introduction to Functional Programming in Haskell 1 / 56

More Lambda Calculus and Intro to Type Systems

Harvard School of Engineering and Applied Sciences CS 152: Programming Languages

Introduction to Functional Programming

Lambda Calculus and Type Inference

More Lambda Calculus and Intro to Type Systems

Functional Programming

CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages. Lambda Calculus

Homework. Lecture 7: Parsers & Lambda Calculus. Rewrite Grammar. Problems

Programming with Math and Logic

CSCE 314 Programming Languages Functors, Applicatives, and Monads

Foundations. Yu Zhang. Acknowledgement: modified from Stanford CS242

Haske k ll An introduction to Functional functional programming using Haskell Purely Lazy Example: QuickSort in Java Example: QuickSort in Haskell

Denotational Semantics. Domain Theory

CMSC 330: Organization of Programming Languages. Lambda Calculus

Haskell Overview II (2A) Young Won Lim 8/9/16

An introduction introduction to functional functional programming programming using usin Haskell

11/6/17. Outline. FP Foundations, Scheme. Imperative Languages. Functional Programming. Mathematical Foundations. Mathematical Foundations

Overview. Elements of Programming Languages. Evaluation order. Evaluation order

Haskell: From Basic to Advanced. Part 2 Type Classes, Laziness, IO, Modules

Functional Programming

Harvard School of Engineering and Applied Sciences CS 152: Programming Languages. Lambda calculus

Elixir, functional programming and the Lambda calculus.

Introductory Example

CMSC 330: Organization of Programming Languages. Functional Programming with OCaml

Introduction to Functional Programming and Haskell. Aden Seaman

Exercise 1 ( = 24 points)

CSC312 Principles of Programming Languages : Functional Programming Language. Copyright 2006 The McGraw-Hill Companies, Inc.

9/23/2014. Why study? Lambda calculus. Church Rosser theorem Completeness of Lambda Calculus: Turing Complete

A Brief Introduction to Scheme (I)

Background Type Classes (1B) Young Won Lim 6/28/18

Refactoring to Functional. Hadi Hariri

CS152: Programming Languages. Lecture 11 STLC Extensions and Related Topics. Dan Grossman Spring 2011

Functional Programming. Big Picture. Design of Programming Languages

Functional Languages. Hwansoo Han

CSCE 314 Programming Languages

functional programming in Python, part 2

Introduction to Concepts in Functional Programming. CS16: Introduction to Data Structures & Algorithms Spring 2017

Advanced Topics in Programming Languages Lecture 2 - Introduction to Haskell

Haskell Overview II (2A) Young Won Lim 8/23/16

Exercise 1 ( = 18 points)

CS457/557 Functional Languages

Side Effects (3A) Young Won Lim 1/13/18

COMP3151/9151 Foundations of Concurrency Lecture 8

Types And Categories. Anat Gilboa

Functional Programming and λ Calculus. Amey Karkare Dept of CSE, IIT Kanpur

Fundamentals and lambda calculus

Fall Lecture 3 September 4. Stephen Brookes

Programming Languages and Techniques (CIS120)

Organization of Programming Languages CS3200/5200N. Lecture 11

Writing code that I'm not smart enough to write. A funny thing happened at Lambda Jam

CSCE 314 TAMU Fall CSCE 314: Programming Languages Dr. Flemming Andersen. Haskell Basics

Polymorphic lambda calculus Princ. of Progr. Languages (and Extended ) The University of Birmingham. c Uday Reddy

Lecture #3: Lambda Calculus. Last modified: Sat Mar 25 04:05: CS198: Extra Lecture #3 1

CS 242. Fundamentals. Reading: See last slide

Functional Programming and Haskell

More Lambda Calculus and Intro to Type Systems

Functional Programming in C++

Fundamentals and lambda calculus. Deian Stefan (adopted from my & Edward Yang s CSE242 slides)

3. Functional Programming. Oscar Nierstrasz

The story so far. Elements of Programming Languages. While-programs. Mutable vs. immutable

Classical Themes of Computer Science

Haskell Overview II (2A) Young Won Lim 9/26/16

Introduction to the Lambda Calculus. Chris Lomont

The Haskell HOP: Higher-order Programming

Pure (Untyped) λ-calculus. Andrey Kruglyak, 2010

Lecture 5: Lazy Evaluation and Infinite Data Structures

CS 11 Haskell track: lecture 1

SOFTWARE ARCHITECTURE 6. LISP

Introduction. chapter Functions

Higher Order Functions in Haskell

Introduction 2 Lisp Part III

Introduction to Functional Programming in Racket. CS 550 Programming Languages Jeremy Johnson

Lambda Calculus alpha-renaming, beta reduction, applicative and normal evaluation orders, Church-Rosser theorem, combinators

Monads. COS 441 Slides 16

Control Structures. Christopher Simpkins CS 3693, Fall Chris Simpkins (Georgia Tech) CS 3693 Scala / 1

Functional Programming in Java. CSE 219 Department of Computer Science, Stony Brook University

.consulting.solutions.partnership. Clojure by Example. A practical introduction to Clojure on the JVM

College Functors, Applicatives

Applicative, traversable, foldable

Lecture 4: Higher Order Functions

Functional Programming Principles in Scala. Martin Odersky

Monads in Haskell. Nathanael Schilling. December 12, 2014

Lambda Calculus and Type Inference

CSCI B522 Lecture 11 Naming and Scope 8 Oct, 2009

M. Snyder, George Mason University LAMBDA CALCULUS. (untyped)

CITS3211 FUNCTIONAL PROGRAMMING

CSCI.4430/6969 Programming Languages Lecture Notes

CSE : Python Programming

CMSC330. Objects, Functional Programming, and lambda calculus

Monads and all that I. Monads. John Hughes Chalmers University/Quviq AB

Functional Programming. Pure Functional Languages

Processadors de Llenguatge II. Functional Paradigm. Pratt A.7 Robert Harper s SML tutorial (Sec II)

Parsing. Zhenjiang Hu. May 31, June 7, June 14, All Right Reserved. National Institute of Informatics

Applicative, traversable, foldable

Transcription:

Functional Programming Concepts for Data Processing Chris Lindholm UCAR SEA 2018 1 / 70

About Me I work at CU LASP I work primarily with Scala I teach Haskell at work Goal: leverage FP to improve scientific software 2 / 70

The Plan Convince you Functional Programming is worth learning about Introduce Functional Programming Refactor some code to be more Functional 3 / 70

Why Functional Programming? 4 / 70

Why Functional Programming? FP helps us write good software 5 / 70

Side E ects 6 / 70

Side E ects Side effects are difficult to reason about 7 / 70

Side E ects Side effects are difficult to reason about Side effects are difficult to test 8 / 70

Side E ects Side effects are difficult to reason about Side effects are difficult to test Side effects are difficult to scale 9 / 70

Side E ects Side effects are difficult to reason about Side effects are difficult to test Side effects are difficult to scale Side effects are unnecessary 10 / 70

Why Functional Programming? FP helps us write good software FP is naturally suited for expressing data-oriented workflows 11 / 70

Data Processing is Function-y Functions are math's way of processing data. 12 / 70

Data Processing is Function-y Functions are math's way of processing data. We can build pipelines with function composition and application. 13 / 70

Data Processing is Function-y Functions are math's way of processing data. We can build pipelines with function composition and application. a :: A d :: D -- D -- \ f :: A -> B g :: B -> C -- -- E / h :: C -> D -> E -- A -> B -> C 14 / 70

Data Processing is Function-y Functions are math's way of processing data. We can build pipelines with function composition and application. a :: A d :: D -- D -- \ f :: A -> B g :: B -> C -- -- E / h :: C -> D -> E -- A -> B -> C (g. f) :: A -> C -- (.) :: (b -> c) -> (a -> b) -> (a -> c) (h. g. f) :: A -> D -> E (h. g. f) a d :: E 15 / 70

Why Functional Programming? FP helps us write good software FP is naturally suited for expressing data-oriented workflows Learning FP makes you a better programmer 16 / 70

Introduction to Functional Programming 17 / 70

Imperative Programming Express computations using statements that perform side effects. 18 / 70

Imperative Programming Express computations using statements that perform side effects. // Declare a variable to mutate later. int var; // Change control flow depending on p. if (p) { var = 0; // If p is true, execute a statement that mutates var. } else { var = 1; // If p is false, execute a statement that mutates var. } 19 / 70

Imperative Programming Express computations using statements that perform side effects. // Declare a variable to mutate later. int var; // Change control flow depending on p. if (p) { var = 0; // If p is true, execute a statement that mutates var. } else { var = 1; // If p is false, execute a statement that mutates var. } Functional Programming Express computations using expressions that evaluate to values. 20 / 70

Imperative Programming Express computations using statements that perform side effects. // Declare a variable to mutate later. int var; // Change control flow depending on p. if (p) { var = 0; // If p is true, execute a statement that mutates var. } else { var = 1; // If p is false, execute a statement that mutates var. } Functional Programming Express computations using expressions that evaluate to values. -- Declare var to be 0 if p is true, 1 otherwise. var = if p then 0 else 1 21 / 70

Lambda Calculus The theory of computation underlying functional programming. 22 / 70

Lambda Calculus The theory of computation underlying functional programming. Lambda calculus has only Variables Functions Expressions but can express anything that is computable. 23 / 70

(λpq. pqp) (λxy. x) (λab. b) AND TRUE FALSE T RUE F ALSE = F ALSE (λpq. pqp)(λxy. x)(λab. b) β λab. b 24 / 70

(λpq. pqp)(λxy. x)(λab. b) β (λ[p := (λxy. x)]q. pqp)(λab. b) β (λq. (λxy. x)q(λxy. x))(λab. b) β λ[q := (λab. b)]. (λxy. x)q(λxy. x) β (λxy. x)(λab. b)(λxy. x) β (λ[x := (λab. b)]y. x)(λxy. x) β (λy. (λab. b))(λxy. x) β (λ[y := (λxy. x)]. (λab. b)) β λab. b 25 / 70

Pure Functions Pure functions take inputs, produce outputs using only those inputs, and do nothing else. 26 / 70

Pure Functions Pure functions take inputs, produce outputs using only those inputs, and do nothing else. def pure(x): return x + 1 27 / 70

Pure Functions Pure functions take inputs, produce outputs using only those inputs, and do nothing else. def pure(x): return x + 1 def maybe_pure(x): return f(x) + 1 28 / 70

First-class Functions First-class functions are values rather than syntactic constructs. 29 / 70

First-class Functions First-class functions are values rather than syntactic constructs. f = lambda x: x + 1 30 / 70

First-class Functions First-class functions are values rather than syntactic constructs. f = lambda x: x + 1 (+) :: Int -> Int -> Int addone :: Int -> Int addone = (+) 1 31 / 70

Higher-order Functions Higher-order functions operate on other functions. 32 / 70

Higher-order Functions Higher-order functions operate on other functions. (.) :: (b -> c) -> (a -> b) -> (a -> c) 33 / 70

Referential Transparency A referentially transparent expression is one that can be replaced with the result of its evaluation and not change the meaning of the program. 34 / 70

Referential Transparency A referentially transparent expression is one that can be replaced with the result of its evaluation and not change the meaning of the program. Expressions that are not referentially transparent have side effects. 35 / 70

Are these the same? def program1(): # expr is the expression we're testing expr = 1 return (expr, expr) def program2(): return (1, 1) 36 / 70

Are these the same? def program1(): # expr is the expression we're testing expr = 1 return (expr, expr) def program2(): return (1, 1) >>> program1() (1,1) >>> program2() (1,1) 37 / 70

Are these the same? def program1(): expr = datetime.now() return (expr, expr) def program2(): return (datetime.now(), datetime.now()) 38 / 70

Are these the same? def program1(): expr = datetime.now() return (expr, expr) def program2(): return (datetime.now(), datetime.now()) >>> program1() (datetime.datetime(2018, 4, 1, 17, 16, 17, 489959), datetime.datetime(2018, 4, 1, 17, 16, 17, 489959)) >>> program2() (datetime.datetime(2018, 4, 1, 17, 16, 19, 304800), datetime.datetime(2018, 4, 1, 17, 16, 19, 304815)) 39 / 70

Are these the same? def my_length(x): res = 0 for i in range(0, len(x)): res += 1 return res def program1(): expr = my_length([1,2,3]) return (expr, expr) def program2(): return (my_length([1,2,3]), my_length([1,2,3])) 40 / 70

Are these the same? def my_length(x): res = 0 for i in range(0, len(x)): res += 1 return res def program1(): expr = my_length([1,2,3]) return (expr, expr) def program2(): return (my_length([1,2,3]), my_length([1,2,3])) >>> program1() (3, 3) >>> program2() (3, 3) 41 / 70

Equational Reasoning Reasoning about code through substitution rather than execution. 42 / 70

Equational Reasoning Reasoning about code through substitution rather than execution. -- Apply a function to every element of a list. map :: (a -> b) -> [a] -> [b] map _ [] = [] map f x:xs = f x : map f xs 43 / 70

Equational Reasoning Reasoning about code through substitution rather than execution. -- Apply a function to every element of a list. map :: (a -> b) -> [a] -> [b] map _ [] = [] map f x:xs = f x : map f xs map (+ 1) [1, 2, 3] 44 / 70

Equational Reasoning Reasoning about code through substitution rather than execution. -- Apply a function to every element of a list. map :: (a -> b) -> [a] -> [b] map _ [] = [] map f x:xs = f x : map f xs map (+ 1) [1, 2, 3] map (+ 1) [1, 2, 3] = (1 + 1) : map (+ 1) [2, 3] = (1 + 1) : (2 + 1) : map (+ 1) [3] = (1 + 1) : (2 + 1) : (3 + 1) : map (+ 1) [] = (1 + 1) : (2 + 1) : (3 + 1) : [] = 2 : 3 : 4 : [] = [2, 3, 4] 45 / 70

Recap 46 / 70

Recap Functional programs are expressions. 47 / 70

Recap Functional programs are expressions. We run them by evaluating the expressions. 48 / 70

Recap Functional programs are expressions. We run them by evaluating the expressions. We try to keep functions and expressions free from side effects. 49 / 70

Recap Functional programs are expressions. We run them by evaluating the expressions. We try to keep functions and expressions free from side effects. We manipulate functions like any other values. 50 / 70

Recap Functional programs are expressions. We run them by evaluating the expressions. We try to keep functions and expressions free from side effects. We manipulate functions like any other values. These things enable us to apply mathematical reasoning to our programs. 51 / 70

Recap Functional programs are expressions. We run them by evaluating the expressions. We try to keep functions and expressions free from side effects. We manipulate functions like any other values. These things enable us to apply mathematical reasoning to our programs. There's a lot more to functional programming. 52 / 70

Refactoring to be More Functional 53 / 70

Refactoring to be More Functional def process(): # Read input data. data = read_data("input") # Run our processing algorithm. data = [f(x) for x in data] data = [g(x) for x in data] # Write our output. write_data(data, "output") 54 / 70

Refactoring to be More Functional def process(): # Read input data. data = read_data("input") # Run our processing algorithm. data = [f(x) for x in data] data = [g(x) for x in data] # Write our output. write_data(data, "output") Testing this is hard. 55 / 70

Refactoring to be More Functional def process(input_file, output_file): # Read input data. data = read_data(input_file) # Run our processing algorithm. data = [f(x) for x in data] data = [g(x) for x in data] # Write our output. write_data(data, output_file) Make testing easier by taking arguments. 56 / 70

Refactoring to be More Functional def process(input_file, output_file): # Read input data. data = read_data(input_file) # Run our processing algorithm. data = [g(x) for x in data] data = [f(x) for x in data] # Write our output. write_data(data, output_file) 57 / 70

Refactoring to be More Functional def process(input_file, output_file): # Read input data. data = read_data(input_file) # Run our processing algorithm. output_f = [f(x) for x in data] output_g = [g(x) for x in output_f] # Write our output. write_data(output_g, output_file) Keep data immutable. 58 / 70

Refactoring to be More Functional def process(input_file, output_file): # Read input data. data = read_data(input_file) # Run our processing algorithm. output_f = process_f(data) output_g = process_g(output_f) # Write our output. write_data(output_g, output_file) def process_f(data): return [f(x) for x in data] def process_g(data): return [g(x) for x in data] Separate pure algorithms from side-effecting infrastructure. 59 / 70

def process(): # Read input data. data = read_data("input") # Run our processing algorithm. data = [f(x) for x in data] data = [g(x) for x in data] # Write our output. write_data(data, "output") def process(input_file, output_file): # Read input data. data = read_data(input_file) # Run our processing algorithm. output_f = process_f(data) output_g = process_g(output_f) # Write our output. write_data(output_g, output_file) def process_f(data): return [f(x) for x in data] def process_g(data): return [g(x) for x in data] 60 / 70

def process(): # Read input data. data = read_data("input") # Run our processing algorithm. data = [f(x) for x in data] data = [g(x) for x in data] # Write our output. write_data(data, "output") def process(input_file, output_file): # Read input data. data = read_data(input_file) # Run our processing algorithm. output_f = process_f(data) output_g = process_g(output_f) # Write our output. write_data(output_g, output_file) def process_f(data): return [f(x) for x in data] def process_g(data): return [g(x) for x in data] 61 / 70

def process(): # Read input data. data = read_data("input") # Run our processing algorithm. data = [f(x) for x in data] data = [g(x) for x in data] # Write our output. write_data(data, "output") def process(input_file, output_file): # Read input data. data = read_data(input_file) # Run our processing algorithm. output_f = process_f(data) output_g = process_g(output_f) # Write our output. write_data(output_g, output_file) def process_f(data): return [f(x) for x in data] def process_g(data): return [g(x) for x in data] 62 / 70

Refactoring to be More Functional def process(input_file, output_file): # Read input data. data = read_data(input_file) # Run our processing algorithm. output_f = map(f, data) output_g = map(g, output_f) # Write our output. write_data(output_g, output_file) #def process_f(data): # return [f(x) for x in data] #def process_g(data): # return [g(x) for x in data] Build larger programs from smaller ones. 63 / 70

Refactoring to be More Functional def process(input_file, output_file): # Read input data. data = read_data(input_file) # Run our processing algorithm. output = map(compose(g, f), data) # Write our output. write_data(output, output_file) def compose(g, f): return lambda x: g(f(x)) Steal from math. -- Functor composition law map g. map f = map (g. f) 64 / 70

Refactoring Example in Haskell main :: IO () main = let inputfile = "input" outputfile = "output" in pipeline inputfile outputfile pipeline :: FilePath -> FilePath -> IO () pipeline inputfile outputfile = readdata inputfile <&> map (g. f) >>= writedata outputfile -- (<&>) :: Functor f => f a -> (a -> b) -> f b -- (>>=) :: Monad m => m a -> (a -> m b) -> m b 65 / 70

Guidelines for Writing Functional Code 66 / 70

Guidelines for Writing Functional Code Write functions that take inputs and return outputs 67 / 70

Guidelines for Writing Functional Code Write functions that take inputs and return outputs Keep data immutable 68 / 70

Guidelines for Writing Functional Code Write functions that take inputs and return outputs Keep data immutable Reach for side effects last 69 / 70

Guidelines for Writing Functional Code Write functions that take inputs and return outputs Keep data immutable Reach for side effects last Limit side effects to the outer layers of your program 70 / 70