Skeletons and the Anatomy of Monads

Size: px

Start display at page:

Download "Skeletons and the Anatomy of Monads"

Elvin Hudson
5 years ago
Views:

1 Skeletons and the Anatomy of Monads Chuan-kai Lin Department of Computer Science Portland State University Abstract Monads are used heavily in Haskell for supporting computational effects, and the language offers excellent support for defining monadic computations. Unfortunately, defining a monad remains a difficult challenge. There are no libraries that a programmer can use to define a monad that is not a composition of existing monad transformers; therefore every such effort must start from scratch despite that all monads share the same structure and need to satisfy the same minimum set of properties. I propose a monadic programming framework called skeletons to simplify the task of defining monads and monad transformers. Skeletons provide a modular term representation that can be used for all monadic computations, and observer functions that implement the basic semantics of monad operators and guarantee their compliance with monad laws. Using the skeleton framework frees programmers from the semantic boilerplate in defining monads so that they can focus on the more important and more interesting task of implementing computational effects. Categories and Subject Descriptors D.1.1 [Programming Techniques]: Applicative (Functional) Programming; D.2.13 [Software Engineering]: Reusable Software; D.3.2 [Programming Languages]: Language Classifications Applicative (functional) languages; D.3.3 [Programming Languages]: Language Constructs and Features Frameworks; F.3.3 [Logics and Meanings of Programs]: Studies of Program Constructs Functional constructs General Terms Keywords 1. Introduction Design, Languages, Algorithms Haskell, monads, monad transformers Monads are a mathematical abstraction that can be used to model a wide variety of computational effects. They were initially used by Moggi to structure denotational semantics [18, 19] and were later introduced into Haskell to support input output operations [9]. In Haskell, monadic computations are first-class values which can be stored in data structures, and programmers can define their own monad to implement computational effects including state, exceptions, nondeterminism, resumptions, and continuations. These qualities made Simon Peyton Jones conclude that Haskell is the world s finest imperative programming language [8]. While monads with multiple effects can be built using monad transformers, composition of existing monad transformers is not always sufficient to satisfy all application-specific needs. Unfortunately, defining a monad is an undertaking not for the faint of heart. After formulating the desired computational effects and how they should be made available through nonstandard morphisms, the programmer must come up with a monad datatype that captures the computational effects, define monad operators on the datatype, implement the nonstandard morphisms of the monad, and verify that the operators satisfy a set of algebraic properties called the monad laws. If the computational effects need to be made available on top of another monad, the programmer must instead define monad transformers, which are even more complicated due to the need to interleave computations in the underlying monad with the effects provided by the monad transformer. I propose a monadic programming framework called skeletons that simplifies the task of defining monads. Skeletons eliminate the need to define monad datatypes and consolidate the semantics of a monad into an evaluation function. Term rewriting on monadic computations guarantees compliance with monad laws without the programmer doing any real work. Finally, the observer functions reduce the amount of semantic boilerplate involved in defining monads and monad transformers, and their use highlights the deep connections between these two constructs. The implementation of skeletons uses rank-2 types [11], generalized algebraic datatypes [12], and lexically-scoped type variables [10]; therefore it works only in recent versions of ghc. I believe that skeletons will make it easier not only to define monads, but to define monads correctly. The specific technical contributions of this paper are: A modular term representation that can capture computations in any monad (Section 3) An observation function that implements the basic behavior of a skeleton monad and ensures that the monad complies with monad laws (Section 4) An extension of the skeleton framework that supports monad transformers (Section 6) A procedure to convert a skeleton implementation of a monad to support the corresponding monad transformer (Section 6.4) An application of skeletons to meta-programming on monadic computations (Section 7) This paper also contains the following examples: [copyright notice will appear here] An implementation of the continuation monad that is not based on continuation passing (Section 4.3) An implementation of Claessen s parallel parsing processes [1] constructed from an algorithm description instead of through formal derivations (Section 5) Submitted to ICFP /4/13

2 instance Monad [ ] where return x = [x] [ ] = = [ ] (x:xs) = f = f x ++ (xs = f) instance MonadPlus [ ] where mzero = [ ] mplus u v = u ++ v Figure 1. The list monad in Haskell. The Monad instance declaration overloads the monad operators, and the MonadPlus instance declaration defines the nonstandard morphisms. An implementation of the list monad transformer that does not require a commutative underlying monad (Section 6.5) Two functions that implement transparent tracing of monad and monad transformer computations (Section 7.2) A sandboxing function that denies all file open attempts in an IO computation (Section 7.3) Section 2 presents background information about monads, Section 8 discusses related work, and finally Section 9 concludes the paper. 2. Background In Haskell, a monad consists of a type constructor M :: and two monad operators with the following types: return :: a M a ( =) :: M a (a M b) M b A value of type M a is a monadic computation in the monad M that produces a result of type a. The return operator turns a value into a computation that does nothing except return the value. The binary operator = (read bind ) runs the computation in its first argument, applies the continuation in its second argument to the result, and runs the computation returned by the continuation. In Haskell both operators are declared in the Monad type class, and the programmer should overload them when defining a monad. 1 Figure 1 shows a definition of the list monad in Haskell. The list monad models nondeterministic computations which can produce multiple results. The monad type constructor of the list monad is the list; the return operator returns a single-element list, and the = operator feeds each result in the first computation to the continuation and flattens the resulting nested list. The functions in the MonadPlus type class are called nonstandard morphisms; they are the interface through which programmers access the computational effects implemented in the monad. Here mzero, which indicates failure, produces no results, and mplus, which introduces nondeterminism, merges the results of two computations. In addition to choosing a type constructor and defining monad operators and nonstandard morphisms, the programmer must also ensure that the definitions satisfy the monad laws. The monad laws are a set of algebraic properties a monad must obey; they ensure that a monad works the way programmers expect (for example, a return computation does not cause any computational effects). Here are the monad laws: return v = k k v m = return m (m = k) = g m = (\v k v = g) The first two laws, also known as the left unit and the right unit laws, ensure that the return operator does not cause any compu- 1 The Monad type class also includes a fail function, but I will not consider it here because it is not part of the mathematical formulation of a monad. newtype State s a = State { runstate :: s (a, s) } instance Monad (State s m) where return a = State (\s (a, s)) m = k = State composite where composite s = let (a, s1) = runstate m s in runstate (k a) s1 instance MonadState s (State s) where get = State (\s (s, s)) put s = State (\ ((), s)) Figure 2. The state monad in Haskell. The type constructor of the monad is State which defines a state transformer. The = operator works by passing the state s1 produced by m to the continuation k. The get morphism returns the current state, and the put morphism sets the state. tational effects. The last law, also known as the associativity law, ensures that = is associative. The lambda wrapping on the righthand-side is necessary because the second argument of = must be a continuation. Since the operators in the Monad type class can be overloaded to arbitrary functions with appropriate types, the programmer bears the sole responsibility of checking the definitions and make sure that the laws are upheld. Sometimes no existing datatypes support the desired computational effect, and in that case the programmer must define a new type constructor as the basis of a monad. Coming up with the right datatype for a computational effect is not always easy. For example, what datatype should be used to model stateful computations? A standard technique is to use a state transformer, which is a function that maps the current state to the result of computation and the new state. Figure 2 shows a definition of the state monad. Such a solution is not obvious to functional programmers who are not already familiar with the technique, and the difficulty of designing monad type constructors increases with the complexity of the computational effects. Hughes proposed deriving a monad from its term representation using the formal semantics of the monad [5], and the technique is later used by Hinze [4] and Claessen [1] in other application domains. However, successful derivation requires a deep understanding of formal semantics and advanced functional programming techniques, thus the technique is out of reach for most functional programmers. Skeletons aim to strive for the middle ground between the lack of guidance in Haskell monadic programming and the complete rigor of Hughes derivation technique. By packaging the results of certain derivation steps into an observer function that can be reused, the skeleton framework allows programmers to enjoy some benefits of the derivation approach without doing much work. At the same time, the skeleton framework places few restrictions on the definitions evaluation functions, and therefore allows programmers to follow whatever design approach is most appropriate. 3. Skeleton term representations The basic principle of the skeleton framework is to represent the structure a monadic computation as algebraic data and then define functions that assign semantics to the captured computations. The framework consists of a modular term representation that can be parameterized to encode computations in different monads, and evaluation functions that evaluate skeleton computations to produce results. Even though this indirection adds certain execution overhead, the separation between syntax and semantics exposes the commonality of all monads and enables such commonality be cap- Submitted to ICFP /4/13

3 data Ske r a = Unit a forall b. Bind (Ske r b) (b Ske r a) Call (r (Ske r) a) instance Monad (Ske r) where return = Unit ( =) = Bind Figure 3. The skeleton term representation. The Ske datatype is used by all skeleton monads, and each of its data constructors represents a way to define a computation. data PlusI m a = Zero Plus (m a) (m a) data StateI s (m :: ) a where Get :: StateI s m s Put :: s StateI s m () Figure 4. Nonstandard morphism datatypes of the state and the list monads. Each data constructor in these types represents a nonstandard morphism. StateI is defined as a generalized algebraic datatype because both get and put return results of specific types. tured and presented in a reusable form. In this section I describe a skeleton datatype that represents monadic computations. The skeleton datatype Ske in Figure 3 represents the common structure of all monads. Each data constructor of Ske represents a way to define computation in the skeleton monad: Unit corresponds to the return operator, Bind corresponds to the = operator, and Call corresponds to nonstandard morphisms of the monad. In the figure I declared Ske as an instance of the Monad type class; however a language that does not support ad hoc polymorphism can still take advantage of skeletons by using Bind and Unit as the monad operators in place of the overloaded = and return. Since nonstandard morphisms differ from one monad to another, the Call constructor represents a nonstandard morphism through the parameterized type constructor r. The parameterization separates the monad-specific parts from the shared skeleton infrastructure and allows the latter to be reused when defining a new monad. The type constructor r used for nonstandard morphisms in the Ske datatype has kind ( ), and each of its data constructors should represent a nonstandard morphism. The first argument of r (of kind ) is the type of the skeleton monad for defining embedded computations (see the Call data constructor in Ske), and its second argument (of kind ) is the type of the result produced by the nonstandard morphism. If r represents a monad with a nonstandard morphism that produces results of a specific type, then it must use generalized algebraic datatypes (GADT) [12] to specify the type of the result. Figure 4 shows the skeleton representations of the State and the Plus nonstandard monad morphisms. The Plus constructor in PlusI uses the parameter m to specify that it contains two embedded computations. The StateI datatype takes an additional type parameter s as the type of the state and uses GADT to specify that Get produces a result of type s and Put produces a trivial result (an empty tuple). Putting the Ske datatype and a nonstandard morphism datatype together produces a skeleton monad type with which we can define computations. For example, by defining get = Call Get put s = Call (Put s) we can define computations in Ske (StateI s) a the same way we define computations in an instance of the MonadState type class. Note that unlike in the derivation technique by Hughes, the skeleton framework does not make use of simplified term representations. There is no need to perform any derivations or simplifications to the term representation, and all the processing is left to the evaluation functions. 4. Skeleton evaluation functions The skeleton term representation by itself is obviously inadequate as a monadic programming framework because it defines only a representation of a computation. Technically the representation is not even a monad because it does not observe the monad laws. To actually make the computation happen, the representation must be evaluated by a function that defines a suitable semantics of the monad. In this section I describe the design of skeleton evaluation functions. Following the formulation of skeleton term representations, an evaluation function is also separated into shared and monadspecific parts. The shared part, which I call the observer following the convention established by Hinze [4], evaluates the Ske datatype and ensures the compliance with monad laws. The monad-specific part, which I call a nonstandard morphism interpreter, defines the semantics of nonstandard morphisms and implements the computational effects of the monad. This separation promotes reuse and reduces the effort required to define a monad. 4.1 Observer and monad law enforcement By specifying the algebraic properties that a monad must satisfy, the monad laws implicitly define a core semantics that acts as the basis of every monad. In the ideal world, the semantics would be implemented once and for all as a library function or a language primitive. Programmers would use such facilities when they define monads, and everyone will live happily thereafter without thinking about monad laws ever again. Unfortunately, we do not live in the ideal world. Among other things, in Haskell the definition of a monad is heavily influenced by the choice of the monad type, so there is no way to implement the core semantics in a truly reusable form, and compliance with monad laws can be reached only with careful definitions of monad operators. As a result, programmers must spend time and energy to implement the semantic boilerplate that has nothing to do with the computational effects of the monad, only to have it become another source of bugs that they need to guard against. This situation clearly leaves a lot to be desired. Since all skeleton monads share the same basic term representation, presenting the core semantics in a reusable form is no longer an issue. That leaves us with the problem of implementing the core semantics by construction so that programmers will no longer bear the burden of ensuring compliance with monad laws. The solution I adopted was motivated by the following observation: It is okay to violate the monad laws as long as there is no evidence that can be used to prove the violation. In the context of monad laws, an incriminating evidence would be a pair of computations that should be equal but produce different results. If we represent all computations in an equivalent class (as defined by monad laws) with a single normalized computation, we can eliminate all evidence of violation and guarantee compliance with monad laws without restricting the definitions of the monad operators. I implemented the computation normalization procedure in an observer function using term rewriting [2] with the following rules derived from monad laws: return v = k k v Submitted to ICFP /4/13

4 data Step r m a b = Stop a Run (r m b) (b m a) type Interp r a v = forall b. Step r (Ske r) a b v observe monad :: Interp r a v Ske r a v observe monad interpret = eval where eval (Unit v) = interpret (Stop v) eval (Bind (Call e) k) = interpret (Run e k) eval (Call e) = interpret (Run e Unit) eval (Bind (Unit v) k) = eval (k v) eval (Bind (Bind m s) k)= eval (Bind m cont) where cont v = Bind (s v) k Figure 5. The skeleton observer function and auxiliary declarations. The Step datatype specifies how the observer function passes computations to a nonstandard morphism interpreter. The first two cases in eval invoke the interpreter directly; the remaining ones use term rewriting to transform the input computation. m = return m (m = k) = g m = (\v k v = g) In all three rules the direction of rewrite is forced because there is no way to perform pattern matching on functions. Some readers may find the second rule alarming because it introduces instead of eliminates a return operator, and also because it is applicable to all monadic computations. Repeated application of this rule alone would guarantee nontermination, and therefore we must be careful and apply it only under specific circumstances. I will get back to this issue later when we discuss the implementation of the observer function. Figure 5 shows the observe monad skeleton observer function. The function accepts a skeleton computation as input, rewrites the computation with appropriate rules, and invokes the nonstandard morphism interpreter (to be described in the next subsection) when there are no appropriate rewrite rules. Among the five pattern cases defined in the eval function, the first two do not correspond to any rewrite rule, so I invoke the nonstandard morphism interpreter. I rewrite the last two cases using the left unit and the associativity rules. That leaves the third case a single nonstandard morphism which I rewrite with the right unit rule and inline the invocation from the second case. The fact that the right unit rule is used only in this specific context indicates that the monad laws are not totally independent of each other. For example, the right unit law at the left hand side of the = operator can be proved as follows: (m = return) = k = m = (\v return v = k) (associativity) = m = (\v k v) (left unit) = m = k (η equivalence) The rewriting in the observer function serves two purposes. First, it ensures compliance with monad laws regardless of how the computational effects in the skeleton monad are implemented. Second, it relieves programmers of the responsibility to implement much of the behavior of monad operators, so they can concentrate on the computational effects instead. In the next subsection I will describe how that is done through nonstandard morphism interpreters. 4.2 Nonstandard morphism interpreters Generic rewriting does not work for the first two cases in the eval function because their semantics vary from one monad to another. The single return case determines how a monadic computation presents its result upon normal termination, and the nonstandard run list :: Ske PlusI a [a] run list = observe monad intp where intp (Stop a) = [a] intp (Run Zero ) = [ ] intp (Run (Plus u v) k) = let us = run list (Bind u k) vs = run list (Bind v k) in us ++ vs run state :: Ske (StateI s) a s (a, s) run state m s = observe monad (intp s) m where intp :: s Interp (StateI s) a (a, s) intp s (Stop v) = (v, s) intp s (Run Get k) = run state (k s) s intp (Run (Put s) k) = run state (k ()) s Figure 6. Skeleton evaluation functions of the list and the state monads. Each evaluation function invokes the observer function with a locally-defined nonstandard morphism interpreter that implements the desired computational effects. morphism bind case determines the semantics of those morphisms and how they interact with the computational effects implemented by the monad. For these reasons the observer function does not handle these two cases directly but delegates their interpretations to the nonstandard morphism interpreter. A skeleton nonstandard morphism interpreter has type Interp r a v (defined in Figure 5) where r is the datatype for nonstandard morphisms, a is the result type of the monadic computation, and v is the result type of evaluating the computation (usually instantiated as a type parameterized by a). The observer function packages a computation in the first two cases as a value of Step (where Stop represents a single return, and Run represents a nonstandard morphism at the left-hand-side of a = computation) and passes the Step value to the nonstandard morphism interpreter. The interpreter, which defines the computational effects supported by the monad, performs a case analysis on the Step value and produces results based on the semantics of the monad and the morphisms. Figure 6 shows the evaluation functions for the list and state monads. Both functions evaluate a monadic computation by invoking observe monad with a nonstandard interpreter named intp and the computation as its arguments. The intp function in run list is a straightforward transcription of the list monad semantics: return returns a single-element list, Zero returns an empty list, and Plus distributes over = into list concatenation. Note that in the Plus case intp invokes run list recursively to evaluate the expanded computations; this is a standard technique will be used extensively in the rest of the paper. The nonstandard morphism interpreter for the state monad is a little more complicated because the result of a get computation depends on the state set by the previous put. The solution used in the run state function is to introduce a parameter s that represents the current state. When evaluating a nonstandard morphism, intp recursively invokes run state using the continuation k applied to the result of the nonstandard morphism (the state s for Get and an empty tuple for Put) as the new computation, and a suitable value s as the new state. 4.3 Example: continuation monad with skeletons The skeleton-based definitions of the list and the state monads are both quite straightforward because all constructors in a nonstandard morphism datatype correspond to nonstandard morphisms of the monad. However, some monads require augmenting the nonstandard morphism datatype with auxiliary constructors for use in Submitted to ICFP /4/13

5 data ContI r m a = forall b. CallCC ((a m b) m a) Apply (m r) run cont :: Ske (ContI a) a a run cont = observe monad intp where intp (Stop v) = v intp (Run (Apply m) ) = run cont m intp (Run (CallCC c) k) = run cont (Bind (c cont) k) where cont a = Call (Apply (k a)) Figure 7. The continuation monad in skeletons. The CallCC data constructor represents the callcc nonstandard morphism. The Apply data constructor is used internally by the nonstandard morphism interpreter and does not correspond to a nonstandard morphism of the monad. the nonstandard morphism interpreter. I illustrate one such case by defining the continuation monad in the skeleton framework. Continuation monad supports a single nonstandard morphism callcc which captures the current continuation and passes it into its embedded computation as an argument. The callcc morphism has the following type: callcc :: ((a Cont r b) Cont r a) Cont r a Here Cont r is the type of the continuation monad, and r is the type of the result of the continuation. The callcc morphism runs the specified computation with the current continuation as its argument. Applying a captured continuation replaces the then-current continuation and changes the control flow of the computation. To implement the continuation monad with skeletons, the nonstandard morphism interpreter must be able to capture the current continuation and apply it at a later time. Capturing a continuation is quite easy because continuation is explicit in each = statement, but there is no easy way to package the captured continuation so that it can be applied later. One way to solve the problem is to present all computations in continuation-passing style, but such a solution would make the code quite difficult to understand. A much simpler solution is to use an auxiliary constructor called Apply to package the captured continuation. When the nonstandard morphism interpreter encounters an Apply computation, it abandons the continuation from the second argument of = and runs the computation attached with Apply instead. Figure 7 shows the implementation of the continuation monad. The CallCC case packages the current continuation k from = into a function cont that produces an Apply computation, and the Apply case gives up the current continuation and evaluates the attached computation instead. Since the implementation does not use continuation passing, I believe that it would be easier to understand than the conventional definition. 4.4 The essence of evaluation functions The evaluation function of a skeleton monad plays the same role as the monad operators and nonstandard morphisms of a typical Haskell monad, and indeed the two definitions of the same monad do resemble each other to some degree. Are these two formulations really different? And if they are different, how can we characterize their differences? The standard Haskell approach to defining monads is based on the notion of denotations. Based on the principles of denotational semantics [20], each construct in a Haskell monad must be denoted a value or a function in Haskell. For example, in the state monad (Figure 2) a computation is denoted by a state transformer in the State value, and the monad operators are functions that produce State values. Such a programming model is useful if a denotation of the monad is readily available. If not, the programmer would have to essentially come up with a denotational semantics through defining the monad type constructor (denotation of computations) and declaring an instance of the Monad type class (denotations of monad operators). Coming up with denotations is hard, especially for monads that support complicated computational effects. In contrast, the skeleton framework is based on the notion of operations. Like the typical formulation of operational semantics, a skeleton evaluation function specifies the steps involved in running a monadic computation. The operational perspective simplifies the sharing of basic behavior between different monads, and it allows programmers to think in terms of what a computation does instead of what a computation is. These features make skeletons especially attractive for defining monads whose denotations are unknown or otherwise difficult to come by. There is, however, some contention between these perspectives. The term rewriting in the observer function allows the programmer to define the associative variant of a non-associative operator with relative ease, and such an operator could be difficult to characterize in a denotational manner. For example, an associative integer subtraction operator defined through right-associative rewriting would satisfy the following equations: (7 3) 2 = = = 2 The associative subtraction operator is not a function on integers anymore because its behavior is context dependent. As a result, any denotation of the operator must take some form of context into consideration. The correspondence between operational and denotational semantics is a difficult problem, and whether there is a standard way to derive denotations from a skeleton evaluation function is left as future work. The contention described in the previous paragraph should not be taken as a criticism toward skeletons. Some monadic constructs, such as the list monad transformer, are just difficult to characterize in a denotational manner. The skeleton framework does not, and is not meant to, change that one way or the other. What it does do is to allow those monads to be defined operationally, so that practical programmers can satisfy their needs while theoreticians ponder on the issue of denotations. I will get back to the list monad transformer in Section 6.5 after presenting an implementation of parallel parsing processes in the next section. 5. Parallel parsing processes Even though the short examples in the previous section help to demonstrate some of the benefits of skeletons, it is only through a complicated example that one can get a deeper appreciation of skeletons. In this section I will define parallel parsing processes in skeletons and compare it with the original implementation by Claessen. Parallel parsing processes [1] is a monad proposed by Claessen that eliminates the space leak exhibited by backtracking monadic parser combinator libraries [6]. When there are multiple parses for a sequence of symbols, a backtracking parser employs a depth-first search strategy that tries all parses in turn. When the monad pursues a successful parse, all alternative parses must be kept in memory until the current parse completes successfully, thus creating a space leak. Parallel parsing processes solve the problem by pursuing all possible parses concurrently so that the unsuccessful parses can be eliminated quickly. To develop the parallel parsing monad, Claessen started with a naïve term representation and applied a series of intricate program transformations to combine terms and to enforce desired algebraic properties. (For example, one section in the paper was devoted to Submitted to ICFP /4/13

6 data ParserI s m a where Symbol :: ParserI s m s Fail :: ParserI s m a Choice :: m a m a ParserI s m a type Parser s = Ske (ParserI s) Figure 8. The nonstandard morphism datatype of a parser monad. The definition of nonstandard morphisms in the ParserI datatype follows the formulation by Claessen [1]. classify :: forall a s. [a] [s Parser s a] [Parser s a] ([a], [s Parser s a]) classify done more [ ] = (done, more) classify done more (p:ps) = observe monad intp p where intp :: Interp (ParserI s) a ([a], [s Parser s a]) intp (Stop v) = classify (v:done) more ps intp (Run Symbol k) = classify done (k:more) ps intp (Run Fail ) = classify done more ps intp (Run (Choice u v) k) = classify done more parsers where parsers = Bind u k : Bind v k : ps Figure 9. The classification function of parallel parsing processes. The classify function contains a nonstandard morphism interpreter that classifies each parser computation based on its first nonstandard morphism (or the lack thereof). maintaining the associativity of the = operator.) In spite of the complicated reasoning employed in Claessen s derivation, parallel parsing processes are based on only two ideas: 1. Classify a parse computation by its first nonstandard morphism (if there is one), and then 2. Save the results of completed parses, discard failed parses, and continue processing the parses that require additional symbols. I now show how to define parallel parsing processes with skeletons by building on these two ideas. 5.1 Defining the parser nonstandard morphisms Figure 8 shows the nonstandard morphism datatype ParserI for a parser skeleton monad. A parser computation maps a sequence of symbols of type s to a set of parses of type a. The monad provides three nonstandard morphisms: Symbol returns the next symbol in the sequence, Fail indicates a parse failure, and Choice poses two alternative parses for the same symbol sequence. The ParserI type defines a standard programming interface for monadic parser combinators; it is not in any way tailored toward parallel parsing processes. Whether parsing is conducted depth-first or in parallel depends solely on the skeleton evaluation function. 5.2 Classifying parse computations Figure 9 shows the classify function which classifies a list of parse computations by status. The purpose of the classification is to plug space leaks by identifying failed parses early without fully evaluating the successful parses. The classify function implements this functionality by defining a nonstandard morphism interpreter that performs a case analysis on the input Step value. Completed parses are collected in the done variable, parses that need more input symbols are collected in the more variable without further evaluation, and failed parses are discarded directly. If a parse starts with a Choice constructor, the interpreter expands the embedded parses and classify them using the same rules. Here we see the power of capturing a monadic computation as algebraic data which is then processed by a function. Coming up parallel parse :: [Parser s a] [s] [(a, [s])] parallel parse [ ] = [ ] parallel parse parsers symbols = let (done, more) = classify [ ] [ ] parsers results = zip done (repeat symbols) further [ ] = [ ] further (s:sx) = parallel parse (feed s) sx feed s = map ($ s) more in results ++ further symbols Figure 10. The evaluation function of parallel parsing processes. The parallel parse function collects the results of completed parses and feeds one additional symbol to the computations that are still in progress. with a type constructor that supports the computational effect of parallel parsing in the same way we model stateful computation as state transformers would undoubtedly be difficult. 5.3 Evaluating classified parse computations Figure 10 shows the evaluation function parallel parse which pursues all possible parses concurrently. The function takes a list of parse computations (parsers) and a list of symbols (symbols) as input, and for each successful parse it returns a pair containing the parse result and the remaining symbols that are not consumed in the parse. The parallel parse function is defined recursively; one symbol is consumed in each invocation until all parses have terminated or all symbols have been consumed. The results variable contains the results of successful parses that terminate in this invocation, and the more results function evaluates the parses that are still in progress by supplying one additional symbol with feed and then invoking the parallel parse function. 5.4 Comparison with Claessen s implementation The skeleton-based implementation of parallel parsing processes presented in this section is significantly simpler than the original implementation by Claessen [1]. While Claessen s implementation is a little shorter, coming up with the SymbolBind and ReturnPlus constructors and applying context passing [5] to enforce the rightassociativity of = require ingenious thinking, careful derivation, and an intimate knowledge of advanced functional programming techniques. In contrast, the skeleton-based implementation consists of two recursive functions on lists, which is a technique already well-understood by most functional programmers. The modular designs of both the skeleton term representation and the evaluation function greatly reduce the boilerplate required to define a monad. As a result, programmers can concentrate on the strategies for parallel parsing with little distraction. The rewrite in the observation function not only ensures that the implementation complies with monad laws, but also flattens a parse computation into a sequence of nonstandard morphisms and greatly simplifies the definition of the classification function. I have translated the grammar of the Oberon programming language [21] into a parser and benchmarked the two parallel parser monads using sample code from Project Oberon [22]. Compared to Claessen s implementation, the skeleton-based implementation consumes roughly the same amount of memory but runs at only 60% 80% of the speed. 6. Skeleton monad transformers A monad transformer is a mechanism for adding new computational effects on top of an existing monad [16]. For example, a state monad transformer adds support for stateful computation, and a list Submitted to ICFP /4/13

7 newtype StateT s m a = StateT { runstatet :: s m (a, s) } instance Monad m Monad (StateT s m) where return a = StateT (\s return (a, s)) m = k = StateT composite where composite s = do (a, s1) runstatet m s runstatet (k a) s1 instance MonadTrans (StateT s) where lift m = StateT lifted where lifted s = do a m return (a, s) instance Monad m MonadState s (StateT s m) where get = StateT (\s return (s, s)) put s = StateT (\ return ((), s)) Figure 11. The state monad transformer in Haskell. The grayedout code segments are identical as in the state monad in Figure 2. The state monad transformer requires a new monad type constructor StateT, a new operator lift in the MonadTrans type class, and modifications to the monad operators and nonstandard morphisms to accommodate the underlying monad. monad transformer adds support for nondeterminism. In Haskell a monad transformer is a monad type constructor parameterized by the constructor of the underlying monad. In addition to the standard monad operators and nonstandard morphisms, a monad transformer must also provide a lift operator that lifts a computation in the underlying monad into the transformed monad. Without the lift operator, it would be impossible to use the computational effects of the underlying monad in the transformed monad. In this section I extend the skeleton framework to support monad transformers. 6.1 Monad transformers in Haskell Figure 11 shows a state monad transformer in Haskell. I grayed out the parts that remain unchanged from Figure 2 so that changes stand out. Except for the renaming of type and data constructors, the most significant changes are: 1. The StateT type is parameterized by the underlying monad m. One of the challenges of implementing a monad transformer is to figure out how the underlying monad fits in the type of the transformed monad. A state monad transformer computation maps state to a computation in the underlying monad. A list monad transformer computation is a computation in the underlying monad whose result is a list. The continuation monad transformer is like the continuation monad, but the result of the continuation is a computation in the underlying monad. There does not appear to be a general rule that covers all these cases. 2. The definitions of monad operators and nonstandard morphisms need to incorporate the underlying monad as specified in the monad transformer type. Basically, a computation that returns a value now needs to return a underlying computation through return, and multiple embedded computations in a nonstandard morphism need to be sequence with = in the underlying monad (as we see in the definition of = here). 3. The monad transformer provides a lift operator that turns a computation of type m a in the underlying monad to a computation of type StateT s m a in the transformed monad. Here the lift operator packages the underlying computation by binding it with a return statement that propagates the current state. Finally, we need to check that the transformed monad satisfies monad laws, a step I omit here. Due to the intricate issues programmers must face when designing the monad transformer type and the lift operator, as well as the lack of understanding of the general principles of monad transformers, defining a monad transformer remains a challenging task. 6.2 The principles of monad transformers There have been several prior works on the principles of monad transformers. Liang and associates proposed the following monad transformer laws [16]: lift. return u return t lift (m = u k) (lift m) = t (lift. k) The suffix of each monad operator denotes whether it is in the transformed monad (t) or in the underlying monad (u). The laws provide little guidance to the design of monad transformers because they say nothing about important issues such as the integration of the underlying monad into the monad transformer. Later, Hinze characterized a monad transformer as a type constructor τ with a pair of functions promote and observe [4]: promote observe :: Monad m m a τ m a :: Monad m τ m a m a The promote function is the the lift operator in Haskell, and the observe function maps a computation in the transformed monad back to the underlying monad by implementing the computational effects introduced by the monad transformer. The promote function must satisfy the monad transformer laws, and in addition the two functions must also satisfy the following laws: observe. return t return u observe (promote m = t k) m = u (observe. k) The last law is particularly useful for designing monad transformers because it points out that a monad transformer need to process only the nonstandard morphisms defined by the transformer and can pass lifted computations as part of the output. Even though Hinze s characterization provides us with a good understanding of how monad transformers work, the characterization does not generalize beyond the backtracking monad transformer in Haskell that Hinze studied in his work. The problem lies in the return type of the observe function. Let us consider the state monad transformer as an example. Since a stateful computation is modeled as a state transformer, observe should have the following type: observe :: Monad m StateT s m a s m (a, s) Or we can consider a list monad transformer ListT with a semantics similar to the list monad: observe :: Monad m ListT s m a m [a] Neither of these types is an instance of the general type of observe listed previously. Hinze did not run into this problem because his backtracking monad returns only one result and the Monad type class in Haskell happens to include a fail function. However, if we wish to use this characterization as a basis for supporting monad transformers in skeletons, the formulation of observe must be generalized to cover these cases as well. I will show how this task can be done in the next subsection. 6.3 Term representation for monad transformers Figure 12 shows an extended version of the skeleton term representation that supports monad transformers. The definitions here Submitted to ICFP /4/13

8 data SkeT r m a = Unit a Lift (m a) Call (r (SkeT r m) a) forall b. Bind (SkeT r m b) (b SkeT r m a) data Xm a type Ske r = SkeT r Xm instance Monad (SkeT r m) where return = Unit ( =) = Bind instance MonadTrans (SkeT r) where lift = Lift Figure 12. The skeleton term representation for monad transformers. The definitions here replace the ones in Figure 3 in a backwardcompatible manner. The most notable changes here are the introduction of the underlying monad type m and the new data constructor Lift which represents the lift operator. type Interpreter r m a v = forall b. Step r (SkeT r m) a b v type Interp r a v = Interpreter r Xm a v type InterpT r m a v = Interpreter r m a (m v) observe core :: (forall b. m b (b v) v) Interpreter r m a v SkeT r m a v observe core bind under interpret = eval where eval (Unit v) = interpret (Stop v) eval (Bind (Call e) k) = interpret (Run e k) eval (Call e) = interpret (Run e Unit) eval (Lift m) = m bind under (eval. Unit) eval (Bind (Lift m) k) = m bind under (eval. k) eval (Bind (Unit v) k) = eval (k v) eval (Bind (Bind m s) k)= eval (Bind m cont) where cont v = Bind (s v) k observe monad :: Interp r a v Ske r a v observe monad = observe core undefined observe trans :: Monad m InterpT r m a v SkeT r m a m v observe trans = observe core ( =) Figure 13. The generic skeleton observer function and auxiliary declarations for both monads and monad transformers. The definitions here replace the ones in Figure 5 in a backward-compatible manner. Even though there are changes all around, the eval function remains mostly the same with the exception of two new cases introduced to handle computations in the underlying monad. replace those in Figure 3, and I grayed out the parts that remain unchanged to highlight the changes. The SkeT datatype includes a Lift data constructor to represent the lift operator and is declared as a trivial instance of MonadTrans. The Ske datatype becomes a special case of SkeT in which the underlying monad is instantiated with the empty type Xm to prevent the use of the Lift constructor. Figure 13 replaces Figure 5 with an observation function that supports monad transformers. The InterpT type specifies that the result of invoking a transformer nonstandard morphism interpreter should be a computation in the underlying monad by using m v as the last argument of Interpreter. The result type v of the returned computation is, however, unrestricted, which generalizes the type of observe given by Hinze and gives programmers the freedom to present the result of transformer computations as they see fit. run state t :: Monad m SkeT (StateI s) m a s m (a, s) run state t m s = observe trans (intp s) m where intp :: Monad m s InterpT (StateI s) m a (a, s) intp s (Stop v) = return (v, s) intp s (Run Get k) = run state t (k s) s intp (Run (Put s) k) = run state t (k ()) s Figure 14. The skeleton evaluation function of the state monad transformer. The grayed-out code segments are identical as in the state monad evaluation function in Figure 6. Compared to the standard Haskell implementation in Figure 11, the skeleton-based definition requires only very few changes. The observe core function extends observe monad with the ability to deal with lifted computations. The two new pattern cases in eval uses the right unit law and Hinze s second law of observe to rewrite computations that contain Lift. Programmers should not invoke observe core directly; instead either observe monad (for skeleton monads) or observe trans (for skeleton monad transformers) should be used. Note that in addition to the nonstandard morphism interpreter, the observe core function is also parameterized by the = operator in the underlying monad. This parameterization is necessary to make observe core backwards compatible with observe monad. Substituting = for bind under requires the result of observe core be a computation in the underlying monad, which would not work because Ske uses the empty type Xm as the underlying monad. The bind under parameter allows observe monad and observe trans to share the same code base and therefore reduces duplication in the framework. 6.4 Monad transformers using skeletons Figure 14 shows an evaluation function of state monad transformer adopted from Figure 6. In the skeleton framework, the simple change of replacing Ske with SkeT introduces support for the lift operator and turns the term representation into an instance of MonadTrans, so there is no need to define a new type constructor like StateT in Figure 11. The SkeT term representations works only with the observe trans observer function, and likewise the type signature of intp needs to use InterpT instead of Interp. These changes in turn requires run state t to produce a computation in the underlying monad as its result, so we change the result type from (a,s) to m (a,s) as well. Now the only thing left to be done is to make sure that run state t does produce computation in the underlying monad as its result. Of the three cases in intp, the first can be adopted by using the return operator, and the others do not require any change because they are both recursive invocations of run state t. While this final step requires some thought, it is not really all that difficult. The walk-through in the previous paragraph suggests a semimechanical procedure for converting a skeleton monad evaluation function into evaluating monad transformers. 1. Replace Ske in type signatures with SkeT and specify that the underlying monad type m is an instance of the Monad class. 2. Add the underlying monad type constructor m to the result type of the evaluation function. 3. Replace Interp in the nonstandard morphism interpreter type signature with InterpT. 4. Replace observe monad with observe trans. 5. Make the nonstandard morphism interpreter produce results as computations in the underlying monad. This change is usually Submitted to ICFP /4/13

9 run list t :: Monad m SkeT PlusI m a m [a] run list t = observe trans intp where intp (Stop a) = return [a] intp (Run Zero ) = return [ ] intp (Run (Plus u v) k) = do us run list t (Bind u k) vs run list t (Bind v k) return (us ++ vs) Figure 15. The skeleton evaluation function of the list monad transformer. The grayed-out code segments are identical as in the list monad evaluation function in Figure 6. Unlike the standard Haskell implementation, this straightforward definition of the list monad transformer does not depend on a commutative underlying monad. accomplished by adding return to pure values and using = to sequence multiple embedded computations. Though I have not done any in-depth analysis on this procedure, it does appear to be applicable to a wide variety of skeleton monads, including the continuation monad in Section 4.3 and the parallel parsing monad developed in Section 5. By providing a common infrastructure for both monads and monad transformers, the skeleton framework conceals their superficial differences and exposes the underlying connections between these two constructs. 6.5 List monad transformer using skeletons I conclude this section by demonstrating how to define a list monad transformer using skeletons. The question of how to define a list monad transformer is of particular interest to functional programmers because the straightforward definition, such as the one used in Haskell Hierarchical Libraries, violates the associativity monad law when the underlying monad is not commutative [7]. Consider the following example: t1 = (mplus a b c) d t2 = mplus a b (c d) Here a,b,c,d are lifted IO computations that print out the corresponding characters. Evaluating t1 prints abccdd while evaluating t2 prints abcdcd, which clearly violates the associativity law. Figure 15 shows an evaluation function for the list monad transformer adopted from Figure 6. The adoption follows the procedure outlined in the previous subsection. The first two cases in intp uses return to turn pure values into computations in the underlying monad, and the final case sequences the embedded computations in the underlying monad and produces the result with another return. Due to the associativity rewrite in observe trans, this definition of list monad transformer satisfies all monad laws without additional work on the programmer s part, and both t1 and t2 print acdbcd when they are evaluated. 7. Meta-programming with skeletons A monadic computation in Haskell is typically a black box due to the use of functions to define monad operators and nonstandard morphisms. While a program can run the computation and obtain some kind of result, nothing else can be done with the computation. Such a limitation does not exist in the skeleton framework: since the monad operators and nonstandard morphisms are data constructors, the programmer can open up a monadic computation and see what happens inside. In this section I will show how to use this capability to support meta-programming on skeleton computations. type SkeX p q m u v = SkeT p m u SkeT q m v type InterpX p q m u v = Interpreter p m u (SkeT q m v) observe xlate :: InterpX p q m u v SkeX p q m u v observe xlate = observe core (Bind. Lift) Figure 16. The skeleton observer function for monad translators. The definitions here references types and a function defined previously in Figures 12 and Observer function for skeleton translators Meta-programming in the skeleton framework is implemented through skeleton monad translators. A skeleton monad translator is like an evaluation function, but instead of evaluating the computation, the translator generates another computation as its result. Using a translator, a programmer can take a skeleton computation apart, make some changes, and put everything back together. Figure 16 shows the observer function for skeleton translators. The InterpX type indicates that a nonstandard morphism interpreter of a skeleton translator should produce a SkeT value as its result. The observer function observe xlate is based on observe core like the others, but it handles a computation in the underlying monad by lifting it back to the transformed monad so that it can be bound back to its continuation. To define a skeleton translator, the programmer defines a nonstandard morphism interpreter that translates each nonstandard morphism and combines it with observe xlate. Note that the translation happens to the nonstandard morphisms executed in a computation instead of the definition of the computation itself. If a single nonstandard morphism written in the computation is executed multiple times (for example, due to recursion), it will be translated many times with possibly different results. This behavior is unavoidable because the translator works on a runtime data value instead of the Haskell source code. Runtime translation incurs a heavier performance penalty, but in some sense it is also more general because it allows the translation process to make use of runtime information. In the remaining of this section I will provide two examples of skeleton meta-programming on tracing and sandboxing skeleton computations. 7.2 Tracing skeleton computations Direct observation is a powerful technique for understanding the behavior of a system. When a monadic computation behaves in an unexpected manner, the first step of debugging is to find out what exactly is going on under the hood. While some monads support output, the programmer still needs to insert trace code into the computation to perform the output operations. Manual insertion of trace code works for simple cases, but it is still better to have a program do it while the computation executes so that there is no need to modify the source code of the computation. Figure 17 shows the trace monad function which produces a self-tracing skeleton computation. The traced computation uses the logger function to maintain a summary s of its execution. When the computation runs a nonstandard morphism, it invokes the logger to update the summary based on the executed morphism and its result. When the computation reaches the end, it returns a pair consisting of the result and the final execution summary. The trace monad function is designed with flexibility in mind: by using appropriate logger functions, the programmer can produce the complete execution history, count how many times a nonstandard morphism is executed, or compute the distribution of the arguments of nonstandard morphisms. Even though trace monad may look reasonable, there are still cases where it fails to work as expected. The problem lies in the decision to pass the execution summary as part of the result of Submitted to ICFP /4/13

Monads. Mark Hills 6 August Department of Computer Science University of Illinois at Urbana-Champaign

Monads. Mark Hills 6 August Department of Computer Science University of Illinois at Urbana-Champaign Monads Mark Hills mhills@cs.uiuc.edu Department of Computer Science University of Illinois at Urbana-Champaign 6 August 2009 Hills Monads 1 / 19 Overview Overview Hills Monads 2 / 19 Why Monads? Overview