Type Inference for Multi-Parameter Type Classes

Size: px

Start display at page:

Download "Type Inference for Multi-Parameter Type Classes"

Juniper Moore
6 years ago
Views:

1 Type Inference for Multi-Parameter Type Classes Martin Sulzmann 1 and Peter J. Stuckey 2 1 School of Computing, National University of Singapore S16 Level 5, 3 Science Drive 2, Singapore sulzmann@comp.nus.edu.sg 2 NICTA Victoria Laboratory Department of Computer Science and Software Engineering The University of Melbourne, Vic. 3010, Australia pjs@cs.mu.oz.au Abstract. We observe that the combination of multi-parameter type classes with existential types and type annotations leads to a loss of principal types. As a consequence type inference in implementations such as GHC is incomplete. This may be a surprising fact for users of these standard features. We conduct a concise investigation of the problem and are able to precisely explain why we lose principality and completeness of inference. As remedy we propose several novel inference methods to retain completeness and principality. 1 Introduction Type systems are important building tools in the design of programming languages. They are typically specified in terms of a set of typing rules which are formulated in natural deduction style. The standard approach towards establishing type soundness is to show that any well-typed program cannot go wrong at run-time. Hence, one of the first tasks of a compiler is to verify whether a program is well-typed or not. The trouble is that typing rules are often not syntax-directed. Also, we often have a choice which types to assign to variables unless we demand that the programmer supplies the compiler with this information. However, using the programming language then may become impractical. What we need is a type inference algorithm which automatically checks whether a program is well-typed and as a side-effect assigns types to program text. For programming languages based on the Hindley/Milner system [Mil78] we can typically verify that type inference is complete and the inferred type is principal [DM82]. Completeness guarantees that if the program is well-typed type inference will infer a type for the program whereas principality guarantees that any type possible given to the program can be derived from the inferred type. In this paper, we ask the question whether this happy situation will continue in case of multi-parameter type classes [JJM97], a popular extension of the Hindley/Milner system and available as part of a number of compiler systems such as GHC [GHC]. In particular, we are interested in type inference in the presence of (by now standard) features such as type annotations [OL96] and (boxed) existential types [LO94]. The surprising observation is that we lose completeness

2 and principal types as the following examples show. We will make use of Haskell syntax [Has] throughout the paper. Consider the following simple program where we make use of the combination of existential types and type classes. class Foo a b where foo::a->b->int instance Foo a Int where foo = 1 -- (F1) instance Foo a Bool where foo = 2 -- (F2) data Bar = forall b. Mk b f x z = (case z of (Mk y) -> foo y x, x == length []) We introduce a two-parameter type class Foo which comes with a method foo of type a, b.foo a b a b Bool. The two instance declarations state that Foo a Int and Foo a Bool hold for any a. The data type Bar introduces a constructor Mk whose parameter b does not appear in the result type (here Bar). Hence, we consider b as existentially quantified. Note that in the Haskell syntax these existential variables are introduced by the forall keyword. We expect function f to be well-typed. Variable x has type Int (since it is equated to the result of length :: [a] -> Int) and variable z has type Bar. The pattern variable y has type b for any b (because b is existential). Hence, foo y x gives rise to the (type class) constraint Foo b Int. According to the instance (F1) this constraint holds. Hence, we find that f has type Int->Bar->(Int,Bool). A surprising observation is that GHC fails to infer a type for the above program whereas Hugs [HUG] succeeds. Though, for the following variant f2 x z = (x == length [], case z of (Mk y) -> foo y x) GHC succeeds whereas HUGS fails. It is even getting better. GHC and Hugs both fail for the following variant f3 x z = case z of (Mk y) -> foo y x We note that similar observations can be made in case of type annotations (see Appendix C). In our system the types of constructors may be constrained by type classes. E.g., the following is accepted in our system. data Erk a = forall b. Foo a b => Mk2 a b Such an extension is known as type classes with existential types. Similarly, type classes may constrain the types in annotations. Both extensions do not complicate the type inference problem any further. Hence, we assume that both extensions are part of our formal system. For the rest of this paper we write MPTC to denote the system of multi-parameter type classes with existential types and type annotations. We conclude that type inference for MPTC turns out to be surprisingly difficult. We are not aware of any formal investigation of the MPTC type inference problem. Previous works either neglects existential types and type annotations [Jon92,Jon95], or considers the single-parameter case only [Läu96]. Here, we conduct a detailed investigation of the inference problem and propose several novel solutions. In summary, our contributions are as follows: 2

3 We show that we need a richer form of constraints to precisely capture the typing problem for multi-parameter type classes with existential types and type annotations. Thus, we can precisely explain why we lose principality and completeness of inference and which type inference methods are necessary to improve this unpleasant situation (Sections 2 and 5). We introduce a solving procedure for implication constraints. Thus, we obtain sound type inference for MPTC programs (Section 6). We investigate under which conditions inference is complete and inferred types are principal (Section 7): In case type class constraints are fixed (this can be enforced by type annotations) we obtain principal types (Section 7.1). In case all parameters of type class constraints in data type definitions are universally quantified each inferred type is principal. Though, inference is still incomplete in the sense that we may fail while a valid type exists (Section 7.2). For the general case, we can give a refined inference procedure which enumerates all possible types (Section 8). We continue in Section 2 where we give an overview of the main problems we face and how we intend to solve them. In Section 3 we introduce some preliminary definitions and spell out some basic assumptions. In Section 4 we briefly introduce the MPTC system. We conclude in Section 9. 2 Overview The standard route towards type inference is to generate an appropriate constraint out of the program text. The solution of these constraints yields a solution to type inference. E.g., in case of Hindley/Milner, we generate type equation constraints which can be solved by a unification procedure [Rob65]. In case of extensions of Hindley/Milner with subtypes [Pot98] or records [Rém93] we need to solve additional primitive constraints. For many extensions of Hindley/Milner, we can generalize this process by abstracting over the particular constraint domain [OSW99,Sul00]. The common type inference path is that we need to solve sets of primitive constraints such as type equations, subtype and record constraints. For MPTC type inference we need a richer form of constraints to capture precisely the typing problem. The significant difference compared to Hindley/Milner is that we additionally need Boolean implication ( ) and universal quantification (we will see shortly why). We refer to these constraints as implication constraints. The idea of using such constraints for type inference dates back to Zenger s index types [Zen99]. Our formulation follows Simonet and Pottier [SP05] who recently formalized implication constraint generation for an extension of Hindley/Milner. In case of the program from the introduction class Foo a b where foo::a->b->int instance Foo a Int where foo = 1 -- (F1) instance Foo a Bool where foo = 2 -- (F2) data Bar = forall b. Mk b f x z = (case z of (Mk y) -> foo y x, x == length []) 3

4 the following (simplified) implication constraint arises out of the program text of f. t f = t x Bar t 1, t 1 = (t 2, Bool), t x = Int, t y.(true Foo t y t x, t 2 = Int) Note that only the trivial T rue (assumption) constraint arises out of the pattern (Mk y). Hence, we could further simplify (True Foo t y t x, t x = Int) to Foo t y t x, t x = Int. The universal quantifier t y ensures that no existential variables escapes. We find that t f = Int Bar (Int, Bool) is a valid solution, hence, f::int->bar->(int,bool) is a correct type as we have already observed earlier. In existing implementations such as GHC, a traditional algorithm W style inference system is employed where we only generate sets of primitive constraints. Hence, language features such as type classes with type annotations and existential types demand that we perform additional tests while traversing the abstract syntax tree (AST). E.g., type inference for type classes with existential types demands that we additionally check that none of the existential variables escapes. In GHC, we traverse the AST from left-to-right. Hence, in function f we first consider the sub-expression case z of (Mk y) -> foo y x without considering x == length []. Then, foo y x gives rise to the constraint Foo t y t x. We cannot let Foo t y t x float out because t y is universally quantified (because y is a existential variable). However, we also cannot satisfy this constraint (the type inferencer would not to guess that t x = Int). Hence, type inference in GHC fails. The situation is similar for Hugs, though, Hugs seems to favor a right-to-left traversal of the AST. With implication constraints we can directly express additional inference tests. Hence, we can take advantage of constraints from the surrounding context while solving. Thus, we obtain an improved inference algorithm. But, we still face the incompleteness problem. E.g., f3 x z = case z of (Mk y) -> foo y x gives rise to t f3 = t x Bar t 1, t y.(true Foo t y t x, t 1 = Int) By guessing we find that t f3 = Int Bar Int and t f3 = Bool Bar Int are two incomparable solutions and there is no more general solution. The trouble is that the set of constraints specifiable by users is from a less expressive constraint domain than those (implication) constraints necessary to represent the typing problem. Hence, we may need to guess a solution. A trivial way out of this dilemma seems to enrich the user constraint domain, e.g., we could take the implication constraint itself as a solution. However, types then become unreadable and we still need to verify that implication constraints are satisfiable which seems as hard a problem as solving them. We employ an extension of algorithm W where we generate implication constraints out of the program text. Details are in Section 5. In Section 6 we show how to solve such constraints. To attack the completeness of inference problem we propose the following three approaches. In our first approach, we identify sufficient conditions under which solving of implication constraints always yields a most general solution or no solution 4

5 exists. We demand that the type of each type class method which appears within a type annotation or case expression must fixed (e.g. by a type annotation). Thus, we rule out function f3. Details are in Section 7.1. The above conditions are quite severe. E.g., we also rule out function f, though, the surrounding context implicitly constrains the type class method. In our second approach, we do not impose any conditions and the maybe surprising result is that if inference succeeds, then the inferred type is principal. For this to hold, we must require that all parameters of type class constraints in data type definitions are universally quantified. Unfortunately, we may miss some valid types. Details are in Section 7.2. In our final approach, we solve implication constraints locally by enumerating all finitely many solutions. E.g., in case of function f we first solve the subproblem t y.(true Foo t y t x, t 2 = Int) which has the two solutions t 2 = Int, t x = Int and t 2 = Int, t x = Bool. We then test which of these solutions is consistent with the solutions for the other sub-problems. Here, we find that only t 2 = Int, t x = Int is consistent with t f = t x Bar t 1, t 1 = (t 2, Bool), t x = Int. Hence, t f = Int Bar (Int, Bool) is the final answer. Thus, we obtain complete inference, however, we lose principality. E.g., in case of function f3 we infer that Int->Bar->Int and Bool->Bar->Int are the two possible types which can be given to f3. But there is no more general type. Details are in Section 8. 3 Preliminaries For convenience, we write ō as a short-hand for a sequence of objects o 1,..., o n (e.g. expressions, types etc). As it is common, we write Γ to denote an environment consisting of a sequence of type assignments x 1 : σ 1,..., x n : σ n. Types σ will be defined later. We commonly treat Γ as a set. We write to denote set subtraction. We write fv(o) to denote the free variables in some object o. We generally assume that the reader is familiar with the concepts of substitutions, unifiers, most general unifiers (m.g.u.) etc [LMM87] and first-order logic [Sho67]. We write [t/a] to denote the simultaneous substitution of variables a i by types t i for i = 1,.., n. We use common notation for Boolean conjunction ( ), implication ( ) and universal ( ) and existential quantifiers ( ). Often, we abbreviate by, and use set notation for conjunction of formulae. We sometimes use V.Fml as a short-hand for fv(fml) V.Fml where Fml is some first-order formula and V a set of variables, that is existential quantification of all variables in Fml apart from V. We write = to denote the model-theoretic entailment relation. When writing logical statements we often leave (outermost) quantifiers implicit. E.g., let Fml 1 and Fml 2 be two formulae where Fml 1 is closed (contains no free variables). Then, Fml 1 = Fml 2 is a short-hand for Fml 1 = fv(fml 2 ).Fml 2 stating that in any (first-order) model for Fml 1 formula fv(fml 2 ).Fml 2 is satisfied. 4 Multi-Parameter Type Classes We first describe the syntax of MPTC programs. Then, we make use of our earlier work [SS02] to explain the meaning of type classes in terms of Constraint 5

6 Handling Rules (CHRs) [Frü95]. Finally, we state some basic conditions which ensure that CHRs resulting from MPTC programs are well-behaved. Expressions e ::= K x λx.e e e let g = e in e let g :: C t case e of in e g = e [p i e i ] i I Patterns p ::= x K p...p Types t ::= a t t T t Type Classes tc ::= TC t Constraints C ::= tc C C Type Schemes σ ::= t ā.c t Data Decls ddec ::= data T a 1...a m = forall b 1,..., b n.c K t 1...t l Type Class Decls tcdec ::= class TC ā wherem :: C t instance C TC t CHRs chr ::= rule TC t C We assume that K refers to constructors of user-defined data types. Patterns must be linear, i.e., each variable occurs at most once. Type annotations are closed, i.e. in f::c=>t we quantify over all variables in fv(c, t). The extension with lexically scoped annotations is straightforward but omitted for simplicity. We assume that let-defined functions carrying an annotation may be recursive, though, we ignore un-annotated recursive functions for simplicity. We assume that user-defined types T t are introduced via data type declarations of the form data T a1... am = forall b1,...,bn. C => K t1... tl We assume that these declarations are preprocessed and the types of constructors K : ā, b.c t 1... t l T ā are recorded in the initial environment Γ init. We assume that ā b =. Note that ā and b can be empty. Similarly, for each class declaration class TC ā where m :: C t we find m : fv(c, t, ā).(tc ā, C) t Γ init. For brevity, we ignore super-classes. Following [SS02], we translate each declaration instance C TC t into a CHR rule TC t C. We assume that the set P contains all CHRs resulting from instance declarations. Commonly, we refer to P as the program theory. Logically, each CHR rule TC t C can be interpreted as the first-order formula fv( t).t C t fv(c) fv( t).c. CHRs also have a simple operational reading which we explain shortly. In the absence of existential types and type annotations, type inference for type classes can be straightforwardly reduced to CHR solving [SS02]. In their presence, we need an extension of the CHR solving mechanism which is discussed in Section 6. We now briefly explain the operational semantics of CHRs (see e.g. [SS02] for more details). CHRs specify a rewrite relation among constraints. We can apply rule TC t C 3 to a set of constraints C if we find a matching copy TC s C such that φ( t) = s for some substitution φ. Then, we replace TC s by the right-hand side under the matching substitution φ(c) More formally, we write C C TC s φ(c) to denote this derivation step. Later we will consider derivations that have a mix of type equations C e and type class 3 We assume that variables in CHRs are renamed before rule application. 6

7 constraints C t. In this case we assume the first step in the CHR derivation maps C e, C t x = t, θ(c t ) where θ = { x/ t} is the mgu of C e. That is we apply the mgu to all type class constraints and replace the equations by a canonical form mgu. For a given program theory P and initial constraint (store) C, we write C P C to denote the exhaustive application of CHRs P resulting in the final constraint C. Of course, CHR derivations may be non-terminating and non-confluent, i.e., a different order of rule applications yields a different result. Termination and confluence of CHRs are crucial properties such that type inference is well-behaved. The following conditions are commonly imposed on multi-parameter type classes [GHC] such that resulting CHRs are well-behaved. Definition 1 (Haskell Multi-Parameter Type Class Restrictions). The left-hand side (a.k.a. context) C of an instance declaration can mention only type variables, not type constructors. In an instance declaration instance C U t 1... t n, at least one of the types t i must not be a type variable and fv(c) fv(t 1,..., t n ). The instance declarations must not overlap. That is, for any two instances declarations instance C => TC t 1... t n and instance C => TC t 1... t n there is no substitution φ such that φ(t 1 ) = φ(t 1 ),...,φ(t n) = φ(t n ). Type classes appearing in programs are unambiguous. A formal definition can be found in the following section. The first three conditions ensure that the translation from instance declarations to CHRs yield a program theory for which CHR solving is complete [SS02]. The last condition ensures that all type class parameters are uniquely determined. This is a natural restriction and ensures that the meaning of programs is deterministic [Jon93]. Among some other conditions, unambiguity ensures that we do not need to guess types in case of type inference. The set of well-typed MPTC programs is formally defined in the next section. 5 Type Inference via Implication Constraints We show how type inference can be reduced to solving of implication constraints. Constraints C ::= t = t TC t C C ImpConstraints F ::= C b.(c ā.f) F F Commonly, we use symbols C and D to refer to constraints. To obtain a type inference result, we need to find a solution of a implication constraint in terms of a set of primitive constraints. Definition 2 (Solutions). Let P be a program theory, F an implication constraint and C a constraint. We say C is a solution of F w.r.t. P iff P = C F. In such a situation, we write C = solve(f P). We say C is a principal solution iff (1) C is a solution, and (2) for any other solution C we have that P = C fv(f).c. 7

8 (x : ā.c t) Γ (Var) Γ, x W (C t) (LetA) ā = fv(c 1, t 1) Γ {g : ā.c 1 t 1}, e 1 W (F 1 t 1) Γ {g : ā.c 1 t 1}, e 2 W (F 2 t 2) F F 2 ā.(c 1 fv(γ,t1 ).F1 t1 = t 1) Γ,let g :: C1 t1 g = e 1 in e 2 W (F t 2) (App) V, Γ, e 1 W (F 1 t 1) C 1 = solve(f 1 P) ā = fv(c 1,t 1) fv(γ) (Let) V, Γ {g : ā.c 1 t 1}, e 2 W (F 2 t 2) V, Γ,let g = e 1 in e 2 W (F 2 t 2) Γ, e 1 W (F 1 t 1) tenv, e 2 W (F 2 t 2) t fresh F F 1 F 2 t 1 = t 2 t Γ, e 1 e 2 W (F t) (Case) (Abs) a fresh Γ {x : a}, e W (F t) Γ, λx.e W (F a t) Γ, p i e i W (F i t i) for i I Γ, e W (F e t e) t 1, t 2 fresh F F e t 1 = t e t 2 V i I (Fi t1 = t i) Γ,case e of [p i e i] i I W (F t 2) (Pat) p b.(d Γ p t 1) Γ Γ p, e W (F e t e) t fresh F b.(d fv(γ, b,te).fe) t = t1 te Γ, p e W (F t) (Pat-K) (Pat-Var) t fresh x (True {x : t} t) K : ā, b.d t 1... t l T ā b ā = p k b k.(d k Γ pk t pk ) φ m.g.u. of t pk = t k for k = 1,..., l dom(φ) b = K p 1...p l b 1,..., b l, b.(φ(d 1)...φ(D l) φ(d) φ(γ p1 )... φ(γ pl ) T φ(ā)) Fig.1. Type Inference via Implication Constraints Note that P = C F states that in any model M which satisfies the logical interpretation of CHRs P and constraint C we find that implication constraint F is satisfied too. Thus equipped, we introduce a algorithm W style inference system where we generate implication constraints out of the program text and solve them before building type schemes at let statements. How to solve such constraints will be discussed in Section 6. The algorithm is formulated in terms of a deduction system where we make use of judgments of the form Γ, e W (F t) where environment Γ and expression 8

9 e are input values and implication constraint F and type t are output values. For patterns we introduce an auxiliary judgment of the form p b.(d Γ t) to compute the set of existential variables b, constraint D, environment Γ and type t arising out of pattern p. Details are in Figure 1. Note that the statement t fresh is meant to introduce a fresh type variable t. We briefly review the individual rules. Note that we write to denote syntactic equality to avoid confusion with equality among types. In rules (Var) and (Pat-K) we assume that the bound variables ā and b are fresh. In rule (Pat) we make use of the implication constraint b.(d fv(γ, b,te).f e) which states that under the temporary assumptions D arising out of the pattern p we can satisfy the implication constraint F e arising out of e. The quantifier ensures that no existential variables b escape. Note that fv(γ, b,te).f e is a short-hand for ā.f e where ā = fv(f e ) fv(γ, b, t e ). That is, we existentially quantify over all variables which are strictly local in F e. In rule (LetA) we again make use of implication constraints to represent the subsumption problem among two type schemes. In rule (Let) we need to solve implication constraints before we can build the type scheme. Based on our inference system, we can define the set of well-typed MPTC programs. Definition 3 (Well-Typed MPTC Program). Let P be the program theory resulting from translating instances to CHRs. Let e be an expression and Γ be an environment. We say that e is well-typed under Γ iff we find constraint C, type t such that P = (C t = t ) F where Γ, e W (F t ). In such a situation, we write C, Γ e : t. We do not verify type soundness here which we take for granted. It is straightforward to show that principal solutions imply principal types. Let P be a program theory, Γ an environment and e an expression. We say (C, t) is a principal type iff (1) C, Γ e : t, and (2) if C, Γ e : t then P = C fv(γ).(c t = t ). Theorem 1 (Principal Types). Let P be a program theory. Let Γ be an environment and e be an expression such that all implication constraints generated from (Γ, e) have a principal solution. Then, (Γ, e) has a principal type. The above result is important and justifies that we can focus on solving of implication constraints for the rest of the paper. For convenience, all solving procedures in the following sections operate on normalized constraints. We also define type class unambiguity in terms of normalized constraints at the end of this section. Normalization of constraints proceeds as follows. In a first step, we transform implication constraints based on the following first-order equivalences: (i) (F 1 Qa.F 2 ) Qa.(F 1 F 2 ) where a fv(f 1 ) and (ii) (Qa.F 1 ) (Qb.F 2 ) Qa, b.(f 1 F 2 ) where a fv(f 2 ), b fv(f 1 ) and Q {, } and (iii) C 1 (C 2 C 3 ) (C 1 C 2 ) C 3. We exhaustively apply the above identities from left to right. Lemma 1 (Normalization). Each implication constraint resulting from Figure 1 can be equivalently represented by C 0 Q.((D 1 C 1 )... (D n C n )) (1) where Q is a mixed prefix of the form b 0. a 1. b 1... a n. b n. 9

10 Note that no explicit quantifier scopes over C 0 (variables in C 0 are implicitly existentially quantified). We apply Skolemization [Mil92] to formula (1) and remove universal quantifiers. That is, we transform b. ā.f into b.[sk a ( b)/ā]f where Sk ai s are some fresh Skolem constructors. We apply this step repeatedly until we reach the form C 0 b.((d 1 C 1)... (D n C n)) (2). Note that Skolemization is (in general) not a satisfiability preserving transformation. However, based on our specific assumptions (only multi-parameter type class constraints to the left of implication, only single-headed simplification rules where there are no type equations on the rhs), we are able to guarantee that any solution to the transformed constraint yields a solution to the original constraint. We formally define type class unambiguity. We say that C 0 b.((d 1 C 1 )... (D n C n )) is unambiguous iff fv(d i, C i, C 0 ) fv(c 0 ) for i = 1,..., n. We assume a semantic oriented definition of the set of free variables in a constraint where fv(c) = {a = a.c C}. Thus, we can argue that existentially bound variables b are uniquely determined by variables in fv(c 0 ). Hence, we can safely drop the existential quantifier b. This is important and avoids us from guessing types. Definition 4 (Unambiguous MPTC Program). We say a well-typed MPTC program is unambiguous iff all normalized and Skolemized implication constraints arising in inference derivations are unambiguous and for each type scheme ā.c t appearing we have that fv(c, t) fv(ā). We can check for unambiguity as follows. It suffices to consider a single normalized and Skolemized constraint F C 0 (D C). We simply build C 0 s m.g.u. φ and apply it to D and C and check whether any free variables other than fv(φ(c 0 )) remain. If this is the case, F is unambiguous. Note that because we apply the m.g.u. it is sufficient to calculate the set of syntactically free type variables. Based on a similar method we can check that a type scheme is unambiguous. 6 Solving of Implication Constraints We introduce a general-purpose procedure to solve implication constraints arising out of MPTC programs. Thus, we obtain sound type inference. Before we consider the solving problem we investigate the checking problem. We consider a (local) sub-problem of the form D C. The following methods can be easily extended to deal with conjunctions of sub-problems. Lemma 2 (Checking). Let P be a MPTC program theory. Let D C be a implication constraint. Then, S is a solution if (1) S, D P D and (2) S, D, C P C such that (3) = D C. Note that D C iff D D, C. Hence, S is a solution if S, D S, D, C holds in the program theory P. Following our earlier work [SS02], we can check for the last statement by executing (1) and (2). The test (3) verifies that the final stores are logically equivalent. Effectively, we check that m.g.u. in both stores are equivalent and type class constraints are syntactically equivalent (after applying the m.g.u.). Note that MPTC instances never introduce any fresh 10

11 variables (see second condition of Definition 1). Hence, fv(d ) fv(s, D) and fv(c ) fv(s, D, C). In the case the constraints are unambiguous, we have that fv(s, D) = fv(s, D, C). Then, the above check is even complete. Of course, for type inference we need more than checking. Though, checking for solutions provides us with some vital clues how to solve implication constraints. Consider D C and assume we check whether True is a solution. In case the check fails, the best guess we can make is to add the difference between final stores being careful not to let universal variables (represented by Skolem constructors) escape. Then, we check again and so on. Based on this observation we suggest the following solving procedure. General solving method: Consider the normalized and Skolemized implication constraint F C 0, (D C), F. We execute C 0, D P D and C 0, D, C P C for some D and C. If False C we immediately fail. If C D yields the empty set (i.e. final stores are logically equivalent) we consider the implication as solved and continue with C 0, F. We immediately stop in case F is empty. Otherwise, we set S to be the subset of all constraints in C D which do not refer to a Skolem constructor. In case S is non-empty, we continue with C 0, S, (D C ), F. In case S is empty, we pick some (D 1 C 1 ) F and continue with C 0, (D 1 C 1 ), F (D 1 C 1 ), (D C). Otherwise, we fail. We illustrate our solving method by an example. Consider class Foo a b where foo::b->a->int instance Foo Int b -- (F) class Bar a b where bar :: b->a->a data Erk a = forall b. Bar a b => K1 (a,b) forall b. K2 (a,b) g (K1 (a,b)) = bar b a g (K2 (a,b)) = foo b a Here is the (slightly simplified) implication constraint arising out of g s program text. Note that t refers to the type of g. t = Erk a t 3 t 3 = t 1 t 3 = t 2 b.(bar a b Bar a b, t 1 = a) b.(true Foo a b, t 2 = Int) We find that t = Erk a t 3, t 3 = t 1, t 2 = t 1, t 1 = a, t 2 = Int is a solution. Indeed, also GHC infers that g has the type Erk Int->Int. In our solving method we Skolemize constraints which leads to t = Erk a t 3 t 3 = t 1 t 3 = t 2 (C 0 ) (Bar a Sk 1 Bar a Sk 1, t 1 = a) (1) (True Foo a Sk 2, t 2 = Int) (2) Assume we starting solving sub-problem (1). We find that S 1 t 1 = a after taking the difference between final stores. Hence, we continue with C 0, S 1 and we can then verify that (1) is solved by C 0, S 1. Consider sub-problem (2). We find that S 2 t 2 = Int. Finally, we can verify that C 0, S 1, S 2 solves sub-problem (2) and we obtain that g has type Erk Int->Int. Note that if we start solving with (2) we encounter S 1 t 2 = Int. We cannot solve (2) yet. Hence, we consider (1) which leads to S 2 t 1 = a. Then, we can 11

12 verify that C 0, S 1, S 2 solves (1) and (2). Obviously, we obtain the same result. Note that Erk Int->Int is the principal type here. We conclude this section by stating the following results. Lemma 3 (Solving). Let P be a MPTC program theory and F a normalized and Skolemized implication constraint. Every constraint S obtained by the general solving method is a valid solution of F w.r.t. P. We can apply the general solving procedure to check whether a MPTC program is well-typed. Theorem 2 (Soundness of MPTC Type Inference). We achieve sound type inference for MPTC programs. Note that our inference method succeeds for function f from the introduction. However, we still fail for function f3. Hence, our inference method is incomplete. Next, we discuss how to achieve complete type inference where the inferred type is principal. 7 Principal Types for MPTC We identify two conditions under which we obtain principal types. The first condition demands that the type class constraints arising on the right-hand side of must be fixed. This can be enforced by type annotations. The second condition demands that all constraints on the left-hand side of are universal. That is, all parameters of type class constraints in data type definitions are universally quantified. For both conditions we must additionally demand that the instances satisfy the Haskell multi-parameter type class restrictions (Definition 1). We refer to the set P of CHRs resulting from such instances as a MPTC program theory. 7.1 Fixed Condition In our first result we characterize under which conditions the solutions computed by the general solving method from the previous sections are principal. Lemma 4 (Local Principal Solutions). Let P be a MPTC program theory and F C 0 (D C) be a normalized, Skolemized and unambiguous MPTC implication constraint such that D P D and D, C P C for some D and C. Let S be the set difference between C and D, i.e. S = C D. Then, C 0, S is a principal solution of F if (1) S does not contain False and (2) each type-class constraint in S does not refer to a Skolem constructor and (3) F has a solution. A proof can be found in Appendix A. Note that we can decide whether the conditions of Lemma 4 are satisfied. Conditions (1) and (2) can be easily checked. To verify condition (3) we can simply test whether C 0, S is a solution based on the above method (note that the lemma guarantees then that C 0, S must be principal). Checking for unambiguity is discussed at the end of Section 5. Our next result (which holds for arbitrary program theories and implication constraints) shows how to combine locally computed principal solutions. 12

13 Lemma 5 (Combining Principal Solutions). Let P be a program theory. Let F 1 and F 2 be two implication constraints and S 1 and S 2 be two constraints such that S 1 is a principal solution of F 1 and S 2 is a principal solution of F 2. Then, S 1 S 2 is a principal solution of F 1 F 2. Proof: Note that S 1 S 2 is a solution of F 1 F 2. Assume S F 1 F 2. Hence, S F 1 and S F 2. S 1 and S 2 are principal. Hence, S S 1 and S S 2 which shows that S S 1 S 2. Immediately, we can state the following result. Lemma 6 (Local Principal Types for MPTC). We can infer principal types for multi-parameter type classes satisfying the MPTC conditions if local principal solutions exist and programs are unambiguous. To guarantee that local principal solutions exist we require that the type of each type class method which appears within a type annotation or case expression must be explicitly annotated. In such a situation we say that type class methods are fixed. Fixed type class methods imply that the type parameters of type class constraints are fixed. Hence, we can guarantee that local principal solutions exists or no solutions exists at all. We summarize. Theorem 3 (Principal Types for Fixed MPTC). We can infer principal types for multi-parameter type classes satisfying the Haskell multi-parameter type class conditions if type class methods are fixed. Note that the fixed conditions implies that programs are unambiguous. Instead of fixing the type of type class methods, we conjecture that we can equivalently demand that parameters of type class constraints on the right-hand side of implication constraints are universally quantified (which implies that these constraints arise from method applications to existential variables). For the specific case of single-parameter type classes, the single parameter of a type classes is either universally quantified or unbound. Hence, there is no need to impose the fixed conditions. Hence, we can easily verify the following result which has been proven before by Läufer [Läu96]. Corollary 1. We can infer principal types for single-parameter type classes satisfying the Haskell multi-parameter type class conditions. Note that in order to sufficiently annotate programs, we may need pattern annotations and lexically scoped annotations. E.g., in case of function f the following annotations are sufficient. f x z = (case z of (Mk (y::b)) -> (foo::b->int->int) y x, x == length []) However, our current syntax does not support such forms of annotations. The extension is fairly straightforward and is discussed elsewhere [SW05]. 7.2 Universal Condition Instead of the fixed condition we demand that all type class parameters in data type definitions are universally quantified. In such a case constraints on the 13

14 left-hand side of are fully Skolemized. In such a situation, we say that the universal condition is satisfied. Note that the example from the introduction satisfies this condition (trivially) whereas in the example from Section 6 definition Erk a contains the type class constraint Bar a b whose parameter a is not universally quantified. If left-hand side constraints are fully Skolemized and the Haskell multi-parameter type class restrictions hold we can guarantee that these constraints will not introduce any fresh variables while solving. Hence, all (partial) solutions will either lead to a principal solution or we reach failure. Theorem 4 (Principal Types for Universal MPTC). For each program which satisfies the Haskell multi-parameter type class restrictions and all parameters of type class constraints in data type definitions are universally quantified, if inference succeeds, then the inferred type is principal. Note that our inference method fails for function f3 from the introduction which satisfies the universal condition. Though, as we have observed function f3 has no principal type. We conjecture that failure of our solving method implies that the program has no principal type. The condition that all parameters of type class constraints in data type definitions are universally quantified is crucial to establish the above result. Consider class Foo a class Bar a b where bar :: a->b instance Foo a => Bar a [a] data Erk a = Foo a => Mk a f (Mk x) = bar x In the data type definition for Erk a we find the type class constraint Foo a where a is not universally quantified. Our inference method applied to the normalized and Skolemized implication constraint t f = Erk a b, (Foo a Bar a b) yields the solution Bar a b, t f = Erk a b. However, the type Bar a b => Erk a->b is not principal. E.g., we can give f the incomparable type Erk a->[a]. The above suggests that it is not obvious how to relax the fixed and universal condition while maintaining complete inference with principal types. In general, the only way to obtain complete inference seems to enumerate all possible solutions. This is what we discuss in the next section. 8 Complete and Semi-Decidable Type Inference The basic idea of our complete inference method is to unify constraints to find further possible solutions. Consider the implication constraint t f = Erk a b, (Foo a Bar a b) from the previous section. We execute (the calculations here are trivial, no CHR applies) t f = Erk a b, Foo a t f = Erk a b, Foo a 14

15 and t f = Erk a b, Foo a, Bar a b t f = Erk a b, Foo a, Bar a b We unify Bar a b against the head of the instance instance Foo a => Bar a [a] which yields b = [a]. Thus, we find the additional solution t f = Erk a [a]. The following example shows that we may also need to unify constraints which arise from the left-hand side of. Consider the code class Foo a b where foo::a->b class Foo2 a b instance Foo b a => Foo2 b [a] data Erk a = forall b. Foo2 b a => Mk b g (Mk x) = foo x In inferring the type of g we obtain the (simplified) constraints or Skolemized as t = Erk a c b.(foo2 b a Foo b c) t = Erk a c (Foo2 Sk a Foo Sk c) While there is no instance of Foo to match Foo Sk c, there is a unique principal solution t = Erk [c] c, given by assuming that a = [c]. This arises from unifying the constraint Foo2 Sk a on the left hand side with the instance head. In general, the complete solving method is as follows. We will manipulate a set of implication constraints F representing the possible solutions of the original implication constraint F. Initially F = {F }. Choose an implication constraint F F. Apply the incomplete method of the previous section. If it succeeds then we add the solution C 0 to F built by the process. Suppose F is of the form C 0, (D C), F. where such that C D is not empty where C 0, D P D and C 0, D, C P C. If it contains a type equation, then it must include Skolem constructors, which would require them to escape so we can conclude that F has no solutions, and simply remove it from F. Otherwise we choose a type class constraint TC s in C D or D C. For each (renamed apart) instance declaration rule TC t C, where the mgu of s = t exists we create a new implication constraint C 0, ( fv( s). s = t), (D C ), F. Note that by writing ( fv( s). s = t) we denote the projection of type equations s = t onto variables fv( s). Thus, we avoid to introduce spurious type equations containing Skolem constructors. E.g., when unifying F oo2 Sk a against the instance head we generate the spurious constraint b = Sk which will be projected away. Additionally if TC s appears in C then for each unifying type class constraint TC ū appearing in the D, where the mgu of ū = s exists we create new implication constraints C 0, ( fv( s). s = ū), (D C ), F. We add each of the new implication constraints constructed above to F and continue. Examining f3 the Skolemized simplified constraint F is t = t x Bar Int, (True Foo Sk t x ) 15

16 and F = {F } At this point the incomplete method fails. The complete method selects the only type class constraint and using the two unifying instances for Foo Sk t x builds two new implication constraints {F 1, F 2 } defined by F 1 t = t x Bar Int, t x = Int, (True Foo Sk t x ) F 2 t = t x Bar Int, t x = Bool, (True Foo Sk t x ) These two implication constraints are solvable (by the incomplete method) giving constraints t = t x Bar Int, t x = Int t = t x Bar Int, t x = Bool which are the two possible solutions. Lemma 7. Let E be a satisfiable constraint not involving Skolem constructors arising out of the procedure above. Then E = solve(f, P), that is, it is a solution of F. Proof: Each of the new implication constraints is a consequence of the original constraint F, hence any solutions to them are solutions to F. Theorem 5. Let P be a program satisfying the MPTC conditions, and let F be a normalized, Skolemized and unambiguous MPTC implication constraint. Let E = solve(f P). Let {E 1,...,E n,...} be the set of constraints (solutions not implication constraints) arising out of the complete procedure above. Then i.e E i. A proof is given in Appendix B. The complete algorithm as defined above is clearly non-terminating, but with a fair breadth first search procedure we create a semi-decision procedure that eventually discovers each individual solution. We could clearly make the algorithm more deterministic, but there is an underlying problem of non-termination. Consider the declarations class Foo a b instance F a b => Foo [a] b Assume we attempt to solve F True Foo c Sk. We unify Foo c Sk with the instance head which leads to c = [d], (True Foo c Sk). In the next step we have that c = [d], Foo c Sk c = [d], Foo d Sk and a variant of the same unmatched constraint F oo d Sk arises. It interesting further work to see when can safely detect such cycles. 9 Conclusion We have pointed out subtle problems when performing type inference for multiparameter type classes with existential types and type annotations. We could propose a number of novel inference methods under which we retain principality and completeness of type inference. To the best of our knowledge, we are the first to identify the inference problem and to suggest several potential solutions. An interesting question is what will happen in case we include further features such as functional dependencies [Jon00]. We plan to pursue this topic in the future. 16

17 References [DM82] L. Damas and R. Milner. Principal type-schemes for functional programs. In Proc. of POPL 82, pages ACM Press, January [Fax03] K. F. Faxén. Haskell and principal types. In Proc. of Haskell Workshop 03, pages ACM Press, [Frü95] T. Frühwirth. Constraint handling rules. In Constraint Programming: Basics and Trends, LNCS. Springer-Verlag, [GHC] Glasgow haskell compiler home page. [Has] Haskell 98 language report. haskell98-revised/haskell98-report-html/. [HUG] Hugs home page. haskell.cs.yale.edu/hugs/. [JJM97] S. Peyton Jones, M. P. Jones, and E. Meijer. Type classes: an exploration of the design space. In Haskell Workshop, June [JN00] M. P. Jones and J. Nordlander, [Jon92] M. P. Jones. Qualified Types: Theory and Practice. D.phil. thesis, Oxford University, September [Jon93] M. P. Jones. Coherence for qualified types. Research Report YALEU/DCS/RR-989, Yale University, Department of Computer Science, September [Jon95] M. P. Jones. Simplifying and improving qualified types. In FPCA 95: Conference on Functional Programming Languages and Computer Architecture. ACM Press, [Jon00] M. P. Jones. Type classes with functional dependencies. In Proc. of ESOP 00, volume 1782 of LNCS. Springer-Verlag, [Läu96] K. Läufer. Type classes with existential types. Journal of Functional Programming, 6(3): , [LMM87] J. Lassez, M. Maher, and K. Marriott. Unification revisited. In Foundations of Deductive Databases and Logic Programming. Morgan Kauffman, [LO94] K. Läufer and M. Odersky. Polymorphic type inference and abstract data types. ACM Trans. Program. Lang. Syst., 16(5): , [Mil78] R. Milner. A theory of type polymorphism in programming. Journal of Computer and System Sciences, 17: , Dec [Mil92] Dale Miller. Unification under a mixed prefix. J. Symb. Comput., 14(4): , [OL96] M. Odersky and K. Läufer. Putting type annotations to work. In Proc. of POPL 96, pages ACM Press, [OSW99] M. Odersky, M. Sulzmann, and M Wehr. Type inference with constrained [Pot98] types. Theory and Practice of Object Systems, 5(1):35 55, F. Pottier. A framework for type inference with subtyping. In Proc. of ICFP 98, pages ACM Press, [Rém93] D. Rémy. Type inference for records in a natural extension of ML. In Carl A. Gunter and John C. Mitchell, editors, Theoretical Aspects Of Object-Oriented Programming. Types, Semantics and Language Design. MIT Press, [Rob65] J. A. Robinson. A machine-oriented logic based on the resolution principle. Journal of the Association for Computing Machinery, 12:23 41, [Sho67] J.R. Shoenfield. Mathematical Logic. Addison-Wesley, [SP05] [SS02] [Sul00] V. Simonet and F. Pottier. Constraint-based type inference for guarded algebraic data types. Research Report 5462, INRIA, January P. J. Stuckey and M. Sulzmann. A theory of overloading. In Proc. of ICFP 02, pages ACM Press, M. Sulzmann. A General Framework for Hindley/Milner Type Systems with Constraints. PhD thesis, Yale University, Department of Computer Science, May

18 [SW05] M. Sulzmann and J. Wazny. Lexically scoped type annotations. sulzmann, [Zen99] C. Zenger. Indizierte Typen. PhD thesis, Universität Karlsruhe, A Proof of Lemma 4 In preparation, we state the following result which can be proven by induction over the CHR derivations. Lemma 8. Let P be a MPTC program theory. Let C, D 2 P C 1 and C, D 1, D 2 P C 2 such that fv(d 1, D 2 ) fv(c) and = C 1 C 2. Then, C P C 1 and C, D 1 D 2 P C 2 such that = C 1 C 2. We also restate (a specific instance of) the Canonical Normal Form Lemma from [SS02]. Lemma 9. Let P be a MPTC program theory. We have that P = D C iff D P D and D, C P C where = D C. We verify Lemma 4: Let P be a MPTC program theory and F C 0 (D C) be a normalized, Skolemized and unambiguous MPTC implication constraint such that D P D and D, C P C for some D and C. Let S be the set difference between C and D, i.e. S = C D. Then, C 0, S is a principal solution of F if (1) S does not contain False and (2) each type-class constraint in S does not refer to a Skolem constructor and (3) F has a solution. Proof: Condition (1) is obvious. Condition (2) prevents existential variables (which are Skolemized in the normalized constraint) from escaping. Condition (3) is interesting. If F has a solved form, then adding the difference among final stores yields a principal solved form. The proof goes as follows. We first show that if S is a solved form then S S, C 0, hence, S, C 0 is principal. W.l.o.g., we assume that fv(s ) = fv(c 0 ). Note that Skolemization is a satisfiability preserving transformation. Hence, if S is a solved form then P = S (C 0, (D C)). Note that constraints are unambiguous, hence, we could drop all existential quantifiers. We have that P = S (C 0, (D C)) iff P = S C 0 and P = (S, D) C which yields P = S, D S, D, C (4). Soundness of CHR solving and range-restriction allows us to conclude that from D P D and D, C P C that = D D and = D, C C (5). From (4) and (5) we conclude that P = S, D S, D, C. From Lemma 9 we conclude that S, D P C 1 and S, D, C P C 2 where = C 1 C 2 (recall that unambiguity implies that fv(d, C, C 0 ) fv(c 0 ) and by assumption fv(s ) = fv(c 0 ), hence, w.l.o.g. we assume that fv(d, C) fv(s ) which could be enforced by normalizing constraints by building the m.g.u.). From Lemma 8 we obtain that S P C 1 and S, C D P C 2 where = C 1 C 2. Lemma 9 finally yields that P = S C D. Hence, we have verified that C 0, C D is principal. We show that C 0, C D is a solved form, i.e. P = C 0, C D C 0, (D C). From Lemma 9 we conclude that this holds iff C 0, C D, D P C 1 and C 0, C D, D, C P C 2 where = C 1 C 2. We can verify this statements by induction over the CHR derivation. 18

19 We provide several examples to show that the Haskell restrictions are necessary. In terms of the CHR theory, these restrictions guarantee confluence and range-restriction (i.e. grounding the lhs of rule also grounds the rhs). First, we show that confluence is necessary. Example 1. We assume a constant constraint Foo, a unary constraint Erk and a binary constraint Bar and the non-confluent logic P = {Bar a b Foo Bar a b Erk b}. Consider the implication constraint b.true Bar a b which yields True Bar a Sk after normalization. We have Bar a Sk Erk Sk. Hence, our local principal solved form procedure suggests that T rue is a principal solved form. However, this is not the case. Note that we also have that Bar a Sk Foo which shows that Foo is a solved form. Next, we consider range-restrictedness. Example 2. We assume a unary constraint Bar and a binary constraint Foo and the non-range-restricted program logic P = {Bar a Foo a b}. Consider the implication constraint Foo a b Bar a. The logical reading of P is a.bar a b.foo a b. Hence, True is a solved form of Foo a b Bar a. Our procedure computes Foo a b Foo a b and Bar a Foo a b. We find that Foo a b is a solved form which is not principal. In our final example, we show that multi-headed rules break our result. Example 3. We assume three constant constraints Bar, Foo and Erk and the multi-headed logic P = {Foo, Bar Erk}. Consider Foo Bar. We have that Foo, Bar Erk. Indeed, Erk is a solved form but Bar is another incomparable solved form. Note that the same argument applies to multi-headed propagation rules. E.g., consider P p = {Foo, Bar = Erk}. B Proof of Theorem 5 Essentially the proof relies on the fact that in order for E to be a solution it must cause derivations from E, D P D and E, D, C P C where D C. These derivations arising for E will be explored by the complete algorithm, and hence it will construct a solution E using these derivations which is no more specific than E. The following lemma follows from the definition of derivation steps. Lemma 10. Let θ be the mgu of type equations E. Suppose that θ(tc s) φ(c) using the rule TC t C with matching φ where φ( t) = θ( s). Then E fv( s). s = t. Proof: Assume F is of the form C 0, (D C), F. Since E is a solution we have by Lemma 9 that for each Skolemized implication (D C) in F that E, D P D and E, D, C P C and D C. Let C 0, D P D and C 0, D, C P C By confluence of the CHR program there is a derivation E, D P D since E C 0. Similarly there must be a derivation E, C P C. In building one of the implication constraints F F the complete inference algorithm will mimic exactly all the derivation steps in these extra derivations 19

Principal Type Inference for GHC-Style Multi-Parameter Type Classes

Principal Type Inference for GHC-Style Multi-Parameter Type Classes Martin Sulzmann 1, Tom Schrijvers 2, and Peter J. Stuckey 3 1 School of Computing, National University of Singapore S16 Level 5, 3 Science