The Hack programming language: Types for PHP Andrew Kennedy Facebook
Facebook s PHP Codebase 350,000 files >10,000,000 LoC (www.facebook.com & internally) 1000s of commits per day, 2 releases per day Anecdotally, good engineers are really productive in PHP And yet
1 + 2 3
How far does this go? 1ne + 2wo 3
How far does this go? 15 + 0xF 15
How far does this go? 15 == 0xF true
What if it s not a numeric string? hagfish + 2 2
OK, it treats non-numeric strings as zero. hagfish 9000000 yqwvysx
It gets worse. $n = hagfish ; $n++; hagfisi
It gets even worse. It never ends. $n = z ; $n++; aa
What to do? Types! * *And quite a lot of other things: remove features (e.g. variable variables ), add async, typed XML syntax,
Some history First focus for Facebook was performance of PHP 2009: HipHop: translation from PHP to C++ 2013: HHVM: highly performant runtime used by FB, Wikipedia, Dynamic checking of type hints => static type system => Hack (Julien Verlaguet). Really two projects A programming language project (this talk) A systems project (make it scale to 10M lines, with parallelism, background incremental type checking, etc.) Also see Flow (Akiv Chaudhuri), a similar effort for Javascript Types matter at Facebook!
Hack: types for PHP Object-oriented type system with generics in the style of Java, C# or Scala Some structural subtyping (tuples, shapes, functions) ML-like type inference based on unification Flow-sensitive typing for locals Type refinement for isnull/type tests Internal use of union types and recursive types ML-style abstract types Gradual typing for mixed code (PHP, Hack)
Pragmatism We re typing code that already exists! Lots of special casing for common PHP idioms Driven by need to convert millions of lines of code & convert hundreds of reluctant developers. => Language has a materials in the room feel to it. Materials being drawn from many years of p. l. research
Rich object-oriented type system Primitive types, named classes, interfaces, and traits, with static and virtual methods Generic type parameters on types and methods, with variance annotations and lower/upper bounds Maybe/option type Named class abstract class ChunkIterable<Tk, +Tv> { abstract public function getiterator(): AsyncIterator<ResultChunk<Tk, Tv>>; Type parameter Covariant type parameter abstract protected function getiteratorwithcursor(?chunkcursor $from,?chunkcursor $to, bool $iterate_backwards): AsyncIterator<(ChunkCursorMaker<Tk>, ResultChunk<Tk, Tv>)>; Lower bound final public function filter<tu super Tv>( IChunqPredicate<Tu> $predicate): ChunkIterable<Tk, Tu> { return new FilteredChunkIterable($this, $predicate);
More inference than Java, C# or Scala Type annotation on function arguments and results only Types inferred for locals; type parameters inferred for new and generic methods class List<T> { function MakeSingleton<T>(T x): List<T> {... function foo(int $b): void { $y = new List(); $y->add($b); $z = new List(); $z->add($y); $s = MakeSingleton($z); Inferred to be List<List<int>> Type parameter is inferred Note: no type-based overloading! (contrast Java, C#)
Flow-sensitive typing of locals Locals aren t even declared in PHP function f( $b) { if ($b) { What types can we write on $x = b ; parameter and result? bar($x); $x = 12; else { $x = a ; return $x;
Flow-sensitive typing of locals Locals aren t even declared in PHP function f(bool $b): mixed { if ($b) { What types can we write on $x = b ; parameter and result? bar($x); $x = 12; else { $x = a ; return $x;
Flow-sensitive refinement Types in Hack do not contain null by default. Must write?type to include null. At last! Tony Hoare s billion-dollar mistake, rectified Null tests in conditionals refine the type inside the branch Similarly, can test dynamic class using instanceof function foo(?int $xopt):int { if ($xopt == null) { return 42; else { return $xopt; function bar(widget $a):void { if ($a instanceof Button) { $a->click(); else { $a->dosomethinggeneric(); Type of $xopt is now int
Flow-sensitive refinement: expressions Types of some expressions can be refined. But care needed! class C { private?int $f = 0; function settonull(): void { $this->f = null; function get(): int { if ($this->f == null) { return 0; else { return $this->f; Type of $this->f is now int
Flow-sensitive refinement: expressions Types of some expressions can be refined. But care needed! class C { private?int $f = 0; function settonull(): void { $this->f = null; function get(): int { if ($this->f == null) { return 0; else { $this->settonull(); Type is invalidated by return $this->f; function call TYPE ERROR!
Internal types Internally, Hack uses a kind of union type for flow-sensitive typing class A { function Foo():int {... class B { function Foo():string {... function foo(bool $b, A $x, B $y): mixed { if ($b) { $obj = $x; int string is a subtype of mixed else { $obj = $y; $result = $obj->foo(); Hack gives $result the type int string Hack gives $obj the type A B
Structural typing PHP uses arrays a lot. Arrays can be indexed by integer or string; they re extensible; and values are dynamically typed. PHP arrays are often used in idiomatic ways; these are reflected in structural types in Hack: Tuples e.g. tuple(int, string, MyClass) Associative maps e.g. array<string,item> Records a.k.a. shapes e.g. shape('id' => int, 'name' => string) Also: function types, with proper co/contra-variant subtyping
Type abstraction Where they are used, types are surprisingly strong Abstraction is enforced on enumeration types (contrast C#) enum NodeColour : int = { Red = 0; Black = 1; No type compatibility between NodeColour and int Opaque types, with optional supertype newtype UserId = string; Outside file, no compatibility between UserId and string
Gradual typing Hack code is marked as strict, partial or decl Strict code has full type annotations and is fully type checked. It cannot call into legacy PHP code Partial code has optional annotations and is type checked as much as it can. It can call into legacy PHP code Decl code is not checked; but type annotations are processed for use by other files Where types are omitted, Hack assumes an any type that is compatible with all other types Contrast mixed which is the top type w.r.t. subtyping
Type safety The intention is that strict mode code is type safe But no soundness theorem; and what would it say about mixed code? Also, plenty of back doors e.g. invariant construct
Implementation Hack is implemented in Ocaml Core of type checker is purely functional Open sourced on github: see http://hacklang.org
That's all folks. Questions?