Operads and the Tree of Life John Baez and Nina Otter

Similar documents
Surfaces Beyond Classification

Understanding Spaces of Phylogenetic Trees

The space of tropically collinear points is shellable

Topos Theory. Lectures 3-4: Categorical preliminaries II. Olivia Caramello. Topos Theory. Olivia Caramello. Basic categorical constructions

Confidence Regions and Averaging for Trees

4 Fractional Dimension of Posets from Trees

Basic Combinatorics. Math 40210, Section 01 Fall Homework 4 Solutions

Lecture 11 COVERING SPACES

The exotic dodecahedron M 0,5 (R) David P. Roberts University of Minnesota, Morris

Fractals: Self-Similarity and Fractal Dimension Math 198, Spring 2013

Algebras. A General Survey. Riley Chien. May 4, University of Puget Sound. Algebras

Planar Graphs and Surfaces. Graphs 2 1/58

Mutation-linear algebra and universal geometric cluster algebras

Olivier Gascuel Arbres formels et Arbre de la Vie Conférence ENS Cachan, septembre Arbres formels et Arbre de la Vie.

6c Lecture 3 & 4: April 8 & 10, 2014

1.1 Topological Representatives for Automorphisms

1. a graph G = (V (G), E(G)) consists of a set V (G) of vertices, and a set E(G) of edges (edges are pairs of elements of V (G))

Evolution of Tandemly Repeated Sequences

TWO CONTRIBUTIONS OF EULER

Lecture 22 Tuesday, April 10

Point-Set Topology 1. TOPOLOGICAL SPACES AND CONTINUOUS FUNCTIONS

Thompson groups, Cantor space, and foldings of de Bruijn graphs. Peter J. Cameron University of St Andrews

Toric Ideals of Homogeneous Phylogenetic Models

Lecture 20: Clustering and Evolution

Introduction III. Graphs. Motivations I. Introduction IV

Alex Schaefer. April 9, 2017

Graphs and trees come up everywhere. We can view the internet as a graph (in many ways) Web search views web pages as a graph

Spaces with algebraic structure

4/4/16 Comp 555 Spring

arxiv: v1 [math.ag] 26 Mar 2012

Introduction to Computers and Programming. Concept Question

MT5821 Advanced Combinatorics

Graph Theory S 1 I 2 I 1 S 2 I 1 I 2

CS 2336 Discrete Mathematics

QUOTIENTS OF THE MULTIPLIHEDRON AS CATEGORIFIED ASSOCIAHEDRA

Mathematics and Statistics, Part A: Graph Theory Problem Sheet 1, lectures 1-4

Univalent fibrations in type theory and topology

Lecture 20: Clustering and Evolution

Homotopy theory of higher categorical structures

Lecture 1: An Introduction to Graph Theory

Adjacent: Two distinct vertices u, v are adjacent if there is an edge with ends u, v. In this case we let uv denote such an edge.

Evolutionary tree reconstruction (Chapter 10)

An Investigation of Closed Geodesics on Regular Polyhedra

2-9 Operations with Complex Numbers

NICOLAS BOURBAKI ELEMENTS OF MATHEMATICS. General Topology. Chapters 1-4. Springer-Verlag Berlin Heidelberg New York London Paris Tokyo

Convex Hull Realizations of the Multiplihedra

Notebook Assignments

Permutation Matrices. Permutation Matrices. Permutation Matrices. Permutation Matrices. Isomorphisms of Graphs. 19 Nov 2015

CMSC th Lecture: Graph Theory: Trees.

Math 311. Trees Name: A Candel CSUN Math

The diagonal of the Stasheff polytope

Leinster s globular theory of weak -categories

Grade 7/8 Math Circles Graph Theory - Solutions October 13/14, 2015

Hyperbolic Geometry and the Universe. the questions of the cosmos. Advances in mathematics and physics have given insight

Linear Systems on Tropical Curves

Week 9: Planar and non-planar graphs. 7 and 9 November, 2018

Surfaces. 14 April Surfaces 14 April /29

Category Theory 3. Eugenia Cheng Department of Mathematics, University of Sheffield Draft: 8th November 2007

6.3 Poincare's Theorem

Exercise set 2 Solutions

Inductive Types for Free

arxiv: v4 [math.gr] 16 Apr 2015

INTRODUCTION TO ALGEBRAIC GEOMETRY, CLASS 11

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. Discrete Mathematics

BAR-MAGNET POLYHEDRA AND NS-ORIENTATIONS OF MAPS

Polynomials. Math 4800/6080 Project Course

TRANSITIVE GRAPHS UNIQUELY DETERMINED BY THEIR LOCAL STRUCTURE

THE BRAID INDEX OF ALTERNATING LINKS

The Associahedra and Permutohedra Yet Again

Modular Arithmetic. is just the set of remainders we can get when we divide integers by n

On the Topology of Finite Metric Spaces

Lecture 1. 1 Notation

Towards an n-category of cobordisms

A history of the Associahedron

Topic 1: What is HoTT and why?

1.2 Functions and Graphs

Figure 4.1: The evolution of a rooted tree.

Grades 7 & 8, Math Circles 31 October/1/2 November, Graph Theory

I. An introduction to Boolean inverse semigroups

v V Question: How many edges are there in a graph with 10 vertices each of degree 6?

A Particular Type of Non-associative Algebras and Graph Theory

Instant Insanity Instructor s Guide Make-it and Take-it Kit for AMTNYS 2006

3.5 Day 1 Warm Up. Graph each line. 3.4 Proofs with Perpendicular Lines

Week 9: Planar and non-planar graphs. 1st and 3rd of November, 2017

An Introduction to Belyi Surfaces

Two Connections between Combinatorial and Differential Geometry

Today. Types of graphs. Complete Graphs. Trees. Hypercubes.

Chapter 3 Trees. Theorem A graph T is a tree if, and only if, every two distinct vertices of T are joined by a unique path.

Definition 1.1. A matching M in a graph G is called maximal if there is no matching M in G so that M M.

Session 27: Fractals - Handout

Different geometry in the two drawings, but the ordering of the edges around each vertex is the same

6.2 Classification of Closed Surfaces

Dynamics on some Z 2 -covers of half-translation surfaces

+ + (124 3) (142 3) (143 2) (412 3) (413 2) (431 2) P 2 P 4 P 2

Cluster algebras and infinite associahedra

An Investigation of the Planarity Condition of Grötzsch s Theorem

RATIONAL CURVES ON SMOOTH CUBIC HYPERSURFACES. Contents 1. Introduction 1 2. The proof of Theorem References 9

Maths. Formative Assessment/key piece of work prior to end of unit: Term Autumn 1

Graphs to semigroups. Alexei Vernitski University of Essex

Non-commutative symmetric functions III:

Transcription:

Operads and the Tree of Life John Baez and Nina Otter

We have entered a new geological epoch, the Anthropocene, in which the biosphere is rapidly changing due to human activities.

Last week two teams of scientists claimed the Western Antarctic Ice Sheet has been irreversibly destabilized, and will melt causing 3 meters of sea level rise in the centuries to come.

So, we can expect that mathematicians will be increasingly focused on biology, ecology and complex systems just as last century s mathematics was dominated by fundamental physics. Luckily, these new topics are full of fascinating mathematical structures and while mathematics takes time to have an effect, it can do truly amazing things. Think of Church and Turing s work on computability, and computers today!

Trees are important, not only in mathematics, but also biology.

The most important is the tree of life. Darwin thought about it:

In the 1860s, the German naturalist Haeckel drew it:

Now we know that the tree of life is not really a tree, due to endosymbiosis and horizontal gene transfer:

But a tree is often a good approximation. Biologists who try to infer phylogenetic trees from present-day genetic data often use simple models where: the genotype of each species follows a random walk, but species branch in two at various times. These are called Markov models.

The simplest Markov model for DNA evolution is the Jukes Cantor model. Consider a genome of fixed length: that is, one or more pieces of DNA having a total of N base pairs, each taken from the set {A,T,C,G}: ATCGATTGAGCTCTAGCG As time passes, the Jukes Cantor model says the genome changes randomly, with each base pair having the same constant rate of randomly flipping to any other. So, we get a Markov process on the set of genomes, X ={A,T,C,G} N I ll explain Markov processes later!

However, a species can also split in two! So, given current-day genome data from various species, biologists try to infer the most probable tree where, starting from a common ancestor, the genome undergoes a random walk most of the time but branches in two at certain times.

For example, here is Elaine Ostrander s reconstruction of the tree for canids:

Define a phylogenetic tree to be a rooted tree with leaves labelled by numbers 1, 2,..., n and edges labelled by times or lengths in [0, ). We require that: the length of every edge is positive, except perhaps for edges incident to a leaf or the root; a vertex that is an only child cannot have only one child.

For example, here is a phylogenetic tree with 5 leaves: 3 1 4 5 2 0 0 t 3 t 1 t 2 0 0 t 4 where t 1, t 2, t 3 0 and t 4 > 0 The root is labelled 0, the leaves 1, 2, 3, 4, 5.

The embedding of the tree in the plane is irrelevant, so these are the same phylogenetic tree: 3 1 4 5 2 0 0 t 4 t 1 t 2 0 0 t 3 = 1 3 4 5 2 0 0 t 4 t 2 t 1 0 0 t 3

Let Phyl n be the set of phylogenetic trees with n leaves. This has an obvious topology. Here is a continuous path in Phyl 4 : 1 2 3 4 1 2 3 4 1 2 3 4 1 1 1 1 1 1 1.2 1 1 1 1 1 0.5 1.2 1 1 1 1 1 1.2

Phylogenetic trees reconstructed by biologists are typically binary. There s a heated debate about trees of higher arity: can a species split into 3 or more species simultaneously? Generically, no. Binary trees form an open dense set of Phyl n, except for Phyl 1. But trees of higher arity matter when we consider paths, paths of paths,... etc. in Phyl n. This was emphasized here: Louis Billera, Susan Holmes and Karen Vogtmann, Geometry of the space of phylogenetic trees, Advances in Applied Mathematics 27 (2001), 733 767.

Billera, Holmes and Vogtmann study the set T n of phylogenetic trees with n leaves where the lengths of external edges edges incident to the root and leaves are fixed to a constant value. The reason: Phyl n = Tn [0, ) n+1 For example, they note T 4 is the cone on the Petersen graph:

The cone on any edge of the Petersen graph is a quadrant: (All these drawings with upside-down trees are theirs!)

The cone on any vertex of the Petersen graph is a ray, at which 3 quadrants meet:

The cone on any pentagon in the Petersen graph looks like this: This is a copy of the famous Stasheff pentagon!

This should remind us of the connection between trees and operads! Today my operads will always be permutative, meaning we have an action of S n on O n, compatible with composition. They will also be topological, meaning that O n is a topological space, and composition and permutations act as continuous maps.

Proposition. There is an operad Phyl, the phylogenetic operad, whose space of n-ary operations is Phyl n. Composition and permutations are defined in the visually evident way. So: What is the mathematical nature of this operad? How is it related to Markov processes with branching? How is it related to known operads in topology? Answer: Phyl is the coproduct of Com, the operad for commutative semigroups, and [0, ), the operad having only unary operations, one for each t [0, ). The first describes branching, the second describes Markov processes. Phyl is closely related to the Boardman Vogt W construction applied to Com. Let s see how this works...

The point of operads is that they have algebras. An algebra of O is a topological space X on which each operation f O n acts as a map α(f ): X n X obeying some plausible conditions. These conditions simply say there s an operad homomorphism α: O End(X ), where End(X ) is the operad whose n-ary operations are maps X n X. More generally we can consider an algebra of O in any symmetric monoidal category C enriched over Top. This is an object X C with an operad homomorphism α: O End(X ). A coalgebra of O in C is an algebra of O in C op. We ll see that the phylogenetic operad has interesting coalgebras in FinStoch, the category of finite sets and stochastic maps.

I ll use [0, ) as the name for the operad having only unary operations, one for each t [0, ), with composition of operations given by addition. The category FinStoch has finite sets as objects, and a morphism f : X Y is a map sending each point in X to a probability distribution on Y. Alternatively, a morphism f : X Y is a Y X -shaped matrix of real numbers where: the entries are nonnegative, each column sums to 1. Such a matrix is called stochastic. Composition of morphisms is matrix multiplication.

FinStoch becomes a symmetric monoidal category enriched over Top, where the tensor product of X and Y is X Y. A Markov process is an algebra of [0, ) in FinStoch. Concretely, a Markov process is a finite set X together with a stochastic map α(t): X X for each t 0, such that: α(s + t) = α(s)α(t), α(0) = 1 α(t) depends continuously on t. Since [0, ) is commutative, a coalgebra of [0, ) in FinVect is the same thing! If X is a set of possible genomes, a Markov process on X describes the random changes of the genome with the passage of time.

There s a unique operad Com with one n-ary operation for each n > 0, and none for n = 0. Algebras of Com in Top are commutative topological semigroups: there is just one way to multiply n elements for n > 0.

Any finite set X becomes a cocommutative coalgebra in FinStoch. The unique n-ary operation of Com acts as the diagonal n : R X R X R X This is a map, a special case of a stochastic map. This map describes the n-fold duplication of a probability distribution f on the set X of genomes when a species branches! 1 2 3 0

Any pair of operads O and O has a coproduct O + O. By general abstract nonsense, an algebra of O + O is an object X that is both an algebra of O and an algebra of O, with no compatibility conditions imposed. In fact: Theorem. The operad Phyl is the coproduct Com + [0, ).

And thus: Corollary. Given any Markov process, its underlying finite set X naturally becomes a coalgebra of Phyl in FinStoch. Proof. X is automatically a coalgebra of Com, and the Markov process makes it into a coalgebra of [0, ). Thus, it becomes a coalgebra of Phyl = Com + [0, ).

How is Phyl related to W(Com), where W is the construction that Boardman and Vogt used to get an operad for loop spaces? Define addition on [0, ] in the obvious way, where + t = t + = Then [0, ] becomes a topological monoid, so there s an operad with only unary operations, one for each t [0, ]. Let s call this operad [0, ]. Theorem. For any operad O, Boardman and Vogt s operad W(O) is isomorphic to a suboperad of O + [0, ].

Operations in O + [0, ] are equivalence classes of planar rooted trees with: vertices except for leaves and the root labelled by operations in O, edges labelled by lengths in [0, ], such that only external edges can have length 0. The equivalence relation comes from permuting edges coming into a vertex, e.g.: 1 2 3 f σ 2 3 1 f ( ) 1 2 3 σ = 2 3 1 0 0

Here is an operation in O + [0, ]: 3 1 4 5 2 t 2 t t 1 t 4 3 t 5 g h t 6 t 7 f 0 where t 1, t 2, t 3, t 4, t 5, t 8 0 and t 6, t 7 > 0. t 8

An operation is in W(O) iff either it is the identity operation or all external edges have length. Here is an operation in W(O): 3 1 4 5 2 g h t 1 t 2 f 0 where t 1, t 2 > 0.

Berger and Moerdijk showed: if S n acts freely on O n and O 1 is well-pointed, W(O) is a cofibrant replacement for O. This is true for O = Assoc, the operad whose algebras are topological semigroups, with n! operations of arity n > 0. This is why Boardman and Vogt could use W(Assoc) as an operad for loop spaces. But S n does not act freely on Com n. W(Com) is not a cofibrant replacement for Com. It is not an operad for infinite loop spaces. Nonetheless W(Com) is interesting because W(Com) n = Tn where T n is Billera, Holmes and Vogtmann s space of phylogenetic trees with external edges having fixed lengths.

And the larger operad Com + [0, ], a compactification of Phyl = Com + [0, ), is also interesting. The reason is that any Markov process α: [0, ) End(X ) approaches a limit as t. Indeed, it extends uniquely to a homomorphism of topological monoids α: [0, ] End(X ). We thus get: Proposition. Given any Markov process, its underlying finite set X naturally becomes a coalgebra of Com + [0, ] in FinStoch.

Summary for topologists who don t care about applications: For any operad O we have weak equivalences O W(O) O + [0, ) O + [0, ] and if O is well-pointed and S n acts freely on O n, W(O) is cofibrant.

Finally: the mystery of tropical trees!

The tropical rig is (, ] with minimization as + and addition as. We can do algebraic geometry over this rig and define tropical curves. In 2007, Gathmann, Kerber and Markwig showed that a certain moduli space of genus 0 tropical curves with n + 1 marked points is the space of trees T n studied by Billera, Holmes and Vogtmann. Also in 2007, Mikhalkin showed this moduli space has a compactification that is a smooth compact tropical variety. Nina Otter has shown this compactification is W(Com) n. The operad structure on W (Com) corresponds to the tropical clutching map described by Abramovich, Caporaso and Payne in 2012. The mystery: why are tropical curves related to phylogenetic trees? Is this connection good for something?