Olivier Gascuel Arbres formels et Arbre de la Vie Conférence ENS Cachan, septembre Arbres formels et Arbre de la Vie.
|
|
- Elvin Lane
- 5 years ago
- Views:
Transcription
1 Arbres formels et Arbre de la Vie Olivier Gascuel Centre National de la Recherche Scientifique LIRMM, Montpellier, France 10 permanent researchers 2 technical staff 3 postdocs, 10 PhDs 1
2 Phylogenetics Tree building algorithms and software (PhyML) Tree combinatorics (triplets, quartets, minimum evolution) Supertrees, gene trees/species trees Probabilistic modeling of substitutions Branch testing
3 Text algorithmics Alignment Searching for repeats, motifs, tags Comparative genomics New generation sequencing Machine learning and clustering Expression data analysis Hidden Markov Models Mining pathogens Plasmodium falciparum (malaria) HIV, Flu Biodiversity Arbres formels et Arbre de la Vie A bit of history and biology Definitions Numbers Topological distances Consensus Random models Algorithms to build trees 3
4 Charles Darwin Charles Darwin
5 Charles Darwin Charles Darwin
6 Arbre d Haeckel ~1875 Jakob Steiner
7 Watson et Crick 1962 Replication DNA duplicates Séquence Transcription RNA synthesis Translation Protein synthesis Protein folding Fonction 7
8 Replication DNA duplicates Mutations Transcription RNA synthesis Translation Protein synthesis Protein folding Sélection Theodosius Dobzhansky Nothing in Biology Makes Sense Except in the Light of Evolution 8
9 The Tree(s) of Life The Tree(s) of Life 9
10 HIV subtype A Eastern & Southern Europe The Tree(s) of Life 10
11 Arbres formels et Arbre de la Vie A bit of history and biology Definitions Numbers Topological distances Consensus Random models Algorithms to build trees A rooted phylogenetic tree Molecular clock assumption 11
12 A rooted phylogenetic tree Outgroup species OutG. An unrooted phylogenetic tree Inferred by most methods Topology Branch lengths Binary 12
13 An unrooted phylogenetic tree Inferred by some methods Topology Branch lengths Unresolved Graphs (vertices, leaves, labels, edges, weights) Adjacency table A B C Ngbr 1 A A B C C Ngbr 2 2 A 5 Ngbr 3 B C B A B C 13
14 Graphs (vertices, leaves, labels, edges, weights) A B C Ngbr1 A A B C C Lgth1 W1 W2 W3 W4 W5 W1 W3 W4 Ngbr2 2 A 5 Lgth2 W2 Wab W5 Ngbr3 B C B Lgth3 Computer representation Wab Wbc Wbc W1 A Wab B Wbc W4 W2 W3 C W5 Parentheses (expressions, newick format) + ((1 X 2) + ((3 X (4 5))) = -1 X X _
15 Parentheses (expressions and recursions) Expression Value ( LeftExp Operator RightExp ) compute (Exp) If Exp = Value, then return: Value Else Exp = (LeftExp Op RightExp) x = compute (LeftExp) y = compute (RightExp) Return: x Op y Parentheses (expressions and recursions) Value Expression Recursive procedure ( LeftExp Strongly Operator connected RightExp to ) trees Widely used E.g. with parsimony, ML compute (Exp) If Exp = Value, then return: Value Else Exp = (LeftExp Op RightExp) x = compute (LeftExp) y = compute (RightExp) Return: x Op y 15
16 Parentheses (expressions, newick format) (( )( ( ))) Parentheses (expressions, newick format) (( ) ( tax5)) ( ( ( Tax 5))) ( (( ) )). 16
17 Parentheses (expressions, newick format) ((:W1, :W2):Wab, :W3, ((:W4, :W5):Wbc); Output format W1 A Wab B Wbc W4 W2 W3 C W5 Bipartitions (binary characters, splits) Tree building and comparison Topology { {, } {,, } } {,, } {, } {} {,,, } {} L - {}, {} L - {} {} L - {}, {} L - {} 17
18 Bipartitions (binary characters, splits) A topology defines a bipartition set Given a bipartition set, it s easy to check that it defines a unique topology, using a local condition: A BandA' B' are tree compatible iff one of A A', A B', B A', B B' is empty A B A A topology is equal to its bipartition set Bipartitions (binary characters, splits) A phylogenetic tree defines a bipartition set Given a bipartition set, it s easy to check that it defines a unique topology, which may be unresolved { {, } {,, } {} L - {} } 18
19 Bipartitions (binary characters, splits) L = {eagle, duck, dog, mouse, kiwi) wings = {eagle, duck, kiwi} {mouse, dog} fly = {eagle, duck} {kiwi, mouse, dog} {eagle, duck} {mouse, dog} = eagle Maximizing character compatibility is hard! mouse duck kiwi dog Four Characters Suffice Katharina Huber, The Swedish University of Agricultural Sciences, and The Linnaeus Centre for Bioinformatics, Uppsala University, Sweden Vincent Moulton, The Linnaeus Centre for Bioinformatics, and Mike Steel, University of Canterbury, New Zealand 19
20 Phylogenetic trees Throughout this talk, we let X denote a finite set (of taxa). A tree T= (V,E ) together with a map f : X V is called a phylogenetic tree (on X) if f is a bijection of X onto the leaf set of T and all interior vertices of T have degree X = {1,2,,6} Partitions from characters We can associate a collection of partitions of X to any given collection of characters defined on X. 1 A T C G C T C 2 A T G C C G C 3 A G C T A G A 4 T C C A G T A
21 Convexity A partition P is called convex on a phylogenetic tree T if for distinct parts A, B of P the subtrees T A and T B (i.e. the minimal subtrees of T containing A, B, respectively) have no vertex in common. Consider the partition T T {35} Defining trees A collection of partitions P defines a phylogenetic tree T if P is convex on T, and T is the only phylogenetic tree with this property T 3 { , , } defines T 21
22 but if P = { , , } then P is convex on both of the following trees T 2 5 T' Question How many partitions suffice to define any given phylogenetic tree? 22
23 2 suffices for caterpillars Three partitions do not suffice
24 ..but at most five do! Theorem (Semple and Steel, 2002) Every phylogenetic tree can be defined by at most five partitions. Four does suffice! Theorem (Huber, Moulton, Steel, 2003) Any phylogenetic tree can be defined by at most four partitions. 24
25 Quartets (4-trees) Topology Quartets (4-trees) Topology { 12 34, 12 35, 12 45, 13 45, } Tree building and comparison 25
26 Quartets (4-trees) A complete quartet set: for every quadruple {i, j, k, l} we have one resolved 4-tree, eg ij kl A binary topology defines a complete quartet set It easy to check that a complete quartet set is tree compatible, and then defines a unique tree. A tree is equal to its quartet set. Quartets (4-trees) It s easy to infer 4-trees for all quadruples (eg ML) But: 4-trees are not reliable It is computationally hard to check that an incomplete quartet set is tree compatible It is computationally hard to select the maximum number of compatible 4-trees Heuristics needed! 26
27 Additive distances Tree with branch lengths Tree building (and comparison) Additive distances 17 i = 1, j = 2, k = 3, l = 4 11 A tree with lengths defines an additive distance. A distance is additive iff it satisfies the local 4-point condition: For every quadruple i, j, k, l, the two largests of ij kl ik jl il jk 2,, are equal
28 Additive distances A tree with lengths defines an additive distance. A distance is additive iff it satisfies the local 4-point condition, which is easily checked. An additive distance defines a unique tree, which is easily built. A tree is equal to its path length distance Additive distances Estimating evolutionary distances between all taxon pairs is easy (ML) But these distances are never 100% additive This induces hard optimization problems Numerous approaches and heuristics 28
29 Summary of Definitions Numerous definitions of trees (graphs, parentheses, bipartitions, characters, quartets, distances) Used to represent, compare and infer trees These definitions involve easy (polynomial) algorithms to recognize trees and change of representation But hard problems to infer trees from data Arbres formels et Arbre de la Vie A bit of history and biology Definitions Numbers Topological distances Consensus Random models Algorithms to build trees 29
30 Numbers Number of edges in a binary tree with n taxa: 2 tax: 1, 3 tax: 3, 4 tax: 5, 5 tax : 7 n tax: e(n) = e(n-1) + 2 = 2n -3 n Numbers Number of unrooted binary trees with n taxa: : 2 tax: 1, 3 tax: 1, 4 tax: 3, 5 tax : 15, 5 tax : tax: atoms in the universe n tax: t(n) = t(n - 1) x e(n -1) = (2n 5)(2n 7) hard optimization problems! 30
31 Numbers Number of rooted binary trees with n taxa: 2 tax: 1, 3 tax: 3, 4 tax: 15, 5 tax: 105 n tax: r(n) = t(n) x e(n) = t(n+1) = (2n 3) (2n 5) root root Arbres formels et Arbre de la Vie A bit of history and biology Definitions Numbers Topological distances Consensus Random models Algorithms to build trees 31
32 Topogical distance Measure the distance between two topologies with the same taxon set To analyze alternative trees (e.g. with parsimony) To compare reconstruction methods with simulated data To infer horizontal gene transfers Robinson & Foulds topogical distance Number of moves to transform one tree into the other Moves = edge contraction, unresolved node expansion Tree1 32
33 Robinson & Foulds topogical distance Number of moves to transform one tree into the other Moves = edge contraction, unresolved node expansion Tree2 Robinson & Foulds topogical distance Number of moves to transform one tree into the other Moves = edge contraction, unresolved node expansion contraction 33
34 Robinson & Foulds topogical distance Number of moves to transform one tree into the other Moves = edge contraction, unresolved node expansion expansion Tree1 R&F(Tree1, Tree2) = 2 Bipartition distance Number of bipartitions in one tree but not the other Bipartition and R&F distances are equal (easy calculation) Tree1: {12 345, } Tree2: {12 345, } 34
35 Quartet distance Number of 4-trees in one tree but not the other Easy to compute, more refined than R&F distance Tree1: {12 34, 12 35, 12 45, 13 45, 23 45} Tree2: {12 34, 12 35, 12 45, 14 35, 24 35} QD = 4 Horizontal gene transfers and SPR distance Species tree 35
36 Horizontal gene transfers and SPR distance Gene transfer Horizontal gene transfers and SPR distance Gene tree Subtree (, ) is Pruned and Regraft 1 HGT 1 SPR 36
37 Horizontal gene transfers and SPR distance SPR distance: minimum number of SPR moves required to tansform one tree into the other. Biologically relevant: number of HGTs Very hard to compute! Exercice Compute the RF and SPR distance between: 37
38 Arbres formels et Arbre de la Vie A bit of history and biology Definitions Numbers Topological distances Consensus Random models Algorithms to build trees Consensus We aim at estimating the consensus of a family of trees with the same taxon set. Most consensus problems are hard (think to elections ) But it s easy to define and compute the majority rule consensus tree 38
39 Majority rule consensus tree n trees with the same taxon set every tree t defines a bipartition set Bt = {b} collect B = { b seen in > n/2 sets Bt } any pair b, b is seen in at least one common set Bt therefore, b and b are tree compatible and B defines a unique tree! Exercice: Compute the majority consensus tree between 39
Introduction to Trees
Introduction to Trees Tandy Warnow December 28, 2016 Introduction to Trees Tandy Warnow Clades of a rooted tree Every node v in a leaf-labelled rooted tree defines a subset of the leafset that is below
More informationIntroduction to Computational Phylogenetics
Introduction to Computational Phylogenetics Tandy Warnow The University of Texas at Austin No Institute Given This textbook is a draft, and should not be distributed. Much of what is in this textbook appeared
More informationDIMACS Tutorial on Phylogenetic Trees and Rapidly Evolving Pathogens. Katherine St. John City University of New York 1
DIMACS Tutorial on Phylogenetic Trees and Rapidly Evolving Pathogens Katherine St. John City University of New York 1 Thanks to the DIMACS Staff Linda Casals Walter Morris Nicole Clark Katherine St. John
More informationScaling species tree estimation methods to large datasets using NJMerge
Scaling species tree estimation methods to large datasets using NJMerge Erin Molloy and Tandy Warnow {emolloy2, warnow}@illinois.edu University of Illinois at Urbana Champaign 2018 Phylogenomics Software
More informationLecture: Bioinformatics
Lecture: Bioinformatics ENS Sacley, 2018 Some slides graciously provided by Daniel Huson & Celine Scornavacca Phylogenetic Trees - Motivation 2 / 31 2 / 31 Phylogenetic Trees - Motivation Motivation -
More informationDynamic Programming for Phylogenetic Estimation
1 / 45 Dynamic Programming for Phylogenetic Estimation CS598AGB Pranjal Vachaspati University of Illinois at Urbana-Champaign 2 / 45 Coalescent-based Species Tree Estimation Find evolutionary tree for
More informationABOUT THE LARGEST SUBTREE COMMON TO SEVERAL PHYLOGENETIC TREES Alain Guénoche 1, Henri Garreta 2 and Laurent Tichit 3
The XIII International Conference Applied Stochastic Models and Data Analysis (ASMDA-2009) June 30-July 3, 2009, Vilnius, LITHUANIA ISBN 978-9955-28-463-5 L. Sakalauskas, C. Skiadas and E. K. Zavadskas
More informationEvolution of Tandemly Repeated Sequences
University of Canterbury Department of Mathematics and Statistics Evolution of Tandemly Repeated Sequences A thesis submitted in partial fulfilment of the requirements of the Degree for Master of Science
More informationRecent Research Results. Evolutionary Trees Distance Methods
Recent Research Results Evolutionary Trees Distance Methods Indo-European Languages After Tandy Warnow What is the purpose? Understand evolutionary history (relationship between species). Uderstand how
More informationPhylogenetics on CUDA (Parallel) Architectures Bradly Alicea
Descent w/modification Descent w/modification Descent w/modification Descent w/modification CPU Descent w/modification Descent w/modification Phylogenetics on CUDA (Parallel) Architectures Bradly Alicea
More informationMain Reference. Marc A. Suchard: Stochastic Models for Horizontal Gene Transfer: Taking a Random Walk through Tree Space Genetics 2005
Stochastic Models for Horizontal Gene Transfer Dajiang Liu Department of Statistics Main Reference Marc A. Suchard: Stochastic Models for Horizontal Gene Transfer: Taing a Random Wal through Tree Space
More informationMolecular Evolution & Phylogenetics Complexity of the search space, distance matrix methods, maximum parsimony
Molecular Evolution & Phylogenetics Complexity of the search space, distance matrix methods, maximum parsimony Basic Bioinformatics Workshop, ILRI Addis Ababa, 12 December 2017 Learning Objectives understand
More informationSeeing the wood for the trees: Analysing multiple alternative phylogenies
Seeing the wood for the trees: Analysing multiple alternative phylogenies Tom M. W. Nye, Newcastle University tom.nye@ncl.ac.uk Isaac Newton Institute, 17 December 2007 Multiple alternative phylogenies
More informationSPR-BASED TREE RECONCILIATION: NON-BINARY TREES AND MULTIPLE SOLUTIONS
1 SPR-BASED TREE RECONCILIATION: NON-BINARY TREES AND MULTIPLE SOLUTIONS C. THAN and L. NAKHLEH Department of Computer Science Rice University 6100 Main Street, MS 132 Houston, TX 77005, USA Email: {cvthan,nakhleh}@cs.rice.edu
More informationThroughout the chapter, we will assume that the reader is familiar with the basics of phylogenetic trees.
Chapter 7 SUPERTREE ALGORITHMS FOR NESTED TAXA Philip Daniel and Charles Semple Abstract: Keywords: Most supertree algorithms combine collections of rooted phylogenetic trees with overlapping leaf sets
More informationEvolutionary tree reconstruction (Chapter 10)
Evolutionary tree reconstruction (Chapter 10) Early Evolutionary Studies Anatomical features were the dominant criteria used to derive evolutionary relationships between species since Darwin till early
More informationof the Balanced Minimum Evolution Polytope Ruriko Yoshida
Optimality of the Neighbor Joining Algorithm and Faces of the Balanced Minimum Evolution Polytope Ruriko Yoshida Figure 19.1 Genomes 3 ( Garland Science 2007) Origins of Species Tree (or web) of life eukarya
More informationCS 581. Tandy Warnow
CS 581 Tandy Warnow This week Maximum parsimony: solving it on small datasets Maximum Likelihood optimization problem Felsenstein s pruning algorithm Bayesian MCMC methods Research opportunities Maximum
More informationOn the Optimality of the Neighbor Joining Algorithm
On the Optimality of the Neighbor Joining Algorithm Ruriko Yoshida Dept. of Statistics University of Kentucky Joint work with K. Eickmeyer, P. Huggins, and L. Pachter www.ms.uky.edu/ ruriko Louisville
More informationComparison of commonly used methods for combining multiple phylogenetic data sets
Comparison of commonly used methods for combining multiple phylogenetic data sets Anne Kupczok, Heiko A. Schmidt and Arndt von Haeseler Center for Integrative Bioinformatics Vienna Max F. Perutz Laboratories
More informationLecture 20: Clustering and Evolution
Lecture 20: Clustering and Evolution Study Chapter 10.4 10.8 11/11/2014 Comp 555 Bioalgorithms (Fall 2014) 1 Clique Graphs A clique is a graph where every vertex is connected via an edge to every other
More informationStudy of a Simple Pruning Strategy with Days Algorithm
Study of a Simple Pruning Strategy with ays Algorithm Thomas G. Kristensen Abstract We wish to calculate all pairwise Robinson Foulds distances in a set of trees. Traditional algorithms for doing this
More information4/4/16 Comp 555 Spring
4/4/16 Comp 555 Spring 2016 1 A clique is a graph where every vertex is connected via an edge to every other vertex A clique graph is a graph where each connected component is a clique The concept of clustering
More informationA New Algorithm for the Reconstruction of Near-Perfect Binary Phylogenetic Trees
A New Algorithm for the Reconstruction of Near-Perfect Binary Phylogenetic Trees Kedar Dhamdhere, Srinath Sridhar, Guy E. Blelloch, Eran Halperin R. Ravi and Russell Schwartz March 17, 2005 CMU-CS-05-119
More informationCSE 549: Computational Biology
CSE 549: Computational Biology Phylogenomics 1 slides marked with * by Carl Kingsford Tree of Life 2 * H5N1 Influenza Strains Salzberg, Kingsford, et al., 2007 3 * H5N1 Influenza Strains The 2007 outbreak
More informationLecture 20: Clustering and Evolution
Lecture 20: Clustering and Evolution Study Chapter 10.4 10.8 11/12/2013 Comp 465 Fall 2013 1 Clique Graphs A clique is a graph where every vertex is connected via an edge to every other vertex A clique
More informationDesigning parallel algorithms for constructing large phylogenetic trees on Blue Waters
Designing parallel algorithms for constructing large phylogenetic trees on Blue Waters Erin Molloy University of Illinois at Urbana Champaign General Allocation (PI: Tandy Warnow) Exploratory Allocation
More informationComputing the Quartet Distance Between Trees of Arbitrary Degrees
January 22, 2006 University of Aarhus Department of Computer Science Computing the Quartet Distance Between Trees of Arbitrary Degrees Chris Christiansen & Martin Randers Thesis supervisor: Christian Nørgaard
More informationAlgorithms for Bioinformatics
Adapted from slides by Leena Salmena and Veli Mäkinen, which are partly from http: //bix.ucsd.edu/bioalgorithms/slides.php. 582670 Algorithms for Bioinformatics Lecture 6: Distance based clustering and
More informationDistance based tree reconstruction. Hierarchical clustering (UPGMA) Neighbor-Joining (NJ)
Distance based tree reconstruction Hierarchical clustering (UPGMA) Neighbor-Joining (NJ) All organisms have evolved from a common ancestor. Infer the evolutionary tree (tree topology and edge lengths)
More informationTrinets encode tree-child and level-2 phylogenetic networks
Noname manuscript No. (will be inserted by the editor) Trinets encode tree-child and level-2 phylogenetic networks Leo van Iersel Vincent Moulton the date of receipt and acceptance should be inserted later
More informationML phylogenetic inference and GARLI. Derrick Zwickl. University of Arizona (and University of Kansas) Workshop on Molecular Evolution 2015
ML phylogenetic inference and GARLI Derrick Zwickl University of Arizona (and University of Kansas) Workshop on Molecular Evolution 2015 Outline Heuristics and tree searches ML phylogeny inference and
More informationImproved parameterized complexity of the Maximum Agreement Subtree and Maximum Compatible Tree problems LIRMM, Tech.Rep. num 04026
Improved parameterized complexity of the Maximum Agreement Subtree and Maximum Compatible Tree problems LIRMM, Tech.Rep. num 04026 Vincent Berry, François Nicolas Équipe Méthodes et Algorithmes pour la
More informationPhylogenetic networks that display a tree twice
Bulletin of Mathematical Biology manuscript No. (will be inserted by the editor) Phylogenetic networks that display a tree twice Paul Cordue Simone Linz Charles Semple Received: date / Accepted: date Abstract
More information11/17/2009 Comp 590/Comp Fall
Lecture 20: Clustering and Evolution Study Chapter 10.4 10.8 Problem Set #5 will be available tonight 11/17/2009 Comp 590/Comp 790-90 Fall 2009 1 Clique Graphs A clique is a graph with every vertex connected
More informationDistance-based Phylogenetic Methods Near a Polytomy
Distance-based Phylogenetic Methods Near a Polytomy Ruth Davidson and Seth Sullivant NCSU UIUC May 21, 2014 2 Phylogenetic trees model the common evolutionary history of a group of species Leaves = extant
More informationLeast Common Ancestor Based Method for Efficiently Constructing Rooted Supertrees
Least ommon ncestor ased Method for fficiently onstructing Rooted Supertrees M.. Hai Zahid, nkush Mittal, R.. Joshi epartment of lectronics and omputer ngineering, IIT-Roorkee Roorkee, Uttaranchal, INI
More informationWhat is a phylogenetic tree? Algorithms for Computational Biology. Phylogenetics Summary. Di erent types of phylogenetic trees
What is a phylogenetic tree? Algorithms for Computational Biology Zsuzsanna Lipták speciation events Masters in Molecular and Medical Biotechnology a.a. 25/6, fall term Phylogenetics Summary wolf cat lion
More informationPhylogenetic Trees and Their Analysis
City University of New York (CUNY) CUNY Academic Works Dissertations, Theses, and Capstone Projects Graduate Center 2-2014 Phylogenetic Trees and Their Analysis Eric Ford Graduate Center, City University
More informationPhylogenetic Trees Lecture 12. Section 7.4, in Durbin et al., 6.5 in Setubal et al. Shlomo Moran, Ilan Gronau
Phylogenetic Trees Lecture 12 Section 7.4, in Durbin et al., 6.5 in Setubal et al. Shlomo Moran, Ilan Gronau. Maximum Parsimony. Last week we presented Fitch algorithm for (unweighted) Maximum Parsimony:
More informationA Lookahead Branch-and-Bound Algorithm for the Maximum Quartet Consistency Problem
A Lookahead Branch-and-Bound Algorithm for the Maximum Quartet Consistency Problem Gang Wu Jia-Huai You Guohui Lin January 17, 2005 Abstract A lookahead branch-and-bound algorithm is proposed for solving
More informationAnswer Set Programming or Hypercleaning: Where does the Magic Lie in Solving Maximum Quartet Consistency?
Answer Set Programming or Hypercleaning: Where does the Magic Lie in Solving Maximum Quartet Consistency? Fathiyeh Faghih and Daniel G. Brown David R. Cheriton School of Computer Science, University of
More informationImprovement of Distance-Based Phylogenetic Methods by a Local Maximum Likelihood Approach Using Triplets
Improvement of Distance-Based Phylogenetic Methods by a Local Maximum Likelihood Approach Using Triplets Vincent Ranwez and Olivier Gascuel Département Informatique Fondamentale et Applications, LIRMM,
More informationA New Algorithm for the Reconstruction of Near-Perfect Binary Phylogenetic Trees
A New Algorithm for the Reconstruction of Near-Perfect Binary Phylogenetic Trees Kedar Dhamdhere ½ ¾, Srinath Sridhar ½ ¾, Guy E. Blelloch ¾, Eran Halperin R. Ravi and Russell Schwartz March 17, 2005 CMU-CS-05-119
More informationAlgorithms for constructing more accurate and inclusive phylogenetic trees
Graduate Theses and Dissertations Iowa State University Capstones, Theses and Dissertations 2013 Algorithms for constructing more accurate and inclusive phylogenetic trees Ruchi Chaudhary Iowa State University
More informationAlignment of Trees and Directed Acyclic Graphs
Alignment of Trees and Directed Acyclic Graphs Gabriel Valiente Algorithms, Bioinformatics, Complexity and Formal Methods Research Group Technical University of Catalonia Computational Biology and Bioinformatics
More informationApplied Mathematics Letters. Graph triangulations and the compatibility of unrooted phylogenetic trees
Applied Mathematics Letters 24 (2011) 719 723 Contents lists available at ScienceDirect Applied Mathematics Letters journal homepage: www.elsevier.com/locate/aml Graph triangulations and the compatibility
More informationParsimony Least squares Minimum evolution Balanced minimum evolution Maximum likelihood (later in the course)
Tree Searching We ve discussed how we rank trees Parsimony Least squares Minimum evolution alanced minimum evolution Maximum likelihood (later in the course) So we have ways of deciding what a good tree
More informationAlgorithms for Computing Cluster Dissimilarity between Rooted Phylogenetic
Send Orders for Reprints to reprints@benthamscience.ae 8 The Open Cybernetics & Systemics Journal, 05, 9, 8-3 Open Access Algorithms for Computing Cluster Dissimilarity between Rooted Phylogenetic Trees
More informationNotes 4 : Approximating Maximum Parsimony
Notes 4 : Approximating Maximum Parsimony MATH 833 - Fall 2012 Lecturer: Sebastien Roch References: [SS03, Chapters 2, 5], [DPV06, Chapters 5, 9] 1 Coping with NP-completeness Local search heuristics.
More informationA Memetic Algorithm for Phylogenetic Reconstruction with Maximum Parsimony
A Memetic Algorithm for Phylogenetic Reconstruction with Maximum Parsimony Jean-Michel Richer 1 and Adrien Goëffon 2 and Jin-Kao Hao 1 1 University of Angers, LERIA, 2 Bd Lavoisier, 49045 Anger Cedex 01,
More informationReconstructing Reticulate Evolution in Species Theory and Practice
Reconstructing Reticulate Evolution in Species Theory and Practice Luay Nakhleh Department of Computer Science Rice University Houston, Texas 77005 nakhleh@cs.rice.edu Tandy Warnow Department of Computer
More informationReconstructing long sequences from overlapping sequence fragment. Searching databases for related sequences and subsequences
SEQUENCE ALIGNMENT ALGORITHMS 1 Why compare sequences? Reconstructing long sequences from overlapping sequence fragment Searching databases for related sequences and subsequences Storing, retrieving and
More informationEvolutionary Trees. Fredrik Ronquist. August 29, 2005
Evolutionary Trees Fredrik Ronquist August 29, 2005 1 Evolutionary Trees Tree is an important concept in Graph Theory, Computer Science, Evolutionary Biology, and many other areas. In evolutionary biology,
More informationPhylogenetics. Introduction to Bioinformatics Dortmund, Lectures: Sven Rahmann. Exercises: Udo Feldkamp, Michael Wurst
Phylogenetics Introduction to Bioinformatics Dortmund, 16.-20.07.2007 Lectures: Sven Rahmann Exercises: Udo Feldkamp, Michael Wurst 1 Phylogenetics phylum = tree phylogenetics: reconstruction of evolutionary
More informationLinear trees and RNA secondary. structuret's
ELSEVIER Discrete Applied Mathematics 51 (1994) 317-323 DISCRETE APPLIED MATHEMATICS Linear trees and RNA secondary. structuret's William R. Schmitt*.", Michael S. Watermanb "University of Memphis. Memphis,
More informationThe Performance of Phylogenetic Methods on Trees of Bounded Diameter
The Performance of Phylogenetic Methods on Trees of Bounded Diameter Luay Nakhleh 1, Usman Roshan 1, Katherine St. John 1 2, Jerry Sun 1, and Tandy Warnow 1 3 1 Department of Computer Sciences, University
More informationHybrid Parallelization of the MrBayes & RAxML Phylogenetics Codes
Hybrid Parallelization of the MrBayes & RAxML Phylogenetics Codes Wayne Pfeiffer (SDSC/UCSD) & Alexandros Stamatakis (TUM) February 25, 2010 What was done? Why is it important? Who cares? Hybrid MPI/OpenMP
More informationIntroduction to Triangulated Graphs. Tandy Warnow
Introduction to Triangulated Graphs Tandy Warnow Topics for today Triangulated graphs: theorems and algorithms (Chapters 11.3 and 11.9) Examples of triangulated graphs in phylogeny estimation (Chapters
More informationParsimony-Based Approaches to Inferring Phylogenetic Trees
Parsimony-Based Approaches to Inferring Phylogenetic Trees BMI/CS 576 www.biostat.wisc.edu/bmi576.html Mark Craven craven@biostat.wisc.edu Fall 0 Phylogenetic tree approaches! three general types! distance:
More informationGenetics/MBT 541 Spring, 2002 Lecture 1 Joe Felsenstein Department of Genome Sciences Phylogeny methods, part 1 (Parsimony and such)
Genetics/MBT 541 Spring, 2002 Lecture 1 Joe Felsenstein Department of Genome Sciences joe@gs Phylogeny methods, part 1 (Parsimony and such) Methods of reconstructing phylogenies (evolutionary trees) Parsimony
More informationSEEING THE TREES AND THEIR BRANCHES IN THE NETWORK IS HARD
1 SEEING THE TREES AND THEIR BRANCHES IN THE NETWORK IS HARD I A KANJ School of Computer Science, Telecommunications, and Information Systems, DePaul University, Chicago, IL 60604-2301, USA E-mail: ikanj@csdepauledu
More informationMOLECULAR phylogenetic methods reconstruct evolutionary
Calculating the Unrooted Subtree Prune-and-Regraft Distance Chris Whidden and Frederick A. Matsen IV arxiv:.09v [cs.ds] Nov 0 Abstract The subtree prune-and-regraft (SPR) distance metric is a fundamental
More informationSubject Index. Journal of Discrete Algorithms 5 (2007)
Journal of Discrete Algorithms 5 (2007) 751 755 www.elsevier.com/locate/jda Subject Index Ad hoc and wireless networks Ad hoc networks Admission control Algorithm ; ; A simple fast hybrid pattern-matching
More informationFast Algorithms for Large-Scale Phylogenetic Reconstruction
Fast Algorithms for Large-Scale Phylogenetic Reconstruction by Jakub Truszkowski A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree of Doctor of Philosophy
More informationCodon models. In reality we use codon model Amino acid substitution rates meet nucleotide models Codon(nucleotide triplet)
Phylogeny Codon models Last lecture: poor man s way of calculating dn/ds (Ka/Ks) Tabulate synonymous/non- synonymous substitutions Normalize by the possibilities Transform to genetic distance K JC or K
More informationReconciliation Problems for Duplication, Loss and Horizontal Gene Transfer Pawel Górecki. Presented by Connor Magill November 20, 2008
Reconciliation Problems for Duplication, Loss and Horizontal Gene Transfer Pawel Górecki Presented by Connor Magill November 20, 2008 Introduction Problem: Relationships between species cannot always be
More informationComputing the All-Pairs Quartet Distance on a set of Evolutionary Trees
Journal of Bioinformatics and Computational Biology c Imperial College Press Computing the All-Pairs Quartet Distance on a set of Evolutionary Trees M. Stissing, T. Mailund, C. N. S. Pedersen and G. S.
More informationarxiv: v2 [q-bio.pe] 8 Aug 2016
Combinatorial Scoring of Phylogenetic Networks Nikita Alexeev and Max A. Alekseyev The George Washington University, Washington, D.C., U.S.A. arxiv:160.0841v [q-bio.pe] 8 Aug 016 Abstract. Construction
More informationEVOLUTIONARY DISTANCES INFERRING PHYLOGENIES
EVOLUTIONARY DISTANCES INFERRING PHYLOGENIES Luca Bortolussi 1 1 Dipartimento di Matematica ed Informatica Università degli studi di Trieste luca@dmi.units.it Trieste, 28 th November 2007 OUTLINE 1 INFERRING
More informationA Memetic Algorithm for Phylogenetic Reconstruction with Maximum Parsimony
A Memetic Algorithm for Phylogenetic Reconstruction with Maximum Parsimony Jean-Michel Richer 1,AdrienGoëffon 2, and Jin-Kao Hao 1 1 University of Angers, LERIA, 2 Bd Lavoisier, 49045 Anger Cedex 01, France
More informationMEGA7: Molecular Evolutionary Genetics Analysis Version 7.0 forbiggerdatasets
MEGA7: Molecular Evolutionary Genetics Analysis Version 7.0 forbiggerdatasets Sudhir Kumar, 1,2,3 Glen Stecher 1 and Koichiro Tamura*,4,5 1 Institute for Genomics and Evolutionary Medicine, Temple University
More informationarxiv: v3 [math.co] 17 Jan 2018
Reconstructing unrooted phylogenetic trees from symbolic ternary metrics arxiv:1702.00190v3 [math.co] 17 Jan 2018 Stefan Grünewald CAS-MPG Partner Institute for Computational Biology Chinese Academy of
More informationTreeCmp 2.0: comparison of trees in polynomial time manual
TreeCmp 2.0: comparison of trees in polynomial time manual 1. Introduction A phylogenetic tree represents historical evolutionary relationship between different species or organisms. There are various
More informationINFERENCE OF PARSIMONIOUS SPECIES TREES FROM MULTI-LOCUS DATA BY MINIMIZING DEEP COALESCENCES CUONG THAN AND LUAY NAKHLEH
INFERENCE OF PARSIMONIOUS SPECIES TREES FROM MULTI-LOCUS DATA BY MINIMIZING DEEP COALESCENCES CUONG THAN AND LUAY NAKHLEH Abstract. One approach for inferring a species tree from a given multi-locus data
More informationLARGE-SCALE ANALYSIS OF PHYLOGENETIC SEARCH BEHAVIOR. A Thesis HYUN JUNG PARK
LARGE-SCALE ANALYSIS OF PHYLOGENETIC SEARCH BEHAVIOR A Thesis by HYUN JUNG PARK Submitted to the Office of Graduate Studies of Texas A&M University in partial fulfillment of the requirements for the degree
More informationSequence clustering. Introduction. Clustering basics. Hierarchical clustering
Sequence clustering Introduction Data clustering is one of the key tools used in various incarnations of data-mining - trying to make sense of large datasets. It is, thus, natural to ask whether clustering
More informationOn Low Treewidth Graphs and Supertrees
Journal of Graph Algorithms and Applications http://jgaa.info/ vol. 19, no. 1, pp. 325 343 (2015) DOI: 10.7155/jgaa.00361 On Low Treewidth Graphs and Supertrees Alexander Grigoriev 1 Steven Kelk 2 Nela
More informationin interleaved format. The same data set in sequential format:
PHYML user's guide Introduction PHYML is a software implementing a new method for building phylogenies from sequences using maximum likelihood. The executables can be downloaded at: http://www.lirmm.fr/~guindon/phyml.html.
More informationParallelizing SuperFine
Parallelizing SuperFine Diogo Telmo Neves ESTGF - IPP and Universidade do Minho Portugal dtn@ices.utexas.edu Tandy Warnow Dept. of Computer Science The Univ. of Texas at Austin Austin, TX 78712 tandy@cs.utexas.edu
More informationAlgorithms for MDC-Based Multi-locus Phylogeny Inference
Algorithms for MDC-Based Multi-locus Phylogeny Inference Yun Yu 1, Tandy Warnow 2, and Luay Nakhleh 1 1 Dept. of Computer Science, Rice University, 61 Main Street, Houston, TX 775, USA {yy9,nakhleh}@cs.rice.edu
More informationSequence length requirements. Tandy Warnow Department of Computer Science The University of Texas at Austin
Sequence length requirements Tandy Warnow Department of Computer Science The University of Texas at Austin Part 1: Absolute Fast Convergence DNA Sequence Evolution AAGGCCT AAGACTT TGGACTT -3 mil yrs -2
More informationParsimonious Reconciliation of Non-binary Trees. Louxin Zhang National University of Singapore
Parsimonious Reconciliation of Non-binary Trees Louxin Zhang National University of Singapore matzlx@nus.edu.sg Gene Tree vs the (Containing) Species Tree. A species tree S represents the evolutionary
More informationFrom Trees to Networks and Back
From Trees to Networks and Back Sarah Bastkowski Supervisor: Prof. Vincent Moulton Co-supervisor: Dr. Geoffrey Mckeown A thesis submitted for the degree of Doctor of Philosophy at the University of East
More informationProtein phylogenetics
Protein phylogenetics Robert Hirt PAUP4.0* can be used for an impressive range of analytical methods involving DNA alignments. This, unfortunately is not the case for estimating protein phylogenies. Only
More informationThe worst case complexity of Maximum Parsimony
he worst case complexity of Maximum Parsimony mir armel Noa Musa-Lempel Dekel sur Michal Ziv-Ukelson Ben-urion University June 2, 20 / 2 What s a phylogeny Phylogenies: raph-like structures whose topology
More informationhuman chimp mouse rat
Michael rudno These notes are based on earlier notes by Tomas abak Phylogenetic Trees Phylogenetic Trees demonstrate the amoun of evolution, and order of divergence for several genomes. Phylogenetic trees
More informationPrior Distributions on Phylogenetic Trees
Prior Distributions on Phylogenetic Trees Magnus Johansson Masteruppsats i matematisk statistik Master Thesis in Mathematical Statistics Masteruppsats 2011:4 Matematisk statistik Juni 2011 www.math.su.se
More informationFast and accurate branch lengths estimation for phylogenomic trees
Binet et al. BMC Bioinformatics (2016) 17:23 DOI 10.1186/s12859-015-0821-8 RESEARCH ARTICLE Open Access Fast and accurate branch lengths estimation for phylogenomic trees Manuel Binet 1,2,3, Olivier Gascuel
More informationTerminology. A phylogeny is the evolutionary history of an organism
Phylogeny Terminology A phylogeny is the evolutionary history of an organism A taxon (plural: taxa) is a group of (one or more) organisms, which a taxonomist adjudges to be a unit. A definition? from Wikipedia
More informationApproximating Subtree Distances Between Phylogenies. MARIA LUISA BONET, 1 KATHERINE ST. JOHN, 2,3 RUCHI MAHINDRU, 2,4 and NINA AMENTA 5 ABSTRACT
JOURNAL OF COMPUTATIONAL BIOLOGY Volume 13, Number 8, 2006 Mary Ann Liebert, Inc. Pp. 1419 1434 Approximating Subtree Distances Between Phylogenies AU1 AU2 MARIA LUISA BONET, 1 KATHERINE ST. JOHN, 2,3
More informationEukaryotic Gene Finding: The GENSCAN System
Eukaryotic Gene Finding: The GENSCAN System BMI/CS 776 www.biostat.wisc.edu/bmi776/ Spring 2016 Anthony Gitter gitter@biostat.wisc.edu These slides, excluding third-party material, are licensed under CC
More informationUnderstanding Spaces of Phylogenetic Trees
Understanding Spaces of Phylogenetic Trees Williams College SMALL REU 2012 September 25, 2012 Trees Which Tell an Evolutionary Story The Tree of Life Problem Given data (e.g. nucleotide sequences) on n
More informationAnalyzing Evolutionary Trees
Analyzing Evolutionary Trees Katherine St. John Lehman College and the Graduate Center City University of New York stjohn@lehman.cuny.edu Katherine St. John City University of New York 1 Overview Talk
More informationMarkovian Models of Genetic Inheritance
Markovian Models of Genetic Inheritance Elchanan Mossel, U.C. Berkeley mossel@stat.berkeley.edu, http://www.cs.berkeley.edu/~mossel/ 6/18/12 1 General plan Define a number of Markovian Inheritance Models
More informationCISC 889 Bioinformatics (Spring 2003) Multiple Sequence Alignment
CISC 889 Bioinformatics (Spring 2003) Multiple Sequence Alignment Courtesy of jalview 1 Motivations Collective statistic Protein families Identification and representation of conserved sequence features
More informationA practical O(n log 2 n) time algorithm for computing the triplet distance on binary trees
A practical O(n log 2 n) time algorithm for computing the triplet distance on binary trees Andreas Sand 1,2, Gerth Stølting Brodal 2,3, Rolf Fagerberg 4, Christian N. S. Pedersen 1,2 and Thomas Mailund
More informationOperads and the Tree of Life John Baez and Nina Otter
Operads and the Tree of Life John Baez and Nina Otter We have entered a new geological epoch, the Anthropocene, in which the biosphere is rapidly changing due to human activities. Last week two teams
More informationPhylogenetic Networks
Reconstructing evolution Algorithms for Combining Phylogenetic Trees into a Network Phylogenetic Tree 82 Mya 76 Mya 68 Mya 35 Mya Kiwi (New Zealand) Cassowary (New Guinea + Australia) Emu (Australia) Ostrich
More informationEfficient Quartet Representations of Trees and Applications to Supertree and Summary Methods
1 Efficient Quartet Representations of Trees and Applications to Supertree and Summary Methods Ruth Davidson, MaLyn Lawhorn, Joseph Rusinko*, and Noah Weber arxiv:1512.05302v3 [q-bio.pe] 6 Dec 2016 Abstract
More information