Supplementary Material, corresponding to the manuscript Accumulated Coalescence Rank and Excess Gene count for Species Tree Inference

Size: px
Start display at page:

Download "Supplementary Material, corresponding to the manuscript Accumulated Coalescence Rank and Excess Gene count for Species Tree Inference"

Transcription

1 Supplementary Material, corresponding to the manuscript Accumulated Coalescence Rank and Excess Gene count for Species Tree Inference Sourya Bhattacharyya and Jayanta Mukherjee Department of Computer Science and Engineering Indian Institute of Technology Kharagpur Kharagpur, West Bengal 7130, India January, Proof of lemmas Lemma 1 min x,y R G (x, y). Proof: From denition, min x,y R G (x, y) =, when x and y are siblings, and are placed at the lowest level of a caterpillar. Here, R G (x, y) is also. N Lemma For a perfect caterpillar G covering N taxa, max x,y R G (x, y) = i= i = N(N+1) - 1. Proof: As R G (x, y) N for any couplet (x, y), maximum R G (x, y) occurs when it includes all the rank values between to N. It corresponds to the case when x (or y) lies at the lowest level of the caterpillar, and LCA G xy is the root of the caterpillar. Example couplets (A,E) and (A,F) in the tree of Fig. exhibit maximum accumulated rank values. N 1 Lemma 3 For non-caterpillar trees covering N taxa, max x,y R G (x, y) = { N+1 lg N i} + N. Proof: Maximum R G (x, y) includes λ G(r) of the root r in G. Further, it should include the coalescence ranks of the internal nodes lying at the lowest levels of the either side of the root. For a height balanced complete binary gene tree, coalescence rank of each of the lowest level internal nodes is N - ( lg N - 1) = N lg N. So, maximum R G (x, y) will twice include the values from (N lg N ) to (N - N 1 1), and also include N, thus equaling { i} + N. N+1 lg N Lemma and lemma 3 show that maximum value of R is O (()N ) (given L(G) = N). Next, we nd the number of distinct R G (x, y) values in all couplets (x, y) of a gene tree G, to derive the exact correspondence between the number of couplets and R. Lemma 4 For a gene tree G with N taxa, maximum number of distinct R G (x, y) N 3N+4. Proof: For any three taxa x, y, z in G, R G (x, y) = R G (x, z), means that y and z are siblings. So, to achieve maximum number of distinct R G (x, y) values, we require to minimize the number of sibling taxa. Such minimization happens for a caterpillar, with just one pair of sibling taxa. Here, (N ) is the number of couplets having duplicate R entries, and the number of distinct R G (x, y) values is N(N 1) (N ) = N 3N+4. Lemma 5 For a binary gene tree G with N taxa, minimum number of distinct R G (x, y) values N +4N 3. 1

2 Proof: Lemma 4 shows that to achieve minimum number of distinct R G (x, y) values, we require maximizing the number of sibling taxa pairs. Given L(G) = N, maximum number of sibling couplets is (N/). For minimum distinct rank values, G needs to be symmetrical, indicating (N/4) couplets on either side of the root. Case 1: Couplets C1 and C are on the same side of the root. There will be N 4 distinct rank values. Case : Couplets C1 and C are on either side of the root. Individual such pair (C1, C) can be chosen in ( ) N 4 ways, each of which have a distinct rank value. Considering both cases, minimum number of distinct R G (x, y) values is ( N 4 ) + N 4 = N +4N 3.

3 Figure 1: Topology and bootstrap clade supports for the Angiosperm dataset, for NJst and STAR. Red branch corresponds to the topology from STAR. 100% bootstrap support values are indicated by `*' symbol. Topology of a group of taxa, depicting 100% bootstrap, are represented by the group name, shown in colored and underlined labels. 3

4 Figure : Topology and bootstrap clade supports for the Angiosperm dataset, for ASTRAL and mulrf. Red branch corresponds to the topology from mulrf. 100% bootstrap support values are indicated by `*' symbol. Topology of a group of taxa, depicting 100% bootstrap, are represented by the group name, shown in colored and underlined labels. 4

5 RAxML command used for Bootstrapping RAxML [6] was used to generate the bootstrap replicates for biological datasets. We followed [4] for such bootstrapping. command for bestml input tree: raxmlhpc-sse3 -m GTRGAMMA -s alignmentle -n bestml -N 0 -p pseed command for bootstrapping: raxmlhpc-sse3 -m GTRGAMMA -s alignmentle -n BSreplica -N 00 -p pseed -b bseed 5

6 (a) Species tree derived by Phylonet [7] (b) Species tree derived by igtp [1] Figure 3: Species trees obtained by Phylonet and igtp, for Amniota [3, ] dataset. 6

7 Figure 4: Detailed view of the topology and bootstrap clade supports corresponding to the Mammalian dataset, for AcRNJXL and the reference approaches. Species tree topologies and corresponding bootstrap support values (less than 100) are shown in the order of ASTRAL/mulRF/STAR/AcRNJXL. The method NJst produces identical performance with STAR. 7

8 Figure 5: Species trees obtained by Phylonet[7], for Mammalian [3, 5] dataset. 8

9 Figure 6: Species trees obtained by igtp [1], for Mammalian [3, 5] dataset. 9

10 Figure 7: Species trees obtained by igtp [1] for the Angiosperm dataset [8]. Bootstrap clade supports are also shown. 10

11 References [1] R. Chaudhary, M. S. Bansal, A. Wehe, D. Fernández-Baca, and O. Eulenstein. igtp: a software package for large-scale gene tree parsimony analysis. BMC Bioinformatics., 3(574):17, 010. [] Y. Chiari, V. Cahais, N. Galtier, and F. Delsuc. Phylogenomic analyses support the position of turtles as the sister group of birds and crocodiles (archosauria). BMC Biology, 10(65):114, 01. [3] S. Mirarab, R. Reaz, M. S. Bayzid, T. Zimmermann, M. S. Swenson, and T. Warnow. Astral: genomescale coalescent-based species tree estimation. Bioinformatics, 30(17):i541i548, 014. [4] S. Mirarab and T. Warnow. Astral-ii: coalescent-based species tree estimation with many hundreds of taxa and thousands of genes. Bioinformatics, 31(1):i44i5, 015. [5] S. Song, L. Liu, S. V. Edwards, and S. Wu. Resolving conict in eutherian mammal phylogeny using phylogenomics and the multispecies coalescent model. Proc Natl Acad Sci USA., 109(37): , 01. [6] A. Stamatakis. Raxml-vi-hpc: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics, (1):688690, 006. [7] Cuong Than and Luay Nakhleh. Species tree inference by minimizing deep coalescences. PLOS Computational Biology, 5(9):11, 009. [8] Z. Xi, L. Liu, J. S. Rest, and C. C. Davis. Coalescent versus concatenation methods and the placement of amborella as sister to water lilies. Syst. Biol., 63(6):91993,

Scaling species tree estimation methods to large datasets using NJMerge

Scaling species tree estimation methods to large datasets using NJMerge Scaling species tree estimation methods to large datasets using NJMerge Erin Molloy and Tandy Warnow {emolloy2, warnow}@illinois.edu University of Illinois at Urbana Champaign 2018 Phylogenomics Software

More information

Dynamic Programming for Phylogenetic Estimation

Dynamic Programming for Phylogenetic Estimation 1 / 45 Dynamic Programming for Phylogenetic Estimation CS598AGB Pranjal Vachaspati University of Illinois at Urbana-Champaign 2 / 45 Coalescent-based Species Tree Estimation Find evolutionary tree for

More information

Designing parallel algorithms for constructing large phylogenetic trees on Blue Waters

Designing parallel algorithms for constructing large phylogenetic trees on Blue Waters Designing parallel algorithms for constructing large phylogenetic trees on Blue Waters Erin Molloy University of Illinois at Urbana Champaign General Allocation (PI: Tandy Warnow) Exploratory Allocation

More information

INFERRING OPTIMAL SPECIES TREES UNDER GENE DUPLICATION AND LOSS

INFERRING OPTIMAL SPECIES TREES UNDER GENE DUPLICATION AND LOSS INFERRING OPTIMAL SPECIES TREES UNDER GENE DUPLICATION AND LOSS M. S. BAYZID, S. MIRARAB and T. WARNOW Department of Computer Science, The University of Texas at Austin, Austin, Texas 78712, USA E-mail:

More information

Parallelizing SuperFine

Parallelizing SuperFine Parallelizing SuperFine Diogo Telmo Neves ESTGF - IPP and Universidade do Minho Portugal dtn@ices.utexas.edu Tandy Warnow Dept. of Computer Science The Univ. of Texas at Austin Austin, TX 78712 tandy@cs.utexas.edu

More information

INFERENCE OF PARSIMONIOUS SPECIES TREES FROM MULTI-LOCUS DATA BY MINIMIZING DEEP COALESCENCES CUONG THAN AND LUAY NAKHLEH

INFERENCE OF PARSIMONIOUS SPECIES TREES FROM MULTI-LOCUS DATA BY MINIMIZING DEEP COALESCENCES CUONG THAN AND LUAY NAKHLEH INFERENCE OF PARSIMONIOUS SPECIES TREES FROM MULTI-LOCUS DATA BY MINIMIZING DEEP COALESCENCES CUONG THAN AND LUAY NAKHLEH Abstract. One approach for inferring a species tree from a given multi-locus data

More information

Seeing the wood for the trees: Analysing multiple alternative phylogenies

Seeing the wood for the trees: Analysing multiple alternative phylogenies Seeing the wood for the trees: Analysing multiple alternative phylogenies Tom M. W. Nye, Newcastle University tom.nye@ncl.ac.uk Isaac Newton Institute, 17 December 2007 Multiple alternative phylogenies

More information

Parsimonious Reconciliation of Non-binary Trees. Louxin Zhang National University of Singapore

Parsimonious Reconciliation of Non-binary Trees. Louxin Zhang National University of Singapore Parsimonious Reconciliation of Non-binary Trees Louxin Zhang National University of Singapore matzlx@nus.edu.sg Gene Tree vs the (Containing) Species Tree. A species tree S represents the evolutionary

More information

Efficient Quartet Representations of Trees and Applications to Supertree and Summary Methods

Efficient Quartet Representations of Trees and Applications to Supertree and Summary Methods 1 Efficient Quartet Representations of Trees and Applications to Supertree and Summary Methods Ruth Davidson, MaLyn Lawhorn, Joseph Rusinko*, and Noah Weber arxiv:1512.05302v3 [q-bio.pe] 6 Dec 2016 Abstract

More information

Heterotachy models in BayesPhylogenies

Heterotachy models in BayesPhylogenies Heterotachy models in is a general software package for inferring phylogenetic trees using Bayesian Markov Chain Monte Carlo (MCMC) methods. The program allows a range of models of gene sequence evolution,

More information

Phylogenetics on CUDA (Parallel) Architectures Bradly Alicea

Phylogenetics on CUDA (Parallel) Architectures Bradly Alicea Descent w/modification Descent w/modification Descent w/modification Descent w/modification CPU Descent w/modification Descent w/modification Phylogenetics on CUDA (Parallel) Architectures Bradly Alicea

More information

SPR-BASED TREE RECONCILIATION: NON-BINARY TREES AND MULTIPLE SOLUTIONS

SPR-BASED TREE RECONCILIATION: NON-BINARY TREES AND MULTIPLE SOLUTIONS 1 SPR-BASED TREE RECONCILIATION: NON-BINARY TREES AND MULTIPLE SOLUTIONS C. THAN and L. NAKHLEH Department of Computer Science Rice University 6100 Main Street, MS 132 Houston, TX 77005, USA Email: {cvthan,nakhleh}@cs.rice.edu

More information

Supplementary Online Material PASTA: ultra-large multiple sequence alignment

Supplementary Online Material PASTA: ultra-large multiple sequence alignment Supplementary Online Material PASTA: ultra-large multiple sequence alignment Siavash Mirarab, Nam Nguyen, and Tandy Warnow University of Texas at Austin - Department of Computer Science {smirarab,bayzid,tandy}@cs.utexas.edu

More information

Algorithms for constructing more accurate and inclusive phylogenetic trees

Algorithms for constructing more accurate and inclusive phylogenetic trees Graduate Theses and Dissertations Iowa State University Capstones, Theses and Dissertations 2013 Algorithms for constructing more accurate and inclusive phylogenetic trees Ruchi Chaudhary Iowa State University

More information

PLOS Currents Tree of Life. Performance on the six smallest datasets

PLOS Currents Tree of Life. Performance on the six smallest datasets Multiple sequence alignment: a major challenge to large-scale phylogenetics November 18, 2010 Tree of Life Kevin Liu, C. Randal Linder, Tandy Warnow Liu K, Linder CR, Warnow T. Multiple sequence alignment:

More information

Reconstructing Reticulate Evolution in Species Theory and Practice

Reconstructing Reticulate Evolution in Species Theory and Practice Reconstructing Reticulate Evolution in Species Theory and Practice Luay Nakhleh Department of Computer Science Rice University Houston, Texas 77005 nakhleh@cs.rice.edu Tandy Warnow Department of Computer

More information

Efficient Non-binary Gene Tree Resolution with Weighted Reconciliation Cost

Efficient Non-binary Gene Tree Resolution with Weighted Reconciliation Cost Efficient Non-binary Gene Tree Resolution with Weighted Reconciliation Cost DIRO Université de Montréal Manuel Lafond, Emmanuel Noutahi Nadia El-Mabrouk Introduction Gene Tree : representation of the evolutionary

More information

Tracy Heath Workshop on Molecular Evolution, Woods Hole USA

Tracy Heath Workshop on Molecular Evolution, Woods Hole USA INTRODUCTION TO BAYESIAN PHYLOGENETIC SOFTWARE Tracy Heath Integrative Biology, University of California, Berkeley Ecology & Evolutionary Biology, University of Kansas 2013 Workshop on Molecular Evolution,

More information

of the Balanced Minimum Evolution Polytope Ruriko Yoshida

of the Balanced Minimum Evolution Polytope Ruriko Yoshida Optimality of the Neighbor Joining Algorithm and Faces of the Balanced Minimum Evolution Polytope Ruriko Yoshida Figure 19.1 Genomes 3 ( Garland Science 2007) Origins of Species Tree (or web) of life eukarya

More information

SEEING THE TREES AND THEIR BRANCHES IN THE NETWORK IS HARD

SEEING THE TREES AND THEIR BRANCHES IN THE NETWORK IS HARD 1 SEEING THE TREES AND THEIR BRANCHES IN THE NETWORK IS HARD I A KANJ School of Computer Science, Telecommunications, and Information Systems, DePaul University, Chicago, IL 60604-2301, USA E-mail: ikanj@csdepauledu

More information

MEGA7: Molecular Evolutionary Genetics Analysis Version 7.0 forbiggerdatasets

MEGA7: Molecular Evolutionary Genetics Analysis Version 7.0 forbiggerdatasets MEGA7: Molecular Evolutionary Genetics Analysis Version 7.0 forbiggerdatasets Sudhir Kumar, 1,2,3 Glen Stecher 1 and Koichiro Tamura*,4,5 1 Institute for Genomics and Evolutionary Medicine, Temple University

More information

Algorithms for MDC-Based Multi-locus Phylogeny Inference

Algorithms for MDC-Based Multi-locus Phylogeny Inference Algorithms for MDC-Based Multi-locus Phylogeny Inference Yun Yu 1, Tandy Warnow 2, and Luay Nakhleh 1 1 Dept. of Computer Science, Rice University, 61 Main Street, Houston, TX 775, USA {yy9,nakhleh}@cs.rice.edu

More information

Phylogenetic networks that display a tree twice

Phylogenetic networks that display a tree twice Bulletin of Mathematical Biology manuscript No. (will be inserted by the editor) Phylogenetic networks that display a tree twice Paul Cordue Simone Linz Charles Semple Received: date / Accepted: date Abstract

More information

MODERN phylogenetic analyses are rapidly increasing

MODERN phylogenetic analyses are rapidly increasing IEEE CONFERENCE ON BIOINFORMATICS & BIOMEDICINE 1 Accurate Simulation of Large Collections of Phylogenetic Trees Suzanne J. Matthews, Member, IEEE, ACM Abstract Phylogenetic analyses are growing at a rapid

More information

arxiv: v2 [q-bio.pe] 8 Sep 2015

arxiv: v2 [q-bio.pe] 8 Sep 2015 RH: Tree-Based Phylogenetic Networks On Tree Based Phylogenetic Networks arxiv:1509.01663v2 [q-bio.pe] 8 Sep 2015 Louxin Zhang 1 1 Department of Mathematics, National University of Singapore, Singapore

More information

CSE 549: Computational Biology

CSE 549: Computational Biology CSE 549: Computational Biology Phylogenomics 1 slides marked with * by Carl Kingsford Tree of Life 2 * H5N1 Influenza Strains Salzberg, Kingsford, et al., 2007 3 * H5N1 Influenza Strains The 2007 outbreak

More information

Introduction to Computational Phylogenetics

Introduction to Computational Phylogenetics Introduction to Computational Phylogenetics Tandy Warnow The University of Texas at Austin No Institute Given This textbook is a draft, and should not be distributed. Much of what is in this textbook appeared

More information

The RAxML-VI-HPC Version 2.0 Manual

The RAxML-VI-HPC Version 2.0 Manual The RAxML-VI-HPC Version 2.0 Manual Alexandros Stamatakis Institute of Computer Science, Foundation for Research and Technology Hellas stamatak@ics.forth.gr 1 About RAxML RAxML (Randomized Axelerated Maximum

More information

Quartet Inference from SNP Data Under the Coalescent Model

Quartet Inference from SNP Data Under the Coalescent Model Quartet Inference from SNP Data Under the Coalescent Model Julia Chifman and Laura Kubatko By Shashank Yaduvanshi EsDmaDng Species Tree from Gene Sequences Input: Alignments from muldple genes Output:

More information

ML phylogenetic inference and GARLI. Derrick Zwickl. University of Arizona (and University of Kansas) Workshop on Molecular Evolution 2015

ML phylogenetic inference and GARLI. Derrick Zwickl. University of Arizona (and University of Kansas) Workshop on Molecular Evolution 2015 ML phylogenetic inference and GARLI Derrick Zwickl University of Arizona (and University of Kansas) Workshop on Molecular Evolution 2015 Outline Heuristics and tree searches ML phylogeny inference and

More information

Parsimony-Based Approaches to Inferring Phylogenetic Trees

Parsimony-Based Approaches to Inferring Phylogenetic Trees Parsimony-Based Approaches to Inferring Phylogenetic Trees BMI/CS 576 www.biostat.wisc.edu/bmi576.html Mark Craven craven@biostat.wisc.edu Fall 0 Phylogenetic tree approaches! three general types! distance:

More information

The worst case complexity of Maximum Parsimony

The worst case complexity of Maximum Parsimony he worst case complexity of Maximum Parsimony mir armel Noa Musa-Lempel Dekel sur Michal Ziv-Ukelson Ben-urion University June 2, 20 / 2 What s a phylogeny Phylogenies: raph-like structures whose topology

More information

Fast Local Search for Unrooted Robinson-Foulds Supertrees

Fast Local Search for Unrooted Robinson-Foulds Supertrees Fast Local Search for Unrooted Robinson-Foulds Supertrees Ruchi Chaudhary 1, J. Gordon Burleigh 2, and David Fernández-Baca 1 1 Department of Computer Science, Iowa State University, Ames, IA 50011, USA

More information

BIOINFORMATICS. Time and memory efficient likelihood-based tree searches on phylogenomic alignments with missing data

BIOINFORMATICS. Time and memory efficient likelihood-based tree searches on phylogenomic alignments with missing data BIOINFORMATICS Vol. 00 no. 00 2005 Pages 1 8 Time and memory efficient likelihood-based tree searches on phylogenomic alignments with missing data Alexandros Stamatakis 1, and Nikolaos Alachiotis 1 1 The

More information

Introduction to Triangulated Graphs. Tandy Warnow

Introduction to Triangulated Graphs. Tandy Warnow Introduction to Triangulated Graphs Tandy Warnow Topics for today Triangulated graphs: theorems and algorithms (Chapters 11.3 and 11.9) Examples of triangulated graphs in phylogeny estimation (Chapters

More information

Chordal Graphs and Evolutionary Trees. Tandy Warnow

Chordal Graphs and Evolutionary Trees. Tandy Warnow Chordal Graphs and Evolutionary Trees Tandy Warnow Possible Indo-European tree (Ringe, Warnow and Taylor 2000) Anatolian Vedic Iranian Greek Italic Celtic Tocharian Armenian Germanic Baltic Slavic Albanian

More information

Prospects for inferring very large phylogenies by using the neighbor-joining method. Methods

Prospects for inferring very large phylogenies by using the neighbor-joining method. Methods Prospects for inferring very large phylogenies by using the neighbor-joining method Koichiro Tamura*, Masatoshi Nei, and Sudhir Kumar* *Center for Evolutionary Functional Genomics, The Biodesign Institute,

More information

1 Objective 2. 2 Version, Author information, and Acknowledgements 2. 5 The Data 3

1 Objective 2. 2 Version, Author information, and Acknowledgements 2. 5 The Data 3 Species Trees and Species Delimitation with SNAPP: A Tutorial and Worked Example Adam D. Leaché Department of Biology, University of Washington, Seattle, United States Burke Museum of Natural History and

More information

Answer Set Programming or Hypercleaning: Where does the Magic Lie in Solving Maximum Quartet Consistency?

Answer Set Programming or Hypercleaning: Where does the Magic Lie in Solving Maximum Quartet Consistency? Answer Set Programming or Hypercleaning: Where does the Magic Lie in Solving Maximum Quartet Consistency? Fathiyeh Faghih and Daniel G. Brown David R. Cheriton School of Computer Science, University of

More information

CS 581. Tandy Warnow

CS 581. Tandy Warnow CS 581 Tandy Warnow This week Maximum parsimony: solving it on small datasets Maximum Likelihood optimization problem Felsenstein s pruning algorithm Bayesian MCMC methods Research opportunities Maximum

More information

Lab 07: Maximum Likelihood Model Selection and RAxML Using CIPRES

Lab 07: Maximum Likelihood Model Selection and RAxML Using CIPRES Integrative Biology 200, Spring 2014 Principles of Phylogenetics: Systematics University of California, Berkeley Updated by Traci L. Grzymala Lab 07: Maximum Likelihood Model Selection and RAxML Using

More information

Alignment of Trees and Directed Acyclic Graphs

Alignment of Trees and Directed Acyclic Graphs Alignment of Trees and Directed Acyclic Graphs Gabriel Valiente Algorithms, Bioinformatics, Complexity and Formal Methods Research Group Technical University of Catalonia Computational Biology and Bioinformatics

More information

Hybrid Parallelization of the MrBayes & RAxML Phylogenetics Codes

Hybrid Parallelization of the MrBayes & RAxML Phylogenetics Codes Hybrid Parallelization of the MrBayes & RAxML Phylogenetics Codes Wayne Pfeiffer (SDSC/UCSD) & Alexandros Stamatakis (TUM) February 25, 2010 What was done? Why is it important? Who cares? Hybrid MPI/OpenMP

More information

MultiPhyl: a high-throughput phylogenomics webserver using distributed computing

MultiPhyl: a high-throughput phylogenomics webserver using distributed computing Nucleic Acids Research, 2007, Vol. 35, Web Server issue W33 W37 doi:10.1093/nar/gkm359 MultiPhyl: a high-throughput phylogenomics webserver using distributed computing Thomas M. Keane 1,2, *, Thomas J.

More information

A New Algorithm for the Reconstruction of Near-Perfect Binary Phylogenetic Trees

A New Algorithm for the Reconstruction of Near-Perfect Binary Phylogenetic Trees A New Algorithm for the Reconstruction of Near-Perfect Binary Phylogenetic Trees Kedar Dhamdhere, Srinath Sridhar, Guy E. Blelloch, Eran Halperin R. Ravi and Russell Schwartz March 17, 2005 CMU-CS-05-119

More information

Extracting conflict-free information from multi-labeled trees

Extracting conflict-free information from multi-labeled trees Deepak et al. Algorithms for Molecular Biology 2013, 8:18 RESEARCH Open Access Extracting conflict-free information from multi-labeled trees Akshay Deepak 1*, David Fernández-Baca 1 and Michelle M McMahon

More information

On the Optimality of the Neighbor Joining Algorithm

On the Optimality of the Neighbor Joining Algorithm On the Optimality of the Neighbor Joining Algorithm Ruriko Yoshida Dept. of Statistics University of Kentucky Joint work with K. Eickmeyer, P. Huggins, and L. Pachter www.ms.uky.edu/ ruriko Louisville

More information

Parametric and Inverse Parametric Sequence Alignment

Parametric and Inverse Parametric Sequence Alignment Parametric and Inverse Parametric Sequence Alignment Nan Lu, nlu@math.gatech.edu School of Mathematics, Georgia Institute of Technology School of Mathematics, Georgia Institute of Technology 22:36, April

More information

An Experimental Analysis of Robinson-Foulds Distance Matrix Algorithms

An Experimental Analysis of Robinson-Foulds Distance Matrix Algorithms An Experimental Analysis of Robinson-Foulds Distance Matrix Algorithms Seung-Jin Sul and Tiffani L. Williams Department of Computer Science Texas A&M University College Station, TX 77843-3 {sulsj,tlw}@cs.tamu.edu

More information

Confidence Regions and Averaging for Trees

Confidence Regions and Averaging for Trees Confidence Regions and Averaging for Trees 1 Susan Holmes Statistics Department, Stanford and INRA- Biométrie, Montpellier,France susan@stat.stanford.edu http://www-stat.stanford.edu/~susan/ Joint Work

More information

analyzing the HTML source code of Web pages. However, HTML itself is still evolving (from version 2.0 to the current version 4.01, and version 5.

analyzing the HTML source code of Web pages. However, HTML itself is still evolving (from version 2.0 to the current version 4.01, and version 5. Automatic Wrapper Generation for Search Engines Based on Visual Representation G.V.Subba Rao, K.Ramesh Department of CS, KIET, Kakinada,JNTUK,A.P Assistant Professor, KIET, JNTUK, A.P, India. gvsr888@gmail.com

More information

Olivier Gascuel Arbres formels et Arbre de la Vie Conférence ENS Cachan, septembre Arbres formels et Arbre de la Vie.

Olivier Gascuel Arbres formels et Arbre de la Vie Conférence ENS Cachan, septembre Arbres formels et Arbre de la Vie. Arbres formels et Arbre de la Vie Olivier Gascuel Centre National de la Recherche Scientifique LIRMM, Montpellier, France www.lirmm.fr/gascuel 10 permanent researchers 2 technical staff 3 postdocs, 10

More information

Genetics/MBT 541 Spring, 2002 Lecture 1 Joe Felsenstein Department of Genome Sciences Phylogeny methods, part 1 (Parsimony and such)

Genetics/MBT 541 Spring, 2002 Lecture 1 Joe Felsenstein Department of Genome Sciences Phylogeny methods, part 1 (Parsimony and such) Genetics/MBT 541 Spring, 2002 Lecture 1 Joe Felsenstein Department of Genome Sciences joe@gs Phylogeny methods, part 1 (Parsimony and such) Methods of reconstructing phylogenies (evolutionary trees) Parsimony

More information

Workshop Practical on concatenation and model testing

Workshop Practical on concatenation and model testing Workshop Practical on concatenation and model testing Jacob L. Steenwyk & Antonis Rokas Programs that you will use: Bash, Python, Perl, Phyutility, PartitionFinder, awk To infer a putative species phylogeny

More information

AUTOMATED PLAUSIBILITY ANALYSIS OF LARGE PHYLOGENIES

AUTOMATED PLAUSIBILITY ANALYSIS OF LARGE PHYLOGENIES CHAPTER 1 AUTOMATED PLAUSIBILITY ANALYSIS OF LARGE PHYLOGENIES David Dao 1, Tomáš Flouri 2, Alexandros Stamatakis 1,2 1 KarlsruheInstituteofTechnology,InstituteforTheoreticalInformatics,Postfach 6980,

More information

Lab 15: Maximum Likelihood Estimation of Biogeographic History on Phylogenies using DIVA and Lagrange

Lab 15: Maximum Likelihood Estimation of Biogeographic History on Phylogenies using DIVA and Lagrange Integrative Biology 200B University of California, Berkeley "Systematics" Spring 2011 by Nick Matzke Lab 15: Maximum Likelihood Estimation of Biogeographic History on Phylogenies using DIVA and Lagrange

More information

A Lookahead Branch-and-Bound Algorithm for the Maximum Quartet Consistency Problem

A Lookahead Branch-and-Bound Algorithm for the Maximum Quartet Consistency Problem A Lookahead Branch-and-Bound Algorithm for the Maximum Quartet Consistency Problem Gang Wu Jia-Huai You Guohui Lin January 17, 2005 Abstract A lookahead branch-and-bound algorithm is proposed for solving

More information

Algorithms for Bioinformatics

Algorithms for Bioinformatics Adapted from slides by Leena Salmena and Veli Mäkinen, which are partly from http: //bix.ucsd.edu/bioalgorithms/slides.php. 582670 Algorithms for Bioinformatics Lecture 6: Distance based clustering and

More information

Introduction to Trees

Introduction to Trees Introduction to Trees Tandy Warnow December 28, 2016 Introduction to Trees Tandy Warnow Clades of a rooted tree Every node v in a leaf-labelled rooted tree defines a subset of the leafset that is below

More information

Comparison of commonly used methods for combining multiple phylogenetic data sets

Comparison of commonly used methods for combining multiple phylogenetic data sets Comparison of commonly used methods for combining multiple phylogenetic data sets Anne Kupczok, Heiko A. Schmidt and Arndt von Haeseler Center for Integrative Bioinformatics Vienna Max F. Perutz Laboratories

More information

A New Algorithm for the Reconstruction of Near-Perfect Binary Phylogenetic Trees

A New Algorithm for the Reconstruction of Near-Perfect Binary Phylogenetic Trees A New Algorithm for the Reconstruction of Near-Perfect Binary Phylogenetic Trees Kedar Dhamdhere ½ ¾, Srinath Sridhar ½ ¾, Guy E. Blelloch ¾, Eran Halperin R. Ravi and Russell Schwartz March 17, 2005 CMU-CS-05-119

More information

Fast and accurate branch lengths estimation for phylogenomic trees

Fast and accurate branch lengths estimation for phylogenomic trees Binet et al. BMC Bioinformatics (2016) 17:23 DOI 10.1186/s12859-015-0821-8 RESEARCH ARTICLE Open Access Fast and accurate branch lengths estimation for phylogenomic trees Manuel Binet 1,2,3, Olivier Gascuel

More information

Bad Clade Deletion Supertrees: A Fast and Accurate Supertree Algorithm

Bad Clade Deletion Supertrees: A Fast and Accurate Supertree Algorithm Article Bad Clade Deletion Supertrees: A Fast and Accurate Supertree Algorithm Markus Fleischauer 1 and Sebastian Böcker*,1 1 Chair for Bioinformatics, Institute for Computer Science, Friedrich-Schiller-University

More information

The RAxML-VI-HPC Version Manual

The RAxML-VI-HPC Version Manual The RAxML-VI-HPC Version 2.2.3 Manual Alexandros Stamatakis École Polytechnique Fédérale de Lausanne School of Computer & Communication Sciences Laboratory for Computational Biology and Bioinformatics

More information

HORIZONTAL GENE TRANSFER DETECTION

HORIZONTAL GENE TRANSFER DETECTION HORIZONTAL GENE TRANSFER DETECTION Sequenzanalyse und Genomik (Modul 10-202-2207) Alejandro Nabor Lozada-Chávez Before start, the user must create a new folder or directory (WORKING DIRECTORY) for all

More information

CISC 636 Computational Biology & Bioinformatics (Fall 2016) Phylogenetic Trees (I)

CISC 636 Computational Biology & Bioinformatics (Fall 2016) Phylogenetic Trees (I) CISC 636 Computational iology & ioinformatics (Fall 2016) Phylogenetic Trees (I) Maximum Parsimony CISC636, F16, Lec13, Liao 1 Evolution Mutation, selection, Only the Fittest Survive. Speciation. t one

More information

Improved parameterized complexity of the Maximum Agreement Subtree and Maximum Compatible Tree problems LIRMM, Tech.Rep. num 04026

Improved parameterized complexity of the Maximum Agreement Subtree and Maximum Compatible Tree problems LIRMM, Tech.Rep. num 04026 Improved parameterized complexity of the Maximum Agreement Subtree and Maximum Compatible Tree problems LIRMM, Tech.Rep. num 04026 Vincent Berry, François Nicolas Équipe Méthodes et Algorithmes pour la

More information

UC Davis Computer Science Technical Report CSE On the Full-Decomposition Optimality Conjecture for Phylogenetic Networks

UC Davis Computer Science Technical Report CSE On the Full-Decomposition Optimality Conjecture for Phylogenetic Networks UC Davis Computer Science Technical Report CSE-2005 On the Full-Decomposition Optimality Conjecture for Phylogenetic Networks Dan Gusfield January 25, 2005 1 On the Full-Decomposition Optimality Conjecture

More information

Phylogenetic Trees Lecture 12. Section 7.4, in Durbin et al., 6.5 in Setubal et al. Shlomo Moran, Ilan Gronau

Phylogenetic Trees Lecture 12. Section 7.4, in Durbin et al., 6.5 in Setubal et al. Shlomo Moran, Ilan Gronau Phylogenetic Trees Lecture 12 Section 7.4, in Durbin et al., 6.5 in Setubal et al. Shlomo Moran, Ilan Gronau. Maximum Parsimony. Last week we presented Fitch algorithm for (unweighted) Maximum Parsimony:

More information

Two C++ Libraries for Counting Trees on a Phylogenetic Terrace

Two C++ Libraries for Counting Trees on a Phylogenetic Terrace Two C++ Libraries for Counting Trees on a Phylogenetic Terrace R. Biczok 1, P. Bozsoky 1, P. Eisenmann 1, J. Ernst 1, T. Ribizel 1, F. Scholz 1, A. Trefzer 1, F. Weber 1, M. Hamann 1, and A. Stamatakis

More information

AMPHORA2 User Manual. An Automated Phylogenomic Inference Pipeline for Bacterial and Archaeal Sequences. COPYRIGHT 2011 by Martin Wu

AMPHORA2 User Manual. An Automated Phylogenomic Inference Pipeline for Bacterial and Archaeal Sequences. COPYRIGHT 2011 by Martin Wu AMPHORA2 User Manual An Automated Phylogenomic Inference Pipeline for Bacterial and Archaeal Sequences. COPYRIGHT 2011 by Martin Wu AMPHORA2 is free software: you may redistribute it and/or modify its

More information

New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0

New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0 Zurich Open Repository and Archive University of Zurich Main Library Strickhofstrasse 39 CH-8057 Zurich www.zora.uzh.ch Year: 2010 New algorithms and methods to estimate maximum-likelihood phylogenies:

More information

Molecular Evolution & Phylogenetics Complexity of the search space, distance matrix methods, maximum parsimony

Molecular Evolution & Phylogenetics Complexity of the search space, distance matrix methods, maximum parsimony Molecular Evolution & Phylogenetics Complexity of the search space, distance matrix methods, maximum parsimony Basic Bioinformatics Workshop, ILRI Addis Ababa, 12 December 2017 Learning Objectives understand

More information

A New Support Measure to Quantify the Impact of Local Optima in Phylogenetic Analyses

A New Support Measure to Quantify the Impact of Local Optima in Phylogenetic Analyses Evolutionary Bioinformatics Original Research Open Access Full open access to this and thousands of other papers at http://www.la-press.com. A New Support Measure to Quantify the Impact of Local Optima

More information

A Statistical Test for Clades in Phylogenies

A Statistical Test for Clades in Phylogenies A STATISTICAL TEST FOR CLADES A Statistical Test for Clades in Phylogenies Thurston H. Y. Dang 1, and Elchanan Mossel 2 1 Department of Electrical Engineering and Computer Sciences, University of California,

More information

V Advanced Data Structures

V Advanced Data Structures V Advanced Data Structures B-Trees Fibonacci Heaps 18 B-Trees B-trees are similar to RBTs, but they are better at minimizing disk I/O operations Many database systems use B-trees, or variants of them,

More information

Trinets encode tree-child and level-2 phylogenetic networks

Trinets encode tree-child and level-2 phylogenetic networks Noname manuscript No. (will be inserted by the editor) Trinets encode tree-child and level-2 phylogenetic networks Leo van Iersel Vincent Moulton the date of receipt and acceptance should be inserted later

More information

The History Bound and ILP

The History Bound and ILP The History Bound and ILP Julia Matsieva and Dan Gusfield UC Davis March 15, 2017 Bad News for Tree Huggers More Bad News Far more convincingly even than the (also highly convincing) fossil evidence, the

More information

Locus aware decomposition of gene trees with respect to polytomous species trees

Locus aware decomposition of gene trees with respect to polytomous species trees https://doi.org/10.1186/s13015-018-0128-1 Algorithms for Molecular Biology RESEARCH Open Access Locus aware decomposition of gene trees with respect to polytomous species trees Michał Aleksander Ciach

More information

Codon models. In reality we use codon model Amino acid substitution rates meet nucleotide models Codon(nucleotide triplet)

Codon models. In reality we use codon model Amino acid substitution rates meet nucleotide models Codon(nucleotide triplet) Phylogeny Codon models Last lecture: poor man s way of calculating dn/ds (Ka/Ks) Tabulate synonymous/non- synonymous substitutions Normalize by the possibilities Transform to genetic distance K JC or K

More information

MASTtreedist: Visualization of Tree Space based on Maximum Agreement Subtree

MASTtreedist: Visualization of Tree Space based on Maximum Agreement Subtree MASTtreedist: Visualization of Tree Space based on Maximum Agreement Subtree Hong Huang *1 and Yongji Li 2 1 School of Information, University of South Florida, Tampa, FL, 33620 2 Department of Computer

More information

Site class Proportion Clade 1 Clade 2 0 p 0 0 < ω 0 < 1 0 < ω 0 < 1 1 p 1 ω 1 = 1 ω 1 = 1 2 p 2 = 1- p 0 + p 1 ω 2 ω 3

Site class Proportion Clade 1 Clade 2 0 p 0 0 < ω 0 < 1 0 < ω 0 < 1 1 p 1 ω 1 = 1 ω 1 = 1 2 p 2 = 1- p 0 + p 1 ω 2 ω 3 Notes for codon-based Clade models by Joseph Bielawski and Ziheng Yang Last modified: September 2005 1. Contents of folder: The folder contains a control file, data file and tree file for two example datasets

More information

V Advanced Data Structures

V Advanced Data Structures V Advanced Data Structures B-Trees Fibonacci Heaps 18 B-Trees B-trees are similar to RBTs, but they are better at minimizing disk I/O operations Many database systems use B-trees, or variants of them,

More information

A RANDOMIZED ALGORITHM FOR COMPARING SETS OF PHYLOGENETIC TREES

A RANDOMIZED ALGORITHM FOR COMPARING SETS OF PHYLOGENETIC TREES A RANDOMIZED ALGORITHM FOR COMPARING SETS OF PHYLOGENETIC TREES SEUNG-JIN SUL AND TIFFANI L. WILLIAMS Department of Computer Science Texas A&M University College Station, TX 77843-3112 USA E-mail: {sulsj,tlw}@cs.tamu.edu

More information

Efficient Generation of Evolutionary Trees

Efficient Generation of Evolutionary Trees fficient Generation of volutionary Trees MUHMM ULLH NN 1 M. SIUR RHMN 2 epartment of omputer Science and ngineering angladesh University of ngineering and Technology (UT) haka-1000, angladesh 1 adnan@cse.buet.ac.bd

More information

Species Trees with Relaxed Molecular Clocks Estimating per-species substitution rates using StarBEAST2

Species Trees with Relaxed Molecular Clocks Estimating per-species substitution rates using StarBEAST2 Species Trees with Relaxed Molecular Clocks Estimating per-species substitution rates using StarBEAST2 Joseph Heled, Remco Bouckaert, Walter Xie, Alexei J. Drummond and Huw A. Ogilvie 1 Background In this

More information

Enabling Phylogenetic Research via the CIPRES Science Gateway!

Enabling Phylogenetic Research via the CIPRES Science Gateway! Enabling Phylogenetic Research via the CIPRES Science Gateway Wayne Pfeiffer SDSC/UCSD August 5, 2013 In collaboration with Mark A. Miller, Terri Schwartz, & Bryan Lunt SDSC/UCSD Supported by NSF Phylogenetics

More information

What is a phylogenetic tree? Algorithms for Computational Biology. Phylogenetics Summary. Di erent types of phylogenetic trees

What is a phylogenetic tree? Algorithms for Computational Biology. Phylogenetics Summary. Di erent types of phylogenetic trees What is a phylogenetic tree? Algorithms for Computational Biology Zsuzsanna Lipták speciation events Masters in Molecular and Medical Biotechnology a.a. 25/6, fall term Phylogenetics Summary wolf cat lion

More information

Fixed Parameter Tractability of Binary Near-Perfect Phylogenetic Tree Reconstruction

Fixed Parameter Tractability of Binary Near-Perfect Phylogenetic Tree Reconstruction Fixed Parameter Tractability of Binary Near-Perfect Phylogenetic Tree Reconstruction Guy E. Blelloch, Kedar Dhamdhere, Eran Halperin, R. Ravi, Russell Schwartz and Srinath Sridhar Abstract. We consider

More information

Sequence length requirements. Tandy Warnow Department of Computer Science The University of Texas at Austin

Sequence length requirements. Tandy Warnow Department of Computer Science The University of Texas at Austin Sequence length requirements Tandy Warnow Department of Computer Science The University of Texas at Austin Part 1: Absolute Fast Convergence DNA Sequence Evolution AAGGCCT AAGACTT TGGACTT -3 mil yrs -2

More information

The Lattice BOINC Project Public Computing for the Tree of Life

The Lattice BOINC Project Public Computing for the Tree of Life The Lattice BOINC Project Public Computing for the Tree of Life Presented by Adam Bazinet Center for Bioinformatics and Computational Biology Institute for Advanced Computer Studies University of Maryland

More information

Parsimony Least squares Minimum evolution Balanced minimum evolution Maximum likelihood (later in the course)

Parsimony Least squares Minimum evolution Balanced minimum evolution Maximum likelihood (later in the course) Tree Searching We ve discussed how we rank trees Parsimony Least squares Minimum evolution alanced minimum evolution Maximum likelihood (later in the course) So we have ways of deciding what a good tree

More information

3 Competitive Dynamic BSTs (January 31 and February 2)

3 Competitive Dynamic BSTs (January 31 and February 2) 3 Competitive Dynamic BSTs (January 31 and February ) In their original paper on splay trees [3], Danny Sleator and Bob Tarjan conjectured that the cost of sequence of searches in a splay tree is within

More information

Selecting Genomes for Reconstruction of Ancestral Genomes

Selecting Genomes for Reconstruction of Ancestral Genomes Selecting Genomes for Reconstruction of Ancestral Genomes Guoliang Li 1,JianMa 2, and Louxin Zhang 3 1 Department of Computer Science National University of Singapore (NUS), Singapore 117543 ligl@comp.nus.edu.sg

More information

Applied Mathematics Letters. Graph triangulations and the compatibility of unrooted phylogenetic trees

Applied Mathematics Letters. Graph triangulations and the compatibility of unrooted phylogenetic trees Applied Mathematics Letters 24 (2011) 719 723 Contents lists available at ScienceDirect Applied Mathematics Letters journal homepage: www.elsevier.com/locate/aml Graph triangulations and the compatibility

More information

FINAL REPORT. Milestone/Deliverable Description: Final implementation and final report

FINAL REPORT. Milestone/Deliverable Description: Final implementation and final report FINAL REPORT PRAC Topic: Petascale simulations of complex biological behavior in fluctuating environments NSF Award ID: 0941360 Principal Investigator: Ilias Tagkopoulos, UC Davis Milestone/Deliverable

More information

Parallel Implementation of a Quartet-Based Algorithm for Phylogenetic Analysis

Parallel Implementation of a Quartet-Based Algorithm for Phylogenetic Analysis Parallel Implementation of a Quartet-Based Algorithm for Phylogenetic Analysis B. B. Zhou 1, D. Chu 1, M. Tarawneh 1, P. Wang 1, C. Wang 1, A. Y. Zomaya 1, and R. P. Brent 2 1 School of Information Technologies

More information

Approximating Subtree Distances Between Phylogenies. MARIA LUISA BONET, 1 KATHERINE ST. JOHN, 2,3 RUCHI MAHINDRU, 2,4 and NINA AMENTA 5 ABSTRACT

Approximating Subtree Distances Between Phylogenies. MARIA LUISA BONET, 1 KATHERINE ST. JOHN, 2,3 RUCHI MAHINDRU, 2,4 and NINA AMENTA 5 ABSTRACT JOURNAL OF COMPUTATIONAL BIOLOGY Volume 13, Number 8, 2006 Mary Ann Liebert, Inc. Pp. 1419 1434 Approximating Subtree Distances Between Phylogenies AU1 AU2 MARIA LUISA BONET, 1 KATHERINE ST. JOHN, 2,3

More information

From gene trees to species trees through a supertree approach

From gene trees to species trees through a supertree approach From gene trees to species trees through a supertree approach Celine Scornavacca 1,2,, Vincent Berry 2, and Vincent Ranwez 1 1 Institut des Sciences de l Evolution (ISEM, UMR 5554 CNRS), Université Montpellier

More information

Algorithms for Ultra-large Multiple Sequence Alignment and Phylogeny Estimation

Algorithms for Ultra-large Multiple Sequence Alignment and Phylogeny Estimation Algorithms for Ultra-large Multiple Sequence Alignment and Phylogeny Estimation Tandy Warnow Department of Computer Science The University of Texas at Austin Phylogeny (evolutionary tree) Orangutan Gorilla

More information