Seeing the wood for the trees: Analysing multiple alternative phylogenies
|
|
- Maude Wilson
- 5 years ago
- Views:
Transcription
1 Seeing the wood for the trees: Analysing multiple alternative phylogenies Tom M. W. Nye, Newcastle University Isaac Newton Institute, 17 December 2007
2 Multiple alternative phylogenies Phylogenetic analysis often produces many possible trees Variability in data / uncertainty in inferred trees: - ML bootstrap trees - Bayesian posterior samples Different trees for different genes How can we summarize / represent this information? How can we compare different alternative tree topologies?
3 Multiple alternative phylogenies Phylogenetic analysis often produces many possible trees Variability in data / uncertainty in inferred trees: - ML bootstrap trees - Bayesian posterior samples Different trees for different genes How can we summarize / represent this information? How can we compare different alternative tree topologies?
4 Representing collections of trees Existing approaches include... Consensus trees Consensus networks But also... Multi-dimensional scaling (Hillis et al, Systematic Biology 2005) Clustering (Stockham et al, Bioinformatics 2002)
5 Representing collections of trees Existing approaches include... Consensus trees Consensus networks But also... Multi-dimensional scaling (Hillis et al, Systematic Biology 2005) Clustering (Stockham et al, Bioinformatics 2002)
6 Toy example How are these trees related? A B A B A C A C E C C E E B D E D D D B
7 Toy example Relate trees by a tree of trees : A B A B A C E D C E D C E D B A B A C A C C D E B D E D B E
8 Why use a tree of trees? Tree-space does not have a tree-like structure Advantages: Cluster similar trees together edges represent gain/loss of topological features Conflicting histories show up as separate clades on the meta-tree: - different modes in a distribution - outliers Convenient form of visualisation
9 Meta-trees Definition Given a fixed set of trees T 1, T 2,..., T n all having leaf-set L, a meta-tree ˆT is an unrooted tree with n leaves such that every vertex ˆv in ˆT has associated to it a species tree Tˆv with leaf set L, and the leaf vertices of ˆT are associated to the trees T 1,..., T n Aim to find meta-trees with minimum score Tree score = sum of edge scores score (ê) = d(tˆv1, Tˆv2 ) for an edge ê between vertices ˆv 1, ˆv 2 Different metrics d(, ) are available
10 Splits A split is a bi-partition of the leaves L induced by cutting a branch: B A C E D AB CDE Trees consist of sets of compatible splits e.g. AB CDE and AC BDE cannot both be in a tree The majority consensus of T 1,..., T n is the tree consisting of all splits in strictly greater than n/2 trees The Robinson-Foulds metric is defined by: d(t a, T b ) = (number of splits in T a \ T b ) + (number of splits in T b \ T a )
11 Analogy with parsimony for DNA trees Suppose we use the Robinson-Foulds metric Represent tree topologies by strings of 0 s and 1 s for presence / absence of splits Could we apply DNA parsimony algorithms to these strings to build an optimal meta-tree? No because strings must always represent trees General problem: replaced set of characters {A, C, G, T } with the set of trees with leaf-set L Meta-tree construction equivalent to Steiner tree problem
12 Analogy with parsimony for DNA trees Suppose we use the Robinson-Foulds metric Represent tree topologies by strings of 0 s and 1 s for presence / absence of splits Could we apply DNA parsimony algorithms to these strings to build an optimal meta-tree? No because strings must always represent trees General problem: replaced set of characters {A, C, G, T } with the set of trees with leaf-set L Meta-tree construction equivalent to Steiner tree problem
13 Analogy with parsimony for DNA trees Suppose we use the Robinson-Foulds metric Represent tree topologies by strings of 0 s and 1 s for presence / absence of splits Could we apply DNA parsimony algorithms to these strings to build an optimal meta-tree? No because strings must always represent trees General problem: replaced set of characters {A, C, G, T } with the set of trees with leaf-set L Meta-tree construction equivalent to Steiner tree problem
14 Analogy with parsimony for DNA trees Suppose we use the Robinson-Foulds metric Represent tree topologies by strings of 0 s and 1 s for presence / absence of splits Could we apply DNA parsimony algorithms to these strings to build an optimal meta-tree? No because strings must always represent trees General problem: replaced set of characters {A, C, G, T } with the set of trees with leaf-set L Meta-tree construction equivalent to Steiner tree problem
15 Analogy with parsimony for DNA trees Suppose we use the Robinson-Foulds metric Represent tree topologies by strings of 0 s and 1 s for presence / absence of splits Could we apply DNA parsimony algorithms to these strings to build an optimal meta-tree? No because strings must always represent trees General problem: replaced set of characters {A, C, G, T } with the set of trees with leaf-set L Meta-tree construction equivalent to Steiner tree problem
16 Majority consensus and optimality Consider a meta-tree with the star topology, central node T 0 : Meta-tree score = T 1 T n T 2 T n 1 splits p T 3 ( number of edges with p at one end but not at the other end ) Score minimised by majority consensus: majority consensus is a median tree NB: optimisation performed for each split independently
17 Majority consensus and optimality Consider a meta-tree with the star topology, central node T 0 : Meta-tree score = T 1 T n T 2 T n 1 splits p T 3 ( number of edges with p at one end but not at the other end ) Score minimised by majority consensus: majority consensus is a median tree NB: optimisation performed for each split independently
18 Majority consensus and optimality Consider a meta-tree with the star topology, central node T 0 : Meta-tree score = T 1 T n T 2 T n 1 splits p T 3 ( number of edges with p at one end but not at the other end ) Score minimised by majority consensus: majority consensus is a median tree NB: optimisation performed for each split independently
19 A local optimality condition Consider internal vertex ˆv on an optimal meta-tree: when does Tˆv contain a split p? p ˆv p p p ˆv p p (a) Score=1 if p Tˆv Score=2 if p / Tˆv (b) Score=2 if p Tˆv Score=1 if p / Tˆv
20 The Meta-NJ algorithm B Z 1 A Z 2 Z k Z 3 A X B Z 1 Z 2 Z Z k X Z 1 Z 2 Z Z k ˆT r Start with star phylogeny ˆT AB r At r-th step pick two nodes A, B to agglomerate ˆT r+1 Form new nodes X and Z that are the majority consensus of their neighbours: X = maj{a, B, Z} and Z = maj{x, Z 1,..., Z k } Calculate score for the resulting configuration ˆT AB r Try every pair A, B and pick the pair with min score
21 The Meta-NJ algorithm B Z 1 A Z 2 Z k Z 3 A X B Z 1 Z 2 Z Z k X Z 1 Z 2 Z Z k ˆT r Start with star phylogeny ˆT AB r At r-th step pick two nodes A, B to agglomerate ˆT r+1 Form new nodes X and Z that are the majority consensus of their neighbours: X = maj{a, B, Z} and Z = maj{x, Z 1,..., Z k } Calculate score for the resulting configuration ˆT AB r Try every pair A, B and pick the pair with min score
22 The Meta-NJ algorithm B Z 1 A Z 2 Z k Z 3 A X B Z 1 Z 2 Z Z k X Z 1 Z 2 Z Z k ˆT r Start with star phylogeny ˆT AB r At r-th step pick two nodes A, B to agglomerate ˆT r+1 Form new nodes X and Z that are the majority consensus of their neighbours: X = maj{a, B, Z} and Z = maj{x, Z 1,..., Z k } Calculate score for the resulting configuration ˆT AB r Try every pair A, B and pick the pair with min score
23 Features of the algorithm Simultaneous equations for X and Z can be solved (almost) uniquely: splits are considered independently Each vertex on resulting meta-tree is majority consensus of its neighbours Algorithm greedily constructs meta-trees with the local optimality condition Zero length branches are sometimes produced leads to multifurcations Ties in score: pick one agglomeration at random
24 Yeast data set Rokas et al, Genome-scale approaches to resolving incongruence, Nature 2003: Genomes from 8 species of yeast ML trees constructed for 106 orthologs 23 different topologies obtained Consensus network (Holland et al, Mol. Biol. and Evolution 2004):
25 Yeast data set results Link to web Topology 11, YDR484W Topology 15, repeated 5 Topology 1, repeated 41 Topology 6, repeated 2 Topology 10, repeated 4 Topology 7, repeated 9 Topology 8, repeated 8 Topology 18, YPL210C Topology 2, YDR531W Topology 20, YKL120W Topology 9, YGL225W Topology 5, repeated 2 Topology 21, repeated 4 Topology 3, repeated 6 Topology 17, repeated 4 Topology 12, repeated 2 Topology 13, repeated 3 Topology 14, YGL192W Topology 22, YJR068W Topology 16, repeated 4 Topology 23, repeated 2 Topology 19, YKL034W Topology 4, repeated 2
26 Yeast data set results
27 Yeast data set results
28 Yeast data set results split absent split present
29 Yeast data set results split present split absent
30 Fish data set 10 orthologous genes in 14 species of ray-finned fish Li et al, BMC Evolutionary Biology, 2007
31 Fish bootstrap analysis To what extent is incongruence caused by: (a) lack of phylogenetic signal in each gene sequence, or (b) genuine evolutionary differences? Generate 10 boostrap replicates for each gene Replicates for each gene form clusters distinct evolutionary histories Replicates scattered lack of signal in each gene
32 Fish bootstrap results
33 Fish bootstrap results sreb2 ENC1 plag12 Glyt
34 Fish bootstrap results Clade absent Gain / loss of clade (Danio, Ictalur, Ochor, Semiotil) Clade present
35 Summary Attempt to represent a collection of trees by a tree-of-trees or meta-tree Finding an optimal meta-tree is computationally hard Meta-NJ algorithm: heuristic approach that builds meta-trees by maintaining a local optimality condition: each vertex is the majority consensus of its neighbours Examples show typical insights meta-trees can provide
36 Summary Attempt to represent a collection of trees by a tree-of-trees or meta-tree Finding an optimal meta-tree is computationally hard Meta-NJ algorithm: heuristic approach that builds meta-trees by maintaining a local optimality condition: each vertex is the majority consensus of its neighbours Examples show typical insights meta-trees can provide
37 Summary Attempt to represent a collection of trees by a tree-of-trees or meta-tree Finding an optimal meta-tree is computationally hard Meta-NJ algorithm: heuristic approach that builds meta-trees by maintaining a local optimality condition: each vertex is the majority consensus of its neighbours Examples show typical insights meta-trees can provide
38 Summary Attempt to represent a collection of trees by a tree-of-trees or meta-tree Finding an optimal meta-tree is computationally hard Meta-NJ algorithm: heuristic approach that builds meta-trees by maintaining a local optimality condition: each vertex is the majority consensus of its neighbours Examples show typical insights meta-trees can provide
39 Acknowledgements Thanks to Wally Gilks Antonis Rokas (yeast data set) Chenhong Li (fish data set) Web site Software is available on line at: ntmwn/phylo comparison/multiple.html
Olivier Gascuel Arbres formels et Arbre de la Vie Conférence ENS Cachan, septembre Arbres formels et Arbre de la Vie.
Arbres formels et Arbre de la Vie Olivier Gascuel Centre National de la Recherche Scientifique LIRMM, Montpellier, France www.lirmm.fr/gascuel 10 permanent researchers 2 technical staff 3 postdocs, 10
More informationMolecular Evolution & Phylogenetics Complexity of the search space, distance matrix methods, maximum parsimony
Molecular Evolution & Phylogenetics Complexity of the search space, distance matrix methods, maximum parsimony Basic Bioinformatics Workshop, ILRI Addis Ababa, 12 December 2017 Learning Objectives understand
More informationParsimony-Based Approaches to Inferring Phylogenetic Trees
Parsimony-Based Approaches to Inferring Phylogenetic Trees BMI/CS 576 www.biostat.wisc.edu/bmi576.html Mark Craven craven@biostat.wisc.edu Fall 0 Phylogenetic tree approaches! three general types! distance:
More informationDistance-based Phylogenetic Methods Near a Polytomy
Distance-based Phylogenetic Methods Near a Polytomy Ruth Davidson and Seth Sullivant NCSU UIUC May 21, 2014 2 Phylogenetic trees model the common evolutionary history of a group of species Leaves = extant
More informationPhylogenetics. Introduction to Bioinformatics Dortmund, Lectures: Sven Rahmann. Exercises: Udo Feldkamp, Michael Wurst
Phylogenetics Introduction to Bioinformatics Dortmund, 16.-20.07.2007 Lectures: Sven Rahmann Exercises: Udo Feldkamp, Michael Wurst 1 Phylogenetics phylum = tree phylogenetics: reconstruction of evolutionary
More informationCSE 549: Computational Biology
CSE 549: Computational Biology Phylogenomics 1 slides marked with * by Carl Kingsford Tree of Life 2 * H5N1 Influenza Strains Salzberg, Kingsford, et al., 2007 3 * H5N1 Influenza Strains The 2007 outbreak
More informationINFERENCE OF PARSIMONIOUS SPECIES TREES FROM MULTI-LOCUS DATA BY MINIMIZING DEEP COALESCENCES CUONG THAN AND LUAY NAKHLEH
INFERENCE OF PARSIMONIOUS SPECIES TREES FROM MULTI-LOCUS DATA BY MINIMIZING DEEP COALESCENCES CUONG THAN AND LUAY NAKHLEH Abstract. One approach for inferring a species tree from a given multi-locus data
More informationDynamic Programming for Phylogenetic Estimation
1 / 45 Dynamic Programming for Phylogenetic Estimation CS598AGB Pranjal Vachaspati University of Illinois at Urbana-Champaign 2 / 45 Coalescent-based Species Tree Estimation Find evolutionary tree for
More informationof the Balanced Minimum Evolution Polytope Ruriko Yoshida
Optimality of the Neighbor Joining Algorithm and Faces of the Balanced Minimum Evolution Polytope Ruriko Yoshida Figure 19.1 Genomes 3 ( Garland Science 2007) Origins of Species Tree (or web) of life eukarya
More informationScaling species tree estimation methods to large datasets using NJMerge
Scaling species tree estimation methods to large datasets using NJMerge Erin Molloy and Tandy Warnow {emolloy2, warnow}@illinois.edu University of Illinois at Urbana Champaign 2018 Phylogenomics Software
More informationRecent Research Results. Evolutionary Trees Distance Methods
Recent Research Results Evolutionary Trees Distance Methods Indo-European Languages After Tandy Warnow What is the purpose? Understand evolutionary history (relationship between species). Uderstand how
More informationPhylogenetics on CUDA (Parallel) Architectures Bradly Alicea
Descent w/modification Descent w/modification Descent w/modification Descent w/modification CPU Descent w/modification Descent w/modification Phylogenetics on CUDA (Parallel) Architectures Bradly Alicea
More informationGenetics/MBT 541 Spring, 2002 Lecture 1 Joe Felsenstein Department of Genome Sciences Phylogeny methods, part 1 (Parsimony and such)
Genetics/MBT 541 Spring, 2002 Lecture 1 Joe Felsenstein Department of Genome Sciences joe@gs Phylogeny methods, part 1 (Parsimony and such) Methods of reconstructing phylogenies (evolutionary trees) Parsimony
More informationDesigning parallel algorithms for constructing large phylogenetic trees on Blue Waters
Designing parallel algorithms for constructing large phylogenetic trees on Blue Waters Erin Molloy University of Illinois at Urbana Champaign General Allocation (PI: Tandy Warnow) Exploratory Allocation
More informationDefinitions. Matt Mauldin
Combining Data Sets Matt Mauldin Definitions Character Independence: changes in character states are independent of others Character Correlation: changes in character states occur together Character Congruence:
More informationDIMACS Tutorial on Phylogenetic Trees and Rapidly Evolving Pathogens. Katherine St. John City University of New York 1
DIMACS Tutorial on Phylogenetic Trees and Rapidly Evolving Pathogens Katherine St. John City University of New York 1 Thanks to the DIMACS Staff Linda Casals Walter Morris Nicole Clark Katherine St. John
More informationCISC 636 Computational Biology & Bioinformatics (Fall 2016) Phylogenetic Trees (I)
CISC 636 Computational iology & ioinformatics (Fall 2016) Phylogenetic Trees (I) Maximum Parsimony CISC636, F16, Lec13, Liao 1 Evolution Mutation, selection, Only the Fittest Survive. Speciation. t one
More informationLecture: Bioinformatics
Lecture: Bioinformatics ENS Sacley, 2018 Some slides graciously provided by Daniel Huson & Celine Scornavacca Phylogenetic Trees - Motivation 2 / 31 2 / 31 Phylogenetic Trees - Motivation Motivation -
More informationSPR-BASED TREE RECONCILIATION: NON-BINARY TREES AND MULTIPLE SOLUTIONS
1 SPR-BASED TREE RECONCILIATION: NON-BINARY TREES AND MULTIPLE SOLUTIONS C. THAN and L. NAKHLEH Department of Computer Science Rice University 6100 Main Street, MS 132 Houston, TX 77005, USA Email: {cvthan,nakhleh}@cs.rice.edu
More informationAlgorithms for Bioinformatics
Adapted from slides by Leena Salmena and Veli Mäkinen, which are partly from http: //bix.ucsd.edu/bioalgorithms/slides.php. 582670 Algorithms for Bioinformatics Lecture 6: Distance based clustering and
More informationSupplementary Material, corresponding to the manuscript Accumulated Coalescence Rank and Excess Gene count for Species Tree Inference
Supplementary Material, corresponding to the manuscript Accumulated Coalescence Rank and Excess Gene count for Species Tree Inference Sourya Bhattacharyya and Jayanta Mukherjee Department of Computer Science
More informationFast Hashing Algorithms to Summarize Large. Collections of Evolutionary Trees
Texas A&M CS Technical Report 2008-6- June 27, 2008 Fast Hashing Algorithms to Summarize Large Collections of Evolutionary Trees by Seung-Jin Sul and Tiffani L. Williams Department of Computer Science
More informationhuman chimp mouse rat
Michael rudno These notes are based on earlier notes by Tomas abak Phylogenetic Trees Phylogenetic Trees demonstrate the amoun of evolution, and order of divergence for several genomes. Phylogenetic trees
More informationEvolutionary tree reconstruction (Chapter 10)
Evolutionary tree reconstruction (Chapter 10) Early Evolutionary Studies Anatomical features were the dominant criteria used to derive evolutionary relationships between species since Darwin till early
More informationIntroduction to Computational Phylogenetics
Introduction to Computational Phylogenetics Tandy Warnow The University of Texas at Austin No Institute Given This textbook is a draft, and should not be distributed. Much of what is in this textbook appeared
More informationCodon models. In reality we use codon model Amino acid substitution rates meet nucleotide models Codon(nucleotide triplet)
Phylogeny Codon models Last lecture: poor man s way of calculating dn/ds (Ka/Ks) Tabulate synonymous/non- synonymous substitutions Normalize by the possibilities Transform to genetic distance K JC or K
More informationMASTtreedist: Visualization of Tree Space based on Maximum Agreement Subtree
MASTtreedist: Visualization of Tree Space based on Maximum Agreement Subtree Hong Huang *1 and Yongji Li 2 1 School of Information, University of South Florida, Tampa, FL, 33620 2 Department of Computer
More informationA RANDOMIZED ALGORITHM FOR COMPARING SETS OF PHYLOGENETIC TREES
A RANDOMIZED ALGORITHM FOR COMPARING SETS OF PHYLOGENETIC TREES SEUNG-JIN SUL AND TIFFANI L. WILLIAMS Department of Computer Science Texas A&M University College Station, TX 77843-3112 USA E-mail: {sulsj,tlw}@cs.tamu.edu
More informationOn the Optimality of the Neighbor Joining Algorithm
On the Optimality of the Neighbor Joining Algorithm Ruriko Yoshida Dept. of Statistics University of Kentucky Joint work with K. Eickmeyer, P. Huggins, and L. Pachter www.ms.uky.edu/ ruriko Louisville
More informationClustering Jacques van Helden
Statistical Analysis of Microarray Data Clustering Jacques van Helden Jacques.van.Helden@ulb.ac.be Contents Data sets Distance and similarity metrics K-means clustering Hierarchical clustering Evaluation
More informationA Memetic Algorithm for Phylogenetic Reconstruction with Maximum Parsimony
A Memetic Algorithm for Phylogenetic Reconstruction with Maximum Parsimony Jean-Michel Richer 1 and Adrien Goëffon 2 and Jin-Kao Hao 1 1 University of Angers, LERIA, 2 Bd Lavoisier, 49045 Anger Cedex 01,
More informationWhat is a phylogenetic tree? Algorithms for Computational Biology. Phylogenetics Summary. Di erent types of phylogenetic trees
What is a phylogenetic tree? Algorithms for Computational Biology Zsuzsanna Lipták speciation events Masters in Molecular and Medical Biotechnology a.a. 25/6, fall term Phylogenetics Summary wolf cat lion
More informationA Randomized Algorithm for Comparing Sets of Phylogenetic Trees
A Randomized Algorithm for Comparing Sets of Phylogenetic Trees Seung-Jin Sul and Tiffani L. Williams Department of Computer Science Texas A&M University E-mail: {sulsj,tlw}@cs.tamu.edu Technical Report
More informationTutorial using BEAST v2.4.7 MASCOT Tutorial Nicola F. Müller
Tutorial using BEAST v2.4.7 MASCOT Tutorial Nicola F. Müller Parameter and State inference using the approximate structured coalescent 1 Background Phylogeographic methods can help reveal the movement
More informationAn Experimental Analysis of Robinson-Foulds Distance Matrix Algorithms
An Experimental Analysis of Robinson-Foulds Distance Matrix Algorithms Seung-Jin Sul and Tiffani L. Williams Department of Computer Science Texas A&M University College Station, TX 77843-3 {sulsj,tlw}@cs.tamu.edu
More informationDistance based tree reconstruction. Hierarchical clustering (UPGMA) Neighbor-Joining (NJ)
Distance based tree reconstruction Hierarchical clustering (UPGMA) Neighbor-Joining (NJ) All organisms have evolved from a common ancestor. Infer the evolutionary tree (tree topology and edge lengths)
More informationContents. ! Data sets. ! Distance and similarity metrics. ! K-means clustering. ! Hierarchical clustering. ! Evaluation of clustering results
Statistical Analysis of Microarray Data Contents Data sets Distance and similarity metrics K-means clustering Hierarchical clustering Evaluation of clustering results Clustering Jacques van Helden Jacques.van.Helden@ulb.ac.be
More informationA Memetic Algorithm for Phylogenetic Reconstruction with Maximum Parsimony
A Memetic Algorithm for Phylogenetic Reconstruction with Maximum Parsimony Jean-Michel Richer 1,AdrienGoëffon 2, and Jin-Kao Hao 1 1 University of Angers, LERIA, 2 Bd Lavoisier, 49045 Anger Cedex 01, France
More informationAnswer Set Programming or Hypercleaning: Where does the Magic Lie in Solving Maximum Quartet Consistency?
Answer Set Programming or Hypercleaning: Where does the Magic Lie in Solving Maximum Quartet Consistency? Fathiyeh Faghih and Daniel G. Brown David R. Cheriton School of Computer Science, University of
More informationABOUT THE LARGEST SUBTREE COMMON TO SEVERAL PHYLOGENETIC TREES Alain Guénoche 1, Henri Garreta 2 and Laurent Tichit 3
The XIII International Conference Applied Stochastic Models and Data Analysis (ASMDA-2009) June 30-July 3, 2009, Vilnius, LITHUANIA ISBN 978-9955-28-463-5 L. Sakalauskas, C. Skiadas and E. K. Zavadskas
More informationProspects for inferring very large phylogenies by using the neighbor-joining method. Methods
Prospects for inferring very large phylogenies by using the neighbor-joining method Koichiro Tamura*, Masatoshi Nei, and Sudhir Kumar* *Center for Evolutionary Functional Genomics, The Biodesign Institute,
More informationMODERN phylogenetic analyses are rapidly increasing
IEEE CONFERENCE ON BIOINFORMATICS & BIOMEDICINE 1 Accurate Simulation of Large Collections of Phylogenetic Trees Suzanne J. Matthews, Member, IEEE, ACM Abstract Phylogenetic analyses are growing at a rapid
More informationHeterotachy models in BayesPhylogenies
Heterotachy models in is a general software package for inferring phylogenetic trees using Bayesian Markov Chain Monte Carlo (MCMC) methods. The program allows a range of models of gene sequence evolution,
More informationAlgorithms for Computing Cluster Dissimilarity between Rooted Phylogenetic
Send Orders for Reprints to reprints@benthamscience.ae 8 The Open Cybernetics & Systemics Journal, 05, 9, 8-3 Open Access Algorithms for Computing Cluster Dissimilarity between Rooted Phylogenetic Trees
More informationEVOLUTIONARY DISTANCES INFERRING PHYLOGENIES
EVOLUTIONARY DISTANCES INFERRING PHYLOGENIES Luca Bortolussi 1 1 Dipartimento di Matematica ed Informatica Università degli studi di Trieste luca@dmi.units.it Trieste, 28 th November 2007 OUTLINE 1 INFERRING
More informationComputing the All-Pairs Quartet Distance on a set of Evolutionary Trees
Journal of Bioinformatics and Computational Biology c Imperial College Press Computing the All-Pairs Quartet Distance on a set of Evolutionary Trees M. Stissing, T. Mailund, C. N. S. Pedersen and G. S.
More information( ylogenetics/bayesian_workshop/bayesian%20mini conference.htm#_toc )
(http://www.nematodes.org/teaching/tutorials/ph ylogenetics/bayesian_workshop/bayesian%20mini conference.htm#_toc145477467) Model selection criteria Review Posada D & Buckley TR (2004) Model selection
More informationTutorial. Phylogenetic Trees and Metadata. Sample to Insight. November 21, 2017
Phylogenetic Trees and Metadata November 21, 2017 Sample to Insight QIAGEN Aarhus Silkeborgvej 2 Prismet 8000 Aarhus C Denmark Telephone: +45 70 22 32 44 www.qiagenbioinformatics.com AdvancedGenomicsSupport@qiagen.com
More informationComputing the Quartet Distance Between Trees of Arbitrary Degrees
January 22, 2006 University of Aarhus Department of Computer Science Computing the Quartet Distance Between Trees of Arbitrary Degrees Chris Christiansen & Martin Randers Thesis supervisor: Christian Nørgaard
More informationMLSTest Tutorial Contents
MLSTest Tutorial Contents About MLSTest... 2 Installing MLSTest... 2 Loading Data... 3 Main window... 4 DATA Menu... 5 View, modify and export your alignments... 6 Alignment>viewer... 6 Alignment> export...
More informationThe worst case complexity of Maximum Parsimony
he worst case complexity of Maximum Parsimony mir armel Noa Musa-Lempel Dekel sur Michal Ziv-Ukelson Ben-urion University June 2, 20 / 2 What s a phylogeny Phylogenies: raph-like structures whose topology
More informationPrior Distributions on Phylogenetic Trees
Prior Distributions on Phylogenetic Trees Magnus Johansson Masteruppsats i matematisk statistik Master Thesis in Mathematical Statistics Masteruppsats 2011:4 Matematisk statistik Juni 2011 www.math.su.se
More informationDISTANCE BASED METHODS IN PHYLOGENTIC TREE CONSTRUCTION
DISTANCE BASED METHODS IN PHYLOGENTIC TREE CONSTRUCTION CHUANG PENG DEPARTMENT OF MATHEMATICS MOREHOUSE COLLEGE ATLANTA, GA 30314 Abstract. One of the most fundamental aspects of bioinformatics in understanding
More informationComparison of commonly used methods for combining multiple phylogenetic data sets
Comparison of commonly used methods for combining multiple phylogenetic data sets Anne Kupczok, Heiko A. Schmidt and Arndt von Haeseler Center for Integrative Bioinformatics Vienna Max F. Perutz Laboratories
More informationIntroduction to Trees
Introduction to Trees Tandy Warnow December 28, 2016 Introduction to Trees Tandy Warnow Clades of a rooted tree Every node v in a leaf-labelled rooted tree defines a subset of the leafset that is below
More informationA Statistical Test for Clades in Phylogenies
A STATISTICAL TEST FOR CLADES A Statistical Test for Clades in Phylogenies Thurston H. Y. Dang 1, and Elchanan Mossel 2 1 Department of Electrical Engineering and Computer Sciences, University of California,
More informationA New Support Measure to Quantify the Impact of Local Optima in Phylogenetic Analyses
Evolutionary Bioinformatics Original Research Open Access Full open access to this and thousands of other papers at http://www.la-press.com. A New Support Measure to Quantify the Impact of Local Optima
More informationGeneralized Neighbor-Joining: More Reliable Phylogenetic Tree Reconstruction
Generalized Neighbor-Joining: More Reliable Phylogenetic Tree Reconstruction William R. Pearson, Gabriel Robins,* and Tongtong Zhang* *Department of Computer Science and Department of Biochemistry, University
More information11/17/2009 Comp 590/Comp Fall
Lecture 20: Clustering and Evolution Study Chapter 10.4 10.8 Problem Set #5 will be available tonight 11/17/2009 Comp 590/Comp 790-90 Fall 2009 1 Clique Graphs A clique is a graph with every vertex connected
More informationHybrid Parallelization of the MrBayes & RAxML Phylogenetics Codes
Hybrid Parallelization of the MrBayes & RAxML Phylogenetics Codes Wayne Pfeiffer (SDSC/UCSD) & Alexandros Stamatakis (TUM) February 25, 2010 What was done? Why is it important? Who cares? Hybrid MPI/OpenMP
More informationCS 581. Tandy Warnow
CS 581 Tandy Warnow This week Maximum parsimony: solving it on small datasets Maximum Likelihood optimization problem Felsenstein s pruning algorithm Bayesian MCMC methods Research opportunities Maximum
More informationA New Algorithm for the Reconstruction of Near-Perfect Binary Phylogenetic Trees
A New Algorithm for the Reconstruction of Near-Perfect Binary Phylogenetic Trees Kedar Dhamdhere, Srinath Sridhar, Guy E. Blelloch, Eran Halperin R. Ravi and Russell Schwartz March 17, 2005 CMU-CS-05-119
More informationEvolution Module. 6.1 Phylogenetic Trees. Bob Gardner and Lev Yampolski. Integrated Biology and Discrete Math (IBMS 1300)
Evolution Module 6.1 Phylogenetic Trees Bob Gardner and Lev Yampolski Integrated Biology and Discrete Math (IBMS 1300) Fall 2008 1 INDUCTION Note. The natural numbers N is the familiar set N = {1, 2, 3,...}.
More informationLab 07: Maximum Likelihood Model Selection and RAxML Using CIPRES
Integrative Biology 200, Spring 2014 Principles of Phylogenetics: Systematics University of California, Berkeley Updated by Traci L. Grzymala Lab 07: Maximum Likelihood Model Selection and RAxML Using
More informationA Collapsing Method for Efficient Recovery of Optimal. Edges in Phylogenetic Trees
A Collapsing Method for Efficient Recovery of Optimal Edges in Phylogenetic Trees by Michael Hu A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree
More informationProtein phylogenetics
Protein phylogenetics Robert Hirt PAUP4.0* can be used for an impressive range of analytical methods involving DNA alignments. This, unfortunately is not the case for estimating protein phylogenies. Only
More informationAlgorithms for MDC-Based Multi-locus Phylogeny Inference
Algorithms for MDC-Based Multi-locus Phylogeny Inference Yun Yu 1, Tandy Warnow 2, and Luay Nakhleh 1 1 Dept. of Computer Science, Rice University, 61 Main Street, Houston, TX 775, USA {yy9,nakhleh}@cs.rice.edu
More informationParallelizing SuperFine
Parallelizing SuperFine Diogo Telmo Neves ESTGF - IPP and Universidade do Minho Portugal dtn@ices.utexas.edu Tandy Warnow Dept. of Computer Science The Univ. of Texas at Austin Austin, TX 78712 tandy@cs.utexas.edu
More information"PRINCIPLES OF PHYLOGENETICS" Spring 2008
Integrative Biology 200A University of California, Berkeley "PRINCIPLES OF PHYLOGENETICS" Spring 2008 Lab 7: Introduction to PAUP* Today we will be learning about some of the basic features of PAUP* (Phylogenetic
More informationGenome 559: Introduction to Statistical and Computational Genomics. Lecture15a Multiple Sequence Alignment Larry Ruzzo
Genome 559: Introduction to Statistical and Computational Genomics Lecture15a Multiple Sequence Alignment Larry Ruzzo 1 Multiple Alignment: Motivations Common structure, function, or origin may be only
More informationRapid Neighbour-Joining
Rapid Neighbour-Joining Martin Simonsen, Thomas Mailund and Christian N. S. Pedersen Bioinformatics Research Center (BIRC), University of Aarhus, C. F. Møllers Allé, Building 1110, DK-8000 Århus C, Denmark.
More informationData Preprocessing. Slides by: Shree Jaswal
Data Preprocessing Slides by: Shree Jaswal Topics to be covered Why Preprocessing? Data Cleaning; Data Integration; Data Reduction: Attribute subset selection, Histograms, Clustering and Sampling; Data
More informationBUCKy Bayesian Untangling of Concordance Knots (applied to yeast and other organisms)
Introduction BUCKy Bayesian Untangling of Concordance Knots (applied to yeast and other organisms) Version 1.2, 17 January 2008 Copyright c 2008 by Bret Larget Last updated: 11 November 2008 Departments
More informationParsimonious Reconciliation of Non-binary Trees. Louxin Zhang National University of Singapore
Parsimonious Reconciliation of Non-binary Trees Louxin Zhang National University of Singapore matzlx@nus.edu.sg Gene Tree vs the (Containing) Species Tree. A species tree S represents the evolutionary
More informationIntroduction to MrBayes
Introduction to MrBayes Fred(rik) Ronquist Dept. Bioinformatics and Genetics Swedish Museum of Natural History, Stockholm, Sweden Installing MrBayes! Two options:! Go to mrbayes.net, click Download and
More informationAccounting for Uncertainty in the Tree Topology Has Little Effect on the Decision-Theoretic Approach to Model Selection in Phylogeny Estimation
Accounting for Uncertainty in the Tree Topology Has Little Effect on the Decision-Theoretic Approach to Model Selection in Phylogeny Estimation Zaid Abdo,* à Vladimir N. Minin, Paul Joyce,* à and Jack
More informationWorld Academy of Science, Engineering and Technology International Journal of Bioengineering and Life Sciences Vol:11, No:6, 2017
BeamGA Median: A Hybrid Heuristic Search Approach Ghada Badr, Manar Hosny, Nuha Bintayyash, Eman Albilali, Souad Larabi Marie-Sainte Abstract The median problem is significantly applied to derive the most
More informationLecture 20: Clustering and Evolution
Lecture 20: Clustering and Evolution Study Chapter 10.4 10.8 11/12/2013 Comp 465 Fall 2013 1 Clique Graphs A clique is a graph where every vertex is connected via an edge to every other vertex A clique
More informationBrief review from last class
Sequence Alignment Brief review from last class DNA is has direction, we will use only one (5 -> 3 ) and generate the opposite strand as needed. DNA is a 3D object (see lecture 1) but we will model it
More informationLecture 20: Clustering and Evolution
Lecture 20: Clustering and Evolution Study Chapter 10.4 10.8 11/11/2014 Comp 555 Bioalgorithms (Fall 2014) 1 Clique Graphs A clique is a graph where every vertex is connected via an edge to every other
More informationNeighbor Joining Plus - algorithm for phylogenetic tree reconstruction with proper nodes assignment.
Neighbor Joining Plus - algorithm for phylogenetic tree reconstruction with proper nodes assignment. Piotr Płoński 1ξ, Jan P. Radomski 2ξ 1 Institute of Radioelectronics, Warsaw University of Technology,
More informationCS 2750 Machine Learning. Lecture 19. Clustering. CS 2750 Machine Learning. Clustering. Groups together similar instances in the data sample
Lecture 9 Clustering Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square Clustering Groups together similar instances in the data sample Basic clustering problem: distribute data into k different groups
More informationDistance Methods. "PRINCIPLES OF PHYLOGENETICS" Spring 2006
Integrative Biology 200A University of California, Berkeley "PRINCIPLES OF PHYLOGENETICS" Spring 2006 Distance Methods Due at the end of class: - Distance matrices and trees for two different distance
More informationLab 13: The What the Hell Do I Do with All These Trees Lab
Integrative Biology 200A University of California, Berkeley Principals of Phylogenetics Spring 2012 Updated by Michael Landis Lab 13: The What the Hell Do I Do with All These Trees Lab We ve generated
More informationResampling Methods. Levi Waldron, CUNY School of Public Health. July 13, 2016
Resampling Methods Levi Waldron, CUNY School of Public Health July 13, 2016 Outline and introduction Objectives: prediction or inference? Cross-validation Bootstrap Permutation Test Monte Carlo Simulation
More informationMain Reference. Marc A. Suchard: Stochastic Models for Horizontal Gene Transfer: Taking a Random Walk through Tree Space Genetics 2005
Stochastic Models for Horizontal Gene Transfer Dajiang Liu Department of Statistics Main Reference Marc A. Suchard: Stochastic Models for Horizontal Gene Transfer: Taing a Random Wal through Tree Space
More informationA New Algorithm for the Reconstruction of Near-Perfect Binary Phylogenetic Trees
A New Algorithm for the Reconstruction of Near-Perfect Binary Phylogenetic Trees Kedar Dhamdhere ½ ¾, Srinath Sridhar ½ ¾, Guy E. Blelloch ¾, Eran Halperin R. Ravi and Russell Schwartz March 17, 2005 CMU-CS-05-119
More informationFast and accurate branch lengths estimation for phylogenomic trees
Binet et al. BMC Bioinformatics (2016) 17:23 DOI 10.1186/s12859-015-0821-8 RESEARCH ARTICLE Open Access Fast and accurate branch lengths estimation for phylogenomic trees Manuel Binet 1,2,3, Olivier Gascuel
More informationPhylogenetic networks that display a tree twice
Bulletin of Mathematical Biology manuscript No. (will be inserted by the editor) Phylogenetic networks that display a tree twice Paul Cordue Simone Linz Charles Semple Received: date / Accepted: date Abstract
More informationINFERRING OPTIMAL SPECIES TREES UNDER GENE DUPLICATION AND LOSS
INFERRING OPTIMAL SPECIES TREES UNDER GENE DUPLICATION AND LOSS M. S. BAYZID, S. MIRARAB and T. WARNOW Department of Computer Science, The University of Texas at Austin, Austin, Texas 78712, USA E-mail:
More informationFastJoin, an improved neighbor-joining algorithm
Methodology FastJoin, an improved neighbor-joining algorithm J. Wang, M.-Z. Guo and L.L. Xing School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, P.R. China
More informationPackage rwty. June 22, 2016
Type Package Package rwty June 22, 2016 Title R We There Yet? Visualizing MCMC Convergence in Phylogenetics Version 1.0.1 Author Dan Warren , Anthony Geneva ,
More informationLab 12: The What the Hell Do I Do with All These Trees Lab
Integrative Biology 200A University of California, Berkeley Principals of Phylogenetics Spring 2010 Updated by Nick Matzke Lab 12: The What the Hell Do I Do with All These Trees Lab We ve generated a lot
More informationCooperative Rec-I-DCM3: A Population-Based Approach for Reconstructing Phylogenies
Cooperative Rec-I-DCM3: A Population-Based Approach for Reconstructing Phylogenies Tiffani L. Williams Department of Computer Science Texas A&M University tlw@cs.tamu.edu Marc L. Smith Department of Computer
More informationStat 547 Assignment 3
Stat 547 Assignment 3 Release Date: Saturday April 16, 2011 Due Date: Wednesday, April 27, 2011 at 4:30 PST Note that the deadline for this assignment is one day before the final project deadline, and
More informationRapid Neighbour-Joining
Rapid Neighbour-Joining Martin Simonsen, Thomas Mailund, and Christian N.S. Pedersen Bioinformatics Research Center (BIRC), University of Aarhus, C. F. Møllers Allé, Building 1110, DK-8000 Århus C, Denmark
More informationSistemática Teórica. Hernán Dopazo. Biomedical Genomics and Evolution Lab. Lesson 03 Statistical Model Selection
Sistemática Teórica Hernán Dopazo Biomedical Genomics and Evolution Lab Lesson 03 Statistical Model Selection Facultad de Ciencias Exactas y Naturales Universidad de Buenos Aires Argentina 2013 Statistical
More informationA practical O(n log 2 n) time algorithm for computing the triplet distance on binary trees
A practical O(n log 2 n) time algorithm for computing the triplet distance on binary trees Andreas Sand 1,2, Gerth Stølting Brodal 2,3, Rolf Fagerberg 4, Christian N. S. Pedersen 1,2 and Thomas Mailund
More informationarxiv: v2 [q-bio.pe] 8 Aug 2016
Combinatorial Scoring of Phylogenetic Networks Nikita Alexeev and Max A. Alekseyev The George Washington University, Washington, D.C., U.S.A. arxiv:160.0841v [q-bio.pe] 8 Aug 016 Abstract. Construction
More informationSAMPLING DISCRETE COMBINATORIAL SPACES IN PHYLOGENETICS
SAMPLING DISCRETE COMBINATORIAL SPACES IN PHYLOGENETICS by Alexander Safatli Submitted in partial fulfillment of the requirements for the degree of Master of Computer Science at Dalhousie University Halifax,
More information