Alignment of Trees and Directed Acyclic Graphs

Size: px
Start display at page:

Download "Alignment of Trees and Directed Acyclic Graphs"

Transcription

1 Alignment of Trees and Directed Acyclic Graphs Gabriel Valiente Algorithms, Bioinformatics, Complexity and Formal Methods Research Group Technical University of Catalonia Computational Biology and Bioinformatics Research Group Research Institute of Health Science, University of the Balearic Islands Centre for Genomic Regulation Barcelona Biomedical Research Park Ben-Gurion University of the Negev, Israel, April 27, 2009 Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

2 Abstract It is well known that the string edit distance and the alignment of strings coincide, while the alignment of trees differs from the tree edit distance. In this talk, we recall various constraints on directed acyclic graphs that allow for a unique (up to isomorphism) representation, called the path multiplicity representation, and present a new method for the alignment of trees and directed acyclic graphs that exploits the path multiplicity representation to produce a meaningful optimal alignment in polynomial time. Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

3 Plan of the Talk String edit distance and alignment Tree edit distance and alignment DAG representation of phylogenetic networks Path multiplicity representation DAG alignment Tree alignment as DAG alignment Tool support BioPerl module Web interface to the BioPerl module Conclusion Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

4 String edit distance and alignment Definition The edit distance between two strings is the smallest number of insertions, deletions, and substitutions needed to transform one string into the other Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

5 String edit distance and alignment Definition The edit distance between two strings is the smallest number of insertions, deletions, and substitutions needed to transform one string into the other Definition An alignment of two strings is an arrangement of the two strings as rows of a matrix, with additional gaps (dashes) between the elements to make some or all of the remaining (aligned) columns contain identical elements but with no column gapped in both strings Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

6 String edit distance and alignment Definition The edit distance between two strings is the smallest number of insertions, deletions, and substitutions needed to transform one string into the other Definition An alignment of two strings is an arrangement of the two strings as rows of a matrix, with additional gaps (dashes) between the elements to make some or all of the remaining (aligned) columns contain identical elements but with no column gapped in both strings Example (Optimal alignment) - GCTTCCGGCTCGTATAATGTGTGG * * * TGCTTCTGACT ---ATAATA -G--- Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

7 Tree edit distance and alignment Definition The edit distance between two trees is the smallest number of insertions, deletions, and substitutions needed to transform one tree into the other Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

8 Tree edit distance and alignment Definition The edit distance between two trees is the smallest number of insertions, deletions, and substitutions needed to transform one tree into the other Example (Edit distance) a a a e d b f b c b c d c d Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

9 Tree edit distance and alignment Definition An alignment of two trees is an arrangement of the trees with space labeled nodes inserted such that their structures coincide Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

10 Tree edit distance and alignment Definition An alignment of two trees is an arrangement of the trees with space labeled nodes inserted such that their structures coincide Example (Optimal alignment) a a a a e d e f b f b c b b c c d d c d Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

11 Tree edit distance and alignment Remark An alignment of trees is a restricted form of tree edit distance in which all the insertions precede all the deletions Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

12 Tree edit distance and alignment Remark An alignment of trees is a restricted form of tree edit distance in which all the insertions precede all the deletions Remark With insertion cost 1, deletion cost 1, identical substitution cost 0, and non-identical substitution cost 2, an optimal tree edit yields a largest common subtree and an optimal alignment yields a smallest common supertree T. Jiang, L. Wang, and K. Zhang. Alignment of trees an alternative to tree edit. Theoretical Computer Science, 143(1): , 1995 Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

13 Tree edit distance and alignment H. Bunke, X. Jiang, and A. Kandel. On the minimum common supergraph of two graphs. Computing, 65(1):13 25, 2000 M.-L. Fernández and G. Valiente. A graph distance measure combining maximum common subgraph and minimum common supergraph. Pattern Recognition Letters, 22(6 7): , 2001 Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

14 Tree edit distance and alignment H. Bunke, X. Jiang, and A. Kandel. On the minimum common supergraph of two graphs. Computing, 65(1):13 25, 2000 M.-L. Fernández and G. Valiente. A graph distance measure combining maximum common subgraph and minimum common supergraph. Pattern Recognition Letters, 22(6 7): , 2001 Theorem The problems of finding a largest common subtree and a smallest common supertree of two trees, in each case together with a pair of witness (minor, topological, homeomorphic, or isomorphic) embeddings, are reducible to each other in time linear in the size of the trees F. Rosselló and G. Valiente. An algebraic view of the relation between largest common subtrees and smallest common supertrees. Theoretical Computer Science, 362(1 3):33 53, 2006 Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

15 Tree edit distance and alignment Example A. Lozano, R. Pinter, O. Rokhlenko, G. Valiente, and M. Ziv-Ukelson. Seeded tree alignment and planar tanglegram layout. In Proc. 7th Workshop on Algorithms in Bioinformatics, volume 4645 of Lecture Notes in Bioinformatics, pages Springer, 2007 A. Lozano, R. Pinter, O. Rokhlenko, G. Valiente, and M. Ziv-Ukelson. Seeded tree alignment. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 5(4): , 2008 Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

16 DAG representation of phylogenetic networks D. H. Huson and D. Bryant. Application of phylogenetic networks in evolutionary studies. Mol. Biol. Evol., 23(2): , 2006 Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

17 DAG representation of phylogenetic networks Definition A phylogenetic network is a directed acyclic graph whose terminal nodes are labeled by taxa names and whose internal nodes are either tree nodes (if they have only one parent) or hybrid nodes (if they have two or more parents) Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

18 DAG representation of phylogenetic networks Example 44 polymorphic sites in a sample of the single gene encoding for alcohol dehydrogenase in 11 species from 5 natural populations of D. melanogaster CCGCAATAATGGCGCTACTCTCACAATAACCCACTAGACAGCCT CCCCAATATGGGCGCTACTTTCACAATAACCCACTAGACAGCCT CCGCAATATGGGCGCTACCCCCCGGAATCTCCACTAAACAGTCA CCGCAATATGGGCGCTGTCCCCCGGAATCTCCACTAAACTACCT CCGAGATAAGTCCGAGGTCCCCCGGAATCTCCACTAGCCAGCCT CCCCAATATGGGCGCGACCCCCCGGAATCTCTATTCACCAGCTT CCCCAATATGGGCGCGACCCCCCGGAATCTGTCTCCGCCAGCCT TGCAGATAAGTCGGCGACCCCCCGGAATCTGTCTCCGCGAGCCT TGCAGATAAGTCGGCGACCCCCCGGAATCTGTCTCCGCGAGCCT TGCAGATAAGTCGGCGACCCCCCGGAATCTGTCTCCGCGAGCCT TGCAGGGGAGGGCTCGACCCCACGGGATCTGTCTCCGCCAGCCT Wa - S Fl -1 S Af - S Fr - S Fl -2 S Ja - S Fl - F Fr - F Wa - F Af - F Ja - F M. Kreitman. Nucleotide polymorphism at the alcohol dehydrogenase locus of Drosophila melanogaster. Nature, 304(5925): , 1983 Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

19 DAG representation of phylogenetic networks Example Ja-F Af-F Fr-F Wa-F Fl-2S Wa-S Af-S Fr-S Fl-1S Ja-S Fl-F Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

20 DAG representation of phylogenetic networks Example Fl-F Ja-F Fr-F Wa-F Af-F Ja-S Fl-2S Fr-S Wa-S Af-S Fl-1S Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

21 DAG representation of phylogenetic networks Definition A phylogenetic network is called tree-sibling if every hybrid node has at least one sibling that is a tree node Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

22 DAG representation of phylogenetic networks Definition A phylogenetic network is called tree-sibling if every hybrid node has at least one sibling that is a tree node Remark The biological meaning of the tree-sibling condition is that in each of the recombination or hybridization processes, at least one of the species involved in them also has some descendant through mutation Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

23 DAG representation of phylogenetic networks Definition A phylogenetic network is called tree-child if every internal node has at least one child that is a tree node Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

24 DAG representation of phylogenetic networks Definition A phylogenetic network is called tree-child if every internal node has at least one child that is a tree node Remark The biological meaning of the tree-child condition is that every non-extant species has some descendant through mutation Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

25 DAG representation of phylogenetic networks Definition A phylogenetic network is time-consistent if there is a temporal representation of the network, that is, an assignment of times to the nodes of the network that strictly increases on tree edges (those edges whose head is a tree node) and remains the same on hybrid edges (whose head is a hybrid node) Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

26 DAG representation of phylogenetic networks Definition A phylogenetic network is time-consistent if there is a temporal representation of the network, that is, an assignment of times to the nodes of the network that strictly increases on tree edges (those edges whose head is a tree node) and remains the same on hybrid edges (whose head is a hybrid node) Remark The biological meaning of a temporal assignment is the time when certain species exist or when certain hybridization processes occur, because for these processes to take place, the species involved must coexist in time Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

27 DAG representation of phylogenetic networks Example (Time consistency) Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

28 DAG representation of phylogenetic networks phylogenetic networks tree-sibling tree-child not timeconsistent galled-trees phylogenetic trees Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

29 DAG representation of phylogenetic networks Number of phylogenetic trees, galled-trees, tree-child, and tree-sibling networks ρ = 0 ρ = 1 ρ = 2 ρ = 4 ρ = 8 M. Arenas, G. Valiente, and D. Posada. Characterization of phylogenetic reticulate networks based on the coalescent with recombination. Molecular Biology and Evolution, 25(12): , 2008 Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

30 Path multiplicity representation Definition The µ-representation of a tree-child phylogenetic network is the multiset of µ-vectors µ(u) of path-to-leaf multiplicities for each node u Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

31 Path multiplicity representation Definition The µ-representation of a tree-child phylogenetic network is the multiset of µ-vectors µ(u) of path-to-leaf multiplicities for each node u Example Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

32 Path multiplicity representation Definition The µ-representation of a tree-child phylogenetic network is the multiset of µ-vectors µ(u) of path-to-leaf multiplicities for each node u Example Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

33 Path multiplicity representation Definition The µ-representation of a tree-child phylogenetic network is the multiset of µ-vectors µ(u) of path-to-leaf multiplicities for each node u Example Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

34 Path multiplicity representation Definition The µ-representation of a tree-child phylogenetic network is the multiset of µ-vectors µ(u) of path-to-leaf multiplicities for each node u Example Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

35 Path multiplicity representation Definition The µ-representation of a tree-child phylogenetic network is the multiset of µ-vectors µ(u) of path-to-leaf multiplicities for each node u Example Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

36 Path multiplicity representation Lemma The µ-representation of a tree-child phylogenetic network can be obtained in polynomial time Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

37 Path multiplicity representation Lemma The µ-representation of a tree-child phylogenetic network can be obtained in polynomial time Example

38 Path multiplicity representation Lemma The µ-representation of a tree-child phylogenetic network can be obtained in polynomial time Example

39 Path multiplicity representation Lemma The µ-representation of a tree-child phylogenetic network can be obtained in polynomial time Example

40 Path multiplicity representation Lemma The µ-representation of a tree-child phylogenetic network can be obtained in polynomial time Example

41 Path multiplicity representation Lemma The µ-representation of a tree-child phylogenetic network can be obtained in polynomial time Example

42 Path multiplicity representation Lemma The µ-representation of a tree-child phylogenetic network can be obtained in polynomial time Example

43 Path multiplicity representation Lemma The µ-representation of a tree-child phylogenetic network can be obtained in polynomial time Example

44 Path multiplicity representation Lemma The µ-representation of a tree-child phylogenetic network can be obtained in polynomial time Example

45 Path multiplicity representation Lemma The µ-representation of a tree-child phylogenetic network can be obtained in polynomial time Example Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

46 Path multiplicity representation Theorem Two tree-child phylogenetic networks are isomorphic if and only if they have the same µ-representation Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

47 Path multiplicity representation Theorem Two tree-child phylogenetic networks are isomorphic if and only if they have the same µ-representation Example

48 Path multiplicity representation Theorem Two tree-child phylogenetic networks are isomorphic if and only if they have the same µ-representation Example

49 Path multiplicity representation Theorem Two tree-child phylogenetic networks are isomorphic if and only if they have the same µ-representation Example

50 Path multiplicity representation Theorem Two tree-child phylogenetic networks are isomorphic if and only if they have the same µ-representation Example

51 Path multiplicity representation Theorem Two tree-child phylogenetic networks are isomorphic if and only if they have the same µ-representation Example

52 Path multiplicity representation Theorem Two tree-child phylogenetic networks are isomorphic if and only if they have the same µ-representation Example

53 Path multiplicity representation Theorem Two tree-child phylogenetic networks are isomorphic if and only if they have the same µ-representation Example

54 Path multiplicity representation Theorem Two tree-child phylogenetic networks are isomorphic if and only if they have the same µ-representation Example Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

55 Path multiplicity representation Definition The µ-distance between two two tree-child phylogenetic networks N and N is d µ (N,N ) = µ(n) µ(n ) Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

56 Path multiplicity representation Definition The µ-distance between two two tree-child phylogenetic networks N and N is d µ (N,N ) = µ(n) µ(n ) Example (d µ (N,N ) = 6) N N Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

57 Path multiplicity representation Theorem The µ-distance induces a metric on the space of tree-child phylogenetic networks that generalizes the bipartition distance for phylogenetic trees G. Cardona, F. Rosselló, and G. Valiente. Comparison of tree-child phylogenetic networks. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2009 Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

58 Path multiplicity representation Theorem The µ-distance induces a metric on the space of tree-child phylogenetic networks that generalizes the bipartition distance for phylogenetic trees G. Cardona, F. Rosselló, and G. Valiente. Comparison of tree-child phylogenetic networks. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2009 Theorem The µ-distance induces a metric on the space of semi-binary tree-sibling time consistent phylogenetic networks that generalizes the bipartition distance for phylogenetic trees G. Cardona, M. Llabrés, F. Rosselló, and G. Valiente. A distance metric for a class of tree-sibling phylogenetic networks. Bioinformatics, 24(13): , 2008 Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

59 DAG alignment Definition For every v V and v V of two phylogenetic networks N = (V,E) and N = (V,E ), let m(v,v ) = µ(v) µ(v ) { χ(v,v 0 if v,v ) = are both tree nodes or both hybrid 1 otherwise The weight of the pair (v,v ) is w(v,v ) = m(v,v ) + χ(v,v ) 2n The total weight of a matching M : V V is w(m) = w(v,m(v)) v V Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

60 DAG alignment Definition An optimal alignment between two phylogenetic networks is a matching with the smallest total weight among all possible matchings Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

61 DAG alignment Definition An optimal alignment between two phylogenetic networks is a matching with the smallest total weight among all possible matchings Lemma A matching between two phylogenetic networks N = (V,E) and N = (V,E ) is an optimal alignment if and only if it minimizes the sum m(v,m(v)) v V and, among those matchings minimizing this sum, it maximizes the number of nodes that are sent to nodes of the same type G. Cardona, F. Rosselló, and G. Valiente. Comparison of tree-child phylogenetic networks. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2009 Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

62 DAG alignment Example (Optimal alignment of two phylogenetic networks) r r b v a x c A d B e u X y Y z r (1,1,2,3,1) b (0,0,1,2,1) a (1,1,1,1,0) A (0,0,1,1,0) c (1,1,0,0,0) d (0,0,1,1,0) e (0,0,0,1,1) B (0,0,0,1,0) r (1,2,1,2,1) u (1,1,0,0,0) v (0,1,1,2,1) x (0,1,1,1,0) y (0,0,1,1,0) z (0,0,0,1,1) X (0,1,0,0,0) Y (0,0,0,1,0) Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

63 DAG alignment Example (Optimal alignment of two phylogenetic networks) r u v x y z X Y r b a A c d e B Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

64 DAG alignment Example (Optimal alignment of two phylogenetic networks) r 3 r b 1 v a 1 x c 0 u A 3 X d B e y Y z Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

65 Tree alignment as DAG alignment Remark If we restrict this alignment method to phylogenetic trees, the weight of a pair of nodes (v 1,v 2 ) is simply C L (v 1 ) C L (v 2 ). This can be seen as an unnormalized version of the score used in TreeJuxtaposer T. Munzner, F. Guimbretière, S. Tasiran, L. Zhang, and Y. Zhou. TreeJuxtaposer: Scalable tree comparison using focus+context with guaranteed visibility. ACM T. Graphics, 22(3): , 2003 Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

66 Tree alignment as DAG alignment Example (Optimal alignment of two phylogenetic trees) /4 0/5 1/ /5 1/5 2/ /5 2/5 3/5 Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

67 Tool support BioPerl module Bio::PhyloNetwork The Perl module Bio::PhyloNetwork implements all the data structures needed to work with phylogenetic networks, as well as algorithms for reconstructing a network from its enewick string reconstructing a network from its µ-representation exploding a network into the set of its induced subtrees computing the µ-representation of a network computing the µ-distance between two networks computing an optimal alignment between two networks computing the set of tripartitions of a network computing the tripartition error between two networks testing if a network is time consistent computing a temporal representation of a time-consistent network Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

68 Tool support Web interface to the BioPerl module The web interface at gcardona/bioinfo/alignment.php allows the user to input one or two phylogenetic networks, given by their enewick strings. A Perl script processes these strings and uses the Bio::PhyloNetwork package to compute all available data for them, including a plot of the networks that can be downloaded in PS format; these plots are generated through the application GraphViz and its companion Perl package Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

69 Tool support Web interface to the BioPerl module The web interface at gcardona/bioinfo/alignment.php allows the user to input one or two phylogenetic networks, given by their enewick strings. A Perl script processes these strings and uses the Bio::PhyloNetwork package to compute all available data for them, including a plot of the networks that can be downloaded in PS format; these plots are generated through the application GraphViz and its companion Perl package Given two networks on the same set of leaves, their µ-distance is also computed, as well as an optimal alignment between them. If their sets of leaves are not the same, their topological restriction on the set of common leaves is first computed followed by the µ-distance and an optimal alignment Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

70 Tool support Web interface to the BioPerl module The web interface at gcardona/bioinfo/alignment.php allows the user to input one or two phylogenetic networks, given by their enewick strings. A Perl script processes these strings and uses the Bio::PhyloNetwork package to compute all available data for them, including a plot of the networks that can be downloaded in PS format; these plots are generated through the application GraphViz and its companion Perl package Given two networks on the same set of leaves, their µ-distance is also computed, as well as an optimal alignment between them. If their sets of leaves are not the same, their topological restriction on the set of common leaves is first computed followed by the µ-distance and an optimal alignment A Java applet displays the networks side by side, and whenever a node is selected, the corresponding node in the other network (with respect to the optimal alignment) is highlighted, if it exists. This is also extended to edges. Similarities and differences between the networks are thus evident at a glance Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

71 Conclusion String edit distance and alignment of strings coincide, but alignment of trees differs from tree edit distance and alignment of graphs differs from graph edit distance Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

72 Conclusion String edit distance and alignment of strings coincide, but alignment of trees differs from tree edit distance and alignment of graphs differs from graph edit distance The alignment of trees and directed acyclic graphs based on the path multiplicity representation can be computed in polynomial time Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

73 Conclusion String edit distance and alignment of strings coincide, but alignment of trees differs from tree edit distance and alignment of graphs differs from graph edit distance The alignment of trees and directed acyclic graphs based on the path multiplicity representation can be computed in polynomial time The alignment method can be applied to any directed acyclic graphs with the same set of terminal node labels Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

74 Conclusion String edit distance and alignment of strings coincide, but alignment of trees differs from tree edit distance and alignment of graphs differs from graph edit distance The alignment of trees and directed acyclic graphs based on the path multiplicity representation can be computed in polynomial time The alignment method can be applied to any directed acyclic graphs with the same set of terminal node labels G. Valiente. Algorithms on Trees and Graphs. Springer, 2002 Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

75 Conclusion String edit distance and alignment of strings coincide, but alignment of trees differs from tree edit distance and alignment of graphs differs from graph edit distance The alignment of trees and directed acyclic graphs based on the path multiplicity representation can be computed in polynomial time The alignment method can be applied to any directed acyclic graphs with the same set of terminal node labels G. Valiente. Algorithms on Trees and Graphs. Springer, 2002 G. Valiente. Combinatorial Pattern Matching Algorithms in Computational Biology using Perl and R. Taylor & Francis/CRC Press, 2009 Gabriel Valiente (UPC) Alignment of Directed Acyclic Graphs BGU / 35

Research Article The Comparison of Tree-Sibling Time Consistent Phylogenetic Networks Is Graph Isomorphism-Complete

Research Article The Comparison of Tree-Sibling Time Consistent Phylogenetic Networks Is Graph Isomorphism-Complete e Scientific World Journal, Article ID 254279, 6 pages http://dx.doi.org/0.55/204/254279 Research Article The Comparison of Tree-Sibling Time Consistent Phylogenetic Networks Is Graph Isomorphism-Complete

More information

Phylogenetic networks that display a tree twice

Phylogenetic networks that display a tree twice Bulletin of Mathematical Biology manuscript No. (will be inserted by the editor) Phylogenetic networks that display a tree twice Paul Cordue Simone Linz Charles Semple Received: date / Accepted: date Abstract

More information

SEEING THE TREES AND THEIR BRANCHES IN THE NETWORK IS HARD

SEEING THE TREES AND THEIR BRANCHES IN THE NETWORK IS HARD 1 SEEING THE TREES AND THEIR BRANCHES IN THE NETWORK IS HARD I A KANJ School of Computer Science, Telecommunications, and Information Systems, DePaul University, Chicago, IL 60604-2301, USA E-mail: ikanj@csdepauledu

More information

A Fast Algorithm for Optimal Alignment between Similar Ordered Trees

A Fast Algorithm for Optimal Alignment between Similar Ordered Trees Fundamenta Informaticae 56 (2003) 105 120 105 IOS Press A Fast Algorithm for Optimal Alignment between Similar Ordered Trees Jesper Jansson Department of Computer Science Lund University, Box 118 SE-221

More information

arxiv: v2 [q-bio.pe] 8 Sep 2015

arxiv: v2 [q-bio.pe] 8 Sep 2015 RH: Tree-Based Phylogenetic Networks On Tree Based Phylogenetic Networks arxiv:1509.01663v2 [q-bio.pe] 8 Sep 2015 Louxin Zhang 1 1 Department of Mathematics, National University of Singapore, Singapore

More information

Scaling species tree estimation methods to large datasets using NJMerge

Scaling species tree estimation methods to large datasets using NJMerge Scaling species tree estimation methods to large datasets using NJMerge Erin Molloy and Tandy Warnow {emolloy2, warnow}@illinois.edu University of Illinois at Urbana Champaign 2018 Phylogenomics Software

More information

The worst case complexity of Maximum Parsimony

The worst case complexity of Maximum Parsimony he worst case complexity of Maximum Parsimony mir armel Noa Musa-Lempel Dekel sur Michal Ziv-Ukelson Ben-urion University June 2, 20 / 2 What s a phylogeny Phylogenies: raph-like structures whose topology

More information

arxiv: v2 [q-bio.pe] 8 Aug 2016

arxiv: v2 [q-bio.pe] 8 Aug 2016 Combinatorial Scoring of Phylogenetic Networks Nikita Alexeev and Max A. Alekseyev The George Washington University, Washington, D.C., U.S.A. arxiv:160.0841v [q-bio.pe] 8 Aug 016 Abstract. Construction

More information

Introduction to Triangulated Graphs. Tandy Warnow

Introduction to Triangulated Graphs. Tandy Warnow Introduction to Triangulated Graphs Tandy Warnow Topics for today Triangulated graphs: theorems and algorithms (Chapters 11.3 and 11.9) Examples of triangulated graphs in phylogeny estimation (Chapters

More information

Dynamic Programming Course: A structure based flexible search method for motifs in RNA. By: Veksler, I., Ziv-Ukelson, M., Barash, D.

Dynamic Programming Course: A structure based flexible search method for motifs in RNA. By: Veksler, I., Ziv-Ukelson, M., Barash, D. Dynamic Programming Course: A structure based flexible search method for motifs in RNA By: Veksler, I., Ziv-Ukelson, M., Barash, D., Kedem, K Outline Background Motivation RNA s structure representations

More information

PHYLOGENETIC networks have been studied over the last

PHYLOGENETIC networks have been studied over the last 552 IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, VOL. 6, NO. 4, OCTOBER-DECEMBER 2009 Comparison of Tree-Child Phylogenetic Networks Gabriel Cardona, Francesc Rosselló, and Gabriel

More information

Graph and Digraph Glossary

Graph and Digraph Glossary 1 of 15 31.1.2004 14:45 Graph and Digraph Glossary A B C D E F G H I-J K L M N O P-Q R S T U V W-Z Acyclic Graph A graph is acyclic if it contains no cycles. Adjacency Matrix A 0-1 square matrix whose

More information

Evolution of Tandemly Repeated Sequences

Evolution of Tandemly Repeated Sequences University of Canterbury Department of Mathematics and Statistics Evolution of Tandemly Repeated Sequences A thesis submitted in partial fulfilment of the requirements of the Degree for Master of Science

More information

Dynamic Programming for Phylogenetic Estimation

Dynamic Programming for Phylogenetic Estimation 1 / 45 Dynamic Programming for Phylogenetic Estimation CS598AGB Pranjal Vachaspati University of Illinois at Urbana-Champaign 2 / 45 Coalescent-based Species Tree Estimation Find evolutionary tree for

More information

Trinets encode tree-child and level-2 phylogenetic networks

Trinets encode tree-child and level-2 phylogenetic networks Noname manuscript No. (will be inserted by the editor) Trinets encode tree-child and level-2 phylogenetic networks Leo van Iersel Vincent Moulton the date of receipt and acceptance should be inserted later

More information

Throughout the chapter, we will assume that the reader is familiar with the basics of phylogenetic trees.

Throughout the chapter, we will assume that the reader is familiar with the basics of phylogenetic trees. Chapter 7 SUPERTREE ALGORITHMS FOR NESTED TAXA Philip Daniel and Charles Semple Abstract: Keywords: Most supertree algorithms combine collections of rooted phylogenetic trees with overlapping leaf sets

More information

Applied Mathematics Letters. Graph triangulations and the compatibility of unrooted phylogenetic trees

Applied Mathematics Letters. Graph triangulations and the compatibility of unrooted phylogenetic trees Applied Mathematics Letters 24 (2011) 719 723 Contents lists available at ScienceDirect Applied Mathematics Letters journal homepage: www.elsevier.com/locate/aml Graph triangulations and the compatibility

More information

SPR-BASED TREE RECONCILIATION: NON-BINARY TREES AND MULTIPLE SOLUTIONS

SPR-BASED TREE RECONCILIATION: NON-BINARY TREES AND MULTIPLE SOLUTIONS 1 SPR-BASED TREE RECONCILIATION: NON-BINARY TREES AND MULTIPLE SOLUTIONS C. THAN and L. NAKHLEH Department of Computer Science Rice University 6100 Main Street, MS 132 Houston, TX 77005, USA Email: {cvthan,nakhleh}@cs.rice.edu

More information

UC Davis Computer Science Technical Report CSE On the Full-Decomposition Optimality Conjecture for Phylogenetic Networks

UC Davis Computer Science Technical Report CSE On the Full-Decomposition Optimality Conjecture for Phylogenetic Networks UC Davis Computer Science Technical Report CSE-2005 On the Full-Decomposition Optimality Conjecture for Phylogenetic Networks Dan Gusfield January 25, 2005 1 On the Full-Decomposition Optimality Conjecture

More information

Olivier Gascuel Arbres formels et Arbre de la Vie Conférence ENS Cachan, septembre Arbres formels et Arbre de la Vie.

Olivier Gascuel Arbres formels et Arbre de la Vie Conférence ENS Cachan, septembre Arbres formels et Arbre de la Vie. Arbres formels et Arbre de la Vie Olivier Gascuel Centre National de la Recherche Scientifique LIRMM, Montpellier, France www.lirmm.fr/gascuel 10 permanent researchers 2 technical staff 3 postdocs, 10

More information

New Common Ancestor Problems in Trees and Directed Acyclic Graphs

New Common Ancestor Problems in Trees and Directed Acyclic Graphs New Common Ancestor Problems in Trees and Directed Acyclic Graphs Johannes Fischer, Daniel H. Huson Universität Tübingen, Center for Bioinformatics (ZBIT), Sand 14, D-72076 Tübingen Abstract We derive

More information

Answer Set Programming or Hypercleaning: Where does the Magic Lie in Solving Maximum Quartet Consistency?

Answer Set Programming or Hypercleaning: Where does the Magic Lie in Solving Maximum Quartet Consistency? Answer Set Programming or Hypercleaning: Where does the Magic Lie in Solving Maximum Quartet Consistency? Fathiyeh Faghih and Daniel G. Brown David R. Cheriton School of Computer Science, University of

More information

INFERRING OPTIMAL SPECIES TREES UNDER GENE DUPLICATION AND LOSS

INFERRING OPTIMAL SPECIES TREES UNDER GENE DUPLICATION AND LOSS INFERRING OPTIMAL SPECIES TREES UNDER GENE DUPLICATION AND LOSS M. S. BAYZID, S. MIRARAB and T. WARNOW Department of Computer Science, The University of Texas at Austin, Austin, Texas 78712, USA E-mail:

More information

Phylogenetics on CUDA (Parallel) Architectures Bradly Alicea

Phylogenetics on CUDA (Parallel) Architectures Bradly Alicea Descent w/modification Descent w/modification Descent w/modification Descent w/modification CPU Descent w/modification Descent w/modification Phylogenetics on CUDA (Parallel) Architectures Bradly Alicea

More information

Introduction to Trees

Introduction to Trees Introduction to Trees Tandy Warnow December 28, 2016 Introduction to Trees Tandy Warnow Clades of a rooted tree Every node v in a leaf-labelled rooted tree defines a subset of the leafset that is below

More information

The History Bound and ILP

The History Bound and ILP The History Bound and ILP Julia Matsieva and Dan Gusfield UC Davis March 15, 2017 Bad News for Tree Huggers More Bad News Far more convincingly even than the (also highly convincing) fossil evidence, the

More information

of the Balanced Minimum Evolution Polytope Ruriko Yoshida

of the Balanced Minimum Evolution Polytope Ruriko Yoshida Optimality of the Neighbor Joining Algorithm and Faces of the Balanced Minimum Evolution Polytope Ruriko Yoshida Figure 19.1 Genomes 3 ( Garland Science 2007) Origins of Species Tree (or web) of life eukarya

More information

Designing parallel algorithms for constructing large phylogenetic trees on Blue Waters

Designing parallel algorithms for constructing large phylogenetic trees on Blue Waters Designing parallel algorithms for constructing large phylogenetic trees on Blue Waters Erin Molloy University of Illinois at Urbana Champaign General Allocation (PI: Tandy Warnow) Exploratory Allocation

More information

Reconstructing Reticulate Evolution in Species Theory and Practice

Reconstructing Reticulate Evolution in Species Theory and Practice Reconstructing Reticulate Evolution in Species Theory and Practice Luay Nakhleh Department of Computer Science Rice University Houston, Texas 77005 nakhleh@cs.rice.edu Tandy Warnow Department of Computer

More information

Rotation Distance is Fixed-Parameter Tractable

Rotation Distance is Fixed-Parameter Tractable Rotation Distance is Fixed-Parameter Tractable Sean Cleary Katherine St. John September 25, 2018 arxiv:0903.0197v1 [cs.ds] 2 Mar 2009 Abstract Rotation distance between trees measures the number of simple

More information

Folding and unfolding phylogenetic trees and networks

Folding and unfolding phylogenetic trees and networks J. Math. Biol. (2016) 73:1761 1780 DOI 10.1007/s00285-016-0993-5 Mathematical Biology Folding and unfolding phylogenetic trees and networks Katharina T. Huber 1 Vincent Moulton 1 Mike Steel 2 Taoyang Wu

More information

Graph similarity. Laura Zager and George Verghese EECS, MIT. March 2005

Graph similarity. Laura Zager and George Verghese EECS, MIT. March 2005 Graph similarity Laura Zager and George Verghese EECS, MIT March 2005 Words you won t hear today impedance matching thyristor oxide layer VARs Some quick definitions GV (, E) a graph G V the set of vertices

More information

Embedded Subgraph Isomorphism and Related Problems

Embedded Subgraph Isomorphism and Related Problems Embedded Subgraph Isomorphism and Related Problems Graph isomorphism, subgraph isomorphism, and maximum common subgraph can be solved in polynomial time when constrained by geometrical information, in

More information

The SNPR neighbourhood of tree-child networks

The SNPR neighbourhood of tree-child networks Journal of Graph Algorithms and Applications http://jgaa.info/ vol. 22, no. 2, pp. 29 55 (2018) DOI: 10.7155/jgaa.00472 The SNPR neighbourhood of tree-child networks Jonathan Klawitter Department of Computer

More information

Identifiability of Large Phylogenetic Mixture Models

Identifiability of Large Phylogenetic Mixture Models Identifiability of Large Phylogenetic Mixture Models John Rhodes and Seth Sullivant University of Alaska Fairbanks and NCSU April 18, 2012 Seth Sullivant (NCSU) Phylogenetic Mixtures April 18, 2012 1 /

More information

Phylogenetic Networks: Properties and Relationship to Trees and Clusters

Phylogenetic Networks: Properties and Relationship to Trees and Clusters Phylogenetic Networks: Properties and Relationship to Trees and Clusters Luay Nakhleh 1 and Li-San Wang 2 1 Department of Computer Science, Rice University, Houston, TX 77005, USA nakhleh@cs.rice.edu 2

More information

Parallel Implementation of a Quartet-Based Algorithm for Phylogenetic Analysis

Parallel Implementation of a Quartet-Based Algorithm for Phylogenetic Analysis Parallel Implementation of a Quartet-Based Algorithm for Phylogenetic Analysis B. B. Zhou 1, D. Chu 1, M. Tarawneh 1, P. Wang 1, C. Wang 1, A. Y. Zomaya 1, and R. P. Brent 2 1 School of Information Technologies

More information

Chordal Graphs and Evolutionary Trees. Tandy Warnow

Chordal Graphs and Evolutionary Trees. Tandy Warnow Chordal Graphs and Evolutionary Trees Tandy Warnow Possible Indo-European tree (Ringe, Warnow and Taylor 2000) Anatolian Vedic Iranian Greek Italic Celtic Tocharian Armenian Germanic Baltic Slavic Albanian

More information

Phylogenetics. Introduction to Bioinformatics Dortmund, Lectures: Sven Rahmann. Exercises: Udo Feldkamp, Michael Wurst

Phylogenetics. Introduction to Bioinformatics Dortmund, Lectures: Sven Rahmann. Exercises: Udo Feldkamp, Michael Wurst Phylogenetics Introduction to Bioinformatics Dortmund, 16.-20.07.2007 Lectures: Sven Rahmann Exercises: Udo Feldkamp, Michael Wurst 1 Phylogenetics phylum = tree phylogenetics: reconstruction of evolutionary

More information

Unique reconstruction of tree-like phylogenetic networks from distances between leaves

Unique reconstruction of tree-like phylogenetic networks from distances between leaves Unique reconstruction of tree-like phylogenetic networks from distances between leaves Stephen J. Willson Department of Mathematics Iowa State University Ames, IA 50011 USA email: swillson@iastate.edu

More information

Evolution Module. 6.1 Phylogenetic Trees. Bob Gardner and Lev Yampolski. Integrated Biology and Discrete Math (IBMS 1300)

Evolution Module. 6.1 Phylogenetic Trees. Bob Gardner and Lev Yampolski. Integrated Biology and Discrete Math (IBMS 1300) Evolution Module 6.1 Phylogenetic Trees Bob Gardner and Lev Yampolski Integrated Biology and Discrete Math (IBMS 1300) Fall 2008 1 INDUCTION Note. The natural numbers N is the familiar set N = {1, 2, 3,...}.

More information

DIMACS Tutorial on Phylogenetic Trees and Rapidly Evolving Pathogens. Katherine St. John City University of New York 1

DIMACS Tutorial on Phylogenetic Trees and Rapidly Evolving Pathogens. Katherine St. John City University of New York 1 DIMACS Tutorial on Phylogenetic Trees and Rapidly Evolving Pathogens Katherine St. John City University of New York 1 Thanks to the DIMACS Staff Linda Casals Walter Morris Nicole Clark Katherine St. John

More information

Computing the All-Pairs Quartet Distance on a set of Evolutionary Trees

Computing the All-Pairs Quartet Distance on a set of Evolutionary Trees Journal of Bioinformatics and Computational Biology c Imperial College Press Computing the All-Pairs Quartet Distance on a set of Evolutionary Trees M. Stissing, T. Mailund, C. N. S. Pedersen and G. S.

More information

TreeCmp 2.0: comparison of trees in polynomial time manual

TreeCmp 2.0: comparison of trees in polynomial time manual TreeCmp 2.0: comparison of trees in polynomial time manual 1. Introduction A phylogenetic tree represents historical evolutionary relationship between different species or organisms. There are various

More information

Introduction to Evolutionary Computation

Introduction to Evolutionary Computation Introduction to Evolutionary Computation The Brought to you by (insert your name) The EvoNet Training Committee Some of the Slides for this lecture were taken from the Found at: www.cs.uh.edu/~ceick/ai/ec.ppt

More information

Dynamic Programming & Smith-Waterman algorithm

Dynamic Programming & Smith-Waterman algorithm m m Seminar: Classical Papers in Bioinformatics May 3rd, 2010 m m 1 2 3 m m Introduction m Definition is a method of solving problems by breaking them down into simpler steps problem need to contain overlapping

More information

Improved parameterized complexity of the Maximum Agreement Subtree and Maximum Compatible Tree problems LIRMM, Tech.Rep. num 04026

Improved parameterized complexity of the Maximum Agreement Subtree and Maximum Compatible Tree problems LIRMM, Tech.Rep. num 04026 Improved parameterized complexity of the Maximum Agreement Subtree and Maximum Compatible Tree problems LIRMM, Tech.Rep. num 04026 Vincent Berry, François Nicolas Équipe Méthodes et Algorithmes pour la

More information

CS 441 Discrete Mathematics for CS Lecture 26. Graphs. CS 441 Discrete mathematics for CS. Final exam

CS 441 Discrete Mathematics for CS Lecture 26. Graphs. CS 441 Discrete mathematics for CS. Final exam CS 441 Discrete Mathematics for CS Lecture 26 Graphs Milos Hauskrecht milos@cs.pitt.edu 5329 Sennott Square Final exam Saturday, April 26, 2014 at 10:00-11:50am The same classroom as lectures The exam

More information

Cost Partitioning Techniques for Multiple Sequence Alignment. Mirko Riesterer,

Cost Partitioning Techniques for Multiple Sequence Alignment. Mirko Riesterer, Cost Partitioning Techniques for Multiple Sequence Alignment Mirko Riesterer, 10.09.18 Agenda. 1 Introduction 2 Formal Definition 3 Solving MSA 4 Combining Multiple Pattern Databases 5 Cost Partitioning

More information

Approximating Subtree Distances Between Phylogenies. MARIA LUISA BONET, 1 KATHERINE ST. JOHN, 2,3 RUCHI MAHINDRU, 2,4 and NINA AMENTA 5 ABSTRACT

Approximating Subtree Distances Between Phylogenies. MARIA LUISA BONET, 1 KATHERINE ST. JOHN, 2,3 RUCHI MAHINDRU, 2,4 and NINA AMENTA 5 ABSTRACT JOURNAL OF COMPUTATIONAL BIOLOGY Volume 13, Number 8, 2006 Mary Ann Liebert, Inc. Pp. 1419 1434 Approximating Subtree Distances Between Phylogenies AU1 AU2 MARIA LUISA BONET, 1 KATHERINE ST. JOHN, 2,3

More information

Discrete mathematics , Fall Instructor: prof. János Pach

Discrete mathematics , Fall Instructor: prof. János Pach Discrete mathematics 2016-2017, Fall Instructor: prof. János Pach - covered material - Lecture 1. Counting problems To read: [Lov]: 1.2. Sets, 1.3. Number of subsets, 1.5. Sequences, 1.6. Permutations,

More information

Treewidth and graph minors

Treewidth and graph minors Treewidth and graph minors Lectures 9 and 10, December 29, 2011, January 5, 2012 We shall touch upon the theory of Graph Minors by Robertson and Seymour. This theory gives a very general condition under

More information

This is the author s version of a work that was submitted/accepted for publication in the following source:

This is the author s version of a work that was submitted/accepted for publication in the following source: This is the author s version of a work that was submitted/accepted for publication in the following source: Chowdhury, Israt J. & Nayak, Richi (2) A novel method for finding similarities between unordered

More information

ABOUT THE LARGEST SUBTREE COMMON TO SEVERAL PHYLOGENETIC TREES Alain Guénoche 1, Henri Garreta 2 and Laurent Tichit 3

ABOUT THE LARGEST SUBTREE COMMON TO SEVERAL PHYLOGENETIC TREES Alain Guénoche 1, Henri Garreta 2 and Laurent Tichit 3 The XIII International Conference Applied Stochastic Models and Data Analysis (ASMDA-2009) June 30-July 3, 2009, Vilnius, LITHUANIA ISBN 978-9955-28-463-5 L. Sakalauskas, C. Skiadas and E. K. Zavadskas

More information

Genetics/MBT 541 Spring, 2002 Lecture 1 Joe Felsenstein Department of Genome Sciences Phylogeny methods, part 1 (Parsimony and such)

Genetics/MBT 541 Spring, 2002 Lecture 1 Joe Felsenstein Department of Genome Sciences Phylogeny methods, part 1 (Parsimony and such) Genetics/MBT 541 Spring, 2002 Lecture 1 Joe Felsenstein Department of Genome Sciences joe@gs Phylogeny methods, part 1 (Parsimony and such) Methods of reconstructing phylogenies (evolutionary trees) Parsimony

More information

Population Genetics in BioPerl HOWTO

Population Genetics in BioPerl HOWTO Population Genetics in BioPerl HOW Jason Stajich, Dept Molecular Genetics and Microbiology, Duke University $Id: PopGen.xml,v 1.2 2005/02/23 04:56:30 jason Exp $ This document

More information

Lecture 5: Graphs. Rajat Mittal. IIT Kanpur

Lecture 5: Graphs. Rajat Mittal. IIT Kanpur Lecture : Graphs Rajat Mittal IIT Kanpur Combinatorial graphs provide a natural way to model connections between different objects. They are very useful in depicting communication networks, social networks

More information

Molecular Evolution & Phylogenetics Complexity of the search space, distance matrix methods, maximum parsimony

Molecular Evolution & Phylogenetics Complexity of the search space, distance matrix methods, maximum parsimony Molecular Evolution & Phylogenetics Complexity of the search space, distance matrix methods, maximum parsimony Basic Bioinformatics Workshop, ILRI Addis Ababa, 12 December 2017 Learning Objectives understand

More information

Understanding Spaces of Phylogenetic Trees

Understanding Spaces of Phylogenetic Trees Understanding Spaces of Phylogenetic Trees Williams College SMALL REU 2012 September 25, 2012 Trees Which Tell an Evolutionary Story The Tree of Life Problem Given data (e.g. nucleotide sequences) on n

More information

Evolutionary tree reconstruction (Chapter 10)

Evolutionary tree reconstruction (Chapter 10) Evolutionary tree reconstruction (Chapter 10) Early Evolutionary Studies Anatomical features were the dominant criteria used to derive evolutionary relationships between species since Darwin till early

More information

A Lookahead Branch-and-Bound Algorithm for the Maximum Quartet Consistency Problem

A Lookahead Branch-and-Bound Algorithm for the Maximum Quartet Consistency Problem A Lookahead Branch-and-Bound Algorithm for the Maximum Quartet Consistency Problem Gang Wu Jia-Huai You Guohui Lin January 17, 2005 Abstract A lookahead branch-and-bound algorithm is proposed for solving

More information

An undirected graph is a tree if and only of there is a unique simple path between any 2 of its vertices.

An undirected graph is a tree if and only of there is a unique simple path between any 2 of its vertices. Trees Trees form the most widely used subclasses of graphs. In CS, we make extensive use of trees. Trees are useful in organizing and relating data in databases, file systems and other applications. Formal

More information

Algorithms for Grid Graphs in the MapReduce Model

Algorithms for Grid Graphs in the MapReduce Model University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln Computer Science and Engineering: Theses, Dissertations, and Student Research Computer Science and Engineering, Department

More information

Graph Theory. Probabilistic Graphical Models. L. Enrique Sucar, INAOE. Definitions. Types of Graphs. Trajectories and Circuits.

Graph Theory. Probabilistic Graphical Models. L. Enrique Sucar, INAOE. Definitions. Types of Graphs. Trajectories and Circuits. Theory Probabilistic ical Models L. Enrique Sucar, INAOE and (INAOE) 1 / 32 Outline and 1 2 3 4 5 6 7 8 and 9 (INAOE) 2 / 32 A graph provides a compact way to represent binary relations between a set of

More information

A more efficient algorithm for perfect sorting by reversals

A more efficient algorithm for perfect sorting by reversals A more efficient algorithm for perfect sorting by reversals Sèverine Bérard 1,2, Cedric Chauve 3,4, and Christophe Paul 5 1 Département de Mathématiques et d Informatique Appliquée, INRA, Toulouse, France.

More information

7.3 Spanning trees Spanning trees [ ] 61

7.3 Spanning trees Spanning trees [ ] 61 7.3. Spanning trees [161211-1348 ] 61 7.3 Spanning trees We know that trees are connected graphs with the minimal number of edges. Hence trees become very useful in applications where our goal is to connect

More information

A Connection between Network Coding and. Convolutional Codes

A Connection between Network Coding and. Convolutional Codes A Connection between Network Coding and 1 Convolutional Codes Christina Fragouli, Emina Soljanin christina.fragouli@epfl.ch, emina@lucent.com Abstract The min-cut, max-flow theorem states that a source

More information

Main Reference. Marc A. Suchard: Stochastic Models for Horizontal Gene Transfer: Taking a Random Walk through Tree Space Genetics 2005

Main Reference. Marc A. Suchard: Stochastic Models for Horizontal Gene Transfer: Taking a Random Walk through Tree Space Genetics 2005 Stochastic Models for Horizontal Gene Transfer Dajiang Liu Department of Statistics Main Reference Marc A. Suchard: Stochastic Models for Horizontal Gene Transfer: Taing a Random Wal through Tree Space

More information

Outline. Guaranteed Visibility. Accordion Drawing. Guaranteed Visibility Challenges. Guaranteed Visibility Challenges

Outline. Guaranteed Visibility. Accordion Drawing. Guaranteed Visibility Challenges. Guaranteed Visibility Challenges Outline Scalable Visual Comparison of Biological Trees and Sequences Tamara Munzner University of British Columbia Department of Computer Science ccordion Drawing information visualization technique TreeJuxtaposer

More information

What is a phylogenetic tree? Algorithms for Computational Biology. Phylogenetics Summary. Di erent types of phylogenetic trees

What is a phylogenetic tree? Algorithms for Computational Biology. Phylogenetics Summary. Di erent types of phylogenetic trees What is a phylogenetic tree? Algorithms for Computational Biology Zsuzsanna Lipták speciation events Masters in Molecular and Medical Biotechnology a.a. 25/6, fall term Phylogenetics Summary wolf cat lion

More information

Computational Genomics and Molecular Biology, Fall

Computational Genomics and Molecular Biology, Fall Computational Genomics and Molecular Biology, Fall 2015 1 Sequence Alignment Dannie Durand Pairwise Sequence Alignment The goal of pairwise sequence alignment is to establish a correspondence between the

More information

Distance-based Phylogenetic Methods Near a Polytomy

Distance-based Phylogenetic Methods Near a Polytomy Distance-based Phylogenetic Methods Near a Polytomy Ruth Davidson and Seth Sullivant NCSU UIUC May 21, 2014 2 Phylogenetic trees model the common evolutionary history of a group of species Leaves = extant

More information

Approximation Algorithms for Constrained Generalized Tree Alignment Problem

Approximation Algorithms for Constrained Generalized Tree Alignment Problem DIMACS Technical Report 2007-21 December 2007 Approximation Algorithms for Constrained Generalized Tree Alignment Problem by Srikrishnan Divakaran Dept. of Computer Science Hofstra University Hempstead,

More information

CSE 549: Computational Biology

CSE 549: Computational Biology CSE 549: Computational Biology Phylogenomics 1 slides marked with * by Carl Kingsford Tree of Life 2 * H5N1 Influenza Strains Salzberg, Kingsford, et al., 2007 3 * H5N1 Influenza Strains The 2007 outbreak

More information

Parsimony-Based Approaches to Inferring Phylogenetic Trees

Parsimony-Based Approaches to Inferring Phylogenetic Trees Parsimony-Based Approaches to Inferring Phylogenetic Trees BMI/CS 576 www.biostat.wisc.edu/bmi576.html Mark Craven craven@biostat.wisc.edu Fall 0 Phylogenetic tree approaches! three general types! distance:

More information

arxiv: v1 [cs.dm] 21 Dec 2015

arxiv: v1 [cs.dm] 21 Dec 2015 The Maximum Cardinality Cut Problem is Polynomial in Proper Interval Graphs Arman Boyacı 1, Tinaz Ekim 1, and Mordechai Shalom 1 Department of Industrial Engineering, Boğaziçi University, Istanbul, Turkey

More information

K 4 C 5. Figure 4.5: Some well known family of graphs

K 4 C 5. Figure 4.5: Some well known family of graphs 08 CHAPTER. TOPICS IN CLASSICAL GRAPH THEORY K, K K K, K K, K K, K C C C C 6 6 P P P P P. Graph Operations Figure.: Some well known family of graphs A graph Y = (V,E ) is said to be a subgraph of a graph

More information

Algorithms for Bioinformatics

Algorithms for Bioinformatics Adapted from slides by Leena Salmena and Veli Mäkinen, which are partly from http: //bix.ucsd.edu/bioalgorithms/slides.php. 582670 Algorithms for Bioinformatics Lecture 6: Distance based clustering and

More information

One of the central problems in computational biology is the problem of reconstructing evolutionary. Research Articles

One of the central problems in computational biology is the problem of reconstructing evolutionary. Research Articles JOURNAL OF COMPUTATIONAL BIOLOGY Volume 17, Number 6, 2010 # Mary Ann Liebert, Inc. Pp. 767 781 DOI: 10.1089/cmb.2009.0249 Research Articles The Imperfect Ancestral Recombination Graph Reconstruction Problem:

More information

Crossing bridges. Crossing bridges Great Ideas in Theoretical Computer Science. Lecture 12: Graphs I: The Basics. Königsberg (Prussia)

Crossing bridges. Crossing bridges Great Ideas in Theoretical Computer Science. Lecture 12: Graphs I: The Basics. Königsberg (Prussia) 15-251 Great Ideas in Theoretical Computer Science Lecture 12: Graphs I: The Basics February 22nd, 2018 Crossing bridges Königsberg (Prussia) Now Kaliningrad (Russia) Is there a way to walk through the

More information

PROTEIN MULTIPLE ALIGNMENT MOTIVATION: BACKGROUND: Marina Sirota

PROTEIN MULTIPLE ALIGNMENT MOTIVATION: BACKGROUND: Marina Sirota Marina Sirota MOTIVATION: PROTEIN MULTIPLE ALIGNMENT To study evolution on the genetic level across a wide range of organisms, biologists need accurate tools for multiple sequence alignment of protein

More information

Markovian Models of Genetic Inheritance

Markovian Models of Genetic Inheritance Markovian Models of Genetic Inheritance Elchanan Mossel, U.C. Berkeley mossel@stat.berkeley.edu, http://www.cs.berkeley.edu/~mossel/ 6/18/12 1 General plan Define a number of Markovian Inheritance Models

More information

Maximum Parsimony on Phylogenetic networks

Maximum Parsimony on Phylogenetic networks RESEARCH Open Access Maximum Parsimony on Phylogenetic networks Lavanya Kannan * and Ward C Wheeler Abstract Background: Phylogenetic networks are generalizations of phylogenetic trees, that are used to

More information

Lecture 2 - Graph Theory Fundamentals - Reachability and Exploration 1

Lecture 2 - Graph Theory Fundamentals - Reachability and Exploration 1 CME 305: Discrete Mathematics and Algorithms Instructor: Professor Aaron Sidford (sidford@stanford.edu) January 11, 2018 Lecture 2 - Graph Theory Fundamentals - Reachability and Exploration 1 In this lecture

More information

Missing Data Estimation in Microarrays Using Multi-Organism Approach

Missing Data Estimation in Microarrays Using Multi-Organism Approach Missing Data Estimation in Microarrays Using Multi-Organism Approach Marcel Nassar and Hady Zeineddine Progress Report: Data Mining Course Project, Spring 2008 Prof. Inderjit S. Dhillon April 02, 2008

More information

Algorithms for Computing Cluster Dissimilarity between Rooted Phylogenetic

Algorithms for Computing Cluster Dissimilarity between Rooted Phylogenetic Send Orders for Reprints to reprints@benthamscience.ae 8 The Open Cybernetics & Systemics Journal, 05, 9, 8-3 Open Access Algorithms for Computing Cluster Dissimilarity between Rooted Phylogenetic Trees

More information

Introduction to Computational Phylogenetics

Introduction to Computational Phylogenetics Introduction to Computational Phylogenetics Tandy Warnow The University of Texas at Austin No Institute Given This textbook is a draft, and should not be distributed. Much of what is in this textbook appeared

More information

Lecture 7. Transform-and-Conquer

Lecture 7. Transform-and-Conquer Lecture 7 Transform-and-Conquer 6-1 Transform and Conquer This group of techniques solves a problem by a transformation to a simpler/more convenient instance of the same problem (instance simplification)

More information

A New Approach For Tree Alignment Based on Local Re-Optimization

A New Approach For Tree Alignment Based on Local Re-Optimization A New Approach For Tree Alignment Based on Local Re-Optimization Feng Yue and Jijun Tang Department of Computer Science and Engineering University of South Carolina Columbia, SC 29063, USA yuef, jtang

More information

Isometric gene tree reconciliation revisited

Isometric gene tree reconciliation revisited DOI 0.86/s05-07-008-x Algorithms for Molecular Biology RESEARCH Open Access Isometric gene tree reconciliation revisited Broňa Brejová *, Askar Gafurov, Dana Pardubská, Michal Sabo and Tomáš Vinař Abstract

More information

arxiv: v1 [cs.ds] 12 May 2013

arxiv: v1 [cs.ds] 12 May 2013 Full Square Rhomboids and Their Algebraic Expressions Mark Korenblit arxiv:105.66v1 [cs.ds] 1 May 01 Holon Institute of Technology, Israel korenblit@hit.ac.il Abstract. The paper investigates relationship

More information

Graph Algorithms Using Depth First Search

Graph Algorithms Using Depth First Search Graph Algorithms Using Depth First Search Analysis of Algorithms Week 8, Lecture 1 Prepared by John Reif, Ph.D. Distinguished Professor of Computer Science Duke University Graph Algorithms Using Depth

More information

A SIMPLE APPROXIMATION ALGORITHM FOR NONOVERLAPPING LOCAL ALIGNMENTS (WEIGHTED INDEPENDENT SETS OF AXIS PARALLEL RECTANGLES)

A SIMPLE APPROXIMATION ALGORITHM FOR NONOVERLAPPING LOCAL ALIGNMENTS (WEIGHTED INDEPENDENT SETS OF AXIS PARALLEL RECTANGLES) Chapter 1 A SIMPLE APPROXIMATION ALGORITHM FOR NONOVERLAPPING LOCAL ALIGNMENTS (WEIGHTED INDEPENDENT SETS OF AXIS PARALLEL RECTANGLES) Piotr Berman Department of Computer Science & Engineering Pennsylvania

More information

Chapter 3 Trees. Theorem A graph T is a tree if, and only if, every two distinct vertices of T are joined by a unique path.

Chapter 3 Trees. Theorem A graph T is a tree if, and only if, every two distinct vertices of T are joined by a unique path. Chapter 3 Trees Section 3. Fundamental Properties of Trees Suppose your city is planning to construct a rapid rail system. They want to construct the most economical system possible that will meet the

More information

INFERENCE OF PARSIMONIOUS SPECIES TREES FROM MULTI-LOCUS DATA BY MINIMIZING DEEP COALESCENCES CUONG THAN AND LUAY NAKHLEH

INFERENCE OF PARSIMONIOUS SPECIES TREES FROM MULTI-LOCUS DATA BY MINIMIZING DEEP COALESCENCES CUONG THAN AND LUAY NAKHLEH INFERENCE OF PARSIMONIOUS SPECIES TREES FROM MULTI-LOCUS DATA BY MINIMIZING DEEP COALESCENCES CUONG THAN AND LUAY NAKHLEH Abstract. One approach for inferring a species tree from a given multi-locus data

More information

Simulation of Molecular Evolution with Bioinformatics Analysis

Simulation of Molecular Evolution with Bioinformatics Analysis Simulation of Molecular Evolution with Bioinformatics Analysis Barbara N. Beck, Rochester Community and Technical College, Rochester, MN Project created by: Barbara N. Beck, Ph.D., Rochester Community

More information

On the EXISTENCE of SPECIAL DEPTH FIRST SEARCH TREES *

On the EXISTENCE of SPECIAL DEPTH FIRST SEARCH TREES * On the EXISTENCE of SPECIAL DEPTH FIRST SEARCH TREES * Ephraim Korach ** and Zvi Ostfeld *** Computer Science Department Technion - Israel Institute of Technology Haifa 32000, Israel ABSTRACT The Depth

More information

Sequence alignment algorithms

Sequence alignment algorithms Sequence alignment algorithms Bas E. Dutilh Systems Biology: Bioinformatic Data Analysis Utrecht University, February 23 rd 27 After this lecture, you can decide when to use local and global sequence alignments

More information

Hamilton paths & circuits. Gray codes. Hamilton Circuits. Planar Graphs. Hamilton circuits. 10 Nov 2015

Hamilton paths & circuits. Gray codes. Hamilton Circuits. Planar Graphs. Hamilton circuits. 10 Nov 2015 Hamilton paths & circuits Def. A path in a multigraph is a Hamilton path if it visits each vertex exactly once. Def. A circuit that is a Hamilton path is called a Hamilton circuit. Hamilton circuits Constructing

More information

Lecture 2 : Counting X-trees

Lecture 2 : Counting X-trees Lecture 2 : Counting X-trees MATH285K - Spring 2010 Lecturer: Sebastien Roch References: [SS03, Chapter 1,2] 1 Trees We begin by recalling basic definitions and properties regarding finite trees. DEF 2.1

More information