Lecture: Bioinformatics
|
|
- Rudolph Henderson
- 5 years ago
- Views:
Transcription
1 Lecture: Bioinformatics ENS Sacley, 2018 Some slides graciously provided by Daniel Huson & Celine Scornavacca
2 Phylogenetic Trees - Motivation 2 / 31
3 2 / 31 Phylogenetic Trees - Motivation Motivation - study relation between species - evolution of characteristics - co-evolution (host-parasite) - geological migration - genetic development of viruses/diseases
4 2 / 31 Phylogenetic Trees - Motivation Motivation - study relation between species - evolution of characteristics - co-evolution (host-parasite) - geological migration - genetic development of viruses/diseases Evolution genetic material changes over time new species branch off tree of life
5 Phylogenetic Trees - Motivation 2 / 31
6 Phylogenetic Trees - Motivation 2 / 31
7 3 / 31 Rooted Phylogenetic Trees evolution of species over time, leaves extant, hypothetical ancestors possibly branch lengths (time) Notation taxon, cluster, triplet
8 3 / 31 Rooted Phylogenetic Trees evolution of species over time, leaves extant, hypothetical ancestors possibly branch lengths (time) Notation taxon, cluster, triplet
9 3 / 31 Rooted Phylogenetic Trees evolution of species over time, leaves extant, hypothetical ancestors possibly branch lengths (time) Notation taxon, cluster, triplet
10 3 / 31 Rooted Phylogenetic Trees evolution of species over time, leaves extant, hypothetical ancestors possibly branch lengths (time) Notation taxon, cluster, triplet
11 Rooted Phylogenetic Trees evolution of species over time, leaves extant, hypothetical ancestors possibly branch lengths (time) Exercise Time Notation Polytomies taxon, cluster, triplet history not clear soft known fan out hard Exercise: use xy z LCA(xz)=LCA(yz)>LCA(xy) to prove ab c + bc d ac d 3 / 31
12 4 / 31 Unrooted Phylogenetic Trees similarity between genomes, leaves extant, internal vertices have no meaning possibly branch lengths (amount of change) Notation taxon, split, quartet
13 4 / 31 Unrooted Phylogenetic Trees similarity between genomes, leaves extant, internal vertices have no meaning possibly branch lengths (amount of change) Notation taxon, split, quartet
14 4 / 31 Unrooted Phylogenetic Trees similarity between genomes, leaves extant, internal vertices have no meaning possibly branch lengths (amount of change) Notation taxon, split, quartet
15 4 / 31 Unrooted Phylogenetic Trees similarity between genomes, leaves extant, internal vertices have no meaning possibly branch lengths (amount of change) Notation taxon, split, quartet
16 4 / 31 Unrooted Phylogenetic Trees similarity between genomes, leaves extant, internal vertices have no meaning possibly branch lengths (amount of change) Notation taxon, split, quartet
17 5 / 31 Reconstructing Phylogenetic Trees Group Species By... - morphology - behavior - geography Diptera = two wings
18 5 / 31 Reconstructing Phylogenetic Trees Group Species By... - morphology - behavior - geography - distance of sequences - genetic distance - etc. Diptera = two wings
19 6 / 31 Reconstructing Phylogenetic Trees Vertebrata has backbone
20 6 / 31 Reconstructing Phylogenetic Trees Vertebrata has backbone Tetrapoda has 4 legs
21 6 / 31 Reconstructing Phylogenetic Trees Vertebrata has backbone Tetrapoda has 4 legs Mammalia breast feeding
22 6 / 31 Reconstructing Phylogenetic Trees Vertebrata has backbone Tetrapoda has 4 legs Salientia can leap Mammalia breast feeding
23 6 / 31 Reconstructing Phylogenetic Trees Vertebrata has backbone Tetrapoda has 4 legs Salientia can leap Mammalia breast feeding Elephantidae elephants
24 6 / 31 Reconstructing Phylogenetic Trees Vertebrata has backbone Tetrapoda has 4 legs Salientia can leap Mammalia breast feeding Elephantidae elephants Loxodonta africana African Elephant Elephas maximus Asian Elephant
25 Reconstructing Phylogenetic Trees Tetrapoda has 4 legs Vertebrata has backbone Keep in Mind Automatic reconstruction should be - fast (deal with tons of species & genes) - consistent (optimal data evolution correctly reflected) - non-arbitrary Salientia can leap Mammalia breast feeding Elephantidae elephants Loxodonta africana African Elephant Elephas maximus Asian Elephant 6 / 31
26 Distance-Based Reconstruction Idea: cluster hierarchically 7 / 31
27 7 / 31 Distance-Based Reconstruction Idea: cluster hierarchically
28 7 / 31 Distance-Based Reconstruction Idea: cluster hierarchically Idea: merge closest clusters
29 7 / 31 Distance-Based Reconstruction Idea: cluster hierarchically Idea: merge closest clusters
30 7 / 31 Distance-Based Reconstruction Idea: cluster hierarchically Idea: merge closest clusters
31 Distance-Based Reconstruction Idea: cluster hierarchically Idea: merge closest clusters branch lengths?? assume molecular clock ultrametric 7 / 31
32 Distance-Based Reconstruction Idea: cluster hierarchically Idea: merge closest clusters update matrix d X Y,Z = X d X,Z + Y d Y,Z X + Y branch lengths?? assume molecular clock ultrametric / 31
33 Distance-Based Reconstruction Idea: cluster hierarchically Idea: merge closest clusters update matrix d X Y,Z = X d X,Z + Y d Y,Z X + Y branch lengths?? assume molecular clock ultrametric / 31
34 Distance-Based Reconstruction Idea: cluster hierarchically Idea: merge closest clusters update matrix d X Y,Z = X d X,Z + Y d Y,Z X + Y branch lengths?? assume molecular clock ultrametric / 31
35 Distance-Based Reconstruction Idea: cluster hierarchically 6 28/3 28/3 Idea: merge closest clusters update matrix d X Y,Z = X d X,Z + Y d Y,Z X + Y branch lengths?? assume molecular clock ultrametric / 31
36 Distance-Based Reconstruction Idea: cluster hierarchically 6 28/3 28/3 Idea: merge closest clusters update matrix d X Y,Z = X d X,Z + Y d Y,Z X + Y branch lengths?? assume molecular clock ultrametric 7 / 31
37 Distance-Based Reconstruction Idea: cluster hierarchically 28/3 Idea: merge closest clusters update matrix d X Y,Z = X d X,Z + Y d Y,Z X + Y branch lengths?? assume molecular clock ultrametric 7 / 31
38 Distance-Based Reconstruction Idea: cluster hierarchically 28/3 5/3 Idea: merge closest clusters 5/3 update matrix d X Y,Z = X d X,Z + Y d Y,Z X + Y branch lengths?? assume molecular clock ultrametric 7 / 31
39 Distance-Based Reconstruction Idea: cluster hierarchically Unweighted Pair Group Method w/ Avg. - find closest pair - join them - update distances & recurse 28/3 5/3 Idea: merge closest clusters 5/3 update matrix d X Y,Z = X d X,Z + Y d Y,Z X + Y branch lengths?? assume molecular clock ultrametric 7 / 31
40 Distance-Based Reconstruction Idea: cluster hierarchically Unweighted Pair Group Method w/ Avg. - find closest pair - join them - update distances & recurse /3 Idea: merge closest clusters 5/3 update matrix d X Y,Z = X d X,Z + Y d Y,Z X + Y branch lengths?? assume molecular clock ultrametric 7 / 31
41 Distance-Based Reconstruction Idea: cluster hierarchically Unweighted Pair Group Method w/ Avg. - find closest pair - join them - update distances & recurse Idea: merge closest clusters update matrix d X Y,Z = X d X,Z + Y d Y,Z X + Y Exercise Time branch lengths?? assume molecular clock ultrametric 7 / 31
42 Distance-Based Reconstruction Idea: cluster hierarchically Unweighted Pair Group Method w/ Avg. - find closest pair - join them - update distances & recurse Idea: merge closest clusters update matrix d X Y,Z = X d X,Z + Y d Y,Z X + Y branch lengths?? assume molecular clock ultrametric 7 / 31
43 8 / 31 Distance-Based Reconstruction What about unrooted trees?
44 8 / 31 Distance-Based Reconstruction B 9 C 11 8 D A B C What about unrooted trees?
45 8 / 31 Distance-Based Reconstruction Problem: correct pairs may not be closest B 9 C 11 8 D A B C
46 9 / 31 Distance-Based Reconstruction Problem: correct pairs may not be closest B 9 C 11 8 D A B C B A D C
47 9 / 31 Distance-Based Reconstruction Problem: correct pairs may not be closest Neighbor Joining (unrooted) - build new matrix: Q X,Y = Z (d X,Z + d Y,Z d X,Y ) + 2d X,Y - find max in Q - join them - update distances & recurse
48 9 / 31 Distance-Based Reconstruction Problem: correct pairs may not be closest Neighbor Joining (unrooted) - build new matrix: Q X,Y = Z (d X,Z + d Y,Z d X,Y ) + 2d X,Y - find max in Q - join them - update distances & recurse
49 9 / 31 Distance-Based Reconstruction Problem: correct pairs may not be closest Neighbor Joining (unrooted) - build new matrix: Q X,Y = Z (d X,Z + d Y,Z d X,Y ) + 2d X,Y - find max in Q - join them - update distances & recurse Theorem Q X,Y max any tree T yielding Q has cherry (X, Y )
50 9 / 31 Distance-Based Reconstruction Problem: correct pairs may not be closest Neighbor Joining (unrooted) - build new matrix: Q X,Y = Z (d X,Z + d Y,Z d X,Y ) + 2d X,Y - find max in Q - join them - update distances & recurse update distances d X Y,Z = 1 /2 (d X,Z + d Y,Z d X,Y ) branch lengths 2b(X ) = Z / {X,Y } (d X,Z d Y,Z +d X,Y ) n 2
51 9 / 31 Distance-Based Reconstruction Problem: correct pairs may not be closest Neighbor Joining (unrooted) - build new matrix: Q X,Y = Z (d X,Z + d Y,Z d X,Y ) + 2d X,Y - find max in Q - join them - update distances & recurse update distances d X Y,Z = 1 /2 (d X,Z + d Y,Z d X,Y ) branch lengths 2b(X ) = Z / {X,Y } (d X,Z d Y,Z +d X,Y ) n 2 2 3
52 9 / 31 Distance-Based Reconstruction Problem: correct pairs may not be closest Neighbor Joining (unrooted) - build new matrix: Q X,Y = Z (d X,Z + d Y,Z d X,Y ) + 2d X,Y - find max in Q - join them - update distances & recurse update distances d X Y,Z = 1 /2 (d X,Z + d Y,Z d X,Y ) branch lengths 2b(X ) = Z / {X,Y } (d X,Z d Y,Z +d X,Y ) n 2 2 3
53 9 / 31 Distance-Based Reconstruction Problem: correct pairs may not be closest Neighbor Joining (unrooted) - build new matrix: Q X,Y = Z (d X,Z + d Y,Z d X,Y ) + 2d X,Y - find max in Q - join them - update distances & recurse update distances d X Y,Z = 1 /2 (d X,Z + d Y,Z d X,Y ) branch lengths 2b(X ) = Z / {X,Y } (d X,Z d Y,Z +d X,Y ) n 2 2 3
54 9 / 31 Distance-Based Reconstruction Problem: correct pairs may not be closest Neighbor Joining (unrooted) - build new matrix: Q X,Y = Z (d X,Z + d Y,Z d X,Y ) + 2d X,Y - find max in Q - join them - update distances & recurse update distances d X Y,Z = 1 /2 (d X,Z + d Y,Z d X,Y ) branch lengths 2b(X ) = Z / {X,Y } (d X,Z d Y,Z +d X,Y ) n
55 9 / 31 Distance-Based Reconstruction Problem: correct pairs may not be closest Neighbor Joining (unrooted) - build new matrix: Q X,Y = Z (d X,Z + d Y,Z d X,Y ) + 2d X,Y - find max in Q - join them - update distances & recurse update distances d X Y,Z = 1 /2 (d X,Z + d Y,Z d X,Y ) branch lengths 2b(X ) = Z / {X,Y } (d X,Z d Y,Z +d X,Y ) n
56 9 / 31 Distance-Based Reconstruction Problem: correct pairs may not be closest Neighbor Joining (unrooted) - build new matrix: Q X,Y = Z (d X,Z + d Y,Z d X,Y ) + 2d X,Y - find max in Q - join them - update distances & recurse update distances d X Y,Z = 1 /2 (d X,Z + d Y,Z d X,Y ) branch lengths 2b(X ) = Z / {X,Y } (d X,Z d Y,Z +d X,Y ) n
57 Parsimony Reconstructing 10 / 31
58 10 / 31 Parsimony Reconstructing fur AUS pouch land X X X X X X X X X
59 10 / 31 Parsimony Reconstructing fur AUS pouch land X X X X X X X X X
60 10 / 31 Parsimony Reconstructing fur AUS pouch land X X X X X X X 0001 X X
61 Parsimony Reconstructing fur AUS pouch land X X X X X X X 0001 X X sum distance of endpoints of each edge 10 / 31
62 Parsimony Reconstructing fur AUS pouch land X X X X X X X 0001 X X sum distance of endpoints of each edge cost 6 (Hamming) 10 / 31
63 10 / 31 Parsimony Reconstructing ACTACGGACCGCTGCG A_TACAAAC GCG ATTACGGACCGCTGCG Small Parsimony Input: character state matrix M, rooted tree T Task: assign characters to internal nodes minimizing total cost ATTACAAAC GCC TTTACAAAC CC
64 10 / 31 Parsimony Reconstructing ACTACGGACCGCTGCG A_TACAAAC GCG ATTACGGACCGCTGCG Small Parsimony Input: character state matrix M, rooted tree T Task: assign characters to internal nodes minimizing total cost ATTACAAAC GCC TTTACAAAC CC O(nm) time [Fitch 71]
65 10 / 31 Parsimony Reconstructing ACTACGGACCGCTGCG A_TACAAAC GCG ATTACGGACCGCTGCG Small Parsimony Input: character state matrix M, rooted tree T Task: assign characters to internal nodes minimizing total cost ATTACAAAC GCC TTTACAAAC CC Exercise Time O(nm) time [Fitch 71]
66 10 / 31 Parsimony Reconstructing ACTACGGACCGCTGCG A_TACAAAC GCG ATTACGGACCGCTGCG ATTACAAAC GCC TTTACAAAC CC Small Parsimony Input: character state matrix M, rooted tree T Task: assign characters to internal nodes minimizing total cost O(nm) time Large Parsimony Input: character state matrix M [Fitch 71] Task: find tree T & assign characters to internal nodes minimizing total cost
67 10 / 31 Parsimony Reconstructing ACTACGGACCGCTGCG A_TACAAAC GCG ATTACGGACCGCTGCG ATTACAAAC GCC TTTACAAAC CC Small Parsimony Input: character state matrix M, rooted tree T Task: assign characters to internal nodes minimizing total cost O(nm) time Large Parsimony Input: character state matrix M [Fitch 71] Task: find tree T & assign characters to internal nodes minimizing total cost NP-hard
68 10 / 31 Parsimony Reconstructing ACTACGGACCGCTGCG A_TACAAAC GCG ATTACGGACCGCTGCG ATTACAAAC GCC TTTACAAAC CC Small Parsimony Input: character state matrix M, rooted tree T Task: assign characters to internal nodes minimizing total cost O(nm) time Large Parsimony Input: character state matrix M [Fitch 71] Task: find tree T & assign characters to internal nodes minimizing total cost NP-hard Note: alignment is crucial!
69 11 / 31 Maximum Likelihood Reconstruction Idea: find a tree (with branch lengths) under which evolution is most likely to have produced the observed characters/genomes
70 11 / 31 Maximum Likelihood Reconstruction Idea: find a tree (with branch lengths) under which evolution is most likely to have produced the observed characters/genomes need model of evolution
71 11 / 31 Maximum Likelihood Reconstruction Idea: find a tree (with branch lengths) under which evolution is most likely to have produced the observed characters/genomes need model of evolution Jukes & Cantor Model - each base evolves individually - each base occurs with equal frequency in the genome - constant rate µ of mutation - each base is equally likely to be result of mutation
72 11 / 31 Maximum Likelihood Reconstruction Idea: find a tree (with branch lengths) under which evolution is most likely to have produced the observed characters/genomes need model of evolution Jukes & Cantor Model - each base evolves individually - each base occurs with equal frequency in the genome - constant rate µ of mutation - each base is equally likely to be result of mutation Generalized Time Reversible Model - each base evolves individually - each base X has a frequency π X to occur in the genome - each base-substitution has its own rate of occurance
73 11 / 31 Maximum Likelihood Reconstruction Idea: find a tree (with branch lengths) under which evolution is most likely to have produced the observed characters/genomes need model of evolution Jukes & Cantor Model - each base evolves individually - each base occurs with equal frequency in the genome - constant rate µ of mutation - each base is equally likely to be result of mutation Generalized Time Reversible Model - each base evolves individually - each base X has a frequency π X to occur in the genome - each base-substitution has its own rate of occurance compute likelihood, given tree & parameters O(mn) time
74 11 / 31 Maximum Likelihood Reconstruction Idea: find a tree (with branch lengths) under which evolution is most likely to have produced the observed characters/genomes need model of evolution Jukes & Cantor Model - each base evolves individually - each base occurs with equal frequency in the genome - constant rate µ of mutation - each base is equally likely to be result of mutation Generalized Time Reversible Model - each base evolves individually - each base X has a frequency π X to occur in the genome - each base-substitution has its own rate of occurance compute likelihood, given tree & parameters O(mn) time find best tree & parameters NP-hard
75 11 / 31 Maximum Likelihood Reconstruction Idea: find a tree (with branch lengths) under which evolution is most likely to have produced the observed characters/genomes need model of evolution Jukes & Cantor Model - each base evolves individually - each base occurs with equal frequency in the genome - constant rate µ of mutation - each base is equally likely to be result of mutation Generalized Time Reversible Model - each base evolves individually - each base X has a frequency π X to occur in the genome - each base-substitution has its own rate of occurance compute likelihood, given tree & parameters O(mn) time find best tree & parameters NP-hard local search in the tree space
76 12 / 31 ML Reconstruction - Tree Spaces Observe: a tree and a rearrangement operation span a space Nearest Neighbor Interchange change any configuration of 4 (3) neighboring subtrees into another
77 12 / 31 ML Reconstruction - Tree Spaces Observe: a tree and a rearrangement operation span a space Nearest Neighbor Interchange change any configuration of 4 (3) neighboring subtrees into another
78 12 / 31 ML Reconstruction - Tree Spaces Observe: a tree and a rearrangement operation span a space Nearest Neighbor Interchange change any configuration of 4 (3) neighboring subtrees into another
79 12 / 31 ML Reconstruction - Tree Spaces Observe: a tree and a rearrangement operation span a space Nearest Neighbor Interchange change any configuration of 4 (3) neighboring subtrees into another Subtree Prune & Regraft break any edge uv & connect v to any edge of the component of u
80 12 / 31 ML Reconstruction - Tree Spaces Observe: a tree and a rearrangement operation span a space Nearest Neighbor Interchange change any configuration of 4 (3) neighboring subtrees into another Subtree Prune & Regraft break any edge uv & connect v to any edge of the component of u
81 12 / 31 ML Reconstruction - Tree Spaces Observe: a tree and a rearrangement operation span a space Nearest Neighbor Interchange change any configuration of 4 (3) neighboring subtrees into another Subtree Prune & Regraft break any edge uv & connect v to any edge of the component of u
82 12 / 31 ML Reconstruction - Tree Spaces Observe: a tree and a rearrangement operation span a space Nearest Neighbor Interchange change any configuration of 4 (3) neighboring subtrees into another Subtree Prune & Regraft break any edge uv & connect v to any edge of the component of u
83 12 / 31 ML Reconstruction - Tree Spaces Observe: a tree and a rearrangement operation span a space Nearest Neighbor Interchange change any configuration of 4 (3) neighboring subtrees into another Subtree Prune & Regraft break any edge uv & connect v to any edge of the component of u
84 ML Reconstruction - Tree Spaces Observe: a tree and a rearrangement operation span a space Nearest Neighbor Interchange change any configuration of 4 (3) neighboring subtrees into another Subtree Prune & Regraft break any edge uv & connect v to any edge of the component of u Tree Bisection & Reconnection break any edge & insert a new reconnecting edge between any 2 edges 12 / 31
85 ML Reconstruction - Tree Spaces Observe: a tree and a rearrangement operation span a space Nearest Neighbor Interchange change any configuration of 4 (3) neighboring subtrees into another Subtree Prune & Regraft break any edge uv & connect v to any edge of the component of u Tree Bisection & Reconnection break any edge & insert a new reconnecting edge between any 2 edges 12 / 31
86 ML Reconstruction - Tree Spaces Observe: a tree and a rearrangement operation span a space Nearest Neighbor Interchange change any configuration of 4 (3) neighboring subtrees into another Subtree Prune & Regraft break any edge uv & connect v to any edge of the component of u Tree Bisection & Reconnection break any edge & insert a new reconnecting edge between any 2 edges 12 / 31
87 ML Reconstruction - Tree Spaces Observe: a tree and a rearrangement operation span a space Nearest Neighbor Interchange change any configuration of 4 (3) neighboring subtrees into another Subtree Prune & Regraft break any edge uv & connect v to any edge of the component of u Tree Bisection & Reconnection break any edge & insert a new reconnecting edge between any 2 edges 12 / 31
88 ML Reconstruction - Tree Spaces Observe: a tree and a rearrangement operation span a space Nearest Neighbor Interchange change any configuration of 4 (3) neighboring subtrees into another Subtree Prune & Regraft break any edge uv & connect v to any edge of the component of u Tree Bisection & Reconnection break any edge & insert a new reconnecting edge between any 2 edges 12 / 31
89 13 / 31 ML Reconstruction - Tree Spaces Exercise: turn into (any) caterpillar: Exercise Time Exercise: how are the distances related?
90 14 / 31 Checking Robustness Bootstrap Method suppose: method X yields tree T from n m character-state matrix M repeat k times the following experiment: 1. draw m columns from M (with repetition) 2. use X to compute T i Finally, for each branch of T, check how often it occurs in the T i bootstrap value measures robustness ( support ) of each branch
91 Reconstruction by Gene Trees 15 / 31
92 15 / 31 Reconstruction by Gene Trees A Common Method For Reconstructing Trees 1. get genomes of multiple species 2. extract genes using START & STOP codons 3. cluster genes in families of similar genes 4. within each family, infer a gene tree using dissimilarities 5. build a consensus among the gene trees species tree (Note: species tree may differ significantly from individual gene trees) 6. reconcile all gene trees with the species tree to learn the evolution of those genes
93 15 / 31 Reconstruction by Gene Trees A Common Method For Reconstructing Trees 1. get genomes of multiple species 2. extract genes using START & STOP codons 3. cluster genes in families of similar genes 4. within each family, infer a gene tree using dissimilarities 5. build a consensus among the gene trees species tree (Note: species tree may differ significantly from individual gene trees) 6. reconcile all gene trees with the species tree to learn the evolution of those genes
94 15 / 31 Reconstruction by Gene Trees A Common Method For Reconstructing Trees 1. get genomes of multiple species 2. extract genes using START & STOP codons 3. cluster genes in families of similar genes 4. within each family, infer a gene tree using dissimilarities 5. build a consensus among the gene trees species tree (Note: species tree may differ significantly from individual gene trees) 6. reconcile all gene trees with the species tree to learn the evolution of those genes
95 15 / 31 Reconstruction by Gene Trees A Common Method For Reconstructing Trees 1. get genomes of multiple species 2. extract genes using START & STOP codons 3. cluster genes in families of similar genes 4. within each family, infer a gene tree using dissimilarities 5. build a consensus among the gene trees species tree (Note: species tree may differ significantly from individual gene trees) 6. reconcile all gene trees with the species tree to learn the evolution of those genes
96 16 / 31 Supertrees - Build Algorithm Idea: find root partition & recurse (as long as there are 3 taxa) a c b f e a d b c
97 16 / 31 Supertrees - Build Algorithm Idea: find root partition & recurse (as long as there are 3 taxa) b c a c b f e a d b c d a e f
98 16 / 31 Supertrees - Build Algorithm Idea: find root partition & recurse (as long as there are 3 taxa) b c a c b f e a d b c d a e f
99 16 / 31 Supertrees - Build Algorithm Idea: find root partition & recurse (as long as there are 3 taxa) b c a c b f e a d b c d a e f a, b, c, d e f
100 16 / 31 Supertrees - Build Algorithm Idea: find root partition & recurse (as long as there are 3 taxa) b c a c b f e a d b c d a e f a, b, c, d e f
101 16 / 31 Supertrees - Build Algorithm Idea: find root partition & recurse (as long as there are 3 taxa) b c a c b f e a d b c d a e f a, b, c, d e f
102 16 / 31 Supertrees - Build Algorithm Idea: find root partition & recurse (as long as there are 3 taxa) b c a c b f e a d b c d a e f a, c, d b e f
103 16 / 31 Supertrees - Build Algorithm Idea: find root partition & recurse (as long as there are 3 taxa) b c a c b f e a d b c d a e f a, c, d b e f
104 Supertrees - Build Algorithm Idea: find root partition & recurse (as long as there are 3 taxa) b c a c b f e a d b c d a e f b c e f a d 16 / 31
105 Supertrees - Build Algorithm Idea: find root partition & recurse (as long as there are 3 taxa) a c b f e a d b c Note: always works if trees are compatible/consistent b c e f a d 16 / 31
106 Supertrees - Build Algorithm Idea: find root partition & recurse (as long as there are 3 taxa) a c b f e a d b c Note: always works if trees are compatible/consistent incompatible? c b e f - largest compatible subset NP-hard (even for triplets) - voting schemes (each tree votes for their clades) - reinterpret clades as characters, combine into matrix & reconstruct a d 16 / 31
107 17 / 31 Consensi of Non-Agreeing Trees strict consensus consensus subtree majority consensus
108 18 / 31 Reconstruction by Gene Trees A Common Method For Reconstructing Trees 1. get genomes of multiple species 2. extract genes using START & STOP codons 3. cluster genes in families of similar genes 4. within each family, infer a gene tree using dissimilarities 5. build a consensus among the gene trees species tree (Note: species tree may differ significantly from individual gene trees) 6. reconcile all gene trees with the species tree to learn the evolution of those genes
109 18 / 31 Reconstruction by Gene Trees A Common Method For Reconstructing Trees 1. get genomes of multiple species 2. extract genes using START & STOP codons 3. cluster genes in families of similar genes 4. within each family, infer a gene tree using dissimilarities 5. build a consensus among the gene trees species tree (Note: species tree may differ significantly from individual gene trees) 6. reconcile all gene trees with the species tree to learn the evolution of those genes
110 The History of a Gene Family Recall gene = functional element of DNA, clustered into gene-families each family yields a tree depicting its history gene tree consensus of the gene trees yields species tree But: what really happened??? Mouse Bat Dog Rat Mouse Rat Bat Dog 19 / 31
111 The History of a Gene Family Recall gene = functional element of DNA, clustered into gene-families each family yields a tree depicting its history gene tree consensus of the gene trees yields species tree But: what really happened??? Mouse Bat Dog Rat Mouse Rat Bat Dog 19 / 31
112 20 / 31 Reconciliation A B C a a b b c Embedding Rules gene tree G, species tree S - mapping ρ : V (G) V (S) - l is leaf in G ρ(l) corresponds to l (a A, a A, etc.) - u V (G) is called duplication if ρ(u) = ρ(c) for any child c of u in G - all non-leaves of G that not duplications are called speciations - each edge uv of G incurs a loss-cost equal to the number of edges in the ρ(u)-ρ(v)-path in S minus 1 if v is a speciation or 0 if v is a duplication
113 Reconciliation A B C a a b b c DL-model = speciation = duplication = loss a a b b c 20 / 31
114 Reconciliation Goal: embed gene tree into species tree (extant genes must map to their species) Max. Likelihood find most probable embedding (computationally expensive) Parsimony find embedding minimizing #events (possibly weighted) DL-model = speciation = duplication = loss a a b b c 20 / 31
115 Reconciliation Parsimonious Reconciliation Input: species tree S, gene tree G, δ, λ N Task: embed G in S, minimizing the weighted sum of events Result: LCA-assignment solves this optimally in O( S + G ) DL-model = speciation (0) = duplication (δ) = loss (λ) a a b b c 20 / 31
116 20 / 31 Reconciliation Parsimonious Reconciliation Input: species tree S, gene tree G, δ, λ, τ N Task: embed G in S, minimizing the weighted sum of events Result: LCA-assignment solves this optimally in O( S + G ) Exercise Time A B C D a b c d a b
117 Reconciliation Parsimonious Reconciliation Input: species tree S, gene tree G, δ, λ, τ N Task: embed G in S, minimizing the weighted sum of events Result: LCA-assignment solves this optimally in O( S + G ) DTL-model = speciation (0) = duplication (δ) = loss (λ) = transfer (τ) a a b b c c 20 / 31
118 Reconciliation Parsimonious Reconciliation Input: species tree S, gene tree G, δ, λ, τ N Task: embed G in S, minimizing the weighted sum of events Result: LCA-assignment solves this optimally in O( S + G ) T events only between co-existing species time constraints NP-hard DTL-model = speciation (0) = duplication (δ) = loss (λ) = transfer (τ) a a b b c c 20 / 31
119 Reconciliation Parsimonious Reconciliation Input: species tree S, gene tree G, δ, λ, τ N Task: embed G in S, minimizing the weighted sum of events Result: LCA-assignment solves this optimally in O( S + G ) T events only between co-existing species time constraints NP-hard Idea: dated species tree O( S 2 G ) [Doyon et al. 10] DTL-model = speciation (0) = duplication (δ) = loss (λ) = transfer (τ) a a b b c c 20 / 31
120 21 / 31 Comparing Phylogenetic Trees Distance Measures - Nearest Neighbor Interchange - Subtree Prune & Regraft - Tree Bisection & Reconnection
121 21 / 31 Comparing Phylogenetic Trees Distance Measures - Nearest Neighbor Interchange - Subtree Prune & Regraft - Tree Bisection & Reconnection - Robinson-Foulds distance - quartet/triplet distance
122 22 / 31 Agreement Forests Definition A forest F is called agreement forest of trees T 1 and T 2 if F can be obtained from T 1 and T 2 by removing edges. D C B D A E C E F F B A
123 22 / 31 Agreement Forests Definition A forest F is called agreement forest of trees T 1 and T 2 if F can be obtained from T 1 and T 2 by removing edges. D C B D A E C E F F B A
124 22 / 31 Agreement Forests Definition A forest F is called agreement forest of trees T 1 and T 2 if F can be obtained from T 1 and T 2 by removing edges. F C E A F E D C B A D B
125 22 / 31 Agreement Forests Definition A forest F is called agreement forest of trees T 1 and T 2 if F can be obtained from T 1 and T 2 by removing edges. F C E A F E D C B A D B
126 22 / 31 Agreement Forests Definition A forest F is called agreement forest of trees T 1 and T 2 if F can be obtained from T 1 and T 2 by removing edges. Theorem [Allen & Steel, 01] TBR-distance(T 1, T 2 ) = #trees in smallest agreement forest - 1 NP-hard to compute Theorem [Bordewich & Semple, 04] rspr-distance(t 1, T 2) = #trees in smallest rooted agreement forest - 1 NP-hard to compute
127 Robinson-Foulds Distance Definition RF(T 1,T 2 ) = #splits/clusters occuring in exactly one of T 1 and T 2 = edge-contraction distance a common tree Note: observe relation to NNI: RF(T 1,T 2 ) 2 NNI(T 1,T 2 ) E F F C D B A trivial splits, {A,B} {C,D,E,F}, {E,F} {A,B,E,F}, {C,D} {A,B,E,F} A B trivial splits, {A,B} {C,D,E,F} {E,F} {A,B,C,D} {A,B,D} {C,E,F} 23 / 31 D C E
128 Robinson-Foulds Distance Definition RF(T 1,T 2 ) = #splits/clusters occuring in exactly one of T 1 and T 2 = edge-contraction distance a common tree Note: observe relation to NNI: RF(T 1,T 2 ) 2 NNI(T 1,T 2 ) E F F C D B A trivial splits, {A,B} {C,D,E,F}, {E,F} {A,B,E,F}, {C,D} {A,B,E,F} A B trivial splits, {A,B} {C,D,E,F} {E,F} {A,B,C,D} {A,B,D} {C,E,F} 23 / 31 D C E
129 Robinson-Foulds Distance Definition RF(T 1,T 2 ) = #splits/clusters occuring in exactly one of T 1 and T 2 = edge-contraction distance a common tree Note: observe relation to NNI: RF(T 1,T 2 ) 2 NNI(T 1,T 2 ) (C) Leonardo de Oliveira Martins 23 / 31
130 23 / 31 Robinson-Foulds Distance Definition RF(T 1,T 2 ) = #splits/clusters occuring in exactly one of T 1 and T 2 = edge-contraction distance a common tree Note: observe relation to NNI: RF(T 1,T 2 ) 2 NNI(T 1,T 2 ) Note: splits correspond to clusters when rooted at last leaf Day s Algorithm (common clusters in O(n)) [Day 85] 1. relabel all leaves s.t. leaves continuous in T 1 2. each vertex in T 1 knows: L: smallest leaf in cluster R: largest leaf in cluster T 1 s clusters in table [L, R] (O(n) sparse set/perfect hashing) 3. each vertex in T 2 knows L & R & size N of its cluster 4. each vertex in T 2 only checks for [L, R] if R L = N 1 (lookup in T 1 s table in O(1) (average) time)
131 23 / 31 Robinson-Foulds Distance C D B A C D E A B F Day s Algorithm (common clusters in O(n)) [Day 85] F E 1. relabel all leaves s.t. leaves continuous in T 1 2. each vertex in T 1 knows: L: smallest leaf in cluster R: largest leaf in cluster T 1 s clusters in table [L, R] (O(n) sparse set/perfect hashing) 3. each vertex in T 2 knows L & R & size N of its cluster 4. each vertex in T 2 only checks for [L, R] if R L = N 1 (lookup in T 1 s table in O(1) (average) time)
132 23 / 31 Robinson-Foulds Distance C D B A C D E A B F Day s Algorithm (common clusters in O(n)) [Day 85] F E 1. relabel all leaves s.t. leaves continuous in T 1 2. each vertex in T 1 knows: L: smallest leaf in cluster R: largest leaf in cluster T 1 s clusters in table [L, R] (O(n) sparse set/perfect hashing) 3. each vertex in T 2 knows L & R & size N of its cluster 4. each vertex in T 2 only checks for [L, R] if R L = N 1 (lookup in T 1 s table in O(1) (average) time)
133 23 / 31 Robinson-Foulds Distance C D A,E,5 C,D B A,D,4 A,B A,D A,D,3 A C A,B,2 A,E D E A B F Day s Algorithm (common clusters in O(n)) [Day 85] F E 1. relabel all leaves s.t. leaves continuous in T 1 2. each vertex in T 1 knows: L: smallest leaf in cluster R: largest leaf in cluster T 1 s clusters in table [L, R] (O(n) sparse set/perfect hashing) 3. each vertex in T 2 knows L & R & size N of its cluster 4. each vertex in T 2 only checks for [L, R] if R L = N 1 (lookup in T 1 s table in O(1) (average) time)
134 23 / 31 Robinson-Foulds Distance C D A,E,5 C,D B A,D,4 A,B A,D A,D,3 A C A,B,2 A,E D E A B F Day s Algorithm (common clusters in O(n)) [Day 85] F E 1. relabel all leaves s.t. leaves continuous in T 1 2. each vertex in T 1 knows: L: smallest leaf in cluster R: largest leaf in cluster T 1 s clusters in table [L, R] (O(n) sparse set/perfect hashing) 3. each vertex in T 2 knows L & R & size N of its cluster 4. each vertex in T 2 only checks for [L, R] if R L = N 1 (lookup in T 1 s table in O(1) (average) time)
135 24 / 31 Quartet/Triplet Distance Definition Q/T(T 1,T 2 ) = #quartets/triplets occur. in exactly one of T 1 and T 2
136 24 / 31 Quartet/Triplet Distance Definition Q/T(T 1,T 2 ) = #quartets/triplets occur. in exactly one of T 1 and T 2 computing Q-distance (binary trees) [Bryant et al. 00] 1. each edge uv has 4 sets (2 clusters for each of u & v) 2. quartet AB CD belongs to edge e if e splits AB CD & e touches AB-path or CD-path each split is owned exactly once 3. uv T 1 & ab T 2: intersect 4 sets of uv with split of ab in T 2 4. sizes of intersections can be precomputed bottom-up in O(n 2 ) time
137 24 / 31 Quartet/Triplet Distance Definition Q/T(T 1,T 2 ) = #quartets/triplets occur. in exactly one of T 1 and T 2 computing Q-distance (binary trees) [Bryant et al. 00] 1. each edge uv has 4 sets (2 clusters for each of u & v) 2. quartet AB CD belongs to edge e if e splits AB CD & e touches AB-path or CD-path each split is owned exactly once 3. uv T 1 & ab T 2: intersect 4 sets of uv with split of ab in T 2 4. sizes of intersections can be precomputed bottom-up in O(n 2 ) time State of the Art count conflict quartets/triplets O(n log n) time [Brodal et al. 13] enumerate conflict quartets O(n 2 + d) [Bryant et al. 00] enumerate conflict triplets O(n + d) [Weller 17]
138 25 / 31 Phylogenetic Networks Observation Trees cannot capture hybridization
139 25 / 31 Phylogenetic Networks Observation Trees cannot capture hybridization phylogenetic network
140 25 / 31 Phylogenetic Networks Observation Trees cannot capture hybridization phylogenetic network Definition evolutionary network N = rooted DAG, leaves labeled (taxa) reticulations R = vertices of in-degree 2 binary = all inner vertices degree 3 block = component without cut-vertex display T = subdivision of T is a subgraph
141 25 / 31 Phylogenetic Networks Observation Trees cannot capture hybridization phylogenetic network Definition evolutionary network N = rooted DAG, leaves labeled (taxa) reticulations R = vertices of in-degree 2 binary = all inner vertices degree 3 block = component without cut-vertex display T = subdivision of T is a subgraph
142 Split Networks split = bipartition of set of taxa splits A B & X Y incompatible if both A & B intersect both X & Y Convex Hull Algorithm [Holland et al., 04] fur AUS pouch land X X X XX X XX c.f.: Neighbor Net [Bryant & Moulton, 03] (for circular splits) 26 / 31
143 Split Networks split = bipartition of set of taxa splits A B & X Y incompatible if both A & B intersect both X & Y Convex Hull Algorithm [Holland et al., 04] fur AUS pouch land X X X XX X XX vs c.f.: Neighbor Net [Bryant & Moulton, 03] (for circular splits) 26 / 31
144 Split Networks split = bipartition of set of taxa splits A B & X Y incompatible if both A & B intersect both X & Y Convex Hull Algorithm [Holland et al., 04] fur AUS pouch land X X X XX X XX vs c.f.: Neighbor Net [Bryant & Moulton, 03] (for circular splits) 26 / 31
145 Split Networks split = bipartition of set of taxa splits A B & X Y incompatible if both A & B intersect both X & Y Convex Hull Algorithm [Holland et al., 04] fur AUS pouch land X X X XX X XX vs vs c.f.: Neighbor Net [Bryant & Moulton, 03] (for circular splits) 26 / 31
146 Split Networks split = bipartition of set of taxa splits A B & X Y incompatible if both A & B intersect both X & Y Convex Hull Algorithm [Holland et al., 04] fur AUS pouch land X X X XX X XX vs vs c.f.: Neighbor Net [Bryant & Moulton, 03] (for circular splits) 26 / 31
147 Split Networks split = bipartition of set of taxa splits A B & X Y incompatible if both A & B intersect both X & Y Convex Hull Algorithm [Holland et al., 04] fur AUS pouch land X X X XX X XX vs vs c.f.: Neighbor Net [Bryant & Moulton, 03] (for circular splits) 26 / 31
148 Split Networks split = bipartition of set of taxa splits A B & X Y incompatible if both A & B intersect both X & Y Convex Hull Algorithm [Holland et al., 04] fur AUS pouch land X X X XX X XX vs vs vs c.f.: Neighbor Net [Bryant & Moulton, 03] (for circular splits) 26 / 31
149 Split Networks split = bipartition of set of taxa splits A B & X Y incompatible if both A & B intersect both X & Y Convex Hull Algorithm [Holland et al., 04] fur AUS pouch land X X X XX X XX vs vs vs c.f.: Neighbor Net [Bryant & Moulton, 03] (for circular splits) 26 / 31
150 Split Networks split = bipartition of set of taxa splits A B & X Y incompatible if both A & B intersect both X & Y Convex Hull Algorithm [Holland et al., 04] fur AUS pouch land X X X XX X XX vs vs vs vs c.f.: Neighbor Net [Bryant & Moulton, 03] (for circular splits) 26 / 31
151 27 / 31 Consensus Split Networks Strategy 1. list all splits of all input trees 2. extend splits to full taxa using Z-closure 3. build consensus
152 27 / 31 Consensus Split Networks Strategy 1. list all splits of all input trees 2. extend splits to full taxa using Z-closure 3. build consensus Experimental Study 106 gene trees (yeast) [Rokas et al. 03, Holland et al. 04]
153 27 / 31 Consensus Split Networks Strategy 1. list all splits of all input trees 2. extend splits to full taxa using Z-closure 3. build consensus Experimental Study 106 gene trees (yeast) [Rokas et al. 03, Holland et al. 04]
154 27 / 31 Consensus Split Networks Strategy 1. list all splits of all input trees 2. extend splits to full taxa using Z-closure 3. build consensus Experimental Study 106 gene trees (yeast) [Rokas et al. 03, Holland et al. 04]
155 27 / 31 Consensus Split Networks Strategy 1. list all splits of all input trees 2. extend splits to full taxa using Z-closure 3. build consensus Experimental Study 106 gene trees (yeast) [Rokas et al. 03, Holland et al. 04]
156 28 / 31 Rooted Network Reconstruction Observation rooted network: cluster of u cluster of v u v rooted network is hasse diagram of its clusters
157 28 / 31 Rooted Network Reconstruction Observation rooted network: cluster of u cluster of v u v rooted network is hasse diagram of its clusters Example {a,b,c,d},{c,d,e,f,g,h}, {c,d,e,f,g}, {e,f,g,h}, {c,d,e}, {e,f,g}, {a,b}, {c,d}, {f,g} Exercise Time
158 28 / 31 Rooted Network Reconstruction Observation rooted network: cluster of u cluster of v u v rooted network is hasse diagram of its clusters Example {a,b,c,d},{c,d,e,f,g,h}, {c,d,e,f,g}, {e,f,g,h}, {c,d,e}, {e,f,g}, {a,b}, {c,d}, {f,g} a,b,c,d c,d,e,f,g c,d,e,f,g,h e,f,g,h c,d,e e,f,g a,b c,d f,g a b c d e f g h
159 28 / 31 Rooted Network Reconstruction Observation rooted network: cluster of u cluster of v u v rooted network is hasse diagram of its clusters Example {a,b,c,d},{c,d,e,f,g,h}, {c,d,e,f,g}, {e,f,g,h}, {c,d,e}, {e,f,g}, {a,b}, {c,d}, {f,g} a b c d e f g h
160 Rooted Network Reconstruction Observation rooted network: cluster of u cluster of v u v rooted network is hasse diagram of its clusters Example {a,b,c,d},{c,d,e,f,g,h}, {c,d,e,f,g}, {e,f,g,h}, {c,d,e}, {e,f,g}, {a,b}, {c,d}, {f,g} a b c d e c.f. cluster popping [Huson & Rupp, 08] f g h 28 / 31
161 28 / 31 Rooted Network Reconstruction Observation rooted network: cluster of u cluster of v u v rooted network is hasse diagram of its clusters Problem may produce more reticulations than necessary to explain the data
162 28 / 31 Rooted Network Reconstruction Observation rooted network: cluster of u cluster of v u v rooted network is hasse diagram of its clusters Problem may produce more reticulations than necessary to explain the data Hybridization Number Input: set of trees T, int k Question: Is there a network with k reticulations displaying all trees in T?
163 28 / 31 Rooted Network Reconstruction Observation rooted network: cluster of u cluster of v u v rooted network is hasse diagram of its clusters Problem may produce more reticulations than necessary to explain the data Hybridization Number Input: set of trees T, int k Question: Is there a network with k reticulations displaying all trees in T? NP-hard for 2 trees [Bordewich & Semple, 07]
164 28 / 31 Rooted Network Reconstruction Observation rooted network: cluster of u cluster of v u v rooted network is hasse diagram of its clusters Problem may produce more reticulations than necessary to explain the data Hybridization Number Input: set of trees T, int k Question: Is there a network with k reticulations displaying all trees in T? NP-hard for 2 trees [Bordewich & Semple, 07] Note: HN(T 1,T 2 ) = max. acyclic agreement forest - 1 [Baroni et al., 05]
165 29 / 31 Networks Display Trees Observation A network may display up to 2 R trees.
166 29 / 31 Networks Display Trees Observation A network may display up to 2 R trees.
167 29 / 31 Networks Display Trees Observation A network may display up to 2 R trees. But: how to decide if a given tree is displayed?
168 29 / 31 Networks Display Trees Tree Containment Input: a network N, a tree T Question: Does N display T?
169 29 / 31 Networks Display Trees Tree Containment Input: a network N, a tree T Question: Does N display T? NP-hard (from Disjoint Paths) [Kanj et al. 08]
170 29 / 31 Networks Display Trees Tree Containment Input: a network N, a tree T Question: Does N display T? NP-hard (from Disjoint Paths) [Kanj et al. 08] Note: linear time on reticulation visible N [Gunawan, 18][Weller, 18]
171 Networks Display Trees Tree Containment Input: a network N, a tree T Question: Does N display T? NP-hard (from Disjoint Paths) [Kanj et al. 08] Note: linear time on reticulation visible N [Gunawan, 18][Weller, 18] Can you see it? A F G B C E D D E C F A G B 29 / 31
172 Small Taxonomy of Network Classes c.f. Who is Who in Phylogenetic Networks ( 30 / 31
173 Thanks & Enjoy Part III 31 / 31
Olivier Gascuel Arbres formels et Arbre de la Vie Conférence ENS Cachan, septembre Arbres formels et Arbre de la Vie.
Arbres formels et Arbre de la Vie Olivier Gascuel Centre National de la Recherche Scientifique LIRMM, Montpellier, France www.lirmm.fr/gascuel 10 permanent researchers 2 technical staff 3 postdocs, 10
More informationIntroduction to Trees
Introduction to Trees Tandy Warnow December 28, 2016 Introduction to Trees Tandy Warnow Clades of a rooted tree Every node v in a leaf-labelled rooted tree defines a subset of the leafset that is below
More informationCSE 549: Computational Biology
CSE 549: Computational Biology Phylogenomics 1 slides marked with * by Carl Kingsford Tree of Life 2 * H5N1 Influenza Strains Salzberg, Kingsford, et al., 2007 3 * H5N1 Influenza Strains The 2007 outbreak
More informationEvolutionary tree reconstruction (Chapter 10)
Evolutionary tree reconstruction (Chapter 10) Early Evolutionary Studies Anatomical features were the dominant criteria used to derive evolutionary relationships between species since Darwin till early
More informationDIMACS Tutorial on Phylogenetic Trees and Rapidly Evolving Pathogens. Katherine St. John City University of New York 1
DIMACS Tutorial on Phylogenetic Trees and Rapidly Evolving Pathogens Katherine St. John City University of New York 1 Thanks to the DIMACS Staff Linda Casals Walter Morris Nicole Clark Katherine St. John
More informationIntroduction to Computational Phylogenetics
Introduction to Computational Phylogenetics Tandy Warnow The University of Texas at Austin No Institute Given This textbook is a draft, and should not be distributed. Much of what is in this textbook appeared
More informationPhylogenetics. Introduction to Bioinformatics Dortmund, Lectures: Sven Rahmann. Exercises: Udo Feldkamp, Michael Wurst
Phylogenetics Introduction to Bioinformatics Dortmund, 16.-20.07.2007 Lectures: Sven Rahmann Exercises: Udo Feldkamp, Michael Wurst 1 Phylogenetics phylum = tree phylogenetics: reconstruction of evolutionary
More informationWhat is a phylogenetic tree? Algorithms for Computational Biology. Phylogenetics Summary. Di erent types of phylogenetic trees
What is a phylogenetic tree? Algorithms for Computational Biology Zsuzsanna Lipták speciation events Masters in Molecular and Medical Biotechnology a.a. 25/6, fall term Phylogenetics Summary wolf cat lion
More informationDynamic Programming for Phylogenetic Estimation
1 / 45 Dynamic Programming for Phylogenetic Estimation CS598AGB Pranjal Vachaspati University of Illinois at Urbana-Champaign 2 / 45 Coalescent-based Species Tree Estimation Find evolutionary tree for
More informationCodon models. In reality we use codon model Amino acid substitution rates meet nucleotide models Codon(nucleotide triplet)
Phylogeny Codon models Last lecture: poor man s way of calculating dn/ds (Ka/Ks) Tabulate synonymous/non- synonymous substitutions Normalize by the possibilities Transform to genetic distance K JC or K
More informationRecent Research Results. Evolutionary Trees Distance Methods
Recent Research Results Evolutionary Trees Distance Methods Indo-European Languages After Tandy Warnow What is the purpose? Understand evolutionary history (relationship between species). Uderstand how
More informationCS 581. Tandy Warnow
CS 581 Tandy Warnow This week Maximum parsimony: solving it on small datasets Maximum Likelihood optimization problem Felsenstein s pruning algorithm Bayesian MCMC methods Research opportunities Maximum
More informationEvolution of Tandemly Repeated Sequences
University of Canterbury Department of Mathematics and Statistics Evolution of Tandemly Repeated Sequences A thesis submitted in partial fulfilment of the requirements of the Degree for Master of Science
More informationThe SNPR neighbourhood of tree-child networks
Journal of Graph Algorithms and Applications http://jgaa.info/ vol. 22, no. 2, pp. 29 55 (2018) DOI: 10.7155/jgaa.00472 The SNPR neighbourhood of tree-child networks Jonathan Klawitter Department of Computer
More informationMolecular Evolution & Phylogenetics Complexity of the search space, distance matrix methods, maximum parsimony
Molecular Evolution & Phylogenetics Complexity of the search space, distance matrix methods, maximum parsimony Basic Bioinformatics Workshop, ILRI Addis Ababa, 12 December 2017 Learning Objectives understand
More informationImproved parameterized complexity of the Maximum Agreement Subtree and Maximum Compatible Tree problems LIRMM, Tech.Rep. num 04026
Improved parameterized complexity of the Maximum Agreement Subtree and Maximum Compatible Tree problems LIRMM, Tech.Rep. num 04026 Vincent Berry, François Nicolas Équipe Méthodes et Algorithmes pour la
More informationof the Balanced Minimum Evolution Polytope Ruriko Yoshida
Optimality of the Neighbor Joining Algorithm and Faces of the Balanced Minimum Evolution Polytope Ruriko Yoshida Figure 19.1 Genomes 3 ( Garland Science 2007) Origins of Species Tree (or web) of life eukarya
More informationScaling species tree estimation methods to large datasets using NJMerge
Scaling species tree estimation methods to large datasets using NJMerge Erin Molloy and Tandy Warnow {emolloy2, warnow}@illinois.edu University of Illinois at Urbana Champaign 2018 Phylogenomics Software
More informationPhylogenetics on CUDA (Parallel) Architectures Bradly Alicea
Descent w/modification Descent w/modification Descent w/modification Descent w/modification CPU Descent w/modification Descent w/modification Phylogenetics on CUDA (Parallel) Architectures Bradly Alicea
More informationAlgorithms for Bioinformatics
Adapted from slides by Leena Salmena and Veli Mäkinen, which are partly from http: //bix.ucsd.edu/bioalgorithms/slides.php. 582670 Algorithms for Bioinformatics Lecture 6: Distance based clustering and
More informationTrinets encode tree-child and level-2 phylogenetic networks
Noname manuscript No. (will be inserted by the editor) Trinets encode tree-child and level-2 phylogenetic networks Leo van Iersel Vincent Moulton the date of receipt and acceptance should be inserted later
More informationLecture 20: Clustering and Evolution
Lecture 20: Clustering and Evolution Study Chapter 10.4 10.8 11/11/2014 Comp 555 Bioalgorithms (Fall 2014) 1 Clique Graphs A clique is a graph where every vertex is connected via an edge to every other
More informationLecture 20: Clustering and Evolution
Lecture 20: Clustering and Evolution Study Chapter 10.4 10.8 11/12/2013 Comp 465 Fall 2013 1 Clique Graphs A clique is a graph where every vertex is connected via an edge to every other vertex A clique
More informationarxiv: v2 [q-bio.pe] 8 Aug 2016
Combinatorial Scoring of Phylogenetic Networks Nikita Alexeev and Max A. Alekseyev The George Washington University, Washington, D.C., U.S.A. arxiv:160.0841v [q-bio.pe] 8 Aug 016 Abstract. Construction
More informationStudy of a Simple Pruning Strategy with Days Algorithm
Study of a Simple Pruning Strategy with ays Algorithm Thomas G. Kristensen Abstract We wish to calculate all pairwise Robinson Foulds distances in a set of trees. Traditional algorithms for doing this
More information4/4/16 Comp 555 Spring
4/4/16 Comp 555 Spring 2016 1 A clique is a graph where every vertex is connected via an edge to every other vertex A clique graph is a graph where each connected component is a clique The concept of clustering
More informationDistances Between Phylogenetic Trees: A Survey
TSINGHUA SCIENCE AND TECHNOLOGY ISSNll1007-0214ll07/11llpp490-499 Volume 18, Number 5, October 2013 Distances Between Phylogenetic Trees: A Survey Feng Shi, Qilong Feng, Jianer Chen, Lusheng Wang, and
More informationEVOLUTIONARY DISTANCES INFERRING PHYLOGENIES
EVOLUTIONARY DISTANCES INFERRING PHYLOGENIES Luca Bortolussi 1 1 Dipartimento di Matematica ed Informatica Università degli studi di Trieste luca@dmi.units.it Trieste, 28 th November 2007 OUTLINE 1 INFERRING
More informationIsometric gene tree reconciliation revisited
DOI 0.86/s05-07-008-x Algorithms for Molecular Biology RESEARCH Open Access Isometric gene tree reconciliation revisited Broňa Brejová *, Askar Gafurov, Dana Pardubská, Michal Sabo and Tomáš Vinař Abstract
More informationSeeing the wood for the trees: Analysing multiple alternative phylogenies
Seeing the wood for the trees: Analysing multiple alternative phylogenies Tom M. W. Nye, Newcastle University tom.nye@ncl.ac.uk Isaac Newton Institute, 17 December 2007 Multiple alternative phylogenies
More informationAnswer Set Programming or Hypercleaning: Where does the Magic Lie in Solving Maximum Quartet Consistency?
Answer Set Programming or Hypercleaning: Where does the Magic Lie in Solving Maximum Quartet Consistency? Fathiyeh Faghih and Daniel G. Brown David R. Cheriton School of Computer Science, University of
More information11/17/2009 Comp 590/Comp Fall
Lecture 20: Clustering and Evolution Study Chapter 10.4 10.8 Problem Set #5 will be available tonight 11/17/2009 Comp 590/Comp 790-90 Fall 2009 1 Clique Graphs A clique is a graph with every vertex connected
More informationReconstructing Reticulate Evolution in Species Theory and Practice
Reconstructing Reticulate Evolution in Species Theory and Practice Luay Nakhleh Department of Computer Science Rice University Houston, Texas 77005 nakhleh@cs.rice.edu Tandy Warnow Department of Computer
More informationTerminology. A phylogeny is the evolutionary history of an organism
Phylogeny Terminology A phylogeny is the evolutionary history of an organism A taxon (plural: taxa) is a group of (one or more) organisms, which a taxonomist adjudges to be a unit. A definition? from Wikipedia
More informationPhylogenetic Trees Lecture 12. Section 7.4, in Durbin et al., 6.5 in Setubal et al. Shlomo Moran, Ilan Gronau
Phylogenetic Trees Lecture 12 Section 7.4, in Durbin et al., 6.5 in Setubal et al. Shlomo Moran, Ilan Gronau. Maximum Parsimony. Last week we presented Fitch algorithm for (unweighted) Maximum Parsimony:
More informationML phylogenetic inference and GARLI. Derrick Zwickl. University of Arizona (and University of Kansas) Workshop on Molecular Evolution 2015
ML phylogenetic inference and GARLI Derrick Zwickl University of Arizona (and University of Kansas) Workshop on Molecular Evolution 2015 Outline Heuristics and tree searches ML phylogeny inference and
More informationPhylogenetic Trees and Their Analysis
City University of New York (CUNY) CUNY Academic Works Dissertations, Theses, and Capstone Projects Graduate Center 2-2014 Phylogenetic Trees and Their Analysis Eric Ford Graduate Center, City University
More informationReconciliation Problems for Duplication, Loss and Horizontal Gene Transfer Pawel Górecki. Presented by Connor Magill November 20, 2008
Reconciliation Problems for Duplication, Loss and Horizontal Gene Transfer Pawel Górecki Presented by Connor Magill November 20, 2008 Introduction Problem: Relationships between species cannot always be
More informationComputing the Quartet Distance Between Trees of Arbitrary Degrees
January 22, 2006 University of Aarhus Department of Computer Science Computing the Quartet Distance Between Trees of Arbitrary Degrees Chris Christiansen & Martin Randers Thesis supervisor: Christian Nørgaard
More informationSection 3.1: Nonseparable Graphs Cut vertex of a connected graph G: A vertex x G such that G x is not connected. Theorem 3.1, p. 57: Every connected
Section 3.1: Nonseparable Graphs Cut vertex of a connected graph G: A vertex x G such that G x is not connected. Theorem 3.1, p. 57: Every connected graph G with at least 2 vertices contains at least 2
More informationSPR-BASED TREE RECONCILIATION: NON-BINARY TREES AND MULTIPLE SOLUTIONS
1 SPR-BASED TREE RECONCILIATION: NON-BINARY TREES AND MULTIPLE SOLUTIONS C. THAN and L. NAKHLEH Department of Computer Science Rice University 6100 Main Street, MS 132 Houston, TX 77005, USA Email: {cvthan,nakhleh}@cs.rice.edu
More informationAlignment of Trees and Directed Acyclic Graphs
Alignment of Trees and Directed Acyclic Graphs Gabriel Valiente Algorithms, Bioinformatics, Complexity and Formal Methods Research Group Technical University of Catalonia Computational Biology and Bioinformatics
More informationMain Reference. Marc A. Suchard: Stochastic Models for Horizontal Gene Transfer: Taking a Random Walk through Tree Space Genetics 2005
Stochastic Models for Horizontal Gene Transfer Dajiang Liu Department of Statistics Main Reference Marc A. Suchard: Stochastic Models for Horizontal Gene Transfer: Taing a Random Wal through Tree Space
More informationApplied Mathematics Letters. Graph triangulations and the compatibility of unrooted phylogenetic trees
Applied Mathematics Letters 24 (2011) 719 723 Contents lists available at ScienceDirect Applied Mathematics Letters journal homepage: www.elsevier.com/locate/aml Graph triangulations and the compatibility
More informationABOUT THE LARGEST SUBTREE COMMON TO SEVERAL PHYLOGENETIC TREES Alain Guénoche 1, Henri Garreta 2 and Laurent Tichit 3
The XIII International Conference Applied Stochastic Models and Data Analysis (ASMDA-2009) June 30-July 3, 2009, Vilnius, LITHUANIA ISBN 978-9955-28-463-5 L. Sakalauskas, C. Skiadas and E. K. Zavadskas
More informationPhylogenetic incongruence through the lens of Monadic Second Order logic
Journal of Graph Algorithms and Applications http://jgaa.info/ vol. 20, no. 2, pp. 189 215 (2016) DOI: 10.7155/jgaa.00390 Phylogenetic incongruence through the lens of Monadic Second Order logic Steven
More informationDistance based tree reconstruction. Hierarchical clustering (UPGMA) Neighbor-Joining (NJ)
Distance based tree reconstruction Hierarchical clustering (UPGMA) Neighbor-Joining (NJ) All organisms have evolved from a common ancestor. Infer the evolutionary tree (tree topology and edge lengths)
More informationNotes 4 : Approximating Maximum Parsimony
Notes 4 : Approximating Maximum Parsimony MATH 833 - Fall 2012 Lecturer: Sebastien Roch References: [SS03, Chapters 2, 5], [DPV06, Chapters 5, 9] 1 Coping with NP-completeness Local search heuristics.
More informationSEEING THE TREES AND THEIR BRANCHES IN THE NETWORK IS HARD
1 SEEING THE TREES AND THEIR BRANCHES IN THE NETWORK IS HARD I A KANJ School of Computer Science, Telecommunications, and Information Systems, DePaul University, Chicago, IL 60604-2301, USA E-mail: ikanj@csdepauledu
More informationhuman chimp mouse rat
Michael rudno These notes are based on earlier notes by Tomas abak Phylogenetic Trees Phylogenetic Trees demonstrate the amoun of evolution, and order of divergence for several genomes. Phylogenetic trees
More informationPhylogenetic networks that display a tree twice
Bulletin of Mathematical Biology manuscript No. (will be inserted by the editor) Phylogenetic networks that display a tree twice Paul Cordue Simone Linz Charles Semple Received: date / Accepted: date Abstract
More informationCISC 636 Computational Biology & Bioinformatics (Fall 2016) Phylogenetic Trees (I)
CISC 636 Computational iology & ioinformatics (Fall 2016) Phylogenetic Trees (I) Maximum Parsimony CISC636, F16, Lec13, Liao 1 Evolution Mutation, selection, Only the Fittest Survive. Speciation. t one
More informationParsimony-Based Approaches to Inferring Phylogenetic Trees
Parsimony-Based Approaches to Inferring Phylogenetic Trees BMI/CS 576 www.biostat.wisc.edu/bmi576.html Mark Craven craven@biostat.wisc.edu Fall 0 Phylogenetic tree approaches! three general types! distance:
More informationOn the Optimality of the Neighbor Joining Algorithm
On the Optimality of the Neighbor Joining Algorithm Ruriko Yoshida Dept. of Statistics University of Kentucky Joint work with K. Eickmeyer, P. Huggins, and L. Pachter www.ms.uky.edu/ ruriko Louisville
More informationDistance-based Phylogenetic Methods Near a Polytomy
Distance-based Phylogenetic Methods Near a Polytomy Ruth Davidson and Seth Sullivant NCSU UIUC May 21, 2014 2 Phylogenetic trees model the common evolutionary history of a group of species Leaves = extant
More informationComparison of commonly used methods for combining multiple phylogenetic data sets
Comparison of commonly used methods for combining multiple phylogenetic data sets Anne Kupczok, Heiko A. Schmidt and Arndt von Haeseler Center for Integrative Bioinformatics Vienna Max F. Perutz Laboratories
More informationParsimony Least squares Minimum evolution Balanced minimum evolution Maximum likelihood (later in the course)
Tree Searching We ve discussed how we rank trees Parsimony Least squares Minimum evolution alanced minimum evolution Maximum likelihood (later in the course) So we have ways of deciding what a good tree
More informationThe worst case complexity of Maximum Parsimony
he worst case complexity of Maximum Parsimony mir armel Noa Musa-Lempel Dekel sur Michal Ziv-Ukelson Ben-urion University June 2, 20 / 2 What s a phylogeny Phylogenies: raph-like structures whose topology
More informationMOLECULAR phylogenetic methods reconstruct evolutionary
Calculating the Unrooted Subtree Prune-and-Regraft Distance Chris Whidden and Frederick A. Matsen IV arxiv:.09v [cs.ds] Nov 0 Abstract The subtree prune-and-regraft (SPR) distance metric is a fundamental
More informationSequence length requirements. Tandy Warnow Department of Computer Science The University of Texas at Austin
Sequence length requirements Tandy Warnow Department of Computer Science The University of Texas at Austin Part 1: Absolute Fast Convergence DNA Sequence Evolution AAGGCCT AAGACTT TGGACTT -3 mil yrs -2
More informationFixed parameter algorithms for compatible and agreement supertree problems
Graduate Theses and Dissertations Iowa State University Capstones, Theses and Dissertations 2013 Fixed parameter algorithms for compatible and agreement supertree problems Sudheer Reddy Vakati Iowa State
More informationEfficient Non-binary Gene Tree Resolution with Weighted Reconciliation Cost
Efficient Non-binary Gene Tree Resolution with Weighted Reconciliation Cost DIRO Université de Montréal Manuel Lafond, Emmanuel Noutahi Nadia El-Mabrouk Introduction Gene Tree : representation of the evolutionary
More informationApproximating Subtree Distances Between Phylogenies. MARIA LUISA BONET, 1 KATHERINE ST. JOHN, 2,3 RUCHI MAHINDRU, 2,4 and NINA AMENTA 5 ABSTRACT
JOURNAL OF COMPUTATIONAL BIOLOGY Volume 13, Number 8, 2006 Mary Ann Liebert, Inc. Pp. 1419 1434 Approximating Subtree Distances Between Phylogenies AU1 AU2 MARIA LUISA BONET, 1 KATHERINE ST. JOHN, 2,3
More informationAlgorithms for constructing more accurate and inclusive phylogenetic trees
Graduate Theses and Dissertations Iowa State University Capstones, Theses and Dissertations 2013 Algorithms for constructing more accurate and inclusive phylogenetic trees Ruchi Chaudhary Iowa State University
More informationarxiv: v2 [q-bio.pe] 8 Sep 2015
RH: Tree-Based Phylogenetic Networks On Tree Based Phylogenetic Networks arxiv:1509.01663v2 [q-bio.pe] 8 Sep 2015 Louxin Zhang 1 1 Department of Mathematics, National University of Singapore, Singapore
More informationThroughout the chapter, we will assume that the reader is familiar with the basics of phylogenetic trees.
Chapter 7 SUPERTREE ALGORITHMS FOR NESTED TAXA Philip Daniel and Charles Semple Abstract: Keywords: Most supertree algorithms combine collections of rooted phylogenetic trees with overlapping leaf sets
More informationEfficiently Inferring Pairwise Subtree Prune-and-Regraft Adjacencies between Phylogenetic Trees
Efficiently Inferring Pairwise Subtree Prune-and-Regraft Adjacencies between Phylogenetic Trees Downloaded 01/19/18 to 140.107.151.5. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
More informationEvolution Module. 6.1 Phylogenetic Trees. Bob Gardner and Lev Yampolski. Integrated Biology and Discrete Math (IBMS 1300)
Evolution Module 6.1 Phylogenetic Trees Bob Gardner and Lev Yampolski Integrated Biology and Discrete Math (IBMS 1300) Fall 2008 1 INDUCTION Note. The natural numbers N is the familiar set N = {1, 2, 3,...}.
More informationParsimonious Reconciliation of Non-binary Trees. Louxin Zhang National University of Singapore
Parsimonious Reconciliation of Non-binary Trees Louxin Zhang National University of Singapore matzlx@nus.edu.sg Gene Tree vs the (Containing) Species Tree. A species tree S represents the evolutionary
More informationMarkovian Models of Genetic Inheritance
Markovian Models of Genetic Inheritance Elchanan Mossel, U.C. Berkeley mossel@stat.berkeley.edu, http://www.cs.berkeley.edu/~mossel/ 6/18/12 1 General plan Define a number of Markovian Inheritance Models
More informationComputing Maximum Agreement Forests without Cluster Partitioning is Folly
Computing Maximum Agreement Forests without Cluster Partitioning is Folly Zhijiang Li 1 and Norbert Zeh 2 1 Microsoft Canada, Vancouver, BC, Canada zhijiang.li@dal.ca 2 Faculty of Computer Science, Dalhousie
More informationClustering of Proteins
Melroy Saldanha saldanha@stanford.edu CS 273 Project Report Clustering of Proteins Introduction Numerous genome-sequencing projects have led to a huge growth in the size of protein databases. Manual annotation
More informationCSE 373 MAY 10 TH SPANNING TREES AND UNION FIND
CSE 373 MAY 0 TH SPANNING TREES AND UNION FIND COURSE LOGISTICS HW4 due tonight, if you want feedback by the weekend COURSE LOGISTICS HW4 due tonight, if you want feedback by the weekend HW5 out tomorrow
More informationUnique reconstruction of tree-like phylogenetic networks from distances between leaves
Unique reconstruction of tree-like phylogenetic networks from distances between leaves Stephen J. Willson Department of Mathematics Iowa State University Ames, IA 50011 USA email: swillson@iastate.edu
More informationPrior Distributions on Phylogenetic Trees
Prior Distributions on Phylogenetic Trees Magnus Johansson Masteruppsats i matematisk statistik Master Thesis in Mathematical Statistics Masteruppsats 2011:4 Matematisk statistik Juni 2011 www.math.su.se
More informationThe History Bound and ILP
The History Bound and ILP Julia Matsieva and Dan Gusfield UC Davis March 15, 2017 Bad News for Tree Huggers More Bad News Far more convincingly even than the (also highly convincing) fossil evidence, the
More informationAlgorithms for Computing Cluster Dissimilarity between Rooted Phylogenetic
Send Orders for Reprints to reprints@benthamscience.ae 8 The Open Cybernetics & Systemics Journal, 05, 9, 8-3 Open Access Algorithms for Computing Cluster Dissimilarity between Rooted Phylogenetic Trees
More informationLecture 3: Graphs and flows
Chapter 3 Lecture 3: Graphs and flows Graphs: a useful combinatorial structure. Definitions: graph, directed and undirected graph, edge as ordered pair, path, cycle, connected graph, strongly connected
More informationRecognizing Interval Bigraphs by Forbidden Patterns
Recognizing Interval Bigraphs by Forbidden Patterns Arash Rafiey Simon Fraser University, Vancouver, Canada, and Indiana State University, IN, USA arashr@sfu.ca, arash.rafiey@indstate.edu Abstract Let
More informationINFERENCE OF PARSIMONIOUS SPECIES TREES FROM MULTI-LOCUS DATA BY MINIMIZING DEEP COALESCENCES CUONG THAN AND LUAY NAKHLEH
INFERENCE OF PARSIMONIOUS SPECIES TREES FROM MULTI-LOCUS DATA BY MINIMIZING DEEP COALESCENCES CUONG THAN AND LUAY NAKHLEH Abstract. One approach for inferring a species tree from a given multi-locus data
More information12/5/17. trees. CS 220: Discrete Structures and their Applications. Trees Chapter 11 in zybooks. rooted trees. rooted trees
trees CS 220: Discrete Structures and their Applications A tree is an undirected graph that is connected and has no cycles. Trees Chapter 11 in zybooks rooted trees Rooted trees. Given a tree T, choose
More informationCopyright (c) 2008 Daniel Huson. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation
Daniel H. Huson and David Bryant Software Demo, ISMB, Detroit, June 27, 2005 Copyright (c) 2008 Daniel Huson. Permission is granted to copy, distribute and/or modify this document under the terms of the
More informationComputing the All-Pairs Quartet Distance on a set of Evolutionary Trees
Journal of Bioinformatics and Computational Biology c Imperial College Press Computing the All-Pairs Quartet Distance on a set of Evolutionary Trees M. Stissing, T. Mailund, C. N. S. Pedersen and G. S.
More informationLeast Common Ancestor Based Method for Efficiently Constructing Rooted Supertrees
Least ommon ncestor ased Method for fficiently onstructing Rooted Supertrees M.. Hai Zahid, nkush Mittal, R.. Joshi epartment of lectronics and omputer ngineering, IIT-Roorkee Roorkee, Uttaranchal, INI
More informationIntroduction to Triangulated Graphs. Tandy Warnow
Introduction to Triangulated Graphs Tandy Warnow Topics for today Triangulated graphs: theorems and algorithms (Chapters 11.3 and 11.9) Examples of triangulated graphs in phylogeny estimation (Chapters
More informationNew Common Ancestor Problems in Trees and Directed Acyclic Graphs
New Common Ancestor Problems in Trees and Directed Acyclic Graphs Johannes Fischer, Daniel H. Huson Universität Tübingen, Center for Bioinformatics (ZBIT), Sand 14, D-72076 Tübingen Abstract We derive
More informationCLUSTERING IN BIOINFORMATICS
CLUSTERING IN BIOINFORMATICS CSE/BIMM/BENG 8 MAY 4, 0 OVERVIEW Define the clustering problem Motivation: gene expression and microarrays Types of clustering Clustering algorithms Other applications of
More informationDesigning parallel algorithms for constructing large phylogenetic trees on Blue Waters
Designing parallel algorithms for constructing large phylogenetic trees on Blue Waters Erin Molloy University of Illinois at Urbana Champaign General Allocation (PI: Tandy Warnow) Exploratory Allocation
More informationOn Low Treewidth Graphs and Supertrees
Journal of Graph Algorithms and Applications http://jgaa.info/ vol. 19, no. 1, pp. 325 343 (2015) DOI: 10.7155/jgaa.00361 On Low Treewidth Graphs and Supertrees Alexander Grigoriev 1 Steven Kelk 2 Nela
More informationLesson 13 Molecular Evolution
Sequence Analysis Spring 2000 Dr. Richard Friedman (212)305-6901 (76901) friedman@cuccfa.ccc.columbia.edu 130BB Lesson 13 Molecular Evolution In this class we learn how to draw molecular evolutionary trees
More informationAnalyzing Evolutionary Trees
Analyzing Evolutionary Trees Katherine St. John Lehman College and the Graduate Center City University of New York stjohn@lehman.cuny.edu Katherine St. John City University of New York 1 Overview Talk
More informationFast Hierarchical Clustering via Dynamic Closest Pairs
Fast Hierarchical Clustering via Dynamic Closest Pairs David Eppstein Dept. Information and Computer Science Univ. of California, Irvine http://www.ics.uci.edu/ eppstein/ 1 My Interest In Clustering What
More informationarxiv: v3 [math.co] 17 Jan 2018
Reconstructing unrooted phylogenetic trees from symbolic ternary metrics arxiv:1702.00190v3 [math.co] 17 Jan 2018 Stefan Grünewald CAS-MPG Partner Institute for Computational Biology Chinese Academy of
More informationFrom Trees to Networks and Back
From Trees to Networks and Back Sarah Bastkowski Supervisor: Prof. Vincent Moulton Co-supervisor: Dr. Geoffrey Mckeown A thesis submitted for the degree of Doctor of Philosophy at the University of East
More informationFast Algorithms for Large-Scale Phylogenetic Reconstruction
Fast Algorithms for Large-Scale Phylogenetic Reconstruction by Jakub Truszkowski A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree of Doctor of Philosophy
More informationDefinition For vertices u, v V (G), the distance from u to v, denoted d(u, v), in G is the length of a shortest u, v-path. 1
Graph fundamentals Bipartite graph characterization Lemma. If a graph contains an odd closed walk, then it contains an odd cycle. Proof strategy: Consider a shortest closed odd walk W. If W is not a cycle,
More informationFor all questions, E. NOTA means none of the above answers is correct. Diagrams are NOT drawn to scale.
For all questions, means none of the above answers is correct. Diagrams are NOT drawn to scale.. In the diagram, given m = 57, m = (x+ ), m = (4x 5). Find the degree measure of the smallest angle. 5. The
More informationA practical O(n log 2 n) time algorithm for computing the triplet distance on binary trees
A practical O(n log 2 n) time algorithm for computing the triplet distance on binary trees Andreas Sand 1,2, Gerth Stølting Brodal 2,3, Rolf Fagerberg 4, Christian N. S. Pedersen 1,2 and Thomas Mailund
More informationGenetics/MBT 541 Spring, 2002 Lecture 1 Joe Felsenstein Department of Genome Sciences Phylogeny methods, part 1 (Parsimony and such)
Genetics/MBT 541 Spring, 2002 Lecture 1 Joe Felsenstein Department of Genome Sciences joe@gs Phylogeny methods, part 1 (Parsimony and such) Methods of reconstructing phylogenies (evolutionary trees) Parsimony
More informationGraph and Digraph Glossary
1 of 15 31.1.2004 14:45 Graph and Digraph Glossary A B C D E F G H I-J K L M N O P-Q R S T U V W-Z Acyclic Graph A graph is acyclic if it contains no cycles. Adjacency Matrix A 0-1 square matrix whose
More information