Speeding up Parsimony Scoring with Streaming SIMD Extensions 2

Size: px
Start display at page:

Download "Speeding up Parsimony Scoring with Streaming SIMD Extensions 2"

Transcription

1 Speeding up Parsimony Scoring with Streaming SIMD Extensions 2 Jason Evans <HTjevans@uidaho.eduTH> and James Foster <HTfoster@uidaho.eduTH> Initiative for Bioinformatics and Evolutionary Studies Department of Computer Science University of Idaho Moscow, ID Abstract The number of trees that can be evaluated in a given amount of time is critical to the effectiveness of heuristic phylogenetic tree searches. Due to its speed, maximum parsimony is often employed as the optimality criterion when analyzing large datasets. This paper presents a performance enhancement for parsimony scoring that is complementary to the highly refined techniques already discussed in the literature. Many modern microprocessors include single instruction, multiple data (SIMD) extensions. These instructions perform multiple operations in parallel, thus making substantial throughput improvements possible. The Intel IA32 implementation of SIMD is called Streaming SIMD Extensions 2 (SSE2), which is part of all Pentium 4 microprocessors. This paper presents a realistic performance comparison between an optimized C implementation and an SSE2 assembly language implementation of parsimony scoring. Empirical results show a factor of increase in performance. Introduction Systematists strive to determine the evolutionary relationships among sets of taxa. Relationships are determined by obtaining character data such as DNA or amino acid sequences for each taxon, then using the character differences among the taxa to infer phylogenetic relationships. One of the simplest inference approaches is to use maximum parsimony as the optimality criterion that determines the relative fitness of candidate phylogenetic trees. The principle of maximum parsimony rests on the idea that the simplest explanation is preferable. As the number of taxa in a data set increases linearly, the number of possible trees that describe the relationships among those taxa grows factorially. This makes exhaustive searching of all possible trees intractable for more than perhaps 30 taxa, which forces the use of heuristic searches for many interesting datasets. Effective heuristic searches must 1) carefully choose which candidate trees to consider, and 2) evaluate as many candidate trees as possible in the allotted time. The SSE2 optimizations presented in this paper directly address the second point. The simplest algorithm for calculating parsimony scores was proposed by (Fitch, 1975). A more sophisticated algorithm was suggested by (Sankoff, 1975), but the Fitch algorithm is faster and more commonly used. The majority of research into optimization techniques for parsimony scoring has focused on the Fitch algorithm, which is also the focus of this paper. Fast Fitch parsimony scoring approaches have been presented in detail (Gladstein, 1997; Goloboff, 1993; Goloboff, 1996; Ronquist, 1998), and are used by some the fastest publicly available implementations (Goloboff, 1999; Swofford, 2002). Some of the optimization

2 techniques are quite involved; only the aspects that directly impact SSE2 optimization are mentioned in this paper. Fitch parsimony scoring The Fitch algorithm for parsimony scoring assumes that all character states are the same evolutionary distance from each other. This means that for DNA character data, no distinction is made between transitions (A G, C T) and transversions (A/G C/T). Since character states are not classified, they can be dealt with as uniform sets. The Fitch algorithm is a dynamic programming algorithm; the partial score of each node in a tree depends only on the states of its children. Therefore, the Fitch algorithm can be implemented as a post-order tree traversal, wherein the following steps are performed at each node: 1) If a leaf node: a. Initialize the character state set to contain the characters associated with the taxon. In the case of DNA, ambiguity codes are decomposed into their constituent bases. For example, V translates to{ A, CG, }, and (gap) translates to{ A, CGT,, }. 2) If an internal node: a. Create the intersection of the child nodes state sets, and associate the result with the node. b. If the set created in (a) is empty: i. Create the union of the child nodes state sets, and associate the result with the node. ii. Increment the parsimony score. Figure 1 shows a tree with five taxa. The state sets are shown for each node, and a + denotes nodes at which the parsimony score was incremented. A more realistic example would have multiple characters. Figure 2 shows the partial scoring results for an internal node of a tree with four characters. Each character is calculated independently of the others, but the total score for the tree is the sum of the scores for all characters. {A} {A} {C} {T} {C} {A} {C,T}+ {A,C}+ {C} Figure 1. A 5-taxon tree with Fitch parsimony state sets and scores at nodes. The tree has a total score of 2. 2

3 Child Child {G,T} {A,C} {A,G,T} {T} {A,C} {G} {A,T} {T} Parent {A,C,G,T}+ {A,C,G}+ {A,T} {T} Figure 2. Characters state sets and scores of an internal node and its children, in a tree with 4 characters. The parent node has a score of 2. Tree bisection and reconnection (TBR) hill climbing Heuristic tree searches typically employ some form of hill climbing. The tree bisection and reconnection (TBR) transform is most often used to create a network of trees that define the landscape in which hill climbing is conducted. Each step of a hill climb evaluates all the neighbors of the current tree, and holds one or more trees with better scores for later consideration. If all trees with better scores are held, the number of held trees can quickly become unmanageable. For large datasets it is commonly necessary to consider only the best of the neighboring trees, which results in following the steepest paths. Limiting a search to steepest paths allows the early termination of scoring for trees in a neighborhood that are known to not have the best score. This optimization can provide a substantial speedup, and is a critical optimization for any high performance implementation of parsimony/tbr-based hill climbing. As will be seen later, this optimization confounds the SSE2 optimization, since the termination check must happen in the inner loop of the scoring function. The TBR transform consists of bisecting a tree at some edge, and reconnecting the subtrees by picking one edge from each subtree and connecting those two edges together (Felsenstein, 2004). Figure 3 shows an example TBR transform. During hill climbing searches, trees are evaluated an entire neighborhood at a time. Therefore, pre-calculation and caching of partial parsimony scoring results have the potential to drastically reduce the total amount of calculation. 3

4 B A Bisect I H G B A E I C C F D D E Reconnect Figure 3. The tree on the left can be transformed via TBR to the tree on the right. F G H A pre-calculation approach that was originally described by (Goloboff, 1993) reduces the total amount of calculation by approximately a factor of the number of internal nodes in the tree. This optimization is so large that it fundamentally changes the nature of parsimony/tbr-based neighborhood scoring. The key observation that motivates this optimization is that for each bisection of a tree, the trees in that portion of the TBR neighborhood are composed of the same two subtrees. This means that calculating the parsimony scores for an entire TBR neighborhood can be quickly done by performing the following steps for each edge in the tree: 1) Bisect the tree (create two subtrees). 2) For each subtree, calculate the state sets and score for each possible rooting. 3) Reconnect the two subtrees using every valid combination of edges (one combination would reverse the bisection) and calculate the final score for each resulting tree. A caching optimization developed by (Gladstein, 1997) is applicable to step (2), although Gladstein originally presented the optimization as being orthogonal to Goloboff s approach. Both optimizations are employed in the SSE2 experiments. SSE2 optimization Speeding up Fitch parsimony scoring with SIMD was suggested by (Ronquist, 1998), though only theoretical results for a Motorola PowerPC 604 processor were provided. Ronquist provided alternate algorithms for horizontal and vertical packing of character data. The results in this paper were obtained using horizontal packing, which means that two DNA character state sets are stored in each byte of memory. For example, a byte with a value of 107 (binary ) translates to { CG, },{ AGT,, }. SSE2 provides eight 128-bit registers (xmm0 through xmm7), which are treated in this paper as a vector of sixteen independent bytes. Since each byte contains two character state sets, it is possible to process 32 characters per iteration of the inner scoring loop, as compared to two characters per iteration for a non-vectorized implementation. One of the biggest challenges to vectorizing code is avoiding data-dependent conditional branches. Since many data elements are being processed in parallel, the program cannot take different branches for each data element. SSE2 s pcmpeqb instruction provides an elegant solution by performing a bytewise comparison of two SSE2 registers and storing result bitmasks in one of those registers (Figure 4). A general strategy for dealing with branches is to calculate the results for both code paths, mask out the bytes for which the results of each branch are 4

5 invalid, and then merge the results. The SSE2 implementation of Fitch parsimony scoring uses this basic approach, but also uses bitmasks to calculate the score. pcmpeqb %%xmm1, %%xmm2 xmm = =...13 = s... = xmm xmm Figure 4. The pcmpeqb instruction compares the bytes of two registers and sets a bitmask accordingly. If the bytes are equal, the bitmask is set to all 1 bits, otherwise the bitmask is set to all 0 bits. Although SSE2 was primarily designed for streaming multimedia applications, all the necessary functionality for vectorization of parsimony scoring is present: unaligned and aligned memory load/store instructions, bitwise logical instructions, bitwise shifting instructions, and math instructions. Each iteration of the inner loop of SSE2-based parsimony scoring performs the following operations: 1) Read sixteen bytes of character data (32 characters) from the character state set vector of each child node. 2) Process the state sets stored in the upper four bits of each byte. 3) Process the state sets stored in the lower four bits of each byte. 4) Sum the total number of changes, and add them to the current parsimony score. 5) For an internal node, store the resulting state set vector in memory. For a root node (final tree scoring), do not bother storing the resulting state set vector, and terminate scoring if the maximum interesting score was exceeded. Note that there are two alternatives for step (5). The implementation that is used for the experiments contains two separate functions that implement these alternatives, in order to maximize performance. A short description of the C implementation is needed in order to understand the tradeoffs that the SSE2 implementation must make. The inner loop of the C version of the program is unrolled, so that 32 characters are processed per iteration. This measurably improves performance, by reducing the overhead imposed by the loop conditional. Unlike the SSE2 version, the C version is able to check whether the maximum score has been exceeded precisely when it actually increments the score, and then terminate scoring immediately after exceeding the threshold. By comparison, the SSE2 version only checks once per loop iteration (every 32 characters), which means that the C version typically reads and processes fewer characters. 5

6 Experiment The SSE2-optimized implementation and a C-only implementation of Fitch parsimony scoring were used to analyze an aligned dataset consisting of 759 informative characters of rbcl data for 500 taxa (Chase, 1993). The dataset has become a rather standard benchmark for the effectiveness of heuristic search techniques on large datasets (Nixon, 1999; Rice, 1997; Snell, 2000). Real data are preferable to simulated data for this experiment, since early termination behavior is data-dependent. The performance of the two implementations was compared for three different experimental configurations: 1. Starting at the locally optimal tree reported by (Rice, 1997) and published in electronic form (Rice), the parsimony scores for all 9,266,156 trees in the immediate neighborhood were calculated. This was repeated 100 times in a single program run, for a total of 926 million tree evaluations. 2. Starting at the same tree as in configuration (1), the best trees in the immediate neighborhood were found. This was repeated 100 times in a single program run, for a total of 926 million tree evaluations pseudo-random trees were generated. For each tree, the best neighbors were found 100 times. All of this was done during a single program run, for a total of 74.3 billion tree evaluations. The same 100 trees were used for the C and SSE2 program runs. The tree in configurations (1) and (2) is a local optimum, which reduces the effectiveness of the early termination optimization. The trees in configuration (3) tend to each have a very diverse neighborhood, which allows early termination to happen more often. The configurations focus on the early termination optimization due to its variable negative impact on the effectiveness of the SSE2 optimization. The 100 pseudo-random trees in configuration (3) were drawn from approximately population mean, with very high probability. The main point of this configuration is simply to measure performance for trees with diverse neighborhoods, so there is little benefit to separately measuring the results for each of the neighborhoods. With that in mind, the results for this experimental configuration are summarized by dividing the total number of trees considered by the total time taken, just as in configurations (1) and (2). All neighborhoods were iterated over repeatedly, in order to reduce the stochastic effects of data caching and to increase the total program runtimes to a point where time measurement error 1280 possible trees. Therefore, the mean parsimony score for the trees is near the (typically ± 10 ms) did not significantly impact the accuracy of the results. All experiments were run on a four-processor 2.8 GHz Pentium 4-based computer. The operating system is a Linux variant, with a based SMP kernel. No multi-threading was used in the experiments, so the multiple processors had no positive impact on the experiments. The test program was compiled using gcc 3.3.3, with the -O2 optimization flag specified. Figure 5 summarizes the results for these experiments. The speedup ranges from a factor of 2.1 to 2.5. The results are interpreted to indicate that speedup differences are dependent on two factors: 1) the relative overhead of the early termination check for the C and SSE2 versions, and 2) the effectiveness of the early termination check. The speedup difference between configurations (1) and (2) is attributed to the first factor, and the speedup difference between configurations (2) and (3) is attributed to the second factor. 6

7 7 2.5X speedup 6 Millions of trees/sec X speedup 1 2.5X speedup 0 1. Peak all 2. Peak best 3. Random best C SSE Figure 5. Millions of tree evaluations per second of C and SSE2 implementations, for three different experimental configurations. The SSE2 implementation is times faster than the C implementation for these experiments. Discussion A speedup of X is a substantial, consistent performance improvement that is certainly worth the programming effort, especially if a program spends more than a few days performing data analysis. One might expect an order of magnitude performance improvement, but in fact, the theoretical maximum improvement is approximately in the 3-5X range. There are two contributing factors to this discrepancy. First, the Pentium 4 processor is three-way superscalar, which allows it to retire up to three instructions per cycle. However, there is only one floating point unit, which means that only one SSE2 instruction can run at a time. Second, the Pentium 4 implements the x86 instruction set, but internally, these instructions are translated to a RISC-like set of micro-ops. In the case of many SSE2 instructions, at least two micro-ops are needed, since the floating point unit only handles 64 bits of data at a time. Therefore, if conditions were ideal for non-sse2 code, six instructions could be retired in the same amount of time that one SSE2 instruction takes to run. This would reduce the theoretical maximum speedup from 16X to approximately 2.7X, in the worst case. The experiments presented in this paper are meant to be a reasonably realistic comparison of the expected performance difference of replacing an optimized C implementation of parsimony scoring with an SSE2 implementation. A less realistic experiment would leave out the early termination optimization. Earlier versions of the test program did not implement early termination, and the observed speedup was approximately 2.9X. 7

8 A substantial increase of the number of characters would benefit the SSE2 implementation in benchmarks, especially if early termination were omitted. Microprocessors often spend time waiting for data to be read from memory, but linear memory access performs well due to predictive data prefetch. The character state set vectors for the test dataset are only 384 bytes long, so the startup cost of reading each vector probably has a significant impact on overall throughput. The SSE2 implementation is more vulnerable to this issue because it is faster, and because it reads more data on average. SSE2 optimizations are only of benefit when using certain Intel and AMD microprocessors. Many researchers (authors included) also use PowerPC-based systems, so future work will likely include the implementation of similar optimizations using the AltiVec instruction set, which is present in Apple G4- and G5-based systems. Availability Source code for the program that was used in the experiments is available upon request. Acknowledgements This work was partially funded by NIH NCRR 1P20 RR Experiments were run on the IBEST Beowulf cluster, which is funded in part by NSF EPS , NIH NCRR 1P20 RR16448, and NIH NCRR 1P20 RR The authors are grateful for Robert Beers s help in understanding the internals of the Pentium 4 processor. References Chase, M. W., D.E. Soltis, R.G. Olmstead, D. Morgan, D.H. Les, B.D. Mishler, M.R. Duvall, R.A. Price, H.G. Hills, Y.-L. Qiu, K.A. Kron, J.H. Rettig, E. Conti, J.D. Palmer, J.R. Manhart, K.J. Sytsma, H.J. Michaels, W.J. Kress, K.G. Karol, W.D. Clark, M. H. Hédren, B.S. Gaut, R.K. Jansen, K.-J. Kim, C.F. Wimpee, J.F. Smith, G.R. Furnier, S.H. Strauss, Q.-Y. Xiang, G.M. Plunkett, P.S. Soltis, S.M. Swensen, S.E. Williams, P.A. Gadek, C.J. Quinn, L.E. Equiarte, E. Dolenberg, G.H. Learn, Jr., S.W. Graham, S.C.H. Barrett, S. Dayandan, and V.A. Albert Phylogenetics of seed plants: An analysis of nucleotide sequences from the plastid gene rbcl. Ann. Mo. Bot. 80: Felsenstein, J Inferring Phylogenies. Sinauer Associates, Inc., Sunderland, MA. Fitch, W. M. Year. Toward finding the tree of maximum parsimony in Proceedings of the Eighth International Conference on Numerical Taxonomy. W.H. Freeman, San Francisco. Gladstein, D. S Efficient Incremental Character Optimization. Cladistics 13: Goloboff, P. A Character Optimization and Calculation of Tree Lengths. Cladistics 9: Goloboff, P. A Methods for Faster Parsimony Analysis. Cladistics 12: Goloboff, P. A NONA (NO NAme) verson 2. Published by author. Nixon, K. C The Parsimony Ratchet, a New Method for Rapid Parsimony Analysis. Cladistics 15: Rice, K. A. Treezilla Data Sets, HThttp:// Rice, K. A., M.J. Donoghue, and R.G. Olmstead Analyzing Large Data Sets: rbcl 500 Revisited. Sys. Biol. 46: Ronquist, F Fast Fitch-Parsimony Algorithms for Large Data Sets. Cladistics 14:

9 Sankoff, D Minimal mutation trees of sequences. SIAM Journal of Applied Mathematics 28: Snell, Q., Whiting, M., Clement, M., and McLaughlin, D. Year. Parallel Phylogenetic Inference in Proceedings of the 2000 ACM/IEEE Conference on Supercomputing. IEEE Computer Society, Dallas, TX. Swofford, D. L PAUP* v4.0b10: Phylogenetic Analysis Using Parsimony * (and other Methods). Sinauer Associates, Inc. 9

Parsimony-Based Approaches to Inferring Phylogenetic Trees

Parsimony-Based Approaches to Inferring Phylogenetic Trees Parsimony-Based Approaches to Inferring Phylogenetic Trees BMI/CS 576 www.biostat.wisc.edu/bmi576.html Mark Craven craven@biostat.wisc.edu Fall 0 Phylogenetic tree approaches! three general types! distance:

More information

Relaxed Neighbor Joining: A Fast Distance-Based Phylogenetic Tree Construction Method

Relaxed Neighbor Joining: A Fast Distance-Based Phylogenetic Tree Construction Method J Mol Evol (2006) 62:785 792 DOI: 10.1007/s00239-005-0176-2 Relaxed Neighbor Joining: A Fast Distance-Based Phylogenetic Tree Construction Method Jason Evans, Luke Sheneman, James Foster Department of

More information

ML phylogenetic inference and GARLI. Derrick Zwickl. University of Arizona (and University of Kansas) Workshop on Molecular Evolution 2015

ML phylogenetic inference and GARLI. Derrick Zwickl. University of Arizona (and University of Kansas) Workshop on Molecular Evolution 2015 ML phylogenetic inference and GARLI Derrick Zwickl University of Arizona (and University of Kansas) Workshop on Molecular Evolution 2015 Outline Heuristics and tree searches ML phylogeny inference and

More information

The Parsimony Ratchet, a New Method for Rapid Parsimony Analysis

The Parsimony Ratchet, a New Method for Rapid Parsimony Analysis Cladistics 15, 407 414 (1999) Article ID clad.1999.0121, available online at http://www.idealibrary.com on The Parsimony Ratchet, a New Method for Rapid Parsimony Analysis Kevin C. Nixon L. H. Bailey Hortorium,

More information

Genetics/MBT 541 Spring, 2002 Lecture 1 Joe Felsenstein Department of Genome Sciences Phylogeny methods, part 1 (Parsimony and such)

Genetics/MBT 541 Spring, 2002 Lecture 1 Joe Felsenstein Department of Genome Sciences Phylogeny methods, part 1 (Parsimony and such) Genetics/MBT 541 Spring, 2002 Lecture 1 Joe Felsenstein Department of Genome Sciences joe@gs Phylogeny methods, part 1 (Parsimony and such) Methods of reconstructing phylogenies (evolutionary trees) Parsimony

More information

CS 581. Tandy Warnow

CS 581. Tandy Warnow CS 581 Tandy Warnow This week Maximum parsimony: solving it on small datasets Maximum Likelihood optimization problem Felsenstein s pruning algorithm Bayesian MCMC methods Research opportunities Maximum

More information

A RANDOMIZED ALGORITHM FOR COMPARING SETS OF PHYLOGENETIC TREES

A RANDOMIZED ALGORITHM FOR COMPARING SETS OF PHYLOGENETIC TREES A RANDOMIZED ALGORITHM FOR COMPARING SETS OF PHYLOGENETIC TREES SEUNG-JIN SUL AND TIFFANI L. WILLIAMS Department of Computer Science Texas A&M University College Station, TX 77843-3112 USA E-mail: {sulsj,tlw}@cs.tamu.edu

More information

A Randomized Algorithm for Comparing Sets of Phylogenetic Trees

A Randomized Algorithm for Comparing Sets of Phylogenetic Trees A Randomized Algorithm for Comparing Sets of Phylogenetic Trees Seung-Jin Sul and Tiffani L. Williams Department of Computer Science Texas A&M University E-mail: {sulsj,tlw}@cs.tamu.edu Technical Report

More information

Parsimony methods. Chapter 1

Parsimony methods. Chapter 1 Chapter 1 Parsimony methods Parsimony methods are the easiest ones to explain, and were also among the first methods for inferring phylogenies. The issues that they raise also involve many of the phenomena

More information

Technical Report. Research Lab: LERIA

Technical Report. Research Lab: LERIA Technical Report Improvement of Fitch function for Maximum Parsimony in Phylogenetic Reconstruction with Intel AVX2 assembler instructions Research Lab: LERIA TR20130624-1 Version 1.0 24 June 2013 JEAN-MICHEL

More information

A Memetic Algorithm for Phylogenetic Reconstruction with Maximum Parsimony

A Memetic Algorithm for Phylogenetic Reconstruction with Maximum Parsimony A Memetic Algorithm for Phylogenetic Reconstruction with Maximum Parsimony Jean-Michel Richer 1 and Adrien Goëffon 2 and Jin-Kao Hao 1 1 University of Angers, LERIA, 2 Bd Lavoisier, 49045 Anger Cedex 01,

More information

Using Intel Streaming SIMD Extensions for 3D Geometry Processing

Using Intel Streaming SIMD Extensions for 3D Geometry Processing Using Intel Streaming SIMD Extensions for 3D Geometry Processing Wan-Chun Ma, Chia-Lin Yang Dept. of Computer Science and Information Engineering National Taiwan University firebird@cmlab.csie.ntu.edu.tw,

More information

Rec-I-DCM3: A Fast Algorithmic Technique for Reconstructing Large Phylogenetic Trees

Rec-I-DCM3: A Fast Algorithmic Technique for Reconstructing Large Phylogenetic Trees Rec-I-: A Fast Algorithmic Technique for Reconstructing Large Phylogenetic Trees Usman Roshan Bernard M.E. Moret Tiffani L. Williams Tandy Warnow Abstract Estimations of phylogenetic trees are most commonly

More information

Phylogenetics on CUDA (Parallel) Architectures Bradly Alicea

Phylogenetics on CUDA (Parallel) Architectures Bradly Alicea Descent w/modification Descent w/modification Descent w/modification Descent w/modification CPU Descent w/modification Descent w/modification Phylogenetics on CUDA (Parallel) Architectures Bradly Alicea

More information

Sequence clustering. Introduction. Clustering basics. Hierarchical clustering

Sequence clustering. Introduction. Clustering basics. Hierarchical clustering Sequence clustering Introduction Data clustering is one of the key tools used in various incarnations of data-mining - trying to make sense of large datasets. It is, thus, natural to ask whether clustering

More information

Evolutionary tree reconstruction (Chapter 10)

Evolutionary tree reconstruction (Chapter 10) Evolutionary tree reconstruction (Chapter 10) Early Evolutionary Studies Anatomical features were the dominant criteria used to derive evolutionary relationships between species since Darwin till early

More information

Optimization of Vertical and Horizontal Beamforming Kernels on the PowerPC G4 Processor with AltiVec Technology

Optimization of Vertical and Horizontal Beamforming Kernels on the PowerPC G4 Processor with AltiVec Technology Optimization of Vertical and Horizontal Beamforming Kernels on the PowerPC G4 Processor with AltiVec Technology EE382C: Embedded Software Systems Final Report David Brunke Young Cho Applied Research Laboratories:

More information

Phylogenetics. Introduction to Bioinformatics Dortmund, Lectures: Sven Rahmann. Exercises: Udo Feldkamp, Michael Wurst

Phylogenetics. Introduction to Bioinformatics Dortmund, Lectures: Sven Rahmann. Exercises: Udo Feldkamp, Michael Wurst Phylogenetics Introduction to Bioinformatics Dortmund, 16.-20.07.2007 Lectures: Sven Rahmann Exercises: Udo Feldkamp, Michael Wurst 1 Phylogenetics phylum = tree phylogenetics: reconstruction of evolutionary

More information

A Memetic Algorithm for Phylogenetic Reconstruction with Maximum Parsimony

A Memetic Algorithm for Phylogenetic Reconstruction with Maximum Parsimony A Memetic Algorithm for Phylogenetic Reconstruction with Maximum Parsimony Jean-Michel Richer 1,AdrienGoëffon 2, and Jin-Kao Hao 1 1 University of Angers, LERIA, 2 Bd Lavoisier, 49045 Anger Cedex 01, France

More information

Dynamic Programming User Manual v1.0 Anton E. Weisstein, Truman State University Aug. 19, 2014

Dynamic Programming User Manual v1.0 Anton E. Weisstein, Truman State University Aug. 19, 2014 Dynamic Programming User Manual v1.0 Anton E. Weisstein, Truman State University Aug. 19, 2014 Dynamic programming is a group of mathematical methods used to sequentially split a complicated problem into

More information

Parsimony Least squares Minimum evolution Balanced minimum evolution Maximum likelihood (later in the course)

Parsimony Least squares Minimum evolution Balanced minimum evolution Maximum likelihood (later in the course) Tree Searching We ve discussed how we rank trees Parsimony Least squares Minimum evolution alanced minimum evolution Maximum likelihood (later in the course) So we have ways of deciding what a good tree

More information

Terminology. A phylogeny is the evolutionary history of an organism

Terminology. A phylogeny is the evolutionary history of an organism Phylogeny Terminology A phylogeny is the evolutionary history of an organism A taxon (plural: taxa) is a group of (one or more) organisms, which a taxonomist adjudges to be a unit. A definition? from Wikipedia

More information

Molecular Evolution & Phylogenetics Complexity of the search space, distance matrix methods, maximum parsimony

Molecular Evolution & Phylogenetics Complexity of the search space, distance matrix methods, maximum parsimony Molecular Evolution & Phylogenetics Complexity of the search space, distance matrix methods, maximum parsimony Basic Bioinformatics Workshop, ILRI Addis Ababa, 12 December 2017 Learning Objectives understand

More information

Reconsidering the Performance of Cooperative Rec-I-DCM3

Reconsidering the Performance of Cooperative Rec-I-DCM3 Reconsidering the Performance of Cooperative Rec-I-DCM3 Tiffani L. Williams Department of Computer Science Texas A&M University College Station, TX 77843 Email: tlw@cs.tamu.edu Marc L. Smith Computer Science

More information

Efficiency of Parallel Direct Optimization

Efficiency of Parallel Direct Optimization Cladistics 17, S71 S82 (2001) doi:10.1006/clad.2000.0160, available online at http://www.idealibrary.com on Efficiency of Parallel Direct Optimization Daniel A. Janies 1 and Ward C. Wheeler Division of

More information

CSE 549: Computational Biology

CSE 549: Computational Biology CSE 549: Computational Biology Phylogenomics 1 slides marked with * by Carl Kingsford Tree of Life 2 * H5N1 Influenza Strains Salzberg, Kingsford, et al., 2007 3 * H5N1 Influenza Strains The 2007 outbreak

More information

Motivation for Heuristics

Motivation for Heuristics MIP Heuristics 1 Motivation for Heuristics Why not wait for branching? Produce feasible solutions as quickly as possible Often satisfies user demands Avoid exploring unproductive sub trees Better reduced

More information

JULIA ENABLED COMPUTATION OF MOLECULAR LIBRARY COMPLEXITY IN DNA SEQUENCING

JULIA ENABLED COMPUTATION OF MOLECULAR LIBRARY COMPLEXITY IN DNA SEQUENCING JULIA ENABLED COMPUTATION OF MOLECULAR LIBRARY COMPLEXITY IN DNA SEQUENCING Larson Hogstrom, Mukarram Tahir, Andres Hasfura Massachusetts Institute of Technology, Cambridge, Massachusetts, USA 18.337/6.338

More information

EFFICIENT LARGE-SCALE PHYLOGENY RECONSTRUCTION

EFFICIENT LARGE-SCALE PHYLOGENY RECONSTRUCTION EFFICIENT LARGE-SCALE PHYLOGENY RECONSTRUCTION MIKLÓS CSŰRÖS AND MING-YANG KAO Abstract. In this study we introduce two novel distance-based algorithms with provably high computational and statistical

More information

Chapter 12: Indexing and Hashing. Basic Concepts

Chapter 12: Indexing and Hashing. Basic Concepts Chapter 12: Indexing and Hashing! Basic Concepts! Ordered Indices! B+-Tree Index Files! B-Tree Index Files! Static Hashing! Dynamic Hashing! Comparison of Ordered Indexing and Hashing! Index Definition

More information

Chapter 12: Indexing and Hashing

Chapter 12: Indexing and Hashing Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing and Hashing Index Definition in SQL

More information

FAST FIR FILTERS FOR SIMD PROCESSORS WITH LIMITED MEMORY BANDWIDTH

FAST FIR FILTERS FOR SIMD PROCESSORS WITH LIMITED MEMORY BANDWIDTH Key words: Digital Signal Processing, FIR filters, SIMD processors, AltiVec. Grzegorz KRASZEWSKI Białystok Technical University Department of Electrical Engineering Wiejska

More information

CS 426 Parallel Computing. Parallel Computing Platforms

CS 426 Parallel Computing. Parallel Computing Platforms CS 426 Parallel Computing Parallel Computing Platforms Ozcan Ozturk http://www.cs.bilkent.edu.tr/~ozturk/cs426/ Slides are adapted from ``Introduction to Parallel Computing'' Topic Overview Implicit Parallelism:

More information

Codon models. In reality we use codon model Amino acid substitution rates meet nucleotide models Codon(nucleotide triplet)

Codon models. In reality we use codon model Amino acid substitution rates meet nucleotide models Codon(nucleotide triplet) Phylogeny Codon models Last lecture: poor man s way of calculating dn/ds (Ka/Ks) Tabulate synonymous/non- synonymous substitutions Normalize by the possibilities Transform to genetic distance K JC or K

More information

Comparing Implementations of Optimal Binary Search Trees

Comparing Implementations of Optimal Binary Search Trees Introduction Comparing Implementations of Optimal Binary Search Trees Corianna Jacoby and Alex King Tufts University May 2017 In this paper we sought to put together a practical comparison of the optimality

More information

Lab 07: Maximum Likelihood Model Selection and RAxML Using CIPRES

Lab 07: Maximum Likelihood Model Selection and RAxML Using CIPRES Integrative Biology 200, Spring 2014 Principles of Phylogenetics: Systematics University of California, Berkeley Updated by Traci L. Grzymala Lab 07: Maximum Likelihood Model Selection and RAxML Using

More information

The Effect of Inverse Document Frequency Weights on Indexed Sequence Retrieval. Kevin C. O'Kane. Department of Computer Science

The Effect of Inverse Document Frequency Weights on Indexed Sequence Retrieval. Kevin C. O'Kane. Department of Computer Science The Effect of Inverse Document Frequency Weights on Indexed Sequence Retrieval Kevin C. O'Kane Department of Computer Science The University of Northern Iowa Cedar Falls, Iowa okane@cs.uni.edu http://www.cs.uni.edu/~okane

More information

Motivation for Parallelism. Motivation for Parallelism. ILP Example: Loop Unrolling. Types of Parallelism

Motivation for Parallelism. Motivation for Parallelism. ILP Example: Loop Unrolling. Types of Parallelism Motivation for Parallelism Motivation for Parallelism The speed of an application is determined by more than just processor speed. speed Disk speed Network speed... Multiprocessors typically improve the

More information

TDT Coarse-Grained Multithreading. Review on ILP. Multi-threaded execution. Contents. Fine-Grained Multithreading

TDT Coarse-Grained Multithreading. Review on ILP. Multi-threaded execution. Contents. Fine-Grained Multithreading Review on ILP TDT 4260 Chap 5 TLP & Hierarchy What is ILP? Let the compiler find the ILP Advantages? Disadvantages? Let the HW find the ILP Advantages? Disadvantages? Contents Multi-threading Chap 3.5

More information

Cooperative Rec-I-DCM3: A Population-Based Approach for Reconstructing Phylogenies

Cooperative Rec-I-DCM3: A Population-Based Approach for Reconstructing Phylogenies Cooperative Rec-I-DCM3: A Population-Based Approach for Reconstructing Phylogenies Tiffani L. Williams Department of Computer Science Texas A&M University tlw@cs.tamu.edu Marc L. Smith Department of Computer

More information

Designing parallel algorithms for constructing large phylogenetic trees on Blue Waters

Designing parallel algorithms for constructing large phylogenetic trees on Blue Waters Designing parallel algorithms for constructing large phylogenetic trees on Blue Waters Erin Molloy University of Illinois at Urbana Champaign General Allocation (PI: Tandy Warnow) Exploratory Allocation

More information

On the Efficacy of Haskell for High Performance Computational Biology

On the Efficacy of Haskell for High Performance Computational Biology On the Efficacy of Haskell for High Performance Computational Biology Jacqueline Addesa Academic Advisors: Jeremy Archuleta, Wu chun Feng 1. Problem and Motivation Biologists can leverage the power of

More information

Fundamentals of Computer Design

Fundamentals of Computer Design Fundamentals of Computer Design Computer Architecture J. Daniel García Sánchez (coordinator) David Expósito Singh Francisco Javier García Blas ARCOS Group Computer Science and Engineering Department University

More information

PROTEIN MULTIPLE ALIGNMENT MOTIVATION: BACKGROUND: Marina Sirota

PROTEIN MULTIPLE ALIGNMENT MOTIVATION: BACKGROUND: Marina Sirota Marina Sirota MOTIVATION: PROTEIN MULTIPLE ALIGNMENT To study evolution on the genetic level across a wide range of organisms, biologists need accurate tools for multiple sequence alignment of protein

More information

In the previous presentation, Erik Sintorn presented methods for practically constructing a DAG structure from a voxel data set.

In the previous presentation, Erik Sintorn presented methods for practically constructing a DAG structure from a voxel data set. 1 In the previous presentation, Erik Sintorn presented methods for practically constructing a DAG structure from a voxel data set. This presentation presents how such a DAG structure can be accessed immediately

More information

The p196 mpi implementation of the reverse-and-add algorithm for the palindrome quest.

The p196 mpi implementation of the reverse-and-add algorithm for the palindrome quest. The p196 mpi implementation of the reverse-and-add algorithm for the palindrome quest. Romain Dolbeau March 24, 2014 1 Introduction To quote John Walker, the first person to brute-force the problem [1]:

More information

A CAM(Content Addressable Memory)-based architecture for molecular sequence matching

A CAM(Content Addressable Memory)-based architecture for molecular sequence matching A CAM(Content Addressable Memory)-based architecture for molecular sequence matching P.K. Lala 1 and J.P. Parkerson 2 1 Department Electrical Engineering, Texas A&M University, Texarkana, Texas, USA 2

More information

1. Microprocessor Architectures. 1.1 Intel 1.2 Motorola

1. Microprocessor Architectures. 1.1 Intel 1.2 Motorola 1. Microprocessor Architectures 1.1 Intel 1.2 Motorola 1.1 Intel The Early Intel Microprocessors The first microprocessor to appear in the market was the Intel 4004, a 4-bit data bus device. This device

More information

An Experimental Analysis of Robinson-Foulds Distance Matrix Algorithms

An Experimental Analysis of Robinson-Foulds Distance Matrix Algorithms An Experimental Analysis of Robinson-Foulds Distance Matrix Algorithms Seung-Jin Sul and Tiffani L. Williams Department of Computer Science Texas A&M University College Station, TX 77843-3 {sulsj,tlw}@cs.tamu.edu

More information

Phylospaces: Reconstructing Evolutionary Trees in Tuple Space

Phylospaces: Reconstructing Evolutionary Trees in Tuple Space Phylospaces: Reconstructing Evolutionary Trees in Tuple Space Marc L. Smith 1 and Tiffani L. Williams 2 1 Colby College 2 Texas A&M University Department of Computer Science Department of Computer Science

More information

Chapter 06: Instruction Pipelining and Parallel Processing. Lesson 14: Example of the Pipelined CISC and RISC Processors

Chapter 06: Instruction Pipelining and Parallel Processing. Lesson 14: Example of the Pipelined CISC and RISC Processors Chapter 06: Instruction Pipelining and Parallel Processing Lesson 14: Example of the Pipelined CISC and RISC Processors 1 Objective To understand pipelines and parallel pipelines in CISC and RISC Processors

More information

Fast Efficient Clustering Algorithm for Balanced Data

Fast Efficient Clustering Algorithm for Balanced Data Vol. 5, No. 6, 214 Fast Efficient Clustering Algorithm for Balanced Data Adel A. Sewisy Faculty of Computer and Information, Assiut University M. H. Marghny Faculty of Computer and Information, Assiut

More information

PIPELINE AND VECTOR PROCESSING

PIPELINE AND VECTOR PROCESSING PIPELINE AND VECTOR PROCESSING PIPELINING: Pipelining is a technique of decomposing a sequential process into sub operations, with each sub process being executed in a special dedicated segment that operates

More information

Parallel Phylogenetic Inference

Parallel Phylogenetic Inference Brigham Young University BYU ScholarsArchive All Faculty Publications 2000-11-10 Parallel Phylogenetic Inference Mark J. Clement clement@cs.byu.edu David McLaughlin See next page for additional authors

More information

A New Exam Timetabling Algorithm

A New Exam Timetabling Algorithm A New Exam Timetabling Algorithm K.J. Batenburg W.J. Palenstijn Leiden Institute of Advanced Computer Science (LIACS), Universiteit Leiden P.O. Box 9512, 2300 RA Leiden, The Netherlands {kbatenbu, wpalenst}@math.leidenuniv.nl

More information

CHAPTER 6 MODIFIED FUZZY TECHNIQUES BASED IMAGE SEGMENTATION

CHAPTER 6 MODIFIED FUZZY TECHNIQUES BASED IMAGE SEGMENTATION CHAPTER 6 MODIFIED FUZZY TECHNIQUES BASED IMAGE SEGMENTATION 6.1 INTRODUCTION Fuzzy logic based computational techniques are becoming increasingly important in the medical image analysis arena. The significant

More information

LARGE-SCALE ANALYSIS OF PHYLOGENETIC SEARCH BEHAVIOR. A Thesis HYUN JUNG PARK

LARGE-SCALE ANALYSIS OF PHYLOGENETIC SEARCH BEHAVIOR. A Thesis HYUN JUNG PARK LARGE-SCALE ANALYSIS OF PHYLOGENETIC SEARCH BEHAVIOR A Thesis by HYUN JUNG PARK Submitted to the Office of Graduate Studies of Texas A&M University in partial fulfillment of the requirements for the degree

More information

Three-Dimensional Cost-Matrix Optimization and Maximum Cospeciation

Three-Dimensional Cost-Matrix Optimization and Maximum Cospeciation Cladistics 14, 167 172 (1998) WWW http://www.apnet.com Article i.d. cl980066 Three-Dimensional Cost-Matrix Optimization and Maximum Cospeciation Fredrik Ronquist Department of Zoology, Uppsala University,

More information

Assembly Language for Intel-Based Computers, 4 th Edition. Chapter 2: IA-32 Processor Architecture Included elements of the IA-64 bit

Assembly Language for Intel-Based Computers, 4 th Edition. Chapter 2: IA-32 Processor Architecture Included elements of the IA-64 bit Assembly Language for Intel-Based Computers, 4 th Edition Kip R. Irvine Chapter 2: IA-32 Processor Architecture Included elements of the IA-64 bit Slides prepared by Kip R. Irvine Revision date: 09/25/2002

More information

Indexing and Hashing

Indexing and Hashing C H A P T E R 1 Indexing and Hashing This chapter covers indexing techniques ranging from the most basic one to highly specialized ones. Due to the extensive use of indices in database systems, this chapter

More information

Improving Memory Space Efficiency of Kd-tree for Real-time Ray Tracing Byeongjun Choi, Byungjoon Chang, Insung Ihm

Improving Memory Space Efficiency of Kd-tree for Real-time Ray Tracing Byeongjun Choi, Byungjoon Chang, Insung Ihm Improving Memory Space Efficiency of Kd-tree for Real-time Ray Tracing Byeongjun Choi, Byungjoon Chang, Insung Ihm Department of Computer Science and Engineering Sogang University, Korea Improving Memory

More information

Midterm Examination CS540-2: Introduction to Artificial Intelligence

Midterm Examination CS540-2: Introduction to Artificial Intelligence Midterm Examination CS540-2: Introduction to Artificial Intelligence March 15, 2018 LAST NAME: FIRST NAME: Problem Score Max Score 1 12 2 13 3 9 4 11 5 8 6 13 7 9 8 16 9 9 Total 100 Question 1. [12] Search

More information

9/24/ Hash functions

9/24/ Hash functions 11.3 Hash functions A good hash function satis es (approximately) the assumption of SUH: each key is equally likely to hash to any of the slots, independently of the other keys We typically have no way

More information

Characterization of Native Signal Processing Extensions

Characterization of Native Signal Processing Extensions Characterization of Native Signal Processing Extensions Jason Law Department of Electrical and Computer Engineering University of Texas at Austin Austin, TX 78712 jlaw@mail.utexas.edu Abstract Soon if

More information

Study of Data Localities in Suffix-Tree Based Genetic Algorithms

Study of Data Localities in Suffix-Tree Based Genetic Algorithms Study of Data Localities in Suffix-Tree Based Genetic Algorithms Carl I. Bergenhem, Michael T. Smith Abstract. This paper focuses on the study of cache localities of two genetic algorithms based on the

More information

PRec-I-DCM3: A Parallel Framework for Fast and Accurate Large Scale Phylogeny Reconstruction

PRec-I-DCM3: A Parallel Framework for Fast and Accurate Large Scale Phylogeny Reconstruction PRec-I-DCM3: A Parallel Framework for Fast and Accurate Large Scale Phylogeny Reconstruction Cristian Coarfa Yuri Dotsenko John Mellor-Crummey Luay Nakhleh Usman Roshan Abstract Phylogenetic trees play

More information

Interpreting a genetic programming population on an nvidia Tesla

Interpreting a genetic programming population on an nvidia Tesla Interpreting a genetic programming population on an nvidia Tesla W. B. Langdon CREST lab, Department of Computer Science Introduction General Purpose use of GPU (GPGPU) and why we care Evolutionary algorithms

More information

GENETIC ALGORITHM VERSUS PARTICLE SWARM OPTIMIZATION IN N-QUEEN PROBLEM

GENETIC ALGORITHM VERSUS PARTICLE SWARM OPTIMIZATION IN N-QUEEN PROBLEM Journal of Al-Nahrain University Vol.10(2), December, 2007, pp.172-177 Science GENETIC ALGORITHM VERSUS PARTICLE SWARM OPTIMIZATION IN N-QUEEN PROBLEM * Azhar W. Hammad, ** Dr. Ban N. Thannoon Al-Nahrain

More information

Chapter 11: Indexing and Hashing

Chapter 11: Indexing and Hashing Chapter 11: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing and Hashing Index Definition in SQL

More information

CS:APP3e Web Aside OPT:SIMD: Achieving Greater Parallelism with SIMD Instructions

CS:APP3e Web Aside OPT:SIMD: Achieving Greater Parallelism with SIMD Instructions CS:APP3e Web Aside OPT:SIMD: Achieving Greater Parallelism with SIMD Instructions Randal E. Bryant David R. O Hallaron October 12, 2015 Notice The material in this document is supplementary material to

More information

Heterotachy models in BayesPhylogenies

Heterotachy models in BayesPhylogenies Heterotachy models in is a general software package for inferring phylogenetic trees using Bayesian Markov Chain Monte Carlo (MCMC) methods. The program allows a range of models of gene sequence evolution,

More information

Clustering Billions of Images with Large Scale Nearest Neighbor Search

Clustering Billions of Images with Large Scale Nearest Neighbor Search Clustering Billions of Images with Large Scale Nearest Neighbor Search Ting Liu, Charles Rosenberg, Henry A. Rowley IEEE Workshop on Applications of Computer Vision February 2007 Presented by Dafna Bitton

More information

Outline. Low-Level Optimizations in the PowerPC/Linux Kernels. PowerPC Architecture. PowerPC Architecture

Outline. Low-Level Optimizations in the PowerPC/Linux Kernels. PowerPC Architecture. PowerPC Architecture Low-Level Optimizations in the PowerPC/Linux Kernels Dr. Paul Mackerras Senior Technical Staff Member IBM Linux Technology Center OzLabs Canberra, Australia paulus@samba.org paulus@au1.ibm.com Introduction

More information

Exploiting Phase Inter-Dependencies for Faster Iterative Compiler Optimization Phase Order Searches

Exploiting Phase Inter-Dependencies for Faster Iterative Compiler Optimization Phase Order Searches 1/26 Exploiting Phase Inter-Dependencies for Faster Iterative Compiler Optimization Phase Order Searches Michael R. Jantz Prasad A. Kulkarni Electrical Engineering and Computer Science, University of Kansas

More information

CS:APP3e Web Aside OPT:SIMD: Achieving Greater Parallelism with SIMD Instructions

CS:APP3e Web Aside OPT:SIMD: Achieving Greater Parallelism with SIMD Instructions CS:APP3e Web Aside OPT:SIMD: Achieving Greater Parallelism with SIMD Instructions Randal E. Bryant David R. O Hallaron January 14, 2016 Notice The material in this document is supplementary material to

More information

Single Pass, BLAST-like, Approximate String Matching on FPGAs*

Single Pass, BLAST-like, Approximate String Matching on FPGAs* Single Pass, BLAST-like, Approximate String Matching on FPGAs* Martin Herbordt Josh Model Yongfeng Gu Bharat Sukhwani Tom VanCourt Computer Architecture and Automated Design Laboratory Department of Electrical

More information

Hybrid Parallelization of the MrBayes & RAxML Phylogenetics Codes

Hybrid Parallelization of the MrBayes & RAxML Phylogenetics Codes Hybrid Parallelization of the MrBayes & RAxML Phylogenetics Codes Wayne Pfeiffer (SDSC/UCSD) & Alexandros Stamatakis (TUM) February 25, 2010 What was done? Why is it important? Who cares? Hybrid MPI/OpenMP

More information

Fundamentals of Computers Design

Fundamentals of Computers Design Computer Architecture J. Daniel Garcia Computer Architecture Group. Universidad Carlos III de Madrid Last update: September 8, 2014 Computer Architecture ARCOS Group. 1/45 Introduction 1 Introduction 2

More information

Introduction to Triangulated Graphs. Tandy Warnow

Introduction to Triangulated Graphs. Tandy Warnow Introduction to Triangulated Graphs Tandy Warnow Topics for today Triangulated graphs: theorems and algorithms (Chapters 11.3 and 11.9) Examples of triangulated graphs in phylogeny estimation (Chapters

More information

EJEMPLOS DE ARQUITECTURAS

EJEMPLOS DE ARQUITECTURAS Maestría en Electrónica Arquitectura de Computadoras Unidad 4 EJEMPLOS DE ARQUITECTURAS M. C. Felipe Santiago Espinosa Marzo/2017 ARM & MIPS Similarities ARM: the most popular embedded core Similar basic

More information

SMD149 - Operating Systems - Multiprocessing

SMD149 - Operating Systems - Multiprocessing SMD149 - Operating Systems - Multiprocessing Roland Parviainen December 1, 2005 1 / 55 Overview Introduction Multiprocessor systems Multiprocessor, operating system and memory organizations 2 / 55 Introduction

More information

Overview. SMD149 - Operating Systems - Multiprocessing. Multiprocessing architecture. Introduction SISD. Flynn s taxonomy

Overview. SMD149 - Operating Systems - Multiprocessing. Multiprocessing architecture. Introduction SISD. Flynn s taxonomy Overview SMD149 - Operating Systems - Multiprocessing Roland Parviainen Multiprocessor systems Multiprocessor, operating system and memory organizations December 1, 2005 1/55 2/55 Multiprocessor system

More information

Complementary Graph Coloring

Complementary Graph Coloring International Journal of Computer (IJC) ISSN 2307-4523 (Print & Online) Global Society of Scientific Research and Researchers http://ijcjournal.org/ Complementary Graph Coloring Mohamed Al-Ibrahim a*,

More information

Bumptrees for Efficient Function, Constraint, and Classification Learning

Bumptrees for Efficient Function, Constraint, and Classification Learning umptrees for Efficient Function, Constraint, and Classification Learning Stephen M. Omohundro International Computer Science Institute 1947 Center Street, Suite 600 erkeley, California 94704 Abstract A

More information

Distance Methods. "PRINCIPLES OF PHYLOGENETICS" Spring 2006

Distance Methods. PRINCIPLES OF PHYLOGENETICS Spring 2006 Integrative Biology 200A University of California, Berkeley "PRINCIPLES OF PHYLOGENETICS" Spring 2006 Distance Methods Due at the end of class: - Distance matrices and trees for two different distance

More information

Midterm Examination CS 540-2: Introduction to Artificial Intelligence

Midterm Examination CS 540-2: Introduction to Artificial Intelligence Midterm Examination CS 54-2: Introduction to Artificial Intelligence March 9, 217 LAST NAME: FIRST NAME: Problem Score Max Score 1 15 2 17 3 12 4 6 5 12 6 14 7 15 8 9 Total 1 1 of 1 Question 1. [15] State

More information

ASSEMBLY LANGUAGE MACHINE ORGANIZATION

ASSEMBLY LANGUAGE MACHINE ORGANIZATION ASSEMBLY LANGUAGE MACHINE ORGANIZATION CHAPTER 3 1 Sub-topics The topic will cover: Microprocessor architecture CPU processing methods Pipelining Superscalar RISC Multiprocessing Instruction Cycle Instruction

More information

Analyzing Cache Bandwidth on the Intel Core 2 Architecture

Analyzing Cache Bandwidth on the Intel Core 2 Architecture John von Neumann Institute for Computing Analyzing Cache Bandwidth on the Intel Core 2 Architecture Robert Schöne, Wolfgang E. Nagel, Stefan Pflüger published in Parallel Computing: Architectures, Algorithms

More information

An Instruction Stream Compression Technique 1

An Instruction Stream Compression Technique 1 An Instruction Stream Compression Technique 1 Peter L. Bird Trevor N. Mudge EECS Department University of Michigan {pbird,tnm}@eecs.umich.edu Abstract The performance of instruction memory is a critical

More information

University of Florida CISE department Gator Engineering. Clustering Part 4

University of Florida CISE department Gator Engineering. Clustering Part 4 Clustering Part 4 Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville DBSCAN DBSCAN is a density based clustering algorithm Density = number of

More information

The worst case complexity of Maximum Parsimony

The worst case complexity of Maximum Parsimony he worst case complexity of Maximum Parsimony mir armel Noa Musa-Lempel Dekel sur Michal Ziv-Ukelson Ben-urion University June 2, 20 / 2 What s a phylogeny Phylogenies: raph-like structures whose topology

More information

1 High-Performance Phylogeny Reconstruction Under Maximum Parsimony. David A. Bader, Bernard M.E. Moret, Tiffani L. Williams and Mi Yan

1 High-Performance Phylogeny Reconstruction Under Maximum Parsimony. David A. Bader, Bernard M.E. Moret, Tiffani L. Williams and Mi Yan Contents 1 High-Performance Phylogeny Reconstruction Under Maximum Parsimony 1 David A. Bader, Bernard M.E. Moret, Tiffani L. Williams and Mi Yan 1.1 Introduction 1 1.2 Maximum Parsimony 7 1.3 Exact MP:

More information

ENGN1640: Design of Computing Systems Topic 06: Advanced Processor Design

ENGN1640: Design of Computing Systems Topic 06: Advanced Processor Design ENGN1640: Design of Computing Systems Topic 06: Advanced Processor Design Professor Sherief Reda http://scale.engin.brown.edu Electrical Sciences and Computer Engineering School of Engineering Brown University

More information

A Parallel Evolutionary Algorithm for Discovery of Decision Rules

A Parallel Evolutionary Algorithm for Discovery of Decision Rules A Parallel Evolutionary Algorithm for Discovery of Decision Rules Wojciech Kwedlo Faculty of Computer Science Technical University of Bia lystok Wiejska 45a, 15-351 Bia lystok, Poland wkwedlo@ii.pb.bialystok.pl

More information

Accelerating InDel Detection on Modern Multi-Core SIMD CPU Architecture

Accelerating InDel Detection on Modern Multi-Core SIMD CPU Architecture Accelerating InDel Detection on Modern Multi-Core SIMD CPU Architecture Da Zhang Collaborators: Hao Wang, Kaixi Hou, Jing Zhang Advisor: Wu-chun Feng Evolution of Genome Sequencing1 In 20032: 1 human genome

More information

Running SNAP. The SNAP Team February 2012

Running SNAP. The SNAP Team February 2012 Running SNAP The SNAP Team February 2012 1 Introduction SNAP is a tool that is intended to serve as the read aligner in a gene sequencing pipeline. Its theory of operation is described in Faster and More

More information

DEPARTMENT OF ELECTRONICS & COMMUNICATION ENGINEERING QUESTION BANK

DEPARTMENT OF ELECTRONICS & COMMUNICATION ENGINEERING QUESTION BANK DEPARTMENT OF ELECTRONICS & COMMUNICATION ENGINEERING QUESTION BANK SUBJECT : CS6303 / COMPUTER ARCHITECTURE SEM / YEAR : VI / III year B.E. Unit I OVERVIEW AND INSTRUCTIONS Part A Q.No Questions BT Level

More information

Computer Architecture and Organization

Computer Architecture and Organization 10-1 Chapter 10 - Advanced Computer Architecture Computer Architecture and Organization Miles Murdocca and Vincent Heuring Chapter 10 Advanced Computer Architecture 10-2 Chapter 10 - Advanced Computer

More information

Query Processing & Optimization

Query Processing & Optimization Query Processing & Optimization 1 Roadmap of This Lecture Overview of query processing Measures of Query Cost Selection Operation Sorting Join Operation Other Operations Evaluation of Expressions Introduction

More information

Storage Efficient Hardware Prefetching using Delta Correlating Prediction Tables

Storage Efficient Hardware Prefetching using Delta Correlating Prediction Tables Storage Efficient Hardware Prefetching using Correlating Prediction Tables Marius Grannaes Magnus Jahre Lasse Natvig Norwegian University of Science and Technology HiPEAC European Network of Excellence

More information