A Scalable Coprocessor for Bioinformatic Sequence Alignments

Size: px
Start display at page:

Download "A Scalable Coprocessor for Bioinformatic Sequence Alignments"

Transcription

1 A Scalable Coprocessor for Bioinformatic Sequence Alignments Scott F. Smith Department of Electrical and Computer Engineering Boise State University Boise, ID, U.S.A. Abstract A hardware coprocessor for the rapid calculation of bioinformatic sequence alignments is presented. The coprocessor uses a globally-asynchronous locally-synchronous (GALS) design style which makes the coprocessor much easier to scale as CMOS feature sizes decrease. The coprocessor is intended to be implemented on a single integrated circuit along with a simple 32-bit RISC processor and memory system. The specific sequence alignment algorithm implemented is that of Smith and Waterman, but the general design strategy could be extended to other bioinformatic sequence alignment algorithms. Keywords - coprocessor, bioinformatics, sequence alignment, Smith-Waterman, globally-asynchronous locally-synchronous. 1. Introduction There is a huge amount of biologic sequence data available today and the volume of data is increasing at an exponential rate. This data includes DNA and protein sequences as well as structural data (x, y, and z positions of atoms in organic polymers). A major challenge for molecular biologists is to make sense of these newly available sequences which run into the billions of units (DNA base pairs or protein amino acids). The only possible way to handle this large amount of data is with automated computer processing, a field now called bioinformatics. One of the core activities undertaken by bioinformatics is the alignment of sequences. This is similar to string searching in text databases, but with some aspects that are peculiar to biologic databases. The simplest of these alignments involves taking a vector of symbols (a query string) and finding ranges in a vector database that are similar to the query string. More complicated alignments use structural data and attempt to find locations in the database with similar relative locations of atoms. This paper describes a coprocessor to handle the non-structural alignment problem, but may be used as a base for designing structural alignment hardware. One of the reasons that text search algorithms can not be directly used for sequence alignment is that there is rarely an exact match between the query string and the database. The query string might be a protein found in humans and the search is for a protein with a similar function in another organism such as mice. The two proteins will have differences that correspond to mutations, insertions, or deletions. A mutation is a difference in a character at a given position. An insertion or deletion is an addition or removal of one or more characters from the query string or database. Insertions and deletions are symmetric in that an insertion in the query string can be viewed as a deletion in the database and vice versa. A further complication is that certain mutations for proteins (amino acids) have less effect on the protein function than other mutations. In order to generate a high quality alignment which finds similar locations in the database without a high false alarm rate, it is necessary that the alignment algorithm mirror the statistics of the underlying biological process. One such algorithm is the Smith-Waterman alignment algorithm [1], which gives high quality alignments but is very computationally demanding.

2 Alternatives to linear-programming based alignments such as Smith-Waterman exist which are not as high quality, but less computationally demanding. These include BLAST (Basic Local Alignment Search Tool) [2] and FASTA (Fast Align) [3] and are commonly used on web-based servers such as those maintained by the National Center for Biotechnology Information (NCBI) [4]. Even though these algorithms are less computationally demanding, they still use significant computing resources. In order to get the high quality of the Smith- Waterman algorithm, one typically has to resort to the power of scientific supercomputing. Strategies include both the use of generalpurpose supercomputers [5] and special-purpose coprocessors [6][7]. This paper describes a special-purpose coprocessor. The coprocessor presented here is different than existing coprocessors in the literature in that it is intended to be a single-chip implementation and that it is more easily scalable due to its globallyasynchronous locally-synchronous design style [8]. There are at least two companies that manufacture and sell systems with coprocessors to accelerate bioinformatic alignments. One of these uses ASIC (application-specific integrated circuit) technology [9] and the other FPGA (field-programmable gate array) technology [10]. Typically these are used to accelerate BLAST processing, which is already computationally intensive enough to warrant special-purpose hardware. The design of these accelerators is, however, proprietary and very little detailed information about them is available in the public literature. The main advantage of the GALS design style is that it gets around the problem of designing a large-area low-skew high-frequency global clock tree. The problems associated with global clock tree design are documented in [11]. These problems will only get worse as CMOS feature sizes decrease and include new problems such as crosstalk and wire inductance that could be safely ignored until recently. Interfaces between the local clock domains of GALS systems have been designed by [12], [13], and [14]. The later design is used for the coprocessor of this paper because it has the advantage of not requiring any control of the local clocks by the interface. The design of a single computation unit which resides within one of the local clock domains of the GALS coprocessor system is described in Section 2. The combination of these units into a multi-clock-domain GALS coprocessor system is the topic of Section 3. Conclusions are presented in Section 4 along with future work to be undertaken. 2. Computation Unit Organization The computation unit is designed to implement the equations of the Smith-Waterman alignment algorithm corresponding to a single character of the query sequence. If there are more characters in the query sequence than there are computation units, then the coprocessor will be accessed multiple times by the main processor. Each access will pass the entire database through the array of computation units. The processor will store intermediate results collected at the end of each pass which generates an intermediate results file several times as large as the initial database. The Smith-Waterman equations are: I i,j = D i,j = M i,j = 0, for all i and j such that i = 0 or j = 0 I i,j = max{i i-1,j - c, M i-1,j - g} D i,j = max{d i,j-1 - c, M i,j-1 - g} M i,j = max{i i-1,j-1 + d(a i,b j ), D i-1,j-1 + d(a i,b j ), M i-1,j-1 + d(a i,b j ), 0} where the indices i and j refer to the position within the query string and within the database. Since the algorithm is symmetric is does not mater which index is chosen for query and which for database. I is the current score if an insertion is underway and D if a deletion is underway. The choice of assignment of i and j to query string versus database determines whether these insertions or deletions actually refer to query string insertions and deletions or

3 database insertions and deletions. The current score if the current pair of characters (one from the query string and one from the database) is taken as a match/mutation is M. The penalty for starting a new insertion or deletion is g, and the penalty for continuing an insertion or deletion is c (g is normally chosen larger than c). The reward for a match is d and this depends in general on how close the match is. Exactly matching characters get the highest reward value and similar characters get a reduced, but positive reward. For DNA alignments, exact matches are usually assigned a positive reward with all other combinations given a reward of zero. For protein alignments, amino acids with similar properties (such as both being hydrophobic) are given non-zero, but lower rewards than exact matches. The d matrix is normally symmetric. There are four possible characters for DNA alignments (A, T, C, and G) and twenty possible characters for protein alignments (C, H, I, M, S, V, A, G, L, P, T, F, R, Y, W, D, N, E, Q, and K) [15]. These characters are normally stored as eight-bit ASCII codes in biological databases. A block diagram of a single computation unit is shown in Figure 1. The unit is divided into two sections, constants and calculation. These two sections have separate request and acknowledge interfaces and work independently. The constants section is loaded first with c, g, d, and valid values for a particular character of the query string. The valid bit allows a computation unit to be bypassed if it is not needed as a result of the query string being shorter than the total number of computation units. The g and c values are the start and continuation penalties for insertions and deletions and are eight bit values. The twenty d values indexed d(0) through d(19) are each three bit rewards for matching/mutating characters that pass through the computation unit. The d values are a single column of the d matrix corresponding to the query string character assigned to the computation unit. In the case of DNA alignment, only the first four d values will be used since the character for the other sixteen d values will simply never appear in the database data stream. Figure 1 Computation unit block diagram.

4 The database characters (Char) are passed through the pipeline of computation units using a twenty-bit one-hot code. The one-hot code makes it easier to select the required d value from the constants section. This makes the computation unit faster and saves transistors inside the unit at the expense of additional bits at the interface between units. The current score at the current position in the database is labeled Max. An additional intermediate variable X has been added to the equations. This X variable represents a portion of the calculation that can been done prior to the arrival of the current Char value. The variable X is not passed between computation units since it is only a temporary internal state. clock domain. Even though the clock signals of two different clock domains are nominally the same, minimal effort is employed to maintain low skew between the domains and the clock domain interfaces make no assumptions about the relative phase of the two local clock signals. The internal design and performance of the asynchronous interface is described in detail in [14] and [16]. The interface is built around an asynchronous FIFO. The performance of the interface has been estimated using a SPICE model as 1.09 ns plus a clock-phase differential term which varies between zero and one period of the receiving clock signal. This performance estimate is based on a 180 nm TSMC [17] CMOS process available through MOSIS [18]. 3. Full Coprocessor System Figure 2 Connection of two computation units. The connection of two computation units together using asynchronous interfaces is shown in Figure 2. Two asynchronous interfaces are needed between each pair of computation units to allow independent passage of constants and data. Each computation unit has its own local clock signal. These clock signals are intended to have the same nominal frequency and are mostly likely derived from the same clock source. Local clock signals have tightly controlled skew such that the usual synchronous design paradigm can be used within a clock domain. This allows the standard types of digital design tools to be used to design the logic internal to a local The full system including processor, coprocessor, and array of computation units is shown in Figure 3. The processor and coprocessor are in the same local clock domain (clock 0) and each of the n computation units occupies its own local clock domain (clock 1 through clock n). The coprocessor loads constants eight bits at a time through the chain of Const connections. Internal to the computation units there is an eight bit wide and ten unit long synchronous FIFO for constant values. After 10n bytes of constants have been sent to the computation unit array by the coprocessor the constants are fully loaded. There is no need to pass constants from computation block n back to the coprocessor and therefore only one asynchronous interface is needed at that place. Data are passed 84 bits at a time through the chain of Data connections. This data is composed of D, I, M, Max, and Char. There is no need to ever pass Char back to the coprocessor from computation unit n, so the 20 bits of Char are omitted and Data is only 64 bits wide at that point. The need to pass D, I, M, and Max from the coprocessor to computation unit 1 occurs only when the query string does not fit in the array of computation blocks and multiple passes of the database through the array is needed. If the query string completely fits or it is the first pass of a multi-pass run, the D, I, M, and Max values are set to zero by the coprocessor.

5 Figure 3 Full coprocessor system. One of the functions of the coprocessor is to expand eight-bit ASCII-coded symbols for DNA bases or amino acids into the 20-bit one-hot code used to specify Char in the computation unit array. The processor is responsible for maintaining the d matrix and calculating all of the constants to be loaded via Const. One reason for having the coprocessor do the one-hot expansion is to reduce the bandwidth over the processor-coprocessor interface. A good possible choice for the processor in this system is an ARM922T CPU core [19][20][21]. This is a small 32-bit RISC processor designed to be used as a core on an ASIC. The ARM922T CPU core is built around an ARM9 processor which has a standard coprocessor interface. One standard coprocessor which uses this interface is the memory management unit (MMU), but additional application-specific coprocessors can be designed to use the interface definition. This coprocessor interface is 32 bits wide. After initialization of the constants, information passing from the processor to coprocessor is in the form of ASCII characters, so four database units can be sent per transfer. Information passing from the coprocessor to the processor is a series of 16-bit scores, one for each database character passed into the coprocessor. The 32-bit processor-coprocessor interface therefore handles an average of 4/3 database characters per clock cycle. The ARM922T is capable of operating at 200 MHz in the 180 nm TSMC CMOS process. The synchronous logic within the coprocessor and computation units has not yet been designed, but it is not unreasonable to expect these to operate at a similar speed (the coprocessor is required to operate on the same clock as the processor). If so, then the processor will be near full processing capacity moving data into and out of the coprocessor during the database access phase of processing. At 200 million database characters per second, a search of the entire human genome (about 3 billion base-pairs) would take about 15 seconds. This assumes that there are enough computation units to hold the entire query string. Performance with longer query strings would be significantly less since the processor would need to store and retrieve the intermediate D, I, M, and Max values once for every pass in excess of the first. 4. Conclusion The main advantage of the GALS approach used in the design of the alignment coprocessor is the ease of scaling to smaller CMOS feature sizes which allows for an increase in the number of computation units in the coprocessor array. Increasing the number of units allows for longer query strings to be processed without using multiple passes. Alternatively, more than one set of processor, coprocessor, and computation unit array can be placed on a single integrated circuit. This would increase throughput rather than increase efficient query string length. The next steps in this work will be the design of the synchronous logic within the coprocessor and computation units. This will yield information on the layout size of the computation unit which in turn will determine how many units can be placed on a single integrated circuit. The design will also allow simulation to determine if the coprocessor and computation blocks can in fact run at near 200 MHz in a 180 nm CMOS process. It is already know from the asynchronous interface design that the interfaces are not a large contributor to layout area and can easily support 200 MHz throughput. References

6 [1] T. Smith and M. Waterman, Identification of Common Molecular Sequences, Journal of Molecular Biology, pp , [2] S. Altschul, W. Gish, E. Myers, and D. Lipman, Basic Local Alignment Search Tool, Journal of Molecular Biology, pp , [3] W. Pearson and D. Lipman, Improved Tools for Biological Sequence Comparison, Proceedings of the National Academy of Science, pp , [4] National Center for Biotechnology Information (NCBI), [5] S. Smith and J. Frenzel, Bioinformatics Application of a Scalable Supercomputer-on-chip Architecture, Proceedings of the International Conference on Parallel and Distributed Processing Techniques, Volume 1, pp , [6] L. Grate, M. Diekhans, D. Dahle, and R. Hughey, Sequence Analysis with the Kestrel SIMD Parallel Processor, Proceedings of the Pacific Symposium on Biocomputing, pp , [7] P. Guerdoux-Jamet and D. Lavenier, SAMBA: Hardware Accelerator for Biological Sequence Comparison, Computer Applications in Biosciences, pp , [8] D. Chapiro, Globally-Asynchronous Locally- Synchronous Systems, Doctoral Thesis, Stanford University, th IEEE International ASIC/SOC Conference, pp , [13] K. Yun and A. Dooply, Pausible Clocking- Based Heterogeneous Systems, IEEE Transactions on VLSI Systems, pp , [14] S. Smith and J. Frenzel, Low-latency Multiple Clock Domain Interfacing Without Alteration of Local Clocks, Proceedings of the 15 th Biennial IEEE University / Government / Industry Microelectronics Symposium, pp , [15] C. Branden and J. Tooze, Introduction to Protein Structure, 2 nd Edition, Garland Publishing, [16] S. Smith, A Multiple-Clock-Domain Bus Architecture Using Asynchronous FIFOs as Elastic Elements, Doctoral Thesis, University of Idaho, [17] Taiwan Semiconductor Manufacturing Company website, [18] MOSIS website, [19] S. Furber, ARM System-on-Chip Architecture, 2 nd Edition, Addison-Wesley, [20] D. Seal, ARM Architecture Reference Manual, 2 nd Edition, Addison-Wesley, [21] ARM Ltd. website, [9] Paracel, Inc. website, [10] TimeLogic Corp. website, [11] D. Bailey, Clock Distribution, in Design of High-Performance Microprocessor Circuits, IEEE Press, pp , [12] J. Muttersbach, T. Villiger, H. Kaeslin, N. Felber, and W. Fichtner, Globally-Asynchronous Locally-Synchronous Architectures to Simplify the Design of On-Chip Systems, Proceedings of the

24 Grundlagen der Bioinformatik, SS 10, D. Huson, April 26, This lecture is based on the following papers, which are all recommended reading:

24 Grundlagen der Bioinformatik, SS 10, D. Huson, April 26, This lecture is based on the following papers, which are all recommended reading: 24 Grundlagen der Bioinformatik, SS 10, D. Huson, April 26, 2010 3 BLAST and FASTA This lecture is based on the following papers, which are all recommended reading: D.J. Lipman and W.R. Pearson, Rapid

More information

OPEN MP-BASED PARALLEL AND SCALABLE GENETIC SEQUENCE ALIGNMENT

OPEN MP-BASED PARALLEL AND SCALABLE GENETIC SEQUENCE ALIGNMENT OPEN MP-BASED PARALLEL AND SCALABLE GENETIC SEQUENCE ALIGNMENT Asif Ali Khan*, Laiq Hassan*, Salim Ullah* ABSTRACT: In bioinformatics, sequence alignment is a common and insistent task. Biologists align

More information

Hardware Accelerator for Biological Sequence Alignment using Coreworks Processing Engine

Hardware Accelerator for Biological Sequence Alignment using Coreworks Processing Engine Hardware Accelerator for Biological Sequence Alignment using Coreworks Processing Engine José Cabrita, Gilberto Rodrigues, Paulo Flores INESC-ID / IST, Technical University of Lisbon jpmcabrita@gmail.com,

More information

Bioinformatics explained: Smith-Waterman

Bioinformatics explained: Smith-Waterman Bioinformatics Explained Bioinformatics explained: Smith-Waterman May 1, 2007 CLC bio Gustav Wieds Vej 10 8000 Aarhus C Denmark Telephone: +45 70 22 55 09 Fax: +45 70 22 55 19 www.clcbio.com info@clcbio.com

More information

A CAM(Content Addressable Memory)-based architecture for molecular sequence matching

A CAM(Content Addressable Memory)-based architecture for molecular sequence matching A CAM(Content Addressable Memory)-based architecture for molecular sequence matching P.K. Lala 1 and J.P. Parkerson 2 1 Department Electrical Engineering, Texas A&M University, Texarkana, Texas, USA 2

More information

Bioinformatics explained: BLAST. March 8, 2007

Bioinformatics explained: BLAST. March 8, 2007 Bioinformatics Explained Bioinformatics explained: BLAST March 8, 2007 CLC bio Gustav Wieds Vej 10 8000 Aarhus C Denmark Telephone: +45 70 22 55 09 Fax: +45 70 22 55 19 www.clcbio.com info@clcbio.com Bioinformatics

More information

BLAST, Profile, and PSI-BLAST

BLAST, Profile, and PSI-BLAST BLAST, Profile, and PSI-BLAST Jianlin Cheng, PhD School of Electrical Engineering and Computer Science University of Central Florida 26 Free for academic use Copyright @ Jianlin Cheng & original sources

More information

Parallel Processing for Scanning Genomic Data-Bases

Parallel Processing for Scanning Genomic Data-Bases 1 Parallel Processing for Scanning Genomic Data-Bases D. Lavenier and J.-L. Pacherie a {lavenier,pacherie}@irisa.fr a IRISA, Campus de Beaulieu, 35042 Rennes cedex, France The scan of a genomic data-base

More information

Data Mining Technologies for Bioinformatics Sequences

Data Mining Technologies for Bioinformatics Sequences Data Mining Technologies for Bioinformatics Sequences Deepak Garg Computer Science and Engineering Department Thapar Institute of Engineering & Tecnology, Patiala Abstract Main tool used for sequence alignment

More information

An Analysis of Pairwise Sequence Alignment Algorithm Complexities: Needleman-Wunsch, Smith-Waterman, FASTA, BLAST and Gapped BLAST

An Analysis of Pairwise Sequence Alignment Algorithm Complexities: Needleman-Wunsch, Smith-Waterman, FASTA, BLAST and Gapped BLAST An Analysis of Pairwise Sequence Alignment Algorithm Complexities: Needleman-Wunsch, Smith-Waterman, FASTA, BLAST and Gapped BLAST Alexander Chan 5075504 Biochemistry 218 Final Project An Analysis of Pairwise

More information

Comparative Analysis of Protein Alignment Algorithms in Parallel environment using CUDA

Comparative Analysis of Protein Alignment Algorithms in Parallel environment using CUDA Comparative Analysis of Protein Alignment Algorithms in Parallel environment using BLAST versus Smith-Waterman Shadman Fahim shadmanbracu09@gmail.com Shehabul Hossain rudrozzal@gmail.com Gulshan Jubaed

More information

Bio-Sequence Analysis with Cradle s 3SoC Software Scalable System on Chip

Bio-Sequence Analysis with Cradle s 3SoC Software Scalable System on Chip 2004 ACM Symposium on Applied Computing Bio-Sequence Analysis with Cradle s 3SoC Software Scalable System on Chip Xiandong Meng Department of Electrical and Computer Engineering Wayne State University

More information

The Effect of Inverse Document Frequency Weights on Indexed Sequence Retrieval. Kevin C. O'Kane. Department of Computer Science

The Effect of Inverse Document Frequency Weights on Indexed Sequence Retrieval. Kevin C. O'Kane. Department of Computer Science The Effect of Inverse Document Frequency Weights on Indexed Sequence Retrieval Kevin C. O'Kane Department of Computer Science The University of Northern Iowa Cedar Falls, Iowa okane@cs.uni.edu http://www.cs.uni.edu/~okane

More information

Jyoti Lakhani 1, Ajay Khunteta 2, Dharmesh Harwani *3 1 Poornima University, Jaipur & Maharaja Ganga Singh University, Bikaner, Rajasthan, India

Jyoti Lakhani 1, Ajay Khunteta 2, Dharmesh Harwani *3 1 Poornima University, Jaipur & Maharaja Ganga Singh University, Bikaner, Rajasthan, India International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2017 IJSRCSEIT Volume 2 Issue 6 ISSN : 2456-3307 Improvisation of Global Pairwise Sequence Alignment

More information

Acceleration of Algorithm of Smith-Waterman Using Recursive Variable Expansion.

Acceleration of Algorithm of Smith-Waterman Using Recursive Variable Expansion. www.ijarcet.org 54 Acceleration of Algorithm of Smith-Waterman Using Recursive Variable Expansion. Hassan Kehinde Bello and Kazeem Alagbe Gbolagade Abstract Biological sequence alignment is becoming popular

More information

BLAST: Basic Local Alignment Search Tool Altschul et al. J. Mol Bio CS 466 Saurabh Sinha

BLAST: Basic Local Alignment Search Tool Altschul et al. J. Mol Bio CS 466 Saurabh Sinha BLAST: Basic Local Alignment Search Tool Altschul et al. J. Mol Bio. 1990. CS 466 Saurabh Sinha Motivation Sequence homology to a known protein suggest function of newly sequenced protein Bioinformatics

More information

Dynamic Programming User Manual v1.0 Anton E. Weisstein, Truman State University Aug. 19, 2014

Dynamic Programming User Manual v1.0 Anton E. Weisstein, Truman State University Aug. 19, 2014 Dynamic Programming User Manual v1.0 Anton E. Weisstein, Truman State University Aug. 19, 2014 Dynamic programming is a group of mathematical methods used to sequentially split a complicated problem into

More information

HARDWARE ACCELERATION OF HIDDEN MARKOV MODELS FOR BIOINFORMATICS APPLICATIONS. by Shakha Gupta. A project. submitted in partial fulfillment

HARDWARE ACCELERATION OF HIDDEN MARKOV MODELS FOR BIOINFORMATICS APPLICATIONS. by Shakha Gupta. A project. submitted in partial fulfillment HARDWARE ACCELERATION OF HIDDEN MARKOV MODELS FOR BIOINFORMATICS APPLICATIONS by Shakha Gupta A project submitted in partial fulfillment of the requirements for the degree of Master of Science in Computer

More information

Hardware Acceleration of Sequence Alignment Algorithms An Overview

Hardware Acceleration of Sequence Alignment Algorithms An Overview Hardware Acceleration of Sequence Alignment Algorithms An Overview Laiq Hasan Zaid Al-Ars Stamatis Vassiliadis Delft University of Technology Computer Engineering Laboratory Mekelweg 4, 2628 CD Delft,

More information

GPU Accelerated Smith-Waterman

GPU Accelerated Smith-Waterman GPU Accelerated Smith-Waterman Yang Liu 1,WayneHuang 1,2, John Johnson 1, and Sheila Vaidya 1 1 Lawrence Livermore National Laboratory 2 DOE Joint Genome Institute, UCRL-CONF-218814 {liu24, whuang, jjohnson,

More information

Performance Comparison between Linear RVE and Linear Systolic Array Implementations of the Smith-Waterman Algorithm

Performance Comparison between Linear RVE and Linear Systolic Array Implementations of the Smith-Waterman Algorithm Performance Comparison between Linear RVE and Linear Systolic Array Implementations of the Smith-Waterman Algorithm Laiq Hasan Zaid Al-Ars Delft University of Technology Computer Engineering Laboratory

More information

Computational Molecular Biology

Computational Molecular Biology Computational Molecular Biology Erwin M. Bakker Lecture 3, mainly from material by R. Shamir [2] and H.J. Hoogeboom [4]. 1 Pairwise Sequence Alignment Biological Motivation Algorithmic Aspect Recursive

More information

ICB Fall G4120: Introduction to Computational Biology. Oliver Jovanovic, Ph.D. Columbia University Department of Microbiology

ICB Fall G4120: Introduction to Computational Biology. Oliver Jovanovic, Ph.D. Columbia University Department of Microbiology ICB Fall 2008 G4120: Computational Biology Oliver Jovanovic, Ph.D. Columbia University Department of Microbiology Copyright 2008 Oliver Jovanovic, All Rights Reserved. The Digital Language of Computers

More information

Searching Biological Sequence Databases Using Distributed Adaptive Computing

Searching Biological Sequence Databases Using Distributed Adaptive Computing Searching Biological Sequence Databases Using Distributed Adaptive Computing Nicholas P. Pappas Thesis submitted to the Faculty of the Virginia Polytechnic Institute and State University in partial fulfillment

More information

Acceleration of Ungapped Extension in Mercury BLAST. Joseph Lancaster Jeremy Buhler Roger Chamberlain

Acceleration of Ungapped Extension in Mercury BLAST. Joseph Lancaster Jeremy Buhler Roger Chamberlain Acceleration of Ungapped Extension in Mercury BLAST Joseph Lancaster Jeremy Buhler Roger Chamberlain Joseph Lancaster, Jeremy Buhler, and Roger Chamberlain, Acceleration of Ungapped Extension in Mercury

More information

THE Smith-Waterman (SW) algorithm [1] is a wellknown

THE Smith-Waterman (SW) algorithm [1] is a wellknown Design and Implementation of the Smith-Waterman Algorithm on the CUDA-Compatible GPU Yuma Munekawa, Fumihiko Ino, Member, IEEE, and Kenichi Hagihara Abstract This paper describes a design and implementation

More information

A Design of a Hybrid System for DNA Sequence Alignment

A Design of a Hybrid System for DNA Sequence Alignment IMECS 2008, 9-2 March, 2008, Hong Kong A Design of a Hybrid System for DNA Sequence Alignment Heba Khaled, Hossam M. Faheem, Tayseer Hasan, Saeed Ghoneimy Abstract This paper describes a parallel algorithm

More information

HIDDEN MARKOV MODELS AND SEQUENCE ALIGNMENT

HIDDEN MARKOV MODELS AND SEQUENCE ALIGNMENT HIDDEN MARKOV MODELS AND SEQUENCE ALIGNMENT - Swarbhanu Chatterjee. Hidden Markov models are a sophisticated and flexible statistical tool for the study of protein models. Using HMMs to analyze proteins

More information

From Smith-Waterman to BLAST

From Smith-Waterman to BLAST From Smith-Waterman to BLAST Jeremy Buhler July 23, 2015 Smith-Waterman is the fundamental tool that we use to decide how similar two sequences are. Isn t that all that BLAST does? In principle, it is

More information

Revisiting the Speed-versus-Sensitivity Tradeoff in Pairwise Sequence Search

Revisiting the Speed-versus-Sensitivity Tradeoff in Pairwise Sequence Search Revisiting the Speed-versus-Sensitivity Tradeoff in Pairwise Sequence Search Ashwin M. Aji and Wu-chun Feng The Synergy Laboratory Department of Computer Science Virginia Tech {aaji,feng}@cs.vt.edu Abstract

More information

Fast Sequence Alignment Method Using CUDA-enabled GPU

Fast Sequence Alignment Method Using CUDA-enabled GPU Fast Sequence Alignment Method Using CUDA-enabled GPU Yeim-Kuan Chang Department of Computer Science and Information Engineering National Cheng Kung University Tainan, Taiwan ykchang@mail.ncku.edu.tw De-Yu

More information

As of August 15, 2008, GenBank contained bases from reported sequences. The search procedure should be

As of August 15, 2008, GenBank contained bases from reported sequences. The search procedure should be 48 Bioinformatics I, WS 09-10, S. Henz (script by D. Huson) November 26, 2009 4 BLAST and BLAT Outline of the chapter: 1. Heuristics for the pairwise local alignment of two sequences 2. BLAST: search and

More information

Protein Sequence Comparison on the Instruction Systolic Array

Protein Sequence Comparison on the Instruction Systolic Array Protein Sequence Comparison on the Instruction Systolic Array Bertil Schmidt, Heiko Schröder and Manfred Schimmler 2 School of Computer Engineering, Nanyang Technological University, Singapore 639798,

More information

On the Efficacy of Haskell for High Performance Computational Biology

On the Efficacy of Haskell for High Performance Computational Biology On the Efficacy of Haskell for High Performance Computational Biology Jacqueline Addesa Academic Advisors: Jeremy Archuleta, Wu chun Feng 1. Problem and Motivation Biologists can leverage the power of

More information

Chapter Seven Morgan Kaufmann Publishers

Chapter Seven Morgan Kaufmann Publishers Chapter Seven Memories: Review SRAM: value is stored on a pair of inverting gates very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: value is stored as a charge on capacitor (must be

More information

Distributed Protein Sequence Alignment

Distributed Protein Sequence Alignment Distributed Protein Sequence Alignment ABSTRACT J. Michael Meehan meehan@wwu.edu James Hearne hearne@wwu.edu Given the explosive growth of biological sequence databases and the computational complexity

More information

Praveen Krishnamurthy, Jeremy Buhler, Roger Chamberlain, Mark Franklin, Kwame Gyang, and Joseph Lancaster

Praveen Krishnamurthy, Jeremy Buhler, Roger Chamberlain, Mark Franklin, Kwame Gyang, and Joseph Lancaster Biosequence Similarity Search on the Mercury System Praveen Krishnamurthy, Jeremy Buhler, Roger Chamberlain, Mark Franklin, Kwame Gyang, and Joseph Lancaster Praveen Krishnamurthy, Jeremy Buhler, Roger

More information

Scalable Hardware Accelerator for Comparing DNA and Protein Sequences

Scalable Hardware Accelerator for Comparing DNA and Protein Sequences Scalable Hardware Accelerator for Comparing DNA and Protein Sequences Philippe Faes, Bram Minnaert, Mark Christiaens, Eric Bonnet, Yvan Saeys, Dirk Stroobandt, Yves Van de Peer Abstract Comparing genetic

More information

Research on Pairwise Sequence Alignment Needleman-Wunsch Algorithm

Research on Pairwise Sequence Alignment Needleman-Wunsch Algorithm 5th International Conference on Mechatronics, Materials, Chemistry and Computer Engineering (ICMMCCE 2017) Research on Pairwise Sequence Alignment Needleman-Wunsch Algorithm Xiantao Jiang1, a,*,xueliang

More information

COS 551: Introduction to Computational Molecular Biology Lecture: Oct 17, 2000 Lecturer: Mona Singh Scribe: Jacob Brenner 1. Database Searching

COS 551: Introduction to Computational Molecular Biology Lecture: Oct 17, 2000 Lecturer: Mona Singh Scribe: Jacob Brenner 1. Database Searching COS 551: Introduction to Computational Molecular Biology Lecture: Oct 17, 2000 Lecturer: Mona Singh Scribe: Jacob Brenner 1 Database Searching In database search, we typically have a large sequence database

More information

Scalable Accelerator Architecture for Local Alignment of DNA Sequences

Scalable Accelerator Architecture for Local Alignment of DNA Sequences Scalable Accelerator Architecture for Local Alignment of DNA Sequences Nuno Sebastião, Nuno Roma, Paulo Flores INESC-ID / IST-TU Lisbon Rua Alves Redol, 9, Lisboa PORTUGAL {Nuno.Sebastiao, Nuno.Roma, Paulo.Flores}

More information

Accelerating Smith Waterman (SW) Algorithm on Altera Cyclone II Field Programmable Gate Array

Accelerating Smith Waterman (SW) Algorithm on Altera Cyclone II Field Programmable Gate Array Accelerating Smith Waterman (SW) Algorithm on Altera yclone II Field Programmable Gate Array NUR DALILAH AHMAD SABRI, NUR FARAH AIN SALIMAN, SYED ABDUL MUALIB AL JUNID, ABDUL KARIMI HALIM Faculty Electrical

More information

Biology 644: Bioinformatics

Biology 644: Bioinformatics Find the best alignment between 2 sequences with lengths n and m, respectively Best alignment is very dependent upon the substitution matrix and gap penalties The Global Alignment Problem tries to find

More information

USING AN EXTENDED SUFFIX TREE TO SPEED-UP SEQUENCE ALIGNMENT

USING AN EXTENDED SUFFIX TREE TO SPEED-UP SEQUENCE ALIGNMENT IADIS International Conference Applied Computing 2006 USING AN EXTENDED SUFFIX TREE TO SPEED-UP SEQUENCE ALIGNMENT Divya R. Singh Software Engineer Microsoft Corporation, Redmond, WA 98052, USA Abdullah

More information

.. Fall 2011 CSC 570: Bioinformatics Alexander Dekhtyar..

.. Fall 2011 CSC 570: Bioinformatics Alexander Dekhtyar.. .. Fall 2011 CSC 570: Bioinformatics Alexander Dekhtyar.. PAM and BLOSUM Matrices Prepared by: Jason Banich and Chris Hoover Background As DNA sequences change and evolve, certain amino acids are more

More information

An I/O device driver for bioinformatics tools: the case for BLAST

An I/O device driver for bioinformatics tools: the case for BLAST An I/O device driver for bioinformatics tools 563 An I/O device driver for bioinformatics tools: the case for BLAST Renato Campos Mauro and Sérgio Lifschitz Departamento de Informática PUC-RIO, Pontifícia

More information

VLSI Design Automation

VLSI Design Automation VLSI Design Automation IC Products Processors CPU, DSP, Controllers Memory chips RAM, ROM, EEPROM Analog Mobile communication, audio/video processing Programmable PLA, FPGA Embedded systems Used in cars,

More information

Vertex Shader Design I

Vertex Shader Design I The following content is extracted from the paper shown in next page. If any wrong citation or reference missing, please contact ldvan@cs.nctu.edu.tw. I will correct the error asap. This course used only

More information

Performance Analysis of Parallelized Bioinformatics Applications

Performance Analysis of Parallelized Bioinformatics Applications Asian Journal of Computer Science and Technology ISSN: 2249-0701 Vol.7 No.2, 2018, pp. 70-74 The Research Publication, www.trp.org.in Dhruv Chander Pant 1 and OP Gupta 2 1 Research Scholar, I. K. Gujral

More information

Darwin: A Genomic Co-processor gives up to 15,000X speedup on long read assembly (To appear in ASPLOS 2018)

Darwin: A Genomic Co-processor gives up to 15,000X speedup on long read assembly (To appear in ASPLOS 2018) Darwin: A Genomic Co-processor gives up to 15,000X speedup on long read assembly (To appear in ASPLOS 2018) Yatish Turakhia EE PhD candidate Stanford University Prof. Bill Dally (Electrical Engineering

More information

Research Article International Journals of Advanced Research in Computer Science and Software Engineering ISSN: X (Volume-7, Issue-6)

Research Article International Journals of Advanced Research in Computer Science and Software Engineering ISSN: X (Volume-7, Issue-6) International Journals of Advanced Research in Computer Science and Software Engineering ISSN: 77-18X (Volume-7, Issue-6) Research Article June 017 DDGARM: Dotlet Driven Global Alignment with Reduced Matrix

More information

FASTA. Besides that, FASTA package provides SSEARCH, an implementation of the optimal Smith- Waterman algorithm.

FASTA. Besides that, FASTA package provides SSEARCH, an implementation of the optimal Smith- Waterman algorithm. FASTA INTRODUCTION Definition (by David J. Lipman and William R. Pearson in 1985) - Compares a sequence of protein to another sequence or database of a protein, or a sequence of DNA to another sequence

More information

A BANDED SMITH-WATERMAN FPGA ACCELERATOR FOR MERCURY BLASTP

A BANDED SMITH-WATERMAN FPGA ACCELERATOR FOR MERCURY BLASTP A BANDED SITH-WATERAN FPGA ACCELERATOR FOR ERCURY BLASTP Brandon Harris*, Arpith C. Jacob*, Joseph. Lancaster*, Jeremy Buhler*, Roger D. Chamberlain* *Dept. of Computer Science and Engineering, Washington

More information

Heuristic methods for pairwise alignment:

Heuristic methods for pairwise alignment: Bi03c_1 Unit 03c: Heuristic methods for pairwise alignment: k-tuple-methods k-tuple-methods for alignment of pairs of sequences Bi03c_2 dynamic programming is too slow for large databases Use heuristic

More information

PARALIGN: rapid and sensitive sequence similarity searches powered by parallel computing technology

PARALIGN: rapid and sensitive sequence similarity searches powered by parallel computing technology Nucleic Acids Research, 2005, Vol. 33, Web Server issue W535 W539 doi:10.1093/nar/gki423 PARALIGN: rapid and sensitive sequence similarity searches powered by parallel computing technology Per Eystein

More information

A 256-Radix Crossbar Switch Using Mux-Matrix-Mux Folded-Clos Topology

A 256-Radix Crossbar Switch Using Mux-Matrix-Mux Folded-Clos Topology http://dx.doi.org/10.5573/jsts.014.14.6.760 JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.14, NO.6, DECEMBER, 014 A 56-Radix Crossbar Switch Using Mux-Matrix-Mux Folded-Clos Topology Sung-Joon Lee

More information

INTRODUCTION TO BIOINFORMATICS

INTRODUCTION TO BIOINFORMATICS Molecular Biology-2019 1 INTRODUCTION TO BIOINFORMATICS In this section, we want to provide a simple introduction to using the web site of the National Center for Biotechnology Information NCBI) to obtain

More information

Massively Parallel Computing on Silicon: SIMD Implementations. V.M.. Brea Univ. of Santiago de Compostela Spain

Massively Parallel Computing on Silicon: SIMD Implementations. V.M.. Brea Univ. of Santiago de Compostela Spain Massively Parallel Computing on Silicon: SIMD Implementations V.M.. Brea Univ. of Santiago de Compostela Spain GOAL Give an overview on the state-of of-the- art of Digital on-chip CMOS SIMD Solutions,

More information

Biological Sequence Analysis. CSEP 521: Applied Algorithms Final Project. Archie Russell ( ), Jason Hogg ( )

Biological Sequence Analysis. CSEP 521: Applied Algorithms Final Project. Archie Russell ( ), Jason Hogg ( ) Biological Sequence Analysis CSEP 521: Applied Algorithms Final Project Archie Russell (0638782), Jason Hogg (0641054) Introduction Background The schematic for every living organism is stored in long

More information

Compares a sequence of protein to another sequence or database of a protein, or a sequence of DNA to another sequence or library of DNA.

Compares a sequence of protein to another sequence or database of a protein, or a sequence of DNA to another sequence or library of DNA. Compares a sequence of protein to another sequence or database of a protein, or a sequence of DNA to another sequence or library of DNA. Fasta is used to compare a protein or DNA sequence to all of the

More information

A Coprocessor Architecture for Fast Protein Structure Prediction

A Coprocessor Architecture for Fast Protein Structure Prediction A Coprocessor Architecture for Fast Protein Structure Prediction M. Marolia, R. Khoja, T. Acharya, C. Chakrabarti Department of Electrical Engineering Arizona State University, Tempe, USA. Abstract Predicting

More information

Computer Architecture: Multi-Core Processors: Why? Prof. Onur Mutlu Carnegie Mellon University

Computer Architecture: Multi-Core Processors: Why? Prof. Onur Mutlu Carnegie Mellon University Computer Architecture: Multi-Core Processors: Why? Prof. Onur Mutlu Carnegie Mellon University Moore s Law Moore, Cramming more components onto integrated circuits, Electronics, 1965. 2 3 Multi-Core Idea:

More information

Sequence Alignment with GPU: Performance and Design Challenges

Sequence Alignment with GPU: Performance and Design Challenges Sequence Alignment with GPU: Performance and Design Challenges Gregory M. Striemer and Ali Akoglu Department of Electrical and Computer Engineering University of Arizona, 85721 Tucson, Arizona USA {gmstrie,

More information

Processor Architectures At A Glance: M.I.T. Raw vs. UC Davis AsAP

Processor Architectures At A Glance: M.I.T. Raw vs. UC Davis AsAP Processor Architectures At A Glance: M.I.T. Raw vs. UC Davis AsAP Presenter: Course: EEC 289Q: Reconfigurable Computing Course Instructor: Professor Soheil Ghiasi Outline Overview of M.I.T. Raw processor

More information

UNIT 4 INTEGRATED CIRCUIT DESIGN METHODOLOGY E5163

UNIT 4 INTEGRATED CIRCUIT DESIGN METHODOLOGY E5163 UNIT 4 INTEGRATED CIRCUIT DESIGN METHODOLOGY E5163 LEARNING OUTCOMES 4.1 DESIGN METHODOLOGY By the end of this unit, student should be able to: 1. Explain the design methodology for integrated circuit.

More information

Computer Architecture: Multi-Core Processors: Why? Onur Mutlu & Seth Copen Goldstein Carnegie Mellon University 9/11/13

Computer Architecture: Multi-Core Processors: Why? Onur Mutlu & Seth Copen Goldstein Carnegie Mellon University 9/11/13 Computer Architecture: Multi-Core Processors: Why? Onur Mutlu & Seth Copen Goldstein Carnegie Mellon University 9/11/13 Moore s Law Moore, Cramming more components onto integrated circuits, Electronics,

More information

Hi Hsiao-Lung Chan, Ph.D. Dept Electrical Engineering Chang Gung University, Taiwan

Hi Hsiao-Lung Chan, Ph.D. Dept Electrical Engineering Chang Gung University, Taiwan Processors Hi Hsiao-Lung Chan, Ph.D. Dept Electrical Engineering Chang Gung University, Taiwan chanhl@maili.cgu.edu.twcgu General-purpose p processor Control unit Controllerr Control/ status Datapath ALU

More information

A SMITH-WATERMAN SYSTOLIC CELL

A SMITH-WATERMAN SYSTOLIC CELL Chapter A SMITH-WATERMAN SYSTOLIC CELL C.W. Yu, K.H. Kwong, K.H. Lee and P.H.W. Leong Department of Computer Science and Engineering The Chinese University of Hong Kong, Shatin, HONG KONG y chi wai@hotmail.com,edwardkkh@alumni.cuhk.net,khlee@cse.cuhk.edu.hk,phwl@cse.cuhk.edu.hk

More information

Globally Asynchronous Locally Synchronous FPGA Architectures

Globally Asynchronous Locally Synchronous FPGA Architectures Globally Asynchronous Locally Synchronous FPGA Architectures Andrew Royal and Peter Y. K. Cheung Department of Electrical & Electronic Engineering, Imperial College, London, UK {a.royal, p.cheung}@imperial.ac.uk

More information

The Nios II Family of Configurable Soft-core Processors

The Nios II Family of Configurable Soft-core Processors The Nios II Family of Configurable Soft-core Processors James Ball August 16, 2005 2005 Altera Corporation Agenda Nios II Introduction Configuring your CPU FPGA vs. ASIC CPU Design Instruction Set Architecture

More information

Accelerating Next Generation Genome Reassembly in FPGAs: Alignment Using Dynamic Programming Algorithms

Accelerating Next Generation Genome Reassembly in FPGAs: Alignment Using Dynamic Programming Algorithms Accelerating Next Generation Genome Reassembly in FPGAs: Alignment Using Dynamic Programming Algorithms Maria Kim A thesis submitted in partial fulfillment of the requirements for the degree of Master

More information

A Novel Pseudo 4 Phase Dual Rail Asynchronous Protocol with Self Reset Logic & Multiple Reset

A Novel Pseudo 4 Phase Dual Rail Asynchronous Protocol with Self Reset Logic & Multiple Reset A Novel Pseudo 4 Phase Dual Rail Asynchronous Protocol with Self Reset Logic & Multiple Reset M.Santhi, Arun Kumar S, G S Praveen Kalish, Siddharth Sarangan, G Lakshminarayanan Dept of ECE, National Institute

More information

VLSI Design Automation. Calcolatori Elettronici Ing. Informatica

VLSI Design Automation. Calcolatori Elettronici Ing. Informatica VLSI Design Automation 1 Outline Technology trends VLSI Design flow (an overview) 2 IC Products Processors CPU, DSP, Controllers Memory chips RAM, ROM, EEPROM Analog Mobile communication, audio/video processing

More information

) I R L Press Limited, Oxford, England. The protein identification resource (PIR)

) I R L Press Limited, Oxford, England. The protein identification resource (PIR) Volume 14 Number 1 Volume 1986 Nucleic Acids Research 14 Number 1986 Nucleic Acids Research The protein identification resource (PIR) David G.George, Winona C.Barker and Lois T.Hunt National Biomedical

More information

Outline Marquette University

Outline Marquette University COEN-4710 Computer Hardware Lecture 1 Computer Abstractions and Technology (Ch.1) Cristinel Ababei Department of Electrical and Computer Engineering Credits: Slides adapted primarily from presentations

More information

ECE520 VLSI Design. Lecture 1: Introduction to VLSI Technology. Payman Zarkesh-Ha

ECE520 VLSI Design. Lecture 1: Introduction to VLSI Technology. Payman Zarkesh-Ha ECE520 VLSI Design Lecture 1: Introduction to VLSI Technology Payman Zarkesh-Ha Office: ECE Bldg. 230B Office hours: Wednesday 2:00-3:00PM or by appointment E-mail: pzarkesh@unm.edu Slide: 1 Course Objectives

More information

InfiniBand SDR, DDR, and QDR Technology Guide

InfiniBand SDR, DDR, and QDR Technology Guide White Paper InfiniBand SDR, DDR, and QDR Technology Guide The InfiniBand standard supports single, double, and quadruple data rate that enables an InfiniBand link to transmit more data. This paper discusses

More information

Applying SIMD Approach to Whole Genome Comparison on Commodity Hardware

Applying SIMD Approach to Whole Genome Comparison on Commodity Hardware Applying SIMD Approach to Whole Genome Comparison on Commodity Hardware Arpith Jacob 1, Marcin Paprzycki 2,3, Maria Ganzha 2,4, and Sugata Sanyal 5 1 Department of Computer Science and Engineering Vellore

More information

Integrated Accelerator Architecture for DNA Sequences Alignment with Enhanced Traceback Phase

Integrated Accelerator Architecture for DNA Sequences Alignment with Enhanced Traceback Phase Integrated Accelerator Architecture for DNA Sequences Alignment with Enhanced Traceback Phase Nuno Sebastião Tiago Dias Nuno Roma Paulo Flores INESC-ID INESC-ID / IST INESC-ID INESC-ID IST-TU Lisbon ISEL-PI

More information

Reconfigurable Architecture for Biological Sequence Comparison in Reduced Memory Space*

Reconfigurable Architecture for Biological Sequence Comparison in Reduced Memory Space* Reconfigurable Architecture for Biological Sequence Comparison in Reduced Memory Space* Azzedine Boukerche 1, Jan M. Correa 2, Alba Cristina M. A. de Melo 2, Ricardo P. Jacobi 2, Adson F. Rocha 3 1 SITE,

More information

A Special-Purpose Processor for Gene Sequence Analysis. Barry Fagin* J. GIll Watt** Thayer School of Engineering

A Special-Purpose Processor for Gene Sequence Analysis. Barry Fagin* J. GIll Watt** Thayer School of Engineering A Special-Purpose Processor for Gene Sequence Analysis Barry Fagin* (barry.fagin@dartmouth.edu) J. GIll Watt** Thayer School of Engineering Robert Gross (bob.gross@dartmouth.edu) Department of Biology

More information

FastA & the chaining problem

FastA & the chaining problem FastA & the chaining problem We will discuss: Heuristics used by the FastA program for sequence alignment Chaining problem 1 Sources for this lecture: Lectures by Volker Heun, Daniel Huson and Knut Reinert,

More information

Asynchronous Behavior Related Retiming in Gated-Clock GALS Systems

Asynchronous Behavior Related Retiming in Gated-Clock GALS Systems Asynchronous Behavior Related Retiming in Gated-Clock GALS Systems Sam Farrokhi, Masoud Zamani, Hossein Pedram, Mehdi Sedighi Amirkabir University of Technology Department of Computer Eng. & IT E-mail:

More information

FPGA Based Agrep for DNA Microarray Sequence Searching

FPGA Based Agrep for DNA Microarray Sequence Searching 2009 International Conference on Computer Engineering and Applications IPCSIT vol.2 (20) (20) IACSIT Press, Singapore FPGA Based Agrep for DNA Microarray Sequence Searching Gabriel F. Villorente, 2 Mark

More information

COPROCESSOR APPROACH TO ACCELERATING MULTIMEDIA APPLICATION [CLAUDIO BRUNELLI, JARI NURMI ] Processor Design

COPROCESSOR APPROACH TO ACCELERATING MULTIMEDIA APPLICATION [CLAUDIO BRUNELLI, JARI NURMI ] Processor Design COPROCESSOR APPROACH TO ACCELERATING MULTIMEDIA APPLICATION [CLAUDIO BRUNELLI, JARI NURMI ] Processor Design Lecture Objectives Background Need for Accelerator Accelerators and different type of parallelizm

More information

Eastern Mediterranean University School of Computing and Technology CACHE MEMORY. Computer memory is organized into a hierarchy.

Eastern Mediterranean University School of Computing and Technology CACHE MEMORY. Computer memory is organized into a hierarchy. Eastern Mediterranean University School of Computing and Technology ITEC255 Computer Organization & Architecture CACHE MEMORY Introduction Computer memory is organized into a hierarchy. At the highest

More information

FastA and the chaining problem, Gunnar Klau, December 1, 2005, 10:

FastA and the chaining problem, Gunnar Klau, December 1, 2005, 10: FastA and the chaining problem, Gunnar Klau, December 1, 2005, 10:56 4001 4 FastA and the chaining problem We will discuss: Heuristics used by the FastA program for sequence alignment Chaining problem

More information

When Girls Design CPUs!

When Girls Design CPUs! When Girls Design CPUs! An overview on one of the world s most famous CPU cores: ARM 1 Once Upon a Time There was a company in UK Acorn This company was the competitor to IBM Apple They were creating personal

More information

Single Pass, BLAST-like, Approximate String Matching on FPGAs*

Single Pass, BLAST-like, Approximate String Matching on FPGAs* Single Pass, BLAST-like, Approximate String Matching on FPGAs* Martin Herbordt Josh Model Yongfeng Gu Bharat Sukhwani Tom VanCourt Computer Architecture and Automated Design Laboratory Department of Electrical

More information

Highly Scalable and Accurate Seeds for Subsequence Alignment

Highly Scalable and Accurate Seeds for Subsequence Alignment Highly Scalable and Accurate Seeds for Subsequence Alignment Abhijit Pol Tamer Kahveci Department of Computer and Information Science and Engineering, University of Florida, Gainesville, FL, USA, 32611

More information

The T0 Vector Microprocessor. Talk Outline

The T0 Vector Microprocessor. Talk Outline Slides from presentation at the Hot Chips VII conference, 15 August 1995.. The T0 Vector Microprocessor Krste Asanovic James Beck Bertrand Irissou Brian E. D. Kingsbury Nelson Morgan John Wawrzynek University

More information

A GPU Algorithm for Comparing Nucleotide Histograms

A GPU Algorithm for Comparing Nucleotide Histograms A GPU Algorithm for Comparing Nucleotide Histograms Adrienne Breland Harpreet Singh Omid Tutakhil Mike Needham Dickson Luong Grant Hennig Roger Hoang Torborn Loken Sergiu M. Dascalu Frederick C. Harris,

More information

Sequence alignment theory and applications Session 3: BLAST algorithm

Sequence alignment theory and applications Session 3: BLAST algorithm Sequence alignment theory and applications Session 3: BLAST algorithm Introduction to Bioinformatics online course : IBT Sonal Henson Learning Objectives Understand the principles of the BLAST algorithm

More information

Speeding up Subset Seed Algorithm for Intensive Protein Sequence Comparison

Speeding up Subset Seed Algorithm for Intensive Protein Sequence Comparison Speeding up Subset Seed Algorithm for Intensive Protein Sequence Comparison Van Hoa NGUYEN IRISA/INRIA Rennes Rennes, France Email: vhnguyen@irisa.fr Dominique LAVENIER CNRS/IRISA Rennes, France Email:

More information

Computational Genomics and Molecular Biology, Fall

Computational Genomics and Molecular Biology, Fall Computational Genomics and Molecular Biology, Fall 2015 1 Sequence Alignment Dannie Durand Pairwise Sequence Alignment The goal of pairwise sequence alignment is to establish a correspondence between the

More information

Software Implementation of Smith-Waterman Algorithm in FPGA

Software Implementation of Smith-Waterman Algorithm in FPGA Software Implementation of Smith-Waterman lgorithm in FP NUR FRH IN SLIMN, NUR DLILH HMD SBRI, SYED BDUL MULIB L JUNID, ZULKIFLI BD MJID, BDUL KRIMI HLIM Faculty of Electrical Engineering Universiti eknologi

More information

Harnessing Associative Computing for Sequence Alignment with Parallel Accelerators

Harnessing Associative Computing for Sequence Alignment with Parallel Accelerators Harnessing Associative Computing for Sequence Alignment with Parallel Accelerators Shannon I. Steinfadt Doctoral Research Showcase III Room 17 A / B 4:00-4:15 International Conference for High Performance

More information

CSE : Introduction to Computer Architecture

CSE : Introduction to Computer Architecture Computer Architecture 9/21/2005 CSE 675.02: Introduction to Computer Architecture Instructor: Roger Crawfis (based on slides from Gojko Babic A modern meaning of the term computer architecture covers three

More information

CBMF W4761 Final Project Report

CBMF W4761 Final Project Report CBMF W4761 Final Project Report Christopher Fenton CBMF W4761 8/16/09 Introduction Performing large scale database searches of genomic data is one of the largest problems in computational genomics. When

More information

Lecture 23. Finish-up buses Storage

Lecture 23. Finish-up buses Storage Lecture 23 Finish-up buses Storage 1 Example Bus Problems, cont. 2) Assume the following system: A CPU and memory share a 32-bit bus running at 100MHz. The memory needs 50ns to access a 64-bit value from

More information