A Scalable Coprocessor for Bioinformatic Sequence Alignments
|
|
- Brian Porter
- 5 years ago
- Views:
Transcription
1 A Scalable Coprocessor for Bioinformatic Sequence Alignments Scott F. Smith Department of Electrical and Computer Engineering Boise State University Boise, ID, U.S.A. Abstract A hardware coprocessor for the rapid calculation of bioinformatic sequence alignments is presented. The coprocessor uses a globally-asynchronous locally-synchronous (GALS) design style which makes the coprocessor much easier to scale as CMOS feature sizes decrease. The coprocessor is intended to be implemented on a single integrated circuit along with a simple 32-bit RISC processor and memory system. The specific sequence alignment algorithm implemented is that of Smith and Waterman, but the general design strategy could be extended to other bioinformatic sequence alignment algorithms. Keywords - coprocessor, bioinformatics, sequence alignment, Smith-Waterman, globally-asynchronous locally-synchronous. 1. Introduction There is a huge amount of biologic sequence data available today and the volume of data is increasing at an exponential rate. This data includes DNA and protein sequences as well as structural data (x, y, and z positions of atoms in organic polymers). A major challenge for molecular biologists is to make sense of these newly available sequences which run into the billions of units (DNA base pairs or protein amino acids). The only possible way to handle this large amount of data is with automated computer processing, a field now called bioinformatics. One of the core activities undertaken by bioinformatics is the alignment of sequences. This is similar to string searching in text databases, but with some aspects that are peculiar to biologic databases. The simplest of these alignments involves taking a vector of symbols (a query string) and finding ranges in a vector database that are similar to the query string. More complicated alignments use structural data and attempt to find locations in the database with similar relative locations of atoms. This paper describes a coprocessor to handle the non-structural alignment problem, but may be used as a base for designing structural alignment hardware. One of the reasons that text search algorithms can not be directly used for sequence alignment is that there is rarely an exact match between the query string and the database. The query string might be a protein found in humans and the search is for a protein with a similar function in another organism such as mice. The two proteins will have differences that correspond to mutations, insertions, or deletions. A mutation is a difference in a character at a given position. An insertion or deletion is an addition or removal of one or more characters from the query string or database. Insertions and deletions are symmetric in that an insertion in the query string can be viewed as a deletion in the database and vice versa. A further complication is that certain mutations for proteins (amino acids) have less effect on the protein function than other mutations. In order to generate a high quality alignment which finds similar locations in the database without a high false alarm rate, it is necessary that the alignment algorithm mirror the statistics of the underlying biological process. One such algorithm is the Smith-Waterman alignment algorithm [1], which gives high quality alignments but is very computationally demanding.
2 Alternatives to linear-programming based alignments such as Smith-Waterman exist which are not as high quality, but less computationally demanding. These include BLAST (Basic Local Alignment Search Tool) [2] and FASTA (Fast Align) [3] and are commonly used on web-based servers such as those maintained by the National Center for Biotechnology Information (NCBI) [4]. Even though these algorithms are less computationally demanding, they still use significant computing resources. In order to get the high quality of the Smith- Waterman algorithm, one typically has to resort to the power of scientific supercomputing. Strategies include both the use of generalpurpose supercomputers [5] and special-purpose coprocessors [6][7]. This paper describes a special-purpose coprocessor. The coprocessor presented here is different than existing coprocessors in the literature in that it is intended to be a single-chip implementation and that it is more easily scalable due to its globallyasynchronous locally-synchronous design style [8]. There are at least two companies that manufacture and sell systems with coprocessors to accelerate bioinformatic alignments. One of these uses ASIC (application-specific integrated circuit) technology [9] and the other FPGA (field-programmable gate array) technology [10]. Typically these are used to accelerate BLAST processing, which is already computationally intensive enough to warrant special-purpose hardware. The design of these accelerators is, however, proprietary and very little detailed information about them is available in the public literature. The main advantage of the GALS design style is that it gets around the problem of designing a large-area low-skew high-frequency global clock tree. The problems associated with global clock tree design are documented in [11]. These problems will only get worse as CMOS feature sizes decrease and include new problems such as crosstalk and wire inductance that could be safely ignored until recently. Interfaces between the local clock domains of GALS systems have been designed by [12], [13], and [14]. The later design is used for the coprocessor of this paper because it has the advantage of not requiring any control of the local clocks by the interface. The design of a single computation unit which resides within one of the local clock domains of the GALS coprocessor system is described in Section 2. The combination of these units into a multi-clock-domain GALS coprocessor system is the topic of Section 3. Conclusions are presented in Section 4 along with future work to be undertaken. 2. Computation Unit Organization The computation unit is designed to implement the equations of the Smith-Waterman alignment algorithm corresponding to a single character of the query sequence. If there are more characters in the query sequence than there are computation units, then the coprocessor will be accessed multiple times by the main processor. Each access will pass the entire database through the array of computation units. The processor will store intermediate results collected at the end of each pass which generates an intermediate results file several times as large as the initial database. The Smith-Waterman equations are: I i,j = D i,j = M i,j = 0, for all i and j such that i = 0 or j = 0 I i,j = max{i i-1,j - c, M i-1,j - g} D i,j = max{d i,j-1 - c, M i,j-1 - g} M i,j = max{i i-1,j-1 + d(a i,b j ), D i-1,j-1 + d(a i,b j ), M i-1,j-1 + d(a i,b j ), 0} where the indices i and j refer to the position within the query string and within the database. Since the algorithm is symmetric is does not mater which index is chosen for query and which for database. I is the current score if an insertion is underway and D if a deletion is underway. The choice of assignment of i and j to query string versus database determines whether these insertions or deletions actually refer to query string insertions and deletions or
3 database insertions and deletions. The current score if the current pair of characters (one from the query string and one from the database) is taken as a match/mutation is M. The penalty for starting a new insertion or deletion is g, and the penalty for continuing an insertion or deletion is c (g is normally chosen larger than c). The reward for a match is d and this depends in general on how close the match is. Exactly matching characters get the highest reward value and similar characters get a reduced, but positive reward. For DNA alignments, exact matches are usually assigned a positive reward with all other combinations given a reward of zero. For protein alignments, amino acids with similar properties (such as both being hydrophobic) are given non-zero, but lower rewards than exact matches. The d matrix is normally symmetric. There are four possible characters for DNA alignments (A, T, C, and G) and twenty possible characters for protein alignments (C, H, I, M, S, V, A, G, L, P, T, F, R, Y, W, D, N, E, Q, and K) [15]. These characters are normally stored as eight-bit ASCII codes in biological databases. A block diagram of a single computation unit is shown in Figure 1. The unit is divided into two sections, constants and calculation. These two sections have separate request and acknowledge interfaces and work independently. The constants section is loaded first with c, g, d, and valid values for a particular character of the query string. The valid bit allows a computation unit to be bypassed if it is not needed as a result of the query string being shorter than the total number of computation units. The g and c values are the start and continuation penalties for insertions and deletions and are eight bit values. The twenty d values indexed d(0) through d(19) are each three bit rewards for matching/mutating characters that pass through the computation unit. The d values are a single column of the d matrix corresponding to the query string character assigned to the computation unit. In the case of DNA alignment, only the first four d values will be used since the character for the other sixteen d values will simply never appear in the database data stream. Figure 1 Computation unit block diagram.
4 The database characters (Char) are passed through the pipeline of computation units using a twenty-bit one-hot code. The one-hot code makes it easier to select the required d value from the constants section. This makes the computation unit faster and saves transistors inside the unit at the expense of additional bits at the interface between units. The current score at the current position in the database is labeled Max. An additional intermediate variable X has been added to the equations. This X variable represents a portion of the calculation that can been done prior to the arrival of the current Char value. The variable X is not passed between computation units since it is only a temporary internal state. clock domain. Even though the clock signals of two different clock domains are nominally the same, minimal effort is employed to maintain low skew between the domains and the clock domain interfaces make no assumptions about the relative phase of the two local clock signals. The internal design and performance of the asynchronous interface is described in detail in [14] and [16]. The interface is built around an asynchronous FIFO. The performance of the interface has been estimated using a SPICE model as 1.09 ns plus a clock-phase differential term which varies between zero and one period of the receiving clock signal. This performance estimate is based on a 180 nm TSMC [17] CMOS process available through MOSIS [18]. 3. Full Coprocessor System Figure 2 Connection of two computation units. The connection of two computation units together using asynchronous interfaces is shown in Figure 2. Two asynchronous interfaces are needed between each pair of computation units to allow independent passage of constants and data. Each computation unit has its own local clock signal. These clock signals are intended to have the same nominal frequency and are mostly likely derived from the same clock source. Local clock signals have tightly controlled skew such that the usual synchronous design paradigm can be used within a clock domain. This allows the standard types of digital design tools to be used to design the logic internal to a local The full system including processor, coprocessor, and array of computation units is shown in Figure 3. The processor and coprocessor are in the same local clock domain (clock 0) and each of the n computation units occupies its own local clock domain (clock 1 through clock n). The coprocessor loads constants eight bits at a time through the chain of Const connections. Internal to the computation units there is an eight bit wide and ten unit long synchronous FIFO for constant values. After 10n bytes of constants have been sent to the computation unit array by the coprocessor the constants are fully loaded. There is no need to pass constants from computation block n back to the coprocessor and therefore only one asynchronous interface is needed at that place. Data are passed 84 bits at a time through the chain of Data connections. This data is composed of D, I, M, Max, and Char. There is no need to ever pass Char back to the coprocessor from computation unit n, so the 20 bits of Char are omitted and Data is only 64 bits wide at that point. The need to pass D, I, M, and Max from the coprocessor to computation unit 1 occurs only when the query string does not fit in the array of computation blocks and multiple passes of the database through the array is needed. If the query string completely fits or it is the first pass of a multi-pass run, the D, I, M, and Max values are set to zero by the coprocessor.
5 Figure 3 Full coprocessor system. One of the functions of the coprocessor is to expand eight-bit ASCII-coded symbols for DNA bases or amino acids into the 20-bit one-hot code used to specify Char in the computation unit array. The processor is responsible for maintaining the d matrix and calculating all of the constants to be loaded via Const. One reason for having the coprocessor do the one-hot expansion is to reduce the bandwidth over the processor-coprocessor interface. A good possible choice for the processor in this system is an ARM922T CPU core [19][20][21]. This is a small 32-bit RISC processor designed to be used as a core on an ASIC. The ARM922T CPU core is built around an ARM9 processor which has a standard coprocessor interface. One standard coprocessor which uses this interface is the memory management unit (MMU), but additional application-specific coprocessors can be designed to use the interface definition. This coprocessor interface is 32 bits wide. After initialization of the constants, information passing from the processor to coprocessor is in the form of ASCII characters, so four database units can be sent per transfer. Information passing from the coprocessor to the processor is a series of 16-bit scores, one for each database character passed into the coprocessor. The 32-bit processor-coprocessor interface therefore handles an average of 4/3 database characters per clock cycle. The ARM922T is capable of operating at 200 MHz in the 180 nm TSMC CMOS process. The synchronous logic within the coprocessor and computation units has not yet been designed, but it is not unreasonable to expect these to operate at a similar speed (the coprocessor is required to operate on the same clock as the processor). If so, then the processor will be near full processing capacity moving data into and out of the coprocessor during the database access phase of processing. At 200 million database characters per second, a search of the entire human genome (about 3 billion base-pairs) would take about 15 seconds. This assumes that there are enough computation units to hold the entire query string. Performance with longer query strings would be significantly less since the processor would need to store and retrieve the intermediate D, I, M, and Max values once for every pass in excess of the first. 4. Conclusion The main advantage of the GALS approach used in the design of the alignment coprocessor is the ease of scaling to smaller CMOS feature sizes which allows for an increase in the number of computation units in the coprocessor array. Increasing the number of units allows for longer query strings to be processed without using multiple passes. Alternatively, more than one set of processor, coprocessor, and computation unit array can be placed on a single integrated circuit. This would increase throughput rather than increase efficient query string length. The next steps in this work will be the design of the synchronous logic within the coprocessor and computation units. This will yield information on the layout size of the computation unit which in turn will determine how many units can be placed on a single integrated circuit. The design will also allow simulation to determine if the coprocessor and computation blocks can in fact run at near 200 MHz in a 180 nm CMOS process. It is already know from the asynchronous interface design that the interfaces are not a large contributor to layout area and can easily support 200 MHz throughput. References
6 [1] T. Smith and M. Waterman, Identification of Common Molecular Sequences, Journal of Molecular Biology, pp , [2] S. Altschul, W. Gish, E. Myers, and D. Lipman, Basic Local Alignment Search Tool, Journal of Molecular Biology, pp , [3] W. Pearson and D. Lipman, Improved Tools for Biological Sequence Comparison, Proceedings of the National Academy of Science, pp , [4] National Center for Biotechnology Information (NCBI), [5] S. Smith and J. Frenzel, Bioinformatics Application of a Scalable Supercomputer-on-chip Architecture, Proceedings of the International Conference on Parallel and Distributed Processing Techniques, Volume 1, pp , [6] L. Grate, M. Diekhans, D. Dahle, and R. Hughey, Sequence Analysis with the Kestrel SIMD Parallel Processor, Proceedings of the Pacific Symposium on Biocomputing, pp , [7] P. Guerdoux-Jamet and D. Lavenier, SAMBA: Hardware Accelerator for Biological Sequence Comparison, Computer Applications in Biosciences, pp , [8] D. Chapiro, Globally-Asynchronous Locally- Synchronous Systems, Doctoral Thesis, Stanford University, th IEEE International ASIC/SOC Conference, pp , [13] K. Yun and A. Dooply, Pausible Clocking- Based Heterogeneous Systems, IEEE Transactions on VLSI Systems, pp , [14] S. Smith and J. Frenzel, Low-latency Multiple Clock Domain Interfacing Without Alteration of Local Clocks, Proceedings of the 15 th Biennial IEEE University / Government / Industry Microelectronics Symposium, pp , [15] C. Branden and J. Tooze, Introduction to Protein Structure, 2 nd Edition, Garland Publishing, [16] S. Smith, A Multiple-Clock-Domain Bus Architecture Using Asynchronous FIFOs as Elastic Elements, Doctoral Thesis, University of Idaho, [17] Taiwan Semiconductor Manufacturing Company website, [18] MOSIS website, [19] S. Furber, ARM System-on-Chip Architecture, 2 nd Edition, Addison-Wesley, [20] D. Seal, ARM Architecture Reference Manual, 2 nd Edition, Addison-Wesley, [21] ARM Ltd. website, [9] Paracel, Inc. website, [10] TimeLogic Corp. website, [11] D. Bailey, Clock Distribution, in Design of High-Performance Microprocessor Circuits, IEEE Press, pp , [12] J. Muttersbach, T. Villiger, H. Kaeslin, N. Felber, and W. Fichtner, Globally-Asynchronous Locally-Synchronous Architectures to Simplify the Design of On-Chip Systems, Proceedings of the
24 Grundlagen der Bioinformatik, SS 10, D. Huson, April 26, This lecture is based on the following papers, which are all recommended reading:
24 Grundlagen der Bioinformatik, SS 10, D. Huson, April 26, 2010 3 BLAST and FASTA This lecture is based on the following papers, which are all recommended reading: D.J. Lipman and W.R. Pearson, Rapid
More informationOPEN MP-BASED PARALLEL AND SCALABLE GENETIC SEQUENCE ALIGNMENT
OPEN MP-BASED PARALLEL AND SCALABLE GENETIC SEQUENCE ALIGNMENT Asif Ali Khan*, Laiq Hassan*, Salim Ullah* ABSTRACT: In bioinformatics, sequence alignment is a common and insistent task. Biologists align
More informationHardware Accelerator for Biological Sequence Alignment using Coreworks Processing Engine
Hardware Accelerator for Biological Sequence Alignment using Coreworks Processing Engine José Cabrita, Gilberto Rodrigues, Paulo Flores INESC-ID / IST, Technical University of Lisbon jpmcabrita@gmail.com,
More informationBioinformatics explained: Smith-Waterman
Bioinformatics Explained Bioinformatics explained: Smith-Waterman May 1, 2007 CLC bio Gustav Wieds Vej 10 8000 Aarhus C Denmark Telephone: +45 70 22 55 09 Fax: +45 70 22 55 19 www.clcbio.com info@clcbio.com
More informationA CAM(Content Addressable Memory)-based architecture for molecular sequence matching
A CAM(Content Addressable Memory)-based architecture for molecular sequence matching P.K. Lala 1 and J.P. Parkerson 2 1 Department Electrical Engineering, Texas A&M University, Texarkana, Texas, USA 2
More informationBioinformatics explained: BLAST. March 8, 2007
Bioinformatics Explained Bioinformatics explained: BLAST March 8, 2007 CLC bio Gustav Wieds Vej 10 8000 Aarhus C Denmark Telephone: +45 70 22 55 09 Fax: +45 70 22 55 19 www.clcbio.com info@clcbio.com Bioinformatics
More informationBLAST, Profile, and PSI-BLAST
BLAST, Profile, and PSI-BLAST Jianlin Cheng, PhD School of Electrical Engineering and Computer Science University of Central Florida 26 Free for academic use Copyright @ Jianlin Cheng & original sources
More informationParallel Processing for Scanning Genomic Data-Bases
1 Parallel Processing for Scanning Genomic Data-Bases D. Lavenier and J.-L. Pacherie a {lavenier,pacherie}@irisa.fr a IRISA, Campus de Beaulieu, 35042 Rennes cedex, France The scan of a genomic data-base
More informationData Mining Technologies for Bioinformatics Sequences
Data Mining Technologies for Bioinformatics Sequences Deepak Garg Computer Science and Engineering Department Thapar Institute of Engineering & Tecnology, Patiala Abstract Main tool used for sequence alignment
More informationAn Analysis of Pairwise Sequence Alignment Algorithm Complexities: Needleman-Wunsch, Smith-Waterman, FASTA, BLAST and Gapped BLAST
An Analysis of Pairwise Sequence Alignment Algorithm Complexities: Needleman-Wunsch, Smith-Waterman, FASTA, BLAST and Gapped BLAST Alexander Chan 5075504 Biochemistry 218 Final Project An Analysis of Pairwise
More informationComparative Analysis of Protein Alignment Algorithms in Parallel environment using CUDA
Comparative Analysis of Protein Alignment Algorithms in Parallel environment using BLAST versus Smith-Waterman Shadman Fahim shadmanbracu09@gmail.com Shehabul Hossain rudrozzal@gmail.com Gulshan Jubaed
More informationBio-Sequence Analysis with Cradle s 3SoC Software Scalable System on Chip
2004 ACM Symposium on Applied Computing Bio-Sequence Analysis with Cradle s 3SoC Software Scalable System on Chip Xiandong Meng Department of Electrical and Computer Engineering Wayne State University
More informationThe Effect of Inverse Document Frequency Weights on Indexed Sequence Retrieval. Kevin C. O'Kane. Department of Computer Science
The Effect of Inverse Document Frequency Weights on Indexed Sequence Retrieval Kevin C. O'Kane Department of Computer Science The University of Northern Iowa Cedar Falls, Iowa okane@cs.uni.edu http://www.cs.uni.edu/~okane
More informationJyoti Lakhani 1, Ajay Khunteta 2, Dharmesh Harwani *3 1 Poornima University, Jaipur & Maharaja Ganga Singh University, Bikaner, Rajasthan, India
International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2017 IJSRCSEIT Volume 2 Issue 6 ISSN : 2456-3307 Improvisation of Global Pairwise Sequence Alignment
More informationAcceleration of Algorithm of Smith-Waterman Using Recursive Variable Expansion.
www.ijarcet.org 54 Acceleration of Algorithm of Smith-Waterman Using Recursive Variable Expansion. Hassan Kehinde Bello and Kazeem Alagbe Gbolagade Abstract Biological sequence alignment is becoming popular
More informationBLAST: Basic Local Alignment Search Tool Altschul et al. J. Mol Bio CS 466 Saurabh Sinha
BLAST: Basic Local Alignment Search Tool Altschul et al. J. Mol Bio. 1990. CS 466 Saurabh Sinha Motivation Sequence homology to a known protein suggest function of newly sequenced protein Bioinformatics
More informationDynamic Programming User Manual v1.0 Anton E. Weisstein, Truman State University Aug. 19, 2014
Dynamic Programming User Manual v1.0 Anton E. Weisstein, Truman State University Aug. 19, 2014 Dynamic programming is a group of mathematical methods used to sequentially split a complicated problem into
More informationHARDWARE ACCELERATION OF HIDDEN MARKOV MODELS FOR BIOINFORMATICS APPLICATIONS. by Shakha Gupta. A project. submitted in partial fulfillment
HARDWARE ACCELERATION OF HIDDEN MARKOV MODELS FOR BIOINFORMATICS APPLICATIONS by Shakha Gupta A project submitted in partial fulfillment of the requirements for the degree of Master of Science in Computer
More informationHardware Acceleration of Sequence Alignment Algorithms An Overview
Hardware Acceleration of Sequence Alignment Algorithms An Overview Laiq Hasan Zaid Al-Ars Stamatis Vassiliadis Delft University of Technology Computer Engineering Laboratory Mekelweg 4, 2628 CD Delft,
More informationGPU Accelerated Smith-Waterman
GPU Accelerated Smith-Waterman Yang Liu 1,WayneHuang 1,2, John Johnson 1, and Sheila Vaidya 1 1 Lawrence Livermore National Laboratory 2 DOE Joint Genome Institute, UCRL-CONF-218814 {liu24, whuang, jjohnson,
More informationPerformance Comparison between Linear RVE and Linear Systolic Array Implementations of the Smith-Waterman Algorithm
Performance Comparison between Linear RVE and Linear Systolic Array Implementations of the Smith-Waterman Algorithm Laiq Hasan Zaid Al-Ars Delft University of Technology Computer Engineering Laboratory
More informationComputational Molecular Biology
Computational Molecular Biology Erwin M. Bakker Lecture 3, mainly from material by R. Shamir [2] and H.J. Hoogeboom [4]. 1 Pairwise Sequence Alignment Biological Motivation Algorithmic Aspect Recursive
More informationICB Fall G4120: Introduction to Computational Biology. Oliver Jovanovic, Ph.D. Columbia University Department of Microbiology
ICB Fall 2008 G4120: Computational Biology Oliver Jovanovic, Ph.D. Columbia University Department of Microbiology Copyright 2008 Oliver Jovanovic, All Rights Reserved. The Digital Language of Computers
More informationSearching Biological Sequence Databases Using Distributed Adaptive Computing
Searching Biological Sequence Databases Using Distributed Adaptive Computing Nicholas P. Pappas Thesis submitted to the Faculty of the Virginia Polytechnic Institute and State University in partial fulfillment
More informationAcceleration of Ungapped Extension in Mercury BLAST. Joseph Lancaster Jeremy Buhler Roger Chamberlain
Acceleration of Ungapped Extension in Mercury BLAST Joseph Lancaster Jeremy Buhler Roger Chamberlain Joseph Lancaster, Jeremy Buhler, and Roger Chamberlain, Acceleration of Ungapped Extension in Mercury
More informationTHE Smith-Waterman (SW) algorithm [1] is a wellknown
Design and Implementation of the Smith-Waterman Algorithm on the CUDA-Compatible GPU Yuma Munekawa, Fumihiko Ino, Member, IEEE, and Kenichi Hagihara Abstract This paper describes a design and implementation
More informationA Design of a Hybrid System for DNA Sequence Alignment
IMECS 2008, 9-2 March, 2008, Hong Kong A Design of a Hybrid System for DNA Sequence Alignment Heba Khaled, Hossam M. Faheem, Tayseer Hasan, Saeed Ghoneimy Abstract This paper describes a parallel algorithm
More informationHIDDEN MARKOV MODELS AND SEQUENCE ALIGNMENT
HIDDEN MARKOV MODELS AND SEQUENCE ALIGNMENT - Swarbhanu Chatterjee. Hidden Markov models are a sophisticated and flexible statistical tool for the study of protein models. Using HMMs to analyze proteins
More informationFrom Smith-Waterman to BLAST
From Smith-Waterman to BLAST Jeremy Buhler July 23, 2015 Smith-Waterman is the fundamental tool that we use to decide how similar two sequences are. Isn t that all that BLAST does? In principle, it is
More informationRevisiting the Speed-versus-Sensitivity Tradeoff in Pairwise Sequence Search
Revisiting the Speed-versus-Sensitivity Tradeoff in Pairwise Sequence Search Ashwin M. Aji and Wu-chun Feng The Synergy Laboratory Department of Computer Science Virginia Tech {aaji,feng}@cs.vt.edu Abstract
More informationFast Sequence Alignment Method Using CUDA-enabled GPU
Fast Sequence Alignment Method Using CUDA-enabled GPU Yeim-Kuan Chang Department of Computer Science and Information Engineering National Cheng Kung University Tainan, Taiwan ykchang@mail.ncku.edu.tw De-Yu
More informationAs of August 15, 2008, GenBank contained bases from reported sequences. The search procedure should be
48 Bioinformatics I, WS 09-10, S. Henz (script by D. Huson) November 26, 2009 4 BLAST and BLAT Outline of the chapter: 1. Heuristics for the pairwise local alignment of two sequences 2. BLAST: search and
More informationProtein Sequence Comparison on the Instruction Systolic Array
Protein Sequence Comparison on the Instruction Systolic Array Bertil Schmidt, Heiko Schröder and Manfred Schimmler 2 School of Computer Engineering, Nanyang Technological University, Singapore 639798,
More informationOn the Efficacy of Haskell for High Performance Computational Biology
On the Efficacy of Haskell for High Performance Computational Biology Jacqueline Addesa Academic Advisors: Jeremy Archuleta, Wu chun Feng 1. Problem and Motivation Biologists can leverage the power of
More informationChapter Seven Morgan Kaufmann Publishers
Chapter Seven Memories: Review SRAM: value is stored on a pair of inverting gates very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: value is stored as a charge on capacitor (must be
More informationDistributed Protein Sequence Alignment
Distributed Protein Sequence Alignment ABSTRACT J. Michael Meehan meehan@wwu.edu James Hearne hearne@wwu.edu Given the explosive growth of biological sequence databases and the computational complexity
More informationPraveen Krishnamurthy, Jeremy Buhler, Roger Chamberlain, Mark Franklin, Kwame Gyang, and Joseph Lancaster
Biosequence Similarity Search on the Mercury System Praveen Krishnamurthy, Jeremy Buhler, Roger Chamberlain, Mark Franklin, Kwame Gyang, and Joseph Lancaster Praveen Krishnamurthy, Jeremy Buhler, Roger
More informationScalable Hardware Accelerator for Comparing DNA and Protein Sequences
Scalable Hardware Accelerator for Comparing DNA and Protein Sequences Philippe Faes, Bram Minnaert, Mark Christiaens, Eric Bonnet, Yvan Saeys, Dirk Stroobandt, Yves Van de Peer Abstract Comparing genetic
More informationResearch on Pairwise Sequence Alignment Needleman-Wunsch Algorithm
5th International Conference on Mechatronics, Materials, Chemistry and Computer Engineering (ICMMCCE 2017) Research on Pairwise Sequence Alignment Needleman-Wunsch Algorithm Xiantao Jiang1, a,*,xueliang
More informationCOS 551: Introduction to Computational Molecular Biology Lecture: Oct 17, 2000 Lecturer: Mona Singh Scribe: Jacob Brenner 1. Database Searching
COS 551: Introduction to Computational Molecular Biology Lecture: Oct 17, 2000 Lecturer: Mona Singh Scribe: Jacob Brenner 1 Database Searching In database search, we typically have a large sequence database
More informationScalable Accelerator Architecture for Local Alignment of DNA Sequences
Scalable Accelerator Architecture for Local Alignment of DNA Sequences Nuno Sebastião, Nuno Roma, Paulo Flores INESC-ID / IST-TU Lisbon Rua Alves Redol, 9, Lisboa PORTUGAL {Nuno.Sebastiao, Nuno.Roma, Paulo.Flores}
More informationAccelerating Smith Waterman (SW) Algorithm on Altera Cyclone II Field Programmable Gate Array
Accelerating Smith Waterman (SW) Algorithm on Altera yclone II Field Programmable Gate Array NUR DALILAH AHMAD SABRI, NUR FARAH AIN SALIMAN, SYED ABDUL MUALIB AL JUNID, ABDUL KARIMI HALIM Faculty Electrical
More informationBiology 644: Bioinformatics
Find the best alignment between 2 sequences with lengths n and m, respectively Best alignment is very dependent upon the substitution matrix and gap penalties The Global Alignment Problem tries to find
More informationUSING AN EXTENDED SUFFIX TREE TO SPEED-UP SEQUENCE ALIGNMENT
IADIS International Conference Applied Computing 2006 USING AN EXTENDED SUFFIX TREE TO SPEED-UP SEQUENCE ALIGNMENT Divya R. Singh Software Engineer Microsoft Corporation, Redmond, WA 98052, USA Abdullah
More information.. Fall 2011 CSC 570: Bioinformatics Alexander Dekhtyar..
.. Fall 2011 CSC 570: Bioinformatics Alexander Dekhtyar.. PAM and BLOSUM Matrices Prepared by: Jason Banich and Chris Hoover Background As DNA sequences change and evolve, certain amino acids are more
More informationAn I/O device driver for bioinformatics tools: the case for BLAST
An I/O device driver for bioinformatics tools 563 An I/O device driver for bioinformatics tools: the case for BLAST Renato Campos Mauro and Sérgio Lifschitz Departamento de Informática PUC-RIO, Pontifícia
More informationVLSI Design Automation
VLSI Design Automation IC Products Processors CPU, DSP, Controllers Memory chips RAM, ROM, EEPROM Analog Mobile communication, audio/video processing Programmable PLA, FPGA Embedded systems Used in cars,
More informationVertex Shader Design I
The following content is extracted from the paper shown in next page. If any wrong citation or reference missing, please contact ldvan@cs.nctu.edu.tw. I will correct the error asap. This course used only
More informationPerformance Analysis of Parallelized Bioinformatics Applications
Asian Journal of Computer Science and Technology ISSN: 2249-0701 Vol.7 No.2, 2018, pp. 70-74 The Research Publication, www.trp.org.in Dhruv Chander Pant 1 and OP Gupta 2 1 Research Scholar, I. K. Gujral
More informationDarwin: A Genomic Co-processor gives up to 15,000X speedup on long read assembly (To appear in ASPLOS 2018)
Darwin: A Genomic Co-processor gives up to 15,000X speedup on long read assembly (To appear in ASPLOS 2018) Yatish Turakhia EE PhD candidate Stanford University Prof. Bill Dally (Electrical Engineering
More informationResearch Article International Journals of Advanced Research in Computer Science and Software Engineering ISSN: X (Volume-7, Issue-6)
International Journals of Advanced Research in Computer Science and Software Engineering ISSN: 77-18X (Volume-7, Issue-6) Research Article June 017 DDGARM: Dotlet Driven Global Alignment with Reduced Matrix
More informationFASTA. Besides that, FASTA package provides SSEARCH, an implementation of the optimal Smith- Waterman algorithm.
FASTA INTRODUCTION Definition (by David J. Lipman and William R. Pearson in 1985) - Compares a sequence of protein to another sequence or database of a protein, or a sequence of DNA to another sequence
More informationA BANDED SMITH-WATERMAN FPGA ACCELERATOR FOR MERCURY BLASTP
A BANDED SITH-WATERAN FPGA ACCELERATOR FOR ERCURY BLASTP Brandon Harris*, Arpith C. Jacob*, Joseph. Lancaster*, Jeremy Buhler*, Roger D. Chamberlain* *Dept. of Computer Science and Engineering, Washington
More informationHeuristic methods for pairwise alignment:
Bi03c_1 Unit 03c: Heuristic methods for pairwise alignment: k-tuple-methods k-tuple-methods for alignment of pairs of sequences Bi03c_2 dynamic programming is too slow for large databases Use heuristic
More informationPARALIGN: rapid and sensitive sequence similarity searches powered by parallel computing technology
Nucleic Acids Research, 2005, Vol. 33, Web Server issue W535 W539 doi:10.1093/nar/gki423 PARALIGN: rapid and sensitive sequence similarity searches powered by parallel computing technology Per Eystein
More informationA 256-Radix Crossbar Switch Using Mux-Matrix-Mux Folded-Clos Topology
http://dx.doi.org/10.5573/jsts.014.14.6.760 JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.14, NO.6, DECEMBER, 014 A 56-Radix Crossbar Switch Using Mux-Matrix-Mux Folded-Clos Topology Sung-Joon Lee
More informationINTRODUCTION TO BIOINFORMATICS
Molecular Biology-2019 1 INTRODUCTION TO BIOINFORMATICS In this section, we want to provide a simple introduction to using the web site of the National Center for Biotechnology Information NCBI) to obtain
More informationMassively Parallel Computing on Silicon: SIMD Implementations. V.M.. Brea Univ. of Santiago de Compostela Spain
Massively Parallel Computing on Silicon: SIMD Implementations V.M.. Brea Univ. of Santiago de Compostela Spain GOAL Give an overview on the state-of of-the- art of Digital on-chip CMOS SIMD Solutions,
More informationBiological Sequence Analysis. CSEP 521: Applied Algorithms Final Project. Archie Russell ( ), Jason Hogg ( )
Biological Sequence Analysis CSEP 521: Applied Algorithms Final Project Archie Russell (0638782), Jason Hogg (0641054) Introduction Background The schematic for every living organism is stored in long
More informationCompares a sequence of protein to another sequence or database of a protein, or a sequence of DNA to another sequence or library of DNA.
Compares a sequence of protein to another sequence or database of a protein, or a sequence of DNA to another sequence or library of DNA. Fasta is used to compare a protein or DNA sequence to all of the
More informationA Coprocessor Architecture for Fast Protein Structure Prediction
A Coprocessor Architecture for Fast Protein Structure Prediction M. Marolia, R. Khoja, T. Acharya, C. Chakrabarti Department of Electrical Engineering Arizona State University, Tempe, USA. Abstract Predicting
More informationComputer Architecture: Multi-Core Processors: Why? Prof. Onur Mutlu Carnegie Mellon University
Computer Architecture: Multi-Core Processors: Why? Prof. Onur Mutlu Carnegie Mellon University Moore s Law Moore, Cramming more components onto integrated circuits, Electronics, 1965. 2 3 Multi-Core Idea:
More informationSequence Alignment with GPU: Performance and Design Challenges
Sequence Alignment with GPU: Performance and Design Challenges Gregory M. Striemer and Ali Akoglu Department of Electrical and Computer Engineering University of Arizona, 85721 Tucson, Arizona USA {gmstrie,
More informationProcessor Architectures At A Glance: M.I.T. Raw vs. UC Davis AsAP
Processor Architectures At A Glance: M.I.T. Raw vs. UC Davis AsAP Presenter: Course: EEC 289Q: Reconfigurable Computing Course Instructor: Professor Soheil Ghiasi Outline Overview of M.I.T. Raw processor
More informationUNIT 4 INTEGRATED CIRCUIT DESIGN METHODOLOGY E5163
UNIT 4 INTEGRATED CIRCUIT DESIGN METHODOLOGY E5163 LEARNING OUTCOMES 4.1 DESIGN METHODOLOGY By the end of this unit, student should be able to: 1. Explain the design methodology for integrated circuit.
More informationComputer Architecture: Multi-Core Processors: Why? Onur Mutlu & Seth Copen Goldstein Carnegie Mellon University 9/11/13
Computer Architecture: Multi-Core Processors: Why? Onur Mutlu & Seth Copen Goldstein Carnegie Mellon University 9/11/13 Moore s Law Moore, Cramming more components onto integrated circuits, Electronics,
More informationHi Hsiao-Lung Chan, Ph.D. Dept Electrical Engineering Chang Gung University, Taiwan
Processors Hi Hsiao-Lung Chan, Ph.D. Dept Electrical Engineering Chang Gung University, Taiwan chanhl@maili.cgu.edu.twcgu General-purpose p processor Control unit Controllerr Control/ status Datapath ALU
More informationA SMITH-WATERMAN SYSTOLIC CELL
Chapter A SMITH-WATERMAN SYSTOLIC CELL C.W. Yu, K.H. Kwong, K.H. Lee and P.H.W. Leong Department of Computer Science and Engineering The Chinese University of Hong Kong, Shatin, HONG KONG y chi wai@hotmail.com,edwardkkh@alumni.cuhk.net,khlee@cse.cuhk.edu.hk,phwl@cse.cuhk.edu.hk
More informationGlobally Asynchronous Locally Synchronous FPGA Architectures
Globally Asynchronous Locally Synchronous FPGA Architectures Andrew Royal and Peter Y. K. Cheung Department of Electrical & Electronic Engineering, Imperial College, London, UK {a.royal, p.cheung}@imperial.ac.uk
More informationThe Nios II Family of Configurable Soft-core Processors
The Nios II Family of Configurable Soft-core Processors James Ball August 16, 2005 2005 Altera Corporation Agenda Nios II Introduction Configuring your CPU FPGA vs. ASIC CPU Design Instruction Set Architecture
More informationAccelerating Next Generation Genome Reassembly in FPGAs: Alignment Using Dynamic Programming Algorithms
Accelerating Next Generation Genome Reassembly in FPGAs: Alignment Using Dynamic Programming Algorithms Maria Kim A thesis submitted in partial fulfillment of the requirements for the degree of Master
More informationA Novel Pseudo 4 Phase Dual Rail Asynchronous Protocol with Self Reset Logic & Multiple Reset
A Novel Pseudo 4 Phase Dual Rail Asynchronous Protocol with Self Reset Logic & Multiple Reset M.Santhi, Arun Kumar S, G S Praveen Kalish, Siddharth Sarangan, G Lakshminarayanan Dept of ECE, National Institute
More informationVLSI Design Automation. Calcolatori Elettronici Ing. Informatica
VLSI Design Automation 1 Outline Technology trends VLSI Design flow (an overview) 2 IC Products Processors CPU, DSP, Controllers Memory chips RAM, ROM, EEPROM Analog Mobile communication, audio/video processing
More information) I R L Press Limited, Oxford, England. The protein identification resource (PIR)
Volume 14 Number 1 Volume 1986 Nucleic Acids Research 14 Number 1986 Nucleic Acids Research The protein identification resource (PIR) David G.George, Winona C.Barker and Lois T.Hunt National Biomedical
More informationOutline Marquette University
COEN-4710 Computer Hardware Lecture 1 Computer Abstractions and Technology (Ch.1) Cristinel Ababei Department of Electrical and Computer Engineering Credits: Slides adapted primarily from presentations
More informationECE520 VLSI Design. Lecture 1: Introduction to VLSI Technology. Payman Zarkesh-Ha
ECE520 VLSI Design Lecture 1: Introduction to VLSI Technology Payman Zarkesh-Ha Office: ECE Bldg. 230B Office hours: Wednesday 2:00-3:00PM or by appointment E-mail: pzarkesh@unm.edu Slide: 1 Course Objectives
More informationInfiniBand SDR, DDR, and QDR Technology Guide
White Paper InfiniBand SDR, DDR, and QDR Technology Guide The InfiniBand standard supports single, double, and quadruple data rate that enables an InfiniBand link to transmit more data. This paper discusses
More informationApplying SIMD Approach to Whole Genome Comparison on Commodity Hardware
Applying SIMD Approach to Whole Genome Comparison on Commodity Hardware Arpith Jacob 1, Marcin Paprzycki 2,3, Maria Ganzha 2,4, and Sugata Sanyal 5 1 Department of Computer Science and Engineering Vellore
More informationIntegrated Accelerator Architecture for DNA Sequences Alignment with Enhanced Traceback Phase
Integrated Accelerator Architecture for DNA Sequences Alignment with Enhanced Traceback Phase Nuno Sebastião Tiago Dias Nuno Roma Paulo Flores INESC-ID INESC-ID / IST INESC-ID INESC-ID IST-TU Lisbon ISEL-PI
More informationReconfigurable Architecture for Biological Sequence Comparison in Reduced Memory Space*
Reconfigurable Architecture for Biological Sequence Comparison in Reduced Memory Space* Azzedine Boukerche 1, Jan M. Correa 2, Alba Cristina M. A. de Melo 2, Ricardo P. Jacobi 2, Adson F. Rocha 3 1 SITE,
More informationA Special-Purpose Processor for Gene Sequence Analysis. Barry Fagin* J. GIll Watt** Thayer School of Engineering
A Special-Purpose Processor for Gene Sequence Analysis Barry Fagin* (barry.fagin@dartmouth.edu) J. GIll Watt** Thayer School of Engineering Robert Gross (bob.gross@dartmouth.edu) Department of Biology
More informationFastA & the chaining problem
FastA & the chaining problem We will discuss: Heuristics used by the FastA program for sequence alignment Chaining problem 1 Sources for this lecture: Lectures by Volker Heun, Daniel Huson and Knut Reinert,
More informationAsynchronous Behavior Related Retiming in Gated-Clock GALS Systems
Asynchronous Behavior Related Retiming in Gated-Clock GALS Systems Sam Farrokhi, Masoud Zamani, Hossein Pedram, Mehdi Sedighi Amirkabir University of Technology Department of Computer Eng. & IT E-mail:
More informationFPGA Based Agrep for DNA Microarray Sequence Searching
2009 International Conference on Computer Engineering and Applications IPCSIT vol.2 (20) (20) IACSIT Press, Singapore FPGA Based Agrep for DNA Microarray Sequence Searching Gabriel F. Villorente, 2 Mark
More informationCOPROCESSOR APPROACH TO ACCELERATING MULTIMEDIA APPLICATION [CLAUDIO BRUNELLI, JARI NURMI ] Processor Design
COPROCESSOR APPROACH TO ACCELERATING MULTIMEDIA APPLICATION [CLAUDIO BRUNELLI, JARI NURMI ] Processor Design Lecture Objectives Background Need for Accelerator Accelerators and different type of parallelizm
More informationEastern Mediterranean University School of Computing and Technology CACHE MEMORY. Computer memory is organized into a hierarchy.
Eastern Mediterranean University School of Computing and Technology ITEC255 Computer Organization & Architecture CACHE MEMORY Introduction Computer memory is organized into a hierarchy. At the highest
More informationFastA and the chaining problem, Gunnar Klau, December 1, 2005, 10:
FastA and the chaining problem, Gunnar Klau, December 1, 2005, 10:56 4001 4 FastA and the chaining problem We will discuss: Heuristics used by the FastA program for sequence alignment Chaining problem
More informationWhen Girls Design CPUs!
When Girls Design CPUs! An overview on one of the world s most famous CPU cores: ARM 1 Once Upon a Time There was a company in UK Acorn This company was the competitor to IBM Apple They were creating personal
More informationSingle Pass, BLAST-like, Approximate String Matching on FPGAs*
Single Pass, BLAST-like, Approximate String Matching on FPGAs* Martin Herbordt Josh Model Yongfeng Gu Bharat Sukhwani Tom VanCourt Computer Architecture and Automated Design Laboratory Department of Electrical
More informationHighly Scalable and Accurate Seeds for Subsequence Alignment
Highly Scalable and Accurate Seeds for Subsequence Alignment Abhijit Pol Tamer Kahveci Department of Computer and Information Science and Engineering, University of Florida, Gainesville, FL, USA, 32611
More informationThe T0 Vector Microprocessor. Talk Outline
Slides from presentation at the Hot Chips VII conference, 15 August 1995.. The T0 Vector Microprocessor Krste Asanovic James Beck Bertrand Irissou Brian E. D. Kingsbury Nelson Morgan John Wawrzynek University
More informationA GPU Algorithm for Comparing Nucleotide Histograms
A GPU Algorithm for Comparing Nucleotide Histograms Adrienne Breland Harpreet Singh Omid Tutakhil Mike Needham Dickson Luong Grant Hennig Roger Hoang Torborn Loken Sergiu M. Dascalu Frederick C. Harris,
More informationSequence alignment theory and applications Session 3: BLAST algorithm
Sequence alignment theory and applications Session 3: BLAST algorithm Introduction to Bioinformatics online course : IBT Sonal Henson Learning Objectives Understand the principles of the BLAST algorithm
More informationSpeeding up Subset Seed Algorithm for Intensive Protein Sequence Comparison
Speeding up Subset Seed Algorithm for Intensive Protein Sequence Comparison Van Hoa NGUYEN IRISA/INRIA Rennes Rennes, France Email: vhnguyen@irisa.fr Dominique LAVENIER CNRS/IRISA Rennes, France Email:
More informationComputational Genomics and Molecular Biology, Fall
Computational Genomics and Molecular Biology, Fall 2015 1 Sequence Alignment Dannie Durand Pairwise Sequence Alignment The goal of pairwise sequence alignment is to establish a correspondence between the
More informationSoftware Implementation of Smith-Waterman Algorithm in FPGA
Software Implementation of Smith-Waterman lgorithm in FP NUR FRH IN SLIMN, NUR DLILH HMD SBRI, SYED BDUL MULIB L JUNID, ZULKIFLI BD MJID, BDUL KRIMI HLIM Faculty of Electrical Engineering Universiti eknologi
More informationHarnessing Associative Computing for Sequence Alignment with Parallel Accelerators
Harnessing Associative Computing for Sequence Alignment with Parallel Accelerators Shannon I. Steinfadt Doctoral Research Showcase III Room 17 A / B 4:00-4:15 International Conference for High Performance
More informationCSE : Introduction to Computer Architecture
Computer Architecture 9/21/2005 CSE 675.02: Introduction to Computer Architecture Instructor: Roger Crawfis (based on slides from Gojko Babic A modern meaning of the term computer architecture covers three
More informationCBMF W4761 Final Project Report
CBMF W4761 Final Project Report Christopher Fenton CBMF W4761 8/16/09 Introduction Performing large scale database searches of genomic data is one of the largest problems in computational genomics. When
More informationLecture 23. Finish-up buses Storage
Lecture 23 Finish-up buses Storage 1 Example Bus Problems, cont. 2) Assume the following system: A CPU and memory share a 32-bit bus running at 100MHz. The memory needs 50ns to access a 64-bit value from
More information