Harnessing Associative Computing for Sequence Alignment with Parallel Accelerators

Size: px
Start display at page:

Download "Harnessing Associative Computing for Sequence Alignment with Parallel Accelerators"

Transcription

1 Harnessing Associative Computing for Sequence Alignment with Parallel Accelerators Shannon I. Steinfadt Doctoral Research Showcase III Room 17 A / B 4:00-4:15 International Conference for High Performance Computing, Networking, Storage and Analysis (SC 08) Advisor: Dr. Johnnie W. Baker Parallel and Associative Computing Lab Computer Science Department Kent State University

2 Outline Introduction to Bioinformatics Local Sequence Alignment ASC about SWAMP SWAMP+ ASC on Metal gcggacgct ccacg-tgtc--c --c- tcgccgcgc cc-cgtctacc : : : : - : : : : :: - : gggccct cctggctcccaac agc ttctcagttc ccacttc Dynamic Programming - Automatic Parallelization Conclusion Questions? 2

3 What is Bioinformatics? Bioinformatics is the field of science in which biology, computer science, and information technology merge to form a single discipline. * The ultimate goal: enable the discovery of new biological insights create a global perspective from which unifying principles in biology can be discerned aattctaatt tctttccatg gagtttttca ttagatccag aaaaaagaag tcaatctctt tttacaaact actgccctaa agaatcatac tttaatccgt tggaggggta agactgcact gtgacatgac tatagaaagt agatttgtat cctagttcta ttatccatgt gtgtaaggca Human Xp DNA base pairs 180 of 5303 base pairs *Definition from NCBI 3

4 Pairwise Local Sequence Alignment Search for regions of high similarity between two strings Similar Characters Similar Structure Similar Function One of the most common fundamental tasks is local sequence alignment 4

5 Pairwise Local Sequence Alignment Search for regions of high similarity between two strings Similar Characters Similar Structure Similar Function Homologous Sequences (derived by humans) Ancestral Relationships Gene Functionality Aid in Drug Discovery (preserved by evolution) 5

6 Goals for Sequence Alignment Provide Accurate (use Smith-Waterman) Fast More detailed alignments One of the most used operations in bioinformatics 6

7 Sequence Alignment Methods Two possible approaches: Heuristics (approximations): e.g. BLAST, mpiblast, FastA the more efficient the heuristics usually the worse the quality of the results Exact algorithms: Jaligner, MPSRCH, Smith-Waterman Parallel Processing: get high-quality results in less time (using the Smith-Waterman algorithm) 7

8 Sequence Alignment Methods Speed vs. Quality BLAST, FastA, Smith-Waterman Slower Search Speed Faster BLAST Lower FastA Data Quality Smith- Waterman Higher 8

9 Traceback in the Smith-Waterman Algorithm 1) Find the maximum computed value Cost Key Match +10 Miss -3 Insert a Gap -3 Extend a Gap -1 9

10 Aligning using Smith-Waterman Algorithm Compare all possible combinations of sequence characters against each other 10

11 Aligning using Smith-Waterman Algorithm Compare all possible combinations of sequence characters against each other Cost Key Match +10 Miss -3 Insert a Gap -3 Extend a Gap -1 11

12 Aligning using Smith-Waterman Algorithm Compare all possible combinations of sequence characters against each other Cost Key Match +10 Miss -3 Insert a Gap -3 Extend a Gap -1 12

13 Aligning using Smith-Waterman Algorithm Compare all possible combinations of sequence characters against each other Cost Key Match +10 Miss -3 Insert a Gap -3 Extend a Gap -1 13

14 Aligning using Smith-Waterman Algorithm Compare all possible combinations of sequence characters against each other Cost Key Match +10 Miss -3 Insert a Gap -3 Extend a Gap -1 14

15 Aligning using Smith-Waterman Algorithm Compare all possible combinations - but it has dynamic programming data dependencies Cost Key Match +10 Miss -3 Insert a Gap -3 Extend a Gap -1 15

16 Aligning using Smith-Waterman Algorithm Compare all possible combinations - but it has dynamic programming data dependencies Cost Key Match +10 Miss -3 Insert a Gap -3 Extend a Gap -1 16

17 Aligning using Smith-Waterman Algorithm Compare all possible combinations - but it has dynamic programming data dependencies Cost Key Match +10 Miss -3 Insert a Gap -3 Extend a Gap -1 17

18 Aligning using Smith-Waterman Algorithm Compare all possible combinations - but it has dynamic programming data dependencies Cost Key Match +10 Miss -3 Insert a Gap -3 Extend a Gap -1 18

19 Aligning using Smith-Waterman Algorithm Compare all possible combinations - but it has dynamic programming data dependencies Cost Key Match +10 Miss -3 Insert a Gap -3 Extend a Gap -1 19

20 Aligning using Smith-Waterman Algorithm Compare all possible combinations - but it has dynamic programming data dependencies Cost Key Match +10 Miss -3 Insert a Gap -3 Extend a Gap -1 20

21 Aligning using Smith-Waterman Algorithm Compare all possible combinations - but it has dynamic programming data dependencies Cost Key Match +10 Miss -3 Insert a Gap -3 Extend a Gap -1 21

22 Traceback in the Smith-Waterman Algorithm 1) Find the maximum computed value Cost Key Match +10 Miss -3 Insert a Gap -3 Extend a Gap -1 22

23 Traceback in the Smith-Waterman Algorithm 1) Find the maximum computed value 2) Traceback until you reach 0 s Alignment: CATTG C - -TG Cost Key Match +10 Miss -3 Insert a Gap -3 Extend a Gap -1 23

24 Goals for Sequence Alignment Provide Accurate (use Smith-Waterman) Fast More detailed alignments One of the most used operations in bioinformatics 24

25 Motivation for Faster Alignment Sequences analyzed by comparison with database(s) Complexity of comparisons proportional to the product of query size times database size i.e. your sequence size * the size of each sequence * number of sequences 262 * 366 * 1,000,000 = 95,892,000,000 comparisons The number of base pairs doubles ~18 months in GenBank 85,759,586,764 bases in 82,853,685 sequence records (2/08) 25

26 Genomic Databases Growth of the International Nucleotide Sequence Database Collaboration (INSDC) base pairs contributed by: EMBL DDJB GenBank Base pairs in (billions) Exponential growth of public sequence data means more to align with; the faster an alignment, the better. 26

27 Get It, Got It, Good (or Better) ctcgccgcgc ggcggacgct ccacgtgtcc cccgtctacc gggccctcct ggctcccaac agcttctcag ttcccacttc Have These Want This Associative SIMD Model - ASC Use This 27

28 Get It, Got It, Good (or Better) ctcgccgcgc ggcggacgct ccacgtgtcc cccgtctacc gggccctcct ggctcccaac agcttctcag ttcccacttc Have These Want This ClearSpeed Advance 620 PCI-X board 50 GFLOPS peak performance 25W average power dissipation Use This 28

29 Get It, Got It, Good (or Better) ctcgccgcgc ggcggacgct ccacgtgtcc cccgtctacc gggccctcct ggctcccaac agcttctcag ttcccacttc Have These Want This NVIDIA C870 Tesla GPGPU 518 Peak GFLOPS on Tesla 170W peak, 120W typical Use This 29

30 ASC: Associative Architecture SIMD with special associative features Fine-grained parallelism Designed for fast associative searches Content-based searches, not memory address 30

31 ASC Advantages Quick data movement in SIMD Move raw data in parallel At each step, PEs follow the algorithmic steps for data movement in lock step No message passing like MPI/PVM No store/forward No headers No explicit synchronizing 31

32 ASC: Associative Architecture Very fast operations for: Finding Maximum / Minimum Finding if there are Any Responders Pick One active PE 32

33 Parallelizing the Smith-Waterman Algorithm 33

34 Parallelizing the Smith-Waterman Algorithm 34

35 Parallelizing the Algorithm 35

36 Parallelizing the Algorithm 36

37 Parallelizing the Algorithm C A T T G 37

38 SWAMP (Smith-Waterman using Associative Massive Parallelism) Order of Computations Used PEs Unused PEs 38

39 Goals for Sequence Alignment Provide Fast Accurate (use Smith-Waterman) More detailed alignments Highest accuracy with more higher information content would be better 39

40 SWAMP+ SWAMP+ returns multiple non-overlapping sequences Search and process with SWAMP multiple times Return top k non-overlapping, non-intersecting sequences Reveal additional information Spatial information Length of comparisions Identify regulatory regions and motifs 40

41 ASC on Metal ASC SIMD with Additional Features Associative Functions Associative Search Search via Content, not Memory Address Associative Functions ClearSpeed SIMD Accelerator (64-bit FP) 50 GFLOPS peak performance 25W average power dissipation NVIDIA Tesla GPGPU Stream Processor (32-bit FP) 518 Peak GFLOPS on Tesla Series 170W peak, 120W typical 41

42 ASC on Metal Associative Functions NVIDIA Tesla GPGPU x 2 42

43 GPGPU Internal Organization Multiple Levels of Parallelism Up to 512 threads per block Communicate through shared memory Grids of thread blocks SPMD Computation Model All data processed by the same program (kernel) From Scalable Parallel Programming with CUDA. From GPUs for Parallel Programming Vol. 6, No. 2 - March/April 2008 by John Nickolls, et. al. 43

44 ASC to GPGPU Mapping ASC GPGPU PE Thread Local memory that belongs solely to PE / Thread PE Interconnection Network Per-block Shared Memory All PEs Block Limited here to 512 separate threads per block Multiple ASC Model (MASC) GPGPU Multiple Instruction Streams Multiple Blocks Mulitple MASC programs Multiple Grids 44

45 Q & A Contact Info: Shannon Steinfadt ssteinfa@cs.kent.edu 45

46 References CUDA Information J. Nickolls, I. Buck, M. Garland, K. Skardon, Scalable Parallel Programming with CUDA, ACM Queue Magazine, pp , March/April Parallel Sequence Alignment Others S. A Manavski and G. Valle, CUDA Compatible GPU Cards as Efficient Hardware Accelerators for Smith-Waterman Sequence Alignment, BMC Bioinformatics, March M. Farrar, Striped Smith-Waterman Speeds Database Searches Six Times over Other SIMD Implementations, Bioinformatics, pp , Jan T. Rogens and E. Seeberg, Six-fold speed-up of Smith-Waterman sequence database searches using parallel processing on common microprocessors. Bioinformatics 16(8): , W. Liu, B. Schmidt, G. Voss, A. Schröder, and W. Müller-Wittig, Bio-Sequence Database Scanning on a GPU, Proc. 20th IEEE Int'l Parallel and Distributed Processing Symp. High Performance Computational Biology (HiCOMB) Workshop,

Keywords -Bioinformatics, sequence alignment, Smith- waterman (SW) algorithm, GPU, CUDA

Keywords -Bioinformatics, sequence alignment, Smith- waterman (SW) algorithm, GPU, CUDA Volume 5, Issue 5, May 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Accelerating Smith-Waterman

More information

Sequence Alignment with GPU: Performance and Design Challenges

Sequence Alignment with GPU: Performance and Design Challenges Sequence Alignment with GPU: Performance and Design Challenges Gregory M. Striemer and Ali Akoglu Department of Electrical and Computer Engineering University of Arizona, 85721 Tucson, Arizona USA {gmstrie,

More information

SWAMP: Smith-Waterman using Associative Massive Parallelism

SWAMP: Smith-Waterman using Associative Massive Parallelism SWAMP: Smith-Waterman using Associative Massive Parallelism Shannon Steinfadt Dr. Johnnie W. Baker Department of Computer Science, Kent State University, Kent, Ohio 44242 USA ssteinfa@cs.kent.edu jbaker@cs.kent.edu

More information

THE Smith-Waterman (SW) algorithm [1] is a wellknown

THE Smith-Waterman (SW) algorithm [1] is a wellknown Design and Implementation of the Smith-Waterman Algorithm on the CUDA-Compatible GPU Yuma Munekawa, Fumihiko Ino, Member, IEEE, and Kenichi Hagihara Abstract This paper describes a design and implementation

More information

Fast Sequence Alignment Method Using CUDA-enabled GPU

Fast Sequence Alignment Method Using CUDA-enabled GPU Fast Sequence Alignment Method Using CUDA-enabled GPU Yeim-Kuan Chang Department of Computer Science and Information Engineering National Cheng Kung University Tainan, Taiwan ykchang@mail.ncku.edu.tw De-Yu

More information

Introduction to GPU computing

Introduction to GPU computing Introduction to GPU computing Nagasaki Advanced Computing Center Nagasaki, Japan The GPU evolution The Graphic Processing Unit (GPU) is a processor that was specialized for processing graphics. The GPU

More information

Comparative Analysis of Protein Alignment Algorithms in Parallel environment using CUDA

Comparative Analysis of Protein Alignment Algorithms in Parallel environment using CUDA Comparative Analysis of Protein Alignment Algorithms in Parallel environment using BLAST versus Smith-Waterman Shadman Fahim shadmanbracu09@gmail.com Shehabul Hossain rudrozzal@gmail.com Gulshan Jubaed

More information

Biology 644: Bioinformatics

Biology 644: Bioinformatics Find the best alignment between 2 sequences with lengths n and m, respectively Best alignment is very dependent upon the substitution matrix and gap penalties The Global Alignment Problem tries to find

More information

GPU Accelerated API for Alignment of Genomics Sequencing Data

GPU Accelerated API for Alignment of Genomics Sequencing Data GPU Accelerated API for Alignment of Genomics Sequencing Data Nauman Ahmed, Hamid Mushtaq, Koen Bertels and Zaid Al-Ars Computer Engineering Laboratory, Delft University of Technology, Delft, The Netherlands

More information

OPEN MP-BASED PARALLEL AND SCALABLE GENETIC SEQUENCE ALIGNMENT

OPEN MP-BASED PARALLEL AND SCALABLE GENETIC SEQUENCE ALIGNMENT OPEN MP-BASED PARALLEL AND SCALABLE GENETIC SEQUENCE ALIGNMENT Asif Ali Khan*, Laiq Hassan*, Salim Ullah* ABSTRACT: In bioinformatics, sequence alignment is a common and insistent task. Biologists align

More information

GPU 3 Smith-Waterman

GPU 3 Smith-Waterman 129 211 11 GPU 3 Smith-Waterman Saori SUDO 1 GPU Graphic Processing Unit GPU GPGPUGeneral-purpose computing on GPU 1) CPU GPU GPU GPGPU NVIDIA C CUDACompute Unified Device Architecture 2) OpenCL 3) DNA

More information

Speeding up Subset Seed Algorithm for Intensive Protein Sequence Comparison

Speeding up Subset Seed Algorithm for Intensive Protein Sequence Comparison Speeding up Subset Seed Algorithm for Intensive Protein Sequence Comparison Van Hoa NGUYEN IRISA/INRIA Rennes Rennes, France Email: vhnguyen@irisa.fr Dominique LAVENIER CNRS/IRISA Rennes, France Email:

More information

Bioinformatics explained: Smith-Waterman

Bioinformatics explained: Smith-Waterman Bioinformatics Explained Bioinformatics explained: Smith-Waterman May 1, 2007 CLC bio Gustav Wieds Vej 10 8000 Aarhus C Denmark Telephone: +45 70 22 55 09 Fax: +45 70 22 55 19 www.clcbio.com info@clcbio.com

More information

A GPU Algorithm for Comparing Nucleotide Histograms

A GPU Algorithm for Comparing Nucleotide Histograms A GPU Algorithm for Comparing Nucleotide Histograms Adrienne Breland Harpreet Singh Omid Tutakhil Mike Needham Dickson Luong Grant Hennig Roger Hoang Torborn Loken Sergiu M. Dascalu Frederick C. Harris,

More information

USING AN EXTENDED SUFFIX TREE TO SPEED-UP SEQUENCE ALIGNMENT

USING AN EXTENDED SUFFIX TREE TO SPEED-UP SEQUENCE ALIGNMENT IADIS International Conference Applied Computing 2006 USING AN EXTENDED SUFFIX TREE TO SPEED-UP SEQUENCE ALIGNMENT Divya R. Singh Software Engineer Microsoft Corporation, Redmond, WA 98052, USA Abdullah

More information

Compares a sequence of protein to another sequence or database of a protein, or a sequence of DNA to another sequence or library of DNA.

Compares a sequence of protein to another sequence or database of a protein, or a sequence of DNA to another sequence or library of DNA. Compares a sequence of protein to another sequence or database of a protein, or a sequence of DNA to another sequence or library of DNA. Fasta is used to compare a protein or DNA sequence to all of the

More information

An Ultrafast Scalable Many-core Motif Discovery Algorithm for Multiple GPUs

An Ultrafast Scalable Many-core Motif Discovery Algorithm for Multiple GPUs 2011 IEEE International Parallel & Distributed Processing Symposium An Ultrafast Scalable Many-core Motif Discovery Algorithm for Multiple GPUs Yongchao Liu, Bertil Schmidt, Douglas L. Maskell School of

More information

FASTA. Besides that, FASTA package provides SSEARCH, an implementation of the optimal Smith- Waterman algorithm.

FASTA. Besides that, FASTA package provides SSEARCH, an implementation of the optimal Smith- Waterman algorithm. FASTA INTRODUCTION Definition (by David J. Lipman and William R. Pearson in 1985) - Compares a sequence of protein to another sequence or database of a protein, or a sequence of DNA to another sequence

More information

Dynamic Programming User Manual v1.0 Anton E. Weisstein, Truman State University Aug. 19, 2014

Dynamic Programming User Manual v1.0 Anton E. Weisstein, Truman State University Aug. 19, 2014 Dynamic Programming User Manual v1.0 Anton E. Weisstein, Truman State University Aug. 19, 2014 Dynamic programming is a group of mathematical methods used to sequentially split a complicated problem into

More information

ENABLING NEW SCIENCE GPU SOLUTIONS

ENABLING NEW SCIENCE GPU SOLUTIONS ENABLING NEW SCIENCE TESLA BIO Workbench The NVIDIA Tesla Bio Workbench enables biophysicists and computational chemists to push the boundaries of life sciences research. It turns a standard PC into a

More information

BIOL591: Introduction to Bioinformatics Alignment of pairs of sequences

BIOL591: Introduction to Bioinformatics Alignment of pairs of sequences BIOL591: Introduction to Bioinformatics Alignment of pairs of sequences Reading in text (Mount Bioinformatics): I must confess that the treatment in Mount of sequence alignment does not seem to me a model

More information

Improving CUDASW++, a Parallelization of Smith-Waterman for CUDA Enabled Devices

Improving CUDASW++, a Parallelization of Smith-Waterman for CUDA Enabled Devices 2011 IEEE International Parallel & Distributed Processing Symposium Improving CUDASW++, a Parallelization of Smith-Waterman for CUDA Enabled Devices Doug Hains, Zach Cashero, Mark Ottenberg, Wim Bohm and

More information

Algorithms and Tools for Bioinformatics on GPUs. Bertil SCHMIDT

Algorithms and Tools for Bioinformatics on GPUs. Bertil SCHMIDT Algorithms and Tools for Bioinformatics on GPUs Bertil SCHMIDT Contents Motivation Pairwise Sequence Alignment Multiple Sequence Alignment Short Read Error Correction using CUDA Some other CUDA-enabled

More information

Accelerated GPU Based Protein Sequence Alignment An optimized database sequences approach

Accelerated GPU Based Protein Sequence Alignment An optimized database sequences approach IJCSNS International Journal of Computer Science and Network Security, VOL.17 No.10, October 2017 231 Accelerated GPU Based Protein Sequence Alignment An optimized database sequences approach Muhammad

More information

Accelerating Smith-Waterman Algorithm for Biological Database Search on CUDA-Compatible GPUs

Accelerating Smith-Waterman Algorithm for Biological Database Search on CUDA-Compatible GPUs IEICE TRANS. INF. & SYST., VOL.E93 D, NO.6 JUNE 2010 1479 PAPER Special Section on Info-Plosion Accelerating Smith-Waterman Algorithm for Biological Database Search on CUDA-Compatible GPUs Yuma MUNEKAWA,

More information

BLAST, Profile, and PSI-BLAST

BLAST, Profile, and PSI-BLAST BLAST, Profile, and PSI-BLAST Jianlin Cheng, PhD School of Electrical Engineering and Computer Science University of Central Florida 26 Free for academic use Copyright @ Jianlin Cheng & original sources

More information

We are IntechOpen, the world s leading publisher of Open Access books Built by scientists, for scientists. International authors and editors

We are IntechOpen, the world s leading publisher of Open Access books Built by scientists, for scientists. International authors and editors We are IntechOpen, the world s leading publisher of Open Access books Built by scientists, for scientists 4, 116, 12M Open access books available International authors and editors Downloads Our authors

More information

Biological Sequence Analysis. CSEP 521: Applied Algorithms Final Project. Archie Russell ( ), Jason Hogg ( )

Biological Sequence Analysis. CSEP 521: Applied Algorithms Final Project. Archie Russell ( ), Jason Hogg ( ) Biological Sequence Analysis CSEP 521: Applied Algorithms Final Project Archie Russell (0638782), Jason Hogg (0641054) Introduction Background The schematic for every living organism is stored in long

More information

Portland State University ECE 588/688. Graphics Processors

Portland State University ECE 588/688. Graphics Processors Portland State University ECE 588/688 Graphics Processors Copyright by Alaa Alameldeen 2018 Why Graphics Processors? Graphics programs have different characteristics from general purpose programs Highly

More information

Brief review from last class

Brief review from last class Sequence Alignment Brief review from last class DNA is has direction, we will use only one (5 -> 3 ) and generate the opposite strand as needed. DNA is a 3D object (see lecture 1) but we will model it

More information

Journal of Computational Physics

Journal of Computational Physics Journal of omputational Physics 229 (200) 4247 4258 ontents lists available at ScienceDirect Journal of omputational Physics journal homepage: www.elsevier.com/locate/jcp Acceleration of the Smith Waterman

More information

How to Run NCBI BLAST on zcluster at GACRC

How to Run NCBI BLAST on zcluster at GACRC How to Run NCBI BLAST on zcluster at GACRC BLAST: Basic Local Alignment Search Tool Georgia Advanced Computing Resource Center University of Georgia Suchitra Pakala pakala@uga.edu 1 OVERVIEW What is BLAST?

More information

24 Grundlagen der Bioinformatik, SS 10, D. Huson, April 26, This lecture is based on the following papers, which are all recommended reading:

24 Grundlagen der Bioinformatik, SS 10, D. Huson, April 26, This lecture is based on the following papers, which are all recommended reading: 24 Grundlagen der Bioinformatik, SS 10, D. Huson, April 26, 2010 3 BLAST and FASTA This lecture is based on the following papers, which are all recommended reading: D.J. Lipman and W.R. Pearson, Rapid

More information

GPUBwa -Parallelization of Burrows Wheeler Aligner using Graphical Processing Units

GPUBwa -Parallelization of Burrows Wheeler Aligner using Graphical Processing Units GPUBwa -Parallelization of Burrows Wheeler Aligner using Graphical Processing Units Abstract A very popular discipline in bioinformatics is Next-Generation Sequencing (NGS) or DNA sequencing. It specifies

More information

High Performance Computing on GPUs using NVIDIA CUDA

High Performance Computing on GPUs using NVIDIA CUDA High Performance Computing on GPUs using NVIDIA CUDA Slides include some material from GPGPU tutorial at SIGGRAPH2007: http://www.gpgpu.org/s2007 1 Outline Motivation Stream programming Simplified HW and

More information

Bioinformatics for Biologists

Bioinformatics for Biologists Bioinformatics for Biologists Sequence Analysis: Part I. Pairwise alignment and database searching Fran Lewitter, Ph.D. Director Bioinformatics & Research Computing Whitehead Institute Topics to Cover

More information

Recent Advances in Heterogeneous Computing using Charm++

Recent Advances in Heterogeneous Computing using Charm++ Recent Advances in Heterogeneous Computing using Charm++ Jaemin Choi, Michael Robson Parallel Programming Laboratory University of Illinois Urbana-Champaign April 12, 2018 1 / 24 Heterogeneous Computing

More information

As of August 15, 2008, GenBank contained bases from reported sequences. The search procedure should be

As of August 15, 2008, GenBank contained bases from reported sequences. The search procedure should be 48 Bioinformatics I, WS 09-10, S. Henz (script by D. Huson) November 26, 2009 4 BLAST and BLAT Outline of the chapter: 1. Heuristics for the pairwise local alignment of two sequences 2. BLAST: search and

More information

Current Trends in Computer Graphics Hardware

Current Trends in Computer Graphics Hardware Current Trends in Computer Graphics Hardware Dirk Reiners University of Louisiana Lafayette, LA Quick Introduction Assistant Professor in Computer Science at University of Louisiana, Lafayette (since 2006)

More information

Accelerating Genomic Sequence Alignment Workload with Scalable Vector Architecture

Accelerating Genomic Sequence Alignment Workload with Scalable Vector Architecture Accelerating Genomic Sequence Alignment Workload with Scalable Vector Architecture Dong-hyeon Park, Jon Beaumont, Trevor Mudge University of Michigan, Ann Arbor Genomics Past Weeks ~$3 billion Human Genome

More information

GPU Accelerated Smith-Waterman

GPU Accelerated Smith-Waterman GPU Accelerated Smith-Waterman Yang Liu 1,WayneHuang 1,2, John Johnson 1, and Sheila Vaidya 1 1 Lawrence Livermore National Laboratory 2 DOE Joint Genome Institute, UCRL-CONF-218814 {liu24, whuang, jjohnson,

More information

Associative Operations from MASC to GPU

Associative Operations from MASC to GPU 388 Int'l Conf. Par. and Dist. Proc. Tech. and Appl. PDPTA'15 Associative Operations from MASC to GPU Mingxian Jin Department of Mathematics and Computer Science, Fayetteville State University 1200 Murchison

More information

Research Article GPU-Based Cloud Service for Smith-Waterman Algorithm Using Frequency Distance Filtration Scheme

Research Article GPU-Based Cloud Service for Smith-Waterman Algorithm Using Frequency Distance Filtration Scheme BioMed Research International Volume 2013, Article ID 721738, 8 pages http://dx.doi.org/10.1155/2013/721738 Research Article GPU-Based Cloud Service for Smith-Waterman Algorithm Using Frequency Distance

More information

Accelerating Smith-Waterman Alignment workload with Scalable Vector Computing

Accelerating Smith-Waterman Alignment workload with Scalable Vector Computing Accelerating Smith-Waterman Alignment workload with Scalable Vector Computing Dong-hyeon Park, Jonathan Beaumont, Trevor Mudge Computer Science and Engineering University of Michigan Ann Arbor, MI {dohypark,

More information

3 College of Computing, Georgia Institute of Technology, Atlanta, GA, 30332, USA.

3 College of Computing, Georgia Institute of Technology, Atlanta, GA, 30332, USA. A Tile-based Parallel Viterbi Algorithm for Biological Sequence Alignment on GPU with CUDA Zhihui Du 1+, Zhaoming Yin 2, and David A. Bader 3 1 Tsinghua National Laboratory for Information Science and

More information

A Study On Pair-Wise Local Alignment Of Protein Sequence For Identifying The Structural Similarity

A Study On Pair-Wise Local Alignment Of Protein Sequence For Identifying The Structural Similarity A Study On Pair-Wise Local Alignment Of Protein Sequence For Identifying The Structural Similarity G. Pratyusha, Department of Computer Science & Engineering, V.R.Siddhartha Engineering College(Autonomous)

More information

Basic Local Alignment Search Tool (BLAST)

Basic Local Alignment Search Tool (BLAST) BLAST 26.04.2018 Basic Local Alignment Search Tool (BLAST) BLAST (Altshul-1990) is an heuristic Pairwise Alignment composed by six-steps that search for local similarities. The most used access point to

More information

GPU-Supercomputer Acceleration of Pattern Matching

GPU-Supercomputer Acceleration of Pattern Matching CHAPTER GPU-Supercomputer Acceleration of Pattern Matching 13 Ali Khajeh-Saeed, J. Blair Perot This chapter describes the solution of a single very large pattern-matching search using a supercomputing

More information

High Performance Technique for Database Applications Using a Hybrid GPU/CPU Platform

High Performance Technique for Database Applications Using a Hybrid GPU/CPU Platform High Performance Technique for Database Applications Using a Hybrid GPU/CPU Platform M. Affan Zidan, Talal Bonny, and Khaled N. Salama Electrical Engineering Program King Abdullah University of Science

More information

Biological Sequence Comparison on Hybrid Platforms with Dynamic Workload Adjustment

Biological Sequence Comparison on Hybrid Platforms with Dynamic Workload Adjustment 2013 IEEE 27th International Symposium on Parallel & Distributed Processing Workshops and PhD Forum Biological Sequence Comparison on Hybrid Platforms with Dynamic Workload Adjustment Fernando Machado

More information

Lecture Overview. Sequence search & alignment. Searching sequence databases. Sequence Alignment & Search. Goals: Motivations:

Lecture Overview. Sequence search & alignment. Searching sequence databases. Sequence Alignment & Search. Goals: Motivations: Lecture Overview Sequence Alignment & Search Karin Verspoor, Ph.D. Faculty, Computational Bioscience Program University of Colorado School of Medicine With credit and thanks to Larry Hunter for creating

More information

Performance Comparison between Linear RVE and Linear Systolic Array Implementations of the Smith-Waterman Algorithm

Performance Comparison between Linear RVE and Linear Systolic Array Implementations of the Smith-Waterman Algorithm Performance Comparison between Linear RVE and Linear Systolic Array Implementations of the Smith-Waterman Algorithm Laiq Hasan Zaid Al-Ars Delft University of Technology Computer Engineering Laboratory

More information

AAlign: A SIMD Framework for Pairwise Sequence Alignment on x86-based Multi- and Many-core Processors

AAlign: A SIMD Framework for Pairwise Sequence Alignment on x86-based Multi- and Many-core Processors AAlign: A SIMD Framework for Pairwise Sequence Alignment on x86-based Multi- and Many-core Processors Kaixi Hou, Hao Wang, Wu-chun Feng {kaixihou,hwang121,wfeng}@vt.edu Pairwise Sequence Alignment Algorithms

More information

A Design of a Hybrid System for DNA Sequence Alignment

A Design of a Hybrid System for DNA Sequence Alignment IMECS 2008, 9-2 March, 2008, Hong Kong A Design of a Hybrid System for DNA Sequence Alignment Heba Khaled, Hossam M. Faheem, Tayseer Hasan, Saeed Ghoneimy Abstract This paper describes a parallel algorithm

More information

Data Mining Technologies for Bioinformatics Sequences

Data Mining Technologies for Bioinformatics Sequences Data Mining Technologies for Bioinformatics Sequences Deepak Garg Computer Science and Engineering Department Thapar Institute of Engineering & Tecnology, Patiala Abstract Main tool used for sequence alignment

More information

Sequence Alignment & Search

Sequence Alignment & Search Sequence Alignment & Search Karin Verspoor, Ph.D. Faculty, Computational Bioscience Program University of Colorado School of Medicine With credit and thanks to Larry Hunter for creating the first version

More information

Cooperative Multitasking for GPU-Accelerated Grid Systems

Cooperative Multitasking for GPU-Accelerated Grid Systems 21 1th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing Cooperative Multitasking for GPU-Accelerated Grid Systems Fumihiko Ino, Akihiro Ogita, Kentaro Oita and Kenichi Hagihara Graduate

More information

Introduction to CUDA Algoritmi e Calcolo Parallelo. Daniele Loiacono

Introduction to CUDA Algoritmi e Calcolo Parallelo. Daniele Loiacono Introduction to CUDA Algoritmi e Calcolo Parallelo References q This set of slides is mainly based on: " CUDA Technical Training, Dr. Antonino Tumeo, Pacific Northwest National Laboratory " Slide of Applied

More information

On the Efficacy of Haskell for High Performance Computational Biology

On the Efficacy of Haskell for High Performance Computational Biology On the Efficacy of Haskell for High Performance Computational Biology Jacqueline Addesa Academic Advisors: Jeremy Archuleta, Wu chun Feng 1. Problem and Motivation Biologists can leverage the power of

More information

Sequencee Analysis Algorithms for Bioinformatics Applications

Sequencee Analysis Algorithms for Bioinformatics Applications Zagazig University Faculty of Engineering Computers and Systems Engineering Department Sequencee Analysis Algorithms for Bioinformatics Applications By Mohamed Al sayed Mohamed Ali Issa B.Sc in Computers

More information

BIOINFORMATICS ORIGINAL PAPER

BIOINFORMATICS ORIGINAL PAPER BIOINFORMATICS ORIGINAL PAPER Vol. 27 no. 10 2011, pages 1351 1358 doi:10.1093/bioinformatics/btr151 Sequence analysis Advance Access publication March 30, 2011 Exact and complete short-read alignment

More information

Accelerating Protein Sequence Search in a Heterogeneous Computing System

Accelerating Protein Sequence Search in a Heterogeneous Computing System Accelerating Protein Sequence Search in a Heterogeneous Computing System Shucai Xiao, Heshan Lin, and Wu-chun Feng Department of Electrical and Computer Engineering Department of Computer Science Virginia

More information

Graphics Processor Acceleration and YOU

Graphics Processor Acceleration and YOU Graphics Processor Acceleration and YOU James Phillips Research/gpu/ Goals of Lecture After this talk the audience will: Understand how GPUs differ from CPUs Understand the limits of GPU acceleration Have

More information

Database Searching Using BLAST

Database Searching Using BLAST Mahidol University Objectives SCMI512 Molecular Sequence Analysis Database Searching Using BLAST Lecture 2B After class, students should be able to: explain the FASTA algorithm for database searching explain

More information

Sequence Alignment. GBIO0002 Archana Bhardwaj University of Liege

Sequence Alignment. GBIO0002 Archana Bhardwaj University of Liege Sequence Alignment GBIO0002 Archana Bhardwaj University of Liege 1 What is Sequence Alignment? A sequence alignment is a way of arranging the sequences of DNA, RNA, or protein to identify regions of similarity.

More information

Scalable Accelerator Architecture for Local Alignment of DNA Sequences

Scalable Accelerator Architecture for Local Alignment of DNA Sequences Scalable Accelerator Architecture for Local Alignment of DNA Sequences Nuno Sebastião, Nuno Roma, Paulo Flores INESC-ID / IST-TU Lisbon Rua Alves Redol, 9, Lisboa PORTUGAL {Nuno.Sebastiao, Nuno.Roma, Paulo.Flores}

More information

CS313 Exercise 4 Cover Page Fall 2017

CS313 Exercise 4 Cover Page Fall 2017 CS313 Exercise 4 Cover Page Fall 2017 Due by the start of class on Thursday, October 12, 2017. Name(s): In the TIME column, please estimate the time you spent on the parts of this exercise. Please try

More information

CISC 636 Computational Biology & Bioinformatics (Fall 2016)

CISC 636 Computational Biology & Bioinformatics (Fall 2016) CISC 636 Computational Biology & Bioinformatics (Fall 2016) Sequence pairwise alignment Score statistics: E-value and p-value Heuristic algorithms: BLAST and FASTA Database search: gene finding and annotations

More information

Revisiting the Speed-versus-Sensitivity Tradeoff in Pairwise Sequence Search

Revisiting the Speed-versus-Sensitivity Tradeoff in Pairwise Sequence Search Revisiting the Speed-versus-Sensitivity Tradeoff in Pairwise Sequence Search Ashwin M. Aji and Wu-chun Feng The Synergy Laboratory Department of Computer Science Virginia Tech {aaji,feng}@cs.vt.edu Abstract

More information

Bloom Filter Performance on Graphics Engines

Bloom Filter Performance on Graphics Engines Bloom Filter Performance on Graphics Engines Lin Ma 1, Roger D. Chamberlain 1,2, Jeremy D. Buhler 1, Mark A. Franklin 1,2 1 Department of Computer Science and Engineering Washington University in St. Louis

More information

A Scalable Coprocessor for Bioinformatic Sequence Alignments

A Scalable Coprocessor for Bioinformatic Sequence Alignments A Scalable Coprocessor for Bioinformatic Sequence Alignments Scott F. Smith Department of Electrical and Computer Engineering Boise State University Boise, ID, U.S.A. Abstract A hardware coprocessor for

More information

PARALIGN: rapid and sensitive sequence similarity searches powered by parallel computing technology

PARALIGN: rapid and sensitive sequence similarity searches powered by parallel computing technology Nucleic Acids Research, 2005, Vol. 33, Web Server issue W535 W539 doi:10.1093/nar/gki423 PARALIGN: rapid and sensitive sequence similarity searches powered by parallel computing technology Per Eystein

More information

B L A S T! BLAST: Basic local alignment search tool. Copyright notice. February 6, Pairwise alignment: key points. Outline of tonight s lecture

B L A S T! BLAST: Basic local alignment search tool. Copyright notice. February 6, Pairwise alignment: key points. Outline of tonight s lecture February 6, 2008 BLAST: Basic local alignment search tool B L A S T! Jonathan Pevsner, Ph.D. Introduction to Bioinformatics pevsner@jhmi.edu 4.633.0 Copyright notice Many of the images in this powerpoint

More information

Software CBESW: Sequence Alignment on the Playstation 3 Adrianto Wirawan*, Chee Keong Kwoh, Nim Tri Hieu and Bertil Schmidt

Software CBESW: Sequence Alignment on the Playstation 3 Adrianto Wirawan*, Chee Keong Kwoh, Nim Tri Hieu and Bertil Schmidt BMC Bioinformatics BioMed Central Software CBESW: Sequence Alignment on the Playstation 3 Adrianto Wirawan*, Chee Keong Kwoh, Nim Tri Hieu and Bertil Schmidt Open Access Address: School of Computer Engineering,

More information

Pairwise Sequence Alignment. Zhongming Zhao, PhD

Pairwise Sequence Alignment. Zhongming Zhao, PhD Pairwise Sequence Alignment Zhongming Zhao, PhD Email: zhongming.zhao@vanderbilt.edu http://bioinfo.mc.vanderbilt.edu/ Sequence Similarity match mismatch A T T A C G C G T A C C A T A T T A T G C G A T

More information

CUDA PROGRAMMING MODEL Chaithanya Gadiyam Swapnil S Jadhav

CUDA PROGRAMMING MODEL Chaithanya Gadiyam Swapnil S Jadhav CUDA PROGRAMMING MODEL Chaithanya Gadiyam Swapnil S Jadhav CMPE655 - Multiple Processor Systems Fall 2015 Rochester Institute of Technology Contents What is GPGPU? What s the need? CUDA-Capable GPU Architecture

More information

Bioinformatics explained: BLAST. March 8, 2007

Bioinformatics explained: BLAST. March 8, 2007 Bioinformatics Explained Bioinformatics explained: BLAST March 8, 2007 CLC bio Gustav Wieds Vej 10 8000 Aarhus C Denmark Telephone: +45 70 22 55 09 Fax: +45 70 22 55 19 www.clcbio.com info@clcbio.com Bioinformatics

More information

REDUCING BEAMFORMING CALCULATION TIME WITH GPU ACCELERATED ALGORITHMS

REDUCING BEAMFORMING CALCULATION TIME WITH GPU ACCELERATED ALGORITHMS BeBeC-2014-08 REDUCING BEAMFORMING CALCULATION TIME WITH GPU ACCELERATED ALGORITHMS Steffen Schmidt GFaI ev Volmerstraße 3, 12489, Berlin, Germany ABSTRACT Beamforming algorithms make high demands on the

More information

Grid Computing for Bioinformatics: An Implementation of a User-Friendly Web Portal for ASTI's In Silico Laboratory

Grid Computing for Bioinformatics: An Implementation of a User-Friendly Web Portal for ASTI's In Silico Laboratory Grid Computing for Bioinformatics: An Implementation of a User-Friendly Web Portal for ASTI's In Silico Laboratory R. Babilonia, M. Rey, E. Aldea, U. Sarte gridapps@asti.dost.gov.ph Outline! Introduction:

More information

CUDA Architecture & Programming Model

CUDA Architecture & Programming Model CUDA Architecture & Programming Model Course on Multi-core Architectures & Programming Oliver Taubmann May 9, 2012 Outline Introduction Architecture Generation Fermi A Brief Look Back At Tesla What s New

More information

ECE 8823: GPU Architectures. Objectives

ECE 8823: GPU Architectures. Objectives ECE 8823: GPU Architectures Introduction 1 Objectives Distinguishing features of GPUs vs. CPUs Major drivers in the evolution of general purpose GPUs (GPGPUs) 2 1 Chapter 1 Chapter 2: 2.2, 2.3 Reading

More information

Sequence Alignment Using Graphics Processing Units. Dzivi PS

Sequence Alignment Using Graphics Processing Units. Dzivi PS Sequence Alignment Using Graphics Processing Units Dzivi PS This report is submitted as partial fulfilment of the requirements for the Honours Programme of the School of Computer Science and Software Engineering,

More information

Dynamic Programming & Smith-Waterman algorithm

Dynamic Programming & Smith-Waterman algorithm m m Seminar: Classical Papers in Bioinformatics May 3rd, 2010 m m 1 2 3 m m Introduction m Definition is a method of solving problems by breaking them down into simpler steps problem need to contain overlapping

More information

Parallelization of Tau-Leap Coarse-Grained Monte Carlo Simulations on GPUs

Parallelization of Tau-Leap Coarse-Grained Monte Carlo Simulations on GPUs Parallelization of Tau-Leap Coarse-Grained Monte Carlo Simulations on GPUs Lifan Xu, Michela Taufer, Stuart Collins, Dionisios G. Vlachos Global Computing Lab University of Delaware Multiscale Modeling:

More information

Bio-Sequence Analysis with Cradle s 3SoC Software Scalable System on Chip

Bio-Sequence Analysis with Cradle s 3SoC Software Scalable System on Chip 2004 ACM Symposium on Applied Computing Bio-Sequence Analysis with Cradle s 3SoC Software Scalable System on Chip Xiandong Meng Department of Electrical and Computer Engineering Wayne State University

More information

Massively Parallel Architectures

Massively Parallel Architectures Massively Parallel Architectures A Take on Cell Processor and GPU programming Joel Falcou - LRI joel.falcou@lri.fr Bat. 490 - Bureau 104 20 janvier 2009 Motivation The CELL processor Harder,Better,Faster,Stronger

More information

INTRODUCTION TO BIOINFORMATICS

INTRODUCTION TO BIOINFORMATICS Molecular Biology-2019 1 INTRODUCTION TO BIOINFORMATICS In this section, we want to provide a simple introduction to using the web site of the National Center for Biotechnology Information NCBI) to obtain

More information

Lecture 5 Advanced BLAST

Lecture 5 Advanced BLAST Introduction to Bioinformatics for Medical Research Gideon Greenspan gdg@cs.technion.ac.il Lecture 5 Advanced BLAST BLAST Recap Sequence Alignment Complexity and indexing BLASTN and BLASTP Basic parameters

More information

Hardware Acceleration of Sequence Alignment Algorithms An Overview

Hardware Acceleration of Sequence Alignment Algorithms An Overview Hardware Acceleration of Sequence Alignment Algorithms An Overview Laiq Hasan Zaid Al-Ars Stamatis Vassiliadis Delft University of Technology Computer Engineering Laboratory Mekelweg 4, 2628 CD Delft,

More information

Similarity searches in biological sequence databases

Similarity searches in biological sequence databases Similarity searches in biological sequence databases Volker Flegel september 2004 Page 1 Outline Keyword search in databases General concept Examples SRS Entrez Expasy Similarity searches in databases

More information

Introduction to CUDA Algoritmi e Calcolo Parallelo. Daniele Loiacono

Introduction to CUDA Algoritmi e Calcolo Parallelo. Daniele Loiacono Introduction to CUDA Algoritmi e Calcolo Parallelo References This set of slides is mainly based on: CUDA Technical Training, Dr. Antonino Tumeo, Pacific Northwest National Laboratory Slide of Applied

More information

Tutorial 4 BLAST Searching the CHO Genome

Tutorial 4 BLAST Searching the CHO Genome Tutorial 4 BLAST Searching the CHO Genome Accessing the CHO Genome BLAST Tool The CHO BLAST server can be accessed by clicking on the BLAST button on the home page or by selecting BLAST from the menu bar

More information

Efficient Lists Intersection by CPU- GPU Cooperative Computing

Efficient Lists Intersection by CPU- GPU Cooperative Computing Efficient Lists Intersection by CPU- GPU Cooperative Computing Di Wu, Fan Zhang, Naiyong Ao, Gang Wang, Xiaoguang Liu, Jing Liu Nankai-Baidu Joint Lab, Nankai University Outline Introduction Cooperative

More information

BIOINFORMATICS A PRACTICAL GUIDE TO THE ANALYSIS OF GENES AND PROTEINS

BIOINFORMATICS A PRACTICAL GUIDE TO THE ANALYSIS OF GENES AND PROTEINS BIOINFORMATICS A PRACTICAL GUIDE TO THE ANALYSIS OF GENES AND PROTEINS EDITED BY Genome Technology Branch National Human Genome Research Institute National Institutes of Health Bethesda, Maryland B. F.

More information

Numerical Simulation on the GPU

Numerical Simulation on the GPU Numerical Simulation on the GPU Roadmap Part 1: GPU architecture and programming concepts Part 2: An introduction to GPU programming using CUDA Part 3: Numerical simulation techniques (grid and particle

More information

INTRODUCTION TO BIOINFORMATICS

INTRODUCTION TO BIOINFORMATICS Molecular Biology-2017 1 INTRODUCTION TO BIOINFORMATICS In this section, we want to provide a simple introduction to using the web site of the National Center for Biotechnology Information NCBI) to obtain

More information

Alignment and clustering tools for sequence analysis. Omar Abudayyeh Presentation December 9, 2015

Alignment and clustering tools for sequence analysis. Omar Abudayyeh Presentation December 9, 2015 Alignment and clustering tools for sequence analysis Omar Abudayyeh 18.337 Presentation December 9, 2015 Introduction Sequence comparison is critical for inferring biological relationships within large

More information

Hybrid KAUST Many Cores and OpenACC. Alain Clo - KAUST Research Computing Saber Feki KAUST Supercomputing Lab Florent Lebeau - CAPS

Hybrid KAUST Many Cores and OpenACC. Alain Clo - KAUST Research Computing Saber Feki KAUST Supercomputing Lab Florent Lebeau - CAPS + Hybrid Computing @ KAUST Many Cores and OpenACC Alain Clo - KAUST Research Computing Saber Feki KAUST Supercomputing Lab Florent Lebeau - CAPS + Agenda Hybrid Computing n Hybrid Computing n From Multi-Physics

More information

Research Article Improving the Mapping of Smith-Waterman Sequence Database Searches onto CUDA-Enabled GPUs

Research Article Improving the Mapping of Smith-Waterman Sequence Database Searches onto CUDA-Enabled GPUs BioMed Research International Volume 2015, Article ID 185179, 10 pages http://dx.doi.org/10.1155/2015/185179 Research Article Improving the Mapping of Smith-Waterman Sequence Database Searches onto CUDA-Enabled

More information

BLAST: Basic Local Alignment Search Tool Altschul et al. J. Mol Bio CS 466 Saurabh Sinha

BLAST: Basic Local Alignment Search Tool Altschul et al. J. Mol Bio CS 466 Saurabh Sinha BLAST: Basic Local Alignment Search Tool Altschul et al. J. Mol Bio. 1990. CS 466 Saurabh Sinha Motivation Sequence homology to a known protein suggest function of newly sequenced protein Bioinformatics

More information