Practical and Optimal String Matching
|
|
- Abigail Parrish
- 6 years ago
- Views:
Transcription
1 Practical and Optimal String Matching Kimmo Fredriksson Department of Computer Science, University of Joensuu, Finland Szymon Grabowski Technical University of Łódź, Computer Engineering Department SPIRE 05 p1/25
2 Problem Setting The classic string matching problem: Given text alphabet and pattern over some finite of size, find the occurrences of in We focus on the case where is relatively small Bit-parallelism SPIRE 05 p2/25
3 Vast number of algorithms exist Some of the most well-known are (classics): Knuth-Morris-Prat: The first Previous work worst case time algorithm Boyer-Moore(-Horspool)-family: Numerous variants, sublinear on average (bit-parallel:) Shift-or: BNDM family: for (Baeza-Yates & Gonnet, 1992) on average for SBNDM (Navarro, 2001; Peltola & Tarhio, 2003), LNDM (He & Fang, 2004), FNDM (Holub & Durian, 2005) SPIRE 05 p3/25
4 Previous work In practice, the best current algorithms for short patterns are the BNDM-family of algorithms (Navarro & Raffinot, 2000) SPIRE 05 p4/25
5 This work We develop a novel pattern partitioning technique that allows us to use shift-or while skipping text characters The algorithm has optimal running time if Very simple to implement, simple inner loop (comparable to plain shift-or) average case very efficient in practice worst case, but can be improved to without destroying the simplicity of the search algorithm SPIRE 05 p5/25
6 Our algorithm: the idea The algorithm is based on the preprocessing / filtering / verification paradigm The preprocessing phase generates different alignements of the pattern, each containing only every th pattern character Ie we partition the pattern into pieces The filtering phase searches all the pieces in parallel using shift-or algorithm, reading only every th text character If any of the pieces match, then we invoke a verification algorithm SPIRE 05 p6/25
7 Preprocessing Given a pattern, generate a set patterns as follows: of Ie we generate different alignments of the original pattern, each alignment containing only every th character Each new pattern has length The total length of the patterns is For example, if and, then, and SPIRE 05 p7/25
8 Preprocessing: the rationale Assume that occurs at mod (1) We can use the set as a filter for the pattern (2) The filter needs to scan only every th character of SPIRE 05 p8/25
9 Preprocessing: the rationale P T a b c d e f i p x x a b c d e f x x x P 0 a d P 1 b e P 2 c f P a d b e c f SPIRE 05 p9/25
10 Prelude to filtering: Shift-or algorithm The algorithm is based on a non-deterministic automaton The automaton for is: Σ a b c d e f The transitions are encoded in a table of bit-masks: For, the mask has the th bit set to 0, iff The bit-vector has one bit per state in the automaton, the th bit of the vector is set to 0, iff the state is active (initially all bits are 1) It can be shown that the automaton can be simulated as: SPIRE 05 p10/25
11 Prelude to filtering: Shift-or algorithm If after the simulation step, the occurs at Can be detected as th bit of is zero, then where has only the th bit set Clearly each step of the automaton is simulated in time, which leads to total time SPIRE 05 p11/25
12 Filtering The whole set of patterns can be searched simultaneously using the Shift-or algorithm (Baeza-Yates & Gonnet, 1992) All the patterns are preprocessed together, as if they were concatenated: For, we effectively preprocess a pattern If the pattern matches, then the zero This can be detected as -th bit in is where has every -th bit set to 1 SPIRE 05 p12/25
13 Filtering: the simplicity illustrated Plain shift-or search: 1 do 2 while 3 then report match 4 if 5 Our shift-or search: 1 do 2 while 3 then Verify 4 if 5 SPIRE 05 p13/25
14 Verification If any of the pattern pieces in match, we verify if the original pattern matches (with the corresponding alignement) Can be done by brute force algorithm, with case cost worst SPIRE 05 p14/25
15 Complexity The filtering time is Assuming that each character occurs with probability, the probability that occurs in a given text position is The verification cost is on average at most We select so that, ie Total average time is, which is optimal SPIRE 05 p15/25
16 Long patterns If, we must use several computer words Asymptotic running time becomes The trick in (Peltola & Tarhio, 2003) to make BNDM work with can be applied to our algorithm too Omitting the details, we obtain average time where Not optimal anymore SPIRE 05 p16/25
17 The worst case running time is Linear worst case time Use any worst case time algorithm for the verifications, and do the verifications incrementally, saving the search state of the worst case algorithm after each verification Standard trick, worst case becomes Not a real problem: if verification time is a problem, then the filter does not work well, and can use the linear time algorithm instead SPIRE 05 p17/25
18 Implementation In modern pipelined CPUs branching is costly Unroll times (ie repeat inline times) the code The bit positions indicating the occurrences will overflow Reserve interference extra bits per pattern to avoid bits in total Verification is done only every th step, for those (at most ) alignements that could match Much faster in practice SPIRE 05 p18/25
19 Experimental results Implementation in C, compiled using icc 81 with full optimizations, run in a 24GHZ Pentium 4 ( ), with 512MB RAM, running Linux patterns were randomly extracted from the text Each pattern was then searched for separately We report the average speed in megabytes per second Our data: real DNA and protein data, English natural language and random ASCII text ( ) SPIRE 05 p19/25
20 Experimental results We compared against: BNDM: (Navarro & Raffinot, 2000), competitive only for random ASCII SBNDM: Simplified version of BNDM (Peltola & Tarhio, 2003), competitive only for random ASCII BMH, BMHS: Boyer-Moore-Horspool, and the Sunday variant of BMH Not competitive on any data (results omitted) Our algorihtms: AOSO: Our basic algorithm FAOSO: with loop-unrolling SPIRE 05 p20/25
21 Experiments: DNA AOSO FAOSO BNDM SBNDM SPIRE 05 p21/25
22 Experiments: proteins AOSO FAOSO BNDM SBNDM SPIRE 05 p22/25
23 Experiments: natural language AOSO FAOSO BNDM SBNDM SPIRE 05 p23/25
24 Experiments: random ASCII AOSO FAOSO BNDM SBNDM SPIRE 05 p24/25
25 Very simple to implement Very efficient in practice Optimal for short patterns ( The techniques can be adapted for several other algorithms as well, eg Shift-add (for Hamming distance): average time ) Conclusions Any algorithm for multiple string matching can be used in place of Shift-or SPIRE 05 p25/25
Improving Practical Exact String Matching
Improving Practical Exact String Matching Branislav Ďurian Jan Holub Hannu Peltola Jorma Tarhio Abstract We present improved variations of the BNDM algorithm for exact string matching. At each alignment
More informationTuning BNDM with q-grams
Tuning BNDM with q-grams Branislav Ďurian Jan Holub Hannu Peltola Jorma Tarhio Abstract We develop bit-parallel algorithms for exact string matching. Our algorithms are variations of the BNDM and Shift-Or
More informationarxiv: v1 [cs.ds] 3 Jul 2017
Speeding Up String Matching by Weak Factor Recognition Domenico Cantone, Simone Faro, and Arianna Pavone arxiv:1707.00469v1 [cs.ds] 3 Jul 2017 Università di Catania, Viale A. Doria 6, 95125 Catania, Italy
More informationA Performance Evaluation of the Preprocessing Phase of Multiple Keyword Matching Algorithms
A Performance Evaluation of the Preprocessing Phase of Multiple Keyword Matching Algorithms Charalampos S. Kouzinopoulos and Konstantinos G. Margaritis Parallel and Distributed Processing Laboratory Department
More informationAn efficient matching algorithm for encoded DNA sequences and binary strings
An efficient matching algorithm for encoded DNA sequences and binary strings Simone Faro 1 and Thierry Lecroq 2 1 Dipartimento di Matematica e Informatica, Università di Catania, Italy 2 University of
More informationTUNING BG MULTI-PATTERN STRING MATCHING ALGORITHM WITH UNROLLING Q-GRAMS AND HASH
Computer Modelling and New Technologies, 2013, Vol.17, No. 4, 58-65 Transport and Telecommunication Institute, Lomonosov 1, LV-1019, Riga, Latvia TUNING BG MULTI-PATTERN STRING MATCHING ALGORITHM WITH
More informationEfficient String Matching Using Bit Parallelism
Efficient String Matching Using Bit Parallelism Kapil Kumar Soni, Rohit Vyas, Dr. Vivek Sharma TIT College, Bhopal, Madhya Pradesh, India Abstract: Bit parallelism is an inherent property of computer to
More informationText Algorithms (6EAP) Lecture 3: Exact paaern matching II
Text Algorithms (6EA) Lecture 3: Exact paaern matching II Jaak Vilo 2012 fall Jaak Vilo MTAT.03.190 Text Algorithms 1 2 Algorithms Brute force O(nm) Knuth- Morris- raa O(n) Karp- Rabin hir- OR, hir- AND
More informationFast exact string matching algorithms
Information Processing Letters 102 (2007) 229 235 www.elsevier.com/locate/ipl Fast exact string matching algorithms Thierry Lecroq LITIS, Faculté des Sciences et des Techniques, Université de Rouen, 76821
More informationThe Exact Online String Matching Problem: A Review of the Most Recent Results
13 The Exact Online String Matching Problem: A Review of the Most Recent Results SIMONE FARO, Università di Catania THIERRY LECROQ, Université derouen This article addresses the online exact string matching
More informationFast Exact String Matching Algorithms
Fast Exact String Matching Algorithms Thierry Lecroq Thierry.Lecroq@univ-rouen.fr Laboratoire d Informatique, Traitement de l Information, Systèmes. Part of this work has been done with Maxime Crochemore
More informationText Algorithms (6EAP) Lecture 3: Exact pa;ern matching II
Text Algorithms (6EAP) Lecture 3: Exact pa;ern matching II Jaak Vilo 2010 fall Jaak Vilo MTAT.03.190 Text Algorithms 1 Find occurrences in text P S 2 Algorithms Brute force O(nm) Knuth- Morris- Pra; O(n)
More informationMulti-Pattern String Matching with Very Large Pattern Sets
Multi-Pattern String Matching with Very Large Pattern Sets Leena Salmela L. Salmela, J. Tarhio and J. Kytöjoki: Multi-pattern string matching with q-grams. ACM Journal of Experimental Algorithmics, Volume
More informationString Matching Algorithms
String Matching Algorithms Georgy Gimel farb (with basic contributions from M. J. Dinneen, Wikipedia, and web materials by Ch. Charras and Thierry Lecroq, Russ Cox, David Eppstein, etc.) COMPSCI 369 Computational
More informationIndexing and Searching
Indexing and Searching Introduction How to retrieval information? A simple alternative is to search the whole text sequentially Another option is to build data structures over the text (called indices)
More informationWAVEFRONT LONGEST COMMON SUBSEQUENCE ALGORITHM ON MULTICORE AND GPGPU PLATFORM BILAL MAHMOUD ISSA SHEHABAT UNIVERSITI SAINS MALAYSIA
WAVEFRONT LONGEST COMMON SUBSEQUENCE ALGORITHM ON MULTICORE AND GPGPU PLATFORM BILAL MAHMOUD ISSA SHEHABAT UNIVERSITI SAINS MALAYSIA 2010 WAVE-FRONT LONGEST COMMON SUBSEQUENCE ALGORITHM ON MULTICORE AND
More informationAlgorithms and Data Structures
Algorithms and Data Structures Charles A. Wuethrich Bauhaus-University Weimar - CogVis/MMC May 11, 2017 Algorithms and Data Structures String searching algorithm 1/29 String searching algorithm Introduction
More informationGRASPm: an efficient algorithm for exact pattern-matching in genomic sequences
Int. J. Bioinformatics Research and Applications, Vol. GRASPm: an efficient algorithm for exact pattern-matching in genomic sequences Sérgio Deusdado* Centre for Mountain Research (CIMO), Polytechnic Institute
More informationInexact Pattern Matching Algorithms via Automata 1
Inexact Pattern Matching Algorithms via Automata 1 1. Introduction Chung W. Ng BioChem 218 March 19, 2007 Pattern matching occurs in various applications, ranging from simple text searching in word processors
More informationStudy of Selected Shifting based String Matching Algorithms
Study of Selected Shifting based String Matching Algorithms G.L. Prajapati, PhD Dept. of Comp. Engg. IET-Devi Ahilya University, Indore Mohd. Sharique Dept. of Comp. Engg. IET-Devi Ahilya University, Indore
More informationkvjlixapejrbxeenpphkhthbkwyrwamnugzhppfx
COS 226 Lecture 12: String searching String search analysis TEXT: N characters PATTERN: M characters Idea to test algorithms: use random pattern or random text Existence: Any occurrence of pattern in text?
More informationPLEASE SCROLL DOWN FOR ARTICLE. Full terms and conditions of use:
This article was downloaded by: [Universiteit Twente] On: 21 May 2010 Access details: Access Details: [subscription number 907217948] Publisher Taylor & Francis Informa Ltd Registered in England and Wales
More informationMax-Shift BM and Max-Shift Horspool: Practical Fast Exact String Matching Algorithms
Regular Paper Max-Shift BM and Max-Shift Horspool: Practical Fast Exact String Matching Algorithms Mohammed Sahli 1,a) Tetsuo Shibuya 2 Received: September 8, 2011, Accepted: January 13, 2012 Abstract:
More informationIndexing Variable Length Substrings for Exact and Approximate Matching
Indexing Variable Length Substrings for Exact and Approximate Matching Gonzalo Navarro 1, and Leena Salmela 2 1 Department of Computer Science, University of Chile gnavarro@dcc.uchile.cl 2 Department of
More informationExperimental Results on String Matching Algorithms
SOFTWARE PRACTICE AND EXPERIENCE, VOL. 25(7), 727 765 (JULY 1995) Experimental Results on String Matching Algorithms thierry lecroq Laboratoire d Informatique de Rouen, Université de Rouen, Facultés des
More informationA Survey of String Matching Algorithms
RESEARCH ARTICLE OPEN ACCESS A Survey of String Matching Algorithms Koloud Al-Khamaiseh*, Shadi ALShagarin** *(Department of Communication and Electronics and Computer Engineering, Tafila Technical University,
More informationKnuth-Morris-Pratt. Kranthi Kumar Mandumula Indiana State University Terre Haute IN, USA. December 16, 2011
Kranthi Kumar Mandumula Indiana State University Terre Haute IN, USA December 16, 2011 Abstract KMP is a string searching algorithm. The problem is to find the occurrence of P in S, where S is the given
More informationA Fast Order-Preserving Matching with q-neighborhood Filtration Using SIMD Instructions
A Fast Order-Preserving Matching with q-neighborhood Filtration Using SIMD Instructions Yohei Ueki, Kazuyuki Narisawa, and Ayumi Shinohara Graduate School of Information Sciences, Tohoku University, Japan
More informationA Practical Distributed String Matching Algorithm Architecture and Implementation
A Practical Distributed String Matching Algorithm Architecture and Implementation Bi Kun, Gu Nai-jie, Tu Kun, Liu Xiao-hu, and Liu Gang International Science Index, Computer and Information Engineering
More informationCombined string searching algorithm based on knuth-morris- pratt and boyer-moore algorithms
IOP Conference Series: Materials Science and Engineering PAPER OPEN ACCESS Combined string searching algorithm based on knuth-morris- pratt and boyer-moore algorithms To cite this article: R Yu Tsarev
More informationApplied Databases. Sebastian Maneth. Lecture 14 Indexed String Search, Suffix Trees. University of Edinburgh - March 9th, 2017
Applied Databases Lecture 14 Indexed String Search, Suffix Trees Sebastian Maneth University of Edinburgh - March 9th, 2017 2 Recap: Morris-Pratt (1970) Given Pattern P, Text T, find all occurrences of
More informationFast Searching in Biological Sequences Using Multiple Hash Functions
Fast Searching in Biological Sequences Using Multiple Hash Functions Simone Faro Dip. di Matematica e Informatica, Università di Catania Viale A.Doria n.6, 95125 Catania, Italy Email: faro@dmi.unict.it
More informationAccelerating Boyer Moore Searches on Binary Texts
Accelerating Boyer Moore Searches on Binary Texts Shmuel T. Klein Miri Kopel Ben-Nissan Department of Computer Science, Bar Ilan University, Ramat-Gan 52900, Israel Tel: (972 3) 531 8865 Email: {tomi,kopel}@cs.biu.ac.il
More informationAlgorithms for Weighted Matching
Algorithms for Weighted Matching Leena Salmela and Jorma Tarhio Helsinki University of Technology {lsalmela,tarhio}@cs.hut.fi Abstract. We consider the matching of weighted patterns against an unweighted
More informationInternational Journal of Computer Engineering and Applications, Volume XI, Issue XI, Nov. 17, ISSN
International Journal of Computer Engineering and Applications, Volume XI, Issue XI, Nov. 17, www.ijcea.com ISSN 2321-3469 DNA PATTERN MATCHING - A COMPARATIVE STUDY OF THREE PATTERN MATCHING ALGORITHMS
More informationA New Multiple-Pattern Matching Algorithm for the Network Intrusion Detection System
IACSIT International Journal of Engineering and Technology, Vol. 8, No. 2, April 2016 A New Multiple-Pattern Matching Algorithm for the Network Intrusion Detection System Nguyen Le Dang, Dac-Nhuong Le,
More informationChapter 7. Space and Time Tradeoffs. Copyright 2007 Pearson Addison-Wesley. All rights reserved.
Chapter 7 Space and Time Tradeoffs Copyright 2007 Pearson Addison-Wesley. All rights reserved. Space-for-time tradeoffs Two varieties of space-for-time algorithms: input enhancement preprocess the input
More informationExperiments on string matching in memory structures
Experiments on string matching in memory structures Thierry Lecroq LIR (Laboratoire d'informatique de Rouen) and ABISS (Atelier de Biologie Informatique Statistique et Socio-Linguistique), Universite de
More informationAlgorithms for Order- Preserving Matching
Departm en tofcom pu terscien ce Algorithms for Order- Preserving Matching TamannaChhabra 90 80 text pattern 70 60 50 40 30 20 10 0 0 1 2 3 4 5 6 7 8 9 10 11 DOCTORAL DISSERTATIONS Preface First, I
More informationString Matching Algorithms
String Matching Algorithms 1. Naïve String Matching The naïve approach simply test all the possible placement of Pattern P[1.. m] relative to text T[1.. n]. Specifically, we try shift s = 0, 1,..., n -
More informationString matching algorithms
String matching algorithms Deliverables String Basics Naïve String matching Algorithm Boyer Moore Algorithm Rabin-Karp Algorithm Knuth-Morris- Pratt Algorithm Copyright @ gdeepak.com 2 String Basics A
More informationIncreased Bit-Parallelism for Approximate String Matching
Increased Bit-Parallelism for Approximate String Matching Heii Hyyrö 1,2, Kimmo Fredrisson 3, and Gonzalo Navarro 4 1 PRESTO, Japan Science and Technology Agency 2 Department of Computer Sciences, University
More informationRow-wise tiling for the Myers bit-parallel approximate string matching algorithm
Row-wise tiling for the Myers bit-parallel approximate string matching algorithm Kimmo Fredriksson Department of Computer Science, PO Box 111, University of Joensuu, FIN-80101 Joensuu kfredrik@cs.joensuu.fi.
More informationThis chapter is based on the following sources, which are all recommended reading:
Bioinformatics I, WS 09-10, D. Hson, December 7, 2009 105 6 Fast String Matching This chapter is based on the following sorces, which are all recommended reading: 1. An earlier version of this chapter
More informationString Processing Workshop
String Processing Workshop String Processing Overview What is string processing? String processing refers to any algorithm that works with data stored in strings. We will cover two vital areas in string
More informationComputing Patterns in Strings I. Specific, Generic, Intrinsic
Outline : Specific, Generic, Intrinsic 1,2,3 1 Algorithms Research Group, Department of Computing & Software McMaster University, Hamilton, Ontario, Canada email: smyth@mcmaster.ca 2 Digital Ecosystems
More informationBit-Reduced Automaton Inspection for Cloud Security
Bit-Reduced Automaton Inspection for Cloud Security Haiqiang Wang l Kuo-Kun Tseng l* Shu-Chuan Chu 2 John F. Roddick 2 Dachao Li 1 l Department of Computer Science and Technology, Harbin Institute of Technology,
More informationApproximate String Matching with Reduced Alphabet
Approxiate String Matching with Reduced Alphabet Leena Salela 1 and Jora Tarhio 2 1 University of Helsinki, Departent of Coputer Science leena.salela@cs.helsinki.fi 2 Aalto University Deptartent of Coputer
More informationBit-parallel (δ, γ)-matching and Suffix Automata
Bit-parallel (δ, γ)-matching and Suffix Automata Maxime Crochemore a,b,1, Costas S. Iliopoulos b, Gonzalo Navarro c,2,3, Yoan J. Pinzon b,d,2, and Alejandro Salinger c a Institut Gaspard-Monge, Université
More informationKeywords Pattern Matching Algorithms, Pattern Matching, DNA and Protein Sequences, comparison per character
Volume 3, Issue 5, May 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Index Based Multiple
More informationAutomaton-based Sublinear Keyword Pattern Matching. SoC Software. Loek Cleophas, Bruce W. Watson, Gerard Zwaan
SPIRE 2004 Padova, Italy October 5 8, 2004 Automaton-based Sublinear Keyword Pattern Matching Loek Cleophas, Bruce W. Watson, Gerard Zwaan SoC Software Construction Software Construction Group Department
More information17 dicembre Luca Bortolussi SUFFIX TREES. From exact to approximate string matching.
17 dicembre 2003 Luca Bortolussi SUFFIX TREES From exact to approximate string matching. An introduction to string matching String matching is an important branch of algorithmica, and it has applications
More informationFast Substring Matching
Fast Substring Matching Andreas Klein 1 2 3 4 5 6 7 8 9 10 Abstract The substring matching problem occurs in several applications. Two of the well-known solutions are the Knuth-Morris-Pratt algorithm (which
More informationHigh Performance Pattern Matching Algorithm for Network Security
IJCSNS International Journal of Computer Science and Network Security, VOL.6 No., October 6 83 High Performance Pattern Matching Algorithm for Network Security Yang Wang and Hidetsune Kobayashi Graduate
More information4. Suffix Trees and Arrays
4. Suffix Trees and Arrays Let T = T [0..n) be the text. For i [0..n], let T i denote the suffix T [i..n). Furthermore, for any subset C [0..n], we write T C = {T i i C}. In particular, T [0..n] is the
More informationA string is a sequence of characters. In the field of computer science, we use strings more often as we use numbers.
STRING ALGORITHMS : Introduction A string is a sequence of characters. In the field of computer science, we use strings more often as we use numbers. There are many functions those can be applied on strings.
More informationVolume 3, Issue 9, September 2015 International Journal of Advance Research in Computer Science and Management Studies
Volume 3, Issue 9, September 2015 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online at: www.ijarcsms.com
More information4. Suffix Trees and Arrays
4. Suffix Trees and Arrays Let T = T [0..n) be the text. For i [0..n], let T i denote the suffix T [i..n). Furthermore, for any subset C [0..n], we write T C = {T i i C}. In particular, T [0..n] is the
More informationString Matching. Pedro Ribeiro 2016/2017 DCC/FCUP. Pedro Ribeiro (DCC/FCUP) String Matching 2016/ / 42
String Matching Pedro Ribeiro DCC/FCUP 2016/2017 Pedro Ribeiro (DCC/FCUP) String Matching 2016/2017 1 / 42 On this lecture The String Matching Problem Naive Algorithm Deterministic Finite Automata Knuth-Morris-Pratt
More informationGiven a text file, or several text files, how do we search for a query string?
CS 840 Fall 2016 Text Search and Succinct Data Structures: Unit 4 Given a text file, or several text files, how do we search for a query string? Note the query/pattern is not of fixed length, unlike key
More informationAGREP A FAST APPROXIMATE PATTERN-MATCHING TOOL. (Preliminary version) Sun Wu and Udi Manber 1
AGREP A FAST APPROXIMATE PATTERN-MATCHING TOOL (Preliminary version) Sun Wu and Udi Manber 1 Department of Computer Science University of Arizona Tucson, AZ 85721 (sw udi)@cs.arizona.edu ABSTRACT Searching
More informationMultiple Skip Multiple Pattern Matching Algorithm (MSMPMA)
Multiple Skip Multiple Pattern Matching (MSMPMA) Ziad A.A. Alqadi 1, Musbah Aqel 2, & Ibrahiem M. M. El Emary 3 1 Faculty Engineering, Al Balqa Applied University, Amman, Jordan E-mail:ntalia@yahoo.com
More informationLING/C SC/PSYC 438/538. Lecture 18 Sandiway Fong
LING/C SC/PSYC 438/538 Lecture 18 Sandiway Fong Today's Topics Reminder: no class on Tuesday (out of town at a meeting) Homework 7: due date next Wednesday night Efficient string matching (Knuth-Morris-Pratt
More informationAn Index Based Sequential Multiple Pattern Matching Algorithm Using Least Count
2011 International Conference on Life Science and Technology IPCBEE vol.3 (2011) (2011) IACSIT Press, Singapore An Index Based Sequential Multiple Pattern Matching Algorithm Using Least Count Raju Bhukya
More informationFast Hybrid String Matching Algorithms
Fast Hybrid String Matching Algorithms Jamuna Bhandari 1 and Anil Kumar 2 1 Dept. of CSE, Manipal University Jaipur, INDIA 2 Dept of CSE, Manipal University Jaipur, INDIA ABSTRACT Various Hybrid algorithms
More informationSurvey of Exact String Matching Algorithm for Detecting Patterns in Protein Sequence
Advances in Computational Sciences and Technology ISSN 0973-6107 Volume 10, Number 8 (2017) pp. 2707-2720 Research India Publications http://www.ripublication.com Survey of Exact String Matching Algorithm
More informationBoyer-Moore strategy to efficient approximate string matching
Boyer-Moore strategy to efficient approximate string matching Nadia El Mabrouk, Maxime Crochemore To cite this version: Nadia El Mabrouk, Maxime Crochemore. Boyer-Moore strategy to efficient approximate
More informationCSCI S-Q Lecture #13 String Searching 8/3/98
CSCI S-Q Lecture #13 String Searching 8/3/98 Administrivia Final Exam - Wednesday 8/12, 6:15pm, SC102B Room for class next Monday Graduate Paper due Friday Tonight Precomputation Brute force string searching
More informationOn Performance Evaluation of BM-Based String Matching Algorithms in Distributed Computing Environment
International Journal of Future Computer and Communication, Vol. 6, No. 1, March 2017 On Performance Evaluation of BM-Based String Matching Algorithms in Distributed Computing Environment Kunaphas Kongkitimanon
More informationTechnical University of Denmark
page 1 of 12 pages Technical University of Denmark Written exam, December 11, 2015. Course name: Algorithms and data structures. Course number: 02110. Aids allowed: All written materials are permitted.
More informationText Algorithms. Jaak Vilo 2016 fall. MTAT Text Algorithms
Text Algorithms Jaak Vilo 2016 fall Jaak Vilo MTAT.03.190 Text Algorithms 1 Topics Exact matching of one pattern(string) Exact matching of multiple patterns Suffix trie and tree indexes Applications Suffix
More informationString Searching Algorithm Implementation-Performance Study with Two Cluster Configuration
International Journal of Computer Science & Communication Vol. 1, No. 2, July-December 2010, pp. 271-275 String Searching Algorithm Implementation-Performance Study with Two Cluster Configuration Prasad
More informationCSC Design and Analysis of Algorithms. Lecture 9. Space-For-Time Tradeoffs. Space-for-time tradeoffs
CSC 8301- Design and Analysis of Algorithms Lecture 9 Space-For-Time Tradeoffs Space-for-time tradeoffs Two varieties of space-for-time algorithms: input enhancement -- preprocess input (or its part) to
More informationSuccinct Data Structures: Theory and Practice
Succinct Data Structures: Theory and Practice March 16, 2012 Succinct Data Structures: Theory and Practice 1/15 Contents 1 Motivation and Context Memory Hierarchy Succinct Data Structures Basics Succinct
More informationFlexible Music Retrieval in Sublinear Time
Kimmo Fredriksson, Veli Mäkinen 2, Gonzalo Navarro 3 Dept. of Computer Science, University of Joensuu, Finland e-mail: kfredrik@cs.joensuu.fi 2 Technische Fakultät, Bielefeld Universität, Germany e-mail:
More informationJumbled Matching with SIMD
Jumbled Matching with SIMD Sukhpal Singh Ghuman and Jorma Tarhio Department of Computer Science Aalto University P.O. Box 15400, FI-00076 Aalto, Finland firstname.lastname@aalto.fi Abstract. Jumbled pattern
More informationString matching algorithms تقديم الطالب: سليمان ضاهر اشراف المدرس: علي جنيدي
String matching algorithms تقديم الطالب: سليمان ضاهر اشراف المدرس: علي جنيدي للعام الدراسي: 2017/2016 The Introduction The introduction to information theory is quite simple. The invention of writing occurred
More informationA very fast string matching algorithm for small. alphabets and long patterns. (Extended abstract)
A very fast string matching algorithm for small alphabets and long patterns (Extended abstract) Christian Charras 1, Thierry Lecroq 1, and Joseph Daniel Pehoushek 2 1 LIR (Laboratoire d'informatique de
More informationEnhanced Two Sliding Windows Algorithm For Pattern Matching (ETSW) University of Jordan, Amman Jordan
Enhanced Two Sliding Windows Algorithm For Matching (ETSW) Mariam Itriq 1, Amjad Hudaib 2, Aseel Al-Anani 2, Rola Al-Khalid 2, Dima Suleiman 1 1. Department of Business Information Systems, King Abdullah
More informationData Structures and Algorithms. Course slides: String Matching, Algorithms growth evaluation
Data Structures and Algorithms Course slides: String Matching, Algorithms growth evaluation String Matching Basic Idea: Given a pattern string P, of length M Given a text string, A, of length N Do all
More informationMultiple-Pattern Matching In LZW Compressed Files Using Aho-Corasick Algorithm ABSTRACT 1 INTRODUCTION
Multiple-Pattern Matching In LZW Compressed Files Using Aho-Corasick Algorithm Tao Tao, Amar Mukherjee School of Electrical Engineering and Computer Science University of Central Florida, Orlando, Fl.32816
More informationParallel string matching for image matching with prime method
International Journal of Engineering Research and Development e-issn: 2278-067X, p-issn: 2278-800X, www.ijerd.com Volume 10, Issue 6 (June 2014), PP.42-46 Chinta Someswara Rao 1, 1 Assistant Professor,
More informationarxiv: v2 [cs.ds] 15 Oct 2008
Efficient Pattern Matching on Binary Strings Simone Faro 1 and Thierry Lecroq 2 arxiv:0810.2390v2 [cs.ds] 15 Oct 2008 1 Dipartimento di Matematica e Informatica, Università di Catania, Italy 2 University
More informationLexical Analysis. Chapter 2
Lexical Analysis Chapter 2 1 Outline Informal sketch of lexical analysis Identifies tokens in input string Issues in lexical analysis Lookahead Ambiguities Specifying lexers Regular expressions Examples
More informationClever Linear Time Algorithms. Maximum Subset String Searching. Maximum Subrange
Clever Linear Time Algorithms Maximum Subset String Searching Maximum Subrange Given an array of numbers values[1..n] where some are negative and some are positive, find the subarray values[start..end]
More informationString quicksort solves this problem by processing the obtained information immediately after each symbol comparison.
Lcp-Comparisons General (non-string) comparison-based sorting algorithms are not optimal for sorting strings because of an imbalance between effort and result in a string comparison: it can take a lot
More informationCS/COE 1501
CS/COE 1501 www.cs.pitt.edu/~nlf4/cs1501/ String Pattern Matching General idea Have a pattern string p of length m Have a text string t of length n Can we find an index i of string t such that each of
More informationParallelized Progressive Network Coding with Hardware Acceleration
Parallelized Progressive Network Coding with Hardware Acceleration Hassan Shojania, Baochun Li Department of Electrical and Computer Engineering University of Toronto Network coding Information is coded
More informationSuffix links are stored for compact trie nodes only, but we can define and compute them for any position represented by a pair (u, d):
Suffix links are the same as Aho Corasick failure links but Lemma 4.4 ensures that depth(slink(u)) = depth(u) 1. This is not the case for an arbitrary trie or a compact trie. Suffix links are stored for
More informationNovember Exam. University of Cape Town ~ Department of Computer Science Computer Science 3003S ~ 2009
University of Cape Town ~ Department of Computer Science Computer Science 3003S ~ 2009 November Exam Marks : 100 Time : 180 minutes Instructions: a) Answer ALL questions in Sections A and C. b) Answer
More informationThis article was published in an Elsevier journal. The attached copy is furnished to the author for non-commercial research and education use, including for instruction at the author s institution, sharing
More informationTHINGS WE DID LAST TIME IN SECTION
MA/CSSE 473 Day 24 Student questions Space-time tradeoffs Hash tables review String search algorithms intro We did not get to them in other sections THINGS WE DID LAST TIME IN SECTION 1 1 Horner's Rule
More informationLab Determining Data Storage Capacity
Lab 1.3.2 Determining Data Storage Capacity Objectives Determine the amount of RAM (in MB) installed in a PC. Determine the size of the hard disk drive (in GB) installed in a PC. Determine the used and
More informationLecture 7 February 26, 2010
6.85: Advanced Data Structures Spring Prof. Andre Schulz Lecture 7 February 6, Scribe: Mark Chen Overview In this lecture, we consider the string matching problem - finding all places in a text where some
More informationarxiv: v3 [cs.ds] 29 Jun 2010
Sampled Longest Common Prefix Array Jouni Sirén Department of Computer Science, University of Helsinki, Finland jltsiren@cs.helsinki.fi arxiv:1001.2101v3 [cs.ds] 29 Jun 2010 Abstract. When augmented with
More informationPractical Fast Searching in Strings
SOFTWARE-PRACTICE AND EXPERIENCE, VOL. 10, 501-506 (1980) Practical Fast Searching in Strings R. NIGEL HORSPOOL School of Computer Science, McGill University, 805 Sherbrooke Street West, Montreal, Quebec
More informationExact Search Algorithms for Biological Sequences
Exact Search Algorithms for Biological Sequences Eric Rivals, Leena Salmela, Jorma Tarhio To cite this version: Eric Rivals, Leena Salmela, Jorma Tarhio. Exact Search Algorithms for Biological Sequences.
More informationA Multipattern Matching Algorithm Using Sampling and Bit Index
A Multipattern Matching Algorithm Using Sampling and Bit Index Jinhui Chen, Zhongfu Ye Department of Automation University of Science and Technology of China Hefei, P.R.China jeffcjh@mail.ustc.edu.cn,
More informationString Patterns and Algorithms on Strings
String Patterns and Algorithms on Strings Lecture delivered by: Venkatanatha Sarma Y Assistant Professor MSRSAS-Bangalore 11 Objectives To introduce the pattern matching problem and the important of algorithms
More informationOptimisation p.1/22. Optimisation
Performance Tuning Optimisation p.1/22 Optimisation Optimisation p.2/22 Constant Elimination do i=1,n a(i) = 2*b*c(i) enddo What is wrong with this loop? Compilers can move simple instances of constant
More information