A Performance Study of Hashing Functions for. M. V. Ramakrishna, E. Fu and E. Bahcekapili. Michigan State University. East Lansing, MI

Size: px
Start display at page:

Download "A Performance Study of Hashing Functions for. M. V. Ramakrishna, E. Fu and E. Bahcekapili. Michigan State University. East Lansing, MI"

Transcription

1 A Performance Study of Hashing Functions for Hardware Applications M. V. Ramakrishna, E. Fu and E. Bahcekapili Department of Computer Science Michigan State University East Lansing, MI frama, fue, Abstract Hashing is used extensively in hardware applications such as page tables for address translation. There is not much literature in this regard although hashing has been extensively studied for le organization. More specically, there is no study of the practical performance of hashing functions used. In the literature we nd bit extraction and exclusive ORing hashing functions used, but there is no mention of the performance of these functions. Moreover, the performance of the hashing functions in relation to the theoretical performance of hashing schemes is not addressed. In this paper we study the practical performance of a particular class of hashing functions. Our results show that by choosing functions randomly from this class of hashing functions, which can be readily implemented in hardware, we can achieve analytically predicted performance of hashing schemes with real life data. Proc. ICCI 94, Int. Conf. on Computing and Information

2 1 Introduction Hashing is a widely used technique of organizing tables which also nds several applications in hardware. For example, hash tables are used to implement page tables in many modern architectures [1] such as IBM system/38 [2], Monads II[3], etc. Thakkar and Knowles proposed a method of address translation, using parallel hashing hardware [4]. Although hashing is widely used in hardware, there is not much literature in this regard and specially we have not been able to nd any paper dealing with the performance of hashing functions suitable for hardware implementation. In all the hardware applications of hashing we have encountered, hash address is obtained by simple bit extraction or exclusive ORing of bit segments. There is no report of the performance of any of the hashing functions. This is hardly surprising, since even for software application there is not much literature on the performance of hashing functions [5],[6]. In this paper, we study the performance of a particular class of hashing functions which can be readily implemented in hardware. This class of hashing functions were shown to be universal 2 by Carter and Wegman [7]. We show that choosing functions at random from this class of functions gives exactly the theoretically predicted performance of hashing. We provide comparison of search lengths for two dierent hashing schemes. We assume that the reader is familiar with the basics of hashing and the related terminology [5], [8]. Knuth expressed his fears of using hashing in[8, p. 540] by concluding his chapter on hashing with: \Finally, we need a great deal of faith in probability theory when we use hashing methods, since they are ecient only on the average, while their worst case is terrible! As in the case of random number generators, we are never completely sure that a hash function will perform properly when it is applied to a new set of data. 1622

3 Therefore scatter storage would be inappropriate for certain real-time applications such as air trac control, where people's lives are at stake". Later in 1981, Gonnet showed that such fears were baseless since the probability of worst case is ridiculously small[9]. He proposed that the expected length of the longest probe sequence (rather than the possible worst of the longest probe sequence) should be the measure of \worst case" of hashing. The length of the longest probe sequence, abbreviated as llps and expected length of the longest probe sequence, E(llps), can be explained as follows. One key, amongst all the keys hashed, has the maximum search length. This search length is the llps. The average value of llps over a large number of dierent hash tables (all tables with the same parameters) is the expected llps, E(llps). This E(llps) is much smaller than the O(n) worst case. Our results about the expected length of the longest probe sequence (and the narrow distribution of the length of the longest probe sequence) which agree closely with the analytical performance measures of Gonnet[9] and Larson[10] are most signicant which show that the \worst case" fears of hashing are baseless. The rest of the paper is organized as follows. In the next section we provide the background of hashing in hardware. Section 3 introduces the class of hashing functions H 3 and present the results of performance study. The last section provides conclusions and discussion of future work. 2 Background We give an overview of the hardware applications of hashing. Irrelevant details of the hardware are omitted in many cases for brevity. Braidt and Taylor describe a memory system which has two independent addressable cards[11]. There are circuits which enable refresh of one card while normal memory access is made to the other card. A simple hash circuit is used to cause 1623

4 memory addresses to occur randomly between the two memory cards. Benhase describes use of hash circuits and directory for data access from a storage hierarchy [12]. A similar \Hash and chain" technique implemented in hardware to address a cache on Direct-access storage devices was used by Robinson and Taylor [13]. McKenney used \dictionary hash" technique to quickly determine stochastically the number of unique labels in a network when presented with a large group of labeled events[14]. The label consists of the source and the destination addresses. At the beginning of each measurement period, a global sequence number is incremented. When a new label comes, this sequence number is inserted into the corresponding hash location. If a label is repeated there will be a match between the number retrieved from the hash address and the global sequence number. The hardware implementation of the scheme shown in their paper uses three distinct Hardware Hash units. The paper does not mention details of the hashing functions but just that \the possible candidates for the hashing functions include CRCs, checksums : : : ". Nor does the paper discuss how to satisfy the requirement that those hash functions produced \statistically independent" hash addresses. The hashing functions we study in the next section readily achieve this exact requirement. There are several papers which discuss about address translation hardware for virtual memory implementation. Houdek and Mitchell discuss the address translation scheme used by the IBM system/38 [2]. The virtual page addresses are hashed into a page directory table. When a match is found the corresponding physical page number is found. Cocke and Worley described the hashing technique used: bit selection (which involves selection of particular bits) from virtual page address [15]. No details of the performance of the hashing functions are provided (nor do Ramamohanarao and Sacks-Davis mention about any particular hashing function[16]). Thakkar and Knowles propose a parallel hashing hardware scheme for address translation [4]. Parallel hardware is used to search a whole bucket in parallel. They 1624

5 suggested hash-bit extraction for the hashing function and quadratic probing for collision resolution. No research about the performance of the hashing functions and the probing functions are mentioned. Brandit proposed an extendible hashing scheme for line-oriented paging stores [17]. Chaining was used for collision resolution. They did not discuss anything about the hashing function used or its performance. Ida and Goto studied the performance of the parallel hashing address translation scheme with key deletion[18]. They used uniform hashing to resolve collision. They also did not discuss about the performance of any particular hashing function. In the next section we introduce a class of hashing functions, called H 3 by Carter and Wegman [7]. We show that by choosing functions randomly from this class of hashing functions, the theoretically predicted performance of hashing schemes can be achieved in practice on real life data set. These hashing functions can be used with any hashing scheme and can be readily implemented in hardware. 3 Performance of a class of hashing functions Let A = f0; 1; 2; :::; a? 1g be the key space and B = f0; 1; :::; m? 1g the address space. Let I be the given key set, I = fx 1 ; x 2 ; :::; x n g, I A. There are a total of m n possible functions from I to B. The usual assumption used in the papers (almost all) analyzing the performance of hashing schemes corresponds to the expected performance value over all the m n possible mapping/hashing functions (The assumption is usually stated as that the probability of a key hashing to a particular location is 1=m, independent of the outcome of the other keys). Our hypothesis is that by choosing functions at random from the class H 3 the analytically predicted performance can be achieved in practice on real life les. 1625

6 The class of functions H 3 We redene A and B so that their cardinalities are powers of 2: Let A = f0; 1; : : : ; 2 i? 1g; and B = f0; 1; : : : ; 2 j? 1g Here i is the number of bits in the key and j is the number of bits in the address. The class H 3 is dened as follows: Let Q denote the set of all i j boolean matrices. For a given q 2 Q and x 2 A, let q(k) be the kth row of the matrix q and x k the kth bit of x. The hashing function h q (x) : A! B is dened as h q (x) = x 1 q(1) x 2 q(2) : : : x i q(i): where denotes the binary AND operation and the exclusive OR operation. The class H 3 is the set fh q j q 2 Qg. The following example illustrates the hashing functions and hash address calculations. Example: 1626

7 Let i be 8 and j be 3. Then the address space is A = f0; : : : ; 255g and the key space is B = f0; : : : ; 7g. We randomly choose an 8 3 matrix q: 2 q = : 7 5 Then the hash addresses for keys 53 and 100 are h q (53) = h q ( ) = q(3) q(4) q(6) q(8) = = 110 = 6(decimal): h q (100) = h q ( ) = q(2) q(3) q(6) = = 0100 = 2(decimal): This class of hashing functions is universal 2 [7]. A class H of hashing functions is said to be universal 2 if no pair of keys collide under more than jhj=m of the functions in the class. Here jhj is the number of hashing functions in H and m is the size of the address space. Hashing functions from this class can be easily implemented in hardware. The following gure shows a circuit implementation. When presented with the key x 1 x 2 x 3 the hash address a 1 a 2 is the output. The 1627

8 matrix q can be generated in software and then loaded into the bank of registers. The circuit is self explanatory and we will not elaborate further. x 1 x 2 x 3 q 1,1 q 1,2 q 2,1 q 2,2 a 1 q 3,1 q 3,2 a 2 Figure 1. Hash address generator, key=x 1 x 2 x 3, hash matrix elements= q ij, hash address=a 1 a

9 Experiments Our hypothesis is that by choosing hashing functions at random from the class H 3, the analytical performance of hashing schemes can be achieved in practice on real life les. We can write a lengthy explanation (justication) for this, but we feel that is irrelevant for this paper [5], [7]. In order to verify this hypothesis, we conducted a series of experiments on real life data sets. Obviously we do not need to build any hardware to experiment. We used uniform hashing and separate chaining collision resolution schemes. These are typical and simple schemes. The obvious implication is that if certain functions perform according to analytical predictions on these schemes, they will do so for any other hashing scheme. Each set of experiments was performed as follows: The hash table size, bucket size and the load factor are xed. Number of keys corresponding to the load factor are selected from the test data consisting of 32 bit integers. A hashing function is generated by generating the requisite number of rows of random numbers. All the keys are then hashed using the hashing function. The search lengths are computed. The same is repeated for 500 dierent hashing functions. Tables 1 and 2 list the average successful and unsuccessful search lengths for separate chaining. The table length was 1024 buckets. The keys were from a le of user-ids from a computer system. We see that the experimental and analytical results agree very closely (The analytical results are from [5], [8] and [10]). The agreement is so close that we do not think statistical tests are necessary. Similar results were obtained for other test les. In each experiment, one of the keys has the longest search length. The corresponding value is noted. The average value of the length of the longest probe sequence over all the experiments is listed in table 3. We see that here again experimental values agree closely with the analytically predicted values. Tables 4-6 show similar results for uniform hashing. For llps, the analytical results given in [10] are 1629

10 values when the number of keys hashed is xed at 1000, and bucket size and load factor change. The experimental results for llps are obtained accordingly for table 6. Figure 2 plots the probability distribution of the length of the longest probe sequence for b = 10; m = 1024, and load factor = 0.8 for the double hashing scheme. We see that the llps is narrowly distributed between 4 and 11 with peak occurring at 6 with E(llps) being We see that out of 500 experiments none has a search length greater than 16 (a quite small value as compared to the worst case which is 1024). Only one value of llps is 16 and there are none in the range of In view of Knuth's statement about worst case of hashing, these results are the most signicant of all. This is showing, as Gonnet predicted, the probability of the worst case of hashing occurring is ridiculously small, and that the llps is narrowly distributed [9]. The value of E(llps) itself is quite small relatively and the probability of llps being much higher than E(llps) is very small. 4 Conclusions There is no literature/description of the performance of hashing functions suitable for hardware applications. There are a number of applications for hashing in hardware. We have shown that by choosing functions at random from the class H 3, the theoretically predicted performance of hashing schemes can be achieved in practice. Also the results about the llps show that Knuth's fears (which appears to be the reason that led Thakkar and Knowles to infer \ : : : requires the storage of about 1000 pseudo random numbers into PROM" [4]) are not justied. These functions can be used in any of the applications discussed in section 2. We are investigating further about hash tables for hardware. 1630

11 b load=0.6 load=0.7 load=0.8 load=0.9 exptl anal exptl anal exptl anal exptl anal Table 1. Expected length of successful search for Separate Chaining b load=0.6 load=0.7 load=0.8 load=0.9 exptl anal exptl anal exptl anal exptl anal Table 2. Expected length of unsuccessful search for Separate Chaining 1631

12 b load=0.6 load=0.7 load=0.8 load=0.9 exptl anal exptl anal exptl anal exptl anal Table 3. Expected llps for Separate Chaining, m = 1024 b load=0.6 load=0.7 load=0.8 load=0.9 exptl anal exptl anal exptl anal exptl anal Table 4. Expected length of successful search for Uniform hashing 1632

13 b load=0.6 load=0.7 load=0.8 load=0.9 exptl anal exptl anal exptl anal exptl anal Table 5. Expected length of unsuccessful search for Uniform hashing b load=0.6 load=0.7 load=0.8 load=0.9 exptl anal exptl anal exptl anal exptl anal Table 6. Expected llps for Uniform hashing, n =

14 "m=1024"? 0.60 Probability 0.40?? 0.20?????????? 0.00????? No. of Probes Figure 2. The prob. distribution of the length of the longest probe sequence, m=1024, load factor=0.8, b=10 (results from 500 experiments) 1634

15 References [1] A. Tanenbaum, Modern operating systems. Prentice Hall, pp , [2] M. Houdek and G. Mitchell, \Translating a large virtual address," IBM System/38 Tech. Developments, pp. 22{24, [3] D. Abramson, \Hardware management of a large virtual memory," Proc. 4th Australian Computer Science Conf., vol. 3, no. 1, [4] S. Thakkar and A. Knowles, \A high-performance memory management scheme," IEEE Computer, pp. 8{22, May [5] M. V. Ramakrishna, \Hashing in practice, analysis of hashing and universal hashing," in Proc. ACM Sigmod Conf., pp. 191{199, [6] J. Mullin, \A note on universal classes of hash functions," Information Processing Letters, vol. 37, pp. 247{256, [7] L. Carter and M. Wegman, \Universal classes of hashing functions," Journal of Computer and System Sciences, vol. 18, no. 2, pp. 143{154, [8] D. Knuth, The art of computer programming, vol. 3. Reading, MA: Addison Wesley, [9] G. Gonnet, \Expected length of the longest probe sequence in hash code searching," J. ACM, vol. 28, no. 2, pp. 289{304, [10] P. Larson, \Expected worst-case performance of hash les," The Computer Journal, vol. 25, no. 3, pp. 347{352, [11] J. Braidt and J. Taylor, \Address hashing circuit for memory with nonchangeable address block," IBM Technical Disclosure Bulletin, vol. 24, no. 7A, pp. 3531{3532,

16 [12] M. Benhase, \Resetting storage unit directories," IBM Technical Disclosure Bulletin, vol. 25, no. 7B, pp. 3760{3761, [13] H. Robinson and G. Tayler, \Hashing addresses to a cache on dasd," IBM Technical Disclosure Bulletin, vol. 24, no. 11A, pp. 5354{5356, [14] P. McKenney, \High-speed event counting and classication using a dictionary hash technique," in 1989 International Conference on Parallel Processing, pp. III{71{III{75, [15] J. Cocke and W. Worley, \Virtual to real address translation using hashing," IBM Technical Disclosure Bulletin, vol. 24, no. 6, pp. 2724{2726, [16] K. Ramamoganarao and R. Sacks-Davis, \Hardware address translation for machines with a large virtual memory," Information Processing Letters, vol. 13, no. 1, pp. 23{29, [17] R. Bryant, \Extendible hashing for line-oriented paging stores," IBM Technical Disclosure Bulletin, vol. 26, no. 11, pp. 6046{6049, [18] T. Ida and E. Goto, \Performance of parallel hash hardware with key deletion," Information Processing, vol. 77, pp. 643{647,

Worst-case running time for RANDOMIZED-SELECT

Worst-case running time for RANDOMIZED-SELECT Worst-case running time for RANDOMIZED-SELECT is ), even to nd the minimum The algorithm has a linear expected running time, though, and because it is randomized, no particular input elicits the worst-case

More information

Chapter 5 Hashing. Introduction. Hashing. Hashing Functions. hashing performs basic operations, such as insertion,

Chapter 5 Hashing. Introduction. Hashing. Hashing Functions. hashing performs basic operations, such as insertion, Introduction Chapter 5 Hashing hashing performs basic operations, such as insertion, deletion, and finds in average time 2 Hashing a hash table is merely an of some fixed size hashing converts into locations

More information

Module 3: Hashing Lecture 9: Static and Dynamic Hashing. The Lecture Contains: Static hashing. Hashing. Dynamic hashing. Extendible hashing.

Module 3: Hashing Lecture 9: Static and Dynamic Hashing. The Lecture Contains: Static hashing. Hashing. Dynamic hashing. Extendible hashing. The Lecture Contains: Hashing Dynamic hashing Extendible hashing Insertion file:///c /Documents%20and%20Settings/iitkrana1/My%20Documents/Google%20Talk%20Received%20Files/ist_data/lecture9/9_1.htm[6/14/2012

More information

SFU CMPT Lecture: Week 8

SFU CMPT Lecture: Week 8 SFU CMPT-307 2008-2 1 Lecture: Week 8 SFU CMPT-307 2008-2 Lecture: Week 8 Ján Maňuch E-mail: jmanuch@sfu.ca Lecture on June 24, 2008, 5.30pm-8.20pm SFU CMPT-307 2008-2 2 Lecture: Week 8 Universal hashing

More information

Symbol Table. Symbol table is used widely in many applications. dictionary is a kind of symbol table data dictionary is database management

Symbol Table. Symbol table is used widely in many applications. dictionary is a kind of symbol table data dictionary is database management Hashing Symbol Table Symbol table is used widely in many applications. dictionary is a kind of symbol table data dictionary is database management In general, the following operations are performed on

More information

q ii (t) =;X q ij (t) where p ij (t 1 t 2 ) is the probability thatwhen the model is in the state i in the moment t 1 the transition occurs to the sta

q ii (t) =;X q ij (t) where p ij (t 1 t 2 ) is the probability thatwhen the model is in the state i in the moment t 1 the transition occurs to the sta DISTRIBUTED GENERATION OF MARKOV CHAINS INFINITESIMAL GENERATORS WITH THE USE OF THE LOW LEVEL NETWORK INTERFACE BYLINA Jaros law, (PL), BYLINA Beata, (PL) Abstract. In this paper a distributed algorithm

More information

Recurrent Neural Network Models for improved (Pseudo) Random Number Generation in computer security applications

Recurrent Neural Network Models for improved (Pseudo) Random Number Generation in computer security applications Recurrent Neural Network Models for improved (Pseudo) Random Number Generation in computer security applications D.A. Karras 1 and V. Zorkadis 2 1 University of Piraeus, Dept. of Business Administration,

More information

Hashing. 1. Introduction. 2. Direct-address tables. CmSc 250 Introduction to Algorithms

Hashing. 1. Introduction. 2. Direct-address tables. CmSc 250 Introduction to Algorithms Hashing CmSc 250 Introduction to Algorithms 1. Introduction Hashing is a method of storing elements in a table in a way that reduces the time for search. Elements are assumed to be records with several

More information

Introduction. hashing performs basic operations, such as insertion, better than other ADTs we ve seen so far

Introduction. hashing performs basic operations, such as insertion, better than other ADTs we ve seen so far Chapter 5 Hashing 2 Introduction hashing performs basic operations, such as insertion, deletion, and finds in average time better than other ADTs we ve seen so far 3 Hashing a hash table is merely an hashing

More information

CSI33 Data Structures

CSI33 Data Structures Outline Department of Mathematics and Computer Science Bronx Community College November 30, 2016 Outline Outline 1 Chapter 13: Heaps, Balances Trees and Hash Tables Hash Tables Outline 1 Chapter 13: Heaps,

More information

Understand how to deal with collisions

Understand how to deal with collisions Understand the basic structure of a hash table and its associated hash function Understand what makes a good (and a bad) hash function Understand how to deal with collisions Open addressing Separate chaining

More information

CS 270 Algorithms. Oliver Kullmann. Generalising arrays. Direct addressing. Hashing in general. Hashing through chaining. Reading from CLRS for week 7

CS 270 Algorithms. Oliver Kullmann. Generalising arrays. Direct addressing. Hashing in general. Hashing through chaining. Reading from CLRS for week 7 Week 9 General remarks tables 1 2 3 We continue data structures by discussing hash tables. Reading from CLRS for week 7 1 Chapter 11, Sections 11.1, 11.2, 11.3. 4 5 6 Recall: Dictionaries Applications

More information

CS2 Algorithms and Data Structures Note 4

CS2 Algorithms and Data Structures Note 4 CS2 Algorithms and Data Structures Note 4 Hash Tables In this lecture, we will introduce a particularly efficient data structure for the Dictionary ADT. 4.1 Dictionaries A Dictionary stores key element

More information

HASH TABLES. Hash Tables Page 1

HASH TABLES. Hash Tables Page 1 HASH TABLES TABLE OF CONTENTS 1. Introduction to Hashing 2. Java Implementation of Linear Probing 3. Maurer s Quadratic Probing 4. Double Hashing 5. Separate Chaining 6. Hash Functions 7. Alphanumeric

More information

A HASHING TECHNIQUE USING SEPARATE BINARY TREE

A HASHING TECHNIQUE USING SEPARATE BINARY TREE Data Science Journal, Volume 5, 19 October 2006 143 A HASHING TECHNIQUE USING SEPARATE BINARY TREE Md. Mehedi Masud 1*, Gopal Chandra Das 3, Md. Anisur Rahman 2, and Arunashis Ghose 4 *1 School of Information

More information

Hashing. 7- Hashing. Hashing. Transform Keys into Integers in [[0, M 1]] The steps in hashing:

Hashing. 7- Hashing. Hashing. Transform Keys into Integers in [[0, M 1]] The steps in hashing: Hashing 7- Hashing Bruno MARTI, University of ice - Sophia Antipolis mailto:bruno.martin@unice.fr http://www.i3s.unice.fr/~bmartin/mathmods.html The steps in hashing: 1 compute a hash function which maps

More information

An 11-Step Sorting Network for 18 Elements. Sherenaz W. Al-Haj Baddar, Kenneth E. Batcher

An 11-Step Sorting Network for 18 Elements. Sherenaz W. Al-Haj Baddar, Kenneth E. Batcher An -Step Sorting Network for 8 Elements Sherenaz W. Al-Haj Baddar, Kenneth E. Batcher Kent State University Department of Computer Science Kent, Ohio 444 salhajba@cs.kent.edu batcher@cs.kent.edu Abstract

More information

Hyperplane Ranking in. Simple Genetic Algorithms. D. Whitley, K. Mathias, and L. Pyeatt. Department of Computer Science. Colorado State University

Hyperplane Ranking in. Simple Genetic Algorithms. D. Whitley, K. Mathias, and L. Pyeatt. Department of Computer Science. Colorado State University Hyperplane Ranking in Simple Genetic Algorithms D. Whitley, K. Mathias, and L. yeatt Department of Computer Science Colorado State University Fort Collins, Colorado 8523 USA whitley,mathiask,pyeatt@cs.colostate.edu

More information

Hashing with Linear Probing and Referential Integrity

Hashing with Linear Probing and Referential Integrity Hashing with Linear Probing and Referential Integrity arxiv:188.6v1 [cs.ds] 1 Aug 18 Peter Sanders Karlsruhe Institute of Technology (KIT), 7618 Karlsruhe, Germany sanders@kit.edu August 1, 18 Abstract

More information

Week 9. Hash tables. 1 Generalising arrays. 2 Direct addressing. 3 Hashing in general. 4 Hashing through chaining. 5 Hash functions.

Week 9. Hash tables. 1 Generalising arrays. 2 Direct addressing. 3 Hashing in general. 4 Hashing through chaining. 5 Hash functions. Week 9 tables 1 2 3 ing in ing in ing 4 ing 5 6 General remarks We continue data structures by discussing hash tables. For this year, we only consider the first four sections (not sections and ). Only

More information

Keywords: Binary Sort, Sorting, Efficient Algorithm, Sorting Algorithm, Sort Data.

Keywords: Binary Sort, Sorting, Efficient Algorithm, Sorting Algorithm, Sort Data. Volume 4, Issue 6, June 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com An Efficient and

More information

CHAPTER 9 HASH TABLES, MAPS, AND SKIP LISTS

CHAPTER 9 HASH TABLES, MAPS, AND SKIP LISTS 0 1 2 025-612-0001 981-101-0002 3 4 451-229-0004 CHAPTER 9 HASH TABLES, MAPS, AND SKIP LISTS ACKNOWLEDGEMENT: THESE SLIDES ARE ADAPTED FROM SLIDES PROVIDED WITH DATA STRUCTURES AND ALGORITHMS IN C++, GOODRICH,

More information

Introducing Hashing. Chapter 21. Copyright 2012 by Pearson Education, Inc. All rights reserved

Introducing Hashing. Chapter 21. Copyright 2012 by Pearson Education, Inc. All rights reserved Introducing Hashing Chapter 21 Contents What Is Hashing? Hash Functions Computing Hash Codes Compressing a Hash Code into an Index for the Hash Table A demo of hashing (after) ARRAY insert hash index =

More information

Using Statistics for Computing Joins with MapReduce

Using Statistics for Computing Joins with MapReduce Using Statistics for Computing Joins with MapReduce Theresa Csar 1, Reinhard Pichler 1, Emanuel Sallinger 1, and Vadim Savenkov 2 1 Vienna University of Technology {csar, pichler, sallinger}@dbaituwienacat

More information

Unique Permutation Hashing

Unique Permutation Hashing Unique Permutation Hashing Shlomi Dolev Limor Lahiani Yinnon Haviv May 9, 2009 Abstract We propose a new hash function, the unique-permutation hash function, and a performance analysis of its hash computation.

More information

The dictionary problem

The dictionary problem 6 Hashing The dictionary problem Different approaches to the dictionary problem: previously: Structuring the set of currently stored keys: lists, trees, graphs,... structuring the complete universe of

More information

Hashing. Hashing Procedures

Hashing. Hashing Procedures Hashing Hashing Procedures Let us denote the set of all possible key values (i.e., the universe of keys) used in a dictionary application by U. Suppose an application requires a dictionary in which elements

More information

Experiments on string matching in memory structures

Experiments on string matching in memory structures Experiments on string matching in memory structures Thierry Lecroq LIR (Laboratoire d'informatique de Rouen) and ABISS (Atelier de Biologie Informatique Statistique et Socio-Linguistique), Universite de

More information

CSE 373 Autumn 2012: Midterm #2 (closed book, closed notes, NO calculators allowed)

CSE 373 Autumn 2012: Midterm #2 (closed book, closed notes, NO calculators allowed) Name: Sample Solution Email address: CSE 373 Autumn 0: Midterm # (closed book, closed notes, NO calculators allowed) Instructions: Read the directions for each question carefully before answering. We may

More information

Hashing. Dr. Ronaldo Menezes Hugo Serrano. Ronaldo Menezes, Florida Tech

Hashing. Dr. Ronaldo Menezes Hugo Serrano. Ronaldo Menezes, Florida Tech Hashing Dr. Ronaldo Menezes Hugo Serrano Agenda Motivation Prehash Hashing Hash Functions Collisions Separate Chaining Open Addressing Motivation Hash Table Its one of the most important data structures

More information

Winning Positions in Simplicial Nim

Winning Positions in Simplicial Nim Winning Positions in Simplicial Nim David Horrocks Department of Mathematics and Statistics University of Prince Edward Island Charlottetown, Prince Edward Island, Canada, C1A 4P3 dhorrocks@upei.ca Submitted:

More information

reasonable to store in a software implementation, it is likely to be a signicant burden in a low-cost hardware implementation. We describe in this pap

reasonable to store in a software implementation, it is likely to be a signicant burden in a low-cost hardware implementation. We describe in this pap Storage-Ecient Finite Field Basis Conversion Burton S. Kaliski Jr. 1 and Yiqun Lisa Yin 2 RSA Laboratories 1 20 Crosby Drive, Bedford, MA 01730. burt@rsa.com 2 2955 Campus Drive, San Mateo, CA 94402. yiqun@rsa.com

More information

Hashing. Introduction to Data Structures Kyuseok Shim SoEECS, SNU.

Hashing. Introduction to Data Structures Kyuseok Shim SoEECS, SNU. Hashing Introduction to Data Structures Kyuseok Shim SoEECS, SNU. 1 8.1 INTRODUCTION Binary search tree (Chapter 5) GET, INSERT, DELETE O(n) Balanced binary search tree (Chapter 10) GET, INSERT, DELETE

More information

Algorithms with numbers (2) CISC4080, Computer Algorithms CIS, Fordham Univ.! Instructor: X. Zhang Spring 2017

Algorithms with numbers (2) CISC4080, Computer Algorithms CIS, Fordham Univ.! Instructor: X. Zhang Spring 2017 Algorithms with numbers (2) CISC4080, Computer Algorithms CIS, Fordham Univ.! Instructor: X. Zhang Spring 2017 Acknowledgement The set of slides have used materials from the following resources Slides

More information

Algorithms with numbers (2) CISC4080, Computer Algorithms CIS, Fordham Univ. Acknowledgement. Support for Dictionary

Algorithms with numbers (2) CISC4080, Computer Algorithms CIS, Fordham Univ. Acknowledgement. Support for Dictionary Algorithms with numbers (2) CISC4080, Computer Algorithms CIS, Fordham Univ. Instructor: X. Zhang Spring 2017 Acknowledgement The set of slides have used materials from the following resources Slides for

More information

Striped Grid Files: An Alternative for Highdimensional

Striped Grid Files: An Alternative for Highdimensional Striped Grid Files: An Alternative for Highdimensional Indexing Thanet Praneenararat 1, Vorapong Suppakitpaisarn 2, Sunchai Pitakchonlasap 1, and Jaruloj Chongstitvatana 1 Department of Mathematics 1,

More information

HASH TABLES. Goal is to store elements k,v at index i = h k

HASH TABLES. Goal is to store elements k,v at index i = h k CH 9.2 : HASH TABLES 1 ACKNOWLEDGEMENT: THESE SLIDES ARE ADAPTED FROM SLIDES PROVIDED WITH DATA STRUCTURES AND ALGORITHMS IN C++, GOODRICH, TAMASSIA AND MOUNT (WILEY 2004) AND SLIDES FROM JORY DENNY AND

More information

Real-Time Scalability of Nested Spin Locks. Hiroaki Takada and Ken Sakamura. Faculty of Science, University of Tokyo

Real-Time Scalability of Nested Spin Locks. Hiroaki Takada and Ken Sakamura. Faculty of Science, University of Tokyo Real-Time Scalability of Nested Spin Locks Hiroaki Takada and Ken Sakamura Department of Information Science, Faculty of Science, University of Tokyo 7-3-1, Hongo, Bunkyo-ku, Tokyo 113, Japan Abstract

More information

Dictionary. Dictionary. stores key-value pairs. Find(k) Insert(k, v) Delete(k) List O(n) O(1) O(n) Sorted Array O(log n) O(n) O(n)

Dictionary. Dictionary. stores key-value pairs. Find(k) Insert(k, v) Delete(k) List O(n) O(1) O(n) Sorted Array O(log n) O(n) O(n) Hash-Tables Introduction Dictionary Dictionary stores key-value pairs Find(k) Insert(k, v) Delete(k) List O(n) O(1) O(n) Sorted Array O(log n) O(n) O(n) Balanced BST O(log n) O(log n) O(log n) Dictionary

More information

Hash Tables. Hash functions Open addressing. March 07, 2018 Cinda Heeren / Geoffrey Tien 1

Hash Tables. Hash functions Open addressing. March 07, 2018 Cinda Heeren / Geoffrey Tien 1 Hash Tables Hash functions Open addressing Cinda Heeren / Geoffrey Tien 1 Hash functions A hash function is a function that map key values to array indexes Hash functions are performed in two steps Map

More information

perform. If more storage is required, more can be added without having to modify the processor (provided that the extra memory is still addressable).

perform. If more storage is required, more can be added without having to modify the processor (provided that the extra memory is still addressable). How to Make Zuse's Z3 a Universal Computer Raul Rojas January 14, 1998 Abstract The computing machine Z3, built by Konrad Zuse between 1938 and 1941, could only execute xed sequences of oating-point arithmetical

More information

CHAPTER 4 BLOOM FILTER

CHAPTER 4 BLOOM FILTER 54 CHAPTER 4 BLOOM FILTER 4.1 INTRODUCTION Bloom filter was formulated by Bloom (1970) and is used widely today for different purposes including web caching, intrusion detection, content based routing,

More information

Two-dimensional Totalistic Code 52

Two-dimensional Totalistic Code 52 Two-dimensional Totalistic Code 52 Todd Rowland Senior Research Associate, Wolfram Research, Inc. 100 Trade Center Drive, Champaign, IL The totalistic two-dimensional cellular automaton code 52 is capable

More information

Chapter 27 Hashing. Liang, Introduction to Java Programming, Eleventh Edition, (c) 2017 Pearson Education, Inc. All rights reserved.

Chapter 27 Hashing. Liang, Introduction to Java Programming, Eleventh Edition, (c) 2017 Pearson Education, Inc. All rights reserved. Chapter 27 Hashing 1 Objectives To know what hashing is for ( 27.3). To obtain the hash code for an object and design the hash function to map a key to an index ( 27.4). To handle collisions using open

More information

9/24/ Hash functions

9/24/ Hash functions 11.3 Hash functions A good hash function satis es (approximately) the assumption of SUH: each key is equally likely to hash to any of the slots, independently of the other keys We typically have no way

More information

Ecient Processor Allocation for 3D Tori. Wenjian Qiao and Lionel M. Ni. Department of Computer Science. Michigan State University

Ecient Processor Allocation for 3D Tori. Wenjian Qiao and Lionel M. Ni. Department of Computer Science. Michigan State University Ecient Processor llocation for D ori Wenjian Qiao and Lionel M. Ni Department of Computer Science Michigan State University East Lansing, MI 4884-107 fqiaow, nig@cps.msu.edu bstract Ecient allocation of

More information

III Data Structures. Dynamic sets

III Data Structures. Dynamic sets III Data Structures Elementary Data Structures Hash Tables Binary Search Trees Red-Black Trees Dynamic sets Sets are fundamental to computer science Algorithms may require several different types of operations

More information

Algorithms in Systems Engineering ISE 172. Lecture 12. Dr. Ted Ralphs

Algorithms in Systems Engineering ISE 172. Lecture 12. Dr. Ted Ralphs Algorithms in Systems Engineering ISE 172 Lecture 12 Dr. Ted Ralphs ISE 172 Lecture 12 1 References for Today s Lecture Required reading Chapter 5 References CLRS Chapter 11 D.E. Knuth, The Art of Computer

More information

Characterization of Request Sequences for List Accessing Problem and New Theoretical Results for MTF Algorithm

Characterization of Request Sequences for List Accessing Problem and New Theoretical Results for MTF Algorithm Characterization of Request Sequences for List Accessing Problem and New Theoretical Results for MTF Algorithm Rakesh Mohanty Dept of Comp Sc & Engg Indian Institute of Technology Madras, Chennai, India

More information

CSE 332 Spring 2014: Midterm Exam (closed book, closed notes, no calculators)

CSE 332 Spring 2014: Midterm Exam (closed book, closed notes, no calculators) Name: Email address: Quiz Section: CSE 332 Spring 2014: Midterm Exam (closed book, closed notes, no calculators) Instructions: Read the directions for each question carefully before answering. We will

More information

Hashing Techniques. Material based on slides by George Bebis

Hashing Techniques. Material based on slides by George Bebis Hashing Techniques Material based on slides by George Bebis https://www.cse.unr.edu/~bebis/cs477/lect/hashing.ppt The Search Problem Find items with keys matching a given search key Given an array A, containing

More information

Adaptive Crossover in Genetic Algorithms Using Statistics Mechanism

Adaptive Crossover in Genetic Algorithms Using Statistics Mechanism in Artificial Life VIII, Standish, Abbass, Bedau (eds)(mit Press) 2002. pp 182 185 1 Adaptive Crossover in Genetic Algorithms Using Statistics Mechanism Shengxiang Yang Department of Mathematics and Computer

More information

/$10.00 (c) 1998 IEEE

/$10.00 (c) 1998 IEEE Dual Busy Tone Multiple Access (DBTMA) - Performance Results Zygmunt J. Haas and Jing Deng School of Electrical Engineering Frank Rhodes Hall Cornell University Ithaca, NY 85 E-mail: haas, jing@ee.cornell.edu

More information

Lecture 16. Reading: Weiss Ch. 5 CSE 100, UCSD: LEC 16. Page 1 of 40

Lecture 16. Reading: Weiss Ch. 5 CSE 100, UCSD: LEC 16. Page 1 of 40 Lecture 16 Hashing Hash table and hash function design Hash functions for integers and strings Collision resolution strategies: linear probing, double hashing, random hashing, separate chaining Hash table

More information

Module 5: Hash-Based Indexing

Module 5: Hash-Based Indexing Module 5: Hash-Based Indexing Module Outline 5.1 General Remarks on Hashing 5. Static Hashing 5.3 Extendible Hashing 5.4 Linear Hashing Web Forms Transaction Manager Lock Manager Plan Executor Operator

More information

COSC160: Data Structures Hashing Structures. Jeremy Bolton, PhD Assistant Teaching Professor

COSC160: Data Structures Hashing Structures. Jeremy Bolton, PhD Assistant Teaching Professor COSC160: Data Structures Hashing Structures Jeremy Bolton, PhD Assistant Teaching Professor Outline I. Hashing Structures I. Motivation and Review II. Hash Functions III. HashTables I. Implementations

More information

Extendible Chained Bucket Hashing for Main Memory Databases. Abstract

Extendible Chained Bucket Hashing for Main Memory Databases. Abstract Extendible Chained Bucket Hashing for Main Memory Databases Pyung-Chul Kim *, Kee-Wook Rim, Jin-Pyo Hong Electronics and Telecommunications Research Institute (ETRI) P.O. Box 106, Yusong, Taejon, 305-600,

More information

Hash Tables. Hash Tables

Hash Tables. Hash Tables Hash Tables Hash Tables Insanely useful One of the most useful and used data structures that you ll encounter They do not support many operations But they are amazing at the operations they do support

More information

Efficient Multiway Radix Search Trees

Efficient Multiway Radix Search Trees Appeared in Information Processing Letters 60, 3 (Nov. 11, 1996), 115-120. Efficient Multiway Radix Search Trees Úlfar Erlingsson a, Mukkai Krishnamoorthy a, T. V. Raman b a Rensselaer Polytechnic Institute,

More information

Advanced Algorithmics (6EAP) MTAT Hashing. Jaak Vilo 2016 Fall

Advanced Algorithmics (6EAP) MTAT Hashing. Jaak Vilo 2016 Fall Advanced Algorithmics (6EAP) MTAT.03.238 Hashing Jaak Vilo 2016 Fall Jaak Vilo 1 ADT asscociative array INSERT, SEARCH, DELETE An associative array (also associative container, map, mapping, dictionary,

More information

Availability of Coding Based Replication Schemes. Gagan Agrawal. University of Maryland. College Park, MD 20742

Availability of Coding Based Replication Schemes. Gagan Agrawal. University of Maryland. College Park, MD 20742 Availability of Coding Based Replication Schemes Gagan Agrawal Department of Computer Science University of Maryland College Park, MD 20742 Abstract Data is often replicated in distributed systems to improve

More information

time using O( n log n ) processors on the EREW PRAM. Thus, our algorithm improves on the previous results, either in time complexity or in the model o

time using O( n log n ) processors on the EREW PRAM. Thus, our algorithm improves on the previous results, either in time complexity or in the model o Reconstructing a Binary Tree from its Traversals in Doubly-Logarithmic CREW Time Stephan Olariu Michael Overstreet Department of Computer Science, Old Dominion University, Norfolk, VA 23529 Zhaofang Wen

More information

The Global Standard for Mobility (GSM) (see, e.g., [6], [4], [5]) yields a

The Global Standard for Mobility (GSM) (see, e.g., [6], [4], [5]) yields a Preprint 0 (2000)?{? 1 Approximation of a direction of N d in bounded coordinates Jean-Christophe Novelli a Gilles Schaeer b Florent Hivert a a Universite Paris 7 { LIAFA 2, place Jussieu - 75251 Paris

More information

DATA STRUCTURES/UNIT 3

DATA STRUCTURES/UNIT 3 UNIT III SORTING AND SEARCHING 9 General Background Exchange sorts Selection and Tree Sorting Insertion Sorts Merge and Radix Sorts Basic Search Techniques Tree Searching General Search Trees- Hashing.

More information

A Visualization Program for Subset Sum Instances

A Visualization Program for Subset Sum Instances A Visualization Program for Subset Sum Instances Thomas E. O Neil and Abhilasha Bhatia Computer Science Department University of North Dakota Grand Forks, ND 58202 oneil@cs.und.edu abhilasha.bhatia@my.und.edu

More information

Open Addressing: Linear Probing (cont.)

Open Addressing: Linear Probing (cont.) Open Addressing: Linear Probing (cont.) Cons of Linear Probing () more complex insert, find, remove methods () primary clustering phenomenon items tend to cluster together in the bucket array, as clustering

More information

Architecture-Dependent Tuning of the Parameterized Communication Model for Optimal Multicasting

Architecture-Dependent Tuning of the Parameterized Communication Model for Optimal Multicasting Architecture-Dependent Tuning of the Parameterized Communication Model for Optimal Multicasting Natawut Nupairoj and Lionel M. Ni Department of Computer Science Michigan State University East Lansing,

More information

Using Templates to Introduce Time Efficiency Analysis in an Algorithms Course

Using Templates to Introduce Time Efficiency Analysis in an Algorithms Course Using Templates to Introduce Time Efficiency Analysis in an Algorithms Course Irena Pevac Department of Computer Science Central Connecticut State University, New Britain, CT, USA Abstract: We propose

More information

Adapted By Manik Hosen

Adapted By Manik Hosen Adapted By Manik Hosen Basic Terminology Question: Define Hashing. Ans: Concept of building a data structure that can be searched in O(l) time is called Hashing. Question: Define Hash Table with example.

More information

HASH TABLES.

HASH TABLES. 1 HASH TABLES http://en.wikipedia.org/wiki/hash_table 2 Hash Table A hash table (or hash map) is a data structure that maps keys (identifiers) into a certain location (bucket) A hash function changes the

More information

Enumeration of Full Graphs: Onset of the Asymptotic Region. Department of Mathematics. Massachusetts Institute of Technology. Cambridge, MA 02139

Enumeration of Full Graphs: Onset of the Asymptotic Region. Department of Mathematics. Massachusetts Institute of Technology. Cambridge, MA 02139 Enumeration of Full Graphs: Onset of the Asymptotic Region L. J. Cowen D. J. Kleitman y F. Lasaga D. E. Sussman Department of Mathematics Massachusetts Institute of Technology Cambridge, MA 02139 Abstract

More information

An On-line Variable Length Binary. Institute for Systems Research and. Institute for Advanced Computer Studies. University of Maryland

An On-line Variable Length Binary. Institute for Systems Research and. Institute for Advanced Computer Studies. University of Maryland An On-line Variable Length inary Encoding Tinku Acharya Joseph F. Ja Ja Institute for Systems Research and Institute for Advanced Computer Studies University of Maryland College Park, MD 242 facharya,

More information

International Journal of Scientific & Engineering Research, Volume 4, Issue 7, July ISSN

International Journal of Scientific & Engineering Research, Volume 4, Issue 7, July ISSN International Journal of Scientific & Engineering Research, Volume 4, Issue 7, July-201 971 Comparative Performance Analysis Of Sorting Algorithms Abhinav Yadav, Dr. Sanjeev Bansal Abstract Sorting Algorithms

More information

Chapter 27 Hashing. Objectives

Chapter 27 Hashing. Objectives Chapter 27 Hashing 1 Objectives To know what hashing is for ( 27.3). To obtain the hash code for an object and design the hash function to map a key to an index ( 27.4). To handle collisions using open

More information

Improved Collision Resolution Algorithms for Multiple Access Channels with Limited Number of Users * Chiung-Shien Wu y and Po-Ning Chen z y Computer a

Improved Collision Resolution Algorithms for Multiple Access Channels with Limited Number of Users * Chiung-Shien Wu y and Po-Ning Chen z y Computer a Improved Collision Resolution Algorithms for Multiple Access Channels with Limited Number of Users * Chiung-Shien Wu y and Po-Ning Chen z y Computer and Communication Research Labs. ITRI, Hsinchu, Taiwan

More information

1. [1 pt] What is the solution to the recurrence T(n) = 2T(n-1) + 1, T(1) = 1

1. [1 pt] What is the solution to the recurrence T(n) = 2T(n-1) + 1, T(1) = 1 Asymptotics, Recurrence and Basic Algorithms 1. [1 pt] What is the solution to the recurrence T(n) = 2T(n-1) + 1, T(1) = 1 2. O(n) 2. [1 pt] What is the solution to the recurrence T(n) = T(n/2) + n, T(1)

More information

Exercise 3: ROC curves, image retrieval

Exercise 3: ROC curves, image retrieval Exercise 3: ROC curves, image retrieval Multimedia systems 2017/2018 Create a folder exercise3 that you will use during this exercise. Unpack the content of exercise3.zip that you can download from the

More information

Scan-Based BIST Diagnosis Using an Embedded Processor

Scan-Based BIST Diagnosis Using an Embedded Processor Scan-Based BIST Diagnosis Using an Embedded Processor Kedarnath J. Balakrishnan and Nur A. Touba Computer Engineering Research Center Department of Electrical and Computer Engineering University of Texas

More information

Web page recommendation using a stochastic process model

Web page recommendation using a stochastic process model Data Mining VII: Data, Text and Web Mining and their Business Applications 233 Web page recommendation using a stochastic process model B. J. Park 1, W. Choi 1 & S. H. Noh 2 1 Computer Science Department,

More information

Radix Searching. The insert procedure for digital search trees also derives directly from the corresponding procedure for binary search trees:

Radix Searching. The insert procedure for digital search trees also derives directly from the corresponding procedure for binary search trees: Radix Searching The most simple radix search method is digital tree searching - the binary search tree with the branch in the tree according to the bits of keys: at the first level the leading bit is used,

More information

Server 1 Server 2 CPU. mem I/O. allocate rec n read elem. n*47.0. n*20.0. select. n*1.0. write elem. n*26.5 send. n*

Server 1 Server 2 CPU. mem I/O. allocate rec n read elem. n*47.0. n*20.0. select. n*1.0. write elem. n*26.5 send. n* Information Needs in Performance Analysis of Telecommunication Software a Case Study Vesa Hirvisalo Esko Nuutila Helsinki University of Technology Laboratory of Information Processing Science Otakaari

More information

Petri Nets ~------~ R-ES-O---N-A-N-C-E-I--se-p-te-m--be-r Applications.

Petri Nets ~------~ R-ES-O---N-A-N-C-E-I--se-p-te-m--be-r Applications. Petri Nets 2. Applications Y Narahari Y Narahari is currently an Associate Professor of Computer Science and Automation at the Indian Institute of Science, Bangalore. His research interests are broadly

More information

CMSC 341 Lecture 16/17 Hashing, Parts 1 & 2

CMSC 341 Lecture 16/17 Hashing, Parts 1 & 2 CMSC 341 Lecture 16/17 Hashing, Parts 1 & 2 Prof. John Park Based on slides from previous iterations of this course Today s Topics Overview Uses and motivations of hash tables Major concerns with hash

More information

A Note On The Sparing Number Of The Sieve Graphs Of Certain Graphs

A Note On The Sparing Number Of The Sieve Graphs Of Certain Graphs Applied Mathematics E-Notes, 15(015), 9-37 c ISSN 1607-510 Available free at mirror sites of http://www.math.nthu.edu.tw/ amen/ A Note On The Sparing Number Of The Sieve Graphs Of Certain Graphs Naduvath

More information

Data Structures and Algorithms. Roberto Sebastiani

Data Structures and Algorithms. Roberto Sebastiani Data Structures and Algorithms Roberto Sebastiani roberto.sebastiani@disi.unitn.it http://www.disi.unitn.it/~rseba - Week 07 - B.S. In Applied Computer Science Free University of Bozen/Bolzano academic

More information

Dynamic Dictionaries. Operations: create insert find remove max/ min write out in sorted order. Only defined for object classes that are Comparable

Dynamic Dictionaries. Operations: create insert find remove max/ min write out in sorted order. Only defined for object classes that are Comparable Hashing Dynamic Dictionaries Operations: create insert find remove max/ min write out in sorted order Only defined for object classes that are Comparable Hash tables Operations: create insert find remove

More information

Cuckoo Hashing for Undergraduates

Cuckoo Hashing for Undergraduates Cuckoo Hashing for Undergraduates Rasmus Pagh IT University of Copenhagen March 27, 2006 Abstract This lecture note presents and analyses two simple hashing algorithms: Hashing with Chaining, and Cuckoo

More information

Introduction hashing: a technique used for storing and retrieving information as quickly as possible.

Introduction hashing: a technique used for storing and retrieving information as quickly as possible. Lecture IX: Hashing Introduction hashing: a technique used for storing and retrieving information as quickly as possible. used to perform optimal searches and is useful in implementing symbol tables. Why

More information

Reductions of the general virus detection problem

Reductions of the general virus detection problem EICAR 2001 Best Paper Proceedings Leitold, F. (2001). Reductions of the general virus detection problem. In U. E. Gattiker (Ed.), Conference Proceedings EICAR International Conference, (pp. 24-30). ISBN:

More information

Semi supervised clustering for Text Clustering

Semi supervised clustering for Text Clustering Semi supervised clustering for Text Clustering N.Saranya 1 Assistant Professor, Department of Computer Science and Engineering, Sri Eshwar College of Engineering, Coimbatore 1 ABSTRACT: Based on clustering

More information

CSE 332: Data Structures & Parallelism Lecture 10:Hashing. Ruth Anderson Autumn 2018

CSE 332: Data Structures & Parallelism Lecture 10:Hashing. Ruth Anderson Autumn 2018 CSE 332: Data Structures & Parallelism Lecture 10:Hashing Ruth Anderson Autumn 2018 Today Dictionaries Hashing 10/19/2018 2 Motivating Hash Tables For dictionary with n key/value pairs insert find delete

More information

An Efficient Algorithm to Test Forciblyconnectedness of Graphical Degree Sequences

An Efficient Algorithm to Test Forciblyconnectedness of Graphical Degree Sequences Theory and Applications of Graphs Volume 5 Issue 2 Article 2 July 2018 An Efficient Algorithm to Test Forciblyconnectedness of Graphical Degree Sequences Kai Wang Georgia Southern University, kwang@georgiasouthern.edu

More information

More on Conjunctive Selection Condition and Branch Prediction

More on Conjunctive Selection Condition and Branch Prediction More on Conjunctive Selection Condition and Branch Prediction CS764 Class Project - Fall Jichuan Chang and Nikhil Gupta {chang,nikhil}@cs.wisc.edu Abstract Traditionally, database applications have focused

More information

BMVC 1996 doi: /c.10.41

BMVC 1996 doi: /c.10.41 On the use of the 1D Boolean model for the description of binary textures M Petrou, M Arrigo and J A Vons Dept. of Electronic and Electrical Engineering, University of Surrey, Guildford GU2 5XH, United

More information

On Checkpoint Latency. Nitin H. Vaidya. In the past, a large number of researchers have analyzed. the checkpointing and rollback recovery scheme

On Checkpoint Latency. Nitin H. Vaidya. In the past, a large number of researchers have analyzed. the checkpointing and rollback recovery scheme On Checkpoint Latency Nitin H. Vaidya Department of Computer Science Texas A&M University College Station, TX 77843-3112 E-mail: vaidya@cs.tamu.edu Web: http://www.cs.tamu.edu/faculty/vaidya/ Abstract

More information

Probabilistic (Randomized) algorithms

Probabilistic (Randomized) algorithms Probabilistic (Randomized) algorithms Idea: Build algorithms using a random element so as gain improved performance. For some cases, improved performance is very dramatic, moving from intractable to tractable.

More information

Hashing. Manolis Koubarakis. Data Structures and Programming Techniques

Hashing. Manolis Koubarakis. Data Structures and Programming Techniques Hashing Manolis Koubarakis 1 The Symbol Table ADT A symbol table T is an abstract storage that contains table entries that are either empty or are pairs of the form (K, I) where K is a key and I is some

More information

Data Structure and Algorithm Homework #3 Due: 1:20pm, Thursday, May 16, 2017 TA === Homework submission instructions ===

Data Structure and Algorithm Homework #3 Due: 1:20pm, Thursday, May 16, 2017 TA   === Homework submission instructions === Data Structure and Algorithm Homework #3 Due: 1:20pm, Thursday, May 16, 2017 TA email: dsa1@csie.ntu.edu.tw === Homework submission instructions === For Problem 1-3, please put all your solutions in a

More information

Bloom filters and their applications

Bloom filters and their applications Bloom filters and their applications Fedor Nikitin June 11, 2006 1 Introduction The bloom filters, as a new approach to hashing, were firstly presented by Burton Bloom [Blo70]. He considered the task of

More information

Heap-on-Top Priority Queues. March Abstract. We introduce the heap-on-top (hot) priority queue data structure that combines the

Heap-on-Top Priority Queues. March Abstract. We introduce the heap-on-top (hot) priority queue data structure that combines the Heap-on-Top Priority Queues Boris V. Cherkassky Central Economics and Mathematics Institute Krasikova St. 32 117418, Moscow, Russia cher@cemi.msk.su Andrew V. Goldberg NEC Research Institute 4 Independence

More information