Using Timestamps to Track Causal Dependencies
|
|
- Logan Hampton
- 5 years ago
- Views:
Transcription
1 Using Timestamps to Track Causal Dependencies J. A. David McWha Dept. of Computer Science, University of Waikato, Private Bag 315, Hamilton ABSTRACT As computer architectures speculate more aggressively in an attempt to extract an increasing amount of parallelism tracking causal dependencies is becoming an increasingly difficult task. Timestamping events is a convenient way to store this information. It is argued that timestamps with a fixed maximum length are easier to implement. A number of fixed length timestamp schemes are proposed, evaluated by functional simulation and the advantages and shortcomings of each are discussed. 1 Introduction To improve processor performance computer architects are increasingly turning t o parallelism, and in particular out-of-order execution and speculation [5][6]. Current production architectures only use a modest level of speculation, for example the Intel Pentium Pro [7] uses a reorder buffer to hold a pool of 4 instructions, and this determines how far ahead the processor can speculate (the speculation distance). At this level of speculation causal dependencies can be maintained by relatively simple hardware. As speculation becomes more aggressive this hardware will grow more complex, quickly becoming infeasible. By timestamping each instruction (or block of instructions) their virtual sequence can be tracked in a scalable way. The virtual sequence is the order in which a sequential machine would execute the instructions. As a program may be of arbitrary length, an arbitrary number of timestamps may be required. If the timestamp representation is allowed to become arbitrarily large then storage and bandwidth requirements are also unbounded. This is of particular importance in hardware implementations, where timestamps of varying lengths are difficult to implement. Arbitrarily large timestamps will also take a potentially unbounded time to generate and compare. However, a fixed length timestamp representation provides only a finite number of timestamps and a sufficiently large program will exhaust these. In order to execute such a program a method of reusing timestamps is necessary; solutions to this problem are discussed later in this paper. In the program fragment shown in Figure 1 the code D will always execute, regardless of whether the branch at A is taken or not. This is known as code convergence, and it is possible to start D in parallel with A, constrained only by data dependencies. if (A) then B; else C; D; A? B D C Figure 1: An example of convergent code
2 To do this it must be possible to insert an arbitrary number of timestamps between the start of the section of unknown size (A and B or C) and the convergent code (D) in order to assign a timestamp to each instruction [1]. In this paper we present a conceptual model for timestamps (see also [3]) and consider a number of possible implementations of fixed length timestamp representations and their efficiency. Results are presented from a number of algorithms run on a simulator for the WarpEngine [4], an aggressively optimistic architecture based on the Time Warp algorithm [8]. The WarpEngine extracts large amounts of parallelism, allowing the restrictions caused by the timestamps to be clearly seen. 2 Conceptual Timestamps 2.1 Tree-based Execution To identify convergent code in a program the code can be mapped to an execution tree. This is done by splitting the instructions into blocks, for example (but not necessarily) basic blocks. Figure 2 shows a tree that might be generated for a sequence of code: A; B; C; D; Note that an n-way tree can be decomposed to a binary tree in all cases by using extra levels of nodes which further subdivide the tree, but perform no execution. The tree can be executed serially by a depth-first left to right traversal of the tree (the virtual sequence). Separate branches may have data dependencies, but are guaranteed to have no control dependencies. A B C D Figure 2: Execution tree generated by sequential code Using the Time Warp algorithm the WarpEngine speculatively executes each branch of the tree in parallel, rolling back any speculation errors and re-executing. Global Virtual Time (GVT) is represented by the earliest node in the virtual sequence with an instruction still pending. All nodes earlier in the virtual sequence will never be rolled back, and can be removed from the system (fossil collected). 2.2 Timestamps Figure 3: Conceptual timestamps for a binary tree. The timestamp representation should efficiently represent a large number of timestamps and efficiently perform creation of new timestamps and comparison of timestamps to obtain an ordering. The initial design for the timestamps is the simplest possible which allows a strict ordering to be determined and an arbitrary number of timestamps to be inserted between any pair of timestamps. This is done in an effort to minimize overheads. The symbolic representation described in this section was not directly simulated. Section 3 describes the practical representations implemented
3 Total size Length representation Exponential representation (bits) (I,L) Number (M,I,L) Number of levels of levels 32 (27,5) 28 (16,12,4) ( ) (58,6) 59 (32,27,5) ( ) (89,7) 9 (32,58,6) ( ) 58 Table 1: Number of levels which can be represented by different sizes of length and exponential representation timestamps. Timestamps are divided into mantissa (M), integer (I), and length (L) parts. and the results obtained. Each node in the execution tree is associated with a string that gives the path from the root to the node. A zero is used for a left branch, one for a right branch and to terminate the string (as in Figure 3). A lexicographic ordering where < < 1, places these strings in the same order as the sequential execution order of the associated nodes. Thus the strings can be used as timestamps for the virtual sequence. 2.3 Rescaling A problem that arises for any finite representation is the need to re-use old timestamps. As the execution tree grows, eventually there will be more nodes than can be represented. Because branches of the tree grow in depth unevenly it is the number of levels available to a branch that causes the restriction, more than the total number of timestamps themselves. This uneven growth tends to cause inefficient use of the timestamps and some will remain unused and be wasted. Details on implementations of rescaling can be found in [3]. As GVT advances, early timestamps will become available for re-use. To make use of these timestamps are re-allocated while retaining their ordering. We term this operation rescaling. 3 Timestamp Schemes 3.1 Length Representation Timestamps To make timestamp comparison easy we can map the timestamps from bit strings to integer values. This makes it possible to compare timestamps bitwise, as for integers. The bit string timestamp is converted to an integer by padding the bit string out to (a fixed) I bits with zeros and then appending the length of the original string (less the terminating ). Some integers remain unused in the representation. Table 1 shows the division of different sized length representations and the number of levels of nodes they can represent. The advantage of this representation is its simplicity, however the maximum tree depth is quite limited.
4 3.2 Exponential Representation Timestamps This representation uses a scheme similar to floating point number representations to allow different parts of the tree to grow to different depths. It is comprised of two parts: a mantissa and an exponent. The exponent is the number of leading zeros in the timestamp, while the mantissa is the normalized tail of the timestamp. In the example in Figure 4 the timestamp 1 becomes 2; 1, where the number before the comma is the exponent and the string after the comma is the mantissa. A complete exponential representation requires that the mantissa be coded using the length representation above. The number of levels which can be represented is greatest on the left side of the tree and decreases to the right, as shown for an arbitrary exponent size in Table 1. The proportions in which a timestamp is divided into exponent and mantissa will be the subject of some optimization based on application. 1 2, 4 1, 6 2, 3, 7 2, , 3,1 2,1 3 2,11,1 5,1 Figure 4: Timestamps in exponential form. This representation favours execution of the left side of the tree. Nodes that are early in the virtual sequence can have longer strings and so will not exhaust the maximum depth as often. Thus, rescale, and possibly cancel, operations can be reduced. This has two additional advantages. First, by delaying execution of nodes to the right of the tree which are more speculative it helps balance the overall execution. Second, a compiler can take advantage of the representation by scheduling more computation on the left of the tree. Provided the compiler can schedule the critical parts of the execution to the left of the tree, execution can progress for much longer without needing to rescale using this representation. 3.3 Ideal Timestamps In order to show the restrictions placed upon execution by the timestamp schemes we also simulate execution with timestamps of unbounded length, i.e. which never require rescaling. 3.4 Rescaling method In the simulation results which follow, the optimistic assumption that rescaling has no cost, in time or resources, is used. This allows an optimistic first approximation of the feasibility of the timestamp schemes to be obtained and allows various rescaling strategies to be studied. Given this assumption the optimal rescaling method is to rescale one level at a time to the root of the tree until sufficient timestamps have been reclaimed. This results in a large number of rescales, impractical in a real scheme, but rescales the minimum number of levels necessary to complete execution. This allows canceled nodes to be rescheduled as early as possible because some timestamps remain unused near the
5 root on the right hand side of the tree. 4 Test Algorithms A set of small algorithms were used for testing the timestamps. These were all handcoded in WarpEngine assembly language [2], and hence, necessarily of small size. The programs that we have simulated span the types of operations that are performed in many programs. The sorting algorithm quicksort (quick) is used. Naive binary tree insertion (bin) and AVL tree insertion (avl) perform dynamic structure manipulation. Matrix and array operations are represented by matrix multiplication (mat) and Gauss-Jordan elimination (gj). Fibbonacci number generation (fib) is an example of recursion. The algorithms are simple in concept but vary in the relative amounts of data and control dependence. 5 Timestamp Scheme Comparison Four different configurations were simulated for each of the algorithms available in order to examine the restrictions caused by the different timestamp schemes. The configurations consisted of: length scheme using 32 bits; length scheme using 64 bits; exponential scheme using 64 bits (32 exponent and 32 mantissa); timestamps. Using timestamps the same size as the machine s word size is likely to be convenient. Timestamps larger than 64 bits are not used, since these may consume undesirable amounts of resources. By comparing the 64 bit exponential scheme with the scheme we can determine whether the 32 additional bits are more valuable as an exponent, or used to extend the length timestamp. It must also be remembered that all manipulations of the timestamps will take longer for exponential timestamps than length timestamps, due to the added complexity of the scheme. However, for simplicity s sake this has not been simulated. By comparing with the timestamps the extent to which the timestamp scheme is restricting execution can be seen. As described earlier, an n-way execution tree can be decomposed to an equivalent binary tree. In the simulations for ease of programming each node can have up to four children, giving a 4-way execution tree. 5.1 Results Figure 5 shows graphs of simulated speedup over a range of problem size for each of the algorithms using the four timestamp configurations. is defined as the number of cycles required for execution on the WarpEngine simulator divided by the number of cycles required by the WarpEngine if all instructions were executed in virtual order (i.e. serially). The simulator makes a number of optimistic assumptions, including zero cost timestamp rescaling, and unlimited bandwidth. Still the speedup quickly diverges from
6 12 1 AVL using, length and exponential timestamps 6 5 BIN using, length and exponential timestamps FIB using, length and exponential timestamps 7 6 GJ using, length and exponential timestamps MAT using, length and exponential timestamps "mat.t.par" QUICK using, length and exponential timestamps Figure 5: Comparison of speedup for exponent and length timestamp schemes. the results for timestamps and, in some cases, drops to levels comparable to current production architectures [7]. Good speedup is achieved for some of the test programs, however all the test programs are small and in some cases easily parallelisable. It is likely that larger benchmarks would follow the trend shown by larger problem sizes and speedup would continue to diverge widely from the timestamp case. Adding a 32 bit exponent to use the exponential scheme provided little gain over the representation. Despite the relatively large proportion of children (more than 6% in most cases) generated as the left-most child, there are few long chains of children on the left-most branch. Often the earliest events in the virtual sequence (and hence the furthest left) are initialization procedures, which are usually brief. Also, the top level of loop structures tend to have a high fan out in an effort
7 to extract large amounts of parallelism. Thus the exponent often cannot be used to replace leading zeros in the timestamp, at least until rescaling is done to place the nodes on the left-most branch. Using timestamps the speedup generally increases with increasing problem size, as one would expect. With the other schemes, however, the speedup generally decreases as the problem size gets larger. This is caused by increasingly long, thin branches in the execution tree of larger problems, which force delays until GVT can progress and allow fossil collection to release timestamps for rescaling to take place. This also forces more cancellation and the attendant delays. 6 Tree Balancing Further work is currently being done to improve timestamp representations. One approach achieving good results is to alter the shape of the timestamp tree to better fit the shape of the execution tree generated by the program by using variable range timestamps. The range of timestamps allocated to each subtree is fixed in all the representations discussed so far. The range of the parent node is subdivided evenly and allocated to each child, regardless of the number of timestamps required by the subtree, or whether the subtree even exists. By analyzing the likely size of each subtree an upper and lower bound for the timestamp range for each subtree can be established. This is equivalent to balancing the execution tree to achieve better timestamp utilization by packing the timestamp tree more densely. There are a number of ways of expressing the analysis of the subtree required to assign variable range timestamps. It may be possible to determine an absolute number of nodes which will be in the subtree, in which case setting the upper limit is trivial. If this is not possible, it may still be possible to estimate the relative sizes of the subtrees, in which case a proportion of the available interval can be allocated to each subtree. Some subtrees may preclude analysis more detailed than a very large number of nodes in which case it is pointless executing anything further right in the tree, because it will have to be rolled back to provide more timestamp space. This could save some rollback overheads. Preliminary results suggest that this approach will place minimal restrictions on the speculative execution, compared with other resource limitations. 7 Conclusions All of the timestamp representations evaluated in this paper unacceptably restrict the parallelism extracted. This is due to the long, thin branches typically present in the execution tree, causing many timestamps to be wasted. This, in turn, causes timestamp precision to be quickly exhausted and prompts frequent rescaling. Even when rescaling itself is assumed to be instantaneous, it restricts the speculation distance to the point where performance is reduced to that of current production architectures. The addition of an exponent to extend the left-most branch has been shown to be
8 ineffective compared with extending the basic timestamp by the same number of bits. The structure of most programs uses a high fanout at the top levels to extract large amounts of parallelism, and frequently only initialization is performed in the left-most branch. A more promising approach to allocating timestamps more efficiently is to allocate variable timestamp ranges to subtrees. This relies on compiler technology and runtime analysis to improve allocation efficiency. Many of the issues described here are also applicable to allocation of other resources for highly speculative programs. Each block of instructions requires a certain amount of resources (for example memory) to be allocated quickly and efficiently. Where the resource is in some sense linear sparse allocation may seriously affect the ability to utilize that resource. References [1] Adam Back and Steve Turner. Time-stamp generation for optimistic parallel computing. In Proceedings of the 28th Annual Simulation Symposium, pages , Phoenix, Arizona, April [2] John G. Cleary. WarpEngine instruction set. Internet Web Page, November URL [3] John G. Cleary, J. A. David McWha, and Murray Pearson. Timestamp representations for virtual sequences. In Proceedings of 11th Workshop on Parallel and Distributed Simulation (PADS 97), pages 98 15, Lockenhaus, Austria, June [4] John G. Cleary, Murray W. Pearson, and Husam Kinawi. The architecture of an optimistic CPU: The WarpEngine. In Proceedings of HICSS, volume 1, pages , Hawaii, [5] Digital Equipment Corporation. DIGITAL Semiconductor Alpha Microprocessor Product Brief, August Serial:EC-R2YTC-TE. [6] D. Hunt. Advanced performance features of the 64-bit PA-8. In Compcon Digest of Papers, pages , March [7] INTEL. Pentium Pro processor at 15 MHz, 166 MHz, 18 MHz and 2 MHz. INTEL Corporation Datasheets, November Order Number: [8] David Jefferson. Virtual time. Transactions on Programming Languages and Systems, 7(3):44 425, July 1985.
Timestamp Representations for Virtual Sequences
Timestamp Representations for Virtual equences John G. Cleary, J. A. David McWha, Murray Pearson Dept of Computer cience, University of Waikato, Private Bag 305, Hamilton, New Zealand. {jcleary, jadm,
More informationChapter 12: Indexing and Hashing. Basic Concepts
Chapter 12: Indexing and Hashing! Basic Concepts! Ordered Indices! B+-Tree Index Files! B-Tree Index Files! Static Hashing! Dynamic Hashing! Comparison of Ordered Indexing and Hashing! Index Definition
More informationChapter 12: Indexing and Hashing
Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing and Hashing Index Definition in SQL
More informationTree-Structured Indexes
Tree-Structured Indexes Chapter 9 Database Management Systems, R. Ramakrishnan and J. Gehrke 1 Introduction As for any index, 3 alternatives for data entries k*: ➀ Data record with key value k ➁
More informationOther Optimistic Mechanisms, Memory Management!
Other Optimistic Mechanisms, Memory Management! Richard M. Fujimoto! Professor!! Computational Science and Engineering Division! College of Computing! Georgia Institute of Technology! Atlanta, GA 30332-0765,
More informationChapter 11: Indexing and Hashing
Chapter 11: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing and Hashing Index Definition in SQL
More informationUNIT III BALANCED SEARCH TREES AND INDEXING
UNIT III BALANCED SEARCH TREES AND INDEXING OBJECTIVE The implementation of hash tables is frequently called hashing. Hashing is a technique used for performing insertions, deletions and finds in constant
More informationTree-Structured Indexes
Introduction Tree-Structured Indexes Chapter 10 As for any index, 3 alternatives for data entries k*: Data record with key value k
More informationSection 5.3: Event List Management
Section 53: Event List Management Discrete-Event Simulation: A First Course c 2006 Pearson Ed, Inc 0-3-4297-5 Discrete-Event Simulation: A First Course Section 53: Event List Management / 3 Section 53:
More informationLec 25: Parallel Processors. Announcements
Lec 25: Parallel Processors Kavita Bala CS 340, Fall 2008 Computer Science Cornell University PA 3 out Hack n Seek Announcements The goal is to have fun with it Recitations today will talk about it Pizza
More informationTree-Structured Indexes. Chapter 10
Tree-Structured Indexes Chapter 10 1 Introduction As for any index, 3 alternatives for data entries k*: Data record with key value k 25, [n1,v1,k1,25] 25,
More informationTree-Structured Indexes. A Note of Caution. Range Searches ISAM. Example ISAM Tree. Introduction
Tree-Structured Indexes Lecture R & G Chapter 9 If I had eight hours to chop down a tree, I'd spend six sharpening my ax. Abraham Lincoln Introduction Recall: 3 alternatives for data entries k*: Data record
More informationTree-Structured Indexes
Tree-Structured Indexes CS 186, Fall 2002, Lecture 17 R & G Chapter 9 If I had eight hours to chop down a tree, I'd spend six sharpening my ax. Abraham Lincoln Introduction Recall: 3 alternatives for data
More informationIndexing Methods. Lecture 9. Storage Requirements of Databases
Indexing Methods Lecture 9 Storage Requirements of Databases Need data to be stored permanently or persistently for long periods of time Usually too big to fit in main memory Low cost of storage per unit
More informationV Advanced Data Structures
V Advanced Data Structures B-Trees Fibonacci Heaps 18 B-Trees B-trees are similar to RBTs, but they are better at minimizing disk I/O operations Many database systems use B-trees, or variants of them,
More informationUnit 6 Chapter 15 EXAMPLES OF COMPLEXITY CALCULATION
DESIGN AND ANALYSIS OF ALGORITHMS Unit 6 Chapter 15 EXAMPLES OF COMPLEXITY CALCULATION http://milanvachhani.blogspot.in EXAMPLES FROM THE SORTING WORLD Sorting provides a good set of examples for analyzing
More informationTreewidth and graph minors
Treewidth and graph minors Lectures 9 and 10, December 29, 2011, January 5, 2012 We shall touch upon the theory of Graph Minors by Robertson and Seymour. This theory gives a very general condition under
More informationChapter 12: Indexing and Hashing (Cnt(
Chapter 12: Indexing and Hashing (Cnt( Cnt.) Basic Concepts Ordered Indices B+-Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing and Hashing Index Definition
More informationTree-Structured Indexes
Tree-Structured Indexes Chapter 10 Comp 521 Files and Databases Fall 2010 1 Introduction As for any index, 3 alternatives for data entries k*: index refers to actual data record with key value k index
More informationV Advanced Data Structures
V Advanced Data Structures B-Trees Fibonacci Heaps 18 B-Trees B-trees are similar to RBTs, but they are better at minimizing disk I/O operations Many database systems use B-trees, or variants of them,
More informationCSIT5300: Advanced Database Systems
CSIT5300: Advanced Database Systems L08: B + -trees and Dynamic Hashing Dr. Kenneth LEUNG Department of Computer Science and Engineering The Hong Kong University of Science and Technology Hong Kong SAR,
More informationChapter 7 The Potential of Special-Purpose Hardware
Chapter 7 The Potential of Special-Purpose Hardware The preceding chapters have described various implementation methods and performance data for TIGRE. This chapter uses those data points to propose architecture
More informationTree-Structured Indexes ISAM. Range Searches. Comments on ISAM. Example ISAM Tree. Introduction. As for any index, 3 alternatives for data entries k*:
Introduction Tree-Structured Indexes Chapter 10 As for any index, 3 alternatives for data entries k*: Data record with key value k
More informationA New Approach to Determining the Time-Stamping Counter's Overhead on the Pentium Pro Processors *
A New Approach to Determining the Time-Stamping Counter's Overhead on the Pentium Pro Processors * Hsin-Ta Chiao and Shyan-Ming Yuan Department of Computer and Information Science National Chiao Tung University
More informationIntroduction. Choice orthogonal to indexing technique used to locate entries K.
Tree-Structured Indexes Werner Nutt Introduction to Database Systems Free University of Bozen-Bolzano 2 Introduction As for any index, three alternatives for data entries K : Data record with key value
More informationData Structure. IBPS SO (IT- Officer) Exam 2017
Data Structure IBPS SO (IT- Officer) Exam 2017 Data Structure: In computer science, a data structure is a way of storing and organizing data in a computer s memory so that it can be used efficiently. Data
More informationChapter 11: Indexing and Hashing
Chapter 11: Indexing and Hashing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 11: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files B-Tree
More informationCS301 - Data Structures Glossary By
CS301 - Data Structures Glossary By Abstract Data Type : A set of data values and associated operations that are precisely specified independent of any particular implementation. Also known as ADT Algorithm
More informationModule 4: Index Structures Lecture 13: Index structure. The Lecture Contains: Index structure. Binary search tree (BST) B-tree. B+-tree.
The Lecture Contains: Index structure Binary search tree (BST) B-tree B+-tree Order file:///c /Documents%20and%20Settings/iitkrana1/My%20Documents/Google%20Talk%20Received%20Files/ist_data/lecture13/13_1.htm[6/14/2012
More informationDatabase Systems II. Record Organization
Database Systems II Record Organization CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 75 Introduction We have introduced secondary storage devices, in particular disks. Disks use blocks as
More informationIntroduction to Indexing 2. Acknowledgements: Eamonn Keogh and Chotirat Ann Ratanamahatana
Introduction to Indexing 2 Acknowledgements: Eamonn Keogh and Chotirat Ann Ratanamahatana Indexed Sequential Access Method We have seen that too small or too large an index (in other words too few or too
More informationFloating Point Considerations
Chapter 6 Floating Point Considerations In the early days of computing, floating point arithmetic capability was found only in mainframes and supercomputers. Although many microprocessors designed in the
More informationMemory Management Algorithms on Distributed Systems. Katie Becker and David Rodgers CS425 April 15, 2005
Memory Management Algorithms on Distributed Systems Katie Becker and David Rodgers CS425 April 15, 2005 Table of Contents 1. Introduction 2. Coarse Grained Memory 2.1. Bottlenecks 2.2. Simulations 2.3.
More informationPhysical Level of Databases: B+-Trees
Physical Level of Databases: B+-Trees Adnan YAZICI Computer Engineering Department METU (Fall 2005) 1 B + -Tree Index Files l Disadvantage of indexed-sequential files: performance degrades as file grows,
More informationKathleen Durant PhD Northeastern University CS Indexes
Kathleen Durant PhD Northeastern University CS 3200 Indexes Outline for the day Index definition Types of indexes B+ trees ISAM Hash index Choosing indexed fields Indexes in InnoDB 2 Indexes A typical
More informationTree-Structured Indexes
Tree-Structured Indexes Chapter 9 Database Management Systems, R. Ramakrishnan and J. Gehrke 1 Introduction As for any index, 3 alternatives for data entries k*: Data record with key value k
More informationCS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 11
CS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 11 CS 536 Spring 2015 1 Handling Overloaded Declarations Two approaches are popular: 1. Create a single symbol table
More informationMultithreaded Algorithms Part 1. Dept. of Computer Science & Eng University of Moratuwa
CS4460 Advanced d Algorithms Batch 08, L4S2 Lecture 11 Multithreaded Algorithms Part 1 N. H. N. D. de Silva Dept. of Computer Science & Eng University of Moratuwa Announcements Last topic discussed is
More informationChapter 11: Indexing and Hashing
Chapter 11: Indexing and Hashing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 11: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files B-Tree
More informationCSC 553 Operating Systems
CSC 553 Operating Systems Lecture 12 - File Management Files Data collections created by users The File System is one of the most important parts of the OS to a user Desirable properties of files: Long-term
More informationLecture 8 13 March, 2012
6.851: Advanced Data Structures Spring 2012 Prof. Erik Demaine Lecture 8 13 March, 2012 1 From Last Lectures... In the previous lecture, we discussed the External Memory and Cache Oblivious memory models.
More informationAnnouncements. Reading Material. Recap. Today 9/17/17. Storage (contd. from Lecture 6)
CompSci 16 Intensive Computing Systems Lecture 7 Storage and Index Instructor: Sudeepa Roy Announcements HW1 deadline this week: Due on 09/21 (Thurs), 11: pm, no late days Project proposal deadline: Preliminary
More informationCOMPUTER ORGANIZATION AND DESI
COMPUTER ORGANIZATION AND DESIGN 5 Edition th The Hardware/Software Interface Chapter 4 The Processor 4.1 Introduction Introduction CPU performance factors Instruction count Determined by ISA and compiler
More information(i) It is efficient technique for small and medium sized data file. (ii) Searching is comparatively fast and efficient.
INDEXING An index is a collection of data entries which is used to locate a record in a file. Index table record in a file consist of two parts, the first part consists of value of prime or non-prime attributes
More informationDatabase System Concepts, 6 th Ed. Silberschatz, Korth and Sudarshan See for conditions on re-use
Chapter 11: Indexing and Hashing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files Static
More informationChapter 12: Indexing and Hashing
Chapter 12: Indexing and Hashing Database System Concepts, 5th Ed. See www.db-book.com for conditions on re-use Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files B-Tree
More informationAn Empirical Performance Study of Connection Oriented Time Warp Parallel Simulation
230 The International Arab Journal of Information Technology, Vol. 6, No. 3, July 2009 An Empirical Performance Study of Connection Oriented Time Warp Parallel Simulation Ali Al-Humaimidi and Hussam Ramadan
More informationSpring 2017 B-TREES (LOOSELY BASED ON THE COW BOOK: CH. 10) 1/29/17 CS 564: Database Management Systems, Jignesh M. Patel 1
Spring 2017 B-TREES (LOOSELY BASED ON THE COW BOOK: CH. 10) 1/29/17 CS 564: Database Management Systems, Jignesh M. Patel 1 Consider the following table: Motivation CREATE TABLE Tweets ( uniquemsgid INTEGER,
More informationDesign and Analysis of Algorithms Lecture- 9: B- Trees
Design and Analysis of Algorithms Lecture- 9: B- Trees Dr. Chung- Wen Albert Tsao atsao@svuca.edu www.408codingschool.com/cs502_algorithm 1/12/16 Slide Source: http://www.slideshare.net/anujmodi555/b-trees-in-data-structure
More informationECE902 Virtual Machine Final Project: MIPS to CRAY-2 Binary Translation
ECE902 Virtual Machine Final Project: MIPS to CRAY-2 Binary Translation Weiping Liao, Saengrawee (Anne) Pratoomtong, and Chuan Zhang Abstract Binary translation is an important component for translating
More informationAdministrivia. Tree-Structured Indexes. Review. Today: B-Tree Indexes. A Note of Caution. Introduction
Administrivia Tree-Structured Indexes Lecture R & G Chapter 9 Homeworks re-arranged Midterm Exam Graded Scores on-line Key available on-line If I had eight hours to chop down a tree, I'd spend six sharpening
More informationTree-Structured Indexes
Tree-Structured Indexes Chapter 10 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke Introduction As for any index, 3 alternatives for data entries k*: Data record with key value k
More informationOptimistic Parallel Simulation of TCP/IP over ATM networks
Optimistic Parallel Simulation of TCP/IP over ATM networks M.S. Oral Examination November 1, 2000 Ming Chong mchang@ittc.ukans.edu 1 Introduction parallel simulation ProTEuS Agenda Georgia Tech. Time Warp
More information2 TEST: A Tracer for Extracting Speculative Threads
EE392C: Advanced Topics in Computer Architecture Lecture #11 Polymorphic Processors Stanford University Handout Date??? On-line Profiling Techniques Lecture #11: Tuesday, 6 May 2003 Lecturer: Shivnath
More informationDDS Dynamic Search Trees
DDS Dynamic Search Trees 1 Data structures l A data structure models some abstract object. It implements a number of operations on this object, which usually can be classified into l creation and deletion
More informationChapter 9 Graph Algorithms
Introduction graph theory useful in practice represent many real-life problems can be if not careful with data structures Chapter 9 Graph s 2 Definitions Definitions an undirected graph is a finite set
More informationOperating Systems 2230
Operating Systems 2230 Computer Science & Software Engineering Lecture 6: Memory Management Allocating Primary Memory to Processes The important task of allocating memory to processes, and efficiently
More informationMultidimensional Indexes [14]
CMSC 661, Principles of Database Systems Multidimensional Indexes [14] Dr. Kalpakis http://www.csee.umbc.edu/~kalpakis/courses/661 Motivation Examined indexes when search keys are in 1-D space Many interesting
More informationSDD Advanced-User Manual Version 1.1
SDD Advanced-User Manual Version 1.1 Arthur Choi and Adnan Darwiche Automated Reasoning Group Computer Science Department University of California, Los Angeles Email: sdd@cs.ucla.edu Download: http://reasoning.cs.ucla.edu/sdd
More informationPrinciples of Parallel Algorithm Design: Concurrency and Mapping
Principles of Parallel Algorithm Design: Concurrency and Mapping John Mellor-Crummey Department of Computer Science Rice University johnmc@rice.edu COMP 422/534 Lecture 3 17 January 2017 Last Thursday
More informationVirtual Memory Design and Implementation
Virtual Memory Design and Implementation To do q Page replacement algorithms q Design and implementation issues q Next: Last on virtualization VMMs Loading pages When should the OS load pages? On demand
More information8.1. Optimal Binary Search Trees:
DATA STRUCTERS WITH C 10CS35 UNIT 8 : EFFICIENT BINARY SEARCH TREES 8.1. Optimal Binary Search Trees: An optimal binary search tree is a binary search tree for which the nodes are arranged on levels such
More informationChapter 4: Trees. 4.2 For node B :
Chapter : Trees. (a) A. (b) G, H, I, L, M, and K.. For node B : (a) A. (b) D and E. (c) C. (d). (e).... There are N nodes. Each node has two pointers, so there are N pointers. Each node but the root has
More informationChapter 9 Graph Algorithms
Chapter 9 Graph Algorithms 2 Introduction graph theory useful in practice represent many real-life problems can be if not careful with data structures 3 Definitions an undirected graph G = (V, E) is a
More informationunused unused unused unused unused unused
BCD numbers. In some applications, such as in the financial industry, the errors that can creep in due to converting numbers back and forth between decimal and binary is unacceptable. For these applications
More informationAnalytical Modeling of Parallel Systems. To accompany the text ``Introduction to Parallel Computing'', Addison Wesley, 2003.
Analytical Modeling of Parallel Systems To accompany the text ``Introduction to Parallel Computing'', Addison Wesley, 2003. Topic Overview Sources of Overhead in Parallel Programs Performance Metrics for
More informationEE382A Lecture 7: Dynamic Scheduling. Department of Electrical Engineering Stanford University
EE382A Lecture 7: Dynamic Scheduling Department of Electrical Engineering Stanford University http://eeclass.stanford.edu/ee382a Lecture 7-1 Announcements Project proposal due on Wed 10/14 2-3 pages submitted
More informationChapter 11: Indexing and Hashing" Chapter 11: Indexing and Hashing"
Chapter 11: Indexing and Hashing" Database System Concepts, 6 th Ed.! Silberschatz, Korth and Sudarshan See www.db-book.com for conditions on re-use " Chapter 11: Indexing and Hashing" Basic Concepts!
More informationData Structures and Algorithms
Data Structures and Algorithms CS245-2008S-19 B-Trees David Galles Department of Computer Science University of San Francisco 19-0: Indexing Operations: Add an element Remove an element Find an element,
More informationReading Assignment. Lazy Evaluation
Reading Assignment Lazy Evaluation MULTILISP: a language for concurrent symbolic computation, by Robert H. Halstead (linked from class web page Lazy evaluation is sometimes called call by need. We do an
More informationLecture Notes. char myarray [ ] = {0, 0, 0, 0, 0 } ; The memory diagram associated with the array can be drawn like this
Lecture Notes Array Review An array in C++ is a contiguous block of memory. Since a char is 1 byte, then an array of 5 chars is 5 bytes. For example, if you execute the following C++ code you will allocate
More informationIndexing. Chapter 8, 10, 11. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1
Indexing Chapter 8, 10, 11 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Tree-Based Indexing The data entries are arranged in sorted order by search key value. A hierarchical search
More informationHeckaton. SQL Server's Memory Optimized OLTP Engine
Heckaton SQL Server's Memory Optimized OLTP Engine Agenda Introduction to Hekaton Design Consideration High Level Architecture Storage and Indexing Query Processing Transaction Management Transaction Durability
More information6.001 Notes: Section 31.1
6.001 Notes: Section 31.1 Slide 31.1.1 In previous lectures we have seen a number of important themes, which relate to designing code for complex systems. One was the idea of proof by induction, meaning
More information2-3 Tree. Outline B-TREE. catch(...){ printf( "Assignment::SolveProblem() AAAA!"); } ADD SLIDES ON DISJOINT SETS
Outline catch(...){ printf( "Assignment::SolveProblem() AAAA!"); } Balanced Search Trees 2-3 Trees 2-3-4 Trees Slide 4 Why care about advanced implementations? Same entries, different insertion sequence:
More informationCoping with Conflicts in an Optimistically Replicated File System
Coping with Conflicts in an Optimistically Replicated File System Puneet Kumar School of Computer Science Carnegie Mellon University 1. Introduction Coda is a scalable distributed Unix file system that
More informationHash Tables. CS 311 Data Structures and Algorithms Lecture Slides. Wednesday, April 22, Glenn G. Chappell
Hash Tables CS 311 Data Structures and Algorithms Lecture Slides Wednesday, April 22, 2009 Glenn G. Chappell Department of Computer Science University of Alaska Fairbanks CHAPPELLG@member.ams.org 2005
More informationPrinciples of Data Management. Lecture #5 (Tree-Based Index Structures)
Principles of Data Management Lecture #5 (Tree-Based Index Structures) Instructor: Mike Carey mjcarey@ics.uci.edu Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Today s Headlines v Project
More informationData Storage and Query Answering. Data Storage and Disk Structure (4)
Data Storage and Query Answering Data Storage and Disk Structure (4) Introduction We have introduced secondary storage devices, in particular disks. Disks use blocks as basic units of transfer and storage.
More information1.3 Data processing; data storage; data movement; and control.
CHAPTER 1 OVERVIEW ANSWERS TO QUESTIONS 1.1 Computer architecture refers to those attributes of a system visible to a programmer or, put another way, those attributes that have a direct impact on the logical
More information9/24/ Hash functions
11.3 Hash functions A good hash function satis es (approximately) the assumption of SUH: each key is equally likely to hash to any of the slots, independently of the other keys We typically have no way
More informationLSU EE 4720 Dynamic Scheduling Study Guide Fall David M. Koppelman. 1.1 Introduction. 1.2 Summary of Dynamic Scheduling Method 3
PR 0,0 ID:incmb PR ID:St: C,X LSU EE 4720 Dynamic Scheduling Study Guide Fall 2005 1.1 Introduction David M. Koppelman The material on dynamic scheduling is not covered in detail in the text, which is
More informationLecture 15 Notes Binary Search Trees
Lecture 15 Notes Binary Search Trees 15-122: Principles of Imperative Computation (Spring 2016) Frank Pfenning, André Platzer, Rob Simmons 1 Introduction In this lecture, we will continue considering ways
More informationChapter 4. Advanced Pipelining and Instruction-Level Parallelism. In-Cheol Park Dept. of EE, KAIST
Chapter 4. Advanced Pipelining and Instruction-Level Parallelism In-Cheol Park Dept. of EE, KAIST Instruction-level parallelism Loop unrolling Dependence Data/ name / control dependence Loop level parallelism
More informationProcessor (IV) - advanced ILP. Hwansoo Han
Processor (IV) - advanced ILP Hwansoo Han Instruction-Level Parallelism (ILP) Pipelining: executing multiple instructions in parallel To increase ILP Deeper pipeline Less work per stage shorter clock cycle
More informationLecture 15 Binary Search Trees
Lecture 15 Binary Search Trees 15-122: Principles of Imperative Computation (Fall 2017) Frank Pfenning, André Platzer, Rob Simmons, Iliano Cervesato In this lecture, we will continue considering ways to
More informationToday: Amortized Analysis (examples) Multithreaded Algs.
Today: Amortized Analysis (examples) Multithreaded Algs. COSC 581, Algorithms March 11, 2014 Many of these slides are adapted from several online sources Reading Assignments Today s class: Chapter 17 (Amortized
More informationParallelizing Frequent Itemset Mining with FP-Trees
Parallelizing Frequent Itemset Mining with FP-Trees Peiyi Tang Markus P. Turkia Department of Computer Science Department of Computer Science University of Arkansas at Little Rock University of Arkansas
More informationB+ Tree Review. CSE332: Data Abstractions Lecture 10: More B Trees; Hashing. Can do a little better with insert. Adoption for insert
B+ Tree Review CSE2: Data Abstractions Lecture 10: More B Trees; Hashing Dan Grossman Spring 2010 M-ary tree with room for L data items at each leaf Order property: Subtree between keys x and y contains
More informationSearch Algorithms for Discrete Optimization Problems
Search Algorithms for Discrete Optimization Problems Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar To accompany the text ``Introduction to Parallel Computing'', Addison Wesley, 2003. 1 Topic
More informationCS 350 : Data Structures B-Trees
CS 350 : Data Structures B-Trees David Babcock (courtesy of James Moscola) Department of Physical Sciences York College of Pennsylvania James Moscola Introduction All of the data structures that we ve
More informationParallel Programming. Parallel algorithms Combinatorial Search
Parallel Programming Parallel algorithms Combinatorial Search Some Combinatorial Search Methods Divide and conquer Backtrack search Branch and bound Game tree search (minimax, alpha-beta) 2010@FEUP Parallel
More informationFILE SYSTEMS. CS124 Operating Systems Winter , Lecture 23
FILE SYSTEMS CS124 Operating Systems Winter 2015-2016, Lecture 23 2 Persistent Storage All programs require some form of persistent storage that lasts beyond the lifetime of an individual process Most
More informationTrees. Eric McCreath
Trees Eric McCreath 2 Overview In this lecture we will explore: general trees, binary trees, binary search trees, and AVL and B-Trees. 3 Trees Trees are recursive data structures. They are useful for:
More informationEect of fan-out on the Performance of a. Single-message cancellation scheme. Atul Prakash (Contact Author) Gwo-baw Wu. Seema Jetli
Eect of fan-out on the Performance of a Single-message cancellation scheme Atul Prakash (Contact Author) Gwo-baw Wu Seema Jetli Department of Electrical Engineering and Computer Science University of Michigan,
More informationChapter 17. Disk Storage, Basic File Structures, and Hashing. Records. Blocking
Chapter 17 Disk Storage, Basic File Structures, and Hashing Records Fixed and variable length records Records contain fields which have values of a particular type (e.g., amount, date, time, age) Fields
More informationParallel and Distributed VHDL Simulation
Parallel and Distributed VHDL Simulation Dragos Lungeanu Deptartment of Computer Science University of Iowa C.J. chard Shi Department of Electrical Engineering University of Washington Abstract This paper
More informationContents. Preface xvii Acknowledgments. CHAPTER 1 Introduction to Parallel Computing 1. CHAPTER 2 Parallel Programming Platforms 11
Preface xvii Acknowledgments xix CHAPTER 1 Introduction to Parallel Computing 1 1.1 Motivating Parallelism 2 1.1.1 The Computational Power Argument from Transistors to FLOPS 2 1.1.2 The Memory/Disk Speed
More informationFile Management By : Kaushik Vaghani
File Management By : Kaushik Vaghani File Concept Access Methods File Types File Operations Directory Structure File-System Structure File Management Directory Implementation (Linear List, Hash Table)
More information15-740/ Computer Architecture Lecture 22: Superscalar Processing (II) Prof. Onur Mutlu Carnegie Mellon University
15-740/18-740 Computer Architecture Lecture 22: Superscalar Processing (II) Prof. Onur Mutlu Carnegie Mellon University Announcements Project Milestone 2 Due Today Homework 4 Out today Due November 15
More information