Data Compression for Bitmap Indexes. Y. Chen

Similar documents
Analysis of Basic Data Reordering Techniques

Bitmap Indices for Fast End-User Physics Analysis in ROOT

Bitmap Index Partition Techniques for Continuous and High Cardinality Discrete Attributes

Compressing Bitmap Indexes for Faster Search Operations

Strategies for Processing ad hoc Queries on Large Data Warehouses

Bitmap Indices for Speeding Up High-Dimensional Data Analysis

Minimizing I/O Costs of Multi-Dimensional Queries with Bitmap Indices

Lawrence Berkeley National Laboratory Lawrence Berkeley National Laboratory

Histogram-Aware Sorting for Enhanced Word-Aligned Compress

Sorting Improves Bitmap Indexes

Approximate Encoding for Direct Access and Query Processing over Compressed Bitmaps

Keynote: About Bitmap Indexes

Enhancing Bitmap Indices

All About Bitmap Indexes... And Sorting Them

Parallel, In Situ Indexing for Data-intensive Computing. Introduction

Binary Encoded Attribute-Pairing Technique for Database Compression

DATA WAREHOUSING II. CS121: Relational Databases Fall 2017 Lecture 23

arxiv: v4 [cs.db] 6 Jun 2014 Abstract

Category: Informational May DEFLATE Compressed Data Format Specification version 1.3

Efficient Iceberg Query Evaluation on Multiple Attributes using Set Representation

CS122 Lecture 15 Winter Term,

V.2 Index Compression

Information Retrieval

Using Bitmap Indexing Technology for Combined Numerical and Text Queries

International Journal of Modern Trends in Engineering and Research. A Survey on Iceberg Query Evaluation strategies

Hybrid Query Optimization for Hard-to-Compress Bit-vectors

ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective

Data Warehousing & Data Mining

Processing of Very Large Data

GPU-WAH: Applying GPUs to Compressing Bitmap Indexes with Word Aligned Hybrid

A System for Query Based Analysis and Visualization

Information Retrieval

Multi-resolution Bitmap Indexes for Scientific Data

Concise: Compressed n Composable Integer Set

FastBit: interactively searching massive data

Lecture 15. Lecture 15: Bitmap Indexes

Index Compression. David Kauchak cs160 Fall 2009 adapted from:

A Comparative Study of Lossless Compression Algorithm on Text Data

BAH: A Bitmap Index Compression Algorithm for Fast Data Retrieval

UpBit: Scalable In-Memory Updatable Bitmap Indexing

Optimizing Query Execution for Variable-Aligned Length Compression of Bitmap Indices

Better bitmap performance with Roaring bitmaps

UpBit: Scalable In-Memory Updatable Bitmap Indexing. Guanting Chen, Yuan Zhang 2/21/2019

Summary. 4. Indexes. 4.0 Indexes. 4.1 Tree Based Indexes. 4.0 Indexes. 19-Nov-10. Last week: This week:

Bitmap Index Design and Evaluation

Reducing the Number of Test Cases for Performance Evaluation of Components

HICAMP Bitmap. A Space-Efficient Updatable Bitmap Index for In-Memory Databases! Bo Wang, Heiner Litz, David R. Cheriton Stanford University DAMON 14

An Asymmetric, Semi-adaptive Text Compression Algorithm

Conflict Detection-based Run-Length Encoding AVX-512 CD Instruction Set in Action

Column Stores versus Search Engines and Applications to Search in Social Networks

A New Compression Method Strictly for English Textual Data

RACBVHs: Random Accessible Compressed Bounding Volume Hierarchies

On Data Latency and Compression

Cluster based Mixed Coding Schemes for Inverted File Index Compression

Searchable Compressed Representations of Very Sparse Bitmaps (extended abstract)

Advanced Data Management

Run-Length and Markov Compression Revisited in DNA Sequences

Study of LZ77 and LZ78 Data Compression Techniques

Multi-Level Bitmap Indexes for Flash Memory Storage

GPU-PLWAH: GPU-based implementation of the PLWAH algorithm for compressing bitmaps

Efficient integration of data mining techniques in DBMSs

IJSRD - International Journal for Scientific Research & Development Vol. 3, Issue 10, 2015 ISSN (online):

Performing Joins without Decompression in a Compressed Database System

HW/SW-Database-CoDesign for Compressed Bitmap Index Processing

Big Data Management and NoSQL Databases

Investigating Design Choices between Bitmap index and B-tree index for a Large Data Warehouse System

Comparative Study of Dictionary based Compression Algorithms on Text Data

Analyzing Dshield Logs Using Fully Automatic Cross-Associations

Indexing. CS6200: Information Retrieval. Index Construction. Slides by: Jesse Anderton

The Effect of Non-Greedy Parsing in Ziv-Lempel Compression Methods

(S)LOC Count Evolution for Selected OSS Projects. Tik Report 315

Data Warehouse Physical Design: Part I

Benchmarking a B-tree compression method

Gipfeli - High Speed Compression Algorithm

1/3/2015. Column-Store: An Overview. Row-Store vs Column-Store. Column-Store Optimizations. Compression Compress values per column

On-Disk Bitmap Index In Bizgres

Basic Compression Library

Network Working Group. Category: Informational August 1996

Modeling the Real World for Data Mining: Granular Computing Approach

Efficient Building and Querying of Asian Language Document Databases

A Simple Lossless Compression Heuristic for Grey Scale Images

Category: Informational December 1998

UNIVERSITY OF OKLAHOMA GRADUATE COLLEGE PARALLEL COMPRESSION AND INDEXING ON LARGE-SCALE GEOSPATIAL RASTER DATA ON GPGPUS A THESIS

Special Issue of IJCIM Proceedings of the

An Advanced Text Encryption & Compression System Based on ASCII Values & Arithmetic Encoding to Improve Data Security

THE RELATIVE EFFICIENCY OF DATA COMPRESSION BY LZW AND LZSS

Huffman Coding Implementation on Gzip Deflate Algorithm and its Effect on Website Performance

7: Image Compression

Binary Representations of Integers and the Performance of Selectorecombinative Genetic Algorithms

Processing Posting Lists Using OpenCL

Unified VLSI Systolic Array Design for LZ Data Compression

Query Processing and Optimization Using Set Predicates

Data Compression. An overview of Compression. Multimedia Systems and Applications. Binary Image Compression. Binary Image Compression

Parallel LZ77 Decoding with a GPU. Emmanuel Morfiadakis Supervisor: Dr Eric McCreath College of Engineering and Computer Science, ANU

Data Warehouse Physical Design: Part I

The Impact of Write Back on Cache Performance

Making the Most of Hadoop with Optimized Data Compression (and Boost Performance) Mark Cusack. Chief Architect RainStor

STORING DATA: DISK AND FILES

Automatic Selection of Bitmap Join Indexes in Data Warehouses

A Simpler Proof Of The Average Case Complexity Of Union-Find. With Path Compression

Transcription:

Data Compression for Bitmap Indexes Y. Chen Abstract Compression Ratio (CR) and Logical Operation Time (LOT) are two major measures of the efficiency of bitmap indexing. Previous works by [5, 9, 10, 11] compare the performance of bitmap compression schemes conducted separately on logical operation time and compression ratio. This paper will describe these works and recommend for consideration a new matrix overall efficiency indicator. The overall efficiency indicator is an integrative approach to evaluating the efficiency of bitmap compression schemes, taking both the compression ratio and the logical operation time into account. A further investigation is conducted to examine the effect of bitmap index density upon an overall efficiency indicator, thus shedding light on the selection of compression schemes depending on the bitmap index density. The finding shows that WAH is the most efficient compression scheme among gzip, BBC and WAH within a certain range of bitmap index density (0.0001 to 0.5). This finding is consistent with the proposal in previous studies of bitmap index compression in [9, 10, 11]. 1.0 Introduction Bitmap indexing appears to be an effective and efficient structure to help with processing complex queries like those in data warehousing applications, and have been implemented in several commercial DBMSs. A major advantage of bitmap indexing is that the main operations on the bitmaps during query processing are bitwise logical operations (AND, OR, NOT, XOR, etc.), which are very efficiently supported by hardware. Conventionally, bitmap indexes are space-efficient only for attributes with low cardinality. However, the size of bitmap indexes may become very large for high cardinality data attributes, thus resulting in heavy storage requirements. To address this problem, a number of bitmap compression schemes have been proposed. Most of the data compression schemes, such as gzip, Expol, etc, are designed primarily to achieve good compression, without considering the CPU processing cost (time) associated with logical operations on compressed data. The result is much slower query processing time than for uncompressed bitmap indexes. Compression schemes were therefore viewed as sacrificing the advantages of bitmap indexing in query processing namely, the low-cost bitwise operations and the capability of multiple indexes scans. For this reason, most commercial DBMSs do not use these compression schemes to compress bitmaps when implementing bitmap indexes. A number of schemes specifically designed to compress bitmap indexes have been proposed. One of the best-known schemes is Byte-aligned Bitmap Code (BBC) [2], which is used in ORACLE. BBC can perform logical operations very efficiently compared to the generic compression schemes while offering good compressions. However, in some cases, BBC is still much slower than the uncompressed bitmap indexes during query processing. A recently proposed compression scheme [5, 9, 10, 11], referred to as Word-aligned hybrid code (WAH), is Addendum-T3-1 receiving new attention because of its significant performance advantages over its predecessors. WAH claims a CPU-friendly scheme, and, in most cases, is faster than an uncompressed bitmap index during query processing while still using less space. This paper analyzes and compares performance of the three compression schemes, with a focus on the efficiency of the bitmap indexing, depending on the density of bitmap indexes. The organization of the paper is as follows: In Section 2, a brief review of simple bitmap indexing is given as an introduction. Section 3 reviews three bitmap compression schemes selected as representatives by identifying their key features and their performance characteristics. We introduce the new overall efficiency indicator and discuss the relative performance of the three bitmap compression schemes using the overall efficiency indicator in Section 4. A short summary and discussion of future work are given in Section 5. 2.0 Simple Bitmap Indexing (LIT) Simple bitmap indexing, referred to as literal (LIT) bit vector [12] or verbatim [5], is the first member of the bitmap indexing family, developed for database use in the Model 204 product from Computer Corporation of America [7]. It is a straightforward way of representing a bitmap. A bitmap index on an attribute is the collection of bitmap vectors. The basic idea behind simple bitmap indexing is to use a string of 1 s or 0 s bits to indicate whether an attribute in a tuple is equal to a specific value or not [7]. We can use the classic example, GENDER, to illustrate this. A simple bitmap index on an attribute GENDER, with domain {F, M}, consists of two bit vectors, one for F and the other for M. The position of a bit in the bit string denotes the position of a tuple in the table, and the length of the bit vectors is equal to cardinality of the table. The bits of the bit vector for M are set to 1 if the tuples of the corresponding bit-positions have the attribute GENDER = M, otherwise the bits are set to 0. The same principle applies to bit vector for F. Figure 1 shows a bitmap index for the attribute GENDER and STATE (STATE can have one of three values, NY, CT, CA). One advantage of bitmap indexes is that complex selection predicates can be computed very quickly, by performing bit-wise AND, OR, and NOT operations on the bitmap indexes. To process the example query GENDER=F AND STATE=NY, a bitmap index on attribute GENDER and a bitmap index on attribute STATE are used separately to generate two bitmaps representing objects satisfying the conditions on GENDER and STATE. The final answer is the result of a bitwise logical AND operation on these two bitmaps. The low-cost of bitwise operations in bitmap index processing has led to considerable interest in their use in Decision Support Systems (DSS). However, within high cardinality domains, both the time and space complexity of building and maintaining the simple bitmap index will rapidly become higher, and the sparse of the bitmap vectors increases with high cardinality attributes, resulting in

RID GENDER STATE B F B M B NY B CA B CT 1 F NY 1 0 1 0 0 2 M NY 0 1 1 0 0 3 M CA 0 1 0 1 0 4 F CT 1 0 0 0 1 5 F CA 1 0 0 1 0 TABLE GENDER STATE poor space utilization and high processing cost. Bitmap compression techniques can make bitmap indexes more compact representations of sparse and high cardinality data attributes. 3.0 Review of Bitmap Index Compression Algorithms A variety of techniques for compressing bitmap indexes can be found in a search of literature. In this paper, three major bitmap compression schemes are selected as representatives, from a number of schemes studied previously [2, 5, 9]: Gzip, Byte-aligned Bitmap Code (BBC), and word-aligned hybrid run-length code (WAH). This section is a brief review of the schemes that identifes their key features and their performance characteristics. 3.1 Gzip Gzip is one of the generic compression schemes [4]. It is an implementation of Lempel-Ziv (LZ) encoding, which is designed primarily for data file compression rather than for bitmap compression. LZ encoding compresses repeated sequences of symbols and replaces them with short compression codes. The Gzip scheme supports only the Basic Boolean Operation Algorithm [5]. In the Basic Boolean Operation Algorithm, the bitmap is first uncompressed, and the bitwise operations between the two bitmaps are performed. The time to uncompress the bitmaps occupies the most time during query processing. We chose the gzip as representative of general purpose compression schemes because it is reported to be usually faster than others in decompressing the data files. 3.2 Byte-aligned Bitmap Code (BBC) The second type of compression scheme is specialized bitmap compression schemes. Most of these compression schemes are byte-based, that is, they access computer memory one byte at a time. The Byte-aligned Bitmap Code (BBC) is one of the well-known schemes of this type. It was first proposed by Antoshenkov in [2, 3]. The claimed advantage of the BBC encoding algorithm is efficient bitwise logical operations, since all operations occur locally on full bytes. In addition, it achieves compression almost as good as gzip. Antoshenkov proposed logical operations on compressed bitmaps, which can be significantly faster than operating on uncompressed bitmaps. The BBC scheme can use the Direct Boolean Operation Algorithm, performing logical operations directly on compressed bitmaps [5]. Figure1. Simple Bitmap Indexes for GENDER and STATE Addendum-T3-2 Most specialized bitmap compression schemes, including the BBC scheme, generally try to store consecutive identical bits in compact forms. Run-length encoding (RLE) is among the simplest of these strategies. Run-length encoding represents the successive identical bits (also called a fill or a gap) by their bit value and their length. The bit value of a fill is called the fill bit. The gap can be zero-filled only for one-sided BBC codes, and two-sided BBC codes can have zero-filled or one-filled. There are two properties of BBC encoding that are crucial to the efficiency of the BBC scheme. The first one is the representation of BBC run. Given a bit sequence, the BBC scheme first divides it into bytes, and then groups the bytes into runs. Each BBC run consists of a fill followed by a tail of literal bytes. The fill represents the fill length as the number of bytes rather than number of bits. This strategy uses more bits to represent a fill length than others such as ExpGol [6]. However, it allows for faster operations [5]. The second property of BBC encoding is the byte alignment. This property limits a fill length to be an integer multiple of bytes, and a tail byte is never broken into bits during bitwise logical operation. On most CPUs, working on a whole byte is much more efficient than working on individual bits. Removing the byte alignment might lead to better compression, but the bitwise logical operations are much slower. 3.3 Word-aligned Hybrid Scheme (WAH) The Word-aligned Hybrid Scheme is the newest scheme. On modern computers, accessing one byte usually takes as much time as accessing one word [8]. Word-based compression schemes can take advantage of this fact. The most efficient word-based compression scheme is proposed by Kesheng and his colleagues [9, 10, 11, 12], and is called the word-aligned hybrid (WAH) code. The WAH scheme achieves performance advantage by ensuring that any logical operations are performed on entire words, rather than on bytes or bits. Thus, they can perform bitwise logical operations faster. This section contains brief descriptions of this compression scheme. The WAH code is based on hybrid run-length code (HRL), which represents long fills using run-length encoding and represents short fills literally. In addition to representing all fills and literal bits in whole words, WAH imposes another requirement on the fills. The word alignment requirement demands that all fill lengths be integer multiples of words. This is similar to Byte-alignment requirement in BBC code, except that BBC stores

compressed data in bytse and WAH stores compressed data in words. ratios of the three compression schemes clusters around 1, which is the uncompressible state. There are two types of words in WAH codes: literal words and fill words. A literal word represents a bitmap literally and a fill word represents a fill. The fill represents the fill length as a number of words. The logical operation can be performed on the compressed bitmaps and the time needed for such an operation on two bitmaps is related to the sizes of the compressed bitmaps. Kesheng and his colleagues came to the conclusion that the BBC scheme achieves better compression than the WAH scheme, but bitwise logical operations on BBC are slower than on WAH codes [11, 12]. This is mainly due to the following three reasons. 1. The types of runs: In WAH codes, there are only two types of words, while there are four different types of bytes in BBC codes. It takes more time to determine the type of runs in BBC. After run type is decided, the run may still need to be fully decoded to determine the fill length of the tail value, since BBC codes use a clever packing to minimize space use. 2. WAH is designed to obey word-alignment requirements, while BBC is designed to obey bytealignment requirements. The requirements ensure that WAH always accesses whole words and BBC always accesses whole bytes during logical operations. It takes more time for BBC to load the data from the main memory to CPU registers than for WAH. 3. The way to encode the short fills is different for the WAH and BBC schemes. The BBC scheme encodes shorter fills more compactly than WAH. WAH typically stores the short fill as a literal word, while BBC starts a new run when it encounters a short fill. It is faster to operate on a WAH literal word than on a BBC run, even though WAH cannot compress as well as BBC. Figure 2: Compression Ratio vs. Bit Density 4.2 Performance Comparison Based on Logical Operation Time Some studies have used the OR query to investigate the relationship between logical operation time and bitmap index density [5, 12]. Figure 3 illustrates an interesting trend. As bitmap index density increases, the logical operation time for gzip, BBC and WAH increases until the bitmap index density reaches around 0.2. Then, the logical operation time starts to decline. When density is between 0.0001 and 0.5, WAH takes the least logical operation time. When density is between 0.0001 and 0.01, BBC uses less logical operation time than gzip, yet between 0.01 and 0.5, gzip uses less logical operation time than BBC. 4.0 Performance Analysis Previous study has investigated the relationship between compression ratio and bitmap index density and the relationship between logical operation time and bitmap index density separately. We will describe these works and recommend for consideration a new matrix overall efficiency indicator. The overall efficiency indicator is an integrative approach to evaluating the efficiency of bitmap compression schemes, taking both the compression ratio and the logical operation time into account. A further investigation is conducted to examine the effect of bitmap index density upon an overall efficiency indicator, thus shedding light on the selection of compression schemes depending on the bitmap index density. 4.1 Performance Comparison Based on Compression Ratio [5, 12] Figure 3: Logical OR Operation Time vs. Bit Density Figure 2 illustrates the relationship between compression ratio and bitmap index density. Compression ratio is defined as the ratio of the compressed size divided by the uncompressed (original) size. Across the three compression schemes (gzip, BBC and WAH) as the bitmap index density increases, the compression ratio increases steadily. When the bitmap index density approaches 0.5, the compression Addendum-T3-3 4.3 The Relationship between Compression Ratio and Logical Operation Time In Kesheng s study [12], plots of the relationship between data compression ratio and logical operation time imply a linear relationship within a certain range of bitmap index density. In general, the higher the compression ratio, the longer the logical operation time it takes to process the operation. Therefore, it is possible that the logical operation

time becomes a function of compression ratio. The illustration of the linear relationship between the logical operation time and compression ratio at a certain level of bitmap index density can be found in [12]. 4.4 Overall Efficiency Indicator Although logical operation time and compression ratio are two related measures, using them separately to evaluate compression schemes does not result in a rationale for evaluating overall performance. To overcome this problem, we can use the slope ratio (LOT/CR) as an indicator of the overall efficiency for each compression scheme, assuming a linear relationship between logical operation time and compression ratio. At a certain level of bitmap index density, one unit increase in compression ratio results in different degrees of change in logical operation time for different compression schemes. The shorter the increase of logical operation time, the shorter time it takes to complete the operation, the more efficient is the compression scheme. The slope ratio (LOT/CR) takes both the logical operation time and compression ratio into account and provides a reasonable way to evaluate the overall efficiency of a compression scheme. At a certain level of bitmap index density, it is possible to compute a slope ratio (LOT/CR) between logical operation time and compression ratio for a compression scheme. It is possible that over a certain range of bitmap index density, the slope ratios of a compression scheme can be plotted. The smaller the ratio, the more efficient is the compression scheme. Thus, it becomes relatively easy to compare the efficiencies of the several compression schemes at different bitmap index densities. The rationale of choosing an optimal scheme becomes clear. We generate the following formula for overall efficiency indicator: Overall indicator = Logical Operation Time / Compression Ratio Figure 4: Overall Indicator (Logical OR Time / Compression Ratio) vs. Bit Density Figure 4 illustrates the plots of overall indicators for gzip, BBC and WAH from 0.0001 to 0.5 of bitmap index density. It is clear that within the range of 0.0001 to 0.5, WAH appears to be the most efficient compression scheme among the three. This finding is consistent with the speculations in previous studies. The overall indicator for WAH is roughly around 1 between bitmap index density of 0.0001 and 0.01 and begins to become more efficient as the bitmap index density increases from 0.01. The overall indicator for BBC is relatively stable across the bitmap index density range of 0.0001 and 0.5. The overall indicator for gzip shows the least efficiency at low bitmap index density. As the bitmap index density increases, the overall indicator for gzip surpasses the one for BBC and becomes a better choice after the bitmap index density of 0.01. As the results indicate, the choice of the compression scheme depends on the density of the bitmap to be compressed and operated on. It also depends on what is the most important to a specific application, logical operation time, or space requirement, or the overall efficiency. Table 1 lists the results of the performance comparison of the three compression schemes. Bit Density 0.0001-0.2 Compression Ratio Logical Operation Time Overall Indicator BBC, gzip WAH WAH 0.2-0.5 WAH, BBC WAH WAH Table1. Performance comparison for three compression schemes 4.5 Limitation Addendum-T3-4 The above analytic approach integrates both logical operation time and compression ratio in evaluating the overall efficiency of compression schemes. Although logical operation time and compression ratio seem to be two of the major factors, other factors might also play significant roles in the overall efficiency, such as CPU capacity, IO speed and other query types. An attempt without fully addressing the effects of these factors is still in a preliminary stage, although it provides an integrative approach in analyzing compression scheme. Within a certain range of bitmap index density, the relationship between compression ratio and logical operation time is assumed to be linear, which provides the possibility of using the slope ratio as a justification indicator to evaluate performance. Yet for a full range of bitmap index density, the relationship between compression ratio and logical operation time is relatively unexplored. Therefore, the overall indicator only applies to a certain range of bitmap index density. 5.0 Summary & Future Work Bitmap index compression has received new attention recently because of the emergence of a new compression scheme, WAH. WAH claims significant performance advantages over its predecessors. In this paper, a review of three representative bitmap compression schemes is given. Then we create an overall efficiency indicator by integrating both compression ratio and logical operation time. Based on the overall efficiency indicator, we evaluate the comparative performance of the three compression schemes in various level of bitmap index density. The result shows that WAH

indeed is the most efficient among the three compression schemes within a certain range of bitmap index density. Bitmap compression can reduce space usage and possibly Boolean operation time. However, bitmap compression introduces new complications to bitmap indexing design. Much research work needs to be done in bitmap index design, taking account into compression. For example: Taking into account the overall efficiency indicator in selection of bitmap index compression scheme, depending on the bitmap density level A linear regression relationship exists between compression ratio and logical operation time in a specific range of bitmap index density. It is also likely that a nonlinear relationship might occur outside that range. Further investigation is called for to examine the full relationship. 6.0 References [1] S. Amer-Yahia, and T. Johnson. Optimizing Queries On Compressed Bitmaps. In 26. VLDB, Cairo, Egypt, 2000 [2] G. Antoshenkov. Byte-Aligned bitmap compression. Technical report, Oracle Corp., 1994. U.S. Patent number 5,363,098 [3] G. Antoshenkov and M. Ziauddin. Query processing and optimization in ORACLE RDB. The VLDB Journal, 5:229 237, 1996 [4] Jean loup Gailly and Mark Adler. zlib 1.1.3 manual, July 1998. Source code available at http://www.infozip.org/pub/infozip/zlib. [5] T. Johnson. Performance Measurements of Compressed Bitmap Indices. In VLDB 99, 1999. Pages 278-289 [6] A. Moffat and J. Zobel. Parameterised ompression for sparse bitmaps. Proceedings of ACM-SIGIR 1992, pages 274-285. ACM Press, 1992 [7] Patrick O Neil. Model 204 Architecture and Performance. Springer-Verlag Lecture Notes in Computer Science 359, 2 nd International Workshop on High Performance Transactions Systems, Asilomar, CA, September 1987. [8] D. A. Patterson, J. L. Hennessy, and D. Goldberg. Computer Architecture: A Quantitative Approach. Morgan Kaufmann. 2 nd edition, 1996. [9] K. Wu, E. J. Otoo, A. Shoshani, and H. Nordberg. Notes on design and implementation of compressed bit vectors. Technical Report LBNL/PUB-3162, Lawrence Berkeley National Laboratory, Berkeley, CA, 2001 [10] K. Wu, E. J. Otoo, and A. Shoshani. Word-aligned compressed bitmaps. Technical Report LBNL47807, Lawrence Berkeley National Laboratory, Berkeley, CA, 2001 [11] K. Wu, E. J. Otoo, and A. Shoshani. A Performance Comparison of bitmap indexes. Proceeding in Conference of Information and Knowwledge Management, ACM, 2001 [12] K. Wu, E. J. Otoo, and A. Shoshani. Compressing Bitmap Indexes for Faster Search Operations. Proceedings of the 14 th International Conference on Scientific and Statistical Database Management (SSDBM 02), IEEE, 2002. Addendum-T3-5

Addendum-T3-6