Unified VLSI Systolic Array Design for LZ Data Compression

Similar documents
Design and Implementation of FPGA- based Systolic Array for LZ Data Compression

SINCE arithmetic coding [1] [12] can approach the entropy

CHAPTER II LITERATURE REVIEW

LOSSLESS DATA COMPRESSION AND DECOMPRESSION ALGORITHM AND ITS HARDWARE ARCHITECTURE

A LOSSLESS INDEX CODING ALGORITHM AND VLSI DESIGN FOR VECTOR QUANTIZATION

Vlsi Design of Cache Compression in Microprocessor Using Pattern Matching Technique

Multigig Lossless Data Compression Device

Low-Power Data Address Bus Encoding Method

HARDWARE IMPLEMENTATION OF LOSSLESS LZMA DATA COMPRESSION ALGORITHM

Study of LZ77 and LZ78 Data Compression Techniques

Simple variant of coding with a variable number of symbols and fixlength codewords.

Efficient Algorithm for Test Vector Decompression Using an Embedded Processor

ISSCC 2003 / SESSION 8 / COMMUNICATIONS SIGNAL PROCESSING / PAPER 8.7

DUE to the high computational complexity and real-time

A High-Performance FPGA-Based Implementation of the LZSS Compression Algorithm

A Simple Lossless Compression Heuristic for Grey Scale Images

IMAGE COMPRESSION. Image Compression. Why? Reducing transportation times Reducing file size. A two way event - compression and decompression

WIRE/WIRELESS SENSOR NETWORKS USING K-RLE ALGORITHM FOR A LOW POWER DATA COMPRESSION

Runlength Compression Techniques for FPGA Configurations

Compression; Error detection & correction

An Efficient VLSI Architecture for Full-Search Block Matching Algorithms

Compression; Error detection & correction

Entropy Coding. - to shorten the average code length by assigning shorter codes to more probable symbols => Morse-, Huffman-, Arithmetic Code

EE-575 INFORMATION THEORY - SEM 092

Implementation of Robust Compression Technique using LZ77 Algorithm on Tensilica s Xtensa Processor

Evolutionary Lossless Compression with GP-ZIP

SCALABLE IMPLEMENTATION SCHEME FOR MULTIRATE FIR FILTERS AND ITS APPLICATION IN EFFICIENT DESIGN OF SUBBAND FILTER BANKS

Massively Parallel Computations of the LZ-complexity of Strings

Cache Aware Compression for Processor Debug Support

A Hybrid Approach to Text Compression

Engineering Mathematics II Lecture 16 Compression

FAULT simulation has been heavily used in test pattern

Lossless Compression Algorithms

Compression Outline :Algorithms in the Real World. Lempel-Ziv Algorithms. LZ77: Sliding Window Lempel-Ziv

A Technique for High Ratio LZW Compression

Abdullah-Al Mamun. CSE 5095 Yufeng Wu Spring 2013

JPEG. Table of Contents. Page 1 of 4

Real-time and smooth scalable video streaming system with bitstream extractor intellectual property implementation

Dictionary Based Compression for Images

Design of a High Speed CAVLC Encoder and Decoder with Parallel Data Path

Image Compression Algorithm and JPEG Standard

INTERNATIONAL JOURNAL OF PROFESSIONAL ENGINEERING STUDIES Volume 9 /Issue 3 / OCT 2017

Implementation of Two Level DWT VLSI Architecture

Implementation and Optimization of LZW Compression Algorithm Based on Bridge Vibration Data

Category: Informational Stac Technology August 1996

IN designing a very large scale integration (VLSI) chip,

Dynamic Pipeline Design of an Adaptive Binary Arithmetic Coder

Processors. Young W. Lim. May 12, 2016

Tradeoff Analysis and Architecture Design of a Hybrid Hardware/Software Sorter

Lossless compression II

AN FFT PROCESSOR BASED ON 16-POINT MODULE

Multimedia Systems. Part 20. Mahdi Vasighi

S 1. Evaluation of Fast-LZ Compressors for Compacting High-Bandwidth but Redundant Streams from FPGA Data Sources

RECENTLY, researches on gigabit wireless personal area

An Efficient Architecture for Lifting-based Two-Dimensional Discrete Wavelet Transforms

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 7, NO. 3, SEPTEMBER

pfpc: A Parallel Compressor for Floating-Point Data

Category: Informational December 1998

On Data Latency and Compression

AN EFFICIENT VLSI IMPLEMENTATION OF IMAGE ENCRYPTION WITH MINIMAL OPERATION

Context based optimal shape coding

Data Compression Techniques

Design of Vector Register Architecture in DSP Processor for Efficient Multimedia Processing

Providing Efficient Support for Lossless Video Transmission and Playback

Analysis of Performance and Designing of Bi-Quad Filter using Hybrid Signed digit Number System

Analysis of Parallelization Effects on Textual Data Compression

Low Power Set-Associative Cache with Single-Cycle Partial Tag Comparison

A Scalable High-Bandwidth Architecture for Lossless Compression on FPGAs

TEST DATA COMPRESSION BASED ON GOLOMB CODING AND TWO-VALUE GOLOMB CODING

Introduction to OpenMP. Lecture 10: Caches

COMP3221: Microprocessors and. Embedded Systems

A Fast Block sorting Algorithm for lossless Data Compression

Ch. 2: Compression Basics Multimedia Systems

A Dynamic Fault-Tolerant Mesh Architecture

Sparse Transform Matrix at Low Complexity for Color Image Compression

Advanced low-complexity compression for maskless lithography data

CAD System Lab Graduate Institute of Electronics Engineering National Taiwan University Taipei, Taiwan, ROC

Reducing/eliminating visual artifacts in HEVC by the deblocking filter.

A Very Low Bit Rate Image Compressor Using Transformed Classified Vector Quantization

Code Compression for the Embedded ARM/THUMB Processor

IMAGE COMPRESSION TECHNIQUES

2014 Summer School on MPEG/VCEG Video. Video Coding Concept

15 Data Compression 2014/9/21. Objectives After studying this chapter, the student should be able to: 15-1 LOSSLESS COMPRESSION

Test Data Compression Using Variable Prefix Run Length (VPRL) Code

THE RELATIVE EFFICIENCY OF DATA COMPRESSION BY LZW AND LZSS

FPGA based Data Compression using Dictionary based LZW Algorithm

PROOFS Fault Simulation Algorithm

Additional Slides to De Micheli Book

Fast Two-Stage Lempel-Ziv Lossless Numeric Telemetry Data Compression Using a Neural Network Predictor

Network Working Group. Category: Informational DayDreamer August 1996

An Efficient Implementation of LZW Decompression Using Block RAMs in the FPGA (Preliminary Version)

The future is parallel but it may not be easy

Compression. storage medium/ communications network. For the purpose of this lecture, we observe the following constraints:

Lecture 2: Memory Systems

Impact of Source-Level Loop Optimization on DSP Architecture Design

A Modification to RED AQM for CIOQ Switches

Textual Data Compression Speedup by Parallelization

Cray XE6 Performance Workshop

Implementation Of Quadratic Rotation Decomposition Based Recursive Least Squares Algorithm

Page 1. Multilevel Memories (Improving performance using a little cash )

Transcription:

Unified VLSI Systolic Array Design for LZ Data Compression Shih-Arn Hwang, and Cheng-Wen Wu Dept. of EE, NTHU, Taiwan, R.O.C. IEEE Trans. on VLSI Systems Vol. 9, No.4, Aug. 2001 Pages: 489-499 Presenter: Liang-Bi Chen

Abstract Hardware implementation of data compression algorithms is receiving attention due to exponentially expanding network traffic and digital data storage usage. In this paper, we propose several serial onedimensional and parallel two-dimensional systolicarrays for Lempel-Ziv data compression. A VLSI chip implementation our optimal linear array is fabricated and tested. The proposed array architecture is scalable. Also, multiple chips (linear arrays) can be connected in parallel to implement the parallel array structure and provide a proportional speedup. 2005/10/5 Unified VLSI Systolic Array Design for LZ Data Compression L. -B. Chen 2/27

Outline What s the problem? Introduction Systolic Algorithm Design Systolic LZ Compressor Design Implementation Conclusion 2005/10/5 Unified VLSI Systolic Array Design for LZ Data Compression L. -B. Chen 3/27

What s the problem? LZ-based algorithms have been widely implemented with software For example Compress Zoo lha Pkzip arj However, their speed is still too low for real-time application, such as Wireless data networking High speed mass-storage transaction Hence, the hardware implementation is required for on-the-fly compression and decompression. 2005/10/5 Unified VLSI Systolic Array Design for LZ Data Compression L. -B. Chen 4/27

Introduction Data compression techniques play an import role in data network and storage utilization, as well as promotion of portable computing and data communication. Many lossless data compression techniques have been proposed in the past and widely used. Hufferman code Arithmetic code Run-length code Lempel-Ziv compression algorithm 2005/10/5 Unified VLSI Systolic Array Design for LZ Data Compression L. -B. Chen 5/27

LZ hardware realizations Microprocessor approach[19] It s not attractive for real time application, since it does not fully explore hardware parallelism. Content-addressable memory (CAM) approach [15]-[18] Advantage It has a constant symbol search time. Thereby, it achieves optimal speed for compression. Disadvantage It has high hardware cost. Systolic-array approach [13],[14],[20]-[22] CAM vs. Systolic-Array The CAM approach performs string match by full parallel searching, the Systolic Array approach does it by pipelining. As compared with CAM-based designs, systolic-array compressors are slower, but better in hardware cost and testability. 2005/10/5 Unified VLSI Systolic Array Design for LZ Data Compression L. -B. Chen 6/27

Systolic Algorithm Design The major concept behind the LZ algorithm is the temporal locality in the information. Since the buffer size (n) and match length (Ls) determine not only the compression efficiency but also the optimal mapping direction of the array architecture. 2005/10/5 Unified VLSI Systolic Array Design for LZ Data Compression L. -B. Chen 7/27

The sequential compression algorithm 2005/10/5 Unified VLSI Systolic Array Design for LZ Data Compression L. -B. Chen 8/27

Simulations of compression ratio with respect to n and Ls The best in here Simulation on various text files. Increasing both n and Ls is not the best way to obtain a high compression ratio. 2005/10/5 Unified VLSI Systolic Array Design for LZ Data Compression L. -B. Chen 9/27

Compression ratio vs. n for different Ls values 2005/10/5 Unified VLSI Systolic Array Design for LZ Data Compression L. -B. Chen 10/27

Compression ratio vs. Ls for different n values 2005/10/5 Unified VLSI Systolic Array Design for LZ Data Compression L. -B. Chen 11/27

Systolic LZ Compression Design Dependence Graph (DG) [26] Object: We can achieve the maximum parallelism in an algorithm by carefully studying the data dependencies in the computations. That shows the dependence of the computations that occur in an algorithm and can be consider as a graphic representation of a single assignment code. Form the single assignment code, the DG of the LZ algorithms can be obtained. 2005/10/5 Unified VLSI Systolic Array Design for LZ Data Compression L. -B. Chen 12/27

The single-assignment code To guarantee single assignment, we use an extra index, j. 2005/10/5 Unified VLSI Systolic Array Design for LZ Data Compression L. -B. Chen 13/27

Global DG of the compression algorithm A DG which contains global signals is called a global DG. 2005/10/5 Unified VLSI Systolic Array Design for LZ Data Compression L. -B. Chen 14/27

Localized DG The global DG can be transformed into a localized DG, in which only local communication is involved. 2005/10/5 Unified VLSI Systolic Array Design for LZ Data Compression L. -B. Chen 15/27

Type-1 array [14] 2005/10/5 Unified VLSI Systolic Array Design for LZ Data Compression L. -B. Chen 16/27

The double buffer 2005/10/5 Unified VLSI Systolic Array Design for LZ Data Compression L. -B. Chen 17/27

Type-2 Array 2005/10/5 Unified VLSI Systolic Array Design for LZ Data Compression L. -B. Chen 18/27

The longest match length decision block 2005/10/5 Unified VLSI Systolic Array Design for LZ Data Compression L. -B. Chen 19/27

On-line buffer updating unit 2005/10/5 Unified VLSI Systolic Array Design for LZ Data Compression L. -B. Chen 20/27

Interleaved Type-2 (Type-2i) Array 2005/10/5 Unified VLSI Systolic Array Design for LZ Data Compression L. -B. Chen 21/27

Type-3 2005/10/5 Unified VLSI Systolic Array Design for LZ Data Compression L. -B. Chen 22/27

Type-4 2005/10/5 Unified VLSI Systolic Array Design for LZ Data Compression L. -B. Chen 23/27

2005/10/5 Unified VLSI Systolic Array Design for LZ Data Compression L. -B. Chen 24/27

Parallel Type-2i Array 2005/10/5 Unified VLSI Systolic Array Design for LZ Data Compression L. -B. Chen 25/27

Implementation 2005/10/5 Unified VLSI Systolic Array Design for LZ Data Compression L. -B. Chen 26/27

Conclusions By investigating possible mapping and scheduling directions on the dependence graph, we propose the optimal array structure for LZ compression, which is better than the two recently proposed designs with respect to hardware cost and testability. Parallel arrays obtained form the block transform of the dependence graph can be used to improve the compression rate. It provides a tradeoff of cost and performance between two extremes. 2005/10/5 Unified VLSI Systolic Array Design for LZ Data Compression L. -B. Chen 27/27