Engineering Mathematics II Lecture 16 Compression

Similar documents
Multimedia Systems. Part 20. Mahdi Vasighi

15 Data Compression 2014/9/21. Objectives After studying this chapter, the student should be able to: 15-1 LOSSLESS COMPRESSION

Data Compression. An overview of Compression. Multimedia Systems and Applications. Binary Image Compression. Binary Image Compression

Fundamentals of Multimedia. Lecture 5 Lossless Data Compression Variable Length Coding

IMAGE PROCESSING (RRY025) LECTURE 13 IMAGE COMPRESSION - I

IMAGE COMPRESSION. Image Compression. Why? Reducing transportation times Reducing file size. A two way event - compression and decompression

Data Compression. Media Signal Processing, Presentation 2. Presented By: Jahanzeb Farooq Michael Osadebey

Lossless Compression Algorithms

CS 335 Graphics and Multimedia. Image Compression

EE67I Multimedia Communication Systems Lecture 4

Image coding and compression

EE-575 INFORMATION THEORY - SEM 092

Multimedia Networking ECE 599

Repetition 1st lecture

Chapter 7 Lossless Compression Algorithms

G64PMM - Lecture 3.2. Analogue vs Digital. Analogue Media. Graphics & Still Image Representation

7: Image Compression

Welcome Back to Fundamentals of Multimedia (MR412) Fall, 2012 Lecture 10 (Chapter 7) ZHU Yongxin, Winson

David Rappaport School of Computing Queen s University CANADA. Copyright, 1996 Dale Carnegie & Associates, Inc.

Entropy Coding. - to shorten the average code length by assigning shorter codes to more probable symbols => Morse-, Huffman-, Arithmetic Code

Image compression. Stefano Ferrari. Università degli Studi di Milano Methods for Image Processing. academic year

Digital Communication Prof. Bikash Kumar Dey Department of Electrical Engineering Indian Institute of Technology, Bombay

Final Review. Image Processing CSE 166 Lecture 18

Chapter 1. Digital Data Representation and Communication. Part 2

CS/COE 1501

DEFLATE COMPRESSION ALGORITHM

A Research Paper on Lossless Data Compression Techniques

Image Coding and Compression

Digital Image Processing

Compressing Data. Konstantin Tretyakov

Compression. storage medium/ communications network. For the purpose of this lecture, we observe the following constraints:

Lecture 6 Review of Lossless Coding (II)

CS/COE 1501

Ch. 2: Compression Basics Multimedia Systems

Index. 1. Motivation 2. Background 3. JPEG Compression The Discrete Cosine Transformation Quantization Coding 4. MPEG 5.

Basic Compression Library

IMAGE COMPRESSION TECHNIQUES

Dictionary techniques

KINGS COLLEGE OF ENGINEERING DEPARTMENT OF INFORMATION TECHNOLOGY ACADEMIC YEAR / ODD SEMESTER QUESTION BANK

Operation of machine vision system

SIGNAL COMPRESSION Lecture Lempel-Ziv Coding

VIDEO SIGNALS. Lossless coding

ITCT Lecture 8.2: Dictionary Codes and Lempel-Ziv Coding

Data and information. Image Codning and Compression. Image compression and decompression. Definitions. Images can contain three types of redundancy

7.5 Dictionary-based Coding

Fundamentals of Video Compression. Video Compression

Compression I: Basic Compression Algorithms

AN ANALYTICAL STUDY OF LOSSY COMPRESSION TECHINIQUES ON CONTINUOUS TONE GRAPHICAL IMAGES

Digital Image Processing

Overview. Last Lecture. This Lecture. Next Lecture. Data Transmission. Data Compression Source: Lecture notes

Image Coding. Image Coding

A Comprehensive Review of Data Compression Techniques

Data Compression Techniques

Topic 5 Image Compression

Image Coding and Data Compression

IMAGE COMPRESSION- I. Week VIII Feb /25/2003 Image Compression-I 1

A Image Comparative Study using DCT, Fast Fourier, Wavelet Transforms and Huffman Algorithm

New Perspectives on Image Compression

A Comparative Study Of Text Compression Algorithms

Analysis of Parallelization Effects on Textual Data Compression

JPEG. Table of Contents. Page 1 of 4

An Effective Approach to Improve Storage Efficiency Using Variable bit Representation

Compression; Error detection & correction

Optimized Compression and Decompression Software

So, what is data compression, and why do we need it?

WIRE/WIRELESS SENSOR NETWORKS USING K-RLE ALGORITHM FOR A LOW POWER DATA COMPRESSION

Intro. To Multimedia Engineering Lossless Compression

Data Compression Techniques

DCT Based, Lossy Still Image Compression

06/12/2017. Image compression. Image compression. Image compression. Image compression. Coding redundancy: image 1 has four gray levels

Video Compression An Introduction

compression and coding ii

Simple variant of coding with a variable number of symbols and fixlength codewords.

A Comparison between English and. Arabic Text Compression

Overview. Videos are everywhere. But can take up large amounts of resources. Exploit redundancy to reduce file size

Ch. 2: Compression Basics Multimedia Systems

Introduction to Data Compression

Compression; Error detection & correction

ECE 499/599 Data Compression & Information Theory. Thinh Nguyen Oregon State University

Lecture 5: Compression I. This Week s Schedule

Image Compression for Mobile Devices using Prediction and Direct Coding Approach

CS 493: Algorithms for Massive Data Sets Dictionary-based compression February 14, 2002 Scribe: Tony Wirth LZ77

Information Theory and Communication

CIS 121 Data Structures and Algorithms with Java Spring 2018

The PackBits program on the Macintosh used a generalized RLE scheme for data compression.

VC 12/13 T16 Video Compression

SIGNAL COMPRESSION. 9. Lossy image compression: SPIHT and S+P

Lossless compression II

A Novel Image Compression Technique using Simple Arithmetic Addition

A COMPRESSION TECHNIQUES IN DIGITAL IMAGE PROCESSING - REVIEW

Department of electronics and telecommunication, J.D.I.E.T.Yavatmal, India 2

Lossy Coding 2 JPEG. Perceptual Image Coding. Discrete Cosine Transform JPEG. CS559 Lecture 9 JPEG, Raster Algorithms

yintroduction to compression ytext compression yimage compression ysource encoders and destination decoders

FPGA based Data Compression using Dictionary based LZW Algorithm

Data Compression 신찬수

Image and Video Compression Fundamentals

Dictionary Based Compression for Images

Wireless Communication

Study of LZ77 and LZ78 Data Compression Techniques

Image Compression using Discrete Wavelet Transform Preston Dye ME 535 6/2/18

Transcription:

010.141 Engineering Mathematics II Lecture 16 Compression Bob McKay School of Computer Science and Engineering College of Engineering Seoul National University 1

Lossless Compression Outline Huffman & Shannon-Fano Arithmetic Compression The LZ Family of Algorithms Lossy Compression Fourier compression Wavelet Compression Fractal Compression 2

Lossless Compression Lossless encoding methods guarantee to reproduce exactly the same data as was input to them 3

Run Length Encoding Original Data String Encoded Data String $******55.72 $ *<6>55.72 --------- -<9> Guns Butter Guns <10>Butter 4

Relative Encoding Useful when there are sequences of runs of data that vary only slightly from one run to the next: eg the lines of a fax The position of each change is denoted relative to the start of the line Position indicator can be followed by a numeric count indicating the number of successive changes For further compression, the position of the next change can be denoted relative to the previous 5

Statistical Compression For the examples below, we will use a simple alphabet with the following frequencies of occurrence (after Held) Character Probability X1 0.10 X2 0.05 X3 0.20 X4 0.15 X5 0.15 X6 0.25 X7 0.10 6

Huffman Encoding Arrange the character set in order of decreasing probability While there is more than one probability class: Merge the two lowest probability classes and add their probabilities to obtain a composite probability At each branch of the binary tree, allocate a '0' to one branch and a '1' to the other The code for each character is found by traversing the tree from the root node to that character 7

8 Huffman Encoding Character X6 X3 X4 X5 X1 X7 X2 Probability 0.25 0.2 0.15 0.15 0.1 0.1 0.05 0.25 0.35 0.25 0.15 0.6 0.4 1.0 Character X6 X3 X4 X5 X1 X7 X2 Probability 00 010 011 100 101 110 111 0 1 0 1 0 1 0 1 0 1 0 1

Shannon-Fano Algorithm Arrange the character set in order of decreasing probability While a probability class contains more than one symbol: Divide the probability class in two so that the probabilities in the two halves are as nearly as possible equal Assign a '1' to the first probability class, and a '0' to the second 9

Shannon-Fano Encoding Character X6 X3 X4 X5 Probability 0.25 0.2 0.15 0.15 1 1 0 1 1 0 Code 11 10 011 010 X1 X7 X2 0.1 0.1 0.05 0 0 1 0 1 0 001 0001 0000 10

Arithmetic Coding Arithmetic coding assumes there is a model for statistically predicting the next character of the string to be encoded An order-0 model predicts the next symbol based on its probability, independent of previous characters For example, an order-0 model of English predicts the highest probability for e An order-1 model predicts the next symbol based on the preceding character For example, if the preceding character is q, then u is a likely next character And so on for higher order models ert erty, etc. 11

Arithmetic Coding Arithmetic coding assumes the coder and decoder share the probability table The main data structure of arithmetic coding is an interval, representing the string constructed so far Its initial value is [0,1] At each stage, the current interval [min,max] is subdivided into sub-intervals corresponding to the probability model for the next character The interval chosen will be the one representing the actual next character The more probable the character, the larger the interval The coder output is a number in the final interval 12

Arithmetic Coding Character Probability X1 0.10 X2 0.05 X3 0.20 X4 0.15 X5 0.15 X6 0.25 X7 0.10 13

Arithmetic Coding Suppose we want to encode the string X1X3X7 After X1, our interval is [0,0.1] After X3, it is [0.015,0.035] After X7, it is [0.033,0.035] The natural output to choose is the shortest binary fraction in [0.033,0.035] Obviously, the algorithm as stated requires infinite precision Slight variants re-normalise at each stage to remain within computer precision 14

Substitutional Compression The basic idea behind a substitutional compressor is to replace an occurrence of a particular phrase with a reference to a previous occurrence There are two main classes of schemes Named after Jakob Ziv and Abraham Lempel, who first proposed them in 1977 and 1978 15

LZW LZW is an LZ78-based scheme designed by T Welch in 1984 LZ78 schemes work by putting phrases into a dictionary when a repeat occurrence of a particular phrase is found, outputting the dictionary index instead of the phrase LZW starts with a 4K dictionary entries 0-255 refer to individual bytes entries 256-4095 refer to substrings Each time a new code is generated it means a new string has been parsed New strings are generated by adding current character K 16 to the end of an existing string w (until dictionary is full)

LZW Algorithm set w = NIL loop read a character K if wk exists in the dictionary w = wk else output the code for w add wk to the string table w = K endloop 17

LZW The most remarkable feature of this type of compression is that the entire dictionary has been transmitted to the decoder without actually explicitly transmitting the dictionary At the end of the run, the decoder will have a dictionary identical to the one the encoder has, built up entirely as part of the decoding process Codings in this family are behind such representations as.gif They were previously under patent, but many of the relevant patents are now expiring 18

Lossy Compression Lossy compression algorithms do not guarantee to reproduce the original input They achieve much higher compression by limiting their compression to what is near enough to be acceptably detectable Usually, this means detectable by a human sense - sight (jpeg), hearing (mp3), motion understanding (mp4) This requires a model of what is acceptable The model may only be accurate in some circumstances Which is why compressing a text or line drawing with jpeg is a bad idea 19

Fourier Compression (jpeg) The Fourier transform of a dataset is a frequency representation of that dataset You have probably already seen graphs of Fourier transforms the frequency diagram of a sound sample is a graph representation of the Fourier transform of the original data, which you see graphed as the original time/amplitude diagram 20

Fourier Compression From our point of view, the important features of the Fourier transform are: it is invertible original dataset can be rebuilt from the Fourier transform graphic images of the World usually contain spatially repetitive information patterns Human senses are (usually) poor at detecting low-amplitude visual frequencies The Fourier transform usually has information concentrated at particular frequencies, depleted at others The depleted frequencies can be transmitted at low precision without serious loss of overall information. 21

Discrete Cosine Transform A discretised version of the Fourier transform Suited to representing spatially quantised (ie raster) images in a frequency quantised (ie tabular) format. Mathematically, the DCT of a function f ranging over a discrete variable x (omitting various important constants) is given by F(n) = Σx f(x) cos(nπx) Of course, we re usually interested in two-dimensional images, and hence need the two-dimensional DCT, given (omitting even more important constants) by F(m,n) = Σx Σyf(x,y) cos(mπx) cos(nπy) 22

Fourier Compression Revisited Fourier-related transforms are based on sine (or cosine) functions of various frequencies The transform is a record of how to add together the periodic functions to obtain the original function Really, all we need is a basis set of functions A set of functions that can generate all others 23

The Haar Transform Instead of periodic functions, we could instead add together discrete functions such as: +--+ +------+ + +------------------ + +-------------- +--+ +------+ +--+ +------+ ------+ +------------ --------------+ + +--+ +------+ +--+ +-------------+ ------------+ +------ + + +--+ +-------------+ +--+ +---------------------------+ ------------------+ + + + +--+ This would give us the Haar transform It can also be used to compress image data, though not as efficiently as the DCT images compressed at the same rate as the DCT tend to look blocky, so higher compression is required to give 24 the same impression

Wavelet Compression Wavelet compression uses a basis set intermediate between Fourier and Haar transforms The functions are smoothed versions of the Haar functions They have a sinusoidal rather than square shape They don t die out abruptly at the edges They decay into lower amplitude Wavelet compression can give very high ratios attributed to similarities between wavelet functions and the edge detection present in the human retina wavelet functions encode just the detail that we see best 25

Vector Quantisation Relies on building a codebook of similar image portions Only one copy of the similar portions is transmitted Just as LZ compression relies on building a dictionary of strings seen so far just transmitting references to the dictionary 26

Fractal Compression Rely on self-similarity of (parts of) the image to reduce transmission It has a similar relation to vector quantisation methods as LZW has to LZ LZW can be thought of as LZ in which the dictionary is derived from the part of the text seen so far fractal compression can be viewed as deriving its dictionary from the portion of the image seen so far 27

Compression Times For transform encodings such as DCT or wavelet compression and decompression times are roughly comparable For fractal compression Compression takes orders of magnitude longer than decompression Difficult to find the right codebook Fractal compression is well suited where precanned images will be accessed many times over 28

Lossless Compression Summary Huffman & Shannon-Fano Arithmetic Compression The LZ Family of Algorithms Lossy Compression Fourier compression Wavelet Compression Fractal Compression 29

감사합니다 30