Data compression with Huffman and LZW

Size: px
Start display at page:

Download "Data compression with Huffman and LZW"

Transcription

1 Data compression with Huffman and LZW André R. Brodtkorb,

2 Outline Data storage and compression Huffman: how it works and where it's used LZW: how it works and where it's used Summary and further reading

3 Data Storage and Compression

4 Data Storage Oral tradition Written text Printed text Electronic storage Little red riding hood, Wikipedia [Public domain, Gustave Dore] Jean Miélot, Wikipedia [Public domain, Jean Le Tavernier] Guthenberg bible, Wikipedia [CC-BY-SA 2.0, user NYC Wanderer (Kevin Eng)] Whirlwind's core memory, Wikipedia [CC-BY-SA 3.0, user Dpbsmith]

5 Why do we need data compression? Data is massive Slackware Linux consisted of over 70 floppy disks in 1994! [Slackware] 2.5 billion gigabytes new data every day in 2012 [IBM] Data is inefficiently stored ASCII text to represent numbers Consecutive consecutive frames of a video are often close to identical. Storage and bandwidth is limited World average bandwidth: 3.9 MBit/s [Akamai, 2014] Time to download 1 GB: 30 minutes! Floppy disk, Wikipedia [public domain, George Chernilevsky]

6 Types of data compression Data compression tries to remove redundant or superfluous information. Lossless compression: Remove redundant information Original signal can be reproduced exactly Lossy compression Remove superfluous information Original signal can only be reproduced with (minor) differences

7 Lossless data compression example Color lookup tables A 24-bit (RGBA) color image can contain over 16 million different colors Chicago "Cloud Gate" image has unique colors and 5 million pixles (2592x1936) Need 18 bits to represent the unique colors Original: 5M pixels x 24 bits / pixel = MB Look-up table: 5M pixels x 18 bits /pixel + 16K colors x (18+24) bit / color = MB

8 Lossy data compression example Human eye is not very sensitive when it comes to colors Simply reduce the number of colors to decrease bits per pixel Information is lost and cannot be recovered Caveat: following slides require a good projector

9 11.6 MB colors

10 MB 256 colors

11 MB 64 colors

12 MB MB MB colors

13 Huffman coding

14 Huffman Code A method for lossless compression of data. Introduced by David A. Huffman ( ) during his Ph.D. [1] Basic idea is to replace original alphabet with variable-length codes, similar to Morse code David A Huffman [Don Harris] [1] Huffman, D. (1952). "A Method for the Construction of Minimum-Redundancy Codes". Proceedings of the IRE 40 (9):

15 Morse and Huffman 1/2 Morse code uses variable length codes. Symbols separated by short pause Words separated by long pause Often used symbols have shorter codes E is one dot, and takes one unit of time to transmit. 0 is five dashes, and takes 19 units of time to transmit Morse code, Wikipedia [Public domain, Rhey T. Snodgrass & Victor F. Camp, 1922]

16 Morse and Huffman 2/2 Morse code can be written as a binary tree Start at the top. If next tone is a dot, go left If next tone is a dash, go right Stop when there is a pause and you have found your letter Morse code tree, Wikipedia [CC-BY-SA 3.0, user Aris00] Huffman similarly uses a binary tree, but is an algorithm for finding the optimal code for each symbol. Removes the need for pauses and minimizes number of dots/dashes

17 Huffman example 1/4 a We have a an alphabet of symbols used to encode e.g., "abracadabra" c d r 4 b 1. Create a binary tree with all symbols* 2. Assign a binary code to each connector (just like Morse) Right child = 1 Left child = 0 a 0 1 * We'll cover how to create the tree in a couple of slides c d r b

18 Huffman example 2/4 Read off the code for each symbol by traversing the tree a c d r b Symbol a b c d r Code Replace symbols by codes a b r a c a d a b r a Send message:

19 Huffman example 3/4 To decode the message, we need the binary tree and the message itself The tree can be static / predefined, or dynamic and transmitted with the message itself Read the message bit by bit and traverse the tree to decode 1: follow right child 0: follow left child a 0 1 When you reach a leaf node, you have found your symbol! c d r b

20 Huffman example 4/4 a 0 1 c d r b Message:

21 Huffman example 4/4 a 0 1 c d r b Message: a

22 Huffman example 4/4 a 0 1 c d r b Message: a b

23 Huffman example 4/4 a 0 1 c d r b Message: a b r

24 Huffman example 4/4 a 0 1 c d r b Message: a b r a

25 Huffman example 4/4 a 0 1 c d r b Message: a b r a c

26 Huffman example 4/4 a 0 1 c d r b Message: a b r a c a

27 Huffman example 4/4 a 0 1 c d r b Message: a b r a c a d

28 Huffman example 4/4 a 0 1 c d r b Message: a b r a c a d a

29 Huffman example 4/4 a 0 1 c d r b Message: a b r a c a d a b

30 Huffman example 4/4 a 0 1 c d r b Message: a b r a c a d a b r

31 Huffman example 4/4 a 0 1 c d r b Message: a b r a c a d a b r a

32 Creating the Huffman tree 1/4 A Huffman tree is generated from the frequency or probability for each symbol Our text "abracadabra" gives rise to the following Symbol a b c d r Frequency Probability The aim is to give the shortest code to the most frequent symbol

33 Creating the Huffman tree 2/4 Algorithm 1. Start by creating nodes for each symbol 2. Add all symbols to a priority queue 3. Get the two least frequent symbols, and make a parent node for them 4. Add the newly created node (with the cumulative probability) to the priority queue 5. Go to 3. When there is a single node left, the tree is complete Loop through the tree, and add the Huffman codes

34 Creating the Huffman tree 3/4 a:5 c:1 d:1 r:2 b:2

35 Creating the Huffman tree 3/4 2 a:5 r:2 b:2

36 Creating the Huffman tree 3/4 4 2 a:5

37 Creating the Huffman tree 3/4 6 a:5

38 Creating the Huffman tree 3/4 11

39 Creating the Huffman tree 4/ a:5 c:1 d:1 r:2 b:2

40 Creating the Huffman tree 4/ a:5 c:1 d:1 r:2 b:2 0

41 Creating the Huffman tree 4/ a:5 c:1 d:1 r:2 b:2 0

42 Creating the Huffman tree 4/ a:5 c:1 d:1 r:2 b:2 001

43 Creating the Huffman tree 4/ a:5 c:1 d:1 r:2 b:

44 Implementing and testing 1/2 Huffman is "simple" in principle (once you get it), but can be challenging to get right Need to fiddle with bits, think about byte order, etc. Difficult to debug: output is bits Around 290 single lines of code Compression test Test dataset A: Macbeth by Shakespeare HTML, 202 KB Available from Test dataset B: Bus video file yuv, uncompressed video, 1.37 MB Available from

45 Implementing and testing 2/2 Test A: Input: , output: (including tree) Compressed size: 65% of original Symbols: 85 = 2^6.4 => 7 bits / symbol wo. Huffman Achieved Entropy: bits / symbol Optimal [Shannon]: bits / symbol Test B Input: , output: (including tree) Compression ratio: 86% of original Symbols: 256 = 2^8 => 8 bits / symbol wo. Huffman Entropy: bits / symbol Optimal [Shannon]: bits / symbol

46 Uses of Huffman coding Fax machines: Combination of run length encoding and Huffman JPEG images: DCT, followed by quantization (information loss), and Huffman MP3 files: Information removal followed by Huffman coding DEFLATE: DEFLATE is an integral part of many tools and file formats. Uses LZ77 and Huffman coding Examples: zlib, png, gzip, SSH, http,...

47 Huffman speed 1/2 The Huffman tree can be generated in O(n log n) for n symbols using a priority queue Typically, building Huffman tree is negligible for long data sets Compression time function of data set size The ubiquity of Huffman coding means that there are a lot of efficient implementations out there zlib compression of text and data libjpeg-turbo JPEG encoder tailored for VNC Intel IPP Intel integrated performance primitives FFmpeg video and audio lodepng self-contained png decoder NVIDIA GPUs hardware support (probably also AMD GPUs) FPGAS real-time streaming of Huffman

48 Huffman speed 2/2 A major problem with Huffman today is its serial nature Decoding of the stream can only be done bit by bit. Some JPEG formats use restart markers to enable parallel decoding (patented ) DEFLATE can be decoded in parallel by splitting into blocks Today we have 4-12 threads in standard PCs Huffman does not scale to more than one thread

49 LZW

50 LZW LZW [1] A compression algorithm named after Abraham Lempel (1936-), Jacob Ziv (1931-), and Terry Welch (1939? -1988) A modification of earlier LZ77 and LZ78 algorithms by Terry Welch Patented in 1983 in the US, 1984 in UK, France, Germany, Italy, Japan, Canada. Basic idea is to replace several symbols with a single code Abraham Lempel, Wikipedia [CC-BY-SA 3.0, user Staelin] [1] Welch, Terry (1984). "A Technique for High-Performance Data Compression". Computer 17 (6): Jacob Ziv, Wikipedia [חישוביות [Public domain, user

51 Dictionaries for compression The main idea of LZW is similar to logograms Each code refers to a word, part of word, etc. Create a dictionary which translate symbols into code and vice versa The LZW algorithm creates a dictionary automatically based on the input data Hieroglyphs, Wikipedia [Public domain, user Vincnet] The Story of Shi Shi Eating Lions, Wikipedia [CC-BY-SA 3.0]

52 LZW Example 1/5 We have data we want to compress, in this case "abracadabra" LZW dynamically* creates a dictionary of strings based on the data we want to compress This dictionary is used to replace multiple symbols with a single dictionary entry The dictionary is not stored to file * We'll cover how in a couple of slides 0 a 1 b 2 c 3 d 4 r 5 ab 6 br 7 ra 8 ac 9 ca 10 ad 11 da 12 abr 13 ra

53 LZW Example 2/5 0 a 1. Initialize dictionary with all single symbols 2. Set "w" equal an empty string 3. Read the next letter into "k" 4. If the dictionary has the string "wk": set w equal wk go to 3 5. Else: add wk to the dictionary write value of w to the output stream set w equal k go to 3 1 b 2 c 3 d 4 r 5 ab 6 br 7 ra 8 ac 9 ca 10 ad 11 da 12 abr 13 ra

54 LZW Example 3/5 w k wk wk in dict? Output code A A Yes A B AB No 0 B R BR No 1 R A RA No 4 A C AC No 0 C A CA No 2 A D AD No 0 D A DA No 3 A B AB Yes AB R ABR No 5 R A RA No 4 A a 1 b 2 c 3 d 4 r 5 ab 6 br 7 ra 8 ac 9 ca 10 ad 11 da 12 abr 13 ra

55 LZW Example 3/5 w k wk wk in dict? Output code A A Yes A B AB No 0 B R BR No 1 R A RA No 4 A C AC No 0 C A CA No 2 A D AD No 0 D A DA No 3 A B AB Yes AB R ABR No 5 R A RA No 4 A a 1 b 2 c 3 d 4 r 5 ab 6 br 7 ra 8 ac 9 ca 10 ad 11 da 12 abr 13 ra

56 LZW Example 3/5 w k wk wk in dict? Output code A A Yes A B AB No 0 B R BR No 1 R A RA No 4 A C AC No 0 C A CA No 2 A D AD No 0 D A DA No 3 A B AB Yes AB R ABR No 5 R A RA No 4 A a 1 b 2 c 3 d 4 r 5 ab 6 br 7 ra 8 ac 9 ca 10 ad 11 da 12 abr 13 ra

57 LZW Example 3/5 w k wk wk in dict? Output code A A Yes A B AB No 0 B R BR No 1 R A RA No 4 A C AC No 0 C A CA No 2 A D AD No 0 D A DA No 3 A B AB Yes AB R ABR No 5 R A RA No 4 A a 1 b 2 c 3 d 4 r 5 ab 6 br 7 ra 8 ac 9 ca 10 ad 11 da 12 abr 13 ra

58 LZW Example 3/5 w k wk wk in dict? Output code A A Yes A B AB No 0 B R BR No 1 R A RA No 4 A C AC No 0 C A CA No 2 A D AD No 0 D A DA No 3 A B AB Yes AB R ABR No 5 R A RA No 4 A a 1 b 2 c 3 d 4 r 5 ab 6 br 7 ra 8 ac 9 ca 10 ad 11 da 12 abr 13 ra

59 LZW Example 3/5 w k wk wk in dict? Output code A A Yes A B AB No 0 B R BR No 1 R A RA No 4 A C AC No 0 C A CA No 2 A D AD No 0 D A DA No 3 A B AB Yes AB R ABR No 5 R A RA No 4 A a 1 b 2 c 3 d 4 r 5 ab 6 br 7 ra 8 ac 9 ca 10 ad 11 da 12 abr 13 ra

60 LZW Example 3/5 w k wk wk in dict? Output code A A Yes A B AB No 0 B R BR No 1 R A RA No 4 A C AC No 0 C A CA No 2 A D AD No 0 D A DA No 3 A B AB Yes AB R ABR No 5 R A RA No 4 A a 1 b 2 c 3 d 4 r 5 ab 6 br 7 ra 8 ac 9 ca 10 ad 11 da 12 abr 13 ra

61 LZW Example 3/5 w k wk wk in dict? Output code A A Yes A B AB No 0 B R BR No 1 R A RA No 4 A C AC No 0 C A CA No 2 A D AD No 0 D A DA No 3 A B AB Yes AB R ABR No 5 R A RA No 4 A a 1 b 2 c 3 d 4 r 5 ab 6 br 7 ra 8 ac 9 ca 10 ad 11 da 12 abr 13 ra

62 LZW Example 3/5 w k wk wk in dict? Output code A A Yes A B AB No 0 B R BR No 1 R A RA No 4 A C AC No 0 C A CA No 2 A D AD No 0 D A DA No 3 A B AB Yes AB R ABR No 5 R A RA No 4 A a 1 b 2 c 3 d 4 r 5 ab 6 br 7 ra 8 ac 9 ca 10 ad 11 da 12 abr 13 ra

63 LZW Example 3/5 w k wk wk in dict? Output code A A Yes A B AB No 0 B R BR No 1 R A RA No 4 A C AC No 0 C A CA No 2 A D AD No 0 D A DA No 3 A B AB Yes AB R ABR No 5 R A RA No 4 A a 1 b 2 c 3 d 4 r 5 ab 6 br 7 ra 8 ac 9 ca 10 ad 11 da 12 abr 13 ra

64 LZW Example 3/5 w k wk wk in dict? Output code A A Yes A B AB No 0 B R BR No 1 R A RA No 4 A C AC No 0 C A CA No 2 A D AD No 0 D A DA No 3 A B AB Yes AB R ABR No 5 R A RA No 4 A a 1 b 2 c 3 d 4 r 5 ab 6 br 7 ra 8 ac 9 ca 10 ad 11 da 12 abr 13 ra

65 LZW Example 3/5 w k wk wk in dict? Output code A A Yes A B AB No 0 B R BR No 1 R A RA No 4 A C AC No 0 C A CA No 2 A D AD No 0 D A DA No 3 A B AB Yes AB R ABR No 5 R A RA No 4 A a 1 b 2 c 3 d 4 r 5 ab 6 br 7 ra 8 ac 9 ca 10 ad 11 da 12 abr 13 ra

66 LZW Example 3/5 w k wk wk in dict? Output code A A Yes A B AB No 0 B R BR No 1 R A RA No 4 A C AC No 0 C A CA No 2 A D AD No 0 D A DA No 3 A B AB Yes AB R ABR No 5 R A RA No 4 A a 1 b 2 c 3 d 4 r 5 ab 6 br 7 ra 8 ac 9 ca 10 ad 11 da 12 abr 13 ra

67 LZW Example 4/5 Decoding the message is done in the opposite order of encoding 1. Decode the first code and store in "w" Also write to output stream 2. Read the next code 3. If the dictionary has the next code: Set "k" equal the decoded code Write out k Set "wk" equal w and the first character of k Add wk to the dictionary Set w equal k Go to 2

68 LZW Example 5/5 Input code in dict? w k wk[0] Output 0 Yes A A 1 Yes A B AB B 4 Yes B R BR R 0 Yes R A RA A 2 Yes A C AC C 0 Yes C A CA A 3 Yes A D AD D 5 Yes D AB DA AB 4 Yes AB R ABR R 0 Yes R A RA A - - A a 1 b 2 c 3 d 4 r 5 ab 6 br 7 ra 8 ac 9 ca 10 ad 11 da 12 abr 13 ra

69 LZW Example 5/5 Input code in dict? w k wk[0] Output 0 Yes A A 1 Yes A B AB B 4 Yes B R BR R 0 Yes R A RA A 2 Yes A C AC C 0 Yes C A CA A 3 Yes A D AD D 5 Yes D AB DA AB 4 Yes AB R ABR R 0 Yes R A RA A - - A a 1 b 2 c 3 d 4 r 5 ab 6 br 7 ra 8 ac 9 ca 10 ad 11 da 12 abr 13 ra

70 LZW Example 5/5 Input code in dict? w k wk[0] Output 0 Yes A A 1 Yes A B AB B 4 Yes B R BR R 0 Yes R A RA A 2 Yes A C AC C 0 Yes C A CA A 3 Yes A D AD D 5 Yes D AB DA AB 4 Yes AB R ABR R 0 Yes R A RA A - - A a 1 b 2 c 3 d 4 r 5 ab 6 br 7 ra 8 ac 9 ca 10 ad 11 da 12 abr 13 ra

71 LZW Example 5/5 Input code in dict? w k wk[0] Output 0 Yes A A 1 Yes A B AB B 4 Yes B R BR R 0 Yes R A RA A 2 Yes A C AC C 0 Yes C A CA A 3 Yes A D AD D 5 Yes D AB DA AB 4 Yes AB R ABR R 0 Yes R A RA A - - A a 1 b 2 c 3 d 4 r 5 ab 6 br 7 ra 8 ac 9 ca 10 ad 11 da 12 abr 13 ra

72 LZW Example 5/5 Input code in dict? w k wk[0] Output 0 Yes A A 1 Yes A B AB B 4 Yes B R BR R 0 Yes R A RA A 2 Yes A C AC C 0 Yes C A CA A 3 Yes A D AD D 5 Yes D AB DA AB 4 Yes AB R ABR R 0 Yes R A RA A - - A a 1 b 2 c 3 d 4 r 5 ab 6 br 7 ra 8 ac 9 ca 10 ad 11 da 12 abr 13 ra

73 LZW Example 5/5 Input code in dict? w k wk[0] Output 0 Yes A A 1 Yes A B AB B 4 Yes B R BR R 0 Yes R A RA A 2 Yes A C AC C 0 Yes C A CA A 3 Yes A D AD D 5 Yes D AB DA AB 4 Yes AB R ABR R 0 Yes R A RA A - - A a 1 b 2 c 3 d 4 r 5 ab 6 br 7 ra 8 ac 9 ca 10 ad 11 da 12 abr 13 ra

74 LZW Example 5/5 Input code in dict? w k wk[0] Output 0 Yes A A 1 Yes A B AB B 4 Yes B R BR R 0 Yes R A RA A 2 Yes A C AC C 0 Yes C A CA A 3 Yes A D AD D 5 Yes D AB DA AB 4 Yes AB R ABR R 0 Yes R A RA A - - A a 1 b 2 c 3 d 4 r 5 ab 6 br 7 ra 8 ac 9 ca 10 ad 11 da 12 abr 13 ra

75 LZW Example 5/5 Input code in dict? w k wk[0] Output 0 Yes A A 1 Yes A B AB B 4 Yes B R BR R 0 Yes R A RA A 2 Yes A C AC C 0 Yes C A CA A 3 Yes A D AD D 5 Yes D AB DA AB 4 Yes AB R ABR R 0 Yes R A RA A - - A a 1 b 2 c 3 d 4 r 5 ab 6 br 7 ra 8 ac 9 ca 10 ad 11 da 12 abr 13 ra

76 LZW Example 5/5 Input code in dict? w k wk[0] Output 0 Yes A A 1 Yes A B AB B 4 Yes B R BR R 0 Yes R A RA A 2 Yes A C AC C 0 Yes C A CA A 3 Yes A D AD D 5 Yes D AB DA AB 4 Yes AB R ABR R 0 Yes R A RA A - - A a 1 b 2 c 3 d 4 r 5 ab 6 br 7 ra 8 ac 9 ca 10 ad 11 da 12 abr 13 ra

77 LZW Example 5/5 Input code in dict? w k wk[0] Output 0 Yes A A 1 Yes A B AB B 4 Yes B R BR R 0 Yes R A RA A 2 Yes A C AC C 0 Yes C A CA A 3 Yes A D AD D 5 Yes D AB DA AB 4 Yes AB R ABR R 0 Yes R A RA A - - A a 1 b 2 c 3 d 4 r 5 ab 6 br 7 ra 8 ac 9 ca 10 ad 11 da 12 abr 13 ra

78 LZW Example 5/5 Input code in dict? w k wk[0] Output 0 Yes A A 1 Yes A B AB B 4 Yes B R BR R 0 Yes R A RA A 2 Yes A C AC C 0 Yes C A CA A 3 Yes A D AD D 5 Yes D AB DA AB 4 Yes AB R ABR R 0 Yes R A RA A - - A a 1 b 2 c 3 d 4 r 5 ab 6 br 7 ra 8 ac 9 ca 10 ad 11 da 12 abr 13 ra

79 LZW Example 5/5 Input code in dict? w k wk[0] Output 0 Yes A A 1 Yes A B AB B 4 Yes B R BR R 0 Yes R A RA A 2 Yes A C AC C 0 Yes C A CA A 3 Yes A D AD D 5 Yes D AB DA AB 4 Yes AB R ABR R 0 Yes R A RA A - - A a 1 b 2 c 3 d 4 r 5 ab 6 br 7 ra 8 ac 9 ca 10 ad 11 da 12 abr 13 ra

80 LZW Dictionary 1/2 LZW uses 12 bits for the dictionary All 256 single-byte values are added at initialization The rest of the bits are used for combinations: 2 12 = 4096 => 3840 entries for combinations When the dictionary is full, compression becomes "static" It can also be reset: clear all entries, and initialize with single-byte values Can add a "reset" code which is used when compression ratio drops An LZW variant uses variable bit length codes Start with 9 bit codes. When the dictionary has 2 9 = 512 entries, Continue with 10 bit codes. When the dictionary has 2 10 = 1024 entries, Continue with 11 bit codes.

81 LZW Dictionary 2/2 Occasionally the code is not in the dictionary during decode This happens when a code just added during encoding is used in the next symbol Example: Encoder knows about "ab", and gets the string "ababa" ab is output, and aba is added to the dictionary aba is output This will always be the case, and we can deduce that any unknown symbol will represent the previous output with the first letter added in in the end.

82 Implementing and testing LZW is "simple" in principle and practice Only complication is need to fiddle with nibbles Around 190 single lines of code Compression test Test dataset A: Macbeth by Shakespeare HTML, 202 KB Available from Test dataset B: Bus video file yuv, uncompressed video, 1.37 MB Available from

83 Implementing and testing Test A: Input: , output: Compressed size: 43% of original Test B Input: , output: Compression ratio: 91% of original

84 LZW Speed 1/2 LZW designed for efficient hardware implementation Fixed size dictionary, trivial initialization Finite state machine Same problems as Huffman wrt. parallelism: inherently serial LZW used to be a standard Linux tool compress, but patents and more efficient algorithms limited its use Closely related to the LZ77 and LZ78 algorithms LZ77 uses sliding window and length distance pairs LZ78 uses dictionary like LZW Part of much used DEFLATE algorithm

85 LZW Speed 2/2 GIF files use LZW compression Limited to a small alphabet (max 256 colors) Often a lot of repeated patterns: Highly suitable for LZW LZW Appears to have lost a lot of traction 20 years of patents has taken its toll Drove forward the creation of the free software PNG GIF images still actively used senorgif.com

86 Summary

87 Summary Huffman is based on replacing fixed-width symbols with variable bit codes Approaches the theoretical Entropy given by Shannon Small overhead in storing the Huffman table itself Works very well for data with a few highly used symbols Works poorly for data with equal use of all characters Very fast LZW is based on replacing multiple symbols with a single code No overhead in storing dictionary: it is created dynamically Only starts "compressing" after the dictionary as a lot of combinations Works very well with small alphabets (fewer string combinations) Works poorly with random data (few repeated "words")

88 Further reading Data compression is big bucks! There's a huge amount of patents on compression algorithms Check licensing requirements Most efficient compression algorithms will take knowledge of underlying data structure into account Combination of lossless and lossy compression. Open source implementations of LZW and Huffman: Warning: not written for speed

89 Thank you for your attention! André R. Brodtkorb Homepage:

CS/COE 1501

CS/COE 1501 CS/COE 1501 www.cs.pitt.edu/~lipschultz/cs1501/ Compression What is compression? Represent the same data using less storage space Can get more use out a disk of a given size Can get more use out of memory

More information

ECE 499/599 Data Compression & Information Theory. Thinh Nguyen Oregon State University

ECE 499/599 Data Compression & Information Theory. Thinh Nguyen Oregon State University ECE 499/599 Data Compression & Information Theory Thinh Nguyen Oregon State University Adminstrivia Office Hours TTh: 2-3 PM Kelley Engineering Center 3115 Class homepage http://www.eecs.orst.edu/~thinhq/teaching/ece499/spring06/spring06.html

More information

Entropy Coding. - to shorten the average code length by assigning shorter codes to more probable symbols => Morse-, Huffman-, Arithmetic Code

Entropy Coding. - to shorten the average code length by assigning shorter codes to more probable symbols => Morse-, Huffman-, Arithmetic Code Entropy Coding } different probabilities for the appearing of single symbols are used - to shorten the average code length by assigning shorter codes to more probable symbols => Morse-, Huffman-, Arithmetic

More information

Data Compression. An overview of Compression. Multimedia Systems and Applications. Binary Image Compression. Binary Image Compression

Data Compression. An overview of Compression. Multimedia Systems and Applications. Binary Image Compression. Binary Image Compression An overview of Compression Multimedia Systems and Applications Data Compression Compression becomes necessary in multimedia because it requires large amounts of storage space and bandwidth Types of Compression

More information

Lossless compression II

Lossless compression II Lossless II D 44 R 52 B 81 C 84 D 86 R 82 A 85 A 87 A 83 R 88 A 8A B 89 A 8B Symbol Probability Range a 0.2 [0.0, 0.2) e 0.3 [0.2, 0.5) i 0.1 [0.5, 0.6) o 0.2 [0.6, 0.8) u 0.1 [0.8, 0.9)! 0.1 [0.9, 1.0)

More information

Multimedia Networking ECE 599

Multimedia Networking ECE 599 Multimedia Networking ECE 599 Prof. Thinh Nguyen School of Electrical Engineering and Computer Science Based on B. Lee s lecture notes. 1 Outline Compression basics Entropy and information theory basics

More information

CS/COE 1501

CS/COE 1501 CS/COE 1501 www.cs.pitt.edu/~nlf4/cs1501/ Compression What is compression? Represent the same data using less storage space Can get more use out a disk of a given size Can get more use out of memory E.g.,

More information

Lossless Compression Algorithms

Lossless Compression Algorithms Multimedia Data Compression Part I Chapter 7 Lossless Compression Algorithms 1 Chapter 7 Lossless Compression Algorithms 1. Introduction 2. Basics of Information Theory 3. Lossless Compression Algorithms

More information

Simple variant of coding with a variable number of symbols and fixlength codewords.

Simple variant of coding with a variable number of symbols and fixlength codewords. Dictionary coding Simple variant of coding with a variable number of symbols and fixlength codewords. Create a dictionary containing 2 b different symbol sequences and code them with codewords of length

More information

Compression. storage medium/ communications network. For the purpose of this lecture, we observe the following constraints:

Compression. storage medium/ communications network. For the purpose of this lecture, we observe the following constraints: CS231 Algorithms Handout # 31 Prof. Lyn Turbak November 20, 2001 Wellesley College Compression The Big Picture We want to be able to store and retrieve data, as well as communicate it with others. In general,

More information

Compressing Data. Konstantin Tretyakov

Compressing Data. Konstantin Tretyakov Compressing Data Konstantin Tretyakov (kt@ut.ee) MTAT.03.238 Advanced April 26, 2012 Claude Elwood Shannon (1916-2001) C. E. Shannon. A mathematical theory of communication. 1948 C. E. Shannon. The mathematical

More information

Fundamentals of Multimedia. Lecture 5 Lossless Data Compression Variable Length Coding

Fundamentals of Multimedia. Lecture 5 Lossless Data Compression Variable Length Coding Fundamentals of Multimedia Lecture 5 Lossless Data Compression Variable Length Coding Mahmoud El-Gayyar elgayyar@ci.suez.edu.eg Mahmoud El-Gayyar / Fundamentals of Multimedia 1 Data Compression Compression

More information

Ch. 2: Compression Basics Multimedia Systems

Ch. 2: Compression Basics Multimedia Systems Ch. 2: Compression Basics Multimedia Systems Prof. Ben Lee School of Electrical Engineering and Computer Science Oregon State University Outline Why compression? Classification Entropy and Information

More information

7: Image Compression

7: Image Compression 7: Image Compression Mark Handley Image Compression GIF (Graphics Interchange Format) PNG (Portable Network Graphics) MNG (Multiple-image Network Graphics) JPEG (Join Picture Expert Group) 1 GIF (Graphics

More information

Compression; Error detection & correction

Compression; Error detection & correction Compression; Error detection & correction compression: squeeze out redundancy to use less memory or use less network bandwidth encode the same information in fewer bits some bits carry no information some

More information

Multimedia Systems. Part 20. Mahdi Vasighi

Multimedia Systems. Part 20. Mahdi Vasighi Multimedia Systems Part 2 Mahdi Vasighi www.iasbs.ac.ir/~vasighi Department of Computer Science and Information Technology, Institute for dvanced Studies in asic Sciences, Zanjan, Iran rithmetic Coding

More information

Engineering Mathematics II Lecture 16 Compression

Engineering Mathematics II Lecture 16 Compression 010.141 Engineering Mathematics II Lecture 16 Compression Bob McKay School of Computer Science and Engineering College of Engineering Seoul National University 1 Lossless Compression Outline Huffman &

More information

Digital Communication Prof. Bikash Kumar Dey Department of Electrical Engineering Indian Institute of Technology, Bombay

Digital Communication Prof. Bikash Kumar Dey Department of Electrical Engineering Indian Institute of Technology, Bombay Digital Communication Prof. Bikash Kumar Dey Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 26 Source Coding (Part 1) Hello everyone, we will start a new module today

More information

DEFLATE COMPRESSION ALGORITHM

DEFLATE COMPRESSION ALGORITHM DEFLATE COMPRESSION ALGORITHM Savan Oswal 1, Anjali Singh 2, Kirthi Kumari 3 B.E Student, Department of Information Technology, KJ'S Trinity College Of Engineering and Research, Pune, India 1,2.3 Abstract

More information

Digital Image Processing

Digital Image Processing Digital Image Processing Image Compression Caution: The PDF version of this presentation will appear to have errors due to heavy use of animations Material in this presentation is largely based on/derived

More information

15 Data Compression 2014/9/21. Objectives After studying this chapter, the student should be able to: 15-1 LOSSLESS COMPRESSION

15 Data Compression 2014/9/21. Objectives After studying this chapter, the student should be able to: 15-1 LOSSLESS COMPRESSION 15 Data Compression Data compression implies sending or storing a smaller number of bits. Although many methods are used for this purpose, in general these methods can be divided into two broad categories:

More information

A New Compression Method Strictly for English Textual Data

A New Compression Method Strictly for English Textual Data A New Compression Method Strictly for English Textual Data Sabina Priyadarshini Department of Computer Science and Engineering Birla Institute of Technology Abstract - Data compression is a requirement

More information

Ch. 2: Compression Basics Multimedia Systems

Ch. 2: Compression Basics Multimedia Systems Ch. 2: Compression Basics Multimedia Systems Prof. Thinh Nguyen (Based on Prof. Ben Lee s Slides) Oregon State University School of Electrical Engineering and Computer Science Outline Why compression?

More information

Data Compression. Media Signal Processing, Presentation 2. Presented By: Jahanzeb Farooq Michael Osadebey

Data Compression. Media Signal Processing, Presentation 2. Presented By: Jahanzeb Farooq Michael Osadebey Data Compression Media Signal Processing, Presentation 2 Presented By: Jahanzeb Farooq Michael Osadebey What is Data Compression? Definition -Reducing the amount of data required to represent a source

More information

Basic Compression Library

Basic Compression Library Basic Compression Library Manual API version 1.2 July 22, 2006 c 2003-2006 Marcus Geelnard Summary This document describes the algorithms used in the Basic Compression Library, and how to use the library

More information

Repetition 1st lecture

Repetition 1st lecture Repetition 1st lecture Human Senses in Relation to Technical Parameters Multimedia - what is it? Human senses (overview) Historical remarks Color models RGB Y, Cr, Cb Data rates Text, Graphic Picture,

More information

Data Compression. Guest lecture, SGDS Fall 2011

Data Compression. Guest lecture, SGDS Fall 2011 Data Compression Guest lecture, SGDS Fall 2011 1 Basics Lossy/lossless Alphabet compaction Compression is impossible Compression is possible RLE Variable-length codes Undecidable Pigeon-holes Patterns

More information

Introduction to Data Compression

Introduction to Data Compression Introduction to Data Compression Guillaume Tochon guillaume.tochon@lrde.epita.fr LRDE, EPITA Guillaume Tochon (LRDE) CODO - Introduction 1 / 9 Data compression: whatizit? Guillaume Tochon (LRDE) CODO -

More information

Encoding. A thesis submitted to the Graduate School of University of Cincinnati in

Encoding. A thesis submitted to the Graduate School of University of Cincinnati in Lossless Data Compression for Security Purposes Using Huffman Encoding A thesis submitted to the Graduate School of University of Cincinnati in a partial fulfillment of requirements for the degree of Master

More information

EE67I Multimedia Communication Systems Lecture 4

EE67I Multimedia Communication Systems Lecture 4 EE67I Multimedia Communication Systems Lecture 4 Lossless Compression Basics of Information Theory Compression is either lossless, in which no information is lost, or lossy in which information is lost.

More information

A Comprehensive Review of Data Compression Techniques

A Comprehensive Review of Data Compression Techniques Volume-6, Issue-2, March-April 2016 International Journal of Engineering and Management Research Page Number: 684-688 A Comprehensive Review of Data Compression Techniques Palwinder Singh 1, Amarbir Singh

More information

15 July, Huffman Trees. Heaps

15 July, Huffman Trees. Heaps 1 Huffman Trees The Huffman Code: Huffman algorithm uses a binary tree to compress data. It is called the Huffman code, after David Huffman who discovered d it in 1952. Data compression is important in

More information

ROOT I/O compression algorithms. Oksana Shadura, Brian Bockelman University of Nebraska-Lincoln

ROOT I/O compression algorithms. Oksana Shadura, Brian Bockelman University of Nebraska-Lincoln ROOT I/O compression algorithms Oksana Shadura, Brian Bockelman University of Nebraska-Lincoln Introduction Compression Algorithms 2 Compression algorithms Los Reduces size by permanently eliminating certain

More information

A Novel Image Compression Technique using Simple Arithmetic Addition

A Novel Image Compression Technique using Simple Arithmetic Addition Proc. of Int. Conf. on Recent Trends in Information, Telecommunication and Computing, ITC A Novel Image Compression Technique using Simple Arithmetic Addition Nadeem Akhtar, Gufran Siddiqui and Salman

More information

Compression; Error detection & correction

Compression; Error detection & correction Compression; Error detection & correction compression: squeeze out redundancy to use less memory or use less network bandwidth encode the same information in fewer bits some bits carry no information some

More information

LZW Compression. Ramana Kumar Kundella. Indiana State University December 13, 2014

LZW Compression. Ramana Kumar Kundella. Indiana State University December 13, 2014 LZW Compression Ramana Kumar Kundella Indiana State University rkundella@sycamores.indstate.edu December 13, 2014 Abstract LZW is one of the well-known lossless compression methods. Since it has several

More information

Data Representation. Types of data: Numbers Text Audio Images & Graphics Video

Data Representation. Types of data: Numbers Text Audio Images & Graphics Video Data Representation Data Representation Types of data: Numbers Text Audio Images & Graphics Video Analog vs Digital data How is data represented? What is a signal? Transmission of data Analog vs Digital

More information

A Research Paper on Lossless Data Compression Techniques

A Research Paper on Lossless Data Compression Techniques IJIRST International Journal for Innovative Research in Science & Technology Volume 4 Issue 1 June 2017 ISSN (online): 2349-6010 A Research Paper on Lossless Data Compression Techniques Prof. Dipti Mathpal

More information

Noise Reduction in Data Communication Using Compression Technique

Noise Reduction in Data Communication Using Compression Technique Digital Technologies, 2016, Vol. 2, No. 1, 9-13 Available online at http://pubs.sciepub.com/dt/2/1/2 Science and Education Publishing DOI:10.12691/dt-2-1-2 Noise Reduction in Data Communication Using Compression

More information

Digital Image Processing

Digital Image Processing Lecture 9+10 Image Compression Lecturer: Ha Dai Duong Faculty of Information Technology 1. Introduction Image compression To Solve the problem of reduncing the amount of data required to represent a digital

More information

Lecture 6 Review of Lossless Coding (II)

Lecture 6 Review of Lossless Coding (II) Shujun LI (李树钧): INF-10845-20091 Multimedia Coding Lecture 6 Review of Lossless Coding (II) May 28, 2009 Outline Review Manual exercises on arithmetic coding and LZW dictionary coding 1 Review Lossy coding

More information

Huffman Coding Assignment For CS211, Bellevue College (rev. 2016)

Huffman Coding Assignment For CS211, Bellevue College (rev. 2016) Huffman Coding Assignment For CS, Bellevue College (rev. ) (original from Marty Stepp, UW CSE, modified by W.P. Iverson) Summary: Huffman coding is an algorithm devised by David A. Huffman of MIT in 95

More information

Image coding and compression

Image coding and compression Image coding and compression Robin Strand Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University Today Information and Data Redundancy Image Quality Compression Coding

More information

Data Storage. Slides derived from those available on the web site of the book: Computer Science: An Overview, 11 th Edition, by J.

Data Storage. Slides derived from those available on the web site of the book: Computer Science: An Overview, 11 th Edition, by J. Data Storage Slides derived from those available on the web site of the book: Computer Science: An Overview, 11 th Edition, by J. Glenn Brookshear Copyright 2012 Pearson Education, Inc. Data Storage Bits

More information

David Rappaport School of Computing Queen s University CANADA. Copyright, 1996 Dale Carnegie & Associates, Inc.

David Rappaport School of Computing Queen s University CANADA. Copyright, 1996 Dale Carnegie & Associates, Inc. David Rappaport School of Computing Queen s University CANADA Copyright, 1996 Dale Carnegie & Associates, Inc. Data Compression There are two broad categories of data compression: Lossless Compression

More information

Chapter 1. Digital Data Representation and Communication. Part 2

Chapter 1. Digital Data Representation and Communication. Part 2 Chapter 1. Digital Data Representation and Communication Part 2 Compression Digital media files are usually very large, and they need to be made smaller compressed Without compression Won t have storage

More information

Analysis of Parallelization Effects on Textual Data Compression

Analysis of Parallelization Effects on Textual Data Compression Analysis of Parallelization Effects on Textual Data GORAN MARTINOVIC, CASLAV LIVADA, DRAGO ZAGAR Faculty of Electrical Engineering Josip Juraj Strossmayer University of Osijek Kneza Trpimira 2b, 31000

More information

Data Compression Techniques

Data Compression Techniques Data Compression Techniques Part 1: Entropy Coding Lecture 1: Introduction and Huffman Coding Juha Kärkkäinen 31.10.2017 1 / 21 Introduction Data compression deals with encoding information in as few bits

More information

Lossless compression II

Lossless compression II Lossless II D 44 R 52 B 81 C 84 D 86 R 82 A 85 A 87 A 83 R 88 A 8A B 89 A 8B Symbol Probability Range a 0.2 [0.0, 0.2) e 0.3 [0.2, 0.5) i 0.1 [0.5, 0.6) o 0.2 [0.6, 0.8) u 0.1 [0.8, 0.9)! 0.1 [0.9, 1.0)

More information

Department of electronics and telecommunication, J.D.I.E.T.Yavatmal, India 2

Department of electronics and telecommunication, J.D.I.E.T.Yavatmal, India 2 IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY LOSSLESS METHOD OF IMAGE COMPRESSION USING HUFFMAN CODING TECHNIQUES Trupti S Bobade *, Anushri S. sastikar 1 Department of electronics

More information

Data Compression 신찬수

Data Compression 신찬수 Data Compression 신찬수 Data compression Reducing the size of the representation without affecting the information itself. Lossless compression vs. lossy compression text file image file movie file compression

More information

FPGA based Data Compression using Dictionary based LZW Algorithm

FPGA based Data Compression using Dictionary based LZW Algorithm FPGA based Data Compression using Dictionary based LZW Algorithm Samish Kamble PG Student, E & TC Department, D.Y. Patil College of Engineering, Kolhapur, India Prof. S B Patil Asso.Professor, E & TC Department,

More information

Bits and Bit Patterns

Bits and Bit Patterns Bits and Bit Patterns Bit: Binary Digit (0 or 1) Bit Patterns are used to represent information. Numbers Text characters Images Sound And others 0-1 Boolean Operations Boolean Operation: An operation that

More information

S 1. Evaluation of Fast-LZ Compressors for Compacting High-Bandwidth but Redundant Streams from FPGA Data Sources

S 1. Evaluation of Fast-LZ Compressors for Compacting High-Bandwidth but Redundant Streams from FPGA Data Sources Evaluation of Fast-LZ Compressors for Compacting High-Bandwidth but Redundant Streams from FPGA Data Sources Author: Supervisor: Luhao Liu Dr. -Ing. Thomas B. Preußer Dr. -Ing. Steffen Köhler 09.10.2014

More information

More Bits and Bytes Huffman Coding

More Bits and Bytes Huffman Coding More Bits and Bytes Huffman Coding Encoding Text: How is it done? ASCII, UTF, Huffman algorithm ASCII C A T Lawrence Snyder, CSE UTF-8: All the alphabets in the world Uniform Transformation Format: a variable-width

More information

VC 12/13 T16 Video Compression

VC 12/13 T16 Video Compression VC 12/13 T16 Video Compression Mestrado em Ciência de Computadores Mestrado Integrado em Engenharia de Redes e Sistemas Informáticos Miguel Tavares Coimbra Outline The need for compression Types of redundancy

More information

IMAGE COMPRESSION TECHNIQUES

IMAGE COMPRESSION TECHNIQUES IMAGE COMPRESSION TECHNIQUES A.VASANTHAKUMARI, M.Sc., M.Phil., ASSISTANT PROFESSOR OF COMPUTER SCIENCE, JOSEPH ARTS AND SCIENCE COLLEGE, TIRUNAVALUR, VILLUPURAM (DT), TAMIL NADU, INDIA ABSTRACT A picture

More information

Huffman Coding Implementation on Gzip Deflate Algorithm and its Effect on Website Performance

Huffman Coding Implementation on Gzip Deflate Algorithm and its Effect on Website Performance Huffman Coding Implementation on Gzip Deflate Algorithm and its Effect on Website Performance I Putu Gede Wirasuta - 13517015 Program Studi Teknik Informatika Sekolah Teknik Elektro dan Informatika Institut

More information

Lecture 5: Compression I. This Week s Schedule

Lecture 5: Compression I. This Week s Schedule Lecture 5: Compression I Reading: book chapter 6, section 3 &5 chapter 7, section 1, 2, 3, 4, 8 Today: This Week s Schedule The concept behind compression Rate distortion theory Image compression via DCT

More information

CS 493: Algorithms for Massive Data Sets Dictionary-based compression February 14, 2002 Scribe: Tony Wirth LZ77

CS 493: Algorithms for Massive Data Sets Dictionary-based compression February 14, 2002 Scribe: Tony Wirth LZ77 CS 493: Algorithms for Massive Data Sets February 14, 2002 Dictionary-based compression Scribe: Tony Wirth This lecture will explore two adaptive dictionary compression schemes: LZ77 and LZ78. We use the

More information

Chapter 7 Lossless Compression Algorithms

Chapter 7 Lossless Compression Algorithms Chapter 7 Lossless Compression Algorithms 7.1 Introduction 7.2 Basics of Information Theory 7.3 Run-Length Coding 7.4 Variable-Length Coding (VLC) 7.5 Dictionary-based Coding 7.6 Arithmetic Coding 7.7

More information

VIDEO SIGNALS. Lossless coding

VIDEO SIGNALS. Lossless coding VIDEO SIGNALS Lossless coding LOSSLESS CODING The goal of lossless image compression is to represent an image signal with the smallest possible number of bits without loss of any information, thereby speeding

More information

CS 335 Graphics and Multimedia. Image Compression

CS 335 Graphics and Multimedia. Image Compression CS 335 Graphics and Multimedia Image Compression CCITT Image Storage and Compression Group 3: Huffman-type encoding for binary (bilevel) data: FAX Group 4: Entropy encoding without error checks of group

More information

A Comparison between English and. Arabic Text Compression

A Comparison between English and. Arabic Text Compression Contemporary Engineering Sciences, Vol. 6, 2013, no. 3, 111-119 HIKARI Ltd, www.m-hikari.com A Comparison between English and Arabic Text Compression Ziad M. Alasmer, Bilal M. Zahran, Belal A. Ayyoub,

More information

Image Compression. cs2: Computational Thinking for Scientists.

Image Compression. cs2: Computational Thinking for Scientists. Image Compression cs2: Computational Thinking for Scientists Çetin Kaya Koç http://cs.ucsb.edu/~koc/cs2 koc@cs.ucsb.edu The course was developed with input from: Ömer Eǧecioǧlu (Computer Science), Maribel

More information

EE-575 INFORMATION THEORY - SEM 092

EE-575 INFORMATION THEORY - SEM 092 EE-575 INFORMATION THEORY - SEM 092 Project Report on Lempel Ziv compression technique. Department of Electrical Engineering Prepared By: Mohammed Akber Ali Student ID # g200806120. ------------------------------------------------------------------------------------------------------------------------------------------

More information

Introduction to Compression. Norm Zeck

Introduction to Compression. Norm Zeck Introduction to Compression 2 Vita BSEE University of Buffalo (Microcoded Computer Architecture) MSEE University of Rochester (Thesis: CMOS VLSI Design) Retired from Palo Alto Research Center (PARC), a

More information

So, what is data compression, and why do we need it?

So, what is data compression, and why do we need it? In the last decade we have been witnessing a revolution in the way we communicate 2 The major contributors in this revolution are: Internet; The explosive development of mobile communications; and The

More information

An Advanced Text Encryption & Compression System Based on ASCII Values & Arithmetic Encoding to Improve Data Security

An Advanced Text Encryption & Compression System Based on ASCII Values & Arithmetic Encoding to Improve Data Security Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 10, October 2014,

More information

SIGNAL COMPRESSION Lecture Lempel-Ziv Coding

SIGNAL COMPRESSION Lecture Lempel-Ziv Coding SIGNAL COMPRESSION Lecture 5 11.9.2007 Lempel-Ziv Coding Dictionary methods Ziv-Lempel 77 The gzip variant of Ziv-Lempel 77 Ziv-Lempel 78 The LZW variant of Ziv-Lempel 78 Asymptotic optimality of Ziv-Lempel

More information

G64PMM - Lecture 3.2. Analogue vs Digital. Analogue Media. Graphics & Still Image Representation

G64PMM - Lecture 3.2. Analogue vs Digital. Analogue Media. Graphics & Still Image Representation G64PMM - Lecture 3.2 Graphics & Still Image Representation Analogue vs Digital Analogue information Continuously variable signal Physical phenomena Sound/light/temperature/position/pressure Waveform Electromagnetic

More information

CIS 121 Data Structures and Algorithms with Java Spring 2018

CIS 121 Data Structures and Algorithms with Java Spring 2018 CIS 121 Data Structures and Algorithms with Java Spring 2018 Homework 6 Compression Due: Monday, March 12, 11:59pm online 2 Required Problems (45 points), Qualitative Questions (10 points), and Style and

More information

JPEG decoding using end of block markers to concurrently partition channels on a GPU. Patrick Chieppe (u ) Supervisor: Dr.

JPEG decoding using end of block markers to concurrently partition channels on a GPU. Patrick Chieppe (u ) Supervisor: Dr. JPEG decoding using end of block markers to concurrently partition channels on a GPU Patrick Chieppe (u5333226) Supervisor: Dr. Eric McCreath JPEG Lossy compression Widespread image format Introduction

More information

WIRE/WIRELESS SENSOR NETWORKS USING K-RLE ALGORITHM FOR A LOW POWER DATA COMPRESSION

WIRE/WIRELESS SENSOR NETWORKS USING K-RLE ALGORITHM FOR A LOW POWER DATA COMPRESSION WIRE/WIRELESS SENSOR NETWORKS USING K-RLE ALGORITHM FOR A LOW POWER DATA COMPRESSION V.KRISHNAN1, MR. R.TRINADH 2 1 M. Tech Student, 2 M. Tech., Assistant Professor, Dept. Of E.C.E, SIR C.R. Reddy college

More information

Fundamentals of Video Compression. Video Compression

Fundamentals of Video Compression. Video Compression Fundamentals of Video Compression Introduction to Digital Video Basic Compression Techniques Still Image Compression Techniques - JPEG Video Compression Introduction to Digital Video Video is a stream

More information

ENSC Multimedia Communications Engineering Huffman Coding (1)

ENSC Multimedia Communications Engineering Huffman Coding (1) ENSC 424 - Multimedia Communications Engineering Huffman Coding () Jie Liang Engineering Science Simon Fraser University JieL@sfu.ca J. Liang: SFU ENSC 424 Outline Entropy Coding Prefix code Kraft-McMillan

More information

Lecture Coding Theory. Source Coding. Image and Video Compression. Images: Wikipedia

Lecture Coding Theory. Source Coding. Image and Video Compression. Images: Wikipedia Lecture Coding Theory Source Coding Image and Video Compression Images: Wikipedia Entropy Coding: Unary Coding Golomb Coding Static Huffman Coding Adaptive Huffman Coding Arithmetic Coding Run Length Encoding

More information

THE RELATIVE EFFICIENCY OF DATA COMPRESSION BY LZW AND LZSS

THE RELATIVE EFFICIENCY OF DATA COMPRESSION BY LZW AND LZSS THE RELATIVE EFFICIENCY OF DATA COMPRESSION BY LZW AND LZSS Yair Wiseman 1* * 1 Computer Science Department, Bar-Ilan University, Ramat-Gan 52900, Israel Email: wiseman@cs.huji.ac.il, http://www.cs.biu.ac.il/~wiseman

More information

Using Arithmetic Coding for Reduction of Resulting Simulation Data Size on Massively Parallel GPGPUs

Using Arithmetic Coding for Reduction of Resulting Simulation Data Size on Massively Parallel GPGPUs Using Arithmetic Coding for Reduction of Resulting Simulation Data Size on Massively Parallel GPGPUs Ana Balevic, Lars Rockstroh, Marek Wroblewski, and Sven Simon Institute for Parallel and Distributed

More information

Data Compression Techniques

Data Compression Techniques Data Compression Techniques Part 2: Text Compression Lecture 6: Dictionary Compression Juha Kärkkäinen 15.11.2017 1 / 17 Dictionary Compression The compression techniques we have seen so far replace individual

More information

Optimized Compression and Decompression Software

Optimized Compression and Decompression Software 2015 IJSRSET Volume 1 Issue 3 Print ISSN : 2395-1990 Online ISSN : 2394-4099 Themed Section: Engineering and Technology Optimized Compression and Decompression Software Mohd Shafaat Hussain, Manoj Yadav

More information

Unit 2 Digital Information. Chapter 1 Study Guide

Unit 2 Digital Information. Chapter 1 Study Guide Unit 2 Digital Information Chapter 1 Study Guide 2.5 Wrap Up Other file formats Other file formats you may have encountered or heard of include:.doc,.docx,.pdf,.mp4,.mov The file extension you often see

More information

Error Resilient LZ 77 Data Compression

Error Resilient LZ 77 Data Compression Error Resilient LZ 77 Data Compression Stefano Lonardi Wojciech Szpankowski Mark Daniel Ward Presentation by Peter Macko Motivation Lempel-Ziv 77 lacks any form of error correction Introducing a single

More information

Chapter 1. Data Storage Pearson Addison-Wesley. All rights reserved

Chapter 1. Data Storage Pearson Addison-Wesley. All rights reserved Chapter 1 Data Storage 2007 Pearson Addison-Wesley. All rights reserved Chapter 1: Data Storage 1.1 Bits and Their Storage 1.2 Main Memory 1.3 Mass Storage 1.4 Representing Information as Bit Patterns

More information

Overview. Videos are everywhere. But can take up large amounts of resources. Exploit redundancy to reduce file size

Overview. Videos are everywhere. But can take up large amounts of resources. Exploit redundancy to reduce file size Overview Videos are everywhere But can take up large amounts of resources Disk space Memory Network bandwidth Exploit redundancy to reduce file size Spatial Temporal General lossless compression Huffman

More information

Example 1: Denary = 1. Answer: Binary = (1 * 1) = 1. Example 2: Denary = 3. Answer: Binary = (1 * 1) + (2 * 1) = 3

Example 1: Denary = 1. Answer: Binary = (1 * 1) = 1. Example 2: Denary = 3. Answer: Binary = (1 * 1) + (2 * 1) = 3 1.1.1 Binary systems In mathematics and digital electronics, a binary number is a number expressed in the binary numeral system, or base-2 numeral system, which represents numeric values using two different

More information

Perceptual Coding. Lossless vs. lossy compression Perceptual models Selecting info to eliminate Quantization and entropy encoding

Perceptual Coding. Lossless vs. lossy compression Perceptual models Selecting info to eliminate Quantization and entropy encoding Perceptual Coding Lossless vs. lossy compression Perceptual models Selecting info to eliminate Quantization and entropy encoding Part II wrap up 6.082 Fall 2006 Perceptual Coding, Slide 1 Lossless vs.

More information

Do not turn this page over until instructed to do so by the Senior Invigilator.

Do not turn this page over until instructed to do so by the Senior Invigilator. CARDIFF CARDIFF UNIVERSITY EXAMINATION PAPER SOLUTIONS Academic Year: 2000-2001 Examination Period: Lent 2001 Examination Paper Number: CMP632 Examination Paper Title: Multimedia Systems Duration: 2 hours

More information

IMAGE PROCESSING (RRY025) LECTURE 13 IMAGE COMPRESSION - I

IMAGE PROCESSING (RRY025) LECTURE 13 IMAGE COMPRESSION - I IMAGE PROCESSING (RRY025) LECTURE 13 IMAGE COMPRESSION - I 1 Need For Compression 2D data sets are much larger than 1D. TV and movie data sets are effectively 3D (2-space, 1-time). Need Compression for

More information

A study in compression algorithms

A study in compression algorithms Master Thesis Computer Science Thesis no: MCS-004:7 January 005 A study in compression algorithms Mattias Håkansson Sjöstrand Department of Interaction and System Design School of Engineering Blekinge

More information

DigiPoints Volume 1. Student Workbook. Module 8 Digital Compression

DigiPoints Volume 1. Student Workbook. Module 8 Digital Compression Digital Compression Page 8.1 DigiPoints Volume 1 Module 8 Digital Compression Summary This module describes the techniques by which digital signals are compressed in order to make it possible to carry

More information

A QUAD-TREE DECOMPOSITION APPROACH TO CARTOON IMAGE COMPRESSION. Yi-Chen Tsai, Ming-Sui Lee, Meiyin Shen and C.-C. Jay Kuo

A QUAD-TREE DECOMPOSITION APPROACH TO CARTOON IMAGE COMPRESSION. Yi-Chen Tsai, Ming-Sui Lee, Meiyin Shen and C.-C. Jay Kuo A QUAD-TREE DECOMPOSITION APPROACH TO CARTOON IMAGE COMPRESSION Yi-Chen Tsai, Ming-Sui Lee, Meiyin Shen and C.-C. Jay Kuo Integrated Media Systems Center and Department of Electrical Engineering University

More information

CSE 421 Greedy: Huffman Codes

CSE 421 Greedy: Huffman Codes CSE 421 Greedy: Huffman Codes Yin Tat Lee 1 Compression Example 100k file, 6 letter alphabet: File Size: ASCII, 8 bits/char: 800kbits 2 3 > 6; 3 bits/char: 300kbits a 45% b 13% c 12% d 16% e 9% f 5% Why?

More information

Dictionary techniques

Dictionary techniques Dictionary techniques The final concept that we will mention in this chapter is about dictionary techniques. Many modern compression algorithms rely on the modified versions of various dictionary techniques.

More information

Distributed source coding

Distributed source coding Distributed source coding Suppose that we want to encode two sources (X, Y ) with joint probability mass function p(x, y). If the encoder has access to both X and Y, it is sufficient to use a rate R >

More information

Source Coding: Lossless Compression

Source Coding: Lossless Compression MIT 6.02 DRAFT Lecture Notes Fall 2010 (Last update: November 29, 2010) Comments, questions or bug reports? Please contact 6.02-staff@mit.edu CHAPTER 22 Source Coding: Lossless Compression In this lecture

More information

OPTIMIZATION OF LZW (LEMPEL-ZIV-WELCH) ALGORITHM TO REDUCE TIME COMPLEXITY FOR DICTIONARY CREATION IN ENCODING AND DECODING

OPTIMIZATION OF LZW (LEMPEL-ZIV-WELCH) ALGORITHM TO REDUCE TIME COMPLEXITY FOR DICTIONARY CREATION IN ENCODING AND DECODING Asian Journal Of Computer Science And Information Technology 2: 5 (2012) 114 118. Contents lists available at www.innovativejournal.in Asian Journal of Computer Science and Information Technology Journal

More information

ITCT Lecture 8.2: Dictionary Codes and Lempel-Ziv Coding

ITCT Lecture 8.2: Dictionary Codes and Lempel-Ziv Coding ITCT Lecture 8.2: Dictionary Codes and Lempel-Ziv Coding Huffman codes require us to have a fairly reasonable idea of how source symbol probabilities are distributed. There are a number of applications

More information

IMAGE COMPRESSION. Image Compression. Why? Reducing transportation times Reducing file size. A two way event - compression and decompression

IMAGE COMPRESSION. Image Compression. Why? Reducing transportation times Reducing file size. A two way event - compression and decompression IMAGE COMPRESSION Image Compression Why? Reducing transportation times Reducing file size A two way event - compression and decompression 1 Compression categories Compression = Image coding Still-image

More information

Efficient Sequential Algorithms, Comp309. Motivation. Longest Common Subsequence. Part 3. String Algorithms

Efficient Sequential Algorithms, Comp309. Motivation. Longest Common Subsequence. Part 3. String Algorithms Efficient Sequential Algorithms, Comp39 Part 3. String Algorithms University of Liverpool References: T. H. Cormen, C. E. Leiserson, R. L. Rivest Introduction to Algorithms, Second Edition. MIT Press (21).

More information