The impossible patent: an introduction to lossless data compression. Carlo Mazza

Size: px
Start display at page:

Download "The impossible patent: an introduction to lossless data compression. Carlo Mazza"

Transcription

1 The impossible patent: an introduction to lossless data compression Carlo Mazza

2 Plan Introduction Formalization Theorem A couple of good ideas

3 Introduction

4 What is data compression? Data compression is the procedure that reduces the size of information. It is used today in many applications, expecially in digital data: generic files compression (ZIP, RAR, etc.) audio compression (MP3, AAC, FLAC, etc.) images compression (JPG, GIF, PNG, etc.) video compression (AVI, MP4, WMV, etc.)

5 (Very) Brief historic overview 1838: Morse Code 1940: Information Theory (Shannon, Fano, Huffman) 1970s 1980s LZW (Lempel, Ziv and Welch) ARJ, PKZIP, LHarc Microsoft and Apple, BBS and newsgroups 1990s JPG, MP3 The web, browsers, Yahoo and Google 2000s H.264, AAC, MP4, M4V dot-com bubble, Facebook

6 Screenshot of PKZIP 2.04g, created on February 15, 2007 using DOSBox

7 Different kinds of compression Lossless compression: ZIP, RAR, FLAC, PNG Lossy compression: MP3, JPG, MP4, AAC

8 Formalization

9 Lossless compression The lossless compression is the compression which does not lose information, i.e., there is another operation, decompression, such that compressing and decompressing a file gives back the exact same file.

10 No loss of information Messaggi SMS: "hi m8, r u k? sry i 4gt 2 cal u lst nite. why dnt we go c movie 2nite? c u l8r" "c 6? xke nn ho bekkato ness1 in 3no? cmq c vdm + trd nel pom" Aoccdrnig to a rscheearch at an Elingsh uinervtisy, it deosn t mttaer in waht oredr the ltteers in a wrod are, the olny iprmoatnt tihng is taht frist and lsat ltteer is at the rghit pclae. The rset can be a toatl mses and you can sitll raed it wouthit porbelm. Tihs is bcuseae we do not raed ervey lteter by itslef but the wrod as a wlohe.

11 Loss of information Jane S., a chief sub editor and editor, can always be found hard at work in her cubicle. Jane works independently, without wasting company time talking to colleagues. She never thinks twice about assisting fellow employees, and she always finishes given assignments on time. Often Jane takes extended measures to complete her work, sometimes skipping coffee breaks. She is a dedicated individual who has absolutely no vanity in spite of her high accomplishments and profound knowledge in her field. I firmly believe that Jane can be classed as a high-caliber employee, the type which cannot be dispensed with. Consequently, I duly recommend that Jane be promoted to executive management, and a proposal will be sent away as soon as possible.

12 Formalization We try to formalize the situation: let F be a file, a sequence of ones and zeros let L(F) be the length of the file F we want to find a procedure that from F yields another file G in such a way that L(G) L(F) How many are the files of length N? And those of length at most N?

13 Compression as a function We think of compression as a function f from the set of files into the same set of files such that L(f(F)) L(F). What properties do we need from this function for the compression to be lossless? the function f(f)=0 surely compresses but loses information the function f(f)=f surely does not loses information but does not compress either What is the property that distinguishes lossless and lossy compression?

14 Compression as a function As we said before, we say that the compression is lossless if there is another operation which recovers, the original file. The functions f models lossless compression if there is another fuction g such that for every file F we have g(f(f))=f (f o g)(f)=f (f o g)(f)=(id)(f) We say that f has a left inverse

15 Left inverses and injective maps Theorem: A function f admits a left inverse if and only if it is injective. Proof: Say f is a map from X to Y. Suppose f is injective. Then every y is the image of at most one x in X. We define the map g by stating that every y which is hit goes back to x, and every other y can do whatever it wants. It is clear that for every x in X, g(f(x))=x.

16 Left inverses and injective maps Proof (cont d): Suppose now that f admits a left inverse, call it g. Suppose that f(x)=f(x ). Then g (f(x))=g(f(x )), but x=g(f(x))=g(f(x ))=x, and therefore x=x, that is f is injective. We managed to translate an intuitive property ( losslessness ) into a precise mathematical concept (injectivity).

17 Theorem

18 Limits of lossless compression WEB Technologies Premier Research Corporation (MINC) Hyper Space method Matthew Burch Pegasus Web Services Inc. (patent 7,096,360) Actually... Theorem: There is no perfect lossless compression.

19 Proof by contraddiction Theorem: There isn t a function f such that for every F we have L(f(F)) L(F), but there is at lest one such that L(f(F))<L(F)). Proof: Let s suppose such a function exists Let F be a file which is actually compressed and let G=f(F). Consider L(f(G)). If L(f(G))=L(F) then let H=f(G)=f(f(F)) and consider L(f(H)) and so on. Since f is injective, I cannot hit the same file twice.

20 Proof (continued) So the length will have to decrease eventually. But then we will eventually go to files of length one, from where we cannot go any further, which leads to a contraddiction.

21 Schubfachprinzip Dirichlet s Principle (1834), pigeonhole principle Let f be a function from a set A to a set B. If the number of elements of B is stricly less than that of A, then f is not injective.

22 Let s count Theorem: There isn t a function f which compresses almost all files (i.e., L(f(F)) L(F) for all F but there is at least one such that L(f (F))<L(F)). Proof: Let N be the minimal length of a file which is compressed. The files of length N-1 are 2 (N-1) and so all files of length N are 2 (N-1) +2 (N-2) =2 N -2. Then f sends a set of size 2 (N- 2) +1 to a set of size 2 (N-2). But because of the pigeonhole principle, it cannot be injective.

23 Impossible compression So there is no universal compression function. Actually, looking at the proof, it s clear that if something is compressed, something else increases in size. So, if we have no good ideas, better leave everything as is.

24 A couple of good ideas RLE and prefix codes

25 Run Lenght Encoding The Run Lenght Encoding (RLE) technique is one of the oldest compression algorithm: when a symbol repeats, we substitute the symbol and the number of its repetitions. aaaabbbcccdd -> 4a3b3c2d mathematics -> 1m1a1t1h1e1m1a1t1i1c1s It works badly for messages with few repetitions and very well for messages with a lot of repetitions (fax).

26 ASCII encoding But we still need to encode the letters and frequencies in binary. In general, let s say we have a text message that we want to compress. The output will be a binary string, so we need to convert letters to binary numbers. One of the standards is the ASCII standard that assigns to each letter a 7 bit number (a string of 7 ones or zeros, so it encodes 2 7 =128 symbols).

27 Dictionary Encoding We decide to choose a dictionary that need not be only one letter, but maybe more. But we still need to have some kind of fixed length to be able to separate the frequencies from the symbols.

28 Exercise

29 Reducing number of bits encoding mathematics in ASCII requires 7 bits * 11 letters = 77bits mathematics only has 8 different letters, so only 3 bits are needed, so in total 33 bits but we could use less bits for the more frequent letters, i.e., a=0 m=1 t=10 h=11 e=100 i=101 c=110 s=111 so mathematics becomes (22 b) but that also encodes iasaattihas

30 Prefix codes Need to make sure that no code is the prefix of another code a=0 b=1 c=10 doesn t work a=0 b=10 c=11 works Examples: international prefix (+1 USA, +39 Italy)

31 Huffman coding We start with a frequency table of the letters. We produce a tree following the rules: create a tree for every letter with weight equal to its frequency create a new tree by joining the two trees with the least two weights (and give it as weigth the sum of the two weigths) go on until there is only one tree To see what the codes are, we read the tree from the top to the bottom.

32 Examples 1. aaaabbbccdd a. RLE 4a3b2c2d (17 bits) b. Huffman: 2. mathematics a. RLE 1m1a1t1h1e1m1a1t1i1c1s (3*11=33 bits) b. Huffman:

33 assassins: (5,s) (2,a) (1,i) (1,n) s a i n s a i n 4 5 a s i n

34 0 1 So, in the end: s=0 a=10 i=110 n=111 s 0 1 assassins = (15 bits) a 0 1 Try sessions, sassafrasses, mummy, beekeeper, but not mathematics i n

35 Advantages and disavantages RLE: one can start compressing at once (there is no need to read the whole message to construct a frequency table) RLE: works expecially well when there are few symbols and lots of repetitions Huffman: works well when the frequencies are not close to each other (natual language) Huffman: works expecially well when frequencies are powers of two

36 That s all folks!

37 (Very) Brief history of data compression 1838: Morse code 1940s: Information theory (Shannon, Fano, Huffman) 1970s: LZW (Lempel, Ziv and Welch), Microsoft, Apple 1980s: ARJ, PKZIP, LHarc (BBS and newsgroups) 1990s: JPG, MP3 ( The web and browsers), 1994: Yahoo 1998: Google 2001: dot-com bubble 2004: Facebook

Semi-Lossless Text Compression: a Case Study

Semi-Lossless Text Compression: a Case Study Semi-Lossless Text Compression: a Case Study BRUNO CARPENTIERI Dipartimento di Informatica Università di Salerno Italy bc@dia.unisa.it Abstract: - Text compression is generally considered only as lossless

More information

CNT4406/5412 Network Security

CNT4406/5412 Network Security CNT4406/5412 Network Security Introduction to Cryptography Zhi Wang Florida State University Fall 2014 Zhi Wang (FSU) CNT4406/5412 Network Security Fall 2014 1 / 18 Introduction What is Cryptography Mangling

More information

CSSE SEMESTER 1, 2017 EXAMINATIONS. CITS1001 Object-oriented Programming and Software Engineering FAMILY NAME: GIVEN NAMES:

CSSE SEMESTER 1, 2017 EXAMINATIONS. CITS1001 Object-oriented Programming and Software Engineering FAMILY NAME: GIVEN NAMES: CSSE SEMESTER 1, 2017 EXAMINATIONS CITS1001 Object-oriented Programming and Software Engineering FAMILY NAME: GIVEN NAMES: STUDENT ID: SIGNATURE: This Paper Contains: 20 pages (including title page) Time

More information

Einführung in die Programmierung Introduction to Programming

Einführung in die Programmierung Introduction to Programming Chair of Software Engineering Einführung in die Programmierung Introduction to Programming Prof. Dr. Bertrand Meyer Michela Pedroni Lecture 11: Describing the Syntax Goals of today s lecture Learn about

More information

An example (1) - Conditional. An example (2) - Conditional. An example (3) Nested conditional

An example (1) - Conditional. An example (2) - Conditional. An example (3) Nested conditional Chair of Software Engineering Einführung in die Programmierung Introduction to Programming Prof. Dr. Bertrand Meyer Michela Pedroni October 2006 February 2007 Lecture 8: Describing the Syntax Intro. to

More information

Entropy Coding. - to shorten the average code length by assigning shorter codes to more probable symbols => Morse-, Huffman-, Arithmetic Code

Entropy Coding. - to shorten the average code length by assigning shorter codes to more probable symbols => Morse-, Huffman-, Arithmetic Code Entropy Coding } different probabilities for the appearing of single symbols are used - to shorten the average code length by assigning shorter codes to more probable symbols => Morse-, Huffman-, Arithmetic

More information

History of Typography. (History of Digital Font)

History of Typography. (History of Digital Font) History of Typography (History of Digital Font) 1 What is Typography? The art and technique of printing The study and process of typefaces Study Legibility or readability of typefaces and their layout

More information

Professional Communication

Professional Communication 1 Professional Communication 1 3 Agenda Communication Analyze the writing situation Personal Business E-mail, Memos, Letters Editing your work 4 Analyze the Writing Situation Consider the following Subject

More information

Lossless compression B 1 U 1 B 2 C R D! CSCI 470: Web Science Keith Vertanen

Lossless compression B 1 U 1 B 2 C R D! CSCI 470: Web Science Keith Vertanen Lossless compression B U B U B 2 U B A ϵ CSCI 47: Web Science Keith Vertanen C R D! Lossless compression Mo7va7on Overview Rules and limits of the game Things to exploit Run- length encoding (RLE) Exploit

More information

Error Checking Codes

Error Checking Codes University of Waterloo December 11th, 2015 In the beginning... We had very primitive methods of long distance communication... Smoke Signals String Phone Telephone Wired Internet Wireless Internet Transmitting

More information

Compressing Data. Konstantin Tretyakov

Compressing Data. Konstantin Tretyakov Compressing Data Konstantin Tretyakov (kt@ut.ee) MTAT.03.238 Advanced April 26, 2012 Claude Elwood Shannon (1916-2001) C. E. Shannon. A mathematical theory of communication. 1948 C. E. Shannon. The mathematical

More information

Department of Image Processing and Computer Graphics University of Szeged. Fuzzy Techniques for Image Segmentation. Outline.

Department of Image Processing and Computer Graphics University of Szeged. Fuzzy Techniques for Image Segmentation. Outline. László G. Nyúl systems sets image László G. Nyúl Department of Processing and Computer Graphics University of Szeged 2009-07-07 systems sets image 1 systems 2 sets 3 image thresholding clustering 4 Dealing

More information

CS/COE 1501

CS/COE 1501 CS/COE 1501 www.cs.pitt.edu/~lipschultz/cs1501/ Compression What is compression? Represent the same data using less storage space Can get more use out a disk of a given size Can get more use out of memory

More information

Lossless Compression Algorithms

Lossless Compression Algorithms Multimedia Data Compression Part I Chapter 7 Lossless Compression Algorithms 1 Chapter 7 Lossless Compression Algorithms 1. Introduction 2. Basics of Information Theory 3. Lossless Compression Algorithms

More information

Data Compression Techniques

Data Compression Techniques Data Compression Techniques Part 1: Entropy Coding Lecture 1: Introduction and Huffman Coding Juha Kärkkäinen 31.10.2017 1 / 21 Introduction Data compression deals with encoding information in as few bits

More information

Computer Security & Privacy. Why Computer Security Matters. Privacy threats abound (identity fraud, etc.) Multi-disciplinary solutions

Computer Security & Privacy. Why Computer Security Matters. Privacy threats abound (identity fraud, etc.) Multi-disciplinary solutions Computer Security & Privacy slides adopted from F. Monrose 1 Why Computer Security Matters Computers/Internet play a vital role in our daily lives Social Networks and Online Communities facebook, flickr,

More information

Digital Communication Prof. Bikash Kumar Dey Department of Electrical Engineering Indian Institute of Technology, Bombay

Digital Communication Prof. Bikash Kumar Dey Department of Electrical Engineering Indian Institute of Technology, Bombay Digital Communication Prof. Bikash Kumar Dey Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 26 Source Coding (Part 1) Hello everyone, we will start a new module today

More information

CS/COE 1501

CS/COE 1501 CS/COE 1501 www.cs.pitt.edu/~nlf4/cs1501/ Compression What is compression? Represent the same data using less storage space Can get more use out a disk of a given size Can get more use out of memory E.g.,

More information

Data Compression. Guest lecture, SGDS Fall 2011

Data Compression. Guest lecture, SGDS Fall 2011 Data Compression Guest lecture, SGDS Fall 2011 1 Basics Lossy/lossless Alphabet compaction Compression is impossible Compression is possible RLE Variable-length codes Undecidable Pigeon-holes Patterns

More information

Data Compression. Media Signal Processing, Presentation 2. Presented By: Jahanzeb Farooq Michael Osadebey

Data Compression. Media Signal Processing, Presentation 2. Presented By: Jahanzeb Farooq Michael Osadebey Data Compression Media Signal Processing, Presentation 2 Presented By: Jahanzeb Farooq Michael Osadebey What is Data Compression? Definition -Reducing the amount of data required to represent a source

More information

FauxCrypt - A Method of Text Obfuscation

FauxCrypt - A Method of Text Obfuscation FauxCrypt - A Method of Text Obfuscation Devlin M. Gualtieri Consulting Scientist Ledgewood, New Jersey gualtieri@ieee.org Abstract Warnings have been raised about the steady diminution of privacy. More

More information

Data Compression. An overview of Compression. Multimedia Systems and Applications. Binary Image Compression. Binary Image Compression

Data Compression. An overview of Compression. Multimedia Systems and Applications. Binary Image Compression. Binary Image Compression An overview of Compression Multimedia Systems and Applications Data Compression Compression becomes necessary in multimedia because it requires large amounts of storage space and bandwidth Types of Compression

More information

Repetition 1st lecture

Repetition 1st lecture Repetition 1st lecture Human Senses in Relation to Technical Parameters Multimedia - what is it? Human senses (overview) Historical remarks Color models RGB Y, Cr, Cb Data rates Text, Graphic Picture,

More information

Compression; Error detection & correction

Compression; Error detection & correction Compression; Error detection & correction compression: squeeze out redundancy to use less memory or use less network bandwidth encode the same information in fewer bits some bits carry no information some

More information

CIS 121 Data Structures and Algorithms with Java Spring 2018

CIS 121 Data Structures and Algorithms with Java Spring 2018 CIS 121 Data Structures and Algorithms with Java Spring 2018 Homework 6 Compression Due: Monday, March 12, 11:59pm online 2 Required Problems (45 points), Qualitative Questions (10 points), and Style and

More information

David Rappaport School of Computing Queen s University CANADA. Copyright, 1996 Dale Carnegie & Associates, Inc.

David Rappaport School of Computing Queen s University CANADA. Copyright, 1996 Dale Carnegie & Associates, Inc. David Rappaport School of Computing Queen s University CANADA Copyright, 1996 Dale Carnegie & Associates, Inc. Data Compression There are two broad categories of data compression: Lossless Compression

More information

Compression; Error detection & correction

Compression; Error detection & correction Compression; Error detection & correction compression: squeeze out redundancy to use less memory or use less network bandwidth encode the same information in fewer bits some bits carry no information some

More information

7: Image Compression

7: Image Compression 7: Image Compression Mark Handley Image Compression GIF (Graphics Interchange Format) PNG (Portable Network Graphics) MNG (Multiple-image Network Graphics) JPEG (Join Picture Expert Group) 1 GIF (Graphics

More information

Overview. Last Lecture. This Lecture. Next Lecture. Data Transmission. Data Compression Source: Lecture notes

Overview. Last Lecture. This Lecture. Next Lecture. Data Transmission. Data Compression Source: Lecture notes Overview Last Lecture Data Transmission This Lecture Data Compression Source: Lecture notes Next Lecture Data Integrity 1 Source : Sections 10.1, 10.3 Lecture 4 Data Compression 1 Data Compression Decreases

More information

15 Data Compression 2014/9/21. Objectives After studying this chapter, the student should be able to: 15-1 LOSSLESS COMPRESSION

15 Data Compression 2014/9/21. Objectives After studying this chapter, the student should be able to: 15-1 LOSSLESS COMPRESSION 15 Data Compression Data compression implies sending or storing a smaller number of bits. Although many methods are used for this purpose, in general these methods can be divided into two broad categories:

More information

IMAGE COMPRESSION- I. Week VIII Feb /25/2003 Image Compression-I 1

IMAGE COMPRESSION- I. Week VIII Feb /25/2003 Image Compression-I 1 IMAGE COMPRESSION- I Week VIII Feb 25 02/25/2003 Image Compression-I 1 Reading.. Chapter 8 Sections 8.1, 8.2 8.3 (selected topics) 8.4 (Huffman, run-length, loss-less predictive) 8.5 (lossy predictive,

More information

Ch. 2: Compression Basics Multimedia Systems

Ch. 2: Compression Basics Multimedia Systems Ch. 2: Compression Basics Multimedia Systems Prof. Thinh Nguyen (Based on Prof. Ben Lee s Slides) Oregon State University School of Electrical Engineering and Computer Science Outline Why compression?

More information

Text Compression. Jayadev Misra The University of Texas at Austin July 1, A Very Incomplete Introduction To Information Theory 2

Text Compression. Jayadev Misra The University of Texas at Austin July 1, A Very Incomplete Introduction To Information Theory 2 Text Compression Jayadev Misra The University of Texas at Austin July 1, 2003 Contents 1 Introduction 1 2 A Very Incomplete Introduction To Information Theory 2 3 Huffman Coding 5 3.1 Uniquely Decodable

More information

Ch. 2: Compression Basics Multimedia Systems

Ch. 2: Compression Basics Multimedia Systems Ch. 2: Compression Basics Multimedia Systems Prof. Ben Lee School of Electrical Engineering and Computer Science Oregon State University Outline Why compression? Classification Entropy and Information

More information

I. Introduction II. Mathematical Context

I. Introduction II. Mathematical Context Data Compression Lucas Garron: August 4, 2005 I. Introduction In the modern era known as the Information Age, forms of electronic information are steadily becoming more important. Unfortunately, maintenance

More information

Fundamentals of Multimedia. Lecture 5 Lossless Data Compression Variable Length Coding

Fundamentals of Multimedia. Lecture 5 Lossless Data Compression Variable Length Coding Fundamentals of Multimedia Lecture 5 Lossless Data Compression Variable Length Coding Mahmoud El-Gayyar elgayyar@ci.suez.edu.eg Mahmoud El-Gayyar / Fundamentals of Multimedia 1 Data Compression Compression

More information

EE67I Multimedia Communication Systems Lecture 4

EE67I Multimedia Communication Systems Lecture 4 EE67I Multimedia Communication Systems Lecture 4 Lossless Compression Basics of Information Theory Compression is either lossless, in which no information is lost, or lossy in which information is lost.

More information

Lempel-Ziv-Welch (LZW) Compression Algorithm

Lempel-Ziv-Welch (LZW) Compression Algorithm Lempel-Ziv-Welch (LZW) Compression lgorithm Introduction to the LZW lgorithm Example 1: Encoding using LZW Example 2: Decoding using LZW LZW: Concluding Notes Introduction to LZW s mentioned earlier, static

More information

EE-575 INFORMATION THEORY - SEM 092

EE-575 INFORMATION THEORY - SEM 092 EE-575 INFORMATION THEORY - SEM 092 Project Report on Lempel Ziv compression technique. Department of Electrical Engineering Prepared By: Mohammed Akber Ali Student ID # g200806120. ------------------------------------------------------------------------------------------------------------------------------------------

More information

IMAGE PROCESSING (RRY025) LECTURE 13 IMAGE COMPRESSION - I

IMAGE PROCESSING (RRY025) LECTURE 13 IMAGE COMPRESSION - I IMAGE PROCESSING (RRY025) LECTURE 13 IMAGE COMPRESSION - I 1 Need For Compression 2D data sets are much larger than 1D. TV and movie data sets are effectively 3D (2-space, 1-time). Need Compression for

More information

Information Retrieval

Information Retrieval Introduction to Information Retrieval CS3245 Information Retrieval Lecture 4: Dictionaries and Tolerant Retrieval4 Last Time: Terms and Postings Details Ch. 2 Skip pointers Encoding a tree-like structure

More information

Introduction to Data Compression

Introduction to Data Compression Introduction to Data Compression Guillaume Tochon guillaume.tochon@lrde.epita.fr LRDE, EPITA Guillaume Tochon (LRDE) CODO - Introduction 1 / 9 Data compression: whatizit? Guillaume Tochon (LRDE) CODO -

More information

Data Compression Fundamentals

Data Compression Fundamentals 1 Data Compression Fundamentals Touradj Ebrahimi Touradj.Ebrahimi@epfl.ch 2 Several classifications of compression methods are possible Based on data type :» Generic data compression» Audio compression»

More information

Computing in the Modern World

Computing in the Modern World Computing in the Modern World BCS-CMW-7: Data Representation Wayne Summers Marion County October 25, 2011 There are 10 kinds of people in the world: those who understand binary and those who don t. Pre-exercises

More information

So, what is data compression, and why do we need it?

So, what is data compression, and why do we need it? In the last decade we have been witnessing a revolution in the way we communicate 2 The major contributors in this revolution are: Internet; The explosive development of mobile communications; and The

More information

Digital Image Processing

Digital Image Processing Lecture 9+10 Image Compression Lecturer: Ha Dai Duong Faculty of Information Technology 1. Introduction Image compression To Solve the problem of reduncing the amount of data required to represent a digital

More information

Digital Communication Prof. Bikash Kumar Dey Department of Electrical Engineering Indian Institute of Technology, Bombay

Digital Communication Prof. Bikash Kumar Dey Department of Electrical Engineering Indian Institute of Technology, Bombay Digital Communication Prof. Bikash Kumar Dey Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 29 Source Coding (Part-4) We have already had 3 classes on source coding

More information

Lossless compression II

Lossless compression II Lossless II D 44 R 52 B 81 C 84 D 86 R 82 A 85 A 87 A 83 R 88 A 8A B 89 A 8B Symbol Probability Range a 0.2 [0.0, 0.2) e 0.3 [0.2, 0.5) i 0.1 [0.5, 0.6) o 0.2 [0.6, 0.8) u 0.1 [0.8, 0.9)! 0.1 [0.9, 1.0)

More information

An Advanced Text Encryption & Compression System Based on ASCII Values & Arithmetic Encoding to Improve Data Security

An Advanced Text Encryption & Compression System Based on ASCII Values & Arithmetic Encoding to Improve Data Security Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 10, October 2014,

More information

Digital Image Processing

Digital Image Processing Digital Image Processing Image Compression Caution: The PDF version of this presentation will appear to have errors due to heavy use of animations Material in this presentation is largely based on/derived

More information

Engineering Mathematics II Lecture 16 Compression

Engineering Mathematics II Lecture 16 Compression 010.141 Engineering Mathematics II Lecture 16 Compression Bob McKay School of Computer Science and Engineering College of Engineering Seoul National University 1 Lossless Compression Outline Huffman &

More information

CS 206 Introduction to Computer Science II

CS 206 Introduction to Computer Science II CS 206 Introduction to Computer Science II 04 / 25 / 2018 Instructor: Michael Eckmann Today s Topics Questions? Comments? Balanced Binary Search trees AVL trees / Compression Uses binary trees Balanced

More information

Data compression with Huffman and LZW

Data compression with Huffman and LZW Data compression with Huffman and LZW André R. Brodtkorb, Andre.Brodtkorb@sintef.no Outline Data storage and compression Huffman: how it works and where it's used LZW: how it works and where it's used

More information

IMAGE COMPRESSION. Image Compression. Why? Reducing transportation times Reducing file size. A two way event - compression and decompression

IMAGE COMPRESSION. Image Compression. Why? Reducing transportation times Reducing file size. A two way event - compression and decompression IMAGE COMPRESSION Image Compression Why? Reducing transportation times Reducing file size A two way event - compression and decompression 1 Compression categories Compression = Image coding Still-image

More information

G64PMM - Lecture 3.2. Analogue vs Digital. Analogue Media. Graphics & Still Image Representation

G64PMM - Lecture 3.2. Analogue vs Digital. Analogue Media. Graphics & Still Image Representation G64PMM - Lecture 3.2 Graphics & Still Image Representation Analogue vs Digital Analogue information Continuously variable signal Physical phenomena Sound/light/temperature/position/pressure Waveform Electromagnetic

More information

Data Compression 신찬수

Data Compression 신찬수 Data Compression 신찬수 Data compression Reducing the size of the representation without affecting the information itself. Lossless compression vs. lossy compression text file image file movie file compression

More information

A study in compression algorithms

A study in compression algorithms Master Thesis Computer Science Thesis no: MCS-004:7 January 005 A study in compression algorithms Mattias Håkansson Sjöstrand Department of Interaction and System Design School of Engineering Blekinge

More information

CSC 421: Algorithm Design & Analysis. Spring 2015

CSC 421: Algorithm Design & Analysis. Spring 2015 CSC 421: Algorithm Design & Analysis Spring 2015 Greedy algorithms greedy algorithms examples: optimal change, job scheduling Prim's algorithm (minimal spanning tree) Dijkstra's algorithm (shortest path)

More information

Multimedia Systems. Part 20. Mahdi Vasighi

Multimedia Systems. Part 20. Mahdi Vasighi Multimedia Systems Part 2 Mahdi Vasighi www.iasbs.ac.ir/~vasighi Department of Computer Science and Information Technology, Institute for dvanced Studies in asic Sciences, Zanjan, Iran rithmetic Coding

More information

ITCT Lecture 8.2: Dictionary Codes and Lempel-Ziv Coding

ITCT Lecture 8.2: Dictionary Codes and Lempel-Ziv Coding ITCT Lecture 8.2: Dictionary Codes and Lempel-Ziv Coding Huffman codes require us to have a fairly reasonable idea of how source symbol probabilities are distributed. There are a number of applications

More information

Lecture #3: Digital Music and Sound

Lecture #3: Digital Music and Sound Lecture #3: Digital Music and Sound CS106E Spring 2018, Young In this lecture we take a look at how computers represent music and sound. One very important concept we ll come across when studying digital

More information

Administrivia. FEC vs. ARQ. Reliable Transmission FEC. Last time: Framing Error detection. FEC provides constant throughput and predictable delay

Administrivia. FEC vs. ARQ. Reliable Transmission FEC. Last time: Framing Error detection. FEC provides constant throughput and predictable delay FEC vs. ARQ Administrivia FEC provides constant throughput and predictable delay If high error rate, need long codes/complex circuitry Does not protect against all errors, or packet loss Last time: Framing

More information

Multimedia Networking ECE 599

Multimedia Networking ECE 599 Multimedia Networking ECE 599 Prof. Thinh Nguyen School of Electrical Engineering and Computer Science Based on B. Lee s lecture notes. 1 Outline Compression basics Entropy and information theory basics

More information

Compression. storage medium/ communications network. For the purpose of this lecture, we observe the following constraints:

Compression. storage medium/ communications network. For the purpose of this lecture, we observe the following constraints: CS231 Algorithms Handout # 31 Prof. Lyn Turbak November 20, 2001 Wellesley College Compression The Big Picture We want to be able to store and retrieve data, as well as communicate it with others. In general,

More information

Lempel-Ziv-Welch Compression

Lempel-Ziv-Welch Compression Lempel-Ziv-Welch Compression Brad Karp UCL Computer Science CS 3007 6 th February 2018 (lecture notes derived from material from Hari Balakrishnan, Katrina LaCurts, and Terry Welch) 1 The Compression Problem

More information

Summary of Digital Information (so far):

Summary of Digital Information (so far): CS/MA 109 Fall 2016 Wayne Snyder Department Boston University Today : Audio concluded: MP3 Data Compression introduced Next Time Data Compression Concluded Error Detection and Error Correcction Summary

More information

IMAGE COMPRESSION TECHNIQUES

IMAGE COMPRESSION TECHNIQUES IMAGE COMPRESSION TECHNIQUES A.VASANTHAKUMARI, M.Sc., M.Phil., ASSISTANT PROFESSOR OF COMPUTER SCIENCE, JOSEPH ARTS AND SCIENCE COLLEGE, TIRUNAVALUR, VILLUPURAM (DT), TAMIL NADU, INDIA ABSTRACT A picture

More information

Image compression. Stefano Ferrari. Università degli Studi di Milano Methods for Image Processing. academic year

Image compression. Stefano Ferrari. Università degli Studi di Milano Methods for Image Processing. academic year Image compression Stefano Ferrari Università degli Studi di Milano stefano.ferrari@unimi.it Methods for Image Processing academic year 2017 2018 Data and information The representation of images in a raw

More information

Programming Abstractions

Programming Abstractions Programming Abstractions C S 1 0 6 X Cynthia Lee Topics: Today we re going to be talking about your next assignment: Huffman coding It s a compression algorithm It s provably optimal (take that, Pied Piper)

More information

A Research Paper on Lossless Data Compression Techniques

A Research Paper on Lossless Data Compression Techniques IJIRST International Journal for Innovative Research in Science & Technology Volume 4 Issue 1 June 2017 ISSN (online): 2349-6010 A Research Paper on Lossless Data Compression Techniques Prof. Dipti Mathpal

More information

CS106B Handout 34 Autumn 2012 November 12 th, 2012 Data Compression and Huffman Encoding

CS106B Handout 34 Autumn 2012 November 12 th, 2012 Data Compression and Huffman Encoding CS6B Handout 34 Autumn 22 November 2 th, 22 Data Compression and Huffman Encoding Handout written by Julie Zelenski. In the early 98s, personal computers had hard disks that were no larger than MB; today,

More information

Lempel-Ziv compression: how and why?

Lempel-Ziv compression: how and why? Lempel-Ziv compression: how and why? Algorithms on Strings Paweł Gawrychowski July 9, 2013 s July 9, 2013 2/18 Outline Lempel-Ziv compression Computing the factorization Using the factorization July 9,

More information

Grade 6 Math Circles November 6 & Relations, Functions, and Morphisms

Grade 6 Math Circles November 6 & Relations, Functions, and Morphisms Faculty of Mathematics Waterloo, Ontario N2L 3G1 Centre for Education in Mathematics and Computing Relations Let s talk about relations! Grade 6 Math Circles November 6 & 7 2018 Relations, Functions, and

More information

Horn Formulae. CS124 Course Notes 8 Spring 2018

Horn Formulae. CS124 Course Notes 8 Spring 2018 CS124 Course Notes 8 Spring 2018 In today s lecture we will be looking a bit more closely at the Greedy approach to designing algorithms. As we will see, sometimes it works, and sometimes even when it

More information

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 11 Coding Strategies and Introduction to Huffman Coding The Fundamental

More information

A Comparative Study of Entropy Encoding Techniques for Lossless Text Data Compression

A Comparative Study of Entropy Encoding Techniques for Lossless Text Data Compression A Comparative Study of Entropy Encoding Techniques for Lossless Text Data Compression P. RATNA TEJASWI 1 P. DEEPTHI 2 V.PALLAVI 3 D. GOLDIE VAL DIVYA 4 Abstract: Data compression is the art of reducing

More information

Dictionary techniques

Dictionary techniques Dictionary techniques The final concept that we will mention in this chapter is about dictionary techniques. Many modern compression algorithms rely on the modified versions of various dictionary techniques.

More information

Data compression.

Data compression. Data compression anhtt-fit@mail.hut.edu.vn dungct@it-hut.edu.vn Data Compression Data in memory have used fixed length for representation For data transfer (in particular), this method is inefficient.

More information

Greedy Algorithms II

Greedy Algorithms II Greedy Algorithms II Greedy algorithms tend to be difficult to teach since different observations lead to correct greedy algorithms in different situations. Pedagogically, it s somewhat difficult to clearly

More information

Lecture 1: Overview

Lecture 1: Overview 15-150 Lecture 1: Overview Lecture by Stefan Muller May 21, 2018 Welcome to 15-150! Today s lecture was an overview that showed the highlights of everything you re learning this semester, which also meant

More information

Lecture 5: Compression I. This Week s Schedule

Lecture 5: Compression I. This Week s Schedule Lecture 5: Compression I Reading: book chapter 6, section 3 &5 chapter 7, section 1, 2, 3, 4, 8 Today: This Week s Schedule The concept behind compression Rate distortion theory Image compression via DCT

More information

CSCI 270: Introduction to Algorithms and Theory of Computing Fall 2017 Prof: Leonard Adleman Scribe: Joseph Bebel

CSCI 270: Introduction to Algorithms and Theory of Computing Fall 2017 Prof: Leonard Adleman Scribe: Joseph Bebel CSCI 270: Introduction to Algorithms and Theory of Computing Fall 2017 Prof: Leonard Adleman Scribe: Joseph Bebel We will now discuss computer programs, a concrete manifestation of what we ve been calling

More information

Source coding and compression

Source coding and compression Computer Mathematics Week 5 Source coding and compression College of Information Science and Engineering Ritsumeikan University last week binary representations of signed numbers sign-magnitude, biased

More information

A Comprehensive Review of Data Compression Techniques

A Comprehensive Review of Data Compression Techniques Volume-6, Issue-2, March-April 2016 International Journal of Engineering and Management Research Page Number: 684-688 A Comprehensive Review of Data Compression Techniques Palwinder Singh 1, Amarbir Singh

More information

More Bits and Bytes Huffman Coding

More Bits and Bytes Huffman Coding More Bits and Bytes Huffman Coding Encoding Text: How is it done? ASCII, UTF, Huffman algorithm ASCII C A T Lawrence Snyder, CSE UTF-8: All the alphabets in the world Uniform Transformation Format: a variable-width

More information

Welcome. Non-Profit Involvement. Goals. Agenda. Goal

Welcome. Non-Profit Involvement. Goals. Agenda. Goal Welcome Kelly Hornbuckle: Director of Marketing & Communications, Georgia Restaurant Association (Non-Profit Trade Association) www.garestaurants.org Thursday, September 2 nd, 2:00-5:15 pm Non-Profit Involvement

More information

International Journal of Trend in Research and Development, Volume 3(2), ISSN: A Review of Coding Techniques in the Frequency

International Journal of Trend in Research and Development, Volume 3(2), ISSN: A Review of Coding Techniques in the Frequency A Review of Coding Techniques in the Frequency Farhad Shoahosseini 1 and Shahram Jamali 2 1 Department of computer, Germi branch, Islamic Azad University, Germi, Iran 2 Associate Professor, University

More information

Ocr: A Statistical Model Of Multi-engine Ocr Systems

Ocr: A Statistical Model Of Multi-engine Ocr Systems University of Central Florida Electronic Theses and Dissertations Masters Thesis (Open Access) Ocr: A Statistical Model Of Multi-engine Ocr Systems 2004 Mercedes Terre McDonald University of Central Florida

More information

ECE 499/599 Data Compression & Information Theory. Thinh Nguyen Oregon State University

ECE 499/599 Data Compression & Information Theory. Thinh Nguyen Oregon State University ECE 499/599 Data Compression & Information Theory Thinh Nguyen Oregon State University Adminstrivia Office Hours TTh: 2-3 PM Kelley Engineering Center 3115 Class homepage http://www.eecs.orst.edu/~thinhq/teaching/ece499/spring06/spring06.html

More information

Keywords Data compression, Lossless data compression technique, Huffman Coding, Arithmetic coding etc.

Keywords Data compression, Lossless data compression technique, Huffman Coding, Arithmetic coding etc. Volume 6, Issue 2, February 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com A Comparative

More information

Flwrap Users Manual Generated by Doxygen

Flwrap Users Manual Generated by Doxygen Flwrap Users Manual 1.3.5 Generated by Doxygen 1.8.11 Contents 1 Flwrap Users Manual - Version 1.3.5 1 1.1 FLWRAP............................................... 1 1.2 Flwrap with Compression.......................................

More information

Section 0.3 The Order of Operations

Section 0.3 The Order of Operations Section 0.3 The Contents: Evaluating an Expression Grouping Symbols OPERATIONS The Distributive Property Answers Focus Exercises Let s be reminded of those operations seen thus far in the course: Operation

More information

Information Theory and Communication

Information Theory and Communication Information Theory and Communication Shannon-Fano-Elias Code and Arithmetic Codes Ritwik Banerjee rbanerjee@cs.stonybrook.edu c Ritwik Banerjee Information Theory and Communication 1/12 Roadmap Examples

More information

GETTING STARTED 8 December 2016

GETTING STARTED 8 December 2016 GETTING STARTED 8 December 2016 About Platform... 4 Browser support... 5 Registration Registering as a Teacher... 6 Registering as a Student... 6 Registering as School... 6 Registering as Municipality

More information

1 One-Time Pad. 1.1 One-Time Pad Definition

1 One-Time Pad. 1.1 One-Time Pad Definition 1 One-Time Pad Secure communication is the act of conveying information from a sender to a receiver, while simultaneously hiding it from everyone else Secure communication is the oldest application of

More information

Binary Trees Case-studies

Binary Trees Case-studies Carlos Moreno cmoreno @ uwaterloo.ca EIT-4103 https://ece.uwaterloo.ca/~cmoreno/ece250 Standard reminder to set phones to silent/vibrate mode, please! Today's class: Binary Trees Case-studies We'll look

More information

Efficient Sequential Algorithms, Comp309. Motivation. Longest Common Subsequence. Part 3. String Algorithms

Efficient Sequential Algorithms, Comp309. Motivation. Longest Common Subsequence. Part 3. String Algorithms Efficient Sequential Algorithms, Comp39 Part 3. String Algorithms University of Liverpool References: T. H. Cormen, C. E. Leiserson, R. L. Rivest Introduction to Algorithms, Second Edition. MIT Press (21).

More information

CSE 421 Greedy: Huffman Codes

CSE 421 Greedy: Huffman Codes CSE 421 Greedy: Huffman Codes Yin Tat Lee 1 Compression Example 100k file, 6 letter alphabet: File Size: ASCII, 8 bits/char: 800kbits 2 3 > 6; 3 bits/char: 300kbits a 45% b 13% c 12% d 16% e 9% f 5% Why?

More information

SIGNAL COMPRESSION Lecture Lempel-Ziv Coding

SIGNAL COMPRESSION Lecture Lempel-Ziv Coding SIGNAL COMPRESSION Lecture 5 11.9.2007 Lempel-Ziv Coding Dictionary methods Ziv-Lempel 77 The gzip variant of Ziv-Lempel 77 Ziv-Lempel 78 The LZW variant of Ziv-Lempel 78 Asymptotic optimality of Ziv-Lempel

More information

LZW Compression. Ramana Kumar Kundella. Indiana State University December 13, 2014

LZW Compression. Ramana Kumar Kundella. Indiana State University December 13, 2014 LZW Compression Ramana Kumar Kundella Indiana State University rkundella@sycamores.indstate.edu December 13, 2014 Abstract LZW is one of the well-known lossless compression methods. Since it has several

More information