Image Compression. cs2: Computational Thinking for Scientists.

Similar documents
Multimedia Systems. Part 4. Mahdi Vasighi

Bytes are read Right to Left, so = 0x3412, = 0x

IMAGE COMPRESSION USING FOURIER TRANSFORMS

7: Image Compression

Image coding and compression

Common File Formats. Need a standard to store images Raster data Photos Synthetic renderings. Vector Graphic Illustrations Fonts

This is not an official directory; it is for voluntary participation only and does not guarantee that someone will not use the same identifier.

This is not yellow. Image Files - Center for Graphics and Geometric Computing, Technion 2

Compression. storage medium/ communications network. For the purpose of this lecture, we observe the following constraints:

Simple variant of coding with a variable number of symbols and fixlength codewords.

Digital Image Processing

Lecture Coding Theory. Source Coding. Image and Video Compression. Images: Wikipedia

A Novel Image Compression Technique using Simple Arithmetic Addition

color bit depth dithered

Image compression. Stefano Ferrari. Università degli Studi di Milano Methods for Image Processing. academic year

Dissecting Files. Endianness. So Many Bytes. Big Endian vs. Little Endian. Example Number. The "proper" order of things. Week 6

Data Compression. Media Signal Processing, Presentation 2. Presented By: Jahanzeb Farooq Michael Osadebey

IMAGE COMPRESSION. Image Compression. Why? Reducing transportation times Reducing file size. A two way event - compression and decompression

Table of Contents. Logo. Colour

An introduction to JPEG compression using MATLAB

VC 12/13 T16 Video Compression

Other Media Primer. Contents

Data Representation From 0s and 1s to images CPSC 101

Compressing Data. Konstantin Tretyakov

Fundamentals of Multimedia. Lecture 5 Lossless Data Compression Variable Length Coding

Data Compression Techniques

Image Coding and Compression

Graphics File Formats

Unit 2 Digital Information. Chapter 1 Study Guide

So, what is data compression, and why do we need it?

Lecture 5: Compression I. This Week s Schedule

A QUAD-TREE DECOMPOSITION APPROACH TO CARTOON IMAGE COMPRESSION. Yi-Chen Tsai, Ming-Sui Lee, Meiyin Shen and C.-C. Jay Kuo

Lecture 8 JPEG Compression (Part 3)

Data Compression. An overview of Compression. Multimedia Systems and Applications. Binary Image Compression. Binary Image Compression

Welcome. Web Authoring: HTML - Advanced Topics & Photo Optimisation (Level 3) Richard Hey & Barny Baggs

EE-575 INFORMATION THEORY - SEM 092

Chapter 1. Digital Data Representation and Communication. Part 2

Data and information. Image Codning and Compression. Image compression and decompression. Definitions. Images can contain three types of redundancy

255, 255, 0 0, 255, 255 XHTML:

IMAGE COMPRESSION TECHNIQUES

EE67I Multimedia Communication Systems Lecture 4

CP SC 4040/6040 Computer Graphics Images. Joshua Levine

SIGNAL COMPRESSION Lecture Lempel-Ziv Coding

CS 493: Algorithms for Massive Data Sets Dictionary-based compression February 14, 2002 Scribe: Tony Wirth LZ77

TEXT -> IMAGE ENCODING FLOWCHART (LSB METHOD) start. store color depth of screen. false. false initialize psuedo rand # generator.

ECE 417 Guest Lecture Video Compression in MPEG-1/2/4. Min-Hsuan Tsai Apr 02, 2013

Lecture 8 JPEG Compression (Part 3)

Data compression with Huffman and LZW

MULTIMEDIA AND CODING

CS 335 Graphics and Multimedia. Image Compression

Colour and Number Representation. From Hex to Binary and Back. Colour and Number Representation. Colour and Number Representation

Repetition 1st lecture

BRAND STYLE GUIDE 2016

A Hybrid Image Compression Technique using Quadtree Decomposition and Parametric Line Fitting for Synthetic Images

CS101 Lecture 12: Image Compression. What You ll Learn Today

IMAGE PROCESSING (RRY025) LECTURE 13 IMAGE COMPRESSION - I

Lempel-Ziv-Welch (LZW) Compression Algorithm

Data Storage. Slides derived from those available on the web site of the book: Computer Science: An Overview, 11 th Edition, by J.

Introduction to Data Compression

Minification techniques

Robert Matthew Buckley. Nova Southeastern University. Dr. Laszlo. MCIS625 On Line. Module 2 Graphics File Format Essay

Standard File Formats

HTML/XML. XHTML Authoring

CREATING A BANNER IN PHOTOSHOP

ITP 140 Mobile App Technologies. Colors

ITCT Lecture 8.2: Dictionary Codes and Lempel-Ziv Coding

Image Coding. Image Coding

1.6 Graphics Packages

7.5 Dictionary-based Coding

LZW Compression. Ramana Kumar Kundella. Indiana State University December 13, 2014

Visual Cryptography of Animated GIF Image Based on XOR Operation

Multimedia Systems. Part 20. Mahdi Vasighi

MARMOL BRAND GUIDELINES APRIL Powered by TECKpert.com

Dictionary techniques

G64PMM - Lecture 3.2. Analogue vs Digital. Analogue Media. Graphics & Still Image Representation

Compression; Error detection & correction

Chapter 9: Data Transmission

Electronic Artwork Information for Authors Glossary and definitions

Representing Characters, Strings and Text

The PackBits program on the Macintosh used a generalized RLE scheme for data compression.

Adding CSS to your HTML

THE RELATIVE EFFICIENCY OF DATA COMPRESSION BY LZW AND LZSS

Chapter 1 (Computer Forensics)

VIDEO SIGNALS. Lossless coding

Example 1: Denary = 1. Answer: Binary = (1 * 1) = 1. Example 2: Denary = 3. Answer: Binary = (1 * 1) + (2 * 1) = 3

Java Oriented Object Programming II Files II - Binary I/O Lesson 3

Numbers and their bases

Overview. Last Lecture. This Lecture. Next Lecture. Data Transmission. Data Compression Source: Lecture notes

Introduction to HTML. SSE 3200 Web-based Services. Michigan Technological University Nilufer Onder

MEDIA RELATED FILE TYPES

ITP 101 Project 2 - Photoshop

Lempel-Ziv-Welch Compression

Entropy Coding. - to shorten the average code length by assigning shorter codes to more probable symbols => Morse-, Huffman-, Arithmetic Code

Formati grafici e Multimediali WWW. Davide Rossi Aprile 2002

DigiPoints Volume 1. Student Workbook. Module 8 Digital Compression

Representing Characters and Text

Digital Technologies Hangarau Matihiko Level 1

CIS 121 Data Structures and Algorithms with Java Spring 2018

Engineering Mathematics II Lecture 16 Compression

Optimized Compression and Decompression Software

Transcription:

Image Compression cs2: Computational Thinking for Scientists Çetin Kaya Koç http://cs.ucsb.edu/~koc/cs2 koc@cs.ucsb.edu The course was developed with input from: Ömer Eǧecioǧlu (Computer Science), Maribel Bueno Cachadina (Mathematics/CCS), Rolf Christoffersen (MCDB), Richard Church (Geography), Wendy Meiring (PSTAT), Todd Oakley (EEMB), and Joan Shea (Chemistry) (http://cs.ucsb.edu/~koc/cs2) cs2 comp think lecture05 - winter 2015 1 / 28

Size of an Image The size of an image is the number of bits an image file takes Additionally there is (generally) a header file, giving (meta)-information about the image The meta-information informs the application program about the type of the image, the color coding method (rgb, cmyk), and other relevant data The header file is usually very small, a few tens of bytes Image files take significantly more space than text files, as expected (http://cs.ucsb.edu/~koc/cs2) cs2 comp think lecture05 - winter 2015 2 / 28

Size of an Image Consider the 500 500 image file we studied, represented in rgb: Since each pixel takes 3 bytes (24 bits), we calculate the (raw) size of this image as 3 500 500 = 750,000 bytes or 6,000,000 bits (http://cs.ucsb.edu/~koc/cs2) cs2 comp think lecture05 - winter 2015 3 / 28

Size of an Image However, the Unix command "ls -l" shows its actual size as 140,461 bytes That is because the image file leo.gif is compressed The purpose of image compression is to have a significantly reduced file size for storage and transmission efficiency Most commonly used image compression methods are: gif, jpg, png (http://cs.ucsb.edu/~koc/cs2) cs2 comp think lecture05 - winter 2015 4 / 28

Image Compression Methods Each one of the file types (gif, jpg, png) refer to a different image compression method We will study GIF, which is the acronym of Graphics Interface Format GIF was developed by CompuServe and published in June 15, 1987: http://www.w3.org/graphics/gif/spec-gif87.txt CompuServe was the first major commercial online service CompuServe dominated the field in the 80s, up until mid-90s, along with AOL (which later purchased CompuServe) (http://cs.ucsb.edu/~koc/cs2) cs2 comp think lecture05 - winter 2015 5 / 28

GIF File Format The Hex-Edit program shows the first few bytes of the file leo.gif as All gif image files has their first 6 bytes as 47 49 46 38 37 61 which encodes the ASCI text GIF87a to denote that this is gif image file, described in a standard document published in 1987 These 6 bytes are also called GIF Signature The last three characters 87a is the version number, and refers to one of the standards of GIF (http://cs.ucsb.edu/~koc/cs2) cs2 comp think lecture05 - winter 2015 6 / 28

GIF File Format The following 2 bytes shows the image size: First the width (the number of columns) and the height (the number of rows), represented in hexadecimal and the least significant byte first Since image file leo.gif is of size 500 500, and the hex representation of 500 is 01F4, we see these two bytes ordered as F401 F401 (http://cs.ucsb.edu/~koc/cs2) cs2 comp think lecture05 - winter 2015 7 / 28

GIF File Format Consider this image, which is of size 500 400 (500 columns and 400 rows, or width is 500 while the height is 400 pixels) Since the hex representation of 500 is 01F4 and 400 is 0190, we see these two bytes ordered as F401 9001 (http://cs.ucsb.edu/~koc/cs2) cs2 comp think lecture05 - winter 2015 8 / 28

GIF File Format The next byte holds the image and color map information It contains the following bits from left to right: 7 6 5 4 3 2 1 0 M cr 0 pixel M is a 1-bit; M = 1 implies a color map exists cr is 3 bits; cr +1 gives the number of bits of color resolution pixel is 3 bits; 2 pixel+1 gives the size of color table (http://cs.ucsb.edu/~koc/cs2) cs2 comp think lecture05 - winter 2015 9 / 28

GIF File Format In image file leo.gif we have this byte as F7 = (1111 0111) 7 6 5 4 3 2 1 0 M cr 0 pixel 1 111 0 111 Thus, we have M = 1, cr = 7, and pixel = 7 cr = 7 implies that the color resolution 8 bits, pixel = 7 implies that the size of the color table is 2 8 = 256, which means there are 256 colors in this image (http://cs.ucsb.edu/~koc/cs2) cs2 comp think lecture05 - winter 2015 10 / 28

GIF File Format 256 is the maximum number of colors allowable in a GIF image A raw rgb image would have 2 24 = 16,777,216 colors, most of which are indistinguishable for human eye Also many displays are incapable of displaying such number of colors Therefore, GIF standard restricts the number of different colors as 256, and keeps them in a table inside the image, which is called the color table or the color palette Since there 256 different colors, and each color has 3 byte values, the maximum color table would have 3 256 = 768 bytes However, the color table will only keep the colors used in that particular image, and therefore, it could be shorter (http://cs.ucsb.edu/~koc/cs2) cs2 comp think lecture05 - winter 2015 11 / 28

GIF File Format Before getting into color tables, we should mention that the next byte (after F7) in leo.gif (which is 00) is the background color index This byte represents the color in the global color table to be used for pixels whose value is not specified in the image data This byte should be zero, if there is no global color table or there is no need for a background pixel Finally the last byte (in leo.gif: 00) is the aspect ratio of the pixel This byte is meaningful for pixels that are not square in size, which is applicable to certain printer and display technologies If this byte is zero, we have only square pixels, otherwise if it has the value of x (between 1 and 255), then the aspect ratio is x+15 64 (http://cs.ucsb.edu/~koc/cs2) cs2 comp think lecture05 - winter 2015 12 / 28

GIF Color Table A randomly selected image (a bird, a flower, a person, sky, etc.) could have any set of colors in them It is therefore not wise to select a small set set of colors and force every imaginable image (!) to use these colors Instead we should let an image to have its own palette of colors This is exactly what GIF standard had done (http://cs.ucsb.edu/~koc/cs2) cs2 comp think lecture05 - winter 2015 13 / 28

GIF Color Table GIF provides a color table to specify up to 256 colors for an image The color table has at most 256 rows Since each color is 3 bytes, each row takes 3 bytes index red green blue 0 FF 00 00 1 00 FF 00 2 00 00 FF 255 FF FF FF However, some images may use much fewer than that, and thus, in such cases, the color table will be smaller (http://cs.ucsb.edu/~koc/cs2) cs2 comp think lecture05 - winter 2015 14 / 28

GIF Color Table The color table in image file leo.gif has 256 colors It starts with 000000 and proceeds as 300204 05040B... This means the first color used in the image is 000000 which is black The next color is 300204 which is a very dark red: http://www.colorhexa.com/300204 The next color is 05040B which is a very dark blue: http://www.colorhexa.com/05040b Another web site for displaying RGB colors: http://www.rapidtables.com/web/color/rgb_color.htm (http://cs.ucsb.edu/~koc/cs2) cs2 comp think lecture05 - winter 2015 15 / 28

GIF Color Table Let us take a simple image of size 10 10 pixels This is a GIF89a file, whose representation is given as 47 49 46 38 39 61 0A 00 0A 00 91 00 00 FF FF FF FF 00 00 00 00 FF 00 00 00 21 F9 04 00 00 00 00 00 2C 00 00 00 00 0A 00 0A 00 00 02 16 8C 2D 99 87 2A 1C DC 33 A0 02 75 EC 95 FA A8 DE 60 8C 04 91 4C 01 00 3B (http://cs.ucsb.edu/~koc/cs2) cs2 comp think lecture05 - winter 2015 16 / 28

GIF Color Table The byte for color map in this file is 91: 1 001 0 001 Therefore, M = 1, cr = 1, and pixel = 1 M = 1 implies that we have a global color table cr = 1 implies that the color resolution is 2 bits pixel = 1 implies the size of color table is 2 2 = 4, i.e., we have only 4 colors in this image These colors are given in the image file in this order: FFFFFF: white FF0000: red 0000FF: blue 000000: black (http://cs.ucsb.edu/~koc/cs2) cs2 comp think lecture05 - winter 2015 17 / 28

GIF File Format The next 8 bytes after the color table is called Graphics Control Extension : 21 F9 04 00 00 00 00 00 The first two bytes are 21F9, and is called the extension introducer, which always starts with 21F9 The third byte is the total block size in bytes: 04 The fourth byte is a packed field, containing several flag bits: 00 The delay time value follows in the next two bytes: 00 After that we have the transparent color index byte: 00 Finally we have the block terminator which is always 00 (http://cs.ucsb.edu/~koc/cs2) cs2 comp think lecture05 - winter 2015 18 / 28

GIF File Format The next 10 bytes is the image descriptor: 2C 00 00 00 00 0A 00 0A 00 00 A single GIF file may contain multiple images This is useful for creating animated images Each image begins with an image descriptor block This block is exactly 10 bytes long The first byte is the image separator: 2C Every image descriptor begins with the value 2C The next 8 bytes represent the location and size of the image (http://cs.ucsb.edu/~koc/cs2) cs2 comp think lecture05 - winter 2015 19 / 28

GIF File Format An image in the stream may not necessarily take up the entire canvas size defined by the logical screen descriptor Therefore, the image descriptor specifies the image left position and image top position of where the image should begin on the canvas These are (x,y) = (00 00,00 00) in this image The next 4 bytes specify the image width and image height: (w,h) = (00 0A,00 0A) Our sample image starts at (0,0) and is of size 10 10 This image does take up the whole canvas size (http://cs.ucsb.edu/~koc/cs2) cs2 comp think lecture05 - winter 2015 20 / 28

GIF File Format The last byte is another packed byte: 00 In our sample file this byte is zero, so all bits are zero The first (most significant) bit in the byte is the local color table flag Setting this flag to 1 implies that the image data is using a different color table than the global color table The local color table is similar to the global color table If no local color table is specified, the global color table is used (http://cs.ucsb.edu/~koc/cs2) cs2 comp think lecture05 - winter 2015 21 / 28

GIF File Format Finally, we get to the actual image data: 02 16 8C 2D 99 87 2A 1C DC 33 A0 02 75 EC 95 FA A8 DE 60 8C 04 91 4C 01 00 The image data is composed of a series of output codes which tell the decoder which colors to spit out to the canvas These codes are combined into the bytes that make up the block GIF files are encoded using an encoding scheme called LZW, which stands for Lempel-Ziv-Welch LZW is a universal lossless data compression algorithm created by Abraham Lempel, Jacob Ziv, and Terry Welch (http://cs.ucsb.edu/~koc/cs2) cs2 comp think lecture05 - winter 2015 22 / 28

GIF File Format The first byte of this block is the LZW minimum code size The next byte indicates that there is 22 (Hex: 16) bytes of data These 22 bytes represent the compressed image data After we have read those 22 bytes, we see the next value is 00 This marks the end of the image data Finally, all GIF files end with the trailer marker 3B (http://cs.ucsb.edu/~koc/cs2) cs2 comp think lecture05 - winter 2015 23 / 28

LZW Compression Finally, we need to understand how the 100 pixels in our sample image are to be encoded using only 22 bytes 8C 2D 99 87 2A 1C DC 33 A0 02 75 EC 95 FA A8 DE 60 8C 04 91 4C 01 The color table had 4 colors, and the colors were placed in the color table in this order: white (0), red (1), blue (2), black (3) FF0000 FF0000 0000FF 0000FF FF0000 FF0000 0000FF 0000FF FF0000 FF0000 0000FF 0000FF FF0000 FF0000 0000FF 0000FF FF0000 FF0000 0000FF 0000FF 0000FF 0000FF FF0000 FF0000 0000FF 0000FF FF0000 FF0000 0000FF 0000FF FF0000 FF0000 0000FF 0000FF FF0000 FF0000 0000FF 0000FF FF0000 FF0000 1 1 1 1 1 2 2 2 2 2 1 1 1 1 1 2 2 2 2 2 1 1 1 1 1 2 2 2 2 2 1 1 1 0 0 0 0 2 2 2 1 1 1 0 0 0 0 2 2 2 2 2 2 0 0 0 0 1 1 1 2 2 2 0 0 0 0 1 1 1 2 2 2 2 2 1 1 1 1 1 2 2 2 2 2 1 1 1 1 1 2 2 2 2 2 1 1 1 1 1 (http://cs.ucsb.edu/~koc/cs2) cs2 comp think lecture05 - winter 2015 24 / 28

LZW Compression However, even if we choose not to use LZW compression at this stage, we have already accomplished a lot Normally, 3-byte representation of colors would have required 100 3 = 300 bytes for the raw image The above matrix requires at most: 100 1 = 100 bytes, if we reserve one byte for every pixel However, the existing color indices are only one of these 3 values: {0,1,2} (white, red, blue) We can reserve only 2 bits per pixel, and thus, we would need 100 2 = 200 bits! (http://cs.ucsb.edu/~koc/cs2) cs2 comp think lecture05 - winter 2015 25 / 28

LZW Compression The LZW algorithm is slightly more efficient It represents these 100 pixels using 22 bytes or 176 bits To understand how LZW works, we need to study the algorithm However, in order to understand how data compression methods work, we need to start with simpler methods, such as Huffman encoding We will study data compression methods later on in this course (http://cs.ucsb.edu/~koc/cs2) cs2 comp think lecture05 - winter 2015 26 / 28

Summary Bytes Meaning 47 49 46 38 39 61 GIF89a 0A 00 0A 00 Size: 10 10 91 Packed byte: 1 001 0 001 Color Table Flag = 1, cr = 1, pixel = 1 Color Table Size = 2 2 = 4 colors 00 00 Background Color and Pixel Aspect Ratio FF FF FF White FF 00 00 Red 00 00 FF Blue 00 00 00 Black (http://cs.ucsb.edu/~koc/cs2) cs2 comp think lecture05 - winter 2015 27 / 28

Summary Bytes Meaning 21 F9 04 00 00 00 00 00 Graphics Control Extension 2C 00 00 00 00 0A 00 0A 00 00 Image Descriptor 02 16 LZW Code Size and the Number of Bytes 8C 2D 99 87 2A 1C DC 33 A0 02 75 LZW Encoded Image File EC 95 FA A8 DE 60 8C 04 91 4C 01 00 End of Image 3B End of File (http://cs.ucsb.edu/~koc/cs2) cs2 comp think lecture05 - winter 2015 28 / 28