Data Representation From 0s and 1s to images CPSC 101
Learning Goals After the Data Representation: Images unit, you will be able to: Recognize and translate between binary and decimal numbers Define bit, byte, word Explain how to represent an image as a grid of pixels ( raster graphics ) Define font, glyph and character set, and explain how they produce text Define the terms pixel and raster graphics Construct or recognize colours in the hexadecimal form used in HTML pages, given a chart to convert between hexadecimal and decimal numbers Explain how to represent an image as a list of drawing commands ( vector graphics ) Compare and contrast the suitability of raster and vector representations for different image representation needs Describe lossless and lossy compression, their relative advantages/disadvantages and give examples of each.
The Decimal system We are used to representing numerical information using the decimal system 248 4 in the tens place 2 in the hundreds place 8 in the ones place 2*10 2 + 4*10 1 + 8*10 0 = 200 + 40 + 8 We use the digits 0..9 to represent all numbers
The binary system With computers, we are limited to representing numbers with the binary system 11111000 1*2 7 0*2 0 1*2 6 1*2 5 1*2 4 1*2 3 0*2 2 0*2 1 We use the digits 0 and 1 to represent all numbers 128+64+32+16+8+0+0+0=248
Binary system terminology A single digit (i.e. a 0 or a 1) is called a bit In a computer, a bit is an electrical signal that is either on or off We string 8 of these together and call it a byte A byte can represent 256 different things (i.e. 256 numbers, or 256 letters) You string bytes together to get a word If you have an 8-bit architecture, then a word is a single byte In a 32-bit system, words comprise 4 bytes, etc. A word can contain a computer instruction, a storage address or application data, etc.
Question Convert 58 to an 8-bit binary number a) 01011110 b) 00111111 c) 00111010 d) 00011101
Question Convert 01011010 to decimal a) 180 b) 90 c) 80 d) 73
Representing Characters All data in a computer is stored in binary (ie, 1s and 0s) How can we get text of of 1s and 0s?
Character Sets ASCII ( ASK-ee ) -- American Standard Code for Information Interchange Sequences of 7 bits represents 128 characters 32 non-printable characters 95 printable characters (32 to 126) We can convert from bit sequence to character (letter) using a chart Do you notice anything odd? Can we represent all languages with ASCII?
Character Sets Unicode uses sequences of 32 bits can support nearly 100,000 characters, including over 70,000 ideograms this is the industry standard Unicode is supported in HTML to use a Unicode character your document, first type &# then your Unicode number (you would look up the number in a Unicode chart) e.g.  produces the percent sign
Text files These characters (ASCII or Unicode) are then stored in files Can be viewed/modified using a text editor (e.g. Notepad)
Fonts There are many ways to actually display a character We do this by mapping a character to an abstract glyph A glyph is what the character ends up looking like on the screen A set of glyphs, grouped according (usually!) to design, size, etc, as well as the mapping for how to translate between characters and glyphs is called a font
Fonts There are two general classes of fonts: Serif fonts have little lines at the end of the character e.g. Times New Roman Sans-serif fonts do not e.g. Courier
Representing Images If everything in a computer is just 0s and 1s, then how do we represent images? A numerical representation is a good way to faithfully transmit an image. Text and music have had abstract symbolic notational systems for thousands of years; the visual arts have just achieved such a system for the first time - Anne Morgan Spalter
Transmitting Images
Raster Representation We cannot transmit images as only numbers until we agree on a data representation scheme. Let s try this one: Give the height and width of the image as numbers For each block in the image, send a 0 if it s black and a 1 if it s white Let s use exactly three digits for height and width. We call the blocks pixels, short for picture elements.
Transmitting Images
Shades of Grey How should we represent an image that s not just black and white as numbers?
Full Colour How about an image in full colour?
Common Raster file formats (Also called bitmap) GIF: graphics interchange format JPEG: joint photographic experts group PNG: portable network graphic BMP: Windows bitmap TIFF: tagged image file format
Problems with Raster graphics Do not scale well that is when resized, you often see a visible reduction in quality Such distorted images are referred to as pixelated
Scaling with Raster representation original scaled by 400%
Scaling with Raster representation original scaled by 800%
Is Raster Representation a Good Match for These Devices? Eastman Static Cutting Table Model M9000 From http://www.eastmancuts.com/products/m9000.htm
Image Representation for a Plotter How would you use numbers to represent an image on a plotter? How about to represent the A we ve been working with?
Vector Representation of Images Our vector representation will be a list of sets of four numbers: x and y coordinate of the start of a line and x and y coordinate of the end of a line. So, vector graphics do not use pixels! Essentially they tell the computer to start drawing at a given point, at a particular angle, and stop drawing at a given point Real vector representations also include much more information such as the end style of a line (rounded or square), shapes, colour, transparency, etc Such graphics keep their integrity, or remain clean, regardless of the resolution
Common Vector File Formats SVG: scalable vector graphics EPS: encapsulated postscript PICT: Macintosh Picture WMF: Windows Media File
Which is Better: Raster or Vector Representation?
Common Uses of Raster and Vector Representations Raster is used for: representing photographs most computer displays some scientific data (like grid-based maps of wildfires ) and much more! Vector is used for: most computer models of real figures (like the models of the characters in computer animated films) most fonts (visual representations of letters and numbers) some scientific data (like elevation profiles for topographical maps) and much more!
Playing Tricks on the Human Vision System The human visual system has lots of weird properties
Image Compression Images, and many other types of data, require a lot of information to store. E.g., a 1000x1000 pixel image contains 1 million pixels. If each pixel require 8 bytes (fairly standard), each image would require 8 M to store or transmit. To deal with this, image compression techniques are used. There are two basic types of compression, lossless and lossy.
Lossless compression Lossless compression techniques store all the information about an image but in a potentially much smaller size. How do they do this? Well a simple example would be this. Imagine compressing the image to the right. Rather than storing each pixel, we can represent the image as: red (1-500,000), blue (500,001-1,000,000) This is a very simplified approach but there exist more sophisticated lossless compression techniques.
Lossy Compression Lossy Compression techniques reduce the size of an image and lose some of the original information but retain a good approximation of the original. The math behind these techniques can be extremely complex. A simple approach is to just reduce the image resolution by averaging together every group of four pixels into 1.
No Compression Low Compression (84% reduced) Medium Compression (92% reduced) High Compression (98% reduced)
Playing Tricks on a Computer Monitor One of those weird properties is that distinct stimuli that are sufficiently close together fuse into a single stimulus Computer (and TV) monitors take advantage of this by displaying small areas of red, green, and blue light close together, which our eyes mix into colour. EXPERIMENT: using an old monitor that you re not too worried about, sprinkle a few waterdrops on to magnify the RGB spots!
Nitty-Gritty Colour Details Computers represent colour using binary numbers, just as for everything else. A colour is broken into a red, green, and blue channel. (Why? Because it works for the human eye!) About 200 levels for each channel turns out to be enough, and we can represent exactly 256 values with 8 binary digits. So, a color is represented as three values between 0 and 255.
Hexadecimal Colour Representation for HTML But do you want to express colours on your web pages as: 00000000, 01101010, 11111111? Hard to read! How about 0, 106, 255? Clearer (no red, some green, lots of blue: like this) but, less convenient for computers. Instead, we use base 16, hexadecimal : 00, 6A, FF. It s close to binary, every channel takes exactly two digits to write, and it s somewhat readable by humans. In HTML, we d write: #006AFF : 00 red, 6A green, FF blue. You should be able to answer questions like give the HTML code for a bright blue or what colour is #FF00FF?
JavaScript Code function colortexttogreennumber(colortext) { var greentext = colortext.substring(3,5); var greennum = parseint("0x" + greentext); return greennum; } Converts a hex color of the form #XXXXXX to the decimal value for green (Functions for red or blue would look similar.) function colornumbertotext(colornum) { var colortext = colornum.tostring(16); if (colortext.length == 1) { colortext = "0" + colortext; } return colortext; } function colornumberstotext(red, green, blue) { return "#" + colornumbertotext(red) + colornumbertotext(green) + colornumbertotext(blue); } Converts three decimal colors to the text representation that you can use in HTML. Can you see how we re exploiting the fact that every color takes two digits?
Learning Goals After the Image Representation unit, you will be able to: Recognize and translate between binary and decimal numbers Define bit, byte, word Explain how to represent an image as a grid of pixels ( raster graphics ) Define font, glyph and character set, and explain how they produce text Define the terms pixel and raster graphics Construct or recognize colours in the hexadecimal form used in HTML pages, given a chart to convert between hexadecimal and decimal numbers Explain how to represent an image as a list of drawing commands ( vector graphics ) Compare and contrast the suitability of raster and vector representations for different image representation needs
What s Next? We ve looked at two widely used data representation schemes for images. Now we know how computers can store and manipulate visual data. How have humans used those abilities to create art and explore creativity?
TO DO Keep working on your projects. Quiz 2 will be on March 24. If you want extra practice with JavaScript, come and see me (or one of the TAs) soon! As always, do the posted readings