Administrative Notes February 9, 2017 Feb 10: Project proposal resubmission (optional) Feb 13: Art and Images reading quiz Feb 17: In the News call #2
Data Representation: Part 2 Text representation Colour representation
Learning goals Text representation [CT Building Block] Given a list of ASCII codes, students will be able to decode an ASCII representation of a short text document. [CT Building Block] Students will be able to explain why opening a non-ascii file (e.g., a Word document) in a text editor results in a different display than when the same document is opened in its intended application. 3
How do we store letters in hex (or binary)? ASCII 128 values (7 bits, since 2 7 = 128) https://en.wikipedia.org/wiki/ascii
How do we store letters in hex (or binary)? ASCII 128 values (7 bits, since 2 7 = 128) ACSII (American Standard Code for Information Interchange) was developed in the 1960 s In addition to letters and numbers, punctuation, spaces and other special control characters are encoded; each encoded item is sometimes called a code point Why 7 bits? An extra check bit was included that could be used to detect certain errors that might arise, e.g., when sending data over a modem Extended ASCII uses 8 bits (or one byte), allowing for characters with accents (Á, ë and others) https://en.wikipedia.org/wiki/ascii
Translating from ASCII (hex) to text Example The image part with relatio nship ID rid9 was not found in the file. Hex Binary Symbol 41 01000001 A 42 01000010 B 43 01000011 C 44 01000100 D 45 01000101 E 46 01000110 F 47 01000111 G Binary 01000110 01000001 01000011 01000101 Hex 46 41 43 45 Text
Translate from ASCII (hex) to text Group exercise Hex Symbol 41 A 42 B 43 C 44 D 45 E 46 F 47 G Binary 01000010 01000001 01000100 01000111 01000101 Hex 42 41 44 47 45 Text
Extended ASCII: an 8-bit representation If regular ASCII represents 128 values in 7 bits, how many values can we represent in a byte (8 bits)?
What about other languages, like Chinese? Unicode is a text representation standard, maintained by the Unicode Consortium since the 1980s Unicode covers most of the world s modern and historic writing systems, and has over a million code points There are different implementations, including UTF-8 and UTF-16 https://en.wikipedia.org/wiki/unicode
What about other languages, like Chinese? Both and UTF-8 and UTF-16 are variable-length encodings: UTF-8 is consistent with ASCII representation, using one byte, but uses up to four bytes for other characters UTF-16 uses one or two 16-bit code units per code point https://en.wikipedia.org/wiki/unicode
Are ASCII, UTF-8 and UTF-16 forms of encryption? Clicker question A. Yes B. No https://en.wikipedia.org/wiki/unicode
What about formatting? How does Word store its data?
What about formatting? How does Word store its data? Uploading a Word document into the online Hex editor suggests that the document is not in ASCII representation In fact it is a zipped collection of files! If you unzip a word document, you can see these files (and even change some things in them )
What about formatting? How does Word store its data? Most of the files that comprise a Word document are in XML (Extensible Markup Language) format; they describe metadata such as the font style and size, document creator, etc. The files The files may also contain information about tracked changes to the document, collaborators, privacy and security settings, and more
Privacy implication! The information that s encoded in a Word document can have data that you don t necessarily want to share! There are ways to scrub metadata from Word documents (details depend on the type of computer Mac or PC and on the version of Word)
Keeping data confidential can be tricky in other formats as well Consider confidential documents, like the redacted military document in the beginning of Blown to Bits Chapter 3 http://www.corriere.it/media/documenti/classified.pdf
Learning goals Colour representation [CT Building Block] Define the RGB colour specification, explain its basis 17
Red Green Blue (RGB) colours Colours on monitors, phone screens, and TVs are mixes of red, green, and blue lights Computer applications use 256 intensities (8 bits) for each of red, green, and blue
Black and white colors Black is the absence of light: 0000 0000 0000 0000 0000 0000 (Binary) 0 0 0 0 0 0 (Hex) RGB bit assignment for black White is the full intensity of each color: 1111 1111 1111 1111 1111 1111 (Binary) F F F F F F (Hex) RGB bit assignment for white http://www.colorpicker.com/
RGB colours Clicker exercise Suppose red s intensity is 255 (full intensity). What happens if both the blue and green intensities increase at the same rate, starting from 0? 20
RGB colours Clicker exercise illustration
RGB colours Clicker exercise Which colour best describes the one represented by the hexadecimal colour code: #00B103? 22