Data Representation. Types of data: Numbers Text Audio Images & Graphics Video

Similar documents
Bits and Bit Patterns

Chapter 1. Data Storage Pearson Addison-Wesley. All rights reserved

Worksheet - Storing Data

Data Storage. Slides derived from those available on the web site of the book: Computer Science: An Overview, 11 th Edition, by J.

CSC 170 Introduction to Computers and Their Applications. Lecture #1 Digital Basics. Data Representation

Intermediate Programming & Design (C++) Notation

[301] Bits and Memory. Tyler Caraza-Harter

Using sticks to count was a great idea for its time. And using symbols instead of real sticks was much better.

Discussion. Why do we use Base 10?

Example 1: Denary = 1. Answer: Binary = (1 * 1) = 1. Example 2: Denary = 3. Answer: Binary = (1 * 1) + (2 * 1) = 3

Chapter 4 The Components of the System Unit

Teaching KS3 Computing. Session 3 Theory: More on binary and representing text Practical: Introducing IF

System Unit Components Chapter2

REPRESENTING INFORMATION:

1.1 Bits and Bit Patterns. Boolean Operations. Figure 2.1 CPU and main memory connected via a bus. CS11102 Introduction to Computer Science

Data Representation and Networking

Jianhui Zhang, Ph.D., Associate Prof. College of Computer Science and Technology, Hangzhou Dianzi Univ.

OCR J276 GCSE Computer Science

INTRODUCTION TO COMPUTERS

Unit 2 Digital Information. Chapter 1 Study Guide

Chapter 2. Prepared By: Humeyra Saracoglu

GCSE Computing. Revision Pack TWO. Data Representation Questions. Name: /113. Attempt One % Attempt Two % Attempt Three %

CC411: Introduction To Microprocessors

A New Compression Method Strictly for English Textual Data

b A bit is the basic unit for storing electronic data, for example an MP3 file. The term bit is a

Multimedia Networking ECE 599

More Bits and Bytes Huffman Coding

15 July, Huffman Trees. Heaps

CS 206 Introduction to Computer Science II

Information Communications Technology (CE-ICT) 6 th Class

Dec Hex Bin ORG ; ZERO. Introduction To Computing

Data Compression Techniques

ECE 499/599 Data Compression & Information Theory. Thinh Nguyen Oregon State University

计算机信息表达. Information Representation 刘志磊天津大学智能与计算学部

The Programming Process Summer 2010 Margaret Reid-Miller

CSE COMPUTER USE: Fundamentals Test 1 Version D

Unit - II. Computer Concepts and C Programming 06CCP13. Unit II

IT 1204 Section 2.0. Data Representation and Arithmetic. 2009, University of Colombo School of Computing 1

FA269 - DIGITAL MEDIA AND CULTURE

2. Give an example of algorithm instructions that would violate the following criteria: (a) precision: a =

CMSC 104 -Lecture 2 John Y. Park, adapted by C Grasso

2nd Paragraph should make a point (could be an advantage or disadvantage) and explain the point fully giving an example where necessary.

Introduction to the Mathematics of Big Data. Philippe B. Laval

Lecture #3: Digital Music and Sound

Computer Basics. Electronic computers have changed dramatically over their 50 history, but a few basic principles characterize all computers

System Unit Components. Chapter2

Digital Media. Daniel Fuller ITEC 2110

Module 1: Information Representation I -- Number Systems

Electronic Data and Instructions

Terminology, Types of Computers & Computer Hardware

Data Compression. Media Signal Processing, Presentation 2. Presented By: Jahanzeb Farooq Michael Osadebey

A comprehensive view of software in detail.

Digital Data. 10/11/2011 Prepared by: Khuzaima Jallad

NUMBERS AND DATA REPRESENTATION. Introduction to Computer Engineering 2015 Spring by Euiseong Seo

Digital Communication Prof. Bikash Kumar Dey Department of Electrical Engineering Indian Institute of Technology, Bombay

Chapter Two. Hardware Basics: Inside the Box

Data Compression Scheme of Dynamic Huffman Code for Different Languages

DATA 301 Introduction to Data Analytics Data Representation. Dr. Ramon Lawrence University of British Columbia Okanagan

Lesson 1: Computer Concepts

Homeschool Enrichment. The System Unit: Processing & Memory

Computers Are Your Future

Edexcel GCSE in Computer Science Microsoft IT Academy Mapping

Topic 1 : Introduction to Telecommunication

DigiPoints Volume 1. Student Workbook. Module 8 Digital Compression

Source coding and compression

Chapter 4 The Components of the System Unit

Lossless Compression Algorithms

Computer Memory. Data Structures and Algorithms CSE 373 SP 18 - KASEY CHAMPION 1

Elementary Computing CSC 100. M. Cheng, Computer Science

"Digital Media Primer" Yue- Ling Wong, Copyright (c)2011 by Pearson EducaDon, Inc. All rights reserved.

The type of all data used in a C (or C++) program must be specified

Information Science 1

Compression. storage medium/ communications network. For the purpose of this lecture, we observe the following constraints:

3 Data Storage 3.1. Foundations of Computer Science Cengage Learning

Computers Are Your Future

Experimental Methods I

15/09/15. Introduction to Computers & The Internet. Contents. Computer hardware and software. Input and output devices CPU. Memory.

Discovering Computers 2012

To Zip or not to Zip. Effective Resource Usage for Real-Time Compression

15110 PRINCIPLES OF COMPUTING SAMPLE EXAM 2

The type of all data used in a C++ program must be specified

Chapter 1: Why Program? Computers and Programming. Why Program?

15110 Principles of Computing, Carnegie Mellon University - CORTINA. Digital Data

Introduction to Computers. Joslyn A. Smith

Information and Creative Technology

Multimedia Data and Its Encoding

Standard File Formats

Text Compression through Huffman Coding. Terminology

Introduction to computers

Greedy Algorithms and Huffman Coding

What is the typical configuration of a computer sold today? 1-1

Keywords Data compression, Lossless data compression technique, Huffman Coding, Arithmetic coding etc.

AQA GCSE Computer Science PLC

Introduction to Computer Science

CS/COE 1501

Chapter 1 Computer and Programming. By Zerihun Alemayehu

An Overview of the Computer System. Kafui A. Prebbie 24

Lecture 1: What is a computer?

Computer Systems. IGCSE OCR AQA Edexcel Understand the term. embedded system and how an Purpose of embedded system

Computer Number Systems Supplement

Transcription:

Data Representation

Data Representation Types of data: Numbers Text Audio Images & Graphics Video

Analog vs Digital data How is data represented? What is a signal? Transmission of data Analog vs Digital Analog: Continuous signal Digital: Discrete signal

Analog vs Digital data Analog Digital Threshold

Representing Text Document: Paragraphs, sentences, words All made up of characters English language has 26 letters 52 if you consider upper and lower case Punctuation characters Space Character sets: ASCII and Unicode

ASCII Character Set

ASCII Character Set 256 characters 8 bits = 1 byte ASCII: Character a --> Dec: 97 --> Binary: 01100001

Unicode Character Set 2 16 : 65000 characters ASCII is a subset of Unicode

Unicode Character Set Why Unicode?

Some terminology 1 gigabyte of storage 20 years ago!

Some terminology

Some terminology Up to this point we have been talking about data in either bits or bytes. 1 byte = 8 bits While this is the correct way to talk about data, sometimes it is a bit inefficient. Therefore, we use prefixes to given an order of magnitude. Much the same way we do with the metric system.

Some terminology Kilobyte (KB) = 10 3 = 1000 bytes Megabyte (MB) = 10 6 = 1 million bytes Gigabyte (GB) = 10 9 = 1 billion bytes Terabyte (TB) = 10 12 = 1 trillion bytes

Data Compression Why compress data? Storage, transmission within PC/over network

Data Compression What is data compression? Reducing physical size of information blocks

Data Compression Compression ratio Tells us how much compression occurs. Number between 0 and 1 Lossless versus lossy compression Images, sound files, videos Database of names, numbers compressed = ratio * uncompressed ratio = compressed/uncompressed

Text Compression Examine three types of text compression: Keyword encoding Run-length encoding Huffman encoding

Keyword Encoding Frequently used words replaced by a single character --> Reversible Word Symbol as ^ the ~ and + that $ must & well % these # The human body is composed of many independent systems, such as the circulatory system, the respiratory system, and the reproductive system. Not only must all systems work independently, but they must interact and cooperate as well. Overall health is a function of the well being of separate systems, as well as how these separate systems work in concert.

Keyword Encoding Frequently used words replaced by a single character --> Reversible Word Symbol as ^ the ~ and + that $ must & well % these # The human body is composed of many independent systems, such as ^ the circulatory circulatory system, ~ respiratory system, the system, respiratory + ~ system, and reproductive the reproductive system. system. Not only Not & all only systems must all work systems independently, work independently, but they & interact but they must and cooperate interact and ^ %. cooperate Overall health as well. is a Overall function health of ~ % is being a function of separate of the systems, well being of ^% separate ^ how # systems, separate as systems well as work how in these separate concert. systems work in concert.

Keyword Encoding Frequently used words replaced by a single character --> Reversible Word Symbol as ^ the ~ and + that $ must & well % these # The Reduced human from body 352 is to composed 317 of many independent Compression systems, ratio: 317/352 such as ^ = the 0.9 circulatory circulatory system, ~ respiratory system, the system, respiratory + ~ system, and reproductive the reproductive system. Is this efficient? system. Not only Not & all only systems must all work systems independently, work independently, but they & interact but they must and cooperate interact and ^ %. cooperate Overall health as well. is a Overall function health of ~ % is being a function of separate of the systems, well being of ^% separate ^ how # systems, separate as systems well as work how in these separate concert. systems work in concert.

Keyword Encoding Frequently used words replaced by a single character --> Reversible Word Symbol as ^ the ~ and + that $ must & well % these # Drawbacks: Symbols used for encoding must not appear in the text The & the needs to be represented by different symbols Would not gain anything by encoding a and I Most frequently used words are often short

Run-Length Encoding Also known as recurrence coding Encoding a single character that is repeated over and over again For example: replacing AAAAAAA with a * : *A7 Drawbacks? Uses: DNA sequences, simple images Lossy or lossless compression?

Huffman Encoding Variable bit lengths to represent characters: a --> Binary 01100001 8 bits Why would character X take up as many bits as a? Represent it using 5 bits instead Saving space: Frequently appearing characters are represented by shorter bit lengths

Huffman Encoding Huffman Code Character 00 A 01 E 100 L 110 O 111 R 1010 B 1011 D DOORBELL D= 1011 O= 110 O=110 1011 110 110 111 101001100100 If we used fixed size bit string: 64 bits With Huffman encoding: 25 bits Compression ratio: 25/64 = 0.39 What about the decoding process?