Data encoding. Lauri Võsandi

Similar documents
Binary representation and data

Objectives. Connecting with Computer Science 2

Princeton University. Computer Science 217: Introduction to Programming Systems. Data Types in C

INF5063: Programming heterogeneous multi-core processors. September 17, 2010

CHAPTER 2 - DIGITAL DATA REPRESENTATION AND NUMBERING SYSTEMS

Multimedia Systems Image III (Image Compression, JPEG) Mahdi Amiri April 2011 Sharif University of Technology

Introduction to Computer Science (I1100) Data Storage

Robert Matthew Buckley. Nova Southeastern University. Dr. Laszlo. MCIS625 On Line. Module 2 Graphics File Format Essay

3 Data Storage 3.1. Foundations of Computer Science Cengage Learning

Lossy compression CSCI 470: Web Science Keith Vertanen Copyright 2013

Features. Sequential encoding. Progressive encoding. Hierarchical encoding. Lossless encoding using a different strategy

Lossy compression. CSCI 470: Web Science Keith Vertanen

MULTIMEDIA AND CODING

IMAGE COMPRESSION USING FOURIER TRANSFORMS

Data Representation 1

Princeton University Computer Science 217: Introduction to Programming Systems The C Programming Language Part 1

NUMBERS AND DATA REPRESENTATION. Introduction to Computer Engineering 2015 Spring by Euiseong Seo

Digital Image Representation Image Compression

ECE 417 Guest Lecture Video Compression in MPEG-1/2/4. Min-Hsuan Tsai Apr 02, 2013

Index. 1. Motivation 2. Background 3. JPEG Compression The Discrete Cosine Transformation Quantization Coding 4. MPEG 5.

General Computing Concepts. Coding and Representation. General Computing Concepts. Computing Concepts: Review

2/12/17. Goals of this Lecture. Historical context Princeton University Computer Science 217: Introduction to Programming Systems

Steganography: Hiding Data In Plain Sight. Ryan Gibson

Video Compression An Introduction

Chapter 1. Data Storage Pearson Addison-Wesley. All rights reserved

DigiPoints Volume 1. Student Workbook. Module 8 Digital Compression

Data Representation and Networking

Introduction ti to JPEG

CSE COMPUTER USE: Fundamentals Test 1 Version D

Compression Part 2 Lossy Image Compression (JPEG) Norm Zeck

Digital Fundamentals

Jianhui Zhang, Ph.D., Associate Prof. College of Computer Science and Technology, Hangzhou Dianzi Univ.

Elementary Computing CSC 100. M. Cheng, Computer Science

Bits and Bit Patterns

Lecture Coding Theory. Source Coding. Image and Video Compression. Images: Wikipedia

CS 335 Graphics and Multimedia. Image Compression

CMPT 365 Multimedia Systems. Media Compression - Image

CS 261 Fall Binary Information (convert to hex) Mike Lam, Professor

ETGG1801 Game Programming Foundations I Andrew Holbrook Fall Lecture 0 - Introduction to Computers 1

Today. o main function. o cout object. o Allocate space for data to be used in the program. o The data can be changed

[301] Bits and Memory. Tyler Caraza-Harter

What is multimedia? Multimedia. Continuous media. Most common media types. Continuous media processing. Interactivity. What is multimedia?

Digital Systems COE 202. Digital Logic Design. Dr. Muhamed Mudawar King Fahd University of Petroleum and Minerals

Bits, bytes, binary numbers, and the representation of information

Number Systems, Scalar Types, and Input and Output

Chap 1. Digital Computers and Information

THE FUNDAMENTAL DATA TYPES

Logic and Computer Design Fundamentals. Chapter 1 Digital Computers and Information

Multimedia. What is multimedia? Media types. Interchange formats. + Text +Graphics +Audio +Image +Video. Petri Vuorimaa 1

Number Systems. TA: Mamun. References: Lecture notes of Introduction to Information Technologies (ITEC 1011) by Dr Scott MacKenzie

Homework 1 graded and returned in class today. Solutions posted online. Request regrades by next class period. Question 10 treated as extra credit

2a. Codes and number systems (continued) How to get the binary representation of an integer: special case of application of the inverse Horner scheme

Moodle WILLINGDON COLLEGE SANGLI. ELECTRONICS (B. Sc.-I) Introduction to Number System

Software and Hardware

IMAGE COMPRESSION. Image Compression. Why? Reducing transportation times Reducing file size. A two way event - compression and decompression

Number Systems. Binary Numbers. Appendix. Decimal notation represents numbers as powers of 10, for example

15 Data Compression 2014/9/21. Objectives After studying this chapter, the student should be able to: 15-1 LOSSLESS COMPRESSION

Lecture 1: What is a computer?

Audio Compression. Audio Compression. Absolute Threshold. CD quality audio:

Image, video and audio coding concepts. Roadmap. Rationale. Stefan Alfredsson. (based on material by Johan Garcia)

Digital Logic Lecture 2 Number Systems

CS107, Lecture 3 Bits and Bytes; Bitwise Operators

CS101 Lecture 12: Image Compression. What You ll Learn Today

Audio-coding standards

Optical Storage Technology. MPEG Data Compression

Lecture #3: Digital Music and Sound

MACHINE LEVEL REPRESENTATION OF DATA

MC1601 Computer Organization

CS107, Lecture 3 Bits and Bytes; Bitwise Operators

Chapter 1. Digital Data Representation and Communication. Part 2

Data Storage. Slides derived from those available on the web site of the book: Computer Science: An Overview, 11 th Edition, by J.

"Digital Media Primer" Yue- Ling Wong, Copyright (c)2011 by Pearson EducaDon, Inc. All rights reserved.

Compression of Stereo Images using a Huffman-Zip Scheme

Princeton University COS 217: Introduction to Programming Systems C Primitive Data Types

Lecture 1. Types, Expressions, & Variables

Multimedia Technology

Introduction to Numbering Systems

Jin-Soo Kim Systems Software & Architecture Lab. Seoul National University. Integers. Spring 2019

M1 Computers and Data

Unit 3. Analog vs. Digital. Analog vs. Digital ANALOG VS. DIGITAL. Binary Representation

2nd Paragraph should make a point (could be an advantage or disadvantage) and explain the point fully giving an example where necessary.

GCSE Computer Science Component 02

Fundamentals of Video Compression. Video Compression

Final Labs and Tutors

3.1. Unit 3. Binary Representation

7: Image Compression

Interframe coding A video scene captured as a sequence of frames can be efficiently coded by estimating and compensating for motion between frames pri

Audio-coding standards

Variables and Constants

Learning Programme Fundamentals of data representation AS Level

Number Systems Prof. Indranil Sen Gupta Dept. of Computer Science & Engg. Indian Institute of Technology Kharagpur Number Representation

2014 Summer School on MPEG/VCEG Video. Video Coding Concept

Bits. Binary Digits. 0 or 1

Digital Data. 10/11/2011 Prepared by: Khuzaima Jallad

Lecture 6 Introduction to JPEG compression

1.1 Bits and Bit Patterns. Boolean Operations. Figure 2.1 CPU and main memory connected via a bus. CS11102 Introduction to Computer Science

EE 109 Unit 3. Analog vs. Digital. Analog vs. Digital. Binary Representation Systems ANALOG VS. DIGITAL

An introduction to JPEG compression using MATLAB

End-to-End Data. Presentation Formatting. Difficulties. Outline Formatting Compression

Content. Information Coding. Where are we? Where are we? Bit vectors (aka Words )

Transcription:

Data encoding Lauri Võsandi

Binary data Binary can represent Letters of alphabet, plain-text files Integers, floating-point numbers (of finite precision) Pixels, images, video Audio samples Could be stored in processor registers, RAM, harddisk, transmitted over network etc Quantization, quantization error

Binary encoding

Binary to decimal Bit indexing starts from 0 Least significant bit is usually on the right Each bit has weight of 2n Multiply each bit with it s weight Add the multiplications

Decimal to binary If weight can be subtracted, bit corresponds to one If the number is smaller than next weight, bit corresponds to zero

Bit order also known as most significant bit (MSB) least significant bit (LSB)

Hexadecimal representation Each hexadecimal digit corresponds to nibble (4-bits) Hexadecimal retains alignment to binary data opposed to decimal binary 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111 hexadecimal 0 1 2 3 4 5 6 7 decimal 0 1 2 3 4 5 6 7 8 9 A B C D E F 8 9 10 11 12 13 14 15

Endianess Motorola 68k (Macintosh) Intel x86 (PC-s)

Integer representation Number 42 (decimal) could be represented as 0b101010 (binary) 0x2a (hexadecimal) 052 (octal) 0o52 (also octal) Check out http://baseconvert.com

Fixed-point numbers

IEEE754 floating point numbers

Text encoding ASCII, Unicode

ISO8859-13 (Baltic) Portion of extended ASCII replaced with letters from Baltic languages

Problems Impossible to mix documents of different character sets 8-bits not enough to describe alphabets of different languages

Unicode More than million characters are described Unicode code point refers to a index of symbol: 0x00000 to 0x10FFFF How it gets mapped to bits is different story: UTF-8 - Variable length coding (1 to 4 bytes) UTF-16 - Also variable-length coding (2 or 4 bytes) UTF-32 - Only fixed-width coding (4 bytes)

Unicode ASCII was used for source code, text files etc. Has been replaced by UTF-8 In-memory data structures different

Python 2.x str is ASCII >>> type("γεια σας") <type 'str'> >>> len("γεια σας") 15 >>> type(u"γεια σας") <type 'unicode'> >>> len(u"γεια σας") 8

Python 3.x str is Unicode >>> type("γεια σας") <class 'str'> >>> len("γεια σας") 8 >>> type(b"γεια σας") File "<stdin>", line 1 SyntaxError: bytes can only contain ASCII literal characters. >>> "γεια σας".encode("utf-8") b'\xce\xb3\xce\xb5\xce\xb9\xce\xb1 \xcf\x83\xce\xb1\xcf\x82' >>> type(b"hello world") <class 'bytes'>

Data types in Java

Data types in C (x86) sizeof(bool) == 1 # 8-bit boolean sizeof(char) == 1 # 8-bit ASCII char or byte sizeof(short) == 2 # 16-bit integer sizeof(int) == 4 # 32-bit integer sizeof(long) == 4 # 32-bit integer sizeof(long long) == 8 # 64-bit integer sizeof(float) == 4 # 32-bit floating point number sizeof(double) == 8 # 64-bit floating point number sizeof(void*) == 4 # 32-bit pointer

Data types in C (armhf) sizeof(bool) == 1 sizeof(char) == 1 sizeof(short) == 2 sizeof(int) == 4 sizeof(long) == 4 sizeof(long long) == 8 sizeof(float) == 4 sizeof(double) == 8 sizeof(void*) == 4

Data types in C (x86-64) sizeof(bool) == 1 # 8-bit boolean sizeof(char) == 1 # 8-bit ASCII char or byte sizeof(short) == 2 # 16-bit integer sizeof(int) == 4 # 32-bit integer sizeof(long) == 8 # 64-bit integer (!) sizeof(long long) == 8 # 64-bit integer sizeof(float) == 4 # 32-bit floating point number sizeof(double) == 8 # 64-bit floating point number sizeof(void*) == 8 # 64-bit pointer

Audio encoding Resolution, sampling rate

Pulse-coded modulation (PCM) Common bit depths are 8, 16 and 24 bits Example on the right uses 4 bits per channel

Audio resolution How accurately audio signal can be represented Speaker cone displacement measuring precision Audio CD: 16-bits/ch

Audio sampling rate How accurately audio signal can be represented Frequently of speaker cone displacement measurement Audio CD: 44.1kHz

Digital-to-analog conversion Each output bit is connected to bit weight resistor Resistances are aggregated Op-amp amplifies the final voltage

Image encoding Pixels, color depth, resolution

Color models

Images Picture element usually known as pixel Red, green, blue channels represent intensity Alpha channel represents transparency Different modes: RGB, BGR, ARGB, RGBA, ABGR,...

Resolution How many pixels Horizontally Vertically DPI (dots per inch) The more pixels, the better it looks

Indexed colors Video card contains the look up table Each pixel is the index in the lookup table RGB values computed on the fly at video output

True color Each pixel contains actual RGB data RGB 8:8:8 corresponds to 224 = 16777216 colors RGB 5:6:5 corresponds to 216 = 65536 colors

256 colors

16 bits per pixel (RGB 5:6:5)

24 bits per pixel (RGB 8:8:8)

Video DAC The simplest/ cheapest use resistor ladder similar to audio DAC

Compression Fourier transform, RLE, Huffman encoding

Audio compression Frames (group of audio samples) are converted from time domain to frequency domain Frequencies with low energy are discarded Peaking frequencies are rounded Adjacent peaks are merged Phase offset information is lost

Fourier transform Frequencies that combined result in the original signal Frequency domain representation (frequencies and their amplitudes) Time domain representation (samples)

Image compression Photographs High correlation between RGB channels No independent pixels A lot of gradients Computer graphics eg. screenshots Adjacent pixels of same color Some pixels occur more frequently than others

Other colorspaces YUV or YCbCr used in image/video Luma and chroma information instead of RGB Less resolution and bit depth for chroma No perceived image quality degradation

RGB vs YUV RGB (8:8:8) representation would result in 12 bytes per 4 pixels The representation on right would result in 6 bytes per 4 pixels

Discrete cosine transform Used in JPEG, MPEG A simplified case of Fourier transform (8x8 pixels)

Discrete cosine transform

Running length encoding Substitute group of identical numbers: How many? What number?

Photo compression with JPEG Colorspace transformation from RGB to YCbCr Downsampling by discarding chroma bits Block splitting usually to 8x8 pixel blocks DCT to convert pixels to waves Quantization, round off insignificant coefficients Running length encoding Huffman encoding, use less bits to represent frequently occurring bit sequences

Potential exam questions What is 0xFF, 0xFFFF, 0xFFFFFF in decimal? What is 0755 in binary? How many bits are required do describe integer range -63 to 64? What integer range / how may colors can be described using 24 bits? What color is 0x88FF8800 (ARGB)?

Potential exam questions Describe simplest 8-bit stereo DAC Describe RGB (4:4:4) DAC What is the minimum audio CD capacity assuming stereo sound at 44.1kHz sampling rate and 16-bits per channel for 80 minute album? What is the bitrate for 7.1 sound system sampled at 96kHz and 24-bits per channel?

Potential exam questions What is the significance of Fourier transform? What is time domain representation? What is frequency domain representation? What is running length encoding? What is Huffman encoding?

Where are we know We know how to install and run OS We know how to use command-line We know how to invoke a program We know how to represent in binary Plain text, integers, floating point numbers Audio and images How to store them efficiently

What next? How is an actual CPU processing the data?