FACULTY OF ENGINEERING LAB SHEET INFORMATION THEORY AND ERROR CODING ETM 2126 ETN2126 TRIMESTER 2 (2011/2012)

Similar documents
Introduction to MATLAB

Chapter 5 VARIABLE-LENGTH CODING Information Theory Results (II)

ECE Lesson Plan - Class 1 Fall, 2001

MATLAB Project: Getting Started with MATLAB

ITCT Lecture 8.2: Dictionary Codes and Lempel-Ziv Coding

Lossless Compression Algorithms

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Optimization of Bit Rate in Medical Image Compression

Fundamentals of Multimedia. Lecture 5 Lossless Data Compression Variable Length Coding

CITS2401 Computer Analysis & Visualisation

MRT based Fixed Block size Transform Coding

Intro. To Multimedia Engineering Lossless Compression

Welcome Back to Fundamentals of Multimedia (MR412) Fall, 2012 Lecture 10 (Chapter 7) ZHU Yongxin, Winson

ENSC Multimedia Communications Engineering Huffman Coding (1)

OUTLINES. Variable names in MATLAB. Matrices, Vectors and Scalar. Entering a vector Colon operator ( : ) Mathematical operations on vectors.

MATLAB Project: Getting Started with MATLAB

Computer Project: Getting Started with MATLAB

Digital Image Processing

MATLAB COURSE FALL 2004 SESSION 1 GETTING STARTED. Christian Daude 1

MATLAB - Lecture # 4

MATLAB BASICS. M Files. Objectives

Creates a 1 X 1 matrix (scalar) with a value of 1 in the column 1, row 1 position and prints the matrix aaa in the command window.

General Instructions. You can use QtSpim simulator to work on these assignments.

Multimedia Networking ECE 599

Information Theory and Communication

A Guide to Using Some Basic MATLAB Functions

Lab of COMP 406. MATLAB: Quick Start. Lab tutor : Gene Yu Zhao Mailbox: or Lab 1: 11th Sep, 2013

Starting with a great calculator... Variables. Comments. Topic 5: Introduction to Programming in Matlab CSSE, UWA

LAB 2: Linear Equations and Matrix Algebra. Preliminaries

MAT 343 Laboratory 1 Matrix and Vector Computations in MATLAB

MATLAB is working with vectors and matrices, using different operators and functions.

MATLAB Demo. Preliminaries and Getting Started with Matlab

Full file at

Data Compression Scheme of Dynamic Huffman Code for Different Languages

MATLAB GUIDE UMD PHYS401 SPRING 2012

LESSON 1. A C program is constructed as a sequence of characters. Among the characters that can be used in a program are:

Figure-2.1. Information system with encoder/decoders.

ESE 150 Lab 03: Data Compression

Introduction to MATLAB

Computer Programming : C++

Midterm Exam 2B Answer key

Introduction to MATLAB

LAB: INTRODUCTION TO FUNCTIONS IN C++

TUTORIAL MATLAB OPTIMIZATION TOOLBOX

Microprocessors & Assembly Language Lab 1 (Introduction to 8086 Programming)

Efficient Sequential Algorithms, Comp309. Motivation. Longest Common Subsequence. Part 3. String Algorithms

KINGS COLLEGE OF ENGINEERING DEPARTMENT OF INFORMATION TECHNOLOGY ACADEMIC YEAR / ODD SEMESTER QUESTION BANK

More Bits and Bytes Huffman Coding

MATLAB Lecture 1. Introduction to MATLAB

Introduction to Matlab. By: Dr. Maher O. EL-Ghossain

MAT 343 Laboratory 1 Matrix and Vector Computations in MATLAB

Constraint-based Metabolic Reconstructions & Analysis H. Scott Hinton. Matlab Tutorial. Lesson: Matlab Tutorial

EE-575 INFORMATION THEORY - SEM 092

Getting To Know Matlab

Experiment 1: Introduction to MATLAB I. Introduction. 1.1 Objectives and Expectations: 1.2 What is MATLAB?

SIGNAL COMPRESSION Lecture Lempel-Ziv Coding

Matlab Tutorial and Exercises for COMP61021

Unix Computer To open MATLAB on a Unix computer, click on K-Menu >> Caedm Local Apps >> MATLAB.

Huffman Coding. (EE 575: Source Coding Project) Project Report. Submitted By: Raza Umar. ID: g

Matlab Tutorial for COMP24111 (includes exercise 1)

A/D Converter. Sampling. Figure 1.1: Block Diagram of a DSP System

EE 368. Weeks 5 (Notes)

Lab 0a: Introduction to MATLAB

Introduction to MATLAB

ELE 201, Spring 2014 Laboratory No. 4 Compression, Error Correction, and Watermarking


CS 206 Introduction to Computer Science II

MATLAB = MATrix LABoratory. Interactive system. Basic data element is an array that does not require dimensioning.

CSE Theory of Computing Spring 2018 Project 2-Finite Automata

An undirected graph is a tree if and only of there is a unique simple path between any 2 of its vertices.

CpSc 1111 Lab 9 2-D Arrays

JME Language Reference Manual

EE67I Multimedia Communication Systems Lecture 4

Inlichtingenblad, matlab- en simulink handleiding en practicumopgaven IWS

CSE Theory of Computing Spring 2018 Project 2-Finite Automata

Introduction to MatLab. Introduction to MatLab K. Craig 1

A Comparative Study of Entropy Encoding Techniques for Lossless Text Data Compression

Introduction to MATLAB

Introduction to MATLAB for Engineers, Third Edition

Boolean Logic & Branching Lab Conditional Tests

CS 1044 Program 6 Summer I dimension ??????

Introduction to MATLAB

SMS 3515: Scientific Computing Lecture 1: Introduction to Matlab 2014

EE 216 Experiment 1. MATLAB Structure and Use

Introduction to MATLAB

Introduction to Scientific Computing with Matlab

2.0 MATLAB Fundamentals

16 Greedy Algorithms

1 Introduction to Matlab

Finding MATLAB on CAEDM Computers

Introduction to MATLAB programming: Fundamentals

APPM 2460 Matlab Basics

FRACTAL IMAGE COMPRESSION OF GRAYSCALE AND RGB IMAGES USING DCT WITH QUADTREE DECOMPOSITION AND HUFFMAN CODING. Moheb R. Girgis and Mohammed M.

What is MATLAB and howtostart it up?

ECON 502 INTRODUCTION TO MATLAB Nov 9, 2007 TA: Murat Koyuncu

Introduction to Matlab for Econ 511b

INTRODUCTION TO MATLAB, SIMULINK, AND THE COMMUNICATION TOOLBOX

CSE COMPUTER USE: Fundamentals Test 1 Version D

CSC 310, Fall 2011 Solutions to Theory Assignment #1

Introduction to MATLAB

Transcription:

FACULTY OF ENGINEERING LAB SHEET INFORMATION THEORY AND ERROR CODING ETM 2126 ETN2126 TRIMESTER 2 (2011/2012) Experiment 1: IT1 Huffman Coding Note: Students are advised to read through this lab sheet before doing experiment. Onthe-spot evaluation may be carried out during or at the end of the experiment. Your performance, teamwork effort, and learning attitude will count towards the marks.

Multimedia University Experiment 1: Huffman Coding 1. OBJECTIVE To apply source coding techniques using Huffman Codes. To design and generate Huffman Codes with MATLAB software. 2. SOFTWARE REQUIRED MATLAB 3. INTRODUCTION TO MATLAB 3.1 MATLAB MATLAB is a powerful collection of tools for algorithm development, computation and visualization. It provides more control and flexibility compared to a traditional highlevel programming language. Unlike such languages, MATLAB is compact and easy to learn, letting you express algorithms in concise, readable code. In addition, MATLAB provides an extensive set of ready-to-use functions including mathematical and matrix operations, graphics, color and sound control, and low-level file I/O. MATLAB is readily extendable - you can use the MATLAB language to easily create functions that operate as part of the MATLAB environment. The MATLAB User's Guide describes how to work with the MATLAB language. It discusses how to enter and manipulate data and use MATLAB's extensive collection of functions. It explains how to perform command-line computations and how to create your own functions and scripts. The MATLAB Reference Guide provides reference descriptions of supplied MATLAB functions and commands. 3.2 SIMULINK SIMULINK is a dynamic system simulation environment. It allows you to represent systems as block diagrams, which you build using your mouse to connect blocks and your keyboard to edit block parameters. The SIMULINK User's Guide describes how to work with SIMULINK. It also provides reference descriptions of each block in the standard SIMULINK libraries. 3.3 Communications Toolbox The Communications Toolbox is a collection of computation functions and simulation blocks for research, development, system design, analysis, and simulation in the communications area. The toolbox is designed to be suitable for both experts and beginners in the communications field. The toolbox contains ready-to-use functions and blocks, which you can easily modify to implement your own schemes, methods, and algorithms. The Communications Toolbox is designed for use with MATLAB and SIMULINK. The Communications Toolbox includes: 1

SIMULINK blocks MATLAB functions You can use the toolbox directly from the MATLAB workspace. You can use the SIMULINK environment to construct a simulation block diagram for your communication system. For convenience, the Communications Toolbox provides online interactive examples that use any of the function found in this toolbox. 3.4 Available Functions This toolbox supports a variety of functions in the communications area. These functions include: Data source Source coding/ decoding Error-control coding Modulation/ demodulation Transmission/ reception filters Transmitting channel Multiple access Synchronization Utilities 4. MATLAB FUNDAMENTALS In this section, we will illustrate some commands used for this experiment. (For additional information on the individual MATLAB commands, we refer you to the MATLAB User's Guide or the online help facility.) 4.1 Fundamental Expressions Working in the MATLAB environment is straightforward because most commands are entered as you would write them mathematically. For example, entering the following simple expression >>a = 4/3 yields the MATLAB response a = 1.3333 In general, it is better to use appropriate and memorable names for variables. MATLAB recognizes the first 19 characters of a variable name but the first character in a variable name be a letter. MATLAB is case sensitive and all MATLAB commands are written in lowercase letters. 2

If you prefer to create a new variable but do not wish to see the MATLAB response, type a semicolon at the end of the expression. For example, >>b = 4+7; will create a new variable b whose value is 11, but MATLAB will not display the value of b. 4.2 Creating Script Files It is much more convenient to use script files than to enter commands line by line at the MATLAB prompt. A script file is an ASCII file (regular text file) that contains a series of MATLAB commands written just as you would enter them in the MATLAB environment. Statements that begin with a '%' are considered to be comments and are ignored by MATLAB. The script file is created outside of MATLAB with any available text editor or word processor. Each script file should have a name that ends in ".m". The commands in the script file are executed in the MATLAB environment simply by entering the name of the script file without the ".m". For example, suppose the text file magphase.m contains the following statements: % magphase.m: example m-file to % compute the magnitude and phase of G at = 1. =1; G =l/(j* + 2); mag =abs(g) phase =atan(imag(g)/real(g)) Then type magphase at the MATLAB prompt will yield the following MATLAB response: mag= 0.4472 phase= -0.4636 which are the magnitude and phase of the transfer function G(j ) = l/(j + 2) evaluated at = 1. By entering help magphase will give you back the text in the comment lines at the beginning of the file: magphase.m: example m-file to compute the magnitude and phase of G at =1. It should be obvious that it is very desirable to include at least some brief comments as a header to each m-file. 3

4.3 Matrices Matrices are entered into MATLAB by listing the elements of the matrix and enclosing them within a pair of brackets. Commas or blanks separate elements of a single row, and semicolons or carriage returns separate rows. For example, >>A = [1 2; 3 4] yields the MATLAB response A= 1 2 3 4 To get the dimensions of a matrix, we can use the size command, e.g., >>[K,N] = size(a) K= 2 N= 2 Special vectors can be created using the ':' operator. For example, the command >>k= 1:10 will generate a row vector with elements from 1 to 10 with increment 1. Using the simple commands described thus far, you can easily manipulate both matrices and vectors. For example, to add a row onto matrix A we type, >>A = [A; [7 8]] A= 1 2 3 4 7 8 To extract the submatrix of A that consists of the first through second columns of the second through third rows, use vectors as indices as follows: >>B = A(2:3,1:2) B= 3 4 7 8 MATLAB has commands for generating special matrices. For example, you can create a n x n identity matrix with the eye(n). The zeros, ones and rand commands work 4

similarly to eye and make matrices with elements equal to zero, elements equal to one, and random elements, respectively. These commands can also be used to create nonsquare matrices. For example, zeros(2,4) generates a 2 x 4 matrix of zeros. Finally, A = A' command will give the transpose of the matrix A. 4.4 Additional Commands A) HUFFMANDICT DICT = HUFFMANDICT(SYM, PROB) Generates a binary Huffman code dictionary using the maximum variance algorithm for the distinct symbols given by the SYM vector. The symbols can be represented as a numeric vector or single-dimensional alphanumeric cell array. The second input, PROB, represents the probability of occurrence for each of these symbols. SYM and PROB must be of same length. DICT = HUFFMANDICT(SYM, PROB, N) Generates an N-ary Huffman code dictionary using the maximum variance algorithm. N is an integer between 2 and 10 (inclusive) that must not exceed the number of source symbols whose probabilities appear in PROB. DICT = HUFFMANDICT(SYM, PROB, N, VARIANCE) Generates an N-ary Huffman code with the specified variance. The possible values for VARIANCE are 'min' and 'max'. [DICT, AVGLEN] = HUFFMANDICT(...) Returns the average codeword length of the Huffman code. B) HUFFMANENCO ENCO = HUFFMANENCO(SIG, DICT) Encodes the input signal, SIG, based on the code dictionary, DICT. The code dictionary is generated using the HUFFMANDICT function. Each of the symbols appearing in SIG must be present in the code dictionary, DICT. The SIG input can be a numeric vector or a single-dimensional cell array containing alphanumeric values. C) HUFFMANDECO HUFFMANDECO(COMP, DICT) Decodes the numeric Huffman code vector COMP using the code dictionary, DICT. The encoded signal is generated by the HUFFMANENCO function. The code dictionary can be generated using the HUFFMANDICT function. The decoded signal will be a numeric vector if the original signals are only numeric. If any signal value in DICT is alphanumeric, then the decoded signal will be represented as a single-dimensional cell 5

array. 5. THEORY 5.1 Source Coding Source encoding (or source coding) is a process by the help of which the data generated by a discrete source is efficiently represented. The device that performs the representation is called a source encoder. For a source encoder to be efficient, we require knowledge of the statistics of the source. In particular, if some source symbols are known to be more probable than the others, then we may exploit this feature in the generation of a source code by assigning short code words to frequent source symbols, and long code words to rare source symbols. We refer to such a source code as a variable-length code. An efficient source encoder should satisfy the following two functional requirements: 1. The code words produced by the encoder are in binary form. 2. The source code is uniquely decodable, so that the original source sequence can be reconstructed perfectly from the encoded binary sequence. Consider a discrete memoryless source whose output s k is converted by the source encoder into a block of 0s and 1s, denoted by b k. We assume that the source has an alphabet with K different symbols, and that the kth symbol s k occurs with probability p k, k = 0, 1,, K-1. Let the binary code word assigned to s k by the encoder have a length l k, measured in bits. We define the average code word length, L, of the source encoder as L = p k l k k = 0, 1,, K-1 L represents the average number of bits per source symbol used in the encoding process. Let L min denote the minimum possible value of L. We then define the coding efficiency of the source encoder as = L min / L With L L min, we clearly have 1. The source encoder is said to be efficient when approaches unity. 5.2 Source Coding Theorem Given a discrete memoryless source of entropy H(S), the average code-word length L for any distortionless source encoding is bounded as L H(S) This is also known as Noiseless coding theorem or Shannon s first theorem. 6

Accordingly, the entropy H(S) represents a fundamental limit on the average number of bits per source symbol necessary to represent a discrete memoryless source. In other words, L cannot be made smaller than H(S). Thus with L min = H(S), we may rewrite the efficiency of a source encoder in terms of the entropy H(S) as = H(S) / L 5.3 Huffman Coding Huffman code is a variable-length code. The basic principle of Huffman coding is to assign short codes to symbols with high p, and long codes to symbols with low p. The purpose is to construct a code such that its average code-word length L that approaches the minimum i.e. entropy of source H(S). The followings are the encoding steps: 1. The source symbols are listed in order of decreasing probability. 2. The two symbols with the lowest probability are assigned a 0 and a 1. 3. The probabilities of the last two symbols are added. 4. The list of symbols is re-sorted with the last two symbols combined. 5. Repeat steps 2, 3, 4 until there are only two symbols left. 6. Assign the final two combined symbols a 0 and a 1. 6. PROCEDURE Step 1: Start the MATLAB program. Step 2: In the Command Window, type the MATLAB code line by line, as given in Example (see Section 7). Observe the result after typing each line. Alternatively, you can type the entire MATLAB code in an m-file (this is actually the preferred method). By saving the m-file, you can run the code at a later date, or you can make minor modifications without retyping the entire code, or you can copy the code to another computer to be executed there. Step 1: Start the MATLAB program. Step 2: To view the present working directory, type: pwd This is where all your m-files will be stored. To change the present working directory, use the "cd" command. For example: cd d:\infot\exp1 Step 3: Go to File -> New -> M-file. The MATLAB Editor window will open. 7

Step 4: Type the entire code as given in Example (see Section 7). Step 5: Save the m-file by going to File -> Save As. Choose a short name for your m- file. Make sure it contains no spaces. Step 6: To run the m-file from the MATLAB Editor window, go to Tools -> Run. You can also execute the m-file from the Command Window by typing the m-file name (without the.m ) on the command line. 7. EXAMPLE STEP 1: Generating the message sequence Let us generate a message which consists of one sentence of text: 'information theory is interesting'. The message will be stored in a variable called msg. In the MATLAB command window, type msg = 'information theory is interesting' Our message contains a total of 33 characters, including spaces. STEP 2: Estimating the source Before a Huffman coder can properly encode the given message, it requires information about the source alphabet and the source statistics. The source alphabet is a list of distinct symbols in the message sequence. There are altogether 14 distinct symbols in our message: symb = {'i' 'n' 'f' 'o' 'r' 'm' 'a' 't' ' ' 'h' 'e' 'y' 's' 'g'} The source statistics, on the other hand, is a list of probabilities corresponding to each symbol in the source alphabet. Since we have 14 distinct symbols, our source statistics should have 14 probabilities: p = [5/33 4/33 1/33 3/33 3/33 1/33 1/33 4/33 1/33 3/33 1/33 2/33 1/33 3/33] Observe that we have enclosed the elements of symb with curly braces { } instead of the usual square brackets [ ]. Doing so indicates that symb is a cell array, instead of a character array. The variable symb will become an input to a later function called huffmandict, which requires that the source alphabet be of cell data type. STEP 3: Constructing the Huffman codebook Based on the supplied source alphabet and source statistics, the MATLAB function huffmandict generates the Huffman codebook: dict = huffmandict(symb, p) The entire codebook construction process (including the Huffman tree) is done in the background, and is invisible to the user. The resulting codebook is stored in the variable 8

dict. Like the variable symb previously, the variable dict is also a cell array. In order to access the elements in dict, we again use curly braces { }. To see the Huffman code for the first symbol, type: dict{1,:} To view the Huffman code for the rest of the symbols, simply increment the subscript value: dict{2,:} dict{3,:} STEP 4: Encoding the message Now that the Huffman codebook is ready, we may proceed to convert the message sequence into a binary stream: binstream = huffmanenco(msg, dict) The encoding process simply consists of replacing each symbol in the message with the corresponding binary word in the codebook. STEP 5: Decoding the binary stream The binary stream will be transmitted over a communications channel to the receiver. At the receiving end, it will be decoded in order to recover the original message sequence. Message decoding is generally performed with the help of a code tree. In MATLAB, we simply type: msgdeco = huffmandeco(binstream, dict) Huffman code is a type of prefix code, so no ambiguity will arise in the decoding process. If our transmission is error-free, the decoded message should match the original message emitted by the source. Also note that the same dictionary (or codebook) is used for both encoding and decoding. 8. PROBLEM 1. Generate a message sequence that incorporates your name. The message should be phrased as 'my name is <insert name here>' Store the message in a variable called myname. 2. Estimate the source alphabet and source statistics of your message. Store the source alphabet in the variable salpha, and the source statistics in the variable sstat. 9

3. Draw the Huffman tree corresponding to the information found in part 2 (do this by hand). Trace the branches in the Huffman tree to construct the Huffman codebook. 4. Type the MATLAB code that automatically generates the Huffman codebook. Compare this codebook with the one constructed manually in part 3. Is there any difference between the two codebooks? If so, why? 5. Encode your message sequence and convert it to a binary stream. 6. Answer the following questions: (a) How many bits were used to encode your message? (b) If 7-bit ASCII were employed instead of Huffman, what would the length of your binary stream be? (c) Estimate the percentage of compression that is achieved by Huffman coding relative to using 7-bit ASCII. (d) Calculate the efficiency of your Huffman coder. 9. REPORT Print out and submit your report 2 weeks after the laboratory session. The report should contain the following: Introduction Procedures Results and Discussion Conclusions Appendices (MATLAB code) Lab report writing is an individual work. Fabricating results and copying the report of others are strictly prohibited. Severe penalties will be imposed for any such offences. 10

FACULTY OF ENGINEERING LAB REPORT SUBMISSION ETM2126 INFORMATION THEORY AND ERROR CODING ETN2126 Information Theory and Error Coding Trimester 2, 2018/2019 TRIMESTER 2 SESSION 2011/2012 Student Name: Student ID: Lab Group No.: Degree Major:.. TE / MCE Declaration of originality: I declare that all sentences, results and data mentioned in this report are from my own work. All work derived from other authors have been listed in the references. I understand that failure to do this is considered plagiarism and will be penalized. Note that collaboration and discussions in conducting the experiments are allowed but copying and any act of cheating in the report, results and data are strictly prohibited. Student signature: Experiment title: Experiment Date: Table/PC No.: Date Submitted: IT1 Huffman Coding... Lab Instructor Name: Verified:. (Please get your lab instructor signature after they have verified your result) 11