Information Science 2

Similar documents
Section 5.5. Left subtree The left subtree of a vertex V on a binary tree is the graph formed by the left child L of V, the descendents

Tree Traversal (1A) Young Won Lim 5/26/18

TREES. Trees - Introduction

Tree Traversal (1A) Young Won Lim 6/6/18

7.1 Introduction. A (free) tree T is A simple graph such that for every pair of vertices v and w there is a unique path from v to w

Tree Data Structures CSC 221

March 20/2003 Jayakanth Srinivasan,

Binary Trees and Huffman Encoding Binary Search Trees

Trees! Ellen Walker! CPSC 201 Data Structures! Hiram College!

12 Abstract Data Types

Visit ::: Original Website For Placement Papers. ::: Data Structure

Upcoming ACM Events Linux Crash Course Date: Time: Location: Weekly Crack the Coding Interview Date:

There are many other applications like constructing the expression tree from the postorder expression. I leave you with an idea as how to do it.

Data Structures and Algorithms for Engineers

Associate Professor Dr. Raed Ibraheem Hamed

Binary Trees, Binary Search Trees

APJ ABDUL KALAM TECHNOLOGICAL UNIVERSITY THIRD SEMESTER B.TECH DEGREE EXAMINATION, JULY 2017 CS205: DATA STRUCTURES (CS, IT)

Chapter 4 Trees. Theorem A graph G has a spanning tree if and only if G is connected.

Topics. Trees Vojislav Kecman. Which graphs are trees? Terminology. Terminology Trees as Models Some Tree Theorems Applications of Trees CMSC 302

Formal Languages and Automata Theory, SS Project (due Week 14)

Discussion 2C Notes (Week 8, February 25) TA: Brian Choi Section Webpage:

FORTH SEMESTER DIPLOMA EXAMINATION IN ENGINEERING/ TECHNOLIGY- MARCH, 2012 DATA STRUCTURE (Common to CT and IF) [Time: 3 hours

CE 221 Data Structures and Algorithms

IX. Binary Trees (Chapter 10) Linear search can be used for lists stored in an array as well as for linked lists. (It's the method used in the find

Trees. Trees. CSE 2011 Winter 2007

MULTIMEDIA COLLEGE JALAN GURNEY KIRI KUALA LUMPUR

Successor/Predecessor Rules in Binary Trees

IX. Binary Trees (Chapter 10)

DS ata Structures Aptitude

Introduction to Computers and Programming. Concept Question

Information Science 1

Friday, March 30. Last time we were talking about traversal of a rooted ordered tree, having defined preorder traversal. We will continue from there.

Data Structures. Trees. By Dr. Mohammad Ali H. Eljinini. M.A. Eljinini, PhD

Greedy Algorithms CHAPTER 16

CSI33 Data Structures

Lossless Compression Algorithms

Binary Trees

Lecture Notes 16 - Trees CSS 501 Data Structures and Object-Oriented Programming Professor Clark F. Olson

Homework 1 graded and returned in class today. Solutions posted online. Request regrades by next class period. Question 10 treated as extra credit

EE 368. Week 6 (Notes)


Information Science 1

MID TERM MEGA FILE SOLVED BY VU HELPER Which one of the following statement is NOT correct.

Trees. (Trees) Data Structures and Programming Spring / 28

Information Science 2


Welcome Back to Fundamentals of Multimedia (MR412) Fall, 2012 Lecture 10 (Chapter 7) ZHU Yongxin, Winson

Lecture 26. Introduction to Trees. Trees

Huffman Coding Assignment For CS211, Bellevue College (rev. 2016)

Tree Applications. Processing sentences (computer programs or natural languages) Searchable data structures

Data Structure - Binary Tree 1 -

Topic Binary Trees (Non-Linear Data Structures)

CSE 143 Lecture 22. Huffman Tree

Programming II (CS300)

First Semester - Question Bank Department of Computer Science Advanced Data Structures and Algorithms...

COMP 250 Fall binary trees Oct. 27, 2017

6-TREE. Tree: Directed Tree: A directed tree is an acyclic digraph which has one node called the root node

CS 171: Introduction to Computer Science II. Binary Search Trees

Using a Heap to Implement a Priority Queue

Advanced Tree Data Structures

CS301 - Data Structures Glossary By

R13. II B. Tech I Semester Supplementary Examinations, May/June DATA STRUCTURES (Com. to ECE, CSE, EIE, IT, ECC)

Information Science 1

CSE100. Advanced Data Structures. Lecture 12. (Based on Paul Kube course materials)

LECTURE 13 BINARY TREES

CSCI-401 Examlet #5. Name: Class: Date: True/False Indicate whether the sentence or statement is true or false.

CS350: Data Structures Tree Traversal

Also, recursive methods are usually declared private, and require a public non-recursive method to initiate them.

9/29/2016. Chapter 4 Trees. Introduction. Terminology. Terminology. Terminology. Terminology

Binary Search Tree (2A) Young Won Lim 5/17/18

CS24 Week 8 Lecture 1

CHAPTER 17 INFORMATION SCIENCE. Binary and decimal numbers a short review: For decimal numbers we have 10 digits available (0, 1, 2, 3, 9) 4731 =

CSI33 Data Structures

Computer Science E-119 Fall Problem Set 4. Due prior to lecture on Wednesday, November 28

Source coding and compression

INF2220: algorithms and data structures Series 1

Binary Trees Fall 2018 Margaret Reid-Miller

CS15100 Lab 7: File compression

Text Compression through Huffman Coding. Terminology

Tree Structures. A hierarchical data structure whose point of entry is the root node

Intro. To Multimedia Engineering Lossless Compression

FINALTERM EXAMINATION Fall 2009 CS301- Data Structures Question No: 1 ( Marks: 1 ) - Please choose one The data of the problem is of 2GB and the hard

Binary Trees and Binary Search Trees

Tree: non-recursive definition. Trees, Binary Search Trees, and Heaps. Tree: recursive definition. Tree: example.

Lec 17 April 8. Topics: binary Trees expression trees. (Chapter 5 of text)

CSE 332 Autumn 2013: Midterm Exam (closed book, closed notes, no calculators)

Computer Science 210 Data Structures Siena College Fall Topic Notes: Trees

Why Use Binary Trees? Data Structures - Binary Trees 1. Trees (Contd.) Trees

1 P a g e A r y a n C o l l e g e \ B S c _ I T \ C \

Tree. Virendra Singh Indian Institute of Science Bangalore Lecture 11. Courtesy: Prof. Sartaj Sahni. Sep 3,2010

13 BINARY TREES DATA STRUCTURES AND ALGORITHMS INORDER, PREORDER, POSTORDER TRAVERSALS

Introduction to Computer Science and Programming for Astronomers

a graph is a data structure made up of nodes in graph theory the links are normally called edges

Garbage Collection: recycling unused memory

Data Structures - Binary Trees 1

Binary Search Trees Treesort

F453 Module 7: Programming Techniques. 7.2: Methods for defining syntax

R10 SET - 1. Code No: R II B. Tech I Semester, Supplementary Examinations, May

BBM 201 Data structures

CSC148 Week 6. Larry Zhang

Transcription:

Information Science 2 - Path Lengths and Huffman s Algorithm- Week 06 College of Information Science and Engineering Ritsumeikan University

Agenda l Review of Weeks 03-05 l Tree traversals and notations for arithmetic expressions l Review of some of the most basic algorithms l Binary trees and path lengths l Huffman coding l Quiz 2

Recall concepts from l Arrays and Linked lists l Graphs and Trees Weeks 03-05 l Binary and Linear search l Search on a binary tree l Sorting algorithm l Bubble sort, Selection sort, and Insertion sort 3

Class objectives l Discuss some of the most basic algorithms, and learn how binary trees are used to store data and expressions l After this lecture and study, you must be able to: Show infix, prefix, and postfix notations as code and as binary tree traversal Understand the basic algorithm types Understand and apply Huffman s algorithm to encode data 4

Tree traversals l A tree-traversal refers to the process of visiting each node in a tree data structure, exactly once, in a systematic way l We will explore three basic traversals: Inorder Preorder Postorder l E.g: Compilers and interpreters use these traversals in algorithms that convert computer programs into executable code 5

Traversal applications l Inorder corresponds to normal infix notation for arithmetic expressions, as used in many programming languages, e.g: 4+3 or 4 ADD 3 l Preorder corresponds to prefix notation for arithmetic expressions, as used in assembly and languages where operators are functions, e.g: + 4 3 or ADD(4, 3) l Postorder corresponds to postfix notation where evaluation order is left-to-right, as used by interpreters and some types of calculator, e.g: 4 3 + or 4 3 ADD 6

Inorder traversal Problem: Print a tree (where internal nodes represent the operators and external nodes the operands) in normal infix notation + l To solve it, apply inorder rules: Traverse the left subtree * + Process the root - c d Traverse the right subtree l Pseudocode: inorder(node) = a b e if node null then inorder(node.left) print node.value inorder(node.right) l Result: (((a-b)*c)+(d+(e/(f+g)))) / f + g 7

l l l Preorder traversal Problem: Print an expression tree in prefix notation, treating operators as if they were function calls + To solve it, apply preorder rules: Process the root * + Traverse the left subtree Traverse the right subtree - c d / Pseudocode: preorder(node) = if node null then a b e print node.value f preorder(node.left) preorder(node.right) Result: +(*(-(a, b), c), +(d, /(e, (+(f, g)))) + g 8

Postorder traversal Problem: Print an expression tree in postfix notation, where operands and operators appear in the exact order they are evaluated + l To solve it, apply postorder rules: Traverse the left subtree * + Traverse the right subtree - c d Process the root l Pseudocode: postorder(node) = a b e if node null then postorder(node.left) postorder(node.right) print node.value l Result: a b c * d e f g + / + + / f + g 9

Basic algorithms l Previously, we have studied a few basic types of algorithms for data conversion, searching, and sorting l Other basic types of algorithms include: Error checking Error correction Compression Encryption, and Data encoding 10

Error checking l All data communications, storage, and manipulation have the possibility of errors l An error in binary data usually means that some bits have been altered (changed) l The simplest error checking algorithms add up the number of ones (or zeros), for example, in a byte or word: parity algorithms l The parity data is sent or stored along with the data so that most errors can be detected before the data is used 11

Error correction l Algorithms can also be used to correct errors in data, sometimes without having to get it again l The simplest way is to send three copies of the same data and use the two that are the same l Copying the data is called mirroring, but that may waste storage and communication capacity l Various error correction algorithms are available to encode data in a format that can be checked and corrected efficiently l Because errors are not uncommon, modern computing and communication would be almost impossible without error checking and correction 12

Data encryption l Encryption encodes data in a format that is intentionally difficult for others to decode l Encryption is generally a way of keeping data and its access secret and secure l After data is encrypted, it may be sent or stored, and then decrypted for use l Access data, such as passwords, may be encrypted one way so it can only easily be confirmed but never easily decrypted l Encryption and decryption have become increasingly important for almost all communication, storage, and access 13

Data encoding l In addition to the just considered problems (i.e., error detection and correction, compression, and encryption), we have previously learned some other common tasks related to data encoding: number representation, ASCII, RGB codes, etc. l In many situations, data encoding is done, using simple tables or arithmetic l There are also many algorithms for encoding data, for example with binary trees, that can be, in a way, better (faster, require less memory, ) 14

15 Data compression l Compression means encoding data in a more compact (economic) format l Compression allows more data to be stored, e.g., a high-definition movie, or thousands of photographs or songs stored on a single disk l Compression allows faster communication l After data is compressed and stored, it must be expanded (or decompressed) to use again l Like all encoding, compression and expansion require both sides (transmitter and receiver) to have related algorithms

Binary trees: Complete and extended l Recall the basic binary tree concept: a node may have 2, 1 or 0 vertices below it, the left and right child nodes l An extended binary tree has either zero (for external) or two (for internal) child nodes at each node l A complete binary tree has two child nodes for each internal node at every level, with a possible exception for the last level of internal nodes 16

Complete binary tree: l Check each node of the graph, using BFS (do not include the last level of internal nodes): Does each checked node have two child nodes? l If yes, it is complete l The final level must be filled from the left Example filled filled not filled no vertices 17

Extended binary tree: l Check each node of the graph, using DFS (or any other): Does every node have zero or two vertices below it? l Internal nodes (shown here as circles) have two child nodes l External nodes (shown here as squares) have zero child nodes l Suitable for encoding data Example 18

Encoding with binary trees l A binary tree can encode a digital, binary code: Each left child corresponds to a binary 0 Each right child corresponds to a binary 1 0 000 001 010 011 100 101 110 111 l In the example, the external nodes encode the octal digits; encoded bits describe a path from the root down to a digit 0 1 0 1 0 1 0 1 0 1 0 (8 1 (8 2 (8 3 (8 4 (8 5 (8 6 (8 7 (8 0 1 1 19

Path length l This binary tree is complete and extended l In the octal digit code, each symbol is represented with the same number of bits each 0 1 0 symbol has the same 0 0 1 path length (i.e., the number 001 010 011 100 101 of edges between the symbol and the root) 0 1 0 1 0 1 000 110 111 0 (8 1 (8 2 (8 3 (8 4 (8 5 (8 6 (8 7 (8 1 1 20

ASCII binary tree sample control, symbols, numerals, punctuation upper case, etc. Example: q = 1110001 = 71h ` a b c d e f g h i j k l m n o p q r s t u v w x y z { } ~ l The encoding is, however, not quite efficient: The resulting tree is huge, even though many symbols may rarely be used The length of each code is (the same and) long even when the corresponding symbol may be used often 21

22 Huffman Coding l Huffman Coding is an algorithm for building a compact tree (i.e., smaller than in the case of the plain binary encoding) l The obtained compact tree is extended but not necessarily complete: symbols that are used more frequently get smaller numbers of bits l Huffman coding is used in many kinds of data compression, including those in image files and video and audio files

23 Huffman Coding algorithm: Overview 1. Get sample data (or the actual symbols) to be encoded 2. Count how many times each symbol is used in the sample data 3. Use that frequency to build the tree from the bottom up, each frequency becoming a node 4. Start with the two least-used symbols to create a node with two child nodes, add the two (child node) frequencies for the new node 5. Evaluate all subtrees, including the new node or any nodes, to find the two least used again

Huffman Coding: Example Sample text: this is an example of a huffman tree Symbols used and their frequencies: a : 4 e : 4 f : 3 h : 2 i : 2 l : 1 m : 2 n : 2 o : 1 p : 1 r : 1 s : 2 t : 2 u : 1 x : 1 space: 7 For the sake of simplicity, symbols not met in the text will be ignored (i.e., the zero frequency will not be used) 24

a : 4 e : 4 f : 3 h : 2 i : 2 l : 1 m : 2 n : 2 o : 1 p : 1 r : 1 s : 2 t : 2 u : 1 x : 1 space: 7 Coding example (cont-d) l Start with the least-used symbols l The numbers in each node are the frequencies l Add the frequencies of the subtrees for new nodes l Each iteration, build from the lowest frequencies, including the symbols not yet on the tree 4 a 8 2 h 4 4 4 4 4 e 2 2 2 2 2 2 i m n s 1 1 o r 1 l 8 36 16 20 1 p 8 2 t 1 u 2 5 1 x 12 3 f 7 space 25

Coding example (cont-d) space 111 a 000 8 36 16 20 8 8 12 4 a 2 h 4 4 4 4 4 e 2 2 2 2 2 2 i m n s 1 1 o r 1 l l Now, shorter codes stand for frequent symbols l Also, no symbol begins with the same bits as any other less frequent symbol l Write the binary codes by frequency for all the symbols (the first two have been done for you) 1 p 2 t 1 u 2 5 1 x 3 f 7 space 26

Summary of this lecture l After this class, you are expected to know the basic algorithm types l Binary trees can be used to encode data l Huffman s algorithm is an efficient way of making a compact tree l You must be able to make a Huffman tree when given small samples of text or characters l Tree traversal is used in algorithms for compiling and interpreting programs, and computing l You must be able to show examples of infix, prefix, and postfix notations as code and binary trees 27

28 l Read these slides again l Do the self-preparation assignments Homework l Learn the English terms new for you

29 Next class l Overview and mid-semester evaluation for the first six weeks: from Week 01 to Week 06

Quiz 03 30