CS473-Algorithms I. Lecture 11. Greedy Algorithms. Cevdet Aykanat - Bilkent University Computer Engineering Department

Similar documents
Analysis of Algorithms - Greedy algorithms -

Greedy algorithms part 2, and Huffman code

16.Greedy algorithms

G205 Fundamentals of Computer Engineering. CLASS 21, Mon. Nov Stefano Basagni Fall 2004 M-W, 1:30pm-3:10pm

Algorithms Dr. Haim Levkowitz

Greedy Algorithms. CLRS Chapters Introduction to greedy algorithms. Design of data-compression (Huffman) codes

Greedy Algorithms CHAPTER 16

Lecture: Analysis of Algorithms (CS )

Chapter 16: Greedy Algorithm

Dynamic Programming Assembly-Line Scheduling. Greedy Algorithms

16 Greedy Algorithms

Text Compression through Huffman Coding. Terminology

Design and Analysis of Algorithms

TU/e Algorithms (2IL15) Lecture 2. Algorithms (2IL15) Lecture 2 THE GREEDY METHOD

We ve done. Now. Next

Design and Analysis of Algorithms

Greedy Algorithms. Alexandra Stefan

Design and Analysis of Algorithms

ECE608 - Chapter 16 answers

Tutorial 6-7. Dynamic Programming and Greedy

Greedy algorithms 2 4/5/12. Knapsack problems: Greedy or not? Compression algorithms. Data compression. David Kauchak cs302 Spring 2012

Huffman Coding. Version of October 13, Version of October 13, 2014 Huffman Coding 1 / 27

Exercises Optimal binary search trees root

Elements of Dynamic Programming. COSC 3101A - Design and Analysis of Algorithms 8. Discovering Optimal Substructure. Optimal Substructure - Examples

Scribe: Virginia Williams, Sam Kim (2016), Mary Wootters (2017) Date: May 22, 2017

LECTURE NOTES OF ALGORITHMS: DESIGN TECHNIQUES AND ANALYSIS

Efficient Sequential Algorithms, Comp309. Motivation. Longest Common Subsequence. Part 3. String Algorithms

CSC 373 Lecture # 3 Instructor: Milad Eftekhar

Design and Analysis of Algorithms 演算法設計與分析. Lecture 7 April 6, 2016 洪國寶

COMP251: Greedy algorithms

February 24, :52 World Scientific Book - 9in x 6in soltys alg. Chapter 3. Greedy Algorithms

CS141: Intermediate Data Structures and Algorithms Greedy Algorithms

Greedy Algorithms CSE 780

Greedy Algorithms and Huffman Coding

Greedy Algorithms. Textbook reading. Chapter 4 Chapter 5. CSci 3110 Greedy Algorithms 1/63

CSE541 Class 2. Jeremy Buhler. September 1, 2016

Greedy Algorithms CSE 6331

Analysis of Algorithms Prof. Karen Daniels

CSE 431/531: Algorithm Analysis and Design (Spring 2018) Greedy Algorithms. Lecturer: Shi Li

1 Non greedy algorithms (which we should have covered

T(n) = expected time of algorithm over all inputs of size n. half the elements in A[1.. j 1] are less than A[ j ], and half the elements are greater.

Department of Computer Science and Engineering Analysis and Design of Algorithm (CS-4004) Subject Notes

Greedy Algorithms CLRS Laura Toma, csci2200, Bowdoin College

Greedy Algorithms. Previous Examples: Huffman coding, Minimum Spanning Tree Algorithms

IN101: Algorithmic techniques Vladimir-Alexandru Paun ENSTA ParisTech

Greedy Algorithms. Mohan Kumar CSE5311 Fall The Greedy Principle

15.4 Longest common subsequence

Design and Analysis of Algorithms 演算法設計與分析. Lecture 7 April 15, 2015 洪國寶

16.3 The Huffman code problem

CPE702 Algorithm Analysis and Design Week 7 Algorithm Design Patterns

1 Format. 2 Topics Covered. 2.1 Minimal Spanning Trees. 2.2 Union Find. 2.3 Greedy. CS 124 Quiz 2 Review 3/25/18

CS2223: Algorithms Sorting Algorithms, Heap Sort, Linear-time sort, Median and Order Statistics

CS F-10 Greedy Algorithms 1

Fundamentals of Multimedia. Lecture 5 Lossless Data Compression Variable Length Coding

Introduction to Algorithms

Greedy Algorithms. Informal Definition A greedy algorithm makes its next step based only on the current state and simple calculations on the input.

CS 758/858: Algorithms

Chapter 9. Greedy Technique. Copyright 2007 Pearson Addison-Wesley. All rights reserved.

15 July, Huffman Trees. Heaps

16.3 The Huffman code problem

Greedy algorithms is another useful way for solving optimization problems.

Analysis of Algorithms

Intro. To Multimedia Engineering Lossless Compression

Huffman Codes (data compression)

memoization or iteration over subproblems the direct iterative algorithm a basic outline of dynamic programming

Problem Strategies. 320 Greedy Strategies 6

MCS-375: Algorithms: Analysis and Design Handout #G2 San Skulrattanakulchai Gustavus Adolphus College Oct 21, Huffman Codes

Lecture 3. Recurrences / Heapsort

So far... Finished looking at lower bounds and linear sorts.

We will go over the basic scenarios, in which it is appropriate to apply this technique, and a few concrete applications.

CSE 421 Greedy: Huffman Codes

Subsequence Definition. CS 461, Lecture 8. Today s Outline. Example. Assume given sequence X = x 1, x 2,..., x m. Jared Saia University of New Mexico

Binary Heaps in Dynamic Arrays

4.8 Huffman Codes. These lecture slides are supplied by Mathijs de Weerd

looking ahead to see the optimum

CSC 373: Algorithm Design and Analysis Lecture 4

CMPS 102 Solutions to Homework 7

CS2 Algorithms and Data Structures Note 6

CS473-Algorithms I. Lecture 10. Dynamic Programming. Cevdet Aykanat - Bilkent University Computer Engineering Department

Partha Sarathi Manal

EE 368. Weeks 5 (Notes)

/463 Algorithms - Fall 2013 Solution to Assignment 3

Greedy Algorithms. Algorithms

Greedy Algorithms 1 {K(S) K(S) C} For large values of d, brute force search is not feasible because there are 2 d {1,..., d}.

CS 337 Project 1: Minimum-Weight Binary Search Trees

Longest Common Subsequence, Knapsack, Independent Set Scribe: Wilbur Yang (2016), Mary Wootters (2017) Date: November 6, 2017

Computational Optimization ISE 407. Lecture 16. Dr. Ted Ralphs

CS2 Algorithms and Data Structures Note 6

CS 6783 (Applied Algorithms) Lecture 5

Solving NP-hard Problems on Special Instances

Algorithms for Data Science

Horn Formulae. CS124 Course Notes 8 Spring 2018

Greedy Algorithms. Subhash Suri. October 9, 2017

CS 161: Design and Analysis of Algorithms

15.4 Longest common subsequence

Welcome Back to Fundamentals of Multimedia (MR412) Fall, 2012 Lecture 10 (Chapter 7) ZHU Yongxin, Winson

Trees! Ellen Walker! CPSC 201 Data Structures! Hiram College!

implementing the breadth-first search algorithm implementing the depth-first search algorithm

An undirected graph is a tree if and only of there is a unique simple path between any 2 of its vertices.

Greedy Algorithms 1. For large values of d, brute force search is not feasible because there are 2 d

Transcription:

CS473-Algorithms I Lecture 11 Greedy Algorithms 1

Activity Selection Problem Input: a set S {1, 2,, n} of n activities s i =Start time of activity i, f i = Finish time of activity i Activity i takes place in [s i, f i ) Aim: Find max-size subset A of mutually compatible activities Max number of activities, not max time spent in activities Activities i and j are compatible if intervals [s i, f i ) and [s j, f j ) do not overlap, i.e., either s i f j or s j f i 2

Activity Selection Problem: An Example S={[1, 4 ), [5, 7 ), [2, 8 ), [3, 11 ), [8, 15 ), [13, 18 )} 3 4 6 1 2 5 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 3

Optimal Substructure heorem: Let k be the activity with the earliest finish time in an optimal soln A S then A{k} is an optimal solution to subproblem S k {is: s i f k } Proof by contradiction): Let B be an optimal solution to S k and B A{k} A 1 hen, B B {k} is compatible and B B 1 A Contradiction to the optimality of A Q.E.D. 4

Repeated Subproblems Consider recursive algorithm that tries all possible compatible subsets Notice repeated subproblems e.g., S 2 ) let f 1 f n ) S 1 2A? YES NO YES 1A? -{2} 2A? S 2 S 1 S 2 S-{1,2} NO YES NO S-{1} 5

Greedy Choice Property Repeated subproblems and optimal substructure properties hold in activity selection problem Dynamic programming? Memoize? Yes, but Greed choice property: a sequence of locally optimal greedy) choices an optimal solution Assume without loss of generality) f 1 f 2 f n If not sort activities according to their finish times in nondecreasing order 6

Greedy Choice Property in Activity Selection heorem: here exists an optimal solution A S such that 1 A Remember f 1 f 2 f n ) Proof: Let A {k,, m, } be an optimal solution such that f k f f m If k 1 then schedule A begins with the greedy choice If k 1 then show that another optimal soln that begins with the greedy choice 1 Let B A{k} {1}, since f 1 f k activity 1 is compatible with A{k}; B is compatible B A 11 A Hence B is optimal Q.E.D. 7

Activity Selection Problem j: specifies the index of most recent activity added to A f j Max{f k : k A}, max finish time of any activity in A; because activities are processed in nondecreasing order of finish times hus, s i f j checks the compatibility of i to current A Running time: n) assuming that the activities were already sorted GASs, f, n) A {1} j 1 for i 2 to n do if s i f j then A A {i} j i return A 8

Activity Selection Problem: An Example S={[1, 4 ), [5, 7 ), [2, 8 ), [3, 11 ), [8, 15 ), [13, 18 )} f j =0 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 9

Activity Selection Problem: An Example S={[1, 4 ), [5, 7 ), [2, 8 ), [3, 11 ), [8, 15 ), [13, 18 )} f j =4 1 2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 10

Activity Selection Problem: An Example S={[1, 4 ), [5, 7 ), [2, 8 ), [3, 11 ), [8, 15 ), [13, 18 )} f j =7 3 1 2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 11

Activity Selection Problem: An Example S={[1, 4 ), [5, 7 ), [2, 8 ), [3, 11 ), [8, 15 ), [13, 18 )} f j =7 4 1 2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 12

Activity Selection Problem: An Example S={[1, 4 ), [5, 7 ), [2, 8 ), [3, 11 ), [8, 15 ), [13, 18 )} f j =7 1 2 5 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 13

Activity Selection Problem: An Example S={[1, 4 ), [5, 7 ), [2, 8 ), [3, 11 ), [8, 15 ), [13, 18 )} f j =15 1 2 5 6 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 14

Activity Selection Problem: An Example S={[1, 4 ), [5, 7 ), [2, 8 ), [3, 11 ), [8, 15 ), [13, 18 )} A={1, 2, 5} 1 2 5 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 15

Greedy vs Dynamic Programming Optimal substructure property exploited by both Greedy and DP strategies Greedy Choice Property: A sequence of locally optimal choices an optimal solution We make the choice that seems best at the moment hen solve the subproblem arising after the choice is made DP: We also make a choice/decision at each step, but the choice may depend on the optimal solutions to subproblems Greedy: he choice may depend on the choices made so far, but it cannot depend on any future choices or on the solutions to subproblems 16

Greedy vs Dynamic Programming DP is a bottom-up strategy Greedy is a top-down strategy each greedy choice in the sequence iteratively reduces each problem to a similar but smaller problem 17

Proof of Correctness of Greedy Algorithms Examine a globally optimal solution Show that this soln can be modified so that 1) A greedy choice is made as the first step 2) his choice reduces the problem to a similar but smaller problem Apply induction to show that a greedy choice can be used at every step Showing 2) reduces the proof of correctness to proving that the problem exhibits optimal substructure property 18

Elements of Greedy Strategy How can you judge whether A greedy algorithm will solve a particular optimization problem? wo key ingredients Greedy choice property Optimal substructure property 19

Key Ingredients of Greedy Strategy Greedy Choice Property: A globally optimal solution can be arrived at by making locally optimal greedy) choices In DP,we make a choice at each step but the choice may depend on the solutions to subproblems In Greedy Algorithms, we make the choice that seems best at that moment then solve the subproblems arising after the choice is made he choice may depend on choices so far, but it cannot depend on any future choice or on the solutions to subproblems DP solves the problem bottom-up Greedy usually progresses in a top-down fashion by making one greedy choice after another reducing each given problem instance to a smaller one 20

Key Ingredients: Greedy Choice Property We must prove that a greedy choice at each step yields a globally optimal solution he proof examines a globally optimal solution Shows that the soln can be modified so that a greedy choice made as the first step reduces the problem to a similar but smaller subproblem hen induction is applied to show that a greedy choice can be used at each step Hence, this induction proof reduces the proof of correctness to demonstrating that an optimal solution must exhibit optimal substructure property 21

Key Ingredients: Optimal Substructure A problem exhibits optimal substructure if an optimal solution to the problem contains within it optimal solutions to subproblems Example: Activity selection problem S If an optimal solution A to S begins with activity 1 then the set of activities A A{1} is an optimal solution to the activity selection problem S {is: s i f 1 } 22

Key Ingredients: Optimal Substructure Optimal substructure property is exploited by both Greedy and dynamic programming strategies Hence one may ry to generate a dynamic programming soln to a problem when a greedy strategy suffices Or, may mistakenly think that a greedy soln works when in fact a DP soln is required Example:Knapsack ProblemsS, w) 23

Knapsack Problems he 0-1Knapsack ProblemS, W) A thief robbing a store finds n items S {I 1, I 2,, I n }, the ith item is worth v i dollars and weighs w i pounds, where v i and w i are integers He wants to take as valuable a load as possible, but he can carry at most W pounds in his knapsack, where W is an integer he thief cannot take a fractional amount of an item he Fractional Knapsack Problem S, W) he scenario is the same But, the thief can take fractions of items rather than having to make binary 0-1) choice for each item 24

0-1 and Fractional Knapsack Problems Both knapsack problems exhibit the optimal substructure property he 0-1Knapsack ProblemS, W) Consider a most valuable load L where W L W If we remove item j from this optimal load L he remaining load L j L {I j } must be a most valuable load weighing at most W j W w j pounds that the thief can take from S j S {I j } hat is, L j should be an optimal soln to the 0-1 Knapsack ProblemS j, W j ) 25

0-1 and Fractional Knapsack Problems he Fractional Knapsack ProblemS, W) Consider a most valuable load L where W L W If we remove a weight 0 w w j of item j from optimal load L he remaining load L j L {w pounds of I j } must be a most valuable load weighing at most W j W w pounds that the thief can take from S j S {I j }{w j w pounds of I j } hat is, L j should be an optimal soln to the Fractional Knapsack ProblemS j, W j ) 26

Knapsack Problems Although the problems are similar the Fractional Knapsack Problem is solvable by Greedy strategy whereas, the 0-1 Knapsack Problem is not 27

Greedy Solution to Fractional Knapsack 1) Compute the value per pound v i w i for each item 2) he thief begins by taking, as much as possible, of the item with the greatest value per pound 3) If the supply of that item is exhausted before filling the knapsack he takes, as much as possible, of the item with the next greatest value per pound 4) Repeat 2-3) until his knapsack becomes full hus, by sorting the items by value per pound the greedy algorithm runs in Onlg n) time 28

0-1 Knapsack Problem Greedy strategy does not work Item 3 Item 1 Item 2 W 50 w 1 10 w 2 20 w 3 30 v 1 $60 v 2 $100 v 3 $120 v 1 w 1 6 v 2 w 2 5 v 3 w 3 4 Knapsack 29

0-1 Knapsack Problem aking item 1 leaves empty space; lowers the effective value of the load Item 3 v 3 $120 w 3 30 Item 2 v 2 $100 w 2 20 Knapsack $220optimal) Item 2 Item 3 v 2 $100 v 3 $120 w 2 20 w 3 30 w 1 10 $60 w 1 10 Knapsack Knapsack $160 $180 out of 30) w 3 20 w 2 20 w 1 10 $80 $100 $60 Fractional is optimally solved with $240 30

0-1 Knapsack Problem When we consider an item I j for inclusion we must compare the solutions to two subproblems Subproblems in which I j is included and excluded he problem formulated in this way gives rise to many overlapping subproblems a key ingredient of DP) In fact, dynamic programming can be used to solve the 0-1 Knapsack problem 31

0-1 Knapsack Problem A thief robbing a store containing n articles {a 1, a 2,, a n } he value of ith article is v i dollars v i is integer) he weight of ith article is w i kg w i is integer) hief can carry at most W kg in his knapsack Which articles should he take to maximize the value of his load? Let K n,w {a 1, a 2,,a n :W} denote 0-1 knapsack problem Consider the solution as a sequence of n decisions i.e., ith decision: whether thief should pick a i for optimal load 32

0-1 Knapsack Problem Optimal substructure property: Let a subset S of articles be optimal for K n,w Let a i be the highest numbered article in S hen S S {a i } is an optimal solution for subproblem K i1,wwi {a 1, a 2,, a i1 : Ww i } cs) v i + cs ) with where c ) is the value of an optimal load 33

0-1 Knapsack Problem Recursive definition for value of optimal soln: Define c[i,w] as the value of an optimal solution for K i,w {a 1, a 2,, a i :w} c[i,w] 0, c[i 1,w], if i 0 or w 0 if w i w max{v i c[i 1,w w i ], c[i 1,w]} o.w 34

0-1 Knapsack Problem Recursive definition for value of optimal soln: his recurrence says that an optimal solution S i,w for K i,w either contains a i cs i,w ) v i cs i1,wwi ) or does not contain a i cs i,w ) cs i1,w ) If thief decides to pick a i He takes v i value and he can choose from {a 1, a 2,,a i1 } up to the weight limit w w i to get c[i 1,w w i ] If he decides not to pick a i He can choose from {a 1, a 2,,a i1 } up to the weight limit w to get c[i 1,w] he better of these two choices should be made 35

DP Solution to 0-1 Knapsack KNAP0-1v, w, n,w) for 0 to W do c[0, ] 0 for i 1 to n do c[i, 0] 0 for i1 to n do for 1 to W do c is an n1)w1) array; c[0.. n : 0..W] Note: table is computed in row-major order Run time: n) nw) if w i then c[i, ] max{v i c[i 1, w i ], c[i 1, ]} else c[i, ] c[i 1, ] return c[n, W] 36

Finding the Set S of Articles in an Optimal Load SOLKNAP0-1a, v, w, n,w,c) i n ; W S while i 0 do if c[i, ] c[i 1, ] then i i 1 else S S {ai } w i i i 1 return S 37

Huffman Codes Widely used and very effective technique for compressing data Savings of 20% to 90% are typical Depending on the characteristics of the file being compressed Huffman s greedy algorithm uses a table of the frequencies of occurrence of each character to build up an optimal way of representing each character as a binary string Example: A 100,000-character data file that is to be compressed only 6 characters {a, b, c, d, e, f} appear a b c d e f frequency in thousands) 45K 13K 12K 16K 9K 5K fixed-length codeword 000 001 010 011 100 101 variable-length codeword 0 101 100 111 1101 1100 variable-length codeword 0 10 110 1110 11110 11111 38

Huffman Codes Binary character code: each character is represented by a unique binary string Fixed-length code: needs 3 bits to represent 6 characters requires 100.0003300,000 bits to code the entire file Variable-length code: can do better by giving frequent characters short codewords & infrequent words long codewords requires 4511331231639454 224,000 bits 39

Prefix Codes Prefix codes: No codeword is also a prefix of some other codeword It can be shown that: optimal data compression achievable by a character code can always be achieved with a prefix code Prefix codes simplify encoding compression) and decoding Encoding: Concatenate the codewords representing each character of the file e.g. 3 char file abc encoded 0.101.100 0101100 40

Prefix Codes Decoding: is quite simple with a prefix code the codeword that begins an encoded file is unambigious since no codeword is a prefix of any other identify the initial codeword translate it back to the original character remove it from the encoded file repeat the decoding process on the remainder of the encoded file e.g. string 001011101 parses uniquely as 0.0.101.1101 decoded aabe 41

Prefix Codes Convenient representation for the prefix code: a binary tree whose leaves are the given characters Binary codeword for a character is the path from the root to that character in the binary tree 0 means go to the left child 1 means go to the right child 42

Binary ree Representation of Prefix Codes 0 100 1 86 14 0 1 0 58 28 14 0 1 0 1 0 1 a: 45 b: 13 c: 12 d: 16 e: 9 f: 5 he binary tree corresponding to the fixed-length code 43

Binary ree Representation of Prefix Codes a: 45 0 100 0 1 55 1 he binary tree corresponding to the optimal variable-length code 25 30 0 1 0 1 c: 12 b: 13 14 d: 16 0 1 f: 5 e: 9 An optimal code for a file is always represented by a full binary tree 44

Full Binary ree Representation of Prefix Codes Consider an FB corresponding to an optimal prefix code It has C leaves external nodes) One for each letter of the alphabet where C is the alphabet from which the characters are drawn Lemma: An FB with C external nodes has exactly C 1 internal nodes 45

Full Binary ree Representation of Prefix Codes Consider an FB corresponding to a prefix code How to compute, B), the number of bits required to encode a file fc): frequency of character c in the file d c): depth of c s leaf in the FB note that d c) also denotes length of the codeword for c B ) f c) d c) cc which we define as the cost of the tree 46

Prefix Codes Lemma: Let each internal node i is labeled with the sum of the weight wi) of the leaves in its subtree hen cc B ) f c) d c) w i) where I denotes the set of internal nodes in i Proof: Consider a leaf node c with f c) & d c) hen, f c) appears in the weights of d c) internal node along the path from c to the root Hence, f c) appears d c) times in the above summation I 47

Constructing a Huffman Code Huffman invented a greedy algorithm that constructs an optimal prefix code called a Huffman code he greedy algorithm builds the FB corresponding to the optimal code in a bottom-up manner begins with a set of C leaves performs a sequence of C 1 merges to create the final tree 48

Constructing a Huffman Code A priority queue Q, keyed on f, is used to identify the two least-frequent objects to merge he result of the merger of two objects is a new object inserted into the priority queue according to its frequency which is the sum of the frequencies of the two objects merged 49

Constructing a Huffman Code HUFFMANC) n C Q C for i 1 to n 1 do z ALLOCAE-NODE) x left[z] EXRAC-MINQ) y right[z] EXRAC-MINQ) f [z] f [x] f [y] INSERQ, z) return EXRAC-MINQ) only one object left in Q Priority queue is implemented as a binary heap Initiation of Q BUILD-HEAP): On) time EXRAC-MIN & INSER take Olgn) time on Q with n objects n n) lg i Olg n!)) O nlg n) i1 50

Constructing a Huffman Code a) f: 5 e: 9 c: 12 b: 13 d: 16 a: 45 b) c: 12 b: 13 14 0 1 d: 16 a: 45 f: 5 e: 9 51

Constructing a Huffman Code c) 14 0 1 d: 16 25 a: 45 0 1 f: 5 e: 9 c: 12 b: 13 d) 25 0 1 30 0 1 a: 45 c: 12 b: 13 0 14 1 d: 16 f: 5 e: 9 52

Constructing a Huffman Code e) a: 45 55 0 1 25 0 1 30 0 1 c: 12 b: 13 0 14 1 d: 16 f: 5 e: 9 53

Constructing a Huffman Code f) 100 0 1 a: 45 55 0 1 25 0 1 30 0 1 c: 12 b: 13 0 14 1 d: 16 f: 5 e: 9 54

Correctness of Huffman s Algorithm We must show that the problem of determining an optimal prefix code exhibits the greedy choice property exhibits the optimal substructure property Lemma 1: Let x & y be two characters in C having the lowest frequencies hen, an optimal prefix code for C in which the codewords for x & y have the same length and differ only in the last bit 55

Correctness of Huffman s Algorithm Proof: ake tree representing an arbitrary optimal code Modify to make a tree representing another optimal code such that characters x & y appear as sibling leaves of max-depth in the new tree Assume that f [b] f [c] & f [x] f [y] Since f [x] & f [y] are two lowest leaf frequencies, in order, and f [b] & f [c] are two arbitrary leaf frequencies, in order, f [x] f [b] & f [y] f [c] 56

Correctness of Huffman s Algorithm x b b y y c b c x c x y : exchange the positions of the leaves b & x : exchange the positions of the leaves c & y 57

58 Greedy-Choice Property of Determining an Optimal Code Proof of Lemma 1 continued): he difference in cost between and is 0 )) ) ]) [ ] [ )) ) ] [ )) ) ] [ ) ] [ ) ) ) ] [ ) ] [ ) ] [ ) ] [ ) ] [ ) ] [ ) ) ) ) ') ) ' ' ' x d b d x f b f x d b d x f x d b d b f x d b f b d x f b d b f x d x f b d b f x d x f b d b f x d x f c d c f c d c f B B C c C c

Greedy-Choice Property of Determining an Optimal Code Proof of Lemma 1 continued): Since f [b]f [x] 0 and d b) d x) therefore B) B) We can similary show that B)B) 0 B) B) which implies B) B) Since is optimal B) B) is also optimal 59

Greedy-Choice Property of Determining an Optimal Code Lemma 1 implies that process of building an optimal tree by mergers can begin with the greedy choice of merging those two characters with the lowest frequency We have already proved that, that is, the total cost of the tree constructed is the sum of the costs of its mergers internal nodes) of all possible mergers At each step Huffman chooses the merger that incurs the least cost B ) w i) i I 60

Greedy-Choice Property of Determining an Optimal Code Lemma 2: Consider any two characters x & y that appear as sibling leaves in optimal and let z be their parent hen, considering z as a character with frequency f [z] f [x] f [y] he tree {x, y} represents an optimal prefix code for the alphabet C C {x, y} {z} 61

62 Greedy-Choice Property of Determining an Optimal Code Proof: ry to express cost of in terms of cost of For each c C C {x, y} we have d c) d c) f c)d c) f c)d c) f [z] f [x] f [y] f [y] f [x] x y y z z ] [ ] [ ') ] [ ') ) ] [ 1) ) ] [ ') ) ] [ 1) ) ] [ 1) ) ] [ ') ) y f x f B z f B z d z f z d z f B z d z f z d y f z d x f B B

Greedy-Choice Property of Determining an Optimal Code Proof continued): If represents a nonoptimal prefix code for the alphabet C hen, a tree whose leaves are characters in C such that B) < B) Since z is a character in C, it appears as a leaf in If we add x & y as children of z in then we obtain a prefix code for x with cost B) f [x] + f [y] < B) + f [x] + f [y] B) contradicting the optimality of 63