CS 161: Design and Analysis o Algorithms
Announcements Homework 3, problem 3 removed
Greedy Algorithms 4: Human Encoding/Set Cover Human Encoding Set Cover
Alphabets and Strings Alphabet = inite set o symbols English alphabet = {a,b,c,,z} Hex values = {0,1,,9,A,B,C,D,E,F} String = sequence o symbols rom some alphabet This is a string
How to Encode Computers store things as 0s and 1s How do we encode strings as sequence o bits? Must be invertible (one-to-one) What to use as ew bits as possible One approach: choose encoding or characters, induce encoding o strings by concatenating codes or each character
How to Encode Obvious solution: I alphabet size is 2 k or some k, encode each character using k bits Each character takes k bits n characters kn bits total Letter Encoding A 00 B 01 C 10 D 11 ABACBDAAADBAC 00010010011100000011010010
How to Encode Issues: Wasteul: I not exactly 2 k characters, some sequences never used Letter A 00 B 01 C 10 Encoding Never use 11
How to Encode Issues: What i one character occurs very oten? AAAAAAABAAACAABAADAAAAAAACAAAB I almost all letters are A s, then an encoding that uses ewer bits to represent A and more to represent everything else would save on space
Variable Length Encoding Variable Length Encoding = encoding o characters as bits where dierent letters may use a dierent number o bits Still need encoding on strings to be one-to-one. What does this say about the encoding or characters?
Variable Length Encoding Letter Encoding A 0 B 01 C 10 D 11 AC 010 BA 010 Not one-to-one!
Preix-Free Encoding A preix o a bit sequence is the irst i bits, or some i 0100101101000110101 0 01 010 0100 01001
Preix-Free Encoding A preix-ree encoding is an encoding o an alphabet such that no encoding o any character is a preix o the encoding o any other character Letter Encoding A 0 B 01 C 10 D 11 The encoding o A is a preix o the encoding o C
Preix-Free Encoding A preix-ree encoding is an encoding o an alphabet such that no encoding o any character is a preix o the encoding o any other character Letter Encoding A 0 B 10 C 110 D 111
Preix-Free Encoding Theorem: Any preix-ree encoding o an alphabet induces a one-to-one encoding o strings over that alphabet
Preix-Free Encoding Proo: Suppose toward contradiction that S and T are two dierent strings that map to the same sequence o bits Assume w.l.o.g. that S and T dier on the irst character. Let c be the irst character o S, d the irst character o T. Let E(c) and E(d) be the encodings o c and d Assume w.l.o.g. E(c) E(d)
Preix-Free Encoding Since all bits in encodings o S and T are the same, the irst E(d) bits are Thereore, the irst E(d) bits o E(c) are equal to E(d) E(d) is a preix o E(c) Since c was assumed dierent rom d, our encoding is not preix-ree.
Tree View o Preix-Free Encoding Every node represents a partial codeword Every node has two children, one or appending 0 to the partial codeword, one or appending 1. Leaves correspond to actual codewords Root is empty
Tree View o Preix-Free Encoding 0 1 A:0 1 0 1 0 10 1 B:11 C:100 D:101
Tree View o Preix-Free Encoding To encode: Find path rom root to character, concatenate edge labels To decode b 1 b 2 : Starting rom the root, ollow edge labeled b 1, then edge labeled b 2, until we ind a lea. Output that character, and start over rom the root
Optimal Encoding What is the best possible preix-ree encoding we can ind? Let n be the length o the string Let C be the cost o the encoding, deined as (length o encoding)/n C = average length o encoding o characters, weighted by requency
Optimal Encoding Let l i be the length o the encoding o character i Let i be the requency i occurs in the string i (number o instances o i)/n C = i i l i
Optimal Encoding l i is also the depth o character i in the encoding tree. Optimal encoding is always a ull binary tree I there is a node with only 1 child, replace node with child. Depth o leas only decreases.
Optimal Encoding Entropy: H = i Theorem (Shannon Coding Theorem): C H i log i
Proo O Coding Theorem Let g(x) = x log x Lemma: g( (x+y)/2 ) ( g(x)+g(y) )/2
Proo O Coding Theorem True when only 2 characters Only possible encoding is or each character to get 1 bit. C = 1 H g( 1) g( 2) 1 + 2 = log 1 2 log 2 = 2 2 g = 2g(1/ 2) 2 2 1 = 1
Proo o Coding Theorem Inductively assume true or m-1 characters Let T be the tree corresponding to an optimal encoding over some alphabet o m characters At least two leas at bottom level. Assume w.l.o.g. these correspond to characters 1 and 2 Replace all instances o characters 1 and 2 with a new character Has requency 1 + 2
Proo o Coding Theorem Now we have an alphabet o size m-1 Encoding or alphabet: start with T delete the nodes corresponding to characters 1 and 2 Assign the new character to the parent o these nodes (which is now a lea) New character has code length 1 less than deleted characters
Proo o Coding Theorem How does C change? Removed character 1 with length l, requency 1 Removed character 2 with length l, requency 2 Added new character, length l-1, requency 1 + 2 C = i i l i C' = C ( 1 + 2) l + ( 1 + 2)( l 1) = C ( 1 + 2)
Proo o Coding Theorem By inductive assumption, Recall ) )log( ( log ' 'log ' ' 2 1 2 1 3 H C i i i i i + + = = ) )log( ( log log ) )log( ( log log log 2 1 2 1 2 2 1 1 2 1 2 1 2 2 1 1 H i i i + + + + = + + + + = 2 1 ' C C + + =
Proo o Coding Theorem ( ) 1 ) log( ) ( log log 2 1 2 1 2 2 1 1 + + + + H C H g g g H H + + + = + + + + = 2 ) ( 2 1 ) ( 2 1 2 2 )log ( log log 2 1 2 1 2 1 2 1 2 2 1 1
How to Find Optimal Encoding Claim 1: There is an optimal solution where the two least requent characters have the longest codewords (i.e. lowest level o tree), and are identical except or last bit I not, swap these two characters with two o the characters with the longest codewords Can swap with two that are siblings
How to Find Optimal Encoding Assume the two lowest-requency characters are 1 and 2. What i we merge the two characters into a new character with requency 1 + 2? New character gets codeword obtained by dropping last bit o the codewords or 1 or 2
Merging Two Characters 0 1 A:0 1 0 1 0 10 1 B:11 0 100 1 D:101 C:1000 E:1001
Merging Two Characters 0 1 A:0 1 0 1 0 10 1 B:11 CE:100 D:101
How to Find Optimal Encoding Claim 2: For any optimal encoding, the encoding obtained by merging characters 1 and 2 must be an optimal encoding or the reduced alphabet, where characters 1 and 2 are replaced with a new character o requency 1 + 2
How to Find Optimal Encoding Character Frequency Codeword A 1 0 B 2 11 C 3 1000 D 4 101 E 5 1001 Character Frequency Codeword A 1 0 B 2 11 CE 3 + 5 100 D 4 101
How to Find Optimal Encoding Idea: Take two characters with lowest requency Merge them Recursively solve reduced problem Split characters apart again
How to Find Optimal Encoding Character Frequency Codeword A 0.45 B 0.25 C 0.10 D 0.15 E 0.05
How to Find Optimal Encoding Character Frequency Codeword A 0.45 B 0.25 C 0.10 D 0.15 E 0.05
How to Find Optimal Encoding Character Frequency Codeword A 0.45 B 0.25 [CE] 0.15 D 0.15
How to Find Optimal Encoding Character Frequency Codeword A 0.45 B 0.25 [CE] 0.15 D 0.15
How to Find Optimal Encoding Character Frequency Codeword A 0.45 B 0.25 [[CE]D] 0.30
How to Find Optimal Encoding Character Frequency Codeword A 0.45 B 0.25 [[CE]D] 0.30
How to Find Optimal Encoding Character Frequency Codeword A 0.45 [[[CE]D]B] 0.55
How to Find Optimal Encoding Character Frequency Codeword A 0.45 [[[CE]D]B] 0.55
How to Find Optimal Encoding Character Frequency Codeword [A[[[CE]D]B]] 1.00
How to Find Optimal Encoding Character Frequency Codeword [A[[[CE]D]B]] 1.00 [A[[[CE]D]B]]
How to Find Optimal Encoding Character Frequency Codeword A 0.45 0 [[[CE]D]B] 0.55 1 A 0 1 [[[CE]D]B]
How to Find Optimal Encoding Character Frequency Codeword A 0.45 0 B 0.25 11 [[CE]D] 0.30 10 A 0 1 0 1 [[CE]D] B
How to Find Optimal Encoding Character Frequency Codeword A 0.45 0 B 0.25 11 [CE] 0.15 100 D 0.15 101 A 0 1 0 1 0 1 B [CE] D
How to Find Optimal Encoding Character Frequency Codeword A 0.45 0 B 0.25 11 C 0.10 1000 D 0.15 101 E 0.05 1001 A 0 1 0 1 0 1 B 0 1 D C E
How to Find Optimal Encoding Let q be a heap o characters, ordered by requency For each character c, q.insert(c) While q has at least two characters: c 1 = q.deletemin(), c 2 = q.deletemin() Create a node labeled [c 1 c 2 ] with children c 1 and c 2 ([c 1 c 2 ] ) = (c 1 ) + (c 2 ) q.insert ([c 1 c 2 ] ) Return q.deletemin()
Running Time n inserts initially: O(n log n) Every run o loop decreases size o heap by 1 n-1 runs o loop Each run o loop involves 3 heap operations: O(log n) Total running time: O(n log n)
Set Cover Given a set o elements B, and a collection o subsets S i, output a selection o the S i whose union is B, such that the number o subsets used is minimal.
Example: Schools Suppose we have a collection o towns, and we want to igure out the best towns to put schools Need at least one school within 20 miles o each town Every school should be in a town
Example: Schools B = set o towns S i = subset o towns within 20 miles o town i
Greedy Solution Obvious solution: repeatedly pick the set S i with the largest number o uncovered elements.
Example B = {1, 2, 3, 4, 5, 6} S 1 = {1, 2, 3} S 2 = {1, 4} S 3 = {2, 5} S 4 = {3, 6}
Example B = {1, 2, 3, 4, 5, 6} S 1 = {1, 2, 3} S 2 = {1, 4} S 3 = {2, 5} S 4 = {3, 6} Sets used: {} Greedy Algorithm Elements let: {1, 2, 3, 4, 5, 6}
Example B = {1, 2, 3, 4, 5, 6} S 1 = {1, 2, 3} S 2 = {1, 4} S 3 = {2, 5} S 4 = {3, 6} Sets used: {S 1 } Greedy Algorithm Elements let: {4, 5, 6}
Example B = {1, 2, 3, 4, 5, 6} S 1 = {1, 2, 3} S 2 = {1, 4} S 3 = {2, 5} S 4 = {3, 6} Sets used: {S 1, S 2 } Greedy Algorithm Elements let: {5, 6}
Example B = {1, 2, 3, 4, 5, 6} S 1 = {1, 2, 3} S 2 = {1, 4} S 3 = {2, 5} S 4 = {3, 6} Sets used: {S 1, S 2, S 3 } Greedy Algorithm Elements let: {6}
Example B = {1, 2, 3, 4, 5, 6} S 1 = {1, 2, 3} S 2 = {1, 4} S 3 = {2, 5} S 4 = {3, 6} Sets used: {S 1, S 2, S 3, S 4 } Greedy Algorithm Elements let: {}
Example B = {1, 2, 3, 4, 5, 6} S 1 = {1, 2, 3} S 2 = {1, 4} S 3 = {2, 5} S 4 = {3, 6} Sets used: {S 1, S 2, S 3, S 4 } Greedy Algorithm Optimal: { S 2, S 3, S 4 } Elements let: {}
Set Cover Greedy algorithm isn t optimal! Obtaining optimal solution believed hard Settle or approximation: I optimal uses k sets, want to get solution using only slightly more than k sets
Approximation Claim: I B contains n elements, and the optimal solution uses k sets, then greedy uses at most k ln n sets
Proo Let n t be the number o uncovered elements ater t iterations o greedy algorithm (n 0 = n) Remaining elements covered by the optimal k sets Must be some set with at least n t /k o the uncovered elements Thereore, greedy picks a set that covers at least n t /k o the remaining elements
Proo Greedy picks a set that covers at least n t /k o the remaining elements n t+1 n t - n t /k = n t (1-1/k) Thereore, n t n 0 (1-1/k) t = n(1-1/k) t
Proo Fact: 1-x e -x, with equality i and only i x = 0
Proo n t n(1-1/k) t < n(e -1/k ) t < ne -t/k Ater t = k ln n iterations, n t < n e -ln n = 1 Thereore, ater t = k ln n iterations, n t = 0 Thereore, greedy algorithm uses at most k ln n sets, as desired
Can We Do Better Our algorithm achieves an approximation ratio o ln n This gives two questions: Can the analysis be tightened so that greedy achieves a better approximation ratio? Are there more sophisticated algorithms that achieve better approximation ratio? Answer to both: most likely not I domr eicient algorithm can do much better, than we can solve a whole host o very diicult problems eiciently