CS 161: Design and Analysis of Algorithms

CS 161: Design and Analysis o Algorithms

Announcements Homework 3, problem 3 removed

Greedy Algorithms 4: Human Encoding/Set Cover Human Encoding Set Cover

Alphabets and Strings Alphabet = inite set o symbols English alphabet = {a,b,c,,z} Hex values = {0,1,,9,A,B,C,D,E,F} String = sequence o symbols rom some alphabet This is a string

How to Encode Computers store things as 0s and 1s How do we encode strings as sequence o bits? Must be invertible (one-to-one) What to use as ew bits as possible One approach: choose encoding or characters, induce encoding o strings by concatenating codes or each character

How to Encode Obvious solution: I alphabet size is 2 k or some k, encode each character using k bits Each character takes k bits n characters kn bits total Letter Encoding A 00 B 01 C 10 D 11 ABACBDAAADBAC 00010010011100000011010010

How to Encode Issues: Wasteul: I not exactly 2 k characters, some sequences never used Letter A 00 B 01 C 10 Encoding Never use 11

How to Encode Issues: What i one character occurs very oten? AAAAAAABAAACAABAADAAAAAAACAAAB I almost all letters are A s, then an encoding that uses ewer bits to represent A and more to represent everything else would save on space

Variable Length Encoding Variable Length Encoding = encoding o characters as bits where dierent letters may use a dierent number o bits Still need encoding on strings to be one-to-one. What does this say about the encoding or characters?

Variable Length Encoding Letter Encoding A 0 B 01 C 10 D 11 AC 010 BA 010 Not one-to-one!

Preix-Free Encoding A preix o a bit sequence is the irst i bits, or some i 0100101101000110101 0 01 010 0100 01001

Preix-Free Encoding A preix-ree encoding is an encoding o an alphabet such that no encoding o any character is a preix o the encoding o any other character Letter Encoding A 0 B 01 C 10 D 11 The encoding o A is a preix o the encoding o C

Preix-Free Encoding A preix-ree encoding is an encoding o an alphabet such that no encoding o any character is a preix o the encoding o any other character Letter Encoding A 0 B 10 C 110 D 111

Preix-Free Encoding Theorem: Any preix-ree encoding o an alphabet induces a one-to-one encoding o strings over that alphabet

Preix-Free Encoding Proo: Suppose toward contradiction that S and T are two dierent strings that map to the same sequence o bits Assume w.l.o.g. that S and T dier on the irst character. Let c be the irst character o S, d the irst character o T. Let E(c) and E(d) be the encodings o c and d Assume w.l.o.g. E(c) E(d)

Preix-Free Encoding Since all bits in encodings o S and T are the same, the irst E(d) bits are Thereore, the irst E(d) bits o E(c) are equal to E(d) E(d) is a preix o E(c) Since c was assumed dierent rom d, our encoding is not preix-ree.

Tree View o Preix-Free Encoding Every node represents a partial codeword Every node has two children, one or appending 0 to the partial codeword, one or appending 1. Leaves correspond to actual codewords Root is empty

Tree View o Preix-Free Encoding 0 1 A:0 1 0 1 0 10 1 B:11 C:100 D:101

Tree View o Preix-Free Encoding To encode: Find path rom root to character, concatenate edge labels To decode b 1 b 2 : Starting rom the root, ollow edge labeled b 1, then edge labeled b 2, until we ind a lea. Output that character, and start over rom the root

Optimal Encoding What is the best possible preix-ree encoding we can ind? Let n be the length o the string Let C be the cost o the encoding, deined as (length o encoding)/n C = average length o encoding o characters, weighted by requency

Optimal Encoding Let l i be the length o the encoding o character i Let i be the requency i occurs in the string i (number o instances o i)/n C = i i l i

Optimal Encoding l i is also the depth o character i in the encoding tree. Optimal encoding is always a ull binary tree I there is a node with only 1 child, replace node with child. Depth o leas only decreases.

Optimal Encoding Entropy: H = i Theorem (Shannon Coding Theorem): C H i log i

Proo O Coding Theorem Let g(x) = x log x Lemma: g( (x+y)/2 ) ( g(x)+g(y) )/2

Proo O Coding Theorem True when only 2 characters Only possible encoding is or each character to get 1 bit. C = 1 H g( 1) g( 2) 1 + 2 = log 1 2 log 2 = 2 2 g = 2g(1/ 2) 2 2 1 = 1

Proo o Coding Theorem Inductively assume true or m-1 characters Let T be the tree corresponding to an optimal encoding over some alphabet o m characters At least two leas at bottom level. Assume w.l.o.g. these correspond to characters 1 and 2 Replace all instances o characters 1 and 2 with a new character Has requency 1 + 2

Proo o Coding Theorem Now we have an alphabet o size m-1 Encoding or alphabet: start with T delete the nodes corresponding to characters 1 and 2 Assign the new character to the parent o these nodes (which is now a lea) New character has code length 1 less than deleted characters

Proo o Coding Theorem How does C change? Removed character 1 with length l, requency 1 Removed character 2 with length l, requency 2 Added new character, length l-1, requency 1 + 2 C = i i l i C' = C ( 1 + 2) l + ( 1 + 2)( l 1) = C ( 1 + 2)

Proo o Coding Theorem By inductive assumption, Recall ) )log( ( log ' 'log ' ' 2 1 2 1 3 H C i i i i i + + = = ) )log( ( log log ) )log( ( log log log 2 1 2 1 2 2 1 1 2 1 2 1 2 2 1 1 H i i i + + + + = + + + + = 2 1 ' C C + + =

Proo o Coding Theorem ( ) 1 ) log( ) ( log log 2 1 2 1 2 2 1 1 + + + + H C H g g g H H + + + = + + + + = 2 ) ( 2 1 ) ( 2 1 2 2 )log ( log log 2 1 2 1 2 1 2 1 2 2 1 1

How to Find Optimal Encoding Claim 1: There is an optimal solution where the two least requent characters have the longest codewords (i.e. lowest level o tree), and are identical except or last bit I not, swap these two characters with two o the characters with the longest codewords Can swap with two that are siblings

How to Find Optimal Encoding Assume the two lowest-requency characters are 1 and 2. What i we merge the two characters into a new character with requency 1 + 2? New character gets codeword obtained by dropping last bit o the codewords or 1 or 2

Merging Two Characters 0 1 A:0 1 0 1 0 10 1 B:11 0 100 1 D:101 C:1000 E:1001

Merging Two Characters 0 1 A:0 1 0 1 0 10 1 B:11 CE:100 D:101

How to Find Optimal Encoding Claim 2: For any optimal encoding, the encoding obtained by merging characters 1 and 2 must be an optimal encoding or the reduced alphabet, where characters 1 and 2 are replaced with a new character o requency 1 + 2

How to Find Optimal Encoding Character Frequency Codeword A 1 0 B 2 11 C 3 1000 D 4 101 E 5 1001 Character Frequency Codeword A 1 0 B 2 11 CE 3 + 5 100 D 4 101

How to Find Optimal Encoding Idea: Take two characters with lowest requency Merge them Recursively solve reduced problem Split characters apart again

How to Find Optimal Encoding Character Frequency Codeword A 0.45 B 0.25 C 0.10 D 0.15 E 0.05

How to Find Optimal Encoding Character Frequency Codeword A 0.45 B 0.25 [CE] 0.15 D 0.15

How to Find Optimal Encoding Character Frequency Codeword A 0.45 B 0.25 [[CE]D] 0.30

How to Find Optimal Encoding Character Frequency Codeword A 0.45 [[[CE]D]B] 0.55

How to Find Optimal Encoding Character Frequency Codeword [A[[[CE]D]B]] 1.00

How to Find Optimal Encoding Character Frequency Codeword [A[[[CE]D]B]] 1.00 [A[[[CE]D]B]]

How to Find Optimal Encoding Character Frequency Codeword A 0.45 0 [[[CE]D]B] 0.55 1 A 0 1 [[[CE]D]B]

How to Find Optimal Encoding Character Frequency Codeword A 0.45 0 B 0.25 11 [[CE]D] 0.30 10 A 0 1 0 1 [[CE]D] B

How to Find Optimal Encoding Character Frequency Codeword A 0.45 0 B 0.25 11 [CE] 0.15 100 D 0.15 101 A 0 1 0 1 0 1 B [CE] D

How to Find Optimal Encoding Character Frequency Codeword A 0.45 0 B 0.25 11 C 0.10 1000 D 0.15 101 E 0.05 1001 A 0 1 0 1 0 1 B 0 1 D C E

How to Find Optimal Encoding Let q be a heap o characters, ordered by requency For each character c, q.insert(c) While q has at least two characters: c 1 = q.deletemin(), c 2 = q.deletemin() Create a node labeled [c 1 c 2 ] with children c 1 and c 2 ([c 1 c 2 ] ) = (c 1 ) + (c 2 ) q.insert ([c 1 c 2 ] ) Return q.deletemin()

Running Time n inserts initially: O(n log n) Every run o loop decreases size o heap by 1 n-1 runs o loop Each run o loop involves 3 heap operations: O(log n) Total running time: O(n log n)

Set Cover Given a set o elements B, and a collection o subsets S i, output a selection o the S i whose union is B, such that the number o subsets used is minimal.

Example: Schools Suppose we have a collection o towns, and we want to igure out the best towns to put schools Need at least one school within 20 miles o each town Every school should be in a town

Example: Schools B = set o towns S i = subset o towns within 20 miles o town i

Greedy Solution Obvious solution: repeatedly pick the set S i with the largest number o uncovered elements.

Example B = {1, 2, 3, 4, 5, 6} S 1 = {1, 2, 3} S 2 = {1, 4} S 3 = {2, 5} S 4 = {3, 6}

Example B = {1, 2, 3, 4, 5, 6} S 1 = {1, 2, 3} S 2 = {1, 4} S 3 = {2, 5} S 4 = {3, 6} Sets used: {} Greedy Algorithm Elements let: {1, 2, 3, 4, 5, 6}

Example B = {1, 2, 3, 4, 5, 6} S 1 = {1, 2, 3} S 2 = {1, 4} S 3 = {2, 5} S 4 = {3, 6} Sets used: {S 1 } Greedy Algorithm Elements let: {4, 5, 6}

Example B = {1, 2, 3, 4, 5, 6} S 1 = {1, 2, 3} S 2 = {1, 4} S 3 = {2, 5} S 4 = {3, 6} Sets used: {S 1, S 2 } Greedy Algorithm Elements let: {5, 6}

Example B = {1, 2, 3, 4, 5, 6} S 1 = {1, 2, 3} S 2 = {1, 4} S 3 = {2, 5} S 4 = {3, 6} Sets used: {S 1, S 2, S 3 } Greedy Algorithm Elements let: {6}

Example B = {1, 2, 3, 4, 5, 6} S 1 = {1, 2, 3} S 2 = {1, 4} S 3 = {2, 5} S 4 = {3, 6} Sets used: {S 1, S 2, S 3, S 4 } Greedy Algorithm Elements let: {}

Example B = {1, 2, 3, 4, 5, 6} S 1 = {1, 2, 3} S 2 = {1, 4} S 3 = {2, 5} S 4 = {3, 6} Sets used: {S 1, S 2, S 3, S 4 } Greedy Algorithm Optimal: { S 2, S 3, S 4 } Elements let: {}

Set Cover Greedy algorithm isn t optimal! Obtaining optimal solution believed hard Settle or approximation: I optimal uses k sets, want to get solution using only slightly more than k sets

Approximation Claim: I B contains n elements, and the optimal solution uses k sets, then greedy uses at most k ln n sets

Proo Let n t be the number o uncovered elements ater t iterations o greedy algorithm (n 0 = n) Remaining elements covered by the optimal k sets Must be some set with at least n t /k o the uncovered elements Thereore, greedy picks a set that covers at least n t /k o the remaining elements

Proo Greedy picks a set that covers at least n t /k o the remaining elements n t+1 n t - n t /k = n t (1-1/k) Thereore, n t n 0 (1-1/k) t = n(1-1/k) t

Proo Fact: 1-x e -x, with equality i and only i x = 0

Proo n t n(1-1/k) t < n(e -1/k ) t < ne -t/k Ater t = k ln n iterations, n t < n e -ln n = 1 Thereore, ater t = k ln n iterations, n t = 0 Thereore, greedy algorithm uses at most k ln n sets, as desired

Can We Do Better Our algorithm achieves an approximation ratio o ln n This gives two questions: Can the analysis be tightened so that greedy achieves a better approximation ratio? Are there more sophisticated algorithms that achieve better approximation ratio? Answer to both: most likely not I domr eicient algorithm can do much better, than we can solve a whole host o very diicult problems eiciently