TU/e Algorithms (2IL15) Lecture 2. Algorithms (2IL15) Lecture 2 THE GREEDY METHOD

Similar documents
Analysis of Algorithms - Greedy algorithms -

Greedy Algorithms. CLRS Chapters Introduction to greedy algorithms. Design of data-compression (Huffman) codes

Scribe: Virginia Williams, Sam Kim (2016), Mary Wootters (2017) Date: May 22, 2017

Greedy Algorithms CHAPTER 16

CS473-Algorithms I. Lecture 11. Greedy Algorithms. Cevdet Aykanat - Bilkent University Computer Engineering Department

Design and Analysis of Algorithms

Algorithms Dr. Haim Levkowitz

MCS-375: Algorithms: Analysis and Design Handout #G2 San Skulrattanakulchai Gustavus Adolphus College Oct 21, Huffman Codes

16.Greedy algorithms

Design and Analysis of Algorithms

Huffman Coding. Version of October 13, Version of October 13, 2014 Huffman Coding 1 / 27

CS 758/858: Algorithms

LECTURE NOTES OF ALGORITHMS: DESIGN TECHNIQUES AND ANALYSIS

Chapter 16: Greedy Algorithm

Text Compression through Huffman Coding. Terminology

February 24, :52 World Scientific Book - 9in x 6in soltys alg. Chapter 3. Greedy Algorithms

CSC 373 Lecture # 3 Instructor: Milad Eftekhar

CSE 431/531: Algorithm Analysis and Design (Spring 2018) Greedy Algorithms. Lecturer: Shi Li

Name: Lirong TAN 1. (15 pts) (a) Define what is a shortest s-t path in a weighted, connected graph G.

Greedy algorithms part 2, and Huffman code

16 Greedy Algorithms

Greedy Algorithms. Textbook reading. Chapter 4 Chapter 5. CSci 3110 Greedy Algorithms 1/63

Lecture: Analysis of Algorithms (CS )

COMP251: Greedy algorithms

CSE 421 Greedy: Huffman Codes

/463 Algorithms - Fall 2013 Solution to Assignment 3

Design and Analysis of Algorithms 演算法設計與分析. Lecture 7 April 6, 2016 洪國寶

Greedy algorithms is another useful way for solving optimization problems.

We ve done. Now. Next

Greedy Algorithms and Huffman Coding

G205 Fundamentals of Computer Engineering. CLASS 21, Mon. Nov Stefano Basagni Fall 2004 M-W, 1:30pm-3:10pm

Analysis of Algorithms

Tutorial 6-7. Dynamic Programming and Greedy

Trees! Ellen Walker! CPSC 201 Data Structures! Hiram College!

Horn Formulae. CS124 Course Notes 8 Spring 2018

Data Structures and Algorithms

CS3383 Unit 2: Greedy

Greedy Algorithms 1 {K(S) K(S) C} For large values of d, brute force search is not feasible because there are 2 d {1,..., d}.

memoization or iteration over subproblems the direct iterative algorithm a basic outline of dynamic programming

Here is a recursive algorithm that solves this problem, given a pointer to the root of T : MaxWtSubtree [r]

CSC 373: Algorithm Design and Analysis Lecture 4

CS2223: Algorithms Sorting Algorithms, Heap Sort, Linear-time sort, Median and Order Statistics

Greedy Algorithms. Alexandra Stefan

Greedy Algorithms 1. For large values of d, brute force search is not feasible because there are 2 d

Greedy Algorithms CSE 6331

EE 368. Weeks 5 (Notes)

Huffman Codes (data compression)

Binary Heaps in Dynamic Arrays

Binary Trees

Greedy Algorithms CSE 780

CSE 417 Dynamic Programming (pt 4) Sub-problems on Trees

Efficient Sequential Algorithms, Comp309. Motivation. Longest Common Subsequence. Part 3. String Algorithms

Analysis of Algorithms Prof. Karen Daniels

CPE702 Algorithm Analysis and Design Week 7 Algorithm Design Patterns

Lecture 6: Analysis of Algorithms (CS )

15 July, Huffman Trees. Heaps

Partha Sarathi Manal

CS 6783 (Applied Algorithms) Lecture 5

4.8 Huffman Codes. These lecture slides are supplied by Mathijs de Weerd

ECE608 - Chapter 16 answers

Department of Computer Science and Engineering Analysis and Design of Algorithm (CS-4004) Subject Notes

Algorithms and Data Structures

CSE 421 Applications of DFS(?) Topological sort

Chapter 9. Greedy Technique. Copyright 2007 Pearson Addison-Wesley. All rights reserved.

Algorithm Theory, Winter Term 2015/16 Problem Set 5 - Sample Solution

COL351: Analysis and Design of Algorithms (CSE, IITD, Semester-I ) Name: Entry number:

CS F-11 B-Trees 1

Problem Set 2 Solutions

Lecture: Analysis of Algorithms (CS )

managing an evolving set of connected components implementing a Union-Find data structure implementing Kruskal s algorithm

Greedy algorithms. Given a problem, how do we design an algorithm that solves the problem? There are several strategies:

Algorithms for Data Science

Subsequence Definition. CS 461, Lecture 8. Today s Outline. Example. Assume given sequence X = x 1, x 2,..., x m. Jared Saia University of New Mexico

CMSC 341 Leftist Heaps

Greedy Algorithms. Previous Examples: Huffman coding, Minimum Spanning Tree Algorithms

Computational Optimization ISE 407. Lecture 16. Dr. Ted Ralphs

Graphs and Network Flows ISE 411. Lecture 7. Dr. Ted Ralphs

Solutions. (a) Claim: A d-ary tree of height h has at most 1 + d +...

Notes slides from before lecture. CSE 21, Winter 2017, Section A00. Lecture 10 Notes. Class URL:

CS301 - Data Structures Glossary By

CS F-10 Greedy Algorithms 1

CS4311 Design and Analysis of Algorithms. Lecture 9: Dynamic Programming I

An undirected graph is a tree if and only of there is a unique simple path between any 2 of its vertices.

15.4 Longest common subsequence

Functions. How is this definition written in symbolic logic notation?

Applied Algorithm Design Lecture 3

Lecture 14. Greedy algorithms!

Binary Trees, Binary Search Trees

CS 206 Introduction to Computer Science II

Lecture 3, Review of Algorithms. What is Algorithm?

Discrete mathematics

CMSC 341 Lecture 15 Leftist Heaps

Functions. Def. Let A and B be sets. A function f from A to B is an assignment of exactly one element of B to each element of A.

CMSC 341 Lecture 15 Leftist Heaps

CS141: Intermediate Data Structures and Algorithms Greedy Algorithms

Trees (Part 1, Theoretical) CSE 2320 Algorithms and Data Structures University of Texas at Arlington

Design and Analysis of Algorithms 演算法設計與分析. Lecture 7 April 15, 2015 洪國寶

Priority Queues and Binary Heaps

Framework for Design of Dynamic Programming Algorithms

ITI Introduction to Computing II

Transcription:

Algorithms (2IL15) Lecture 2 THE GREEDY METHOD x y v w 1

Optimization problems for each instance there are (possibly) multiple valid solutions goal is to find an optimal solution minimization problem: associate cost to every solution, find min-cost solution maximization problem: associate profit to every solution, find max-profit solution 2

Techniques for optimization optimization problems typically involve making choices backtracking: just try all solutions can be applied to almost all problems, but gives very slow algorithms try all options for first choice, for each option, recursively make other choices greedy algorithms: construct solution iteratively, always make choice that seems best can be applied to few problems, but gives fast algorithms only try option that seems best for first choice (greedy choice), recursively make other choices dynamic programming in between: not as fast as greedy, but works for more problems 3

Algorithms for optimization: how to improve on backtracking for greedy algorithms 1. try to discover structure of optimal solutions: what properties do optimal solutions have? what are the choices that need to be made? do we have optimal substructure? optimal solution = first choice + optimal solution for subproblem do we have greedy-choice property for the first choice? 2. prove that optimal solutions indeed have these properties prove optimal substructure and greedy-choice property 3. use these properties to design an algorithm and prove correctness proof by induction (possible because optimal substructure) 4

Today: two examples of greedy algorithms Activity-Selection 8:00 10:00 12:00 14:00 16:00 18:00 Optimal text encoding bla bla 0100110000010000010011000001 5

Activity-Selection Problem 8:00 10:00 12:00 14:00 16:00 18:00 Input: set A = {a 1,, a n } of n activities for each activity a i : start time start(a i ), finishing time end(a i ) Valid solution: any subset of non-overlapping activities Optimal solution: valid solution with maximum number of activities 6

What are the choices? What properties does optimal solution have? 8:00 10:00 12:00 14:00 16:00 18:00 for each activity, do we select it or not? better to look at it differently 7

What are the choices? What properties does optimal solution have? 8:00 10:00 12:00 14:00 16:00 18:00 what is first activity in optimal solution, what is second activity, etc. do we have optimal substructure? optimal solution = first choice + optimal solution for subproblem? yes! optimal solution = first activity + optimal selection from activities that do not overlap first activity 8

proof of optimal substructure 8:00 10:00 12:00 14:00 16:00 18:00 Lemma: Let a i be the first activity in an optimal solution OPT for A. Let B be the set of activities in A that do not overlap a i. Let S be an optimal solution for the set B. Then S U {a i } is an optimal solution for A. Proof. First note that S U {a i } is a valid solution for A. Second, note that OPT \ {a i } is a subset of non-overlapping activities from B. Hence, by definition of S we have size(s) size (OPT \ {a i }), which implies that S U {a i } is an optimal solution for A. 9

What are the choices? What properties does optimal solution have? 8:00 10:00 12:00 14:00 16:00 18:00 do we have greedy-choice property: can we select first activity greedily and still get optimal solution? yes! first activity = activity that ends first greedy choice 10

A = {a 1,, a n }: set of n activities Lemma: Let a i be an activity in A that ends first. Then there is an optimal solution to the Activity-Selection Problem for A that includes a i. Proof. General structure of all proofs for greedy-choice property: take optimal solution if OPT contains greedy choice, then done otherwise modify OPT so that it contains greedy choice, without decreasing the quality of the solution 11

Lemma: Let a i be an activity in A that ends first. Then there is an optimal solution to the Activity-Selection Problem for A that includes a i. Proof. Let OPT be an optimal solution for A. If OPT includes a i then the lemma obviously holds, so assume OPT does not include a i. We will show how to modify OPT into a solution OPT* such that (i) OPT* is a valid solution (ii) OPT* includes a i (iii) size(opt*) size(opt) quality OPT* quality OPT standard text you can basically use in proof for any greedy-choice property Thus OPT* is an optimal solution including a i, and so the lemma holds. To modify OPT we proceed as follows. here comes the modification, which is problem-specific 12

How to modify OPT? greedy choice OPT 8:00 10:00 12:00 14:00 16:00 18:00 replace first activity in OPT by greedy choice 13

Lemma: Let a i be an activity in A that ends first. Then there is an optimal solution to the Activity-Selection Problem for A that includes a i. Proof. [ ] We show how to modify OPT into a solution OPT* such that (i) OPT* is a valid solution (ii) OPT* includes a i (iii) size(opt*) size(opt) [ ] To modify OPT we proceed as follows. Let a k be activity in OPT ending first, and let OPT* = ( OPT \ {a k } ) U {a i }. Then OPT* includes a i and size(opt*) = size(opt). We have end(a i ) end(a k ) by definition of a i, so a i cannot overlap any activities in OPT \ {a k }. Hence, OPT* is a valid solution. 14

And now the algorithm: Algorithm Greedy-Activity-Selection (A) 1. if A is empty 2. then return A 3. else a i an activity from A ends first 4. B all activities from A that do not overlap a i 5. return {a i } U Greedy-Activity-Selection (B) Correctness: by induction, using optimal substructure and greedy-choice property Running time: O(n 2 ) if implemented naively O(n) after sorting on finishing time, if implemented more cleverly 15

Today: two examples of greedy algorithms Activity-Selection 8:00 10:00 12:00 14:00 16:00 18:00 Optimal text encoding bla bla 0100110000010000010011000001 16

Optimal text encoding bla bla 0100110000010000010011000001 Standard text encoding schemes: fixed number of bits per character ASCII: 7 bits (extended versions 8 bits) UCS-2 (Unicode): 16 bits Can we do better using variable-length encoding? Idea: give characters that occur frequently a short code and give characters that do not occur frequently a longer code 17

The encoding problem Input: set C of n characters c 1, c n ; for each character c i its frequency f(c i ) Output: binary code for each character code(c 1 ) = 01001, code (c 2 ) = 010, not a prefix-code Variable length encoding: how do we know where characters end? text = 0100101100 Does it start with c 1 = 01001 or c 2 = 010 or?? Use prefix-code: no character code is prefix of another character code 18

Variable-length prefix encoding: can it help? Text: een voordeel Frequencies: f(e)=4, f(n)=1, f(v)=1, f(o)=2, f(r)=1, f(d)=1, f(l)=1, f( )=1 fixed-length code: e=000 n=001 v=010 0=011 r =100 d=101 l =110 =111 length of encoded text: 12 x 3 = 36 bits possible prefix code: e=00 n=0110 v=0111 o=010 r =100 d=101 l=110 =111 length of encoded text: 4x2 + 2x4 + 6x3 = 34 bits 19

Representing prefix codes Text: een voordeel Frequencies: f(e)=4, f(n)=1, f(v)=1, f(o)=2, f(r)=1, f(d)=1, f(l)=1, f( )=1 code: e=00 n=0110 v=0111 o=010 r =100 d=101 l=110 =111 0 1 0 1 0 1 e 0 1 0 1 0 o r d l 0 1 n v 1 representation is binary tree T: one leaf for each character internal nodes always have two outgoing edges, labeled 0 and 1 code of character: follow path to leaf and list bits codes represented by such trees are exactly the non-redundant prefix codes 20

Representing prefix codes Text: een voordeel Frequencies: f(e)=4, f(n)=1, f(v)=1, f(o)=2, f(r)=1, f(d)=1, f(l)=1, f( )=1 code: e=00 n=0110 v=0111 o=010 r =100 d=101 l=110 =111 0 1 0 1 0 1 e 4 0 1 0 1 0 1 o r d l 2 0 1 1 1 1 1 n v 1 1 frequencies cost of encoding represented by T: i f(c i ) depth(c i ) 21

Designing greedy algorithms 1. try to discover structure of optimal solutions: what properties do optimal solutions have? what are the choices that need to be made? do we have optimal substructure? optimal solution = first choice + optimal solution for subproblem do we have greedy-choice property for the first choice? 2. prove that optimal solutions indeed have these properties prove optimal substructure and greedy-choice property 3. use these properties to design an algorithm and prove correctness proof by induction (possible because optimal substructure) 22

Bottom-up contruction of tree: start with separate leaves, and then merge n-1 times until we have the tree choices: which subtrees to merge at every step c 2 c 4 c 5 c 7 c 8 c 1 c 3 c 6 4 2 1 1 1 1 1 1 we do not have to merge adjacent leaves 23

Bottom-up contruction of tree: start with separate leaves, and then merge n-1 times until we have the tree choices: which subtrees to merge at every step c 1 c 2 c 3 c 4 c 5 c 6 c 7 c 8 b 4 2 1 1 1 1 1 1 2 Do we have optimal substructure? Do we even have a problem of the same type? Yes, we have a subproblem of the same type: after merging, replace merged leaves c i, c k by a single leaf b with f (b) =? f (c i ) + f(c k ) (other way of looking at it: problem is about merging weighted subtrees) 24

Lemma: Let c i and c k be siblings in an optimal tree for set C of characters. Let B = ( C \{c i, c k } ) U {b}, where f (b) = f (c i ) + f(c k ). Let T B be an optimal tree for B. Then replacing the leaf for b in T B by an internal node with c i, c k as children results in an optimal tree for C. Proof. Do yourself. 25

Bottom-up contruction of tree: start with separate leaves, and then merge n-1 times until we have the tree choices: which subtrees to merge at every step c 1 c 2 c 3 c 4 c 5 c 6 c 7 c 8 b 4 2 1 1 1 1 1 1 2 Do we have a greedy-choice property? Which leaves should we merge first? Greedy choice: first merge two leaves with smallest character frequency 26

Lemma: Let c i, c k be two characters with the lowest frequency in C. Then there is an optimal tree T OPT for C where c i, c k are siblings. Proof. Let OPT be an optimal tree T OPT for C. If c i, c k are siblings in T OPT then the lemma obviously holds, so assume this is not the case. We will show how to modify T OPT into a tree T* such that (i) T* is a valid tree (ii) c i, c k are siblings in T* (iii) cost(t*) cost(t OPT ) standard text you can basically use in proof for any greedy-choice property Thus T* is an optimal tree in which c i, c k are siblings, and so the lemma holds. To modify T OPT we proceed as follows. now we have to do the modification 27

How to modify T OPT? T OPT T* c k c m c i v depth = d 1 c s c s c m depth = d 2 c i c k take a deepest internal node v make c i, c k children of v by swapping them with current children (if necessary) change in cost due to swapping c i and c s cost (T OPT ) cost (T*) = f (c s ) (d 2 d 1 ) + f (c i ) (d 1 d 2 ) = ( f (c s ) f (c i ) ) (d 2 d 1 ) 0 Conclusion: T* is valid tree where c i, c k are siblings and cost(t*) cost (T OPT ). 28

Algorithm Construct-Huffman-Tree (C: set of n characters) 1. if C = 1 2. then return a tree consisting of single leaf, storing the character in C 3. else c i, c k two characters from C with lowest frequency 4. Remove c i, c k from C, and replace them by a new character b with f(b) = f(c i ) + f(c k ). Let B denote the new set of characters. 5. T B Construct-Huffman-Tree(B) 6. Replace leaf for b in T B with internal node with c i, c k as children. 7. Let T be the new tree. 8. return T Correctness: by induction, using optimal substructure and greedy-choice property Running time: O(n 2 )?! O(n log n) if implemented smartly (use heap) Sorting + O(n) if implemented even smarter (hint: 2 queues) 29

Summary greedy algorithm: solves optimization problem by trying only one option for first choice (the greedy choice) and then solving subproblem recursively need: optimal substructure + greedy choice property proof of greedy-choice property: show that optimal solution can be modified such that it uses greedy choice 30