Part 3: Randomized Algorithms and Probabilistic Analysis Prof. Dr. Institut für Informatik Winter 2013/14
Efficient Algorithms When is an algorithm considered efficient?
Efficient Algorithms When is an algorithm considered efficient? Engineer The algorithm must be efficient in practice, i.e., it must solve practical instances in an appropriate amount of time.
Efficient Algorithms When is an algorithm considered efficient? Engineer The algorithm must be efficient in practice, i.e., it must solve practical instances in an appropriate amount of time. Theoretician The algorithm must be efficient in the worst case, i.e., it must solve all instances in polynomial time.
The Knapsack Problem Knapsack problem (KP) Input set of items {1,..., n} profits p 1,..., p n weights w 1,..., w n capacity b
The Knapsack Problem Knapsack problem (KP) Input set of items {1,..., n} profits p 1,..., p n weights w 1,..., w n capacity b Goal Find a subset of items that fits into the knapsack and maximizes the profit.
The Knapsack Problem Knapsack problem (KP) Input set of items {1,..., n} profits p 1,..., p n weights w 1,..., w n capacity b Goal Find a subset of items that fits into the knapsack and maximizes the profit. Formal description max p 1 x 1 + + p n x n subject to w 1 x 1 + + w n x n b and x i {0, 1}
Different Opinions Theorists say... KP is NP-hard. FPTAS exists. No efficient algorithm for KP, unless P = NP.
Different Opinions Theorists say... KP is NP-hard. FPTAS exists. No efficient algorithm for KP, unless P = NP. Engineers say... KP is easy to solve! Does not even require quadratic time. There are very good heuristics for practical instances of KP.
Reason for discrepancy Reason for discrepancy Worst-case complexity is too pessimistic! There are (artificial) worst-case instances for KP on which the heuristics are not efficient. These instances, however, do not occur in practice. This phenomenon occurs not only for KP, but also for many other problems.
Reason for discrepancy Reason for discrepancy Worst-case complexity is too pessimistic! There are (artificial) worst-case instances for KP on which the heuristics are not efficient. These instances, however, do not occur in practice. This phenomenon occurs not only for KP, but also for many other problems. How to make theory more consistent with practice? Find a more realistic performance measure.
Reason for discrepancy Reason for discrepancy Worst-case complexity is too pessimistic! There are (artificial) worst-case instances for KP on which the heuristics are not efficient. These instances, however, do not occur in practice. This phenomenon occurs not only for KP, but also for many other problems. How to make theory more consistent with practice? Find a more realistic performance measure. Average-case analysis Is it realistic to consider the average case behavior instead of the worst case behavior?
Random Inputs are not Typical Random inputs are not typical! If real-world data was random, watching TV would be very boring...
What is a typical instance? What is a typical instance? Profit Profit Weight Weight Profit Profit Weight Weight It depends very much on the concrete application. We cannot say in general what a typical instance for KP looks like.
Smoothed Analysis Profit Step 1: Adversary chooses input I. Profit Weight Profit Weight Profit Weight Weight
Smoothed Analysis Profit Profit Step 1: Adversary chooses input I. Profit Weight Step 2: Random perturbation. I per(i) Profit Weight Weight Weight Profit Profit Weight Weight Profit Profit Weight Weight
Smoothed Analysis Step 1: Adversary chooses input I. Step 2: Random perturbation. I per(i) Smoothed Complexity = worst expected running time the adversary can achieve Why do we consider this model? First step alone: worst case analysis. Second step models random influences, e.g., measurement errors, numerical imprecision, rounding,... So we have a combination of both: instances of any structure with some amount of random noise.
Perturbation Example Step 1: Adversary chooses all p i, w i [0, 1] arbitrarily. Profit Weight
Perturbation Example Step 1: Adversary chooses all p i, w i [0, 1] arbitrarily. Step 2: Add an independent Gaussian random variable to each profit and weight. Profit 1 sigma=0.5 sigma=1 sigma=2 0.5 0-5 -4-3 -2-1 0 1 2 3 4 5 Weight
Perturbation Example Step 1: Adversary chooses all p i, w i [0, 1] arbitrarily. Step 2: Add an independent Gaussian random variable to each profit and weight. Profit 1 sigma=0.5 sigma=1 sigma=2 0.5 σ 1 0-5 -4-3 -2-1 0 1 2 3 4 5 Weight
Perturbation Example Step 1: Adversary chooses all p i, w i [0, 1] arbitrarily. Step 2: Add an independent Gaussian random variable to each profit and weight. Profit 1 sigma=0.5 sigma=1 sigma=2 0.5 σ 1 > σ 2 0-5 -4-3 -2-1 0 1 2 3 4 5 Weight
Complexity Measures C A(I) Instances of length n. C A (I) = running time of algorithm A on input I X n = set of inputs of length n,
Complexity Measures C A(I) CA worst (n) Instances of length n. C A (I) = running time of algorithm A on input I X n = set of inputs of length n, CA worst (n) = max I Xn (C A (I))
Complexity Measures C A(I) CA worst (n) CA ave (n) Instances of length n. C A (I) = running time of algorithm A on input I X n = set of inputs of length n, µ n = prob. distribution on X n CA worst (n) = max I Xn (C A (I)) CA ave(n) = E I µn X n [C A (I)]
Complexity Measures C A(I) CA worst (n) C smooth A (n, σ 2) σ 2 < σ 1 C smooth A (n, σ 1) CA ave (n) I Instances of length n. C A (I) = running time of algorithm A on input I X n = set of inputs of length n, µ n = prob. distribution on X n CA worst (n) = max I Xn (C A (I)) CA ave(n) = E I µn X n [C A (I)] C smooth A (n, σ) = max I Xn E[C A (per σ (I))]
Complexity Measures C A(I) CA worst (n) C smooth A (n, σ 2) σ 2 < σ 1 C smooth A (n, σ 1) CA ave (n) I Instances of length n. C A (I) = running time of algorithm A on input I X n = set of inputs of length n, µ n = prob. distribution on X n CA worst (n) = max I Xn (C A (I)) CA ave(n) = E I µn X n [C A (I)] CA smooth (n, σ) = max I Xn E[C A (per σ (I))] smoothed complexity low bad instances are isolated peaks
Linear Programming Linear Programs (LPs) variables: x 1,..., x n R linear objective function: max c T x = c 1 x 1 +... + c n x n m linear constraints: a 1,1 x 1 +... + a 1,n x n b 1. a m,1 x 1 +... + a m,n x n b m x c
Linear Programming Linear Programs (LPs) variables: x 1,..., x n R linear objective function: max c T x = c 1 x 1 +... + c n x n m linear constraints: a 1,1 x 1 +... + a 1,n x n b 1. a m,1 x 1 +... + a m,n x n b m x c Complexity of LPs LPs can be solved in polynomial time by the ellipsoid method [Khachiyan 1979] and the interior point method [Karmarkar 1984].
Simplex Algorithm c Simplex Algorithm Start at some vertex of the polytope.
Simplex Algorithm c Simplex Algorithm Start at some vertex of the polytope.
Simplex Algorithm c Simplex Algorithm Start at some vertex of the polytope. Walk along the edges of the polytope in the direction of the objective function c T x.
Simplex Algorithm c Simplex Algorithm Start at some vertex of the polytope. Walk along the edges of the polytope in the direction of the objective function c T x. local optimum = global optimum
Simplex Algorithm c Simplex Algorithm Start at some vertex of the polytope. Walk along the edges of the polytope in the direction of the objective function c T x. local optimum = global optimum c Pivot Rules Which vertex is chosen if there are multiple options? Different pivot rules suggested: random, steepest descent, shadow vertex pivot rule,...
Simplex Algorithm Shadow Vertex Pivot Rule Shadow Vertex Pivot Rule Let x 0 be some vertex of the polytope. Compute u R d such that x 0 maximizes u T x. x 0 x u c
Simplex Algorithm Shadow Vertex Pivot Rule Shadow Vertex Pivot Rule Let x 0 be some vertex of the polytope. Compute u R d such that x 0 maximizes u T x. Project the polytope onto the plane spanned by c and u. x 0 x u c
Simplex Algorithm Shadow Vertex Pivot Rule Shadow Vertex Pivot Rule Let x 0 be some vertex of the polytope. Compute u R d such that x 0 maximizes u T x. Project the polytope onto the plane spanned by c and u. x 0 x u c The shadow is a polygon.
Simplex Algorithm Shadow Vertex Pivot Rule Shadow Vertex Pivot Rule Let x 0 be some vertex of the polytope. Compute u R d such that x 0 maximizes u T x. Project the polytope onto the plane spanned by c and u. x 0 x 0 x x u c The shadow is a polygon. x 0 is a vertex of the shadow. x is a vertex of the shadow.
Simplex Algorithm Shadow Vertex Pivot Rule Shadow Vertex Pivot Rule Let x 0 be some vertex of the polytope. Compute u R d such that x 0 maximizes u T x. Project the polytope onto the plane spanned by c and u. x 0 x 0 x x u c The shadow is a polygon. x 0 is a vertex of the shadow. x is a vertex of the shadow. Edges of the shadow correspond to edges of the polytope.
Simplex Algorithm Shadow Vertex Pivot Rule Shadow Vertex Pivot Rule Let x 0 be some vertex of the polytope. Compute u R d such that x 0 maximizes u T x. Project the polytope onto the plane spanned by c and u. Start at x 0 and follow the edges of the shadow. x 0 x 0 u c x x The shadow is a polygon. x 0 is a vertex of the shadow. x is a vertex of the shadow. Edges of the shadow correspond to edges of the polytope.
Simplex Algorithm Shadow Vertex Pivot Rule Shadow Vertex Pivot Rule Let x 0 be some vertex of the polytope. Compute u R d such that x 0 maximizes u T x. Project the polytope onto the plane spanned by c and u. Start at x 0 and follow the edges of the shadow. x 0 x 0 u c x x The shadow is a polygon. x 0 is a vertex of the shadow. x is a vertex of the shadow. Edges of the shadow correspond to edges of the polytope.
Simplex Algorithm Shadow Vertex Pivot Rule Shadow Vertex Pivot Rule Let x 0 be some vertex of the polytope. Compute u R d such that x 0 maximizes u T x. Project the polytope onto the plane spanned by c and u. Start at x 0 and follow the edges of the shadow. x 0 x 0 u c x x The shadow is a polygon. x 0 is a vertex of the shadow. x is a vertex of the shadow. Edges of the shadow correspond to edges of the polytope.
Simplex Algorithm Shadow Vertex Pivot Rule Shadow Vertex Pivot Rule Let x 0 be some vertex of the polytope. Compute u R d such that x 0 maximizes u T x. Project the polytope onto the plane spanned by c and u. Start at x 0 and follow the edges of the shadow. x 0 x 0 u c x x The shadow is a polygon. x 0 is a vertex of the shadow. x is a vertex of the shadow. Edges of the shadow correspond to edges of the polytope.
Simplex Algorithm Shadow Vertex Pivot Rule Shadow Vertex Pivot Rule Let x 0 be some vertex of the polytope. Compute u R d such that x 0 maximizes u T x. Project the polytope onto the plane spanned by c and u. Start at x 0 and follow the edges of the shadow. x 0 x 0 u c x x The shadow is a polygon. x 0 is a vertex of the shadow. x is a vertex of the shadow. Edges of the shadow correspond to edges of the polytope.
Simplex Algorithm Shadow Vertex Pivot Rule Shadow Vertex Pivot Rule Let x 0 be some vertex of the polytope. Compute u R d such that x 0 maximizes u T x. Project the polytope onto the plane spanned by c and u. Start at x 0 and follow the edges of the shadow. x 0 x 0 u c x x The shadow is a polygon. x 0 is a vertex of the shadow. x is a vertex of the shadow. Edges of the shadow correspond to edges of the polytope.
Simplex Algorithm Running Time Theoreticians say... shadow vertex pivot rule requires exponential number of steps no pivot rule with sub-exponential number of steps known ellipsoid and interior point methods are efficient
Simplex Algorithm Running Time Theoreticians say... shadow vertex pivot rule requires exponential number of steps no pivot rule with sub-exponential number of steps known ellipsoid and interior point methods are efficient Engineers say... simplex method usually fastest algorithm in practice requires usually only Θ(m) steps clearly outperforms ellipsoid method
Smoothed Analysis Observation: In worst-case analysis, the adversary is too powerful. Idea: Let s weaken him!
Smoothed Analysis Observation: In worst-case analysis, the adversary is too powerful. Idea: Let s weaken him! Perturbed LPs Step 1: Adversary specifies arbitrary LP: max c T x subject to a T 1 x b 1... a T n x b n. W. l. o. g. (a i, b i ) = 1.
Smoothed Analysis Observation: In worst-case analysis, the adversary is too powerful. Idea: Let s weaken him! Perturbed LPs Step 1: Adversary specifies arbitrary LP: max c T x subject to a T 1 x b 1... a T n x b n. W. l. o. g. (a i, b i ) = 1. Step 2: Add Gaussian random variable with standard deviation σ to each coefficient in the constraints.
Smoothed Analysis Observation: In worst-case analysis, the adversary is too powerful. Idea: Let s weaken him! Perturbed LPs Step 1: Adversary specifies arbitrary LP: max c T x subject to a T 1 x b 1... a T n x b n. W. l. o. g. (a i, b i ) = 1. Step 2: Add Gaussian random variable with standard deviation σ to each coefficient in the constraints. Smoothed Running Time = worst expected running time the adversary can achieve
Smoothed Analysis Step 1: Adversary chooses input I Step 2: Random perturbation I per σ (I) Formal Definition: LP(n, m) = set of LPs with n variables and m constraints T (I) = number of steps of simplex method on LP I
Smoothed Analysis Step 1: Adversary chooses input I Step 2: Random perturbation I per σ (I) Formal Definition: LP(n, m) = set of LPs with n variables and m constraints T (I) = number of steps of simplex method on LP I smoothed run time T smooth (n, m, σ) = max I LP(n,m) E[T (per σ (I))]
Smoothed Analysis Step 1: Adversary chooses input I Step 2: Random perturbation I per σ (I) Formal Definition: LP(n, m) = set of LPs with n variables and m constraints T (I) = number of steps of simplex method on LP I smoothed run time T smooth (n, m, σ) = max I LP(n,m) E[T (per σ (I))] Why do we consider this model? First step models unknown structure of the input. Second step models random influences, e.g., measurement errors, numerical imprecision, rounding,... smoothed running time low bad instances are unlikely to occur σ determines the amount of randomness
Smoothed Analysis of the Simplex Algorithm Lemma [Spielman and Teng (STOC 2001)] For every fixed plane and every LP the adversary can choose, after the perturbation, the expected number of edges on the shadow is O ( poly ( n, m, σ 1)). x 0 u c x
Smoothed Analysis of the Simplex Algorithm Lemma [Spielman and Teng (STOC 2001)] For every fixed plane and every LP the adversary can choose, after the perturbation, the expected number of edges on the shadow is O ( poly ( n, m, σ 1)). x 0 u c x Theorem [Spielman and Teng (STOC 2001)] The smoothed running time of the simplex algorithm with shadow vertex pivot rule is O ( poly ( n, m, σ 1)). Already for small perturbations exponential running time is unlikely.
Smoothed Analysis of the Simplex Algorithm Lemma [Spielman and Teng (STOC 2001)] For every fixed plane and every LP the adversary can choose, after the perturbation, the expected number of edges on the shadow is O ( poly ( n, m, σ 1)). x 0 u c x Theorem [Spielman and Teng (STOC 2001)] The smoothed running time of the simplex algorithm with shadow vertex pivot rule is O ( poly ( n, m, σ 1)). Already for small perturbations exponential running time is unlikely. Main Difficulties in Proof of Theorem: x 0 is found in phase I no Gaussian distribution of coefficients In phase II, the plane onto which the polytope is projected is not independent of the perturbations.
Improved Analysis Theorem [Vershynin (FOCS 2006)] The smoothed number of steps of the simplex algorithm with shadow vertex pivot rule is O ( poly ( n, log m, σ 1)). only polylogarithmic in the number of constraints m
Improved Analysis Theorem [Vershynin (FOCS 2006)] The smoothed number of steps of the simplex algorithm with shadow vertex pivot rule is O ( poly ( n, log m, σ 1)). only polylogarithmic in the number of constraints m x 0 x Phase I: add vertex x 0 in random direction. With constant prob. this does not change optimal solution. The plane is not correlated with the perturbed polytope. u c
Improved Analysis Theorem [Vershynin (FOCS 2006)] The smoothed number of steps of the simplex algorithm with shadow vertex pivot rule is O ( poly ( n, log m, σ 1)). only polylogarithmic in the number of constraints m x 0 x 0 u c x x Phase I: add vertex x 0 in random direction. With constant prob. this does not change optimal solution. The plane is not correlated with the perturbed polytope. With high prob. no angle between consecutive vertices is too small.
Overview of the coming Lectures Smoothed Analysis 2-Opt heuristic for the traveling salesperson problem Nemhauser/Ullmann algorithm for the knapsack problem