Subexponential Lower Bounds for the Simplex Algorithm

Subexponential Lower Bounds for the Simplex Algorithm Oliver Friedmann Department of Computer Science, Ludwig-Maximilians-Universität Munich, Germany. January 0, 011 Oliver Friedmann (LMU) Subexponential Lower Bounds for the SimplexJanuary Algorithm 0, 011 1 /

Outline 1 LPs and Simplex algorithm MDPs and Policy iteration 3 Random Edge 4 Random Facet and Least Entered 5 All is well that ends well? Oliver Friedmann (LMU) Subexponential Lower Bounds for the SimplexJanuary Algorithm 0, 011 /

LPs and Simplex algorithm LPs and Simplex algorithm Oliver Friedmann (LMU) Subexponential Lower Bounds for the SimplexJanuary Algorithm 0, 011 3 /

Outline LPs and Simplex algorithm 1 LPs and Simplex algorithm Linear programming Simplex algorithm Pivoting rules Technique MDPs and Policy iteration 3 Random Edge 4 Random Facet and Least Entered 5 All is well that ends well? Oliver Friedmann (LMU) Subexponential Lower Bounds for the SimplexJanuary Algorithm 0, 011 4 /

LPs and Simplex algorithm Linear programming Linear programming Oliver Friedmann (LMU) Subexponential Lower Bounds for the SimplexJanuary Algorithm 0, 011 5 /

Context LPs and Simplex algorithm Linear programming Optimization of a linear objective subject to linear (in)equality constraints One of the most important computational problems Simplex Algorithm introduced by Dantzig in 1947 for solving LPs Weakly polytime algorithms: Ellipsoid Method by Khachiyan, Interior Point Method by Karmarkar No strongly polytime algorithm known Oliver Friedmann (LMU) Subexponential Lower Bounds for the SimplexJanuary Algorithm 0, 011 6 /

LPs and Simplex algorithm Linear programming Linear programming max s.t. c T x Ax = b x 0 min s.t. b T y A T y c Simplex algorithm (Dantzig 1947) Ellipsoid algorithm (Khachiyan 1979) Interior-point algorithm (Karmakar 1984) Oliver Friedmann (LMU) Subexponential Lower Bounds for the SimplexJanuary Algorithm 0, 011 7 /

LPs and Simplex algorithm Simplex algorithm Simplex algorithm Oliver Friedmann (LMU) Subexponential Lower Bounds for the SimplexJanuary Algorithm 0, 011 8 /

Context LPs and Simplex algorithm Simplex algorithm Introduced by Dantzig in 1947 for solving LPs Fixpoint iteration algorithm Asymptotic complexity depends on the number of iterations Parameterized by pivoting rules Hirsch conjecture Oliver Friedmann (LMU) Subexponential Lower Bounds for the SimplexJanuary Algorithm 0, 011 9 /

Simplex algorithm LPs and Simplex algorithm Simplex algorithm Dantzig 1947 Move up, along an edge, to a neighboring vertex, until reaching the top 10 /

LPs and Simplex algorithm Pivoting rules Pivoting rules 11 /

Pivoting rules LPs and Simplex algorithm Pivoting rules Simplex method is parameterized by a pivoting rule. Pivoting rule = method of chosing adjacent vertices with better objective only single-switching deterministic vs. memorizing vs. randomized oblivious 1 /

Deterministic rules LPs and Simplex algorithm Pivoting rules Dantzig s rule Bland s rule Lexicographic rule many more... Theorem (Klee-Minty (197) et al. ) All known to require an exponential number of steps in the worst case. 13 /

Randomized rules LPs and Simplex algorithm Pivoting rules Random-Facet: recursively find the optimum (see later) due to Kalai ( 9), and Matoušek, Sharir and Welzl ( 96) Random-Edge: choose a random improving edge Theorem (Friedmann-Hansen-Zwick (SODA 011,...)) There are (different) explicit LPs on which Random-Facet and Random-Edge require an expected subexponential number of iterations. 14 /

Memorizing rule LPs and Simplex algorithm Pivoting rules (taken from David Avis paper) Theorem (Friedmann (IPCO 011)) There are explicit LPs on which Least-Entered requires a subexponential number of iterations. 15 /

LPs and Simplex algorithm Technique Technique 16 /

LPs and Simplex algorithm Lower bound technique Technique 1 Lower bounds for Markov decision process policy iteration Induce linear programs 3 Correspondence of simplex algorithm and policy iteration originally : lower bounds for parity games first 17 /

MDPs and Policy iteration MDPs and Policy iteration 18 /

Outline MDPs and Policy iteration 1 LPs and Simplex algorithm MDPs and Policy iteration Markov decision processes Policy iteration MDPs as LPs Summary 3 Random Edge 4 Random Facet and Least Entered 5 All is well that ends well? 19 /

MDPs and Policy iteration Markov decision processes Markov decision processes 0 /

MDPs and Policy iteration Markov decision process Markov decision processes due to Shapley 53, Bellman 57, Howard 60,... 1 a 1 c e 6 1 1 5 b 1 d 1 f 0 Controller (circle) chooses outgoing action Randomizer (square) determines successive state Objective: maximize expected total reward max E[ r(a i )] i=0 1 /

MDPs and Policy iteration Markov decision process Markov decision processes due to Shapley 53, Bellman 57, Howard 60,... 1 a 1 c e 6 1 1 5 policy σ = choice of an action from each state b 1 d 1 f 0 Controller (circle) chooses outgoing action Randomizer (square) determines successive state Objective: maximize expected total reward max E[ r(a i )] i=0 1 /

MDPs and Policy iteration Policy iteration Policy iteration /

Context MDPs and Policy iteration Policy iteration Introduced by Howard in 1960 to solve Markov Decision Processes Transferred to many other areas by several authors Fixpoint iteration algorithm Asymptotic complexity depends on the number of iterations Parameterized by improvement rules 3 /

Policy valuations MDPs and Policy iteration Policy iteration a 1 b 1 c 5 1 d val σ (a) = 1 b + 1 c = val σ (b) = + a = val σ (c) = 6 + e = 1 6 1 val σ (d) = 1 b + 1 f = val σ (e) = 1 c + 1 f = e 1 f val σ (f) = 0 4 /

Policy valuations MDPs and Policy iteration Policy iteration a 1 b 1 c 5 1 d val σ (a) = 1 b + 1 c = 14 val σ (b) = + a = 16 val σ (c) = 6 + e = 1 1 6 1 val σ (d) = 1 b + 1 f = 8 val σ (e) = 1 c + 1 f = 6 e 1 f val σ (f) = 0 0 4 /

MDPs and Policy iteration (Improving) Switches Policy iteration A σ-switch is an edge e E 0 \ σ not chosen by σ. Facts about switches Comparability: val σ val σ[e] or val σ[e] val σ for every σ-switch. Easy check: val σ val σ[(v,w)] iff val σ (σ(v)) < val σ (w). Improving switches: I(σ) = {e val σ val σ[e] } Theorem Switching: J I(σ) implies val σ val σ[j]. Optimality: I(σ) = implies σ is optimal. 5 /

MDPs and Policy iteration Policy iteration algorithm Policy iteration Howard (1960), Hoffman-Karp (1966), Puri (1995), Vöge / Jurdziński (000)... Strategy improvement / Policy iteration 1: while σ is not optimal do : σ σ[j] for some J I(σ) 3: end while Complexity: depends on the number of iterations. 6 /

Policy iteration MDPs and Policy iteration Policy iteration 1 a 1 c e 6 1 1 5 b 1 d 1 f val σ (a) = 1 b + 1 c = val σ (b) = + a = val σ (c) = 6 + e = val σ (d) = 1 b + 1 f = val σ (e) = 1 c + 1 f = val σ (f) = 0 7 /

Policy iteration MDPs and Policy iteration Policy iteration 1 a 1 c e 6 1 1 5 b 1 d 1 f val σ (a) = 1 b + 1 c = 14 val σ (b) = + a = 16 val σ (c) = 6 + e = 1 val σ (d) = 1 b + 1 f = 8 val σ (e) = 1 c + 1 f = 6 val σ (f) = 0 0 7 /

Policy iteration MDPs and Policy iteration Policy iteration a 1 b 1 c 5 1 d val σ (a) = 1 b + 1 c = val σ (b) = + a = val σ (c) = 5 + d = 1 6 1 val σ (d) = 1 b + 1 f = val σ (e) = 1 c + 1 f = e 1 f val σ (f) = 0 7 /

Policy iteration MDPs and Policy iteration Policy iteration a 1 b 1 c 5 1 d val σ (a) = 1 b + 1 c = 16 val σ (b) = + a = 18 val σ (c) = 5 + d = 1 1 6 1 val σ (d) = 1 b + 1 f = 9 val σ (e) = 1 c + 1 f = 7 e 1 f val σ (f) = 0 0 7 /

Complexity MDPs and Policy iteration Policy iteration Policy Iteration is parameterized by an improvement rule. Improvement Rule = method of chosing improving switches Single-switching vs. Multi-switching Deterministic vs. Memorizing vs. Randomized Oblivious 8 /

Diameter MDPs and Policy iteration Policy iteration Question: theoretically possible to have polynomially many iterations? Let G be a game and n be the number of nodes. Definition: the diameter of G is the least number of iterations required to solve G Diameter Theorem The diameter of G is less or equal to n. 9 /

MDPs and Policy iteration MDPs as LPs MDPs as LPs 30 /

MDPs and Policy iteration Dual LP of unichain MDP MDPs as LPs (D) min s.t. i V y i y i j V p j,ay j r a, i V, a A i V - set of all states A - set of all actions, A i - set of actions from state i r a - reward of action a, p i,a - transition probability from action a to state i y i - value of state i 31 /

MDPs and Policy iteration Primal LP of unichain MDP MDPs as LPs (P ) max s.t. a A r ax a a A i x a a A p i,ax a = 1, i V x a 0, a A x a - expected number of times of taking action a, when starting from all positions 3 /

MDPs and Policy iteration Summary Summary 33 /

MDPs and Policy iteration Pivoting on the induced LP Summary Theorem Vertex of bfs in the primal LP corresponds to policy in the MDP. Improving switches correspond to pivoting steps. Theorem Optimal solution of the dual LP corresponds to optimal policies in the MDP. Policy corresponds to basic solution of the dual. 34 /

MDPs and Policy iteration Summary Policy iteration vs. Simplex method Policy iteration Simplex method Pivoting single- and single-switch Rules multi-switch only Diameter linear unknown / Hirsch conjecture 35 /

Random Edge Random Edge 36 /

Outline Random Edge 1 LPs and Simplex algorithm MDPs and Policy iteration 3 Random Edge Random edge rule Motivation Construction 4 Random Facet and Least Entered 5 All is well that ends well? 37 /

Random Edge Random edge rule Random edge rule 38 /

Random edge rule Random Edge Random edge rule Random-Edge rule Perform single switch arbitrarly at random. Context Abstract lower bound: Ω( 3 n) (Matoušek, Szabó 006) 39 /

Random Edge Motivation Motivation 40 /

Random Edge Construction principles Motivation 1 Process always reaches the sink (unichain MDPs) Priority rewards on the nodes ( n) 5 5 ( n) 5 ( n) 5 ( n) 5 41 /

Cycle gadgets Random Edge Motivation 1 ɛ ɛ a b c d val σ (b) val σ (d) 4 /

Cycle gadgets Random Edge Motivation 1 ɛ ɛ a b c d val σ (b) = val σ (a) 4 /

Random Edge Motivation Binary Counting - How does it work? 101011 43 /

Random Edge Motivation Binary Counting - How does it work? 101011 Setting 101111 43 /

Random Edge Motivation Binary Counting - How does it work? 101011 Setting 101111 Resetting 101100 43 /

Random Edge Motivation Binary Counting - How does it work? 101011 Setting 101111 Resetting 101100 Activating 101100 43 /

Binary Counters Random Edge Motivation Binary Counting proceeds in three phases. 1 Set the least unset bit Reset all lower (inactive) bits 3 Activate recently set bit How to separate active from inactive bits? By access control. Attach access control structure to every bit Set bits are activated Recently set bit is not activated After resetting lower bits, recently set bit is activated 44 /

Random Edge Construction Construction 45 /

Full construction Random Edge Construction wn+1 t 1 un+1. yi+1 4i + 10 g ci+1,j 3 bi+1 4 li+1 bi+1,j 3 di+1 4i + 9 ui+1 wi+1 ai+1 4 ai+1,j 3 xi+1 4i + 7 h yi 4i + 6 g ci,j 3 bi 4 li bi,j 3 di 4i + 5 ui wi ai 4 ai,j 3 xi 4i + 3 h. y1 10 g c1,j 3 b1 4 l1 b1,j 3 d1 9 u1 w1 a1 4 a1,j 3 x1 7 h r 6 s 46 /

Cycle Gadget Random Edge Construction 47 /

Cycle Gadget Random Edge Construction Cycles close one edge at a time 47 /

Cycle Gadget Random Edge Construction Cycles close one edge at a time Shorter cycles close faster 47 /

Cycle Gadget Random Edge Construction 47 /

Cycle Gadget Random Edge Construction Cycles open simultaneously 47 /

Random Edge Construction Increment phases: from 101011 to 101100 1 Setting: { B k counting cycle closes C k helper cycle closes Resetting: 3 Activating: { U lane realigns A i and B i cycles (i < k) open A k access cycle closes W lane realigns C i cycles of unset bits open 48 /

Analysis Random Edge Construction Cycles (opening and closing) and lanes compete with each other. Supposed candidate has to win with high probability. Solution: increase length of higher cycles, resulting in O(n 4 ) vertices. Work in progress: improved construction. 49 /

Lower bound Random Edge Construction Theorem (F.-Hansen-Zwick (011)) The number of improving step performed by Random-Edge on the MDPs (and LPs), which contain O(n 4 ) vertices and edges, is Ω(n). 50 /

Random Facet and Least Entered Random Facet and Least Entered 51 /

Outline Random Facet and Least Entered 1 LPs and Simplex algorithm MDPs and Policy iteration 3 Random Edge 4 Random Facet and Least Entered Random Facet on LPs and MDPs Random Facet construction Least Entered 5 All is well that ends well? 5 /

Random Facet and Least Entered Random Facet on LPs and MDPs Random Facet on LPs and MDPs 53 /

The algorithm Random Facet and Least Entered Random Facet on LPs and MDPs due to Kalai ( 9), and Matoušek, Sharir and Welzl ( 96) procedure Random-Facet(H,B) if H = B then return B else Choose h H \ B at random B Random-Facet(H \ {h}, B) if h is violated by B then B Basis(B {h}) return Random-Facet(H, B ) else return B end if end if end procedure procedure Random-Facet(G,σ) if E 0 = σ then return σ else Choose e E 0 \ σ at random σ Random-Facet(G \ {e}, σ) if val σ val σ [e] then σ σ [e] return Random-Facet(G, σ ) else return σ end if end if end procedure Random Facet for LPs Random Facet for MDPs 54 /

Context Random Facet and Least Entered Random Facet on LPs and MDPs Known results Upper bound: O(n) (Kalai 199) Abstract lower bound: Ω( n) (Matoušek 1994) Theorem (F.-Hansen-Zwick (011)) Random-Facet for MDPs and LPs is subexponential. 55 /

Random Facet and Least Entered Random Facet construction Random Facet construction 56 /

Random Facet and Least Entered Randomized counting 0 0 0 0 0 Random Facet construction 57 /

Random Facet and Least Entered Randomized counting exclude bit 0 0 0 0 0 Random Facet construction 0 0 0 0 57 /

Random Facet and Least Entered Randomized counting exclude bit 0 0 0 0 0 Random Facet construction 0 0 0 0 1 1 1 1 count recursively 57 /

Random Facet and Least Entered Randomized counting 0 0 0 0 0 Random Facet construction exclude bit set bit 0 0 0 0 1 1 1 1 1 1 1 0 0 count recursively 57 /

Random Facet and Least Entered Randomized counting 0 0 0 0 0 Random Facet construction exclude bit set bit 0 0 0 0 1 1 1 1 1 1 1 0 0 1 1 1 1 1 count recursively count recursively 57 /

Random Facet and Least Entered Randomized counting 0 0 0 0 0 Random Facet construction exclude bit set bit 0 0 0 0 1 1 1 1 1 1 1 0 0 1 1 1 1 1 count recursively count recursively Analysis 57 /

Random Facet and Least Entered Randomized counting 0 0 0 0 0 Random Facet construction exclude bit set bit 0 0 0 0 1 1 1 1 1 1 1 0 0 1 1 1 1 1 count recursively count recursively Analysis Counting 0 0 0 0 equivalent to counting 0 0 0 0 Counting 1 1 1 0 0 equivalent to counting 0 0 57 /

Random Facet and Least Entered Random Facet construction Simplified construction (for parity games) t : 1 c3 : 8 b3 : b3 : lr d3 : 3 a3 : a3 : l r c : 6 b : b : lr d : 3 a : a : l r c1 : 4 b1 : b1 : lr d1 : 3 a1 : a1 : l r 58 /

Lower bound Random Facet and Least Entered Random Facet construction Theorem (F.-Hansen-Zwick (011)) The number of improving step performed by Random-Facet on the MDPs (and LPs) is Ω( n/ log(n)). 59 /

Random Facet and Least Entered Least Entered Least Entered 60 /

Random Facet and Least Entered Zadeh s pivoting rule Least Entered Zadeh s Least-Entered rule Perform single switch that has been applied least often. 61 /

Fair counting Random Facet and Least Entered Least Entered Problem: Flipping higher bits happens less often than flipping lower bits Zadeh s rule switches higher bits before they are supposed to be switched Solution: Represent every bit by two representatives Only one representative is actively working Inactive representative switches back and forth to catch up with the rest Both representatives change roles after flipping the represented bit 6 /

Full construction Random Facet and Least Entered Least Entered t kn+1 n+9 k [i+;n] ki+1 t h 0 i i+8 h 1 i i+8 s d 0 i 6 d 1 i 6 s k [1;n] b 0 i,0 1 ε ε ε 1 ε b 1 i,0 k [1;n] A 0 i A 1 i 1 ε 1 ε t b 0 i,1 b 1 i,1 t c0 i t k c1 i 7 [1;n] 7 ki i+7 t k [1;n] s 63 /

Lower bound Random Facet and Least Entered Least Entered Theorem (Friedmann (011)) The number of improving step performed by Least-Entered on the MDPs, which contain O(n ) vertices and edges, is Ω(n). 64 /

All is well that ends well? All is well that ends well? 65 /

All is well that ends well? Concluding remarks Game-theortic perspective helpful for the construction of lower bounds Lower bounds transfer to many other classes of determined games Random-Edge lower bound can be used as lower bound for Switch-Half 66 /

All is well that ends well? Relation to other games LP-type problems (LPtype) Turn-based Stochastic Games (TSG) 1 players Linear Programming (LP) Discounted Payoff Games (DPG) players Mean Payoff Games (MPG) players Markov Decision Processes (MDP) 1 1 players Parity Games (PG) players Deterministic Markov Decision Processes (DMDP) 1 player NP conp P 67 /

All is well that ends well? Open problems Polytime algorithm for two-player games and the like Strongly polytime algorithm for LPs (and MDPs) Resolving the Hirsch conjecture Find game-theoretic model with unresolved diameter bounds 68 /

All is well that ends well? The slide usually called the end. Thank you for listening! 69 /