Exploiting Degeneracy in MIP Tobias Achterberg 9 January 2018 Aussois
Performance Impact in Gurobi 7.5+ 35% 32.0% 30% 25% 20% 15% 14.6% 10% 5.7% 7.9% 6.6% 5% 0% 2.9% 1.2% 0.1% 2.6% 2.6% Time limit: 10000 sec. Intel Xeon CPU E3-1240 v3 @ 3.40GHz 4 cores, 8 hyper-threads 32 GB RAM Test set has 3257 models: - 67 discarded due to inconsistent answers - 9 discarded that none of the versions can solve - speed-up measured on >100s bracket: 1015 models 2
Definition of Degeneracy An LP solution is called degenerate, if there are basic variables equal to their bounds Dual solution degenerate primal problem has multiple optimal solutions Non-basic variable with reduced costs zero Non-basic slack with dual solution value zero Primal solution degenerate dual problem has multiple optimal solutions Basic variable with primal solution value equal to its bound Basic slack with primal solution value zero Solutions for LPs of practical problems: typically dual and primal degenerate c c dual degeneracy (1-dimensional primal optimal face) primal degeneracy (2-dimensional dual optimal face) 3
6.1% 3.8% 1.2% 0.1% fraction of models 13.5% 30.7% 22.6% 41.1% 52.9% 80.0% 69.3% 62.0% 54.1% 46.2% 39.3% 34.4% 29.0% 25.4% 21.7% 17.8% 14.1% 9.1% 5.4% fraction of models 64.2% 74.8% 91.6% 96.5% 94.2% 89.2% 84.6% Degeneracy in Gurobi MIP Test Set 3485 models (excluding those solved in presolve or with infeasible LP relaxation) 3193 (91.6%) with dual degeneracy, 3362 (96.5%) with primal degeneracy 187 (5.4%) with 100% dual degeneracy, 3 (0.1%) with 100% primal degeneracy avg. dual degeneracy (#non-basic-zero/#cols): 37.3% avg. primal degeneracy (#basic-zero/#rows): 43.6% 1.0 1.0 0.9 0.9 0.8 0.8 0.7 0.7 0.6 0.6 0.5 0.5 0.4 0.4 0.3 0.3 0.2 0.2 0.1 0.1 0.0 0.0 4 dual degeneracy primal degeneracy
Dual Degeneracy Multiple Primal Optima If there are non-basic variables with zero reduced costs, there may be multiple primal optimal solutions. Which one should the MIP solver select? Good starting point for primal heuristics Few integer variables with fractional LP solution? Interesting point for cutting plane separation Many integer variables with fractional LP solution? Small angles between outgoing rays? Many tight formulation constraints and few tight cuts? Interesting point for branching and strong branching Set of different optimal LP solutions may yield some insight for branching decision See Berthold and Salvagnin: "Could Branching" (CPAIOR 2013 proceedings) Anything else? Obviously: if there is an integer solution on the optimal face, this is an optimal choice Aiming for few integer variables with fractional LP solution may find this Approximation: try to move integer variables out of the basis 5
Integrality in Simplex Tune the simplex algorithm in order to find less fractional optimal basis Dual simplex: Pricing: pick primal infeasible variable to leave the basis defines dual ray to follow Ratio test: "first" non-basic variable that is hit by ray needs to enter the basis Both steps are subject to heuristic choices Pricing: trade-off between sparsity and numerical properties Ratio test: trade-off between numerical properties and temporary bound violations (Harris ratio test) Idea: add integrality of variables as additional criterion to the trade-off Pricing: prefer integer variables to leave the basis Ratio test: prefer continuous variables to enter the basis See talk of Gu: "Gurobi Technology" (INFORMS 2009, Gurobi Workshop, San Diego) Gurobi 2.0 6
Performance Impact in Gurobi 7.5+ 35% 32.0% 30% 25% 20% 15% 14.6% 10% 5.7% 7.9% 6.6% 5% 0% 2.9% 1.2% 0.1% 2.6% 2.6% Time limit: 10000 sec. Intel Xeon CPU E3-1240 v3 @ 3.40GHz 4 cores, 8 hyper-threads 32 GB RAM Test set has 3257 models: - 67 discarded due to inconsistent answers - 9 discarded that none of the versions can solve - speed-up measured on >100s bracket: 1015 models 7
Primal PumpReduce Given an optimal LP solution, fix variables and slacks with non-zero reduced costs or duals Fixes solution to stay on optimal face What does "non-zero" mean in floating point arithmetics? Gurobi uses 10-10 as threshold Tiny reduced cost values for variables with large domains can still lead to substantial objective changes Modify the objective function to move to different point on optimal face using primal simplex Search for integer solution similar to Feasibility Pump Try multiple objective function vectors Record least fractional basis found in the process Translate final basis of fixed LP to original LP Need to flip non-basis statuses (at lower/upper bound) to match values of variables with non-zero reduced costs Original LP solve cleans up errors we got from not fixing tiny reduced costs See talk of Achterberg: "LP Basis Selection and Cutting Planes" (MIP 2010, Atlanta) Introduced by Gu for CPLEX 11.0 Extended in CPLEX 12.1 8
Primal PumpReduce Quick Version Perform ratio tests to check if basic integer variables can be replaced by continuous variables Non-basic slacks of rows with zero dual solution value Non-basic variables with zero reduced costs Very simple pivots for slacks Can just update basis and solution directly No factorizations or linear system solves needed See talk of Christophel: "The Black Art of Pivoting" (MIP 2012, UC Davis) "Pivot Reduce": first push slacks into basis, then continuous structural variables Implemented in SCIP 4.0 with Soplex 3.0 Applied only at root node (but: not active in default settings, but now used in SCIP 5.0) 10% performance improvement for MIPLIB 2010 models, but solves 2 fewer See Maher et. al.: "The SCIP Optimization Suite 4.0" (March 2017) Gurobi applies quick version after every node LP relaxation solve 9
Performance Impact in Gurobi 7.5+ 35% 32.0% 30% 25% 20% 15% 14.6% 10% 5.7% 7.9% 6.6% 5% 0% 2.9% 1.2% 0.1% 2.6% 2.6% Time limit: 10000 sec. Intel Xeon CPU E3-1240 v3 @ 3.40GHz 4 cores, 8 hyper-threads 32 GB RAM Test set has 3257 models: - 67 discarded due to inconsistent answers - 9 discarded that none of the versions can solve - speed-up measured on >100s bracket: 1015 models 10
Reduced Cost Strengthening Reduced costs of non-basic variables provide a lower bound on the objective change c incumbent solution c LP r j l j l j +1 u j ' u j x j Given the objective value of an incumbent solution, this can tighten bounds of variables Nothing else than propagating c T x = y LPT Ax + (c y LPT A) T x = y LPT b + r LPT x = c LP + r LPT x c* c multiple LP solutions lead to piece-wise linear lower bound x j l j u j 11
Lurking Bounds Gurobi stores reduced cost strengthening opportunities as "lurking bounds" table "If we find an incumbent with a certain objective value, then we can tighten some bounds" c* p j x j q j (or x j q j ) Need to store only one piece of PWL function for binary variables Also store only a linear lower bound function for general integer and continuous variables Pick the one that yields largest objective value at l j + 1 (or u j 1) Table is populated during the root cut loop Every root LP solution yields a pair (c LP, r LP ) Given the optimal LP objective value c LP, can we find a "good" r LP on the dual optimal face? Every r LP yields a set of lurking bounds Lurking bounds for individual variables may come from different r LP vectors Any dual feasible solution can be used can even leave dual optimal face (i.e., c' < c LP ) 12
Selecting "Good" Dual Solutions Bajgiran, Cire, and Rousseau: "A First Look at Picking Dual Variables for Maximizing Reduced Cost Fixing" (CPAIOR 2017 proceedings; talk at MIP 2017, Montréal) Pick dual solution vector that "maximizes reduced-cost-based filtering" Solve MIP to maximize the number of reduced-cost fixings, subject to dual feasibility and optimality Needs binary variables and big-m constraints for counting Often quick to solve, but sometimes pretty time-consuming Solve LP to minimize total slack needed to satisfy reduced-cost fixing conditions Usually very fast Some nice performance improvements (on a small set of models) (pictures from MIP 2017 talk) 13
Dual PumpReduce Primal PumpReduce Given an optimal LP solution, fix variables and slacks with non-zero reduced costs or duals Fixes primal solution to stay on primal optimal face (r j > 0 x j = 0) Modify the objective function to move to different point on primal optimal face using primal simplex Search for integer solution similar to Feasibility Pump Record least fractional basis found in the process Translate final basis of fixed LP to original LP Dual PumpReduce Given an optimal LP solution, remove bounds and rows that are not tight Fixes dual solution to stay on dual optimal face (x j > 0 r j = 0) Modify bounds to move to different point on dual optimal face using dual simplex For remaining finite bounds, randomly set either l j [1,100] or u j [-1,-100] Any better ideas? For each dual feasible basis, calculate reduced costs in original model and update lurking bounds table For each dual ray, fix variables/slacks with non-zero entry in dual ray 14
Degenerate Reduced Cost Tightening See talks of Christophel: "The Black Art of Pivoting" (MIP 2012, UC Davis) Polik: "More Ways to Use Dual Information in MILP" (ISMP 2015, Pittsburgh) For each primal degenerate basic variable Compute pivot row Perform a ratio test to get reduced costs after pivot If non-zero, apply reduced cost strengthening Equivalent to calculating Driebeek penalties SAS/OR also has version that leaves dual optimal face 2.5% performance improvement in SAS/OR (personal communication with Imre Polik) (picture from MIP 2012 talk) 15
Dual PumpReduce Quick Version For each row, perform ratio test to check if the dual variable can be increased or decreased If possible, move dual variable and update reduced costs May leave dual optimal face Immediately update global lurking bounds table Updated reduced costs may lead to additional opportunities to move dual variables Track maximum reduced costs for each variable Use for local reduced cost strengthening Need to be careful with reduced costs coming from dual solutions that are not on optimal face 16
Performance Impact in Gurobi 7.5+ 35% 32.0% 30% 25% 20% 15% 14.6% 10% 5.7% 7.9% 6.6% 5% 0% 2.9% 1.2% 0.1% 2.6% 2.6% Time limit: 10000 sec. Intel Xeon CPU E3-1240 v3 @ 3.40GHz 4 cores, 8 hyper-threads 32 GB RAM Test set has 3257 models: - 67 discarded due to inconsistent answers - 9 discarded that none of the versions can solve - speed-up measured on >100s bracket: 1015 models 17
Degeneracy and Cutting Planes Cutting plane separation depends on current LP solution Gomory cuts are defined by current basis Other cuts like MIR or knapsack covers heuristically try to separate current LP solution from convex hull of integer solutions What is a good solution/basis to separate cuts for? After adding cuts and resolving the LP, should we stop at the first optimal basis that we encounter? Typically, for this basis many of the newly added cuts will be tight Additional cuts will be based on the previous cuts Leads to high rank cuts Numerical issues in cut separation accumulate 18
Degeneracy and Cutting Planes Zanette, Fischetti and Balas: "Lexicography and degeneracy: Can a pure cutting plane algorithm work?" (Math Programming, 2011) Pure Gomory fractional cut approach suffers from "cuts on top of cuts" Determinants of simplex bases grow rapidly: numerical difficulties Objective improvement stalls after few iterations Solution: select different optimal basis in each iteration Lexicographic simplex (Pictures for stein15, see Zanette et. al.) 19
Degeneracy and Cutting Planes Tramontani et. al.: "Concurrent Root Cut Loops to Exploit Random Performance Variability" (INFORMS 2013) 2 parallel root cut loops, each started from different optimal LP basis Share cuts and solutions between loops 5% speed-up in CPLEX 12.5.1 Also in Gurobi since version 6.0 (2014) 2 parallel root cut loops Start from same optimal LP basis, but use different settings (including random seed) Data passed only from secondary cut loop to primary one (but not in other direction) New incumbents New global bounds New cuts 20
Degeneracy and Branch-and-Cut Fischetti and Monaci: "Exploiting Erraticism in Search" (Operations Research, 2014) Bet-and-run approach: make a number of short (i.e., few branch-and-bound nodes) sample runs, then pick the "most promising" and bring it to completion Carvahal et. al.: "Using diversification, communication and parallelism to solve mixed-integer linear programs" (OR Letters, 2014) Concurrent optimization with communication Also available in Gurobi and CPLEX, but without communication Communication is complicated without affecting determinism Also as internal feature of XPress Shinano et. al.: "FiberSCIP - A shared memory parallelization of SCIP" (ZIB Report, 2013; IJOC, 2017) Racing ramp-up phase 21
Performance Impact in Gurobi 7.5+ 35% 32.0% 30% 25% 20% 15% 14.6% 10% 5.7% 7.9% 6.6% 5% 0% 2.9% 1.2% 0.1% 2.6% 2.6% Time limit: 10000 sec. Intel Xeon CPU E3-1240 v3 @ 3.40GHz 4 cores, 8 hyper-threads 32 GB RAM Test set has 3257 models: - 67 discarded due to inconsistent answers - 9 discarded that none of the versions can solve - speed-up measured on >100s bracket: 1015 models 22
Degeneracy and Interior Point Methods Interior point methods find analytic center of the optimal face min - ln x j s.t. Ax = b (with arithmetic appropriately defined on ln 0 = - ) x 0 Usually, we then apply crossover to obtain a vertex solution Analytic center for zero objective (c = 0) may find fixings Detects linearly implied equations If x j = 0 in analytic center, we can permanently fix x j := 0 Fix variable to a bound Turn inequality into equation See Berthold, Perregaard, Meszaros: "Four good reasons to use an Interior Point solver within a MIP solver" (ZIB Report, 2017) 2% speed-up due to presolving fixings from analytic center 23
Fixings from Analytic Center of Optimal Face We can also find fixings if we solved the original problem (with objective) with interior point If x j = 0 and r j = 0 in a relative interior solution, then x j = 0 for all feasible solutions Proof (by Gu): Assume there is a feasible solution with x j * > 0 The function obj(d) for an optimal solution with x j = d x j * is piecewise linear Say that 0 < x j ' x j * is the first break point of this PWL function, i.e., [0, x j '] is the linear piece with a fixed r j ' If r j ' > 0, then 0 < r j < r j ', since primal and dual solutions are at analytic center, contradicting our assumption If r j ' = 0, then the analytic center solution value for x j should be inside (0, x j '), which contradicts x j = 0 Fixings are linearly implied: not very powerful But may trigger other presolve reductions Exploit this directly for MIP? Barrier is numerically difficult Barrier presolve reductions may lead to solution that is not the analytic center Solution may yield hints which variable fixings should be tested explicitly (e.g., in an OBBT fashion) Use analytic center in LP crossover Do not use variables at bounds in analytic center for crossover basis 1% speed-up on barrier test set 24
Implied Equations using Simplex Solves Guess some fixings x j = 0 with j F Solve max {x j : j F} s.t. Ax = b x 0 x j 0.001 for all j F If all x j = 0 for j F, then permanently fix all variables in F and stop Otherwise, remove non-zero variables from F and iterate 25
Performance Impact in Gurobi 7.5+ 35% 32.0% 30% 25% 20% 15% 14.6% 10% 5.7% 7.9% 6.6% 5% 0% 2.9% 1.2% 0.1% 2.6% 2.6% Time limit: 10000 sec. Intel Xeon CPU E3-1240 v3 @ 3.40GHz 4 cores, 8 hyper-threads 32 GB RAM Test set has 3257 models: - 67 discarded due to inconsistent answers - 9 discarded that none of the versions can solve - speed-up measured on >100s bracket: 1015 models 26
Total Impact in Gurobi 7.5+ More Details 50% 47.6% 1400 1238 45% 1200 40% 35% 32.8% 1000 30% 25% 20% 15% 10% 5% 12.4% 17.6% 23.9% 800 600 400 200 282 136 196 206 328 257 170 145 89 153 0% >0 sec >1 sec >10 sec >100 sec >1000 sec Unsolved models (out of 5 x 3214): 52 vs 253 (+108 for both) Unsolved models without feasible solution: 16 vs 109 (+31 for both) 0-5 -4-3 -2-1 0 +1 +2 +3 +4 +5 #wins #losses per model for 5 random seeds for disabling all degeneracy exploits Time limit: 10000 sec. Intel Xeon CPU E3-1240 v3 @ 3.40GHz 4 cores, 8 hyper-threads 32 GB RAM Test set has 3257 models: - 43 discarded due to inconsistent answers - 14 discarded that none of the versions can solve 27