Integer and Combinatorial Optimization: Clustering Problems

Similar documents
Minimum Weight Constrained Forest Problems. Problem Definition

Research Interests Optimization:

Branch-and-Cut for the k-way equipartition problem

Realignment in the National Football League Did They Do It Right?

Integer Programming Theory

Finding optimal realignments in sports leagues using a branch-and-cut-and-price approach

Binary Positive Semidefinite Matrices and Associated Integer Polytopes

Math Models of OR: The Simplex Algorithm: Practical Considerations

A Note on Polyhedral Relaxations for the Maximum Cut Problem

Cutting Planes for Some Nonconvex Combinatorial Optimization Problems

Towards Efficient Higher-Order Semidefinite Relaxations for Max-Cut

A Counterexample to the Dominating Set Conjecture

3 No-Wait Job Shops with Variable Processing Times

CS599: Convex and Combinatorial Optimization Fall 2013 Lecture 14: Combinatorial Problems as Linear Programs I. Instructor: Shaddin Dughmi

Introduction to Mathematical Programming IE496. Final Review. Dr. Ted Ralphs

Rearrangement of DNA fragments: a branch-and-cut algorithm Abstract. In this paper we consider a problem that arises in the process of reconstruction

Introduction to Mathematical Programming IE406. Lecture 20. Dr. Ted Ralphs

Combinatorial Optimization and Integer Linear Programming

Graphs and Network Flows IE411. Lecture 13. Dr. Ted Ralphs

Binary Positive Semidefinite Matrices and Associated Integer Polytopes

SUBSTITUTING GOMORY CUTTING PLANE METHOD TOWARDS BALAS ALGORITHM FOR SOLVING BINARY LINEAR PROGRAMMING

Graphing Linear Inequalities in Two Variables.

A Generic Separation Algorithm and Its Application to the Vehicle Routing Problem

SUBSTITUTING GOMORY CUTTING PLANE METHOD TOWARDS BALAS ALGORITHM FOR SOLVING BINARY LINEAR PROGRAMMING

Two models of the capacitated vehicle routing problem

An SDP Approach to Multi-level Crossing Minimization

Integer and Combinatorial Optimization

Convex Geometry arising in Optimization

5.3 Cutting plane methods and Gomory fractional cuts

Previously Local sensitivity analysis: having found an optimal basis to a problem in standard form,

LATIN SQUARES AND THEIR APPLICATION TO THE FEASIBLE SET FOR ASSIGNMENT PROBLEMS

Domination, Independence and Other Numbers Associated With the Intersection Graph of a Set of Half-planes

The ILP approach to the layered graph drawing. Ago Kuusik

1 date: September 15, 1998 file: mitche1

ACTUALLY DOING IT : an Introduction to Polyhedral Computation

Graph Coloring via Constraint Programming-based Column Generation

NETWORK SURVIVABILITY AND POLYHEDRAL ANALYSIS

ACO Comprehensive Exam October 12 and 13, Computability, Complexity and Algorithms

Facets of the minimum-adjacency vertex coloring polytope

MVE165/MMG631 Linear and integer optimization with applications Lecture 9 Discrete optimization: theory and algorithms

A Computational Study of Conflict Graphs and Aggressive Cut Separation in Integer Programming

Investigation of the Ring Spur Assignment Problem. 1 Introduction and Telecommunications Background. 2 Telecommunications Topological design Problems

Probabilistic Graphical Models

Polyhedral results for the Cardinality Constrained Multi-cycle Problem (CCMcP) and the Cardinality Constrained Cycles and Chains Problem (CCCCP)

DETERMINISTIC OPERATIONS RESEARCH

Stable sets, corner polyhedra and the Chvátal closure

Combinatorial Geometry & Topology arising in Game Theory and Optimization

Target Cuts from Relaxed Decision Diagrams

Introduction to Approximation Algorithms

Week 5. Convex Optimization

Integer Programming ISE 418. Lecture 7. Dr. Ted Ralphs

Combinatorial Optimization

CS 473: Algorithms. Ruta Mehta. Spring University of Illinois, Urbana-Champaign. Ruta (UIUC) CS473 1 Spring / 36

A New Exact Algorithm for the Two-Sided Crossing Minimization Problem

7KH9HKLFOH5RXWLQJSUREOHP

Winning Positions in Simplicial Nim

4 Linear Programming (LP) E. Amaldi -- Foundations of Operations Research -- Politecnico di Milano 1

Some Advanced Topics in Linear Programming

CHAPTER 3 LINEAR PROGRAMMING: SIMPLEX METHOD

A simple exact separation algorithm for 2-matching inequalities

Discrete Optimization. Lecture Notes 2

Financial Optimization ISE 347/447. Lecture 13. Dr. Ted Ralphs

AN EXACT ALGORITHM FOR A LOCATION-ROUTING PROBLEM. Martine Labbé 1, Inmaculada Rodríguez Martín 2, Juan José Salazar González 2

Exact Algorithms for the Quadratic Linear Ordering Problem

February 10, 2005

Computing Globally Optimal Solutions for Single-Row Layout Problems Using Semidefinite Programming and Cutting Planes

Algorithms for Integer Programming

1. Lecture notes on bipartite matching February 4th,

Coverings and Matchings in r-partite Hypergraphs

Optimality certificates for convex minimization and Helly numbers

Introduction to Mathematical Programming IE406. Lecture 16. Dr. Ted Ralphs

Integer Programming Explained Through Gomory s Cutting Plane Algorithm and Column Generation

Linear Programming. L.W. Dasanayake Department of Economics University of Kelaniya

An Introduction to Chromatic Polynomials

4 LINEAR PROGRAMMING (LP) E. Amaldi Fondamenti di R.O. Politecnico di Milano 1

Graph bisection revisited

6 ROUTING PROBLEMS VEHICLE ROUTING PROBLEMS. Vehicle Routing Problem, VRP:

1 date: September 15, 1998 file: mitche2

McMaster University. Advanced Optimization Laboratory. Authors: Kartik Krishnan, John Mitchell. AdvOL-Report No. 2004/3

Randomized rounding of semidefinite programs and primal-dual method for integer linear programming. Reza Moosavi Dr. Saeedeh Parsaeefard Dec.

DM515 Spring 2011 Weekly Note 7

Solving lexicographic multiobjective MIPs with Branch-Cut-Price

Discrete mathematics , Fall Instructor: prof. János Pach

Using Facets of a MILP Model for solving a Job Shop Scheduling Problem

TR/03/94 March 1994 ZERO-ONE IP PROBLEMS: POLYHEDRAL DESCRIPTIONS AND CUTTING PLANE PROCEDURES. Fatimah Abdul-Hamid Gautam Mitra Leslie-Ann Yarrow

Mathematical Tools for Engineering and Management

15-451/651: Design & Analysis of Algorithms October 11, 2018 Lecture #13: Linear Programming I last changed: October 9, 2018

Artificial Intelligence

Linear Programming. Meaning of Linear Programming. Basic Terminology

Classification of Ehrhart quasi-polynomials of half-integral polygons

arxiv: v1 [cs.cc] 30 Jun 2017

The University of Jordan Department of Mathematics. Branch and Cut

The maximum-impact coloring polytope

Theorem 2.9: nearest addition algorithm

Optimality certificates for convex minimization and Helly numbers

Vertex-Colouring Edge-Weightings

Applied Integer Programming

Applied Lagrange Duality for Constrained Optimization

Chapter 1. Introduction

Linear programming and the efficiency of the simplex algorithm for transportation polytopes

Transcription:

Integer and Combinatorial Optimization: Clustering Problems John E. Mitchell Department of Mathematical Sciences RPI, Troy, NY 12180 USA February 2019 Mitchell Clustering Problems 1 / 14

Clustering Clustering We have n objects, each with a number of attributes. We wish to group similar objects into clusters. There is no limit on the number of clusters, or on the size of each cluster. We have a measure c ij of the difference between two objects i and j; the larger this measure, the less similar the objects. This measure can take positive or negative values. Mitchell Clustering Problems 2 / 14

Clustering Variables and dimension We model this by introducing variables x ij = { 1 if i and j in same cluster 0 if i and j in different clusters for 1 i < j n Let S B 1 2 n(n 1) be the set of feasible solutions. We have the following results regarding conv(s): Proposition The set S is full-dimensional. One way to prove this is to note that the origin and all the unit vectors are in S. Mitchell Clustering Problems 3 / 14

Clustering Nonnegativity Proposition The lower bound constraints x ij > 0 define facets of conv(s). Mitchell Clustering Problems 4 / 14

Clustering Triangle inequalities Proposition Let 1 i < j < k n. The triangle inequalities define facets of conv(s). x ij + x ik x jk 1 x ij x ik + x jk 1 x ij + x ik + x jk 1 These inequalities enforce consistency. For example, the first one says that if i and j are in the same cluster and also i and k are in the same cluster then j and k must be in the same cluster. The only binary solution violating this constraint is x ij = x ik = 1, x jk = 0. Mitchell Clustering Problems 5 / 14

Clustering An integer program Proposition Any binary vector satisfying all the triangle inequalities is the incidence vector of a clustering. Thus, finding the best binary vector satisfying the triangle inequalities will solve the clustering problem. Mitchell Clustering Problems 6 / 14

Clustering Upper bound constraints The upper bound constraints x ij 1 do not define facets of conv(s). In particular, if x ij = 1 then we must also have x ij + x ik x jk = 1 and x ij x ik + x jk = 1 for each other k. Mitchell Clustering Problems 7 / 14

2-partition inequalities Clustering The following proposition generalizes the lower bound and triangle inequalities. Proposition (2-partition inequalities) Let U and W be disjoint collections of objects with U > W. The following inequality defines a facet of conv(s): i U,j W x ij i U,j U x ij i W,j W x ij W. This gives the lower bound constraints when U = 2, W = 0. It gives the triangle constraints when U = 2, W = 1. Mitchell Clustering Problems 8 / 14

Clustering The objective function coefficients Note that if all the c ij are nonnegative then the optimal solution is to place each object in its own cluster, so all x ij = 0. Thus, our measure c ij cannot simply be the distance between two objects, but must allow negative values if we are to have an interesting problem. For more details, see Grötschel and Wakabayashi [3, 4]. Mitchell Clustering Problems 9 / 14

Equipartition Equipartition Given a graph G = (V, E) with n = V = 2q for some integer q, we partition V into two sets of size q. We define the variables x ij = { 1 if i and j in same partition 0 if i and j in different partitions for 1 i < j n Let S be the set of feasible incidence vectors of equipartitions. Mitchell Clustering Problems 10 / 14

Equipartition Polyhedral results We have the following results: Proposition The dimension of conv(s) is 1 2n(n 3). If C is a cycle with q + 1 vertices then the inequality x(e(c)) q 1 is facet defining. If U V with U 3 and odd, the clique inequality x(e(u)) 1 2 U 2 is facet-defining. Other inequalities are known (Conforti et al. [1, 2]). Mitchell Clustering Problems 11 / 14

Clustering with lower bound Clustering with lower bound Now consider a clustering problem where we require each cluster to contain at least q elements, for some positive integer q. For example, this problem arises in the following settings: allocating teams to divisions in a sports league. In this case, often require each division to have the same cardinality. microaggregation in the release of data: in order to preserve privacy, clusters with tiny sizes must be avoided. Mitchell Clustering Problems 12 / 14

Clustering with lower bound Clustering with lower bound Now consider a clustering problem where we require each cluster to contain at least q elements, for some positive integer q. For example, this problem arises in the following settings: allocating teams to divisions in a sports league. In this case, often require each division to have the same cardinality. microaggregation in the release of data: in order to preserve privacy, clusters with tiny sizes must be avoided. Mitchell Clustering Problems 12 / 14

Polyhedral theory Clustering with lower bound Let S B 1 2 n(n 1) be the set of incidence vectors of clusterings where each cluster contains at least q elements. We have the following results regarding conv(s): Proposition If q < n/2 then dim(conv(s)) = 1 2n(n 1), so S is full-dimensional. Proposition The nonnegativity constraints and the triangle constraints of Proposition 3 define facets of conv(s), provided q < n/3. The 2-partition inequalities of Proposition 5 define facets of conv(s) provided ( W + 2)q < n. Other families of valid inequalities are also known [5, 6]. Mitchell Clustering Problems 13 / 14

Polyhedral theory Clustering with lower bound Let S B 1 2 n(n 1) be the set of incidence vectors of clusterings where each cluster contains at least q elements. We have the following results regarding conv(s): Proposition If q < n/2 then dim(conv(s)) = 1 2n(n 1), so S is full-dimensional. Proposition The nonnegativity constraints and the triangle constraints of Proposition 3 define facets of conv(s), provided q < n/3. The 2-partition inequalities of Proposition 5 define facets of conv(s) provided ( W + 2)q < n. Other families of valid inequalities are also known [5, 6]. Mitchell Clustering Problems 13 / 14

References The equipartition polytope I: Formulations, dimension and basic facets. Mathematical Programming, 49:49 70, 1990. The equipartition polytope II: Valid inequalities and facets. Mathematical Programming, 49:71 90, 1990. A cutting plane algorithm for a clustering problem. Mathematical Programming, 45:59 96, 1989. Facets of the clique partitioning polytope. Mathematical Programming, 47:367 387, 1990. X. Ji and J. E. Mitchell. Branch-and-price-and-cut on the clique partition problem with minimum clique size requirement. Discrete Optimization, 4(1):87 102, 2007. J. E. Mitchell. Realignment in the national football league: Did they get it right? Naval Research Logistics, 50(7):683 701, 2003. Mitchell Clustering Problems 14 / 14

References The equipartition polytope I: Formulations, dimension and basic facets. Mathematical Programming, 49:49 70, 1990. The equipartition polytope II: Valid inequalities and facets. Mathematical Programming, 49:71 90, 1990. A cutting plane algorithm for a clustering problem. Mathematical Programming, 45:59 96, 1989. Facets of the clique partitioning polytope. Mathematical Programming, 47:367 387, 1990. X. Ji and J. E. Mitchell. Branch-and-price-and-cut on the clique partition problem with minimum clique size requirement. Discrete Optimization, 4(1):87 102, 2007. J. E. Mitchell. Realignment in the national football league: Did they get it right? Naval Research Logistics, 50(7):683 701, 2003. Mitchell Clustering Problems 14 / 14

References The equipartition polytope I: Formulations, dimension and basic facets. Mathematical Programming, 49:49 70, 1990. The equipartition polytope II: Valid inequalities and facets. Mathematical Programming, 49:71 90, 1990. A cutting plane algorithm for a clustering problem. Mathematical Programming, 45:59 96, 1989. Facets of the clique partitioning polytope. Mathematical Programming, 47:367 387, 1990. X. Ji and J. E. Mitchell. Branch-and-price-and-cut on the clique partition problem with minimum clique size requirement. Discrete Optimization, 4(1):87 102, 2007. J. E. Mitchell. Realignment in the national football league: Did they get it right? Naval Research Logistics, 50(7):683 701, 2003. Mitchell Clustering Problems 14 / 14

References The equipartition polytope I: Formulations, dimension and basic facets. Mathematical Programming, 49:49 70, 1990. The equipartition polytope II: Valid inequalities and facets. Mathematical Programming, 49:71 90, 1990. A cutting plane algorithm for a clustering problem. Mathematical Programming, 45:59 96, 1989. Facets of the clique partitioning polytope. Mathematical Programming, 47:367 387, 1990. X. Ji and J. E. Mitchell. Branch-and-price-and-cut on the clique partition problem with minimum clique size requirement. Discrete Optimization, 4(1):87 102, 2007. J. E. Mitchell. Realignment in the national football league: Did they get it right? Naval Research Logistics, 50(7):683 701, 2003. Mitchell Clustering Problems 14 / 14

References The equipartition polytope I: Formulations, dimension and basic facets. Mathematical Programming, 49:49 70, 1990. The equipartition polytope II: Valid inequalities and facets. Mathematical Programming, 49:71 90, 1990. A cutting plane algorithm for a clustering problem. Mathematical Programming, 45:59 96, 1989. Facets of the clique partitioning polytope. Mathematical Programming, 47:367 387, 1990. X. Ji and J. E. Mitchell. Branch-and-price-and-cut on the clique partition problem with minimum clique size requirement. Discrete Optimization, 4(1):87 102, 2007. J. E. Mitchell. Realignment in the national football league: Did they get it right? Naval Research Logistics, 50(7):683 701, 2003. Mitchell Clustering Problems 14 / 14

References The equipartition polytope I: Formulations, dimension and basic facets. Mathematical Programming, 49:49 70, 1990. The equipartition polytope II: Valid inequalities and facets. Mathematical Programming, 49:71 90, 1990. A cutting plane algorithm for a clustering problem. Mathematical Programming, 45:59 96, 1989. Facets of the clique partitioning polytope. Mathematical Programming, 47:367 387, 1990. X. Ji and J. E. Mitchell. Branch-and-price-and-cut on the clique partition problem with minimum clique size requirement. Discrete Optimization, 4(1):87 102, 2007. J. E. Mitchell. Realignment in the national football league: Did they get it right? Naval Research Logistics, 50(7):683 701, 2003. Mitchell Clustering Problems 14 / 14