Optimal transport and redistricting: numerical experiments and a few questions. Nestor Guillen University of Massachusetts

Similar documents
Math 734 Aug 22, Differential Geometry Fall 2002, USC

Conjectures concerning the geometry of 2-point Centroidal Voronoi Tessellations

Topological Data Analysis - I. Afra Zomorodian Department of Computer Science Dartmouth College

Definition A metric space is proper if all closed balls are compact. The length pseudo metric of a metric space X is given by.

04 - Normal Estimation, Curves

EE 584 MACHINE VISION

Lecture 2 September 3

The melting Rubik cube: From Fluids to Combinatorics and vice versa

Module 1 Lecture Notes 2. Optimization Problem and Model Formulation

Digital Image Processing

MA 323 Geometric Modelling Course Notes: Day 21 Three Dimensional Bezier Curves, Projections and Rational Bezier Curves

Matching under transferable utility

Linear Programming and its Applications

Parameterization of triangular meshes

Mirror Symmetry Through Reflexive Polytopes

Lecture notes for Topology MMA100

Linear Programming: Introduction

As a consequence of the operation, there are new incidences between edges and triangles that did not exist in K; see Figure II.9.

Electoral Redistricting with Moment of Inertia and Diminishing Halves Models

Some Open Problems in Graph Theory and Computational Geometry

Simplicial Complexes: Second Lecture

Chapter 15 Introduction to Linear Programming

Discrete Exterior Calculus How to Turn Your Mesh into a Computational Structure. Discrete Differential Geometry

In what follows, we will focus on Voronoi diagrams in Euclidean space. Later, we will generalize to other distance spaces.

A Flavor of Topology. Shireen Elhabian and Aly A. Farag University of Louisville January 2010

Planar Graphs. 1 Graphs and maps. 1.1 Planarity and duality

1.2 Numerical Solutions of Flow Problems

14.30 Introduction to Statistical Methods in Economics Spring 2009

Lecture 4: Convexity

Quadratic and cubic b-splines by generalizing higher-order voronoi diagrams

Cuts of the Hypercube

Interpretations and Models. Chapter Axiomatic Systems and Incidence Geometry

A Sparse Algorithm for Optimal Transport

A noninformative Bayesian approach to small area estimation

Olmo S. Zavala Romero. Clustering Hierarchical Distance Group Dist. K-means. Center of Atmospheric Sciences, UNAM.

MODEL SELECTION AND REGULARIZATION PARAMETER CHOICE

SOLVING PARTIAL DIFFERENTIAL EQUATIONS ON POINT CLOUDS

Math 5320, 3/28/18 Worksheet 26: Ruler and compass constructions. 1. Use your ruler and compass to construct a line perpendicular to the line below:

Unit Activity Correlations to Common Core State Standards. Geometry. Table of Contents. Geometry 1 Statistics and Probability 8

Lecture 8 Object Descriptors

CS443: Digital Imaging and Multimedia Binary Image Analysis. Spring 2008 Ahmed Elgammal Dept. of Computer Science Rutgers University

Orthogonal Ham-Sandwich Theorem in R 3

Mirror, Mirror. String Theory and Pairs of Polyhedra. Ursula Whitcher. February Math Reviews & University of Wisconsin Eau Claire

Ohio Tutorials are designed specifically for the Ohio Learning Standards to prepare students for the Ohio State Tests and end-ofcourse

Convexity: an introduction

Alternative interpretation of the Plücker quadric s ambient space and its application

STATISTICS AND ANALYSIS OF SHAPE

Math 5593 Linear Programming Lecture Notes

On the undecidability of the tiling problem. Jarkko Kari. Mathematics Department, University of Turku, Finland

Discrete geometry. Lecture 2. Alexander & Michael Bronstein tosca.cs.technion.ac.il/book

Short-Cut MCMC: An Alternative to Adaptation

Def De orma f tion orma Disney/Pixar

Algebraic Geometry of Segmentation and Tracking

7 Fractions. Number Sense and Numeration Measurement Geometry and Spatial Sense Patterning and Algebra Data Management and Probability

Carnegie Learning Math Series Course 2, A Florida Standards Program

Crossing Families. Abstract

Facility Location and Strategic Supply Chain Management

Parameterization. Michael S. Floater. November 10, 2011

Geometry Processing & Geometric Queries. Computer Graphics CMU /15-662

MODEL SELECTION AND REGULARIZATION PARAMETER CHOICE

Introduction to Computer Science

CS 468 (Spring 2013) Discrete Differential Geometry

TOPOLOGY, DR. BLOCK, FALL 2015, NOTES, PART 3.

Cluster algebras and infinite associahedra

A GRAPH FROM THE VIEWPOINT OF ALGEBRAIC TOPOLOGY

CS-9645 Introduction to Computer Vision Techniques Winter 2019

Don t just read it; fight it! Ask your own questions, look for your own examples, discover your own proofs. Is the hypothesis necessary?

Laplacian Meshes. COS 526 Fall 2016 Slides from Olga Sorkine and Yaron Lipman

CS S Lecture February 13, 2017

A Singular Example for the Averaged Mean Curvature Flow

Survey of the Mathematics of Big Data

Voronoi Diagram. Xiao-Ming Fu

Technische Universität München Zentrum Mathematik

Note Set 4: Finite Mixture Models and the EM Algorithm

Notes on point set topology, Fall 2010

Summer School: Mathematical Methods in Robotics

Lower bounds on the barrier parameter of convex cones

Mathematics in Orbit

A barrier on convex cones with parameter equal to the dimension

An introduction to Topological Data Analysis through persistent homology: Intro and geometric inference

Computational Geometry

7. The Gauss-Bonnet theorem

Mathematics Scope & Sequence Grade 8 Revised: June 2015

Preliminaries: Size Measures and Shape Coordinates

Shape Modeling and Geometry Processing

Homotopy type of the complement. complex lines arrangements

Level lines based disocclusion

Other Voronoi/Delaunay Structures

Content Standard 1: Numbers, Number Sense, and Computation

Elementary Topology. Note: This problem list was written primarily by Phil Bowers and John Bryant. It has been edited by a few others along the way.

WUSTL Math Circle Sept 27, 2015

Applied Lagrange Duality for Constrained Optimization

Lecture 7: Jan 31, Some definitions related to Simplical Complex. 7.2 Topological Equivalence and Homeomorphism

HPISD Eighth Grade Math

Finite Element Methods

CHAPTER 6 Parametric Spline Curves

Gromov-Hausdorff distances in Euclidean Spaces. Facundo Mémoli

2D Spline Curves. CS 4620 Lecture 18

Exchangeability and continuum limits of discrete random structures

Open and Closed Sets

Transcription:

Optimal transport and redistricting: numerical experiments and a few questions Nestor Guillen University of Massachusetts

From Rebecca Solnit s Hope in the dark (apropos of nothing in particular) To hope is to give yourself to the future - and that commitment to the future is what makes the present inhabitable.

Traditional Districting Principles 1. Equal Population 2. Compactness 3. Contiguity 4. Respect for county/city boundaries 5. Respect for communities of interest 6. Voting Rights Act compliance

Traditional Districting Principles 1. Equal Population 7.???...we shall retain just #1, and see how that interacts with other desirable criteria.

Optimal Transport Optimal transport is a field of mathematics & an array of tools increasingly used throughout engineering and science. It s concerned with maps that preserve volume and/or ways of matching two distributions together and this transport / matching is optimal in some sense.

Optimal Transport Goes back to the 19th century. Two names that will come up today: 1. Gaspard Monge (19th century, French physicist and mathematician) 2. Leonidas Kantorovich (20th century, only Soviet Nobel prize in economics!) Optimal transport is partly a subfield of linear programming and optimization, partly a subfielf of Partial Differential Equations.

Optimal Transport The field has blossomed since the early 90 s and has become a fairly developed field whose methods are used in: image processing, pattern recognition, fluid mechanics, weather modeling, geometric optics, probability, statistical inference, differential geometry, economics...

The Monge Problem Setup Two sets of points on the plane, X (squares) and Y (circles). We are given the transportation cost from x i to y j, denoted c(x i, y j )

Problem The Monge Problem Setup Find a bijective map T : X Y minimizing N c(x, T (x)) i=1 (Total transportation cost of T )

Problem The Monge Problem Setup Find a bijective map T : X Y minimizing N c(x, T (x)) i=1 (Total transportation cost of T )

The Monge Problem As X and Y are finite, there are finitely many choices, so at least one of them achieves the smallest value However, this is a hard optimization problem when X and Y are sufficiently large.

The Monge Problem As X and Y are finite, there are finitely many choices, so at least one of them achieves the smallest value

The Monge Problem As X and Y are finite, there are finitely many choices, so at least one of them achieves the smallest value

The Kantorovich Problem Setup A variation We are now given mass densities f and g, representing Their total masses are equal f i = population at x i g j = capacity at y j N M f i = i=1 j=1 g j

The Kantorovich Problem Setup Instead of looking for T : X Y we broaden our options by allowing ourselves to split the mass leaving each x i

The Kantorovich Problem Setup Instead of looking for T : X Y we broaden our options by allowing ourselves to split the mass leaving each x i

The Kantorovich Problem Setup This splitting we encode by a matrix π, interpreted as π ij = population at x i assigned to y j M f i = π ij, g j = j=1 N i=1 π ij Such a π is called a transportation plan between f and g.

The Kantorovich Problem Setup Then: among all plans π, find one minimizing the otal cost N i=1 j=1 M c(x i, y j )π ij This is known as the Kantorovich problem. This is a linear optimization problem in π: the set of admissible plans forms a convex set in R NM, and the objective functional is a linear function of π.

The Kantorovich Problem In summary: Find π which achieves the minimum value of N i=1 j=1 while satisfying the constraints f i = g j = M c ij π ij M π ij for i = 1,..., N, j=1 N π ij for j = 1,..., M, i=1 π ij 0, for all i, j.

The Monge-Kantorovich Problem Monge versus Kantorovich Monge: Highly nonlinear variational/optimization problem. How do we find a minimizer? Solutions are nice : you do not split a house in two. Kantorovich: Amounts to a linear program! (easy to implement) But solutions may be multi-valued (mass-splitting!). Under extra assumptions, Kantorovich = Monge.

The Wasserstein distance For probability distributions µ and ν, their Wasserstein dist. is W 2 (µ, ν) := ( ) 1 min x y π X Y 2 2 π(x, y)dxdy This is a widely studied metric space (it behaves a lot like an infinite dimensional manifold) A special subset The set E M of empirical distributions made out of M points E M = {ν ν = 1 M M δ yj y 1,..., y M Y } j=1

OT: continuous source and discrete target We look for a map in some region R T : R R 2 {y 1,..., y M } = Y. Minimizing c(x, T (x))µ(x) dx R 2 So the population distribution µ is a mass density function

OT: continuous source and discrete target Theorem (cf. Brennier circa 1990) There are numbers α 1,..., α M such that the function u(x) = min {c(x, y j) + α j } 1 j M gives the optimal transport map, via T (x) = y k u(x) = c(x, y k ) + α k The sets {D j } given by D j = T 1 (y j ) provide a partition of the population distribution. Geomtrically, what does this theorem say about the nature of the minimizers?

Discrete OT So, we have seen that the pre-images T 1 (y k ) are given by the intersection of sets of the form {x c(x, y j ) + c(x, y k ) α j α k } This is just as with convex polygons being the intersection of a number of half-spaces! What happens when both source and target are discrete?

Discrete OT Example: Cartesian Grid and Square Cost Everything is true and trivial in the Cartesian grid! @iamcardib (or @justinmsolomon, I don t remember rn)

Discrete OT Example: Cartesian Grid and Square Cost Everything is true and trivial in the Cartesian grid! @iamcardib (or @justinmsolomon, I don t remember rn) Let us take c(x i, y j ) = x i y j 2, where the population and the y s are supported in Z 2. (Admittedly, here Z 2 hardly plays any role) We have the sets {x Z 2 c(x, y j ) + c(x, y k ) α j α k } But expanding the squares, we get that {x x (y j y k ) λ} for some λ depending on α j, α k and y j, y k.

Discrete OT Geometry of districts More generally, observe that Therefore u(x) c(x, y k ) + α k k. = c(x, y k ) + α k if x T 1 (y k ). T 1 (y k ) := {u(x) c(x, y k ) + α k }. Such a set is called a c-section (no, really!). They are the analogue of convex sets in the geometry induced by the cost c.

Optimal transport and partitioning Problem: Fix µ and K among all ν E K, minimize the Wasserstein dist. W 2 (µ, ν) This is a special instance of what is now known as the Wasserstein barycenter problem. In the math literature, this was first studied by Agueh and Carlier (2013). See Solomon s 2018 survey on numerical OT for an overall discussion of its uses in statistics, computer vision, ML, and more.

Optimal transport and partitioning One can use OT (and linear programming in general) to provide an optimality criterium for partitions, and thus also for computer-assisted redistricting. There is, for instance, this paper from 1965!

Optimal transport and partitioning Data: Cost function c(x i, y j ) between pairs of vertices. Population distribution f is given. (Total population is P ) Number of districts K.

Optimal transport and partitioning Partition Plans A partition plan π for f is one satisfying the following constraints: First, the usual mass balance with the population M π ij = f i for i = 1,..., N, j=1 π ij 0, for all i, j.

Optimal transport and partitioning Partition Plans Second, there are indices j 1,..., j K such that N π ij = P/K if j is not among {j 1,..., j K }, i=1 N π ij = 0 if j is not among {j 1,..., j K }. i=1 Now the problem is to find {y j1,..., y jk } leading to the smallest optimal transport cost from µ

Optimal transport and partitioning Partition Plans Note: One may think of the points y j1,..., y jk as virtual voting centers, or one may make them correspond to real ones, as may be convenient. The point is that as they are not given a priori, the locations y j1,..., y jk are part of the unknowns. (this is just like the Wasserstein barycente problem, except the cost is not quadratic)

Optimal transport and partitioning Contrast this with: Voronoi diagrams Consider a region Ω and points {x 1,..., x N }. Image credit: Wikipedia These points yield a partition of Ω into N regions, known as its Voronoi diagram, its cells being defined by D k = {x Ω x x k x x j j = 1,..., N}

Optimal transport and partitioning A related approach: Balanced Power Diagrams, as in Cohen-Addad, Klein, and Young (2018). They turn census data into weighted point clouds in the plane, and compute a balanced centroidal power diagram. The setting of the problem in R 2 leads to partitions given by convex polygons.

Optimal transport and partitioning From Cohen-Addad, Klein, and Young (2018)

Optimal transport and partitioning From Cohen-Addad, Klein, and Young (2018)

Optimal transport and partitioning Strong geometry dependence on the cost function Important: The resulting polygonal shapes are a consequence of our selection of a cost function c ij, c(x, y) = x y 2 Different cost functions, specially ones that incorporate specific policy principles, could yield law-complaint maps, while providing mathematical and rhetorical justification for them.

Optimal transport and partitioning Strong geometry dependence on the cost function The variation in map shape is best illustrated by Voronoi diagrams made using the Euclidean distance versus the Manhattan (cartesian grid) distance

Some numerical experiments Let us do the following: fix a state (PA), which has 18 seats. Take say, the graph of Census Tracts. The pairwise matrix of distances between the centroids, and µ the state s population distribution. Sample graph locations (uniformly*): {y 1,..., y 18 } Compute the optimal transport map from µ to the uniform distribution given by the y j.

Some numerical experiments Ok, so what did we just see? Although it did not happen with all plans, plenty of them had a couple of disconnected districts and districts with holes. Specially in the plans with 18 districts. In the semi-discrete setting, there is large class of costs for which one can prove the above does NOT happen. The condition involves a tensor invariant proposed by Ma, Trudinger, and Wang in 2005.

The Ma-Trudinger-Wang condition A condition on the cost function for OT problems in the continuum setting. Boils down to a series of inequalities involving 4th order derivatives of c(x, y). It is a a new type of curvature condition, with some relation to the Ricci curvature when c(x, y) = d(x, y) 2 (the squared geodesic distance). It is a necessary condition for differentiability of a related PDE the Monge-Ampère equation, and in turn, for the continuity/differentiability of optimal transport maps.

The Ma-Trudinger-Wang condition Then, the MTW condition is: if V η then So this expression is difficult to parse but it is related to a geometric property of c-hyperplanes (a family of surfaces arising from the cost)

The Ma-Trudinger-Wang condition Question: Is there a discrete analogue of the MTW condition for a cost function c ij in a graph (V, E)? Can one show under such a condition that the resulting districting plans are... path connected?... convex? In other words: when are c-sections for a given cost path-connected? how compact are these sets?

The Ma-Trudinger-Wang condition? A good first step towards answering this question is the following. Theorem (...maybe?, hopefully?) For a graph (V, E), if c(x i, y j ) is just the graph distance, then optimizers should be path connected.

In progress 1. Update the voting centers to further reduce the cost (e.g. Lloyd s algorithm). 2. Understand the lack of connectivity and lack of simple connectivity in some districts. Is it a side effect of the algorithm? Or does the underlying geometry cause this?. 3. Use this method to construct optimal plans according to the Transit Time Compactness metric. 4. Systematically produce optimal maps which are gerrymandered (consider a population mixture, impose relative size constrains district by district). 5. One potential use: computer-aided design of majority-minority districts.

More questions / things to do 1. How good are the compactness scores for the resulting districts? (i.e. compute lots plans for lots of states and tally various compactness scores). 2. Are there inequalities relating optimizers for the transport cost versus optimizers for the perimeter? (i.e. prove some theorems). 3. It would be worth combining this with a Markov chain for the voting centers which then determines the optimal transport plans. This generates a chain in the space of partitions. Does this chain have any interesting properties?.

Thank You! Questions? nguillen@math.umass.edu