Homework 3: Heuristics in Search

Similar documents
Hardware-Software Codesign

A Modular Multiphase Heuristic Solver for Post Enrolment Course Timetabling

CS446: Machine Learning Fall Problem Set 4. Handed Out: October 17, 2013 Due: October 31 th, w T x i w

Escaping Local Optima: Genetic Algorithm

Constraint Satisfaction Problems

Due: March 8, 11:59pm. Project 1

CS 170 Spring 2000 Solutions and grading standards for exam 1 Clancy

Fall 09, Homework 5

Module 4. Constraint satisfaction problems. Version 2 CSE IIT, Kharagpur

EECS 219C: Computer-Aided Verification Boolean Satisfiability Solving. Sanjit A. Seshia EECS, UC Berkeley

CS 188: Artificial Intelligence. Recap Search I

EECS 219C: Formal Methods Boolean Satisfiability Solving. Sanjit A. Seshia EECS, UC Berkeley

Lecture 06 Decision Trees I

Boolean Satisfiability Solving Part II: DLL-based Solvers. Announcements

Research Project Brief

Tabu Search for Constraint Solving and Its Applications. Jin-Kao Hao LERIA University of Angers 2 Boulevard Lavoisier Angers Cedex 01 - France

Midterm Examination CS540-2: Introduction to Artificial Intelligence

Electoral Redistricting with Moment of Inertia and Diminishing Halves Models

CS 374 Fall 2014 Homework 5 Due Tuesday, October 14, 2014 at noon

1 Linear programming relaxation

Variable Neighborhood Search

COMP 250 Winter Homework #4

11 Data Structures Foundations of Computer Science Cengage Learning

(the bubble footer is automatically inserted into this space)

Local Search (Greedy Descent): Maintain an assignment of a value to each variable. Repeat:

Heuristic Optimization Introduction and Simple Heuristics

N-Queens problem. Administrative. Local Search

Hill Climbing. Assume a heuristic value for each assignment of values to all variables. Maintain an assignment of a value to each variable.

CSE Theory of Computing Fall 2017 Project 1-SAT Solving

Exam 3 Practice Problems

Parallel and Distributed Computing

CMU-Q Lecture 8: Optimization I: Optimization for CSP Local Search. Teacher: Gianni A. Di Caro

CS150 - Assignment 5 Data For Everyone Due: Wednesday Oct. 16, at the beginning of class

Set 5: Constraint Satisfaction Problems

CS 540: Introduction to Artificial Intelligence

Homework 4: Clustering, Recommenders, Dim. Reduction, ML and Graph Mining (due November 19 th, 2014, 2:30pm, in class hard-copy please)

Lecturers: Sanjam Garg and Prasad Raghavendra March 20, Midterm 2 Solutions

University of Florida CISE department Gator Engineering. Clustering Part 4

COMPSCI 311: Introduction to Algorithms First Midterm Exam, October 3, 2018

Homework Assignment #3

11.1 Facility Location

Parsimony Least squares Minimum evolution Balanced minimum evolution Maximum likelihood (later in the course)

Outline of the talk. Local search meta-heuristics for combinatorial problems. Constraint Satisfaction Problems. The n-queens problem

Homework 3 Solutions

Recap Randomized Algorithms Comparing SLS Algorithms. Local Search. CPSC 322 CSPs 5. Textbook 4.8. Local Search CPSC 322 CSPs 5, Slide 1

Clustering Part 4 DBSCAN

How Infections Spread on Networks CS 523: Complex Adaptive Systems Assignment 4: Due Dec. 2, :00 pm

Announcements. CS 188: Artificial Intelligence Fall Reminder: CSPs. Today. Example: 3-SAT. Example: Boolean Satisfiability.

CS 188: Artificial Intelligence Fall 2008

Review: Final Exam CPSC Artificial Intelligence Michael M. Richter

EECS 543 Final Exam Example

SAT Solvers. Ranjit Jhala, UC San Diego. April 9, 2013

HMC CS 158, Fall 2017 Problem Set 3 Programming: Regularized Polynomial Regression

Homework 6 FC-CBJ with Static and Dynamic Variable Ordering

Boolean Functions (Formulas) and Propositional Logic

CS 4349 Lecture October 18th, 2017

CS 1044 Program 6 Summer I dimension ??????

On Applying Cutting Planes in DLL-Based Algorithms for Pseudo-Boolean Optimization

CS 374 Fall 2014 Homework 2 Due Tuesday, September 16, 2014 at noon

Administrative. Local Search!

Program of Study. Artificial Intelligence 1. Shane Torbert TJHSST

COMP 250 Fall Homework #4

CS264: Homework #4. Due by midnight on Wednesday, October 22, 2014

Automatic Drawing for Tokyo Metro Map

Note that if both p1 and p2 are null, equals returns true.

CS418 Operating Systems

CS 373: Combinatorial Algorithms, Fall Name: Net ID: Alias: U 3 / 4 1

Set 5: Constraint Satisfaction Problems Chapter 6 R&N

Homework 6: Higher-Order Procedures Due: 11:59 PM, Oct 16, 2018

Midterm Examination CS 540-2: Introduction to Artificial Intelligence

A CSP Search Algorithm with Reduced Branching Factor

Conflict-based Statistics

Problem Set 2. General Instructions

CS 3L (Clancy) Solutions and grading standards for exam 1

L E A R N I N G O B JE C T I V E S

The Heuristic (Dark) Side of MIP Solvers. Asja Derviskadic, EPFL Vit Prochazka, NHH Christoph Schaefer, EPFL

Homework 4 Solutions

Spring 2007 Midterm Exam

3 No-Wait Job Shops with Variable Processing Times

Example: Map coloring

color = 3 color = 1 color = 2 color = 3

Today. CS 188: Artificial Intelligence Fall Example: Boolean Satisfiability. Reminder: CSPs. Example: 3-SAT. CSPs: Queries.

Introduction to Algorithms / Algorithms I Lecturer: Michael Dinitz Topic: Approximation algorithms Date: 11/18/14

Predicting housing price

CS 506, Sect 002 Homework 5 Dr. David Nassimi Foundations of CS Due: Week 11, Mon. Apr. 7 Spring 2014

CS 229 Final Project - Using machine learning to enhance a collaborative filtering recommendation system for Yelp

Algorithms CMSC Homework set #2 due January 21, 2015

CS224W: Analysis of Networks Jure Leskovec, Stanford University

Non-deterministic Search techniques. Emma Hart

Avi Silberschatz, Henry F. Korth, S. Sudarshan, Database System Concept, McGraw- Hill, ISBN , 6th edition.

Learning Techniques for Pseudo-Boolean Solving and Optimization

Introduction to Optimization Using Metaheuristics. Thomas J. K. Stidsen

An Improved Separation of Regular Resolution from Pool Resolution and Clause Learning

Global Optimization Simulated Annealing and Tabu Search. Doron Pearl

Hands on Assignment 1

Seminar on. A Coarse-Grain Parallel Formulation of Multilevel k-way Graph Partitioning Algorithm

Constraint Satisfaction Problems

A Computational Study of Conflict Graphs and Aggressive Cut Separation in Integer Programming

/ Approximation Algorithms Lecturer: Michael Dinitz Topic: Linear Programming Date: 2/24/15 Scribe: Runze Tang

Outline of the module

Transcription:

Graduate Artificial Intelligence 15-780 Homework 3: Heuristics in Search Out on February 15 Due on March 1

Problem 1: Exting Your SAT Solver (35 points) In the last homework, you constructed both a SAT Solver and a random 3SAT generator. In this problem you will ext your SAT solver, which we ll call S. As in the last homework, generate 25 random 3SAT instances for n = 50 and c = {150,170,...,350}. These 275 problem instances will be your testing battery. Implement trivial backjumping into your SAT solver. Trivial means that your SAT solver should jump back to the most recently set variable in a node s conflict set only. Call this SAT solver S T. Replace the implementation of trivial backjumping in S T with conflict-directed backjumping. Call this SAT solver S B. Augment your SAT solver with conflict-directed backjumping S B with clause learning. Call this SAT solver S C. Extra Credit - 15 points. GRASP (Marques-Silva and Sakallah 1999, available at http://goo.gl/krcin), is a SAT solver that uses advanced notions of clause learning and backjumping. Implement the GRASP solver and call it S G. Time {S,S T,S B,S C,S G } against your testing battery. You should hand in plots, tables, and discussions of the performance of the SAT solvers you made. Which solver ran the best? Why do you think this is? Did c make a difference in your results? We do not require you to submit source code, but we reserve the right to ask for it. (If you do choose to do the extra credit, please attach a well-commented printout of your implementation to the problem set.) Problem 2: You re the Only Ten I See (65 points) Every state receives a certain number of delegates to the US Congress based on their population. Congressional districting is the process of dividing a state into districts, each of which will be represented by a single congressperson living in that district. The US Code mandates that districts should be contiguous and made as equal as possible in terms of population size. In this problem, you will use local search to divide the state of Tennessee into contiguous, compactly-shaped districts of approximately equal population. The basic unit of area in this problem will be tracts, which are contiguous groups of residences where about five thousand people live. For districting in practice it is possible to go down to the level of individual blocks to make districts as equal-sized as possible. However for this problem the coarseness of the tract-level data we are using means a good solution will see a discrepancy between district sizes of a few thousand people. The 2000 Census counted 5,689,283 people living in Tennessee, and the state was allocated nine districts. Consequently the average number of people per district is p 632,142.56. Data files: Tennessee had 1,261 census tracts in the 2000 Census. The relevant information for each tract is given in two files, tracts.txt and neighbors.txt. tracts.txt has 1,261 lines of the form id population x y Where i d {0,...,1260} is a unique integer ID for each tract, popul ati on is the number of people living in the census tract, and x and y are the longitude and latitude of the center of the tract. (These values will be useful for visualizing your solutions.) neighbors.txt has the adjacency information for the tracts. It has 1,261 lines of the form id neighbor neighbor neighbor... Page 1 of 4

or, in other words, a list of an integer ID followed by the integer ID of every tract adjacent to it. This is the adjacency information to use in this problem. Do not use a custom solution based on the longitude and latitude values. Introduction (15 points): As an introduction to local search, you will compare hill climbing and stochastic hill climbing. Algorithms 1 and 2 give the pseudocode for the maximization versions of these search methods. cur r ent = StartNode(); for Some number of iterations do nei g hbor s = GetNeighbors(cur r ent); sort nei g hbor s by score descing; cur r ent = nei g hbor s.pop; if score(cur r ent) > score(best) then return best; cur r ent = StartNode(); for Some number of iterations do nei g hbor s = GetNeighbors(cur r ent); sort nei g hbor s by score descing; while r and() < δ and!nei g hbor s.empty do cur r ent = nei g hbor s.pop; if score(cur r ent) > score(best) then return best; Algorithm 1: Hill Climbing Algorithm 2: Stochastic Hill Climbing There are several functions that define these algorithms that are left for you to write. StartNode() should return a random start node. One solution for this is to assign every tract to a single district, and then re-assign a random collection of 8 tracts to each other district, while taking care that your assignments will not violate contiguity of the large district. GetNeighbors(node) should return the local neighborhood of a search node. One solution for this is to find the smallest district and allocate it one of its adjoining tracts. score should return the score of a node. For this part, use the negative variance of district populations, i.e., letting p(d i ) represent the population of district d i then the score of a node is (1/9) i (p(d i ) p) 2. With this objective function, the upper bound on node score is zero. Regarding contiguity. A contiguous district forms a connected component, where vertices are tracts and edges are based on the given adjacencies. Your solution to the problem must be contiguous. You are very strongly advised to never consider non-contiguous solutions in your search. Your solution should include a function that checks to see if a prospective collection of tracts would form a contiguous district. Implement Hill Climbing and Stochastic Hill Climbing with δ =.75, using the starting node/neighborhood suggestions we have made or come up with your own. Regardless, be sure to fully describe the details of your Page 2 of 4

implementation. Plot the mean and median scores of the best node found from 101 runs over their first 500 steps. How similar are the means and medians? Why? Which algorithm performs better? Why do you think that is? Are there prominent differences between Hill Climbing and Stochastic Hill Climbing in terms of which neighbors are expanded? Did you notice any problems with local optima in Hill Climbing? Why do you think that is? Competition (25 points): Consider the more-sophisticated objective function obj( d) 100 i ( ) p(di ) p 2 + cc(d i ) p Where cc is the clustering coefficient of a district, the fraction of possible adjacencies between the tracts of d i. Formally, let t j be the number of adjacent tracts to tract j, and let t i be the number of adjacent tracts to tract j j in district i. Then t j d i t i j cc(d i ) = t j d i t j Find a solution to the districting problem with maximum score for this new objective by using local search. You should feel free to update any part of the local search method used in the previous part: the starting node, the neighborhood definition, or the neighbor selection routine. Here are some ideas you might consider: Find good starting nodes by separating out the initial tracts of a district based on geographic, adjacency, or some kind of population-weighted distance. Expand your neighborhood definition to include swaps and multiple simultaneous moves. Try a more-advanced search method like Tabu search (which keeps a fixed-length queue of recently-visited nodes and will not revisit those nodes again) or Beam search (which considers a set of nodes at each step, and where the nodes for the next iteration are taken from the union of the neighborhoods of the nodes in the current step). Submit your solution by putting a file named districts.txt into your dropbox. The file districts.txt should consist of 1,261 lines of the form id district where id is the unique tract integer id and district is an integer 1 through 9. For this part of the problem you will be graded based on the evaluation of the objective function obj on the solution you submit. Let s = obj( d) be the score for the solution you submit: If s 7, you will receive full credit. If 5 s < 7, you will receive 20 points. If 0 s < 5, you will receive 15 points. If 10 s < 0, you will receive 10 points. If s < 10, or if your solution does not have contiguous districts, you will not receive any points for this section. Additionally, the top five submissions in terms of score will receive 6 r ank extra credit points. The TAs were able to quickly code up a solution with a score of about 7.7; we expect the winning score to be around 8. Your solutions should be entirely algorithmic. Attach your source code for this part of the problem to your homework. Clearly note your implementations of StartNode() and GetNeighbors(node). Writeup (25 points): In addition to your writeup from your hill climbing implementation, discuss how you solved the optimization problem. We are as much interested in what worked as what didn t. Which of the three parts of the local search algorithm (start node, neighborhood definition, selection criteria) did you find was most important to improving your solution quality? Also, please consider the following questions in your writeup. We do not expect exact solutions to them. i Page 3 of 4

What is the size of the search space (How many nodes are there)? How hard (from a complexity standpoint) is this problem? How would you translate this problem into a SAT instance? California is the most populous state; it has 53 congressional districts. How well do you think your approach would scale to districting California? Why do you think we did not want you to look at non-contiguous prospective solutions? Can you think of ways other than the clustering coefficient to measure the compactness of a district? If all the tracts are roughly the same size and shape, can you come up with a district that has very high clustering coefficient but that would not be considered compact? Extra Credit Come up with a good way of visualizing your solution. The most innovative and attractive solution will receive 5 points of extra credit, and the second- and third-best will receive 3 points of extra credit. Attempting any kind of visualization will earn 1 point of extra credit. Page 4 of 4