Parallel and Distributed Computing Project Assignment MAX-SAT SOLVER Version 1.0 (07/03/2016) 2015/2016 2nd Semester
CONTENTS Contents 1 Introduction 2 2 Problem Description 2 2.1 Illustrative Example................................... 2 2.2 Discussion of the Algorithm to Implement....................... 2 3 Implementation Details 3 3.1 Input Data........................................ 3 3.2 Output Data....................................... 3 3.3 Sample Problem..................................... 3 4 Part 1 - Serial implementation 4 5 Part 2 - OpenMP implementation 4 6 Part 3 - MPI implementation 4 7 What to Turn in, and When 4 Revisions Version 1.0 (March 7th, 2016) Initial Version 1
2 PROBLEM DESCRIPTION 1 Introduction The purpose of this class project is to gain experience in parallel programming on an SMP and a multicomputer, using OpenMP and MPI, respectively. For this assignment you are to write a sequential and two parallel implementations of a Maximum Satisfiability (MAXSAT) solver. 2 Problem Description The Boolean Satisfiability Problem, simply known as SAT, is the problem of determining if there exists an assignment of the variables in a Boolean formula such that it evaluates to true. Although the format of the formula description is irrelevant, a format commonly used in practice is the conjunctive normal form (CNF). Under CNF, a Boolean formula consists of a conjunction of clauses, and a clause is a disjunction of literals. A literal is simply either a variable, then called positive literal, or the negation of a variable, then called negative literal. The maximum satisfiability problem (MAXSAT) can be seen as a generalization of SAT. MAXSAT is defined as the problem of determining the maximum number of clauses, of a given Boolean formula in conjunctive normal form, that can be made true by an assignment of the variables. 2.1 Illustrative Example Consider the following Boolean formula in CNF: φ = (A) (A B) (B C) (A B C) which has 3 variables (A, B and C) and 4 clauses, each with 1, 2, 2, and 3 literals respectively. It is easily verified that this formula is not satisfiable: the first clauses implies that A = 0; since A = 0, the second clause implies that B = 1; replacing these values in the third and fourth clauses we get (C) (C). However, for this project we are looking for the assignment that maximizes the number of the CNF clauses that evaluate to 1. Of course, if the formula is satisfiable, the solution to MAXSAT is the number of clauses in the formula. For our example, we achieve a MAXSAT solution of 3 with any assignment to C. 2.2 Discussion of the Algorithm to Implement A brute force approach to the MAXSAT problem is just to test the formula for all possible variable assignments. There are many sophisticated techniques to make this search more efficient. For this project, we do not want any type of research about these techniques. Keep in mind that the focus is the parallelization strategy. You are to implement a simple branch and bound search for the MAXSAT solution. The search should be made using a binary tree, where each level corresponds to a variable and the two children of each node correspond to the two possible assignments to that variable. Every time you arrive to a leaf of the tree (all variables have been assigned), the number of satisfied clauses is registered and the maximum value found should be saved. During the search the current maximum value should be used as a lower bound to prune the search. If at any node the total number of clauses minus the clauses that are unsatisfied by the current partial variable assignment is lower than this lower bound then we do not need to descend the tree further as it is guaranteed that no better solution can be found below. 2
3 IMPLEMENTATION DETAILS For this project you should not only determine the MAXSAT value, but also the number of different variable assignments that achieve that maximum number of satisfied clauses. 3 Implementation Details 3.1 Input Data The description of the problem is contained in a file (e.g., ex1.in) that starts with two positive integers indicating, respectively, the number of variables (nvar < 128) and the number of clauses (ncl < 2 16 ) in the formula. Then, each line will contain one of the clauses. To simplify, each variable is identified by an integer, signed to indicate a positive or negative literal. Hence, each line is a sequence of signed integers ending with a value 0. Your program should allow one and only one input parameter in the command line, used to specify the name of this input file. 3.2 Output Data The output of this problem should be, in two separate lines: in the first line, two integers separated by a space, the first the value of MAXSAT and the second the number of different complete variable assignments that achieve MAXSAT in the second line, one of the assignments that achieve MAXSAT (not important which), indicated as a sequence of integers separated by a space, positive for a value 1, negative for a value 0. Your program should send these two output lines (and nothing else) to the standard output, the project cannot be graded unless you follow these input and output rules! 3.3 Sample Problem Consider the formula φ above. The input file (e.g., ex1.in) that describes this formula will be: 3 4-1 0 1 2 0-2 3 0 1-2 -3 0 The program execution for this case should be: $ maxsat-serial ex1.in 3 7-1 2 3 $ (the last line can actually be any assignment except 1 2-3). 3
7 WHAT TO TURN IN, AND WHEN 4 Part 1 - Serial implementation Write a serial implementation of the algorithm in C (or C++). Name the source file of this implementation maxsat-serial.c. As stated, your program should expect exactly one input parameter. This will be your base for comparisons, it is expected that the branch and bound implementation should be as efficient as possible. 5 Part 2 - OpenMP implementation Write an OpenMP implementation of the algorithm, with the same rules and input/output descriptions. Name this source code maxsat-omp.c. You can start by simply adding OpenMP directives, but you are free, and encouraged, to modify the code in order to make the parallelization more effective and more scalable. Be careful about synchronization and load balancing! 6 Part 3 - MPI implementation Write an MPI implementation of the algorithm as for OpenMP, and address the same issues. Name this source code maxsat-mpi.c. For MPI, you will need to modify your code substantially. Besides synchronization and load balancing, you will need to take into account the minimization of the impact of communication costs. Extra credits will be given to groups that present a combined MPI+OpenMP implementation. 7 What to Turn in, and When You must eventually submit the sequential and both parallel versions of your program (please use the filenames indicated above) and the times to run the parallel versions on input data that will be made available (for 1, 2, 4 and 8 parallel tasks). Note that we will not be using any level of compiler optimizations to evaluate the performance of your programs, so you shouldn t also. You must also submit a short report about the results (1-2 pages) that discusses: the approach used for parallelization what decomposition was used what were the synchronization concerns and why how was load balancing addressed what are the performance results, and are they what you expected You will turn in the serial version and OpenMP parallel version at the first due date, with the short report, and then the serial version again (hopefully the same) and the MPI parallel version at the second due date, with an updated report. Both the code and the report will be uploaded to the Fenix system in a zip file. Name these files as g<n>omp.zip and g<n>mpi.zip, where <n> is your group number. 1st due date (serial + OMP): April 8th, until 5pm. Note: your project will be tested in the practical class just after the due date. 2nd due data (serial + MPI): May 13th, until 5pm. Note: your project will be tested in the practical class just after the due date. 4