A Genetic Algorithm for the Number Partitioning Problem

Similar documents
A Web-Based Evolutionary Algorithm Demonstration using the Traveling Salesman Problem

LECTURE 3 ALGORITHM DESIGN PARADIGMS

The Genetic Algorithm for finding the maxima of single-variable functions

Optimal Sequential Multi-Way Number Partitioning

Introduction to Optimization

CHAPTER 2 CONVENTIONAL AND NON-CONVENTIONAL TECHNIQUES TO SOLVE ORPD PROBLEM

A Genetic Algorithm Applied to Graph Problems Involving Subsets of Vertices

ACO and other (meta)heuristics for CO

Algorithm Design Paradigms

Genetic Algorithms. Kang Zheng Karl Schober

Introduction to Optimization

Literature Review On Implementing Binary Knapsack problem

ARTIFICIAL INTELLIGENCE (CSCU9YE ) LECTURE 5: EVOLUTIONARY ALGORITHMS

Comparison Study of Multiple Traveling Salesmen Problem using Genetic Algorithm

A Simple Efficient Circuit Partitioning by Genetic Algorithm

Escaping Local Optima: Genetic Algorithm

Bi-Objective Optimization for Scheduling in Heterogeneous Computing Systems

Set Cover with Almost Consecutive Ones Property

Job Shop Scheduling Problem (JSSP) Genetic Algorithms Critical Block and DG distance Neighbourhood Search

Outline. Motivation. Introduction of GAs. Genetic Algorithm 9/7/2017. Motivation Genetic algorithms An illustrative example Hypothesis space search

A New Algorithm for Solving the Operation Assignment Problem in 3-Machine Robotic Cell Scheduling

PROBLEM SOLVING AND SEARCH IN ARTIFICIAL INTELLIGENCE

International Journal of Modern Engineering and Research Technology

Graph Coloring Algorithms for Assignment Problems in Radio Networks

sort items in each transaction in the same order in NL (d) B E B C B C E D C E A (c) 2nd scan B C E A D D Node-Link NL

Fuzzy Inspired Hybrid Genetic Approach to Optimize Travelling Salesman Problem

3 No-Wait Job Shops with Variable Processing Times

Evolutionary Computation Algorithms for Cryptanalysis: A Study

From Approximate to Optimal Solutions: A Case Study of Number Partitioning

A New Selection Operator - CSM in Genetic Algorithms for Solving the TSP

Suppose you have a problem You don t know how to solve it What can you do? Can you use a computer to somehow find a solution for you?

Optimization of fuzzy multi-company workers assignment problem with penalty using genetic algorithm

Evolutionary Computation. Chao Lan

Local Search (Greedy Descent): Maintain an assignment of a value to each variable. Repeat:

An Evolutionary Algorithm for the Multi-objective Shortest Path Problem

Advanced Search Genetic algorithm

A Consistent Design Methodology to Meet SDR Challenges

Using Genetic Algorithm to Break Super-Pascal Knapsack Cipher

THE Multiconstrained 0 1 Knapsack Problem (MKP) is

A GENETIC ALGORITHM APPROACH TO OPTIMAL TOPOLOGICAL DESIGN OF ALL TERMINAL NETWORKS

Multi-Way Number Partitioning

A Hybrid Recursive Multi-Way Number Partitioning Algorithm

Supplementary Notes on Concurrent ML

Metaheuristic Development Methodology. Fall 2009 Instructor: Dr. Masoud Yaghini

An Improved Hybrid Genetic Algorithm for the Generalized Assignment Problem

Artificial Intelligence Application (Genetic Algorithm)

Search Algorithms for Regression Test Suite Minimisation

Optimal tree for Genetic Algorithms in the Traveling Salesman Problem (TSP).

Network Routing Protocol using Genetic Algorithms

N-Queens problem. Administrative. Local Search

Evolutionary Approaches for Resilient Surveillance Management. Ruidan Li and Errin W. Fulp. U N I V E R S I T Y Department of Computer Science

A GENETIC ALGORITHM FOR CLUSTERING ON VERY LARGE DATA SETS

An Introduction to Evolutionary Algorithms

ANTICIPATORY VERSUS TRADITIONAL GENETIC ALGORITHM

Optimization of Test/Diagnosis/Rework Location(s) and Characteristics in Electronic Systems Assembly Using Real-Coded Genetic Algorithms

Genetic Algorithms Based Solution To Maximum Clique Problem

Genetic Algorithms Applied to the Knapsack Problem

Genetic Algorithm for Dynamic Capacitated Minimum Spanning Tree

A Genetic Algorithm for Graph Matching using Graph Node Characteristics 1 2

Using Genetic Algorithms to Solve the Box Stacking Problem

Introduction to Genetic Algorithms

Attractor of Local Search Space in the Traveling Salesman Problem

Research Article Path Planning Using a Hybrid Evolutionary Algorithm Based on Tree Structure Encoding

Administrative. Local Search!

Topological Machining Fixture Layout Synthesis Using Genetic Algorithms

CHAPTER 6 HYBRID AI BASED IMAGE CLASSIFICATION TECHNIQUES

Methods for Solving Subset Sum Problems

Introduction to Evolutionary Computation

Lecture 4. Convexity Robust cost functions Optimizing non-convex functions. 3B1B Optimization Michaelmas 2017 A. Zisserman

March 19, Heuristics for Optimization. Outline. Problem formulation. Genetic algorithms

Scan Scheduling Specification and Analysis

Scheduling complex streaming applications on the Cell processor. LIP Research Report RR-LIP

On Covering a Graph Optimally with Induced Subgraphs

World Academy of Science, Engineering and Technology International Journal of Bioengineering and Life Sciences Vol:11, No:6, 2017

Chapter 14 Global Search Algorithms

Evolutionary Algorithms. CS Evolutionary Algorithms 1

A Genetic Algorithm for Minimum Tetrahedralization of a Convex Polyhedron

METAHEURISTICS Genetic Algorithm

A Parallel Evolutionary Algorithm for Discovery of Decision Rules

Complementary Graph Coloring

Improving Lin-Kernighan-Helsgaun with Crossover on Clustered Instances of the TSP

Anale. Seria Informatică. Vol. X fasc Annals. Computer Science Series. 10 th Tome 1 st Fasc. 2012

A Hybrid Improvement Heuristic for the Bin Packing Problem

31.6 Powers of an element

Introduction to Genetic Algorithms. Genetic Algorithms

Algorithms & Complexity

Genetic Algorithm for Circuit Partitioning

Genetic Algorithms for Vision and Pattern Recognition

Worst-Case Utilization Bound for EDF Scheduling on Real-Time Multiprocessor Systems

Thick separators. Luc Jaulin and Benoît Desrochers. Lab-STICC, ENSTA Bretagne, Brest, France

What is GOSET? GOSET stands for Genetic Optimization System Engineering Tool

Optimally Scheduling Small Numbers of Identical Parallel Machines

A Genetic Algorithm for Multiprocessor Task Scheduling

Energy-Aware Scheduling of Distributed Systems Using Cellular Automata

NP-Hardness. We start by defining types of problem, and then move on to defining the polynomial-time reductions.

Heuristic Optimisation

Genetic Algorithms for Solving. Open Shop Scheduling Problems. Sami Khuri and Sowmya Rao Miryala. San Jose State University.

Randomized algorithms have several advantages over deterministic ones. We discuss them here:

Genetic Programming of Autonomous Agents. Functional Description and Complete System Block Diagram. Scott O'Dell

Using Genetic Algorithms to optimize ACS-TSP

Transcription:

A Algorithm for the Number Partitiong Problem Jordan Junkermeier Department of Computer Science, St. Cloud State University, St. Cloud, MN 5631 USA Abstract The Number Partitiong Problem (NPP) is an NPhard problem of combatorial optimization which a set of positive tegers must be partitioned to two subsets such that the sums of the subsets are as equal as possible (Hayes, 22). The longest processg time heuristic is a known greedy approach to solvg this problem. In this paper, a Algorithm (GA) is presented as a solution to the NPP and is compared to this greedy heuristic. The GA encodes candidate solutions as bary strgs, uses k-tournament selection to choose parent chromosomes, and uses two-pot crossover and probabilistic bitwise mutation as operators on the encodgs. The results of the two algorithms are compared on identical put sets, and the results show that of the two algorithms, the genetic algorithm produced superior solutions to those of the greedy heuristic. However, the results of the genetic algorithm are probabilistic, that each successive execution of the genetic program may produce a different overall best partition. 1. The Problem 1.1. Background The Number Partitiong Problem (NPP), sometimes referred to as the easiest hard problem (Hayes, 22; Mertens, 23), is an NP-Hard problem of combatorial optimization. Given a set of positive tegers, the problem is to partition the tegers to two subsets such that the sums of the subsets are as equal as possible (Hayes, 22). Formally, given a set a 1, a 2,, a n of positive tegers, fd a partition β {1,, n} such that E(β) = i β(b i ) i β (b i ) is mimized (Mertens, 23). A perfect partition is a partition which the sums of the two subsets are equal, such that E = (Mertens, 23). A perfect partition is always desired, yet not all sets have a perfect partition, as shown the second example below. The followg examples demonstrate the Number Partitiong Problem. Take the set of positive tegers A = {1, 2, 3, }. The optimal partition for this set forms subsets A 1 = {2, 3} and A 2 = {1, }, such that E(A 1 ) = ; a perfect partition. In another, less prime, example, the set B = {, 6} can, at best, be partitioned to form subsets B 1 = {} and B 2 = {6}, such that E(B 1 ) = 2. While this partition is not a perfect partition, this is still the optimal partition sce the difference between the sums is mimized. In this example, a perfect partition is not possible. 1.2. Complexity The Number Partitiong Problem s computational complexity is dependent on the type of numbers the put set A = {a 1, a 2,, a N }. If each a i is a positive teger bounded by a constant B, then the difference E between the partitioned subsets can be at most NB different values, such that the search space is O(NB) stead of O(2 N ). This is known as pseudo polynomiality. (Mertens, 23). However, a typical put set is comprised of dependently and identically distributed random numbers, such that the mimal difference E 1 is a stochastic variable with median value O( N 2 N ) (Karmarkar et al., 1986). Surprisgly, heuristic algorithms for the NPP are of poor quality (Johnson et al., 1991 & Ruml et al., 1996). The differencg method is the best polynomial time heuristic, which, for put set A of real valued a i, yields discrepancies O(N logn ) for some positive constant (Yakir, 1996). Due to the NP-hardness of the NPP, for any put set A bounded by B = 2 kn, the worst case complexity of any exact algorithm is exponential N for all k > (Mertens, 23). 1.3. Applications The Number Partitiong Problem can be applied to areas cludg public key encryption and task schedulg (Mertens, 23), public key cryptography (Merkle, 1978), and team selection for sportg events (Hayes, 22). Specifically, the NPP can be utilized multiprocessor schedulg and VLSI circuit size and delay mimization (Coffman & Lueker, 1991 & Tsai, 1992). Additionally, the Number Partitiong Problem is one of the six basic NP-Hard problems that are fundamental to the theory of NP-completeness (Garey & Johnson, 1997 & Mertens, 22) and is often used NP-completeness proofs for problems such as

knapsack problems, quadratic programmg, b packg, and multiprocessor schedulg (Mertens, 23). 2. The Greedy Heuristic Because the Number Partitiong Problem is NPhard, exact solutions with known algorithms are only possible for small problem stances (Pedroso & Kubo, 28). Therefore, the idea of an exact solution should be abandoned and approximative heuristic algorithms should be implemented stead. (Mertens, 23). One such heuristic is the longest processg time heuristic, commonly used the multi-processor schedulg problem (Pedroso & Kubo, 28). In this algorithm, the largest number the origal set is placed to one of the two subsets. The largest remag number is then placed to the subset that has the smaller total sum. This contues until all numbers have been assigned to a subset. The aim of this heuristic is to keep the sum discrepancy as small as possible with each successive decision (Mertens, 23). Figure 1 shows the heuristic applied to the set A = {, 5, 6, 7, 8}. A = {, 5, 6, 7, 8} A 1 = {8} A 2 = { } A 1 = {8} Time A 2 = {7} A 1 = {8} A 2 = {7, 6} A 1 = {8, 5} A 2 = {7, 6} A 1 = {8, 5, } A 2 = {7, 6} Fal sum discrepancy = Figure 1. The longest processg time heuristic applied to the set A = {, 5, 6, 7, 8}. For this stance, the heuristic produces a partition with a sum discrepancy of four. The optimal partition for the set the above example is {7, 8} {, 5, 6}, a perfect partition with a sum discrepancy of zero. In addition to the greedy heuristic s failure to produce the optimal partition, the partition {6, 8} {, 5, 7}, with a discrepancy of two, was also missed. In short, while this greedy heuristic may be acceptable, it is not ideal. This algorithm s time complexity is O(N log N), the time complexity of sortg N numbers (Mertens, 23). As the example depicted Figure 1, the worst situation arises when the sums of the two subsets are equal just before the last sertion. In this case, the fal discrepancy will necessarily be equal to the last number serted. This is a motivation for the assignment of numbers decreasg order, which gives the scalg O(N 1 ) of the result for real-valued a j (Mertens, 23). 3. The Algorithm In addition to greedy heuristics, genetic algorithms (GA) can also be used to produce adequate solutions to NP-hard problems. algorithms are heuristics based on biological evolution that simulate reproduction with variation and selection accordg to fitness, like that of a true biological population (Julstrom, 215). The genetic algorithm created to solve the NPP is implemented the C# programmg language and follows the general structure of a typical genetic algorithm, shown Figure 2 (Julstrom, 215). In this GA, the program iterates through a set number of generations and then halts, reportg the best overall solution. Generate random itial population; While (not done) { For i=1 to population size { Select two parents; Crossover to produce an offsprg; Mutate the offsprg; Insert offsprg to new generation; } } Offsprg replace parents; Report the best solution the population; Report the best overall solution; Figure 2. The general structure of a genetic algorithm. 3.1. Encodg Candidate Solutions In a GA designed for the Number Partitiong Problem, candidate solutions can be represented as bary strgs, parallel to an array of the put numbers, such that each character the bary strg represents one number the put array. Each number the array is represented by the character the bary strg whose position the strg is the same as that number s dex the array. For each bary character a candidate solution, if the character s value is, the correspondg number is a member of the first subset the partition. Otherwise (the bary character s value is 1), the number belongs to the second subset. This relationship between candidate solutions and their bary strg encodgs is illustrated Figure 3. A Chromosome struct with data members genome and fitness hold each candidate solution and its associated fitness, respectively. A population array holds every Chromosome the population.

A = {, 5, 6, 7, 8} Chromosome i = 111 Partition i = {, 5, 8} {6, 7} Chromosome j = 111 Partition j = {, 5, 6} {7, 8} Chromosome k = Partition k = { } {, 5, 6, 7, 8} Figure 3. Three chromosomes and their correspondg partitions for the bary strg encodg of set A = {, 5, 6, 7, 8}. 3.2. In this genetic algorithm, a chromosome s fitness is equal to the sum discrepancy between the two subsets that its bary strg encodg represents. Therefore, fitness should be mimized, such that chromosomes with smaller fitnesses are better solutions. The fitness for each chromosome is also stored with the Chromosome struct. es are calculated by takg the absolute value of the difference between two sums the chromosomes represent. The characters the bary strg encodg are iterated through and the represented numbers are added to the appropriate subset sum. The resultg difference becomes that chromosome s fitness. At the end of each generation, once the offsprg chromosomes replace the parent chromosomes, the chromosome with the smallest fitness is reported as output to the program. This chromosome is also compared to the overall best chromosome throughout the program. If the best chromosome the current population has a smaller fitness than the overall best, the local best chromosome becomes the overall best. At the end of the program s execution, the overall best chromosome and its fitness are reported as output. 3.3. Selection The genetic algorithm uses k-tournament selection to determe which chromosomes the population will become parents, with k = 2 for the problem stances used durg testg and durg comparison with the greedy heuristic. To determe a parent, an array of size k of candidate parents is itialized, and chromosomes are randomly chosen from the population and added to the array until it is full. The candidates fitnesses are then compared, and the chromosome with the smallest fitness becomes the parent. 3.. Crossover Crossover the genetic algorithm is accomplished via two-pot crossover, which two dices, X 1 and X 2, of the bary strg encodg are chosen, and each of the two parent chromosomes are cut at those dices. The offsprg chromosome is created by takg the characters at dices [, X 1 ) from the first parent, appendg the characters at dices [X 1, X 2 ) from the second parent, and appendg characters at dex X 2 onward from the first parent. This process is illustrated Figure. In this particular genetic algorithm, X 1 and X 2 occur approximately one-third and two-thirds of the way through the chromosome, respectively. X 1 X 2 parent = 111111 parent 1 = 111111 offsprg = 111111 Figure. An example of two-pot crossover a genetic algorithm. 3.5. Mutation After an offsprg chromosome has been created, a probabilistic bitwise mutation is performed on that chromosome s bary strg encodg. For each character the bary strg, there is a 1% chance that the character will be swapped. For each swap, an existg 1 becomes a, and an existg becomes a 1. This way, each new chromosome that enters the population still matas some heritability from its parents, while also allowg for the troduction of new traits to the population.. Comparison of Algorithms In this section, several problem stances of the NPP are described, and the results of both the genetic algorithm and the longest processg time greedy heuristic on those stances are stated and compared. For these tests, both of the algorithms have been implemented the C# programmg language and run as console applications..1. A Small Test Instance The set A = {, 5, 6, 7, 8} was used above Figure 1 to illustrate the longest processg time heuristic. Therefore, it was the first NPP problem stance used the algorithm comparison. The greedy heuristic produced the partition {8, 5, } {7, 6} with sum discrepancy E =. When A was used as put to the genetic program with a

population size of 1 and 5 generations, the perfect partition {, 5, 6} {7, 8} with E = was achieved. It should be noted that, unlike the greedy heuristic, which always produces the same end partition per put set, the results of the genetic algorithm are probabilistic, that each successive run of the program may produce a different overall best partition. In the previous example, the perfect partition was achieved on the first execution of the program. Subsequent executions yielded equivalent results. See Table 1 and Table 2 for more details. While this example may be trivial, it demonstrates the comparative effectiveness of the genetic algorithm on a basic level. Additionally, this put set was the put used to itially test the correctness of the genetic program..2. Additional Comparisons and used as put to both of the programs. As Table 1 shows, the genetic program aga produced the better solution, even on the first program execution. On its first execution, the genetic program formed a partition with a sum discrepancy of 1, while the greedy program only managed to produce a partition with a discrepancy of 58. Moreover, on all of the stacked executions of the genetic program, perfect partitions were achieved. However, the greedy program (usg Quicksort as the sortg mechanism) had a total execution time of ms, while a sgle execution of the genetic program lasted 1ms. Execution time summary statistics are detailed Table 2. Similarly, an put set of 5 random tegers the range [1, 1] was also generated and used as put. The results are shown Table 1 and Table 2. For another demonstration, an put set of 1 random tegers the range [1, 1] was generated Table 1. Summary Statistics for the Comparison of the Longest Processg Time greedy heuristic and the proposed Algorithm (with 5 generations) Input Set {, 5, 6, 7, 8} 1 [1, 1] 5 [1, 1] Algorithm (1 execution) (25 executions) (5 executions) (1 executions) Greedy 58 58 58 58 (population = 1) (population = 1) 1 Greedy 1 1 1 1 (population = 1) 2 Table 2. Program Execution Time Summary Statistics for the Comparison of the Longest Processg Time greedy heuristic and the proposed Algorithm (with 5 generations) Input Set {, 5, 6, 7, 8} 1 [1, 1] 5 [1, 1] Algorithm Execution Time (1 execution) (milliseconds) Greedy 2 (population = 1) 19 (population = 1) 1 (population = 1) 1,17

5. Conclusion From the results collected from the test problem stances described 5. Comparison of Algorithms, shown Table 1, it is clear that of the two algorithms, the genetic algorithm managed to produce superior solutions to those of the greedy heuristic. However, the results of the genetic algorithm are probabilistic, that each successive execution of the genetic program may produce a different overall best partition. This is dependent on the random itial population, the population size, the randomly selected parents, random mutation, and more. Therefore, there is a chance that the genetic program will not produce the optimal partition for a given put set. To crease the chances of producg a superior fal partition, subsequent executions of the program should be performed, as demonstrated this paper. Additionally, the number of offsprg generations each program execution may be altered for different results. The major difference between alterg the number of program executions and alterg the number of generations is that each generation has some heritability from their parent chromosomes, while the itial generation a subsequent program execution is randomized, such that there is no heritability from one execution to the next. As detailed Table 2, the genetic algorithm s superior solutions are contrasted by its ferior execution time. Time complexity could possibly be reduced through mor, more efficient, alterations to the algorithm. In conclusion, this genetic algorithm can be used to generate adequate solutions to stances of the Number Partitiong Problem, whereas exact solutions to the problem would be computationally hard to produce, and solutions provided by the longest processg time heuristic are ferior. References Coffman, E. & Lueker, G. S. (1991). Probabilistic Analysis of Packg and Partitiong Algorithms. John Wiley & Sons. New York. Garey, M. R. & Johnson, D. S. (1997). Computers and Intractability. A Guide to the Theory of NP-Completeness. W.H. Freeman. New York. Hayes, B. (22). The Easiest Hard Problem. American Scientist. 9, 113. Johnson, D. S., Aragon, C. R., McGeoch, L. A., & Schevron C. (1991). Operations Research. 39, 378. Julstrom, B. (215). Evolutionary Computation. Lecture Notes. Karmarkar, N., Karp, R. M., Lueker, G. S., & Odlyzko, J. (1986). Appl. Prob. 23, 626. Merkle, R. C. & Hellman, M. E. (1978). IEEE Transactions on Information Theory 2, 525. Mertens, S. (22). Computg Science and Engeerg., 31. Mertens, S. (23). The Easiest Hard Problem: Number Partitiong. Magdeburg, Germany. Pedroso, J. P. & Kubo, M. (28). Heuristics and Exact Methods for Number Partitiong. Technical Report Series: DCC-28-3. Ruml, W., Ngo, J., Marks, J., & Shieber, S. (1996). Journal of Optimization Theory and Applications. 89, 251. Tsai, L.-H. (1992). SIAM J. Comput. 21, 59. Yakir, B. (1996). Math. Oper. Res. 21, 85.