A Taxonomic Bit-Manipulation Approach to Genetic Problem Solving Dr. Goldie Gabrani 1, Siddharth Wighe 2, Saurabh Bhardwaj 3 1: HOD, Delhi College of Engineering, ggabrani@yahoo.co.in 2: Student, Delhi College of Engineering, siddharth_wighe@yahoo.co.in 3: Student, Delhi College of Engineering, saurabh_dce08@rediffmail.com Abstract: Nowadays, search and optimization problems have exacting demands of skill accompanied with brevity of time. Genetic Algorithms are endowed with lucrative optimization techniques and are inspired from biological mechanisms such as natural selection, genetic reproduction processes like crossover and mutation. However, the main difficulty encountered in genetic problem solving is to somehow encode a range of problems with multiple aspects affecting them into a form solvable by genetic algorithms. This paper aims to overcome the difficulty mentioned above by establishing a novel and generalized initial Bit Manipulation Approach to such problems, following which genetic mechanisms may easily take over to solve the problem. This generalized taxonomic approach can be applied to a variety of problems for initiating a genetic programming solution. A case study is presented and subsequent results suggest the effectiveness of the proposed approach to genetic problem solving accompanied with the simplicity introduced. Keywords: Bit-Manipulation Approach, classification, genetic problem solving Introduction There has been a lot of research work done in the field of genetic algorithms. Its success can be seen from the fact, that there have been 36 instances where genetic algorithms have produced results that are competitive with the human performance [8]. Research related to its improvement, efficiency and speed of evaluation has been going ever since its very existence. Researchers and scientists have dedicated their studies in implementing genetic based solutions for specific problems. Their approach generally converges towards the attainment of specific goals in their problems. However, little efforts have been directed on its broad-spectrum implementation to general problems. Hence there arrives a need to develop guidelines or rules for a general-purpose approach that can be applied to various environments of genetic algorithm problems. Such a general-purpose approach should be able to tackle a diverse pool of problem sets with various possible constraints, without excessive indulgence in the working of the core genetic processes itself. Overview of Bit-Manipulation Approach This paper proposes a novel approach called Bit- Manipulation Approach that overcomes the problems faced during the general-purpose genetic implementation. This taxonomic Bit-Manipulation Approach starts with the analysis of the problem and the identification of various modules, components and the data value domain of each component. All these steps are performed with help of well-defined functions. General overview of primitive steps involved: PRIMARY STEP The taxonomic nature of or Bit-Manipulation approach can be exploited to solve this problem with few defined functions. Here, each component value so selected can be assigned an encoded pattern of 0/1 bits. The encoded set of bits for each component corresponds to our data value for that component, and the combined data values of all the components should correspond to a solution to the problem. E.g. ARR1 = encoded data value for D1, ARR2 encoded data value for D2 till ARRn = encoded data value for Dn SECONDARY STEP With all the sets of bits encoded for all components ready, the sets of bits are coalesced together to form an array of bits in a sequence. The new array of bits will now look like a sequence of bits [4] comprising of 0 s and 1 s only. E.g. FINAL_ARRAY = Concatenate (ARR1, ARR2 ARRn). The new array will look like 010110101010100100011101. (The size of the
array depends upon the number of components and bits assigned to each of them). This array can be considered as an individual of a population. As each component has its own set of bits, the following advantages are achieved: Clear-cut division or classification of each component Data integrity Multi-domain structure which may helps in multi-objective programming Once the above two steps are completed, the obtained array can be operated upon by various genetic procedures to achieve the objective. Functions Involved This taxonomic approach presents a few set of functions to implement these primary and secondary steps. This approach presents few sets of functions that are the pillars of this approach and are instrumental in development of genetic algorithm based applications. Function and their uses are provided in this paper with a general idea of there working. These functions can be implemented in any language or platform (depending upon the user). PRIMARY CHECK FUNCTION Primary objective of this taxonomic approach would be to determine whether the problem in hand could fit under the genetic algorithm based problem category. To determine this, the approach provides a function called as primary check that scrutinizes the problem under study into genetic or non-genetic problem. Primary check performs following functions: 1. Checks whether the problem could be divided into components that can be designated by modules. 2. Checks for the data space for each module for consistency. 3. Check for the nature of solution required in the problem. Depending upon the above conditions, the function categorizes the problem into genetic or nongenetic. MODULE EVALUATOR FUNCTION Modules (components) are the fundamental blocks of this taxonomic Bit-Manipulation Approach. Modules cover input data space, which the user intends to provide. Depending upon the nature of problem, a function called as module evaluator decides the no. of modules to be incorporated in the program. Each module is distinct from the way it affects the final solution with its distinct data domain range. The major objectives of the module evaluator function can be summarized as follows: 1. Decides upon no of modules. 2. Nature of modules and its range of data. 3. Multiple data domains for multi objective programming. 4. Map all data domains with their corresponding module number and each data value within the data domain with a unique data key value. 5. Provides guidelines for the data in database and controls data retriever function Figure 1. Component with domain set DATABASE ORGANIZATION Database is an integral part of this general-purpose approach. User data (input data range of the problem) is stored in this database along with the module numbers and unique data key values as governed by the module evaluator function. A data retriever function retrieves data from the database. Data retriever function gets the decoded value along with the module number from the decoder function as an argument. How and in which form the data is stored depends upon the modular evaluator function, which controls the database retriever function. Data stored is in the database is in form of domains, one for each module. Inside each domain data is arranged with tags called as data keys and a domain identification number, which is unique to a domain. DD Data Domain
1)Selection Any algorithm may be used to select the individuals 2)Reproduction - Crossover, mutation etc. are popular reproduction procedures and any algorithm may be used here too. 3)Crossover - This process of reproduction involves the mechanism of swapping of bits between the parent chromosomes. Figure 2. Database Representation DATA DECODER FUNCTION This function converts binary number into data key. With the data key, it also gives domain identification number, which is passed to the data retriever function. If data keys are decimal numbers (depending upon the module evaluator function) then the function is a simple decimal to binary converter and we assign k bits to the component s corresponding module then: 2 k-1 < m <= 2 k Function of data decoder: 1. Maps binary number to data key. 2. Passes the module identification number to the data retrieval function DATA RETRIEVER FUNCTION The primary purpose of this function is to retrieve the data from the database. It is a bi-directional function, which receives the data key number and the module identification number from the data decoder function and accesses the database. This obtained key value and the module identification number is referenced to obtain the corresponding data, which passes onto the program. Figure 3. Function of Data Retriever Function CORE GENETIC FUNCTIONS Some of the genetic processes [2] involved are: Functions for these processes are standard and need no modification, 4)Mutation This process involves inverting of a bit within the chromosome PROPOSED ALGORITHM The final brief proposed algorithm could be summarized as follows: 1)Analyze the problem and identify various components contributing or affecting our solution. 2)Provide the information to the module evaluator function to decide on the number of modules, number of bits assigned to each module and the data domain for each module 3)Module evaluator stores the data in the database in a systematic form having data domains with unique data keys and module identification number. 4)Based upon the module evaluator develop data decoder function and data retrieval function for data transfer. 5)Bit pattern of all the modules are coalesced together in a definite or arbitrary sequence of bits depending upon the nature of problem. Genetic operations may now takeover the bit array for problem solving. MULTI-OBJECTIVE PROGRAMMING Sometimes, the solution to a problem may not be obtained despite working on various components that contribute towards a specific goal [1,2,3]. This is because the various components contributing to our solution may not be governed by a single factor. In fact, some components may have both positive and negative to effects upon our solution. The main crux of the multi-objective programming involves optimization of at least two conflicting objectives of a design [5]. The Bit-Manipulation Approach to genetic problem solving should thus be flexible to tackle such a situation. The Bit- Manipulation Approach towards single-objective
problems and multi-objective problems is almost similar thus retaining its simplicity. However, following considerations should be kept in mind: For multiple objective programming, the paper intends to present an approach that deals with multi facet factors with priority as a key. Basics: Depending upon the nature of the problem, this approach categorizes factors into two types: a) Static factor b) Dynamic factor Static factors: These factors are also called as veto factors. These factors can affect a single module to an entire individual. Binary table for static factors and weights for the dynamic factors are the output result of the priority function. With a suitable bridge function, these can be incorporated into the fitness function. BRIDGE FUNCTION (USER INTERFACING FUNCTION) This function takes weights of dynamic factors and binary table for static factor as an argument. Static binary table for an individual is mapped into a weight called static priority weight (depending upon the user s priority). Output of this function is static priority weight and weights of dynamic factors (user input). Dynamic factors: These factors are dynamic in nature. These factors generally affect an entire individual. Priority: Priority is a user-defined option. If priority is set high then the solution is not only governed by these factors but also by their priorities. This paper presents general-purpose guidelines to tackle these types of multi objective problems with few pre-fitness functions. These are called as pre fitness functions as they assist the program s fitness functions. PRIORITY FUNCTION The inclusion of advanced hard/soft priority provides a better decision support in multiobjective optimization [6]. This function is a generalized form of all types of priority functions for a genetic algorithm problem. Priority function performs following operations: 1. Scrutinizes all possible factors affecting the solutions. 2. Segregates between static and dynamic type. 3. Develops priority table for static factors. 4. Produces a binary table for static factors. (No. of bits decided by the no of static factors, 1 if factor satisfies and 0 for fail) The entries are arranged inside the table according to their priority. 5. Depending upon the priorities of dynamic factors assigned by the user, the overall contribution of the dynamic factors should be in accordance to each factor. Figure 4. Overview of processes involved FITNESS FUNCTION Fitness function [2] is very important part of genetic algorithm based programs. Acumen of the program lies in its fitness function quality. Priority and bridge function support the fitness function in a multi- objective environment. These two pre fitness functions introduce priority parameter in the fitness function. The arguments that are taken from these two functions are static priority weight and weights of all the dynamic factors. It depend upon the acumen of the programmer how he incorporate these two factors in the fitness function. Case Study Following is a demonstration of a costperformance optimization problem using genetic algorithms where one is required to find the optimum solution/combination of multiple components where:
1. Each component has multiple costs - (Total cost is the sum of costs of one of the cost of each product). 2. Some/all of the components control the performance (multi-objective). VENDOR PROBLEM Assume a common scenario where a purchaser goes to a computer store with a pre-determined budget in his mind to purchase a computer. The store manager gives a list of components like Ram, Hard disk, Scanner, Monitor and Headphone etc. where each component has its price, performance, company etc. Now, the purchaser has to find out the best possible configuration for his P.C that best fits his budget. Where does the purchaser start? He/she has options to choose not only from multiple prices for every single component, but also has to mind the performance factor for each of the components. PROBLEM ANAYLSIS A careful analysis shows that the problem is fit for the Bit-Manipulation Approach to be applied for its solution. The primary check function would categorize it in genetic algorithm based problem. Modular evaluator function would provide components like processor, hard disk, and monitor etc a module each. A database would be created by the modular evaluator function with domains and data keying system. Some of the modules contain multiple data domain like for hard disk it is its cost and disk quality etc. Thus, problem is a multiobjective one where both cost and performance need to be optimized. Thus, the fitness function needs to keep both the cost and performance in mind. BIT DISTRIBUTION A sequence of encode bits (0 or 1) are provided to each module that specifies the cost, capacity or the type. Each module designating a component has its bits assigned by the module evaluator function. E.g. MODULE Bit4 Bit2 Bit1 Bit0 1 0 1 0 All these guidelines are decided through a modular evaluator function. Binary Form 000, 001, 010, 0011 1111 On decoding 4 bit binary form to decimal form, we get the corresponding key value which may be searched in the database to get processor speed/cost (data base domain for processor). e.g. 1010 is typically a value with data key number as 10. Similarly, bits are assigned to the rest of the components and the final array of bits provides us the configuration of our P.C. The total number of bits amounts to 45. Sample table created through modular evaluator function. Module or Start Total Bits Stop Bit Component Bit Assigned Processor 0 3 4 Hard Disk 4 7 4 Ram 8 11 4 Mother Board 12 14 3 Keyboard 15 17 3 Mouse 18 20 3 Graphics Card 21 23 3 Sound Card 24 26 3 Speaker 27 29 3 Ups 30 32 3 Cd / Dvd 33 35 3 Printer 36 38 3 Scanner 39 41 3 Monitor 42 44 3 Total: 45 PRIORITY FUNCTION In the given problem, there are two factors that govern the solution. 1. cost ( dynamic factor ) 2. Performance (static factor) User input: priority set HIGH Cost weight generated = 1(factor multiplied with the total cost of individual) Performance binary table contains either 1 or 0 depending upon the whether performance criteria satisfied or not. Since there are single factors in both static and dynamic categories, development of the priority static table and the dynamic weight table is of no use. Hence the user can directly input the static priority weight and dynamic weight. BRIDGING FUNCTION These are the input priority values from the user.
Static priority weight = 1 (1 success) or 1/100 (0 for fail) Dynamic Weight = 1 FITNESS FUNCTION USED DW = weights of all dynamic factors SPW= static priority weight F(X) = [1/ (G (DW, X))] * SPW for X dynamic factors Results The proposed approach is successful in retaining the characteristic traits of genetic procedures. The results obtained from the case study produce the following graphs, which are the typical results of any genetic problem solution [7]. Graph 1. GEN. 25 20 15 10 5 0-4000 -2000 0 DEVIATION 2000 4000 Fitness Values Graph 2. 0.012 0.01 0.008 0.006 0.004 0.002 0 1 4 7 10 13 16 19 22 Generations problem by deciding upon the nature of functions to be used. References [1] J. Branke, K. Deb, H. Dierolf, and M. Osswald. Finding Knees in multi-objective Optimization. Indian Institute of Technology, Kanpur. [2] Deb, K(March, 2003). A Population-Based Algorithm-Generator for Real-Parameter Optimization. KanGAL Report No. 2003003. [3] Deb, K.and Santosh Tiwari (October, 2004). Omni- Optimizer: A Procedure for Single and Multi- Objective Optimization. KanGAL Report No. 2004013. [4] Genetic Algorithms in Search, Optimization, and Machine Learning (Hardcover) by David Goldberg. [5] Kalyanmoy Deb and Aravind Srinivasan. Innovization: Innovative Design Principles Through Optimization. KanGAL Report Number 2005007. [6] Kay Chen Tan, Eik Fun Khor, Tong Heng Lee, Ramasubramanian Sathikannan An Evolutionary Algorithm with Advanced Goal and Priority Specification for Multi-objective Optimization. Journal of Artificial Intelligence Research 18 (2003). [7] Martin Pelikan, David. E. Goldberg, and Erick Cantfi-Paz. Bayesian Optimization Algorithm, Population Sizing, and Time to Convergence. Genetic and Evolutionary Computation Conference, Las Vegas, NV, July 8-12, 2000 [8] http://www,geneticprogramming.org. Conclusion and Future Work The approach presented above is generalized and can be incorporated in any genetic algorithm environment. The Bit-Manipulation Approach can be considered as the shell within which the core genetic problem solving germinates. The future course of work is directed towards a very ambitious project of creating a GUI based general-purpose genetic problem solver. This would be GUI based application software using the above guidelines to solve any genetic algorithm problem. This software would possess an IQ, which would tackle any potential genetic algorithm