Genetic Algorithm Based Solution to SAT-3 Problem
1Department of Computer Science, Jamia Hamdard, New Delhi, India
Abstract
SAT-3 is an NP-complete problem for determining whether there exists a solution satisfying a given Boolean formula in the Conjunctive Normal Form, wherein each clause has at most three literals. Existing approaches of this problem take exponential time and are also memory inefficient. The work uses Genetic Algorithms for finding an optimal solution to this problem. The central idea is the intelligent exploitation of a random search used to solve optimization problems. The work explores previous works to direct the search into regions of better performance within the search space, thus reducing the time and space complexity. It thus establishes the ability of Genetic Algorithms for finding optimal solutions from a huge set of solutions. The work has been implemented and analyzed with satisfactory results.
At a glance: Figures
Keywords: NP complete problem, genetic algorithm, SAT-3 problem, intraceability, optimal solution
Journal of Computer Sciences and Applications, 2015 3 (2),
pp 33-39.
DOI: 10.12691/jcsa-3-2-3
Received March 18, 2015; Revised April 03, 2015; Accepted April 16, 2015
Copyright © 2015 Science and Education Publishing. All Rights Reserved.Cite this article:
- Aiman, Umme, and Nausheen Asrar. "Genetic Algorithm Based Solution to SAT-3 Problem." Journal of Computer Sciences and Applications 3.2 (2015): 33-39.
- Aiman, U. , & Asrar, N. (2015). Genetic Algorithm Based Solution to SAT-3 Problem. Journal of Computer Sciences and Applications, 3(2), 33-39.
- Aiman, Umme, and Nausheen Asrar. "Genetic Algorithm Based Solution to SAT-3 Problem." Journal of Computer Sciences and Applications 3, no. 2 (2015): 33-39.
Import into BibTeX | Import into EndNote | Import into RefMan | Import into RefWorks |
1. Introduction
A boolean expression is made up of clauses. Each clause has literals. If a clause has at most three literals connected by disjunction, and the clauses in turn are connected through conjunctions, the expression is that of SAT3. The problem is a satisfiability problem. The problem calls for finding the values of the literals that make the given expression true. Consider an expression where in there are n literals. Since a literal can have two values, there can be 2n possible set of values of the given set of literals, at each set of values; the expression needs to be evaluated. The solution, by the above method (brute force), therefore becomes computationally very expensive.
There is another way of handling the problem. The use of heuristic techniques like Genetic Algorithms would make the search of the solutions easy and hence tackle the given problem efficiently. GAs are heuristic processes based on the concept of survival of the fittest [1]. The operators like crossover, mutation etc help one to achieve the task. Many problems have been successfully solved using GAs. The present work uses GAs to solve the 3 Problem.
The paper has been organized as follows. Section 2 presents the literature review, section 3 presents Genetic Algorithm, section 4 presents NP Complete problems. In section 5 we discuss existing approaches to SAT-3 problem; section 5.1 describes, section 5.2 states, section 5.3 presents, section 5.4 presents, section 5.5 describes. In section 6 we present the proposed solution, section 7 presents result, section 8 presents the conclusion and section 9 presents references.
2. Literature Review
A systematic review was undertaken to summarize the existing approaches and their restrictions, find gaps in current research and propose areas for further investigation. The review has been carried out in accordance with the guidelines proposed by Kitchenham [2]. Table 1 presents the results of the review.
3. Genetic Algorithm
GAs works on the principle of survival of the fittest. [1] Each generation is represented by a series of character strings similar to the chromosome of DNA. [15] Each individual is a point in the search space. [16] They are then made to go through a series of evolutionary processes. The individuals compete for resources. Genes from successful individuals propagate throughout the population, making successive generations more suited to their environment.
After an initial population is randomly generated, the algorithm evolves through three operators:
Selection- equates to the survival of the fittest;
Crossover- represents mating between individuals;
Mutation- introduces random modifications.
The main steps of GA are explained below.
Randomly generate an initial population. Each chromosome has n cells. The population is generated by the following algorithm:
for each chromosome
for each cell
If (random () % 2 == 0)
Cell=1
else
Cell=0
end if
end
end
The chromosomes generated are a combination of 0s and 1s. A chromosome is a representation of the values assigned to the variables.
Find out the truth value of each chromosome. The truth value is whether the chromosome results in a “TRUE” or a “FALSE” value for the given formula.
CROSSOVER: Randomly select two chromosomes and break them from a randomly selected crossover point. A third child chromosome is formed by joining the first part of the first chromosome and the second part of the second chromosome. Similarly a fourth child chromosome is formed by joining the second part of the first chromosome and the first part of the second chromosome. Now the truth value for these two newly formed chromosomes are found out. Crossover rate is between 2 and 5.
MUTATION: A random chromosome is selected and one of its random bits is complemented and the truth value is calculated. The number of mutations is given by the formula.
No. of mutation = (No. of cells in a chromosome * No. of chromosomes * mutation rate) / 100
The mutation rate is less than 1 and was taken as 0.5 in our implementation. The number of TRUE value is counted. Figure 1 shows the basic steps in GA.
4. NP Complete Problem
There are basically two types of problems
• Polynomial-time problems.
• NP problems.
Polynomial-Time problems-Problems that have algorithms with worst case running time of O(nk), where k is a constant, are called tractable others are called intractable or super-polynomial [17].
NP problems- Another class of algorithms that are “verifiable” in polynomial time is called NP problems. Any problem in P is also in NP [18].
We say that a language M, defining some decision problem, is NP-hard if every other language L in NP is polynomial-time reducible to M. If a language M is NP –hard and it is also in the class NP itself, then M is NP-complete [19].
Thus, an NP –complete problem is, in a very formal sense, one of the hardest problems in NP, as far as polynomial-time computability is concerned. If anyone finds a deterministic polynomial-time algorithm for even one NP-complete problem, then P=NP. [19] Figure 2 shows the relation of complexity classes.
Definition: A Boolean formula is in 3SAT if it is in 3CNF form and is also satisfiable. A Boolean formula is in 3CNF if it is of the form
C1∧C2∧•••∧C3
Where, each Ci is an “AND” of three or less literals. [20].
5. Existing Approaches To SAT3 Problem
5.1. Brute Force AlgorithmThe simplest but also the most time consuming idea. The main approach is to try every possible valuation until a satisfying one is found or there are no more valuations left. It is guaranteed to find a solution but has a slow run-time performance [21].
Implementation of this algorithm involves taking a number of variables and the Boolean function as input from the user. Then a 2D array is used as the data structure to store all the possible 2n combinations of values of all the variables. Then each row of the 2D array is used to insert values of variables in the given Boolean function to check its satisfiability, thus in a worst case scenario, all the possible combinations are tried until the solution is obtained.
Complexity-This approach works well for small inputs i.e. Small no of variables and simple functions. But since it is exponential in nature, its run-time complexity is O (2n) in the worst case. Thus, when the number of variables becomes large the algorithm fails to find optimal solution within practical time limit.
The algorithm works on a propositional formula ф. ф should be in CNF form. The algorithm treats ф as a set of clauses that can be manipulated. It defines 3 basic principles for obtaining the answer.
5.2.1 Unit Propagation Rule
Rule I deals with simplifying the formula for determining satisfiability by analyzing one-literal clauses. There are three cases to be distinguished.
Case 1: an atomic formula p occurs as a one-literal clause in both positive and negated form. In this case the resulting formula ф is unsatisfiable, because the empty clause can be deduced by means of resolution.
Case 2: Case 1 does not apply and an atomic formula p occurs as a one literal clause. Then all clauses that contain p can be deleted and -p can be removed from the remaining clauses.
Case 3: Case 1 does not apply and an atomic formula -p occurs as a one literal clause. Then all clauses that contain -p can be deleted and p can be removed from the remaining clauses.
5.2.2. Pure literal Rule
It deals with redundant clauses. It is generally used as a preprocessing step for efficiency reasons. If only p or only -p occurs in the formula, then all clauses which contain p or -p may be deleted.
5.2.3. Resolution Rule
Resolution can be applied to two clauses that have complementary literals L and L’ such that L ≡ -L’. With the help of the resolution step these two clauses can be combined in a single clause called the resolvent that contains all the literals without L and L’ [21].
Complexity-The fact that this is an exponential algorithm follows directly from the fact that regular resolution is exponential. This means that this algorithm will generate an exponential number of clauses and thus is very memory inefficient. The lower bound complexity of this algorithm has been proven to be 2cn for some constant c > 0 [21]. Figure 3 depicts the basic steps involved in Davis Putnum Algorithm.
In this algorithm rule III of Davis Putnum Algorithm was replaced by splitting rule in order to limit the amount of memory used during runtime.
• The simplifying of the formula is now usually done as a preprocessing step for efficiency reasons.
• Heuristics for the branching variable are used
• Backtracking has been replaced by back jumping or non-chronological backtracking in iterative versions of the DLL, so that similar parts of the same tree are not searched over and over again. [21]
• New clauses can be ascertained during the search process to increase the amount of branches that can be cut away from the search tree later.
• Improvements in Boolean Constraint Propagation algorithms are introduced.
• Random restarts of the algorithm that make use of previously learned information increase the chances of coming to a solution faster
• For these improvements to be efficient and also to allow non-chronological backtracking the DLL algorithm is usually implemented iteratively [21].
Complexity- In general, DPLL requires exponential time (splitting rule!)! Heuristics are needed to determine which variable should be instantiated next and which value should be used.
DPLL is polynomial on Horn clauses, i.e., Clauses with at most one positive literal -A1,V…V-An V B [22].
5.4. Binary Decision Diagrams (BDD)A BDD represents a formula ф (seen as a Boolean function) as a rooted directed acyclic graph G = (V, T, E). Each non-terminal node or vertex v(i) ϵ V is labeled by a variable x and has edges directed towards two successor nodes, called the then-child and the else-child. In a graphical representation, the then-child represents x = 1 via an uninterrupted line and the else-child represents x = 0 via a dotted line. The lines are depicted as undirected for simplicity, but must always be considered as directed to the next node. Each terminal node t(i) ϵ T is labeled with 0 or 1 and does not have any children. For a given assignment of the variables, the value of the function is found by tracing a path from the root to a terminal vertex following the branches indicated by the values assigned to the variables [21].
Using reduced order Binary Decision Diagram a set of satisfying assignments is created by looking at the constraints.
The process is to continuously add clauses to an empty BDD until the perfect pattern is described. By adding all the constraint paths to the BDD we get the complete representation of the function for which the satisfiability is being checked. This method solves the problem by building up the solution set in order to verify satisfiability.
Complexity- ROBDDs perform worse, both in terms of space and time complexity in under constrained and critically constrained set of problems. It only works best in over constrained sets of problems [21]. Figure 4 shows the steps involved in Binary Decision Diagram approach.
5.5. Set TheorySet theory based approach decides the un-satisfiability of a given formula. It describes the problem in set theory and investigates what is needed to decide the un-satisfiability of a given sat3 formula.
For a SAT3 function ф of n variables the set Vφ of possible valuations (variable assignments) v is determined, for which v(ф) evaluates to 1. Here v is a mapping {x0::: xn-1} → {0,1} that instantiates every variable to 0 or 1. So each function vi can be depicted as a unique number in binary representation, which is in the set 2n that denotes n-tuples over a binary alphabet {0, 1} n. If any clause that is part of ф is unsatisfiable, then ф is also unsatisfiable. That is, v(ci) = 0 => v(ф) = 0. Using this information, any 3-SAT formula can be proven (un) satisfiable using sets [21].
Complexity-This solution is not computationally possible with a high amount of clauses and takes more time than applying brute-force approach to all 2n possible valuations [21].
6. Proposed Solution
This work proposes Gas based solution to SAT-3 problem. The algorithm has been presented as follows. The algorithm uses the blend of the theory of natural selection and conventional optimization.
ALGORITHM: SAT-3 problem
1. cell->number of variables
2. expression->given Boolean formula
3. chromosome-> representation of values assigned to the variables of given Boolean formula
4. expression_evaluate->truth value of expression
5. //initial population generation
6. for each chromosome //no. of iterations/initial population ??
7. for each cell
8. If (random () % 100> 50)
9. cell=1;
10. else
11. cell=0;
12. end if
13. end for
14. end for
15. //count number of true values
16. for each chromosome
17. If (expression_evaluate == TRUE)
18. count++;
19. end if
20. end for
21. //crossover
22. for each chromosome-> till crossover_rate
23. chromo some 1=random () % population;
24. chromo some 2=random () % population;
25. crossover_point=random () % cell;
26. for each cell-> beginning to crossover_point
27. chromo some 3[i] = chromo some 1[i];
28. chromo some 4[i] = chromo some 2[i];
29. end for
30. for each cell->crossover_point to end
31. chromo some 3[i] = chromo some 2[i];
32. chromo some 4[i] = chromo some 1[i];
33. end for
34. end for
35. for each chromosome
36. If (expression_evaluate==TRUE)
37. count++;
38. end if
39. end for
40. //mutation
41. for each chromosome->till mutation_rate
42. mutation_point=random () % cell;
43. If (cell[mutation_point]==1)
44. cell[mutation_point]=0;
45. else
46. cell[mutation_point]=1;
47. end if
48. end for
49. for each chromosome
50. If (expression_evaluate==TRUE)
51. count++;
52. End if
53. End for
7. Results
The following tables (Table 2, Table 3, Table 4, Table 5) show the obtained results. Due to time constraints we have considered the case of 4 variables and 7, 8, 9 and 10 clauses. Further analysis with more number of variables and clauses will be considered for future work.
In the following tables the first column shows the no. of true values generated randomly when initial population is formed.
The second column shows the Cross-over Rate. The third column shows the number of true values generated w.r.t the mentioned crossover rate. The fourth column shows the Mutation Rate. Finally, The last column shows the no. of true values generated w.r.t the mentioned mutation rate.
Table 2 shows the results for 4 variables and 7 clauses, Table 3 shows the results for 4 variables and 8 clauses, Table 3 shows the results for 4 vriables and 9 clauses, Table 3 shows the results for 4 variables and 10 clauses.
Table 4. 4 Variables, 10 Clauses (#{A}||#{B}||#{C})&&(#{D}||!#{A}||#{C})&&(!#{B})&&(!#{C}||!#{A})&&(#{A}||!#{B})&&(!#{A}||!#{B}||!#{C})&&(!#{D}||#{C}\||#{A})&&(!#{A}||#{B}||#{C})&&(!#{A}||#{B}||#{D})&&(!#{A}||!#{B}||!#{D})
Figure 5 depicts the graph showing variation in no. of solution with different Crossover Rates. Figure 6 depicts the graph showing variation in no. of solution with different Mutation Rates.
8. Conclusion
The paper successfully implements Gas based algorithm for SAT-3 problem. The algorithm has been analyzed for various crossover rates and mutation rates as explained in the previous section. It can be seen from the graph presented in the previous section. It can be seen from the graph presented in the previous section that the results are optimal for COR=3 and MR=0.3.
The extension to this work would analyze the algorithm for a larger domain to establish the veracity of the algorithm. It is also intended to use other artificial intelligence based techniques and compare the results with the above techniques.
References
[1] | David E. Goldberg, “Genetic Algorithm”, Pearson Education. | ||
In article | |||
[2] | Barbara Kitchenham, (July 2003), “Procedures for Performing Systematic Reviews”, Keele University Technical Report TR/SE-0401 ISSN: 1353-7776 and NICTA Technical Report 0400011T.1 | ||
In article | |||
[3] | Drogoul, A., and Dubreuil, C. “A distributed approach to n-puzzle solving.” Proceedings of the Distributed Artificial Intelligence Workshop. 1993. | ||
In article | PubMed | ||
[4] | Harsh Bhasin and NehaSingla, (August 2012),” Genetic based Algorithm for N-Puzzle Problem”, International Journal of Computer Applications (0975-8887), Volume 51-No. 22. | ||
In article | |||
[5] | Eric Angel, Romain Campigotto and Christian Laforest, Algorithms for the Vertex Cover Problem on Large Graphs. | ||
In article | |||
[6] | Harsh Bhasin and Gitanjali, (2012), “Harnessing Genetic Algorithm for Vertex Cover Problem”, International Journal on Computer Science and Engineering (IJCSE), Vol. 1, Issue 2, pp. 218-223. | ||
In article | |||
[7] | Naresh Kumar, Deepkiran Munjal, Modified Genetic Algorithm for Maximum Clique Problem, International Journal of Computer & Organization Trends-Volume 3 Issue 4-May 2013. | ||
In article | |||
[8] | Harsh Bhasinand Rohan Mahajan, (2013),” Genetic Algorithms Based Solution To Maximum Clique Problem”||, International Journal of Computer Science and Engineering, Vol. 4. | ||
In article | |||
[9] | Rong Long Wang, A genetic algorithm for subset sum problem, New Aspects in Neurocomputing: 10th European Symposium on Artificial Neural Networks 2002. | ||
In article | |||
[10] | Harsh Bhasin and NehaSingla, (2012),” Modified Genetic Algorithms Based Solution to Subset Sum Problem”, (IJARAI) International Journal of Advanced Research in Artificial Intelligence, Vol. 1, No. 1. | ||
In article | |||
[11] | Ling Zhao, Notes in Computer Science Volume 2883, 2003, pp 326-344, Tackling Post’s Correspondence Problem. | ||
In article | CrossRef | ||
[12] | Harsh Bhasin and Nishant Gupta, (2012), “Randomized algorithm approach for solving PCP”, International Journal on Computer Science and Engineering (IJCSE),Vol. 4, Issue 1, pp. 106-113. | ||
In article | |||
[13] | Genetic algorithms for the travelling salesman problem: A review of representations and operators-P Larrañaga, CMH Kuijpers, RH Murga, I Inza…-Artificial Intelligence …, 1999-Springer. | ||
In article | |||
[14] | H Braun, On solving travelling salesman problems by genetic algorithms,-Parallel Problem Solving from Nature, 1991-Springer. | ||
In article | |||
[15] | Lintao Zhang, Searching For Truth: Techniques For Satisfiability of Boolean Formulas, PhD Thesis, Princeton University, 2003. | ||
In article | |||
[16] | Yogesh S. Mahajan, Zhaochui Fu and Sharad Malik, Zchaff 2004: An Efficient SAT Solver, Lecture Notes in Computer Science, 2005, Volume 3542/2005, 898. | ||
In article | |||
[17] | www.personal.kent.edu/.../Algorithms/MyAlgorithms/.../npComplete.html | ||
In article | |||
[18] | T.H. Cormen, C.E. Leiserson, and R.L. Rivest, “Introduction to Algorithms”, the MIT Press, Cambridge, Massachusetts, USA, 2009. | ||
In article | |||
[19] | ww3.algorithmdesign.net/sample/ch13-np.pdf | ||
In article | |||
[20] | www.cs.umd.edu/~gasarch/TOPICS/sat/SATtalk.pdf | ||
In article | |||
[21] | “Solving 3-SAT”, (2012), Peter Maandag, Bachelor Thesis, Radboud University Nijmegen. | ||
In article | |||
[22] | “Foundations of Artificial Intelligence, Satisfiability and Model Construction”, Wolfram Burgard, Bernhard Nebel and Martin Riedmiller. | ||
In article | |||
[23] | “Implementing the Davis-Putnam Method”, (1999), Hantao Zhang (Dept. of Computer Science, The University of Iowa), Mark E. Stickel (Artificial Intelligence Center, U.S.A). | ||
In article | |||
[24] | www.cs.ucc.ie/~dgb/courses/ai1/03-notes.pdf | ||
In article | |||
[25] | Algorithms for Random 3-SAT, Abraham D. Flaxman, Microsoft Research | ||
In article | |||
[26] | A. Mishchenko, An Introduction to Zero-Suppressed Binary Decision Diagrams. https://www.ee.pdx.edu/~alanmi/research/, 2001. | ||
In article | |||
[27] | J. M. Howe and A. King, A Pearl on SAT Solving in Prolog, In Functional and Logic Programming, volume 6009 of Lecture Notes in Computer Science, pages 165-174, Springer, 2010 | ||
In article | |||
[28] | Norbert Manthey, Improving SAT Solvers Using State-of-the-Art Techniques, Thesis, Technische Universitt Dresden (2010). | ||
In article | |||
[29] | Wikipedia-online encyclopedia. | ||
In article | |||
[30] | https://www.stumptown.com/diss/chapter2.html | ||
In article | |||