Selection of Fittest Key Using Genetic Algorithm and Autocorrelation in Cryptography

Veiw figure View Figure

Table 1. Literature review of GAs techniques

Download as

View current table in a new window

Tables index

Veiw figure View Table

View next table

5. Background

GA was used to find the best fit key for the cryptographic algorithm. An approach of a pseudo random number generator was used to produce unique keys further used in the various ciphers. To make the key strong and almost unpredictable, a method was used which was based on the theory of natural selection. The basic processes in GAss, such as Initial Population Generation, Crossover, Mutation, Fitness Function Calculation and Final Key Selection were used. For calculating the fitness function Gap test and Frequency Test were performed. A 48-bit key Data Encryption Standard (DES) Cipher was used to show the implementation of the research.

Random samples were formed by generating a preliminary or initial random population of 100 chromosomes. Numerous tests were implemented on the samples and the results were observed.

Afterwards, crossover function was executed taking the total population size and the crossover rate into consideration for calculation. Mutation rate value was also selected. The fitness values of the keys were calculated and scrutinized by making use of the Frequency and Gap Tests. Thus the maximum frequency was detected in the sample. This resulted in the finding that chromosomes were repeated at most that many times. This shows the randomness of the sample used. Therefore, the final outcome came out to be as random, unique and exclusive as possible. The application further encompasses the use of DES cipher for data encryption. The whole solution was of seven rounds and the complete method was repeated 100 times. Even after this the key gets generated in a very less amount of time ^[19].

6. Literature Review

Literature Reviews are essential especially while doing an in depth analysis of the previous researches and providing better and feasible solution to currently existing frameworks. A paper is considered good only when it is well "re-searched" and has different ideas for an optimal solution. So let’s look at some of the proposed work by different authors in the area of cryptography and GAs.

7. Proposed Work

In this research paper, we use the processes of initial population generation, perform crossover and mutation on the population generated and check their randomness. Autocorrelation is used as fitness function. Autocorrelation is a statistical test that determines whether a random number generator is producing independent random numbers in a sequence ^[16].

To check the dependence between numbers within a sequence the test is implemented. After the initial population generation we perform the autocorrelation on the generated population to check for randomness.

Next we perform the autocorrelation on the population generated after Crossover. Similarly, after Mutation is performed autocorrelation is implemented on the obtained population.

In the result set we get three sets of population from each step and choose the population having the autocorrelation value nearest to zero which gets stored in the repository. All this process is executed “N” number of times and the final population having the best autocorrelation value is chosen. Thus from this final set of keys we choose a key randomly which goes for processing in the DES Cipher.

The proposed solution is depicted in the following flow chart

Figure 2. Steps involved in Random key generation using GAs and autocorrelation in cryptography.

Download as

Veiw figure View Figure

7.1. Population Generation

The process of GA initiates with randomly generated keys known as chromosome population. The Key size used is 48-bit key. The size of population will be influenced and vary on a huge number of solutions. Once the preliminary population is produced, the population set will go through various Genetic operations which escalate the number of chromosomes. Some of these individuals are probabilistically selected from the population to participate in the genetic operations. After this selection we perform the Autocorrelation test on the population set and further forward the result.

7.2. Crossover

Crossover can be easily understood as genetic recombination. Number ways are available by which it can be applied in GAs. Crossover is implemented on two randomly chosen individuals from the population set.

The successors generated from crossover are very diverse from their parents. The resulting individuals from Crossover again go through the Autocorrelation test and further passed on to the next GA operation. A crossover rate is chosen and then number of crossovers is calculated with the formula: noco = cor * m * n / 100.

Where noco = number of crossovers, cor = crossover rate, m = key length, n = number of keys

7.3. Mutation

Mutation is also one of the Genetic operations used here to preserve diversity from one generation of a population of chromosomes to the subsequent generation. It is equivalent to biological mutation. In mutation, there is a lot of chance that the result may change entirely from the former result. Mutation is a step occurred during evolution based on user-defined probability. The probability is set low.

Mutation is executed in such a way that the algorithm avoids the population of chromosomes from becoming too analogous to one another. Here also number of mutations is calculated via the formula, nom = mr * m * n / 100,

Where nom = number of mutations, mr = mutation rate, m = key length, n = number of keys

Again the resulting individuals from Mutation undergo the Autocorrelation test and result is forwarded.

7.4. Fitness Function

Fitness function is basically an objective function which defines how close the result is to the expected goal value. In the proposed solution, all the keys which are in binary format are first converted into decimal format. Autocorrelation test is then performed on them. The formula used for the autocorrelation test is the Karl Pearson's Coefficient of Correlation.

The correlation coefficient between x and y is given below:

Where there is one dataset {x₁,....x_n} containing “n” values and another dataset {y₁,....y_n} containing “n” values.

And similarly is also defined for Y.

Finally the final key is selected from the repository.

7.5. Final Key Selection from Repository

The complete process of Population generation, Crossover and Mutation is repeated “N” times and all the keys generated from “N” iterations are stored in the repository. The “N” value used in our work is 3 (N=3).

In N=1, we perform the three steps and from the various keys generated one can be chosen as the one with the greatest value. Again at N=2 and 3, the same procedure is used.

Finally the final key is selected on the basis of random key selection.

7.6. Result

The resulting final key can be used as an input for the encryption and decryption procedure in the DES cipher.

8. Observation

The technique proposed was accomplished using Java Technology and observations were scrutinized. These observations steered to the conclusions expressed in the next section.

The observations for the population set of 20 keys from the implementation of the proposed algorithm are as follows:

8.1. Iteration 1

Figure 3. Sample set of highest autocorrelation value

Download as

Veiw figure View Figure

Figure 4. Comparison of autocorrelation coefficients

Download as

Veiw figure View Figure

In the implementation of the proposed algorithm, when N=1, a random initial population of chromosomes is generated and the autocorrelation test is performed on it. The test is performed individually for Crossover and Mutation also. And we find that amongst all the resulting population the greatest value is that of the Initial Population Generation.

8.2. Iteration 2

In the second step when N=2, the above process is repeated again. A random initial population of chromosomes is generated and the autocorrelation test is performed on it. Similarly autocorrelation tests is performed for the population of Crossover and Mutation also. And we find that amongst all the resulting population when N=2 the best autocorrelation value is that of the Mutation Generation having the highest order of randomness.

Figure 5. Sample set of highest autocorrelation value

Download as

Veiw figure View Figure

Figure 6. Comparison of autocorrelation coefficients

Download as

Veiw figure View Figure

8.3. Iteration 3

In the third step when N=3, the above process is repeated again last time. A random initial population of chromosomes is generated and the autocorrelation test is performed on it. The test is performed individually for Crossover and Mutation also. And we find that amongst all the resulting population when N=3 the greatest value is that of the Crossover result.

Table 2. Comparison of autocorrelation values from repository

Download as

View current table in a new window

Tables index

Veiw figure View Table

View previous table

Figure 7. Sample set of highest autocorrelation value

Download as

Veiw figure View Figure

Figure 8. Comparison of autocorrelation coefficients

Download as

Veiw figure View Figure