Evolving Numeric Series using Genetic Algorithms in C#

Mike Gold
19y
32.5k
0
4

Article

Figure 1 - Fractal from the Fibonacci sequence

If you ever browsed around the book store, you'll notice these puzzle books or IQ test books and some of the books contain questions asking you to complete a series of numbers. A simple example might be what comes after the following series

1, 2, 3, 4, 5, 6, ?

You'll have a multiple choice to answer the questions such as

a. 6 b. -1 c. 6.5 or d. 7

Of course this is an extremely simple example. A more difficult series might be

1, 2, 8, 28, 100, 356, ?

Do you know the answer? Well I don't, but the computer does using our extremely versatile genetic algorithm. In fact, after taking the IQ test with this program, you might come to the conclusion that the computer has an IQ of somewhere over 180. All kidding aside, the genetic algorithm is only as good as its fitness function and whether or not the operations and constants in the genes are capable of converging to a solution to the next value in the series. After running a bunch of series, the computer seemed to do a fairly good job of converging on many of them. Below are the declarations of many of the series that I tested the algorithm on:

// float[] SeriesValues = new float[] {0, 1, 4, 9, 16, 25, 36, 49, 64, 81}; // squares
// float[] SeriesValues = new float[] {1, -1, 1, -1, 1, -1, 1, -1, 1}; // alternating 1, -1
// float[] SeriesValues = new float[] {1, -.5f, .25f, -.125f, .0625f, -.03125f, .015625f}; // halfed and negative
float[] SeriesValues = new float[] {1,2,3,5,7,11,13,17,19,23}; // primes {unsolved}
// float[] SeriesValues = new float[] {0,1,1,2,3,5,8,13,21,34,55}; // fibanocci
// float[] SeriesValues = new float[] {0,1,8,27,64,125,216,343,512,729,1000}; // cubes
// float[] SeriesValues = new float[] {1,2,3,4,5,6,7,8,9,10}; // ints
// float[] SeriesValues = new float[] {1,2,2,4,8,32,256}; // multiples
// float[] SeriesValues = new float[] {0,3,6,9,12,15,18, 21, 24, 27}; // 3x series
// float[] SeriesValues = new float[] {0, 1, 1, 2, 5, 29, 866}; // sum of squares
// float[] SeriesValues = new float[] {1, 2, 8, 28, 100, 356};
// float[] SeriesValues = new float[] {-3, -2, 1, 6, 13, 22}; // x[i]^2 - 3
// float[] SeriesValues = new float[] {1, 2, 3, 5, 16, 231}; // x[i]^2 - x[i-1]^2
// float[] SeriesValues = new float[] {1, 2, 1.5f, 1.75f, 1.625f, 1.6875f}; // (x[i] + x[i-1])/2 (average series)
// float[] SeriesValues = new float[] {1, 2, 3, 6, 12, 24}; // sum of all previous values
// float[] SeriesValues = new float[] {1, -1, 0, -1, -1, -2, -3, -5}; // sum of x[i] + x[i-1]
// float[] SeriesValues = new float[] {1, 2, 3, 2, 1, 2, 3, 2, 1}; // simple triangle wave
// float[] SeriesValues = new float[] {-1, -1, 1, 1, -1, -1, 1, 1}; // simple square wave
/// float[] SeriesValues = new float[] {1, 10, 100, 1000, 10000, 100000}; // multiples of 10
// float[] SeriesValues = new float[] {0, 1, 2, 5, 10, 19}; // x[i] + x[i-1] + i
// float[] SeriesValues = new float[] {1, (1f/2f), (1f/3f), (1f/4f), (1f/5f), (1f/6f)}; // 1/i
// float[] SeriesValues = new float[] {1, (1f/4f), (1f/9f), (1f/16f), (1f/25f), (1f/36f)}; // 1/(i^2)

Running the Program

To run the program on any of the series above, simply comment out the existing series, and uncomment the series that interests you. Then execute the program, and the results will appear in the console window. Upon running the program, you should see some interesting genomes appearing in the console. Sometimes the final solution to the series would result in different formulas, but the next number in the series would always be the same. For example, the series for squared numbers is:

0, 1, 4, 9, 16, 25, 36, 49, 64, 81, ?

The genetic algorithm (using MEP) came up with several solutions to this problem, with the same next value. Note that i is the index in the series starting from 0, x[i] is the current number in the series and x[i-1] is the previous number in the series. For determining the value after 81: i = 9, x[i] = 81, and x[i-1] = 64

Genome Solution for the square series	Reduced	Calculated Value
--> ((i*(i+(i/i)))+(i+(i/i)))	reduces to (i+1)^2	= (9+1)^2= 100,
--> (x[i]+((i+i)+1))	reduces to x[i] + 2i + 1	= 81 + 2*9 + 1 = 100
--> (x[i-1]+((i+i)+(i+i)))	reduces to x[i-1] + 4i	= 64 + 4 * 9 = 100
--> (((1x[i])+(1x[i]))-((0+x[i-1])-(1+1)))	reduces to 2x[i] - x[i-1] + 2	= 2*81 - 64 + 2 =100

Another interesting series to look at is the Fibonacci series that produces the design in figure 1 of this article. Below is the Fibonacci series beginning at 0 and 1:

0,1,1,2,3,5,8,13,21,34,55, ?

After running our Genetic Algorithm on this series, we come up with the following output

--> (x[i-1]+x[i]) --> 1 --> 89

Anotherwords the Fibonacci series says that the next value in the series is produced by the sum of the previous two values.

Below is a similar series that we presented earlier. Can you guess the pattern?

1, 2, 8, 28, 100, 356, ?

Running this series through the MEP genetic algorithm, we get the following solution:

--> (((x[i-1]+x[i])+x[i])+((x[i-1]+x[i])/1)) --> 1 --> 1268

I'll leave it to you to reduce the equation to its simplest form. After testing on quite a few series, the gene algorithm did quite a good job. Although it admittedly had trouble with the series below:

1,2,3,5,7,11,13,17,19,23,?

Albeit, if it came up with a equation to this series, I'd probably win the Nobel prize. This is the series of prime numbers. The best solution the algorithm would come up with are shown below:

equation	fitness	next value
((i+i)-((1-i)-(1/(((x[i-1]x[i-1])(x[i-1]x[i-1]))((x[i-1]x[i-1])(x[i-1]*x[i-1]))))))	0.762460087787515	26
((((i-x[i-1])/(i-(i/(i-x[i-1]))))+(i+(((i-(i/(i-x[i-1])))/(i-x[i-1]))/(i/(i-x[i-1]))))) +((x[i-1]-(((i-x[i-1])/(i-(i/(i-x[i-1]))))((i-x[i-1])/(i-(i/(i-x[i-1]))))))- (((i-x[i-1])/(i-(i/(i-x[i-1]))))((i-x[i-1])/(i-(i/(i-x[i-1])))))))	0.882005252861989	26.0492916107178
((i+(((x[i-1]*i)-(i-(x[i]/i)))/x[i]))+i)	0.823261873344771	25.1545906066895
(((i+i)-1)+i)	0.683939720585721	26
(i+(i*((x[i-1]/(x[i]+(i/x[i])))+1)))	0.794208164805367	25.3104095458984

Of course none of these solutions approximating the next value in the prime series are even close to the real answer, 29. Anyway, this may not actually be the fault of the algorithm itself, but probably of the genes I am providing as a possibility for producing the genomes. Below are the allowable genes for this genome in the MEP algorithm that I have chosen:

0:   x[i]
1:   x[i-1]
2:   i
3:   + (add)
4:   - (subtract)
5:   * (multiply)
6:   /   (divide)
7:   0
8:   1

This gene combination works very well for most geometric, arithmetic, and even power series.

The Code

As usual, we will not go into the coding of MEP here, you can simply browse the MEP article on this site if you are curious. We will, however, examine the calculation function and the fitness function. The fitness function cycles through each value in the series, and performs a calculation with the genome on that value. Then it compares the result produced by the genome on the next value in the series. So if the genome equation run on x[i] produces x[i+1], we have a perfect fitness for that number in the series. The sum of all of the fitnesses run on every value in the series sample divided by the total number of samples will give us a fitness between (0 and 1).

Listing 1 - Calculating the Fitness of the Genome to produce the next value in the series

public double CalculateSeriesFitness()
{
CurrentFitness = 0.0f;
// cycle through all values in the series (each x[i])
for (int i = 1; i < SeriesValues.Length-1; i++)
{
// perform the calculation on the current value in the series and its index
double calc = PerformCalculation(SeriesValues[i], i);
// compare the calculation to the next value in the series to determine a fitness measurement between the two values
double measure = SeriesValues[i+1];
CurrentFitness += Math.Exp(-Math.Abs(measure-calc));
}
// produce a normalized fitness by dividing by 2 less than all the numbers in the series}
CurrentFitness /= SeriesValues.Length - 2; // we chopped off both ends of the series so subtract 2
return CurrentFitness;
}

The PerformCalculation method runs the calculation from the gene values inside the genome. A successful genome will produce the next value in our series:

Listing 2 - The PerformCalculation method that exercises the calculation contained in the Genome

/// <summary>
/// Perform a calculation based on the value in the series
/// and the index of the value in that series
/// </summary>
/// <param name="measure">value in the series</param>
/// <param name="index">index of that value</param>
/// <returns>calculated next value in the series</returns>
public float PerformCalculation(double measure, int index)
{
int count = 0;
foreach (Gene g in TheArray)
{
// if its a symbol (0,1 or 2), it's either x[i], x[i-1], or i itself
if (g.operation < NumSymbols)
{
if (g.operation == 1) // 1: x[i-1]
CalculationArray[count] = SeriesValues[index - 1];
else if (g.operation == 2) // 2: i
CalculationArray[count] = index;
else
CalculationArray[count] = (float)measure; // 0: x[i]
}
else if (g.operation == 7)
{
// its the number zero
CalculationArray[count] = 0;
}
else if (g.operation == 8)
{
// its the number one
CalculationArray[count] = 1;
}
else
{
// operation (+, -, * or /), use it to fill calculation in array
CalculationArray[count] = DoOperation(CalculationArray[g.instruction1], CalculationArray[g.instruction2], g.operation, index);
}
count++;
}
return CalculationArray[TheArray.Count - 1]; // return last calculation
}

Conclusion

Again we can see how genetic algorithms can be made to "think" of solutions to problems that we encounter in math all the time. It would be interesting to experiment with the genomes to produce some more difficult series and also introduce other calculation capability into the genome (such as the log, mod, or sin functions). This way we may broaden the algorithms ability to solve some of the more difficult series. I encourage you to try different number patterns and see what the algorithm will come up with. You might discover something new in the world of numbers that just happens to add up.