# Genetic Cloud

Complex optimization tasks, like travelling salesman problem, can be solved by simple algorithms with random searching component. Genetic algorithm is one of such approaches. It gives better solution on the big population sizes, but the execution time grows non-linear with population size growth. Another way to increase accuracy is to have several parallel runs. The additional resources for parallel runs could be taken in a Cloud.

Cloud Computing is for every person for every day.

Cloud Computing provides almost unlimited resources. They are always available. The payment is only for usage time. The current article continues the series about new opportunities that become available to every person.

Do you play billiard? I do. I'm not a professional player. So I enjoy pocketing a ball accidentally. The ball's going at the rebound from cushions, coming to the pocket; it almost stops with revolving, and falls to the pocket. Such kind of a casual luck!

But I'm not sure the luck is here today. The coming manager doesn't look like he simplifies my current work day.

To exclude all technical details, commercial secrets, and other things, the task can be presented as a known travelling salesman problem with 50 points there. Yes, the order of variant's number is about 50! (10^64). The distance matrix is not static additionally. Even if I find the necessary library quickly it'll take much time to include the task features there.

"I believe in you" my manager's tapping me. "Today's evening."

"What is if it is ready earlier than evening?" I'm trying to joke.

"If it's ready earlier you could even play billiard the rest of the time." My manager answers in the same manner.

Ok, at least I'm motivated now.

There are a lot of methods to solve such a problem .

It would be great if the task could be solved by my minimal efforts. Even it takes a little bit more other resources, e.g. computing time. Sure!

Indeed it can be solved in this way! A genetic algorithm is the simplest method that I can implement. But I will have to just wait for the result.

Wasting of time! But I can use the cloud resources. It'll save me time, and I will be able to play billiard during waiting.

Stage 1. Genetic Algorithm

Genetic algorithms are very simple. They have only three steps (excluding an initialization): crossover, mutation, and selection. So the interface is simple (it's on C#):

`public class Genetic     {         public void Init()         {         }         public void Select()         {         }         public void Mutate()         {         }         public void CrossOver()         {         }     }`

The chromosome is a ring path starting from the first defined point and directed to the second defined point (to not duplicate the same ways during the search).

The implementation is:

1. The initialization is easy, because it's random.

2. The selection is even easier, because the common distance is necessary to be calculated and the best (shortest by the distance) chromosomes should be taken to the further search.

3. The mutation is not such a difficult task. Two random cells of a chromosome are replaced by each other.

4. The crossover is a little bit more complex. Two random cells are taken, that defines 2 parts of rings for both chromosomes. One part from each chromosome is taken in the way where parts have minimal overlapping. Duplicated cells are removed randomly, absent cells are added randomly according to their placement in the initial chromosomes.

public class Genetic
{
private const int SIZE = 1000;
private const float CROSSOVER_SHARE = 0.15f;
private const float MUTATE_SHARE = 0.15f;
private const float SELECT_SHARE = 1 - (CROSSOVER_SHARE + MUTATE_SHARE);
private Population _population = new Population();
public void Init()
{
for (int i = 0; i < SIZE; i++) { _population.Add(Chromosome.Generate());
}
}
public void Select()
{
_population.Sort();
int count = _population.GetCount();
for (int i = count - 1; i > SIZE * SELECT_SHARE; i--)
{
_population.RemoveLast();
}
_population.AssignRange();
}
public void Mutate()
{
for (int i = 0; i < SIZE * MUTATE_SHARE; i++)
{
Chromosome chromosome = _population.GetRandom();
}
}
public void CrossOver()
{
for (int i = 0; i < SIZE * CROSSOVER_SHARE; i++)
{
Chromosome father = _population.GetRandom();
Chromosome mother = _population.GetRandom();
}
}
}

The attached C# code is based on the European cities with the distance though the geographical coordinates.

So the algorithm is ready. Is the search not most effective? But I'm very effective. The code is ready quickly, even if it takes some additional time to calculate, the whole task will be finished quicker.

Stage 2. Cloud :

The genetic algorithm is still randomized method. It guaranties almost optimal solution and depends on parameters. I can increase the population size to improve the probability of the best result, but it will increase the calculation time. Or I can run several instances simultaneously and pickup the best result.

As I write on .NET, I prefer any other working tools also in the Visual Studio. So my cloud is also in the Studio. I use a free tool EC2Studio. It's the most convenient tool for me to operate with the Amazon EC2 in the Visual Studio. Amazon EC2 provides computing resources when I need them. I choose the necessary AMI configuration and start an instance. After I finish using the instance, I terminate it. The payment is only for the used time. The deploying is:

1. Launch an instance.

1. Generate a key pair, if it is not generated yet.

2. Find the necessary AMI (I use the following instances for competitive tests: ami-6a6f8d03 (Win2008, large, EBS), ami-a2698bcb (Win2008, small, EBS), ami-dd20c3b4 (Win2003, large), and ami-df20c3b6 (Win2003, small)).

3. Start it.

4. Wait until the instance becomes in running status.

2. Make a test in the instance

2. Start a console for the instance.

3. Copy the application there.

4. Run it there.

Perfect! I shut down a laptop soon and go to play billiard.

Below the picture with 4 cloud instances and 1 locally run version (on Vista).

The Best solution is 181.85 here (it's on the slowest machine, but the genetic algorithm is random).

There are different configurations by software and hardware are available on EC2. My tests use small and large images (small images are on 32bit, large images are on 64bit). The competitive test result - the average time per one cycle (selection, mutation, and crossing over according to defined shares):

 For small Windows 2003 32bit Amazon Instance 1098 msec For large Windows 2003 64bit Amazon Instance 594 msec For large Windows 2008 64bit Amazon Instance 503 msec For small Windows 2008 32bit Amazon Instance 1139 msec For my laptop Vista 1659 msec

Taking into account 4th times prices difference between a small and large instance in EC2, a small instance is twice cheaper for this program (when the execution time is not critical).

The same day later

By e-mails:

"Mr. Manager, the found almost most optimal variant consists of the sequence of the following points"
"Good! But where are you?"
"I'm having a good billiard playing, as you said in the morning."
"I understand. Have beautiful doublets! And big thanks for the result!"

Now I think almost all optimization tasks can be solved with two components: a cloud and a genetic algorithm.

Every problem leads to a good solution. Every cloud has a silver lining.Pavel Klimov

*Don't forget to Terminate Instance after you finish using it!