Using a Genetic Algorithm to Do Consultant Scheduling in C#

Mike Gold
12y
82.7k
0
4

Article

Figure 1 - Complex Schedule

Introduction

Let me present you with the following problem. Let's say I have a company of 32 consultants. Suppose I have 5 consulting projects over a 3 month period and I want to rotate each of my consultants through all these projects in such a way so that they all end up with equal pay at the end of the project. One project pays $500/day, one project pays $400/day, one project pays $300/day, one pays $200/day and finally one pays $100/day. To simplify the problem, one consultant can work the same day and earn the sum of the two projects, but there are only 5 slots you can occupy in a day, one for each project. In other words on any given day, you can have at the most 5 consultants working the 5 projects.

How would I go about solving this problem? Well I had several thoughts on it at first. You could some how round robin your consultants. Choose the first 5 then the next 5 then the next 5 and then have then rotate their order. This doesn't work out very easily though, because the number of consultants does not divide easily into the 3 month period. You could just randomly assign them to all the slots and see if it works out. Then just keep switching them around until it evens out. This particular solution is not as easy to resolve as it seems.

So what is the easiest solution? Genetic Algorithms. Let the computer do the work through trial and error. Each time the GA will try to fill a schedule, give a fitness to the result, and the highest fitness survives. (Ain't GA's grand!) The beauty about genetic algorithms is that you never really care how the GA gets to the answer, you only care about how accurately you represent the fitness of the solution contained in each Genome. Although this solution can take a long arduous time to complete, it works. (Imagine when quantum computing comes into play, then the long arduous part goes away!)

In this article we will use a genetic algorithm called PBIL (Population based Learning) to converge on the solution. Population based learning maintains a learning vector and converges on a solution using this vector in conjunction with the fitness of the genomes.

Genome Representation

As is true with every genetic algorithm, the trick is to figure out how the genome can be a representation of a possible solution. In the case of our problem, we want the genome to represent the schedule for the entire 3 months of our 5 projects. Each gene in the genome is a consultant in a particular slot in the schedule. Since we conveniently have 32 consultants, then we can represent each consultant as a 5 digit binary number from 00000b - 11111b.

The schedule has 5 slots over a 3 month period. There are 5 working days in a week and 4 weeks in a month, so the calculation of number of slots is:

(3 months ) * (20 days/month) * (5 slots) = 300 slots.

The calculation of the number of bits in our genome is:

(5 bits/consultant) * 300 slots = 1500 bits

Now we have a representation in the genome of the entire schedule containing a consultant for each slot of the schedule over a 3 month period. For example, here is the start of a possible genome:

11000 | 00100 | 00000 | 10001 | 00111 | ....

The string above represents the first 5 consultants in the first 5 slots of the 300 slot schedule. In other words, these are all the consultants for a possible genome on day 1. If we translate to base 10, we see that this comes out to:

24 , 4 , 0 , 17 , 7

If we assigned each consultant an ID from 0-31 then consultant with ID 24 would be filling slot #1 on day 1, consultant 4 fills slot #2, consultant 0 fills slot #3, consultant 17 fills slot #4 and consultant 7 fills slot #5.

The part of the genetic algorithm that constructs the genome will continue to fill the slots all the way up until the end of the month to represent one genome or an entire schedule of consultants for the 3 month period. Since we want them all to get paid equally, we need to come up with a fitness function that will find a genome that balances the pay of all consultants for the 3 month period.

Fitness Function

The easiest way to figure out what the fitness of the problem is to ask ourselves "What is it exactly we are trying to accomplish?". What we want in this problem is for all consultants to get paid equally over the 3 month period. That means that if we add up the total billing for the period of 3 months for a consultant and compare it to another consultant, the billing should be close to the same (give or take a few hundred bucks). How can we do this comparison? One way to do the comparison is to first add up the amount billed each individual consultant in a particular genome. Then take the average billing for all consultants represented in the genome. Then sum the standard deviations (the amount each consultant's total billing varies from the average). The larger the sum of the standard deviations, the worse the fitness. Therefore we should take the reciprocal of the calculated sum in order to make a higher deviation a worse fitness.

Using this method, as the fitness of the genes get higher and higher as we run our algorithm, the consultants total billing over the 3 month period should begin to equalize.

The Code for Fitness

Following the code for the fitness function, it performs just as we explained in the section above. First the function loops through the 1500 bit genome and calculates the total amount each consultant makes in the 3 month project period. Next it computes the average billing among all of the consultants. Finally it computes the standard deviation for each consultant and sums all of the standard deviations together. Finally it computes the fitness by taking the reciprocal value of the sum of the standard deviations. For very large standard deviation sums, our fitness will be close to zero. If we have a perfect fitness (all consultants getting paid the same exact amount) then our fitness should be close to infinity.

Listing 1 - Fitness function for Computing the Optimal Billing Schedule

double CalculateFullScheduleFitness()
{
float fitness = 0; // total fitness value for this genome
float standardDevSum = 0.0f; // standard deviation
float[] billing = new float[32]; // total billing for each consultant

int slotCount = 0; // used to keep track of the current slot # in the genome

// go through all slots in the month and calculate billing for each consultant
for (int i = 0; i < LENGTH; i += NUMBER_SIZE)
{
   // convert the binary part of the genome to an 5 bit integer
    int nextConsultant = FormNumber(i, NUMBER_SIZE);
   billing[nextConsultant] += _slotValues[slotCount];

   // keep a running total of the slots
   slotCount = (slotCount + 1) % NUMBER_OF_SLOTS;
}

// Now the fitness is based on how close the billing is for all consultants
// so we need to go through each consultant's billing,
// and compute a standard deviation
// the smaller the sum of the deviations, the higher the fitness
// compute the average billing

float totalBilling = 0.0f;
float averageBilling = 0.0f;
foreach (float nextBilling in billing)
   {
      totalBilling += nextBilling;
    }

averageBilling = totalBilling / billing.Length;

// sum the standard deviations for each consultant
foreach (float nextBilling in billing)
{
    float sqrtOfDeviation = (nextBilling - averageBilling);
    standardDevSum += (sqrtOfDeviation * sqrtOfDeviation);
   }

// the higher the sum of the deviations, the worse the fitness
// take the reciprocal (add a small value to prevent a divide by 0
// exception)

fitness += 1.0f / (standardDevSum + .0001f);
return (double)fitness;
}

Results

After running the PBIL algorithm for 1000 generations with our fitness function we get the following results as shown in Figure 2. As we can see, the total billing is beginning to equalize. All total billing for the 3 months are at least $1600 and at most $4400 for each of our 32 consultants.

Figure 2 - Generation 1000 of the PBIL algorithm

By the 14000th generation (Figure 3) the total billing is even more equalized with the lowest paid consultant being paid a total of $2200 and the highest paying consultant being paid a total of $3400.

Figure 3 - Generation 14000 of the PBIL Algorithm

Finally at around generation 18000 (Figure 4), the total billing has pretty much equalized with everyone being paid around $2800. (There are a few lucky consultants who managed to squeak out an extra $100). Note that Figure 4 also shows the scheduling for all of the consultants. Consultants 6, 4, 31, 25, and 27 fill the first 5 slots on day 1. Consultants 14, 22, 4, 11, and 25 fill the next 5 slots on day 2, and so on. Note, the fitness function doesn't take into account that a consultant can work 2 slots on the same day or any distribution of time for each consultant. For example, on day 3, consultant 3 is working two slots. To solve this problem, you probably can easily add something to the program that swaps two consultants in the same slot, if one of them is working two different slots on the same day after the PBIL algorithm has finished converging.

Figure 4 - Generation 18000 of the PBIL Algorithm (Convergence)

Conclusion

In the past we have examined how to use genetic algorithms to search for solutions to problems such as creating sudoku puzzles, electronic logic design, and playing mastermind. In this article we discuss how to solve a scheduling problem with genetic algorithms that might normally be very difficult to solve otherwise. Perhaps you can use this solution to administer some of your own difficult day-to-day activities. Anyway, don't leave your consultants hanging in an unequal billing cycle, get them organized using C# and .NET.