Genetic Algorithm to Solve 2D Mazes

This article shows how to use genetic algorithms to solve problems. The approach uses Genetic Algorithms (GA) in the solutions optimization context.

Scope

In this article, as a sequel to Genetic algorithm for icon generation in Visual Basic, we'll see how genetic algorithms can be applied to solve problems. The approach is particularly suited to a requirement where we don't have a prior model (or, an ideal target to reach), but know the optimal parameters that tend to a solution. In other words, we'll discuss Genetic Algorithms (GA) in the solutions optimization context. The language in which we'll develop a sample application will be Visual Basic .NET again.

Introduction

In the last article we considered how, starting from a given graphical model, it's possible to generate a set of random and chaotic solutions that will evolve towards our target through the constant selection of the best samples from each set, crossing their information to create new champions, perhaps better. Though those are interesting ways to explore GAs, when we discuss those coded to grow towards a strict model have the obvious need of a starting complete and ideal solution. That means if we don't have a final result to match each solution we create, we won't have any way to evolve beyond chaos, closing our algorithm in an infinite loop of random samples, because they won't be oriented to any specificity.

Obviously, a GA must always have a target, in one way or another. However, a monolithic model isn't always necessary, but after we have defined the optimal conditions we expect, we let the algorithm find solutions that meet those requirements (or, the optimal solution to a given problem). There are many practical applications for that, regarding many fields of human knowledge. But since we learn better when we play with something, we'll see here an example of a Visual Basic GA oriented to solve a 2D maze, trying,  using previously seen functionalities, to not only join two points in space, but to do it in one of the better ways, in other words searching for the shorter path.

A brief recap

Let's review the main concepts to refresh your memory.

A GA is so named because it tries to mimic those biological processes at the core of species evolution. We could say it attempts to create a computer model able to auto-tune itself to a specific context, generating individuals (solutions to a given problem) that will be used to evaluate a real goal of a given objective. The problem to be solved is the digital equivalent of the biological context in which individuals move, obeying the "fittest's survive" tenet. In the previous article I've used the example of a world filled with predators, in which the fastest inhabitants (more apt to an efficient flight) or the more pugnacious (mode apt to an efficient fight) represents the best candidates to survive, if compared with slower o weaker samples. In the same way, if we have a digital target (equivalent of the base parameter of biological survival), we could tell about "apt individuals" in the measure in which potential solutions get nearer to the given objective. Among them, in the same way nature does, we'll try to identify the best solutions to get them to reproduce, generating new solutions that will bear their parent's peculiarities. A quick sample about it has been shown in the first article. Given a certain graphical icon (an array of bytes that produces an image), starting from a fixed number of chaotic solutions, in each generation the best solutions are those akin to the model at which they aim.

The biological inspiration of GA passes through its terminology, too. In it, we'll speak of:

  • The concept of individual/chromosome, of a single solution to a given problem
  • Population, or the entire set of possible solutions
  • Gene, or a parte (byte) of an individual
  • Fitness, to be intended as the rate of validity in a solution/individual
  • Crossing-over, or the solutions recombination to produce new models, new chromosomes/individuals
  • Mutation, to be intended as random changes (or pseudo-random) that can occur inside a solution

All those concepts are valid in a digital world and must be implemented in some way to speak about a real GA-based solution, or a solution which can autoevolve. The developer must define each one of them, in theoretical and practical means to deal with each phase.

Optimizing a solution

In principle, when dealing with the case of image replication too, when taling about GAs we could always think about optimization. In fact, each GA case is, ultimately, the search for the best solution, whenever it refers to reach the ideal target or falling in a satisfying local optimum. With the term "local optimum", we indicate the inability to progress further ta a given precision rate, due to certain population characteristics that, after recombination, doesn't produce improvements. A method to prevent stagnation is possibly applied through mutation, using which we introduce, on a random basis, changes in the newly created solutions, to maintain a little constant of chaos in transmitting information from a generation to the next one.

Despite that, we could always refer to the concept of optimization, it becomes more evident when applied to a logistical problem. Let's suppose we are inside a maze that provides several ways to reach the destination. In it we could move as the internal topography allows and it is possible to repeat a path or never reach the destination. Essentially, in such a labyrinth there are many ways to reach a solution, but a solution that will be better compared to others exists, too. In such a place, the best road is the one that leads us from the start point to the end in a straight way, in the least steps possible (in other words, the optimal solution). A GA could help us to spot that path, without previously knowing what the possibilities are. We note it isn't just about finding a way to arrive, but to test its validity, considering what will be better.

Base structure

To be able to think about a labyrinth, we must have a labyrinth. Here we'll see the basis I've used in my example, to briefly explain what will be the ground on which we will work. As usual, at the end of the article you'll find a link to download the source code of the example that I invite you to consult for a detailed view about those aspects that won't be treated in this article. Running the sample code, we'll see a window like this:

The program will create two windows with 20 cells per side that will be used as a graphical representation on the elements constituting the user-defined labyrinth (on the left) and the one that will show the found solutions (on the right). In the left panel, the user can click each cell a given number of times. The following is a short table to explain what each click will produce on a cell:

  • 1 Click = "Wall" cell (Gray, it wil be impassable)
  • 2 Click = Start point (Green, the entering point of the labyrinth)
  • 3 Click = End point (Red, the exit point of the labyrinth)
  • 4 Click = Valid cell (not used in labyrinth creating, it will belong to the GA)
  • 5 Click = "Free" cell (Black, a cell that can be crossed)
To be able to test the program quickly, I've implemented two buttons to save and load a previously created labyrinth. In the source code's package, I've made available the "test_maze.txt" file that is the one we use in this example. Loading that file, we'll obtain the following labyrinth:

We can immediately see there are many possible solutions between the start and end point. Before getting into the GA part, I will spend some words about the creation of the labyrinth, to explain its underlying logic. That will be useful for a better understanding of the algorithm itself.

Implementing a labyrinth in VB

Our labyrinth has a base made by a bidimensional byte array, named sourceMaze(). Each cell of it represents an array position, in the form (X,Y). Each of those bytes is set to a value that represents the current state of a specific cell. Those values are associated with an enumeration, visible in the Globals.vb file:

  1. Module Globals  
  2.     Public Const MAZESIZE As Byte = 20  
  3.     Public Const MAZESIDE As Byte = 15  
  4.     
  5.     Public Enum CELLSTATE  
  6.         Free = 0  
  7.         Wall = 1  
  8.         Start = 2  
  9.         Finish = 3  
  10.         Valid = 4  
  11.         FakeWall = 5  
  12.     End Enum  
  13. End Module  

There is an additional value compared to the values discussed above, namely FakeWall = 5. We will talk about it soon, when we'll analyze how the solutions are created.

During the Form load, a call to the routine that draws the labyrinths is made. Drawing operations are done using MazeCell controls that are PictureBox extensions created to contain data from the data array as in the following:

  1. Private Sub GenerateMaze(container As Panel)  
  2.     Dim _TOP As Integer = 0  
  3.     Dim _LEFT As Integer = 0  
  4.     
  5.     For _Y As Integer = 0 To MAZESIZE - 1  
  6.         For _X As Integer = 0 To MAZESIZE - 1  
  7.     
  8.             Dim pb As New MazeCell  
  9.             With pb  
  10.                 .Name = container.Name & _Y.ToString("00") & _X.ToString("00")  
  11.                 .Top = _TOP  
  12.                 .Left = _LEFT  
  13.                 .Width = MAZESIDE  
  14.                 .Height = MAZESIDE  
  15.                 .BackColor = Color.Black  
  16.                 .BorderStyle = BorderStyle.FixedSingle  
  17.                 .Cursor = Cursors.Hand  
  18.     
  19.                 .MatX = _X  
  20.                 .MatY = _Y  
  21.     
  22.                 AddHandler pb.Click, AddressOf MazeClickCell  
  23.             End With  
  24.     
  25.             _LEFT += MAZESIDE  
  26.             container.Controls.Add(pb)  
  27.         Next  
  28.         _LEFT = 0  
  29.         _TOP += MAZESIDE  
  30.     Next  
  31. End Sub  
GenerateMaze() needs an argument of type Panel that represents the owner control in which we'll generate our strcture. Next, with two nested loops, cells are created. They will be bound with the sourceMaze array during MazeClickCell event, defined as:
  1. Private Sub MazeClickCell(sender As Object, e As EventArgs)  
  2.     Dim pb As MazeCell = CType(sender, MazeCell)  
  3.     
  4.     pb.CellStatus += 1  
  5.     If pb.CellStatus > CELLSTATE.Valid Then pb.CellStatus = CELLSTATE.Free  
  6.     
  7.     sourceMaze(pb.MatX, pb.MatY) = pb.CellStatus  
  8.     pb.BackColor = pb.CellBrush  
  9. End Sub  
We can note MazeCell controls, in comparison with standard PictureBox, have additional properties, like MatX that identifies a cell's abscissa in the matrix, MatY that represents an ordinate, CellStatus that exposes a cell's content and CellBrush that, following the status of a cell, represents the cell itself with a color. The source code that realizes the MazeCell extension is the following:
  1. Public Class MazeCell  
  2.     Inherits PictureBox  
  3.     
  4.     Dim _STATUS As Byte = 0  
  5.     Dim _MATX As Byte = 0  
  6.     Dim _MATY As Byte = 0  
  7.     
  8.     Public Property CellStatus As Byte  
  9.         Get  
  10.             Return _STATUS  
  11.         End Get  
  12.         Set(value As Byte)  
  13.             _STATUS = value  
  14.         End Set  
  15.     End Property  
  16.     
  17.     Public ReadOnly Property CellBrush As Color  
  18.         Get  
  19.             Select Case CellStatus  
  20.                 Case CELLSTATE.Free  
  21.                     Return Color.Black  
  22.                 Case CELLSTATE.Valid  
  23.                     Return Color.Violet  
  24.                 Case CELLSTATE.Start  
  25.                     Return Color.Green  
  26.                 Case CELLSTATE.Finish  
  27.                     Return Color.Red  
  28.                 Case CELLSTATE.Wall  
  29.                     Return Color.Gray  
  30.                 Case CELLSTATE.FakeWall  
  31.                     Return Color.Yellow  
  32.             End Select  
  33.         End Get  
  34.     End Property  
  35.     
  36.     Public Property MatX As Byte  
  37.         Get  
  38.             Return _MATX  
  39.         End Get  
  40.         Set(value As Byte)  
  41.             _MATX = value  
  42.         End Set  
  43.     End Property  
  44.     
  45.     Public Property MatY As Byte  
  46.         Get  
  47.             Return _MATY  
  48.         End Get  
  49.         Set(value As Byte)  
  50.             _MATY = value  
  51.         End Set  
  52.     End Property  
  53.     
  54.     Public ReadOnly Property StringMatrix As String  
  55.         Get  
  56.             Return _MATX.ToString & ";" & _MATY.ToString  
  57.         End Get  
  58.     End Property  
  59. End Class  

The present structure allows the user to define his own base labyrinth, to be then passed as a base for the GA, or saving it for future reference. (I've implemented, but not optimized, basic routines to save and load labyrinth data, but I won't discuss them here, to focus on the topic's fundamentals.)

Genetic algorithm logics

Now that we have an operative context, we can start reasoning about the explorative possibilities of an automaton. We've mentioned above about the rules to be applied for each type of cell and consequently we know what the basic prequisites are that a generic explorative algorithm must possess (please note: generic, not genetic). We know it must start moving from the Green cell, trying to reach the Red one, without crossing Grey ones. All the Black cells can be explored/stepped over. We can think about a routine which, given a start point, evaluates the status of the four adjacent cells, to recursively cross those that will be considered free.

In other words, being solverBytes() and an array containing a given solution, a function like this will do:

  1. Private Function Solve(_X As Integer, _Y As IntegerAs Boolean  
  2.     If solverBytes(_endPos(0), _endPos(1)) = CELLSTATE.Valid Then Return True  
  3.     
  4.     If solverBytes(_X + 1, _Y) = CELLSTATE.Free Or solverBytes(_X + 1, _Y) = CELLSTATE.Start Or solverBytes(_X + 1, _Y) = CELLSTATE.Finish Then  
  5.         solverBytes(_X + 1, _Y) = CELLSTATE.Valid  
  6.         If Solve(_X + 1, _Y) Then Return True  
  7.     End If  
  8.     
  9.     If solverBytes(_X - 1, _Y) = CELLSTATE.Free Or solverBytes(_X - 1, _Y) = CELLSTATE.Start Or solverBytes(_X - 1, _Y) = CELLSTATE.Finish Then  
  10.         solverBytes(_X - 1, _Y) = CELLSTATE.Valid  
  11.         If Solve(_X - 1, _Y) Then Return True  
  12.     End If  
  13.     
  14.     If solverBytes(_X, _Y + 1) = CELLSTATE.Free Or solverBytes(_X, _Y + 1) = CELLSTATE.Start Or solverBytes(_X, _Y + 1) = CELLSTATE.Finish Then  
  15.         solverBytes(_X, _Y + 1) = CELLSTATE.Valid  
  16.         If Solve(_X, _Y + 1) Then Return True  
  17.     End If  
  18.     
  19.     If solverBytes(_X, _Y - 1) = CELLSTATE.Free Or solverBytes(_X, _Y + 1) = CELLSTATE.Start Or solverBytes(_X, _Y + 1) = CELLSTATE.Finish Then  
  20.         solverBytes(_X, _Y - 1) = CELLSTATE.Valid  
  21.         If Solve(_X, _Y - 1) Then Return True  
  22.     End If  
  23.     
  24.     Return False  
  25. End Function  

As can be noted, giving a start point ad [_X,_Y] coordinates results in looping through the function (and recursively into it) until all the cells that leads to the solution are crossed. In other words, the algorithm will explore the entire solution domain. Applying the function to our labyrinth, we will obtain a solution like this:

Obviously, that will be an encompassing solution, with the sole purpose of tracking down all possibilities. It is the most dispersive solution. Nonetheless, it could have its usefulness in a specific context (please note that, at a conceptual level, our function is a trivial graphical floodfill, given drawing margins), but that's not what we desire to obtain here. We are starting to see what the peculiarities are of a GA, what it must be able to do. In our case, it must try to solve a labyrinth, without trying to taking out a massive amount of space. So, generating individuals capable of solving our problem, we can introduce a kind of limiter.

We saw the limiter above, in the CELLSTATE enumeration, with the value of FakeWall = 5. When we'll generate a new individual, we'll introduce in it some "dirt", to alter the results of the Solve() function with fake walls, precisely, or walls that are not in the base labyrinth structure, having the same behavior of a standard wall.

To be spotted more easily, we'll draw them using the color Yellow. As said, those are arbitrary limiters that respond to the same impossability laws as traditional walls and are useful to delimiter, for each related individual, the solution domain. In fact, we can note that, where a Yellow wall is present, the exploration of the maze is blocked from that spot on. Such randomness allows us to generate a manifold population of individuals. It will contain elements that will solve the maze and elements unable to reach the exit (for example, think about a Yellow cell standing right in front of a Red one). Next, each solution will arrive to its conclusion in a given number of steps. There will be elements that will arrive to the Red cell in a more straight way, whereas another will follow more convoluted paths, being more dispersive.

A hypothetical solution in VB

Let's now see the code of the GASolver class that represents the single individual or solution. Please note it has a generic constructor, with a parameter that expects the array of the user-defined maze, to replicate in each solution the bytes representing walls, starting point and an end point that are essential prerequisites to solve the maze, since they are fixed. With some probability, one or more cells will take the FakeWall value, to add limiters where, in drawing the maze, the user has left free cells.

  1. Public Class GASolver  
  2.     
  3.     Dim _startpos(1) As Byte  
  4.     Dim _endPos(1) As Byte  
  5.     Dim _isSolved As Boolean  
  6.     
  7.     Dim solverBytes(MAZESIZE, MAZESIZE) As Byte  
  8.     
  9.     Public ReadOnly Property StartPos As Byte()  
  10.         Get  
  11.             Return _startpos  
  12.         End Get  
  13.     End Property  
  14.     
  15.     Public ReadOnly Property EndPos As Byte()  
  16.         Get  
  17.             Return _endPos  
  18.         End Get  
  19.     End Property  
  20.     
  21.     Public ReadOnly Property GetBytes As Byte(,)  
  22.         Get  
  23.             Return solverBytes  
  24.         End Get  
  25.     End Property  
  26.     
  27.     Public ReadOnly Property Steps As Integer  
  28.         Get  
  29.             Dim _steps As Integer = 0  
  30.             For _Y As Integer = 0 To MAZESIZE - 1  
  31.                 For _X As Integer = 0 To MAZESIZE - 1  
  32.                     If solverBytes(_X, _Y) = CELLSTATE.Valid Then _steps += 1  
  33.                 Next  
  34.             Next  
  35.     
  36.             Return _steps  
  37.         End Get  
  38.     End Property  
  39.     
  40.     Public ReadOnly Property isSolved As Boolean  
  41.         Get  
  42.             Return Solve(_startpos(0), _startpos(1))  
  43.         End Get  
  44.     End Property  
  45.     
  46.     Private Function Solve(_X As Integer, _Y As IntegerAs Boolean  
  47.         If solverBytes(_endPos(0), _endPos(1)) = CELLSTATE.Valid Then Return True  
  48.     
  49.         If solverBytes(_X + 1, _Y) = CELLSTATE.Free Or solverBytes(_X + 1, _Y) = CELLSTATE.Start Or solverBytes(_X + 1, _Y) = CELLSTATE.Finish Then  
  50.             solverBytes(_X + 1, _Y) = CELLSTATE.Valid  
  51.             If Solve(_X + 1, _Y) Then Return True  
  52.         End If  
  53.     
  54.         If solverBytes(_X - 1, _Y) = CELLSTATE.Free Or solverBytes(_X - 1, _Y) = CELLSTATE.Start Or solverBytes(_X - 1, _Y) = CELLSTATE.Finish Then  
  55.             solverBytes(_X - 1, _Y) = CELLSTATE.Valid  
  56.             If Solve(_X - 1, _Y) Then Return True  
  57.         End If  
  58.     
  59.         If solverBytes(_X, _Y + 1) = CELLSTATE.Free Or solverBytes(_X, _Y + 1) = CELLSTATE.Start Or solverBytes(_X, _Y + 1) = CELLSTATE.Finish Then  
  60.             solverBytes(_X, _Y + 1) = CELLSTATE.Valid  
  61.             If Solve(_X, _Y + 1) Then Return True  
  62.         End If  
  63.     
  64.         If solverBytes(_X, _Y - 1) = CELLSTATE.Free Or solverBytes(_X, _Y + 1) = CELLSTATE.Start Or solverBytes(_X, _Y + 1) = CELLSTATE.Finish Then  
  65.             solverBytes(_X, _Y - 1) = CELLSTATE.Valid  
  66.             If Solve(_X, _Y - 1) Then Return True  
  67.         End If  
  68.     
  69.         Return False  
  70.     End Function  
  71.     
  72.     Public Sub New(bytes01(,) As Byte, bytes02(,) As Byte, startPos As Byte(), endPos As Byte())  
  73.         _startpos = startPos  
  74.         _endPos = endPos  
  75.     
  76.         For _Y As Integer = 0 To MAZESIZE - 1  
  77.             For _X As Integer = 0 To MAZESIZE - 1  
  78.                 If bytes01(_X, _Y) = CELLSTATE.Wall Then solverBytes(_X, _Y) = bytes01(_X, _Y)  
  79.                 If bytes01(_X, _Y) = CELLSTATE.FakeWall Then solverBytes(_X, _Y) = bytes01(_X, _Y)  
  80.             Next  
  81.         Next  
  82.     
  83.         For _Y As Integer = 0 To MAZESIZE - 1  
  84.             For _X As Integer = 0 To MAZESIZE - 1  
  85.                 If bytes02(_X, _Y) = CELLSTATE.Wall Then solverBytes(_X, _Y) = bytes02(_X, _Y)  
  86.                 If bytes02(_X, _Y) = CELLSTATE.FakeWall Then solverBytes(_X, _Y) = bytes02(_X, _Y)  
  87.             Next  
  88.         Next  
  89.     
  90.         For _Y As Integer = 0 To MAZESIZE - 1  
  91.             For _X As Integer = 0 To MAZESIZE - 1  
  92.     
  93.                 If solverBytes(_X, _Y) <> CELLSTATE.Wall Then  
  94.                     Randomize()  
  95.                     If Int(Rnd() * 100) Mod 99 = 0 Then  
  96.                         If Int(Rnd() * 100) Mod 99 = 0 Then  
  97.                             solverBytes(_X, _Y) = CELLSTATE.FakeWall  
  98.                         Else  
  99.                             solverBytes(_X, _Y) = CELLSTATE.Free  
  100.                         End If  
  101.                     End If  
  102.                 End If  
  103.     
  104.             Next  
  105.         Next  
  106.     
  107.         solverBytes(startPos(0), startPos(1)) = CELLSTATE.Start  
  108.         solverBytes(endPos(0), endPos(1)) = CELLSTATE.Finish  
  109.     End Sub  
  110.     
  111.     Public Sub New(sourceBytes(,) As Byte)  
  112.         For _Y As Integer = 0 To MAZESIZE - 1  
  113.             For _X As Integer = 0 To MAZESIZE - 1  
  114.     
  115.                 If sourceBytes(_X, _Y) = CELLSTATE.Start Then _startpos(0) = _X : _startpos(1) = _Y  
  116.                 If sourceBytes(_X, _Y) = CELLSTATE.Finish Then _endPos(0) = _X : _endPos(1) = _Y  
  117.     
  118.                 If sourceBytes(_X, _Y) = CELLSTATE.Free Then  
  119.                     Randomize()  
  120.                     If Int(Rnd() * 100) Mod 50 = 0 Then  
  121.                         solverBytes(_X, _Y) = CELLSTATE.FakeWall  
  122.                     Else  
  123.                         solverBytes(_X, _Y) = CELLSTATE.Free  
  124.                     End If  
  125.                 Else  
  126.                     solverBytes(_X, _Y) = sourceBytes(_X, _Y)  
  127.                 End If  
  128.     
  129.             Next  
  130.         Next  
  131.     
  132.     End Sub  
  133. End Class  

A Solve() function is also present, to be able to determine if a solution can reach the exit cell, or if it fails somewhere else. There are other properties and functions that will be analyzed soon.

Solution's fitness

As we saw in the previous article, an essential part of a GA is the one that allows a determination of an adaptability score when confronted with a given problem. It goes without saying that the way in which that score will be calculated rests on the problem's tipology. Fitness is essential, because when we have a population made of a solution more or less close to our target, it allows the extraction of some champions with potential characterics we can desire to try to imrove the individual's generation. It is therefore evident that an error in designing a fitness calculator's function will be the foundation stone to build a failure in a GA realization.

In trying to limit problems laying on an eroneous fitness attribution to one or more solutions, deep problems analyzing is a requirement. In cases as an image replication is simpler, we know it is, at the core, an array of bytes ordered in a given way and the fitness of a solution passes by checking how many bytes are shared between our target and the solutions. When confronted with open problems, like the present one, a step forward is needed. Here we aren't interested in evaluating the affinity with a predefined model, but we want to discuss the potential effctiveness of a solution.

Simplifying, in the case of a maze, we are interested in the following two aspects, already mentioned previously:

  1. The solution must lead to a valid end, it must reach the destination
  2. The solution must be quick; having satisfied the prerequisite 1), it must be able to obtain it in the most neat way
We can say that, in our case, the fitness calculation passes for two properties. The first is to reach a goal, whereas the second is the number of steps taken, where the fewer there are, the greater is our solution's quality. To that means, the GASolver class implements two respective properties, isSolver, which returns a boolean value to understand that it is a specific individual that reaches the target and Steps that count the amount of valid cells in the solution itself.
 
Initialization and crossing-over

The initialization's concept follows simple rules that are about creating a first generation of solutions, using standard constructors. In the attached source code, the operation happens at the btnInit button's Click:

  1. Private Sub btnInit_Click(sender As Object, e As EventArgs) Handles btnInit.Click  
  2.     epochs = 0  
  3.     
  4.     For _Y As Integer = 0 To MAZESIZE - 1  
  5.         For _X As Integer = 0 To MAZESIZE - 1  
  6.             If sourceMaze(_X, _Y) = CELLSTATE.Start Then _startPos(0) = _X : _startPos(1) = _Y  
  7.             If sourceMaze(_X, _Y) = CELLSTATE.Finish Then _endPos(0) = _X : _endPos(1) = _Y  
  8.         Next  
  9.     Next  
  10.     
  11.     gaList = New List(Of GASolver)  
  12.     For ii As Integer = 0 To 999  
  13.         gaList.Add(New GASolver(sourceMaze))  
  14.     Next  
  15.     
  16.     gaCursor = 0  
  17.     MazePaintItem(mazeDest)  
  18.     
  19.     btnStart.Enabled = True  
  20.     btnStop.Enabled = False  
  21. End Sub  

We reset a counter that will be used to count the generations, then we save the cells that the user has defined as the start and end points. In a List (OF GASolver), that we will query via LINQ, we'll add the desired number of newly created GASolvers (in our case, 1000 units), to proced to their graphical representation. Each initial solution, as we saw, will be initialized by adding dummy delimiters into the spaces the user has kept clear, to create variety in the population.

By pressing the btnStart button, the seach for solutions will start. Note that we will extract, during each loop, those elements that solve the maze (ga.isSolved = True), ordered by the number of steps used to complete the task (ordered by ga.Steps Ascending). That way, we will be sure to extract the two best solutions at the moment. Those elements will be recombined, producing two new elements, following the rules we'll see in a moment. Lastly, ordering our list for a descending amount of steps, we can discard the two worst elements. Please note that we won't apply any discriminant on the solution's quality and it is possible to discard elements that arrivees at a valid solution, but having the flaw to be, among all solutions, the most dispersive ones.

Cross-over generates two new elements as explained using the four-argument constructor of the GASolver class, as follows:

  1. Public Sub New(bytes01(,) As Byte, bytes02(,) As Byte, startPos As Byte(), endPos As Byte())  
  2.     _startpos = startPos  
  3.     _endPos = endPos  
  4.     
  5.     For _Y As Integer = 0 To MAZESIZE - 1  
  6.         For _X As Integer = 0 To MAZESIZE - 1  
  7.             If bytes01(_X, _Y) = CELLSTATE.Wall Then solverBytes(_X, _Y) = bytes01(_X, _Y)  
  8.             If bytes01(_X, _Y) = CELLSTATE.FakeWall Then solverBytes(_X, _Y) = bytes01(_X, _Y)  
  9.         Next  
  10.     Next  
  11.     
  12.     For _Y As Integer = 0 To MAZESIZE - 1  
  13.         For _X As Integer = 0 To MAZESIZE - 1  
  14.             If bytes02(_X, _Y) = CELLSTATE.Wall Then solverBytes(_X, _Y) = bytes02(_X, _Y)  
  15.             If bytes02(_X, _Y) = CELLSTATE.FakeWall Then solverBytes(_X, _Y) = bytes02(_X, _Y)  
  16.         Next  
  17.     Next  
  18.     
  19.     For _Y As Integer = 0 To MAZESIZE - 1  
  20.         For _X As Integer = 0 To MAZESIZE - 1  
  21.     
  22.             If solverBytes(_X, _Y) <> CELLSTATE.Wall Then  
  23.                 Randomize()  
  24.                 If Int(Rnd() * 100) Mod 99 = 0 Then  
  25.                     If Int(Rnd() * 100) Mod 99 = 0 Then  
  26.                         solverBytes(_X, _Y) = CELLSTATE.FakeWall  
  27.                     Else  
  28.                         solverBytes(_X, _Y) = CELLSTATE.Free  
  29.                     End If  
  30.                 End If  
  31.             End If  
  32.     
  33.         Next  
  34.     Next  
  35.     
  36.     solverBytes(startPos(0), startPos(1)) = CELLSTATE.Start  
  37.     solverBytes(endPos(0), endPos(1)) = CELLSTATE.Finish  
  38. End Sub  

Essentially, it requires two arrays, referred to the bytes of the elements to be recombined and two pairs of coordinates, namely the start and exit points of the maze.

Looping through the two matrices, a matrix is generated, putting in it the combination of Wall and FakeWall types. A Wall cell will be the same between the two, but a FakeWall will possibily differ. If the two solutions come to a valid end, in other words the FakeWall from both matrices can be applied without affect the solution, then it is made better by removing a potentially free cell to be considered valid steps. When the process ends, an additional mutation loop calculates the eventual transformation of non-Wall cells in Wall or FakeWall cells. The newly created solution will then have the same Walls as the parents, the same start/end point, but the virtual walls from both, plus some mutation (if they occur).

The logic is simple: the thing that determines a more economic solution, is the freedom of movement. The more the free cells are, the greater the solution's domain will be, and consequently the steps towards the exit point. In the described way, we apply a progressive limiter towards a basic path that will be constituted by the sum of the limiters that, in past generations, had represented and helped that objective.

Making the algorithm run, alternating generations, in a few hundreds of epochs we'll have the optimal solution, or the solution that leads from the start point to the end without "indecisions". At the same time, the progressive discarding of poor solutions, in favor of those with higher performances, will lead to an entire population of acceptable solutions.

In the preceding example, it's possible to see that after about 14000 generations of solutions, the result is really close to the optimal one. There is room to improve, and proceeding though epochs it's possible to reach better results, though slow due to the recombination of nearly-ideal solutions (that can count almost exclusively on mutation to improve further). Lastly, please note that, with a given number of potential solutions, the path chosen by the algorithm will be affected by the elaborations themselves. For example, if a mutation inhibits a certain path, the algorithm will focus on the remaining solutions. So, we won't obtain the ultimately better overall solution, but the better depending on the current context and the agents at our disposal, miming at a conceptual level what happens in biology too.

Source Code
Demo