How Count Sort works

Razi Rais
2y
48.4k
0
0

Article

Introduction

Count Sort is a Linear Sorting algorithm that sorts elements in O(n) time, the other linear sorts include Bucket and Radix sorts.

What is Linear Sorting Algorithm

A Sorting algorithm that does not use any comparison operator ( >,<, >=, <=, = = ) to determine the sorting order of the elements, the sorting is achieved by acute logic build, and the overall time taken by the algorithm is hence linear.

Actually, if we contrast linear sorts to other comparison sorts with respect to time we will find that comparison sorts can do n log n at their best and exponential at worse in terms of time, the linear sort gives linear performance and thus have a fine edge in time over these algorithms.

Note: Along with Time, we also have to look at the Space/Memory usage of a particular algorithm, and actually this is exactly where Linear Sort falls back, as Linear Sorts use extra space roughly more than that of the original sorted data structure.

The Internal Working of Count Sort

The Count Sort is quite simple as compared to other sorting algorithms ( not even a single nested loop ), I will explain the logical working of count sort, and for plainness purposes use an array data structure.

There are only two Classes in the code namely :

CountSort: This class contains methods Max, Count_Sort ( ), and Display ( ), for finding the Maximum element in an array, sorting the array with Count Sort, and for display all the elements in the array, respectively.
CountSortApp: This class contains the Main() method that contains the Count Sort class object, calls to methods of Count Sort, and does nothing more.

The Count Sort Class have only two attributes

class CountSort
{
int []thearray; // the array of unsorted elements
int i;
.....

Now go through the Constructor of CountSort

public CountSort(int size)
{
thearray = new int [size] ;
Random ran = new Random();
for(int i = 0 ; i< size; i++)
{
thearray[i] = ran.Next(i,i*size);
}
}

1) The Constructor takes the size of the array and made thearray of this size.

2) Initialize thearray to Random numbers, first the Class Random object is declared i-e ran, and later uses its method Next(lowerbound, upperbound) to generate a random number at each iteration and equate it to thearray [ i ], so in all thearray is filled up with random numbers.

Next, is the Max() method, actually, the soul purpose of this method is to find the maximum number in the thearray.

// Find maximum Number in the Array
// This will take O(n) time
private int Max()
{
int max = thearray[0]; // first element be the max
for(i = 1 ; i<thearray.Length ;i++)
{
if (thearray[i] > max )
{
// if the elemnt at index 'i ' is greater than
//previous max , than set this value to the max
max = thearray[i] ;
}
//end of if
}
//end of for
return max;
}

Let's go through this method's code step by step.

3) First lets take the first element in the array to be the maximum number in the whole array i-e max = thearray[0]

Now iterate over the whole array except the first element (as we already select it as a maximum element), and whenever you find the element greater than the value contained in variable max, just replace the value of max with this value, actually, the check if (thearray[i] > max ) do the checking of element at index i with the current value of max, and in the case, if it is greater, replace it with max = thearray[i]

Now we come to the Count_Sort method, which contains all the real work.

First, there is the declaration for variables used by the CountSort method

int k = Max();
// the maximum element in the array
int []output = new int[thearray.Length];
int []temp = new int[k+1]; //For indeing up to k , we required array of k+1

4) Declaration of variable k which holds the maximum element in thearray, is done so by calling Max() method.

5) Declaration of array output of type int, equals in size of the original array. The output array will have final sorted values, as you can see that in order to have the original array sorted you have to copy from the output array to the original array ie thearray. I have more to say about this and other crucial facts in the last section.

6) Finally the declaration of array temp, which will provide the indexing up to the max value in thearray, temp array is of size k+1, since in C#, in order to have indexing up to 'k' we need to declare it of size k+1.

Now, we have a bunch of for loops, I will go through each of them separately, in order to clarify their functionality

The first for loop

for( i= 0 ; i<k+1 ; i++)
{
temp[ i ] = 0;
}

7) The initialization of the array temp to 0, actually this step is done in order to keep clarity and pace with the algorithm. What is done is only to make sure that every element of temp is set to zero, as this zero will play important role in the next coming role.

The Second for loop

for(i = 0 ; i<thearray.Length ; i++)
{
temp [ thearray [ i ] ] = temp [ thearray [ i ] ] + 1;
}

8) This loop contains some elaborate logic, worth explaining, let's consider the original array thearray contained values {2,1,3,2,4}, so the value of k will be k= 4, and we have array temp of size k+1 i.e. 5, and an output array of size of thearray i.e. 5, it can be shown diagrammatically as fig 1.0

figure 1.0

9) The statement temp [thearray [ i ]] = temp [thearray [i]] + 1 can be breaked into

int p = thearray [ i ] ;
temp [ p ] = temp [ p ] + 1 ;

The element at thearray[i] is saved in variable p and this value is used for indexing the array temp, and finally adding one to the value that is obtained by indexing.

In the temp array, every index corresponds to the element / possible element of the original unsorted array thearray. For e.g. the index number 2 in temp corresponds to the element 2 in thearray, see also that index 0 does not have corresponding element 0 in thearray. So the statement temp [ p ] +1 will actually mean that we have an occurrence element p, so increment it by one so its number of occurrences can be captured, well I also feel pretty much hazy so let's take a look at what happened for different values of i and get the picture right.

For i = 0

p = thearray [ 0 ] ;
// this will results in p = 2
temp [ 2 ] = temp [ 2 ] + 1 ;
// this gives temp [ 2 ] = 0 +1
temp [ 2 ] = 1
// the number 2 is occurred

Now for i = 1

p = thearray [ 1 ] ;
// this will give p = 3
temp [ 3 ] = temp [ 3 ] + 1 ;
// temp [ 3 ] = 0+1
// temp [ 3 ] = 1 ; // the number 3 is occurred once

Now for i = 3

p = thearray [ 3 ] ;
// this will give p = 2
temp [ 2 ] = temp [ 2 ] + 1 ;
// temp [ 2 ] = 1 +1
// temp [ 2 ] = 1 ; // the number 2 is occurred twice

So on for the other values of i. After which we have temp containing occurrences of each and every element of thearray, this is illustrated in figure 1.1

figure 1.1

Now let's move to the third for loop

for( i = 1 ; i<k+1 ; i++)
{
temp[ i ] = temp[ i ] + temp[ i - 1];
}

10) The loop simply runs up to the size of the array temp, the main reason for initializing i = 1 will be clear as I explain what we are trying to do in this loop. The main idea is to start with 2nd element ( at index 1 ) of the array temp and add all previous elements including itself ( in the case of the 2nd element these are elements at indexes 0 and 1 ), then move to the next element 3rd ( at index 2 ) and again add all previous elements including itself (in case of 3rd element these are elements at indexes 0,1 and 2), and so on till the end of array temp. Since there is no element prior to 1st element ( at index 0 ), i is set to 1 at the start.

The temp array after the execution of this loop is shown in figure 1.2.

figure 1.2

and now the fourth and final for loop

for(i = thearray.Length-1 ; i>=0 ; i--)
{
output [ temp [ thearray [ i ] ] -1 ] = thearray [ i ] ;
temp [ thearray [ i ] ] = temp [ thearray [ i ] ] - 1;
}

11) Now break the statement in the loop as

int p = thearray [ i ] ;
temp [ p ] = temp [ p ] - 1

we initialize variable p with the elements of thearray one by one, next we use p as an index for the array temp, we get the value at that index, decrement it by 1, and then put this decremented value back in the array temp at index p. (see figure 1.3 below )

Well here is the explanation, as we are building the final sorted array in this very last loop, these statements actually occur due to the fact that the values in array temp are used by sorted array output as an index, and the value that is just accessed by output array must be decrement by one or else we are not able to put same elements found in original array at consecutive locations in the final sorted array output.

This fact is demonstrated in figure 1.3

12) for i = 4

13) for i = 1

figure 1.3

14) When the temp is not decremented by 1, note that element 3 is inserted at index no 4 again (below)

figure 1.4

Finally, the sorted array output is shown in (figure 1.5), also shown is the state of the array temp after the final step.

figure 1.5

Conclusion

In this article I showed how to implement the Count Sort, a sorting algorithm that sorts elements in Linear Time, well there are other factors that one must consider before using Count Sort, the Count Sort is Stable means numbers with the values appear in the output array in the same order as they do in the input array. The fact that there is no comparison used by this algorithm makes it difficult if not impossible to use it with custom objects rather than primitive data types, and even in the case of primitive types, the algorithm can perform much worse in terms of memory usage if only single element in the input set is close to maximum range for that type.