Taking a cup of tea while dissecting algorithms with Nawas Naziru: MERGE SORT

1.1 Background Insight/ How it was discovered.

When Given an array to sort, the problem is sorting in as little time as possible not as in insertion sort in which in the worst case, the time to sort ramps up quickly with the number of elements to sort. To accomplish that we take rather this seeming foolish approach that will later turn into breakthrough and wonders:

Assume the array is already sorted 😎😂! If so, it then means that we don't even have to do any work i.e. it is in best case. Not so lucky! It is very rare to find array that is already sorted. So, that can't work let's throw this approach away and take another? But...wait! What? We don't easily give up in things in life--we have to tweak things most of the time without which many things in life won't have been possible. Why can't we find way of converting any array to an already sorted state? Doing so is nothing more than the sorting we are trying to do😋 hence back to same problem. Still without giving up, let's calm down a bit, our demand which is too high making things impossible. What way can we take so that many portions of the array is already sorted and thus, we need to do only little work? So what portion of any array or list of items is already sorted? Let's experiment and make trials. Pause 🔇! Each and every single-element sub array is already sorted in what ever direction (ascending, descending...). So, the time to sort each single element is zero. What does this suggest? Even from real experience we know that the time taken to sort increases with increase in the number of items. Since the time taken to sort a single element is zero, then such time would increase linearly or proportionately and so, we expect that the time taken to sort two-element array would be less than that for 5-element array. So, why can't we use that to an advantage and build up our array this way so that it appears finally in sorted form? Wait won't it be still same amount of time or even more since we combine continuously and the time to sort increasing till we obtain the original array in sorted form? Nope! It is easy to remember that at each intermediate stage just before combining, the sub-arrays are already sorted and as such, from common sense, we know that the time taken to sort the two i.e. to combine them so that they appear in sorted form would be relatively small compared to insertion sort worst case where we have to compare almost each and every element with all others and change it's position--here we just compare starting from the first two elements picking the lower to be the lowest of the final resulting array and so on. To see it clearly consider this story: if you are assigned the task of arranging 10 persons in a queue in increasing order of height and luckily enough an idea come to you that instead of assuming the first you encounter is the smallest and you compare it with next and if smaller you swap and continuously which is nothing but insertion sort, you recall that the more the items to sort, the more time consuming the insertion sort would be and so you decide to split them into two and sort each half independently, then you merge finally. You will find out that the time you will take would be less and you find it less difficult because after sorting each half, it is the matter of comparing and picking going position by position.

So, with this discoveries in place, we are fully armed for reducing time taken in sorting. Simply take the array to sort and divide by 2 and sort each half? Yes that's nice idea but we can do better we can minimize the sorting time to the core by further dividing each half into two again,...each obtained into two again and continuously until we get single-element arrays. With single-element arrays, they are already sorted and so we build up from there; building two elements array then 3 or 4 elements array by merging. For example, if the original array has six elements, then each half after first division by two, is three-element array and so second division divides each of the three-element array into a two-element and one-element arrays.

1.2 The Pseudocode

What we should do in the Merge function is to take the array to sort denoted by A with its first and last indices i.e. position of it's first and last elements. The reason for the indices is to be able to calculate the midpoint of the array for subsequent division of the array into two. Note that as used here, the first index is taken as zero for the sake of simplicity and not 1 as you might come across in other materials.

MergeSort(A, p, r)

1. -----

So, the first step is to divide the given array A into two halves. To do so, we calculate its approximate midpoint denoted by q. If q is a number with decimal fraction, the fractional part is simply discarded and not round off - what is usually called flooring in programming languages. For such a case, one of the subarrays would be bigger in size than the other and that doesn't matter - remember that the main goal in merge sorting is to have the portions already sorted so that merging them to appear in sorted form takes little time.

Hence, we add the following line inside the merge function as shown below;

MergeSort(A, p, r)

1. q=(p+r)/2

With this midpoint in place, it is clear to see that the the first half would then be from position p(which is the first position for the first division but can be any other position as can be seen when it comes to further subdividing the other half) to q and the second half would be from q+1 to r. Are we to create two new arrays right here to hold the two halves? Yes of course but not creating them physically as you might expect but rather logically (i.e. we made the single array to work as though it were two separate halves but it is one). Why do that? Remember that in writing algorithms, we are after highest possible efficiency and minimisation of computing resources (space/memory, time, bandwidth and what have you). So, the original array can do but instead, we think of it and use it as two separate halves. How? Simply note down the beginning and end of the first half and the beginning and end of the second half--those are their signatures.

With that in place, the next thing to do is to further divide each half into two again(why not dividing into three why always two? If we divide into three, and sort each portion, then merging would be time consuming because we have to be comparing across 3 which wouldn't be easy 😒.

Good algorithms should show elegance and be free from redundancies. As such, it portrays our expertise and skill if we continue to use this same MergeSort function to divide and merge any sub-arrays rather than polluting everywhere with similar lines of codes since the operations we do are same only that the input varies. This is achieved through what is known as recursion. That is a function during it's execution calls itself one or more times to branch out and perform similar operation on some set of data before finally coming back or returning to finish its execution. This is what we need exactly here. We want to continually divide into two, down to single element arrays without sorting(no point in the use of this algorithm if we sort immediately after dividing by two because the subarrays won't appear in sorted form just after dividing the main array and remember that this algorithm is only helpful when each of the the two divisions is already sorted in itself).

The MergeSort function we have written so far, does only one job for now which is calculating the midpoint of the array passed into it. Hence, we continue to call it on sub-arrays until hitting one element arrays then we begin to merge from there building up until we realise the original array in sorted form. How is the recursion? It is recursion upon recursion...or recursion inside recursion.

Let's see how to write that 🚀. Before that, in the above written MergeSort function, we calculated the q only but then where is the dividing into two and where are the sub-arrays? Remember that, we said we are not going to create new arrays but rather use the original given array in logical sense. For that, the first sub-array is then (A, p, q) i.e. the original array taken from the pth element(element at position p) to the qth element. And the second as (A, q+1, r). But then, we need to further divide each of these sub-arrays and so we need to pass in these two sub-arrays into the Merge Sort function again (merge sort inside merge sort? Yeah!). Let's write that;

MergeSort(A, p, r)

1. q=(p+r)/2

2. MergeSort (A, p, q)

3. MergeSort (A, q+1, r)

Note: Going by how the MergeSort function is defined, what it does in general is simply calculate midpoint between any two positions in the array not necessarily first and last(first and last positions are used only in the first division of the original array into two). So, in line 2 above, q is now the last position which marks the end of the sub-array and in line 3, q+1 is now the first position of the other sub-array. All these is easy to see when one knows how functions are defined and called/used; you can pass in any other valid variables not necessarily the particular variables the function was defined to accept.

This is simple explanation of the above written MergeSort function: first divide into two parts but then we cannot merge and sort them since each part is not yet sorted in itself. Now how to sort each? To sort each, we further divide each portion into two by passing into the MergeSort function whose job is doing that for us. Again, we cannot merge the sub-arrays obtained since not sorted. To sort, we need to further divide. This continues till reaching single element sub-arrays which are sorted so we stop there and we begin merging from there upward. Hence, it is clear to see that we have a kind of nested recursion whereby one MergeSort calls another MergeSort waiting for it to return it's result(merged sorted sub-array) before proceeding to its own merging.

But 😣...What? Wait! Doesn't that seem to be going into a loop that never ends? Nope! We stop the recursion as soon as we obtain single element arrays. Then how do we check that we reach single element arrays so that the recursion stops there? A single element array would be (A,1,1) for example. So when the p and r (first and last index) of an array are equal in value. Now, we put a line of code so that we only recurse if p is less than r; once p is equal to r then it means that we hit single element array and so we don't recurse again to divide further. Therefore we have;

MergeSort(A, p, r)

If (p < r) {

1. q = (p+r)/2

2. Mergesort (A, p, q)

3. MergeSort (A, q+1, r)

}

Now if we hit the single element arrays, what do we do next? We begin the merging process which is the real stuff. We merge the single element arrays since all single element arrays are already sorted. In the merging process, we merge them in sorted manner not just any how; remember that we still have to sort but with less effort than if the two portions under consideration are each unsorted in itself. This merging continues up, till the original array is formed in sorted form. Then how do we write the merging function and where do we place it in the MergeSort function we have written so far. Addressing the second question, it is easy to realise that we have to write the Merge function as the last line in the MergeSort function because it is the last thing we want to happen and that every sub-array needs it so every call to the MergeSort function should eventually execute it. How would it work then? Here it is: when we call the MergeSort function passing in the array we want it to sort, it computes the midpoint and pass each of the two sub-arrays into two other MergeSort functions. So, the Merge function we shall soon write at the bottom doesn't execute immediately until the two MergeSort functions are done. Inside each of them, we have same case and so, the process continues until reaching two element sub-arrays. At that point, we have single element arrays passed into the two MergeSort functions. In each you will find out that p=r and so, no more function call. Therefore, the Merge for the two-element sub-arrays will execute since the two MergeSort functions that take in the single element arrays don't call any other MergeSort function. Once the Merge process is done, it means everything is ready for the upper higher level MergeSort function that requires this to execute first...this continues up to highest upper level where we finally merge to get the original input array sorted...you get the picture 📸 !

Let's rewrite the MergeSort function to show the Merge function inside.

MergeSort(A, p, r)

If (p < r) {

1. q = (p+r)/2

2. Mergesort (A, p, q)

3. MergeSort (A, q+1, r)

4. Merge (A, p, q, r)

}

Note: For the Merge function above, we need q to be passed in also, because it is what we use to separate the two sub-arrays since they are still in contiguous space in A.

We talked about where to place the Merge function. Next is the real stuff; how to write or come up or define the Merge function because up there we just called it to use without defining it earlier.

Actually we have taken so much time explaining the backbone of the MergeSort function😧. Let's quickly write the Merge function and we explain how and why it works along the way.

Note: The first element position in the original array is zero not 1.

Merge(A, p, q, r)

1. Declare L and H to hold the left and right sub-arrays temporarily.

2. Set the sizes of the sub-arrays L and H:

Original Size of L = q - p

Original Size of R = r - (q+1)

For each of the two sub-arrays L and R, we have to add a very large number larger than any possible number i.e. approximately equal to infinity in this case(if it we're sorting in descending order, then we have to add a very small number). The reason for this is that as we shall soon see, during comparison between elements of the two sub-arrays and replacing the corresponding element of the original array A with the result of the comparison, one of the sub-arrays may get exhausted before the other. So, to continue the comparison process and make sure that all the remaining elements of the unexhausted subarray are placed into the original array at their correct position, we then need to be comparing each of them with the infinity in the exhausted subarray so that each of them is less than the very infinite number and so, it gets placed in to the array A. Thus,

n1 = new size of L = q-p+1

n2= new size of R = r-(q+1) + 1 = r-q

3. Copy the elements from the original array A to the sub-arrays. We need to do that, so that we just need to be comparing the elements in sub-array L with that in R and instead of creating new arrays to be putting the results of the comparisons, we just proceed into replacing the elements of the original array with the results of the comparison.

for ( i=0 to n1-1){

L[i] = A[i+p]

* we need to add p because it is not every time that we copy from the beginning of the original arrays. Consider when we are to merge two sub-arrays that form the first right sub-array... So adding p is a kind of offsetting.

}
L[n1]= infinity

for (j=0 to n2-1){

R[i] = A[i + q+1]

}
R[n1]= infinity

4. Next we get into comparing the elements of the L and R sub-arrays then putting the result of the comparison into original array to replace whatever is there.

i = 0

j = 0

for (k = p to r){

if(L[i] <=R[j]){

A[k] = L[i]

i = i + 1

}

else {

A[k] = R[j]

j = j + 1

}

The above last set of codes work this way:

In the first execution of the For loop, we start by comparing first element of L and first element of R i.e. L[0] and R[0]. If L[0] is smaller or equal to R[0], we replace first element of A i.e. A[0] with L[0] . This means , we are done with L[0] and as such, we move to next element in the array L in the next loop or iteration hence, the reason for incrementing i by 1 . Otherwise if L[0] > R[0], we substitute A[0] with R[0] and similarly move to next element in the sub-array R in the next loop. This process continues until one of the sub-arrays L or R gets exhausted and the benefit of the use of the very large numbers at the end of each array comes.

Note: neither of the very large numbers will be copied nor placed into the array A after the sorting process because as written above, the iteration of the for loop goes to highest, the number of elements in the array or sub-array which was earlier divided into two to give the L and R sub-arrays before the addition of the very large numbers. Work out an example to see that more clearly.

Important Note:

See other resources on the internet for practical examples, exercises and pictorial representations on how the MergeSort algorithm works. Definitely you will find everything like a breeze.

Let's put together, the Merge Sort algorithm without any comment.

MergeSort(A, p, r)

If (p < r) {

1. q = (p+r)/2

2. Mergesort (A, p, q)

3. MergeSort (A, q+1, r)

4. Merge (A, p, q, r)

}

Merge(A, p, q, r)

n1 = q-p+1

n2= new size of R = r-q

for ( i=0 to n1-1){

L[i] = A[i+p]

}

L[n1]= infinity

for (j=0 to n2-1){

R[i] = A[i + q+1]

}
R[n1]= infinity

i = 0

j = 0

for (k = p to r){

if(L[i] <=R[j]){

A[k] = L[i]

i = i + 1

}

else {

A[k] = R[j]

j = j + 1

}

}

Coming up next is the implementation of the Merge Sort algorithm in Javascript followed by Time and Space complexity calculations of the algorithm. Stay tuned............

Taking a cup of tea while dissecting algorithms with Nawas Naziru

MERGE SORT - A TRUE LIFE STORY

No comments:

Post a Comment

Popular Posts

Report Abuse