# Analysis of Merge Sort in Data Structure

\$\$/\$\$

In the following video, we will analyse the Merge Sort algorithm. Now that we have used ‘divide and conquer’ in its implementation, it should ideally showcase an improved time efficiency compared to the previous algorithms. Let’s see if it does so.

\$\$/\$\$

## Video Transcript

Now let us look at the time complexity of merge sort. So if you look at this code carefully and if you look at the first few steps of this pseudocode, you would see that here I am merely calculating my n. Here I'm only calculating my mid value which is n by two, or here I am just declaring my L and arrays. So, how much time do you think this will take? Will it really depend on the length of my input array? Suppose that my input array was of length 1 billion. So how much time would it take for me to calculate mid? Well, it would take me one single mathematical operation to calculate mid. And moreover, it would take me one single declaration to declare L and R. So basically, these entire all of these processes are going to take a constant amount of time because they involve a simple mathematical calculation where I'm calculating mid is equal to N by two and simply declaring my LNR array. Note that when it comes to inserting values in the LNR, we are doing it later. But here I'm merely just declaring my LNR array. So these steps are going to take a constant amount of time, which I'll denote by C. Now, moving on to these four loops here, I'm inserting all the elements from zero till mid minus one into L and I'm. Inserting all the remaining elements from mid till N minus one, into R. So basically, I have a large array of size N, and I'm inserting those N elements into two separate subarrays L and R. So how many steps does it involve? Well, it involves a total of n number of steps together. So writing down values in my left subarray and right subarray are total going to take N number of steps. So here both of these four loops together are going to take N amount of steps. Now, if you are still confused as to why these two for loops are taking a total N amount of time, so let us recall that in the beginning we had an array like this of eight elements and we divided it into elements of length for each.

Now, for dividing and filling values in these arrays, what I did was that I iterated from this position to this position and filled this array. So this took me a total number of four steps. Similarly, when I filled in the remaining elements into this array, this again took me a four number of steps. So how many steps did it take for me in total? It took me four plus four eight number of steps which was the length of A. And hence I say that it takes me a total of N number of steps for inserting all the values in my L and R array. Now, coming over to the next step, we see. I'm calling the merge sort function here on my L array and my merge sort function on my R array. So, if I denote the time complexity of this entire merge sort by T of N, note that T of N is merge sort on A, which is my larger array. So how much time is it going to take to do these merge sort operations on L and R? Now, since L and R both are of length N by two, so in terms of T of N, I can write them that they are going to take a time of T of N by two and N by two each.

It is because when I called merge sort of A, it took me T an amount of time. Now, when I call merge sort on array, which is half the length of a it will take me t? N by two amount of time which is for merge sort on l and it will take me t of n by two amount of time, which is merge sort on r now, coming over to the last step. I'm doing a merge operation here. So if you recall how I merged my two arrays. Well, I had these four arrays, these two arrays of length for each. And what I did was that I sequentially compared all the elements in these two smaller subarrays and I combined them into one. So I was beginning from these two points, comparing these two elements and writing them here. Then I was writing my next element here till when did I do it? Well, I did it till I reached the last element of my A array. So, if my length of A array is N, how many steps will it take for me to merge these L and R into my final A array? Yes, it is going to take me a total number of N steps, since I am merely taking elements from L and R, comparing them and putting them inside A. So this process that is this merging operation is going to take me a total number of N steps since I am only combining the elements from L and R and putting it into a final larger array a with which comprises of double elements of L and R. That is, it will take a total amount of N number of steps to merge these two smaller arrays.

## Video Recap

Merge sort is a sorting algorithm that utilizes the concept of divide and conquer.

• A merge function is defined which takes in an input array, the starting index of that input array, the middle element, the mid index of that input array, and the ending index.

• The length of the left and right subarrays are calculated.

• Two new subarrays, i and r, are declared and populated with the respective elements of the left and right subarrays.

• A merging process is executed where the smaller of the two elements is copied into a new array and the respective pointer is incremented.

• If one of the subarrays has been exhausted, then the remaining elements of the other subarray are copied into the new array.

• A sorting function is defined that takes in an input array, the starting index, and the ending index.

• The sorting function calculates the mid index and calls itself recursively on the smaller subarrays.

• Finally, the merge function is called on the left and right subarrays to sort and merge them.

• The code is tested and verified to work correctly.

Till now you have learnt the breakdown of how many steps each part of the Merge Sort code takes. Let's use this information in the next video to calculate the time complexity of the algorithm.

\$\$/\$\$

Merge sort analysis in brief :

To analyse the time complexity, we can simply see what best divide and conquer can do to merge sort, to make it efficient.

1     2       3     ………………………………........................................     n-1  n

.

Step 1: Divide

For dividing, we will compute the middle of the given ‘n’ number of elements. This would take a constant time, let’s say D(n)= Θ(1)

Step 2: Conquer

In this step, the sequence of ‘n’ elements is divided into n/2 elements each and now applying recursion, the n/2 elements would be further divided into n/2 which will continue until a single element is left. Therefore, we are recursively solving two sequences of n/2 elements that give us time complexity, let’s say 2T(n/2).

Step 3: Merge/Combine

Combining the subarrays of ‘n’ elements would take time C(n)= Θ(n).

T(n) = 0, if n<=1
T(n) = T(n/2) + T(n/2) + D(n)+ C(n), otherwise

T(n)= 2*T(n/2) + Θ(n) + Θ(1)

// while finding the Big O /Theta we ignore the constants, hence theta(1) is ignored in the next step. And theta(n) is taken to be n because even in the worst case it can have an impact of ‘n’ only.

T(n)= 2*T(n/2) + n  ………………………………………… (i)

T(n/2) = 2*(2*T(n/4) + n/2) =T(n/)+n   ……………………………… (ii)

// In this step n in equation(i) is replaced by n/2 therefore, wherever there is n we have n/2, when we n/2 it is replaced with n/4 ([n/2]/2= n/4) and so on...

T(n) =  *T(n/2) + 2n ……………………. (iii)

// In this step ‘n’ in equation(ii), is replaced by 2n to make the same equation of the form T(n).

.

.

.

.

.

= (2*T(n/2) + kn

// just like equation (iii), we have obtained a general term for T(n)

Suppose n=2k , therefore k=logn

T()=nT(n/n)+logn*n

= n*T(1) + nlogn

But T(1) = 0

T(n) = O(nlogn)

Some relief! Merge sort finally brings us out of O(N). The efficiency for merge sort comes out to be O(N log N).

\$\$/\$\$