Differences between Merge Sort and Bubble Sort

Table of Contents

SORTING ALGORITHMS

Introduction

Sorting algorithms are methods/procedures used in various computer programs to associate and arrange pieces of data in a certain order. Sorting procedures are widely used throughout the world in diverse fields of mathematics, computer science, machine learning and artificial intelligence. Sorting problems are so trivial that they can be found ubiquitously, for example, probably sorting the marks of students of a class, or getting the second highest mark, or some other instances like that, which we will consider later in this article. Our main focus in this article is to compare two powerful techniques of sorting used everywhere throughout the world – Merge Sort and the Bubble Sort.

PRELIMINARY CONCEPTS

Merge Sort

Merge Sort is a recursive algorithm that is used in most of the servers and applications that require sorting procedures. It is one of the fastest methods to sort a data set and more importantly, it requires minimum time to do so. For large data sets, which is actually our main concern, the merge sort is definitely more effective than most other algorithms in terms of time. Shortly, we will see how this works, and, how it stnds in comparison to the bubble sort.

Bubble Sort

The bubble sort procedure is a more intuitive and easily imaginable approach to visualize sorting. However, it loses its effectiveness when it comes to handling ‘Big Data’. Hence, we need to adopt a better approach for sorting, as we will see during the course of this article.

A scattered plot of numbers, yet to be sorted by Bubble Sort.

COMPARE AND CONTRAST

1. Pseudo Code

Merge Sort:

function merge_sort(list m)
    // if list size is 0 (empty) or 1, consider it sorted and return it
    // (using less than or equal prevents infinite recursion for a zero length m)
    if length(m) <= 1
        return m
    // else list size is > 1, so split the list into two sublists
    // 1. DIVIDE Part...
    var list left, right
    var integer middle = length(m) / 2
    for each x in m before middle
         add x to left
    for each x in m after or equal middle
         add x to right
    // recursively call merge_sort() to further split each sublist
    // until sublist size is 1
    left = merge_sort(left)
    right = merge_sort(right)
    // merge the sublists returned from prior calls to merge_sort()
    // and return the resulting merged sublist
    // 2. CONQUER Part...
    return merge(left, right)

Bubble Sort:

procedure bubbleSort( A : list of sortable items )
   repeat     
     swapped = false
     for i = 1 to length(A) - 1 inclusive do:
       /* if this pair is out of order */
       if A[i-1] > A[i]then
         /* swap them and remember something changed */
         swap( A[i-1], A[i])
         swapped = true
       end if
     end for
   until not swapped
end procedure

Analyzing the pseudo code becomes an imperative part of the comparison. Seeing it, we can easily derive that the merge sort is an recursive procedure, whereas the bubble sort is an iterative procedure.

2. Small Data Sets

If we take small data sets into account, the bubble sort and merge sort are almost equal with respect to sorting numbers in approximately the same time. The difference between the two procedures is not very predominant to observe in such cases.

3. Complexity Relation

Considering the data set input field to be ‘n‘, the corresponding complexity relations for the two sorting procedures is this:

The two relations are absolutely different when it comes to relating them at high and low input field data sets respectively.

4. Large Data Sets

With respect to large input ‘n’, we would expect a much lower value of the complexity rate in merge sort as compared to bubble sort. For example, with a data set of 1000 variables, we would require about 3000 units of time/memory in the case of merge sort. However, on the other hand, we would require about 1000000 units of the same in bubble sort ! There does seem to be a good enough difference already ! 1000 is considered to be a small number in today’s terabyte world. We would certainly need to go with the fastest algorithm in place.

SEE ALSO Power of Batch Programming