Understanding Selection and Insertion Sort in an easy way

Introduction

Selection Sort and Insertion Sort form the most basic of the numerous sorting techniques that aim to efficiently sort a given set of unordered data into ordered data based on a given criteria.

For better understanding, we often refer to sorting techniques as number sorting techniques applying on integers (mostly positive), although these techniques can also be applied on datasets other than numbers such as characters, or any other complex data types.

In this article, let’s try understand what Selection Sort and Insertion Sort work and how they differ from one another.

What is Selection Sort?

Selection Sort is the simplest among all the sorting techniques and also the least efficient sorting technique when used over larger datasets. The algorithm implements on the basic idea of swapping elements to their actual places based on the comparison criteria.

The sorting algorithm looks like below:

SelectionSort(array A)
begin
	lastIndex = length(A) - 1;
	for i = 0 to lastIndex do
		minIndex = i
		for j = i + 1 to lastIndex do
			if A[j] < minElement then
			begin
				minElement = j
			end
		Swap(A, minIndex, i)
	end
end

Swap(array A, i, j)
begin
	temp = A[i]
	A[i] = A[j]
	A[j] = temp
end

The implementation in C# looks like below

public void Sort(int[] arr)
{
	for (int i = 0; i < arr.Length; i++)
    {
    	var min = i;
        for (int j = i + 1; j < arr.Length; j++)
        {
        	if (arr[j] < arr[min])
            {
            	min = j;
            }
        }
		Swap(arr, min, i);
    }
}

private void Swap(int[] arr, int i, int j)
{
	var temp = arr[i];
    arr[i] = arr[j];
    arr[j] = temp;
}

How Selection Sort works?

In this approach, we have two pointers which run through out the given dataset and the inner loop pointers seeks through the entire dataset once for every single move of the outer loop pointer.

For every seek, the inner loop pointer looks for the element in the loop which is smaller than the element of the outer loop pointer (which is assumed to represent the relatively smallest element in the set) and then swaps the elements once the seek is complete.

This goes on till the outer loop pointer reaches the end of the dataset.

For a given dataset {5,9,2,4,7,8}, The flow looks like this:

Input: 5,9,2,4,7,8
LastIndex: 5

i: 0    2,9,5,4,7,8 5 elements
i: 1    2,4,5,9,7,8 4 elements
i: 2    2,4,5,9,7,8 3 elements
i: 3    2,4,5,7,9,8 2 elements
i: 4    2,4,5,7,8,9 1 element
i: 5    2,4,5,7,8,9 0 element

Output: 4,2,5,8,7,9

Observe that in the first iteration of i where the assumed minIndex element is 5, we have the element 2 which is the smallest element found in the set and so 2 is picked as the first smallest and is swapped for the first element.

Then the index moves to the second element 9 and the loop iterates over the set {5,4,7,8} where 4 is clearly the smallest. So its swapped for the second element and this goes on till all the elements are seeked by the index i.

Time Complexity of Selection Sort

As mentioned before, this is clearly the least efficient in the bunch. Because for every iteration of the outer loop for n elements, the inner loop runs over n-1 elements and so on. So the total work done by both the loops is:

O(n(n-1)/2) which is equivalent to O(n2) time, for all cases

And this is an in-place sorting algorithm, which means there are no other additional memory spaces used inside the algorithm. Hence the algorithm uses a constant space or O(1) space.

What is Insertion Sort?

On the other hand, Insertion Sort works a bit better than the Selection Sort technique. This algorithm relies on the “movement” of the elements from their original places to their actual places. The algorithm for this technique looks like below:

InsertionSort(array A)
begin
	lastIndex = length(A) - 1;
	for i = 1 to lastIndex do
	begin
		k = A[i]
		j = i - 1
		while j >= 0 && A[j] > k do
		begin
			A[j+1] = A[j]
			j = j-1
		end
		A[j+1] = k
	end
end

And the implementation in C# looks like below:

public void Sort(int[] arr)
{
	for (int i = 1; i < arr.Length; i++)
    {
    	var k = arr[i];
        var j = i - 1;
        
		while (j >= 0 && arr[j] > k)
        {
        	arr[j + 1] = arr[j];
            j--;
        }
		arr[j + 1] = k;
    }
}

How Insertion Sort works?

In this approach as well, we have two pointers i and j which run on the dataset and for one iteration of i, j loops from the pointer i till the occurrence of an element smaller than the current element of i encountered “in reverse” and moves the element to a relatively “smaller” position in the set. Then the element at j is placed with the current element at i.

For a given dataset {5,9,2,4,7,8}, The flow looks like this:

Input: 5,9,2,4,7,8
LastIndex: 5

i: 1
5,9,2,4,7,8

i: 2 - element 2 is moved
5,9,9,4,7,8
5,5,9,4,7,8
2,5,9,4,7,8

i: 3 - element 4 is moved
2,5,9,9,7,8
2,5,5,9,7,8
2,4,5,9,7,8

i: 4 - element 7 is moved
2,4,5,9,9,8
2,4,5,7,9,8

i: 5 - element 8 is moved
2,4,5,7,9,9
2,4,5,7,8,9
2,4,5,7,8,9

Output: 2,4,5,7,8,9

Time Complexity of Insertion Sort

Compared to Selection Sort, the inner loop of the Insertion Sort works comparatively less time because the loop doesn’t necessarily run till the beginning of the loop at all times. For example, in the above dataset for the movement on element 7, the inner loop iterates just one place (from 4 to 3).

Hence this algorithm takes O(n(n-1)/2) in its worst case scenario where for every pass the inner loop needs to seek elements from last to first (probably when the set is sorted in descending order). Hence the worst case complexity is O(n2) while the expected case can be somewhere lesser than O(n2) (although not so less).

For the space, the technique takes a constant space of O(1) since we don’t use any additional memory for the comparisons.

Differences between Selection Sort and Insertion Sort

As we can see from both the algorithms, we can find the following differences in how these algorithms work:

From the Selection Sort example, we can see that with every pass the inner loop runs over unsorted elements and tries to sort them as the outer loop moves forward. In case of Insertion Sort, with every pass we’re creating a “relatively” Sorted Set of elements ([5,9], [2,5,9], [2,5,4,9], [2,4,5,7,9], [2,4,5,7,8,9]) for the inner loop as the outer loop progresses forward.
In case of a Selection Sort, we will have one definite “swap” per iteration. Whereas in the case of an Insertion Sort, the element to be “swapped” is first taken aside in a temp variable (k) and once the inner loop moves the elements till an appropriate position for k is found, the swap finally happens.
From a Time perspective, the Selection Sort takes the same O(n2) time for any scenario – be it the best case or the worst case. Whereas, the Insertion Sort takes a varied time for various input scenarios – hence we say that for the worst case it takes O(n2) time while for other cases it is comparatively lower than O(n2) time.

Conclusion

From a performance perspective, both these algorithms are less efficient than the other algorithms when put to test on a large dataset. However, if we have a memory constraint on our execution machine and have a relatively small datasets, these algorithms are worth using as they are very simple to implement.