Algorithms Posted 6 months ago

Selection Sort and Insertion Sort form the most basic of the numerous sorting techniques that aim to efficiently sort a given set of unordered data into ordered data based on a given criteria. For better understanding, we often refer to sorting techniques as number sorting techniques applying on integers (mostly positive), although these techniques can also be applied on datasets other than numbers such as characters, or any other complex datatypes. In this article, let's try understand what Selection Sort and Insertion Sort work and how they differ from one another.

**Selection Sort:**

Selection Sort is the simplest among all the sorting techniques and also the least efficient sorting technique when used over larger datasets. The algorithm implements on the basic idea of swapping elements to their actual places based on the comparison criteria.

The sorting algorithm looks like below:

```
SelectionSort(array A)
begin
lastIndex = length(A) - 1;
for i = 0 to lastIndex do
minIndex = i
for j = i + 1 to lastIndex do
if A[j] < minElement then
begin
minElement = j
end
Swap(A, minIndex, i)
end
end
Swap(array A, i, j)
begin
temp = A[i]
A[i] = A[j]
A[j] = temp
end
```

The implementation looks like below:

```
public void Sort(int[] arr)
{
for (int i = 0; i < arr.Length; i++)
{
var min = i;
for (int j = i + 1; j < arr.Length; j++)
{
if (arr[j] < arr[min])
{
min = j;
}
}
Swap(arr, min, i);
}
}
private void Swap(int[] arr, int i, int j)
{
var temp = arr[i];
arr[i] = arr[j];
arr[j] = temp;
}
```

**How it works:**

In this approach, we have two pointers which run through out the given dataset and the inner loop pointers seeks through the entire dataset once for every single move of the outer loop pointer. For every seek, the inner loop pointer looks for the element in the loop which is smaller than the element of the outer loop pointer (which is assumed to represent the relatively smallest element in the set) and then swaps the elements once the seek is complete. This goes on till the outer loop pointer reaches the end of the dataset.

For a given dataset {5,9,2,4,7,8}, The flow looks like this:

```
Input: 5,9,2,4,7,8
LastIndex: 5
i: 0 2,9,5,4,7,8 5 elements
i: 1 2,4,5,9,7,8 4 elements
i: 2 2,4,5,9,7,8 3 elements
i: 3 2,4,5,7,9,8 2 elements
i: 4 2,4,5,7,8,9 1 element
i: 5 2,4,5,7,8,9 0 element
Output: 4,2,5,8,7,9
```

Observe that in the first iteration of i where the assumed minIndex element is 5, we have the element 2 which is the smallest element found in the set and so 2 is picked as the first smallest and is swapped for the first element. Then the index moves to the second element 9 and the loop iterates over the set {5,4,7,8} where 4 is clearly the smallest. So its swapped for the second element and this goes on till all the elements are seeked by the index i.

**Complexity Analysis:**

As mentioned before, this is clearly the least efficient in the bunch. Because for every iteration of the outer loop for n elements, the inner loop runs over n-1 elements and so on. So the total work done by both the loops is:

```
O(n(n-1)/2) which is equivalent to O(n2) time, for all cases
```

And this is an in-place sorting algorithm, which means there are no other additional memory spaces used inside the algorithm. Hence the algorithm uses a constant space or O(1) space.

**Insertion Sort:**

On the other hand, Insertion Sort works a bit better than the Selection Sort technique. This algorithm relies on the "movement" of the elements from their original places to their actual places. The algorithm for this technique looks like below:

```
InsertionSort(array A)
begin
lastIndex = length(A) - 1;
for i = 1 to lastIndex do
begin
k = A[i]
j = i - 1
while j >= 0 && A[j] > k do
begin
A[j+1] = A[j]
j = j-1
end
A[j+1] = k
end
end
```

And the implementation looks like below:

```
public void Sort(int[] arr)
{
for (int i = 1; i < arr.Length; i++)
{
var k = arr[i];
var j = i - 1;
while (j >= 0 && arr[j] > k)
{
arr[j + 1] = arr[j];
j--;
}
arr[j + 1] = k;
}
}
```

**How it works:**

In this approach as well, we have two pointers i and j which run on the dataset and for one iteration of i, j loops from the pointer i till the occurence of an element smaller than the current element of i encountered "in reverse" and moves the element to a relatively "smaller" position in the set. Then the element at j is placed with the current element at i.

For a given dataset {5,9,2,4,7,8}, The flow looks like this:

```
Input: 5,9,2,4,7,8
LastIndex: 5
i: 1
5,9,2,4,7,8
i: 2 - element 2 is moved
5,9,9,4,7,8
5,5,9,4,7,8
2,5,9,4,7,8
i: 3 - element 4 is moved
2,5,9,9,7,8
2,5,5,9,7,8
2,4,5,9,7,8
i: 4 - element 7 is moved
2,4,5,9,9,8
2,4,5,7,9,8
i: 5 - element 8 is moved
2,4,5,7,9,9
2,4,5,7,8,9
2,4,5,7,8,9
Output: 2,4,5,7,8,9
```

**Complexity Analysis:**

Compared to Selection Sort, the inner loop of the Insertion Sort works comparatively less time because the loop doesn't necessarily run till the beginning of the loop at all times. For example, in the above dataset for the movement on element 7, the inner loop iterates just one place (from 4 to 3).

Hence this algorithm takes O(n(n-1)/2) in its worst case scenario where for every pass the inner loop needs to seek elements from last to first (probably when the set is sorted in descending order). Hence the worst case complexity is O(n2) while the expected case can be somewhere lesser than O(n2) (although not so less).

For the space, the technique takes a constant space of O(1) since we don't use any additional memory for the comparisons.

**Selection Sort vs Insertion Sort - Differences:**

As we can see from both the algorithms, we can find the following differences in how these algorithms work:

- From the Selection Sort example, we can see that with every pass the inner loop runs over unsorted elements and tries to sort them as the outer loop moves forward. In case of Insertion Sort, with every pass we're creating a "relatively" Sorted Set of elements ([5,9], [2,5,9], [2,5,4,9], [2,4,5,7,9], [2,4,5,7,8,9]) for the inner loop as the outer loop progresses forward.
- In case of a SelectionSort, we will have one definite "swap" per iteration. Whereas in the case of an InsertionSort, the element to be "swapped" is first taken aside in a temp variable (k) and once the inner loop moves the elements till an appropriate position for k is found, the swap finally happens.
- From a Time perspective, the SelectionSort takes the same O(n2) time for any scenario - be it the best case or the worst case. Whereas, the InsertionSort takes a varied time for various input scenarios - hence we say that for the worst case it takes O(n2) time while for other cases it is comparatively lower than O(n2) time.

**Final Thoughts:**

From a performance perspective, both these algorithms are less efficient than the other algorithms when put to test on a large dataset. However, if we have a memory constraint on our execution machine and have a relatively small datasets, these algorithms are worth using as they are very simple to implement.