Algorithms Posted 6 months ago

**Setting up the Context:**

Imagine you're running a program to transfer a file of 10 units from one system to another system. This program copies this file with a speed of 5 units/sec. Now how much time this transfer takes? it takes 2 sec to copy (at the rate of 5 units/sec) which is mathematically No.Of Units / Speed. As we increase the number of units, the time increases as well, but it takes a linear approach with the increase in the number of units with time - something which we can represent with a mathematical function as well:

N units of something takes N/K sec of time to copy where K is the rate of copy. This ability to represent the time as a variation of input is what we call as the Time Complexity.

**What is Time Complexity?**

Time Complexity can be defined as the total amount of time taken by an algorithm to run with respect to the input. Like in the above example, it is a representation to understand how the time to execute our algorithm changes as the input varies.

To better explain this, let's take another example of an algorithm that adds two numbers a and b to return the result c.

```
Add(a, b)
begin
c = a+b
return c
end
```

The algorithm takes two inputs and returns its sum. If we assume that the algorithm takes 1 unit of time to complete for a given input a and b, the time doesn't change no matter how we vary the inputs a and b because the algorithm just adds and returns the result, nothing varies here. Hence we can say that this algorithm executes at a "constant time" or the time complexity is constant.

Now if we say that for a given number n, we need to print the sum of all natural numbers till n - our algorithm might look like this:

```
AddTillN(n)
begin
sum = 0
i = 1
while i <= n
sum = sum + i
return sum
end
```

In this case, the algorithm iteratively adds all the numbers till the given input n to the result variable "sum" that stores the cumulative sum of numbers till n. In this case, if we take the input as 5 - the sum is calculated 5 times from 1 till 5. For 100, it runs from 1 to 100 - for N, it runs from 1 to N and adds to sum.

Hence in this case, the algorithm takes "N units of time" for an input N. Hence the "time complexity" is "N".

**How it is useful?**

Understanding time complexity of an algorithm helps us in early detection of the design flaws in our business logic. The time complexity we're calculating here directly impacts the execution time of our solution to the problem, and time means money. An ideal solution must try to solve a problem in least possible time. Time complexity analysis of our solution design helps us understand where the logic is taking more time and enables us to think for alternative approaches even before translating our design into an actual programmatical solution. This also saves us a lot of development and load test times, because we don't need to worry about the time performance of the solution once its programmed and deployed into action.

**Expressing Time - Notations:**

To better express the time taken by our algorithm for an input, we need to employ a generalized notation. Time complexity can be expressed in three terms - Big O, Big Omega and Theta.

**Big O** - represents the minimum time an algorithm takes to execute. An algorithm can't take lower than the big-O time of its execution for any input. For the previous example AddTillN() function, we know that the execution time of the algorithm varies "linearly" with the input number N, hence the time complexity of the algorithm can be denoted as O(n) where the algorithm takes atleast O(n) time to execute for an input length n.

**Big Omega** - represents the maximum time an algorithm takes to execute. An algorithm can't execute higher than the big-Omega time of execution for any input. For the previous example AddTillN() function, since the time increase is "linear" to the input N, we can also say that the algorithm takes no more than Ω(n) time.

**Big Theta** - represents a tight-bound on the execution time of an algorithm. This is used when the algorithm has both the upper-bound and lower-bound as same. In the example AddTillN(), since we have both the lower and upper bounds as "linear", the algorithm can be represented by theta notation as Θ(n) time.

"Generally, the time complexity of an algorithm is always represented by O() notation, since it is the maximum time an algorithm can time in any case."

**Best Case, Worst Case and Expected Case Times:**

Similar to the bounds of time, we do have variations in time for our algorithm based on the input. For example, consider a linear-search algorithm that searches for an element n inside the array arr and returns the index of n inside arr if found, as below:

```
linearSearch(arr, n)
begin
if arr.length == 0
return -1
for i = 0 to arr.length
if arr[i] == n
return i
return -1
end
```

in our algorithm linearSearch(n), let's say the element n is at the first index of the array. In this case, the loop runs just once and we're already at our result. In this case, the time-complexity is O(1) since we've run down the array only once. This case is called as the **"Base Case Time"** of the algorithm.

On the contrary, let's say if the element is at the last index of the array, then the algorithm runs till the end of the array (looping through all the n elements) and returns the last index of the array. In this case, the time taken is O(n), since the loop runs for all the n elements. This is called as the **"Worst Case Time"** of the algorithm.

The third one is the generally considered case of input, where the element n is somewhere inside the array and not particularly the "first" or the "last" element. In this case, the time complexity can be somewhere between O(1) and O(n) and not exactly one of it. This is called as the **"Expected Case Time"**.

"Generally, for most of the times the "Expected Case" and "Worst Case" times are same, while there are cases when they are different based on how our algorithm works. In such cases, we need to analyse both the "Expected Case" and "Worst Case" separately."

**Special cases - Amortized Complexity**

Sometimes, an algorithm can have more than one execution times, based on the worst-case scenarios. But the input variation gap between these two scenarios are so long that we generally "amortize" or "write-off" this other time. This is called as "Amortized Time Complexity". The best example for such complexity is the case of an ArrayList.

An ArrayList is an array whose length changes dynamically and hence we can store as many input elements as needed without needing to worry about the overflow scenarios. But internally an ArrayList is still an Array. So how does it work?

When we initialize an ArrayList, an array of certain length is created internally. As we insert elements into the array, the ArrayList checks if there is available space in the array or not. If there is available places in the array, it places the inserted elements into the array. If there is no more space available inside the array, the ArrayList creates a new array with double the length and then copies all the elements into the new array and the operation continues.

The time taken to insert an element into an ArrayList in general is O(1) since we're placing an element exactly at a single index. But when the ArrayList runs out of space in its current array, it has to:

- Create a new Array with double the length
- Copy all elements into the new array
- Insert the new element into the now bigger array

This copy operation needs the ArrayList to run down all the elements in the smaller array, which takes O(n) time. But whenever this scenario occurs, a new Array of double the length of the current array is created. That means if the initial array length is 1, this scenario occurs at 1, 2, 4, 8, 16, .. spaces of Array. The gap between such scenarios doubles as the insertion happens. Hence, the time complexity shall be O(2X) at a certain point X and the amortized time complexity for the ArrayList is O(1).

**Finding out Time Complexity - Examples:**

To understand better about how we calculate our algorithm times, let's look at a few examples.

- To find the factorial of a given number - let's say we have our solution like below:

```
Factorial(n)
begin
i = 1, prod = 1
while(i <= n)
prod = prod * i
return prod
end
```

In this example, we're iteratively multiplying all the numbers from 1 till the given number N to find the product of all numbers from 1 till the N, which is the factorial for that number. Now how much time the algorithm takes? depends on how many times the loop runs. How many times the loop runs? for every input of N, the loop runs till the N from 1 and varies with N. Hence the time taken is O(n).

- To find the sum of digits of a number - let's say we have our solution like below:

```
SumOfDigits(n)
begin
sum = 0, r
while(n > 0)
r = n % 10
sum = sum + r
n = n / 10
return sum
end
```

In this example, for a given input say 12345, we expect the algorithm to slice and sum all the individual digits of the number and return the sum, which in this case is 1+2+3+4+5=15.

If we look at the algorithm, the input number is iteratively divided by 10 and then the sliced remainder is added to the total sum. For how many times the loop runs in this case? For a number 12345, the loop runs 5 times. In other words, to run the loop for 5 times, you need a number with 5 digits which is in the range of 10^5. Similarly to run the loop for D times, the length of the number should be 10^D.

```
N = 10^D
D = log10N
```

Hence the time taken for the algorithm to execute is logN(base 10).

- To find the Nth Fibonacci number - let's say the solution we've developed looks like below:

```
fib(n)
begin
if n <= 0
return 0
else if n == 1
return 1
else
return fib(n-1) + fib(n-2)
end
```

In this recursive solution, we have our call stack like this:

for an input 5, we'll have:

- fib(5) calling out fib(4) and fib(3)
- fib(4) calling out fib(3) and fib(2)
- fib(3) calling out fib(2) and fib(1)
- fib(2) calling out fib(1) and fib(0)
- fib(1) calling out fib(0) and fib(-1)

For every recursive call, the call stack branches out into sub nodes, forming something similar to a balanced binary tree. In this case, we have exactly 2 nodes per branch (fib(n-1) and fib(n-2)) and if we count the total number of nodes, it'd be around 2^n. Hence the time taken for this algorithm is O(2^n)

- Sum of all elements of a Matrix - let's say our solution looks like below:

```
SumOfElements(matrix[][], m, n)
begin
sum = 0
for i = 1 to m
for j = 1 to n
sum = sum + matrix[i][j]
return sum
end
```

In this case, for every since iteration of the outer loop i, we have j iterate from 1 to n. This means that the total times the j loop runs is equal to mxn. Hence the total time taken for this algorithm is O(mxn) where m and n are the rows and columns of the matrix.

**Time Complexity - Points to Remember:**

Time complexities for algorithms follow certain patterns, making it easy to identify and understand the execution time for an algorithm. Here are some important things to remember:

- Looping through an array of size n takes O(n) time
- Looping through a matrix of size mxn takes O(mxn) time
- While calculating the time for algorithm, we can drop off the constants. which means O(2n) is equal to O(n)
- If an algorithm contains two sequences of loops, the complexities are added. which means O(n+n) = O(2n) == O(n)
- If an algorithm contains two nested loops, the complexities are multiplied. which means O(nxn) = O(n^2)
- If an algorithm works on an input set by slicing the input into smaller parts, the complexity is mostly logarithmic. which means O(logn)
- If an algorithm works in a recursive fashion with exactly two sub recursive calls, the complexity is mostly exponential. which means O(2^n)
- If an algorithm works in a recursive fashion with exactly one sub recursive call, the complexity is mostly linear. which means O(n)
- Binary Search algorithm takes O(logn) time, while Merge Sort takes O(nlogn) times, since both approaches involve splicing the input set into smaller chunks while processing.