Data Structures and Algorithm (DSA) – Performance, Complexity And Big-O Notation

Hussain Patel
Aug 29, 2017

42.3k
0
3
- facebook
- twitter
- linkedIn
- Reddit
- WhatsApp
- Email
- Print
- Other Artcile

Introduction

In this article we will be talking about what performance of an algorithm is and how Complexity is used to measure performance also learn what is Big-O notation?

Background

When we write a program or an algorithm that will be used only once on a small amount of data and then discarded, we focus more on writing the code (program) in the easiest possible way we know, we debug it, test it and then use it in our application and then forget about it. However when we need to write a program or an algorithm that will be used over a longer period of time and will be maintained by many people, we need to focus of factors like – simplicity, clarity and efficiency /performance of the program or algorithm. In this article we will be focusing on the efficiency / performance of an algorithm. How it is measured? What is the complexity of the algorithm and also we will discuss about Big-O Notation, it’s a unit for expressing complexity.

So let’s start with Performance and see how performance is measured.

Performance of a program or an algorithm is measured using various factors – like Time, Space, and Network etc.

Time - How much time the program takes to execute?
Space - How much memory or space the program is using while it's executed (also the physical space)
Network - if there are any network related activities the program is doing, which may affect the performance).

Of all the factors mentioned above, Time are Space are usually considered important when measuring the performance of an algorithm. Example – How fast is your algorithm compared to another algorithm. How much space does your algorithm take in terms of memory?. When we talk about the time taken by the algorithm, we are actually talking about Complexity (Time Complexity). “Complexity of an algorithm is high, when the algorithm takes more time to execute” and more time means your algorithm has low performance

Let’s discuss more about Complexity and understand what Time complexity is.

Definition

“Complexity is used to measure performance. Or Complexity is a measure of how the Resources (in this case Time) changes as the size of the problem gets larger.”

Let’s understand this with an example: For small input of data – an algorithm might run fast and might not show any time difference, but as the input size increases the program might take time to execute and become slow and performance is low, this is where the complexity is measured. The Complexity of an algorithm indicates to us how performant the algorithm is.

So we can say that - Size of input determines complexity with in turn determines how the algorithm behaves when the input size is very large. In computer terms this is also called the Worst Case performance of the algorithm.

Now that we know what complexity is, we can see how complexity is measured. This is done using Big – O notation. Let’s take an example.

Example

Suppose you have two algorithms and you want to compare them based on the performance i.e. Time complexity. How do you do that?

One way is - You can practically run each of them for same inputs and wait till they execute and note the time difference, in some cases this might not be feasible to practically run each algorithm taking into consideration the time and resource required to execute them. In such case the other way is to prove this theoretically using mathematical terms – that is using the Big – O notation.

Big – O notation is used to study the performance / complexity of an algorithm in theoretical terms. Big – O notation looks at the upper bound of an algorithms performance; i.e. its worst case behavior. Big –O notation also looks at algorithms asymptotic behavior – what it means is the performance of the algorithm as the size of the input increases to very large. The size of input is represented by n.

Definition

The Big – O notation is a unit to express complexity in terms of the size of the input that goes into an algorithm.

Following are the Big –O notation rules to figure out an algorithm's performance or asymptotic behavior,

Constant Time Complexity O(1)

If the time taken by the algorithm does not change and remains constant as the input size is increased, then the algorithm is said to have complexity of O(1) – read as Order of 1, performance. The algorithm doesn’t depend on the size of the input. Example – Node added at the end of the linked list (assuming we have a tail pointer).
1. static void ConstantTimeComplexity(int n)
2. {
3. int a = 100;
4. int b = 200;
6. int sum = a + n;
7. int mul = b * n;
9. Console.WriteLine("{0}, {1}", sum, mul);
10. }

O(log n)

An algorithm for input size n (where n is very large), breaks the problem into small chunks for each iteration, the algorithm is said to have complexity of O (log n) - read as Order log n performance. Example: Binary Search Tree.
Linear Time Complexity O(n)

An algorithm for input size n (where n is very large) performs n steps and the time taken by the algorithm changes linearly as the input size increases, then the algorithm is said to have complexity of O(n) – read as Order of n performance. The run time complexity is proportionate to the size of n.

Example

for loop to the size of length of array, as the array size increases the time taken will proportionately increase
1. static void LinearTimeComplexity(int[] arr1)
2. {
3. for (int i = 0; i < arr1.Length; i++)
4. {
5. Console.WriteLine("{0}", arr1[i]);
6. }
7. }

O (n log n)

An algorithm for input size n (where n is very large), breaks the problem into small chunks for each invocation, and then takes each of the smallest chunks and stiches them back together, then the algorithm is said to have complexity of O(n log n) - read as Order n log n performance. Example: Quick Sort.

Quadratic: O(n²)
An algorithm for input size n (where n is very large) performs almost double (n²) the steps and the time taken by the algorithm changes Quadratically as the input size increases, the algorithm is said to have complexity of O(n²) – read as Order n square performance. Example bubble Sort.
1. static void QuadraticTimeComplexity(int[] arr1)
2. {
3. for (int i = 0; i < arr1.Length; i++)
4. {
5. for (int j = 0; j < arr1.Length; j++)
6. {
7. Console.WriteLine("{0}", arr1[i] + arr1[j]);
8. }
9. }
10. }

Cubic: O (n³)
very rare algorithms.

Exponential
O (n!) – Incredibly rare.

Note

Constants and number or lower order of N are not considered when expressing Complexity of an algorithm. Example - O (n² + n) = O (n²) as O (n) is very small compared to O (n²) and can be ignored.

Source – By Cmglee - Own work, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=50321072

From the graph shown on the left, we can quickly find out which algorithms of which complexity are faster,

Algorithms of complexity O (1) are fastest and take less time while algorithms of complexity O(n!) are slower and take more time.

We can also write – the time taken by algorithms of complexity O (1) < O (log n) < O (n) < O (n²) < O (n!) and so on.

Additional Examples

What is the complexity of below code?
1. static void MoreExamples1(int[] arr1, int[] arr2)
2. {
3. for (int i = 0; i < arr1.Length; i++)
4. {
5. Console.WriteLine(arr1[i]);
6. }
8. for (int j = 0; j < arr2.Length; j++)
9. {
10. Console.WriteLine(arr2[j]);
11. }
12. }
In this case we have 2 for loops and each iterates through a different set of inputs (arr1 and arr2) and each input size is considered to be very large, so for the first for look the complexity is O (a) and for the second for loop the complexity is O (b) and as both for loops are independent the complexity is additive O (a) + O (b), we can write O (a + b) where a = arr1.Length and b = arr2.Length.

What is the complexity of the below code?
1. static void MoreExamples2(int[] arr1, int[] arr2)
2. {
3. for(int i =0;i<arr1.Length; i++)
4. {
5. for (int j = 0; j < arr2.Length; j++)
6. {
7. if (arr1[i] > arr2[j])
8. {
9. Console.WriteLine(arr1[i] + " , " + arr2[j]);
10. }
11. }
12. }
13. }
In this case we have 2 nested for loops and one if condition, so for each element i of arr1 the inner loop j goes through all the elements of arr2, (both arr1 and arr2 are different inputs and the input size is considered to be very large), and the if condition will be executed if the condition is satisfied. so for the first for loop the complexity is O (a), for the inner for loop the complexity is O (b) and for the if condition the complexity is O (1). As it’s a nested for loop, the complexity is multiplicative O (a) * O (b) * O(1), we can write O (a*b) or O (ab) where a = arr1.Length and b = arr2.Length.
Note: It’s not O (n²) as there are 2 different inputs arr1 and arr2 of very large size.

What is the complexity of the below code?
1. static void MoreExamples3(int[] arr1, int[] arr2)
2. {
3. for (int i = 0; i < arr1.Length; i++)
4. {
5. for (int j = 0; j < arr2.Length; j++)
6. {
7. for (int k = 0; k < 100000; k++)
8. {
9. Console.WriteLine(arr1[i] + " , " + arr2[j]);
10. }
11. }
12. }
13. }
In this case we have 3 nested for loops, so for each element i of arr1 and loop j of arr2 the inner loop k goes through 100000 times (both arr1 and arr2 are different and the input size is very large). For the first for loop the complexity is O (a), for the second for loop the complexity is O (b) and for the third for loop the complexity is O (1) (constant time complexity and the input size is fixed – 100000 and would iterate fixed input). As it’s a nested for loop we can write the complexity as O (a) * O (b) * O (1), we can ignore O (1) and write the complexity as O (a * b) or O (ab) where a = arr1.Length and b = arr2.Length.

Note: As mentioned – the constant time complexity O (1) for the constant 10000 can be ignored as it very small compared to input size of arr1 and arr2.

What is the complexity of the below code?
1. static void MoreExamples4(int[] arr1)
2. {
3. for (int i = 0; i < arr1.Length; i++)
4. {
5. for (int j = 0; j < arr1.Length; j++)
6. {
7. Console.WriteLine(arr1[i] + ", " + arr1[j]);
8. }
9. }
11. for (int k = 0; k < arr1.Length; k++)
12. {
13. Console.WriteLine(arr1[k]);
14. }
15. }
In the above example there are 2 set of for loops - nested for loop and the other is a single for loop. For nested for loop the complexity is O (n²) and the single for loop the complexity is O (n), As both for loop sets are independent the complexity as O (n²) + O (n).
Note: O (n) is much smaller as compared to O (n²) and the maximum time the algorithm will take is to execute the nested for loop, so O (n) can be ignored and we can say that the complexity for above code is O (n²).

What’s the complexity of the below code?
1. static void MoreExamples5(int[] arr1)
2. {
3. int i = 1;
5. while (i < arr1.Length)
6. {
7. Console.WriteLine(arr1[i]);
8. i = i * 2;
9. }
10. }

In the above example, the loop variable is doubled in every iteration that means the loop will execute faster than the normal loop executes where the increment is by 1. So the complexity of above code can be written as O (log₂ n), where n = arr1.Length. Let's see how we can derive this. In each iteration the value if i is multiplied by 2, so in first iteration i = 1, then I is multiplied by 2 see below the calculation.

From the figure below, we can see that for every iteration the value if i increases by powers of 2. Like 2⁰, 2¹, 2², 2³ and so on. You can also say that in each iteration for some value of k the value i will be 2^k. For example for the first iteration if the value of K =0 the value of i will be 2⁰ that is i = 1.

Data Structures and Algorithm

So we can write a mathematical formulas as for (K + 1)^thiteration i = 2^K-1* 2 that is i = 2^K.

At some point there will be some number k which will be equal to N that is 2^k >= n and iteration will stop and exist out of the loop. Now let’s do some derivation to find out in how many iterations (K) takes to double the value of i so that it exists out of the loop.

Thus k = log₂n. Since the loop will be iterated only for k number of iterations, so the complexity of the above code is O(K) and so the complexity is O (log n).
Data Structures and Algorithm

Important point to note: When the input (n – where n is very large size) is reduced by half or double in each iteration, your code will execute fast and the loop will exit fast, the complexity of that piece of code has complexity of O (log n).

References

I have referenced a lot of books and training materials to understand the concepts and writing this article would not be possible with the below and many other references.

Introduction to Algorithms - By By Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest and Clifford Stein
Data Structures Algorithms Made-Easy - By Narasimha Karumanchi
From 0 to 1: Data Structures & Algorithms in Java - Loony Corn Special thanks to Janani for beautiful explanation and examples.

Recommended Free Ebook

Kotlin for Beginners

Download Now!