Analyzing Selection Sort (using Big-O)

Let’s analyze the running time of it using the method we learned in a previous section. There is no code that doesn’t run or ends prematurely so there are no tricky cases to worry about. Here is the code again.

arr = [1, 4, 2, 7, 7, 6]  # change this array to the array you want to sort
for first_idx in range(len(arr)):
    min_idx = first_idx
    for second_idx in range(first_idx + 1, len(arr)):
        if arr[second_idx] < arr[min_idx]:
            min_idx = second_idx
    arr[first_idx], arr[min_idx] = arr[min_idx], arr[first_idx]

print(arr)

View code on GitHub.

Line 1 is the input so it isn’t part of the running time analysis.

Line 2 is a for loop that checks all the indexes (n operations) of the list. The running time is O(n).

Line 3 is assigning a variable, which is one operation. This is always O(1).

Line 4 is a for loop that checks all the indexes of the list starting from first_idx+1. This is tricky. How to find running time for a for loop whose number of iterations keeps changing throughout the program? Well, you would take the average number of iterations throughout the program. Since first_idx could be the numbers from 0 to n-1, second_idx could be 1 to n-1. The average of the numbers from 1 to n-1 is 0.5n. This means there are 0.5n operations. Therefore, the running time is O(n).

Line 5 is an if statement and the condition inside takes a constant number of operations to do regardless of input. Therefore, the running time is O(1).

Line 6 is assigning a variable, which is one operation. This is always O(1).

Line 7 is tuple unpacking which takes a constant number of operations to do regardless of input. Therefore, the running time is O(1).

Line 9 we are ignoring in our analysis since it isn't part of the algorithm itself.

Let’s put the running time next to each line.

arr = [?, ?, ?]  # this is the input, so we're not including it in analysis
for first_idx in range(len(arr)):  # O(n)
    min_idx = first_idx  # O(1)
    for second_idx in range(first_idx + 1, len(arr)):  # O(n)
        if arr[second_idx] < arr[min_idx]:  # O(1)
            min_idx = second_idx  # O(1)
    arr[first_idx], arr[min_idx] = arr[min_idx], arr[first_idx]  # O(1)

Now multiply the running time of the code inside the for loop by the running time of the for loop. Let’s start with the inner for loop.

arr = [?, ?, ?]  # this is the input, so we're not including it in analysis
for first_idx in range(len(arr)):  # O(n)
    min_idx = first_idx  # O(1)
    for second_idx in range(first_idx + 1, len(arr)):  # O(n)
        if arr[second_idx] < arr[min_idx]:  # O(1) * O(n) = O(n)
            min_idx = second_idx  # O(1) * O(n) = O(n)
    arr[first_idx], arr[min_idx] = arr[min_idx], arr[first_idx]  # O(1)

Then let’s do it for the outer loop.

arr = [?, ?, ?]  # this is the input, so we're not including it in analysis
for first_idx in range(len(arr)):  # O(n)
    min_idx = first_idx  # O(1) * O(n) = O(n)
    for second_idx in range(first_idx + 1, len(arr)):  # O(n) * O(n) = O(n^2)
        if arr[second_idx] < arr[min_idx]:  # O(1) * O(n) * O(n) = O(n^2)
            min_idx = second_idx  # O(1) * O(n) * O(n) = O(n^2)
    arr[first_idx], arr[min_idx] = arr[min_idx], arr[first_idx]  # O(1) * O(n) = O(n)

After we calculated the running time of all the lines of code, let’s add them up.