Bubble Sort
Bubble sort is a simple comparison-based sorting algorithm that repeatedly swaps adjacent elements until the list is sorted.Goals
By the end of this lesson, you should be able to:
- describe bubble sort algorithm steps and ways to optimise them.
- implement bubble sort algorithm to sort a sequence of number.
sorting
, bubble sort
The best way to practice your programming skills is by writing actual code. One of the common computation is to sort some items in some way. For example, sorting a number from smallest to biggest or names from A to Z. In this notebook, we will describe some sorting algorithms which you can implement in Python.
A Note about Show Pseudocode Button
Throughout these notes, you will see a button that says Show Pseudocode as shown below. Go ahead and click it.
Show Pseudocode
Steps:
1. Do this
2. Then do that
The purpose of this button is to show you the pseudocode of the algorithm. But more importantly, the button is placed there so that you don't immediately see the pseudocode. The purpose is for you to write down your own pseudocode and compare your pseudocode with what is provided in these notes.
Another thing about pseudocode is that it is not a Python code.
Do not copy and paste the pseudocode into any Python interpreter or Jupyter notebook cell and expect it to work. There is a good reason why it is called pseudocode and not pythoncode. See definition of pseudo.
Bubble Sort
Bubble sort is one of the simplest sorting algorithms. We will be following the PCDIT framework (Problem statement, Test Cases, Design of Algorithm, Implementation, and Testing) in describing the steps of these algorithms.
Problem Statement
The problem is specified as follows. Given a sequence of numbers, write some steps to sort the sequence in some order. Usually, we will sort the sequence from the smallest to the largest.
Test Case
For example, given the following input:
# Python Code
numbers = [16, 14, 10, 8, 7, 8, 3, 2, 4, 1]
We want to write some steps that sort the numbers, such that the output will be:
[1, 2, 3, 4, 7, 8, 8, 10, 14, 16]
We can intuitively try to sort the numbers by comparing two numbers (a pair) at a time. If the order is incorrect, we will swap the two numbers. Let's illustrate the steps!
We start from the input:
[16, 14, 10, 8, 7, 8, 3, 2, 4, 1]
We compare the first two numbers (16, 14), i.e. positions 0 and 1 in the list. Since 16 is bigger than 14, we will swap them.
[14, 16, 10, 8, 7, 8, 3, 2, 4, 1]
Now we move to the next pair (16, 10), i.e. positions 1 and 2 in the list. Since 16 is bigger than 10, we will swap them again.
[14, 16, 10, 8, 7, 8, 3, 2, 4, 1]
[14, 10, 16, 8, 7, 8, 3, 2, 4, 1]
And next pair (16, 8), i.e. positions 2 and 3. Since 16 is bigger than 8, we will swap.
[14, 10, 16, 8, 7, 8, 3, 2, 4, 1]
[14, 10, 8, 16, 7, 8, 3, 2, 4, 1]
We continue until 16 reaches the last position.
[14, 10, 8, 16, 7, 8, 3, 2, 4, 1]
[14, 10, 8, 7, 16, 8, 3, 2, 4, 1]
--
[14, 10, 8, 7, 16, 8, 3, 2, 4, 1]
[14, 10, 8, 7, 8, 16, 3, 2, 4, 1]
--
[14, 10, 8, 7, 8, 16, 3, 2, 4, 1]
[14, 10, 8, 7, 8, 3, 16, 2, 4, 1]
--
[14, 10, 8, 7, 8, 3, 16, 2, 4, 1]
[14, 10, 8, 7, 8, 3, 2, 16, 4, 1]
--
[14, 10, 8, 7, 8, 3, 2, 16, 4, 1]
[14, 10, 8, 7, 8, 3, 2, 4, 16, 1]
--
[14, 10, 8, 7, 8, 3, 2, 4, 16, 1]
[14, 10, 8, 7, 8, 3, 2, 4, 1, 16]
So now the largest number has occupied its place in the last position. But the other numbers are still not in the right order. Therefore, we have to repeat the steps starting from the beginning again. We start from the pair (14, 10). This will repeat the pair-wise comparison and move 14 to the second last position. Similarly, we can see how 10 and 8 will reach their final position. Here, we no longer show the pair-wise comparison and swapping for brevity.
- [10, 8, 7, 8, 3, 2, 4, 1, 14, 16]
- [8, 7, 8, 3, 2, 4, 1, 10, 14, 16]
- [7, 8, 3, 2, 4, 1, 8, 10, 14, 16]
At this point, we will start comparing the pair (7, 8). But since 7 is not greater than 8, there is no swap.
- [7, 8, 3, 2, 4, 1, 8, 10, 14, 16]
Afterward, we will compare the pair (8, 3).
- [7, 8, 3, 2, 4, 1, 8, 10, 14, 16]
Since 8 is greater than 3, there will be a swap, and these pair-wise swaps will continue to push 8 to its final position.
- [7, 3, 2, 4, 1, 8, 8, 10, 14, 16]
Now we begin again with the pair (7, 3) and push 7 to its final position. The detailed pair-wise swapping is not shown below, but the final arrangement at the end of this iteration will be as the one below.
- [3, 2, 4, 1, 7, 8, 8, 10, 14, 16]
At this point, we will start a new iteration by comparing the pair (3, 2). Since 3 is greater than 2, there will be a swap. The next comparison will fall on the pair (3, 4), i.e. [2, 3, 4, 1, 7, 8, 8, 10, 14, 16]. But since 3 is less than 4, there is no swap happening. And the rest of the comparison will push 4 to its final position.
- [2, 3, 1, 4, 7, 8, 8, 10, 14, 16]
A similar situation occurs. There is no swap between (2, 3), but then it will swap (3, 1).
- [2, 1, 3, 4, 7, 8, 8, 10, 14, 16]
In the last iteration, it wil swap (2, 1).
- [1, 2, 3, 4, 7, 8, 8, 10, 14, 16]
Here's an animated example from Wikipedia,sorting a different sequence of numbers.
Design of Algorithm
After we work on the test cases, we can now write down the steps in pseudocode. Several things to note from the above test cases:
- There are two kind of iterations:
- the inner iteration is when we compare the pairs (a, b) and do a swap if ,
- the outer iteration is when repeat the inner iteration pass starting from the first pair again.
- The number of inner iteration is the , where is the number of elements in the list. This is because the inner iteration compares a pair. So if there is elements, there will be pairs to compare.
- The number of outer iteration is also . You can refer back to the case above that there were 9 outer iterations for the 10 elements.
Let's write down what we did in the above case. Note that in this algorithm we chose not to return the sorted list as a new object, but rather sort the list in place. This means that the input list is modified and the sorted list exists in the object pointed by the input list. The advantage of this is that the list need not be duplicated and the memory is saved as we deal only with one list.
Show Pseudocode
Bubble Sort
Version: 1
Input: array
Output: None, sort in place
Steps:
1. n = length of array
2. For outer_index from 1 to n-1, do:
2.1 For inner_index from 1 to n-1, do:
2.1.1 first_number = array[inner_index - 1]
2.1.2 second_number = array[inner_index]
2.1.3 if first_number > second_number, do:
2.1.3.1 swap(first_number, second_number)
Optimised Bubble Sort
Bubble Sort v2
We can optimized bubble sort algorithm by noting the following:
If the sequence is already in order, we can reduce the next outer iteration.
For example, if the input is
[16, 1, 2, 3, 4, 5]
The first iteration will push 16 to the last position.
[1, 2, 3, 4, 5, 16]
In the second iteration, no swap is made since all the numbers are already in the correct order. However, with the above algorithm, the outer iteration will still repeat for times. We can conclude therefore that if no swap is made in one pass of outer iteration, the sequence is already in order, and we can stop the outer iteration. We can then modify the pseudocode as follows.
Show Pseudocode
Bubble Sort
Version: 2
Input: array
Output: None, sort in place
Steps:
1. n = length of array
2. swapped = True
3. As long as swapped is True, do:
3.1 swapped = False
3.2 For inner_index from 1 to n-1, do:
3.2.1 first_number = array[inner_index - 1]
3.2.2 second_number = array[inner_index]
3.2.3 if first_number > second_number, do:
3.2.3.1 swap(first_number, second_number)
3.2.3.2 swapped = True
Comparison with v1
Let's compare this with version 1. Notice that the number of outer iteration is much less in version two as compared to version one. The n-th pass of the outer iteration will place the n-th largest number to its final position. For example, in the given input below,
[16, 14, 10, 8, 7, 8, 3, 2, 4, 1]
In the 1-st outer pass, the 1-st largest number (i.e. 16) will be placed to its final position.
[14, 10, 8, 7, 8, 3, 2, 4, 1, 16]
This means that we can reduce the number of inner iteration. Instead of comparing pairs for each inner iteration, we can reduce the number of inner iteration after each pass of outer iteration is done. For example, in the above example, we have 10 elements. In the first outer iteration, we have 9 comparisons in the inner iteration. In the next outer iteration, we can simply do 8 comparisons instead of 9. In the third outer iteration, we can do just 7 comparisons instead of 9, and so on.
Bubble Sort v3
We can further optimise and re-write the pseudocode as follows.
Show Pseudocode
Bubble Sort
Version: 3
Input: array
Output: None, sort in place
Steps:
1. n = length of array
2. swapped = True
3. As long as swapped is True, do:
3.1 swapped = False
3.2 For inner_index from 1 to n-1, do:
3.2.1 first_number = array[inner_index - 1]
3.2.2 second_number = array[inner_index]
3.2.3 if first_number > second_number, do:
3.2.3.1 swap(first_number, second_number)
3.2.3.2 swapped = True
3.3 n = n -1
The additional step is 3.3 which is to reduce the number of . With this, Step 3.2 will decrease by one in the next outer iteration.
Bubble Sort v4
It can happen that more than one elements are place in their final positions in one outer iteration pass. This means that we don't have to decrease the number of inner iteration by 1 on each step of outer iteration. We can record down, at which position was the last swap, and on the next outer iteration, we can do comparison up to that last position. To illustrate this, consider the following input.
[1000, 1, 4, 7, 3, 10, 100]
In the first outer pass, we push 1000 to its final position. This means we have comparisons and swaps.
[1, 4, 7, 3, 10, 100, 1000]
In the second outer pass, we first compare the pair (1, 4), but no swap happens. Similarly with (4, 7). Now, when comparing the pair (7, 3), we do a swap to result in.
[1, 4, 3, 7, 10, 100, 1000]
When we have a swap, we will record down our position. In this case, it is at the position of the 4th element in the list, or index 3 (if our index starts from 0). Subsequently, we compare (7, 10), but no swap happens. Similarly with (10, 100). We stop at this point because the second iteration only compares up to the second last element in the above algorithm.
Now, if we follow the previous algorithm, the next outer pass will compare up to the third last element, which is 10. However, we note that the last five elements are already ordered. We know this because in the last pass, the last swap happens at (7, 3) to (3, 7) pair. In this case, we just need to run our inner iteration up to this position, i.e. 4th position, or index 3.
We can modify our pseudocode as follows.
Show Pseudocode
Bubble Sort
Version: 4
Input: array
Output: None, sort in place
Steps:
1. n = length of array
2. swapped = True
3. As long as swapped is True, do:
3.1 swapped = False
3.2 new_n = 0
3.3 For inner_index from 1 to n-1, do:
3.3.1 first_number = array[inner_index - 1]
3.3.2 second_number = array[inner_index]
3.3.3 if first_number > second_number, do:
3.3.3.1 swap(first_number, second_number)
3.3.3.2 swapped = True
3.3.3.3 new_n = inner_index
3.4 n = new_n
In the above pseudocode, we set record down the position of the element on the last swap (step 3.3.3.3), and we assign this as the new ending position for the next outer pass (step 3.4).