Bubble sort
How it works
Compare two values and move the larger value to the right in each iteration.
This way, the greatest value will end up on the right side in each iteration and will be locked and excluded from subsequent iterations.
animated demonstration: https://commons.wikimedia.org/wiki/File:Bubble-sort-example-300px.gif
Complexity
- Time
O(n^2)
- bubble sort works by repeatedly swapping adjacent elements if they are in the wrong order, and it continues this process until the array is sorted. The number of swaps required to sort the array is proportional to the square of the number of elements in the array
- Space
O(1)
Pros and Cons
Pros
- Educational purposes: easy to implement
- memory efficiency: elements are swapped in place without using additional temporary storage
- stable sorting
- For example, consider a list of records that represent people, where each record has a name and an age. If we sort the list based on age, it is important to maintain the relative order of people with the same age, so that the order of their names is preserved in the sorted list. A stable sorting algorithm, such as merge sort, would ensure that the relative order of equal elements is preserved, while an unstable sorting algorithm, such as quick sort, would not.
- efficient for nearly-sorted data
- It takes linear time,
O(n)
- fewer swaps, allowing the algorithm to complete more quickly.
Cons
- bad performance: it does not deal well with a list containing a huge number of items.
- not recommended for use in practical applications due to its poor time complexity of
O(n^2)
Demonstration for unsorted numbers
input
7 5 6 1 3
i=4, j=0, j&j+1, (5<7
, swap)
7 5 6 1 3
5 7 6 1 3
j j
i=4, j=1, j&j+1, (7>6
, swap)
5 7 6 1 3
5 6 7 1 3
j j
i=4, j=2, j&j+1, (7>1
, swap)
5 6 7 1 3
5 6 1 7 3
j j
i=4, j=3, j&j+1, (7>3
, swap)
5 6 1 7 3
5 6 1 3 7
j j
i=3, j=0, j&j+1, (5<6
, do nothing)
5 6 1 3 7
j j
i=3, j=1, j&j+1, (6>1
, swap)
5 6 1 3 7
5 1 6 3 7
j j
i=3, j=2, j&j+1, (6>3
, swap)
5 1 6 3 7
5 1 3 6 7
j j
i=2, j=0, j&j+1, (5>1
, swap)
5 1 3 6 7
1 5 3 6 7
j j
i=2, j=1, j&j+1, (5>3
, swap)
1 5 3 6 7
1 3 5 6 7
j j
i=1, j=0, j&j+1, (1<3
, do nothing)
1 3 5 6 7
j j
end
Demonstration for sorted numbers
input
1 3 5 6 7
i=4, j=0, j&j+1, (1<3
, swapped = false)
1 3 5 6 7
j j
i=4, j=1, j&j+1, (3<5
, swapped = false)
1 3 5 6 7
j j
i=4, j=2, j&j+1, (5<6
, swapped = false)
1 3 5 6 7
j j
i=4, j=3, j&j+1, (6<7
, swapped = false)
1 3 5 6 7
j j
swapped is false, then break the loop
end
for sorted numbers, it only takes O(n)
Insertion sort
How it works
Set a starting point (i) that moves forward in each iteration after comparing all the numbers prior to it and move the smaller number to the left.
animated demonstration: https://commons.wikimedia.org/wiki/File:Insertion-sort-example.gif
Complexity
- Time
O(n^2)
- insertion sort works by iterating over the elements in the array, and for each element, it shifts all the elements that are greater than it to the right in order to make room for it in its final position. The number of shifts required to insert each element is proportional to the number of elements that have already been processed
- Space
O(1)
Pros and Cons
Pros
- efficient for nearly-sorted data
- It can sort in linear time
O(n)
when the input is already sorted or nearly sorted
- This is because the inner loop of the insertion sort algorithm, which is responsible for most of the computation, will have fewer iterations if the data is already close to being in order.
- online sorting: For example, consider a system that generates log events in real-time and needs to sort the events based on their timestamp. An online sorting algorithm, such as insertion sort, would be a good choice in this scenario, since it can sort the events as they are generated, rather than waiting until all the events have been generated before sorting the entire array.
- memory efficiency: n-place, which means no auxiliary storage is necessary
- stable sorting
- Insertion sort algorithm is far better than selection sort algorithm
- Selection sort algorithm can be used for small data sets, unfortunately Insertion sort algorithm best suitable for it.
Cons
- bad performance: it does not deal well with a list containing a huge number of items.
O(n^2)
Demonstration for unsorted numbers
input
7 8 5 2 4 6 3
i=1, j=1, j&j-1 (7<8
, break loop)
7 8 5 2 4 6 3
i
j j
i is a starting point and j will be reduced by 1 in each i loop to compare 2 values
i=2, j=2, j&j-1 (8>5
, swap)
7 8 5 2 4 6 3
7 5 8 2 4 6 3
i
j j
i=2, j=1, j&j-1 (7>5
, swap)
7 5 8 2 4 6 3
5 7 8 2 4 6 3
i
j j
i=3, j=3, j&j-1 (8>2
, swap)
5 7 8 2 4 6 3
5 7 2 8 4 6 3
i
j j
i=3, j=2, j&j-1 (7>2
, swap)
5 7 2 8 4 6 3
5 2 7 8 4 6 3
i
j j
i=3, j=1, j&j-1 (5>2
, swap)
5 2 7 8 4 6 3
2 5 7 8 4 6 3
i
j j
i=4, j=4, j&j-1 (8>4
, swap)
2 5 7 8 4 6 3
2 5 7 4 8 6 3
i
j j
i=4, j=3, j&j-1 (7>4
, swap)
2 5 7 4 8 6 3
2 5 4 7 8 6 3
i
j j
i=4, j=2, j&j-1 (5>4
, swap)
2 5 4 7 8 6 3
2 4 5 7 8 6 3
i
j j
i=4, j=1, j&j-1 (2<4
, break loop)
2 4 5 7 8 6 3
i
j j
i=5, j=5, j&j-1 (8>6
, swap)
2 4 5 7 8 6 3
2 4 5 7 6 8 3
i
j j
i=5, j=4, j&j-1 (7>6
, swap)
2 4 5 7 6 8 3
2 4 5 6 7 8 3
i
j j
i=5, j=3, j&j-1 (5<6
, break loop)
2 4 5 6 7 8 3
i
j j
i=6, j=6, j&j-1 (8>3
, swap)
2 4 5 6 7 8 3
2 4 5 6 7 3 8
i
j j
i=6, j=5, j&j-1 (7>3
, swap)
2 4 5 6 7 3 8
2 4 5 6 3 7 8
i
j j
i=6, j=4, j&j-1 (6>3
, swap)
2 4 5 6 3 7 8
2 4 5 3 6 7 8
i
j j
i=6, j=3, j&j-1 (5>3
, swap)
2 4 5 3 6 7 8
2 4 3 5 6 7 8
i
j j
i=6, j=2, j&j-1 (4>3
, swap)
2 4 3 5 6 7 8
2 3 4 5 6 7 8
i
j j
i=6, j=1, j&j-1 (2<3
, break loop)
2 3 4 5 6 7 8
i
j j
end
Demonstration for sorted numbers
input
2 3 4 5 6 7 8
i=1, j=1, j&j-1 (2<3
, break loop)
2 3 4 5 6 7 8
i
j j
i=2, j=2, j&j-1 (3<4
, break loop)
2 3 4 5 6 7 8
i
j j
i=3, j=3, j&j-1 (4<5
, break loop)
2 3 4 5 6 7 8
i
j j
i=4, j=4, j&j-1 (5<6
, break loop)
2 3 4 5 6 7 8
i
j j
i=5, j=5, j&j-1 (6<7
, break loop)
2 3 4 5 6 7 8
i
j j
i=6, j=6, j&j-1 (7<8
, break loop)
2 3 4 5 6 7 8
i
j j
for sorted numbers, it only takes O(n)
Selection sort
How it works
Set a starting point (i) to find the minimum after it and swap it with the minimum.
Lock the minimum and move the starting point forward in each iteration.
Complexity
- Time
O(n^2)
- selection sort works by repeatedly finding the minimum element in the unsorted portion of the array and swapping it with the first unsorted element. The number of swaps required to sort the array is proportional to the square of the number of elements in the array
- Space
O(1)
Pros and Cons
Pros
- memory efficiency: in-place algorithm. It does not require a lot of space for sorting
Cons
- bad performance
- It performs poorly when working on huge lists.
O(n^2)
- Among bubble, insertion and selection, selection is the least efficient. It takes
O(n^2)
even if the numbers have been sorted already
- unstable algorithm
Demonstration
input
2 8 1 3 9
i=0, minIdx=0, j=0, j+1 (2<8
, do nothing)
2 8 1 3 9
i j
m
i=0, minIdx=0, j=1, j+1 (2>1
, to be updated)
2 8 1 3 9
i j
m
i=0, minIdx=2, j=1, j+1 (update minIdx)
2 8 1 3 9
i j
m
i=0, midIdx=2, j=2, j+1 (1<3
, do nothing)
2 8 1 3 9
i m j
i=0, midIdx=2, j=3, j+1 (1<9
, do nothing)
2 8 1 3 9
i m j
i=0, midIdx=2, j=3, j+1 (end of iteration, swap min with i)
1 8 2 3 9
m i j
i=1, midIdx=1, j=1, j+1 (8>2
, to be updated)
1 8 2 3 9
m j
i
i=1, midIdx=2, j=1, j+1 (update minIdx)
1 8 2 3 9
i j
m
i=1, midIdx=2, j=2, j+1 (2<3
, do nothing)
1 8 2 3 9
i m j
i=1, midIdx=2, j=3, j+1 (2<9
, do nothing)
1 8 2 3 9
i m j
i=1, midIdx=2, j=3, j+1 (end of iteration, swap min with i)
1 2 8 3 9
m i j
i=2, midIdx=2, j=2, j+1 (8>3
, to be updated)
1 2 8 3 9
i j
m
i=2, midIdx=3, j=2, j+1 (update midIdx)
1 2 8 3 9
i j
m
i=2, midIdx=3, j=3, j+1 (3<9
, do nothing)
1 2 8 3 9
i m j
i=2, midIdx=3, j=3, j+1 (end of iteration, swap min with i)
1 2 3 8 9
m i j
i=3, midIdx=3, j=3, j+1 (8<9
, do nothing)
1 2 3 8 9
i j
m
i=3, midIdx=3, j=3, j+1 (end of iteration, swap min with i)
1 2 3 8 9
i j
m
end
Merge sort
How it works
Divide the array into two equal-sized sub-arrays until each sub-array contains only one element. Then, merge each sub-array back into one, in order of value, until complete.
Complexity
- Time
O(n log n)
- merge sort works by dividing the array into two halves, sorting each half separately, and then merging the two sorted halves back together. The merging process takes linear time proportional to the number of elements in the array, while the sorting process takes logarithmic time proportional to the number of elements in each half. The overall time complexity of merge sort is
O(n log n)
because the number of elements in each half is reduced by half with each iteration of the sorting process.
- Space
O(n)
- Merge sort has a space complexity of
O(n)
because it uses an auxiliary array of size n to store the sorted elements during the merging step.
- merge sort requires an auxiliary array to store the result of each merging step. The size of this array is proportional to the number of elements in the array, which means that the space complexity of merge sort is
O(n)
. Additionally, merge sort requires a small amount of additional memory to store the recursive call stack, but this memory usage is typically not significant compared to the size of the auxiliary array.
- It’s worth noting that the total space complexity is
O(log n) + O(n)
- the space complexity of the call stack for the recursive calls is
O(log n)
because the array is being divided in half with each recursive call
Pros and Cons
Pros
- Good performance
- It is efficient for both small and large data sets, with a time complexity of
O(n log n)
even though worst case occurs
- widely used in practice due to its efficient time complexity and stability
- Stable sorting
- Parallel processing
- can be implemented in a parallel way, which can greatly speed up sorting large data set
- External sorting
- Merge sort is often used for external sorting, where the elements to be sorted are too large to be stored in memory all at once. In this case, the elements are divided into smaller chunks that can be sorted and merged in a two-pass process, where the first pass sorts the chunks and the second pass merges the sorted chunks into the final sorted array.
Cons
- use more memory: it requires more memory to store the sublists, which can be a problem with a very large list
- slow for small arrays
- For small arrays, simpler sorting algorithms like Insertion Sort can actually be faster in practice even though they have worse time complexity
O(n^2)
than Merge Sort
- the algorithm does the whole process even the array is already sorted
- Marginally slower than quick sort in practise
Demonstration
[6 5 3 1 8 7 2 4]
/ \
[6 5 3 1] [8 7 2 4]
/ \ / \
[6 5] [3 1] [8 7] [2 4]
/ \ / \ / \ / \
[6] [5] [3] [1] [8] [7] [2] [4]
\ / \ / \ / \ /
[5 6] [1 3] [7 8] [2 4]
\ / \ /
[1 3 5 6] [2 4 7 8]
\ /
[1 2 3 4 5 6 7 8 ]
Quick sort
How it works
Choose a pivot element, partition the data set into two sub-arrays recursively based on the pivot element,
and sort each sub-array by moving values less than the pivot to the left and values greater than the pivot to the right.
When all sub-arrays contain only one element, the entire array is considered sorted.
Complexity
- Time
- (avg)
O(n log n)
- The pivot is then used as a pivot for recursive calls on the two partitioned subarrays. When the pivot is chosen optimally and the array is randomly shuffled, the size of each partition is roughly equal, which results in a good average time complexity of
O(n log n)
- (worse)
O(n^2)
- if the pivot is always chosen poorly, the time complexity of quick sort can be
O(n^2)
, which makes its time complexity heavily dependent on the choice of pivot.
- Space
- it’s tied to the maximum height of the recursion tree
- (avg)
O(log n)
- In the average case, when the pivot is chosen optimally and the array is randomly shuffled, the size of each partition is roughly equal, which results in a good average space complexity of
O(log n)
.
- This is because quick sort uses a recursive approach, where each recursive call uses a small amount of memory proportional to the size of the call stack.
O(1)
space for each recursive call and the height of the recursion tree is proportional to O(log n)
- It is due to the additional memory needed to store the intermediate state of the recursion (for function call stack).
- (worse)
O(n)
- where
n
is the number of elements in the array (height of tree)
- if the pivot is always chosen poorly, the size of one partition can be much larger than the other, which results in a call stack with a height proportional to the number of elements in the array, resulting in a worst-case space complexity of
O(n)
Pros and Cons
Pros
- Good performance
- Quick sort is highly efficient among all sorting algorithms
- widely used in practice
- fast in most cases
- memory usage:
O(log n)
- It is an in-place algorithm since it just requires a modest auxiliary stack.
O(log n)
- quicksort requires little space
- Parallel processing
- during each call of QUICKSORT, the array is partitioned into two parts and each part is solved recursively.
- Sorting the smaller arrays represents two completely independent subproblems that can be solved in parallel.
- Therefore, one way to parallelize quicksort is to execute it initially on a single process; then, when the algorithm performs its recursive calls, assign one of the subproblems to another process.
- Now each of these processes sorts its array by using quicksort and assigns one of its subproblems to other processes
Cons
- unstable sorting
- it swaps non-adjacent elements.
- worse case:
O(n^2)
- if you always choose the first or last array element as the pivot, then Quicksort takes N^2 time to sort an already-sorted array
- slowest sort if things are already in reverse-order
O(n^2)
- On sorted and nearly-sorted arrays, its performance is slower than standard InsertionSort
Demonstration
j
6 3 7 5 1 2 [4] 6 > 4
i
j
6 3 7 5 1 2 [4] 3 < 4
i
j
3 6 7 5 1 2 [4] swap
i
j
3 6 7 5 1 2 [4] i+1
i
j
3 6 7 5 1 2 [4] 7 > 4
i
j
3 6 7 5 1 2 [4] 5 > 4
i
j
3 6 7 5 1 2 [4] 1 < 4
i
j
3 1 7 5 6 2 [4] swap
i
j
3 1 7 5 6 2 [4] i+1
i
j
3 1 7 5 6 2 [4]
i
j
3 1 7 5 6 2 [4] 2 < 4
i
j
3 1 2 5 6 7 [4] swap
i
j
3 1 2 5 6 7 [4] i+1
i
j
3 1 2 5 6 7 [4] loop ends
i
swap pivot and i
3 1 2 [4] 6 7 5
Now, all elements that are less than the pivot are before it, and all elements that are greater than the pivot are after it.
Then divide the array into two and repeat the sorting process recursively for each part using the same steps
[3 1 2] 4 [6 7 5]
and so on…
It doesn’t require an additional array to perform the sorting operation
However, QuickSort is a recursive algorithm, and each recursive call requires a small amount of memory that is allocated on the call stack, a system-provided data structure that tracks function calls.
Each recursive call to QuickSort is a new “layer” on the call stack, and the maximum depth of these layers is what gives rise to the space complexity of QuickSort.
The amount of memory used does not depend on the size of the input to the function (which is why we say it’s a constant amount), but on the number of recursive calls
So each recursive call to the QuickSort function requires a constant amount of space for its stack frame
Heap sort
How it works
- Build an max-heap/min-heap first from given unsorted array using the same idea of
heapify
- In each iteration, swap the maximum with the last element, then run
heapify
to sort unsorted elements (exclude the elements that are shifted to the right)
- pop the maximum, then heapify the rest of the elements
- Until the heap size reduces to 1, all elements are sorted
Complexity
- Time
O(n log n)
- it takes
O(log n)
time to maintain the min-heap property by swapping elements down the tree as necessary
- Space
O(1)
- The algorithm swaps elements within the original array, so no extra memory is needed to store intermediate results
Pros and Cons
Pros
- Efficiency & Consistency
- a better average-case time complexity of
O(n log n)
- Memory usage is less: it doesn’t require any extra space to sort an array
O(1)
Cons
- unstable sort: not a stable sorting algorithm
- Expensive constant factors
- In the case of Heapsort vs. Quicksort, it turns out that there are ways (median of 5, for example) to make Quicksort’s worst cases very rare indeed
- Also, maintaining a heap is not free.
- Given an array with a normal distribution, Quicksort and Heapsort will both run in
O(n log n)
. But Quicksort will execute faster because its constant factors are smaller than the constant factors for Heapsort
- partitioning is faster than maintaining the heap
- Huge datasets
- If your dataset is really huge and doesn’t fit into memory, then merge sort works like a charm. It’s frequently used in clusters where dataset can span over hundreds of machines
Demonstration Part 1 - Make unsorted array a max-heap
For an array with 9 elements, only the first 4 elements need to be heapified in order to get max-heap.
The reason why only the first 4 elements are needed is because the direction of heapify
is downward, so the processes will take care of the whole heap
The example to explain above
-> 8
/ \
-> 27 14 <-
/ \ / \
-> 18 9 36 55
/ \
41 21
[ 8, 27, 14, 18, 9, 36, 55, 41, 21 ]
* * * *
If there are 10 elements in the array:
-> 8
/ \
-> 27 14 <-
/ \ / \
-> 18 -> 9 36 55
/ \ /
41 21 8
[ 8, 27, 14, 18, 9, 36, 55, 41, 21, 8 ]
* * * * *
Now, start to work, the input for the following demonstration
-> 7
/ \
-> 1 2
/ \
8 4
[7, 1, 2, 8, 4]
* *
the first 2 elements will need to be heapified
1st loop: swap value 8 with value 1 (heapify value 1)
7
/ \
-> 1 2
/ \
8 4
7
/ \
8 2
/ \
1 4
the heapify is downward, so it stops here
2nd loop: swap 7 with 8 (heapify value 7)
-> 7
/ \
8 2
/ \
1 4
-> 8
/ \
7 2
/ \
1 4
max-heap is done
Demonstration Part 2 - Sort the max-heap by shifting maximum to the right in each iteration
max-heap
8
/ \
7 2
/ \
1 4
[8, 7, 2, 1, 4]
1st loop
shift index 0 with the last one (swap value 8 with value 4)
4
/ \
7 2
/ \
1 8
[4, 7, 2, 1, 8]
do heapify for 0, 1, 2, 3 (without value 8), start with index 0
-> 4
/ \
7 2
/
1
[4, 7, 2, 1, 8]
swap value 4 and value 7
7
/ \
-> 4 2
/
1
[7, 4, 2, 1, 8]
do heapify, start from value 4
7
/ \
-> 4 2
/
1
[7, 4, 2, 1, 8]
done (4 > 1, do nothing)
2nd loop
swap value 7 with value 1
1
/ \
4 2
/
7
[1, 4, 2, 7, 8]
do heapify for index 0, 1, 2 (without value 7), start from value 1
-> 1
/ \
4 2
[1, 4, 2, 7, 8]
swap value 1 and value 4
4
/ \
-> 1 2
[4, 1, 2, 7, 8]
3rd loop
swap value 2 with value 4
2
/ \
1 4
[2, 1, 4, 7, 8]
do heapify for index 0, 1 (without value 4), start from value 2
-> 2
/
1
[2, 1, 4, 7, 8]
done (2 > 1, do nothing)
4th loop
swap value 1 and value 2
1
/
2
[1, 2, 4, 7, 8]
done (no need to do heapify)
Counting sort
How it works
The key idea behind counting sort is to count the occurrences of each element in the input array,
and use that information to determine the correct position of each element in the output array.
This makes counting sort very efficient for data sets with a relatively small range of values,
since it avoids the need for expensive comparisons between elements.
Complexity
n is the number of elements to be sorted and k is the range of values in the input array
Pros and Cons
Pros
- Linear time complexity
- This makes it very fast for data sets with a relatively small range of values.
- No comparisons needed
- Unlike many other sorting algorithms, counting sort does not rely on comparisons between elements, making it more efficient for certain types of data
- Stable sorting
- Easy to implement
Cons
- Memory usage
- Counting sort requires additional memory to store the counts of each element, as well as to store the sorted output array.
- Not an in-place sort
- may not be suitable for use in situations where memory is limited
- Inefficient for large range of values to be sorted
- Counting sort is only suitable for sorting data sets with a relatively small range of values. If the range of values is too large, the memory requirements of counting sort can become wasteful.
- It’s designed to work with integer (not suitable for decimal)
Demonstration
input
index 0 1 2 3
-----------------------
value | 18 | 21 | 25 | 21 |
-----------------------
min = 18, max = 25, size of count array = 8 (25-18+1)
Create a count array
index 0 1 2 3 4 5 6 7
-----------------------------------------------
value | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
-----------------------------------------------
(18) (19) (20) (21) (22) (23) (24) (25)
Fill the count array with the number of occurrences
index 0 1 2 3 4 5 6 7
-----------------------------------------------
value | 1 | 0 | 0 | 2 | 0 | 0 | 0 | 1 |
-----------------------------------------------
(18) (19) (20) (21) (22) (23) (24) (25)
Sum up values with predecessor values
index 0 1 2 3 4 5 6 7
-----------------------------------------------
value | 1 | 1 | 1 | 3 | 3 | 3 | 3 | 4 |
-----------------------------------------------
(18) (19) (20) (21) (22) (23) (24) (25)
Create a sorted array with the same length as origin input
index 0 1 2 3
-----------------------
value | 0 | 0 | 0 | 0 |
-----------------------
Now, start to fill the sorted array
To make sure this algorithm performs stable sorting, start with the last element 21
from the input
index 0 1 2 3
-----------------------
value | 18 | 21 | 25 | 21 |
-----------------------
index 0 1 2 3 4 5 6 7
-----------------------------------------------
value | 1 | 1 | 1 | 3 | 3 | 3 | 3 | 4 |
-----------------------------------------------
(18) (19) (20) (21) (22) (23) (24) (25)
index 0 1 2 3
-----------------------
value | 0 | 0 | 0 | 0 |
-----------------------
Find the value of the count array 3
via value 21
from input array
index 0 1 2 3
-----------------------
value | 18 | 21 | 25 | 21 |
-----------------------
|
|
index 0 1 2 3 4 5 6 7
-----------------------------------------------
value | 1 | 1 | 1 | 3 | 3 | 3 | 3 | 4 |
-----------------------------------------------
(18) (19) (20) (21) (22) (23) (24) (25)
index 0 1 2 3
-----------------------
value | 0 | 0 | 0 | 0 |
-----------------------
Fill the sorted array with 21
at the index 2
, which comes from the value of count array 3
minus 1
.
index 0 1 2 3
-----------------------
value | 18 | 21 | 25 | 21 |
-----------------------
|
|
index 0 1 2 3 4 5 6 7
-----------------------------------------------
value | 1 | 1 | 1 | 3 | 3 | 3 | 3 | 4 |
-----------------------------------------------
(18) (19) (20) (21) (22) (23) (24) (25)
____/
/
index 0 1 2 3
-----------------------
value | 0 | 0 | 21 | 0 |
-----------------------
Update the value of count array to 2 (3
-1
)
index 0 1 2 3
-----------------------
value | 18 | 21 | 25 | 21 |
-----------------------
|
|
index 0 1 2 3 4 5 6 7
-----------------------------------------------
value | 1 | 1 | 1 | 2 | 3 | 3 | 3 | 4 |
-----------------------------------------------
(18) (19) (20) (21) (22) (23) (24) (25)
____/
/
index 0 1 2 3
-----------------------
value | 0 | 0 | 21 | 0 |
-----------------------
Then next value is 25
. Find the value of count array using the same step
index 0 1 2 3
-----------------------
value | 18 | 21 | 25 | 21 |
-----------------------
\___________________________
\
index 0 1 2 3 4 5 6 7
-----------------------------------------------
value | 1 | 1 | 1 | 2 | 3 | 3 | 3 | 4 |
-----------------------------------------------
(18) (19) (20) (21) (22) (23) (24) (25)
index 0 1 2 3
-----------------------
value | 0 | 0 | 21 | 0 |
-----------------------
Find the index of sorted array 3
via the value of count array (4
minus 1
)
index 0 1 2 3
-----------------------
value | 18 | 21 | 25 | 21 |
-----------------------
\___________________________
\
index 0 1 2 3 4 5 6 7
-----------------------------------------------
value | 1 | 1 | 1 | 2 | 3 | 3 | 3 | 4 |
-----------------------------------------------
(18) (19) (20) (21) (22) (23) (24) (25)
______________________/
/
index 0 1 2 3
-----------------------
value | 0 | 0 | 21 | 0 |
-----------------------
Update the sorted array and count array
index 0 1 2 3
-----------------------
value | 18 | 21 | 25 | 21 |
-----------------------
\___________________________
\
index 0 1 2 3 4 5 6 7
-----------------------------------------------
value | 1 | 1 | 1 | 2 | 3 | 3 | 3 | 3 |
-----------------------------------------------
(18) (19) (20) (21) (22) (23) (24) (25)
______________________/
/
index 0 1 2 3
-----------------------
value | 0 | 0 | 21 | 25 |
-----------------------
Next, the value 21
index 0 1 2 3
-----------------------
value | 18 | 21 | 25 | 21 |
-----------------------
\_________
\
index 0 1 2 3 4 5 6 7
-----------------------------------------------
value | 1 | 1 | 1 | 2 | 3 | 3 | 3 | 3 |
-----------------------------------------------
(18) (19) (20) (21) (22) (23) (24) (25)
__________/
/
index 0 1 2 3
-----------------------
value | 0 | 0 | 21 | 25 |
-----------------------
Update the sorted array and count array
index 0 1 2 3
-----------------------
value | 18 | 21 | 25 | 21 |
-----------------------
\_________
\
index 0 1 2 3 4 5 6 7
-----------------------------------------------
value | 1 | 1 | 1 | 1 | 3 | 3 | 3 | 3 |
-----------------------------------------------
(18) (19) (20) (21) (22) (23) (24) (25)
__________/
/
index 0 1 2 3
-----------------------
value | 0 | 21 | 21 | 25 |
-----------------------
Lastly, the last value 18
index 0 1 2 3
-----------------------
value | 18 | 21 | 25 | 21 |
-----------------------
|
|
index 0 1 2 3 4 5 6 7
-----------------------------------------------
value | 1 | 1 | 1 | 1 | 3 | 3 | 3 | 3 |
-----------------------------------------------
(18) (19) (20) (21) (22) (23) (24) (25)
|
|
index 0 1 2 3
-----------------------
value | 0 | 21 | 21 | 25 |
-----------------------
Update the sorted array and count array
index 0 1 2 3
-----------------------
value | 18 | 21 | 25 | 21 |
-----------------------
|
|
index 0 1 2 3 4 5 6 7
-----------------------------------------------
value | 0 | 1 | 1 | 1 | 3 | 3 | 3 | 3 |
-----------------------------------------------
(18) (19) (20) (21) (22) (23) (24) (25)
|
|
index 0 1 2 3
-----------------------
value | 18 | 21 | 21 | 25 |
-----------------------
Now, all the elements are sorted
index 0 1 2 3
-----------------------
value | 18 | 21 | 21 | 25 |
-----------------------
end
Binary search
How it works
Find a specific value in a sorted array by repeatedly dividing the search interval in half.
It works by comparing the target value to the middle element of the array.
If the target value is less than the middle element, the search continues in the lower half of the array.
If the target value is greater than the middle element, the search continues in the upper half of the array.
The algorithm continues until the target value is found or the search interval is empty.
Complexity
Time: O(log n)
Pros and Cons
Pros
- very fast compared to linear search
- can be implemented both iteratively and recursively
- can be applied to any sorted array or list
Cons
- requires the input array or list to be sorted
- can only be applied to one-dimensional sorted arrays or lists
- not suitable for dynamic arrays or lists because inserting and deleting elements will destroy the sorted order and thus make the binary search useless
Demonstration
a sorted list, try to find 13
1 3 4 5 13 20 25 40 42 44 53
find the middle 20
and compare with it
1 3 4 5 13 20 25 40 42 44 53
*
because 13 < 20
, then look for the left half
1 3 4 5 13 20 25 40 42 44 53
| |
find the middle 4
on the left half, then look for the right half as 13 > 4
1 3 4 5 13 20 25 40 42 44 53
| * |
find the middle 5
on the left half, then look for the right half as 13 > 5
1 3 4 5 13 20 25 40 42 44 53
| * |
find the node 13
1 3 4 5 13 20 25 40 42 44 53
*
end
Traversal
BFS (Breadth First Search)
tree
9
/ \
4 20
/ \ / \
1 6 15 170
traversal
9 4 20 1 6 15 170
time complexity is O(n)
Pros
- If Solution exists BFS will definitely find it
- easy to find the shortest path between two nodes because BFS visits nodes level-by-level, and it will find the destination node as soon as it is encountered
- Never get trapped in unwanted nodes without the solution
- useful for finding the maximum or minimum value in a BST
Cons
- BFS uses more memory than DFS because it needs to keep track of all nodes at the same level before moving on to deeper levels (e.g. queue)
- Slower for deep trees (a higher time complexity than DFS because it visits all nodes at the same level before moving on to the next level)
use cases
- find BT nodes
- search engine use BFS to build index
- GPS navigation
- Network problems: BFS can be used to find the shortest path in a network. For example, it can be used to find the shortest path between two nodes in a computer network, or to find the shortest path between two nodes in a social network.
- Map problems: BFS can be used to find the shortest path in a map.
- recommendation engine
- on Amazon: what types of items are related or the closet relation to the last book I bought
- on FB: what types of friend requests I should be recommended
DFS (Depth First Search)
tree
9
/ \
4 20
/ \ / \
1 6 15 170
traversal
9 4 1 6 20 15 170
The time complexity is O(n)
regardless of the traversal method used
Pros
- Memory efficiency: DFS uses a stack to store nodes to be processed, which only requires a limited amount of memory compared to BFS, which uses a queue and requires more memory.
- Faster for deep trees because it can quickly reach the bottom-most level of a tree.
- Faster in finding a solution: DFS can backtrack and quickly eliminate branches that do not contain a solution especially if the solution is located deep in the graph. For example, consider a problem where you need to find a specific node in a large graph. With DFS, you can start from a node and follow its edges to explore deeper into the graph. If you find the solution, you can stop the search and return the result.
Cons
- Can get stuck in an infinite loop: If there are cycles in the graph, DFS may get stuck in an infinite loop, whereas BFS is able to avoid this issue.
- Cannot guarantee to find a solution. For example, if we search for Starbucks from my location, it will definitely find one soon. but if we search for a museum, then it will take more time and might not guarantee to find one.
- Cannot find the minimal solution if two solutions are available
- Slower for shallow trees
use cases
- If we perform DFS on unweighted graph, then it will create minimum spanning tree for all pair shortest path tree
- Using DFS we can find path between two given vertices u and v.
- topological sorting: Topological sorting is a linear ordering of the vertices in a directed acyclic graph (DAG) (e.g. A->B->C->D)
- scheduling problems: can be used to schedule jobs from given dependencies among jobs
- solving puzzles with only one solution, such as a maze or a sudoku puzzle.
- analysing networks e.g. testing if a graph is bipartite (a graph where vertices can be partitioned into two disjoint sets, with no edges connecting vertices in the same set. It’s used in matching, scheduling, and network flow problems and can be easily tested for by coloring vertices with two colors and checking if no two adjacent vertices have the same color.)
- on linkedin: If I have a connection someone, I can use DFS for what degree of connection with that person.
Why is BFS more effective in finding the shortest path compared to DFS?
- BFS is good at finding the shortest path in a graph or tree because it explores all the nodes at a given level before moving on to the next level. This means that it discovers all the nodes at distance 1 from the starting node before discovering any nodes at distance 2, and so on.
- BFS guarantees that it explores all nodes at a given distance before moving on to the next distance, it is guaranteed to find the shortest path from the starting node to any other node in the graph or tree.
- DFS does not guarantee to find the shortest path, as it explores the nodes along a single path until it reaches a dead end before backtracking and exploring other paths.
inorder vs preorder vs postorder
- preorder (root, left, right)
- If you know you need to explore the roots before inspecting any leaves, you pick pre-order because you will encounter all the roots before all of the leaves.
- inorder (left, root, right)
- If you know that the tree has an inherent sequence in the nodes, and you want to flatten the tree back into its original sequence, than an in-order traversal should be used.
- postorder (left, right, root)
- If you know you need to explore all the leaves before any nodes, you select post-order because you don’t waste any time inspecting roots in search for leaves.
Example 1
9
/ \
4 20
/ \ / \
1 6 15 170
preorder
9 4 1 6 20 15 170
inorder
1 4 6 9 15 20 170
postorder
1 6 4 15 170 20 9
Example 2
50
/ \
/ \
30 80
/ \ / \
1 47 70 100
/ \ \ / \
40 73 78 99 126
/ \ / \ /
38 45 75 79 118
preorder
50 30 1 47 40 38 45 73 80 70 78 75 79 100 99 126 118
inorder
1 30 38 40 45 47 73 50 70 75 78 79 80 99 100 118 126
postorder
1 38 45 40 73 47 30 75 79 78 70 99 118 126 100 80 50
Why does BFS not have inorder, preorder, and postorder implementations, while DFS does?
- BFS visits nodes in a level-by-level manner, which makes it difficult to visit the nodes in a specific order.
- DFS has inorder, preorder, and postorder implementations because it can be used to traverse the tree and visit the nodes in a specific order by visiting the children of a node before visiting the node itself.
What’s the pros and cons of recursive solution?
Pros
- reduce time complexity (if use recursion with memorisation)
- cleaner code
- Recursion is better at tree traversal
- One of the more efficient ways to traverse these trees when looking for a specific leaf (or node) is by recursively following a single branch until the end of that branch until you find the value you are looking for.
Cons
- use more memory (might cause stack overflow)
- Because the function has to add to the stack with each recursive call and keep the values there until the call is finished
- can be slow (generally, compare to iteration)
- it requires the allocation of a new stack frame
- fibonacci question is a good example. Iteration is much faster than recursion, because iteration saves the value of each calculation for further use.
- more difficult to understand and debug
Other sorting algorithms
- Tree sort
- Steps
- Create a binary search tree from an array
- Do an in-order traversal
- Radix sort
- Time
O(nk)
- Space
O(n+k)
- used to sort integrers or strings
shortest path (of a weighted graph)
Leetcode techniques
Cumulative Sum
A cumulative sum is a sequence of partial sums of a given sequence. For an array of numbers, the cumulative sum at a certain index is the sum of all elements up to that index.
If nums is [a, b, c, d]
, the cumulative sum array would be [a, a+b, a+b+c, a+b+c+d]
.
For example, the original input:
[1, 2, 3, 4]
cumulative sum array:
[1, 3, 6, 10]
Once you have the cumulative sum array, you can quickly calculate the sum of any subarray from index i to j.
For example
- what’s the sum of 1 to 3? Ans: 1 to 3 is cumulativeSum[3] - cumulativeSum[0], which equals 10 - 1 = 9
- what’s the sum of 2 to 3? Ans: 2 to 3 is cumulativeSum[3] - cumulativeSum[1], which equals 10 - 3 = 7
Terms
Dynamic Programming
- Aim to find the best solution among a set of possible solutions
- It is mainly used when the problem can be broken down into overlapping subproblems, and the optimal solutions of these subproblems can be used iteratively to solve the overall problem
- to solve each subproblem only once, rather than repeatedly
- = Divide & Conquer + Memoisation
Functional Programming
- uses pure functions to create maintainable software
- uses immutable data and avoids concepts like shared states (opposite to OOP)
- focuses on the results, not the process
- make code easier to understand, test, and debug
- independent and reusable
Interview
Top 6 Coding Interview Concepts
- Heaps
- min-heaps
- max-heaps
- K closest points to origin
- network delay time
- min cost to connect all points
- sliding window
- Best time to buy/sell a stock
- Binary search
- Guess number higher or lower
- search a 2-D matrix
- binary search
- DFS & BFS
- usually
O(V+E)
, time & space complexity
- number of islands
- Recursion
- includes Trees, Graphs, Backtracking, DP
- N-queens
- Hash maps
The 10 Most Important Concepts For Coding Interviews
- logarithum
- DFS/BFS
- binary search
- sliding window
- recursion
- inverting a binary tree and reverting a linked list
- suffix trees
- heaps
- dynamic programming
- sorting algorithum
ref