Algorithms

Bubble sort

How it works

Compare two values and move the larger value to the right in each iteration. This way, the greatest value will end up on the right side in each iteration and will be locked and excluded from subsequent iterations.

animated demonstration: https://commons.wikimedia.org/wiki/File:Bubble-sort-example-300px.gif

Complexity

Time O(n^2)
- bubble sort works by repeatedly swapping adjacent elements if they are in the wrong order, and it continues this process until the array is sorted. The number of swaps required to sort the array is proportional to the square of the number of elements in the array
Space O(1)

Pros and Cons

Pros

Educational purposes: easy to implement
memory efficiency: elements are swapped in place without using additional temporary storage
stable sorting
- For example, consider a list of records that represent people, where each record has a name and an age. If we sort the list based on age, it is important to maintain the relative order of people with the same age, so that the order of their names is preserved in the sorted list. A stable sorting algorithm, such as merge sort, would ensure that the relative order of equal elements is preserved, while an unstable sorting algorithm, such as quick sort, would not.
efficient for nearly-sorted data
- It takes linear time, O(n)
- fewer swaps, allowing the algorithm to complete more quickly.

Cons

bad performance: it does not deal well with a list containing a huge number of items.
- not recommended for use in practical applications due to its poor time complexity of O(n^2)

Demonstration for unsorted numbers

input

7 5 6 1 3

i=4, j=0, j&j+1, (5<7, swap)

7 5 6 1 3
5 7 6 1 3
j j

i=4, j=1, j&j+1, (7>6, swap)

5 7 6 1 3
5 6 7 1 3
  j j

i=4, j=2, j&j+1, (7>1, swap)

5 6 7 1 3
5 6 1 7 3
    j j

i=4, j=3, j&j+1, (7>3, swap)

5 6 1 7 3
5 6 1 3 7
      j j

i=3, j=0, j&j+1, (5<6, do nothing)

5 6 1 3 7
j j

i=3, j=1, j&j+1, (6>1, swap)

5 6 1 3 7
5 1 6 3 7
  j j

i=3, j=2, j&j+1, (6>3, swap)

5 1 6 3 7
5 1 3 6 7
    j j

i=2, j=0, j&j+1, (5>1, swap)

5 1 3 6 7
1 5 3 6 7
j j

i=2, j=1, j&j+1, (5>3, swap)

1 5 3 6 7
1 3 5 6 7
  j j

i=1, j=0, j&j+1, (1<3, do nothing)

1 3 5 6 7
j j

end

Demonstration for sorted numbers

input

1 3 5 6 7

i=4, j=0, j&j+1, (1<3, swapped = false)

1 3 5 6 7
j j

i=4, j=1, j&j+1, (3<5, swapped = false)

1 3 5 6 7
  j j

i=4, j=2, j&j+1, (5<6, swapped = false)

1 3 5 6 7
    j j

i=4, j=3, j&j+1, (6<7, swapped = false)

1 3 5 6 7
      j j

swapped is false, then break the loop

end

for sorted numbers, it only takes O(n)

Insertion sort

How it works

Set a starting point (i) that moves forward in each iteration after comparing all the numbers prior to it and move the smaller number to the left.

animated demonstration: https://commons.wikimedia.org/wiki/File:Insertion-sort-example.gif

Complexity

Time O(n^2)
- insertion sort works by iterating over the elements in the array, and for each element, it shifts all the elements that are greater than it to the right in order to make room for it in its final position. The number of shifts required to insert each element is proportional to the number of elements that have already been processed
Space O(1)

Pros and Cons

Pros

efficient for nearly-sorted data
- It can sort in linear time O(n) when the input is already sorted or nearly sorted
- This is because the inner loop of the insertion sort algorithm, which is responsible for most of the computation, will have fewer iterations if the data is already close to being in order.
- online sorting: For example, consider a system that generates log events in real-time and needs to sort the events based on their timestamp. An online sorting algorithm, such as insertion sort, would be a good choice in this scenario, since it can sort the events as they are generated, rather than waiting until all the events have been generated before sorting the entire array.
memory efficiency: n-place, which means no auxiliary storage is necessary
stable sorting
Insertion sort algorithm is far better than selection sort algorithm
- Selection sort algorithm can be used for small data sets, unfortunately Insertion sort algorithm best suitable for it.

Cons

bad performance: it does not deal well with a list containing a huge number of items. O(n^2)

Demonstration for unsorted numbers

input

7 8 5 2 4 6 3

i=1, j=1, j&j-1 (7<8, break loop)

7 8 5 2 4 6 3
i
j j

i is a starting point and j will be reduced by 1 in each i loop to compare 2 values

i=2, j=2, j&j-1 (8>5, swap)

7 8 5 2 4 6 3
7 5 8 2 4 6 3
  i
  j j

i=2, j=1, j&j-1 (7>5, swap)

7 5 8 2 4 6 3
5 7 8 2 4 6 3
  i
j j

i=3, j=3, j&j-1 (8>2, swap)

5 7 8 2 4 6 3
5 7 2 8 4 6 3
    i
    j j

i=3, j=2, j&j-1 (7>2, swap)

5 7 2 8 4 6 3
5 2 7 8 4 6 3
    i
  j j

i=3, j=1, j&j-1 (5>2, swap)

5 2 7 8 4 6 3
2 5 7 8 4 6 3
    i
j j

i=4, j=4, j&j-1 (8>4, swap)

2 5 7 8 4 6 3
2 5 7 4 8 6 3
      i
      j j

i=4, j=3, j&j-1 (7>4, swap)

2 5 7 4 8 6 3
2 5 4 7 8 6 3
      i
    j j

i=4, j=2, j&j-1 (5>4, swap)

2 5 4 7 8 6 3
2 4 5 7 8 6 3
      i
  j j

i=4, j=1, j&j-1 (2<4, break loop)

2 4 5 7 8 6 3
      i
j j

i=5, j=5, j&j-1 (8>6, swap)

2 4 5 7 8 6 3
2 4 5 7 6 8 3
        i
        j j

i=5, j=4, j&j-1 (7>6, swap)

2 4 5 7 6 8 3
2 4 5 6 7 8 3
        i
      j j

i=5, j=3, j&j-1 (5<6, break loop)

2 4 5 6 7 8 3
        i
    j j

i=6, j=6, j&j-1 (8>3, swap)

2 4 5 6 7 8 3
2 4 5 6 7 3 8
          i
          j j

i=6, j=5, j&j-1 (7>3, swap)

2 4 5 6 7 3 8
2 4 5 6 3 7 8
          i
        j j

i=6, j=4, j&j-1 (6>3, swap)

2 4 5 6 3 7 8
2 4 5 3 6 7 8
          i
      j j

i=6, j=3, j&j-1 (5>3, swap)

2 4 5 3 6 7 8
2 4 3 5 6 7 8
          i
    j j

i=6, j=2, j&j-1 (4>3, swap)

2 4 3 5 6 7 8
2 3 4 5 6 7 8
          i
  j j

i=6, j=1, j&j-1 (2<3, break loop)

2 3 4 5 6 7 8
          i
j j

end

Demonstration for sorted numbers

input

2 3 4 5 6 7 8

i=1, j=1, j&j-1 (2<3, break loop)

2 3 4 5 6 7 8
i
j j

i=2, j=2, j&j-1 (3<4, break loop)

2 3 4 5 6 7 8
  i
  j j

i=3, j=3, j&j-1 (4<5, break loop)

2 3 4 5 6 7 8
    i
    j j

i=4, j=4, j&j-1 (5<6, break loop)

2 3 4 5 6 7 8
      i
      j j

i=5, j=5, j&j-1 (6<7, break loop)

2 3 4 5 6 7 8
        i
        j j

i=6, j=6, j&j-1 (7<8, break loop)

2 3 4 5 6 7 8
          i
          j j

for sorted numbers, it only takes O(n)

Selection sort

How it works

Set a starting point (i) to find the minimum after it and swap it with the minimum. Lock the minimum and move the starting point forward in each iteration.

Complexity

Time O(n^2)
- selection sort works by repeatedly finding the minimum element in the unsorted portion of the array and swapping it with the first unsorted element. The number of swaps required to sort the array is proportional to the square of the number of elements in the array
Space O(1)

Pros and Cons

Pros

memory efficiency: in-place algorithm. It does not require a lot of space for sorting

Cons

bad performance
- It performs poorly when working on huge lists. O(n^2)
- Among bubble, insertion and selection, selection is the least efficient. It takes O(n^2) even if the numbers have been sorted already
unstable algorithm

Demonstration

input

2 8 1 3 9

i=0, minIdx=0, j=0, j+1 (2<8, do nothing)

2 8 1 3 9
i j
m

i=0, minIdx=0, j=1, j+1 (2>1, to be updated)

2 8 1 3 9
i   j
m

i=0, minIdx=2, j=1, j+1 (update minIdx)

2 8 1 3 9
i   j
    m

i=0, midIdx=2, j=2, j+1 (1<3, do nothing)

2 8 1 3 9
i   m j

i=0, midIdx=2, j=3, j+1 (1<9, do nothing)

2 8 1 3 9
i   m   j

i=0, midIdx=2, j=3, j+1 (end of iteration, swap min with i)

1 8 2 3 9
m   i   j

i=1, midIdx=1, j=1, j+1 (8>2, to be updated)

1 8 2 3 9
  m j
  i

i=1, midIdx=2, j=1, j+1 (update minIdx)

1 8 2 3 9
  i j
    m

i=1, midIdx=2, j=2, j+1 (2<3, do nothing)

1 8 2 3 9
  i m j

i=1, midIdx=2, j=3, j+1 (2<9, do nothing)

1 8 2 3 9
  i m   j

i=1, midIdx=2, j=3, j+1 (end of iteration, swap min with i)

1 2 8 3 9
  m i   j

i=2, midIdx=2, j=2, j+1 (8>3, to be updated)

1 2 8 3 9
    i j
    m

i=2, midIdx=3, j=2, j+1 (update midIdx)

1 2 8 3 9
    i j
      m

i=2, midIdx=3, j=3, j+1 (3<9, do nothing)

1 2 8 3 9
    i m j

i=2, midIdx=3, j=3, j+1 (end of iteration, swap min with i)

1 2 3 8 9
    m i j

i=3, midIdx=3, j=3, j+1 (8<9, do nothing)

1 2 3 8 9
      i j
      m

i=3, midIdx=3, j=3, j+1 (end of iteration, swap min with i)

1 2 3 8 9
      i j
      m

end

Merge sort

How it works

Divide the array into two equal-sized sub-arrays until each sub-array contains only one element. Then, merge each sub-array back into one, in order of value, until complete.

Complexity

Time O(n log n)
- merge sort works by dividing the array into two halves, sorting each half separately, and then merging the two sorted halves back together. The merging process takes linear time proportional to the number of elements in the array, while the sorting process takes logarithmic time proportional to the number of elements in each half. The overall time complexity of merge sort is O(n log n) because the number of elements in each half is reduced by half with each iteration of the sorting process.
Space O(n)
- Merge sort has a space complexity of O(n) because it uses an auxiliary array of size n to store the sorted elements during the merging step.
- merge sort requires an auxiliary array to store the result of each merging step. The size of this array is proportional to the number of elements in the array, which means that the space complexity of merge sort is O(n). Additionally, merge sort requires a small amount of additional memory to store the recursive call stack, but this memory usage is typically not significant compared to the size of the auxiliary array.
- It’s worth noting that the total space complexity is O(log n) + O(n)
  - the space complexity of the call stack for the recursive calls is O(log n) because the array is being divided in half with each recursive call

Pros and Cons

Pros

Good performance
- It is efficient for both small and large data sets, with a time complexity of O(n log n) even though worst case occurs
- widely used in practice due to its efficient time complexity and stability
Stable sorting
Parallel processing
- can be implemented in a parallel way, which can greatly speed up sorting large data set
External sorting
- Merge sort is often used for external sorting, where the elements to be sorted are too large to be stored in memory all at once. In this case, the elements are divided into smaller chunks that can be sorted and merged in a two-pass process, where the first pass sorts the chunks and the second pass merges the sorted chunks into the final sorted array.

Cons

use more memory: it requires more memory to store the sublists, which can be a problem with a very large list
slow for small arrays
- For small arrays, simpler sorting algorithms like Insertion Sort can actually be faster in practice even though they have worse time complexity O(n^2) than Merge Sort
the algorithm does the whole process even the array is already sorted
Marginally slower than quick sort in practise

Demonstration

                [6 5 3 1 8 7 2 4]
                  /           \
             [6 5 3 1]     [8 7 2 4]
              /    \         /    \
          [6 5]  [3 1]    [8 7]   [2 4]
          / \     / \      / \     / \
        [6] [5] [3] [1]  [8] [7] [2] [4]
           \ /    \ /     \ /     \ /
          [5 6]  [1 3]    [7 8]  [2 4]
              \  /          \   /
           [1 3 5 6]      [2 4 7 8]
                  \          /
               [1 2 3 4 5 6 7 8 ]

Quick sort

How it works

Choose a pivot element, partition the data set into two sub-arrays recursively based on the pivot element, and sort each sub-array by moving values less than the pivot to the left and values greater than the pivot to the right. When all sub-arrays contain only one element, the entire array is considered sorted.

Complexity

Time
- (avg) O(n log n)
  - The pivot is then used as a pivot for recursive calls on the two partitioned subarrays. When the pivot is chosen optimally and the array is randomly shuffled, the size of each partition is roughly equal, which results in a good average time complexity of O(n log n)
- (worse) O(n^2)
  - if the pivot is always chosen poorly, the time complexity of quick sort can be O(n^2), which makes its time complexity heavily dependent on the choice of pivot.
Space
- it’s tied to the maximum height of the recursion tree
- (avg) O(log n)
  - In the average case, when the pivot is chosen optimally and the array is randomly shuffled, the size of each partition is roughly equal, which results in a good average space complexity of O(log n).
  - This is because quick sort uses a recursive approach, where each recursive call uses a small amount of memory proportional to the size of the call stack.
  - O(1) space for each recursive call and the height of the recursion tree is proportional to O(log n)
  - It is due to the additional memory needed to store the intermediate state of the recursion (for function call stack).
- (worse) O(n)
  - where n is the number of elements in the array (height of tree)
  - if the pivot is always chosen poorly, the size of one partition can be much larger than the other, which results in a call stack with a height proportional to the number of elements in the array, resulting in a worst-case space complexity of O(n)

Pros and Cons

Pros

Good performance
- Quick sort is highly efficient among all sorting algorithms
- widely used in practice
- fast in most cases
memory usage: O(log n)
- It is an in-place algorithm since it just requires a modest auxiliary stack. O(log n)
- quicksort requires little space
Parallel processing
- during each call of QUICKSORT, the array is partitioned into two parts and each part is solved recursively.
- Sorting the smaller arrays represents two completely independent subproblems that can be solved in parallel.
- Therefore, one way to parallelize quicksort is to execute it initially on a single process; then, when the algorithm performs its recursive calls, assign one of the subproblems to another process.
- Now each of these processes sorts its array by using quicksort and assigns one of its subproblems to other processes

Cons

unstable sorting
- it swaps non-adjacent elements.
worse case: O(n^2)
- if you always choose the first or last array element as the pivot, then Quicksort takes N^2 time to sort an already-sorted array
- slowest sort if things are already in reverse-order O(n^2)
- On sorted and nearly-sorted arrays, its performance is slower than standard InsertionSort

Demonstration

j
6 3 7 5 1 2 [4]    6 > 4
i

  j
6 3 7 5 1 2 [4]     3 < 4
i

  j
3 6 7 5 1 2 [4]     swap
i

  j
3 6 7 5 1 2 [4]     i+1
  i

    j
3 6 7 5 1 2 [4]     7 > 4
  i

      j
3 6 7 5 1 2 [4]     5 > 4
  i

        j
3 6 7 5 1 2 [4]     1 < 4
  i

        j
3 1 7 5 6 2 [4]     swap
  i

        j
3 1 7 5 6 2 [4]     i+1
    i

          j
3 1 7 5 6 2 [4]
    i

          j
3 1 7 5 6 2 [4]     2 < 4
    i

          j
3 1 2 5 6 7 [4]     swap
    i

          j
3 1 2 5 6 7 [4]     i+1
      i

             j
3 1 2 5 6 7 [4]     loop ends
      i

swap pivot and i

3 1 2 [4] 6 7 5

Now, all elements that are less than the pivot are before it, and all elements that are greater than the pivot are after it.

Then divide the array into two and repeat the sorting process recursively for each part using the same steps

[3 1 2]  4  [6 7 5]

and so on…

It’s in-place algorithm, why does it require extra space?

It doesn’t require an additional array to perform the sorting operation

However, QuickSort is a recursive algorithm, and each recursive call requires a small amount of memory that is allocated on the call stack, a system-provided data structure that tracks function calls.

Each recursive call to QuickSort is a new “layer” on the call stack, and the maximum depth of these layers is what gives rise to the space complexity of QuickSort.

The amount of memory used does not depend on the size of the input to the function (which is why we say it’s a constant amount), but on the number of recursive calls

So each recursive call to the QuickSort function requires a constant amount of space for its stack frame

Heap sort

How it works

Build an max-heap/min-heap first from given unsorted array using the same idea of heapify
In each iteration, swap the maximum with the last element, then run heapify to sort unsorted elements (exclude the elements that are shifted to the right)
- pop the maximum, then heapify the rest of the elements
Until the heap size reduces to 1, all elements are sorted

Complexity

Time O(n log n)
- it takes O(log n) time to maintain the min-heap property by swapping elements down the tree as necessary
Space O(1)
- The algorithm swaps elements within the original array, so no extra memory is needed to store intermediate results

Pros and Cons

Pros

Efficiency & Consistency
- a better average-case time complexity of O(n log n)
Memory usage is less: it doesn’t require any extra space to sort an array O(1)

Cons

unstable sort: not a stable sorting algorithm
Expensive constant factors
- In the case of Heapsort vs. Quicksort, it turns out that there are ways (median of 5, for example) to make Quicksort’s worst cases very rare indeed
- Also, maintaining a heap is not free.
- Given an array with a normal distribution, Quicksort and Heapsort will both run in O(n log n). But Quicksort will execute faster because its constant factors are smaller than the constant factors for Heapsort
- partitioning is faster than maintaining the heap
Huge datasets
- If your dataset is really huge and doesn’t fit into memory, then merge sort works like a charm. It’s frequently used in clusters where dataset can span over hundreds of machines

Demonstration Part 1 - Make unsorted array a max-heap

For an array with 9 elements, only the first 4 elements need to be heapified in order to get max-heap. The reason why only the first 4 elements are needed is because the direction of heapify is downward, so the processes will take care of the whole heap

The example to explain above

          -> 8
           /   \
      -> 27     14 <-
        /  \   /  \
   -> 18    9 36   55
     / \
   41   21

  [ 8, 27, 14, 18, 9, 36, 55, 41, 21 ]
    *   *   *   *

If there are 10 elements in the array:

          -> 8
           /   \
      -> 27     14 <-
        /  \   /  \
   -> 18 -> 9 36   55
     / \   /
   41  21 8

  [ 8, 27, 14, 18, 9, 36, 55, 41, 21, 8 ]
    *   *   *   *  *

Now, start to work, the input for the following demonstration

   -> 7
     / \
 -> 1   2
   / \
  8   4

[7, 1, 2, 8, 4]
 *  *

the first 2 elements will need to be heapified

1st loop: swap value 8 with value 1 (heapify value 1)

the heapify is downward, so it stops here

2nd loop: swap 7 with 8 (heapify value 7)

max-heap is done

Demonstration Part 2 - Sort the max-heap by shifting maximum to the right in each iteration

max-heap

      8
     / \
    7   2
   / \
  1   4

  [8, 7, 2, 1, 4]

1st loop

shift index 0 with the last one (swap value 8 with value 4)

      4
     / \
    7   2
   / \
  1   8

  [4, 7, 2, 1, 8]

do heapify for 0, 1, 2, 3 (without value 8), start with index 0

   -> 4
     / \
    7   2
   /
  1

  [4, 7, 2, 1, 8]

swap value 4 and value 7

      7
     / \
 -> 4   2
   /
  1

  [7, 4, 2, 1, 8]

do heapify, start from value 4

      7
     / \
 -> 4   2
   /
  1

  [7, 4, 2, 1, 8]

done (4 > 1, do nothing)

2nd loop

swap value 7 with value 1

      1
     / \
    4   2
   /
  7

  [1, 4, 2, 7, 8]

do heapify for index 0, 1, 2 (without value 7), start from value 1

   -> 1
     / \
    4   2

  [1, 4, 2, 7, 8]

swap value 1 and value 4

      4
     / \
 -> 1   2

  [4, 1, 2, 7, 8]

3rd loop

swap value 2 with value 4

      2
     / \
    1   4

  [2, 1, 4, 7, 8]

do heapify for index 0, 1 (without value 4), start from value 2

   -> 2
     /
    1

  [2, 1, 4, 7, 8]

done (2 > 1, do nothing)

4th loop

swap value 1 and value 2

      1
     /
    2

  [1, 2, 4, 7, 8]

done (no need to do heapify)

Counting sort

How it works

The key idea behind counting sort is to count the occurrences of each element in the input array, and use that information to determine the correct position of each element in the output array. This makes counting sort very efficient for data sets with a relatively small range of values, since it avoids the need for expensive comparisons between elements.

Complexity

Time O(n+k)
Space O(n+k)

n is the number of elements to be sorted and k is the range of values in the input array

Pros and Cons

Pros

Linear time complexity
- This makes it very fast for data sets with a relatively small range of values.
No comparisons needed
- Unlike many other sorting algorithms, counting sort does not rely on comparisons between elements, making it more efficient for certain types of data
Stable sorting
Easy to implement

Cons

Memory usage
- Counting sort requires additional memory to store the counts of each element, as well as to store the sorted output array.
Not an in-place sort
- may not be suitable for use in situations where memory is limited
Inefficient for large range of values to be sorted
- Counting sort is only suitable for sorting data sets with a relatively small range of values. If the range of values is too large, the memory requirements of counting sort can become wasteful.
It’s designed to work with integer (not suitable for decimal)

Demonstration

input

 index     0     1     2     3
         -----------------------
 value  | 18  | 21  | 25  | 21  |
         -----------------------

min = 18, max = 25, size of count array = 8 (25-18+1)

Create a count array

 index     0     1     2     3     4     5     6     7
         -----------------------------------------------
 value  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |
         -----------------------------------------------
          (18)  (19)  (20)  (21)  (22)  (23)  (24)  (25)

Fill the count array with the number of occurrences

 index     0     1     2     3     4     5     6     7
         -----------------------------------------------
 value  |  1  |  0  |  0  |  2  |  0  |  0  |  0  |  1  |
         -----------------------------------------------
          (18)  (19)  (20)  (21)  (22)  (23)  (24)  (25)

Sum up values with predecessor values

 index     0     1     2     3     4     5     6     7
         -----------------------------------------------
 value  |  1  |  1  |  1  |  3  |  3  |  3  |  3  |  4  |
         -----------------------------------------------
          (18)  (19)  (20)  (21)  (22)  (23)  (24)  (25)

Create a sorted array with the same length as origin input

 index     0     1     2     3
         -----------------------
 value  |  0  |  0  |  0  |  0  |
         -----------------------

Now, start to fill the sorted array

To make sure this algorithm performs stable sorting, start with the last element 21 from the input

 index     0     1     2     3
         -----------------------
 value  | 18  | 21  | 25  | 21  |
         -----------------------


 index     0     1     2     3     4     5     6     7
         -----------------------------------------------
 value  |  1  |  1  |  1  |  3  |  3  |  3  |  3  |  4  |
         -----------------------------------------------
          (18)  (19)  (20)  (21)  (22)  (23)  (24)  (25)


 index     0     1     2     3
         -----------------------
 value  |  0  |  0  |  0  |  0  |
         -----------------------

Find the value of the count array 3 via value 21 from input array

 index     0     1     2     3
         -----------------------
 value  | 18  | 21  | 25  | 21  |
         -----------------------
                             |
                             |
 index     0     1     2     3     4     5     6     7
         -----------------------------------------------
 value  |  1  |  1  |  1  |  3  |  3  |  3  |  3  |  4  |
         -----------------------------------------------
          (18)  (19)  (20)  (21)  (22)  (23)  (24)  (25)


 index     0     1     2     3
         -----------------------
 value  |  0  |  0  |  0  |  0  |
         -----------------------

Fill the sorted array with 21 at the index 2, which comes from the value of count array 3 minus 1.

 index     0     1     2     3
         -----------------------
 value  | 18  | 21  | 25  | 21  |
         -----------------------
                             |
                             |
 index     0     1     2     3     4     5     6     7
         -----------------------------------------------
 value  |  1  |  1  |  1  |  3  |  3  |  3  |  3  |  4  |
         -----------------------------------------------
          (18)  (19)  (20)  (21)  (22)  (23)  (24)  (25)
                         ____/
                        /
 index     0     1     2     3
         -----------------------
 value  |  0  |  0  | 21  |  0  |
         -----------------------

Update the value of count array to 2 (3-1)

 index     0     1     2     3
         -----------------------
 value  | 18  | 21  | 25  | 21  |
         -----------------------
                             |
                             |
 index     0     1     2     3     4     5     6     7
         -----------------------------------------------
 value  |  1  |  1  |  1  |  2  |  3  |  3  |  3  |  4  |
         -----------------------------------------------
          (18)  (19)  (20)  (21)  (22)  (23)  (24)  (25)
                         ____/
                        /
 index     0     1     2     3
         -----------------------
 value  |  0  |  0  | 21  |  0  |
         -----------------------

Then next value is 25. Find the value of count array using the same step

 index     0     1     2     3
         -----------------------
 value  | 18  | 21  | 25  | 21  |
         -----------------------
                        \___________________________
                                                    \
 index     0     1     2     3     4     5     6     7
         -----------------------------------------------
 value  |  1  |  1  |  1  |  2  |  3  |  3  |  3  |  4  |
         -----------------------------------------------
          (18)  (19)  (20)  (21)  (22)  (23)  (24)  (25)


 index     0     1     2     3
         -----------------------
 value  |  0  |  0  | 21  |  0  |
         -----------------------

Find the index of sorted array 3 via the value of count array (4 minus 1)

 index     0     1     2     3
         -----------------------
 value  | 18  | 21  | 25  | 21  |
         -----------------------
                        \___________________________
                                                    \
 index     0     1     2     3     4     5     6     7
         -----------------------------------------------
 value  |  1  |  1  |  1  |  2  |  3  |  3  |  3  |  4  |
         -----------------------------------------------
          (18)  (19)  (20)  (21)  (22)  (23)  (24)  (25)
                               ______________________/
                              /
 index     0     1     2     3
         -----------------------
 value  |  0  |  0  | 21  |  0  |
         -----------------------

Update the sorted array and count array

 index     0     1     2     3
         -----------------------
 value  | 18  | 21  | 25  | 21  |
         -----------------------
                        \___________________________
                                                    \
 index     0     1     2     3     4     5     6     7
         -----------------------------------------------
 value  |  1  |  1  |  1  |  2  |  3  |  3  |  3  |  3  |
         -----------------------------------------------
          (18)  (19)  (20)  (21)  (22)  (23)  (24)  (25)
                               ______________________/
                              /
 index     0     1     2     3
         -----------------------
 value  |  0  |  0  | 21  | 25  |
         -----------------------

Next, the value 21

 index     0     1     2     3
         -----------------------
 value  | 18  | 21  | 25  | 21  |
         -----------------------
                  \_________
                            \
 index     0     1     2     3     4     5     6     7
         -----------------------------------------------
 value  |  1  |  1  |  1  |  2  |  3  |  3  |  3  |  3  |
         -----------------------------------------------
          (18)  (19)  (20)  (21)  (22)  (23)  (24)  (25)
                   __________/
                  /
 index     0     1     2     3
         -----------------------
 value  |  0  |  0  | 21  | 25  |
         -----------------------

Update the sorted array and count array

 index     0     1     2     3
         -----------------------
 value  | 18  | 21  | 25  | 21  |
         -----------------------
                  \_________
                            \
 index     0     1     2     3     4     5     6     7
         -----------------------------------------------
 value  |  1  |  1  |  1  |  1  |  3  |  3  |  3  |  3  |
         -----------------------------------------------
          (18)  (19)  (20)  (21)  (22)  (23)  (24)  (25)
                   __________/
                  /
 index     0     1     2     3
         -----------------------
 value  |  0  | 21  | 21  | 25  |
         -----------------------

Lastly, the last value 18

 index     0     1     2     3
         -----------------------
 value  | 18  | 21  | 25  | 21  |
         -----------------------
           |
           |
 index     0     1     2     3     4     5     6     7
         -----------------------------------------------
 value  |  1  |  1  |  1  |  1  |  3  |  3  |  3  |  3  |
         -----------------------------------------------
          (18)  (19)  (20)  (21)  (22)  (23)  (24)  (25)
           |
           |
 index     0     1     2     3
         -----------------------
 value  |  0  | 21  | 21  | 25  |
         -----------------------

Update the sorted array and count array

 index     0     1     2     3
         -----------------------
 value  | 18  | 21  | 25  | 21  |
         -----------------------
           |
           |
 index     0     1     2     3     4     5     6     7
         -----------------------------------------------
 value  |  0  |  1  |  1  |  1  |  3  |  3  |  3  |  3  |
         -----------------------------------------------
          (18)  (19)  (20)  (21)  (22)  (23)  (24)  (25)
           |
           |
 index     0     1     2     3
         -----------------------
 value  | 18  | 21  | 21  | 25  |
         -----------------------

Now, all the elements are sorted

 index     0     1     2     3
         -----------------------
 value  | 18  | 21  | 21  | 25  |
         -----------------------

end

Binary search

How it works

Find a specific value in a sorted array by repeatedly dividing the search interval in half. It works by comparing the target value to the middle element of the array. If the target value is less than the middle element, the search continues in the lower half of the array. If the target value is greater than the middle element, the search continues in the upper half of the array. The algorithm continues until the target value is found or the search interval is empty.

Complexity

Time: O(log n)

Pros and Cons

Pros

very fast compared to linear search
can be implemented both iteratively and recursively
can be applied to any sorted array or list

Cons

requires the input array or list to be sorted
can only be applied to one-dimensional sorted arrays or lists
not suitable for dynamic arrays or lists because inserting and deleting elements will destroy the sorted order and thus make the binary search useless

Demonstration

a sorted list, try to find 13

1 3 4 5 13 20 25 40 42 44 53

find the middle 20 and compare with it

1 3 4 5 13 20 25 40 42 44 53
           *

because 13 < 20, then look for the left half

1 3 4 5 13 20 25 40 42 44 53
|        |

find the middle 4 on the left half, then look for the right half as 13 > 4

1 3 4 5 13 20 25 40 42 44 53
|   *    |

find the middle 5 on the left half, then look for the right half as 13 > 5

1 3 4 5 13 20 25 40 42 44 53
    | *  |

find the node 13

1 3 4 5 13 20 25 40 42 44 53
        *

end

Traversal

BFS (Breadth First Search)

tree

            9
          /   \
        4      20
       / \    /  \
      1   6  15   170

traversal

9 4 20 1 6 15 170

time complexity is O(n)

Pros

If Solution exists BFS will definitely find it
easy to find the shortest path between two nodes because BFS visits nodes level-by-level, and it will find the destination node as soon as it is encountered
Never get trapped in unwanted nodes without the solution
useful for finding the maximum or minimum value in a BST

Cons

BFS uses more memory than DFS because it needs to keep track of all nodes at the same level before moving on to deeper levels (e.g. queue)
Slower for deep trees (a higher time complexity than DFS because it visits all nodes at the same level before moving on to the next level)

use cases

find BT nodes
search engine use BFS to build index
GPS navigation
Network problems: BFS can be used to find the shortest path in a network. For example, it can be used to find the shortest path between two nodes in a computer network, or to find the shortest path between two nodes in a social network.
Map problems: BFS can be used to find the shortest path in a map.
recommendation engine
- on Amazon: what types of items are related or the closet relation to the last book I bought
- on FB: what types of friend requests I should be recommended

DFS (Depth First Search)

tree

            9
          /   \
        4      20
       / \    /  \
      1   6  15   170

traversal

9 4 1 6 20 15 170

The time complexity is O(n) regardless of the traversal method used

Pros

Memory efficiency: DFS uses a stack to store nodes to be processed, which only requires a limited amount of memory compared to BFS, which uses a queue and requires more memory.
Faster for deep trees because it can quickly reach the bottom-most level of a tree.
Faster in finding a solution: DFS can backtrack and quickly eliminate branches that do not contain a solution especially if the solution is located deep in the graph. For example, consider a problem where you need to find a specific node in a large graph. With DFS, you can start from a node and follow its edges to explore deeper into the graph. If you find the solution, you can stop the search and return the result.

Cons

Can get stuck in an infinite loop: If there are cycles in the graph, DFS may get stuck in an infinite loop, whereas BFS is able to avoid this issue.
Cannot guarantee to find a solution. For example, if we search for Starbucks from my location, it will definitely find one soon. but if we search for a museum, then it will take more time and might not guarantee to find one.
Cannot find the minimal solution if two solutions are available
Slower for shallow trees

use cases

If we perform DFS on unweighted graph, then it will create minimum spanning tree for all pair shortest path tree
Using DFS we can find path between two given vertices u and v.
topological sorting: Topological sorting is a linear ordering of the vertices in a directed acyclic graph (DAG) (e.g. A->B->C->D)
- scheduling problems: can be used to schedule jobs from given dependencies among jobs
solving puzzles with only one solution, such as a maze or a sudoku puzzle.
analysing networks e.g. testing if a graph is bipartite (a graph where vertices can be partitioned into two disjoint sets, with no edges connecting vertices in the same set. It’s used in matching, scheduling, and network flow problems and can be easily tested for by coloring vertices with two colors and checking if no two adjacent vertices have the same color.)
on linkedin: If I have a connection someone, I can use DFS for what degree of connection with that person.

Why is BFS more effective in finding the shortest path compared to DFS?

BFS is good at finding the shortest path in a graph or tree because it explores all the nodes at a given level before moving on to the next level. This means that it discovers all the nodes at distance 1 from the starting node before discovering any nodes at distance 2, and so on.
BFS guarantees that it explores all nodes at a given distance before moving on to the next distance, it is guaranteed to find the shortest path from the starting node to any other node in the graph or tree.
DFS does not guarantee to find the shortest path, as it explores the nodes along a single path until it reaches a dead end before backtracking and exploring other paths.

inorder vs preorder vs postorder

preorder (root, left, right)
- If you know you need to explore the roots before inspecting any leaves, you pick pre-order because you will encounter all the roots before all of the leaves.
inorder (left, root, right)
- If you know that the tree has an inherent sequence in the nodes, and you want to flatten the tree back into its original sequence, than an in-order traversal should be used.
postorder (left, right, root)
- If you know you need to explore all the leaves before any nodes, you select post-order because you don’t waste any time inspecting roots in search for leaves.

Example 1

            9
          /   \
        4      20
       / \    /  \
      1   6  15   170

preorder

9 4 1 6 20 15 170

inorder

1 4 6 9 15 20 170

postorder

1 6 4 15 170 20 9

Example 2

              50
            /    \
          /        \
       30           80
      /  \        /    \
     1    47    70      100
         /  \     \     / \
       40    73    78  99  126
      /  \        /  \     /
     38  45      75  79  118

preorder

50 30 1 47 40 38 45 73 80 70 78 75 79 100 99 126 118

inorder

1 30 38 40 45 47 73 50 70 75 78 79 80 99 100 118 126

postorder

1 38 45 40 73 47 30 75 79 78 70 99 118 126 100 80 50

Why does BFS not have inorder, preorder, and postorder implementations, while DFS does?

BFS visits nodes in a level-by-level manner, which makes it difficult to visit the nodes in a specific order.
DFS has inorder, preorder, and postorder implementations because it can be used to traverse the tree and visit the nodes in a specific order by visiting the children of a node before visiting the node itself.

What’s the pros and cons of recursive solution?

Pros

reduce time complexity (if use recursion with memorisation)
cleaner code
Recursion is better at tree traversal
- One of the more efficient ways to traverse these trees when looking for a specific leaf (or node) is by recursively following a single branch until the end of that branch until you find the value you are looking for.

Cons

use more memory (might cause stack overflow)
- Because the function has to add to the stack with each recursive call and keep the values there until the call is finished
can be slow (generally, compare to iteration)
- it requires the allocation of a new stack frame
- fibonacci question is a good example. Iteration is much faster than recursion, because iteration saves the value of each calculation for further use.
more difficult to understand and debug

Other sorting algorithms

Tree sort
- Steps
  - Create a binary search tree from an array
  - Do an in-order traversal
Radix sort
- Time O(nk)
- Space O(n+k)
- used to sort integrers or strings

shortest path (of a weighted graph)

Bellman-Ford
- support negative weight
Dijkstra

Leetcode techniques

Cumulative Sum

A cumulative sum is a sequence of partial sums of a given sequence. For an array of numbers, the cumulative sum at a certain index is the sum of all elements up to that index.

If nums is [a, b, c, d], the cumulative sum array would be [a, a+b, a+b+c, a+b+c+d].

For example, the original input:

[1, 2, 3, 4]

cumulative sum array:

[1, 3, 6, 10]

Once you have the cumulative sum array, you can quickly calculate the sum of any subarray from index i to j.

For example

what’s the sum of 1 to 3? Ans: 1 to 3 is cumulativeSum[3] - cumulativeSum[0], which equals 10 - 1 = 9
what’s the sum of 2 to 3? Ans: 2 to 3 is cumulativeSum[3] - cumulativeSum[1], which equals 10 - 3 = 7

Terms

Dynamic Programming

Aim to find the best solution among a set of possible solutions
It is mainly used when the problem can be broken down into overlapping subproblems, and the optimal solutions of these subproblems can be used iteratively to solve the overall problem
- to solve each subproblem only once, rather than repeatedly
= Divide & Conquer + Memoisation

Functional Programming

uses pure functions to create maintainable software
uses immutable data and avoids concepts like shared states (opposite to OOP)
focuses on the results, not the process
make code easier to understand, test, and debug
independent and reusable

Interview

Top 6 Coding Interview Concepts

Heaps
- min-heaps
- max-heaps
- K closest points to origin
- network delay time
- min cost to connect all points
sliding window
- Best time to buy/sell a stock
Binary search
- Guess number higher or lower
- search a 2-D matrix
- binary search
DFS & BFS
- usually O(V+E), time & space complexity
- number of islands
Recursion
- includes Trees, Graphs, Backtracking, DP
- N-queens
Hash maps
- Two sum

The 10 Most Important Concepts For Coding Interviews

logarithum
DFS/BFS
binary search
sliding window
recursion
inverting a binary tree and reverting a linked list
suffix trees
heaps
dynamic programming
sorting algorithum

Bubble sort

How it works

Complexity

Pros and Cons

Demonstration for unsorted numbers

Demonstration for sorted numbers

Insertion sort

How it works

Complexity

Pros and Cons

Demonstration for unsorted numbers

Demonstration for sorted numbers

Selection sort

How it works

Complexity

Pros and Cons

Demonstration

Merge sort

How it works

Complexity

Pros and Cons

Demonstration

Quick sort

How it works

Complexity

Pros and Cons

Demonstration

It’s in-place algorithm, why does it require extra space?

Heap sort

How it works

Complexity

Pros and Cons

Demonstration Part 1 - Make unsorted array a max-heap

Demonstration Part 2 - Sort the max-heap by shifting maximum to the right in each iteration

1st loop

2nd loop

3rd loop

4th loop

Counting sort

How it works

Complexity

Pros and Cons

Demonstration

Binary search

How it works

Complexity

Pros and Cons

Demonstration

Traversal

BFS (Breadth First Search)

DFS (Depth First Search)

Why is BFS more effective in finding the shortest path compared to DFS?

inorder vs preorder vs postorder

Why does BFS not have inorder, preorder, and postorder implementations, while DFS does?

What’s the pros and cons of recursive solution?

Other sorting algorithms

shortest path (of a weighted graph)

Leetcode techniques

Cumulative Sum

Terms

Dynamic Programming

Functional Programming

Interview

Top 6 Coding Interview Concepts

The 10 Most Important Concepts For Coding Interviews

ref