Sorting Algorithms Flashcards
Selection Sort
- Find the smallest element in an array, replace it with the first element.
- Find the smallest element in the unsorted part of an array, replace it with the second element.
- Repeat this process until the end and array will be sorted at the end.
Time Complexity is O(n2) as there are two nested loops. The selection sort is not stable.
// One by one move boundary of // unsorted subarray for (int i = 0; i < n - 1; i++) { // Find the minimum element // in unsorted array int min_idx = i; for (int j = i + 1; j < n; j++) if (arr[j] < arr[min_idx]) min_idx = j; // Swap the found minimum // element with the first // element int temp = arr[min_idx]; arr[min_idx] = arr[i]; arr[i] = temp; }
Stable Sorting Algorithm Definition
A sorting algorithm is said to be stable if two objects with equal keys appear in the same order in sorted output as they appear in the input array to be sorted.
For example, [CC, AA, BZ, DD, BM] array after “stable” sorting algorithms will be [AA, BZ, BM, CC, DD]. As you see BZ and BM stay in their relative positions. But in unstable sort algorithm, they would be replaced:
[AA, BM, BZ, CC, DD]
Insertion Sort
you compare every element with the previous element. If the previous element is greater than your current element, you move that (previous) element to the next position. If not, you continue to find a great number
Time Complexity is O(n2) as there are two nested loops. The Insertion sort is stable. Insertion sort is known to be faster than quick sort only if it’s smaller array length because of the recursion stack that quick sort creates.
for (j = 1; j < arr.length; j++) { int key = arr[j] int i = j - 1 while (i > 0 and arr[i] > key) { arr[i+1] = arr[i] i -= 1 } arr[i+1] = key }
Bubble Sort
Bubble sort is a comparison-based algorithm that compares every pair of elements in an array and swaps them if they are in the wrong order.
Time Complexity is O(n²) as repeatedly passes through the unsorted part of the list. The bubble sort is stable.
int n = arr.length; for (int i = 0; i < n-1; i++) for (int j = 0; j < n-i-1; j++) if (arr[j] > arr[j+1]) { // swap arr[j+1] and arr[i] int temp = arr[j]; arr[j] = arr[j+1]; arr[j+1] = temp; }
Quick Sort
Average n log n
Quickest sorting algorithm but not stable
worst case n2 if selected wrong pivot (left most or right most)
for example 9,8,7,6,1, and 1 as a pivot, this would result in n^2
the solution is to select random pivot every time or median.
static int partition(int[] arr, int low, int high) { int pivot = arr[high]; int i = (low - 1); for (int j = low; j <= high - 1; j++) { if (arr[j] < pivot) { i++; swap(arr, i, j); } } swap(arr, i + 1, high); return (i + 1); } /* The main function that implements QuickSort arr[] --> Array to be sorted, low --> Starting index, high --> Ending index */ static void quickSort(int[] arr, int low, int high) { if (low < high) { int pi = partition(arr, low, high); quickSort(arr, low, pi - 1); quickSort(arr, pi + 1, high); } }
Random Quick Sort
static int partition(int[] arr, int low, int high)
{
// pivot is chosen randomly random(arr, low, high); int pivot = arr[high]; int i = (low-1); // index of smaller element for (int j = low; j < high; j++) { // If current element is smaller than or // equal to pivot if (arr[j] < pivot) { i++; // swap arr[i] and arr[j] int tempp = arr[i]; arr[i] = arr[j]; arr[j] = tempp; } } // swap arr[i+1] and arr[high] (or pivot) int tempp2 = arr[i + 1]; arr[i + 1] = arr[high]; arr[high] = tempp2; return i + 1; }
// This Function helps in calculating
// random numbers between low(inclusive)
// and high(inclusive)
static int random(int[] arr, int low, int high)
{
Random rand = new Random(); int pivot = rand.Next() % (high - low) + low; int tempp1 = arr[pivot]; arr[pivot] = arr[high]; arr[high] = tempp1; return partition(arr, low, high); }
Heap Sort Core Ideas
- Build a max heap from the input data.
- At this point, the maximum element is stored at the root of the heap. Replace it with the last item of the heap followed by reducing the size of the heap by 1. Finally, heapify the root of the tree.
- Repeat step 2 while the size of the heap is greater than 1.
Note: The heapify procedure can only be applied to a node if its children nodes are heapified. So the heapification must be performed in the bottom-up order.
Applications of HeapSort:
Heapsort is mainly used in hybrid algorithms like the IntroSort.
- Sort a nearly sorted (or K sorted) array
- k largest(or smallest) elements in an array
- The heap sort algorithm has limited uses because Quicksort and Mergesort are better in practice. Nevertheless, the Heap data structure itself is enormously used.
Its typical implementation is not stable.
O(n log n)
Heap Sort Implementation
public void sort(int[] arr)
{
int N = arr.Length;
// Build heap (rearrange array) for (int i = N / 2 - 1; i >= 0; i--) heapify(arr, N, i); // One by one extract an element from heap for (int i = N - 1; i > 0; i--) { // Move current root to end int temp = arr[0]; arr[0] = arr[i]; arr[i] = temp; // call max heapify on the reduced heap heapify(arr, i, 0); } } // To heapify a subtree rooted with node i which is // an index in arr[]. n is size of heap void heapify(int[] arr, int N, int i) { int largest = i; // Initialize largest as root int l = 2 * i + 1; // left = 2*i + 1 int r = 2 * i + 2; // right = 2*i + 2 // If left child is larger than root if (l < N && arr[l] > arr[largest]) largest = l; // If right child is larger than largest so far if (r < N && arr[r] > arr[largest]) largest = r; // If largest is not root if (largest != i) { int swap = arr[i]; arr[i] = arr[largest]; arr[largest] = swap; // Recursively heapify the affected sub-tree heapify(arr, N, largest); } }
Merge Sort Core Ideas
dividing the arr into smaller subarrays, sorting each subarray + merging sorted subareas back together to form a final sorted array.
Time Complexity = O(n log n), stable sort
popular for sorting large datasets b/c it’s efficient
*MergeSort(arr[], l, r)
if(r > l)
1. find the middle to divide the arr
m = l + (r-l)/2
2. call mergeSort for the first half
mergeSort(arr, l, m )
3. call mergeSort for the second half
mergeSort(arr, m+1, r)
4. merge two halves sorted in step 2/3
mege(arr, l, m, r)
Merge Sort Implementation
// Merges two subarrays of arr[].
// First subarray is arr[l..m]
// Second subarray is arr[m+1..r]
void merge(int arr[], int l, int m, int r)
{
// Find sizes of two subarrays to be merged
int n1 = m - l + 1;
int n2 = r - m;
/* Create temp arrays */ int L[] = new int[n1]; int R[] = new int[n2]; /*Copy data to temp arrays*/ for (int i = 0; i < n1; ++i) L[i] = arr[l + i]; for (int j = 0; j < n2; ++j) R[j] = arr[m + 1 + j]; /* Merge the temp arrays */ // Initial indexes of first and second subarrays int i = 0, j = 0; // Initial index of merged subarray array int k = l; while (i < n1 && j < n2) { if (L[i] <= R[j]) { arr[k] = L[i]; i++; } else { arr[k] = R[j]; j++; } k++; } /* Copy remaining elements of L[] if any */ while (i < n1) { arr[k] = L[i]; i++; k++; } /* Copy remaining elements of R[] if any */ while (j < n2) { arr[k] = R[j]; j++; k++; } } // Main function that sorts arr[l..r] using // merge() void sort(int arr[], int l, int r) { if (l < r) { // Find the middle point int m = l + (r - l) / 2; // Sort first and second halves sort(arr, l, m); sort(arr, m + 1, r); // Merge the sorted halves merge(arr, l, m, r); } }
Count Sort (With Negative Values)
Time Complexity: O(N + K) where N is the number of elements in the input array and K is the range of input.
Auxiliary Space: O(N + K) Not stable.
Counting sort is efficient if the range of input data is not significantly greater than the number of objects to be sorted.
static void countSort(int[] arr)
{
int max = arr.Max();
int min = arr.Min();
int range = max - min + 1;
int[] count = new int[range];
int[] output = new int[arr.Length];
for (int i = 0; i < arr.Length; i++) {
count[arr[i] - min]++;
}
for (int i = 1; i < count.Length; i++) {
count[i] += count[i - 1];
}
for (int i = arr.Length - 1; i >= 0; i–) {
output[count[arr[i] - min] - 1] = arr[i];
count[arr[i] - min]–;
}
for (int i = 0; i < arr.Length; i++) {
arr[i] = output[i];
}
}
Count Sort (Positive Values)
static void countsort(char[] arr)
{
int n = arr.Length;
// The output character array that // will have sorted arr char[] output = new char[n]; // Create a count array to store // count of individual characters // and initialize count array as 0 int[] count = new int[256]; for (int i = 0; i < 256; ++i) count[i] = 0; // store count of each character for (int i = 0; i < n; ++i) \++count[arr[i]]; // Change count[i] so that count[i] // now contains actual position of // this character in output array for (int i = 1; i <= 255; ++i) count[i] += count[i - 1]; // Build the output character array // To make it stable we are operating in reverse // order. for (int i = n - 1; i >= 0; i--) { output[count[arr[i]] - 1] = arr[i]; --count[arr[i]]; } // Copy the output array to arr, so // that arr now contains sorted // characters for (int i = 0; i < n; ++i) arr[i] = output[i]; }
Radix Sort Time Complexity
Radix sort takes O(ℓ∗(n+k)) time and O(n+k) space, where n is the number of items to sort, ℓ is the number of digits in each item, and k is the number of values each digit can have.
This time complexity comes from the fact that we’re calling counting sort one time for each of the ℓ digits in the input numbers, and counting sort has a time complexity of O(n+k).
The space complexity also comes from counting sort, which requires O(n+k) space to hold the counts, indices, and output arrays.
In many implementations, including ours, we assume that the input consists of 64-bit integers. This means that the number of digits, ℓ is a constant (64). With a binary number, each digit can either be a zero or a one, meaning that k is also a constant (2). With these assumptions, radix sort becomes O(n) time and O(n) space.
Radix Sort Implementation
// A function to do counting sort of arr[] according to
// the digit represented by exp.
public static void countSort(int[] arr, int n, int exp)
{
int[] output = new int[n]; // output array
int i;
int[] count = new int[10];
// initializing all elements of count to 0 for (i = 0; i < 10; i++) count[i] = 0; // Store count of occurrences in count[] for (i = 0; i < n; i++) count[(arr[i] / exp) % 10]++; // Change count[i] so that count[i] now contains // actual // position of this digit in output[] for (i = 1; i < 10; i++) count[i] += count[i - 1]; // Build the output array for (i = n - 1; i >= 0; i--) { output[count[(arr[i] / exp) % 10] - 1] = arr[i]; count[(arr[i] / exp) % 10]--; } // Copy the output array to arr[], so that arr[] now // contains sorted numbers according to current // digit for (i = 0; i < n; i++) arr[i] = output[i]; } // The main function to that sorts arr[] of size n using // Radix Sort public static void radixsort(int[] arr, int n) { // Find the maximum number to know number of digits int m = getMax(arr, n); // Do counting sort for every digit. Note that // instead of passing digit number, exp is passed. // exp is 10^i where i is current digit number for (int exp = 1; m / exp > 0; exp *= 10) countSort(arr, n, exp); }
Radix Sort Use Cases
It’s great when you have a large set of data with keys that are somehow constrained. For example, when you need to order a 1-million array of 64-bit numbers, it can be used to sort by 8 least significant bits, then by the next 8, and so on (applied 8 times). That way this array can be sorted in 81M operations, rather than 1Mlog(1M).
An example is when creating a “suffix array” using the skew DC3 algorithm (Kärkkäinen-Sanders-Burkhardt). The algorithm is only linear-time if the sorting algorithm is linear-time, and radix sort is necessary and useful here because the keys are short by construction (3-tuples of integers).
Radix sort takes O(k*n) time. But you have to ask what is K. K is the “number of digits” (a bit simplistic but basically something like that).
So, how many digits do you have? Quite answer, more than log(n) (log using the “digit size” as base) which makes the Radix algorithm O(n log n).
Why is that? If you have less than log(n) digits, then you have less than n possible numbers. Hence you can simply use “count sort” which takes O(n) time (just count how many of each number you have). So I assume you have more than k>log(n) digits…