Arrays Flashcards
How to make a read-only BST from an array?
Just sort it you dummy (probably w/ quicksort to save on memory). You don’t need an extra data structure for a read-only BST. You just use a sorted array and use indices logic to search
what is regarded by many as the go to sorting algorithm of choice for large enough inputs?
a randomized version of quicksort
The 2 ways to randomize quicksort?
by randomizing the input (e.g. permute by sorting, which however takes nlogn time if using comparison sort), OR better yet, by randomized sampling (e.g. choosing random index for the pivot element)
Quick sort
Quicksort(A,p,r) { if (r - p <= 0) return // length is one element or less and thus is sorted q=partition(A,p,r) quicksort(A,p,q-1) quicksort(A,q+1, r) } partition(A,p,r){ let i =p-1 for(let j = p; j < r; j++) { if(A[j] < A[r]) swap(A, ++i, j) // if current element we are processing x (at pos j) is less than the pivot element, expand the bounds of the lower section from [p, i] to [p, i' = i + 1] such that it is now overlapping with one element y ( which is >= A[r]) (at pos i') of the greater section, and swap x and y such that when greater section's range becomes become (i', j' = j + 1) after the increment statement in the for loop, A[j] = contains y which used to be located at A[i' swap(A, i+1,j) return i+1 } //at the start of each iteration of the loop [p, i] represents less than section. (i, j) represents >=section. j is not yet processed so [j, r) represents unprocessed section. [r, r] represents the pivot element.
Swap(A, a, b) { let x = A[a] A[a] = A[b] A[b] = x }
How to randomize quicksort with random sampling?
RandomizedPartition(A,p,r){
i = Math.floor(Math.random() * (r-p+1)) + p
swap(A, i, r)
return partition (A, p, r)
}
RandomizedQuicksort(A, p, r){
q = randomizedPartition(A,p,r)
RandomizedQuicksort(A, p, q-1)
RandomizedQuicksort (A, q+1, r)
RandomizedQuicksort from scratch?
RandomizedPartition(A,p, r) {
k = Math.floor(Math.random() * (r - p + 1) + p)
Swap(A, k, r)
i =p-1
For (let j=p; j < r; j++) {
if(A[j] < A[r]) {
swap(A,++i, j)
}
}
swap(A, i+1, r)
return i + 1 //
}
Proof of randomized select being linear for expected time
high level version: Even though quicksort expected time is nlgn, randomized select is n because we recursively partition on just one side of the array and not both sides.
low level version:
Fundamental differences between nlogn sorting, linear sorting, and linear selection
nlgn sorting takes nlgn time bc it uses the comparison model, linear sorting achieves linearity by making assumptions about the input, linear selection achieves linearity by avoiding sorting
randomizedSelect
randomizedSelect(A,p,r, i) [Select the ith smallest element of the range A[p..r] where p is the 1st element (as opposed to 0th)]
q = randomizedPartition(A,p,r)
k = q - p + 1 // k=the # of elements in the red partition + the pivot element. this also means the last element (pivot element) in this range is the kth element in this range [p, q], because there are k elements in this range. // we know the pivot element @ pos q is in a spot such that it would be in the correct sorted spot in the entire array. // if i happens to be k, then we already know what the kth element is (A[q]) so we can just return that
if i ==k return A[q] if (i < k) return randomizedSelect(A, p, q- 1, i) else return randomizedSelect(A, q+ 1, r, i - k) //because we've already eliminated the first k elements as candidates
Partition, but with partition element as an input parameter
Partition(A, p, r, x) {// used in selection in worst case linear time algorithm
let i =p-1; for (let j=p; j
Select the ith element in worst case linear time
- Split the array into floor(n/5) groups of size of 5 ( and perhaps one group which may have size less than 5)
- Find the median of each of the groups, by first insertion sorting each group, and then selecting the median
- Use SELECT to recursively find the median of the ceil(n/5) medians. By convention if there are an even number of medians (meaning we could have two median of medians), we will choose the lower median of medians
- Partition the input array around the median of medians by using the modified version of partition that can take an element to partition around as a partition element
- Let k be one more than number of elements on the low side of the partition such that x is at the kth position, and there are n-k elements on the high side of the partition. If i=k return x , because Partition (particularly the swap after the for loop) guarantees that the partition element is always in it’s sorted spot . Otherwise use Select recursively to find the ith smallest element on the low side if i < k, or the (i-k)th smallest element on the high side if i > k lol
Select in worst case linear time
High level:
Like randomized-select the algorithm SELECT finds the desired element by recursively partitioning the input array. Here, however, we guarantee a good split upon partitioning the array
Proof of time for worst case linear select
Steps 1,2,4 are linear time.
Step 3 is T(n/5)
Step 5: to analyze running time of select, we must first determine the lower bound on # of elements that are greater than the partitioning element x. We know at least half the groups (minus perhaps the last one with <5 elements and the one containing x itself) have medians greater than c and therefore have >=3 elements greater than x. Therefore >= 3 * (1/2*ceil(n/5) - 2) elements are greater than x. Which is 3n/10 - 6. Thus worst case step 5 calls select recursively on 7n/10 + 6 elements (though I think it should be 7n/10 + 5 bc you should exclude x but who cares).
Assume when n <140, T(n) = O(n). Else T(n) <= T(n/5) + T(7n/10 + 6) + O(n)
We’ll
Java ArrayStack
A Stack implemented with an ArrayList . All stack operations are in O(1) time
Dynamic array
is a random access, variable-size list data structure that allows elements to be added or removed.
Dynamic array is aka
growable array, resizable array, dynamic table, mutable array, or array list
Dynamic arrays work how?
When allocated array is full or almost full and you are inserting a new item, allocate a new other array (usually 2x larger, or more generally is ‘a’ larger) in memory, copy over the existing values to the start of the new array
Dynamic array runtime complexity for insert/delete at end
O(1) amortized. Inserting n elements takes O(n) time because you have to copy over old values (which takes O(n) time but only happens once every n times ) and insert each new value one constant operation at a time
Downsides of dynamic array vs linked lists
1)dynamic arrays require linear time to insert or delete at an arbitrary location, since all following elements must be moved, while linked lists can do this in constant time 2) Also, in a highly fragmented memory region, it may be expensive or impossible to find contiguous space for a large dynamic array, whereas linked lists do not require the whole data structure to be stored contiguously.
How to mitigate dynamic arrays requiring linear time to do insertion/deletion at an arbitrary location
By using the gap buffer or tiered vector variants of dynamic arrays
Array deque
A dynamic array that can grow from both ends. have all the properties of a dynamic array, such as constant-time random access, good locality of reference, and inefficient insertion/removal in the middle, with the addition of amortized constant-time insertion/removal at both ends, instead of just one end.
What do dynamic arrays do as logical size decreases?
They deallocate some of the underlying storage if its size drops below a certain threshold, such as 30% of the capacity. This threshold must be strictly smaller than 1/a in order to provide hysteresis (provide a stable band to avoid repeatedly growing and shrinking) and support mixed sequences of insertions and removals with amortized constant cost.
How would you attack the performance of a dynamic array?
If you know the shrinking threshold and if you are right on a geometric boundary (e.g power of geometric expansion factor a) you could determine the range of hysteresis r, and therefore a deletion r times in a row or subsequent insertion r times in a row would cause a resizing to happen and thus take n/r amortized time for those operations
HAT
Hashed Array Tree
Hashed array tree
is a dynamic array data-structure maintaining an array of separate memory fragments (or “leaves”) to store the data elements, unlike simple dynamic arrays which maintain their data in one contiguous memory area. Its primary objective is to reduce the amount of element copying due to automatic array resizing operations, and to improve memory usage patterns.
Asymptotic complexity of key operations and attributes for a hash array tree
Indexing - O(1) time Insert/delete at beginning O(n) time Insert/delete at end - O(1) amortized time Insert/delete in middle O(n) Wasted space (average) - O(n^.5)
Quicksort: at the start the loop what do p, i, j, r all demarcate?
[p,i] = all processed elements < r
(i, j) = all processed elements >= r
[j, r) = unprocessed elements
[r, r] = the pivot element
Insertion sort implementation
InsertionSort(A) { let i = 1;
In quicksort at the start of each iteration of the for loop in partition what is the range of the <=A[r] section?
[p, i]
in quicksort at the start of each iteration in the partition for loop before the increment, what is the range of the >= A[r] section?
(i, j)
in each iteration of the quicksort partition for loop, what element is being processed?
A[j]
JS math.random range
[0,1)
In quicksort at the start of each loop what is the range of the unprocessed section?
[j,r]