coding university - Data structure and algorithms Flashcards

Question

What is the maximum height of a red-black tree?

Answer 1

root: 1 to 2t-1 keys non-root: t-1 to 2t-1 keys t could be up to 100, or more. There are n keys and n+1 children. Leaves are all the same level.

Answer 2

The number of items being stored, and page size based on disk characteristics.

Answer 3

Pages on disk.

Answer 4

1024 children per node. Store root in memory. 3 nodes accessed gets us 1024^3 disk pages. 4 nodes accessed gets us 1024^4 disk pages.

Answer 5

Never step into a minimal node.

Answer 6

Never step into a full node.

Answer 7

O(k) nodes with k leaves due to compression.

Answer 8

A suffix tree is a compressed trie containing all the suffixes of the given text as their keys and positions in the text as their values. Suffix trees allow particularly fast implementations of many important string operations. The construction of such a tree for the string S takes time and space linear in the length of S. Once constructed, several operations can be performed quickly, for instance locating a substring in S, locating a substring if a certain number of mistakes are allowed, locating matches for a regular expression pattern etc. Suffix trees also provide one of the first linear-time solutions for the longest common substring problem. These speedups come at a cost: storing a string's suffix tree typically requires significantly more space than storing the string itself.

Answer 9

Find the minimum item on each pass, past the previous minimum, and swap it into the leftmost position after the previous minimum.

Answer 10

Load into a binary search tree. Then inorder traversal.

Answer 11

Replace the unsorted portion with a min-heap. Gives O(log n) removal. Makes n log n overall.

Answer 12

Array - good Linked list - clumsy

Answer 13

Linked list - a natural Array does not allow for in-place

Answer 14

Choose a median of three.

Answer 15

Counting sort is an algorithm for sorting a collection of objects according to keys that are small integers; that is, it is an integer sorting algorithm. It operates by counting the number of objects that have each distinct key value, and using arithmetic on those counts to determine the positions of each key value in the output sequence. Its running time is linear in the number of items and the difference between the maximum and minimum key values, so it is only suitable for direct use in situations where the variation in keys is not significantly greater than the number of items. However, it is often used as a subroutine in another sorting algorithm, radix sort, that can handle larger keys more efficiently.

Answer 16

Radix sort is a non-comparative integer sorting algorithm that sorts data with integer keys by grouping keys by the individual digits which share the same significant position and value. Two classifications of radix sorts are least significant digit (LSD) radix sorts and most significant digit (MSD) radix sorts. LSD radix sorts process the integer representations starting from the least digit and move towards the most significant digit. MSD radix sorts work the other way around.

Answer 17

O(q + n) where q is the number of unique items. If q is in O(n), then linear time.

Answer 18

A power of 2 radix.

Answer 19

Flip all bits for negative numbers, do sort, then flip back.

Answer 20

# Choose q within a power of 2 of n. Ensures the number of passes is small. Best rule is n rounded down to the next power of 2. To save memory, round sqrt(n) down to the next power of 2. Twice as many passes.

Answer 21

- union - intersection - difference

Answer 22

The Day–Stout–Warren (DSW) algorithm is a method for efficiently balancing binary search trees — that is, decreasing their height to O(log n) nodes, where n is the total number of nodes. Unlike a self-balancing binary search tree, it does not do this incrementally during each operation, but periodically, so that its cost can be amortized over many operations.

Answer 23

for (i = 0; i < n; ++i) { j = i; while (j > 0 && a[j - 1] > a[j]) { swap(a, j, j - 1); j -= 1; } }

Answer 24

for (i = 0; i < n; ++i) { min_index = i: for (j = i; j < n; ++j) { if (a[j] < a[min_index]) { min_index = j; } } swap(a, i, min_index) }

Answer 25

Omega(n log n)

Answer 26

Topological sort

Answer 27

1. Calculate in-degree for each node. O(v + e) 2. Go through 0s, add to queue. 3. For each item in queue, look at each connection, and decrement in-degree of each, if they got to 0, add to queue, repeat.

Answer 28

def prim(self): """""" Returns a dictionary of parents of vertices in a minimum spanning tree :rtype: dict """""" s = set() q = queue.PriorityQueue() parents = {} start_weight = float(""inf"") weights = {} # since we can't peek into queue for i in self.get_vertex(): weight = start_weight if i == 0: q.put(([0, i])) weights[i] = weight parents[i] = None while not q.empty(): v_tuple = q.get() vertex = v_tuple[1] s.add(vertex) for u in self.get_neighbor(vertex): if u.vertex not in s: if u.weight < weights[u.vertex]: parents[u.vertex] = vertex weights[u.vertex] = u.weight q.put(([u.weight, u.vertex])) return parents

Answer 29

O(e log v) derived from: O((e + v) log v)

Answer 30

O(e + v log v)

Answer 31

KRUSKAL(G): A = ∅ foreach v ∈ G.V: MAKE-SET(v) foreach (u, v) in G.E ordered by weight(u, v), increasing: if FIND-SET(u) ≠ FIND-SET(v): A = A ∪ {(u, v)} UNION(u, v) return A

Answer 32

O(E log V) or O(e log e + e α(v) + v)

Answer 33

Kruskal's algorithm is a minimum-spanning-tree algorithm which finds an edge of the least possible weight that connects any two trees in the forest. It is a greedy algorithm in graph theory as it finds a minimum spanning tree for a connected weighted graph adding increasing cost arcs at each step. This means it finds a subset of the edges that forms a tree that includes every vertex, where the total weight of all the edges in the tree is minimized. If the graph is not connected, then it finds a minimum spanning forest (a minimum spanning tree for each connected component).

Answer 34

For each node: if node not yet visited, increment component count and do DFS.

Answer 35

Do a DFS, and when each node is being marked as complete, add node to a list. Reverse the list.

Answer 36

for each neighbor node: if not marked as visited (and is not parent) then DFS else it's a cycle

Answer 37

1. DFS - calculate the finish times for each node 2. Reverse the edges in the graph 3. Call DFS on nodes in reverse graph in reverse order of finishing times.

Answer 38

Transpose the matrix, so [i, j] becomes [j, i]

Answer 39

1. Topological sort | 2. follow the topological sort, relaxing edges

Answer 40

1. Set all edges to their negative weight. 2. Topological sort 3. follow the topological sort, relaxing edges

Answer 41

The shortest path of the farthest nodes. That is, it is the greatest distance between any pair of vertices. To find the diameter of a graph, first find the shortest path between each pair of vertices. The greatest length of any of these paths is the diameter of the graph.

Answer 42

When the graph contains a negative edge. Can cause a cycle that will be traversed infinitely.

Answer 43

1. Create a set T and list for result 2. Make a list of all edges in G 3. Sort edges by weight, from least to greatest. 4. Iterate edges in sorted order. 5. For each edge, if u and v are not in T, add u and v to T, and add edge to result list.

Answer 44

Finding the shortest path in a DAG. Formulating it this way ensures you can solve it in linear or linearithmic time.

Answer 45

(page width - text width)^3 Minimize the sum of the badness of the lines.

Answer 46

If it's bipartite. All trees are bipartite.

Answer 47

arm's length recursion

Answer 48

The code required to give the solution to the smallest subproblem.

Answer 49

n! / k!(n - k)!

Answer 50

ef solve(conf): if (no more choices): return conf choices = get_available_choices for choice in choices: c = pick one if solve(conf using c): return true unmake choice c return false

Answer 51

At sqrt(n) the probability is 1/2

Answer 52

It is as hard as any other problem in NP. A problem X is NP-Hard if every problem Y in NP-Hard reduces to X.

Answer 53

- tsp - knapsack problem - satisfiability - 3D matching - tricoloring - subset sum - rectangle packing - bin packing - vertex cover - set cover

Answer 54

Select a vertex as root. Build a MST. Do a preorder traversal, store nodes in H. Return H (a Hamiltonian cycle)

Answer 55

When an item is accessed, it moves to the head of the list. The trailing items can be overwritten with new items, or removed.

Answer 56

A data structure that allows fast search within an ordered sequence of elements. Fast search is made possible by maintaining a linked hierarchy of subsequences, with each successive subsequence skipping over fewer elements than the previous one. Searching starts in the sparsest subsequence until two consecutive elements have been found, one smaller and one larger than or equal to the element searched for. A skip list is built in layers. The bottom layer is an ordinary ordered linked list. Each higher layer acts as an ""express lane"" for the lists below, where an element in layer i appears in layer i+1 with some fixed probability p (two commonly used values for p are 1/2 or 1/4).

Answer 57

search: O(log n) O(n) insert: O(log n) O(n) delete: O(log n) O(n)

Answer 58

All are O(log log M), where M is the total number of items that can be stored = 2^m Or O(log m) where m is the actual number of items stored Space: O(M) Search Insert Delete Predecessor Successor

Answer 59

For all the basic maintenance operations, they are O(log n) average case and O(n) worst case. - Search - Insert - Delete For these operations, O(m log n/m) for treaps of sizes m and n, with m ≤ n. - union - intersection - difference

Answer 60

Catalan numbers form a sequence of natural numbers that occur in various counting problems, often involving recursively-defined objects. They can be thought of as the set of balanced parentheses. Do not think of Catalan numbers as pseudoprimes. There are only 3 Catalan pseudoprimes.

Answer 61

Randomly generated upon insertion. That randomness is used to keep the tree balanced.

Answer 62

- Graham scan | - Jarvis march (gift-wrapping method)

Answer 63

At O(n log n), uses a sort and then a simple single pass of all the points, and making only left turns as it goes around the perimeter counter-clockwise. When the next point is a right turn, it backtracks past all points (using a stack and popping points off) until that turn turns into a left turn.

Answer 64

Starting with the leftmost point p: Go through each point to the right of that point, and using p as a pivot, find which point is the most clockwise. O(n) Get the most clockwise point as the new p - O(1) Loop again with new p This continues until the starting point is reached O(h) - where h is the number of hull points

Answer 65

O(n^2) Occurs when most points are part of the hull, and few points contained in the hull.

Answer 66

O(n * h) where h is the number of points that compose the hull.

Answer 67

delete - o(n) | find - o(n)

Answer 68

delete- o(n)

Answer 69

find - o(h) add - o(h) delete - O(h)

Answer 70

Add element to the top of the stack - push O(1) Remove the top element of the stack - pop O(1) Return the value of the top element of the stack without removing it. O(1)

Answer 71

Add an element to a queue. O(1) Remove an element from the front of the queue. dequeue O(1) Return the element from the front of the queue without removing it. - front O(1)

Answer 72

find - O(log N) add - O(log N) delete - O(log N)

Answer 73

A data structure for storing a sorted list of items using a hierarchy of linked lists that connect increasingly sparse subsequences of the items. O(log N) expected time for all operations, O(N) worst case.

Answer 74

tree + heap. A random priority is assigned to every key and must maintain two properties: - They are in order with respect to their keys, as in a typical binary search tree - They are in heap order with respect to their priorities, that is, no key has a key of lower priority as an ancestor O(log N) expected time for all operations, O(N) worst case.

Answer 75

A queue in which each element has a ""priority"" assigned to it. Elements with higher priorities are served before lower priorities.

Answer 76

min - o(1) insert - o(log n) removemin - o (logn)

Answer 77

A collection of keys arranged in a complete heap-ordered binary tree, represented in level order in an array (not using the first entry). The parent of the node in position k is in position [k/2] and the two children of the node in position k are in position 2k and 2k+1.

Answer 78

A priority queue that allows you to change the priority of objects already in the queue.

Answer 79

O(N^2 worst) O(N log N) - best & expected Lower Bound for Comparison Based Sorting No comparison based sorting algorithm can be faster than O(N log N)

Answer 80

There exists a path from every vertex to every other vertex in the graph.

Answer 81

An acyclic connected graph.

Answer 82

Path with at least one edge whose first and last vertices are the same.

Answer 83

A subgraph that contains all of that graph's vertices and a single tree. Also known as Min-Cost Spanning Tree

Answer 84

Items with the same key are sorted based on their relative position in the original permutation

Answer 85

Prefix tree or a radix tree.

Answer 86

An internal sort is any data sorting process that takes place entirely within the main memory of a computer. This is possible whenever the data to be sorted is small enough to all be held in the main memory. For sorting larger datasets, it may be necessary to hold only a chunk of data in memory at a time, since it won't all fit. The rest of the data is normally held on some larger, but slower medium, like a hard-disk. Any reading or writing of data to and from this slower media can slow the sortation process considerably.

Answer 87

External sorting is a term for a class of sorting algorithms that can handle massive amounts of data. External sorting is required when the data being sorted do not fit into the main memory of a computing device (usually RAM) and instead they must reside in the slower external memory (usually a hard drive). External sorting typically uses a hybrid sort-merge strategy. In the sorting phase, chunks of data small enough to fit in main memory are read, sorted, and written out to a temporary file. In the merge phase, the sorted subfiles are combined into a single larger file. Mergesort is typically preferred.

Answer 88

- suitable for a linked list | - suitable for external sort

Answer 89

Need an extra buffer to hold the merged data

Answer 90

- don't need recursion - suitable for large data - locality of data

Answer 91

Usually slower than merge sort and quick sort.

Answer 92

The weakest point in a graph.

Answer 93

The smallest number of colors needed for an edge coloring of a graph.

Answer 94

The number of edges incident of the vertex, with loops counted twice.

Answer 95

A selection algorithm to find the kth smallest element in an unordered list. Quickselect uses the same overall approach as quicksort, choosing one element as a pivot and partitioning the data in two based on the pivot, accordingly as less than or greater than the pivot. However, instead of recursing into both sides, as in quicksort, quickselect only recurses into one side - the side with the element it is searching for. This reduces the average complexity from O(n log n) to O(n).

Answer 96

An index data structure storing a mapping from content, such as words or numbers, to its locations in a database file, or in a document or a set of documents (named in contrast to a Forward Index, which maps from documents to content). The purpose of an inverted index is to allow fast full text searches, at a cost of increased processing when a document is added to the database.

Answer 97

A partitioning of elements of some universal set into a collection of disjointed subsets. Thus, each element must be in exactly one subset.

Answer 98

A spanning tree of a weighted graph having maximum weight. It can be computed by negating the edges and running either Prim's or Kruskal's algorithms

Answer 99

The cost of a tree is the product of all the edge weights in the tree, instead of the sum of the weights. Since log(a*b) = log(a) + log(b), the minimum spanning tree on a graph whose edge weights are replaced with their logarithms gives the minimum product spanning tree on the original graph. You would use it to minimize the product.

Answer 100

A rolling hash (also known as a rolling checksum) is a hash function where the input is hashed in a window that moves through the input. One of the main applications is the Rabin-Karp string search algorithm, which uses the rolling hash.

Answer 101

def gcd(a, b): while a: b, a = a, b % a return b

Answer 102

Compute hash codes of each substring whose length is the length of s, such as a function with the property that the hash code of a string is an additive function of each individual character. Get the hash code of a sliding window of characters and compare if the hash matches.

Answer 103

Time : O (|V| |E|) or Theta(n^3) Space: O (|V|)

Answer 104

An algorithm that computes shortest paths from a single source vertex to all of the other vertices in a weighted digraph. It is slower than Dijkstra's algorithm for the same problem, but more versatile, as it is capable of handling graphs in which some of the edge weights are negative numbers.

Answer 105

Given an undirected graph G = (V, E), does there exist a simple cycle Γ that contains every node in V. Certificate is a permutation of the n nodes, contain each node in v exactly once, there is an edge btw each pair of adj nodes in the permutation.

Answer 106

Given a set U of elements, a collection S1, S2, ..., Sm of subsets of U, and an integer k, does there exist a collection of ≤ k of these sets whose union is equal to U

Answer 107

O(n lg n) time O(1) space

Answer 108

O(n lg n) time O(n) space

Answer 109

void reverse(node_t **head) { node_t *prev = NULL; node_t *current = *head; node_t *next = *head; while (current) { next = current->next; current->next = prev; prev = current; current = next; } *head = prev; }

Answer 110

bst_node* delete_value(bst_node* node, int value) { if (node == NULL) return node; if (value < node->value) { node->left = delete_value(node->left, value); } else if (value > node->value) { node->right = delete_value(node->right, value); } else { // found value if (node->left == NULL && node->right == NULL) { free(node); node = NULL; } else if (node->left == NULL) { bst_node* temp = node; node = node->right; free(temp); } else if (node->right == NULL) { bst_node* temp = node; node = node->left; free(temp); } else { // 2 children - get min node of right subtree int right_min = get_min(node->right); node->value = right_min; node->right = delete_value(node->right, right_min); } } return node; }

Answer 111

int get_successor(bst_node* node, int value) { if (node == NULL) return -1; bst_node* target = node; while (target->value != value) { if (value < target->value) { target = target->left; } else if (value > target->value) { target = target->right; } } // arrived at target node if (target->right != NULL) { // get min value of right subtree return get_min(target->right); } else { // get lowest ancestor that is a left child in the path to target value bst_node* successor = NULL; bst_node* ancestor = node; while (ancestor != NULL) { if (value < ancestor->value) { successor = ancestor; ancestor = ancestor->left; } else { ancestor = ancestor->right; } } return successor->value; } }

Answer 112

root = insert(node*, int) "bst_node* insert(bst_node* node, const int value) { if (node == 0) { bst_node* new_node = malloc(sizeof(bst_node)); if (new_node == NULL) { printf(""Unable to allocate memory.""); exit(0); } new_node->value = value; new_node->left = 0; new_node->right = 0; node = new_node; return node; } if (value < node->value) { node->left = insert(node->left, value); } else if (value > node->value) { node->right = insert(node->right, value); } return node; }

Answer 113

bool is_binary_search_tree(bst_node* node) { if (node == NULL) return true; return is_between(node, INT_MIN, INT_MAX); } bool is_between(bst_node* node, int min, int max) { if (node == NULL) return true; // ensure subtrees are not hiding a value lower or higher than the subtree // allows return node->value > min && node->value < max && is_between(node->left, min, node->value) && is_between(node->right, node->value, max); }" Using an iterative approach, write a function find_node(bst_node* root, int target) that returns the node with the given target value in a BST. "bst_node* find_node(bst_node* root, int target) { while (root != NULL && root->key != target) { if (root->key > target) { root = root->left; } else { root = root->right; } } return root; }

Answer 114

int get_height(bst_node* node) { if (node == NULL) { return 0; } return 1 + max_num(get_height(node->left), get_height(node->right)); }

Answer 115

floor(1 + log(base2)(n))

Answer 116

A tree of size n nodes, will have floor(n/2^h) nodes with height >= h. The last half of nodes will be leaves, so they already satisfy the heap property. No work needs to be done on them. going bottom-up (ignoring the last n/2 items) and satisfying the heap property one level at a time, each level going up the tree has to do at most 1 operation more than the level below it. But as you go up the tree, higher levels have fewer nodes, so you may be doing more operations, but it happens on fewer number of times. This resembles a series: n/2 - height 1: 1 operations n/4 - height 2: 2 operation n/8 - height 3: 3 operations ... going to floor(n/2^h) - height h: h operations n * (1/2 + 2/4 + 3/8 + 4/16 ....) = n * 1 = n" C or Python: Sort an array of numbers using heap sort. "void heap_sort(int* numbers, int count) { int temp; for (int i = count - 1; i > 0; --i) { temp = numbers[i]; numbers[i] = numbers[0]; numbers[0] = temp; percolate_down(numbers, i, 0); } } void heapify(int* numbers, int count) { for (int i = count / 2 - 1; i >= 0; --i) { percolate_down(numbers, count, i); } } void percolate_down(int* numbers, int count, int index) { while (index * 2 + 1 < count) { int swap_index = index; int left_child_index = index * 2 + 1; int right_child_index = index * 2 + 2; bool has_left_child = left_child_index < count; bool has_right_child = right_child_index < count; if (has_left_child && has_right_child) { if (numbers[left_child_index] > numbers[right_child_index]) { swap_index = left_child_index; } else { swap_index = right_child_index; } } else if (has_left_child) { swap_index = left_child_index; } else if (has_right_child) { swap_index = right_child_index; } else { break; } if (numbers[swap_index] > numbers[index]) { int temp = numbers[index]; numbers[index] = numbers[swap_index]; numbers[swap_index] = temp; index = swap_index; } else { break; } } }

Answer 117

Using a Circular Array or Singly Linked List.

Answer 118

Using a Circular Array or Doubly Linked List.

Answer 119

int binary_search(int target, int numbers[], int size) { int low = 0; int high = size - 1; int mid = 0; while (low <= high) { mid = (high + low) / 2; if (target > numbers[mid]) { low = mid + 1; } else if (target < numbers[mid]) { high = mid - 1; } else { return mid; } } return -1; }

Answer 120

int binary_search_recur(int target, int numbers[], int low, int high) { if (low > high) { return -1; } int mid = (high + low) / 2; if (target > numbers[mid]) { return binary_search_recur(target, numbers, mid + 1, high); } else if (target < numbers[mid]) { return binary_search_recur(target, numbers, low, mid - 1); } else { return mid; } }

Answer 121

int hash(const char* key, const int m) { int hash = 0; for (int i = 0; i < key[i] != '\0'; ++i) { hash = hash * 31 + key[i]; } return abs(hash % m); }

Answer 122

A binary tree is a data structure where each node has a comparable key and satisfies the restriction that the key in any node is larger than the keys in all nodes in that node's left subtree and smaller than the keys in all nodes in that node's right subtree.

Answer 123

A BST where the height of every node and that of its sibling differ by at most 1.

Answer 124

BSTs having red and black links satisfying: - Red links lean left - No node has two links connected to it - The tree has perfect black balance: every path from the root to a null link has the same number of blacks

Answer 125

A self-adjusting binary search tree where recently accessed elements are moved to the root so they are quick to access again.

Answer 126

A random priority is assigned to every key and must maintain two properties: - They are in order with respect to their keys, as in a typical binary search tree - They are in heap order with respect to their priorities, that is, no key has a key of lower priority as an ancestor O(log N) expected time for all operations, O(N) worst case

Answer 127

The van Emde Boas tree supports insertions, deletions, lookups, successor queries, and predecessor queries in time O(log log U), where U is the universe of items to store. Items are stored in clusters of size sqrt(U). The van Emde Boas data structure divides the range {0,...,n−1} into blocks of size sqrt(n), which we call clusters. Each cluster is itself a vEB structure of size sqrt(n). In addition, there is a “summary” structure that keeps track of which clusters are nonempty. ---More details ---- A van Emde Boas tree (or van Emde Boas priority queue), also known as a vEB tree, is a tree data structure which implements an associative array with m-bit integer keys. It performs all operations in O(log m) time, or equivalently in O(log log M) time, where M = 2m is the maximum number of elements that can be stored in the tree. The M is not to be confused with the actual number of elements stored in the tree, by which the performance of other tree data-structures is often measured. The vEB tree has good space efficiency when it contains a large number of elements, as discussed below. It was invented by a team led by Dutch computer scientist Peter van Emde Boas in 1975.

Answer 128

It's a trie where the non-branching paths are compacted into a single edge.

Answer 129

It's a trie where the non-branching paths are compacted into a single edge.

Answer 130

The ordering of the keys.

Answer 131

No. It requires O(n) space

Answer 132

A y-fast trie is a data structure for storing integers from a bounded domain. It supports exact and predecessor or successor queries in time O(log log M), using O(n) space, where n is the number of stored values and M is the maximum value in the domain. The structure was proposed by Dan Willard in 1982 to decrease the O(n log M) space used by an x-fast trie.

Answer 133

An x-fast trie is a data structure for storing integers from a bounded domain. It supports exact and predecessor or successor queries in time O(log log M), using O(n log M) space, where n is the number of stored values and M is the maximum value in the domain. The structure was proposed by Dan Willard in 1982, along with the more complicated y-fast trie, as a way to improve the space usage of van Emde Boas trees, while retaining the O(log log M) query time.

Answer 134

void merge(int numbers[], int low, int mid, int high) { // temp array for holding sorted items int b[high - low - 1]; int i = low; int j = mid + 1; int k = 0; // merge items from list in order while (i <= mid && j <= high) { if (numbers[i] <= numbers[j]) { b[k++] = numbers[i++]; } else { b[k++] = numbers[j++]; } } // copy the remaining items to tmp array while (i <= mid) b[k++] = numbers[i++]; while (j <= high) b[k++] = numbers[j++]; --k; while (k >= 0) { numbers[low + k] = b[k]; --k; } } void merge_sort(int numbers[], int low, int high) { if (low < high) { int mid = (low + high) / 2; merge_sort(numbers, low, mid); merge_sort(numbers, mid + 1, high); merge(numbers, low, mid, high); } }

Answer 135

void quick_sort(int numbers[], int left, int right) { if (left == right) return; int i = left; int j = right; int temp = 0; int count = right - left; int pivot_mod = rand() % count; int pivot = numbers[left + pivot_mod]; while (i <= j) { while (numbers[i] < pivot) ++i; while (numbers[j] > pivot) --j; if (i <= j) { temp = numbers[i]; numbers[i] = numbers[j]; numbers[j] = temp; ++i; --j; } } if (left < j) { quick_sort(numbers, left, j); } if (right > i) { quick_sort(numbers, i, right); } }

Answer 136

When you don't need to support inserts or deletes. The data is static.

Answer 137

It creates a second hash table in the buckets where there are multiple items (k), using a second hash function, and k^2 space. The hash table has two hashing levels. k^2 is chosen because the Markov inequality (birthday paradox) ensures we will not have collisions in bucket.

Answer 138

O(sqrt(n))

Answer 139

n/m, where n = items, m = buckets) n/m is also called alpha.

Answer 140

O(1 + alpha), where alpha is the load factor (n/m). Table doubling operations are amortized.

Answer 141

O(1) - we get lucky and find the element right at the midpoint.

Answer 142

Finding all the outgoing edges from a vertex takes O(n) time even if there aren't very many, and the O(n^2) space cost is high for ""sparse graphs,"" those with much fewer than n^2 edges.

Answer 143

Finding predecessors of a node u is extremely expensive, requiring looking through every list of every node in time O(n + e), where e is the total number of edges, although if this is something we actually need to do often we can store a second copy of the graph with the edges reversed.

Answer 144

Adjacency lists are most useful when we mostly want to enumerate outgoing edges of each node. This is common in search tasks, where we want to find a path from one node to another or compute the distances between pairs of nodes. If other operations are important, we can optimize them by augmenting the adjacency list representation; for example, using sorted arrays for the adjacency lists reduces the cost of edge existence testing to O(log(d+ (u))), and adding a second copy of the graph with reversed edges lets us find all predecessors of u in O(d− (u)) time, where d− (u) is u's in-degree.

Answer 145

log(base2) x + 1

Answer 146

log(basek) x + 1

Answer 147

2^(h+1) − 1 nodes

Answer 148

O(e log v), where e is the number of edges. It must scan each edge, and gets and updates values on the heap.

Answer 149

Half of the entries in the matrix are duplicates.

Answer 150

Theta( |V| + |E| )

Answer 151

Theta(|V|^2)

Answer 152

Use a tail pointer. Push new items at the tail, pop items at the head. Both operations are constant-time.

Answer 153

Push and pop items at the head. Both operations are constant-time.

Answer 154

Nodes first, leaves later.

Answer 155

Leaves first, internal nodes later.

Answer 156

(n - 1)! - enumerating is possible (using backtracking), but there will be a lot.

Answer 157

BFS. Using only 2 colors. When you encounter a new vertex, if it has no color, give it the opposite color of its parent vertex. If it is already colored the same, the graph is not bipartite.

Answer 158

DFS. If you discover an edge that connects to an ancestor (previously discovered vertex), you have a cycle.

Answer 159

A vertex of a graph whose deletion disconnects the graph.

Answer 160

DFS multiple times. Remove each edge one at a time, doing a DFS after each, so see if you end up with > 1 connected components. If you remove a node and then DFS and find you have fewer than m - 1 edges, you've deleted an articulation vertex. O(n(n+m)). A faster way, with a little more bookkeeping, can be done in O(n+m) time, if you do DFS and keep track of parents and make a note when you reach a back edge, which connects to an ancestor.

Answer 161

That a subpath of a shortest path is also a shortest path.

Answer 162

BSTs having red and black links satisfying: - Red links lean left - No node has two links connected to it - The tree has perfect black balance: every path from the root to a null link has the same number of blacks

Answer 163

A random priority is assigned to every key and must maintain two properties: - They are in order with respect to their keys, as in a typical binary search tree - They are in heap order with respect to their priorities, that is, no key has a key of lower priority as an ancestor O(log N) expected time for all operations, O(N) worst case

Answer 164

def is_bipartite(self): """""" Returns true if graph is bipartite :rtype: bool """""" colorings = {} to_visit = queue.Queue() to_visit.put(0) colorings[0] = 0 while not to_visit.empty(): v = to_visit.get() for u in self.adjacency_list[v]: if u not in colorings: colorings[u] = 1 - colorings[v] to_visit.put(u) elif colorings[u] == colorings[v]: return False return True

Answer 165

Too many base case scenarios. Just have one base case so you can return as quickly as possible. Avoid ""arm's length"" recursion.

Answer 166

The longest edge in the permutation that gives you the shortest edges.

Answer 167

Θ(φ^n), where phi(φ) is the golden ratio (1 + sqrt(5)) / 2. approx: 1.618

Answer 168

Contiguously-allocated structures are composed of single slabs of memory, and include arrays, matrices, heaps, and hash tables.

Answer 169

Linked data structures are composed of distinct chunks of memory bound together by pointers, and include lists, trees, and graph adjacency lists.

Answer 170

Constant-time access given the index - Space efficiency - Memory locality

Answer 171

- Overflow on linked structures can never occur unless the memory is actually full. - Insertions and deletions are simpler than for contiguous (array) lists. - With large records, moving pointers is easier and faster than moving the items themselves.

Answer 172

- Linked structures require extra space for storing pointer fields. - Linked lists do not allow efficient random access to items. - Arrays allow better memory locality and cache performance than random pointer jumping.

Answer 173

""""""" * DP Runtime : O(len(str1) * len(str2)) """""" def min_edit_distance(str1, str2): rows = len(str2) + 1 cols = len(str1) + 1 T = [[0 for _ in range(cols)] for _ in range(rows)] for j in range(cols): T[0][j] = j for i in range(rows): T[i][0] = i for i in range(1, rows): for j in range(1, cols): if str2[i - 1] == str1[j - 1]: T[i][j] = T[i - 1][j - 1] else: T[i][j] = 1 + min(T[i - 1][j - 1], T[i - 1][j], T[i][j - 1]) print_edits(T, str1, str2) return T[rows - 1][cols - 1] if __name__ == '__main__': str1 = ""azced"" str2 = ""abcdef"" expected = 3 assert expected == min_edit_distance(str1, str2) assert expected == min_edit_distance(str2, str1)

Answer 174

A fast lookup table, like a hash table or binary tree, and a linked list of items by use. When you access or add an item, you delete it from the linked list and add it to the head of the list. Then to prune, traverse the linked list and remove trailing elements, and delete them from the storage (tree or hash table). You can also use a splay tree, since it moves accesses to the root. To prune items, somehow find and remove the leaves, since the number of leaves will be about n/2." What is a direct mapped cache? It's a type of cache used in the CPU, where the lower order bits of a given memory address are used modulo the number of cache lines to place or lookup in the cache. Collisions are treated as overwrites. What is a fully-associative cache? "It's a type of cache used in the CPU, where lookups are done on all cache lines in parallel to determine a hit or miss. This requires a very large number of comparators that increase the complexity and cost of implementing large caches. Therefore, this type of cache is usually only used for small caches, typically less than 4K.

Answer 175

Huffman encoding algorithm analyzes the occurrence of individual symbols and creates a binary tree where the common symbols are closest to the root, using fewer bits to encode, and less common/rare symbols have longer paths on the tree, with longer encodings to accommodate. By traversing the tree, from root to leaf, and keeping track of 1 or 0 at each node, we can determine the encoding of the symbol.

Answer 176

A way to put symbols or words into a dictionary or array, and use the indices as the values in the text to save space so that words are not repeated.

Answer 177

You could use a hash table, creating or updating an entry for each pair. Keep track of max_frequency and most_frequent_phrase. Just increment the count, and when you see the new count is > than max_frequency, update max_frequency and most_frequent_phrase

Answer 178

Sort them: Once the numbers are sorted, the closest pair of numbers must lie next to each other somewhere in sorted order. Thus, a linear-time scan through them completes the job, for a total of O(n log n) time including the sorting.

Answer 179

This is a special case of the closest-pair problem, where we ask if there is a pair separated by a gap of zero. The most efficient algorithm sorts the numbers and then does a linear scan though checking all adjacent pairs.

Answer 180

If the items are sorted, we can sweep from left to right and count them, since all identical items will be lumped together during sorting. To find out how often an arbitrary element k occurs, look up k using binary search in a sorted array of keys. Then use binary search in each direction to find where that run of the number begins and ends.

Answer 181

The small set can be sorted in O(m log m) time. We can now do a binary search with each of the n elements in the big set, looking to see if it exists in the small one. The total time will be O((n + m) log m)

Answer 182

- optimizing left to right sequences (strings, tree nodes as array, permutations) - search all possibilities while storing results to avoid recomputing

Answer 183

The Floyd–Warshall algorithm is a dynamic programming algorithm for finding shortest paths in a weighted graph with positive or negative edge weights (but with no negative cycles).