General Data Structure Flashcards

Question

True/False: Binary insertion sorting (insertion sort that uses binary search to find each insertion point) requires O(n log n) total operations.

Answer 1

False: False. While binary insertion sorting improves the time it takes to find the right position for the next element being inserted, it may still take O(n) time to perform the swaps necessary to shift it into place. This results in an O(n^2) running time, the same as that of insertion sort.

Answer 2

True: At the top level, roughly n work is done to merge all n elements. At the next level, there are two branches, each doing roughly n/2 work to merge n/2 elements. In total, roughly n work is done on that level. This pattern continues on through to the leaves, where a constant amount of work is done on n leaves, resulting in roughly n work being done on the leaf level, as well.

Answer 3

False: False. Finding the next smallest element, the predecessor, may require traveling down the height of the tree, making the running time O(h). Example, next biggest might be in the adjacent subtree.

Answer 4

True. The AVL property is restored on every operation. Therefore, inserting another item will require at most two rotations to restore the balance.

Answer 5

False. A min-heap cannot provide the next largest element in O(log n) time. To find the next largest element, we need to do a linear, O(n), search through the heap’s array.

Answer 6

False. The notes state that double hashing ‘comes close.’ Double hashing only provides n^2 permutations, not n!.

Answer 7

False. Only in unweighted graphs.

Answer 8

True. In a tree, there is only one path between two vertices, and breadth-first search finds it.

Answer 9

True. In a tree, there is only one path between two vertices, and breadth-first search finds it.

Answer 10

True. In a tree, there is only one path between two vertices, and depth-first search finds it.

Answer 11

No, you’d prefer depth-first search, which can easily be used to produce a topological sort of the graph, which would correspond to a valid task order. BFS can produce incorrect results.

Answer 12

False. Dijkstra’s algorithm always visits each node at most once; this is why it produces an incorrect result in the presence of negative-weight edges.

Answer 13

False. The negative-weight cycle has to be reachable from s.

Answer 14

True. Negate the weights.

Answer 15

True. Negate the weights.

Answer 16

True. Multiplying edge weights by any positive constant factor preserves their relative order, as well as the relative order of any linear combination of the weights. All path weights are linear combinations of edge weights, so the relative order of path weights is preserved. This means that a shortest path in G will still be a shortest path in G'

Answer 17

Stack based implementation Recursion based implementation

Answer 18

An MST always contains all the vertices of the original graph. A subgraph with all the vertices, but minimum number of edges need connect them all AND a collection with the minimum edge weight sum.

Answer 19

Shortest Path does not require inclusion of all the vertices, just at least the two vertices needed to have a path with.

Answer 20

Dijsktra: Shortest path Prim's: MST Dijsktra: works on directed and undirected graphs Prim: Only for undirected Dijsktra: cannot handle graphs with negative edge weights Prim's: can handle negative edge weights

Answer 21

Requires data structure to store elements. Elements should/needs to represent the edge weight in some way. Include a boolean variable to indicate if currently included/visited during traversal through graph Create a "forrest" of vertices initialized to indicate vertices are "seperated"/disconnected, and use a while loop Should have a way for class/function to check if all vertices' boolean is all Yes or if a no exists.

Answer 22

The core idea is to continusly eliminate longer paths between the starting node and all possible destinations (or at least given them lower priority) during the traversal. Discriminate between visited nodes and unvisited nodes. Visited nodes are "known"/verified to be the minimum distance from the start and current traversal vertex. Unvisited nodes are those that have not been confirmed to be minimum but can still reach destination vertex. Implementation requires initiazliation of all vertex to have a "distance" of infinity and Prev/Predecessor. Start vertex is only node not given "infinity" distance, but 0. Iteratively traverse through graph through BFS and change vertex status as visited BFS is powered by a PRIORITY QUEUE (min-heap) popping the minimum edge weight each time among the set of possible vertices. The edge weight is not static, as the vertices being added to the Priority QUEUE have edge weights that are the weight of its edge to the previously visited vertex PLUS (+) current distance since the start vertex.

Answer 23

A vertex/node can be described with a label/name, LinkedList in reference to shortest path, a distance from the source, and a adjacency list (list of neighbors and edge weights)

Answer 24

Reversing the PQ to give the max edge weight can provide the Maximum Spanning Tree Challenge in implementation is checking for cycle. Use DFS algorithm to check for cycle. Optima solution is to use UNION-FIND alogirthm with disjoint data structure (since it uses incremental edge addding approach to detect cycles

Answer 25

Reversing the PQ to give the max edge weight can provide the Maximum Spanning Tree Challenge in implementation is checking for cycle. Use DFS algorithm to check for cycle, but... Optimal solution is to use UNION-FIND algorithm with disjoint data structure (since it uses incremental edge adding approach to detect cycles. Implementing this into the spanning tree construction process will improve Kruskal Algorithm

Answer 26

From a forrest of disjoint sets/nodes, where each set only contains one node, itself. Each time an edge is added to the tree, we check if the two nodes are in the same set. Tree structure can be used to represent the disjoint set. Each node has a parent pointer. In each set, there is a unique root node that represents this set. At the start , the root node has a self referenced parent pointer.

Answer 27

With a certain union order worst case could be O(n) and resembling a linked list. To improve this we use path compression technique where the root node is attached to the union's parent reference directly. Thus, future lookups of this node will take one iteration to get to the root/sentinel node. Path compression reattaches the vertices directly to the root

Answer 28

In context of path compression, the first time using this technique is essentially guaranteeing to take no longer than subsequent uses. Amortized cost analysis considered the time or space cost of doing a sequence of operations (as opposed to a single operation) since the total cost of entire sequence of operations might be less with extra initial work than without! The first application of path compression is non-constant time where structure is re-adjusting itself. The second is constant time part in which the structure has finished readjusting and is now reaping the benefits of the self-adjustment.

Answer 29

In the PQ existence of a shorter path with the same vertex is not possible since the PQ is designed to have the SHORTEST path the top. Thus the shortest path is updated before any other options of the same vertex that are larger in distance, which after that vertex is popped will never be updated again (boolean flag). This is part of the GREEDY algorithm approach.

Answer 30

Space: O(1) Insert: O(1) Lookup: O(1) How to use and implement? It is a good first "layer" of filtering. As the size of the database or table or bitmap becomes bigger the likelihood of eventually reporting a false positive gets larger. If possible, if the size is known ahead of time, then try to restrict it from growing (if possible) or else needing to resize constantly to maintain low probability of false positve ("maybe")

Answer 31

Space: O(n) Peek: O(1) Dequeue: O( log n). ("popping"/removing out of queue and correctly positioning/location next item on the top and ensuring the remainder of container maintains priority rules) Enqueue: O( log n). ("pushing" into queue and correctly positioning/location it based on priority) Highly useful in Graph/Dijsktra's Algorithm, Graph Traversal, and Huffman encoding

Answer 32

Space: O(n) Peek: O(1) Dequeue: O( log n). ("popping"/removing out of queue and correctly positioning/location next item on the top and ensuring the remainder of container maintains priority rules) Enqueue: O( log n). ("pushing" into queue and correctly positioning/location it based on priority) Highly useful in Graph/Dijsktra's Algorithm, Graph Traversal, and Huffman encoding Applications: any environment with different levels of urgency like hospitals, auctions, or scheduling and tradeoffs can be be made (discrimination of elements) If using a sorted LIST to implement Priority Queue, where the highest priority is just a position 0, but insertion will need to move a certain number (up to N-1) items over, nad removal will require the same but shifted over opposite direction As a linkedLIST, first/head node is highest priority so fast access, but insertion will require changing of pointers O(n). Removal and restructure is fast and easy.

Answer 33

Burrows Wheeler Transform Construction O(n)

Answer 34

They are also used to implement the heapsort sorting algorithm, which is a nice fast N log N sorting algorithm

Answer 35

Basic algorithm: 1. Create a new node in the proper location, filling level left-to-right 2. Put key to be inserted in this new node 3. “Bubble up” the key toward the root, exchanging with key in parent, until it is in a node whose parent has a larger key (or it is in the root) Fill in right left child before the right. No node should have a right child and empty left child

Answer 36

O(K + N log N) This depends on also the I/O speed which can dominate if number of elements is large

Answer 37

This the answer to this question lies in the motivation and inclusion of information theory and the minimum number of bits/units of information needed for a message of size K. By Shannon -> Entropy: ~~log_2 (K) to find average number of bits, need to take the weights average of p log 1/p .. +1 (or ceiling)

Answer 38

Lowest is if it always occurs and has prob of 1 is log (1) = 0 If all element occurs the same nuber of times, then they all have the same probability 1/N, so H = SUM N * log(1/N)

Answer 39

One common case is: implementing a heap Because heaps have a very regular structure (they are complete binary trees), the array representation can be very compact: array entries themselves don’t need to hold parentchild relational information • the index of the parent, left child, or right child of any node in the array can be found by simple computations on that node’s index

Answer 40

* structure: they are “full” binary trees; every node is either a leaf, or has 2 children * size: to code N possible items, they have N leaves, and N-1 internal nodes These features make it potentially interesting to use arrays to implement a Huffman code tree

Answer 41

The result: Nodes at the same level are contiguous in the array, and the array has no "gaps"

Answer 42

The red-black invariants imply that the tree is balanced Red-black trees are balanced; insert, delete, and find operations are O(logN) worstcase. In fact, the height can never be more than 2 (log2 N) + 1 It is fairly easy to implement insert and delete operations to be faster by a constant factor than for AVL trees The average level of a node in a random red-black tree is very close to that in a random AVL tree (about log2 N), so find operations are as fast in the average case The average level of a node in a random red-black tree is very close to that in a random AVL tree (about log2 N), so find operations are as fast in the average case

Answer 43

Height of R-B trees are: 2 (log2 N) + 1 as inserting only increases by a factor of 2 and restrictions/rules dont the height get out of control. All internal nodes will have at least 2 children

Answer 44

Red-Black Trees only worry about the initial top-down preparation and never needing to bubble up or rotate after insertion which AVL does and costs more in runtime costs This means in effect travelling the path from root to leaf twice, whether it is done recursively, or iteratively (by using parent pointers)

Answer 45

Tries don’t have to be binary: an “alphabet trie” is a 26-ary tree for storing words ✗ a “ternary trie” is a 3-ary tree supporting efficient prefix queries on strings Tries don’t have to be completely filled: in general they can have any tree structure Other applications of TRIE are decision trees and discrimination nets

Answer 46

Formed by inserting the nodes highest-priority-first into a binary search tree without doing any rebalancing If the priorities are independent random numbers (from a distribution over a large enough space of possible priorities to ensure that two nodes are very unlikely to have the same priority) then the shape of a treap has the same probability distribution as the shape of a random binary search tree, a search tree formed by inserting the nodes without rebalancing in a randomly chosen insertion order

Answer 47

MWT uses much more memory due to empty nullptr form unused character edges

Answer 48

Depending on initial insertion and how many different words/strings, you can traverse down the lef tor right multiple times before reaching actual initial character.

Answer 49

MWT will not always have an advantage , it is more likely to have an advantage but NOT ALWAYS

Answer 50

All AVL trees are binary trees but not all binary trees are AVL trees

Answer 51

This is because the only nodes affected are the 2-3 notes connect to the node being rotated and the readjustment of pointers are O(1) In general the higher up the node, the less time complexity.

Answer 52

Hash Table

Answer 53

True a[1] has two children a[2] a[3] that are less a[2] has two children a[4] and a[5]. etc etc

Answer 54

FALSE The expected time to search for an item in the table is O(1 + ALPHA) = O(1+ log_2 n) = O(1) At least a constant running time O(1) is needed to search for an item; subconstant running time O(1/log_2 n) is not possible

Answer 55

FALSE The probability p that one of the r searches collides with the single element stored in the table is equal to 1 minus the probability that none of the r searches r collides with the single element stored in the table. That is, p = 1 − (1- 1/m)^r

Answer 56

TRUE Perfect Hashing O(1)

Answer 57

O(h) h=height = log_2 n

Answer 58

TRUE. AVL property restored on every operation. Therefore, inserting another item will require at most two rotations to restore the balance.

Answer 59

LinkedList can be used a a data structure with sequence of data/objects in a HEAP HW5 (CSE12 GARY) Objects are nameless

Answer 60

Post Order is guaranteed to visit descendants before itself Pre-Order visits everyone before stoping

Answer 61

Self Balancing BST gives guarantee of tree brnaches Best case: O(1) Worst case: O(n) if can't find or unbalanced Average: E[Y]

Answer 62

Successor is the leftmost node on right subtree

Answer 63

Queue implemented as Linked List Stack implemented as Linked List Queue or Stack can be implemented as an ArrayList Data Structure impl,ements "any" ADT but speed can vary depending how what Data Structure is used.

Answer 64

Array List Binary Search Tree Heap Linked List Hashmap

Answer 65

Car is an ADT, engine is data structure How the car is build is data structure

Answer 66

Dynamic memory allocation: "new" memory allocated during runtime, as opposed to static memory, assigned at start of program (constants, variables, etc.)

Answer 67

Array can implement SET Hashtable can implement MAP ADT C++ uses RBT for MAP ADT

Answer 68

When rearranging only rotate nodes within the path of insertion, never have to deal with other subtrees unrelated to the path of insertion Height will always be factor of runtime. But it minimizes insertion. Search takes longer but faster to insert. Before and after insertion RBT is perfectly balanced

Answer 69

AVL will always need to go down to insert and back up to check balance and rebalance if necessary

Answer 70

The first node of a TREE is the root

Answer 71

Terms for visiting all nodes in a Tree: Traversal

Answer 72

Loop based solutions often uses less memory than recursive solution

Answer 73

Read/write code that uses arrays on the stack or heap, such as ArrayStack Trees: understand/implement operations on a tree (BST or trie) such as adding elements or performing traversals, and understand qualities of trees such as balance. Graphs: look at a graph and answer questions about it (connectedness, cyclicness, degrees; understand the execution of graph algorithms such as DFS, BFS , Dijsktra's)

Answer 74

Key Idea: Design Data structures that trade per-operation efficiency for overall efficiency Use two-stack queue as example We only do expensive dequeues after a long run of cheap enqueues (dishwasher, we slowly introduce a lot of dirty dishes to get cleaned up all at once. Provided we clean u al the dirty dishes at once and provided that dirty dishes accumulate slowly, this is the fastest strategy) Dam Example: Lots of expensive up front work but payoff over time (long term). The average work done at each point in time is high until lots of operations are performed. Early expensive operations, cheap later ones. Dish Washer Example: Lots of cheap operations that need to be made up for by an expensive one later. The average work done at each point in time is low. Grocery Story Example: Unlikely there will be large operations because of the randomizations, but every now and then, we run into trouble when demand for one item is high. Performs well on expectation, can't guarantee efficiency.

Answer 75

Amortization works best if: 1. imbalances accumulate slowly, and 2. imbalances get cleaned up quickly

Answer 76

It doesn't matter where these credits are placed or removed from. ● The total number of credits added and removed doesn't matter; all that matters is the difference between these two.

General Data Structure Flashcards

(101 cards)