Data Structures 4 Flashcards
Algorithm Analysis
how long it takes a computer to do something
Algorithm
has input, produces output, definite, finite, operates on the data it is given
Big-Oh
T(n) = O(f(n)) if there are positive constants c & n° such that T(n) <= c * f(n) for all n >= n°
Big Omega
T(n) = Ω(f(n)) if ∃ positive constants c & n° such that T(n) >= c * f(n) for all n >= n°
Little-Oh
T(n) = 0(f(n)) if T(n) = O(f(n)) and T(n) != Ω(f(n))
Queues
First in, first out. O(1)
L N R
in-order traversal
N L R
Preorder traversal (Polish)
L R N
Postorder traversal (Reverse Polish)
Insertion Sort
Stable, O(n^2), Ω(n) : Swapping elements one at a time starting at the beginning.
Selection Sort
Unstable, O(n^2), Ω(n^2) : Iterates through every elements to ensure the list is sorted.
Bubble Sort
Stable, O(n^2), Ω(n) : Compares neighboring elements to see if sorted. Stops when there’s nothing left to sort.
Heap Sort
Unstable, O(n log n), Ω(n log n): Make a heap, take everything out.
Tree Sort
Stable, O(n log n), Ω(n log n) : Put everything in the tree, traverse in-order.
Merge Sort
Stable, O(n log n), Ω(n log n): Use recursion to split arrays in half repeatedly. An array with size 1 is already sorted.
Quick Sort
Unstable, O(n log n) for a good pivot,O(n^2) for a bad pivot Ω(n log n) : Uses partitioning O(n), Pick a median of 1st, middle, and last element for pivot. Random selection is also good, but expensive. Algorithm can be slow because of many function calls.
Insertion & Quick Sort
Using both algorithms together is more efficient since O(n log n) is only for large arrays.
Indirect Sorting
Involves the use of smart pointers; objects which contains pointers.
Bucket Sort
O(n+m) where m is the # of buckets.
Graphs
”"”Uber”” data structure. Shows connections between objects. Can be displayed as either a matrix or linked list representation.”
DAG
directed acyclic graph
Strongly connected
A directed graph in which there exists a path between every pair of vertices.
Weakly connected
A path between every pair of vertices which are undirected.
Depth First Search
Runs in time equal to the size of the graph, can determine if a graph has a cycle.
Topological Sorting
Receives a DAG as input, outputs the ordering of vertices. Selects a node with no incoming edges, reads it’s outgoing edges.
Single Source Shortest Path
Input: Graph and starting vertex.Output: shortest path to all points.Unweighted: BFSWeighted: Dijkstra’s Method
Dijkstra’s Method
“Calculates the shortest path to all vertices in a single source shortest path using a priority queue, or a heap. Check’s ““frontier”” based on cost. The distance to any node is known once it has been ““visited””.”
Relaxation
Getting from A->C more cheaply by using B as an intermediary.
Negative Edge Costs
Dijkstra’s cannot solve. Requires Bellman-Ford.
Bellman-Ford
Allowed to reconsider costs of reaching vertexes. Can detect negative cost cycles. Able to handle negative graphs by performing relaxation on all edges V-1 times where V is the number of vertices.
All Pairs Shortest Path
Can be solved using Floyd-Warshall.
Floyd-Warshall
Implicitly determines shortest paths taking into account all vertexes.
Minimum Spanning Tree
Acyclic, contain all vertexes. Can be approached with either Prim’s or Kruskal’s method.
Prim’s
Same overall algorithm as Dijkstra’s except that it only considers lowest cost of single edge. Continually builds onto a tree with the cheapest cost edges.
Kruskal’s
Takes edges in sorted order by cost, creates many trees which join into one large tree.
Disjoint Sets
Never allowed to break apart sets. Also known as Union/Find Algorithm. Each node has a parent pointer which points to a representative for each set
2 Ways to Improve Disjoint Sets
Union By Rank - make smaller tree point to larger tree.Path Compression - Updating parent pointer directly to root.
Abstract Data Types
Consists of 2 parts:1. Data it contains2. Operations that can be performed on it
Parameter Passing
Small, no modification - valueLarge, no modification - CONST referencemodified - pointer
Iterators
“An object that knows how to ““walk”” over a collection of things. Encapsulates everything it needs to know about what it’s iterating over. Should all have similar interfaces. Can read data, move, know when to stop.”
AVL Trees
Adelson-Velskii & Landis: Any pair of sibling nodes have a height difference of at most 1. On insertion, at most one rotation (single or double) is needed to restore balance. On removal, multiple rotations may be necessary.
Single Rotation
Rotation preserves order. Inner children become the child of the node which was replaced.
Double Rotation
Two single rotation at different locations, either right-left or left-right. First rotation is deeper than the second.
Splay Tree
Any valid BST. Amortized O(log n) access. M operations take O(m log n) for m being large #s. Any node getting inserted, removed, or accessed, get’s splayed to the root.
Spatial locality
Close by in memory
Temporal locality
Accessing something over a short period of time
Red Black Trees
- Every node is Red or Black2. The root is Black3. If a node is red, it’s children must be black4. Every path from a node to a NULL pointer must contain the same number of black nodes
B-Trees
Popular in disk storage. Keys are in nodes. Data is in the leaves.
Heap
A type of priority queue. Stores data which is order-able. O(1) access to highest priority item.
Trie
Has only part of a key for comparison at each node.
Hash Table
Constant access time (on average).
Hash function
takes an object and tells you where to put it.
Collision
Entering into a space already in use.
Load Factor
items(n) / table size
Separate Chaining
Uses a linked list to handle collisions at a specific point.
Open addressing
Uses probes to find an open location to store data.
Linear Probing
Checks each spot in order to find available location, causes primary clustering.
Quadratic Probing
Checks the square of the nth time it has to check, causes secondary clustering. Not guaranteed to find an open table spot unless table is 1/2 empty.
Double Hashing
The process of using two hash functions to determine where to store the data.
Rehashing
Expanding the table: double table size, find closest prime number. Rehash each element for the new table size.
Lazy Deletion
Marking a spot as deleted in a hash table rather than actually deleting it.
Prime number Tables
Reduce the chance of collision.
Bloom Filters
Probabilistic hash table. No means no. Yes means maybe. Multiple (different) hash functions. Can’t resize table. Also can’t remove elements.
Important Sorting Assumptions
1.Sorting array of integers2. Length of array is n3.Sorting least to greatest4.Can access array element in constant time5.Compare ints in array only with ‘6.Focus on # of comparisons
Average Lower bound for adjacent swaps
n(n-1)/4 Ω(n^2)
Inversions
Min: 0Max: n(n-1)/2Swapping removes 1 inversion
4 Rules of Recursion
Base Cases: You must always have some base cases, which can be solved without recursion.Making Progress: For the cases that are to be solved recursively, the recursive call must make progress to a base case.Design rule: Assume that all recursive calls workCompound Interest Rule: Never duplicate word by solving the same instance of a problem in separate recursive calls.