dsa Flashcards
what is the amortised cost of an algorithm?
the mean complexity of a set of instructions (considering all cases and how often they occur)
what does it mean for two algorithms to be asymptotically equal? ( f(x)~g(x) )
the algorithms tend to the same time i.e.
f(x)/ g(x) = 1 such as if f(x) is x^2 and g(x) is x^2 +3x
what is theta complexity?
if f(n) = ϴg(n):
ag(n) ≥ f(n) ≥ bg(n) ≥ 0 ∀ n≥n₀
has an upper and lower bound so f and g grow at the same rate
what is little o notation?
if f(n) = o(g(n)) then f(n) is ultimately negligible to g(n) or lim n→∞ f(n)/g(n) = 0
what does it mean if an algorithm has Ω(x^2) complexity
the algorithm grows at the same rate or faster than x^2 as n increases
what is implicit and explicit memory management
implicit: doesn’t expose memory locations to programmer and has garbage collection, bounds of data structures are checked, memory allocation and memory freeing is automatic.
explicit: all of these things have to be done automatically
what does it mean if an algorithm has O(x^2) complexity
the algorithm grows at the same rate or slower than x^2 as n increases
what is size, height, depth, order and balance in a tree?
size: number of elements
height: number of levels (empty=-1, height starts from 0)
depth: how far down an element is from the top
order: maximum number of children of a node
balance: difference in height of each subtree of a node
what is special about a quad tree?
each node points to 4 other quad trees but 3 of them has to be a leaf. they are used for representing images
how do you delete an element from a binary search tree assuming the node has 2 children?
delete it and fill its place with the leftmost node on the right subtree. then replace that node with its right subtree if there is one
what is an AVL binary tree?
a binary tree where each node has a balance of either 1, 0 or -1
what is a perfect and minimal AVL tree?
perfect: a full AVL tree
minimal (Fibonacci): an AVL tree where every node has a balance of 1 or -1 (as empty as an AVL tree is allowed)
what are the two formulas to calculate the size of a minimal tree?
|Tₕ₊₂| = 1+|Tₕ|+|Tₕ₊₁|
|Tₕ|= Fₕ₊₃
what is the formula to calculate the size of a full binary tree?
|Tₕ|= 2ʰ⁺¹ -1
balance an LL, RR, RL and LR tree
Check answers online
what are the characteristics of a B+ tree of order m?
- node has 2-m children (unless its a leaf)
- non-node, non-leaf nodes have m/2 -m children
- all leaf nodes are on the same level
- balanced
- operations are O(log n) complexity
- data structure is stored in secondary storage used for databases
what is a block of memory?
a way that disks store data that groups data together usually of size 4-64Kb and they can only operate on whole blocks at the time
what is a discriminator in a B+ tree?
a value used to decide which path to take down the tree in searching for a record
what is the maximum number of children in a B+ tree?
the maximum isn’t fixed because each child is a variable length key and together they are stored in a block of memory. there will always however be an amount of children such that the block of memory is at least half full.
what are the two types of B+ tree and how do they work?
record-imbedded B+ tree: non-leaf nodes store disk addresses which point to nodes that have key values in the range of the two discriminators that the disk address is sandwiched between. whole records are stored in leaf nodes. leaves also point to their neighbours.
index B+tree: only stores keys and disk addresses of records in leaves which accompanies another data structure storing the records
how does inserting work on a record-imbedded B+ tree?
Search with the key to find where to insert the record. if it has space, insert it.
If not, split the leaf into two leaves and add the record into the correct one. Then add the key into the node above (posting) as a discriminator if there is space.
If not, continue splitting and posting the discriminators. If necessary, a new node could be made.
how does deletion work on a record-imbedded B+ tree?
record is deleted and if it makes the node less than half full, it and it’s neighbour’s contents are distributed. if they’re both less than half full, they are merged and the discriminator is removed from the node above and this process is repeated until
how does bulk-loading work on a record-imbedded B+ tree?
records are ordered and then are inserted from left to right at leaf level under a root node. when a node gets full, make another node on the same level and add a discriminator to the parent node.
what is the difference between a primary and secondary index B+ tree?
the disk addresses in primary trees point to the block where all records that are between the two discriminators either side of it are. in secondary trees, in the leaf level, each discriminator is a key and the disk address is that of the specific record because the records are not stored in the order that the tree is set to
what is a binary heap tree?
a binary tree where all leaves are on the last level and are as far to the left as possible. also parents are larger or equal to their children
how does inserting, deleting and updating work in a BHT?
inserting: add node to next available space and bubble up (swap with parent) where necessary (if parent is lesser)
deleting: delete node and fill with the last leaf node. then bubble down (swap with greatest child) where necessary (if any child is greater) and bubble up where necessary
updating: same as deleting only not replaced with lowest element
how could a BHT be implemented and sorted from a random order
use an array with the root at array[1], its children at array[2] and array[3], the next level at array[4-7] etc. To sort (heapify), bubble down every element from array[n/2] to array[1]
what three ways could two BHTs be merged?
1: each element from one tree is inserted one by one to the other one - O(nlogn)
2: elements are added from the larger of the two to the smaller until both trees are the same size, then they are put in a tree with a dummy root as the left and right subtree and then the root is deleted and the tree is balanced- O(nlogn) but faster
3: the two arrays are concatenated and it is heapified- O(n)
what is a binomial tree and a binomial heap??
binomial tree: a binomial tree of order 0 is a single node and a binomial tree of order k has children that are roots of binomial trees of order k-1, k-2, …, 1, 0
binomial heap: a group of binomial trees but there can only be 0 or 1 of trees of each order. also each node in the binomial trees must have less than or equal priority to its parent.
how can two binomial trees of the same order be merged?
an additional pointer is added to the root of one of the trees which points to the second tree.
what is the difference between a stable and unstable sorting algorithm?
stable sorting algorithms don’t swap the order of equal elements, unstable algorithms might
what is heap sort?
enters all elements (as an order 1 binomial tree) into a binomial heap and then takes the largest out one by one, adding them to the end of the new output array
what are the two types of open-addressing in hash tables?
linear probing:
double hashing:
what is the load factor of a hash table?
load factor: the number of elements/ the size of the hash table
what do we have to assume for a hash table using direct chaining to have all operations O(1)
a maximum load factor and a good hash function
how do tombstones work in linear probing?
when deleting an element, replace with a tombstone. if an item needs to be inserted, the tombstone can be replaced
where are the backup locations for elements in a hash table that uses double hashing?
1: ( hash1(key)+hash2(key) ) MOD T
2: ( hash1(key)+2hash2(key) ) MOD T
3: ( hash1(key)+3hash2(key) ) MOD T
where T is the hash table size
what is a:
-simple graph
-path
-simple path
-cycle
simple graph: a graph where there are no loops and no more than one edge between nodes
path: a set of connecting edges
simple path: a path where a node cannot appear more than once unless it is the starting and ending node of the path
cycle: a path that ends with the same node that it started with
what is a:
-connected undirected graph
-weakly connected directed graph
-strongly connected directed graph
connected undirected graph: an undirected graph where every pair of nodes is connected by a path
weakly connected directed graph: each node connected to every other node in at least one direction
strongly connected directed graph: each node is connected to every other node both ways
what ways could a graph be represented as?
adjacency matrix: vertices are numbered and a value in a matrix M[i][j] shows the weight of the edge i→j or 0/1 if unweighted
adjacency lists: each node has a list of nodes it is connected to together with the weight if there is one
how does Djikstra’s shortest path algorithm work?
arrays store current shortest path length to each node (initially ∞), current predecessor and if each node has been checked yet. then, starting at the first node, each
outgoing edge is checked and if the weight + the current shortest path length < that node’s current shortest path length, it is updated along with its predecessor. then the outgoing node is marked as checked until all nodes are checked. the next node to be checked will be the current closest unchecked node. the shortest path can be found by backtracking the predecessors of the final node
how could the operation of finding the shortest path be sped up from O(n^2) on a graph stored using adjacency lists?
use a min-priority queue to find which unfinished node has the shortest distance to use next. put all nodes into the binary/ Fibonacci heap with the priority being the distance and delete the minimum priority one each time and update the heap. Fibonacci heaps are slightly faster at this but both reduce complexity to nlogn)
what is a minimal spanning tree?
a collection of edges on a graph that covers each node only once and the added weight of all the edges is as low as possible
how does the Jarnik-Prim algorithm work to find the minimal spanning tree on a graph?
works same way as Dijkstra’s algorithm but the edges are obtained from the nodes’ predecessors as each node to its predecessor is an edge
do the flashcards on the last pdf on dsa (forrests)