Exam Flashcards
Negatives of adjacency lists
Enumerating over nodes directly is o(n) where n is number of vertices
if each node has k edges on average enumerating edges requires o(n*k) which equals o(m) where m is the total number of edges which scales badly
What is extendible hashing
A dynamic hashing technique where the hash code is treated as a bit string and the hash code splits when buckets overflow using more bits to differentiate keys
Advantages and disadvantages of quadratic probing
Advantages: reduces clustering in comparison to linear probing
better distribution of keys in tables
disadvantages: more complex than linear probing and can still fail to resolve collision if table size isn’t a prime number
suffer from secondary clustering where keys with the same initial hash code follow the same sequence
Psuedocode for bfs
BFS(Graph graph, startNode) {
currentLevel = [startNode] // Start with the list containing the start node
while currentLevel is not empty {
nextLevel = [] // Prepare a new list for the next level of nodes
for each node in currentLevel {
visit(node) // Process or visit the node
for each edge connected to node { // Explore all edges incident on the node
if edge is unexplored { // Check if the edge has been explored
neighborNode = graph.opposite(edge, node) // Find the other endpoint of the edge
if neighborNode is undiscovered { // If the neighbor node hasn’t been discovered yet
label(edge, “discovery”) // Label the edge as a discovery edge
add neighborNode to nextLevel // Add the undiscovered neighbor node to the next level
} else {
label(edge, “cross”) // If the neighbor node has already been discovered, label the edge as a cross edge
}
}
}
}
currentLevel = nextLevel // Move to the next level, update ‘currentLevel’ with the new list of nodes
}
Advantages of adjacency matrices and disadvantages
Fast edge lookup simple query A[i][j] which is o(1)
disadvantages: requires o(n^2) memory for n nodes even if a graph is sparse
finding all neighbours is an o(n) operation as it involves scanning entire row/column
What’s the length of a path
The number of edges it traverses
Why does dijkstras algorithm never work with negative weight
the algorithm assumes that adding an edge to a path will never shorten it. Negative weights violate this leading to incorrect results
Advantages and disadvantages of extendible hashing
Advantages: don’t need to resize entire table, dynamically adjusts to varying data sizes, handles collisions efficiently
disadvantages: more complex to implement and splitting buckets requires rehashing and reorganising data
Write an implementation of insertion sort (in place)
Public void insertionSort(array){
for (int i =1; I<array.length;I++){
temp = array[i]
j = i-1
while (j>=0 && array[j] > temp){
array[j+1] = array[j]
j–
}
array[j+1] = temp
Advantages of separate chaining
Handles collisions automatically
buckets can grow dynamically
performs well with a good hash function and low load factor
What’s a path
A path is a sequence of nodes and edges that link 2 nodes
How to remove a node in an edge list
Find the node object
remove from node list
traverse edge list and remove edges involving the node
Worst and average case for separate chaining
Worst case: hash function is poor and all keys map to the same bucket so it’s essentially a linked list
inserting is o(1) and searching/deleting becomes o(n)
average case: hash function distributes keys evenly.
search/delete is o(n/m)
if load factor is low it’s close to o(1)
Approach to BFS
Choose a node as first node, mark as visited
discover all neighbours that haven’t been discovered
visit each undiscovered node
once all neighbours are visited move to their discovered neighbours
Properties of a good hash function
Uniform distribution of keys
minimal collisions
fast computation
Worst and average case for separate chaining
Worst case: hash function is poor and all keys map to the same bucket so it’s essentially a linked list
inserting is o(1) and searching/deleting becomes o(n)
average case: hash function distributes keys evenly.
search/delete is o(n/m)
if load factor is low it’s close to o(1)
definition of a balanced binary search tree
In a balanced binary search tree the heights of the left and right sub trees differ by at most 1
What’s a strongly connected graph
A graph is strongly connected when the graph is connected and we respect the directions
What’s the recursive implementation of post order traversal
Public void traverse (BinaryTreeNode node){
if(node.left != null) traverse(node,left)
if(node.right!=null) traverse(node.right)
visit(node.element)
}
What’s the complexity of a rebalancing an AVL tree
A single rotation is an o(1) operation
propagating changes is an o(logn) operation as after a rotation the height of the subtree may change which requires updates to the balance factors of its patent, grandparent and so on. In the worst case the propagation may need to travel up the height of the tree which js proportional to o(logn)
Complexity of removal from an edge list
O(m) where m is the number of edges
What is a bucket overflow
A bucket overflows when it can’t store any more key value pairs due to collisions
How do we label nodes when updating a tree to deal with addition
W - the node added
Z - the first unbalanced node above w
X - the child of y with larger height
y- the child of z with larger height
how is minimal disruption value determined when deleting a node from a binary search tree
It’s the smallest value larger than the deleted node or the largest value smaller than it. So the leftmost node on the right subtree
What is a disadvantage of using heuristics
They don’t always return the correct or most optimal solution
Properties of a good hash function
Uniform distribution of keys
minimal collisions
fast computation
Psuedocode for dijkstras weighed algorithm
findShortestWeightedPaths(graph, sourceNode) {
unvisitedNodes = graph.getAllNodes() // Set of all unvisited nodes
for each node in unvisitedNodes {
shortestDistance[node] = infinity // Initialize all distances to infinity
}
shortestDistance[sourceNode] = 0 // Distance to the source node is 0
while unvisitedNodes is not empty {
currentNode = node in unvisitedNodes with smallest shortestDistance
remove currentNode from unvisitedNodes // Mark the current node as visited
for each edge connectedEdge incident on currentNode {
neighborNode = graph.getOppositeNode(connectedEdge, currentNode)
edgeWeight = graph.getEdgeWeight(connectedEdge) // Get the weight of the edge
newDistance = edgeWeight + shortestDistance[currentNode] // Tentative distance
if newDistance < shortestDistance[neighborNode] {
shortestDistance[neighborNode] = newDistance // Update to smaller distance
}
}
Psuedocode for bfs
BFS(Graph graph, startNode) {
currentLevel = [startNode] // Start with the list containing the start node
while currentLevel is not empty {
nextLevel = [] // Prepare a new list for the next level of nodes
for each node in currentLevel {
visit(node) // Process or visit the node
for each edge connected to node { // Explore all edges incident on the node
if edge is unexplored { // Check if the edge has been explored
neighborNode = graph.opposite(edge, node) // Find the other endpoint of the edge
if neighborNode is undiscovered { // If the neighbor node hasn’t been discovered yet
label(edge, “discovery”) // Label the edge as a discovery edge
add neighborNode to nextLevel // Add the undiscovered neighbor node to the next level
} else {
label(edge, “cross”) // If the neighbor node has already been discovered, label the edge as a cross edge
}
}
}
}
currentLevel = nextLevel // Move to the next level, update ‘currentLevel’ with the new list of nodes
}
}
Approach to BFS
Choose a node as first node, mark as visited
discover all neighbours that haven’t been discovered
visit each undiscovered node
once all neighbours are visited move to their discovered neighbours
Advantages of adjacency lists over edge list
No need to traverse over entire list
direct access to adjacency lists ensures o(1) removal time per edge
Complexity of node removal in an adjacency list
O(k) where k is the degree of the node
edge removal is o(1)/edge as we can directly access adjacency lists
How to remove a node in an adjacency list
Find the node object
remove from node list
remove incident edges:
access the nodes adjacent edge, for each adjacent edge remove the edge itself and update the adjacency list of the other endpoint to remove reference to this edge
Complexity of removal from an edge list
O(m) where m is the number of edges
What a weakly connected graph
A weakly connected graph is when the graph is connected if we ignore the direction of edges
Define a connected graph
A graph is connected if there’s a path between every pair of nodes
What’s a subgraph
A subset of the original nodes and edges
Definition of a tree in graphs
A tree is an undirected graph in which any 2 nodes are connected by 1 path
What’s path length
Path length is number of edges in a graph
What’s a path
A path is a sequence of nodes and edges that link 2 nodes
What is the degree of a node in a graph
The degree of a graph is the number of edges connected to it
undirected graph: total edges connected to the graph
directed graph: in degree - edges coming into the node
Out degree - edges going out of the node
What is the degree of a node in a graph
The degree of a graph is the number of edges connected to it
undirected graph: total edges connected to the graph
directed graph: in degree - edges coming into the node
Out degree - edges going out of the node
What’s a spanning tree
A spanning tree is a subset of graph that connects all nodes, has no cycles and contains exactly n-1 edges where n is the number of nodes
What are the differences between directed and undirected graphs
Directed graphs: edges have a direction going from one way to another
undirected graphs: edges have no direction the connection is bidirectional
What are the differences between directed and undirected graphs
Directed graphs: edges have a direction going from one way to another
undirected graphs: edges have no direction the connection is bidirectional
What’s a graph formally
A graph G =(V,E)
V: a set of nodes
E: a set of edges connecting the nodes
Properties of a good hash function
Uniform distribution of keys
minimal collisions
fast computation
Advantages ans disadvantages of hashing over Arrays.
Advantages: hashing provides o(1) for search insertion and deletion average compared to arrays o(n) for sorted arrays and o(logn) for unsorted arrays
hash tables can also grow dynamically with resizing or extendible hashing unlike fixed size arrays
hash tables can allocate memory dynamically for buckets whereas arrays allocate memory upfront and is fixed
disadvantages: hash tables don’t maintain the order of elements while arrays order ordered traversal
Hashing works with unique keys and duplicate keys require extra handling unlike arrays where duplicates are inherently supported
arrays allow direct access to elements via indices whereas hashing requires a hash code which adds extra overhead
What’s the advantage of using sorted lists in chaining
Searching becomes o(log(n/m)) for binary search but adds overhead during insertion
What is a bucket overflow
A bucket overflows when it can’t store any more key value pairs due to collisions
Worst and average case for separate chaining
Worst case: hash function is poor and all keys map to the same bucket so it’s essentially a linked list
inserting is o(1) and searching/deleting becomes o(n)
average case: hash function distributes keys evenly.
search/delete is o(n/m)
if load factor is low it’s close to o(1)
Advantages and disadvantages of extendible hashing
Advantages: don’t need to resize entire table, dynamically adjusts to varying data sizes, handles collisions efficiently
disadvantages: more complex to implement and splitting buckets requires rehashing and reorganising data
What is extendible hashing
A dynamic hashing technique where the hash code is treated as a bit string and the hash code splits when buckets overflow using more bits to differentiate keys
Advantages and disadvantages of secondary hashing
Advantages: minimises clustering by providing a better distribution of probes
disadvantages: more complex to implement than quadratic and linear probing and requires careful choice of the second hash function to avoid infinite loops !=0
What’s secondary hashing
A second hash function is used to calculate the step size for probing
Advantages and disadvantages of quadratic probing
Advantages: reduces clustering in comparison to linear probing
better distribution of keys in tables
disadvantages: more complex than linear probing and can still fail to resolve collision if table size isn’t a prime number
suffer from secondary clustering where keys with the same initial hash code follow the same sequence
Disadvantages of linear probing
Leads to primary hashing where groups of filled slots form increasing chances of collisions and slowing future insertions and searches
performance degrades as table fills up
Advantages of linear probing
Simple to implement and cache friendly due to sequential memory access
What is quadratic probing
Quadratic probing skips slots in a quadratic pattern.
h(k,i) = (h(k)+i^2) mod m
What is linear probing
A collision handling technique where the next available slot is searched sequentially;
h(k,i) = (h(k)+i) mod m
How does open addressing handle collisions
Open adddressing stores key value pairs directly in the hash table. If a collision occurs it probes for the next available slot using techniques such as
linear probing
quadratic probing
secondary hashing
Advantages do separate chaining
Handles collisions automatically
buckets can grow dynamically
performs well with a good hash function and low load factor
How does separate chaining handle collisions
Each bucket stores a list (or chain) of key value pairs when keys collide they are added to the same buckets list
What’s a collision in hashing
A collision occurs when 2 keys hash to the same bucket
What’s load factor in a hash table
The load factor is the ratio of the number of keys n to the number of buckets m
LF=n/m
What is a bucket
A bucket is a container at an array index where one or more key value pairs may be stored
What is hashing
Hashing maps keys to values using a hash function enabling efficient data storage and retrieval
What’s the complexity of a rebalancing an AVL tree
A single rotation is an o(1) operation
propagating changes is an o(logn) operation as after a rotation the height of the subtree may change which requires updates to the balance factors of its patent, grandparent and so on. In the worst case the propagation may need to travel up the height of the tree which js proportional to o(logn)
What’s the balance factor in an AVL tree
What’s the balance factor in an AVL tree
What’s the balance factor in an AVL tree
The difference in heights between the left and right subtree of a node
When is rebalancing needed in a BST
After a node is added/removed the balance factor of any node isn’t within [-1,1]
how is minimal disruption value determined when deleting a node from a binary search tree
It’s the smallest value larger than the deleted node or the largest value smaller than it. So the leftmost node on the right subtree
What are the cases to consider when deleting a node from a binary tree
Leaf node
node with one child
node with 2 children
What is the complexity of searching in a balanced vs unbalanced binary search tree and why
In a balanced binary search tree with n nodes its height is log base 2 n
eaxh step the target value is compared with the current nodes value and move to either the left or right node. Each step halves the search space. The search path is proportional to the height of the tree so the search time is also o(Logn)
if the tree becomes unbalanced its height can grow as large as n in the worst case. Its time complexity becomes O(n) as you may have to visit every node to find the target
What is the key invariant of a binary search tree
For each node all labels in the left subtree are less than the nodes label and all labels in the right subtree are greater
Compare the strategies of choosing a pivot between first, last or randomly selected element or median of 3 values
First/ last: is simple to implement but can lead to unbalanced partitions (eg worst case o(n^2)) if the list is already sorted
median: improves balance and avoids worst case scenarios but adds extra computation to find the median
randomly: generally ensures balanced partitions on average, avoiding input specific patterns and is computationally cheap
Compare structural vs value driven algorithms and their advantages/ disadvantages
Structural algorithms like merge sort:
approach: divide the list based on structure (eg splitting by size)
Advantages: predictable performance (o(nlogn)) for all cases and no reliance on input values
disadvantages: uses extra space due to auxiliary arrays
value driven algos:
approach: divide based on value comparisons
advantages: can be done in place using o(1) memory
disadvantages: worst case performance is o(n^2) when partitions are imbalanced and requires careful pivot selection
What’s the time complexity of merge sort and why
Splitting is a log n operation and merging is o(n) operation making it o(nlogn)
How does merge sort use divide and conquer
It splits the list into two halves, recursively sorts them and merges the sorted halves
Complexity of selection sort and why
Selection sort is o(n^2) because it performs roughly n comparisons for each element since there are n elements the number of comparisons grows quadratically
What is the idea behind selection sort
Repeatedly find the smallest element and place it at the beginning of the list
Write an implementation of insertion sort (in place)
Public void insertionSort(array){
for (int i =1; I<array.length;I++){
temp = array[i]
j = i-1
while (j>=0 && array[j] > temp){
array[j+1] = array[j]
j–
}
array[j+1] = temp
What is the time complexity of insertion sort in its average, best and worst case and why
It’s average case is o(n^2) because well roughly have to do n/4 swaps for n elements giving you an average complexity of o(n^2)
its worst case is o(n^2) and it occurs when it’s reverse sorted. For each of the n elements it performs up to n shifts
its best case is o(n) this is when the list is sorted because each element only needs a single comparison to check whether it’s in the right place
What is a disadvantage of using heuristics
They don’t always return the correct or most optimal solution
Why are heuristics used
They provide faster approximate solutions. Useful for when algorithms take too long or fail to terminate
Difference between an algorithm and a heuristic
Algorithms terminate for finite inputs heuristics aren’t always guaranteed to terminate and give approximate results
How are expressions derived in postfix traversal
Operators are after operands and derived using post order traversal
How are expressions derived in infix notation
Infix notation uses in order traversal and brackets are used to surround each sub expression. Operators are inline with operands
How are expressions derived in prefix notation
Prefix notation uses pre order traversal, operators are before operands
Implement in order traversal recursively
Public void traverse(BinaryTreeNode node){
if(node.left!=null) traverse(node.left)
visit(node.element)
if(node.right!=null) traverse(node.right)
}
How does in order traversal work
Visit the first child them the node then the next child
What’s the recursive implementation of post order traversal
Public void traverse (BinaryTreeNode node){
if(node.left != null) traverse(node,left)
if(node.right!=null) traverse(node.right)
visit(node.element)
}
How does post order tree traversal work
Visit the children then visit the node
What’s the recursive implementation of preorder tree traversal
public void traverse(BinaryTreeNode node){
visit(node.element)
if(node.left!=null) traverse(node.left)
if(node.right!=null) traverse(node.right)
}
private void visit(Object element){
print(element + “,”)
}
How does pre order tree traversal work
Visit the node then visit that nodes children
Write selection sort in Java
for (int I =0; I< array.length-1;I++){
int min = I
for (int j = i+1; j<array.length;j++){
if (array[min] > array[j]){
min = j
}
}
int temp = array[i]
array[i]=array[min]
array[min] = temp
}
Write bubble sort
For (int I =0; I < array.length: I++ ){
For (int j =0;j<array.length-I-1;j++){
if(array[j] > array[j+1]){
int temp = array[j]
array[j]=array[j+1]
array[j+1] = temp
}
}
}
Disadvantages of array based stack implementation
Arrays have a fixed size so the stack can’t be resized
the user could easily overestimate or underestimate the amount of elements their stack will need to store which could lead to memory wasteage or unexpected errors
What are some advantages of an array based stack implementation
The implementation of an array based search is simple since it only requires an index to keep track of the top element
pushing and popping from an array based stack runs in o(1) time so it’s efficient
array based implementations have lower memory overhead as they don’t need to store references to pointers
A stack is a LIFO data structure what does this mean
this means the last element that is added to the stack is the first that is popped out
write an array based stack implementation to the following interface
public interface IStack{
void push( T element) throws StackOverflowException
T pop() throws StackEmptyException
T top() throws StackEmptyException
int size()
Boolean isEmpty()
void clear()
}
public class ArrayStack implements IStack{
private int size;
private T[] elements;
public ArrayStack(int maxSize){
size =0
elements = new (T[]) Object[maxSize]
}
Public void push ( T element){
if ( size < elements.length){
elements[size++] = element
} else{
throw new StackOverflowException();
}
public T pop(){
if (isEmpty()){
throw new StackEmptyException
} else{
size-=1;
return elements[size-1]
}
public T top(){
if(isEmpty()){
throw new StackEmptyException()
} else{
return elements[size-1]
}
}
public Boolean isEmpty(){
return size =0;
}
public void clear(){
size =0;
}
Implement a queue using a singly linked list define the generic ListNode classes and queue classes as well. Use the interface
interface IQueue<T> {
void enqueue(T element )
T dequeue() throws queueemptyexception
int size()
boolean isEmpty()
}</T>
public class ListNode<T>{
public T element
public ListNode<T> next</T></T>
public ListNode(element){
this(element,null)
}
public ListNode(T element, ListNode<T>next){
this.element=element
this.next = next
}</T>
public class Queue<T> implements IQueue {
private ListNode<T> front
private ListNode<T> back
private int size =0</T></T></T>
public Queue(){
this.front = null;
this.back = null;
size =0;
}
public void enqueue ( T element){
ListNode<T> newNode = new ListNode(element)</T>
If ( isEmpty()){
front = newNode
back = front
} else{
back.next = newNode
back = back.next
}
size++
}
public T dequeue() throws QueueEmptyException(){
T frontElement = front.element
front = front.next
size- -
return front.element;
}
public int size(){
return size
}
public Boolean IsEmpty(){
return size == 0
}