DSALG Flashcards
Name the 3 linear data structures
Array, Linked list, Hash table
Name all the hierarchical data strucutres
Binary Tree
Binary Search Tree
AVL Tree
Splay Tree
Heap
2-3 Tree
B Tree
Name the 3 graphical data structure algorithms
Tree traversal(Breadth first and Depth first search)
Shortest Path Trees( Dijkstra’s algorithm)
Spanning Trees(Prim and Kruskals algorithm)
What are the 2 types of arrays
Sorted and unsorted arrays
What are the special cases of arrays
Stacks ADT, Queue ADT, Priority queue
What are the special cases of linked lists
Double linked list
Circularly linked lists
Skip list
What is a doubly linked list
A linked list in which every node has a forward and a backward link
What is a circularly linked list
A SLL in which the last node contains the
reference to the first node in the SLL
What is a skip list
A probabilistic data structure, based on parallel linked lists, with an improved efficiency
What is a Hash table
A searching tool that uses hashing for the time-space trade off
What are the special cases for a B tree
B * Tree
B+ Tree
How is the heap structure applied
Priority Queue
Huffman coding
What is an algorithm
A set of instructions on how to complete a specific task
What are the different classifications of algorithms?
Brute force
Divide and Conquer
Backtracking
Greedy
What is a backtracking algorithm
Keep trying but when you a reach a point where things don’t work out you go back and try a different possibility
What is a greedy algorithm
An algorithm where a decision is made that is deemed good but no consideration is given to future consequences
What is a data structure
A way to store and organise data in order to facilitate access and modifications
What is a hierarchical data structure
A data structure where each node has a unique predecessor and many successors
What is a linear data structure
A data structure where each node has a unique predecessor and a unique successor
What is a graphical data structure
A data structure where each node has many predecessors and many successors
What is a set structure
No predecessor and no successor
What does ADT stand for
Abstract data type
What is an ADT
A collection of data and associated methods stored as a single module
Can data be accessed directly in an ADT
No data is accessed indirectly through its methods to access and modify
Compare an ADT and a data structure
ADT:
What it can do
Implementation independent
Data structure:
How it does it
Implementation dependent
What is a stack
A collection of objects where only the most recently inserted object (Top) can be removed at anytime works on LIFO(Last in first out) basis.
Name some stack methods
- Push-Add an item to the top of the stack
- Pop-Remove an item from the top of the stack
- Peek-Examine item at the top of the stack
- Empty-Determine if the stack is empty
- Full-Determine if the stack is full
What is a queue
A collection of objects organised such that the object that has been stored in the queue the longest is the next one removed.
Works on a FIFO (First in First Out)
Name some ADT
Stack
Queue
Linked lists
Hash Table
Where are elements inserted and removed in a queue
Elements are inserted to the tail(back) of the queue and elements are removed from the head(front)
Name some queue methods
- Enqueue - Add to tail of queue
- Dequeue - Remove from the queue
- Full - See if its full
- Empty - See if its empty
- First - See what element is first
What is the big O notation for Constant complexity?
O(1) - The time taken does not grow no matter how the input n varies or when you find the element immediately
What is the big O notation for Linear complexity?
O(n) - As the input end gets bigger the time also increases e.g linear search
What is the big O notation for Polynomial complexity?
O(n^2) - the time it takes increases at a faster rate than the rate of increase in the size of input n e.g bubble sort
What is the big O notation for Logarithmic complexity?
O(log n) - As the input size grows the number of operation grows very slowly
What does complexity of O(logn) mean
An algorithm that repeatedly excludes half the variables during its execution
Name all the searching algorithms
Linear search
Binary Search
What happens in linear search
You search in order until you find what you want
What happens in binary search
Check the middle element. Discard the left or right in the list that doesn’t contain the item. List must be sorted tho
What is the worst, best and average case of binary search
○ Best case O(1)
○ Worst Case O((log2n)
Average O(log2n)
What is the worst, best and average case of linear search
○ Best case O(1)
○ Worst case O(n)
○ Average case N/2
Name the 5 sorting algorithms
Selection Sort
Insertion sort
Bubble sort
Quick sort
Merge Sort
What happens in selection sort
Smallest/largest element is selected from the unsorted array and swapped with the leftmost element of the unsorted section. Item now in its correct final position with the ascending/descending order. Repeat on unsorted side until all are in order
What happens in insertion sort
Item is inserted into the correct position
What happens in bubble sort
Works by swapping the items via comparison. At the end of each pass the largest item is at the end
What is the best and worst case of Insertion
- Best case O(n)
- Worst Case O(n^2)
What is the best and worst case of bubble sort
○ Best case O(n^2)
○ Worst case O(n^2)
What is the best and worst case of selection sort
○ Best case O(n^2)
○ Worst case O(n^2)
What is recursion
Recursion is a alternative method of repeating a code
Its when a function calls itself
When is recursion applied
○ Solution is easy to specify for certain conditions–stopping case.
Rules for proceeding to a new state which is either a stopping case or eventually leads to a stopping case–recursive steps
What happens in merge sort
Merge sort is when you break apart the list then in the final bit you merge so in order
What is the worst, average and best case for merge sort
- Best case O(n log n)
- Worst case O(n log n)
What happens in quick sort
You choose a pivot then compare your pointers and swap accordingly. Then you repeat this for either side of the pivot treating either side as its own list
What is the best, worst and average case of quicksort
- Best case is O(n log n)
- Worst case is n^2
- Average case is O( n logn)
What is a static array
A fixed array
Give some advantages and disadvantages of static arrays
Advantage:
Faster access to each entry as long as the index is known - O(1)
Disadvantage:
Fixed size - can not be easily extended
Can be expensive to maintain
What is a linked list
A collection of objects, called nodes
Its a dynamic data structure
What are the 2 components of nodes
Information to be stored (the data)
Reference to the next node (often called a link
In code its written as
○ head.data
§ Accesses data sorted in the first node.
○ head.link
§ Reference to the second node
Name some advantages and disadvantages of dynamic data structures
Advantages:
○ Easily extended or reduced to fit the data set.
Efficient - O(1) to insert/delete item, once located
Disadvantages:
Does not allow direct access
Uses more memory
What is the difference between a general tree, binary tree and a binary search tree
General Trees: No restriction on the number of child nodes a parent can have
Binary Trees: At most two children per node, with ordering.
Binary Search Trees: At most two children per node but left node < parent and right node > parent makes search and retrieval efficient
What is the formula for the max number if nodes in a binary search tree of height n
(2^n) -1
What is the formula for the height of a balanced binary tree
h=⌊log2(n)⌋+1
What is another way to find the height of a tree
Number of comparison in the worst case
Does the middle element have to be the root in a binary tree
No
What is a balanced tree
A tree in which all paths from the root node to the leaf nodes are of the same length(all leaves are on the same level)
Define a tree
A hierarchical structure that consists of nodes connected by edges
Define an edge
The connection between nodes
Define the root
The topmost node in a tree, which has no parent
Define a branch
A connection between a parent node and its child node
Define subtree
A tree formed by a node and its descendants
Define a leaf(terminal node)
node with no preceding node(nothing after it)
Define degree of a node
The number of branches connected to it. It tells you how many children the node has
Define the level of a node
The distance of a node from the root. The root is at level 0
Define the height/depth of a tree
The length of the longest path from the root to a leaf node
What are the 5 traversals
Pre order
Post order
In order
Breadth first
Depth first
What are the tips for post,pre an in order traversal
- For PreOrder you start with the root
- For post order you end with root also start with left then right
- In order just write out the numbers from smallest to largest
What does AVL mean
Self balancing
What does splay mean
Self organising
What are the pros and cons of balancing in trees
Pros:
If tree reasonably balanced then operations (insertion, deletion and search) are fast
Best case O(log2n)
Cons
If tree unbalanced then operations are slow
Worst case O(n)
Why do we use AVL trees
To avoid the worst case
What is an AVL tree
- An AVL tree is a binary tree search tree in which:
○ The height of the subtrees are at most a difference of 1
○ Left and right subtrees are also their own AVL trees( a recursive definition)
How do we ensure the height of subtrees are at most a difference of 1 in AVL trees
We make use of rotations
What is the balance factor of a node
The height of its left subtree minus the height of its right subtree
How are AVL trees created
Insert the items as per usual then balance them out using rotations
Name the types and the actual rotations used in AVL trees
Single rotations - LL and RR
Double rotations - LR and RL
Where do rotations take place in AVL trees
Around the unbalanced node
Give some pros and cons to AVL Trees
Pros:
Searching is fast
Insertion and deletion is both fast (rotation is required)
Cons:
Since rotation is required its complex to understand and maintain
What is the complexity of AVL trees
O(log2n)
What is a splay tree
. A self organising BST with the property that recently accessed elements are always quick to access again
. It keeps the most commonly used data near the top of the tree
. Self-adjusts after every search, insertion and deletion operation.
What is splaying
The movement of a node S to the root through a series of rotations
How does insertion work in splay trees
Look where to insert the item and insert it
Once inserted splay the item to the root using a series of double rotations
How does deletion work in splay trees
- Search for the item to be deleted
- If its found splay the item to be deleted to the root
- Then remove and replace the deleted item
- If the item to be deleted is not found:
Splay the last item located to the root
How does searching work in splay trees
- Search for the item normally.
- If the item is found splay it to the root
- If not found splay the last item located to the root
Name the type of rotations and actual rotations in Splay trees
Single rotations
o Zig (equivalent to LL and RR rotations used in AVL trees)
Double rotations
o ZigZag (equivalent to LR and RL rotations used in AVL trees)
o ZigZig
What is the best, worst and average case for splay trees
best case : O(1).
worst case : O(n).
average case: O(log2 n).
What is a 2-3 tree
- A multi-way search tree in which:
○ Each node contains 1 or 2 data items
○ Every internal node must have 2 or 3 children
Why is a 2-3 tree balanced
All leaves are on the same level
What is a 2 node
A node that has:
1 item of data
Has either 2 children or no children
What is a 3 node
A node that has:
○ 2 items of data
○ Has either 3 children or no
children
What are the 2 extra rules for nodes in a 2-3 tree
○ Values of all descendants in the centre subtree are less than the value of the second data item
○ Values of all descendants in the right subtree are greater than the value of the second data item.
How does insertion work in 2-3 trees
- The new item is inserted directly into the leaf node
- The steps are as follows:
○ Locate the new leaf node that should contain the new item
○ If the leaf node contains one data item: the new item is inserted into this leaf node and the insertion completed.
○ If leaf node contains two data items: space has to be created – this is achieved by splitting nodes.
How is splitting done in 2-3 trees
- The node is split into two nodes - say, L1&L2.
- The smallest of the 3 data items is placedintoL1.
- The largest of the 3 data items is placedinL2.
- The remaining data item(the middle one) is promoted to the parent node.
(Look at slides for what to do based on typa node)
Give the pros and cons of 2-3 trees
Pros:
Tree always balanced
Insertion, deletion and searching is fast - O(log3n)
Cons:
Complex and maintaining
What is a B tree
- B Tree of order m is a self-balancing multi-way search tree with the following properties to maintain its balance:
○ root node can store between 2 and m-1 items
○ Internal nodes(any node below the root) have between [m/2] and m children
All leaves are on the same level
What is my simplified version of what a B tree
Like a 2-3 tree but you specify the order/degree of nodes with the letter m
What is the formula for the number of data items in a 2 -3 tree
(2^h)-1
and (3^h)-1
What does each node in a B tree represent
A block(or page) of secondary storage which means accessing is more expensive
What is a B* Tree
- The B* Tree is a variation of the B Tree with aim to improve efficiency of access.
- Where each node except the root node is at least 2/3 full
- The frequency of splitting a node is decreased by delaying a split
- When a node overflows:
- A split is delayed by redistributing the keys between the node and its sibling.
- When a split is made two nodes are split into three
What is a B + Tree
Internal nodes of a B+ Tree are used as an index to enable fast access to the data.
The index is a B Tree
Compare a B Tree and a B + Tree
B Tree:
Keys/indices not repeated
Data stored in all nodes
Long search
Deletion is hard
B + Tree:
Stores redundant keys/indices
Data only stored in leaf nodes
Quick search and quicker in general
Deletion is easy
What is Hashing
Hashing involves using an array for the efficient storage and retrieval of information
How does hashing work
- Mapping the input key (data item) to an output index of the array.
- The mapping will not keep a sorted array of keys
Hashing makes the retrieval of data faster
What must a good hashing function have
○ Quick and easy to compute
○ Minimise collisions
○ Achieve an even distribution of keys
. It must aim for the efficiency O(1) for the retrieval and storage of data
.use less memory
What is truncation
Ignores part of the key and uses remaining parts as the hash value. Its fast but can fail to distribute keys uniformly
What is folding
Splits the key into several parts and then combines the parts to get the hash values. It gives better spread than truncation
What is modulo arithmetic
Divide key by table size and uses the remainder as the hash value
Name some ways to solve collisions
Chaining & open addressing
What are the open addressing methods for solving collisions
○ Linear Probing - linear search of the table from the hashed position is used to find an empty slot (h + i)
○ Quadratic Probing (h + i^2)
○ Double Hashing - 2 hash functions are used. The hash value of another hash function g is added to the hashed value h repetitively to find an empty slot(h, h + g, h + 2g and etc)
○ Random Rehashing - RNG used to obtain increments.
What is a priority queue?
An abstract data type(ADT) supporting the following operations:
○ Insert an element into the queue with an associated priority;
○ Remove element from the queue with the highest priority.
○ Locating the element with the highest priority
How are priority queues implemented
Static arrays
Heap data structure because the highest priority item is stored at the root
Why are priority queues efficient
Because you are inserting into the first available space O(1)
When is a heap data structure useful
When it is necessary to repeatedly remove the object with the highest (or lowest) priority, or when insertions need to be scattered with removals of the root node
What is a max heap
A binary tree with the following characteristics:
○ It’s a complete binary tree
○ The key present at the root node must be greatest among the keys present at all of it’s children.
○ The same property must be recursively true for all sub-trees in that Binary Tree
What is a min heap
A binary tree with the following characteristics:
○ The key present at the root node must be minimum among the keys present at all of it’s children.
○ The same property must be recursively true for all sub-trees in that Binary Tree
What happens in heap insertion
- Item is inserted as the next leaf on that level
- If full the item is inserted on the next level on the left side
- Insertion can destroy the heap property
- To restore the heap property the inserted item is moved up the tree until it ends up as the root or it finds a parent which restores the heap property (Keep swapping until the biggest all the smallest is the parent)
what happens in heap removal
Items at the root are easy to remove - leaving us with two sub-trees from which we must re-create a single tree which satisfies properties of a heap.
Compare AVL, Splay and heap
AVL - BST, focuses on the balance(difference of height) of the tree, height is not guaranteed
Splay - BST, focuses on the most recently visited node, height not guaranteed
Heap - Complete BT, focuses on priority, shortest height
Name some applications of heap
- Dijkstra’s algorithm
- Kruskal’s algorithm
Huffman’s algorithm
- Kruskal’s algorithm
What is data compression:
- Aim to represent information as accurately as possible using the fewest number of bits
What are some good things about data compression
○ Reduces storage space used and storage costs
Reduces time to retrieve data & time to transmit data
What are the 2 types if compression
Lossy and lossless
Name some data compression techniques
Text compression, Voice compression, Image compression
What is fixed length encoding
- When each character/element is encoded using a fixed number of bits
- Using 2 bits per character instead of 8 bits per character
What is variable length encoding
- Code words used to represent characters
- Shorter code words used more frequently
- Longer code words are used more frequently
- Variable length coding can reduce the size of an encoded string
What is huffman coding
- Huffman coding is a statistical compression method to convert characters into variable length bit strings.
- Most frequently occurring characters are converted into short code-words and the least frequently occurring ones into longer code-words
- Makes use of a priority queue so we use heap data structure
When are graphical data structures used
Mostly used when linear and tree structures are not applicable
Name some graph theories
- Shortest path problem - Graph traversal and path search with tradeoff between space and time
- Ramsey theory - For any 6 people, either at least 3 of them are mutual strangers or at least 3 of them are mutual acquaintances
Name some ways graphs are applied irl
- Planning a journey by air:
○ What is the shortest route
○ Shortest distance between 2 cities - A maze
○ Is there a path from the entrance to the exit
What is path
The route taken along the edges in a graph
What is a cycle
A path that starts and ends at the same vertex in a graph
What are vertices
The nodes in a graph
What is a directed graph
Graph with arrows
What is a undirected graph
Graph without arrows
What is a weighted graph
A graph where the edged hold numerical weight(edges have numbered)
What is an adjacency matrix
A square matrix used to describe which vertices (or nodes) of the graph are adjacent to each other.
If they are adjacent we represent this using a 1, if they aren’t we use a 0
In a weighted graph we replace the 1 with the weight of the edges and 0s with infinite.
What is an adjacency list
An alternative to an adjacency matrix
* A 1D array is used to store vertices and their adjacency
* A set of singly linked lists with one SLL for each vertex.
* Each SLL contains all vertices that are adjacent to the vertex
What is a short path tree and name an example
- A tree that contains all vertices and a subset of edges - the path from the root to any other vertex has the shortest distance
Example dijkstras algorithm
What is a spanning tree
- A spanning tree of a graph is a tree that contains all the vertices with only path between pairs if vertices
- A graph can have many different spanning trees (E.g. Breadth first search and depth first search)
Name some applications of spanning trees
Design of computer networks
Airline routes
Name algorithms that involve spanning trees
Prim’s Algorithms and Kruskal’s Algorithm
Why is Binary Search Preferred over Ternary or N-ary Search?
. Simple and easy to understand
. Fewer comparisons
. Efficient memory acces
Compare a SLL and a skip list
SLL:
. Elements are connected one after the other in a straight line.
Searching takes
. O(n) time on average
. Insertion and deletion quick and easy
Skip-list:
. Consists of multiple parallel SLL
. Searching is faster
. Insertion and deletion is harder
Depth First Traversal (DFT)
* Proceeds along a path from the root through one child to the most distant descendant of that first child before processing the second child.
* Implementation uses a stack.
* Leaf by leaf.
Breadth First Traversal (BFT)
* Proceeds horizontally from the root to all of its children then to its children’s children and so on…
*Implementation uses a queue. * Level by level.
Name the 3 types of depth first traversal
Pre order,inorder postorder
. Explain the difference between a Depth First Traversal and a Breadth First Traversal of a graph.
BFS of a graph processes the start vertex and then processes all of its adjacent vertices. Then it picks the first adjacent vertex and processes all of its adjacent vertices(makes use of edge table)
What is the formula for the maximum number of nodes in B tree
(o^h) -1
o is the order of the tree
j is the height of a tree