Data Structures And Algorithms Flashcards
What is a primitive data type?
A data type that can’t be broken down further.
What is an abstract data type?
A conceptual model of how data is to be organised and the operations we might perform on the data.
What is a data structure?
Implementation of ADTs. Can combine multiple data under a single identifier.
What are the key ADTs?
- Stacks
- Queues
- Graphs
- Trees
- Hash Tables
What are arrays?
Static data structures that stored data of the same type arranged as a contiguous block in memory with a fixed length
How does indexing work in arrays?
The index acts as an offset to the data from the start off the structure in memory. For this reason the dimensions and data type of the array must be known.
What is a multidimensional array?
An n-dimensional array is a set of elements with the same data type that are indexed by a tuple of n integers, where a tuple is an ordered list of elements.
What are the uses of arrays?
- Can store multiple homogenous values
- Can represent vectors
- Can store tabular data in a matrix
- 2D arrays are common in machine learning applications that consist of large related data sets
What are the advantages of static data structures?
- Do not require pointers so take up less space
- Can randomly access any element in constant time because data is stored continuously in memory
- Faster than dynamic data structures
What are the disadvantages of static data structures?
- Cannot grow to accommodate more data being declared
- Cannot shrink to free up memory
- Inefficient if more space is allocated than required
- Can lead to errors if not enough space is allocated
On what kind of basis is date retrieved from a stack?
Last-In-First-Out
On what kind of basis is data retrieved from a queue?
First-In-First-Out
What are the key operations of a stack?
- Push
- Pop
- Peek
- Test for empty stack
- Test for full stack
What are the key operations of a queue?
- Add and item
- Remove an item
- Test for an empty queue
- Test for a full queue
What is a stack overflow?
An attempt is made to push more data onto the stack that it can store
What is a stack underflow?
Occurs when a pop is attempted on an empty stack.
What are stacks comprised of?
- A sequence of items
- A stack pointer
What are queues comprised of?
- Sequence of items
- Rear pointer
- Front pointer
What are the uses of stacks?
- Reversing sequences of items
- Maintaining “undo” lists in applications
- Maintaining a web-browser history
- Storing processor state (register values) when handling an interrupt
What are the uses of queues?
- Maintaining a queue of print jobs
- Managing an input buffer
- Handling multiple file downloads
- Maintaining a playlist of media
Test if a stack is empty?
Check if the stack pointer is equal to -1
Test if a stack is full?
Compare the stack pointer to the maximum size of the array (-1)
Push an item onto a stack?
- Check if the state is full if not…
- Increment the stack pointer by 1
- Assign the pushed item to the element within item indexed by the stack pointer
Peeking a stack?
- Check if the stack is empty, if not…
- Return the item at the stack pointer
Popping a stack?
- Check if the stack is empty, if not…
- Decrement the stack pointer
- Return the item at the stack pointer + 1
What is a linear queue?
A simple static and array based implementation of a queue
What are the advantages and disadvantages of linear queues?
+Simple to program
-Can lead to wasted capacity as memory is never used up
-Process of shuffling data to solve this issue is inefficient
What is a circular queue?
A static array based implementation of a queue with “wrapping” front and rear pointers
What are the advantages and disadvantages of a circular queue?
+It avoids wasted space with no need of shuffling
- more complex to implement
(-Can still run out of storage space)
What are the different types of queues?
- Linear
- Circular
- Priority
How do you test if a linear queue is empty?
Check if the front pointer is greater than rear pointer
How do you test of a linear queue is full?
Check if the rear pointer points to the last element
How do you enqueue items in a linear queue?
- Check if full
- Increment rear pointer
- Assign item to index indicated by rear pointer
How do you dequeue items in a linear queue?
- Check if empty
- Return value at the front pointer
- Increment front pointer
How do you peek a linear queue?
- Check if empty
- Return value at front pointer
How do you enqueue items in a circular queue?
- Check if queue is full
- Increment rear pointer and modulus it with the maximum size of the array
- Add items at index indicated by rear pointer
- Increment queue size
How do you dequeue items from a circular queue?
- Check if empty
- Store removed item
- Increment front pointer and modulus with maximum size of the array
- Decrement the queue size and return item
What area the differences between a circular and linear queue? (Structure only)
- Circular more efficient but also more complex
- Can wrap around memory locations
- No space is wasted after dequeueing
- Linear queues shuffle but this is inefficient
- Circular queues don’t need to do this
What is a priority queue?
A type of queue where each item in the queue has a priority.
How do priority queues work?
- Items are stored within the queue based on priority with highest priority at the front.
- When enqueueing we search for an item that has less priority and insert the item in front of it by shuffling everything after it
- This preserves FIFO within priority levels
What are the advantages of a dynamic data structure?
- Can change size at run-time by dynamically allocating memory
- More flexible, with no wasted space
What are the disadvantages of a dynamic data structure?
- Do not store items in continuous memory locations so retrieving items takes longer
- Require memory to store pointers to next items so take up more space
What is a recursive subroutine?
A subroutine that calls itself
What is the base case?
Condition that evaluates to a known value to stop recursion.
What is a call stack?
An area of memory used to store data used by running processes and subroutines.
How does the call stack work?
- Each new subroutine call pushes a stack frame onto the call stack
- When a process completes, its stack frame is popped off the call stack
- The previous process continues
What information does a stack frame contain?
- local variables
- parameter values
- return addresses
What are graphs?
An ADT comprised of a network of nodes connected by edges used to model complex relationships.
What is a node?
Part of a graph that represents objects or entities
What is an edge?
Part of a graph that represents a relationship between entities
What is a weight in a graph?
A data value used to label an edge.
What are the uses of graphs
- Human (social) networks
- Transport networks , including road and rail networks
- Connections between devices on a network
- Planning telecommunications networks
What is an undirected graph?
A graph in which each relationship between nodes is two-way.
What is a directed graph?
A graph where each relationship between nodes can be one-way as shown by arrows or bidirectional.
What is a weighted graph?
A graph in which each edge is given a weight.
What are some possible graph operations?
- Adding/Removing a node
- Adding/Removing an edge
- Test if an edge exists between two nodes
- Getting and updating weights on edges
- Finding a valid path between two nodes
What are the two main ways of implementing a graph?
- Adjacency matrix - size of N squared
- Adjacency list - size of N + E
How can an adjacency matrix be represented?
Static 2D array
How can an adjacency list be represented?
Dynamically as an array of graph nodes, each containing a pointer to a list of nodes
What are the advantages of adjacency matrices?
- Edges between nodes can be quickly identified using the coordinates of the nodes in the 2D array
What are the disadvantages of adjacency matrices?
- Almost half of the matrix is repeated data
- Stores every possible edge between nodes, even those that don’t exist
- Wasted data
When are adjacency matrices used?
- “Dense graphs” where there are a lot of edges
- When edges need to be frequently created,removed,identified or their weights changed
When are adjacency lists used?
- “Sparse graphs” where there are not many edges
- Nodes are frequently added/removed
What are the advantages of adjacency lists?
- Memory efficient as they only store edges that exist in the graph
What are the disadvantages of adjacency lists?
- They are slow to identify edges as each item in a list must be be searched linearly until the desired edge is found
What is graph traversal?
Visiting every node within the graph to achieve an purpose
What are the two different types of graph traversal?
- Depth-First Search
- Breadth-First Search
How does Depth First traversal work?
- Recursively visits nodes that are adjacent to a starting node
- Until a node is reached with no unvisited adjacent nodes
- At this point the algorithm backtracks
How is a stack used in depth first traversal?
Uses it to keep track of the currently visiting node and the history of nodes required to get there.
What are the uses of a depth first traversal algorithm.
- Finding a solution to a maze, where the maze is modelled as a graph
- Finding the valid path from one node to another
- Determining the order with which processes that depend on one another should occur
How does breadth first traversal work?
Adds all adjacent nodes to a queue and uses iteration to repeat the process with the node at the front of the queue.
What are some uses of breadth first searches?
- Identifying “friends” and first-degree contacts
- Finding immediately connected devices on a network
- Crawling web pages to build an index of linked pages as used by search engines
- Mapping an unknown graph
What is a tree data structure?
A specialised form of undirected graph with exactly one path between two nodes used to model hierarchical relationships.
What are the features of a tree?
- Undirected
- Unweighted
- Fully connected
- Acyclical
What is a binary tree?
A type of rooted tree where each node can have at most two children.
What are the uses of trees?
- Efficient searching and sorting algorithms
- Storing data that has a hierarchical structure
- Processing syntax in natural and programming languages
- Implementing probability and decision trees
Why are trees well suited for recursion?
They are a recursive data type such that each child node can be the root of a separate tree.
What is a rooted tree?
- One node is designated as the root which has no parents
- All nodes described as parent or child nodes
- Direction of hierarchy often shown by arrows
What are the features on a rooted tree?
- Root node: has no parent
- Child and Parent nodes
- Edges/Branches
- Leaf node: has no children
What is a binary search tree?
A type of binary tree where a particular method is used to insert new node in an optimal location for later search/retreival
What are Binary Search Trees used for?
Provide an efficient way of searching for items as well as providing an effective way of returning all items in the correct order.
What are the key operations of a tree?
- Add nodes
- Get child nodes
- Getting a subtree
- Searching for a node
- Determining the height of the tree
What is the height of a tree?
The number of edges along the longest path from the root to the furthest leaf node.
What are the uses of a pre-order traversal?
- Copying a tree
- Evaluating mathematical expressions using prefix notation
What are the uses of an in-order traversal?
- Outputting the contents of a binary tree in ascending order
What are the uses of a post-order traversal?
- Converting mathematical expressions from infix to postfix notation (Reverse Polish)
- Producing a postfix expression from an expression tree
- Deleting a tree
- Completing dependant processes in order
What is an expression tree?
A tree that represents a mathematical expressions evaluated using post-order traversals.
What it Reverse Polish Notation?
An alternate means of writing mathematical expressions using postfix notation
What is infix notation?
The operator sits between the operands
What is postfix notation?
Operator comes after operands
What are the benefits of Reverse Polish Notation?
- Simplifies mathematical expressions by eliminating the need for brackets as order is implied by notation
- More easily evaluated by stack-based interpreters
What are some properties of stack-based interpreters?
- Translate and execute program code line-by-line
- Invoke pre-complied subroutines to perform the line being translated
What is a hash table?
A data structure that stores a key/value pair based on an index calculated from a hashing algorithm
What is a collision in hash tables?
When two key values compute to the same hash value
What are properties of good hash functions?
- Make use of all info provided by the key
- Uniformly distribute output across table
- Maps similar keys to very different hash values
- Make use of only the fastest operations
How can collisions be dealt with?
- Rehashing
- Linear probing
- Chaining
What is linear probing and what are its disadvantages?
- Look for the next available index and insert item there
- However increases potential for further collisions and can promote clustering
What is clustering in hash tables?
Nearby indexes are not randomly or evenly distributed
What is chaining?
Replacing the existing contents of the index with a list and add the new items to the list
What is rehashing?
Process of increasing the size of a hash table and redistributing the elements to new indexes based on their new hash values
What are the uses of hash tables?
- Databases: quick storage and retrieval of data based on primary keys
- Cache memory: determining the memory address for a given item of data
- Operating Systems: store locations of its programs and utilities for quick access
(Encryption)
How do hash tables work?
- When an item is added, its key is processed by a hashing algorithm that returns a hash value
- Items are placed in indexes specified by the hash value
- To search for an item, key is processed again and the resulting index is checked for the item, if it does not exist we can confidently say it is not in the table
What are the advantages of using hash tables to store data?
- Faster to lookup a record
- Time complexity of O(1) for retrieval
What are the advantages of storing data in a dictionary?
- More compact file as there are no empty records
- Produce a sorted list
- Do not have to design a hash algorithm so easier to design
- No need to deal with collisions
What is a vector?
A data structure that represents a size and a magnitude, made up of at least 2 components.
How can vectors be represented?
- List of numbers
- Functions
- Geometric point in space pointed to by an arrow
How can a vector with n components from the set of reals be written?
Rn
How can a vector be represented using a dictionary?
- Key: The value in the domain
- Value: The image in the codomain
How can we express a vector as a function?
f:S —> R where S is the set of values that the function can be applied to (domain) and R is the potential outputs of the function (co-domain)
What is the dot product of 2 vectors?
The single value result when applying one vector’s magnitude and direction to another
How do you calculate the dot product?
U = [u1, u2, … , un]
V = [v1, v2, … , vn]
U x V = u1v1 + u2v2 + … unvn
What are the uses of the dot product?
Can be used to find the angle between them using the formula:
ab = |a||b|cosx
What is a convex combination of two vectors?
A vector that lies on the line between two vectors
When is D a convex combination where:
D = aA + bB
Where A, B are vectors and a,b are scalars
If a and b are both greater than or equal to 0 and a + b = 1
What is a linear search algorithm?
An algorithm that looks through each item in a set until the search key is found or every item has been inspected.
What are the advantages and disadvantages of linear search?
+ Only option for unsorted lists
- Inefficient: O(n)
What does the efficiency of a linear search depend on?
- Size of the dataset
- The location of the item to search for
What is a binary search algorithm?
An algorithm that looks at the middle item, if it is less look at the middle item in the lower half, if it is more look at the middle item in the upper half. This is repeated until it is found or low pointer is above the high pointer
What are the advantages and disadvantages of binary search?
+ Very efficient for large databases - O(log n)
- Can be more difficult to implement
- Must be a sorted list
How do you search in a binary search tree?
Using an in-order traversal
How does bubble sort work?
Each item in a list is compared to the one after it. If the item after is smaller that the current one then they swap until no swaps are made and it is sorted.
What are the disadvantages of bubble sort?
- Inefficient time wise as time complexity of O(n2)
- Algorithm continues to iterate over the list even if no further swaps are required
How can bubble sort be improved?
- Reducing the number of comparisons after each pass
- A sorted flag that can stop the sort if no swaps are made
How does merge sort work?
- Example of a divide and conquer algorithm where the program is broken down into smaller units
- List is continuously divided until there is only one item in each
- Sublists are combined in order until the list os fully sorted
What are the disadvantages of merge sort?
If done recursively is not very memory efficient
What are the advantages of merge sort?
Time efficient with an efficiency of O(nlogn)
What is an optimisation algorithm?
An algorithm that finds the best possible solution to the problem proposed.
What is an example of an optimisation algorithm?
Dijkstra’s Shortest Path (DSP)
What does DSP do?
Calculates the shortest distance between a starting node to all other nodes in the graph
What is the efficiency of DSP?
O(n2)
What are the uses of DSP?
- Shortest path between two locations in a transport network
- Telephone and computer network planning
- Network routing / packet switching
In what terms can algorithms be classified as?
- Time complexity
- Space required
What is a tractable algorithm?
An algorithm that can be solved within polynomial time: O(nk)
What are all the time complexities?
- Constant complexity: O(1)
- Logarithmic Complexity: O(logn)
- Linear Complexity: O(n)
- Polynomial Complexity: O(nk)
- Exponential Complexity: O(xn)
- (Factorial Complexity: O(n!) )
What is an intractable algorithm?
An algorithm that has a computational solution which is not within polynomial time so is not practical for anything other than small n.
What are the properties of an algorithm with constant complexity?
- Time Taken is constant so is independent of size of the dataset
- As n increases the time taken stays the same
What are some examples of algorithms with a constant complexity?
- Accessing an index in an array
- Looking up a value in a hash table (assuming no collisions)
- Popping from a stack or dequeueing from a queue
What are the properties of an algorithm with logarithmic complexity?
- Time taken increases logarithmically with respect to the size of the database
- The size of the problem set is halved after each iteration or recursive call
What are some examples of algorithms with a logarithmic time complexity?
- Binary Search
- Merge Sort
What are the features of an algorithm with a linear time complexity?
Time taken increases proportionally with the size of the dataset
What are some examples of algorithms with a linear time complexity?
- Linear Search
- Traversing a linked list
What are the properties of an algorithm with a polynomial complexity?
- Time taken increases exponentially with the size of the dataset
- At the limit of the tractability
What are some examples of algorithms with a polynomial time complexity?
- Bubble Sort
- Dijkstra’s shortest path
What are the properties of an algorithm with exponential complexity?
- Time taken increases exponentially with size of dataset
- Intractable
What are some examples of algorithms with an exponential time complexity?
- Brute force password cracking
- Travelling Salesman Problem
What are the number of different permutations for a set with size n?
n!
How to identify O(1)
No iteration or number of steps doesn’t depend on n
How to identify O(n)
Single iteration that depends on n
How to identify O(nk)
Nested iteration where each iterative structure depends on n and k is the iterative depth
How to identify O(logn)
Breaking down the problem into halves
How to identify O(xn)
Number of iterative structures/ recursive calls depends on n
What do you do if an algorithm has a combination of time complexities?
You select the worst Big O factor
What is a heuristic?
Rules based on the knowledge of a problem that can be used to find an approximate solution by changing some constraints in the problem or reducing the problem space.
What is the halting problem?
Determining whether a program will halt for a particular input without running the problem
What did the Halting Problem prove?
There was an example of a problem that is unsolvable
What was the significance of the Halting Problem?
Demonstrated that there are un-computable problems where no algorithm can solve them
Why are heuristics used?
A solution will arise within a reasonable time. However this may not be the best solution.
How are records added to a hash table?
- Hashing algorithm is applied to a key
- Result is the location in the table where the record should be stored
- If location is not empty then use next free location