Data Structures Flashcards
Data Type
Defines potential set of values a variable can hold and the operations that can be performed on it
Aspects of Data types
- Range of values
- Size of memory required
- How binary data is interpreted
- Operations that can be performed on it
Abstract Data Type (ADT)
Conceptual model of how data is to be stored and the operations that can be carried out on it
Data Structure
- (Partial) implementation of user-defined types (ADTs)
- Each item known as member and can be of different data types
Purpose of data structures
To combine multiple data items under a single identifier
Key ADTs
- Stacks
- Queues
- Graphs
- Trees
- Hash tables
Static data structures
- Max size fixed at compilation
- Data stored in contiguous memory locations
Advantages of static data structures
- Fast to access
- (Sometimes) require less memory
Disadvantages of static data structures
- Inflexible
- (Sometimes) wastes space
- Operations like insertion and deletion are very slow
Dynamic data structures
- Can change size at run-time
- Data not (usually) stored contiguously
- Each item points to the next item in the structure
Advantages of dynamic data structures
- More flexible
- No wasted space or limits
- (Sometimes) require less memory
Disadvantages of dynamic data structures
- (Sometimes) require more memory (have to store pointers)
- Slower to access
Queue
Data structure comprised of sequence of items and front and rear pointers
Order of item insertion in Queue
Items added at rear and retrieved from front in First-in-First-Out (FIFO)
Key operations of Queue
- Enqueue
- Dequeue
- isEmpty
- isFull
Enqueue
If not isFull() then add new item to rear
Dequeue
If not isEmpty() then remove front item
isFull
Test whether queue is full
isEmpty
Test whether queue is empty
Uses of queues
- Queues of print jobs
- Managing input buffers
- Handling multiple file downloads
- Priority-based resource allocation
- …
Queue implementations
- Linear queues
- Circular queues
- Priority queues
Linear queues
- Stores items in ‘line’
- Front and rear pointers move down memory
Advantages of linear queues
- Simple to program
- Limited capacity useful when representing finite resources
Disadvantages of linear queues
Wasted capacity (whole queue shifts down)
Solving waste problem in linear queue
- Shuffling every item one index forward when first item deleted
- However, it’s inefficient (O(n))
Circular queues
- ‘Wraps’ front and rear pointers using MOD function
- Static, array-based
Advantages of circular queues
- Avoids wasting space
- No need for shuffling
Disadvantage of circular queue
More complex to program
Priority queue
Items enqueued (or dequeued) based on priority
Priority queue implementations
- Can be implemented statically using a 2D array
- More efficient to implement dynamically, with each node storing value and priority
Disadvantages of static priority queue
- Inherent drawbacks of static (e.g. space limitation)
- However, dequeue and enqueue time is still O(n) since linear search required to find highest priority
Stack
- Comprised of sequence of items and stack pointer that points to top of stack
- Last-in-First-Out (LIFO) - items added to top and removed from top
Stack structure
int maxSize;
int items[maxSize];
int stackPointer = -1;
Stack operations
- Push
- Pop
- Peek
- isFull
- isEmpty
Push
If not isFull, push value to top of stack
Pop
If not isEmtpy, returns value at top of stack and removes it
Peek
Returns item at top of stack (doesn’t remove it)
Uses of stack
- Reversing sequence of items
- Storing stack frames
- Storing processor state when handling interrupt
- Evaluating mathematical expressions written in Reverse Polish (postfix) notation
Stack overflow
Attempt to push item when stack is full
Stack underflow
Attempt to pop item when stack is empty
Call stack
Area of memory used to store data used by running processes and subroutines
Stack frame
- Return addresses (keeps reference to parent subroutine)
- Parameter values (keeps parameter values of different subroutines separate)
- Local variables (isolates these from rest of system)
Call stack pushing and popping
- Stack frame is pushed with each new subroutine call
- Stack frame popped when process completes
Linked list
Dynamic data structure comprised of nodes that point to its consecutive node in the list
Aims of linked lists
- Convenience of storing multiple values per element
- Flexibility to grow
- Quick at inserting and deleting items (no shuffling)
Disadvantages of linked lists
- Slower to access items (traversal)
- Takes up more memory (each node store value + 64 bit pointer)
Graphs
Set of nodes connected by edges
Directed vs undirected graph
- In directed, edges can be unidirectional
- In undirected, all edges are bidirectional
Weighted graph
Edges have a cost of traversing (e.g. time)
Purpose of graphs
Modelling complex relationships between interralated entities
Uses of graphs
- Human (social) networks
- Transport networks
- Computer networks
- Planning telecommunication networks
- Project management
Typical graph operations
- Adding / removing node
- Adding / removing edge
- Test edge existence
- Get edges / adjacent nodes
- Get weight of edge
Implementations of graphs
- Adjacency matrix (static)
- Adjacency list (dynamic)
Adjacency matrix
Edges between nodes represented by ‘1’ at intersection in matrix
E.g. if edges[0][1] == 1, then there is an edge between node 0 and node 1