1.4.2 Data Structures Flashcards
Arrays and Lists
Array : Ordered, Finite set of elements of a single data type.
- Stored contiguously in memory
- Static - cannot change size of array at runtime (stored contiguously but memory reserved even if not used)
- Mutable - data can be changed
- Zero Indexed
- suited where size is known and single data type as easy access through indexes
Lists : Ordered, unfixed set of elements of variable data types.
- Stored non-contiguously in memory
- Dynamic - can change size of array at runtime (Memory not reserved but data spread out in memory (fragmented))
- Mutable - can change data
- good where size unknown and multiple data types
Records and Tuples
Record (Row in file) : Made up of fields that can be different data types relating to entity.
Tuples : Immutable (data can’t change), Static (size can’t change) Ordered set of values of any data type
Linked list
Linked List : Dynamic data structure used to store an ordered list not contiguously in memory.
- often used to store a dynamic queue where data can be added and removed.
Contains : Start Pointer to first item
- Nodes that have data and pointer field
- Pointer gives location / address of next node
- Linked List that contains free memory locations - add to when remove items.
To Add/Change/Delete Item in list - traverse list until find item/end, Change relevant pointer fields then Change relevant data fields.
And change free list pointer - stores free spaces as linked list - each point to next free space.
Advantages : Values can easily be added or removed - dynamic
Disadvantages : Wasted memory on pointers and removed values,
- Items cannot be directly accessed - must traverse to find : O(n)
Graphs
Graph : Abstract data structure that represents relationships between nodes.
- Abstracts real life scenarios e.g. Network
- Contains Vertices / Nodes connected by arcs / edges
- Can be Directed, Undirected or Weighted
Ways to Represent Graphs
Adjacency Matrix : Row and columns are nodes with connections as items.
Advantages : Easier to add nodes.
Disadvantages : Sparse graph with few connections leaves many cells empty - wastes memory.
Adjacency List : Nodes Stored alongside list of adjacent nodes.
Advantages : Space efficient - good for large sparse networks.
Disadvantages : Harder to add nodes.
Traversing Graphs
Depth First : Goes as far down as it can following child nodes and adding nodes to stack then when node is reached with no sub-nodes this node is visited and popped off of stack and then backtrack to previous node to follow other unvisited paths and repeat until no unvisited paths.
- Uses stack
- Less memory efficient
- Finds sooner if deeper in the tree
- Doesn’t always find best / Most efficient solution
Breadth First : Adds root node to queue then adds all nodes connected to back of queue - each time node is visited, its sub-nodes are added to the back of the queue, Nodes are visited in order added
to the queue.
- Uses Queue
- So Visits all nodes connecting to first node then all nodes connected to those nodes.
- finds sooner if higher up in tree
- more memory efficient
Stacks
Stack : Last in First out, abstract data structure.
- One Pointer that points to top of stack
- When removing item decrement pointer and check for underflow
- When adding item increment pointer and check for overflow
- Can be static or dynamic (Static more memory efficient if max size known)
- Used in interrupts and to reverse data input
Operations : push(item), pop(), sFull(), isEmpty()
Stack.pop()
Queues
Queue : First in First out, abstract data structure.
- Two Pointers : One to back of queue other to front (head and tail)
- Linear queue - array but as values removed addresses cannot be used again
- Circular queue - Once pointers equal max size of array they loop back to start - more memory efficient but harder to implement
- Queue used to pass data from one process to another or process orders where orders are processed in order they arrive.
- Check for under flow and overflow (tail> head for underflow or head = last index for overflow)
Operations : queue.Enqueue(item), Dequeue(), isFull(), isEmpty()
Trees
Tree : Form of graph - contains nodes and edges (branches) to sub-nodes, abstract data structure.
- Cannot contain cycles and unweighted and 1 directional.
- Root node at top and leaf node at bottom (no sub-nodes) - Hierarchical
Binary Tree
Binary Tree : Tree where each node has max of 2 children, Abstract data structure.
- Represented in 2D array with left pointer, data and right pointer
Binary search tree - easy to search through :
- If child node is less than parent it goes to left, if child node is greater it goes to right of parent.
- Divide and Conquer and (O(logn)) (O(n) worst case)
To delete nodes :
- Traverse tree until node found… compare against… bigger/smaller so go right/left - apply to question.
- Change relevant pointer of parent to null / pointer of child of node / pointer of next largest node.
- Add deleted node to free storage list
Traversing Binary Tree
Pre-order traversal:
Follows order – node, left subtree, right subtree
In-order traversal:
Follows order – left subtree, node, right subtree
Post-order traversal:
Follows order – left subtree, right subtree, node
Hash Tables and Hashing for Storage
Hash Table : Array coupled with hash algorithm that maps input to index by hashing.
- Result Generated by applying algorithm to value
- Enable direct access to location of record
- O(1) time complexity as doesn’t change as more items are added
- Hash table takes same time to search as more items are added
- Used where data needs constant lookup times (e.g. Cache)
- Can be used in database indexing
Collisions : If two inputs return same hash to resolve can store in next available location (Linear probing - search through one at a time)
- Or Store in overflow with a linked list - Location points to start of linked list and new item added to end of the linked list
- Collisions more likely as table gets larger
Hash function should have:
- Low collision rate
- Be quick to calculate
Depth First Traversal Describe
Depth first starts at the root
…and goes all the way down one branch (left) to the bottom
It stores which nodes it has visited / pushes nodes visited onto a
stack
When it cannot go any further
…It then backtracks/returns to the previous node
And continues to backtrack until a node is reached with unvisited
children