Data Structures Flashcards
Data Structure
Data structures are used by computers as the containers within which information is stored
Static Data Structure
A fixed data structure that is reserved memory at the start of the program. The contents of a static array can be changed but the array cannot be resized
Dynamic Data Structure
Memory can be allocated and deallocated as required while the program is running.
Abstract data types
Do not exist as data structures in their own right, and instead make use of other data structures such as arrays to form a new way of storing data. It is a conceptual model that describes how data is organized and which operations can be carried out on the data
Primitive Data Type
only contains one value
Benefits of Static Data Structures?
- The compiler can allocate space during compilation
- Easy to program
- Arrays allow for random access
- Easy to check for overflow
- Contiguous
- Random access allowed
- Uses less memory when storing the same amount of data
- Store data in consecutive memory locations
Points of Dynamic Data Structures?
- Only uses the space needed at any time.
- Makes efficient use of memory; storage no longer required can be returned to the system for other uses
- Non-contiguous
- Sequential Access
- If fragmented can be slow to access
- Do not store data in consecutive memory locations
Array
An array is an indexed set of related elements, finite and only contains elements of the same data type
Example of 1D array
Array Names = {“Bob, “Will”, “Sam”}
Example of 2D array
Array Colours = {“light blue, dark blue, blue”}, {“light pink, dark pink, pink”}, {“light green, dark green, green”}}
(indexing starts at 0)
Colours(0,1) would return ‘dark blue’
Colours(2,0) would return ‘light green’
Static Memory Allocation
- A fixed list can be set up, this is a contiguous space on disk.
- The next memory location is the next address and its position can be implied.
- Removing something is not easy as all succeeding elements have to be moved back one place. Leaving unused memory locations (wasted space)
How does Dynamic Memory Allocation occur
- Dynamic memory allocation uses linked lists making it much easier to remove and add elements.
- Each element points to the address of the succeeding element.
- Removing an element just requires pointing to a different address
What is Recursion?
A recursive subroutine is one that calls itself.
What is a Base case?
The case where recursion does not occur
Criteria for recursion?
- Size of the problem must get repeatedly smaller
- There is a general case
- There is a base case
Factorial
A Factorial (n!) is a function that multiplies a number by every number below it.
What are Queues? How/where are they used?
- First in first out data structure, abstract and based on arrays
- Typically queues are used in buffering where a sequence of instructions are sent to a printer for instance, and the printer prints documents in the order that they arrive - Lists can be used to represent queues.
- Also used in keyboard buffers where each key press is added to the queue, so that letters appears in the order they are type
- The breadth first search algorithm uses a queue to keep track of which nodes in a network have been visited
Queue Operations
- Add; add element to the queue
- Remove; remove element
- IsEmpty
- IsFull
- Queue.Dequeue() removes an element from the front
- Queue.Enqueue() adds an element
Emptiness can be checked by comparing front and rear pointers, if they are the same, a queue is empty
How to remove items in a circular queue?
1. Check the queue is not already empty;
2. Compare the value of the front pointer with the maximum size of the array;
3. If equal then front pointer becomes one; A. index of the first position in the array instead of one
4. Otherwise, add one to the front pointer;
How to add items to a Linear queue?
- Check that the queue is not already full;
- (if it isn’t) then add 1 to the value of the rear pointer;
- then add the new item to the position indicated by the rear pointer;
How to add items to a Priority Queue?
- Starting with the item at the rear of the queue move each item back one place in the array;
- Until you (reach the start of the queue or) find an item with the same or higher priority than the item to add;
- Add the new item in the position after that item;
Linear Queue
As an item is removed from the queue all other items move up one space. For a long queue this can take a lot of processing.
pointers
the pointer representing the start of the queue moves as an item is removed from the queue. However, this leads to a lot of empty cells/wasted memory at the front of the list. You also need to know the queue length and how many elements have been removed.
Circular Queues
As an item is removed memory locations are recycled: Unused memory locations at the front now become memory locations at the back of the queue, making it more memory efficient
Priority Queue
Each element is assigned a priority and highest priority items are removed first. If items have the same priority, then the item nearest the front of the queue is removed first.
Stacks
Last in first out file system, just like a stack of plates. The last item added to a stack is the first to be retrieved.
What are 4 features of a stack frame?
- Local Variables
- Return Address
- Parameters
- Register Values
Stack Operations
- push(data) - add element to a stack
- pop() - remove element
- peek()/Top - view top element without removing any
- is_empty()
- is_full()
How do stacks reverse a sequence of numbers?
Stacks can reverse a sequence of numbers by popping one stack and pushing another
Graphs
A graph is a way of representing relations between data. They are made up of Nodes/Vertices and Arcs/edges.
Graph uses
Popular uses of graphs are mapping, social networks, chemical compounds and electrical circuits
Neighbours
When there are two vertices connected by an edge, the two vertices are called neighbours
degree of a vertex
how many other vertices it is connected to (or amount of neighbours)
weighted graph
adds values to the arc. This could represent, for example, the distance between places.
adjacency matrices
A way of representing a graph in a table. If the graph has no weights, it is given a value of 1 for connected nodes.
Adjacency Matrix
Stores every possible edge between nodes, even those that don’t exist -Memory inefficient
Specific edge to be queried very quickly, it can be looked up by its row and column -Time efficient
Better for dense graphs with a lot of edges
Adjacency List
Only stores the edges that exist. -Memory efficient
Slow to query, each item in a list must be searched sequentially until desired edge is found -Time inefficient
Better for sparse graphs with few edges
Directed graphs
have arrows and only apply in the direction of the arrow, without this they are bidirectional and apply both ways
Graph study uses
- Human networks
- Transport Networks
- internet
- CompSci
- Game Theory
- Medical Research
Define tree
A tree is a connected, undirected graph with no cycles.
- Connected – every node is connected indirectly to every other node
- No cycles – there is only one path between nodes
Define Rooted Tree
has a root node that has no parent, and all other nodes are descended from the root. All other nodes can be a parent/child node. A leaf node has no descending nodes
What are the uses/benefits of trees?
- Trees can be used to store data that has an inherent hierarchy nature e.g. OS using a tree for directories, files and folders in its file management system
- They are dynamic which means it is easy to add and delete nodes
- Easy to search and sort using traversal algorithms
- Can be used to process the syntax of statements
Why are binary trees useful?
- They are more structured and therefore quicker to search than linked lists
- When combined with hashing can be used as a basis for constructing a database
What are the benefits of Hash Tables?
- Hashing allows stored data to be accessed very quickly without the need to search through every record. This is achieved by relating the data itself to its index position using a key.
What is a Hash tables time complexity?
Constant time complexity of O(1)
How do Hash tables work?
- If the derived number is bigger than the length of the list, then you will need to apply modulo
- Collisions occur when a bin is already occupied. In such a situation data is placed in the next available bin.
- You can rehash with the higher modulus and number of elements when the number of collisions becomes higher.
- The load factor is the number of occupied bins divided by the total number of bins.
- The hash table should contain more bins than there are elements you would like to store by a load factor of 0.75.
- If the load factor is exceeded, we can rehash using a larger hash table with a greater number of bins.
clustering
Too many collisions occur
Dictionaries
an abstract data structure that contains a list of values associated with a key that you can use to access the value.
One use of this could be dictionary based compression
Vector Addition
- Each element in the vector is added/subtracted to the corresponding value at that index in the other vector
E.g. Find A+B where A = [2, 4, 6, 8] and B = [1, 2, 3, 4]
A+B = [3, 6, 9, 12]
Scalar Vector Multiplication
- Vectors can be multiplied by scalars (single numbers)
- Each element is multiplied by the scalar
E.g. Find 2a where a = [3, 5, 6, 9]
2a = [6, 10, 12, 18]
Dot Product
- Calculated by multiplying the corresponding element in each vector and adding together all of the elements
E.g. Find a.b where a = [2, 4, 6, 8] and b = [1, 2, 3, 4]
a.b = [2 + 8 + 18 + 32] = 60
Convex combination of 2 vectors
A combination
e.g. u[1,2] v=[4,3]
a=0.4, b=0.6
au_bv (0.4x1) + (0.6 x 4), (0.4 x 2) + (0.6 x 3)
au+bv = (2.8, 2.6)