Data Structures and Algorithms Flashcards
Array
An array is a collection of elements of the same type. The location of each element has a numerical value (the index) which is used to identity/locate
Unsorted Array adv and disadv
Advantages:
Insertion is fast when adding at the end (O(1))
Deletion is fast if the index is known (O(1))
Searching is fast if the index is known (O(1))
Disadvantages:
Deletion is slow if the index is unknown (O(n))
Searching is slow if the index is unknown (O(n))
Fixed size (cannot dynamically resize)
Sorted Array adv and disadv
Advantages:
Searching is fast using binary search (O(log n))
Disadvantages:
Insertion is slow because elements need to be shifted (O(n))
Deletion is slow for the same reason (O(n))
Fixed size (cannot dynamically resize)
Stack ADT
A stack is a collection of objects organised such that only the most recently inserted object can be removed at any time. Last-In First-Out structure - LIFO.
Queue ADT
A queue is a collection of objects organised such that the object that has been stored in the queue the longest is the next one removed. First-In First-Out structure - FIFO.
Priority Queue
In a priority queue an element is inserted to the queue with an associated priority. Deletion locates and removes the element from the queue with the highest priority. Inserting an element into a priory queue is efficient - O(1). Removing an element requires searching the list sequentially for the element
with the highest
Linked List
A linked list is a collection of objects, called nodes. Every node (except the last one) contains a reference (location) to the next node.
Advantages:
Insertion - fast, if insert at head or position already located
Deletion - fast, if already located
Size not fixed
Disadvantages:
Search - slow, need to traverse all elements to locate, even if ordered
Circular Singly Linked List
A circular singly linked list is a SLL in which the last node contains the reference to the first node in the SLL.
Doubly Linked List
A doubly linked list is a linked list in which every node has a forward and backward link.
SkipList
A skipList is a probabilistic data structure, based on parallel linked lists, with an improved efficiency, compared to a SLL, for most operations. A SkipList is an extension of an ordered singly linked list with additional forward links, added in a randomized way, so that a search in the list can quickly skip parts of the list. Performance: SkipLists are not guaranteed to give good performance. The best case will be O(log2 n) and the worst case will be O(n).
Hash Table
A Hash Table is a searching tool that uses hashing for the time-space trade-off.
Advantages Faster searching - map the keys into indices by hash functions. O(1)
Disadvantages Design of the hashing functions to provide unique indices and avoid collision.
Binary Tree
A Binary Tree is a special case of a tree where every node is of degree two or less.
Binary Search Tree
A Binary Search Tree is a special case of a Binary Tree where:
The keys (if any) in the left subtree of the root precede the key in the root.
The key in the root node precedes the keys (if any) in the right subtree.
The left and right subtrees of the root are also Binary Search Trees (BST).
Advantages:
If tree reasonably balanced then operations (insertion, deletion and search) are fast. O(log2n)
Disadvantages:
If tree unbalanced, then operations are slow. Worse case- O(n)
AVL Tree
An AVL tree is a self-balancing binary search tree in which:
. Heights of the left and right subtrees of the root node differ by at most one.
. Left and right subtrees are again AVL trees. Uses rotations to keep the tree balanced: Single rotations - . LL and RR
. Double rotations - LR and RL
Advantages:
Searching - fast
Insertion - fast (dependent on the length of the path from the position of the inserted node back to the root). Will involve maximum of one rotation. The rotation operations take constant time as only few references are being changed.
Deletion - fast but potentially, involves more rotations - can lead to O(log2 n) rotations, because every node on the path from the deleted node to the root may require rebalancing. The rotation operations take constant time as only few references are being changed.
O(log2n)
O(log2n)
O(log2n)
Disadvantages Complex Structure and Maintaining
Splay Tree
A splay tree is a self-organising binary search tree with the property that recently accessed elements are always quick to access again.
Uses rotations to keep the tree organised:
Single rotations
. Zig (equivalent to LL and RR rotations used in AVL trees)
Double rotations
. ZigZag (equivalent to LR and RL rotations used in AVL trees)
. ZigZig (no equivalent rotation used in AVL trees)
Splay Tree Performance
For many non-uniform sequences of operations, splay trees perform better than other search trees, even when the specific pattern of the sequence is unknown. No single search operation is guaranteed to be efficient.
. The best case will be O(1).
. The worst case will be O(n).
. On average a series of operations tend to be of O(log2 n).
Two-Three Tree
A 2-3 tree is a balanced multi-way search tree where:
. Internal nodes have two or three children (2-node or 3-node).
. Leaves are always on the same level, making it a completely balanced search tree.
. It follows strict properties for organizing values across its subtrees.
Properties:
. Each node contains one or two data items (2-node or 3-node).
. Internal nodes have either:
. Two children for a 2-node, or
. Three children for a 3-node.
Values of descendants:
. Left subtree: Values < the first data item.
. Center subtree: Values > the first data item (and < the second data item, if present).
. Right subtree (if it exists): Values > the second data item.
B* Tree
The B* Tree is a variation of the B Tree with aim to improve efficiency of access.
In a B* Tree all nodes except the root must be at least two thirds full. The frequency of splitting a node is decreased by delaying a split. The split is delayed by redistributing the keys between a node and its sibling when the node overflows. When a split is required then two nodes are split into three.
B Tree
B Tree of order m is a balanced multi-way search tree with the following properties:
. The root node has either no children or between 2 and m children
. All internal nodes have between (m/2) and m children
. All leaves are on the same level
Advantages:
Insertion - fast O(logmn)
Deletion - fast O(logmn)
Search - fast O(logmn)
Tree is always balanced
Disadvantages:
Complex Structure and Maintaining
B+ Tree
Internal nodes of a B+ Tree are used as an index to enable fast access to the data. The index is a B Tree. The leaf nodes have a different structure to the other nodes of the B+ Tree. They are linked together in sequence (SLL) thus facilitating the scanning of a sequence of leaves to give a list of the data in ascending order.
Heap Data Structure
A max heap is in a binary tree with the following characteristics:
. It is a complete binary tree.
. Key in the root is larger than the key in either child node, and both sub-trees satisfy
Advantages:
Insertion - fast O(log2n)
Deletion - fast O(log2n)
Search for the largest item - fast O(1)
Disadvantages:
Search for items other than the largest item - slow.
Heap Data Structure Applications
Priority Queue:
In a priority queue an element is inserted to the queue with an associated priority. Deletion locates and removes the element from the queue with the highest priority.
. Inserting element O(log2n)
. Locating element with highest priority O(1)
. Removing element with highest priority O(log2n)
Huffman Coding:
Huffman coding is a statistical compression method to convert characters into variable length bit strings. The most-frequently occurring characters are converted into short code-words and the least frequent into longer code-words.
Graphical Data Structures
The Graph (or network) data structure consists of a collection of nodes (elements) where each node has many predecessors and many successors. Algorithms studied:
. Tree traversals - Depth First Search and Breadth First Search
. Shortest Path Trees - Dijkstra’s Algorithm
. Spanning Trees - Prim’s Algorithms and Kruskal’s Algorithm