Data Structures Flashcards
Principle for a Stack
LIFO: Last in first out
Principle for a Que
FIFO: First in first out
Best analogy for explaining how a stack works
It’s like a “stack” of dishes. When you wash dishes, you start at the top and work your way down. When someone adds a new dish, they add it to the top of the stack.
Best analogy to explain how a que works
Similar to a line of people waiting for a service, where the first person to join the line is the first to be served.
Data Structure
Organization of data in memory that behaves in a predictable way.
A data structure is an actual implementation or realization of an abstract data type (ADT). It involves organizing and storing data in a specific way to facilitate efficient access, modification, and retrieval of the data.
Characteristics:
1) Defines how the data is organized, stored, and accessed.
2) Specifies the internal representation and implementation details.
3) Determines the efficiency and performance of operations based on the chosen representation.
Abstract Datatype (ADT)
Focuses on what operations can be performed on the data and what behavior those operations should have, independent of any particular language
Characteristics:
1) Defines the operations that can be performed on the data.
2) Specifies the interface (operations and behavior) without revealing the implementation details.
3) Provides a high-level abstraction, encapsulating the data and its associated operations.
Data Structures v. Abstract Data Types
1) An abstract data type (ADT) is a theoretical concept that defines a set of operations and behavior without specifying implementation details, while a data structure is the actual implementation of an ADT, providing concrete ways to store and manipulate data.
2) ADT focuses on the behavior and interface, allowing for various implementations (data structures) to achieve that behavior. On the other hand, a data structure defines how data is organized and accessed, emphasizing implementation specifics.
3) ADT is more concerned with the logical view of the data and operations, while a data structure deals with the physical representation and manipulation of data in memory.
Most Basic Data Types
1) Stack
2) Que
3) List
4) Trees
5) Heaps
6) Graphs
Stack
Operates as a container
Can hold other things (possibly even other stacks)
All insertions and deletions can only occur at the TOP of the stack (LIFO)
Pros of using a stack
Pros:
1) Efficient Push and Pop Operations: Adding an element (push) or removing the top element (pop) is highly efficient and takes constant time (O(1)).
2) Simplified Data Structure: Simple and intuitive structure based on the Last-In-First-Out (LIFO) principle, making it easy to understand and use.
3) Space Efficiency (with Proper Implementation): Can be space-efficient, especially when implemented using a linked list. Efficient memory usage for managing function call contexts.
4) Algorithm Design: Suited for specific algorithms, simplifying their design and implementation. Useful for tasks like backtracking and parsing.
5) Function Call Management: Integral for managing function calls, recursion, and execution contexts in programming languages.
Cons of using a stack
Cons:
1) Limited Access: Restricts access to the topmost element; direct access or modification of other elements requires removing the top elements.
2) No Random Access: Lacks random access to elements, limiting flexibility in accessing or modifying specific elements in the middle of the stack.
3) Memory Management (with Array-based Implementation): If implemented using an array, frequent push/pop operations may lead to resizing, potentially causing memory wastage and performance issues.
4) Potential Stack Overflow: Excessive recursion or deep function call stacks can lead to stack overflow issues, affecting program stability.
5) Context Sensitivity: Context and state are highly sensitive to the order of operations, which can be challenging to manage in certain scenarios.
Operations that can be performed on a stack
Push: Adds an element to the top of the stack.
Pop: Removes and retrieves the top element of the stack.
Peek: Retrieves the top element without removing it.
Accessibility concerns with using a stack
Only the top element is accessible for reading or removal. Access to other elements requires popping off the top elements.
Makes it much less accessible than a queue
Accessibility of a queue
Both the front and rear elements are accessible. Front element is for reading or removal, while rear element is for insertion.
Makes it much more accessible than a stack
Operations that can be done on a que
Enqueue: Adds an element to the end of the queue.
Dequeue: Removes and retrieves the front element of the queue.
Peek: Retrieves the front element without removing it.
Applications of stacks
1) Algorithmic Problems: Various algorithmic problems, such as Tower of Hanoi, can be solved efficiently using stacks, demonstrating their usefulness in solving complex puzzles and scenarios.
2) Compiler Design: In compiler design, stacks are employed to implement the parsing and evaluation of algebraic expressions during the compilation process.
3) Depth-First Traversal: Depth-first traversal of trees and graphs, such as in graph traversal algorithms (e.g., depth-first search), can be efficiently implemented using stacks to manage nodes to be visited.
4) Memory Management: Stacks are used in memory management, like in the call stack of an operating system, to keep track of the execution context of programs, memory allocations, and deallocations.
5) Function Call Management: Stacks are used by programming languages to manage function calls. When a function is called, its context (local variables, return addresses) is pushed onto the stack, and it’s popped off when the function completes its execution.
6) Expression Evaluation: Stacks are vital for evaluating arithmetic expressions, postfix expressions, and handling infix-to-postfix conversions. Operators and operands are pushed and popped based on their precedence.
7) Backtracking Algorithms: Algorithms like depth-first search (DFS) use stacks to keep track of visited nodes and paths, making it easier to backtrack and explore alternative paths.
8) Parenthesis Matching and Syntax Parsing: Stacks are used to match opening and closing brackets, braces, and parentheses in expressions, ensuring proper syntax and balanced structures.
Applications of Queues
1) Message Queues: Queues are extensively used in message-driven architectures and communication systems. They help manage messages between different parts of a system or between different systems, ensuring reliable message delivery and processing.
2) Task Processing and Load Balancing: Queues are used in task processing systems to distribute and manage tasks across workers or processing units. This helps balance the workload and ensures efficient task execution.
3) Breadth-First Search (BFS) in Graphs: BFS traversal of graphs uses a queue to explore nodes at the current level before moving on to nodes at the next level, making it an essential component for graph algorithms.
4) Job Scheduling: Queues are employed in job scheduling systems where jobs or tasks are added to a queue and processed in the order they were added, facilitating task scheduling and prioritization.
5) Print Queue in Operating Systems: In operating systems, print queues are used to manage print requests from multiple users. Jobs are placed in a queue and processed one after another.
6) Request Handling in Web Servers: Web servers use queues to manage incoming requests. Each request is added to a queue, and the server processes them sequentially, preventing overload and ensuring responsive performance.
7) Buffering in I/O Operations: Queues are used to buffer I/O operations, especially in scenarios where data is produced faster than it can be consumed. The queue allows for smooth data flow between producers and consumers.
8) Background Task Execution: Background tasks, such as periodic updates or data processing, can be scheduled and executed using a queue, ensuring orderly processing and optimal resource utilization.
9) Event Handling in GUI Applications: Queues are utilized in graphical user interfaces (GUIs) to handle user-generated events like mouse clicks and keyboard inputs. Events are queued and processed in the order they occur.
10) Buffering in Networking: Queues are employed in networking for buffering data packets when transferring data between devices. This helps manage traffic and maintain a smooth flow of data.
11) Caching: Queues can be used for caching to store frequently accessed data, ensuring a consistent and ordered approach to data access.
Operations that can be performed on a stack
1) Push: Adds an element to the top of the stack.
2) Pop: Removes and returns the element at the top of the stack.
3) Peek (or Top): Returns the element at the top of the stack without removing it.
isEmpty: Checks if the stack is empty or not.
4) isFull: Checks if the stack is full, applicable in fixed-size stack implementations.
5) Size (or Count): Returns the number of elements currently in the stack.
6) Clear (or Empty): Removes all elements from the stack, making it empty.
Syntax of an implementation of the ‘push’ command for a stack
in C++:
class MyStack {
private:
std::stack<int> data; // Use an STL stack for simplicity
public:
void push(int value) {
data.push(value);
}
};</int>
Syntax for an implementation of the ‘top’ command for a stack
in C++:
class MyStack {
private:
std:stack<int> data;
public:
int top() {
if (!data.empty()) {
return data.top();
} else {
std::cout << "Stack is empty.\n";
return -1; // Or any sentinel value indicating an error
}
}
};</int>
Syntax for an implementation of the ‘pop’ command for a stack
in C++:
class MyStack {
private:
std::stack<int> data;
public:
void pop() {
if (!data.empty()) {
data.pop();
} else {
std::cout << "Stack is empty. Cannot pop.\n";
}
}
};</int>
Syntax for an implementation of the ‘isEmpty’ command for a stack
in C++:
class MyStack {
private:
std::stack<int> data;
public:
// Check if the stack is empty
bool isEmpty() {
return data.empty();
}
};</int>
What happens if ‘Push’ is called when the stack is already full?
“Stack Overflow”
Exaclty how this is handled depends on which language you are using. Four basic possibilities (the first two are the most common):
1) Error or Exception:
Many programming languages and stack implementations will raise an error, throw an exception, or terminate the program when a push operation is attempted on a full stack.
2) Undefined Behavior:
In some cases, attempting to push an element onto a full stack might result in undefined behavior. This means the program might crash, produce unpredictable results, or corrupt data.
*these two are much less common, but can still happen in some fairly popular languages
3) No Action:
Some implementations may choose to silently ignore the push operation when the stack is full, without modifying the stack. This can be risky as it doesn’t provide feedback to the program that the operation was unsuccessful. (example: Python)
4) Resizing or Dynamic Stack:
If the stack is dynamically resizable, the implementation might allocate additional memory, resize the stack, and then push the element. However, if resizing is not possible (e.g., due to memory constraints), this may revert to an error or exception. (Example: Java)
Error Handling in for Stack Overflows
Option #1 – Client is responsible
*Client code must contain defensive code to detect and avoid these situations in which the operations of the container class are undefined
Option #2 – Container is responsible
*Container class must contain defensive code that signals client code if these situations occur so that client can take appropriate action
Stack Overflows
Occurs when there is an attempt to add an element (push) to a stack that has already reached its maximum capacity.
Can result in one of two things:
1) Exception or Error: In many programming languages, attempting to push an element onto a full stack will result in an exception being thrown or an error being raised. This informs the program that the operation is not allowed due to a full stack.
2) Undefined Behavior: In some cases, the behavior might be undefined, which means the program’s behavior is unpredictable and may lead to program crashes, corrupted data, or other undesirable outcomes.
*Some languages may simply ignore the push command OR attempt to resize the stack (this will not work if there simply is not enough memory)
*Generally, you have discretion over how this situation will be handled in the code. Two options:
1) You handle the exception in the container
2) The client will decide how they will handle the exception, since they’re the one using your code.
Two questions you have to ask yourself whenever you use a stack
1) What happens if I push when there is no more memory left in the stack? (ie: how will I handle stack overflow)
2) What happens if ‘Pop’ or ‘Top’ is called when the stack is empty?
*Need to consider whether these errors will be handled on the client side or the server side
Implementing an Abstract Data Type (ADT)
There are many MANY ways to implement an abstract data type.
Your implementation depends on the language you are using and how you choose to handle potential exceptions
It’s up to you as the programmer to choose what is best for your use case
Advantages of implementing a stack as a linked list of dynamically allocated nodes
a. Dynamic Size: The linked list allows for a dynamic size, meaning the stack can grow or shrink as needed without a fixed maximum capacity.
b. Ease of Expansion: Adding elements does not require reallocation or shifting of elements, making push and pop operations efficient in terms of time complexity.
c. No Pre-allocation: There is no need to pre-allocate a fixed amount of memory for the stack, which can be more memory-efficient for sparse or variable-size usage.
Disadvantages of implementing a stack as a linked list of dynamically allocated nodes
a. Memory Overhead: Each node in a linked list carries additional memory overhead due to storing pointers/references to the next node.
b. No Random Access: Accessing elements in a linked list is not as efficient as in an array because you need to traverse nodes sequentially to access a specific element.
Advantages of implementing a stack as a sequential array
a. Fast Access: Elements can be accessed directly through array indices, resulting in constant-time access (O(1)) to any element in the stack.
b. Memory Efficiency: Arrays use less memory per element compared to linked list nodes because there are no additional pointers.
c. Cache Efficiency: Accessing elements sequentially in an array can be more cache-friendly, potentially leading to faster access times.
Disadvantages of implementing a stack as a sequential array
a. Fixed Size: Arrays have a fixed size, meaning you need to pre-allocate a specific amount of memory, and the stack size is limited to that allocation.
b. Costly Resizing: If the stack exceeds the allocated size, resizing the array involves allocating a new, larger array, copying elements, and deallocating the old array, which can be a costly operation.
c. Inefficient Expansion/Shrinking: Expanding or shrinking the stack requires costly operations, as it involves creating a new array and copying elements.
When to use a linked list of dynamically allocated nodes versus a sequential array for stack implementation
- Use a linked list for dynamic and flexible stack sizes without the need for pre-allocation (evaluating an arithmetic expression)
- Use a sequential array for fast access and memory efficiency when the stack size is known or can be estimated in advance.
When is the size of the stack defined?
You determine the size when you create the stack class
(usually by setting a global variable)
When is the “itemtype” for the stack defined?
You determine the “itemtype” when you create the stack class
(usually by setting a global variable)
–as a good programming practice, you should use integer (because you can use a stack of integers to hold any datatype with minimal changes)
Dynamic Allocation
Indicates that we are using the heap
Creating “nodes” individually and having them hold pointers to eachother
The Heap
Basically “sandbox” memory for the programmer to play with
Node structure needed for a linked list
Contains two things:
1) an ItemType (tells you what will be stored in that node)
2) a pointer (tells you where to look for the next node)
List (data structure)
Simply put: a sequence of zero or more components
A collection of elements or items arranged in a specific order. Lists are fundamental data structures used to organize and store data, allowing operations such as adding, removing, and accessing elements based on their position or key.
Homogenous Lists
Lists where all of the elements/items are of the same type
Examples: A list of integers, A list of strings
Heterogenous Lists
Lists where NOT all of the elements/items are of the same type
The opposite of a Homogeneous List
Basic Operations for a List
1) Create a list
2) Add an Item to the list
3) Delete an Item from the List
4) Print the entire list
5) Search List for a particular Item
6) Sort the list items
7) Purge Duplicate Items
Sorted List v. Unsorted List
- Ordering:
- A sorted list has elements in a specific order, such as ascending or descending numerical order or alphabetical order.
- An unsorted list does not follow a specific order; elements are arranged arbitrarily.
- Insertion of New Elements:
- In a sorted list, when adding a new element, it is inserted in a way that preserves the order of the list.
- In an unsorted list, new elements are typically appended at the end or inserted based on specific rules but without considering overall order.
- Search Operations:
- Searching for an element in a sorted list is often more efficient, commonly using binary search due to the ordered nature of the list.
- Searching in an unsorted list typically involves linear search, which is less efficient than binary search.
Sorted List
- In a sorted list, the elements are arranged in a specific order according to a defined sorting criterion (e.g., numerical order, alphabetical order).
- When a new element is added to a sorted list, it is inserted in a way that maintains the order of the elements.
- Common operations on a sorted list include searching for an element (often using binary search) and inserting new elements in a way that preserves the order.
Unsorted List
- In an unsorted list, the elements are not arranged in any particular order.
- Elements can be added to an unsorted list in any order, and they are typically appended or inserted at the end of the list.
- Common operations on an unsorted list include searching for an element (often using linear search), adding new elements at the end, and removing elements based on their value or position.
Sequential (Array-Based) Implementation of a list
- Sequential indicates “Array-based”
- Components stored in contiguous array cells
- Order of components determined by the order in which the elements are stored
- Data Storage:
- Elements are stored in a contiguous block of memory (an array).
- Each element is accessed using an index (position in the array).
- Size and Capacity:
- The size of the array is fixed or can be dynamically resized.
- Dynamic resizing involves creating a new, larger array, copying elements, and deallocating the old array.
- Insertion and Deletion:
- Insertion or deletion involves shifting elements to make space or remove elements, resulting in potentially costly operations for adding or removing elements in the middle of the list.
- Advantages:
- Efficient random access to elements using indexing.
- Better cache performance due to contiguous memory storage.
- Disadvantages:
- Costly insertions and deletions within the list (except at the ends).
- Wasted memory if the allocated size is not fully utilized.
Linked Implementation or a List
- Components stored in nodes that may not be consecutive in memory
- Each node typically contains the component data along with a “link” which indicates the location of the next node
- Data Storage:
- Elements are stored in nodes, each containing the element and a reference to the next node.
- Elements are not stored in contiguous memory; they are scattered across memory, linked by pointers.
- Size and Capacity:
- The size of the linked list can grow or shrink dynamically as nodes are added or removed.
- Insertion and Deletion:
- Insertion and deletion are more efficient as they involve updating pointers, without the need to shift elements.
- Insertion and deletion can be done anywhere in the list with consistent time complexity.
- Advantages:
- Efficient insertions and deletions at any position within the list.
- No wasted memory due to dynamic allocation of nodes.
- Disadvantages:
- Less efficient random access (no direct indexing), requiring traversal from the head to reach a specific element.
- Higher memory overhead due to storing pointers/references for each element.
Iterators for Lists
- Iterators are methods that allow a structure such as
a List to be processed item by item - They allow for the traversal and access of the elements within a list in a sequential manner. It provides a way to visit each element of the list and perform operations on them without exposing the internal structure or implementation of the list.
- An iterator typically follows a specific pattern that includes functions or methods to:
1) Initialize: Start the iteration at the beginning of the list.
2) Check for End: Determine if there are more elements to iterate over.
3) Retrieve Current Element: Access the current element in the iteration.
4) Move to Next Element: Move the iterator to the next element in the list.
Array-based implementation of a linked list
- Refers to a way of representing a “linked list-like structure” using an array for certain operations or optimizations.
- An array is used to represent the elements of a linked list or to enhance certain functionalities, such as random access to elements, indexing, or combining the benefits of both data structures.
- Array-Based Implementation of a “Linked List-Like” Structure:
- Node Representation:
- Define a structure or class to represent nodes with data and a pointer (index) to the next node.
- Array to Store Nodes:
- Use an array where each element represents a node in the linked list.
- Operations:
- Implement operations like adding nodes, removing nodes, and traversing the linked list using array indices.
- Node Representation:
Creating a linked list using dynamic allocation
- Dynamic allocation is a fundamental concept in computer programming that involves allocating memory for data structures during the program’s execution.
- When creating a linked list, dynamic allocation is commonly used to create nodes and link them together to form the linked list.
- Concept
- Dynamically allocate records (“nodes”)
- Each node holds a data item and one or more links to other nodes in the list
- If one knows where the list begins (“head”), one can follow the links to find the rest of the list
- Must remember to mark the last node so that one can tell when the end of list has been reached
- Here’s a step-by-step approach to create a linked list using dynamic allocation:
- Define the Node Structure: Create a structure or a class to represent a node in the linked list. This structure typically contains the data and a pointer/reference to the next node.
- Allocate Memory for Nodes Dynamically: Use a memory allocation function (e.g., malloc in C/C++, new in C++) to dynamically allocate memory for each node.
- Link Nodes Together: Set the pointers/references in each node to link them together, forming the linked list.
- Handle Deallocation: After using the linked list, ensure to deallocate the allocated memory using the appropriate deallocation function (e.g., free in C/C++, delete in C++) to prevent memory leaks.
Benefits of Dynamic Allocation
- Dynamic allocation is a crucial aspect of working with data structures in programming, particularly when dealing with complex or variable-size data structures. Here are several reasons why dynamic allocation is essential:
Variable Size and Flexibility:
Dynamic allocation allows data structures to adapt to varying or unpredictable sizes of data, enabling the creation of flexible and resizable structures like dynamic arrays, linked lists, trees, etc.
Efficient Memory Usage:
Dynamic allocation optimizes memory usage by allocating memory only when needed. This contrasts with static allocation, where memory is allocated in advance, possibly leading to wasted memory.
Avoid Compile-Time Limits:
Dynamic allocation bypasses compile-time limitations by allowing data structures to grow and shrink as needed during runtime. Static allocation, on the other hand, has fixed memory requirements determined at compile time.
Avoiding Stack Overflow:
In cases where memory requirements exceed the stack’s capacity, using dynamic allocation (e.g., heap allocation) prevents stack overflow issues, ensuring the program can handle larger data structures.
Support for Recursive Data Structures:
Recursive data structures like trees, graphs, and linked lists can be implemented and managed effectively using dynamic allocation. These structures often have an indefinite or unpredictable number of nodes, making dynamic allocation necessary.
Ease of Dealing with Large Data:
Large datasets or complex data structures can be efficiently managed using dynamic allocation. Allocating memory on the heap allows for the handling of extensive data without imposing limits based on stack size or program memory layout.
Sharing Data Across Functions or Modules:
Dynamic allocation enables sharing and passing of data structures across various functions or modules, facilitating more flexible and modular programming.
Dynamically Linked Data Structures:
Data structures like linked lists, trees, and graphs heavily rely on dynamic allocation to link nodes or elements dynamically, as the number of nodes can vary based on runtime conditions.
Memory Recycling and Reusability:
Dynamic allocation enables memory recycling and reusability, improving memory management by deallocating and reallocating memory as needed, reducing memory fragmentation.
Improved Performance:
Dynamic allocation, when used judiciously, can improve performance by allowing efficient memory allocation and deallocation, resulting in reduced overhead for managing data structures.
Singly-linked list v. Doubly-linked list
- Singly-Linked List:
- Contains a reference to the next node.
- Efficient memory usage (single reference per node).
- Forward traversal only.
- Limited to efficient operations in the forward direction.
- Doubly-Linked List:
- Contains references to both the next and previous nodes.
- Less memory-efficient (two references per node).
- Supports forward and backward traversal.
- Enables efficient insertions and deletions in both directions.
Singly-linked lists
- Node Structure:
- Each node in a singly-linked list contains the data and a reference (or pointer) to the next node in the list.
- Traversal:
- Traversing a singly-linked list is done in one direction, starting from the head (first node) and moving towards the tail (last node).
- Operations:
- Common operations include insertion and deletion at the head or tail, and in the middle (with some additional operations compared to a doubly-linked list for similar actions).
- Backward Traversal:
- Not possible to traverse the list backward efficiently, as there is no reference to the previous node from a given node.
- Memory Efficiency:
- More memory-efficient than a doubly-linked list as it only requires a single reference per node (next).
Doubly-linked list
- Node Structure:
- Each node in a doubly-linked list contains the data and references (pointers) to both the next and previous nodes.
- Traversal:
- Traversing a doubly-linked list can be done in both directions: forward (from head to tail) and backward (from tail to head).
- Operations:
- Supports all operations possible in a singly-linked list, as well as efficient insertion and deletion of nodes in both forward and backward directions.
- Backward Traversal:
- Allows efficient backward traversal, as each node has a reference to the previous node.
- Memory Efficiency:
- Less memory-efficient than a singly-linked list due to the extra reference (previous) per node.
Classifying lists by Data-Structure
- Array-Based Lists: Elements are stored in a contiguous block of memory (e.g., dynamic arrays).
- Linked Lists: Elements are stored in nodes with pointers/references to the next and/or previous elements.
- Skip Lists: A probabilistic data structure that uses layers of linked lists for efficient searching.