Fork-Join Parallelism with Data Structures Flashcards
It utilizes multiple cores inside the computer to tackle tasks simultaneously, making processes much quicker.
Parallel Processing
Parallel processing utilizes multiple _ inside the computer to tackle tasks simultaneously, making processes much quicker.
cores
_ is a programming model that allows tasks to be split into subtasks (forking) and later combined (joining) after execution.
Fork-Join Parallelism
The _ splits the task into smaller subtasks and these tasks are executed concurrently.
Fork
After the execution of the subtasks, the task may _ all the results into one result.
join
After the _, one flow turns into two separate flows.
fork
It’s particularly useful in scenarios where a problem can be broken down into subproblems that can be solved independently.
The fork-join framework employs a _ strategy, making it ideal for parallel processing.
divide and conquer
Divide-and-Conquer Algorithm
void DivideAndConquer ( Problem P) {
—if( P is base case ) {
——Solve P;
—} else {
——Divide P into K subproblems;
——Fork to conquer each subproblem in parallel;
——Join;
——Combine subsolutions into final solution;
—}
}
Noted
The function DivideAndConquer takes problem P as input
If the problem P is a _ , the function directly solves it and returns the solution.
base case
A small, easily solvable instance.
Base Case
If the problem is not a base case, it is _ into K subproblems.
divided
These subproblems are then solved in parallel using the Fork operation, creating new tasks for each subproblem.
The _ operation waits for all subproblems to finish.
Join
Finally, the solutions of the subproblems are combined to form the solution to the original problem.
The fork/join framework supports a style of parallel programming that solves problems by “Divide and conquer”, in the following manner as shown below:
1. Splitting a task into sub-tasks.
2. Solving sub-tasks in parallel
* Sub-tasks can run in parallel on different cores.
* Sub-tasks can also run concurrently in different threads on a single core.
3. Waiting for them to complete
* join() waits for a sub-task to finish
4. Merging the results.
* A task uses calls to join() to merge the sub-task results together.
Noted
For good speedup, most of the work must occur deep in the _, where parallelism is _.
nesting, high
Three-level, two-way nesting = eight parallel flows at the innermost level
FORK-JOIN CREWS
Three (3) key components to achieve parallel processing:
1.Fork-Join Pool: The Conductor of Threads
2. ForkJoinTasks
3. Worker Threads
It manages a pool of worker threads, and assigns them tasks from a queue, just like the pool keeps workers busy.
Fork-Join Pool: The Conductor of Threads
A thread pool that manages the execution of ForkJoin Tasks.
ForkJoinPool
When a task is submitted to the pool, the forkjoinpool decides whether to execute it directly or split it into subtasks, distributing them among available _.
worker threads
An abstract class that defines a task that runs within a ForkJoinPool.
ForkJoinTask<v>
(ForkJoinTasks)</v>
There are two (2) main types of tasks:
- RecursiveTask: ForkJoinTask’s subclass for tasks that return values
- RecursiveAction: ForkJoinTask’s subclass for tasks that don’t return values.
ForkJoinTask’s subclass for tasks that return values.
RecursiveTask
Similar to RecursiveAction, but a RecursiveTask returns a result whose type is specified by the type parameter v.
ForkJoinTask’s subclass for tasks that don’t return values.
RecursiveAction
_ means that the task can be split into subtasks of itself by a divide-and-conquer strategy.
Recursive
These are the workhorses that execute the tasks.
Worker Threads
The worker threads continuously check the _ for available tasks. Once they receive a task, they follow the instructions effectively completing their assigned part of the overall job.
queue
The Fork/Join Framework uses a cool trick called _.
work-stealing
Here’s the idea: imagine a worker thread finishes its task. They can peek over at another thread and see if there is any extra work to help out with. This keeps everyone working and makes sure no time is wasted.
Adds an item to the back of the queue.
Enqueue
Removes an item from the front of the queue.
Dequeue
Fork-join Real-life Examples
- WEATHER FORECASTING: Large-scale weather simulations can be divided into smaller regions, each of which can be simulated independently.
- BIG DATA PROCESSING: Large datasets can be divided into smaller chunks and processed in parallel using techniques like MapReduce.
A _ is a path that is followed during a program’s execution.
thread
The majority of programs written nowadays run as a _ thread.
single
For example, a program is not capable of reading keystrokes while making drawings. These tasks cannot be executed by the program at the same time. This problem can be solved through multitasking so that two or more tasks can be executed simultaneously.
An example of a multithreaded program that we are all familiar with is a word processor. While you are typing, multiple threads are used to
* display your document,
* asynchronously check the spelling
* grammar of your document,
* generate a PDF version of the document.
When working on a spreadsheet, a user enters data into a cell, and the following may happen:
* column widths may be adjusted
* repeating cell elements may be replicated;
* spreadsheets may be saved multiple times as the file is further developed.
These are all happening concurrently, with independent threads performing these tasks internally.
Noted
It is the smallest unit of processing that can be performed in an OS.
Thread
Thread is the smallest unit of processing that can be performed in an _.
Operating System (OS)
Thread simply is a _ of a process.
subset
A thread contains all the information in a _.
Thread Control Block (TCB)
The components of Thread Control Block (TCB) have been defined as:
* Thread ID
* Thread state
* CPU information :
— Program counter
— Register contents
* Thread priority
* Pointer to process that created this thread
* Pointer(s) to other thread(s) that were created by this thread
Noted
It is a unique identifier assigned by the Operating System to the thread when it is being created.
Thread ID (TID)
These are the states of the thread which changes as the thread progresses through the system.
Thread states
It includes everything that the OS needs to know about, such as how far the thread has progressed and what data is being used.
CPU Information
It indicates the weight of the thread over other threads which helps the thread scheduler to determine which thread should be selected next from the READY queue.
Thread Priority
Points to the process which triggered the creation of this thread.
Pointer
Pointer which points to the thread(s) created by this thread.
Pointer
A _ is like a container that holds all the resources needed to run a program. It’s a separate execution environment with its own memory space, open files, and other resources.
process
A _ is a lightweight unit of execution within a process. Think of it as a smaller, independent unit that can run concurrently within the process.
thread
Relationship between the Process and its Thread
- Process Control Block (PCB): This block stores information about a process, including its process ID (PID), state, counter, registers, memory limits, and list of open files.
- Thread Control Block (TCB): This block stores information about a thread, including its parent process pointer, thread ID (TID), state, program counter, register set, and stack pointer.
-
Process Memory: This section represents the memory space allocated to a process, which is divided into different segments:
—Text: Contains the executable code of the process.
—Data: Stores the data used by the process.
—Stack: Used for storing local variables, function arguments, and return addresses during function calls.
Noted
Relationship between the Process and its Thread
This block stores information about a process, including its process ID (PID), state, counter, registers, memory limits, and list of open files.
Process Control Block (PCB)
Relationship between the Process and its Thread
This block stores information about a thread, including its parent process pointer, thread ID (TID), state, program counter, register set, and stack pointer.
Thread Control Block (TCB)
Relationship between the Process and its Thread
This section represents the memory space allocated to a process.
Process Memory
[PROCESS MEMORY SEGMENT] Relationship between the Process and its Thread
Contains the executable code of the process.
Text
[PROCESS MEMORY SEGMENT] Relationship between the Process and its Thread
Stores the data used by the process.
Data
[PROCESS MEMORY SEGMENT] Relationship between the Process and its Thread
Used for storing local variables, function arguments, and return addresses during function calls.
Stack
Relationship between the Process and its Thread
- The thread control block points to the process control block, indicating that the thread belongs to that specific process.
- The thread control block also points to the process memory, indicating that the thread operates within the memory space allocated to its parent process.
Noted
PROCESS VS THREAD
Process
* Process is heavy weight or resource intensive.
* Process switching needs interaction with operating system.
* In multiple processing environments, each process executes the same code but has its own memory and file resources.
* If one process is blocked, then no other process can execute until the first process is unblocked.
* Multiple processes without using threads use more resources.
* In multiple processes each process operates independently of the others.
Thread
* Thread is light weight, taking lesser resources than a process.
* Thread switching does not need to interact with operating system.
* All threads can share same set of open files, child processes.
* While one thread is blocked and waiting, a second thread in the same task can run.
* Multiple threaded processes use fewer resources.
One thread can read, write or change another thread’s data.
Noted
It is the ability of a program or an operating system to enable more than one user at a time without requiring multiple copies of the program running on the computer.
Multithreading
Multithreading can also handle _ requests from the _ user.
multiple, same
Each user request for a program or system service is tracked as a thread with a _ identity.
separate
Fast _ and large _ are needed for multithreading.
CPU speed,
memory capacities
A _ executes pieces, or threads, of various programs so fast, it appears the computer is handling multiple requests simultaneously.
single processor
Multiple threads can exist within one process where:
* Each thread contains its own register set and local variables (stored in the stack) .
* All threads of a process share global variables (stored in heap) and the program code.
Noted
In _, the state of a thread is saved and the state of another thread is loaded whenever any interrupt (due to 1/0 or manually set) takes place.
context switching
In context switching, the state of a thread is _ and the state of another thread is _ whenever any interrupt (due to 1/0 or manually set) takes place.
saved, loaded
Context switching takes place so frequently that all the threads appear to be running _.
parallelly
This is termed multitasking.
In context switching, the state of a thread is saved and the state of another thread is loaded whenever any interrupt (due to 1/0 or manually set) takes place. Context switching takes place so frequently that all the threads appear to be running parallelly (this is termed multitasking).
Noted
The concept of multi-threading needs a proper understanding of these two terms - a _ and a _.
process, thread
A _ is a program being executed.
process
A process can be further divided into _ units known as threads.
independent
A process can be further divided into independent units known as _.
threads
A _ is like a small light-weight process within a process.
thread
We can say a collection of threads is what is known as a _.
process
How does multithreading work?
* Processor Handling: The processor can execute only one instruction at a time, but it switches between different threads so fast that it gives the illusion of simultaneous execution.
* Thread Synchronization: Each thread is like a separate task within a program. They share resources and work together smoothly, ensuring programs run efficiently.
* Efficient Execution: Threads in a program can run independently or wait for their turn to process, making programs faster and more responsive.
* Programming Considerations: Programmers need to be careful about managing threads to avoid problems like conflicts or situations where threads get stuck waiting for each other.
Noted
How does multithreading work?
The processor can execute only one instruction at a time, but it switches between different threads so fast that it gives the illusion of simultaneous execution.
Processor Handling
How does multithreading work?
Each thread is like a separate task within a program. They share resources and work together smoothly, ensuring programs run efficiently.
Thread Synchronization
How does multithreading work?
Threads in a program can run independently or wait for their turn to process, making programs faster and more responsive.
Efficient Execution
How does multithreading work?
Programmers need to be careful about managing threads to avoid problems like conflicts or situations where threads get stuck waiting for each other.
Programming Considerations
Multithreading vs. Multiprocessing
Multithreading - refers to the ability of a processor to execute multiple threads concurrently, where each thread runs a process.
Multiprocessing - refers to the ability of a system to run multiple processors concurrently, where each processor can run one or more threads.
Noted
It refers to the ability of a processor to execute multiple threads concurrently, where each thread runs a process.
Multithreading
It refers to the ability of a system to run multiple processors concurrently, where each processor can run one or more threads.
Multiprocessing
The thread has its own code, data, and files, and it runs sequentially.
Single Processor Single Thread
Each thread has its code, data, and files, but they share the same processor.
Single Processor Multithread
Each processor has its memory and can execute threads independently.
Multiprocessing
This provides the highest level of parallelism and is suitable for large-scale, computationally intensive tasks.
Multiprocessing
Multithreading vs. Multiprocessing
- Multithreading is useful for IO-bound processes, such as reading files from a network or database since each thread can run the IO-bound process concurrently.
- Multiprocessing is useful for CPU-bound processes, such as computationally heavy tasks since it will benefit from having multiple processors; similar to how multicore computers work faster than computers with a single core.
Noted
It is useful for IO-bound processes, such as reading files from a network or database since each thread can run the IO-bound process concurrently.
Multithreading
It is useful for CPU-bound processes, such as computationally heavy tasks since it will benefit from having multiple processors; similar to how multicore computers work faster than computers with a single core.
Multiprocessing
Fork–join parallelism delineates a set of tasks that can be executed simultaneously, beginning at the same starting point, the fork, and continuing until all concurrent tasks are finished having reached the joining point. Only when all the concurrent tasks defined by the fork-join have been completed will the succeeding computation proceed.
Multithreading is a CPU feature that allows programmers to split processes into smaller subtasks called threads that can be executed concurrently. These threads may be run asynchronously, concurrently, or parallelly across one or more processors to improve the performance of the application. The ability to run tasks concurrently also makes multithreaded applications and APIs more responsive to the end user.
Noted