Fork-Join Parallelism with Data Structures Flashcards by Gabbie Pedrialva

It utilizes multiple cores inside the computer to tackle tasks simultaneously, making processes much quicker.

Parallel Processing

How well did you know this?

Not at all

Perfectly

Parallel processing utilizes multiple _ inside the computer to tackle tasks simultaneously, making processes much quicker.

cores

How well did you know this?

Not at all

Perfectly

_ is a programming model that allows tasks to be split into subtasks (forking) and later combined (joining) after execution.

Fork-Join Parallelism

How well did you know this?

Not at all

Perfectly

The _ splits the task into smaller subtasks and these tasks are executed concurrently.

Fork

How well did you know this?

Not at all

Perfectly

After the execution of the subtasks, the task may _ all the results into one result.

join

How well did you know this?

Not at all

Perfectly

After the _, one flow turns into two separate flows.

fork

It’s particularly useful in scenarios where a problem can be broken down into subproblems that can be solved independently.

How well did you know this?

Not at all

Perfectly

The fork-join framework employs a _ strategy, making it ideal for parallel processing.

divide and conquer

How well did you know this?

Not at all

Perfectly

Divide-and-Conquer Algorithm

void DivideAndConquer ( Problem P) {
—if( P is base case ) {
——Solve P;
—} else {
——Divide P into K subproblems;
——Fork to conquer each subproblem in parallel;
——Join;
——Combine subsolutions into final solution;
—}
}

Noted

How well did you know this?

Not at all

Perfectly

The function DivideAndConquer takes problem P as input

If the problem P is a _ , the function directly solves it and returns the solution.

base case

How well did you know this?

Not at all

Perfectly

A small, easily solvable instance.

Base Case

How well did you know this?

Not at all

Perfectly

If the problem is not a base case, it is _ into K subproblems.

divided

These subproblems are then solved in parallel using the Fork operation, creating new tasks for each subproblem.

How well did you know this?

Not at all

Perfectly

The _ operation waits for all subproblems to finish.

Join

Finally, the solutions of the subproblems are combined to form the solution to the original problem.

How well did you know this?

Not at all

Perfectly

The fork/join framework supports a style of parallel programming that solves problems by “Divide and conquer”, in the following manner as shown below:

1. Splitting a task into sub-tasks.

2. Solving sub-tasks in parallel
* Sub-tasks can run in parallel on different cores.
* Sub-tasks can also run concurrently in different threads on a single core.

3. Waiting for them to complete
* join() waits for a sub-task to finish

4. Merging the results.
* A task uses calls to join() to merge the sub-task results together.

Noted

How well did you know this?

Not at all

Perfectly

For good speedup, most of the work must occur deep in the _, where parallelism is _.

nesting, high

Three-level, two-way nesting = eight parallel flows at the innermost level

How well did you know this?

Not at all

Perfectly

FORK-JOIN CREWS

Three (3) key components to achieve parallel processing:

1.Fork-Join Pool: The Conductor of Threads
2. ForkJoinTasks
3. Worker Threads

How well did you know this?

Not at all

Perfectly

It manages a pool of worker threads, and assigns them tasks from a queue, just like the pool keeps workers busy.

Fork-Join Pool: The Conductor of Threads

How well did you know this?

Not at all

Perfectly

A thread pool that manages the execution of ForkJoin Tasks.

ForkJoinPool

How well did you know this?

Not at all

Perfectly

When a task is submitted to the pool, the forkjoinpool decides whether to execute it directly or split it into subtasks, distributing them among available _.

worker threads

How well did you know this?

Not at all

Perfectly

An abstract class that defines a task that runs within a ForkJoinPool.

ForkJoinTask<v>
(ForkJoinTasks)</v>

How well did you know this?

Not at all

Perfectly

There are two (2) main types of tasks:

RecursiveTask: ForkJoinTask’s subclass for tasks that return values
RecursiveAction: ForkJoinTask’s subclass for tasks that don’t return values.

How well did you know this?

Not at all

Perfectly

ForkJoinTask’s subclass for tasks that return values.

RecursiveTask

Similar to RecursiveAction, but a RecursiveTask returns a result whose type is specified by the type parameter v.

How well did you know this?

Not at all

Perfectly

ForkJoinTask’s subclass for tasks that don’t return values.

RecursiveAction

How well did you know this?

Not at all

Perfectly

_ means that the task can be split into subtasks of itself by a divide-and-conquer strategy.

Recursive

How well did you know this?

Not at all

Perfectly

These are the workhorses that execute the tasks.

Worker Threads

How well did you know this?

Not at all

Perfectly

The worker threads continuously check the _ for available tasks. Once they receive a task, they follow the instructions effectively completing their assigned part of the overall job.

queue

The Fork/Join Framework uses a cool trick called _.

work-stealing ## Footnote Here's the idea: imagine a worker thread finishes its task. They can peek over at another thread and see if there is any extra work to help out with. This keeps everyone working and makes sure no time is wasted.

Adds an item to the back of the queue.

Enqueue

Removes an item from the front of the queue.

Dequeue

Fork-join Real-life Examples

* **WEATHER FORECASTING:** Large-scale weather simulations can be divided into smaller regions, each of which can be simulated independently. * **BIG DATA PROCESSING:** Large datasets can be divided into smaller chunks and processed in parallel using techniques like MapReduce.

A _ is a path that is followed during a program’s execution.

thread

The majority of programs written nowadays run as a _ thread.

single ## Footnote For example, a program is not capable of reading keystrokes while making drawings. These tasks cannot be executed by the program at the same time. This problem can be solved through **multitasking** so that two or more tasks can be executed simultaneously.

An example of a multithreaded program that we are all familiar with is a **word processor**. While you are typing, multiple threads are used to * display your document, * asynchronously check the spelling * grammar of your document, * generate a PDF version of the document. When working on a **spreadsheet**, a user enters data into a cell, and the following may happen: * column widths may be adjusted * repeating cell elements may be replicated; * spreadsheets may be saved multiple times as the file is further developed. These are all happening **concurrently**, with **independent** **threads** performing these tasks **internally**.

Noted

It is the smallest unit of processing that can be performed in an OS.

Thread

Thread is the smallest unit of processing that can be performed in an _.

Operating System (OS)

**Thread** simply is a _ of a process.

subset

A thread contains all the information in a _.

Thread Control Block (TCB)

The components of **Thread Control Block** (TCB) have been defined as: * Thread ID * Thread state * CPU information : --- Program counter --- Register contents * Thread priority * Pointer to process that created this thread * Pointer(s) to other thread(s) that were created by this thread

Noted

It is a unique identifier assigned by the Operating System to the thread when it is being created.

Thread ID (TID)

These are the states of the thread which changes as the thread progresses through the system.

Thread states

It includes everything that the OS needs to know about, such as how far the thread has progressed and what data is being used.

CPU Information

It indicates the weight of the thread over other threads which helps the thread scheduler to determine which thread should be selected next from the READY queue.

Thread Priority

Points to the process which triggered the creation of this thread.

Pointer

Pointer which points to the thread(s) created by this thread.

Pointer

A _ is like a container that holds all the resources needed to run a program. It's a separate execution environment with its own memory space, open files, and other resources.

process

A _ is a lightweight unit of execution within a process. Think of it as a smaller, independent unit that can run concurrently within the process.

thread

# Relationship between the Process and its Thread * **Process Control Block (PCB):** This block stores information about a process, including its process ID (PID), state, counter, registers, memory limits, and list of open files. * **Thread Control Block (TCB):** This block stores information about a thread, including its parent process pointer, thread ID (TID), state, program counter, register set, and stack pointer. * **Process Memory:** This section represents the memory space allocated to a process, which is divided into different segments: ---**Text:** Contains the executable code of the process. ---**Data:** Stores the data used by the process. ---**Stack:** Used for storing local variables, function arguments, and return addresses during function calls.

Noted

# Relationship between the Process and its Thread This block stores information about a process, including its process ID (PID), state, counter, registers, memory limits, and list of open files.

Process Control Block (PCB)

# Relationship between the Process and its Thread This block stores information about a thread, including its parent process pointer, thread ID (TID), state, program counter, register set, and stack pointer.

Thread Control Block (TCB)

# Relationship between the Process and its Thread This section represents the memory space allocated to a process.

Process Memory

# [PROCESS MEMORY SEGMENT] Relationship between the Process and its Thread Contains the executable code of the process.

Text

# [PROCESS MEMORY SEGMENT] Relationship between the Process and its Thread Stores the data used by the process.

Data

# [PROCESS MEMORY SEGMENT] Relationship between the Process and its Thread Used for storing local variables, function arguments, and return addresses during function calls.

Stack

# Relationship between the Process and its Thread * The **thread control block** points to the **process control block**, indicating that the thread belongs to that specific process. * The **thread control block** also points to the **process memory**, indicating that the thread operates within the memory space allocated to its parent process.

Noted

# PROCESS VS THREAD **Process** * Process is **heavy weight** or **resource intensive**. * Process switching needs interaction with operating system. * In multiple processing environments, each process executes the same code but has its own memory and file resources. * If one process is blocked, then no other process can execute until the first process is unblocked. * Multiple processes without using threads use more resources. * In multiple processes each process operates independently of the others. **Thread** * Thread is **light weight**, taking lesser resources than a process. * Thread switching does not need to interact with operating system. * All threads can share same set of open files, child processes. * While one thread is blocked and waiting, a second thread in the same task can run. * Multiple threaded processes use fewer resources. One thread can read, write or change another thread's data.

Noted

It is the ability of a program or an operating system to enable more than one user at a time without requiring multiple copies of the program running on the computer.

Multithreading

Multithreading can also handle _ requests from the _ user.

multiple, same

Each user request for a program or system service is tracked as a thread with a _ identity.

separate

Fast _ and large _ are needed for multithreading.

CPU speed, memory capacities

A _ executes pieces, or threads, of various programs so fast, it appears the computer is handling multiple requests simultaneously.

single processor

Multiple threads can exist within one process where: * Each thread contains its own **register set** and **local variables** (stored in the stack) . * All threads of a process share **global variables** (stored in heap) and the **program code**.

Noted

In _, the state of a thread is saved and the state of another thread is loaded whenever any interrupt (due to 1/0 or manually set) takes place.

context switching

In context switching, the state of a thread is _ and the state of another thread is _ whenever any interrupt (due to 1/0 or manually set) takes place.

saved, loaded

Context switching takes place so frequently that all the threads appear to be running _.

parallelly ## Footnote This is termed **multitasking**.

In **context switching**, the state of a thread is **saved** and the state of another thread is **loaded** whenever any interrupt (due to 1/0 or manually set) takes place. Context switching takes place so frequently that all the threads appear to be running **parallelly** (this is termed **multitasking**).

Noted

The concept of multi-threading needs a proper understanding of these two terms - a _ and a _.

process, thread

A _ is a program being executed.

process

A process can be further divided into _ units known as threads.

independent

A process can be further divided into independent units known as _.

threads

A _ is like a small light-weight process within a process.

thread

We can say a collection of threads is what is known as a _.

process

**How does multithreading work?** *** Processor Handling:** The processor can execute only one instruction at a time, but it switches between different threads so fast that it gives the illusion of simultaneous execution. *** Thread Synchronization:** Each thread is like a separate task within a program. They share resources and work together smoothly, ensuring programs run efficiently. *** Efficient Execution:** Threads in a program can run independently or wait for their turn to process, making programs faster and more responsive. *** Programming Considerations:** Programmers need to be careful about managing threads to avoid problems like conflicts or situations where threads get stuck waiting for each other.

Noted

# **How does multithreading work?** The processor can execute only one instruction at a time, but it switches between different threads so fast that it gives the illusion of simultaneous execution.

Processor Handling

# **How does multithreading work?** Each thread is like a separate task within a program. They share resources and work together smoothly, ensuring programs run efficiently.

Thread Synchronization

# **How does multithreading work?** Threads in a program can run independently or wait for their turn to process, making programs faster and more responsive.

Efficient Execution

# **How does multithreading work?** Programmers need to be careful about managing threads to avoid problems like conflicts or situations where threads get stuck waiting for each other.

Programming Considerations

# **Multithreading vs. Multiprocessing** **Multithreading** - refers to the ability of a processor to execute multiple threads concurrently, where each thread runs a process. **Multiprocessing** - refers to the ability of a system to run multiple processors concurrently, where each processor can run one or more threads.

Noted

It refers to the ability of a processor to execute multiple threads concurrently, where each thread runs a process.

Multithreading

It refers to the ability of a system to run multiple processors concurrently, where each processor can run one or more threads.

Multiprocessing

The thread has its own code, data, and files, and it runs sequentially.

Single Processor Single Thread

Each thread has its code, data, and files, but they share the same processor.

Single Processor Multithread

Each processor has its memory and can execute threads independently.

Multiprocessing

This provides the highest level of parallelism and is suitable for large-scale, computationally intensive tasks.

Multiprocessing

# **Multithreading vs. Multiprocessing** * **Multithreading** is useful for **IO-bound** processes, such as reading files from a network or database since each thread can run the IO-bound process concurrently. * **Multiprocessing** is useful for **CPU-bound** processes, such as computationally heavy tasks since it will benefit from having multiple processors; similar to how multicore computers work faster than computers with a single core.

Noted

It is useful for IO-bound processes, such as reading files from a network or database since each thread can run the IO-bound process concurrently.

Multithreading

It is useful for CPU-bound processes, such as computationally heavy tasks since it will benefit from having multiple processors; similar to how multicore computers work faster than computers with a single core.

Multiprocessing

**Fork–join parallelism** delineates a set of tasks that can be executed simultaneously, beginning at the same starting point, the fork, and continuing until all concurrent tasks are finished having reached the joining point. Only when all the concurrent tasks defined by the fork-join have been completed will the succeeding computation proceed. **Multithreading** is a CPU feature that allows programmers to split processes into smaller subtasks called threads that can be executed concurrently. These threads may be run asynchronously, concurrently, or parallelly across one or more processors to improve the performance of the application. The ability to run tasks concurrently also makes multithreaded applications and APIs more responsive to the end user.

Noted

Fork-Join Parallelism with Data Structures Flashcards

(86 cards)