Nextflow Flashcards

1
Q

DSL

A

Domain specific language: “a computer language specialized to a particular application domain. This is in contrast to a general-purpose language, which is broadly applicable across domains.” (wikipedia). Nextflow comes with a Nextflow DSL - the language used to write pipeline scripts.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

DSL2

A

Nextflow came out with a major update to its DSL (pipeline code language) around 2020. It’s called DSL2. Pipelines written in the old syntax are called DSL1 and are no longer supported by the latest Nextflow releases.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Process

A

Base unit of a Nextflow pipeline. Has inputs, outputs and a script block, among other things. The script block can be any language, but is usually bash and usually just runs a bioinformatics command line tool.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Module

A

A Nextflow script that can be shared and imported into a pipeline. Typically contains just one process, but can have more

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Workflow/Sub-Workflow

A

Workflow can be a generic term for a pipeline, but also has special meaning when writing Nextflow code. All DSL2 Nextflow pipelines (workflows?) must include a workflow. They can also be named, chained together, and run independently. A workflow collects processes and channel logic into a single unit. A sub-workflow is a workflow that is called by another workflow.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Task

A

A task is a unit of execution by Nextflow. A process may have a template script and input / output channels. When you run the pipeline, Nextflow will generate a task for every set of inputs to a process, resolving the template script with any variables and then running it as a task. So a single process can spawn many tasks. You can think of a task as a process instance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Nextflow Workflow

A

See: Nextflow Pipeline

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Nextflow Pipeline

A

See: Nextflow Workflow

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Nextflow Channels

A

Nextflow channels are the magic pipes that connect each process block together. Outputs from a process task go into a channel. That channel can then be used as an input to another, different process. Channels are special data-flow variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Operator

A

Special functions (methods) to work with channels. For example, to filter / fork / reduce channels and many more things.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Executor

A

An interface between the Nextflow pipeline and the underlying compute infrastructure. Nextflow has many executors to support many compute environments (AWS Batch, Azure Batch, Google Cloud Batch, Slurm, LSF, Kubernetes, etc…)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

xpack

A

Seqera provides Nextflow xpack licenses which are paid extensions to Nextflow.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Nextflow head job

A

Nextflow orchestrates the execution of numerous tasks, which means when you run a Nextflow pipeline, you can have hundreds of thousands of tasks running at the same time. Though they’re all managed to some extent by Nextflow, there’s only one Nextflow process, which we refer as the head job when we’re in a cloud computing environment or in an HPC. Though some people may choose to run Nextflow in the login node of a cluster, the best practice is to submit it as a job (and then, the term head job) that, in turn, will submit more jobs for every task.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly