Getting started

Requirements

Nextflow can be used on any POSIX compatible system (Linux, OS X, etc). It requires Bash 3.2 (or later) and Java 11 (or later, up to 21) to be installed.

For the execution in a cluster of computers, the use of a shared file system is required to allow the sharing of tasks input/output files.

Nextflow can also be run on Windows through WSL.

Tip

We recommend that you install Java through SDKMAN!, and that you use the latest LTS version of Corretto or Temurin. See this website for more information. While other Java distros may work at first or even most of the time, many users have experienced issues that are difficult to debug and are usually resolved by using one of the recommended distros.

To install Corretto 17:

sdk install java 17.0.6-amzn

To install Temurin 17:

sdk install java 17.0.6-tem

Installation

Nextflow is distributed as a self-installing package, which means that it does not require any special installation procedure.

It only needs two easy steps:

  1. Download the executable package by copying and pasting either one of the following commands in your terminal window: wget -qO- https://get.nextflow.io | bash

    Or, if you prefer curl: curl -s https://get.nextflow.io | bash

    This will create the nextflow main executable file in the current directory.

  2. Make the binary executable on your system by running chmod +x nextflow.

  3. Optionally, move the nextflow file to a directory accessible by your $PATH variable (this is only required to avoid remembering and typing the full path to nextflow each time you need to run it).

Tip

Set export CAPSULE_LOG=none to make the dependency installation logs less verbose.

Tip

If you don’t have curl or wget, you can also download the Nextflow launcher script from the project releases page on GitHub, in lieu of step 1.

Tip

To avoid downloading the dependencies, you can also use the nextflow-VERSION-all distribution available for every Nextflow release on Github.

  1. Go to the Github releases page and expand the Assets section for a specific release.

  2. Copy the URL of the nextflow-VERSION-all asset and enter the download command in your terminal, e.g. wget -qO- ASSET-URL. It will create the completely self-contained nextflow-VERSION-all executable file in the current directory.

Updates

Having Nextflow installed in your computer you can update to the latest version using the following command:

nextflow self-update

Tip

You can temporarily switch to a specific version of Nextflow by prefixing the nextflow command with the NXF_VER environment variable. For example:

NXF_VER=20.04.0 nextflow run hello

Stable and Edge releases

A stable version of Nextflow is released on a six-months basic schedule, in the 1st and 3rd quarter of every year.

Along with the stable release, an edge version is released on a monthly basis. This version is useful to test and use most recent updates and experimental features.

To use the latest edge release run the following snippet in your shell terminal:

export NXF_EDGE=1
nextflow self-update

Your first script

Copy the following example into your favorite text editor and save it to a file named tutorial.nf:

params.str = 'Hello world!'

process splitLetters {
    output:
    path 'chunk_*'

    """
    printf '${params.str}' | split -b 6 - chunk_
    """
}

process convertToUpper {
    input:
    path x

    output:
    stdout

    """
    cat $x | tr '[a-z]' '[A-Z]'
    """
}

workflow {
    splitLetters | flatten | convertToUpper | view { it.trim() }
}

Note

For versions of Nextflow prior to 22.10.0, you must explicitly enable DSL2 by adding nextflow.enable.dsl=2 to the top of the script or by using the -dsl2 command-line option.

This script defines two processes. The first splits a string into 6-character chunks, writing each one to a file with the prefix chunk_, and the second receives these files and transforms their contents to uppercase letters. The resulting strings are emitted on the result channel and the final output is printed by the view operator.

Execute the script by entering the following command in your terminal:

$ nextflow run tutorial.nf

N E X T F L O W  ~  version 22.10.0
executor >  local (3)
[69/c8ea4a] process > splitLetters   [100%] 1 of 1 ✔
[84/c8b7f1] process > convertToUpper [100%] 2 of 2 ✔
HELLO
WORLD!

You can see that the first process is executed once, and the second twice. Finally the result string is printed.

It’s worth noting that the process convertToUpper is executed in parallel, so there’s no guarantee that the instance processing the first split (the chunk Hello) will be executed before the one processing the second split (the chunk world!).

Thus, it is perfectly possible that you will get the final result printed out in a different order:

WORLD!
HELLO

Tip

The hexadecimal string, e.g. 22/7548fa, is the unique hash of a task, and the prefix of the directory where the task is executed. You can inspect a task’s files by changing to the directory $PWD/work and using this string to find the specific task directory.

Modify and resume

Nextflow keeps track of all the processes executed in your pipeline. If you modify some parts of your script, only the processes that are actually changed will be re-executed. The execution of the processes that are not changed will be skipped and the cached result used instead.

This helps a lot when testing or modifying part of your pipeline without having to re-execute it from scratch.

For the sake of this tutorial, modify the convertToUpper process in the previous example, replacing the process script with the string rev $x, so that the process looks like this:

process convertToUpper {
  input:
    path x
  output:
    stdout

  """
  rev $x
  """
}

Then save the file with the same name, and execute it by adding the -resume option to the command line:

nextflow run tutorial.nf -resume

It will print output similar to this:

N E X T F L O W  ~  version 22.10.0
executor >  local (2)
[69/c8ea4a] process > splitLetters   [100%] 1 of 1, cached: 1 ✔
[d0/e94f07] process > convertToUpper [100%] 2 of 2 ✔
olleH
!dlrow

You will see that the execution of the process splitLetters is actually skipped (the process ID is the same), and its results are retrieved from the cache. The second process is executed as expected, printing the reversed strings.

Tip

The pipeline results are cached by default in the directory $PWD/work. Depending on your script, this folder can take up a lot of disk space. It’s a good idea to clean this folder periodically, as long as you know you won’t need to resume any pipeline runs.

For more information, see the Caching and resuming page.

Pipeline parameters

Pipeline parameters are simply declared by prepending to a variable name the prefix params, separated by dot character. Their value can be specified on the command line by prefixing the parameter name with a double dash character, i.e. --paramName

For the sake of this tutorial, you can try to execute the previous example specifying a different input string parameter, as shown below:

nextflow run tutorial.nf --str 'Bonjour le monde'

The string specified on the command line will override the default value of the parameter. The output will look like this:

N E X T F L O W  ~  version 22.10.0
executor >  local (4)
[8b/16e7d7] process > splitLetters   [100%] 1 of 1 ✔
[eb/729772] process > convertToUpper [100%] 3 of 3 ✔
m el r
edno
uojnoB

Changed in version 20.11.0-edge: Any . (dot) character in a parameter name is interpreted as the delimiter of a nested scope. For example, --foo.bar Hello will be interpreted as params.foo.bar. If you want to have a parameter name that contains a . (dot) character, escape it using the back-slash character, e.g. --foo\.bar Hello.