Dataflow
Nextflow uses a dataflow programming model to define workflows declaratively. In this model, processes in a pipeline are connected to each other through dataflow channels and dataflow values.
Channels
A dataflow channel (or simply channel) is an asynchronous sequence of values.
The values in a channel cannot be accessed directly, but only through an operator or process. For example:
channel.of(1, 2, 3).view { v -> "channel emits ${v}" }
channel emits 1
channel emits 2
channel emits 3
Factories
A channel can be created by factories in the channel
namespace. For example, the channel.fromPath()
factory creates a channel from a file name or glob pattern, similar to the files()
function:
channel.fromPath('input/*.txt').view()
See Channel factories for the full list of channel factories.
Operators
Channel operators, or operators for short, are functions that consume and produce channels. Because channels are asynchronous, operators are necessary to manipulate the values in a channel. Operators are particularly useful for implementing glue logic between processes.
Commonly used operators include:
combine: emit the combinations of two channels
collect: collect the values from a channel into a list
filter: select the values in a channel that satisfy a condition
flatMap: transform each value from a channel into a list and emit each list element separately
groupTuple: group the values from a channel based on a grouping key
join: join the values from two channels based on a matching key
map: transform each value from a channel with a mapping function
mix: emit the values from multiple channels
view: print each value in a channel to standard output
See Operators for the full list of operators.
Values
A dataflow value is an asynchronous value.
Dataflow values can be created using the channel.value factory, and they are created by processes (under certain conditions).
A dataflow value cannot be accessed directly, but only through an operator or process. For example:
channel.value(1).view { v -> "dataflow value is ${v}" }
dataflow value is 1
See Value<V> for the set of available methods for dataflow values.