Deployment#
Advantages of the CLI#
The Command Line Interface (CLI) is the recommended way to deploy Pangeo Forge recipes, both for production and local testing. Advantages of using the CLI include:
Centralized configuration
Sensible defaults
Deploy from version control refs
The CLI is itself a thin wrapper around Apache Beam’s pipeline deployment logic (in pseudocode):
import apache_beam as beam
from apache_beam.pipeline import PipelineOptions
options = PipelineOptions(runner="DirectRunner", ...)
with beam.Pipeline(options=options) as p:
p | recipe
Users are welcome to use this native Beam deployment approach for their recipes as well.
Beam Runners#
Apache Beam (and therefore, Pangeo Forge) supports flexible deployment via “runners”, which include:
DirectRunner: Useful for testing during recipe development and, in multithreaded mode, for certain production workloads. (Note that Apache Beam does not recommend this runner for production.)
FlinkRunner: Executes pipelines using Apache Flink.
DataflowRunner: Uses the Google Cloud Dataflow managed service.
DaskRunner: Executes pipelines via Dask.distributed.
When deploying with the CLI, the runner is specified via a Configuration file.