Once you have a file pattern for your source data, it’s time to define a set of transforms to apply to the data, which may include:
Standard transforms from Apache Beam’s Python transform catalog
Third-party extensions from the Pangeo Forge Ecosystem
Your own transforms, such as custom Preprocessors
⚙️ Deploy-time configurable keyword arguments
Keyword arguments designated by the gear emoji ⚙️ below are deploy-time configurable. They should therefore not be provided in your recipe file. Instead, values for these arguments are specified in a per-deployment Configuration file. The values provided in the configuration file will be injected into your recipe by the Command Line Interface.
Once you’ve created a file pattern for your source data, you’ll need to open it somehow. Pangeo Forge currently provides the following openers:
Before writing out your analysis-ready, cloud-optimized (ARCO) dataset, it’s possible
you may want to preprocess the data. A custom Apache Beam
PTransform can be written
for this purpose and included in your recipe.
# TODO: Add preprocessor example.
Once your recipe is defined, you’re ready to move on to Deployment.