A recipe describes the steps to transform archival source data in one format / location into analysis-ready, cloud-optimized (ARCO) data in another format / location. Technically, a recipe is as a set of composite Apache Beam transforms applied to the data collection associated with a file pattern. To write a recipe:
Most recipes will be composed following the generic sequence:
In Apache Beam, transforms are connected with the
| pipe operator.
Or, in pseudocode:
recipe = ( beam.Create(pattern.items()) | Opener | Preprocessor # optional | Writer )
Pangeo Forge does not provide any importable, pre-defined sequences of transforms. This is by design, and leaves the composition process flexible enough to accomodate the heterogeneity of real world data. In practice, however, certain Common styles may work as the basis for many datasets.