Storage

Recipes need a place to store data. The location where the final dataset produced by the recipe is stored is called the Target. Pangeo forge has a special class for this: pangeo_forge_recipes.storage.FSSpecTarget

Creating a Target requires two arguments:

  • The fs argument is an fsspec filesystem. Fsspec supports many different types of storage via its built in and third party implementations.

  • The root_path argument specifies the specific path where the data should be stored.

For example, creating a storage target for AWS S3 might look like this:

import s3fs
fs = s3fs.S3FileSystem(key="MY_AWS_KEY", secret="MY_AWS_SECRET")
target_path = "pangeo-forge-bucket/my-dataset-v1.zarr"
target = FSSpecTarget(fs=fs, root_path=target_path)

Temporary data is can be cached via a pangeo_forge_recipes.storage.CacheFSSpecTarget object. Some recipes require separate caching of metadata, which is provided by a third pangeo_forge_recipes.storage.FSSpecTarget.

class pangeo_forge_recipes.storage.FSSpecTarget(fs, root_path='')

Representation of a storage target for Pangeo Forge.

Parameters
  • fs (AbstractFileSystem) – The filesystem object we are writing to.

  • root_path (str) – The path under which the target data will be stored.

exists(path)

Check that the file is in the cache.

Return type

bool

get_mapper()

Get a mutable mapping object suitable for storing Zarr data.

Return type

FSMap

open(path, **kwargs)

Open file with a context manager.

Return type

Iterator[Any]

rm(path)

Remove file from the cache.

Return type

None

size(path)

Get file size

Return type

int

class pangeo_forge_recipes.storage.FlatFSSpecTarget(fs, root_path='')

Bases: pangeo_forge_recipes.storage.FSSpecTarget

A target that sanitizes all the path names so that everything is stored in a single directory.

Designed to be used as a cache for inputs.

class pangeo_forge_recipes.storage.CacheFSSpecTarget(fs, root_path='')

Bases: pangeo_forge_recipes.storage.FlatFSSpecTarget

Alias for FlatFSSpecTarget