Release Notes#
v0.10.4 - 2023-11-15#
Add
dynamic_chunking_fn
/dynamic_chunking_fn_kwargs
keywords to StoreToZarr. This allows users to pass a function that will be called at runtime to determine the target chunks for the resulting datasets based on the in memory representation/size of the recipe dataset. GH PR 595
v0.10.3 - 2023-10-03#
v0.10.2 - 2023-09-19#
Fix bug preventing use of multiple merge dims GH PR 591
Add parquet storage target for reference recipes GH PR 620
Support addition of dataset-level attrs for zarr recipes GH PR 622
Integration testing upgrades GH PR 590 GH PR 605 GH PR 607 GH PR 611
Add missing
py.typed
marker for mypy compatibility GH PR 613
v0.10.1 - 2023-08-31#
Add sentinel as default for transform keyword arguments that are required at runtime and which recipe developers may not want to set in recipe modules. This allows recipe modules to be importable (i.e., unit-testable) and type-checkable during development. GH PR 588
StoreToZarr
now emits azarr.storage.FSStore
which can be consumed by downstream transforms. This is useful for opening and testing the completed zarr store, adding it to a catalog, etc. GH PR 574Concurrency limiting transform added. This base transform can be used to limit concurrency for calls to external services. It is now used internally to allow
OpenURLWithFSSpec
to be limited to a specified maximum concurrency. GH PR 557Various packaging, testing, and maintenance upgrades GH PR 565 GH PR 567 GH PR 576
Patched deserialization bug that affected rechunking on GCP Dataflow GH PR 548
v0.10.0 - 2023-06-30#
Major breaking change: This release represents a nearly complete rewrite of the package, removing the custom recipe constructor classes and executors, and replacing them with a set of modular, domain-specific Apache Beam
PTransform
s, which can be flexibly composed and executed on any Apache Beam runner. The documention has been updated to reflect this change. As the first release following this major rewrite, we expect bugs and documentation gaps to exist, and we look forward to community support in finding and triaging those issues. A blog post and further documentaion of the motivations for and opportunities created by this major change is forthcoming.
v0.9 - 2022-05-11#
Breaking changes: Deprecated
XarrayZarrRecipe
manual stage methods. Also deprecatedFilePattern(..., is_opendap=True)
kwarg, which is superseded byFilePattern(..., file_type="opendap")
. GH PR 362Added
serialization
module along withBaseRecipe.sha256
andFilePattern.sha256
methods. Collectively, this provides for generation of deterministic hashes for both recipe and file pattern instances. Checking these hashes against those from a prior version of the recipe can be used to determine whether or not a particular recipe instance in a Python module (which may contain arbitrary numbers of recipe instances) has changed since the last time the instances in that module were executed. The file pattern hashes are based on a merkle tree built cumulatively from all of the index:filepath pairs yielded by the pattern’sself.items()
method. As such, in cases where a new pattern is intended to append to an existing dataset which was built from a prior version of that pattern, the pattern hash can be used to determine the index from which to begin appending. This is demonstrated in the tests. GH PR 349Created new Prefect executor which wraps the Dask executor in a single Task. This should mitigate problems related to large numbers of Prefect Tasks (GH issue 347).
Implemented feature to cap cached filename lengths at 255 bytes on local filesystems, to accomodate the POSIX filename length limit. Cached filename lengths are not truncated on any other filesystem. GH PR 353
v0.8.3 - 2022-04-19#
Added
.file_type
attribute topangeo_forge_recipes.patterns.FilePattern
. This attribute will eventually supercede.is_opendap
, which will be deprecated in0.9.0
. Until then,FilePattern(..., is_opendap=True)
is supported as equivalent toFilePattern(..., file_type="opendap")
. GH PR 322
v0.8.2 - 2022-02-23#
Removed click from dependencies and removed cli entrypoint.
v0.8.1 - 2022-02-23#
Fixed dependency issue with pip installation.
Fixed bug where recipes would fail if the target chunks exceeded the full array length. GH issue 279
v0.8.0 - 2022-02-17#
v0.7.0 - 2022-02-14 ❤️#
Apache Beam executor added. GH issue 169. By Alex Merose.
Index type update. GH PR 257
Fix incompatibility with
fsspec>=2021.11.1
. GH PR 247
v0.6.1 - 2021-10-25#
Major internal refactor of executors. GH PR 219. Began deprecation cycle for recipe methods (e.g.
recipe.prepare_target()
) in favor of module functions.Addition of
open_input_with_fsspec_reference
option onpangeo_forge_recipes.recipes.XarrayZarrRecipe
, permitting the bypassing of h5py when opening inputs. GH PR 218
v0.6.0 - 2021-09-02#
Added
pangeo_forge_recipes.recipes.HDFReferenceRecipe
class to create virtual Zarrs from collections of NetCDF / HDF5 files. GH PR 174Limit output from logging. GH PR 175
Change documentation structure. GH PR 178
Move
fsspec_open_kwargs
andis_opendap
parameters out ofpangeo_forge_recipes.recipes.XarrayZarrRecipe
and intopangeo_forge_recipes.patterns.FilePattern
. Addquery_string_secrets
as attribute ofpangeo_forge_recipes.patterns.FilePattern
. GH PR 167
v0.5.0 - 2021-07-11#
Added
subset_inputs
option topangeo_forge_recipes.recipes.XarrayZarrRecipe
. GH issue 93, GH PR 166Fixed file opening to eliminate HDF errors related to closed files. GH issue 170, GH PR 171
Changed default behavior of executors so that the
cache_input
loop is always run, regardless of the value ofcache_inputs
. GH PR 168
v0.4.0 - 2021-06-25#
Fixed issue with recipe serialilzation causing high memory usage of Dask schedulers and workers when executing recipes with Prefect or Dask GH PR 160.
Added new methods
.to_dask()
,to_prefect()
, and.to_function()
for converting a recipe to one of the Dask, Prefect, or Python execution plans. The previous method,recpie.to_pipelines()
is now deprecated.
v0.3.4 - 2021-05-25#
Added
copy_pruned
method topangeo_forge_recipes.recipes.XarrayZarrRecipe
to facilitate testing.Internal refactor of storage module.
v0.3.3 - 2021-05-10#
Many feature enhancements.
Non-backwards compatible changes to core API.
Package renamed from pangeo_forge
to pangeo_forge_recipes
.
There were problems with packaging for the 0.3.0-0.3.2 releases.
v0.2.0 - 2021-04-26#
First release since major Spring 2021 overhaul. This release depends on Xarray v0.17.1, which has not yet been released as of the date of this release.
v0.1.0 - 2020-10-22#
First release.