NetCDF Zarr Multi-Variable Sequential Recipe: NOAA World Ocean Atlas
Contents
NetCDF Zarr Multi-Variable Sequential Recipe: NOAA World Ocean Atlas#
This recipe is a little bit more complicated than the Xarray-to-Zarr Sequential Recipe: NOAA OISST. You shold probably review that one first; here we will skip the basics.
For this example, we will use data from NOAA’s World Ocean Atlas. As we can see from the data access page, the dataset is spread over many different files. What’s important here is that:
There is a time sequence (month) to the files.
Different variables live in different files.
Because our dataset is spread over muliple files, we will have to use a more complex File Pattern than the previous example.
Step 1: Get to know your source data#
This step can’t be skipped! It’s impossible to write a recipe if you don’t understand intimately how the source data are organized. World Ocean Atlass has eight different variables: Temperature, Salinity, Dissolved Oxygen, Percent Oxygen Saturation, Apparent Oxygen Utilization, Silicate, Phosphate, Nitrate. Each variable has a page that looks like this:
For the purpose of this tutorial, we will use the 5-degree resolution monthly data. We can follow the links to finally find an HTTP download link for a single month of data.
download_url = 'https://www.ncei.noaa.gov/thredds-ocean/fileServer/ncei/woa/temperature/decav/5deg/woa18_decav_t01_5d.nc'
Let’s download it and try to open it with xarray.
! wget https://www.ncei.noaa.gov/thredds-ocean/fileServer/ncei/woa/temperature/decav/5deg/woa18_decav_t01_5d.nc
--2021-11-13 22:38:40-- https://www.ncei.noaa.gov/thredds-ocean/fileServer/ncei/woa/temperature/decav/5deg/woa18_decav_t01_5d.nc
Resolving www.ncei.noaa.gov... 205.167.25.177, 205.167.25.178, 205.167.25.167, ...
Connecting to www.ncei.noaa.gov|205.167.25.177|:443... connected.
HTTP request sent, awaiting response... 200
Length: 2389903 (2.3M) [application/x-netcdf]
Saving to: ‘woa18_decav_t01_5d.nc.4’
woa18_decav_t01_5d. 100%[===================>] 2.28M 8.91MB/s in 0.3s
2021-11-13 22:38:41 (8.91 MB/s) - ‘woa18_decav_t01_5d.nc.4’ saved [2389903/2389903]
import xarray as xr
try:
ds = xr.open_dataset("woa18_decav_t01_5d.nc")
except ValueError as e:
print(e)
unable to decode time units 'months since 1955-01-01 00:00:00' with 'the default calendar'. Try opening your dataset with decode_times=False or installing cftime if it is not installed.
❗️ Oh no, we got an error!
This is a very common problem. The calendar is encoded using “months since” units, which are ambiguous in the CF Conventions. (The precise length of a month is variable by month an year.)
We will follow the advice and do
ds = xr.open_dataset("woa18_decav_t01_5d.nc", decode_times=False)
ds
<xarray.Dataset> Dimensions: (lat: 36, nbounds: 2, lon: 72, depth: 57, time: 1) Coordinates: * lat (lat) float32 -87.5 -82.5 -77.5 -72.5 ... 77.5 82.5 87.5 * lon (lon) float32 -177.5 -172.5 -167.5 ... 167.5 172.5 177.5 * depth (depth) float32 0.0 5.0 10.0 ... 1.45e+03 1.5e+03 * time (time) float32 372.5 Dimensions without coordinates: nbounds Data variables: crs int32 -2147483647 lat_bnds (lat, nbounds) float32 -90.0 -85.0 -85.0 ... 85.0 90.0 lon_bnds (lon, nbounds) float32 -180.0 -175.0 ... 175.0 180.0 depth_bnds (depth, nbounds) float32 0.0 2.5 ... 1.475e+03 1.5e+03 climatology_bounds (time, nbounds) float32 0.0 404.0 t_mn (time, depth, lat, lon) float32 ... t_dd (time, depth, lat, lon) float64 ... t_sd (time, depth, lat, lon) float32 ... t_se (time, depth, lat, lon) float32 ... Attributes: (12/49) Conventions: CF-1.6, ACDD-1.3 title: World Ocean Atlas 2018 : sea_water_tempe... summary: PRERELEASE Climatological mean temperatu... references: Locarnini, R. A., A. V. Mishonov, O. K. ... institution: National Centers for Environmental Infor... comment: global climatology as part of the World ... ... ... publisher_email: NCEI.info@noaa.gov nodc_template_version: NODC_NetCDF_Grid_Template_v2.0 license: These data are openly available to the p... metadata_link: http://www.nodc.noaa.gov/OC5/WOA18/pr_wo... date_created: 2018-02-19 date_modified: 2018-02-19
- lat: 36
- nbounds: 2
- lon: 72
- depth: 57
- time: 1
- lat(lat)float32-87.5 -82.5 -77.5 ... 82.5 87.5
- standard_name :
- latitude
- long_name :
- latitude
- units :
- degrees_north
- axis :
- Y
- bounds :
- lat_bnds
array([-87.5, -82.5, -77.5, -72.5, -67.5, -62.5, -57.5, -52.5, -47.5, -42.5, -37.5, -32.5, -27.5, -22.5, -17.5, -12.5, -7.5, -2.5, 2.5, 7.5, 12.5, 17.5, 22.5, 27.5, 32.5, 37.5, 42.5, 47.5, 52.5, 57.5, 62.5, 67.5, 72.5, 77.5, 82.5, 87.5], dtype=float32)
- lon(lon)float32-177.5 -172.5 ... 172.5 177.5
- standard_name :
- longitude
- long_name :
- longitude
- units :
- degrees_east
- axis :
- X
- bounds :
- lon_bnds
array([-177.5, -172.5, -167.5, -162.5, -157.5, -152.5, -147.5, -142.5, -137.5, -132.5, -127.5, -122.5, -117.5, -112.5, -107.5, -102.5, -97.5, -92.5, -87.5, -82.5, -77.5, -72.5, -67.5, -62.5, -57.5, -52.5, -47.5, -42.5, -37.5, -32.5, -27.5, -22.5, -17.5, -12.5, -7.5, -2.5, 2.5, 7.5, 12.5, 17.5, 22.5, 27.5, 32.5, 37.5, 42.5, 47.5, 52.5, 57.5, 62.5, 67.5, 72.5, 77.5, 82.5, 87.5, 92.5, 97.5, 102.5, 107.5, 112.5, 117.5, 122.5, 127.5, 132.5, 137.5, 142.5, 147.5, 152.5, 157.5, 162.5, 167.5, 172.5, 177.5], dtype=float32)
- depth(depth)float320.0 5.0 10.0 ... 1.45e+03 1.5e+03
- standard_name :
- depth
- bounds :
- depth_bnds
- positive :
- down
- units :
- meters
- axis :
- Z
array([ 0., 5., 10., 15., 20., 25., 30., 35., 40., 45., 50., 55., 60., 65., 70., 75., 80., 85., 90., 95., 100., 125., 150., 175., 200., 225., 250., 275., 300., 325., 350., 375., 400., 425., 450., 475., 500., 550., 600., 650., 700., 750., 800., 850., 900., 950., 1000., 1050., 1100., 1150., 1200., 1250., 1300., 1350., 1400., 1450., 1500.], dtype=float32)
- time(time)float32372.5
- standard_name :
- time
- long_name :
- time
- units :
- months since 1955-01-01 00:00:00
- axis :
- T
- climatology :
- climatology_bounds
array([372.5], dtype=float32)
- crs()int32...
- grid_mapping_name :
- latitude_longitude
- epsg_code :
- EPSG:4326
- longitude_of_prime_meridian :
- 0.0
- semi_major_axis :
- 6378137.0
- inverse_flattening :
- 298.25723
array(-2147483647, dtype=int32)
- lat_bnds(lat, nbounds)float32...
- comment :
- latitude bounds
array([[-90., -85.], [-85., -80.], [-80., -75.], [-75., -70.], [-70., -65.], [-65., -60.], [-60., -55.], [-55., -50.], [-50., -45.], [-45., -40.], [-40., -35.], [-35., -30.], [-30., -25.], [-25., -20.], [-20., -15.], [-15., -10.], [-10., -5.], [ -5., 0.], [ 0., 5.], [ 5., 10.], [ 10., 15.], [ 15., 20.], [ 20., 25.], [ 25., 30.], [ 30., 35.], [ 35., 40.], [ 40., 45.], [ 45., 50.], [ 50., 55.], [ 55., 60.], [ 60., 65.], [ 65., 70.], [ 70., 75.], [ 75., 80.], [ 80., 85.], [ 85., 90.]], dtype=float32)
- lon_bnds(lon, nbounds)float32...
- comment :
- longitude bounds
array([[-180., -175.], [-175., -170.], [-170., -165.], [-165., -160.], [-160., -155.], [-155., -150.], [-150., -145.], [-145., -140.], [-140., -135.], [-135., -130.], [-130., -125.], [-125., -120.], [-120., -115.], [-115., -110.], [-110., -105.], [-105., -100.], [-100., -95.], [ -95., -90.], [ -90., -85.], [ -85., -80.], [ -80., -75.], [ -75., -70.], [ -70., -65.], [ -65., -60.], [ -60., -55.], [ -55., -50.], [ -50., -45.], [ -45., -40.], [ -40., -35.], [ -35., -30.], [ -30., -25.], [ -25., -20.], [ -20., -15.], [ -15., -10.], [ -10., -5.], [ -5., 0.], [ 0., 5.], [ 5., 10.], [ 10., 15.], [ 15., 20.], [ 20., 25.], [ 25., 30.], [ 30., 35.], [ 35., 40.], [ 40., 45.], [ 45., 50.], [ 50., 55.], [ 55., 60.], [ 60., 65.], [ 65., 70.], [ 70., 75.], [ 75., 80.], [ 80., 85.], [ 85., 90.], [ 90., 95.], [ 95., 100.], [ 100., 105.], [ 105., 110.], [ 110., 115.], [ 115., 120.], [ 120., 125.], [ 125., 130.], [ 130., 135.], [ 135., 140.], [ 140., 145.], [ 145., 150.], [ 150., 155.], [ 155., 160.], [ 160., 165.], [ 165., 170.], [ 170., 175.], [ 175., 180.]], dtype=float32)
- depth_bnds(depth, nbounds)float32...
- comment :
- depth bounds
array([[ 0. , 2.5], [ 2.5, 7.5], [ 7.5, 12.5], [ 12.5, 17.5], [ 17.5, 22.5], [ 22.5, 27.5], [ 27.5, 32.5], [ 32.5, 37.5], [ 37.5, 42.5], [ 42.5, 47.5], [ 47.5, 52.5], [ 52.5, 57.5], [ 57.5, 62.5], [ 62.5, 67.5], [ 67.5, 72.5], [ 72.5, 77.5], [ 77.5, 82.5], [ 82.5, 87.5], [ 87.5, 92.5], [ 92.5, 97.5], [ 97.5, 112.5], [ 112.5, 137.5], [ 137.5, 162.5], [ 162.5, 187.5], [ 187.5, 212.5], [ 212.5, 237.5], [ 237.5, 262.5], [ 262.5, 287.5], [ 287.5, 312.5], [ 312.5, 337.5], [ 337.5, 362.5], [ 362.5, 387.5], [ 387.5, 412.5], [ 412.5, 437.5], [ 437.5, 462.5], [ 462.5, 487.5], [ 487.5, 525. ], [ 525. , 575. ], [ 575. , 625. ], [ 625. , 675. ], [ 675. , 725. ], [ 725. , 775. ], [ 775. , 825. ], [ 825. , 875. ], [ 875. , 925. ], [ 925. , 975. ], [ 975. , 1025. ], [1025. , 1075. ], [1075. , 1125. ], [1125. , 1175. ], [1175. , 1225. ], [1225. , 1275. ], [1275. , 1325. ], [1325. , 1375. ], [1375. , 1425. ], [1425. , 1475. ], [1475. , 1500. ]], dtype=float32)
- climatology_bounds(time, nbounds)float32...
- comment :
- This variable defines the bounds of the climatological time period for each time
array([[ 0., 404.]], dtype=float32)
- t_mn(time, depth, lat, lon)float32...
- standard_name :
- sea_water_temperature
- long_name :
- Average of all unflagged interpolated values at each standard depth level for sea_water_temperature in each grid-square which contain at least one measurement.
- cell_methods :
- area: mean depth: mean time: mean within years time: mean over years
- grid_mapping :
- crs
- units :
- degrees_celsius
[147744 values with dtype=float32]
- t_dd(time, depth, lat, lon)float64...
- standard_name :
- sea_water_temperature number_of_observations
- long_name :
- The number of observations of sea_water_temperature in each grid-square at each standard depth level.
- cell_methods :
- area: sum depth: point time: sum
- grid_mapping :
- crs
- units :
- 1
[147744 values with dtype=float64]
- t_sd(time, depth, lat, lon)float32...
- long_name :
- The standard deviation about the statistical mean of sea_water_temperature in each grid-square at each standard depth level.
- cell_methods :
- area: mean depth: mean time: standard_deviation
- grid_mapping :
- crs
- units :
- degrees_celsius
[147744 values with dtype=float32]
- t_se(time, depth, lat, lon)float32...
- standard_name :
- sea_water_temperature standard_error
- long_name :
- The standard error about the statistical mean of sea_water_temperature in each grid-square at each standard depth level.
- cell_methods :
- area: mean depth: mean time: mean
- grid_mapping :
- crs
- units :
- degrees_celsius
[147744 values with dtype=float32]
- Conventions :
- CF-1.6, ACDD-1.3
- title :
- World Ocean Atlas 2018 : sea_water_temperature January 1955-2017 5.00 degree
- summary :
- PRERELEASE Climatological mean temperature for the global ocean from in situ profile data
- references :
- Locarnini, R. A., A. V. Mishonov, O. K. Baranova, T. P. Boyer, M. M. Zweng, H. E. Garcia, J. R. Reagan, D. Seidov, K. W. Weathers, C. R. Paver, I. V. Smolyar, 2018: World Ocean Atlas 2018, Volume 1: Temperature. A. V. Mishonov, Technical Ed., NOAA Atlas NESDIS ##
- institution :
- National Centers for Environmental Information (NCEI)
- comment :
- global climatology as part of the World Ocean Atlas project
- id :
- woa18_decav_t01_5d.nc
- naming_authority :
- gov.noaa.ncei
- sea_name :
- World-Wide Distribution
- time_coverage_start :
- 1955-01-01
- time_coverage_end :
- 2017-01-31
- time_coverage_duration :
- P63Y
- time_coverage_resolution :
- P01M
- geospatial_lat_min :
- -90.0
- geospatial_lat_max :
- 90.0
- geospatial_lon_min :
- -180.0
- geospatial_lon_max :
- 180.0
- geospatial_vertical_min :
- 0.0
- geospatial_vertical_max :
- 1500.0
- geospatial_lat_units :
- degrees_north
- geospatial_lat_resolution :
- 5.00 degrees
- geospatial_lon_units :
- degrees_east
- geospatial_lon_resolution :
- 5.00 degrees
- geospatial_vertical_units :
- m
- geospatial_vertical_resolution :
- SPECIAL
- geospatial_vertical_positive :
- down
- creator_name :
- Ocean Climate Laboratory
- creator_email :
- NCEI.info@noaa.gov
- creator_url :
- http://www.ncei.noaa.gov
- creator_type :
- group
- creator_institution :
- National Centers for Environmental Information
- project :
- World Ocean Atlas Project
- processing_level :
- processed
- keywords :
- Oceans< Ocean Temperature > Water Temperature
- keywords_vocabulary :
- ISO 19115
- standard_name_vocabulary :
- CF Standard Name Table v49
- contributor_name :
- Ocean Climate Laboratory
- contributor_role :
- Calculation of climatologies
- cdm_data_type :
- Grid
- publisher_name :
- National Centers for Environmental Information (NCEI)
- publisher_institution :
- National Centers for Environmental Information
- publisher_type :
- institution
- publisher_url :
- http://www.ncei.noaa.gov/
- publisher_email :
- NCEI.info@noaa.gov
- nodc_template_version :
- NODC_NetCDF_Grid_Template_v2.0
- license :
- These data are openly available to the public. Please acknowledge the use of these data with the text given in the acknowledgment attribute.
- metadata_link :
- http://www.nodc.noaa.gov/OC5/WOA18/pr_woa18.html
- date_created :
- 2018-02-19
- date_modified :
- 2018-02-19
ds.time
<xarray.DataArray 'time' (time: 1)> array([372.5], dtype=float32) Coordinates: * time (time) float32 372.5 Attributes: standard_name: time long_name: time units: months since 1955-01-01 00:00:00 axis: T climatology: climatology_bounds
- time: 1
- 372.5
array([372.5], dtype=float32)
- time(time)float32372.5
- standard_name :
- time
- long_name :
- time
- units :
- months since 1955-01-01 00:00:00
- axis :
- T
- climatology :
- climatology_bounds
array([372.5], dtype=float32)
- standard_name :
- time
- long_name :
- time
- units :
- months since 1955-01-01 00:00:00
- axis :
- T
- climatology :
- climatology_bounds
We have opened the data, but the time coordinate is just a number, not an actual datetime object.
We can work around this issue by explicitly specifying the 360_day
calendar (in which every month is assumed to have 30 days).
ds.time.attrs['calendar'] = '360_day'
ds = xr.decode_cf(ds)
ds
<xarray.Dataset> Dimensions: (lat: 36, nbounds: 2, lon: 72, depth: 57, time: 1) Coordinates: * lat (lat) float32 -87.5 -82.5 -77.5 -72.5 ... 77.5 82.5 87.5 * lon (lon) float32 -177.5 -172.5 -167.5 ... 167.5 172.5 177.5 * depth (depth) float32 0.0 5.0 10.0 ... 1.45e+03 1.5e+03 * time (time) object 1986-01-16 00:00:00 Dimensions without coordinates: nbounds Data variables: crs int32 ... lat_bnds (lat, nbounds) float32 ... lon_bnds (lon, nbounds) float32 ... depth_bnds (depth, nbounds) float32 ... climatology_bounds (time, nbounds) float32 ... t_mn (time, depth, lat, lon) float32 ... t_dd (time, depth, lat, lon) float64 ... t_sd (time, depth, lat, lon) float32 ... t_se (time, depth, lat, lon) float32 ... Attributes: (12/49) Conventions: CF-1.6, ACDD-1.3 title: World Ocean Atlas 2018 : sea_water_tempe... summary: PRERELEASE Climatological mean temperatu... references: Locarnini, R. A., A. V. Mishonov, O. K. ... institution: National Centers for Environmental Infor... comment: global climatology as part of the World ... ... ... publisher_email: NCEI.info@noaa.gov nodc_template_version: NODC_NetCDF_Grid_Template_v2.0 license: These data are openly available to the p... metadata_link: http://www.nodc.noaa.gov/OC5/WOA18/pr_wo... date_created: 2018-02-19 date_modified: 2018-02-19
- lat: 36
- nbounds: 2
- lon: 72
- depth: 57
- time: 1
- lat(lat)float32-87.5 -82.5 -77.5 ... 82.5 87.5
- standard_name :
- latitude
- long_name :
- latitude
- units :
- degrees_north
- axis :
- Y
- bounds :
- lat_bnds
array([-87.5, -82.5, -77.5, -72.5, -67.5, -62.5, -57.5, -52.5, -47.5, -42.5, -37.5, -32.5, -27.5, -22.5, -17.5, -12.5, -7.5, -2.5, 2.5, 7.5, 12.5, 17.5, 22.5, 27.5, 32.5, 37.5, 42.5, 47.5, 52.5, 57.5, 62.5, 67.5, 72.5, 77.5, 82.5, 87.5], dtype=float32)
- lon(lon)float32-177.5 -172.5 ... 172.5 177.5
- standard_name :
- longitude
- long_name :
- longitude
- units :
- degrees_east
- axis :
- X
- bounds :
- lon_bnds
array([-177.5, -172.5, -167.5, -162.5, -157.5, -152.5, -147.5, -142.5, -137.5, -132.5, -127.5, -122.5, -117.5, -112.5, -107.5, -102.5, -97.5, -92.5, -87.5, -82.5, -77.5, -72.5, -67.5, -62.5, -57.5, -52.5, -47.5, -42.5, -37.5, -32.5, -27.5, -22.5, -17.5, -12.5, -7.5, -2.5, 2.5, 7.5, 12.5, 17.5, 22.5, 27.5, 32.5, 37.5, 42.5, 47.5, 52.5, 57.5, 62.5, 67.5, 72.5, 77.5, 82.5, 87.5, 92.5, 97.5, 102.5, 107.5, 112.5, 117.5, 122.5, 127.5, 132.5, 137.5, 142.5, 147.5, 152.5, 157.5, 162.5, 167.5, 172.5, 177.5], dtype=float32)
- depth(depth)float320.0 5.0 10.0 ... 1.45e+03 1.5e+03
- standard_name :
- depth
- bounds :
- depth_bnds
- positive :
- down
- units :
- meters
- axis :
- Z
array([ 0., 5., 10., 15., 20., 25., 30., 35., 40., 45., 50., 55., 60., 65., 70., 75., 80., 85., 90., 95., 100., 125., 150., 175., 200., 225., 250., 275., 300., 325., 350., 375., 400., 425., 450., 475., 500., 550., 600., 650., 700., 750., 800., 850., 900., 950., 1000., 1050., 1100., 1150., 1200., 1250., 1300., 1350., 1400., 1450., 1500.], dtype=float32)
- time(time)object1986-01-16 00:00:00
- standard_name :
- time
- long_name :
- time
- axis :
- T
- climatology :
- climatology_bounds
array([cftime.Datetime360Day(1986, 1, 16, 0, 0, 0, 0, has_year_zero=False)], dtype=object)
- crs()int32...
- grid_mapping_name :
- latitude_longitude
- epsg_code :
- EPSG:4326
- longitude_of_prime_meridian :
- 0.0
- semi_major_axis :
- 6378137.0
- inverse_flattening :
- 298.25723
array(-2147483647, dtype=int32)
- lat_bnds(lat, nbounds)float32...
- comment :
- latitude bounds
array([[-90., -85.], [-85., -80.], [-80., -75.], [-75., -70.], [-70., -65.], [-65., -60.], [-60., -55.], [-55., -50.], [-50., -45.], [-45., -40.], [-40., -35.], [-35., -30.], [-30., -25.], [-25., -20.], [-20., -15.], [-15., -10.], [-10., -5.], [ -5., 0.], [ 0., 5.], [ 5., 10.], [ 10., 15.], [ 15., 20.], [ 20., 25.], [ 25., 30.], [ 30., 35.], [ 35., 40.], [ 40., 45.], [ 45., 50.], [ 50., 55.], [ 55., 60.], [ 60., 65.], [ 65., 70.], [ 70., 75.], [ 75., 80.], [ 80., 85.], [ 85., 90.]], dtype=float32)
- lon_bnds(lon, nbounds)float32...
- comment :
- longitude bounds
array([[-180., -175.], [-175., -170.], [-170., -165.], [-165., -160.], [-160., -155.], [-155., -150.], [-150., -145.], [-145., -140.], [-140., -135.], [-135., -130.], [-130., -125.], [-125., -120.], [-120., -115.], [-115., -110.], [-110., -105.], [-105., -100.], [-100., -95.], [ -95., -90.], [ -90., -85.], [ -85., -80.], [ -80., -75.], [ -75., -70.], [ -70., -65.], [ -65., -60.], [ -60., -55.], [ -55., -50.], [ -50., -45.], [ -45., -40.], [ -40., -35.], [ -35., -30.], [ -30., -25.], [ -25., -20.], [ -20., -15.], [ -15., -10.], [ -10., -5.], [ -5., 0.], [ 0., 5.], [ 5., 10.], [ 10., 15.], [ 15., 20.], [ 20., 25.], [ 25., 30.], [ 30., 35.], [ 35., 40.], [ 40., 45.], [ 45., 50.], [ 50., 55.], [ 55., 60.], [ 60., 65.], [ 65., 70.], [ 70., 75.], [ 75., 80.], [ 80., 85.], [ 85., 90.], [ 90., 95.], [ 95., 100.], [ 100., 105.], [ 105., 110.], [ 110., 115.], [ 115., 120.], [ 120., 125.], [ 125., 130.], [ 130., 135.], [ 135., 140.], [ 140., 145.], [ 145., 150.], [ 150., 155.], [ 155., 160.], [ 160., 165.], [ 165., 170.], [ 170., 175.], [ 175., 180.]], dtype=float32)
- depth_bnds(depth, nbounds)float32...
- comment :
- depth bounds
array([[ 0. , 2.5], [ 2.5, 7.5], [ 7.5, 12.5], [ 12.5, 17.5], [ 17.5, 22.5], [ 22.5, 27.5], [ 27.5, 32.5], [ 32.5, 37.5], [ 37.5, 42.5], [ 42.5, 47.5], [ 47.5, 52.5], [ 52.5, 57.5], [ 57.5, 62.5], [ 62.5, 67.5], [ 67.5, 72.5], [ 72.5, 77.5], [ 77.5, 82.5], [ 82.5, 87.5], [ 87.5, 92.5], [ 92.5, 97.5], [ 97.5, 112.5], [ 112.5, 137.5], [ 137.5, 162.5], [ 162.5, 187.5], [ 187.5, 212.5], [ 212.5, 237.5], [ 237.5, 262.5], [ 262.5, 287.5], [ 287.5, 312.5], [ 312.5, 337.5], [ 337.5, 362.5], [ 362.5, 387.5], [ 387.5, 412.5], [ 412.5, 437.5], [ 437.5, 462.5], [ 462.5, 487.5], [ 487.5, 525. ], [ 525. , 575. ], [ 575. , 625. ], [ 625. , 675. ], [ 675. , 725. ], [ 725. , 775. ], [ 775. , 825. ], [ 825. , 875. ], [ 875. , 925. ], [ 925. , 975. ], [ 975. , 1025. ], [1025. , 1075. ], [1075. , 1125. ], [1125. , 1175. ], [1175. , 1225. ], [1225. , 1275. ], [1275. , 1325. ], [1325. , 1375. ], [1375. , 1425. ], [1425. , 1475. ], [1475. , 1500. ]], dtype=float32)
- climatology_bounds(time, nbounds)float32...
- comment :
- This variable defines the bounds of the climatological time period for each time
array([[ 0., 404.]], dtype=float32)
- t_mn(time, depth, lat, lon)float32...
- standard_name :
- sea_water_temperature
- long_name :
- Average of all unflagged interpolated values at each standard depth level for sea_water_temperature in each grid-square which contain at least one measurement.
- cell_methods :
- area: mean depth: mean time: mean within years time: mean over years
- grid_mapping :
- crs
- units :
- degrees_celsius
[147744 values with dtype=float32]
- t_dd(time, depth, lat, lon)float64...
- standard_name :
- sea_water_temperature number_of_observations
- long_name :
- The number of observations of sea_water_temperature in each grid-square at each standard depth level.
- cell_methods :
- area: sum depth: point time: sum
- grid_mapping :
- crs
- units :
- 1
[147744 values with dtype=float64]
- t_sd(time, depth, lat, lon)float32...
- long_name :
- The standard deviation about the statistical mean of sea_water_temperature in each grid-square at each standard depth level.
- cell_methods :
- area: mean depth: mean time: standard_deviation
- grid_mapping :
- crs
- units :
- degrees_celsius
[147744 values with dtype=float32]
- t_se(time, depth, lat, lon)float32...
- standard_name :
- sea_water_temperature standard_error
- long_name :
- The standard error about the statistical mean of sea_water_temperature in each grid-square at each standard depth level.
- cell_methods :
- area: mean depth: mean time: mean
- grid_mapping :
- crs
- units :
- degrees_celsius
[147744 values with dtype=float32]
- Conventions :
- CF-1.6, ACDD-1.3
- title :
- World Ocean Atlas 2018 : sea_water_temperature January 1955-2017 5.00 degree
- summary :
- PRERELEASE Climatological mean temperature for the global ocean from in situ profile data
- references :
- Locarnini, R. A., A. V. Mishonov, O. K. Baranova, T. P. Boyer, M. M. Zweng, H. E. Garcia, J. R. Reagan, D. Seidov, K. W. Weathers, C. R. Paver, I. V. Smolyar, 2018: World Ocean Atlas 2018, Volume 1: Temperature. A. V. Mishonov, Technical Ed., NOAA Atlas NESDIS ##
- institution :
- National Centers for Environmental Information (NCEI)
- comment :
- global climatology as part of the World Ocean Atlas project
- id :
- woa18_decav_t01_5d.nc
- naming_authority :
- gov.noaa.ncei
- sea_name :
- World-Wide Distribution
- time_coverage_start :
- 1955-01-01
- time_coverage_end :
- 2017-01-31
- time_coverage_duration :
- P63Y
- time_coverage_resolution :
- P01M
- geospatial_lat_min :
- -90.0
- geospatial_lat_max :
- 90.0
- geospatial_lon_min :
- -180.0
- geospatial_lon_max :
- 180.0
- geospatial_vertical_min :
- 0.0
- geospatial_vertical_max :
- 1500.0
- geospatial_lat_units :
- degrees_north
- geospatial_lat_resolution :
- 5.00 degrees
- geospatial_lon_units :
- degrees_east
- geospatial_lon_resolution :
- 5.00 degrees
- geospatial_vertical_units :
- m
- geospatial_vertical_resolution :
- SPECIAL
- geospatial_vertical_positive :
- down
- creator_name :
- Ocean Climate Laboratory
- creator_email :
- NCEI.info@noaa.gov
- creator_url :
- http://www.ncei.noaa.gov
- creator_type :
- group
- creator_institution :
- National Centers for Environmental Information
- project :
- World Ocean Atlas Project
- processing_level :
- processed
- keywords :
- Oceans< Ocean Temperature > Water Temperature
- keywords_vocabulary :
- ISO 19115
- standard_name_vocabulary :
- CF Standard Name Table v49
- contributor_name :
- Ocean Climate Laboratory
- contributor_role :
- Calculation of climatologies
- cdm_data_type :
- Grid
- publisher_name :
- National Centers for Environmental Information (NCEI)
- publisher_institution :
- National Centers for Environmental Information
- publisher_type :
- institution
- publisher_url :
- http://www.ncei.noaa.gov/
- publisher_email :
- NCEI.info@noaa.gov
- nodc_template_version :
- NODC_NetCDF_Grid_Template_v2.0
- license :
- These data are openly available to the public. Please acknowledge the use of these data with the text given in the acknowledgment attribute.
- metadata_link :
- http://www.nodc.noaa.gov/OC5/WOA18/pr_woa18.html
- date_created :
- 2018-02-19
- date_modified :
- 2018-02-19
ds.time
<xarray.DataArray 'time' (time: 1)> array([cftime.Datetime360Day(1986, 1, 16, 0, 0, 0, 0, has_year_zero=False)], dtype=object) Coordinates: * time (time) object 1986-01-16 00:00:00 Attributes: standard_name: time long_name: time axis: T climatology: climatology_bounds
- time: 1
- 1986-01-16 00:00:00
array([cftime.Datetime360Day(1986, 1, 16, 0, 0, 0, 0, has_year_zero=False)], dtype=object)
- time(time)object1986-01-16 00:00:00
- standard_name :
- time
- long_name :
- time
- axis :
- T
- climatology :
- climatology_bounds
array([cftime.Datetime360Day(1986, 1, 16, 0, 0, 0, 0, has_year_zero=False)], dtype=object)
- standard_name :
- time
- long_name :
- time
- axis :
- T
- climatology :
- climatology_bounds
We will need this trick for later.
Step 2: Define the File Pattern#
We can browse through the files on the website and see how they are organized.
https://www.ncei.noaa.gov/thredds-ocean/fileServer/ncei/woa/temperature/decav/5deg/woa18_decav_t01_5d.nc
https://www.ncei.noaa.gov/thredds-ocean/fileServer/ncei/woa/temperature/decav/5deg/woa18_decav_t02_5d.nc
...
https://www.ncei.noaa.gov/thredds-ocean/fileServer/ncei/woa/salinity/decav/5deg/woa18_decav_s01_5d.nc
https://www.ncei.noaa.gov/thredds-ocean/fileServer/ncei/woa/salinity/decav/5deg/woa18_decav_s02_5d.nc
...
From this we can deduce the general pattern. We write a function to return the correct filename for a given variable / month combination.
# Here it is important that the function argument name "time" match
# the name of the dataset dimension "time"
def format_function(variable, time):
return ("https://www.ncei.noaa.gov/thredds-ocean/fileServer/ncei/woa/"
f"{variable}/decav/5deg/woa18_decav_{variable[0]}{time:02d}_5d.nc")
format_function("temperature", 2)
'https://www.ncei.noaa.gov/thredds-ocean/fileServer/ncei/woa/temperature/decav/5deg/woa18_decav_t02_5d.nc'
Now we turn this into a FilePattern
object.
This pattern has two distinct combine_dims
: variable name and month.
We want to merge over variable names and concatenate over months.
from pangeo_forge_recipes import patterns
variable_merge_dim = patterns.MergeDim("variable", keys=["temperature", "salinity"])
# Here it is important that the ConcatDim name "time" match the name of the
# dataset dimension "time" (and the argument name in format_function)
month_concat_dim = patterns.ConcatDim("time", keys=list(range(1, 13)), nitems_per_file=1)
pattern = patterns.FilePattern(format_function, variable_merge_dim, month_concat_dim)
pattern
<FilePattern {'variable': 2, 'time': 12}>
Step 3: Write the Recipe#
Now that we have a FilePattern
, we are ready to write our XarrayZarrRecipe
.
Define an Input Preprocessor Function#
Above we noted that the time was encoded wrong in the original data.
We might have also noticed that many variables that seems like coordinates (e.g. lat_bnds
) were in the Data Variables part of the dataset.
We will write a function that fixes both these issues.
def fix_encoding_and_attrs(ds, fname):
ds.time.attrs['calendar'] = '360_day'
ds = xr.decode_cf(ds)
ds = ds.set_coords(['crs', 'lat_bnds', 'lon_bnds', 'depth_bnds', 'climatology_bounds'])
return ds
Define the Recipe Object#
from pangeo_forge_recipes.recipes import XarrayZarrRecipe
recipe = XarrayZarrRecipe(
pattern,
xarray_open_kwargs={'decode_times': False},
process_input=fix_encoding_and_attrs
)
recipe
XarrayZarrRecipe(file_pattern=<FilePattern {'variable': 2, 'time': 12}>, storage_config=StorageConfig(target=FSSpecTarget(fs=<fsspec.implementations.local.LocalFileSystem object at 0x113568f70>, root_path='/var/folders/f8/rh42xb3d1tnbw2bxsjwgym1c0000gn/T/tmp6dljapud/PKyYuJFI'), cache=CacheFSSpecTarget(fs=<fsspec.implementations.local.LocalFileSystem object at 0x113568f70>, root_path='/var/folders/f8/rh42xb3d1tnbw2bxsjwgym1c0000gn/T/tmp6dljapud/37IRtt94'), metadata=MetadataTarget(fs=<fsspec.implementations.local.LocalFileSystem object at 0x113568f70>, root_path='/var/folders/f8/rh42xb3d1tnbw2bxsjwgym1c0000gn/T/tmp6dljapud/1kGCgZma')), inputs_per_chunk=1, target_chunks={}, cache_inputs=True, copy_input_to_local_file=False, consolidate_zarr=True, consolidate_dimension_coordinates=True, xarray_open_kwargs={'decode_times': False}, xarray_concat_kwargs={}, delete_input_encoding=True, process_input=<function fix_encoding_and_attrs at 0x112f39820>, process_chunk=None, lock_timeout=None, subset_inputs={}, open_input_with_fsspec_reference=False)
Step 4: Run the Recipe#
In Xarray-to-Zarr Sequential Recipe: NOAA OISST we went through each step of recipe execution in detail. Here we will not do that. He we will let Prefect do the work for us.
flow = recipe.to_prefect()
flow.run()
[2022-02-16 17:50:16-0500] INFO - prefect.FlowRunner | Beginning Flow run for 'pangeo-forge-recipe'
[2022-02-16 17:50:16-0500] INFO - prefect.TaskRunner | Task 'cache_input': Starting task run...
[2022-02-16 17:50:16-0500] INFO - prefect.TaskRunner | Task 'cache_input': Finished task run for task with final state: 'Mapped'
[2022-02-16 17:50:16-0500] INFO - prefect.TaskRunner | Task 'cache_input[0]': Starting task run...
[2022-02-16 17:50:17-0500] INFO - prefect.TaskRunner | Task 'cache_input[0]': Finished task run for task with final state: 'Success'
[2022-02-16 17:50:17-0500] INFO - prefect.TaskRunner | Task 'cache_input[1]': Starting task run...
[2022-02-16 17:50:17-0500] INFO - prefect.TaskRunner | Task 'cache_input[1]': Finished task run for task with final state: 'Success'
[2022-02-16 17:50:17-0500] INFO - prefect.TaskRunner | Task 'cache_input[2]': Starting task run...
[2022-02-16 17:50:17-0500] INFO - prefect.TaskRunner | Task 'cache_input[2]': Finished task run for task with final state: 'Success'
[2022-02-16 17:50:17-0500] INFO - prefect.TaskRunner | Task 'cache_input[3]': Starting task run...
[2022-02-16 17:50:17-0500] INFO - prefect.TaskRunner | Task 'cache_input[3]': Finished task run for task with final state: 'Success'
[2022-02-16 17:50:17-0500] INFO - prefect.TaskRunner | Task 'cache_input[4]': Starting task run...
[2022-02-16 17:50:17-0500] INFO - prefect.TaskRunner | Task 'cache_input[4]': Finished task run for task with final state: 'Success'
[2022-02-16 17:50:17-0500] INFO - prefect.TaskRunner | Task 'cache_input[5]': Starting task run...
[2022-02-16 17:50:18-0500] INFO - prefect.TaskRunner | Task 'cache_input[5]': Finished task run for task with final state: 'Success'
[2022-02-16 17:50:18-0500] INFO - prefect.TaskRunner | Task 'cache_input[6]': Starting task run...
[2022-02-16 17:50:18-0500] INFO - prefect.TaskRunner | Task 'cache_input[6]': Finished task run for task with final state: 'Success'
[2022-02-16 17:50:18-0500] INFO - prefect.TaskRunner | Task 'cache_input[7]': Starting task run...
[2022-02-16 17:50:18-0500] INFO - prefect.TaskRunner | Task 'cache_input[7]': Finished task run for task with final state: 'Success'
[2022-02-16 17:50:18-0500] INFO - prefect.TaskRunner | Task 'cache_input[8]': Starting task run...
[2022-02-16 17:50:18-0500] INFO - prefect.TaskRunner | Task 'cache_input[8]': Finished task run for task with final state: 'Success'
[2022-02-16 17:50:18-0500] INFO - prefect.TaskRunner | Task 'cache_input[9]': Starting task run...
[2022-02-16 17:50:18-0500] INFO - prefect.TaskRunner | Task 'cache_input[9]': Finished task run for task with final state: 'Success'
[2022-02-16 17:50:18-0500] INFO - prefect.TaskRunner | Task 'cache_input[10]': Starting task run...
[2022-02-16 17:50:19-0500] INFO - prefect.TaskRunner | Task 'cache_input[10]': Finished task run for task with final state: 'Success'
[2022-02-16 17:50:19-0500] INFO - prefect.TaskRunner | Task 'cache_input[11]': Starting task run...
[2022-02-16 17:50:19-0500] INFO - prefect.TaskRunner | Task 'cache_input[11]': Finished task run for task with final state: 'Success'
[2022-02-16 17:50:19-0500] INFO - prefect.TaskRunner | Task 'cache_input[12]': Starting task run...
[2022-02-16 17:50:19-0500] INFO - prefect.TaskRunner | Task 'cache_input[12]': Finished task run for task with final state: 'Success'
[2022-02-16 17:50:19-0500] INFO - prefect.TaskRunner | Task 'cache_input[13]': Starting task run...
[2022-02-16 17:50:19-0500] INFO - prefect.TaskRunner | Task 'cache_input[13]': Finished task run for task with final state: 'Success'
[2022-02-16 17:50:19-0500] INFO - prefect.TaskRunner | Task 'cache_input[14]': Starting task run...
[2022-02-16 17:50:19-0500] INFO - prefect.TaskRunner | Task 'cache_input[14]': Finished task run for task with final state: 'Success'
[2022-02-16 17:50:19-0500] INFO - prefect.TaskRunner | Task 'cache_input[15]': Starting task run...
[2022-02-16 17:50:19-0500] INFO - prefect.TaskRunner | Task 'cache_input[15]': Finished task run for task with final state: 'Success'
[2022-02-16 17:50:19-0500] INFO - prefect.TaskRunner | Task 'cache_input[16]': Starting task run...
[2022-02-16 17:50:20-0500] INFO - prefect.TaskRunner | Task 'cache_input[16]': Finished task run for task with final state: 'Success'
[2022-02-16 17:50:20-0500] INFO - prefect.TaskRunner | Task 'cache_input[17]': Starting task run...
[2022-02-16 17:50:20-0500] INFO - prefect.TaskRunner | Task 'cache_input[17]': Finished task run for task with final state: 'Success'
[2022-02-16 17:50:20-0500] INFO - prefect.TaskRunner | Task 'cache_input[18]': Starting task run...
[2022-02-16 17:50:20-0500] INFO - prefect.TaskRunner | Task 'cache_input[18]': Finished task run for task with final state: 'Success'
[2022-02-16 17:50:20-0500] INFO - prefect.TaskRunner | Task 'cache_input[19]': Starting task run...
[2022-02-16 17:50:20-0500] INFO - prefect.TaskRunner | Task 'cache_input[19]': Finished task run for task with final state: 'Success'
[2022-02-16 17:50:20-0500] INFO - prefect.TaskRunner | Task 'cache_input[20]': Starting task run...
[2022-02-16 17:50:21-0500] INFO - prefect.TaskRunner | Task 'cache_input[20]': Finished task run for task with final state: 'Success'
[2022-02-16 17:50:21-0500] INFO - prefect.TaskRunner | Task 'cache_input[21]': Starting task run...
[2022-02-16 17:50:21-0500] INFO - prefect.TaskRunner | Task 'cache_input[21]': Finished task run for task with final state: 'Success'
[2022-02-16 17:50:21-0500] INFO - prefect.TaskRunner | Task 'cache_input[22]': Starting task run...
[2022-02-16 17:50:21-0500] INFO - prefect.TaskRunner | Task 'cache_input[22]': Finished task run for task with final state: 'Success'
[2022-02-16 17:50:21-0500] INFO - prefect.TaskRunner | Task 'cache_input[23]': Starting task run...
[2022-02-16 17:50:21-0500] INFO - prefect.TaskRunner | Task 'cache_input[23]': Finished task run for task with final state: 'Success'
[2022-02-16 17:50:21-0500] INFO - prefect.TaskRunner | Task 'prepare_target': Starting task run...
/Users/rwegener/repos/copy/pangeo-forge-recipes/pangeo_forge_recipes/recipes/xarray_zarr.py:111: RuntimeWarning: Failed to open Zarr store with consolidated metadata, falling back to try reading non-consolidated metadata. This is typically much slower for opening a dataset. To silence this warning, consider:
1. Consolidating metadata in this existing store with zarr.consolidate_metadata().
2. Explicitly setting consolidated=False, to avoid trying to read consolidate metadata, or
3. Explicitly setting consolidated=True, to raise an error in this case instead of falling back to try reading non-consolidated metadata.
return xr.open_zarr(target.get_mapper())
[2022-02-16 17:50:21-0500] INFO - prefect.TaskRunner | Task 'prepare_target': Finished task run for task with final state: 'Success'
[2022-02-16 17:50:21-0500] INFO - prefect.TaskRunner | Task 'store_chunk': Starting task run...
[2022-02-16 17:50:21-0500] INFO - prefect.TaskRunner | Task 'store_chunk': Finished task run for task with final state: 'Mapped'
[2022-02-16 17:50:21-0500] INFO - prefect.TaskRunner | Task 'store_chunk[0]': Starting task run...
[2022-02-16 17:50:22-0500] INFO - prefect.TaskRunner | Task 'store_chunk[0]': Finished task run for task with final state: 'Success'
[2022-02-16 17:50:22-0500] INFO - prefect.TaskRunner | Task 'store_chunk[1]': Starting task run...
[2022-02-16 17:50:22-0500] INFO - prefect.TaskRunner | Task 'store_chunk[1]': Finished task run for task with final state: 'Success'
[2022-02-16 17:50:22-0500] INFO - prefect.TaskRunner | Task 'store_chunk[2]': Starting task run...
[2022-02-16 17:50:22-0500] INFO - prefect.TaskRunner | Task 'store_chunk[2]': Finished task run for task with final state: 'Success'
[2022-02-16 17:50:22-0500] INFO - prefect.TaskRunner | Task 'store_chunk[3]': Starting task run...
[2022-02-16 17:50:22-0500] INFO - prefect.TaskRunner | Task 'store_chunk[3]': Finished task run for task with final state: 'Success'
[2022-02-16 17:50:22-0500] INFO - prefect.TaskRunner | Task 'store_chunk[4]': Starting task run...
[2022-02-16 17:50:22-0500] INFO - prefect.TaskRunner | Task 'store_chunk[4]': Finished task run for task with final state: 'Success'
[2022-02-16 17:50:22-0500] INFO - prefect.TaskRunner | Task 'store_chunk[5]': Starting task run...
[2022-02-16 17:50:23-0500] INFO - prefect.TaskRunner | Task 'store_chunk[5]': Finished task run for task with final state: 'Success'
[2022-02-16 17:50:23-0500] INFO - prefect.TaskRunner | Task 'store_chunk[6]': Starting task run...
[2022-02-16 17:50:23-0500] INFO - prefect.TaskRunner | Task 'store_chunk[6]': Finished task run for task with final state: 'Success'
[2022-02-16 17:50:23-0500] INFO - prefect.TaskRunner | Task 'store_chunk[7]': Starting task run...
[2022-02-16 17:50:23-0500] INFO - prefect.TaskRunner | Task 'store_chunk[7]': Finished task run for task with final state: 'Success'
[2022-02-16 17:50:23-0500] INFO - prefect.TaskRunner | Task 'store_chunk[8]': Starting task run...
[2022-02-16 17:50:23-0500] INFO - prefect.TaskRunner | Task 'store_chunk[8]': Finished task run for task with final state: 'Success'
[2022-02-16 17:50:23-0500] INFO - prefect.TaskRunner | Task 'store_chunk[9]': Starting task run...
[2022-02-16 17:50:23-0500] INFO - prefect.TaskRunner | Task 'store_chunk[9]': Finished task run for task with final state: 'Success'
[2022-02-16 17:50:23-0500] INFO - prefect.TaskRunner | Task 'store_chunk[10]': Starting task run...
[2022-02-16 17:50:23-0500] INFO - prefect.TaskRunner | Task 'store_chunk[10]': Finished task run for task with final state: 'Success'
[2022-02-16 17:50:23-0500] INFO - prefect.TaskRunner | Task 'store_chunk[11]': Starting task run...
[2022-02-16 17:50:24-0500] INFO - prefect.TaskRunner | Task 'store_chunk[11]': Finished task run for task with final state: 'Success'
[2022-02-16 17:50:24-0500] INFO - prefect.TaskRunner | Task 'store_chunk[12]': Starting task run...
[2022-02-16 17:50:24-0500] INFO - prefect.TaskRunner | Task 'store_chunk[12]': Finished task run for task with final state: 'Success'
[2022-02-16 17:50:24-0500] INFO - prefect.TaskRunner | Task 'store_chunk[13]': Starting task run...
[2022-02-16 17:50:24-0500] INFO - prefect.TaskRunner | Task 'store_chunk[13]': Finished task run for task with final state: 'Success'
[2022-02-16 17:50:24-0500] INFO - prefect.TaskRunner | Task 'store_chunk[14]': Starting task run...
[2022-02-16 17:50:24-0500] INFO - prefect.TaskRunner | Task 'store_chunk[14]': Finished task run for task with final state: 'Success'
[2022-02-16 17:50:24-0500] INFO - prefect.TaskRunner | Task 'store_chunk[15]': Starting task run...
[2022-02-16 17:50:24-0500] INFO - prefect.TaskRunner | Task 'store_chunk[15]': Finished task run for task with final state: 'Success'
[2022-02-16 17:50:24-0500] INFO - prefect.TaskRunner | Task 'store_chunk[16]': Starting task run...
[2022-02-16 17:50:24-0500] INFO - prefect.TaskRunner | Task 'store_chunk[16]': Finished task run for task with final state: 'Success'
[2022-02-16 17:50:24-0500] INFO - prefect.TaskRunner | Task 'store_chunk[17]': Starting task run...
[2022-02-16 17:50:25-0500] INFO - prefect.TaskRunner | Task 'store_chunk[17]': Finished task run for task with final state: 'Success'
[2022-02-16 17:50:25-0500] INFO - prefect.TaskRunner | Task 'store_chunk[18]': Starting task run...
[2022-02-16 17:50:25-0500] INFO - prefect.TaskRunner | Task 'store_chunk[18]': Finished task run for task with final state: 'Success'
[2022-02-16 17:50:25-0500] INFO - prefect.TaskRunner | Task 'store_chunk[19]': Starting task run...
[2022-02-16 17:50:25-0500] INFO - prefect.TaskRunner | Task 'store_chunk[19]': Finished task run for task with final state: 'Success'
[2022-02-16 17:50:25-0500] INFO - prefect.TaskRunner | Task 'store_chunk[20]': Starting task run...
[2022-02-16 17:50:25-0500] INFO - prefect.TaskRunner | Task 'store_chunk[20]': Finished task run for task with final state: 'Success'
[2022-02-16 17:50:25-0500] INFO - prefect.TaskRunner | Task 'store_chunk[21]': Starting task run...
[2022-02-16 17:50:25-0500] INFO - prefect.TaskRunner | Task 'store_chunk[21]': Finished task run for task with final state: 'Success'
[2022-02-16 17:50:25-0500] INFO - prefect.TaskRunner | Task 'store_chunk[22]': Starting task run...
[2022-02-16 17:50:26-0500] INFO - prefect.TaskRunner | Task 'store_chunk[22]': Finished task run for task with final state: 'Success'
[2022-02-16 17:50:26-0500] INFO - prefect.TaskRunner | Task 'store_chunk[23]': Starting task run...
[2022-02-16 17:50:26-0500] INFO - prefect.TaskRunner | Task 'store_chunk[23]': Finished task run for task with final state: 'Success'
[2022-02-16 17:50:26-0500] INFO - prefect.TaskRunner | Task 'finalize_target': Starting task run...
[2022-02-16 17:50:26-0500] INFO - prefect.TaskRunner | Task 'finalize_target': Finished task run for task with final state: 'Success'
[2022-02-16 17:50:26-0500] INFO - prefect.FlowRunner | Flow run SUCCESS: all reference tasks succeeded
<Success: "All reference tasks succeeded.">
flow.visualize()
Step 5: Check the Target#
All the data should be there!
ds = xr.open_zarr(recipe.target_mapper)
ds
<xarray.Dataset> Dimensions: (time: 12, nbounds: 2, depth: 57, lat: 36, lon: 72) Coordinates: climatology_bounds (time, nbounds) float32 dask.array<chunksize=(1, 2), meta=np.ndarray> crs int32 ... * depth (depth) float32 0.0 5.0 10.0 ... 1.45e+03 1.5e+03 depth_bnds (depth, nbounds) float32 dask.array<chunksize=(57, 2), meta=np.ndarray> * lat (lat) float32 -87.5 -82.5 -77.5 -72.5 ... 77.5 82.5 87.5 lat_bnds (lat, nbounds) float32 dask.array<chunksize=(36, 2), meta=np.ndarray> * lon (lon) float32 -177.5 -172.5 -167.5 ... 167.5 172.5 177.5 lon_bnds (lon, nbounds) float32 dask.array<chunksize=(72, 2), meta=np.ndarray> * time (time) object 1986-01-16 00:00:00 ... 1986-12-16 00:0... Dimensions without coordinates: nbounds Data variables: s_dd (time, depth, lat, lon) float64 dask.array<chunksize=(1, 57, 36, 72), meta=np.ndarray> s_mn (time, depth, lat, lon) float32 dask.array<chunksize=(1, 57, 36, 72), meta=np.ndarray> s_sd (time, depth, lat, lon) float32 dask.array<chunksize=(1, 57, 36, 72), meta=np.ndarray> s_se (time, depth, lat, lon) float32 dask.array<chunksize=(1, 57, 36, 72), meta=np.ndarray> t_dd (time, depth, lat, lon) float64 dask.array<chunksize=(1, 57, 36, 72), meta=np.ndarray> t_mn (time, depth, lat, lon) float32 dask.array<chunksize=(1, 57, 36, 72), meta=np.ndarray> t_sd (time, depth, lat, lon) float32 dask.array<chunksize=(1, 57, 36, 72), meta=np.ndarray> t_se (time, depth, lat, lon) float32 dask.array<chunksize=(1, 57, 36, 72), meta=np.ndarray> Attributes: (12/49) Conventions: CF-1.6, ACDD-1.3 cdm_data_type: Grid comment: global climatology as part of the World ... contributor_name: Ocean Climate Laboratory contributor_role: Calculation of climatologies creator_email: NCEI.info@noaa.gov ... ... summary: Climatological mean salinity for the glo... time_coverage_duration: P63Y time_coverage_end: 2017-01-31 time_coverage_resolution: P01M time_coverage_start: 1955-01-01 title: World Ocean Atlas 2018 : sea_water_salin...
Just to check, we will make a plot.
ds.s_mn.isel(depth=0).mean(dim='time').plot()
<matplotlib.collections.QuadMesh at 0x16a60b310>

🎉 Yay! Our recipe worked!