I’m trying to use the existing data_loader_hrrr_era5
(as referenced in examples/generative/stormcast/datasets/data_loader_hrrr_era5.py
) to train the regression model (e.g. via regression.train
). The documentation describes the folder layout under <location>
—with separate era5/
and <hrrr_dataset_name>/
directories, each containing per-year .zarr
files inside their respective train/
, valid/
, and test/
subdirectories. However, it’s not clear:
Specifically, I’d like to know:
For ERA5 per-year .zarr files, how the variables should be structured (dimensions, coordinate names, etc.).
For HRRR per-year .zarr , what the internal layout must look like and how the variables should be structured (dimensions, coordinate names, etc.).
Without knowing the exact schema, it’s difficult to build a custom Zarr export that the loader can consume. Any examples, minimal specs, or references to how the original datasets were organized (paths + internal variable names) would be extremely helpful. Thanks in advance for any pointers!