Command Line Interface¶
The OpenClimateGIS command line interface provides access to Python capabilities using command line syntax. Supported subcommands are called using:
ocli <subcommand> <arguments>
Current subcommands:
Subcommand | Long Name | Description |
---|---|---|
chunked-rwg |
Chunked Regrid Weight Generation | Chunked regrid weight generation using OCGIS spatial decompositions and ESMF weight generation. Allows weight generation for very high resolution grids in memory-limited environments. |
Chunked Regrid Weight Generation¶
Chunked regrid weight generation uses a spatial decomposition to calculate regridding weights by breaking source and destination grids into smaller pieces (chunks). This allows very high resolution grids to participate in regridding without depleting machine memory. The destination grid is chunked using a spatial decomposition (unstructured grids) or index-based slicing (structured, logically rectangular grids). The source grid is then spatially subset by the spatial extent of the destination chunk increased by a spatial buffer to ensure the destination chunk is fully mapped by the source chunk. Weights are calculated using ESMPy, the Python interface for the Earth System Modeling Framework (ESMF), for each chunked source-destination combination. A global weight file merge is performed by default on the weight chunks to create a global weights file.
In addition to chunked weight generation, the interface also offers spatial subsetting of the source grid using the global spatial extent of the destination grid. This is useful in situations where the destination grid spatial extent is very small compared to the spatial extent of the source grid.
Usage¶
$ ocli chunked-rwg --help
Usage: ocli chunked-rwg [OPTIONS]
Generate regridding weights using a spatial decomposition.
Options:
-s, --source PATH Path to the source grid NetCDF file.
[required]
-d, --destination PATH Path to the destination grid NetCDF file.
[required]
-n, --nchunks_dst TEXT Single integer or sequence defining the
chunking decomposition for the destination
grid. Each value is the number of chunks
along each decomposed axis. For unstructured
grids, provide a single value (i.e. 100).
For logically rectangular grids, two values
are needed to describe the x and y
decomposition (i.e. 10,20). Required if
--genweights and not --spatial_subset.
--merge / --no_merge (default=merge) If --merge, merge weight
file chunks into a global weight file.
-w, --weight PATH Path to the output global weight file.
Required if --merge.
--esmf_src_type TEXT (default=GRIDSPEC) ESMF source grid type.
Supports GRIDSPEC, UGRID, and SCRIP.
--esmf_dst_type TEXT (default=GRIDSPEC) ESMF destination grid
type. Supports GRIDSPEC, UGRID, and SCRIP.
--genweights / --no_genweights (default=True) Generate weights using ESMF
for each source and destination subset.
--esmf_regrid_method TEXT (default=CONSERVE) The ESMF regrid method.
Only applicable with --genweights. Supports
CONSERVE, BILINEAR. PATCH, and NEAREST_STOD.
--spatial_subset / --no_spatial_subset
(default=no_spatial_subset) Optionally
subset the destination grid by the bounding
box spatial extent of the source grid. This
will not work in parallel if --genweights.
--src_resolution FLOAT Optionally overload the spatial resolution
of the source grid. If provided, assumes an
isomorphic structure. Spatial resolution is
the mean distance between grid cell center
coordinates.
--dst_resolution FLOAT Optionally overload the spatial resolution
of the destination grid. If provided,
assumes an isomorphic structure. Spatial
resolution is the mean distance between grid
cell center coordinates.
--buffer_distance FLOAT Optional spatial buffer distance (in units
of the destination grid coordinates) to use
when subsetting the source grid by the
spatial extent of a destination grid or
chunk. This is computed internally if not
provided. Useful to override if the area of
influence for a source-destination mapping
is known a priori.
--wd PATH Optional working directory for intermediate
chunk files. Creates a directory in the
system's temporary scratch space if not
provided.
--persist / --no_persist (default=no_persist) If --persist, do not
remove the working directory --wd following
execution.
--eager / --not_eager (default=eager) If --eager, load all data
from the grids into memory before
subsetting. This will increase performance
as loading data for each chunk is avoided.
Set this to --not_eager for a more memory
efficient execution at the expense of
additional IO operations.
--help Show this message and exit.
ESMF Cross-Reference¶
Limitations¶
- Reducing memory overhead leverages IO heavily. Best performance is attained when netCDF4-python is built with parallel support to allow concurrent IO writes with OpenClimateGIS. A warning will be emitted by OpenClimateGIS if a serial only netCDF4-python installation is detected.
- Supports weight generation only without weight application (sparse matrix multiplication).
- Works for spherical latitude/longitude grids only.
- When a spatial decomposition is used on the destination grid, there may be duplicate entries in the merged, global weight file. These may be ignored as it results in only minor performance hits for sparse matrix multiplications.
Examples¶
Weight Generation with Logically Rectangular Grids¶
This example creates two global, spherical, latitude/longitude grids with differing spatial resolutions. First, we write the grids to NetCDF files. We then call the command line chunked regrid weight generation in parallel. The destination grid is decomposed into 25 chunks - five chunks along the y-axis and five chunks along the x-axis.
import os
import subprocess
import tempfile
import ocgis
from ocgis.test.base import create_gridxy_global
DATADIR = tempfile.mkdtemp(prefix='ocgis_chunked_rwg_')
SRC_CFGRID = os.path.join(DATADIR, 'src.nc')
DST_CFGRID = os.path.join(DATADIR, 'dst.nc')
WEIGHT = os.path.join(DATADIR, 'esmf_weights.nc')
# Write the source and destination grids. The destination grid has a slightly lower spatial resolution. ----------------
srcgrid = create_gridxy_global(crs=ocgis.crs.Spherical())
srcgrid.parent.write(SRC_CFGRID)
dstgrid = create_gridxy_global(resolution=1.33, crs=ocgis.crs.Spherical())
dstgrid.parent.write(DST_CFGRID)
# ----------------------------------------------------------------------------------------------------------------------
# Construct the chunked regrid weight generation command and execute in a subprocess.
cmd = ['mpirun', '-n', str(4), 'ocli', 'chunked-rwg', '-s', SRC_CFGRID, '-d', DST_CFGRID, '-w', WEIGHT, '-n', '5,5']
print(' '.join(cmd))
# Command line looks like:
# mpirun -n 4 ocli chunked-rwg -s src.nc -d dst.nc -w esmf_weights.nc -n 5,5
subprocess.check_call(cmd)
# Inspect the weight file output.
ocgis.RequestDataset(WEIGHT).inspect()
Weight Generation with Spatial Subset¶
This example creates a global, spherical, latitude/longitude grid. It also creates a grid with a single cell. The spatial extent of the single cell grid is much smaller than the global grid. Spatially subsetting the source grid prior to weight generation decreases the amount of source grid information required in the weight calculation.
import os
import subprocess
import tempfile
import ocgis
from ocgis.test.base import create_gridxy_global
DATADIR = tempfile.mkdtemp(prefix='ocgis_chunked_rwg_')
SRC_CFGRID = os.path.join(DATADIR, 'src.nc')
DST_CFGRID = os.path.join(DATADIR, 'dst.nc')
WEIGHT = os.path.join(DATADIR, 'esmf_weights.nc')
# Write the source and destination grids. The destination grid has a much smaller spatial extent. ----------------------
srcgrid = create_gridxy_global(crs=ocgis.crs.Spherical())
srcgrid.parent.write(SRC_CFGRID)
# Write the destination grid. Slice the grid first to create a single cell.
dstgrid = create_gridxy_global(resolution=5, crs=ocgis.crs.Spherical())
dstgrid = dstgrid[18, 36]
dstgrid.parent.write(DST_CFGRID)
# ----------------------------------------------------------------------------------------------------------------------
# Construct the chunked regrid weight generation command and execute in a subprocess.
cmd = ['ocli', 'chunked-rwg', '-s', SRC_CFGRID, '-d', DST_CFGRID, '-w', WEIGHT, '--spatial_subset']
print(' '.join(cmd))
# Command looks like:
# ocli chunked-rwg -s src.nc -d dst.nc -w esmf_weights.nc --spatial_subset
subprocess.check_call(cmd)
# Inspect the weight file output.
ocgis.RequestDataset(WEIGHT).inspect()
Chunked Sparse Matrix Multiplication (Weight Application)¶
Using output from the chunked regrid weight generation, a chunked sparse matrix multiplication applies the sparse matrix (the weights) using a sequential approach. The individual multiplication operations are performed in parallel if using MPI.
$ ocli chunked-smm --help
Usage: ocli.py chunked-smm [OPTIONS]
Apply weights in chunked files with an option to insert the global data.
Options:
--wd PATH Optional working directory containing
destination chunk files. If empty, the
current working directory is used.
--index_path FILE Path grid chunker index file. If not
provided, it will assume the default name in
the working directory.
--insert_weighted / --no_insert_weighted
If --insert_weighted, insert the weighted
data back into the global destination file.
-d, --destination FILE Path to the destination grid NetCDF file.
Needed if using --insert_weighted.
--data_variables TEXT List of comma-separated data variable names
to overload auto-discovery.
--help Show this message and exit.