Command Line Interface

The OpenClimateGIS command line interface provides access to Python capabilities using command line syntax. Supported subcommands are called using:

ocli <subcommand> <arguments>

Current subcommands:

Subcommand Long Name Description
chunked-rwg Chunked Regrid Weight Generation Chunked regrid weight generation using OCGIS spatial decompositions and ESMF weight generation. Allows weight generation for very high resolution grids in memory-limited environments.

Chunked Regrid Weight Generation

Chunked regrid weight generation uses a spatial decomposition to calculate regridding weights by breaking source and destination grids into smaller pieces (chunks). This allows very high resolution grids to participate in regridding without depleting machine memory. The destination grid is chunked using a spatial decomposition (unstructured grids) or index-based slicing (structured, logically rectangular grids). The source grid is then spatially subset by the spatial extent of the destination chunk increased by a spatial buffer to ensure the destination chunk is fully mapped by the source chunk. Weights are calculated using ESMPy, the Python interface for the Earth System Modeling Framework (ESMF), for each chunked source-destination combination. A global weight file merge is performed by default on the weight chunks to create a global weights file.

In addition to chunked weight generation, the interface also offers spatial subsetting of the source grid using the global spatial extent of the destination grid. This is useful in situations where the destination grid spatial extent is very small compared to the spatial extent of the source grid.

Usage

$ ocli chunked-rwg --help

Usage: ocli chunked-rwg [OPTIONS]

  Generate regridding weights using a spatial decomposition.

Options:
  -s, --source PATH               Path to the source grid NetCDF file.
                                  [required]
  -d, --destination PATH          Path to the destination grid NetCDF file.
                                  [required]
  -n, --nchunks_dst TEXT          Single integer or sequence defining the
                                  chunking decomposition for the destination
                                  grid. Each value is the number of chunks
                                  along each decomposed axis. For unstructured
                                  grids, provide a single value (i.e. 100).
                                  For logically rectangular grids, two values
                                  are needed to describe the x and y
                                  decomposition (i.e. 10,20). Required if
                                  --genweights and not --spatial_subset.
  --merge / --no_merge            (default=merge) If --merge, merge weight
                                  file chunks into a global weight file.
  -w, --weight PATH               Path to the output global weight file.
                                  Required if --merge.
  --esmf_src_type TEXT            (default=GRIDSPEC) ESMF source grid type.
                                  Supports GRIDSPEC, UGRID, and SCRIP.
  --esmf_dst_type TEXT            (default=GRIDSPEC) ESMF destination grid
                                  type. Supports GRIDSPEC, UGRID, and SCRIP.
  --genweights / --no_genweights  (default=True) Generate weights using ESMF
                                  for each source and destination subset.
  --esmf_regrid_method TEXT       (default=CONSERVE) The ESMF regrid method.
                                  Only applicable with --genweights. Supports
                                  CONSERVE, BILINEAR. PATCH, and NEAREST_STOD.
  --spatial_subset / --no_spatial_subset
                                  (default=no_spatial_subset) Optionally
                                  subset the destination grid by the bounding
                                  box spatial extent of the source grid. This
                                  will not work in parallel if --genweights.
  --src_resolution FLOAT          Optionally overload the spatial resolution
                                  of the source grid. If provided, assumes an
                                  isomorphic structure. Spatial resolution is
                                  the mean distance between grid cell center
                                  coordinates.
  --dst_resolution FLOAT          Optionally overload the spatial resolution
                                  of the destination grid. If provided,
                                  assumes an isomorphic structure. Spatial
                                  resolution is the mean distance between grid
                                  cell center coordinates.
  --buffer_distance FLOAT         Optional spatial buffer distance (in units
                                  of the destination grid coordinates) to use
                                  when subsetting the source grid by the
                                  spatial extent of a destination grid or
                                  chunk. This is computed internally if not
                                  provided. Useful to override if the area of
                                  influence for a source-destination mapping
                                  is known a priori.
  --wd PATH                       Optional working directory for intermediate
                                  chunk files. Creates a directory in the
                                  system's temporary scratch space if not
                                  provided.
  --persist / --no_persist        (default=no_persist) If --persist, do not
                                  remove the working directory --wd following
                                  execution.
  --eager / --not_eager           (default=eager) If --eager, load all data
                                  from the grids into memory before
                                  subsetting. This will increase performance
                                  as loading data for each chunk is avoided.
                                  Set this to --not_eager for a more memory
                                  efficient execution at the expense of
                                  additional IO operations.
  --help                          Show this message and exit.

Limitations

  • Reducing memory overhead leverages IO heavily. Best performance is attained when netCDF4-python is built with parallel support to allow concurrent IO writes with OpenClimateGIS. A warning will be emitted by OpenClimateGIS if a serial only netCDF4-python installation is detected.
  • Supports weight generation only without weight application (sparse matrix multiplication).
  • Works for spherical latitude/longitude grids only.
  • When a spatial decomposition is used on the destination grid, there may be duplicate entries in the merged, global weight file. These may be ignored as it results in only minor performance hits for sparse matrix multiplications.

Examples

Weight Generation with Logically Rectangular Grids

This example creates two global, spherical, latitude/longitude grids with differing spatial resolutions. First, we write the grids to NetCDF files. We then call the command line chunked regrid weight generation in parallel. The destination grid is decomposed into 25 chunks - five chunks along the y-axis and five chunks along the x-axis.

import os
import subprocess
import tempfile

import ocgis
from ocgis.test.base import create_gridxy_global

DATADIR = tempfile.mkdtemp(prefix='ocgis_chunked_rwg_')
SRC_CFGRID = os.path.join(DATADIR, 'src.nc')
DST_CFGRID = os.path.join(DATADIR, 'dst.nc')
WEIGHT = os.path.join(DATADIR, 'esmf_weights.nc')

# Write the source and destination grids. The destination grid has a slightly lower spatial resolution. ----------------
srcgrid = create_gridxy_global(crs=ocgis.crs.Spherical())
srcgrid.parent.write(SRC_CFGRID)

dstgrid = create_gridxy_global(resolution=1.33, crs=ocgis.crs.Spherical())
dstgrid.parent.write(DST_CFGRID)
# ----------------------------------------------------------------------------------------------------------------------

# Construct the chunked regrid weight generation command and execute in a subprocess.
cmd = ['mpirun', '-n', str(4), 'ocli', 'chunked-rwg', '-s', SRC_CFGRID, '-d', DST_CFGRID, '-w', WEIGHT, '-n', '5,5']
print(' '.join(cmd))

# Command line looks like:

# mpirun -n 4 ocli chunked-rwg -s src.nc -d dst.nc -w esmf_weights.nc -n 5,5

subprocess.check_call(cmd)

# Inspect the weight file output.
ocgis.RequestDataset(WEIGHT).inspect()

Weight Generation with Spatial Subset

This example creates a global, spherical, latitude/longitude grid. It also creates a grid with a single cell. The spatial extent of the single cell grid is much smaller than the global grid. Spatially subsetting the source grid prior to weight generation decreases the amount of source grid information required in the weight calculation.

import os
import subprocess
import tempfile

import ocgis
from ocgis.test.base import create_gridxy_global

DATADIR = tempfile.mkdtemp(prefix='ocgis_chunked_rwg_')
SRC_CFGRID = os.path.join(DATADIR, 'src.nc')
DST_CFGRID = os.path.join(DATADIR, 'dst.nc')
WEIGHT = os.path.join(DATADIR, 'esmf_weights.nc')

# Write the source and destination grids. The destination grid has a much smaller spatial extent. ----------------------
srcgrid = create_gridxy_global(crs=ocgis.crs.Spherical())
srcgrid.parent.write(SRC_CFGRID)

# Write the destination grid. Slice the grid first to create a single cell.
dstgrid = create_gridxy_global(resolution=5, crs=ocgis.crs.Spherical())
dstgrid = dstgrid[18, 36]
dstgrid.parent.write(DST_CFGRID)
# ----------------------------------------------------------------------------------------------------------------------

# Construct the chunked regrid weight generation command and execute in a subprocess.
cmd = ['ocli', 'chunked-rwg', '-s', SRC_CFGRID, '-d', DST_CFGRID, '-w', WEIGHT, '--spatial_subset']
print(' '.join(cmd))

# Command looks like:

# ocli chunked-rwg -s src.nc -d dst.nc -w esmf_weights.nc --spatial_subset

subprocess.check_call(cmd)

# Inspect the weight file output.
ocgis.RequestDataset(WEIGHT).inspect()

Chunked Sparse Matrix Multiplication (Weight Application)

Using output from the chunked regrid weight generation, a chunked sparse matrix multiplication applies the sparse matrix (the weights) using a sequential approach. The individual multiplication operations are performed in parallel if using MPI.

$ ocli chunked-smm --help
Usage: ocli.py chunked-smm [OPTIONS]

  Apply weights in chunked files with an option to insert the global data.

Options:
  --wd PATH                       Optional working directory containing
                                  destination chunk files. If empty, the
                                  current working directory is used.
  --index_path FILE               Path grid chunker index file. If not
                                  provided, it will assume the default name in
                                  the working directory.
  --insert_weighted / --no_insert_weighted
                                  If --insert_weighted, insert the weighted
                                  data back into the global destination file.
  -d, --destination FILE          Path to the destination grid NetCDF file.
                                  Needed if using --insert_weighted.
  --data_variables TEXT           List of comma-separated data variable names
                                  to overload auto-discovery.
  --help                          Show this message and exit.