Class Reference

RequestDataset

class ocgis.RequestDataset(uri=None, variable=None, units=None, time_range=None, time_region=None, time_subset_func=None, level_range=None, conform_units_to=None, crs='auto', t_units=None, t_calendar=None, t_conform_units_to=None, grid_abstraction='auto', grid_is_isomorphic='auto', dimension_map=None, field_name=None, driver=None, regrid_source=True, regrid_destination=False, metadata=None, format_time=True, opened=None, uid=None, rename_variable=None, predicate=None, rotated_pole_priority=False, driver_kwargs=None)[source]

Contains all the information necessary to create an OCGIS field via an OCGIS driver.

>>> from ocgis import RequestDataset
>>> uri = 'http://some.opendap.dataset'
>>> # It is also okay to enter the path to a local file.
>>> uri = '/path/to/local/file.nc'
>>> variable = 'tasmax'
>>> rd = RequestDataset(uri, variable)
Parameters:uri (str | sequence of str | None) – The absolute path (URLs included) to the data’s location. If None, either opened or metadata must be provided.
>>> uri = 'http://some.opendap.dataset'
>>> uri = '/path/to/local/file.nc'
>>> # Multifile datasets are supported for local and remote targets.
>>> uri = ['/path/to/local/file1.nc', '/path/to/local/file2.nc']

Warning

There is no internal checking on the ordering of the files. If the datasets should be concatenated along the time dimension, it may be a good idea to run the sequence of URIs through a time sorting function get_sorted_uris_by_time_dimension().

Parameters:variable (str | sequence of str | None) – The target variable. If the argument value is None, then a search on the target data object will be performed to find variables having a minimum set of dimensions (i.e. time and space). The value of this property will then be updated.
>>> variable = 'tas'
>>> variable = ['tas', 'tasmax']
Parameters:
  • time_range (two-element sequence of datetime-like objects) – Lower and upper bounds for time dimension subsetting. If None, return all time points.
  • time_region (dict) – A dictionary with keys of 'month' and/or 'year' and values as sequences corresponding to target month and/or year values. Empty region selection for a key may be set to None.
>>> time_region = {'month':[6,7],'year':[2010,2011]}
>>> time_region = {'year':[2010]}
Parameters:
  • time_subset_func – See get_subset_by_function().
  • level_range (two-element sequence of int or float) – Lower and upper bounds for level dimension subsetting. If None, return all levels.
  • crs (AbstractCoordinateReferenceSystem) – Overload the autodiscovered coordinate system.
>>> from ocgis.variable.crs import WGS84
>>> crs = WGS84()
Parameters:
  • t_units (str) – Overload the time units.
  • t_calendar (str) – Overload the time calendar.
  • t_conform_units_to (str) – Conform the time dimension to the provided units. The calendar may not be changed. The optional dependency cf_units is required.
>>> t_conform_units_to = 'days since 1949-1-1'
Parameters:grid_abstraction (str) – Abstract the geometry data to either 'point' or 'polygon'. If 'polygon' is not possible due to missing bounds, 'point' will be used instead. If 'auto' (the default), identify the grid abstraction automatically. Unstructured data formats also allow for 'line'.

Note

The abstraction argument in OcgOperations will overload this.

Parameters:
  • dimension_map (DimensionMap | dict) – Maps dimensions to axes in the case of a projection/realization axis or an uncommon axis ordering. All axes must be in the dictionary. A fully-specified dimension map for a CF grid file containing time, x, and y axes is below. The file also contains a scalar level axis. At minimum, a 'variable' must be provided for each axis. See Configuring a Dimension Map for a usage example.
  • units (str | cf_units.Unit | sequence of possible types) – The units of the source variable. This will be read from metadata if this value is None.
  • conform_units_to (str | cfunits.Units | sequence of possible types) – Destination units for conversion. If this parameter is set, then the cf_units module must be installed.
  • driver (str | AbstractDriver) – If None, autodiscover the appropriate driver. Acceptable values are listed below. Class objects for the associated driver key are also accepted.
Value File Extension(s) Description
'netcdf-cf' 'nc' A netCDF file using a CF-Grid metadata convention.
'netcdf-ugrid' 'nc' A netCDF file using the UGRID (Unstructured Grid) metadata convention.
'netcdf-scrip' 'nc' A netCDF file using the SCRIP metadata convention.
'netcdf' 'nc' A netCDF file with no metadata convention.
'vector' 'shp' An ESRI Shapefile or other vector source.
'csv' 'csv' A CSV file.
Parameters:
  • field_name (str) – Name of the requested field in the output collection. If None, defaults to the variable defaults to the data variable name. If there are multiple data variables, the default name is 'ocgis_field'.
  • regrid_source (bool) – If False, do not regrid this dataset. This is relevant only if a regrid_destination dataset is present. Please see ESMPy Regridding for an overview.
  • regrid_destination (bool) – If True, use this dataset as the destination grid for a regridding operation. Only one RequestDataset may be set as the destination grid. Please see ESMPy Regridding for an overview.
  • rename_variable (sequence of str) – A sequence with the same length as variable. Provides new names for the variables.
  • metadata (dict) – Overload the metadata that would normally be loaded by the driver. If metadata is provided and uri is None, a field will be created by interpreting the provided metadata.
  • opened (varies by driver class) – An open file used as a write target for the driver.
  • uid (int) – A unique identifier for the request dataset.
  • predicate (function) – A filter function returning True if a variable should be included in the output field. The function should take a single argument which is a sequence of string variable names. This function is applied directly to the metadata before other functions (i.e. identifying data variables).
  • rotated_pole_priority (bool) – If False, attempt to use representative spherical coordinates if available in a dataset having a rotated pole coordinate system. If True, use the rotated coordinate even if representative coordinates are available.
>>> predicate = lambda x: x.startswith('w')
Parameters:
  • driver_kwargs (dict) – Any keyword arguments to driver creation. See the driver documentation for a description of accepted parameters. These are often format-specific and not easily generalized.
  • grid_is_isomorphic – See documentation for ocgis.Field
inspect()[source]

Print a string containing important information about the source driver.

OcgOperations

class ocgis.OcgOperations(dataset=None, spatial_operation='intersects', geom=None, geom_select_sql_where=None, geom_select_uid=None, geom_uid=None, aggregate=False, calc=None, calc_grouping=None, calc_raw=False, abstraction='auto', snippet=False, backend='ocg', prefix=None, output_format='ocgis', agg_selection=False, select_ugid=None, vector_wrap=True, allow_empty=False, dir_output=None, slice=None, file_only=False, format_time=True, calc_sample_size=False, search_radius_mult=None, output_crs=None, interpolate_spatial_bounds=False, add_auxiliary_files=True, optimizations=None, callback=None, time_range=None, time_region=None, time_subset_func=None, level_range=None, conform_units_to=None, select_nearest=False, regrid_destination=None, regrid_options=None, melted=False, output_format_options=None, spatial_wrapping=None, spatial_reorder=False, optimized_bbox_subset=False)[source]

Entry point for all OCGIS operations.

Parameters:
  • dataset (ocgis.RequestDataset | Field | sequence(RequestDataset, …) | sequence(Field, …)) – A dataset is the target file(s) or object(s) containing data to process.
  • spatial_operation (str) – The geometric operation to be performed.
  • geom (list of dict, list of float, str) – The selection geometry(s) used for the spatial subset. If None, selection defaults to entire spatial domain.
  • geom_select_sql_where (str) – A string suitable for insertion into a SQL WHERE statement. See http://www.gdal.org/ogr_sql.html for documentation (section titled “WHERE”).
  • geom_select_uid (sequence of integers) – The unique identifiers of specific geometries contained in the geometry datasets. Geometries having these unique identifiers will be used for subsetting.
  • geom_uid (str) – If provided, use this as the unique geometry identifier. If None, use the value of DEFAULT_GEOM_UID. If that is not present, generate a one-based unique identifier with that name.
  • aggregate (bool) – If True, dataset geometries are aggregated to coincident selection geometries.
  • calc (list of dictionaries or string-based function) – Calculations to be performed on the dataset subset.
  • calc_grouping (list(str), int , None) – Temporal grouping to apply for calculations.
  • calc_raw (bool) – If True, perform calculations on the “raw” data regardless of aggregation flag.
  • abstraction (str) – The geometric abstraction to use for the dataset geometries. If None (the default), use the highest order geometry available.
  • snippet (bool) – If True, return a data “snippet” composed of the first time point, first level (if applicable), and the entire spatial domain.
  • backend (str) – The processing backend to use.
  • prefix (str) – The output prefix to prepend to any output data filename.
  • output_format (str) – The desired output format.
  • agg_selection (bool) – If True, the selection geometry will be aggregated prior to any spatial operations.
  • vector_wrap (bool) – If True, keep any vector output on a -180 to 180 longitudinal domain.
  • allow_empty (bool) – If True, do not raise an exception in the case of an empty geometric selection.
  • dir_output (str) – The output directory to which any disk format folders are written. If the directory does not exist, an exception will be raised. This will override env.DIR_OUTPUT.
  • slice (list) – A five-element list to use for slicing the input data. This will override any other susetting.
  • format_time (bool) – If True (the default), attempt to coerce time values to datetime stamps. If False, pass values through without a coercion attempt. This only affects RequestDataset objects.
  • calc_sample_size (bool) – If True, calculate statistical sample sizes for calculations.
  • output_crs (ocgis.variable.crs.AbstractCRS) – If provided, all output geometries will be projected to match the provided CRS.
  • search_radius_mult (float) – This value is multiplied by the target data’s spatial resolution to determine the buffer radius for point selection geometries.
  • interpolate_spatial_bounds (bool) – If True and no bounds are available, attempt to interpolate bounds from centroids.
  • add_auxiliary_files (bool) – If True, create a new directory and add metadata and other informational files in addition to the converted file. If False, write the target file only to dir_output and do not create a new directory.
  • callback (function) – A function taking two parameters: percent_complete and message.
  • time_range ([datetime.datetime, datetime.datetime]) – Upper and lower bounds for time dimension subsetting. If None, return all time points. Using this argument will overload all RequestDataset time_range values.
  • time_region (dict) – A dictionary with keys of ‘month’ and/or ‘year’ and values as sequences corresponding to target month and/or year values. Empty region selection for a key may be set to None. Using this argument will overload all RequestDataset time_region values.
  • time_subset_func (FunctionType) – See ocgis.interface.base.dimension.temporal.TemporalDimension.get_subset_by_function() for usage instructions.
  • level_range ([int/float, int/float]) – Upper and lower bounds for level dimension subsetting. If None, return all levels. Using this argument will overload all RequestDataset level_range values.
  • conform_units_to (str or cfunits.Units) – Destination units for conversion. If this parameter is set, then the cfunits module must be installed. Setting this parameter will override conformed units set on dataset objects.
  • select_nearest (bool) – If True, the nearest geometry to the centroid of the current selection geometry is returned. This is useful when subsetting by a point, and it is preferred to not return all geometries within the selection radius.
  • regrid_destination (str | ocgis.Field) – If provided, regrid dataset objects using ESMPy to this destination grid. If a string is provided, then the ocgis.RequestDataset with the corresponding name will be selected as the destination. Please see ESMPy Regridding for an overview.
  • regrid_options (dict) – Overload the default keywords for regridding. Dictionary elements must map to the names of keyword arguments for iter_regridded_fields(). If this is left as None, then the default keyword values are used. Please see ESMPy Regridding for an overview.
  • melted (bool) – If None, default to ocgis.env.MELTED. If False (the default), variable names are individual columns in tabular output formats (i.e. 'csv'). If True, all variable values will be collected under a single value column.
  • output_format_options (dict) – A dictionary of output-specific format options.
  • spatial_wrapping (str) – If "wrap" or "unwrap", wrap or unwrap the spatial coordinates if the associated coordinate system is a wrappable coordinate system like spherical latitude/longitude.
  • spatial_reorder (bool) – If True, reorder wrapped coordinates such that the longitude values are in ascending order. Reordering assumes the first row of longitude coordinates are representative of the other longitude coordinate rows. Bounds and corners will be removed in the event of a reorder. Only applies to spherical coordinate systems.
  • optimized_bbox_subset (bool) – If True, only perform the bounding box subset ignoring other subsetting procedures such as spatial operations on geometry objects using a spatial index.
execute()[source]

Execute the request using the selected backend.

Return type:Path to an output file/folder or dictionary composed of ocgis.driver.collection.AbstractCollection objects.
get_base_request_size()[source]

Return the estimated request size in kilobytes. This is the estimated size of the requested data not the returned data product.

Returns:Dictionary containing sizes of variables. Format is: dict['field'][<field name>][<variable name>].
Return type:dict
>>> ops = OcgOperations(...)
>>> ret = ops.get_base_request_size()
{'field': {'tas': {u'height': {'dtype': dtype('float64'),
                               'kb': 0.0,
                               'shape': ()},
           u'lat': {'dtype': dtype('float64'),
                    'kb': 0.5,
                    'shape': (64,)},
           u'lat_bnds': {'dtype': dtype('float64'),
                         'kb': 1.0,
                         'shape': (64, 2)},
           'latitude_longitude': {'dtype': None,
                                  'kb': 0.0,
                                  'shape': (0,)},
           u'lon': {'dtype': dtype('float64'),
                    'kb': 1.0,
                    'shape': (128,)},
           u'lon_bnds': {'dtype': dtype('float64'),
                         'kb': 2.0,
                         'shape': (128, 2)},
           'tas': {'dtype': dtype('float32'),
                   'kb': 116800.0,
                   'shape': (3650, 64, 128)},
           u'time': {'dtype': dtype('float64'),
                     'kb': 28.515625,
                     'shape': (3650,)},
           u'time_bnds': {'dtype': dtype('float64'),
                          'kb': 57.03125,
                          'shape': (3650, 2)}}},
 'total': 116890.046875}

Data Interface

Dimension

class ocgis.Dimension(name, size=None, size_current=None, src_idx=None, dist=False, is_empty=False, source_name=-999, aliases=None, uid=None)[source]

Bases: ocgis.base.AbstractNamedObject

A dimension tracks the count of elements along an axis in a multi-dimensional array. All Variable objects use dimensions. Dimensions are used to track global and local bounds when running and parallel. They also track source indices allowed data to be sliced without loading from source. See https://en.wikipedia.org/wiki/Dimension for an overview of dimensions.

Parameters:
  • name (str) – The dimension’s name.
  • size (int) – The dimension’s size. Set to None if this dimension is unlimited.
  • size_current – The dimension’s current size. The current size is needed to track sizes if the dimension is unlimited.
  • src_idx (numpy.ndarray | str) – An one-dimensional, integer array containing the source indices for a “dimensioned” element. If 'auto', generate the index array automatically from the dimension size (not applicable for unlimited dimensions).
  • dist (bool) – If True, this dimension is distributed. Used by ocgis.OcgDist to generate the parallel distribution.
  • is_empty (bool) – If True, the dimension is empty on the current rank.
  • source_name – See AbstractNamedObject.
  • aliases – See AbstractNamedObject.
  • uid – See AbstractNamedObject.

Example Code:

>>> # Create standard dimension.
>>> dim = Dimension('the_dim', size=5)
>>> assert dim.size == 5 and len(dim) == 5
>>> # Create an unlimited dimension.
>>> udim = Dimension('unlimited_dim', size_current=10)
>>> assert udim.size is None and len(udim) == 10 and udim.size_current == 10
__eq__(other)[source]

x.__eq__(y) <==> x==y

__getitem__(slc)[source]

Dimensions may be sliced like other sliceable Python objects. A shallow copy of the dimension is created before slicing. Use get_distributed_slice() for parallel slicing.

Parameters:slc – A slice-like object.
Return type:Dimension
Raises:IndexError
>>> dim = Dimension('five', 5)
>>> sub = dim[2:4]
>>> assert len(sub) == 2
>>> assert id(dim) != id(sub)
__init__(name, size=None, size_current=None, src_idx=None, dist=False, is_empty=False, source_name=-999, aliases=None, uid=None)[source]

x.__init__(…) initializes x; see help(type(x)) for signature

__ne__(other)[source]

x.__ne__(y) <==> x!=y

__repr__() <==> repr(x)[source]
bounds_global

Get or set the global bounds for the dimension across all non-empty ranks.

Index Description
0 The lower bound.
1 The upper bound.
Return type:tuple(int, int)
bounds_local

Get or set the rank-local bounds for the dimension.

Index Description
0 The lower bound.
1 The upper bound.
Return type:tuple(int, int)
convert_to_empty()[source]

Convert the dimension to an empty dimension.

Raises:ValueError
eq(other, check_src_idx=True)[source]

Return True if dimensions are equal.

Parameters:
  • other (Dimension) – The other dimension.
  • check_src_idx (bool) – If True, assert dimension source indices are equal.
Return type:

bool

get_distributed_slice(slc)[source]

Slice the dimension in parallel. The sliced dimension object is a shallow copy. The returned dimension may be empty.

Parameters:slc – A slice-like object or a fancy slice. If this is a fancy slice, slc must be processor-local. If the fancy slice uses integer indices, the indices must be local. In other words, a fancy slc is not manipulated or redistributed prior to slicing.
Return type:Dimension
Raises:EmptyObjectError
is_empty
Returns:True if the dimension is empty. Allows for the creation of empty objects.
Return type:bool
is_unlimited
Returns:True if the dimension is unlimited.
Return type:bool
set_size(value, src_idx=None)[source]

Set the dimension’s size.

Parameters:
  • value (int | None) – The new size of the dimension. Allows setting to None to convert to an unlimited dimension.
  • src_idx – The new source index. If 'auto', create a default source index.
Raises:

ValueError

size
Returns:The dimension’s size.
Return type:int | None if unlimited
size_current
Returns:The current size of the dimension. Needed to track sizes for unlimited dimensions.
Return type:int
size_global
Returns:The global size of the dimension.
Return type:int
to_xarray()[source]

Convert this object to a type understood by xarray.

Return type:str

Variable

class ocgis.Variable(name=None, value=None, dimensions=None, dtype='auto', mask=None, attrs=None, fill_value='auto', units='auto', parent=None, bounds=None, is_empty=None, source_name=-999, uid=None, repeat_record=None)[source]

Bases: ocgis.variable.base.AbstractContainer, ocgis.variable.attributes.Attributes

A variable contains data values. They may be masked and have attributes.

Parameters:
  • name (str) – The variable’s name (required).
  • value (numpy.ndarray | numpy.ma.MaskedArray | sequence) – The variable’s data.
  • dimensions (sequence of Dimension | str) – Dimensions for value. The number of dimensions must match the dimension count of value (if provided). None is allowed for scalar or attribute container variables.
  • dtype – The variable’s data type. If 'auto', the data type will match the data type of value. If the data type does not match values’s data type, value will be converted to match.
  • mask (numpy.ndarray | sequence) – The variable’s mask. If None and value is a numpy.ma.MaskedArray, then the mask is pulled from value. Shape must be the same as value. Data type is cast to bool.
  • attrs – See Attributes.
  • fill_value (int) – The fill value to use when hardening the mask. If 'auto', this will be determined automatically from a masked array or the data type.
  • units (str) – Units for the variable’s data. If 'auto', attempt to pull units from the variable’s attrs.
  • parent – See AbstractContainer.
  • bounds (Variable) – Bounds for the variable’s data value. Mostly applicable for coordinate-type variables.
  • is_empty – If True, the variable is empty and has not value, mask, or meaning.
  • source_name – See AbstractNamedObject.
  • uid – See AbstractNamedObject.
  • repeat_record (sequence) – A value to repeat when the variable’s iter() method is called.
>>> repeat_record = [('i am', 'a repeater'), ('this is my value', 5)]

Example Code:

>>> # Create simple variable with a single dimension.
>>> var = Variable(name='data', value=[1, 2, 3], dtype=float, dimensions='three')
>>> assert var.dimensions[0].name == 'three'
>>> # Create a variable using dimension objects.
>>> from ocgis import Dimension
>>> dim1 = Dimension('three', 3)
>>> dim2 = Dimension('five', 5)
>>> var = Variable(name='two_d', dimensions=[dim1, dim2], fill_value=4, dtype=int)
>>> assert var.get_value().mean() == 4
__init__(name=None, value=None, dimensions=None, dtype='auto', mask=None, attrs=None, fill_value='auto', units='auto', parent=None, bounds=None, is_empty=None, source_name=-999, uid=None, repeat_record=None)[source]

x.__init__(…) initializes x; see help(type(x)) for signature

__setitem__(slc, variable)[source]

Set the this variable’s value and mask to the other variable’s value and mask in the index space defined by slc.

Parameters:
  • slc ((slice-like, …) | dict(<str>=<slice-like>, …)) – The index space for setting data from variable. If slc is a sequence, it must have the same length as the target variable’s dimension count. If slc is a dictionary, there must be a key for each dimension name.
  • variable (Variable) – The variable to use for setting values in the target.
allocate_value(fill=None)[source]

Allocate the value for the variable.

Parameters:fill – If None, use fill_value.
bounds
Returns:A bounds variable or None
Return type:Variable | None
cfunits
Returns:The CF units object representation.
Return type:cf_units.Units
cfunits_conform(to_units, from_units=None)[source]

Conform value units in-place. If there are scale or offset parameters in the attribute dictionary, they will be removed.

Parameters:
  • to_units (str or units object) – Target conform units.
  • from_units (str or units object) – Overload source units.
Raises:

NoUnitsError

convert_to_empty()[source]

Convert this variable to an empty variable. This sets the value and mask to None. Also sets the parent to empty.

copy()[source]
Returns:A shallow copy of the variable.
Return type:Variable
create_ugid(name, start=1, is_current=True, **kwargs)[source]

Create a unique identifier variable for the variable. The returned variable will have the same dimensions.

Parameters:
  • name (str) – The name for the new global identifier variable.
  • start (int) – Starting value for the unique identifier.
  • is_current (bool) – If True (the default) set this variable using set_ugid().
  • kwargs (dict) – Additional arguments to variable creation.
Return type:

Variable

create_ugid_global(name, start=1)[source]

Same as create_ugid() but collective across the current OcgVM.

Raises:EmptyObjectError
deepcopy(eager=False)[source]
Parameters:eager (bool) – If True, also deep copy the variable’s parent.
Returns:A deep copy of the variable.
Return type:Variable
dimension_names
Returns:A sequence of dimension names instead of objects.
Return type:tuple of str
dimensions_dict
Returns:Dimensions as a dictionary. Keys are the dimension names. Values are the dimension objects.
Return type:OrderedDict
dist
Returns:True if the variable has a distributed dimension.
Return type:bool
dtype

Get or set the variable’s data type. If 'auto', this will be chosen automatically from the variable’s numpy data type. Setting does not do any type conversion.

Returns:The data type for variable.
Return type:type
extent
Returns:The extent of the variable’s masked value (minimum, maximum). Not applicable for all data types.
Return type:tuple
Raises:EmptyObjectError
extract(keep_bounds=True, clean_break=False)[source]

Extract the variable from its collection.

Parameters:
  • keep_bounds (bool) – If True, maintain any bounds associated with the target variable.
  • clean_break (bool) – If True, remove the target from the containing collection entirely.
Return type:

Variable

fill_value
Returns:The variable’s fill value. If 'auto', determin this automatically from numpy.
Return type:int or float
get_between(lower, upper, return_indices=False, closed=False, use_bounds=True)[source]
Parameters:
  • lower – The lower value.
  • upper – The upper value.
  • return_indices (bool) – If True, also return the indices used to slice the variable.
  • closed (bool) – If False (the default), operate on the open interval (>=, <=). If True, operate on the closed interval (>, <).
  • use_bounds (bool) – If True, use the bounds values for the between operation.
Returns:

A sliced variable.

Return type:

Variable

get_distributed_slice(slc)[source]

Slice a distributed variable. Returned variable may be empty.

Parameters:slc – The slice indices. The length of slc must match the number of variable dimensions.
Return type:Variable
get_iter(**kwargs)[source]
Parameters:kwargs – See source.
Returns:A variable iterator object.
Return type:Iterator
get_mask(create=False, check_value=False, eager=True)[source]
Parameters:
  • create (bool) – If True, create the mask if it does not exist.
  • check_value – If True, check the variable’s value for values matching fill_value. Matching indices are set to True in the created mask.
Returns:

An array of bool data type with shape matching shape.

Return type:

numpy.ndarray

get_masked_value()[source]

Return the variable’s value as a masked array.

Return type:numpy.ma.MaskedArray
get_report()[source]
Returns:A sequence of strings suitable for printing.
Return type:list[str, ..]
get_value()[source]
Returns:The data values associated with the variable.
Return type:numpy.ndarray
has_allocated_mask
Returns:True if the mask is allocated.
Return type:bool
has_allocated_value
Returns:True if the value is allocated.
Return type:bool
has_bounds
Returns:True if the variable has bounds.
Return type:bool
has_dimensions
Returns:True if the variable has dimensions.
Return type:bool
has_mask
Returns:True if the variable has a mask.
Return type:bool
has_masked_values
Returns:True if any values are masked.
Return type:bool
is_empty
Returns:is_empty set at initialization or True if any dimensions are empty and is_empty=None at initialization.
Return type:bool
is_orphaned

Variables are always part of collections. “Orphaning” is used to isolate variables to avoid infinite recursion when operating on variable collections.

Returns:True if the variable has no parent.
Return type:bool
is_string_object

Return True if the variable contains string data.

iter_dict_slices(dimensions=None)[source]
Parameters:dimensions (sequence) – Dimensions to iterate.
Returns:Yields dictionary slices in the form {‘<dimension name>’: <integer index>}.
Return type:dict
join_string_value()[source]

Join well-formed string values.

load(*args, **kwargs)[source]

Allows variables to be fake-loaded in the case of mixed pure variables and sourced variables. Actual implementations is in SourcedVariable

m(*args, **kwargs)[source]

See ocgis.Variable.get_mask()

mv(*args, **kwargs)[source]

See ocgis.Variable.get_masked_value()

ndim
Returns:The dimension count for the variable.
Return type:int
remove_value()[source]

Remove the value on the variable. The variable’s value will be re-allocated when it is retrieved again.

resolution

Resolution is computed using the differences between successive values up to ocgis.constants.RESOLUTION_LIMIT. Applicable mostly for spatial coordinate variables.

Return type:float or int
Raises:ResolutionError
set_bounds(value, force=False, clobber_units=None)[source]

Set the bounds variable.

Parameters:
  • value (Variable) – The variable containing bounds for the target.
  • force (bool) – If True, clobber the bounds if they exist in parent.
  • clobber_units (bool) – If True, clobber value.units to match self.units. If None, default to ocgis.env.CLOBBER_UNITS_ON_BOUNDS
set_dimensions(dimensions, force=False)[source]

Set dimensions for the variable. These may be set to None.

Parameters:
  • dimensions (sequence of Dimension) – The new dimensions. Should be congruent with the target variable.
  • force (bool) – If True, clobber any existing dimensions on parent
set_extrapolated_bounds(name_variable, name_dimension)[source]

Set the bounds variable using extrapolation.

Parameters:
  • name_variable (str) – Name of the bounds variable.
  • name_dimension (str) – Name for the bounds dimension.
set_mask(mask, cascade=False, update=False)[source]

Set the variable’s mask.

Parameters:
  • mask (numpy.ndarray | sequence) – A boolean array with shape matching shape.
  • cascade (bool) – If True, set the mask on variables in parent to match this mask. Only sets the masks along shared dimensions.
  • update (bool) – If True, update the existing mask using a logical or operation.
set_string_max_length_global(value=None)[source]

See string_max_length_global.

Call is collective across the current OcgVM.

set_ugid(variable, attr_link_name=None)[source]

Set the unique identifier for the variable.

Parameters:
  • variable (Variable | None) – The unique identifier variable.
  • attr_link_name (str) – If provided, set an attribute with this name on the current variable with a value of variable’s name.
set_value(value, update_mask=False)[source]

Set the variable value.

Parameters:
  • valuenumpy.ndarray | sequence
  • update_mask – See set_mask
shape
Returns:Shape of the variable.
Return type:tuple
size
Returns:Size of the variable (count of its elements)
Return type:int
string_max_length_global

Get the max string length. This only returns the private value. Call set_string_max_length_global() to initialize the private value.

This is the maximum length of the strings contained in the object across the current ocgis.OcgVM.

Returns:int
to_xarray()[source]

Convert the variable to a xarray.DataArray. This does not traverse the parent’s hierararchy. Use the conversion method on the variable’s parent to convert all variables in the collection.

Return type:xarray.DataArray
ugid
Returns:unique identifier variable
Return type:Variable
units

Get or set the units.

Returns:Units for the object.
Return type:str
v()[source]

See ocgis.Variable.get_value()

write(*args, **kwargs)[source]

Write the variable object using the provided driver.

Parameters:
  • driver(='netcdf-cf') The driver for variable writing. Not all drivers support writing single variables.
  • args – Arguments to the driver’s write_variable call.
  • kwargs – Keyword arguments to driver’s write_variable call.
class ocgis.SourcedVariable(*args, **kwargs)[source]

Bases: ocgis.variable.base.Variable

__init__(*args, **kwargs)[source]

Like a variable but loads its value and metadata from a source request dataset. Full variable functionality is maintained for convenience. Generally, it is a good idea to only provide name` and ``request_dataset to avoid conflicts.

Note

Accepts all parameters to Variable.

Additional arguments and/or keyword arguments are:

Parameters:
  • request_dataset (:class`ocgis.RequestDataset`) – (=None) The request dataset containing the variable’s source information.
  • protected (bool) – (=False) If True, attempting to access the variable’s value from source will raise a ocgis.exc.PayloadProtectedError exception. Set <object>.payload = False to disable this. Useful to ensure the variables payload data is untouched through a series of operations.
  • should_init_from_source (bool) – (=True) Allows a sourced variable to ignore any from-file operations and behave as a normal variable. This is used by some subclasses.
get_iter(**kwargs)[source]
Parameters:kwargs – See source.
Returns:A variable iterator object.
Return type:Iterator
get_mask(*args, **kwargs)[source]
Parameters:
  • create (bool) – If True, create the mask if it does not exist.
  • check_value – If True, check the variable’s value for values matching fill_value. Matching indices are set to True in the created mask.
Returns:

An array of bool data type with shape matching shape.

Return type:

numpy.ndarray

load(cascade=False)[source]

Load all variable data from source.

Parameters:cascade (bool) – If False, only load this variable’s data form source. If True, load all data from source including any variables on its parent object.

CoordinateReferenceSystem

class ocgis.variable.crs.CoordinateReferenceSystem(value=None, proj4=None, epsg=None, name='ocgis_coordinate_system')[source]

Bases: ocgis.variable.crs.AbstractProj4CRS, ocgis.base.AbstractInterfaceObject

Defines a coordinate system objects. One of value, proj4, or epsg is required.

Parameters:
  • value (dict) – A dictionary representation of the coordinate system with PROJ.4 paramters as keys.
  • proj4 (str) – A PROJ.4 string.
  • epsg (int) – An EPSG code.
  • name (str) – A custom name for the coordinate system.
__eq__(other)[source]

x.__eq__(y) <==> x==y

__init__(value=None, proj4=None, epsg=None, name='ocgis_coordinate_system')[source]

x.__init__(…) initializes x; see help(type(x)) for signature

__ne__(other)[source]

x.__ne__(y) <==> x!=y

load(*args, **kwargs)[source]

Compatibility with variable.

write_to_rootgrp(rootgrp, with_proj4=True, **kwargs)[source]

Write the coordinate system to an open netCDF file.

Parameters:
  • rootgrp (netCDF4.Dataset) – An open netCDF dataset object for writing.
  • with_proj4 (bool) – If True, write the PROJ.4 string to the coordinate system variable in an attribute called “proj4”.
Returns:

The netCDF variable object created to hold the coordinate system metadata.

Return type:

netCDF4.Variable

CRS

class ocgis.variable.crs.CRS(value=None, proj4=None, epsg=None, name='ocgis_coordinate_system')[source]

Here for convenience.

VariableCollection

class ocgis.VariableCollection(name=None, variables=None, attrs=None, parent=None, children=None, aliases=None, tags=None, source_name=-999, uid=None, is_empty=None, force=False, groups=None, initial_data=None, dimensions=None)[source]

Bases: ocgis.collection.base.AbstractCollection, ocgis.variable.base.AbstractContainer, ocgis.variable.attributes.Attributes

Variable collections behave like Python dictionaries. The keys are variable names and values are variable objects. A variable collection may have a parent and children (groups). Variable collections may be sliced using a dictionary.

Parameters:
  • name (str) – The collection’s name.
  • variables (sequence of Variable) – Initial set of variables used to initialize the collection.
  • attrs – See Attributes.
  • parent (VariableCollection) – The parent collection.
  • children (dict) – A dictionary of child variable collections.
>>> child_vc = VariableCollection()
>>> children = {'child1': child_vc}
Parameters:
  • aliases – See AbstractNamedObject.
  • tags (dict) – Tags are used to group variables (data variables for example).
>>> tags = {'special_variables': ['teddy', 'unicorn']}
Parameters:
  • source_name – See AbstractNamedObject.
  • uid – See AbstractNamedObject.
  • is_empty – If True, this is an empty collection.
  • force (bool) – If True, clobber any names that already exist in the collection.
  • initial_data – See ocgis.collection.base.AbstractCollection.
  • groups – Alias for children.
  • dimensions (dict) – A dictionary of dimension objects. The keys are the dimension names. The values are Dimension objects.
__getitem__(item_or_slc)[source]
Parameters:item_or_slc (str | dict) – The string name of the variable to retrieve or a dictionary slice. A dictionary slice has dimension names for keys and the slice as values. A shallow copy of the variable collection is returned in the case of a slice.
Returns:Variable | VariableCollection
__init__(name=None, variables=None, attrs=None, parent=None, children=None, aliases=None, tags=None, source_name=-999, uid=None, is_empty=None, force=False, groups=None, initial_data=None, dimensions=None)[source]

x.__init__(…) initializes x; see help(type(x)) for signature

add_child(child, force=False)[source]

Add a child variable collection to the current variable collection.

Parameters:
  • child (VariableCollection) – Child variable collection to add.
  • force (bool) – If True, clobber any existing children with the same name.
Raises:

ValueError

add_dimension(dimension, force=False, check_src_idx=True)[source]

Add a dimension to the variable collection.

Parameters:
  • dimension (Dimension) – The dimension to add. Will raise an exception if the dimension name is found in the collection and the dimensions are not equal.
  • force (bool) – If True, clobber any dimensions with the same name.
  • check_src_idx (bool) – If True, assert dimension source indices are equal. Raise a dimension mismatch error if they are not.
Raises:

DimensionMismatchError

add_group(*args, **kwargs)[source]

Alias for add_child().

add_variable(variable, force=False)[source]

Add a variable to the variable collection.

Parameters:
  • variable (Variable) – The variable to add.
  • force (bool) – If True, clobber any variables in the collection with the same name.
Raises:

VariableInCollectionError

append_to_tags(tag, to_append, create=True)[source]

Append a variable name to a tag.

Parameters:
  • tag (str) – The tag name.
  • to_append (str | Variable) – The variable or variable name to append to the tag.
  • create (bool) – If True, create the tag if it does not exist.
Raises:

ValueError

convert_to_empty()[source]

Convert the variable collection to an empty collection. This will convert every variable to empty.

copy()[source]
Returns:A shallow copy of the variable collection. Member variables and dimensions are also shallow copied.
Return type:VariableCollection
create_tag(tag)[source]

Create a tag.

Parameters:tag (str) – The tag name.
Raises:ValueError
find_by_attribute(key=None, value=None, pred=None)[source]

Find a variable by searching attributes.

Parameters:
  • key (str) – The attribute key. If None, check all attribute values.
  • value (<varying>) – The value to match. Takes precedence over pred.
  • pred (function) – A function accepting the attribute value associated with key. If pred returns True, the variable matches.
Return type:

tuple of Variable

Raises:

ValueError

get_by_tag(tag, create=False, strict=False, names_only=False)[source]

Tuple of variable objects that have the tag.

Parameters:
  • tag (str) – The tag to retrieve.
  • create (bool) – If True, create the tag if it does not exist.
  • strict (bool) – If True, raise exception if variable name is not found in collection.
  • names_only (bool) – If True, return names and not variable objects.
Return type:

tuple(ocgis.Variable, …)

get_mask(*args, **kwargs)[source]
Returns:The object’s mask as a boolean array with same dimension as the object.
Return type:numpy.ndarray
groups

Alias for children.

groups_to_variable(**kwargs)[source]

Convert the group identifier to a variable using the variable creation keyword arguments kwargs. The dimension of the new variable will be used to stack other variables sharing that dimension along that dimension.

Parameters:kwargs – See ocgis.Variable
Return type:ocgis.VariableCollection
iter(**kwargs)[source]
Returns:Yield record dictionaries for variables in the collection.
Return type:dict
iter_variables_by_dimensions(dimensions)[source]
Parameters:dimensions (sequence of str | sequence of Dimension) – Dimensions required to select a variable.
Returns:Yield variables sharing dimensions.
Return type:Variable
load()[source]

Load all variable values (payloads) from source. Here for compatibility with sourced variables.

static read(*args, **kwargs)[source]

Read a variable collection from a request dataset.

Parameters:
Return type:

VariableCollection

remove_orphaned_dimensions(dimensions=None)[source]

Remove dimensions from the collection that are not associated with a variable in the current collection.

Parameters:dimensions – A sequence of Dimension objects or string dimension names to check. If None, check all dimensions in the collection.
remove_variable(variable, remove_bounds=True)[source]

Remove a variable from the collection. This removes the variable’s bounds by default. Any orphaned dimensions are removed from the collection following variable removal.

Parameters:
  • variable (Variable | str) – The variable or variable name to remove from the collection.
  • remove_bounds (bool) – If True (the default), remove the variable’s bounds from the collection.
rename_dimension(old_name, new_name)[source]

Rename a dimension on the variable collection in-place.

Parameters:
  • old_name (str) – The dimension’s original name.
  • new_name (str) – The dimension’s new name.
set_mask(variable, exclude=None, update=False)[source]

Set all variable masks to the mask on variable. See set_mask() for a description of how this works on variables.

Parameters:
  • variable (Variable) – The variable having the source mask.
  • exclude (sequence of Variable | sequence of str) – Variables to exclude from mask setting.
  • update – See set_mask().
shapes
Returns:A dictionary of variable shapes.
Return type:OrderedDict
strip()[source]

Remove dimensions, variables, and children from the collection.

to_xarray(**kwargs)[source]

Convert all the variables in the collection to an xarray.Dataset.

Parameters:kwargs – Optional keyword arguments to pass to the dataset creation. data_vars and attrs are always overloaded by this method.
Return type:xarray.Dataset
write(*args, **kwargs)[source]

Write the variable collection to file.

Parameters:
  • driver – (=ocgis.constants.DriverKey.NETCDF) The driver to use for writing.
  • args – Arguments to the driver’s write_variable_collection() method.
  • kwargs – Keyword arguments to the driver’s write_variable_collection() method.

SpatialCollection

class ocgis.SpatialCollection(name=None, variables=None, attrs=None, parent=None, children=None, aliases=None, tags=None, source_name=-999, uid=None, is_empty=None, force=False, groups=None, initial_data=None, dimensions=None)[source]

Bases: ocgis.variable.base.VariableCollection

Spatial collections use a group hierarchy to nest fields in their representative geometries. First-level groups on a spatial collection are the container fields with a single geometry. The second-level groups or grandchildren (nested under the container geometries) are field associated with the parent container. These associations are typically defined using a spatial subset.

Spatial collections are the 'ocgis' output format. It is possible to not provide a subset geometry when a spatial collection is created. In this case, the container geometry is None, but the data is still nested.

Note

Accepts all parameters to VariableCollection.

__getitem__(item_or_slc)[source]
Parameters:item_or_slc (str | dict) – The string name of the variable to retrieve or a dictionary slice. A dictionary slice has dimension names for keys and the slice as values. A shallow copy of the variable collection is returned in the case of a slice.
Returns:Variable | VariableCollection
__repr__() <==> repr(x)[source]
add_field(field, container, force=False)[source]

Add a field to the spatial collection.

Parameters:
  • field (Field) – The field to add.
  • container (Field | None) – The container geometry. A None value is allowed.
  • force (bool) – If True, clobber any field names in the spatial collection.
Returns:

archetype_field

Return an archetype field from the spatial collection. This is first field encountered during field iteration.

Return type:Field
crs

Return the spatial collection’s coordinate system. This is the coordinate system of the first encountered field in iteration.

Return type:AbstractCRS
geoms

Reformat container geometries into a dictionary. Keys are the child geometries unique identifiers. The values are Shapely geometries.

Return type:OrderedDict
get_element(field_name=None, variable_name=None, container_ugid=None)[source]

Get a field or variable from the spatial collection.

Parameters:
  • field_name (str) – The field name to get from the collection.
  • variable_name (str) – The variable name to get from the collection. If None, a field will be returned.
  • container_ugid – The container unique identifier. If None, the first container will be used.
Return type:

Field | Variable

has_container_geometries

Return True if there are container geometries.

Return type:bool
iter_fields(yield_container=False)[source]

Iterate field objects in the collection.

iter_melted(tag=None)[source]

Iterate a melted dictionary containing all spatial collection elements.

properties

Reformat container geometry values into a properties dictionary.

Return type:OrderedDict

DimensionMap

class ocgis.DimensionMap[source]

Bases: ocgis.base.AbstractOcgisObject

A dimension map is used to link dimensions and variables with an explicit meaning. It is the main mapping produced by a driver and a request dataset’s metadata. Dimension maps are used by fields to construct grids and geometries, perform subsetting, link bounds to parent variables, and manage coordinate systems.

__eq__(other)[source]

x.__eq__(y) <==> x==y

__init__()[source]

x.__init__(…) initializes x; see help(type(x)) for signature

as_dict(curr=None)[source]

Convert the the dimension map to a dictionary.

Return type:dict
classmethod from_dict(dct)[source]

Create a dimension map from a well-formed dictionary.

Parameters:dct (dict) – The input dimension map-like dictionary.
Return type:DimensionMap
classmethod from_metadata(driver, group_metadata, group_name=None, curr=None)[source]

Create a dimension map from source metadata.

Parameters:
  • driver (AbstractDriver) – The driver to use for metadata interpretation.
  • group_metadata (dict) – Source metadata for the target group to convert recursively.
  • group_name (str) – The current group name.
Return type:

DimensionMap

classmethod from_old_style_dimension_map(odmap)[source]

Convert an old-style dimension map (pre-v2.x) to a new-style dimension map.

Parameters:odmap (dict) – The old-style dimension map to convert.
Return type:DimensionMap
get_attrs(entry_key)[source]

Get attributes for the dimension map entry entry_key.

Parameters:entry_key (str) – See ocgis.constants.DimensionMapKey for valid entry keys.
Return type:OrderedDict
get_available_topologies()[source]

Get a list of available topologies keys. Keys are of type ocgis.constants.Topology. The returned tuple may be of zero length if no topologies are present on the dimension map.

Return type:tuple
get_bounds(entry_key)[source]

Get the bounds variable name for the dimension map entry entry_key.

Parameters:entry_key (str) – See ocgis.constants.DimensionMapKey for valid entry keys.
Return type:str
get_crs(parent=None, nullable=False)[source]

Get the coordinate reference system variable name for the dimension map entry entry_key.

Return type:str | CRS
get_dimension(entry_key, dimensions=None)[source]

Get the dimension names for the dimension map entry entry_key.

Parameters:
  • entry_key (str) – See ocgis.constants.DimensionMapKey for valid entry keys.
  • dimensions (dict) – A dictionary of dimension names (keys) and objects (values). If provided, a dimension object will be returned from this dictionary if the dimension name is present on the dimension map.
Return type:

list of str

get_driver(as_class=False)[source]

Return the driver key or class associated with the dimension map.

Parameters:as_class (bool) – If True, return the driver class instead of the driver string key.
Return type:str | ocgis.driver.base.AbstractDriver
get_grid_abstraction(default='auto')[source]

Get the grid abstraction or, if absent on the dimension map, return default.

Parameters:default (ocgis.constants.GridAbstraction) – Default return value.
Return type:str | ocgis.constants.GridAbstraction
get_group(group_key)[source]

Get the dimension map for a group indexed by group_key starting from the root group.

Parameters:group_key – The group indexing key.
Return type:list of str
get_property(key, default=None)[source]

Return a dimension map property value.

Parameters:
  • key (str) – The key name
  • default – A default value to return if the key is not present
Return type:

<varying>

get_spatial_mask()[source]

Get the spatial mask variable name.

Return type:str
get_topology(topology, create=False)[source]

Get a child dimension map for a given topology. If create is True, the child dimension map will be created if it is not present on the dimension map. If create is False, None will be returned if the topology does not exist.

Parameters:
  • topology (ocgis.constants.Topology) – The target topology to get or create.
  • create (bool) – Flag for creation behavior if the child dimension map does not exist.
Return type:

DimensionMap | None

get_variable(entry_key, parent=None, nullable=False)[source]

Get the coordinate variable name for the dimension map entry entry_key.

Parameters:
  • entry_key (str) – See ocgis.constants.DimensionMapKey for valid entry keys.
  • parent (VariableCollection) – If present, use the returned variable name to return the variable object form parent.
  • nullable (bool) – If True and parent is not None, return None if the variable is not found in parent.
Return type:

str | None

has_topology

Return True if the dimension map has topology entries.

Return type:bool
inquire_is_xyz(variable)[source]

Inquire the dimension map to identify a variable’s spatial classification.

Parameters:variable (str | Variable) – The target variable to identify.
Return type:ocgis.constants.DimensionMapKey
pprint(as_dict=False)[source]

Pretty print the dimension map.

Parameters:as_dict (bool) – If True, convert group dimension maps to dictionaries.
set_bounds(entry_key, bounds)[source]

Set the bounds variable name for entry_key.

Parameters:
set_crs(variable)[source]

Set the coordinate reference system variable name.

Parameters:variablestr | Variable
set_group(group_key, dimension_map)[source]

Set the group dimension map for group_key.

Parameters:
set_property(key, value)[source]

Set a property on the dimension map.

Parameters:
  • key (str) – The key name
  • value – The property’s value
set_spatial_mask(variable, attrs=None, default_attrs=None)[source]

Set the spatial mask variable for the dimension map. If attrs is not None, then attrs > variable.attrs (if variable is not a string) > default attributes.

Parameters:
  • variable (Variable | str) – The spatial mask variable.
  • attrs (dict) – Attributes to associate with the spatial mask variable in addition to default attributes.
  • default_attrs (dict) – If provided, use these attributes as default spatial mask attributes.
set_variable(entry_key, variable, dimension=None, bounds=None, attrs=None, pos=None, dimensionless=False, section=None)[source]

Set coordinate variable information for entry_key.

Parameters:
  • entry_key (str) – See ocgis.constants.DimensionMapKey for valid entry keys.
  • variable (str | Variable) – The variable to set. Use a variable object to auto-fill additional fields if they are None.
  • dimension – A sequence of dimension names. If None, they will be pulled from variable if it is a variable object.
  • bounds – See set_bounds().
  • attrs (dict) – Default attributes for the coordinate variables. If None, they will be pulled from variable if it is a variable object.
  • pos (int) – The representative dimension position in variable if variable has more than one dimension. For example, a latitude variable may have two dimensions (lon, lat). The mapper must determine which dimension position is representative for the latitude variable when slicing.
  • section (tuple) – A slice-like tuple used to extract the data out of its source variable into a single variable format.
>>> section = (None, 0)
>>> # This will be converted to a slice.
>>> [slice(None), slice(0, 1)]
Parameters:dimensionless (bool) – If True, this variable has no canonical dimension.
Raises:DimensionMapError
update(other)[source]

Update this dimension map from another dimension map.

Parameters:other (DimensionMap) –
update_dimensions_from_metadata(metadata)[source]

Update dimension names for coordinate variables using a metadata dictionary.

Parameters:metadata (dict) – A metadata dictionary containing dimension names for variables.

Field

class ocgis.Field(**kwargs)[source]

Bases: ocgis.variable.base.VariableCollection

A field behaves like a variable collection but with additional metadata on its component variables.

Note

Accepts all parameters to VariableCollection.

Additional keyword arguments are:

Parameters:
  • dimension_map (DimensionMap | dict) – (=None) Maps variables to axes, dimensions, bounds, and default attributes. It is possible to fully-specify a default field by providing a list of variables and the dimension map. Instrumented/coordinate variables may be provided with keyword arguments. The dimension map is updated internally in those cases.
  • is_data (sequence of Variable | sequence of str) – (=None) Set these variables or variable names (if names are provided, the variables must be provided through variables) as data variables. Data variables often contain the field information of interest such as temperature, relative humidity, etc.
  • realization (Variable) – (=None) A realization or ensemble variable. Its value is typically an integer representing its record count across global realizations.
  • time (TemporalVariable) – (=None) A time variable.
  • level (Variable) – (=None) A level variable. This may also be considered the field’s z-coordinate.
  • grid (Grid) – (=None) A grid object. x/y-coordinates will be pulled from the grid automatically. Any level or z-coordinate must be provided using level.
  • geom (GeometryVariable) – (=None) The geometry variable.
  • crs (str | None | AbstractCRS) – (='auto') A coordinate reference system variable. If 'auto', use the coordinate system from the grid or geom. geom is given preference if both are present.
  • format_time (str) – See keyword argument format_time for TemporalVariable.
  • grid_abstraction (str) – See keyword argument abstraction for Grid.
  • grid_is_isomporphic – (='auto') If True, the grid is isomorphic with repeated, topologically adjancent cells (i.e. a logically rectangular grid). If False, the grid elements change shapes (i.e. boundaries like a watershed). If 'auto', let the driver determine the grid default.
__init__(**kwargs)[source]

x.__init__(…) initializes x; see help(type(x)) for signature

add_variable(variable, force=False, is_data=False)[source]

..note:: Accepts all parameters to add_variable().

Additional keyword arguments are:

Parameters:is_data (bool) – If True, the variable is considered a data variable.
axes_shapes
Returns:Axis variables shapes.
Return type:dict
bounds_variables

Create a tuple of bounds variables associated with coordinate_variables().

Return type:tuple(ocgis.Variable, …)
coordinate_variables

Return a tuple of coordinate variables. This will attempt to access spatial coordinate variables on the field’s grid. If no grid is available, spatial coordinates will be pulled from the dimension map. Time will always be pulled from the field. The tuple may have a length of zero if no coordinate variables are available on the field.

Return type:tuple
copy()[source]
Returns:A shallow copy of the field. The field’s dimension map is deep copied.
Return type:Field
crs
Returns:Get the field’s coordinate reference system. Return None if no coordinate system is assigned.
Return type:AbstractCRS
data_variables

Data variables are the “value” variables for the field. They are often variables like temperature or relative humidity. The default tag ocgis.constants.TagName.DATA_VARIABLES is used for data variables.

Returns:A sequence of variables tagged with the default data variable tag.
Return type:sequence of Variable
driver

Return the driver class associated with the dimension map.

Return type:ocgis.driver.base.AbstractDriver
classmethod from_records(records, schema=None, crs=-999, uid=None, union=False, data_model=None)[source]

Create a Field from Fiona-like records.

Parameters:
  • records (sequence of dict) – A sequence of records returned from an Fiona file object.
  • schema (dict) – A Fiona-like schema dictionary. If None and any records properties are None, then this must be provided.
>>> schema = {'geometry': 'Point', 'properties': {'UGID': 'int', 'NAME', 'str:4'}}
Parameters:
  • crs (dict | AbstractCoordinateReferenceSystem) – If ocgis.constants.UNINITIALIZED, default to ocgis.env.DEFAULT_COORDSYS.
  • uid (str) – If provided, use this attribute name as the unique identifier. Otherwise search for env.DEFAULT_GEOM_UID and, if not present, construct a 1-based identifier with this name.
  • union (bool) – If True, union the geometries from records yielding a single geometry with a unique identifier value of 1.
  • data_model (str) – See create_typed_variable_from_data_model().
Returns:

Field object constructed from records.

Return type:

Field

classmethod from_variable_collection(vc, *args, **kwargs)[source]

Create a field from a variable collection.

Parameters:vc (VariableCollection) – The template variable collection.
Return type:Field
geom
Returns:Get the field’s geometry variable. Return None if no geometry is available.
Return type:GeometryVariable | None
get_field_slice(dslice, strict=True, distributed=False)[source]

Slice the field using a dictionary. Keys are dimension map standard names defined by ocgis.constants.DimensionMapKey. Dimensions are temporarily renamed for the duration of the slice.

Parameters:
  • dslice (dict) – The dictionary slice.
  • strict – If True (the default), any dimension names in dslice are required to be in the target field.
  • distributed (bool) – If True, this is should be considered a parallel/global slice.
Returns:

A shallow copy of the sliced field.

Return type:

Field

get_report(should_print=False)[source]
Parameters:should_print (bool) – If True, print the report lines in addition to returning them.
Returns:A sequence of strings with descriptive field information.
Return type:list of str
grid
Returns:Get the field’s grid object. Return None if no grid is present.
Return type:AbstractGrid | None
has_data_variables
Returns:True if the field has data variables.
Return type:bool
iter(**kwargs)[source]
Returns:Yield record dictionaries for variables in the field applying standard names to dimensions by default.
Return type:dict
iter_data_variables(tag_name='_ocgis_data_variables')[source]
Parameters:tag_name (str) – The tag to iterate.
Returns:Yields variables associated with tag.
Return type:Variable
level
Returns:Get the field’s level variable. Return None if no level is assigned.
Return type:Variable | None
classmethod read(*args, **kwargs)[source]

Read a variable collection from a request dataset.

Parameters:
Return type:

VariableCollection

realization
Returns:Get the field’s realization variable. Return None if no realization is assigned.
Return type:Variable | None
set_abstraction_geom(force=True, create_ugid=False, ugid_name='GID', ugid_start=1, set_ugid_as_data=False)[source]

Set the abstraction geometry for the field using the field’s geometry variable or the field’s grid abstraction geometry.

Parameters:
  • force (bool) – If True (the default), clobber any existing geometry variables.
  • create_ugid (bool) – If True, create a unique identifier integer Variable for the abstraction geometry. Only creates the variable if the geometry does not already have a ugid.
  • ugid_name (str) – Name for the ugid variable.
  • ugid_start (int) – Starting value to use for the unique identifier.
  • set_ugid_as_data (bool) – If True, set the ugid variable as data on the field. Useful for writing shapefiles which require at least one data variable.
Raises:

ValueError

set_crs(value, force=True, should_add=True)[source]

Set the field’s coordinate reference system. If coordinate system is already present on the field. Remove this variable.

Parameters:
  • value (AbstractCRS | None) – The coordinate reference system variable or None.
  • force – See add_variable()
  • should_add (bool) – If True, add the variable to the field object. If False, do not add the variable to the field variable storage. This is useful for updating metadata on the dimension map only.
set_element_node_connectivity(value, force=True, should_add=True)[source]

Set the element node connectivity variable. This variable maps coordinate values to element nodes using an index.

Parameters:
  • value (Variable) – The element node connectivity variable.
  • force (bool) – See add_variable().
  • should_add (bool) – If True (the default), add the variable to collection.
set_geom(variable, crs='auto', force=True, dimensionless='auto', should_add=True)[source]

Set the field’s geometry variable.

Parameters:
  • variable (GeometryVariable | None) – The geometry variable or None.
  • crs – If 'auto' (the default), use the coordinate system of the incoming geometry variable.
  • force (bool) – If True (the default), clobber any existing geometry variable.
  • dimensionless (bool) – If 'auto', automatically determine dimensionless state for the variable. See set_variable().
  • should_add (bool) – If True, add the variable to the field object. If False, do not add the variable to the field variable storage. This is useful for updating metadata on the dimension map only.
Raises:

ValueError

set_geom_from_grid(force=True)[source]

Set the field’s geometry from its grid’s abstraction geometry.

Parameters:force (bool) – If True (the default), clobber any existing geometry variables.
set_grid(grid, crs='auto', force=True, should_add=True)[source]

Set the field’s grid.

Parameters:
  • grid (Grid | None | str) – The grid object. If 'auto', pass-through.
  • crs – If 'auto' (the default), use the coordinate system of the incoming grid object.
  • force (bool) – If True (the default), clobber any existing grid member variables.
  • should_add (bool) – If True, add the variable to the field object. If False, do not add the variable to the field variable storage. This is useful for updating metadata on the dimension map only.
set_level(variable, force=True, should_add=True)[source]

Set the field’s level variable.

Parameters:
  • variable (TemporalVariable | None) – The variable to use.
  • force – See add_variable()
  • should_add (bool) – If True, add the variable to the field object. If False, do not add the variable to the field variable storage. This is useful for updating metadata on the dimension map only.
set_level_repr(variable, force=True, should_add=True)[source]

Set the field’s representative level variable.

Parameters:
  • variable (TemporalVariable | None) – The variable to use.
  • force – See add_variable()
  • should_add (bool) – If True, add the variable to the field object. If False, do not add the variable to the field variable storage. This is useful for updating metadata on the dimension map only.
set_realization(variable, force=True, should_add=True)[source]

Set the field’s realization variable.

Parameters:
  • variable (TemporalVariable | None) – The variable to use.
  • force – See add_variable()
  • should_add (bool) – If True, add the variable to the field object. If False, do not add the variable to the field variable storage. This is useful for updating metadata on the dimension map only.
set_time(variable, force=True, should_add=True)[source]

Set the field’s time variable.

Parameters:
  • variable (TemporalVariable | None) – The variable to use.
  • force – See add_variable()
  • should_add (bool) – If True, add the variable to the field object. If False, do not add the variable to the field variable storage. This is useful for updating metadata on the dimension map only.
set_x(variable, dimension, force=True, should_add=True)[source]

Set the field’s x-coordinate variable.

Parameters:
  • variable (Variable) – The source variable.
  • dimension (Dimension) – The representative field dimension for the variable. Required as the representative dimension cannot be determined with greater than one dimension on the coordinate variable.
  • force (bool) – If True (the default), clobber any existing geometry variables.
  • should_add (bool) – If True, add the variable to the field object. If False, do not add the variable to the field variable storage. This is useful for updating metadata on the dimension map only.
set_y(variable, dimension, force=True, should_add=True)[source]

Set the field’s y-coordinate variable.

Parameters:
  • variable (Variable) – The source variable.
  • dimension (Dimension) – The representative field dimension for the variable. Required as the representative dimension cannot be determined with greater than one dimension on the coordinate variable.
  • force – See add_variable()
  • should_add (bool) – If True, add the variable to the field object. If False, do not add the variable to the field variable storage. This is useful for updating metadata on the dimension map only.
temporal

Alias for time

time
Returns:Get the field’s time variable. Return None if no time is assigned.
Return type:TemporalVariable | None
to_xarray(**kwargs)[source]

Convert the field to a xarray.Dataset with CF metadata interpretation.

Limitations: * Bounds are treated as data arrays inside the xarray dataset. * Integer masked arrays are upcast to float data types in xarray. * Group hierarchies are not supported in xarray.

Parameters:
  • decode_cf (bool) – (=True) If True, run the xarray function decode_cf on the returned dataset.
  • kwargs (dict) – Optional keyword arguments to dataset creation. See ocgis.VariableCollection.to_xarray() for additional information.
Return type:

xarray.Dataset

unwrap()[source]

Unwrap the field’s coordinates contained in its grid and/or geometry.

Raises:EmptyObjectError
update_crs(to_crs, from_crs=None)[source]

See ocgis.spatial.base.AbstractOperationsSpatialObject.update_crs()

wrap(inplace=True)[source]

Wrap the field’s coordinates contained in its grid and/or geometry.

Raises:EmptyObjectError
wrapped_state
Returns:The wrapped state for the field.
Return type:ocgis.constants.WrappedState
Raises:EmptyObjectError
write(*args, **kwargs)[source]

See ocgis.VariableCollection.write().

Note

If no driver is provided, then the field’s dimension map driver will be used.

x
Returns:Get the field’s x-coordinate variable. Return None if no x-coordinate is assigned.
Return type:Variable | None
y
Returns:Get the field’s y-coordinate variable. Return None if no y-coordinate is assigned.
Return type:Variable | None
z

Alias for level.

Grids

class ocgis.Grid(x=None, y=None, z=None, pos=(0, 1), **kwargs)[source]

Bases: ocgis.spatial.grid.AbstractGrid, ocgis.spatial.base.AbstractXYZSpatialContainer

Grids are structured, rectilinear x/y-coordinate representations. x/y-coordinate variables may have bounds. The z-coordinate is supported only to allow its access from the grid. All subsetting operations, slicing, etc. occurs only on the x/y-coordinates.

Parameters:
  • x (Variable) – The grid’s x-coordinate.
  • y (Variable) – The grid’s y-coordinate.
  • z (Variable) – The grid’s z-coordinate. No grid operations manipulate the z-coordinate. It is present on the grid for convenience.
  • abstraction (str) – The grid’s spatial abstraction.
Value Description
'auto' Automatically choose spatial abstraction. ‘polygon’ if x/y-coordinates have bounds and ‘point’ if they do not.
'point' Use representative value from x/y-coordinate variables to construct point geometries. Typically this is considered the center value.
'polygon' Use bounds from x/y-coordinates to construct polygon geometries.
Parameters:
  • crs – See AbstractSpatialObject
  • parent (Field) – The parent field for the grid.
  • mask (Variable) – The mask variable for the grid. Coordinate variables should not be masked. The mask must be managed independently. The mask variable should use the its mask to indicate masked values.
  • pos (sequence) – If coordinate variables x and y are two-dimensional, these are the dimension indices for them in the grid’s dimensions. Defaults to (0, 1) or (y/latitude, x/longitude).
__getitem__(slc)[source]
Parameters:slc – The slice sequence with indices corresponding to:
Index Description
0 row/y dimension
1 column/x dimension

slc may also be a dictionary with grid dimensions as keys.

Returns:Shallow copy of the sliced grid.
Return type:Grid
__init__(x=None, y=None, z=None, pos=(0, 1), **kwargs)[source]

x.__init__(…) initializes x; see help(type(x)) for signature

__setitem__(slc, grid)[source]

Set the grid values and mask to match grid in the index space defined by slc.

Parameters:
  • slc (sequence of slice-like object) – The set slice for the target. Must have length matching the grid dimension count.
  • grid – The grid object containing the values to set in the target.
copy()[source]
Returns:shallow copy of the grid
Return type:Grid
diagnostics(plot_xy=False, scatter_xy=False, unique_xy=False, verbose=False, plot_var=None)[source]

Print some grid diagnostics. This is designed to be customized.

dtype
Returns:Representative data type for the grid. This is pulled from the archetype variable.
Return type:type
expand()[source]

If the grid is vectorized/factorized (spatial coordinate represented using one-dimensional arrays), convert spatial coordinates to two-dimensional arrays. If the grid is already two-dimensional, pass through.

extract(clean_break=False)[source]

Extract the grid from its parent collection.

See extract() for documentation.

get_abstraction_geometry(**kwargs)[source]

Get the abstraction geometry variable for the grid.

Parameters:kwargs – Keyword arguments to the geometry get method. See get_point() for example.
Return type:GeometryVariable
get_distributed_slice(slc, **kwargs)[source]

Slice the grid in parallel and return a shallow copy. This is collective across the current OcgVM.

Parameters:
Return type:

AbstractGrid

get_nearest(*args, **kwargs)[source]

Get nearest element to target geometry.

get_report()[source]
Returns:sequence of strings containing explanatory grid information
Return type:list of str
get_spatial_index(*args, **kwargs)[source]

Get the spatial index.

get_spatial_subset_operation(spatial_op, subset_geom, return_slice=False, use_bounds='auto', original_mask=None, keep_touches='auto', cascade=True, optimized_bbox_subset=False, apply_slice=True)[source]

Perform intersects or intersection operations on the grid object.

Parameters:
  • spatial_op (str) – Either an 'intersects' or an 'intersection' spatial operation.
  • subset_geom (shapely.geometry.base.BaseGeometry) – The subset Shapely geometry. All geometry types are accepted.
  • return_slice (bool) – If True, also return the slices used to limit the grid’s extent.
  • use_bounds (bool | str) – If 'auto' (the default), use bounds if they are available to construct polygon objects for the intersects operation.
  • original_mask (numpy.ndarray) – An optional mask to use as a hint for spatial operation. True values are excluded from spatial consideration.
  • keep_touches (bool | str) – If 'auto' (the default), keep geometries that touch only if the grid’s spatial abstraction is point.
  • cascade – If True (the default), set the mask across all variables in the grid’s parent collection.
  • optimized_bbox_subset – If True, perform an optimized bounding box subset on the grid. This will only use the grid’s representative coordinates ignoring bounds, geometries, etc.
  • apply_slice – If True (the default), apply the slice to the grid object in addition to updating its mask.
Returns:

If return_slice is False (the default), return a shallow copy of the sliced grid. If return_slice is True, this will be a tuple with the subsetted object as the first element and the slice used as the second. If spatial_op is 'intersection', the returned object is a geometry variable.

Return type:

Grid | GeometryVariable | tuple of (<returned object>, <slice used>)

has_allocated_abstraction_geometry
Returns:True if the geometry abstraction variable is allocated on the grid.
Return type:bool
has_allocated_point
Returns:True if the point variable is allocated on the grid.
Return type:bool
has_allocated_polygon
Returns:True if the polygon variable is allocated on the grid.
Return type:bool
has_bounds

Return True if the grid coordinate variables have bounds.

Return type:bool
has_shared_dimension

Return True if the x/y dimensions are equal.

is_vectorized
Returns:True if the grid is vectorized (factorized). Vectorized grids have one-dimensional x- and coordinate variables.
Return type:bool
iter_records(*args, **kwargs)[source]

Generate fiona-compatible records.

remove_bounds()[source]

Set the grid coordinate variable bounds to None.

set_extrapolated_bounds(name_x_variable, name_y_variable, name_dimension)[source]

Extrapolate corners from grid centroids.

Parameters:
  • name_x_variable (str) – Name for the x-coordinate bounds variable.
  • name_y_variable (str) – Name for the y-coordinate bounds variable.
  • name_dimension (str) – Name for the bounds/corner dimension.
shape
Return type:tuple of int
update_crs(*args, **kwargs)[source]

Update the coordinate system in place.

Parameters:
  • to_crs (AbstractCRS) – The destination coordinate system.
  • from_crs (AbstractCRS) – Optional original coordinate system to temporarily assign to the data. Useful when the object’s coordinate system is different from the desired coordinate system.
write(*args, **kwargs)[source]

See write().

class ocgis.GridUnstruct(geoms=None, abstraction='auto', parent=None)[source]

Bases: ocgis.spatial.grid.AbstractGrid

Unstructured grids manage operations across geometry coordinate objects. It overloads some operations but generally delegates complex operations to underlying geometry coordinate objects. It will broadcast operations across multiple geometry coordinate objects as necessary. Hence, geometry coordinate objects’ documentation should be used when interpreting unstructured grid operations.

Parameters:
  • geoms (sequence of AbstractGeometryCoordinates | None) – One or more geometry coordinate variables for representing the unstructured grid. If None, use the parent’s dimension map to construct the object.
  • abstraction – See ocgis.spatial.grid.AbstractGrid.abstraction.
  • parent (Field) – The parent field object. Required if not geometry coordinate objects are provided.
__getattribute__(name)[source]

x.__getattribute__(‘name’) <==> x.name

__init__(geoms=None, abstraction='auto', parent=None)[source]

x.__init__(…) initializes x; see help(type(x)) for signature

coordinate_variables

See coordinate_variables()

get_abstraction_geometry()[source]

Get the abstraction geometry variable for the grid.

Parameters:kwargs – Keyword arguments to the geometry get method. See get_point() for example.
Return type:GeometryVariable
reduce_global(*args, **kwargs)[source]

See ocgis.spatial.geomc.AbstractGeometryCoordinates.reduce_global()

GeometryVariable

class ocgis.GeometryVariable(**kwargs)[source]

Bases: ocgis.spatial.base.AbstractSpatialVariable

A variable containing Shapely geometry object arrays.

Note

Accepts all parameters to Variable.

Additional keyword arguments are:

Parameters:
  • crs (AbstractCRS) – (=None) The coordinate reference system for the geometries.
  • geom_type (str) – (='auto') See http://toblerity.org/shapely/manual.html#object.geom_type. If 'auto', the geometry type will be automatically determined from the object array. Providing a default prevents iterating over the object array to identify the geometry type.
  • ugid (numpy.ndarray) – (=None) An integer array with same shape as the geometry variable. This array will be converted to a Variable.
  • is_bbox (bool) – If True, treat the polygon geometry as a bounding box geometry.
__init__(**kwargs)[source]

Like a variable but loads its value and metadata from a source request dataset. Full variable functionality is maintained for convenience. Generally, it is a good idea to only provide name` and ``request_dataset to avoid conflicts.

Note

Accepts all parameters to Variable.

Additional arguments and/or keyword arguments are:

Parameters:
  • request_dataset (:class`ocgis.RequestDataset`) – (=None) The request dataset containing the variable’s source information.
  • protected (bool) – (=False) If True, attempting to access the variable’s value from source will raise a ocgis.exc.PayloadProtectedError exception. Set <object>.payload = False to disable this. Useful to ensure the variables payload data is untouched through a series of operations.
  • should_init_from_source (bool) – (=True) Allows a sourced variable to ignore any from-file operations and behave as a normal variable. This is used by some subclasses.
area
Returns:geometry areas as a float masked array
Return type:numpy.ma.MaskedArray
as_shapely()[source]

Convert to a Shapely geometry provided this is a singleton geometry variable.

Return type:shapely.geometry.base.BaseGeometry
convert_to(target=<ConversionTarget.GEOMETRY_COORDS: 'geometry_coords'>, **kwargs)[source]

Convert to a target type. The returned object is orphaned (does not share a parent with the source).

Some common manipulations are shared between conversion targets.

  • Always orients polygons CCW.
Parameters:
  • target (ocgis.constants.ConversionTarget) – The target type.
  • dtype(=None) Array data type for the coordinate variables.
  • xname(=constants.DEFAULT_NAME_COL_COORDINATES) Name of the x-coordinate variable.
  • yname(=constants.DEFAULT_NAME_ROW_COORDINATES) Name of the y-coordinate variable.
  • zname(=constants.DEFAULT_NAME_LVL_COORDINATES) Name of the z-coordinate variable.
  • node_dim_name(='n_node') Name of the node count dimension.
  • element_index_name(='element_index') Name of the element node connectivity variable.
  • pack(=True) If True, de-duplicate coordinate vectors.
  • repeat_last_node(=False) If False, do not repeat the last node coordinate for polygon geometries.
  • max_element_coords (int) – (=None) If provided, the maximum number of coordinates across all polygon geometries to convert. This fixes the column count for element node connectivity arrays. Otherwise, ragged arrays are used.
  • multi_break_value (int) – (=constants.OcgisConvention.MULTI_BREAK_VALUE) Value to use for indicating a multi-geometry break (indicates a separation of elements. Value must be negative.
  • node_threshold (int) – (=None) Split polygons with nodes counts greater than this value into multi-polygons.
  • split_interiors (bool) – (=False) If True, split polygons with holes/interiors into multi-polygons.
  • driver (str) – (driver=constants.DriverKey.NETCDF_UGRID) The driver to use for the output object.
  • use_geometry_iterator (bool) – (=False) If True, use a geometry iterator instead of loading all the geometries from source.
  • start_index (int) – (=0) The start index to use for coordinate indexing.
dtype
Returns:geometry variables are always of object type
Return type:type
geom_type
Returns:geometry type for the variable
Return type:str
geom_type_global
Returns:global geometry type collective across the current OcgVM
Return type:str
Raises:EmptyObjectError
get_buffer(*args, **kwargs)[source]

Return a shallow copy of the geometry variable with geometries buffered.

Note

Accepts all parameters to shapely.geometry.base.BaseGeometry.buffer().

An additional keyword argument is:

Parameters:geom_type (str) – The geometry type for the new buffered geometry if known in advance.
Return type:GeometryVariable
Raises:EmptyObjectError
get_intersection(*args, **kwargs)[source]

Note

Accepts all parameters to get_intersects(). Same return types.

Additional arguments and/or keyword arguments are:

Parameters:
  • inplace (bool) – (=False) If False (the default), deep copy the geometry array on the output before executing an intersection. If True, modify the geometries in-place.
  • intersects_check (bool) – (=True) If True (the default), first perform an intersects operation to limit the geometries tests for intersection. If False, perform the intersection as is.
get_intersects(*args, **kwargs)[source]

Perform an intersects spatial operations on the geometry variable.

Parameters:
  • return_slice (bool) – (=False) If True, return the _global_ slice that will guarantee no masked elements outside the subset geometry as the second element in the return value.
  • cascade (bool) – (=True) If True (the default), set the mask following the spatial operation on all variables in the parent collection.
Returns:

shallow copy of the geometry variable

Return type:

GeometryVariable | (<geometry variable>, <slice>)

Raises:

EmptySubsetError

get_iter(*args, **kwargs)[source]
Return type:Iterator
get_mask_from_intersects(geometry_or_bounds, use_spatial_index=True, keep_touches=False, original_mask=None)[source]
Parameters:
  • geometry_or_bounds (shapely.geometry.base.BaseGeometry | tuple) – A Shapely geometry or bounds tuple used for the masking.
  • use_spatial_index (bool) – If True, use a spatial index for the operation.
  • keep_touches (bool) – If True, keep geometries that only touch.
  • original_mask (numpy.ndarray) – A hint mask for the spatial operation. True values will be skipped.
Returns:

boolean array with non-intersecting values set to True

Return type:

numpy.ndarray

get_nearest(target, return_indices=False)[source]
Parameters:
  • target (shapely.geometry.base.BaseGeometry) – The Shapely geometry to use for proximity.
  • return_indices (bool) – If True, also return the indices used for slicing the geometry variable.
Returns:

shallow copy of the geometry variable and optionally slices

Return type:

GeometryVariable | (<geometry variable>, <slice>)

get_report()[source]
Returns:A sequence of strings suitable for printing.
Return type:list[str, ..]
get_spatial_index(target=None)[source]
Parameters:target (numpy.ndarray) – If this is a boolean array, use this as the add target. Otherwise, use the compressed masked values.
Returns:spatial index for the geometry variable
Return type:rtree.index.Index
get_spatial_subset_operation(spatial_op, subset_geom, **kwargs)[source]

Perform a spatial subset operation (this includes an intersection/clip).

get_unioned(dimensions=None, union_dimension=None, spatial_average=None, root=0)[source]

Unions _unmasked_ geometry objects and applies spatial averaging weights to variables in the parent collection if requested. Collective across the current OcgVM.

Parameters:
  • dimensions (tuple(ocgis.Dimension, …) | tuple(str, …)) – Dimensions to union. If None, default to the object’s dimensions.
  • union_dimension (ocgis.Dimension | str) – The new dimension for the unioned geometry.
  • spatial_average (tuple(ocgis.Variable, …) | tuple(str, …)) – The variables to spatially average. Other variables will be left untouched.
  • root (int) – If executing in parallel, the root rank to send all data. On non-root ranks, None will be returned.
Return type:

ocgis.GeometryVariable

has_z

Return True if the variable has a z-coordinate.

Return type:bool
iter_records(use_mask=True)[source]

Generate fiona-compatible records.

prepare(archetype=None)[source]

Prepare the geometry variable for spatial operations by calling its coordinate system’s ocgis.variable.crs.AbstractCRS.prepare_geometry_variable() method and returning a deep copy. If an archetype is provided, update the returned object’s coordinate system and wrapped state to match the archetype’s. If the current object has no crs or no modifications are required by the object, then a shallow copy is returned.

Parameters:archetype (ocgis.spatial.base.AbstractSpatialObject) – The object to use for spatial property matching.
Returns:GeometryVariable
set_ugid(variable, attr_link_name='ocgis_geom_uid')[source]

Same as set_ugid(), except the unique identifier name has a default value.

set_value(value, **kwargs)[source]

Set the variable value.

Parameters:
  • valuenumpy.ndarray | sequence
  • update_mask – See set_mask
update_crs(to_crs, from_crs=None)[source]

Update the coordinate system of the geometry variable in-place.

Parameters:to_crs (AbstractCRS) – The destination CRS for the transformation.
weights

Weights are defined as:

>>> self.area / self.area.max()

Any geometries with zero area (points) are given an area of 1.0 for the purposes of weight calculation.

Returns:weights as a float masked array
Return type:numpy.ma.MaskedArray

Geometry Coordinate Variables

class ocgis.PolygonGC(x=None, y=None, z=None, cindex='auto', packed=True, start_index='auto', hosted=False, **kwargs)[source]

Bases: ocgis.spatial.geomc.AbstractGeometryCoordinates

__shapely_geometry_class__

alias of shapely.geometry.polygon.Polygon

__shapely_multipart_class__

alias of shapely.geometry.multipolygon.MultiPolygon

get_element_node_connectivity_by_index(element_connectivity, idx)[source]

Return something that can be used for indexing into coordinate arrays to retrieve the coordinates for the current element.

Parameters:
  • element_connectivity (numpy.ndarray) – An element connectivity array with the first dimension/axis as the element dimension.
  • idx (int) – The element index.
Return type:

<used as NumPy index>

get_shapely_geometry(*args, **kwargs)[source]

Return a Shapely geometry object.

Return type:shapely.geometry.base.BaseGeometry
class ocgis.LineGC(*args, **kwargs)[source]

Bases: ocgis.spatial.geomc.AbstractGeometryCoordinates

__init__(*args, **kwargs)[source]

x.__init__(…) initializes x; see help(type(x)) for signature

__shapely_geometry_class__

alias of shapely.geometry.linestring.LineString

class ocgis.PointGC(x=None, y=None, z=None, cindex='auto', packed=True, start_index='auto', hosted=False, **kwargs)[source]

Bases: ocgis.spatial.geomc.AbstractGeometryCoordinates

__shapely_geometry_class__

alias of shapely.geometry.point.Point

TemporalVariable

class ocgis.TemporalVariable(**kwargs)[source]

Bases: ocgis.variable.base.SourcedVariable

Note

Accepts all parameters to SourcedVariable.

Parameters:
__init__(**kwargs)[source]

Like a variable but loads its value and metadata from a source request dataset. Full variable functionality is maintained for convenience. Generally, it is a good idea to only provide name` and ``request_dataset to avoid conflicts.

Note

Accepts all parameters to Variable.

Additional arguments and/or keyword arguments are:

Parameters:
  • request_dataset (:class`ocgis.RequestDataset`) – (=None) The request dataset containing the variable’s source information.
  • protected (bool) – (=False) If True, attempting to access the variable’s value from source will raise a ocgis.exc.PayloadProtectedError exception. Set <object>.payload = False to disable this. Useful to ensure the variables payload data is untouched through a series of operations.
  • should_init_from_source (bool) – (=True) Allows a sourced variable to ignore any from-file operations and behave as a normal variable. This is used by some subclasses.
calendar

Get or set the calendar for the variable. If None, the standard calendar will be used.

Return type:str
cfunits
Returns:cf_units object with appropriate calendar
Return type:cf_units.Unit
extent_datetime
Returns:lower and upper time bounds as a two-element datetime.datetime tuple
Return type:tuple
extent_numtime
Returns:lower and upper time bounds as a two-element float tuple
Return type:tuple
classmethod from_variable(variable, format_time=True)[source]
Parameters:
  • variable – The source variable to convert to a time variable.
  • format_time (bool) – See TemporalVariable.
Returns:

a standard variable converted to a time variable

Return type:

TemporalVariable

get_between(lower, upper, return_indices=False)[source]
Parameters:
  • lower – The lower value.
  • upper – The upper value.
  • return_indices (bool) – If True, also return the indices used to slice the variable.
  • closed (bool) – If False (the default), operate on the open interval (>=, <=). If True, operate on the closed interval (>, <).
  • use_bounds (bool) – If True, use the bounds values for the between operation.
Returns:

A sliced variable.

Return type:

Variable

get_datetime(arr)[source]
Parameters:arr (numpy.ndarray) – An array of floats to convert datetime-like objects.
Returns:object array of the same shape as arr with float objects converted to datetime objects.
Return type:numpy.ndarray
get_grouping(grouping)[source]

Create a temporally grouped variable using string group sequences.

Parameters:grouping – The temporal grouping to use when creating the temporal group dimension.
>>> grouping = ['month']
Return type:TemporalGroupVariable
get_iter(**kwargs)[source]
Parameters:kwargs – See source.
Returns:A variable iterator object.
Return type:Iterator
get_numtime(arr)[source]
Parameters:arr (numpy.ndarray) – An array of datetime-like objects to convert to numeric time.
Returns:An array of numeric values with same shape as arr.
Return type:numpy.ndarray
get_report()[source]
Returns:sequence of descriptive strings about the time variable
Return type:sequence of str
get_subset_by_function(func, return_indices=False)[source]

Subset the temporal dimension by an arbitrary function. The functions must take one argument and one keyword. The argument is a vector of datetime objects. The keyword argument should be called “bounds” and may be None. If the bounds value is not None, it should expect a n-by-2 array of datetime objects. The function must return an integer sequence suitable for indexing. For example:

>>> def subset_func(value, bounds=None):
>>>     indices = []
>>>     for ii, v in enumerate(value):
>>>         if v.month == 6:
>>>             indices.append(ii)
>>>     return indices
>>> td = TemporalDimension(...)
>>>
>>> td_subset = td.get_subset_by_function(subset_func)
Parameters:
  • func (FunctionType) – The function to use for subsetting.
  • return_indices (bool) – If True, return the index integers used for slicing/subsetting of the target object.
Return type:

TemporalVariable | tuple

get_time_region(time_region, return_indices=False)[source]
Parameters:time_region (dict) – A dictionary defining the time region subset.
>>> time_region = {'month': [1, 2, 3], 'year': [2000]}
Parameters:return_indices (bool) – If True, also return the indices used to subset the variable.
Returns:shallow copy of the sliced time variable
Return type:TemporalVariable
set_bounds(value, **kwargs)[source]

Set the bounds variable.

Parameters:
  • value (Variable) – The variable containing bounds for the target.
  • force (bool) – If True, clobber the bounds if they exist in parent.
  • clobber_units (bool) – If True, clobber value.units to match self.units. If None, default to ocgis.env.CLOBBER_UNITS_ON_BOUNDS
set_value(value, **kwargs)[source]

Set the variable value.

Parameters:
  • valuenumpy.ndarray | sequence
  • update_mask – See set_mask
value_datetime
Returns:time value as a datetime.datetime masked object array
Return type:numpy.ma.MaskedArray
value_numtime
Returns:time value as a datetime.datetime masked float array
Return type:numpy.ma.MaskedArray
class ocgis.variable.temporal.TemporalGroupVariable(*args, **kwargs)[source]

Bases: ocgis.variable.temporal.TemporalVariable

Stores temporal grouping information for a time variable. Behaves like a time variable in all other aspects.

Note

Accepts all parameters to TemporalVariable.

Additional keyword arguments are:

Parameters:
  • grouping – (=None) See get_grouping().
  • dgroups (sequence of numpy.ndarray) – (=None) Sequence of boolean arrays defining each unique temporal group.
  • date_parts (sequence of tuple) – (=None) Sequence of date part tuples.
__init__(*args, **kwargs)[source]

Like a variable but loads its value and metadata from a source request dataset. Full variable functionality is maintained for convenience. Generally, it is a good idea to only provide name` and ``request_dataset to avoid conflicts.

Note

Accepts all parameters to Variable.

Additional arguments and/or keyword arguments are:

Parameters:
  • request_dataset (:class`ocgis.RequestDataset`) – (=None) The request dataset containing the variable’s source information.
  • protected (bool) – (=False) If True, attempting to access the variable’s value from source will raise a ocgis.exc.PayloadProtectedError exception. Set <object>.payload = False to disable this. Useful to ensure the variables payload data is untouched through a series of operations.
  • should_init_from_source (bool) – (=True) Allows a sourced variable to ignore any from-file operations and behave as a normal variable. This is used by some subclasses.

Parallelism

OcgVM

class ocgis.OcgVM(comm=None)[source]

Bases: ocgis.base.AbstractOcgisObject

Manages communicators for parallel execution. Provides access to a dummy communicator when running in serial.

Parameters:comm (MPI Communicator or DummyMPIComm) – The default communicator.

OcgDist

class ocgis.vmachine.mpi.OcgDist(size=None, ranks=None)[source]

Bases: ocgis.base.AbstractOcgisObject

Computes the parallel distribution for variable dimensions.

Parameters:
  • size (int) – The number of ranks to use for the distribution. If None (the default), use the global MPI size.
  • ranks (sequence of int) – A sequence of integer ranks with length equal to size. Useful if computing a distribution for a rank subset.

GIS File Access

GeomCabinet

class ocgis.GeomCabinet(path=None)[source]

Bases: object

A utility object designed for accessing shapefiles stored in a locally accessible location.

>>> # Adjust location of search directory.
>>> import ocgis
...
>>> ocgis.env.DIR_GEOMCABINET = '/path/to/local/shapefile/directory'
>>> sc = GeomCabinet()
>>> # List the shapefiles available.
>>> sc.keys()
['state_boundaries', 'mi_watersheds', 'world_countries']
>>> # Load geometries from the shapefile.
>>> geoms = sc.get_geoms('state_boundaries')
Parameters:path (str) – Absolute path the directory holding shapefile folders. Defaults to ocgis.env.DIR_GEOMCABINET.
iter_geoms(key=None, select_uid=None, path=None, load_geoms=True, as_field=False, uid=None, select_sql_where=None, slc=None, union=False, data_model=None, driver_kwargs=None)[source]

See documentation for GeomCabinetIterator.

keys()[source]

Return a list of the shapefile keys contained in the search directory.

Return type:list of str

GeomCabinetIterator

class ocgis.GeomCabinetIterator(key=None, select_uid=None, path=None, load_geoms=True, as_field=False, uid=None, select_sql_where=None, slc=None, union=False, data_model=None, driver_kwargs=None)[source]

Bases: object

Iterate over geometries from a shapefile specified by key or path.

>>> sc = GeomCabinet()
>>> geoms = sc.iter_geoms('state_boundaries', select_uid=[1, 48])
>>> len(list(geoms))
2
Parameters:key (str) – Unique key identifier for a shapefile contained in the GeomCabinet directory.
>>> key = 'state_boundaries'
Parameters:select_uid (sequence) – Sequence of unique identifiers to select from the target data source.
>>> select_uid = [23, 24]
Parameters:path (str) – Path to the target data source to iterate over. If key is provided it will override path.
>>> path = '/path/to/shapefile.shp'
Parameters:
  • load_geoms (bool) – If False, do not load geometries, excluding the 'geom' key from the output dictionary.
  • as_field (bool) – If True, yield field objects.
  • uid (str) – The name of the attribute containing the unique identifier. If None, ocgis.env.DEFAULT_GEOM_UID will be used if present. If no unique identifier is found, add one with name ocgis.env.DEFAULT_GEOM_UID.
  • select_sql_where (str) – SINGLE QUOTES MUST BE USED INSIDE DOUBLE QUOTES FOR PYTHON 3! A string suitable for insertion into a SQL WHERE statement. See http://www.gdal.org/ogr_sql.html for documentation (section titled “WHERE”).
>>> select_sql_where = "STATE_NAME = 'Wisconsin'"
Parameters:slc – A two-element integer sequence: [start, stop].
>>> slc = [0, 5]
Parameters:data_model (str) – The target data model for the iteration.
>>> data_model = 'NETCDF3'
Parameters:driver_kwargs (dict) – Format specific keyword arguments to use for driver creation.
Raises:ValueError, RuntimeError
Return type:dict
__iter__()[source]

Return an iterator as from ocgis.GeomCabinet.iter_geoms().

Operation Wrappers

CalculationEngine

class ocgis.calc.engine.CalculationEngine(grouping, funcs, calc_sample_size=False, spatial_aggregation=False, progress=None)[source]

Bases: object

Manages calculation execution.

Parameters:
  • calc_sample_size (bool) – If True, calculation sample sizes for the calculations.
  • progress (ProgressOcgOperations) – A progress object to update.

OperationsEngine

class ocgis.ops.engine.OperationsEngine(ops, request_base_size_only=False, progress=None)[source]

Bases: ocgis.base.AbstractOcgisObject

Executes the operations defined by ops.

Parameters:
  • ops (OcgOperations) – The operations to interpret.
  • request_base_size_only (bool) – If True, return field objects following the spatial subset performing as few operations as possible.
  • progress (ProgressOcgOperations) – A progress object to update.

RegridOperation

class ocgis.regrid.base.RegridOperation(field_src, field_dst, subset_field=None, regrid_options=None, revert_dst_crs=False)[source]

Bases: ocgis.base.AbstractOcgisObject

Execute a regrid operation handling spatial subsetting and coordinate system transformations.

Parameters:
  • field_src (ocgis.Field) – The source field to regrid.
  • field_dst (ocgis.Field) – The destination regrid field.
  • subset_field (ocgis.Field) – If provided, use this field to subset the regridding fields.
  • regrid_options (dict) – A dictionary of keyword options to pass to regrid_field().
  • revert_dst_crs (bool) – If True, revert the destination grid coordinate system if it needed to be transformed. Typically, a number of source fields are regridded to a common destination and this transform should only occur once.
execute()[source]

Execute regridding operation.

Return type:ocgis.Field

SpatialSubsetOperation

class ocgis.spatial.spatial_subset.SpatialSubsetOperation(field, output_crs='input', wrap=None)[source]

Bases: ocgis.base.AbstractOcgisObject

Perform spatial subsets using Field objects.

Parameters:
  • field (Field) – The target field to subset.
  • output_crs (AbstractCRS) – If provided, all output coordinates will be remapped to match. If 'input', the default, use the coordinate system assigned to field.
  • wrap (bool) – This is only relevant for spherical coordinate systems on field or when selected as the output_crs. If None, leave the wrapping the same as field. If True, wrap the coordinates. If False, unwrap the coordinates. A “wrapped” spherical coordinate system has a longitudinal domain from -180 to 180 degrees.
get_spatial_subset(operation, geom, use_spatial_index=True, buffer_value=None, buffer_crs=None, geom_crs=None, select_nearest=False, optimized_bbox_subset=False)[source]

Perform a spatial subset operation on target.

Parameters:
  • operation (str) – Either 'intersects' or 'clip'.
  • geom (shapely.geometry.base.BaseGeometry | ocgis.GeometryVariable) – The input geometry object to use for subsetting of target.
  • use_spatial_index (bool) – If True, use an rtree spatial index.
  • select_nearest (bool) – If True, select the geometry nearest polygon using shapely.geometry.base.BaseGeometry.distance().
  • buffer_value (float) – The buffer radius to use in units of the coordinate system of subset_sdim.
  • buffer_crs (ocgis.interface.base.crs.CoordinateReferenceSystem) – If provided, then buffer_value are not in units of the coordinate system of subset_sdim but in units of buffer_crs.
  • geom_crs (ocgis.crs.CRS) – The coordinate reference system for the subset geometry.
  • select_nearest – If True, following the spatial subset operation, select the nearest geometry in the subset data to geom. Centroid-based distance is used.
  • optimized_bbox_subset (bool) – If True, only do a bounding box subset and do not perform more complext GIS subset operations such as constructing a spatial index.
Return type:

Same as target. If target is a ocgis.RequestDataset, then a ocgis.interface.base.field.Field will be returned.

Raises:

ValueError

Drivers

class ocgis.driver.base.AbstractDriver(rd)[source]

Bases: ocgis.base.AbstractOcgisObject

Base class for all drivers.

Parameters:rd (RequestDataset) – The input request dataset object.
class ocgis.driver.base.AbstractTabularDriver(rd)[source]

Bases: ocgis.driver.base.AbstractDriver

Base class for tabular drivers (no optimal single variable access).

class ocgis.driver.base.AbstractUnstructuredDriver[source]

Bases: ocgis.base.AbstractOcgisObject

class ocgis.driver.nc.DriverNetcdf(rd)[source]

Bases: ocgis.driver.base.AbstractDriver

Driver for netCDF files that avoids any hint of metadata.

Driver keyword arguments (driver_kwargs) to the request dataset:

  • Any keyword arguments to dataset or multi-file dataset creation.
class ocgis.driver.nc.DriverNetcdfCF(rd)[source]

Bases: ocgis.driver.nc.AbstractDriverNetcdfCF

Metadata-aware netCDF driver interpreting CF-Grid by default.

class ocgis.driver.nc_scrip.DriverNetcdfSCRIP(rd)[source]

Bases: ocgis.driver.base.AbstractUnstructuredDriver, ocgis.driver.nc.DriverNetcdf

Driver for the SCRIP NetCDF structured and unstructured grid format. SCRIP is a legacy format that is the primary precursor to NetCDF-CF convention. By default, SCRIP grids are treated as unstructured data creating an unstructured grid.

class ocgis.driver.nc_ugrid.DriverNetcdfUGRID(rd)[source]

Bases: ocgis.driver.base.AbstractUnstructuredDriver, ocgis.driver.nc.DriverNetcdfCF

Driver for NetCDF data following the UGRID convention. It will also interpret CF convention for axes not overloaded by UGRID.

class ocgis.driver.vector.DriverVector(rd)[source]

Bases: ocgis.driver.base.AbstractTabularDriver

Driver for vector GIS data.

Driver keyword arguments (driver_kwargs) to the request dataset:

  • 'feature_class' –> For File Geodatabases, a string feature class name is required.
class ocgis.driver.csv_.DriverCSV(rd)[source]

Bases: ocgis.driver.base.AbstractTabularDriver

Driver for comma separated value files.

Grid Chunker

class ocgis.spatial.grid_chunker.GridChunker(source, destination, nchunks_dst=None, paths=None, check_contains=False, allow_masked=True, src_grid_resolution=None, dst_grid_resolution=None, optimized_bbox_subset='auto', iter_dst=None, buffer_value=None, redistribute=False, genweights=False, esmf_kwargs=None, use_spatial_decomp='auto', eager=True)[source]

Splits source and destination grids into separate netCDF files. “Source” is intended to mean the source data for a regridding operation. “Destination” is the destination grid for regridding operation.

The destination subset extents are buffered to ensure full overlap with the source destination subset. Hence, elements in the destination subset are globally unique and source subset elements are not necessarily globally unique.

Note

Grid parent variable collections may be altered during initializations to account for global source indexing.

Note

All function calls are collective.

Parameters:
  • source (AbstractGrid | RequestDataset | Field) – The source object for a regridding operation. The object must either be a grid or an object from which a grid is retrievable.
  • destination (AbstractGrid | RequestDataset | Field) – The destination object for a regridding operation. The object must either be a grid or an object from which a grid is retrievable.
  • nchunks_dst (tuple) – The split count for the grid. Tuple length must match the dimension count of the grid.
>>> nchunks_dst = (2, 3)
Parameters:paths (dict) – Dictionary of paths used by the grid splitter. Defaults are provided.
Key (str) Default Description
wd os.getcwd() Working directory to write to or containing split files, weight files, and splitter index file.
dst_template 'split_dst_{}.nc' Destination filename template.
src_template 'split_src_{}.nc' Source filename template.
wgt_template 'esmf_weights_{}.nc' Weight filename template.
index_file '01-split_index.nc' Name of the index file.
Parameters:
  • check_contains (bool) – If True, check that the source subset bounding box fully contains the destination subset bounding box. Works when coordinate data is ordered and packed similarly between source and destination.
  • allow_masked (bool) – If True, allow masked values following a subset.
  • src_grid_resolution (float) – Overload the source grid resolution. This is useful when using unstructured data that may have a regular patterning to leverage for performance.
  • dst_grid_resolution (float) – Same as src_grid_resolution exception for the destination grid.
  • optimized_bbox_subset (bool) – If True, use optimizations for subsetting. If False, do not use optimizations. Optimizations are generally okay for structured, rectilinear grids. Optimizations will avoid constructing geometries for the subset target. Hence, subset operations with complex boundary definitions should generally avoid optimizations (set to False). If 'auto', attempt to identify the best optimization method.
  • iter_dst (<generator function>) – A generator yielding destination grids. This generator must also write the grid.
  • buffer_value (float) – The value in units of the destination grid, used to buffer the spatial extent for subsetting the source grid. It is best to keep this small, but it must ensure the destination subset is fully mapped by the source for whatever purpose the grid splitter is used. If None, the default is double the highest resolution between source and destination grids.
  • redistribute (bool) – If True, redistribute the source subset for unstructured grids. The redistribution reloads the data from source so should not be used with in-memory grids.
  • genweights (bool) – If False, do no generate regridding weight files using ESMF. If True, generate regridding weight files for each source-destination chunk.
  • esmf_kwargs (dict) – Optional overloads for keyword arguments to ESMF interfaces. Currently supported keyword arguments are below.
Name Default Possible
'regrid_method' 'CONSERVE' 'CONSERVE', 'BILINEAR', 'PATCH', 'NEAREST_STOD'
'unmapped_action' 'IGNORE' 'IGNORE', 'ERROR'
'ignore_degenerate' False True/False
Parameters:
  • use_spatial_decomp (bool) – If True, use a spatial decomposition as opposed to an index-based decomposition when creating destination chunks. A spatial decomposition ensures destination coordinates are spatially “clumped” and is recommended for unstructured datasets. If 'auto', choose the best approach from the grid type.
  • eager (bool) – If True, load grid data from disk before chunking. This avoids always loading the data from disk for sourced datasets following a subset. There will be an improvement in performance but an increase in the memory used.
Raises:

ValueError

buffer_value

Spatial distance in units of the destination grid to buffer the destination grid chunk’s spatial extent when subsetting the associated source grid. Defaults to the higher spatial resolution times a modifier (ocgis.constants.GridChunkerConstants.BUFFER_RESOLUTION_MODIFIER).

Parameters:value (float) – Spatial buffer value
Return type:float
create_merged_weight_file(merged_weight_filename, strict=False)[source]

Merge weight file chunks to a single, global weight file.

Parameters:
  • merged_weight_filename (str) – Path to the merged weight file.
  • strict (bool) – If False, allow “missing” files where the iterator index cannot create a found file. It is best to leave these False as not all source and destinations are mapped. If True, raise an
dst_grid

Get the destination grid.

Returns:AbstractGrid
static insert_weighted(index_path, dst_wd, dst_master_path)[source]

Inserted weighted, destination variable data into the master destination file.

Parameters:
  • index_path (str) – Path to the split index netCDF file.
  • dst_wd (str) – Working directory containing the destination files holding the weighted data.
  • dst_master_path (str) – Path to the destination master weight file.
iter_dst_grid_slices(yield_idx=None)[source]

Yield global slices for the destination grid using guidance from nchunks_dst.

Parameters:yield_idx (int) – If a zero-based integer, only yield for this chunk index and skip everything else.
Returns:A dictionary with keys as the grid dimension names and the values the associated slice for that dimension.
Return type:dict
>>> example_yield = {'dimx': slice(2, 4), 'dimy': slice(10, 20)}
iter_dst_grid_subsets(yield_slice=False, yield_idx=None)[source]

Yield destination grid subsets.

Parameters:
  • yield_idx (int) – If a zero-based integer, only yield for this chunk index and skip everything else.
  • yield_slice (bool) – If True, yield the slice used on the destination grid.
Return type:

ocgis.spatial.grid.AbstractGrid

iter_src_grid_subsets(yield_dst=False, yield_idx=None)[source]

Yield source grid subset using the extent of its associated destination grid subset.

Parameters:
  • yield_dst (bool) – If True, yield the destination subset as well as the source grid subset.
  • yield_idx (int) – If a zero-based integer, only yield for this chunk index and skip everything else.
Return type:

tuple(ocgis.spatial.grid.AbstractGrid, slice-like)

optimized_bbox_subset

If True, use an optimized bounding box subset to spatially subset the source grid.

Parameters:value (str | bool) – If 'auto', choose the optimization based on grid isomorphism.
Return type:bool
src_grid

Get the source grid.

Returns:AbstractGrid
write_chunks()[source]

Write grid subsets to netCDF files using the provided filename templates. This will also generate ESMF regridding weights for each subset if requested.

write_esmf_weights(src_path, dst_path, wgt_path, src_grid=None, dst_grid=None)[source]

Write ESMF regridding weights for a source and destination filename combination.

Parameters:
  • src_path (str) – Full path to source file
  • dst_path (str) – Full path to destination file
  • wgt_path (str) – Path to output weight file
  • src_grid (ocgis.spatial.grid.AbstractGrid) – If provided, use this source grid for identifying ESMF parameters
  • dst_grid (ocgis.spatial.grid.AbstractGrid) – If provided, use this destination grid for identifying ESMF parameters

Base Classes

class ocgis.base.AbstractOcgisObject[source]

Bases: object

Base class for all ocgis objects.

class ocgis.base.AbstractInterfaceObject[source]

Bases: ocgis.base.AbstractOcgisObject

Base class for interface objects.

copy()[source]

Return a shallow copy of self.

deepcopy()[source]

Return a deep copy of self.

to_xarray(**kwargs)[source]

Convert this object to a type understood by xarray. This should be overloaded by subclasses.

class ocgis.base.AbstractNamedObject(name, aliases=None, source_name=-999, uid=None)[source]

Bases: ocgis.base.AbstractInterfaceObject

Base class for objects with a name.

Parameters:
  • name (str) – The object’s name.
  • aliases (sequence(str, ...)) – Alternative names for the object.
  • source_name (str) – Name of the object in its source data. Allows the object name and its source name to differ.
  • uid (int) – A unique integer identifier for the object.
append_alias(alias)[source]

Append a name alias to the list of the object’s aliases.

Parameters:alias (str) – The alias to append.
is_matched_by_alias(alias)[source]

Return True if the provided alias matches any of the object’s aliases.

Parameters:alias (str) – The alias to check.
Return type:bool
name

The objects’s name.

Return type:str
set_name(name, aliases=None)[source]

Set the object’s name.

Parameters:
  • name (str) – The new name.
  • aliases (sequence(str, ...)) – New aliases for the object.
source_name

The object’s source name.

Return type:str
class ocgis.variable.base.AbstractContainer(name, aliases=None, source_name=-999, parent=None, uid=None, is_empty=None)[source]

Bases: ocgis.base.AbstractNamedObject

Base class for objects with a parent.

Note

Accepts all parameters to AbstractNamedObject.

Additional keyword arguments are:

Parameters:
  • parent (None | VariableCollection) – (=None) The parent collection for this container. A variable will always become a member of its parent.
  • is_empty (None | bool) – (=None) Set to True if this is an empty object.
dimensions
Returns:A dimension dictionary containing all dimensions on associated with variables in the collection.
Return type:OrderedDict
driver

Get the parent’s driver class or object.

get_mask(*args, **kwarga)[source]
Returns:The object’s mask as a boolean array with same dimension as the object.
Return type:numpy.ndarray
group
Returns:The group index in the parent/child hierarchy. Returns None if this collection is the head.
Return type:None | list of str
has_initialized_parent
Returns:True if the object’s parent has not been initialized.
Return type:bool
is_empty
Returns:True if the object is empty..
Return type:bool
parent

Get or set the parent collection.

Return type:ocgis.VariableCollection
set_mask(mask, **kwargs)[source]

Set the object’s mask.

Parameters:mask – A boolean mask array or None to remove the mask.
set_name(name, aliases=None)[source]

Set the name for the object.

See AbstractNamedObject for documentation.

class ocgis.spatial.grid.AbstractGrid(abstraction='auto')[source]

Bases: ocgis.base.AbstractOcgisObject

Base class for grid objects.

Parameters:abstraction (ocgis.constants.Topology) – The grid abstraction to use. If 'auto' (the default), use the highest order abstraction available.
abstraction

Get or set the spatial abstraction for the grid.

Parameters:abstraction (Topology.) – The grid’s overloaded or highest order topology or spatial abstraction. AUTO should not be returned.
Return type:Topology
abstractions_available

Get the topologies / spatial abstractions available on the object. Tuple elements are of type ocgis.constants.Topology.

Return type:tuple
get_abstraction_geometry(**kwargs)[source]

Get the abstraction geometry variable for the grid.

Parameters:kwargs – Keyword arguments to the geometry get method. See get_point() for example.
Return type:GeometryVariable
is_abstraction_available(abstraction)[source]

Return True if the spatial abstraction is available on the grid.

Parameters:abstraction (ocgis.constants.GridAbstraction) – The spatial abstraction to check.
Return type:bool
class ocgis.variable.attributes.Attributes(attrs=None)[source]

Bases: ocgis.base.AbstractOcgisObject

Adds an attrs dictionary. Always converts dictionaries to collections.OrderedDict objects.

Parameters:attrs (dict) – A dictionary of attribute name/value pairs.
attrs

Get or set the attributes dictionary.

Parameters:value (dict) – The dictionary of attributes. Always converted to an collections.OrderedDict.
Return type:collections.OrderedDict
write_attributes_to_netcdf_object(target)[source]
Parameters:target (netCDF4.Variable | netCDF4.Dataset) – The attribute write target.

Spatial Objects

class ocgis.spatial.base.AbstractSpatialContainer(**kwargs)[source]

Bases: ocgis.variable.base.AbstractContainer, ocgis.spatial.base.AbstractOperationsSpatialObject

get_mask(*args, **kwargs)[source]

A spatial container’s mask is stored independently from the coordinate variables’ masks. The mask is actually a variable containing a mask. This approach ensures the mask may be persisted to file and retrieved/modified leaving all coordinate variables intact.

Note

See get_mask() for documentation.

set_mask(value, cascade=False)[source]

Set the spatial container’s mask from a boolean array or variable.

Parameters:
  • value (numpy.ndarray | Variable) – A mask array having the same shape as the grid. This may also be a variable with the same dimensions.
  • cascade – If True, cascade the mask along shared dimensions on the spatial container.
class ocgis.spatial.base.AbstractXYZSpatialContainer(**kwargs)[source]

Bases: ocgis.spatial.base.AbstractSpatialContainer

Abstract container for X, Y, and optionally Z coordinate variables. If x and y are not provided, then parent is required.

Parameters:
  • x (ocgis.Variable) – (=None) X-coordinate variable
  • y (ocgis.Variable) – (=None) Y-coordinate variable
  • parent (ocgis.Field) – (=None) Parent field object
  • mask (ocgis.Variable) – (=None) Mask variable
  • pos (tuple) – (=(0, 1)) Axis values for n-dimensional coordinate arrays
  • is_isomorphic – See grid_is_isomorphic documentation for ocgis.Field
archetype
Returns:archetype coordinate variable
Return type:Variable
coordinate_variables

See coordinate_variables()

get_member_variables(include_bounds=True)[source]

A spatial container is composed of numerous member variables defining coordinates, bounds, masks, and geometries. This method returns those variables if present on the current container object.

Parameters:include_bounds – If True, include any bounds variables associated with the grid members.
Return type:list of Variable
has_mask
Returns:True if the geometry abstraction variable is allocated on the grid.
Return type:bool
has_z
Returns:True if the grid has a z-coordinate.
Return type:bool
is_isomorphic

See grid_is_isomorphic documentation for ocgis.Field

mask_variable
Returns:The mask variable associated with the grid. This will be None if no mask is present.
Return type:Variable
resolution

Returns the average spatial along resolution along the x and y dimensions.

Return type:float
resolution_max

Returns the maximum spatial resolution between the x and y coordinate variables.

Return type:float
resolution_x

Returns the resolution ox x variable.

Return type:float
resolution_y

Returns the resolution ox y variable.

Return type:float
shape_global

Get the global shape across the current OcgVM.

Return type:tuple of int
Raises:EmptyObjectError
update_crs(to_crs, from_crs=None)[source]

Update the coordinate system in place.

Parameters:
  • to_crs (AbstractCRS) – The destination coordinate system.
  • from_crs (AbstractCRS) – Optional original coordinate system to temporarily assign to the data. Useful when the object’s coordinate system is different from the desired coordinate system.
x

Get or set the x-coordinate variable for the grid.

Return type:Variable
y

Get or set the y-coordinate variable for the grid.

Return type:Variable
z

Get or set the z-coordinate variable for the grid.

Return type:Variable
class ocgis.spatial.geomc.AbstractGeometryCoordinates(x=None, y=None, z=None, cindex='auto', packed=True, start_index='auto', hosted=False, **kwargs)[source]

Bases: ocgis.spatial.base.AbstractXYZSpatialContainer

Superclass for geometry coordinate objects. These objects manage coordinate arrays for subsetting and conversion.

Parameters:
  • x (Variable) – The x-coordinate variable. Required if no parent is provided.
  • y (Variable) – The y-coordinate variable. Required if no parent is provided.
  • z (Variable) – The z-coordinate variable. Not required.
  • cindex (None | Variable | str) – The element node connectivity variable. If provided, this is used to index into coordinate variables x, y, and/or z. If this is None, use coordinate variables’ first dimension as the element dimension (ragged arrays). If 'auto' (the default), attempt to retrieve an appropriate variable from the dimension map.
  • packed (bool) – If True, the element node connectivity variable has been de-duplicated.
  • start_index (int | str) – If 'auto', attempt to retrieve this value from the element node connectivity variable. The default is 0 if it cannot be found. An integer may also be provided.
  • hosted (bool) – If False, this object is not hosted by an unstructured grid. If True, it is hosted by an unstructured grid object. Hosted objects will return their parents for some classes of operations.
  • kwargs (dict) – Optional keyword arguments to the superclass.
abstraction

Return the generic abstraction for the geometry coordinates following

cindex

Provides an index into coordinate variables to extract coordinate values for elements. When setting, the first dimension is considered the representative element count dimension.

Return type:None | Variable
convert_to(target=<ConversionTarget.GEOMETRY_VARIABLE: 'geometry_variable'>, **kwargs)[source]

Convert the geometry coordinate object to various targets.

Parameters:
  • target (ConversionTarget) – The destination conversion target.
  • kwargs (dict) – Keyword arguments for the creation of the destination object.
Returns:

Varies depending on the conversion target.

element_dim

Get the element dimension. The size of the dimension is equivalent to the element count.

Return type:Dimension
get_distributed_slice(slc, **kwargs)[source]

Slice a distributed object.

Parameters:
Return type:

ocgis.spatial.geomc.AbstractGeometryCoordinates

get_element_node_connectivity_by_index(element_connectivity, idx)[source]

Return something that can be used for indexing into coordinate arrays to retrieve the coordinates for the current element.

Parameters:
  • element_connectivity (numpy.ndarray) – An element connectivity array with the first dimension/axis as the element dimension.
  • idx (int) – The element index.
Return type:

<used as NumPy index>

get_geometry_iterable(use_mask=True, hint_mask=None, use_memory_optimizations=None, with_index=True)[source]

Yield a tuple composed of the current iterator index and Shapely geometry object. If the geometry is masked, the geometry will be None. For example: (2, <Polygon>) or (3, None) if masked.

Parameters:
  • use_mask (bool) – If True, use a mask for iteration. This is retrieved from the object if hint_mask is None. If False, do not use the mask for iteration yielding underlying data if the object is mask.
  • hint_mask (None | numpy.ndarray) – If None, use the object’s mask. If a boolean array, use this as the mask as opposed to the object’s mask. The array must have the same dimension as self.
  • use_memory_optimizations (None | bool) – If None, default to ocgis.env.USE_MEMORY_OPTIMIZATIONS. If True, do not eagerly load coordinates. If False, load all coordinates into memory improving performance by limiting IO.
  • with_index (bool) – If False, do not yield the current iteration index.
Returns:

tuple(int, <Shapely geometry>) | <Shapely geometry>

get_nearest(target, return_indices=False)[source]

Get nearest element to target geometry.

get_shapely_geometry(*args, **kwargs)[source]

Return a Shapely geometry object.

Return type:shapely.geometry.base.BaseGeometry
get_spatial_index()[source]

Get the spatial index.

has_bounds

Always return False. Geometry coordinate objects will never have bounds.

Return type:bool
has_multi

If True, this object represents multi-geometries (multi-polygon for example).

Return type:bool
is_vectorized

Always return False. Geometry coordinates are never vectorized.

Return type:bool
iter_records(use_mask=True)[source]

Generate fiona-compatible records.

multi_break_value

The break value to use for determining multi-geometries.

Returns:int | None
ndim

Get the representative dimension. This is typically 1.

Return type:int
node_dim

Get the node dimension.

Return type:Dimension
reduce_global()[source]

De-duplicate and reindex (reset start index) an element node connectivity variable. Operation is collective across the current VM. The new node dimension is distributed. Return a shallow copy of self for convenience.

Return type:AbstractGeometryCoordinates
shape

Get the current size of the element dimension as a tuple. For example: (5,).

Return type:tuple
start_index

Get the start index. 0 is the default. For data written and read from Fortran natively, this is often 1. This will attempt to read an attribute from the element node connectivity variable called ocgis.constants.AttributeName.START_INDEX.

Return type:int
topology

Alias of abstraction.

Coordinate Systems

class ocgis.variable.crs.AbstractCRS(angular_units=<OcgisUnits.DEGREES: 'degrees'>, linear_units=None)[source]

Bases: ocgis.base.AbstractInterfaceObject

Base class for all OCGIS coordinate systems. Intended to allow differentiation between standard PROJ.4 coordinate systems and specialized OCGIS-supported coordinate systems.

Parameters:
dist

Here for variable compatibility.

format_spatial_object(spatial_obj, is_transform=False)[source]
Parameters:
  • spatial_obj (<varying>) – The spatial object to format.
  • is_transform (bool) – If True, this is a coordinate system transformation format call. CF attributes should be removed. If False, this is for a write and attributes should be left as is or overwritten if explicitly defined by the coordinate system.
classmethod fuzzy_check(value)[source]

Coordinate system definitions are often changing in interpreters. This function allows coordinate system definitions to vary slightly, but still be created appropriately.

This returns True if there is a fuzzy match.

Parameters:value (dict) – The coordinate system dictionary definition.
Returns:bool
classmethod get_wrap_action(state_src, state_dst)[source]
Parameters:
Returns:

The wrapping action to perform on state_src. (ocgis.constants.WrapAction)

Return type:

int

Raises:

NotImplementedError, ValueError

get_wrapped_state(target)[source]
Parameters:target (Field) – Return the wrapped state of a field. This function only checks grid centroids and geometry exteriors. Bounds/corners on the grid are excluded.
is_geographic
Returns:True if the coordinate system is considered geographic.
Return type:bool
is_wrappable
Returns:True if the coordinate system may be globally wrapped or unwrapped. A wrappable CRS will use degree units on ranges 0 to 360 and -180 to 180.
Return type:bool
prepare_geometry_variable(subset_geom, rhs_tol=10.0, inplace=True)[source]

Prepared a geometry variable for subsetting. This method:

  • Appropriately wraps subset geometries for spherical coordinate systems.
Parameters:
  • subset_geom (GeometryVariable) – The geometry variable to prepare.
  • rhs_tol (float) – The amount, in matching coordinate system units, to buffer the right hand selection geometry.
  • inplace (bool) – If True, modify the object in-place.
set_string_max_length_global(value=None)[source]

Here for variable compatibility.

to_xarray(**kwargs)[source]

Convert the CRS variable to a xarray.DataArray. This does not traverse the parent’s hierararchy. Use the conversion method on the variable’s parent to convert all variables in the collection.

Return type:xarray.DataArray
class ocgis.variable.crs.AbstractProj4CRS(angular_units=<OcgisUnits.DEGREES: 'degrees'>, linear_units=None)[source]

Bases: ocgis.variable.crs.AbstractCRS

Base class for coordinate systems that may be transformed using PROJ.4.

classmethod load_from_metadata(group_metadata)[source]

Create a coordinate system object using group metadata.

Parameters:group_metadata (dict) – The metdata to interpret. This should be for the current group only.
Returns:CRS

Abstract Drivers

class ocgis.driver.base.AbstractDriver(rd)[source]

Bases: ocgis.base.AbstractOcgisObject

Base class for all drivers.

Parameters:rd (RequestDataset) – The input request dataset object.
class ocgis.driver.nc.AbstractDriverNetcdfCF(rd)[source]

Bases: ocgis.driver.nc.DriverNetcdf

class ocgis.driver.base.AbstractTabularDriver(rd)[source]

Bases: ocgis.driver.base.AbstractDriver

Base class for tabular drivers (no optimal single variable access).