GDALMDArray C++ API

class GDALMDArray : public virtual GDALAbstractMDArray, public GDALIHasAttribute

Class modeling a multi-dimensional array.

It has a name, values organized as an array and a list of GDALAttribute.

This is based on the HDF5 dataset concept

Since

GDAL 3.1

Subclassed by GDALMDArrayFromRasterBand

Public Functions

GUInt64 GetTotalCopyCost() const

Return a total “cost” to copy the array.

Used as a parameter for CopyFrom()

virtual bool CopyFrom(GDALDataset *poSrcDS, const GDALMDArray *poSrcArray, bool bStrict, GUInt64 &nCurCost, const GUInt64 nTotalCost, GDALProgressFunc pfnProgress, void *pProgressData)

Copy the content of an array into a new (generally empty) array.

Parameters
  • poSrcDS – Source dataset. Might be nullptr (but for correct behavior of some output drivers this is not recommended)

  • poSrcArray – Source array. Should NOT be nullptr.

  • bStrict – Whether to enable stict mode. In strict mode, any error will stop the copy. In relaxed mode, the copy will be attempted to be pursued.

  • nCurCost – Should be provided as a variable initially set to 0.

  • nTotalCost – Total cost from GetTotalCopyCost().

  • pfnProgress – Progress callback, or nullptr.

  • pProgressData – Progress user data, or nulptr.

Returns

true in case of success (or partial success if bStrict == false).

virtual bool IsWritable() const = 0

Return whether an array is writable.

virtual const std::string &GetFilename() const = 0

Return the filename that contains that array.

This is used in particular for caching.

Might be empty if the array is not linked to a file.

Since

GDAL 3.4

virtual CSLConstList GetStructuralInfo() const

Return structural information on the array.

This may be the compression, etc..

The return value should not be freed and is valid until GDALMDArray is released or this function called again.

This is the same as the C function GDALMDArrayGetStructuralInfo().

virtual const std::string &GetUnit() const

Return the array unit.

Values should conform as much as possible with those allowed by the NetCDF CF conventions: http://cfconventions.org/Data/cf-conventions/cf-conventions-1.7/cf-conventions.html#units but others might be returned.

Few examples are “meter”, “degrees”, “second”, … Empty value means unknown.

This is the same as the C function GDALMDArrayGetUnit()

virtual bool SetUnit(const std::string &osUnit)

Set the variable unit.

Values should conform as much as possible with those allowed by the NetCDF CF conventions: http://cfconventions.org/Data/cf-conventions/cf-conventions-1.7/cf-conventions.html#units but others might be returned.

Few examples are “meter”, “degrees”, “second”, … Empty value means unknown.

This is the same as the C function GDALMDArraySetUnit()

Note

Driver implementation: optionally implemented.

Parameters

osUnit – unit name.

Returns

true in case of success.

virtual bool SetSpatialRef(const OGRSpatialReference *poSRS)

Assign a spatial reference system object to the the array.

This is the same as the C function GDALMDArraySetSpatialRef().

virtual std::shared_ptr<OGRSpatialReference> GetSpatialRef() const

Return the spatial reference system object associated with the array.

This is the same as the C function GDALMDArrayGetSpatialRef().

virtual const void *GetRawNoDataValue() const

Return the nodata value as a “raw” value.

The value returned might be nullptr in case of no nodata value. When a nodata value is registered, a non-nullptr will be returned whose size in bytes is GetDataType().GetSize().

The returned value should not be modified or freed. It is valid until the array is destroyed, or the next call to GetRawNoDataValue() or SetRawNoDataValue(), or any similar methods.

This is the same as the C function GDALMDArrayGetRawNoDataValue().

Note

Driver implementation: this method shall be implemented if nodata is supported.

Returns

nullptr or a pointer to GetDataType().GetSize() bytes.

double GetNoDataValueAsDouble(bool *pbHasNoData = nullptr) const

Return the nodata value as a double.

The value returned might be nullptr in case of no nodata value. When a nodata value is registered, a non-nullptr will be returned whose size in bytes is GetDataType().GetSize().

This is the same as the C function GDALMDArrayGetNoDataValueAsDouble().

Parameters

pbHasNoData – Pointer to a output boolean that will be set to true if a nodata value exists and can be converted to double. Might be nullptr.

Returns

the nodata value as a double. A 0.0 value might also indicate the absence of a nodata value or an error in the conversion (*pbHasNoData will be set to false then).

virtual bool SetRawNoDataValue(const void *pRawNoData)

Set the nodata value as a “raw” value.

The value passed might be nullptr in case of no nodata value. When a nodata value is registered, a non-nullptr whose size in bytes is GetDataType().GetSize() must be passed.

This is the same as the C function GDALMDArraySetRawNoDataValue().

Note

Driver implementation: this method shall be implemented if setting nodata is supported.

Returns

true in case of success.

bool SetNoDataValue(double dfNoData)

Set the nodata value as a double.

If the natural data type of the attribute/array is not double, type conversion will occur to the type returned by GetDataType().

This is the same as the C function GDALMDArraySetNoDataValueAsDouble().

Returns

true in case of success.

virtual double GetOffset(bool *pbHasOffset = nullptr, GDALDataType *peStorageType = nullptr) const

Get the offset value to apply to raw values.

unscaled_value = raw_value * GetScale() + GetOffset()

This is the same as the C function GDALMDArrayGetOffset().

Note

Driver implementation: this method shall be implemented if gettings offset is supported.

Parameters
  • pbHasOffset – Pointer to a output boolean that will be set to true if a offset value exists. Might be nullptr.

  • peStorageType – Pointer to a output GDALDataType that will be set to the storage type of the offset value, when known/relevant. Otherwise will be set to GDT_Unknown. Might be nullptr. Since GDAL 3.3

Returns

the offset value. A 0.0 value might also indicate the absence of a offset value.

virtual double GetScale(bool *pbHasScale = nullptr, GDALDataType *peStorageType = nullptr) const

Get the scale value to apply to raw values.

unscaled_value = raw_value * GetScale() + GetOffset()

This is the same as the C function GDALMDArrayGetScale().

Note

Driver implementation: this method shall be implemented if gettings scale is supported.

Parameters
  • pbHasScale – Pointer to a output boolean that will be set to true if a scale value exists. Might be nullptr.

  • peStorageType – Pointer to a output GDALDataType that will be set to the storage type of the scale value, when known/relevant. Otherwise will be set to GDT_Unknown. Might be nullptr. Since GDAL 3.3

Returns

the scale value. A 1.0 value might also indicate the absence of a scale value.

virtual bool SetOffset(double dfOffset, GDALDataType eStorageType = GDT_Unknown)

Set the offset value to apply to raw values.

unscaled_value = raw_value * GetScale() + GetOffset()

This is the same as the C function GDALMDArraySetOffset() / GDALMDArraySetOffsetEx().

Note

Driver implementation: this method shall be implemented if setting offset is supported.

Parameters
  • dfOffset – Offset

  • eStorageType – Data type to which create the potential attribute that will store the offset. Added in GDAL 3.3 If let to its GDT_Unknown value, the implementation will decide automatically the data type. Note that changing the data type after initial setting might not be supported.

Returns

true in case of success.

virtual bool SetScale(double dfScale, GDALDataType eStorageType = GDT_Unknown)

Set the scale value to apply to raw values.

unscaled_value = raw_value * GetScale() + GetOffset()

This is the same as the C function GDALMDArraySetScale() / GDALMDArraySetScaleEx().

Note

Driver implementation: this method shall be implemented if setting scale is supported.

Parameters
  • dfScale – scale

  • eStorageType – Data type to which create the potential attribute that will store the scale. Added in GDAL 3.3 If let to its GDT_Unknown value, the implementation will decide automatically the data type. Note that changing the data type after initial setting might not be supported.

Returns

true in case of success.

std::shared_ptr<GDALMDArray> GetView(const std::string &viewExpr) const

Return a view of the array using slicing or field access.

The slice expression uses the same syntax as NumPy basic slicing and indexing. See https://www.numpy.org/devdocs/reference/arrays.indexing.html#basic-slicing-and-indexing Or it can use field access by name. See https://www.numpy.org/devdocs/reference/arrays.indexing.html#field-access

Multiple [] bracket elements can be concatenated, with a slice expression or field name inside each.

For basic slicing and indexing, inside each [] bracket element, a list of indexes that apply to successive source dimensions, can be specified, using integer indexing (e.g. 1), range indexing (start:stop:step), ellipsis (…) or newaxis, using a comma separator.

Examples with a 2-dimensional array whose content is [[0,1,2,3],[4,5,6,7]].

  • GetView(“[1][2]”): returns a 0-dimensional/scalar array with the value at index 1 in the first dimension, and index 2 in the second dimension from the source array. That is 5

  • GetView(“[1]”)->GetView(“[2]”): same as above. Above is actually implemented internally doing this intermediate slicing approach.

  • GetView(“[1,2]”): same as above, but a bit more performant.

  • GetView(“[1]”): returns a 1-dimensional array, sliced at index 1 in the first dimension. That is [4,5,6,7].

  • GetView(“[:,2]”): returns a 1-dimensional array, sliced at index 2 in the second dimension. That is [2,6].

  • GetView(“[:,2:3:]”): returns a 2-dimensional array, sliced at index 2 in the second dimension. That is [[2],[6]].

  • GetView(“[::,2]”): Same as above.

  • GetView(“[…,2]”): same as above, in that case, since the ellipsis only expands to one dimension here.

  • GetView(“[:,::2]”): returns a 2-dimensional array, with even-indexed elements of the second dimension. That is [[0,2],[4,6]].

  • GetView(“[:,1::2]”): returns a 2-dimensional array, with odd-indexed elements of the second dimension. That is [[1,3],[5,7]].

  • GetView(“[:,1:3:]”): returns a 2-dimensional array, with elements of the second dimension with index in the range [1,3[. That is [[1,2],[5,6]].

  • GetView(“[::-1,:]”): returns a 2-dimensional array, with the values in first dimension reversed. That is [[4,5,6,7],[0,1,2,3]].

  • GetView(“[newaxis,…]”): returns a 3-dimensional array, with an addditional dimension of size 1 put at the beginning. That is [[[0,1,2,3],[4,5,6,7]]].

One difference with NumPy behavior is that ranges that would result in zero elements are not allowed (dimensions of size 0 not being allowed in the GDAL multidimensional model).

For field access, the syntax to use is [“field_name”] or [‘field_name’]. Multiple field specification is not supported currently.

Both type of access can be combined, e.g. GetView(“[1][‘field_name’]”)

The returned array holds a reference to the original one, and thus is a view of it (not a copy). If the content of the original array changes, the content of the view array too. When using basic slicing and indexing, the view can be written if the underlying array is writable.

This is the same as the C function GDALMDArrayGetView()

Note

When using the GDAL Python bindings, natural Python syntax can be used. That is ar[0,::,1][“foo”] will be internally translated to ar.GetView(“[0,::,1][‘foo’]”)

Note

When using the C++ API and integer indexing only, you may use the at(idx0, idx1, …) method.

Parameters

viewExpr – Expression expressing basic slicing and indexing, or field access.

Returns

a new array, that holds a reference to the original one, and thus is a view of it (not a copy), or nullptr in case of error.

std::shared_ptr<GDALMDArray> operator[](const std::string &fieldName) const

Return a view of the array using field access.

Equivalent of GetView(“[‘fieldName’]”)

Note

When operationg on a shared_ptr, use (*array)[“fieldName”] syntax.

inline std::shared_ptr<GDALMDArray> at(GUInt64 idx, GUInt64VarArg... tail) const

Return a view of the array using integer indexing.

Equivalent of GetView(“[indices_0,indices_1,…..,indices_last]”)

Example:

ar->at(0,3,2)

virtual std::shared_ptr<GDALMDArray> Transpose(const std::vector<int> &anMapNewAxisToOldAxis) const

Return a view of the array whose axis have been reordered.

The anMapNewAxisToOldAxis parameter should contain all the values between 0 and GetDimensionCount() - 1, and each only once. -1 can be used as a special index value to ask for the insertion of a new axis of size 1. The new array will have anMapNewAxisToOldAxis.size() axis, and if i is the index of one of its dimension, it corresponds to the axis of index anMapNewAxisToOldAxis[i] from the current array.

This is similar to the numpy.transpose() method

The returned array holds a reference to the original one, and thus is a view of it (not a copy). If the content of the original array changes, the content of the view array too. The view can be written if the underlying array is writable.

Note that I/O performance in such a transposed view might be poor.

This is the same as the C function GDALMDArrayTranspose().

Returns

a new array, that holds a reference to the original one, and thus is a view of it (not a copy), or nullptr in case of error.

std::shared_ptr<GDALMDArray> GetUnscaled() const

Return an array that is the unscaled version of the current one.

That is each value of the unscaled array will be unscaled_value = raw_value * GetScale() + GetOffset()

Starting with GDAL 3.3, the Write() method is implemented and will convert from unscaled values to raw values.

This is the same as the C function GDALMDArrayGetUnscaled().

Returns

a new array, that holds a reference to the original one, and thus is a view of it (not a copy), or nullptr in case of error.

virtual std::shared_ptr<GDALMDArray> GetMask(CSLConstList papszOptions) const

Return an array that is a mask for the current array.

This array will be of type Byte, with values set to 0 to indicate invalid pixels of the current array, and values set to 1 to indicate valid pixels.

The generic implementation honours the NoDataValue, as well as various netCDF CF attributes: missing_value, _FillValue, valid_min, valid_max and valid_range.

This is the same as the C function GDALMDArrayGetMask().

Parameters

papszOptions – NULL-terminated list of options, or NULL.

Returns

a new array, that holds a reference to the original one, and thus is a view of it (not a copy), or nullptr in case of error.

std::shared_ptr<GDALMDArray> GetResampled(const std::vector<std::shared_ptr<GDALDimension>> &apoNewDims, GDALRIOResampleAlg resampleAlg, const OGRSpatialReference *poTargetSRS, CSLConstList papszOptions) const

Return an array that is a resampled / reprojected view of the current array.

This is the same as the C function GDALMDArrayGetResampled().

Currently this method can only resample along the last 2 dimensions.

Since

3.4

Parameters
  • apoNewDims – New dimensions. Its size should be GetDimensionCount(). apoNewDims[i] can be NULL to let the method automatically determine it.

  • resampleAlg – Resampling algorithm

  • poTargetSRS – Target SRS, or nullptr

  • papszOptions – NULL-terminated list of options, or NULL.

Returns

a new array, that holds a reference to the original one, and thus is a view of it (not a copy), or nullptr in case of error.

virtual GDALDataset *AsClassicDataset(size_t iXDim, size_t iYDim) const

Return a view of this array as a “classic” GDALDataset (ie 2D)

In the case of > 2D arrays, additional dimensions will be represented as raster bands.

The “reverse” method is GDALRasterBand::AsMDArray().

This is the same as the C function GDALMDArrayAsClassicDataset().

Parameters
  • iXDim – Index of the dimension that will be used as the X/width axis.

  • iYDim – Index of the dimension that will be used as the Y/height axis. Ignored if the dimension count is 1.

Returns

a new GDALDataset that must be freed with GDALClose(), or nullptr

virtual CPLErr GetStatistics(bool bApproxOK, bool bForce, double *pdfMin, double *pdfMax, double *pdfMean, double *padfStdDev, GUInt64 *pnValidCount, GDALProgressFunc pfnProgress, void *pProgressData)

Fetch statistics.

Returns the minimum, maximum, mean and standard deviation of all pixel values in this array.

If bForce is FALSE results will only be returned if it can be done quickly (i.e. without scanning the data). If bForce is FALSE and results cannot be returned efficiently, the method will return CE_Warning but no warning will have been issued. This is a non-standard use of the CE_Warning return value to indicate “nothing done”.

When cached statistics are not available, and bForce is TRUE, ComputeStatistics() is called.

Note that file formats using PAM (Persistent Auxiliary Metadata) services will generally cache statistics in the .aux.xml file allowing fast fetch after the first request.

Cached statistics can be cleared with GDALDataset::ClearStatistics().

This method is the same as the C function GDALMDArrayGetStatistics().

Since

GDAL 3.2

Parameters
  • bApproxOK – Currently ignored. In the future, should be set to true if statistics on the whole array are wished, or to false if a subset of it may be used.

  • bForce – If false statistics will only be returned if it can be done without rescanning the image.

  • pdfMin – Location into which to load image minimum (may be NULL).

  • pdfMax – Location into which to load image maximum (may be NULL).-

  • pdfMean – Location into which to load image mean (may be NULL).

  • pdfStdDev – Location into which to load image standard deviation (may be NULL).

  • pnValidCount – Number of samples whose value is different from the nodata value. (may be NULL)

  • pfnProgress – a function to call to report progress, or NULL.

  • pProgressData – application data to pass to the progress function.

Returns

CE_None on success, CE_Warning if no values returned, CE_Failure if an error occurs.

virtual bool ComputeStatistics(bool bApproxOK, double *pdfMin, double *pdfMax, double *pdfMean, double *pdfStdDev, GUInt64 *pnValidCount, GDALProgressFunc, void *pProgressData)

Compute statistics.

Returns the minimum, maximum, mean and standard deviation of all pixel values in this array.

Pixels taken into account in statistics are those whose mask value (as determined by GetMask()) is non-zero.

Once computed, the statistics will generally be “set” back on the owing dataset.

Cached statistics can be cleared with GDALDataset::ClearStatistics().

This method is the same as the C function GDALMDArrayComputeStatistics().

Since

GDAL 3.2

Parameters
  • bApproxOK – Currently ignored. In the future, should be set to true if statistics on the whole array are wished, or to false if a subset of it may be used.

  • pdfMin – Location into which to load image minimum (may be NULL).

  • pdfMax – Location into which to load image maximum (may be NULL).-

  • pdfMean – Location into which to load image mean (may be NULL).

  • pdfStdDev – Location into which to load image standard deviation (may be NULL).

  • pnValidCount – Number of samples whose value is different from the nodata value. (may be NULL)

  • pfnProgress – a function to call to report progress, or NULL.

  • pProgressData – application data to pass to the progress function.

Returns

true on success

virtual void ClearStatistics()

Clear statistics.

Since

GDAL 3.4

virtual std::vector<std::shared_ptr<GDALMDArray>> GetCoordinateVariables() const

Return coordinate variables.

Coordinate variables are an alternate way of indexing an array that can be sometimes used. For example, an array collected through remote sensing might be indexed by (scanline, pixel). But there can be a longitude and latitude arrays alongside that are also both indexed by (scanline, pixel), and are referenced from operational arrays for reprojection purposes.

For netCDF, this will return the arrays referenced by the “coordinates” attribute.

This method is the same as the C function GDALMDArrayGetCoordinateVariables().

Since

GDAL 3.4

Returns

a vector of arrays

bool AdviseRead(const GUInt64 *arrayStartIdx, const size_t *count, CSLConstList papszOptions = nullptr) const

Advise driver of upcoming read requests.

Some GDAL drivers operate more efficiently if they know in advance what set of upcoming read requests will be made. The AdviseRead() method allows an application to notify the driver of the region of interest.

Many drivers just ignore the AdviseRead() call, but it can dramatically accelerate access via some drivers. One such case is when reading through a DAP dataset with the netCDF driver (a in-memory cache array is then created with the region of interest defined by AdviseRead())

This is the same as the C function GDALMDArrayAdviseRead().

Since

GDAL 3.2

Parameters
  • arrayStartIdx – Values representing the starting index to read in each dimension (in [0, aoDims[i].GetSize()-1] range). Array of GetDimensionCount() values. Can be nullptr as a synonymous for [0 for i in range(GetDimensionCount() ]

  • count – Values representing the number of values to extract in each dimension. Array of GetDimensionCount() values. Can be nullptr as a synonymous for [ aoDims[i].GetSize() - arrayStartIdx[i] for i in range(GetDimensionCount() ]

  • papszOptions – Driver specific options, or nullptr. Consult driver documentation.

Returns

true in case of success (ignoring the advice is a success)

bool IsRegularlySpaced(double &dfStart, double &dfIncrement) const

Returns whether an array is a 1D regularly spaced array.

Parameters
  • dfStart[out] First value in the array

  • dfIncrement[out] Increment/spacing between consecutive values.

Returns

true if the array is regularly spaced.

bool GuessGeoTransform(size_t nDimX, size_t nDimY, bool bPixelIsPoint, double adfGeoTransform[6]) const

Returns whether 2 specified dimensions form a geotransform.

Parameters
  • nDimX – Index of the X axis.

  • nDimY – Index of the Y axis.

  • bPixelIsPoint – Whether the geotransform should be returned with the pixel-is-point (pixel-center) convention (bPixelIsPoint = true), or with the pixel-is-area (top left corner convention) (bPixelIsPoint = false)

  • adfGeoTransform[out] Computed geotransform

Returns

true if a geotransform could be computed.

bool Cache(CSLConstList papszOptions = nullptr) const

Cache the content of the array into an auxiliary filename.

The main purpose of this method is to be able to cache views that are expensive to compute, such as transposed arrays.

The array will be stored in a file whose name is the one of GetFilename(), with an extra .gmac extension (stands for GDAL Multidimensional Array Cache). The cache is a netCDF dataset.

If the .gmac file cannot be written next to the dataset, the GDAL_PAM_PROXY_DIR will be used, if set, to write the cache file into that directory.

The GDALMDArray::Read() method will automatically use the cache when it exists. There is no timestamp checks between the source array and the cached array. If the source arrays changes, the cache must be manually deleted.

This is the same as the C function GDALMDArrayCache()

Note

Driver implementation: optionally implemented.

Parameters

papszOptions – List of options, null terminated, or NULL. Currently the only option supported is BLOCKSIZE=bs0,bs1,…,bsN to specify the block size of the cached array.

Returns

true in case of success.

virtual bool Read(const GUInt64 *arrayStartIdx, const size_t *count, const GInt64 *arrayStep, const GPtrDiff_t *bufferStride, const GDALExtendedDataType &bufferDataType, void *pDstBuffer, const void *pDstBufferAllocStart = nullptr, size_t nDstBufferAllocSize = 0) const override

Read part or totality of a multidimensional array or attribute.

This will extract the content of a hyper-rectangle from the array into a user supplied buffer.

If bufferDataType is of type string, the values written in pDstBuffer will be char* pointers and the strings should be freed with CPLFree().

This is the same as the C function GDALMDArrayRead().

Parameters
  • arrayStartIdx – Values representing the starting index to read in each dimension (in [0, aoDims[i].GetSize()-1] range). Array of GetDimensionCount() values. Must not be nullptr, unless for a zero-dimensional array.

  • count – Values representing the number of values to extract in each dimension. Array of GetDimensionCount() values. Must not be nullptr, unless for a zero-dimensional array.

  • arrayStep – Spacing between values to extract in each dimension. The spacing is in number of array elements, not bytes. If provided, must contain GetDimensionCount() values. If set to nullptr, [1, 1, … 1] will be used as a default to indicate consecutive elements.

  • bufferStride – Spacing between values to store in pDstBuffer. The spacing is in number of array elements, not bytes. If provided, must contain GetDimensionCount() values. Negative values are possible (for example to reorder from bottom-to-top to top-to-bottom). If set to nullptr, will be set so that pDstBuffer is written in a compact way, with elements of the last / fastest varying dimension being consecutive.

  • bufferDataType – Data type of values in pDstBuffer.

  • pDstBuffer – User buffer to store the values read. Should be big enough to store the number of values indicated by count[] and with the spacing of bufferStride[].

  • pDstBufferAllocStart – Optional pointer that can be used to validate the validty of pDstBuffer. pDstBufferAllocStart should be the pointer returned by the malloc() or equivalent call used to allocate the buffer. It will generally be equal to pDstBuffer (when bufferStride[] values are all positive), but not necessarily. If specified, nDstBufferAllocSize should be also set to the appropriate value. If no validation is needed, nullptr can be passed.

  • nDstBufferAllocSize – Optional buffer size, that can be used to validate the validty of pDstBuffer. This is the size of the buffer starting at pDstBufferAllocStart. If specified, pDstBufferAllocStart should be also set to the appropriate value. If no validation is needed, 0 can be passed.

Returns

true in case of success.