BAG -- Bathymetry Attributed Grid

Driver short name

BAG

Build dependencies

libhdf5

This driver provides read-only support, and starting with GDAL 2.4 for creation, for bathymetry data in the BAG format. BAG files are actually a specific product profile in an HDF5 file, but a custom driver exists to present the data in a more convenient manner than is available through the generic HDF5 driver.

BAG files have two image bands representing Elevation (band 1), Uncertainty (band 2) values for each cell in a raster grid area. Nominal Elevation values may also be present, and starting with GDAL 3.2, any 2D array of numeric data type and with the same dimensions as Elevation present under BAG_root will be reported as a GDAL band.

The geotransform and coordinate system is extracted from the internal XML metadata provided with the dataset. However, some products may have unsupported coordinate system formats, if using the non-WKT way of encoding the spatial reference system.

The full XML metadata is available in the "xml:BAG" metadata domain.

Nodata, minimum and maximum values for each band are also reported.

Driver capabilities

Supports CreateCopy()

This driver supports the GDALDriver::CreateCopy() operation

Supports Create()

This driver supports the GDALDriver::Create() operation

New in version 3.2.

Supports Georeferencing

This driver supports georeferencing

Supports VirtualIO

This driver supports virtual I/O operations (/vsimem/, etc.)

Open options

For open options specific to variable resolution, see following chapter.

Other open options are:

  • REPORT_VERTCRS=[YES/NO]: (GDAL >= 3.2) Defaults to YES. To report the vertical CRS from BAG XML metadata as the vertical component of a compound CRS. If set to NO, only the horizontal part will be reported.

Variable resolution (VR) grid support

Starting with GDAL 2.4, GDAL can handle BAG files with variable resolution grids. Such datasets are made of a low-resolution grid, which is the one presented by default by the driver, and for each of those low-resolution cells, a higher resolution grid can be present in the file. Such higher resolution grids are dubbed "supergrids" in GDAL.

The driver has different modes of working which can be controlled by the MODE open option:

MODE=[LOW_RES_GRID/LIST_SUPERGRIDS/RESAMPLED_GRID/INTERPOLATED/AUTO]: Defaults to AUTO. Driver mode.

  • MODE=LOW_RES_GRID: this is the default mode (unless opening a supergrid). The driver will expose the low resolution grid, and indicate in the dataset metadata if the dataset has supergrids (HAS_SUPERGRIDS=TRUE), as well as the minimum and maximum resolution of those grids.

  • MODE=LIST_SUPERGRIDS: in this mode, the driver will report the various supergrids in the subdataset list. It is possible to apply in this mode additional open options to restrict the search

    • SUPERGRIDS_INDICES=<(y1,x1),(y2,x2),...>: Tuple or list of tuples, of supergrids described by their y,x indices (starting from 0, y from the south of the grid, x from the west o the grid).

    • MINX=value: Minimum georeferenced X value to use as a filter for the supergrids to list.

    • MINY=value: Minimum georeferenced Y value to use as a filter for the supergrids to list.

    • MAXX=value: Maximum georeferenced X value to use as a filter for the supergrids to list.

    • MAXY=value: Maximum georeferenced Y value to use as a filter for the supergrids to list.

    • RES_FILTER_MIN=value: Minimum resolution of supergrids to take into account (excluded bound)

    • RES_FILTER_MAX=value: Maximum resolution of supergrids to take into account (included bound)

  • Opening a supergrid. This mode is triggered by using as a dataset name a string formatted like BAG:my.bag:supergrid:{y}:{x}, which is the value of the SUBDATASET_x_NAME metadata items reported by the above described mode. {y} is the index (starting from 0, from the south of the grid), and {x} is the index (starting from 0, from the west of the grid) of the supergrid to open.

  • MODE=RESAMPLED_GRID: in this mode, the user specify the extent and resolution of a target grid, and for each cell of this target grid, the driver will find the nodes of the supergrids that fall into that cell. By default, it will select the node with the maximum elevation value to populate the cell value. Or if no node of any supergrid are found, the cell value will be set to the nodata value. Overviews are reported: note that, those overviews correspond to resampled grids computed with different values of the RESX and RESY parameters, but using the same value population rules (and not nearest neighbour resampling of the full resolution resampled grid). The last overview level is the low-resolution grid (if SUPERGRIDS_MASK=NO)

    The available open options in this mode are:

    • MINX=value: Minimum georeferenced X value for the resampled grid. By default, the corresponding value of the low resolution grid.

    • MINY=value: Minimum georeferenced Y value for the resampled grid. By default, the corresponding value of the low resolution grid.

    • MAXX=value: Maximum georeferenced X value for the resampled grid. By default, the corresponding value of the low resolution grid.

    • MAXY=value: Maximum georeferenced Y value for the resampled grid. By default, the corresponding value of the low resolution grid.

    • RESX=value: Horizontal resolution. By default, and if RES_STRATEGY is set to AUTO, this will be the minimum resolution among all the supergrids.

    • RESY=value: Vertical resolution (positive value). By default, and if RES_STRATEGY is set to AUTO, this will be the minimum resolution among all the supergrids.

    • RES_STRATEGY=AUTO/MIN/MAX/MEAN: Which strategy to apply to set the resampled grid resolution. By default, if none of RESX, RESY, RES_FILTER_MIN and RES_FILTER_MAX is specified, the AUTO strategy will correspond to the MIN strategy: that is the minimum resolution among all the supergrids is used. If MAX is specified, the maximum resolution among all the supergrids is used. If MEAN is specified, the mean resolution among all the supergrids is used. RESX and RESY, if defined, will override the resolution determined by RES_STRATEGY.

    • RES_FILTER_MIN=value: Minimum resolution of supergrids to take into account (excluded bound, except if it is the minimum resolution of supergrids). By default, the minimum resolution of supergrids available. If this value is specified and none of RES_STRATEGY, RES_FILTER_MAX, RESX or RESY is specified, the maximum resolution among all the supergrids will be used as the resolution for the resampled grid.

    • RES_FILTER_MAX=value: Maximum resolution of supergrids to take into account (included bound). By default, the maximum resolution of supergrids available. If this value is specified and none of RES_STRATEGY, RESX or RESY is specified, this will also be used as the resolution for the resampled grid.

    • VALUE_POPULATION=MIN/MAX/MEAN/COUNT: Which value population strategy to apply to compute the resampled cell values. This default to MAX: the elevation value of a target cell is the maximum elevation of all supergrid nodes (potentially filtered with RES_FILTER_MIN and/or RES_FILTER_MAX) that fall into this cell; the corresponding uncertainty will be the uncertainty of the source node where this maximum elevation si reached. If no supergrid node fall into the target cell, the nodata value is set. The MIN strategy is similar, except that this is the minimum elevation value among intersecting nodes that is selected. The MEAN strategy uses the mean value of the elevation of intersecting nodes, and the maximum uncertainty of those nodes. The COUNT strategy (GDAL >= 3.2) exposes one single UInt32 band where each target cell contains the count of supergrid nodes that fall into it.

    • SUPERGRIDS_MASK=YES/NO. Default to NO. If set to YES, instead of the elevation and uncertainty band, the dataset contains a single Byte band which is boolean valued. For a target cell, if at least one supergrids nodes (potentially filtered with RES_FILTER_MIN and/or RES_FILTER_MAX) falls into the cell, the cell value is set at 255. Otherwise it is set at 0. This can be used to distinguish if elevation values at nodata are due to no source supergrid node falling into them, or if that/those supergrid nodes were themselves at the nodata value.

    • NODATA_VALUE=value. Override the default value, which is usually 1000000.

  • MODE=INTERPOLATED: (GDAL >= 3.8) in this mode, the user specify the extent and resolution of a target grid, and for each cell of this target grid, the driver will interpolate the value from the surrounding nodes of the supergrid(s) that applies to the cell, using in priority bilinear interpolation (for a node that falls within a supergrid and when all 4 surrounding nodes of the supergrid are not at nodata value), and fallbacks to barycentric interpolation or inverse distance weighting in other situations.

    Overviews are reported. Note that those overviews correspond to resampled grids computed with different values of the RESX and RESY parameters, but for performance reasons, the interpolation is still limited to the immediate neighbours of the target grid in the supergrid(s), which result in the equivalent of nearest-neighbour downsampling of the full resolution raster. The last overview level is the low-resolution grid.

    The available open options in this mode are MINX, MINY, MAXX, MAXY, RESX, RESY, RES_STRATEGY, RES_FILTER_MIN, RES_FILTER_MAX and NODATA_VALUE (cf their above description for the MODE=RESAMPLED_GRID)

Spatial metadata support

Starting with GDAL 3.2, GDAL can expose BAG files with spatial metadata.

When such spatial metadata is present, the subdataset list will include names of the form 'BAG:"{filename}":georef_metadata:{name_of_layer}' where name_of_layer is the name of a subgroup under /BAG_root/Georef_metadata

The values of the keys dataset under each metadata layer are used as the GDAL raster value. And the corresponding values dataset is exposed as a GDAL Raster Attribute Table associated to the GDAL raster band. If keys is absent, record 1 of values is assumed to be met for each elevation point that does not match the nodata value of the elevation band.

When variable resolution grids are present, the MODE=LIST_SUPERGRIDS open option will cause subdatasets of names of the form 'BAG:"{filename}":georef_metadata:{name_of_layer}:{y}:{x}' to be reported. When opening such a subdataset, the varres_keys dataset will be used to populate the GDAL raster value. If varres_keys is absent, record 1 of values is assumed to be met for each elevation point that does not match the nodata value of the variable resolution elevation band.

Tracking list support

When the dataset is opened in vector mode (ogrinfo, ogr2ogr, etc.), the tracking_list dataset will be reported as a OGR vector layer

Creation support

Starting with GDAL 2.4, the driver can create a BAG dataset (without variable resolution extension) with the elevation and uncertainty bands from a source dataset. The source dataset must be georeferenced, and have one or two bands. The first band is assumed to be the elevation band, and the second band the uncertainty band. If the second band is missing, the uncertainty will be set to nodata.

The driver will instantiate the BAG XML metadata by using a template file, which is by default, bag_template.xml, found in the GDAL data definition files. This template contains variables, present as ${KEYNAME} or ${KEYNAME:default_value} in the XML file, that can be substituted by providing a creation option whose name is the VAR_ string prefixed to the key name. Currently those creation options are:

  • VAR_INDIVIDUAL_NAME=value: Defaults to unknown. String to fill contact/CI_ResponsibleParty/individualName.

  • VAR_ORGANISATION_NAME=value: Defaults to unknown. String to fill contact/CI_ResponsibleParty/organisationName.

  • VAR_POSITION_NAME=value: Defaults to unknown. String to fill contact/CI_ResponsibleParty/positionName.

  • VAR_DATE=<YYYY-MM-DD>: Defaults to current date. Value to fill dateStamp/Date.

  • VAR_VERT_WKT=value: WKT string to fill referenceSystemInfo/MD_ReferenceSystem/referenceSystemIdentifier/RS_Identifier/code for the vertical coordinate reference system. If not provided, and if the input CRS is not a compound CRS, default to VERT_CS["unknown", VERT_DATUM["unknown", 2000]].

  • VAR_ABSTRACT=value: Defaults to <empty string>. String to fill identificationInfo/abstract.

  • VAR_PROCESS_STEP_DESCRIPTION=value: Defaults to Generated by GDAL x.y.z. String to fill dataQualityInfo/lineage/LI_Lineage/processStep/LI_ProcessStep/description.

  • VAR_DATETIME=<YYYY-MM-DDTHH:MM:SS>: Defaults to current datetime. Value to fill dataQualityInfo/lineage/LI_Lineage/processStep/LI_ProcessStep/dateTime/DateTime.

  • VAR_RESTRICTION_CODE=<enumerated_value>: Defaults to otherRestrictions. Value to fill metadataConstraints/MD_LegalConstraints/useConstraints/MD_RestrictionCode.

  • VAR_OTHER_CONSTRAINTS=value: Defaults to unknown. String to fill metadataConstraints/MD_LegalConstraints/otherConstraints.

  • VAR_CLASSIFICATION=<enumerated_value>: Defaults to unclassified. Value to fill metadataConstraints/MD_SecurityConstraints/classification/MD_ClassificationCode.

  • VAR_SECURITY_USER_NOTE=value: Defaults to none. String to fill metadataConstraints/MD_SecurityConstraints/userNote.

Other required variables found in the template, such as RES, RESX, RESY, RES_UNIT, HEIGHT, WIDTH, CORNER_POINTS and HORIZ_WKT will be automatically filled from the input dataset metadata.

The other following creation options are available:

  • TEMPLATE=<filename>: Path to a XML file that can serve as a template. This will typically be a customized version of the base bag_template.xml file. The file can contain other substituable variables than the ones mentioned above by using a similar syntax.

  • VAR_xxxx=value: Substitute variable ${xxxx} in the template XML value by the provided value.

  • BAG_VERSION=value: Defaults to 1.6.2. Value to write in the /BAG_root/BAG Version attribute.

  • COMPRESS=[NONE/DEFLATE]: Defaults to DEFLATE. Compression for elevation and uncertainty grids.

  • ZLEVEL=1-9: Defaults to 6. Deflate compression level.

  • BLOCK_SIZE=<integer>: Chunking size of the HDF5 arrays. Default to 100, or the maximum dimension of the raster if smaller than 100.

Usage examples

  • Opening in low resolution mode:

    $ gdalinfo data/test_vr.bag
    
    [...]
    Size is 6, 4
    [...]
      HAS_SUPERGRIDS=TRUE
      MAX_RESOLUTION_X=29.900000
      MAX_RESOLUTION_Y=31.900000
      MIN_RESOLUTION_X=4.983333
      MIN_RESOLUTION_Y=5.316667
    [...]
    
  • Displaying available supergrids:

    $ gdalinfo data/test_vr.bag -oo MODE=LIST_SUPERGRIDS
    
    [...]
    Subdatasets:
      SUBDATASET_1_NAME=BAG:"data/test_vr.bag":supergrid:0:0
      SUBDATASET_1_DESC=Supergrid (y=0, x=0) from (x=70.100000,y=499968.100000) to (x=129.900000,y=500031.900000), resolution (x=29.900000,y=31.900000)
      SUBDATASET_2_NAME=BAG:"data/test_vr.bag":supergrid:0:1
      SUBDATASET_2_DESC=Supergrid (y=0, x=1) from (x=107.575000,y=499976.075000) to (x=152.424999,y=500023.924999), resolution (x=14.950000,y=15.950000)
    [...]
      SUBDATASET_24_NAME=BAG:"data/test_vr.bag":supergrid:3:5
      SUBDATASET_24_DESC=Supergrid (y=3, x=5) from (x=232.558335,y=500077.391667) to (x=267.441666,y=500114.608334), resolution (x=4.983333,y=5.316667)
    [...]
    
  • Opening a particular supergrid:

    $ gdalinfo BAG:"data/test_vr.bag":supergrid:3:5
    
  • Converting a BAG in resampling mode with default parameters (use of minimum resolution of supergrids, MAX value population rule):

    $ gdal_translate data/test_vr.bag -oo MODE=RESAMPLED_GRID out.tif
    
  • Converting a BAG in resampling mode with a particular grid origin and resolution

    $ gdal_translate data/test_vr.bag -oo MODE=RESAMPLED_GRID -oo MINX=80 -oo MINY=500000 -oo RESX=16 -oo RESY=16 out.tif
    
  • Converting a BAG in resampling mode, with a mask indicating where supergrids nodes intersect the cell of the resampled dataset.

    $ gdal_translate data/test_vr.bag -oo MODE=RESAMPLED_GRID -oo SUPERGRIDS_MASK=YES out.tif
    
  • Converting a BAG in resampling mode, by filtering on supergrid resolutions (and the resampled grid will use 4 meter resolution by default)

    $ gdal_translate data/test_vr.bag -oo MODE=RESAMPLED_GRID -oo RES_FILTER_MIN=4 -oo RES_FILTER_MAX=8 out.tif
    
  • Converting a GeoTIFF file to a BAG dataset, and provide a custom value for the ABSTRACT substituable variable.

    $ gdal_translate in.tif out.bag -co "VAR_ABSTRACT=My abstract"
    
  • Converting a (VR) BAG in resampling mode with a particular grid resolution (5m) to a BAG dataset (without variable resolution extension), and provide a custom value for the ABSTRACT metadata:

    $ gdal_translate data/test_vr.bag -oo MODE=RESAMPLED_GRID -oo RESX=5 -oo RESY=5 out.bag -co "VAR_ABSTRACT=My abstract"
    
  • Displaying the tracking list:

    $ ogrinfo -al data/my.bal
    

See Also