RFC 104: Adding a "gdal" front-end command line interface
Author: |
Even Rouault |
Contact: |
even.rouault @ spatialys.com |
Started: |
2024-Nov-06 |
Status: |
Adopted, implemented in 3.11 |
Target: |
GDAL 3.11 for initial version (full scope will likely take more development cycles) |
Summary
This RFC introduces a single gdal front-end command line interface (CLI), that exposes sub-commands, adopts a consistent naming of options, introduces new capabilities (pipelines) and a concept of algorithms that can be run through the CLI or that can be automatically discovered and invoked programmatically.
This RFC gives the general principles and how they will be implemented on a subset of the envisioned commands. The initial candidate implementation will definitely not cover the full spectrum. Given that its size is already 10,000 new lines of code at time of writing, we need to limit its scope for reasonable reviewability. Extra functionality will be added progressively after RFC adoption and initial implementation.
Motivation
As of 2024, GDAL has 26 years of existence, and throughout the years, various utilities have been added by diverse contributors, leading to inconsistent naming of utilities (underscore or not?), options and input/output parameter. The GDAL User Survey of October-November 2024 shows that a significant proportion of GDAL users do so through the command line interface, and they suffer from inconsistencies, hence it is legitimate to enhance their experience.
This RFC adds a new single gdal front-end CLI whose sub-commands will match the functionality of existing utilities and re-use the battle-tested underlying implementations as much as possible.
The existing utilities will remain for backwards compatible reasons. This RFC takes inspiration from PDAL applications and rasterio 'rio' utilities.
Examples
Before going to the theory and details, let's have a look at the following examples which reflect the state of the candidate implementation.
Short usage help message of "gdal"
$ gdal ERROR 1: gdal: Missing subcommand name. Usage: gdal <subcommand> where <subcommand> is one of: - convert: Convert a dataset (shortcut for 'gdal raster convert' or 'gdal vector convert'). - info: Return information on a dataset (shortcut for 'gdal raster info' or 'gdal vector info'). - pipeline: Execute a pipeline (shortcut for 'gdal vector pipeline'). - raster: Raster commands. - vector: Vector commands. 'gdal <FILENAME>' can also be used as a shortcut for 'gdal info <FILENAME>'. And 'gdal read <FILENAME> ! ...' as a shortcut for 'gdal pipeline <FILENAME> ! ...'. For more details, consult https://gdal.org/programs/index.html
Short usage help message of "gdal raster info"
$ gdal raster info ERROR 1: info: Positional arguments starting at 'INPUT' have not been specified. Usage: gdal raster info [OPTIONS] <INPUT> Try 'gdal raster info --help' for help.
Detailed usage help message of "gdal raster info"
$ gdal raster info --help Usage: gdal raster info [OPTIONS] <INPUT> Return information on a raster dataset. Positional arguments: -i, --input <INPUT> Input raster dataset [required] Common Options: -h, --help Display help message and exit --json-usage Display usage as JSON document and exit Options: -f, --of, --format, --output-format <OUTPUT-FORMAT> Output format. OUTPUT-FORMAT=json|text (default: json) --mm, --min-max Compute minimum and maximum value --stats Retrieve or compute statistics, using all pixels Mutually exclusive with --approx-stats --approx-stats Retrieve or compute statistics, using a subset of pixels Mutually exclusive with --stats --hist Retrieve or compute histogram Advanced Options: --oo, --open-option <KEY=VALUE> Open options [may be repeated] --if, --input-format <INPUT-FORMAT> Input formats [may be repeated] --no-gcp Suppress ground control points list printing --no-md Suppress metadata printing --no-ct Suppress color table printing --no-fl Suppress file list printing --checksum Compute pixel checksum --list-mdd List all metadata domains available for the dataset --mdd <MDD> Report metadata for the specified domain. 'all' can be used to report metadata in all domains Esoteric Options: --no-nodata Suppress retrieving nodata value --no-mask Suppress mask band information --subdataset <SUBDATASET> Use subdataset of specified index (starting at 1), instead of the source dataset itself
A few invocations of "gdal raster info [OPTIONS] <FILENAME>"
$ gdal raster info byte.tif [ ... JSON output stripped ... ] $ gdal raster info -i byte.tif [ ... JSON output stripped ... ] $ gdal raster info --input byte.tif [ ... JSON output stripped ... ] $ gdal raster info --input=byte.tif [ ... JSON output stripped ... ] $ gdal raster info byte.tif --stats --format=text [ ... text output stripped ... ]
Using just gdal info <FILENAME>
$ gdal raster byte.tif [ ... JSON output stripped ... ]
And cherry-on-the-cake gdal <FILENAME>
$ gdal byte.tif [ ... JSON output stripped ... ]
"gdal [info] <FILENAME>" on dataset with mixed raster and vector content
$ gdal info mixed.gpkg ERROR 1: 'mixed.gpkg' has both raster and vector content. Please use 'gdal raster info' or 'gdal vector info'. $ gdal mixed.gpkg ERROR 1: 'mixed.gpkg' has both raster and vector content. Please use 'gdal raster info' or 'gdal vector info'.
A few invocations of "gdal raster convert"
$ gdal raster convert byte.tif out.tif $ gdal raster convert byte.tif out.tif --co=TILED=YES,COMPRESS=LZW ERROR 1: File 'out.tif' already exists. Specify the --overwrite option to overwrite it. $ gdal raster convert --input=byte.tif --output=out.tif --co=TILED=YES,COMPRESS=LZW --overwrite $ gdal raster convert -i byte.tif -o out.tif --co=TILED=YES,COMPRESS=LZW --overwrite --progress 0...10...20...30...40...50...60...70...80...90...100 - done.
Similarly to "gdal info" resolving automatically to "gdal raster info" or "gdal vector info" based on dataset content, "gdal convert" will also detect which subcommand must be used:
$ gdal convert byte.tif out.tif --overwrite
But:
$ gdal convert mixed.gpkg out.tif --overwrite ERROR 1: 'mixed.gpkg' has both raster and vector content. Please use 'gdal raster convert' or 'gdal vector convert'.
Help message of "gdal vector"
$ gdal vector ERROR 1: vector: Missing subcommand name. Usage: gdal vector <SUBCOMMAND> where <SUBCOMMAND> is one of: - convert: Convert a vector dataset. - filter: Filter a vector dataset. - info: Return information on a vector dataset. - pipeline: Process a vector dataset. - reproject: Reproject a vector dataset.
A few invocations of "gdal vector convert"
$ gdal vector convert poly.gpkg poly.parquet $ gdal vector convert poly.gpkg poly.parquet --lco COMPRESSION=SNAPPY ERROR 1: File 'poly.parquet' already exists. Specify the --overwrite option to overwrite it. $ gdal vector convert multilayer.gpkg output.gpkg -l my_input_layer --output-layer=new_layer --update --progress 0...10...20...30...40...50...60...70...80...90...100 - done. $ gdal convert poly.gpkg poly.parquet --overwrite
JSON-formatted detailed usage of "gdal vector convert"
This mode is rather aimed at application developers that would want to dynamically generate graphical user interfaces for GDAL algorithms.
$ gdal vector convert --json-usage{ "name":"convert", "full_path":[ "vector", "convert" ], "description":"Convert a vector dataset.", "sub_algorithms":[ ], "input_arguments":[ { "name":"output-format", "type":"string", "description":"Output format", "min_count":0, "max_count":1, "category":"Base", "metadata":{ "required_capabilities":[ "DCAP_VECTOR", "DCAP_CREATE" ] } }, { "name":"open-option", "type":"string_list", "description":"Open options", "min_count":0, "max_count":2147483647, "category":"Advanced" }, { "name":"input-format", "type":"string_list", "description":"Input formats", "min_count":0, "max_count":2147483647, "category":"Advanced", "metadata":{ "required_capabilities":[ "DCAP_VECTOR" ] } }, { "name":"input", "type":"dataset", "description":"Input vector dataset", "min_count":1, "max_count":1, "category":"Base", "dataset_type":[ "vector" ], "input_flags":[ "name", "dataset" ] }, { "name":"creation-option", "type":"string_list", "description":"Creation option", "min_count":0, "max_count":2147483647, "category":"Base" }, { "name":"layer-creation-option", "type":"string_list", "description":"Layer creation option", "min_count":0, "max_count":2147483647, "category":"Base" }, { "name":"overwrite", "type":"boolean", "description":"Whether overwriting existing output is allowed", "default":false, "min_count":0, "max_count":1, "category":"Base" }, { "name":"update", "type":"boolean", "description":"Whether updating existing dataset is allowed", "default":false, "min_count":0, "max_count":1, "category":"Base" }, { "name":"overwrite-layer", "type":"boolean", "description":"Whether overwriting existing layer is allowed", "default":false, "min_count":0, "max_count":1, "category":"Base" }, { "name":"append", "type":"boolean", "description":"Whether appending to existing layer is allowed", "default":false, "min_count":0, "max_count":1, "category":"Base" }, { "name":"input-layer", "type":"string_list", "description":"Input layer name(s)", "min_count":0, "max_count":2147483647, "category":"Base" }, { "name":"output-layer", "type":"string", "description":"Output layer name", "min_count":0, "max_count":1, "category":"Base" } ], "output_arguments":[ ], "input_output_arguments":[ { "name":"output", "type":"dataset", "description":"Output vector dataset", "min_count":1, "max_count":1, "category":"Base", "dataset_type":[ "vector" ], "input_flags":[ "name", "dataset" ], "output_flags":[ "dataset" ] } ] }
A few invocations of "gdal vector pipeline"
# The use of the '!' as a step separator is to prevent Unix or Windows shells from # trying to use other processes for the "reproject" or "write" steps. # Below is a single-process pipeline. $ gdal vector pipeline read poly.gpkg ! reproject --dst-crs=EPSG:4326 ! write out.parquet --overwrite # Alternative without the "vector" and "pipeline" subcommands, and with --progress $ gdal read poly.gpkg ! reproject --dst-crs=EPSG:4326 ! write out.parquet --overwrite --progress # Alternative using an explicit --pipeline switch, and given the quoting, we can use the '|' character $ gdal vector pipeline --pipeline="read poly.gpkg | reproject --dst-crs=EPSG:4326 | write out.parquet --overwrite" # Works also as a quoted positional argument, and without the "vector" subcommand $ gdal pipeline --progress "read poly.gpkg | reproject --dst-crs=EPSG:4326 | write out.parquet --overwrite"
Detailed usage help message of "gdal vector pipeline"
$ gdal vector pipeline --help Usage: gdal vector pipeline [OPTIONS] <PIPELINE> Process a vector dataset. Positional arguments: Common Options: -h, --help Display help message and exit --json-usage Display usage as JSON document and exit --progress Display progress bar <PIPELINE> is of the form: read [READ-OPTIONS] ( ! <STEP-NAME> [STEP-OPTIONS] )* ! write [WRITE-OPTIONS] Example: 'gdal vector pipeline --progress ! read in.gpkg ! \ reproject --dst-crs=EPSG:32632 ! write out.gpkg --overwrite' Potential steps are: * read [OPTIONS] <INPUT> ------------------------ Read a vector dataset. Positional arguments: -i, --input <INPUT> Input vector dataset [required] Options: -l, --layer, --input-layer <INPUT-LAYER> Input layer name(s) [may be repeated] Advanced Options: --if, --input-format <INPUT-FORMAT> Input formats [may be repeated] --oo, --open-option <KEY=VALUE> Open options [may be repeated] * filter [OPTIONS] ------------------ Filter. Options: --bbox <BBOX> Bounding box as xmin,ymin,xmax,ymax * reproject [OPTIONS] --------------------- Reproject. Options: -s, --src-crs <SRC-CRS> Source CRS -d, --dst-crs <DST-CRS> Destination CRS [required] * write [OPTIONS] <OUTPUT> -------------------------- Write a vector dataset. Positional arguments: -o, --output <OUTPUT> Output vector dataset [required] Options: -f, --of, --format, --output-format <OUTPUT-FORMAT> Output format --co, --creation-option <KEY=VALUE> Creation option [may be repeated] --lco, --layer-creation-option <KEY=VALUE> Layer creation option [may be repeated] --overwrite Whether overwriting existing output is allowed --update Whether updating existing dataset is allowed --overwrite-layer Whether overwriting existing layer is allowed --append Whether appending to existing layer is allowed -l, --output-layer <OUTPUT-LAYER> Output layer name
The filter and reproject steps can also be used as direct "gdal vector" standalone subcommands, in which case they are augmented with the options of the 'read' and 'write' steps:
$ gdal vector reproject --help Usage: gdal vector reproject [OPTIONS] <INPUT> <OUTPUT> Reproject a vector dataset. Positional arguments: -i, --input <INPUT> Input vector dataset [required] -o, --output <OUTPUT> Output vector dataset [required] Common Options: -h, --help Display help message and exit --json-usage Display usage as JSON document and exit --progress Display progress bar Options: -l, --layer, --input-layer <INPUT-LAYER> Input layer name(s) [may be repeated] -f, --of, --format, --output-format <OUTPUT-FORMAT> Output format --co, --creation-option <KEY>=<VALUE> Creation option [may be repeated] --lco, --layer-creation-option <KEY>=<VALUE> Layer creation option [may be repeated] --overwrite Whether overwriting existing output is allowed --update Whether to open existing dataset in update mode --overwrite-layer Whether overwriting existing layer is allowed --append Whether appending to existing layer is allowed --output-layer <OUTPUT-LAYER> Output layer name -s, --src-crs <SRC-CRS> Source CRS -d, --dst-crs <DST-CRS> Destination CRS [required] Advanced Options: --if, --input-format <INPUT-FORMAT> Input formats [may be repeated] --oo, --open-option <KEY=VALUE> Open options [may be repeated]
CLI specification
Subcommand syntax
gdal <subcommand> [<subsubcommand>]... [<options>]... [<positional arguments>]...
where subcommand is something like raster
, vector
, etc. with potential
sub-subcommand like info
, convert
, etc.
Option naming conventions
One-letter short names preceded with dash:
-z
When a value is specified, it must be separated with a space:
-z <value>
Longer names preceded with two dashes and using dash to separate words, lower-case capitalized:
--long-name
When a single value is expected, it must be separated with a space or equal sign:
--long-name <value>
or--long-name=<value>
.In the rest of the document, we will use the version with a space separator, but equal sign is also accepted.
Repeated values / multi-valued options
Existing GDAL command line utilities have an inconsistent strategy regarding how to specify repeated values (band indices, nodata values, etc.), sometimes with the switch being repeated many times, sometimes with a single switch but the values being grouped together and separated with spaces or commas.
With this RFC, for arguments of list types, 2 variants will be supported:
values passed at the same time (packed values), separated by a
,
(comma):--co KEY1=VALUE1,KEY2=VALUE2
or values are passed one by one with the option being repeated:
--co KEY1=VALUE1 --co KEY2=VALUE2
In some cases, in particular when a fixed number of values is expected, or
if the order of values in the list matters, like a bounding-box argument,
the argument can be declared to accept packed values only, like in
--bbox <xmin>,<ymin>,<xmax>,<ymax>
Specification of input and output files/datasets
Two possibilities will be offered:
positional arguments with input(s) first, output last
gdal <subcommand> <input1> [<input2>]... <output>
using
-i / --input
and-o / --output
gdal <subcommand> -i <input1> [-i <input2>]... -o <output>
Reserved switches
The following switches are reserved. Meaning that if a subcommand uses them, it must be with their below semantics and syntax.
-h
,--help
: display detailed help synopsis-i <name>
,--input <name>
: specify input file/dataset-o <name>
,--output <name>
: specify output file/dataset--overwrite
: whether overwriting the output file is allowed. Defaults to no, that is execution will fail if the output file already exists.-f <format>
,--of <format>
: output format. Value is a (not always so) "short" driver name:GTiff
,COG
,GPKG
,ESRI Shapefile
. Also used bygdal info
to select JSON vs text output--if <format>
: input format. Value is a short driver name. Used when autodetection of the appropriate driver fails.-b <band_number>
,--band <band_number>
: specify input raster band number. May be repeated for utilities supporting multiple bands-l <name>
,--layer <name>
: specify input vector layer name. May be repeated for utilities supporting multiple layers--co <NAME>=<VALUE>
: driver specific creation option. May be repeated.--oo <NAME>=<VALUE>
: driver specific open option. May be repeated.--ot {Byte|UInt16|...}
: output data type (for raster output)--bbox <xmin>,<ymin>,<xmax>,<ymax>
: as used bygdal vector info
,gdal vector convert
,gdal raster convert
--src-crs <crs_spec>
: Override source CRS specification. Accept--s_srs
as hidden alias for old CLI compatibility.--dst-crs <crs_spec>
: Define target CRS specification. Accept--t_srs
as hidden alias for old CLI compatibility.--override-crs <crs_spec>
: Override CRS without reprojection. Accept--a_srs
as hidden alias for old CLI compatibility.
gdal info
This subcommand will merge together gdalinfo, ogrinfo and gdalmdiminfo.
It will GDALDataset::Open()
the specified dataset in raster and vector mode.
If the dataset is only a raster one, it will automatically resolve as the sub-subcommand "gdal raster info".
If the dataset is only a vector one, it will automatically resolve as the sub-subcommand as "gdal vector info".
In this automated mode, no switch besides open options can be specified, given that we don't know yet in which mode to open.
If the dataset has both raster and vector content, an error will be emitted, inviting the user to specify explicitly the raster or vector mode.
Example:
gdal info my.tif gdal info my.gpkg
The main gdal utility will also accept gdal [OPTIONS] <FILENAME>
as a shortcut for gdal info [OPTIONS] <FILENAME>
.
gdal raster info
Equivalent of existing gdalinfo
Synopsis: gdal raster info [-i <filename>] [other options] <filename>
Example:
gdal raster info my.gpkg
Switches:
-f json|text
,--of json|text
: output format. Will default to JSON.--min-max
--stats
--approx-stats
--hist
--no-gcp
--no-md
--no-ct
--no-fl
--no-nodata
--no-mask
--checksum
--list-mdd
--mdd <domain>|all
--subdataset <num>
gdal vector info
Equivalent of existing ogrinfo
Synopsis: gdal vector info [-i <filename>] [other options] <filename> [<layername>]...
Example:
gdal vector info my.gpkg
Switches:
-f json|text
,--of json|text
: output format. Will default to JSON.--sql <statement>
-l <name>
,--layer <name>
--update
: New default will be read-only--interleaved-layers
: a.k.a random layer reading mode (ogrinfo-al
), for OSM and GMLAS mostly.--where <statement>
--dialect <dialectname>
--bbox <xmin>,<ymin>,<xmax>,<ymax>
gdal multidim info
Equivalent of existing gdalmdiminfo
Details will be fleshed out in the pull request implementing it.
gdal raster convert
Equivalent of existing gdal_translate
Initial options below. More to be added.
Positional arguments:
-i, --input <INPUT> Input raster dataset [required]
-o, --output <OUTPUT> Output raster dataset (created by algorithm) [required]
Common Options:
-h, --help Display help message and exit
--json-usage Display usage as JSON document and exit
--progress Display progress bar
Options:
-f, --of, --format, --output-format <OUTPUT-FORMAT> Output format
--co, --creation-option <KEY=VALUE> Creation option [may be repeated]
--overwrite Whether overwriting existing output is allowed
Mutually exclusive with --append
--append Append as a subdataset to existing output
Mutually exclusive with --overwrite
Advanced Options:
--oo, --open-option <KEY=VALUE> Open options [may be repeated]
--if, --input-format <INPUT-FORMAT> Input formats [may be repeated]
gdal vector convert
Equivalent of existing ogr2ogr
Initial options below. More to be added, but presumably not all existing options
of ogr2ogr
.
Positional arguments:
-i, --input <INPUT> Input vector dataset [required]
-o, --output <OUTPUT> Output vector dataset [required]
Common Options:
-h, --help Display help message and exit
--json-usage Display usage as JSON document and exit
--progress Display progress bar
Options:
-f, --of, --format, --output-format <OUTPUT-FORMAT> Output format
--co, --creation-option <KEY=VALUE> Creation option [may be repeated]
--lco, --layer-creation-option <KEY=VALUE> Layer creation option [may be repeated]
--overwrite Whether overwriting existing output is allowed
--update Whether updating existing dataset is allowed
--overwrite-layer Whether overwriting existing layer is allowed
--append Whether appending to existing layer is allowed
-l, --layer, --input-layer <INPUT-LAYER> Input layer name(s) [may be repeated]
--output-layer <OUTPUT-LAYER> Output layer name
Advanced Options:
--oo, --open-option <KEY=VALUE> Open options [may be repeated]
--if, --input-format <INPUT-FORMAT> Input formats [may be repeated]
gdal vector pipeline
"Equivalent" of existing ogr2ogr
Refer to above examples.
A pipeline is the succession of several processing steps. One issue with ogr2ogr
is that it offers tons of different processings that can be combined together,
but it is not always obvious to know in which order they are applied. In some
cases, we had to duplicate options, like -clipsrc
and -clipdst
to offer
a way of clipping geometries before or after reprojection. It can be more natural
to explicitly specified in which order operations should be conducted, like
read the input dataset
filter on a bounding box (in the CRS of the input layer)
reproject to some other CRS
clip geometries to a rectangle (in the new CRS)
... some other operation ...
write to final file.
Available steps currently are:
"read": required to be first. Possibility to select all, one or a subset of input layers
"filter": filtering by bounding box, or where clause
"reproject"
"write": required to be last
More steps to be added in follow-up pull requests.
There might be a loss of efficiency in having separate steps that iterate over (on-the-fly / streamed) features returned by the previous step(s) and generating new (on-the-fly / streamed) ones. In the most simple cases, we might be able to "compile" steps into GDALVectorTranslate() single invocation. That might be done in follow-up pull requests to the initial candidate implementation of this RFC.
Further enhancements might support non-linear pipelines (that is forming a directed acyclic graph), and strategies to multi-thread some processing (for example, a reprojection step could acquire N batches of X features from its source layer, and then reproject each batch in a dedicated thread. But at the expense of a greater usage of RAM to be able to store N * X features at once.)
gdal multidim convert
Equivalent of existing gdalmdimtranslate
Details will be fleshed out in the pull request implementing it.
gdal warp ?
Equivalent, or subset, of existing gdalwarp
Note
In the User Survey, a number of users have expressed a wish to have gdal_translate and gdalwarp functionality merged together. This RFC does not attempt at addressing that. Or should it... ? That'd be a huge topic
Note that warp is also a bit of a misnomer as gdalwarp can mosaic.
Details will be fleshed out in the pull request implementing it.
gdal raster contour
Equivalent of existing gdal_contour
Details will be fleshed out in the pull request implementing it.
gdal vector rasterize
Equivalent of existing gdal_rasterize
Details will be fleshed out in the pull request implementing it.
gdal raster create
Details will be fleshed out in the pull request implementing it.
gdal raster footprint
Equivalent of existing gdal_footprint
Details will be fleshed out in the pull request implementing it.
gdal dem
Equivalent of existing gdaldem
Including viewhsed
(equivalent of existing gdal_viewshed) as a subcommand,
along side with the current modes of gdaldem: hillshade
, slope
, etc.
Details will be fleshed out in the pull request implementing it.
gdal grid
Equivalent of existing gdal_grid
grid is a vector to raster operation: should it be a top-level operation, or
a sub-subcommand of the raster
or vector
ones ?
Details will be fleshed out in the pull request implementing it.
gdal raster mosaic
Equivalent of existing gdalbuildvrt and gdal_translate.
Details will be fleshed out in the pull request implementing it.
gdal raster tileindex
Equivalent of existing gdaltindex
Details will be fleshed out in the pull request implementing it.
gdal vector tileindex
Equivalent of existing ogrtindex
Details will be fleshed out in the pull request implementing it.
gdal raster cleanborder
Equivalent of existing nearblack
Details will be fleshed out in the pull request implementing it.
Implementation details
C API
The C API will map most of the functionality of GDALAlgorithm
,
GDALAlgorithmArg
, GDALArgDatasetValue
and GDALAlgorithmRegistry
.
Below is an extract of the beginning of https://github.com/rouault/gdal/blob/rfc104/gcore/gdalalgorithm.h
/** Type of an argument */
typedef enum GDALAlgorithmArgType
{
/** Boolean type. Value is a bool. */
GAAT_BOOLEAN,
/** Single-value string type. Value is a std::string */
GAAT_STRING,
/** Single-value integer type. Value is a int */
GAAT_INTEGER,
/** Single-value real type. Value is a double */
GAAT_REAL,
/** Dataset type. Value is a GDALArgDatasetValue */
GAAT_DATASET,
/** Multi-value string type. Value is a std::vector<std::string> */
GAAT_STRING_LIST,
/** Multi-value integer type. Value is a std::vector<int> */
GAAT_INTEGER_LIST,
/** Multi-value real type. Value is a std::vector<double> */
GAAT_REAL_LIST,
/** Multi-value dataset type. Value is a std::vector<GDALArgDatasetValue> */
GAAT_DATASET_LIST,
} GDALAlgorithmArgType;
/** Return whether the argument type is a list / multi-valued one. */
bool CPL_DLL GDALAlgorithmArgTypeIsList(GDALAlgorithmArgType type);
/** Return the string representation of the argument type */
const char CPL_DLL *GDALAlgorithmArgTypeName(GDALAlgorithmArgType type);
/** Opaque C type for GDALArgDatasetValue */
typedef struct GDALArgDatasetValueHS *GDALArgDatasetValueH;
/** Opaque C type for GDALAlgorithmArg */
typedef struct GDALAlgorithmArgHS *GDALAlgorithmArgH;
/** Opaque C type for GDALAlgorithm */
typedef struct GDALAlgorithmHS *GDALAlgorithmH;
/** Opaque C type for GDALAlgorithmRegistry */
typedef struct GDALAlgorithmRegistryHS *GDALAlgorithmRegistryH;
/************************************************************************/
/* GDALAlgorithmRegistryH API */
/************************************************************************/
GDALAlgorithmRegistryH CPL_DLL GDALGetGlobalAlgorithmRegistry(void);
void CPL_DLL GDALAlgorithmRegistryRelease(GDALAlgorithmRegistryH);
char CPL_DLL **GDALAlgorithmRegistryGetAlgNames(GDALAlgorithmRegistryH);
GDALAlgorithmH CPL_DLL GDALAlgorithmRegistryInstantiateAlg(
GDALAlgorithmRegistryH, const char *pszAlgName);
/************************************************************************/
/* GDALAlgorithmH API */
/************************************************************************/
void CPL_DLL GDALAlgorithmRelease(GDALAlgorithmH);
const char CPL_DLL *GDALAlgorithmGetName(GDALAlgorithmH);
const char CPL_DLL *GDALAlgorithmGetDescription(GDALAlgorithmH);
const char CPL_DLL *GDALAlgorithmGetLongDescription(GDALAlgorithmH);
const char CPL_DLL *GDALAlgorithmGetHelpFullURL(GDALAlgorithmH);
bool CPL_DLL GDALAlgorithmHasSubAlgorithms(GDALAlgorithmH);
char CPL_DLL **GDALAlgorithmGetSubAlgorithmNames(GDALAlgorithmH);
GDALAlgorithmH CPL_DLL
GDALAlgorithmInstantiateSubAlgorithm(GDALAlgorithmH, const char *pszSubAlgName);
bool CPL_DLL GDALAlgorithmParseCommandLineArguments(GDALAlgorithmH,
CSLConstList papszArgs);
GDALAlgorithmH CPL_DLL GDALAlgorithmGetActualAlgorithm(GDALAlgorithmH);
bool CPL_DLL GDALAlgorithmRun(GDALAlgorithmH, GDALProgressFunc pfnProgress,
void *pProgressData);
bool CPL_DLL GDALAlgorithmFinalize(GDALAlgorithmH);
char CPL_DLL *GDALAlgorithmGetUsageAsJSON(GDALAlgorithmH);
char CPL_DLL **GDALAlgorithmGetArgNames(GDALAlgorithmH);
GDALAlgorithmArgH CPL_DLL GDALAlgorithmGetArg(GDALAlgorithmH,
const char *pszArgName);
/************************************************************************/
/* GDALAlgorithmArgH API */
/************************************************************************/
void CPL_DLL GDALAlgorithmArgRelease(GDALAlgorithmArgH);
const char CPL_DLL *GDALAlgorithmArgGetName(GDALAlgorithmArgH);
GDALAlgorithmArgType CPL_DLL GDALAlgorithmArgGetType(GDALAlgorithmArgH);
const char CPL_DLL *GDALAlgorithmArgGetDescription(GDALAlgorithmArgH);
const char CPL_DLL *GDALAlgorithmArgGetShortName(GDALAlgorithmArgH);
char CPL_DLL **GDALAlgorithmArgGetAliases(GDALAlgorithmArgH);
const char CPL_DLL *GDALAlgorithmArgGetMetaVar(GDALAlgorithmArgH);
const char CPL_DLL *GDALAlgorithmArgGetCategory(GDALAlgorithmArgH);
bool CPL_DLL GDALAlgorithmArgIsPositional(GDALAlgorithmArgH);
bool CPL_DLL GDALAlgorithmArgIsRequired(GDALAlgorithmArgH);
int CPL_DLL GDALAlgorithmArgGetMinCount(GDALAlgorithmArgH);
int CPL_DLL GDALAlgorithmArgGetMaxCount(GDALAlgorithmArgH);
bool CPL_DLL GDALAlgorithmArgGetPackedValuesAllowed(GDALAlgorithmArgH);
bool CPL_DLL GDALAlgorithmArgGetRepeatedArgAllowed(GDALAlgorithmArgH);
char CPL_DLL **GDALAlgorithmArgGetChoices(GDALAlgorithmArgH);
bool CPL_DLL GDALAlgorithmArgIsExplicitlySet(GDALAlgorithmArgH);
bool CPL_DLL GDALAlgorithmArgHasDefaultValue(GDALAlgorithmArgH);
bool CPL_DLL GDALAlgorithmArgIsHiddenForCLI(GDALAlgorithmArgH);
bool CPL_DLL GDALAlgorithmArgIsOnlyForCLI(GDALAlgorithmArgH);
bool CPL_DLL GDALAlgorithmArgIsInput(GDALAlgorithmArgH);
bool CPL_DLL GDALAlgorithmArgIsOutput(GDALAlgorithmArgH);
const char CPL_DLL *GDALAlgorithmArgGetMutualExclusionGroup(GDALAlgorithmArgH);
bool CPL_DLL GDALAlgorithmArgGetAsBoolean(GDALAlgorithmArgH);
const char CPL_DLL *GDALAlgorithmArgGetAsString(GDALAlgorithmArgH);
GDALArgDatasetValueH
CPL_DLL GDALAlgorithmArgGetAsDatasetValue(GDALAlgorithmArgH);
int CPL_DLL GDALAlgorithmArgGetAsInteger(GDALAlgorithmArgH);
double CPL_DLL GDALAlgorithmArgGetAsDouble(GDALAlgorithmArgH);
char CPL_DLL **GDALAlgorithmArgGetAsStringList(GDALAlgorithmArgH);
const int CPL_DLL *GDALAlgorithmArgGetAsIntegerList(GDALAlgorithmArgH,
size_t *pnCount);
const double CPL_DLL *GDALAlgorithmArgGetAsDoubleList(GDALAlgorithmArgH,
size_t *pnCount);
bool CPL_DLL GDALAlgorithmArgSetAsBoolean(GDALAlgorithmArgH, bool);
bool CPL_DLL GDALAlgorithmArgSetAsString(GDALAlgorithmArgH, const char *);
bool CPL_DLL GDALAlgorithmArgSetAsDatasetValue(GDALAlgorithmArgH hArg,
GDALArgDatasetValueH value);
bool CPL_DLL GDALAlgorithmArgSetDataset(GDALAlgorithmArgH hArg, GDALDatasetH);
bool CPL_DLL GDALAlgorithmArgSetAsInteger(GDALAlgorithmArgH, int);
bool CPL_DLL GDALAlgorithmArgSetAsDouble(GDALAlgorithmArgH, double);
bool CPL_DLL GDALAlgorithmArgSetAsStringList(GDALAlgorithmArgH, CSLConstList);
bool CPL_DLL GDALAlgorithmArgSetAsIntegerList(GDALAlgorithmArgH, size_t nCount,
const int *pnValues);
bool CPL_DLL GDALAlgorithmArgSetAsDoubleList(GDALAlgorithmArgH, size_t nCount,
const double *pnValues);
/** Binary-or combination of GDAL_OF_RASTER, GDAL_OF_VECTOR,
* GDAL_OF_MULTIDIM_RASTER, possibly with GDAL_OF_UPDATE.
*/
typedef int GDALArgDatasetType;
GDALArgDatasetType CPL_DLL GDALAlgorithmArgGetDatasetType(GDALAlgorithmArgH);
/** Bit indicating that the name component of GDALArgDatasetValue is accepted. */
#define GADV_NAME (1 << 0)
/** Bit indicating that the dataset component of GDALArgDatasetValue is accepted. */
#define GADV_OBJECT (1 << 1)
int CPL_DLL GDALAlgorithmArgGetDatasetInputFlags(GDALAlgorithmArgH);
int CPL_DLL GDALAlgorithmArgGetDatasetOutputFlags(GDALAlgorithmArgH);
/************************************************************************/
/* GDALArgDatasetValueH API */
/************************************************************************/
GDALArgDatasetValueH CPL_DLL GDALArgDatasetValueCreate(void);
void CPL_DLL GDALArgDatasetValueRelease(GDALArgDatasetValueH);
const char CPL_DLL *GDALArgDatasetValueGetName(GDALArgDatasetValueH);
GDALDatasetH CPL_DLL GDALArgDatasetValueGetDatasetRef(GDALArgDatasetValueH);
GDALDatasetH
CPL_DLL GDALArgDatasetValueGetDatasetIncreaseRefCount(GDALArgDatasetValueH);
void CPL_DLL GDALArgDatasetValueSetName(GDALArgDatasetValueH, const char *);
void CPL_DLL GDALArgDatasetValueSetDataset(GDALArgDatasetValueH, GDALDatasetH);
SWIG API
All the above C API will be directly mapped to equivalent SWIG classes and methods.
It will be available in a new swig/include/Algorithm.i file.`
It will be used by our Python autotest suite, as most of the testing will be done through that way.
gdal
binary
gdal.cpp is a ~ 50 line of code launcher script that queries the gdal
main
algorithm, passes it to it the command line arguments and execute the
GDALAlgorithm::Run
method.
Out of scope
This RFC only addresses existing C++ utilities. Python utilities that would be migrated in the future as C++ utilities should follow this RFC.
The very specific sozip utility will not follow this RFC. It has been design to mimic the existing standard
zip
utility.
Backward compatibility
Fully backwards compatible. Existing utilities will remain for now, and if the project decides to retire them in the future, that will likely go through a multi-year deprecation period. Such decision will be made later, depending on the maturity of the new unified CLI approach, and its adoption status by the community.
Before GDAL 3.11 release, we'll need to decide if we advertise the already implemented commands as stable or experimental. My feeling is that it might be prudent to label them as experimental for now, but release them as part of 3.11.0 to get broader feedback from users, before stabilizing command and option names.
Testing
Testing of the parsing logic of GDALAlgorithm, setting argument values will
be done in C++ in autotest/cpp/test_gdal_algorithm.cpp
.
Testing of the commands and subcommands of gdal
will be done in Python
in autotest/utilities
.
Documentation
The new gdal
utility and its commands will be documented in https://gdal.org/programs
Staffing
The candidate implementation will be done by Even Rouault. Full scope will likely require a team effort, at least if we want to have a significant subset ready for 3.11. Otherwise it might take several release cycles. At the very least we'll need double checking of all naming to avoid adding new inconsistencies!
Voting history
+1 from PSC members JukkaR, DanielM, JavierJS, HowardB and EvenR