gdal vector pipeline
Added in version 3.11.
Process a vector dataset.
Synopsis
Usage: gdal vector pipeline [OPTIONS] <PIPELINE>
Process a vector dataset.
Positional arguments:
Common Options:
-h, --help Display help message and exit
--json-usage Display usage as JSON document and exit
--config <KEY>=<VALUE> Configuration option [may be repeated]
--progress Display progress bar
<PIPELINE> is of the form: read|concat [READ-OPTIONS] ( ! <STEP-NAME> [STEP-OPTIONS] )* ! write [WRITE-OPTIONS]
A pipeline chains several steps, separated with the ! (quotation mark) character.
The first step must be read
or concat
, and the last one write
. Each step has its
own positional or non-positional arguments. Apart from read
, concat
and write
,
all other steps can potentially be used several times in a pipeline.
Potential steps are:
read
* read [OPTIONS] <INPUT>
------------------------
Read a vector dataset.
Positional arguments:
-i, --input <INPUT> Input vector datasets [required]
Options:
-l, --layer, --input-layer <INPUT-LAYER> Input layer name(s) [may be repeated]
Advanced Options:
--if, --input-format <INPUT-FORMAT> Input formats [may be repeated]
--oo, --open-option <KEY>=<VALUE> Open options [may be repeated]
concat
* concat [OPTIONS] <INPUT>...
-----------------------------
Concatenate vector datasets.
Positional arguments:
-i, --input <INPUT> Input vector datasets [1.. values] [required]
Options:
-l, --layer, --input-layer <INPUT-LAYER> Input layer name(s) [may be repeated]
--mode <MODE> Determine the strategy to create output layers from source layers . MODE=merge-per-layer-name|stack|single (default: merge-per-layer-name)
--output-layer <OUTPUT-LAYER> Name of the output vector layer (single mode), or template to name the output vector layers (stack mode)
--source-layer-field-name <SOURCE-LAYER-FIELD-NAME> Name of the new field to add to contain identificoncation of the source layer, with value determined from 'source-layer-field-content'
--source-layer-field-content <SOURCE-LAYER-FIELD-CONTENT> A string, possibly using {AUTO_NAME}, {DS_NAME}, {DS_BASENAME}, {DS_INDEX}, {LAYER_NAME}, {LAYER_INDEX}
--field-strategy <FIELD-STRATEGY> How to determine target fields from source fields. FIELD-STRATEGY=union|intersection (default: union)
-s, --src-crs <SRC-CRS> Source CRS
-d, --dst-crs <DST-CRS> Destination CRS
Advanced Options:
--if, --input-format <INPUT-FORMAT> Input formats [may be repeated]
--oo, --open-option <KEY>=<VALUE> Open options [may be repeated]
Details for options can be found in gdal vector concat.
clip
* clip [OPTIONS]
----------------
Clip a vector dataset.
Options:
--active-layer <ACTIVE-LAYER> Set active layer (if not specified, all)
--bbox <BBOX> Clipping bounding box as xmin,ymin,xmax,ymax
Mutually exclusive with --geometry, --like
--bbox-crs <BBOX-CRS> CRS of clipping bounding box
--geometry <GEOMETRY> Clipping geometry (WKT or GeoJSON)
Mutually exclusive with --bbox, --like
--geometry-crs <GEOMETRY-CRS> CRS of clipping geometry
--like <DATASET> Dataset to use as a template for bounds
Mutually exclusive with --bbox, --geometry
--like-sql <SELECT-STATEMENT> SELECT statement to run on the 'like' dataset
Mutually exclusive with --like-where
--like-layer <LAYER-NAME> Name of the layer of the 'like' dataset
--like-where <WHERE-EXPRESSION> WHERE SQL clause to run on the 'like' dataset
Mutually exclusive with --like-sql
Details for options can be found in gdal vector clip.
edit
* edit [OPTIONS]
----------------
Edit metadata of a vector dataset.
Options:
--active-layer <ACTIVE-LAYER> Set active layer (if not specified, all)
--geometry-type <GEOMETRY-TYPE> Layer geometry type
--crs <CRS> Override CRS (without reprojection)
--metadata <KEY>=<VALUE> Add/update dataset metadata item [may be repeated]
--unset-metadata <KEY> Remove dataset metadata item [may be repeated]
--layer-metadata <KEY>=<VALUE> Add/update layer metadata item [may be repeated]
--unset-layer-metadata <KEY> Remove layer metadata item [may be repeated]
Details for options can be found in gdal vector edit.
filter
* filter [OPTIONS]
------------------
Filter a vector dataset.
Options:
--active-layer <ACTIVE-LAYER> Set active layer (if not specified, all)
--bbox <BBOX> Bounding box as xmin,ymin,xmax,ymax
--where <WHERE>|@<filename> Attribute query in a restricted form of the queries used in the SQL WHERE statement
Details for options can be found in gdal vector filter.
geom
* geom <COMMAND> [OPTIONS]
where <COMMAND> is one of:
- buffer: Compute a buffer around geometries of a vector dataset.
- explode-collections: Explode geometries of type collection of a vector dataset.
- make-valid: Fix validity of geometries of a vector dataset.
- segmentize: Segmentize geometries of a vector dataset.
- set-type: Modify the geometry type of a vector dataset.
- simplify: Simplify geometries of a vector dataset.
- swap-xy: Swap X and Y coordinates of geometries of a vector dataset.
Details for options can be found in gdal vector geom.
reproject
* reproject [OPTIONS]
---------------------
Reproject a vector dataset.
Options:
--active-layer <ACTIVE-LAYER> Set active layer (if not specified, all)
-s, --src-crs <SRC-CRS> Source CRS
-d, --dst-crs <DST-CRS> Destination CRS [required]
Details for options can be found in gdal vector reproject.
select
* select [OPTIONS] <FIELDS>
---------------------------
Select a subset of fields from a vector dataset.
Positional arguments:
--fields <FIELDS> Fields to select (or exclude if --exclude) [may be repeated] [required]
Options:
--active-layer <ACTIVE-LAYER> Set active layer (if not specified, all)
--exclude Exclude specified fields
Mutually exclusive with --ignore-missing-fields
--ignore-missing-fields Ignore missing fields
Mutually exclusive with --exclude
Details for options can be found in gdal vector select.
sql
* sql [OPTIONS] <statement>|@<filename>
---------------------------------------
Apply SQL statement(s) to a dataset.
Positional arguments:
--sql <statement>|@<filename> SQL statement(s) [may be repeated] [required]
Options:
-l, --output-layer <OUTPUT-LAYER> Output layer name(s) [may be repeated]
--dialect <DIALECT> SQL dialect (e.g. OGRSQL, SQLITE)
Details for options can be found in gdal vector sql.
write
* write [OPTIONS] <OUTPUT>
--------------------------
Write a vector dataset.
Positional arguments:
-o, --output <OUTPUT> Output vector dataset [required]
Options:
-f, --of, --format, --output-format <OUTPUT-FORMAT> Output format ("GDALG" allowed)
--co, --creation-option <KEY>=<VALUE> Creation option [may be repeated]
--lco, --layer-creation-option <KEY>=<VALUE> Layer creation option [may be repeated]
--overwrite Whether overwriting existing output is allowed
--update Whether to open existing dataset in update mode
--overwrite-layer Whether overwriting existing layer is allowed
--append Whether appending to existing layer is allowed
-l, --output-layer <OUTPUT-LAYER> Output layer name
Description
gdal vector pipeline can be used to process a vector dataset and perform various processing steps.
GDALG output (on-the-fly / streamed dataset)
A pipeline can be serialized as a JSON file using the GDALG
output format.
The resulting file can then be opened as a vector dataset using the
GDALG: GDAL Streamed Algorithm driver, and apply the specified pipeline in a on-the-fly /
streamed way.
The command_line
member of the JSON file should nominally be the whole command
line without the final write
step, and is what is generated by
gdal vector pipeline ! .... ! write out.gdalg.json
.
{
"type": "gdal_streamed_alg",
"command_line": "gdal vector pipeline ! read in.gpkg ! reproject --dst-crs=EPSG:32632"
}
The final write
step can be added but if so it must explicitly specify the
stream
output format and a non-significant output dataset name.
{
"type": "gdal_streamed_alg",
"command_line": "gdal vector pipeline ! read in.gpkg ! reproject --dst-crs=EPSG:32632 ! write --output-format=streamed streamed_dataset"
}
Examples
Example 1: Reproject a GeoPackage file to CRS EPSG:32632 ("WGS 84 / UTM zone 32N")
$ gdal vector pipeline --progress ! read in.gpkg ! reproject --dst-crs=EPSG:32632 ! write out.gpkg --overwrite
Example 2: Serialize the command of a reprojection of a GeoPackage file in a GDALG file, and later read it
$ gdal vector pipeline --progress ! read in.gpkg ! reproject --dst-crs=EPSG:32632 ! write in_epsg_32632.gdalg.json --overwrite
$ gdal vector info in_epsg_32632.gdalg.json
Example 3: None
Union 2 source shapefiles (with similar structure), reproject them to EPSG:32632, keep only cities larger than 1 million inhabitants and write to a GeoPackage
$ gdal vector pipeline --progress ! concat --single --dst-crs=EPSG:32632 france.shp belgium.shp ! filter --where "pop > 1e6" ! write out.gpkg --overwrite