gdal vector concat
Added in version 3.11.
Concatenate vector datasets
Synopsis
Usage: gdal vector concat [OPTIONS] <INPUTS>... <OUTPUT>
Concatenate vector datasets.
Positional arguments:
-i, --input <INPUTS> Input vector datasets [1.. values] [required] [not available in pipelines]
-o, --output <OUTPUT> Output vector dataset [required] [not available in pipelines]
Common Options:
-h, --help Display help message and exit
--json-usage Display usage as JSON document and exit
--config <KEY>=<VALUE> Configuration option [may be repeated]
-q, --quiet Quiet mode (no progress bar or warning message) [not available in pipelines]
Options:
-l, --layer, --input-layer <INPUT-LAYER> Input layer name(s) [may be repeated] [not available in pipelines]
-f, --of, --format, --output-format <OUTPUT-FORMAT> Output format ("GDALG" allowed) [not available in pipelines]
--co, --creation-option <KEY>=<VALUE> Creation option [may be repeated] [not available in pipelines]
--lco, --layer-creation-option <KEY>=<VALUE> Layer creation option [may be repeated] [not available in pipelines]
--overwrite Whether overwriting existing output dataset is allowed [not available in pipelines]
--update Whether to open existing dataset in update mode [not available in pipelines]
--overwrite-layer Whether overwriting existing output layer is allowed [not available in pipelines]
--append Whether appending to existing layer is allowed [not available in pipelines]
Mutually exclusive with --upsert
--skip-errors Skip errors when writing features [not available in pipelines]
--mode <MODE> Determine the strategy to create output layers from source layers . MODE=merge-per-layer-name|stack|single (default: merge-per-layer-name)
--output-layer <OUTPUT-LAYER> Name of the output vector layer (single mode), or template to name the output vector layers (stack mode)
--source-layer-field-name <SOURCE-LAYER-FIELD-NAME> Name of the new field to add to contain identification of the source layer, with value determined from 'source-layer-field-content'
--source-layer-field-content <SOURCE-LAYER-FIELD-CONTENT> A string, possibly using {AUTO_NAME}, {DS_NAME}, {DS_BASENAME}, {DS_INDEX}, {LAYER_NAME}, {LAYER_INDEX}
--field-strategy <FIELD-STRATEGY> How to determine target fields from source fields. FIELD-STRATEGY=union|intersection (default: union)
-s, --input-crs <INPUT-CRS> Input CRS
-d, --output-crs <OUTPUT-CRS> Output CRS
Advanced Options:
--if, --input-format <INPUT-FORMAT> Input formats [may be repeated] [not available in pipelines]
--oo, --open-option <KEY>=<VALUE> Open options [may be repeated] [not available in pipelines]
--output-oo, --output-open-option <KEY>=<VALUE> Output open options [may be repeated] [not available in pipelines]
--upsert Upsert features (implies 'append') [not available in pipelines]
Mutually exclusive with --append
Description
gdal vector concat concatenates several source datasets.
It has 3 main modes:
--mode=merge-per-layer-name(the default). The output dataset generated by the command will contain as many layers as there are different layer names in the source datasets. For example if there are 2 datasets, one with layers a and b, and the other one with layers b and c, 3 output layers will be created: a, b (merging the 2 source layers) and c.--mode=stack. The output dataset generated by the command will contain as many layers as there are layers in the source datasets. For example if there are 2 datasetsds1(with layers a and b) andds2(with layers b and c), 4 output layers will be created:ds1_a,ds1_b,ds2_bandds2_c.--mode=single. The output dataset generated by the command will contain one single layer, merging all layers in the source datasets.
When an output layer merges several source layer, by default the resulting
schema will contain the union of all source fields. It is possible to select
only the intersection with the --field-strategy set to intersection.
Regarding the resulting CRS, by default the CRS of the source layer will be
used as the target CRS, and features of other source layers that do no match
this CRS will be reprojected to it. --output-crs can be used to select
a given destination CRS.
This command can also be used as the first step of gdal vector pipeline.
GDALG output (on-the-fly / streamed dataset)
This program supports serializing the command line as a JSON file using the GDALG output format.
The resulting file can then be opened as a vector dataset using the
GDALG: GDAL Streamed Algorithm driver, and apply the specified pipeline in a on-the-fly /
streamed way.
Program-Specific Options
- --field-strategy union|intersection
Determines how the schema of the target layer is built from the schemas of the input layers:
union(default) to use a super-set of all the fields from all source layers.intersectionto use a sub-set of all the common fields from all source layers.
- --input-crs, -s <INPUT-CRS>
Set (override) input spatial reference. If not specified the SRS found in the input dataset will be used.
The coordinate reference systems that can be passed are anything supported by the OGRSpatialReference.SetFromUserInput() call, which includes EPSG Projected, Geographic or Compound CRS (i.e. EPSG:4296), a well known text (WKT) CRS definition, PROJ.4 declarations, or the name of a .prj file containing a WKT CRS definition.
If the SRS has an explicit vertical datum that points to a geoid grid, and the input dataset is a single band dataset, a vertical correction will be applied to the values of the dataset.
- --mode merge-per-layer-name|stack|single
Determine the strategy to create output layers from source layers. See introductory paragraph for more details.
- --output-crs, -d <OUTPUT-CRS>
Set output spatial reference. Inputs will be reprojected to this CRS if necessary.
The coordinate reference systems that can be passed are anything supported by the OGRSpatialReference.SetFromUserInput() call, which includes EPSG Projected, Geographic or Compound CRS (i.e. EPSG:4296), a well known text (WKT) CRS definition, PROJ.4 declarations, or the name of a .prj file containing a WKT CRS definition.
If the SRS has an explicit vertical datum that points to a geoid grid, and the input dataset is a single band dataset, a vertical correction will be applied to the values of the dataset.
- --output-layer <OUTPUT-LAYER>
Name of the output vector layer (in
singlemode, and the default is "merged"), or template to name the output vector layers instackmode (the default value is{AUTO_NAME}). Not allowed inmerge-per-layer-namemode.The template in
stackmode can be a string with the following variables that will be substituted with a value computed from the input layer being processed:{AUTO_NAME}: equivalent to{DS_BASENAME}_{LAYER_NAME}if both values are different, or{LAYER_NAME}when they are identical (case of shapefile).{DS_NAME}: name of the source dataset{DS_BASENAME}: base name of the source dataset{DS_INDEX}: index of the source dataset{LAYER_NAME}: name of the source layer{LAYER_INDEX}: index of the source layer
- --source-layer-field-content <SOURCE-LAYER-FIELD-CONTENT>
If specified, the schema of the target layer will be extended with a new field (whose name is given by
--source-layer-field-name, orsource_ds_lyrotherwise), whose content is determined by the specified template (see--output-layerfor variables that can be used).
- --source-layer-field-name <SOURCE-LAYER-FIELD-NAME>
If specified, the schema of the target layer will be extended with a field whose name is the value of this option and whose content is determined
--source-layer-field-content.
Standard Options
Details
- --append
Whether appending features to existing layer(s) is allowed. This also creates the output dataset if it does not exist yet.
- --co, --creation-option <NAME>=<VALUE>
Many formats have one or more optional dataset creation options that can be used to control particulars about the file created. For instance, the GeoPackage driver supports creation options to control the version.
May be repeated.
The dataset creation options available vary by format driver, and some simple formats have no creation options at all. A list of options supported for a format can be listed with the --formats command line option but the documentation for the format is the definitive source of information on driver creation options. See Vector drivers format specific documentation for legal creation options for each format.
Note that dataset creation options are different from layer creation options.
- --if, --input-format <format>
Format/driver name to be attempted to open the input file(s). It is generally not necessary to specify it, but it can be used to skip automatic driver detection, when it fails to select the appropriate driver. This option can be repeated several times to specify several candidate drivers. Note that it does not force those drivers to open the dataset. In particular, some drivers have requirements on file extensions.
May be repeated.
- --input-layer <INPUT-LAYER>
Specifies the name of one or more layers to process. By default, all layers will be processed.
- --lco, --layer-creation-option <NAME>=<VALUE>
Many formats have one or more optional layer creation options that can be used to control particulars about the layer created. For instance, the GeoPackage driver supports layer creation options to control the feature identifier or geometry column name, setting the identifier or description, etc.
May be repeated.
The layer creation options available vary by format driver, and some simple formats have no layer creation options at all. A list of options supported for a format can be listed with the --formats command line option but the documentation for the format is the definitive source of information on driver creation options. See Vector drivers format specific documentation for legal creation options for each format.
Note that layer creation options are different from dataset creation options.
- --oo, --open-option <NAME>=<VALUE>
Dataset open option (format specific).
May be repeated.
- -f, --of, --format, --output-format <OUTPUT-FORMAT>
Which output vector format to use. Allowed values may be given by
gdal --formats | grep vector | grep rw | sort
- --output-open-option, --output-oo <NAME>=<VALUE>
Added in version 3.12.
Dataset open option for output dataset (format specific).
May be repeated.
- --overwrite
Allow program to overwrite existing target file or dataset. Otherwise, by default, gdal errors out if the target file or dataset already exists.
- --overwrite-layer
Whether overwriting the existing output vector layer is allowed.
- --skip-errors
Added in version 3.12.
Whether failures to write feature(s) should be ignored. Note that this option sets the size of the transaction unit to one feature at a time, which may cause severe slowdown when inserting into databases.
- --update
Whether to open an existing output dataset in update mode.
- --upsert
Added in version 3.12.
Variant of
--appendwhere theOGRLayer::UpsertFeature()operation is used to insert or update features instead of appending withOGRLayer::CreateFeature().This is currently implemented only in a few drivers: GPKG -- GeoPackage vector, Elasticsearch: Geographically Encoded Objects for Elasticsearch and MongoDBv3 (drivers that implement upsert expose the
GDAL_DCAP_UPSERTcapability).The upsert operation uses the FID of the input feature, when it is set (and the FID column name is not the empty string), as the key to update existing features. It is crucial to make sure that the FID in the source and target layers are consistent.
For the GPKG driver, it is also possible to upsert features whose FID is unset or non-significant (the
--unset-fidoption of gdal vector edit can be used to ignore the FID from the source feature), when there is a UNIQUE column that is not the integer primary key.
Return status code
The program returns status code 0 in case of success, and non-zero in case of error (non-blocking errors emitted as warnings are considered as a successful execution).
Examples
Example 1: Creating a GeoPackage stacking all input shapefiles in separate layers.
gdal vector concat --mode=stack *.shp out.gpkg
Example 2: Adding a field to indicate the source layer, and reprojecting to a single CRS
Concatenate the content of france.shp and germany.shp in merged.shp,
reprojecting them to ETRS89, and add a 'country' field to each feature whose value is 'france' or
'germany' depending where it comes from:
gdal vector concat --mode=single --source-layer-field-name=country --output-crs=EPSG:4258 france.shp germany.shp merged.shp