gdal vector select
Added in version 3.11.
Select a subset of fields from a vector dataset.
Synopsis
Usage: gdal vector select [OPTIONS] <INPUT> <OUTPUT> <FIELDS>
Select a subset of fields from a vector dataset.
Positional arguments:
-i, --input <INPUT> Input vector datasets [required]
-o, --output <OUTPUT> Output vector dataset [required]
--fields <FIELDS> Fields to select (or exclude if --exclude) [may be repeated] [required]
Common Options:
-h, --help Display help message and exit
--json-usage Display usage as JSON document and exit
--config <KEY>=<VALUE> Configuration option [may be repeated]
--progress Display progress bar
Options:
-l, --layer, --input-layer <INPUT-LAYER> Input layer name(s) [may be repeated]
-f, --of, --format, --output-format <OUTPUT-FORMAT> Output format ("GDALG" allowed)
--co, --creation-option <KEY>=<VALUE> Creation option [may be repeated]
--lco, --layer-creation-option <KEY>=<VALUE> Layer creation option [may be repeated]
--overwrite Whether overwriting existing output is allowed
--update Whether to open existing dataset in update mode
--overwrite-layer Whether overwriting existing layer is allowed
--append Whether appending to existing layer is allowed
--output-layer <OUTPUT-LAYER> Output layer name
--active-layer <ACTIVE-LAYER> Set active layer (if not specified, all)
--exclude Exclude specified fields
Mutually exclusive with --ignore-missing-fields
--ignore-missing-fields Ignore missing fields
Mutually exclusive with --exclude
Advanced Options:
--if, --input-format <INPUT-FORMAT> Input formats [may be repeated]
--oo, --open-option <KEY>=<VALUE> Open options [may be repeated]
Description
gdal vector select can be used to select a subset of fields.
select
can also be used as a step of gdal vector pipeline.
Standard options
- -f, --of, --format, --output-format <OUTPUT-FORMAT>
Which output vector format to use. Allowed values may be given by
gdal --formats | grep vector | grep rw | sort
- --co <NAME>=<VALUE>
Many formats have one or more optional dataset creation options that can be used to control particulars about the file created. For instance, the GeoPackage driver supports creation options to control the version.
May be repeated.
The dataset creation options available vary by format driver, and some simple formats have no creation options at all. A list of options supported for a format can be listed with the --formats command line option but the documentation for the format is the definitive source of information on driver creation options. See Vector drivers format specific documentation for legal creation options for each format.
Note that dataset creation options are different from layer creation options.
- -lco <NAME>=<VALUE>
Layer creation option (format specific)
- --overwrite
Allow program to overwrite existing target file or dataset. Otherwise, by default, gdal errors out if the target file or dataset already exists.
- --active-layer <ACTIVE-LAYER>
Set the active layer. When it is specified, only the layer specified by its name will be subject to the processing. Other layers will be not modified. If this option is not specified, all layers will be subject to the processing.
- --fields <FIELDS>
Comma-separated list of fields from input layer to copy to the new layer (or to exclude if
--exclude
is specified)Field names with spaces, commas or double-quote should be surrounded with a starting and ending double-quote character, and double-quote characters in a field name should be escaped with backslash.
Depending on the shell used, this might require further quoting. For example, to select
regular_field
,a_field_with space, and comma
anda field with " double quote
with a Unix shell:--fields "regular_field,\"a_field_with space, and comma\",\"a field with \\\" double quote\""
A field is only selected once, even if mentioned several times in the list.
Geometry fields can also be specified in the list. If the source layer has no explicit name for the geometry field,
_ogr_geometry_
must be used to select the unique geometry field.Specifying a non-existing source field name results in an error.
- --ignore-missing-fields
By default, if a field specified by
--fields
does not exist in the input layer(s), an error is emitted and the processing is stopped. When specifying--ignore-missing-fields
, only a warning is emitted and the non existing fields are just ignored.
Advanced options
- --oo <NAME>=<VALUE>
Dataset open option (format specific).
May be repeated.
- --if <format>
Format/driver name to be attempted to open the input file(s). It is generally not necessary to specify it, but it can be used to skip automatic driver detection, when it fails to select the appropriate driver. This option can be repeated several times to specify several candidate drivers. Note that it does not force those drivers to open the dataset. In particular, some drivers have requirements on file extensions.
May be repeated.
GDALG output (on-the-fly / streamed dataset)
This program supports serializing the command line as a JSON file using the GDALG
output format.
The resulting file can then be opened as a vector dataset using the
GDALG: GDAL Streamed Algorithm driver, and apply the specified pipeline in a on-the-fly /
streamed way.
Examples
Example 1: Select the EAS_ID field and the geometry field from a Shapefile
$ gdal vector select in.shp out.gpkg "EAS_ID,_ogr_geometry_" --overwrite
Example 2: Remove sensitive fields from a layer
$ gdal vector select in.shp out.gpkg --exclude "name,surname,address" --overwrite