SQL SQLite dialect
The SQLite
dialect can be used as an alternate SQL dialect to the
OGR SQL dialect.
This assumes that GDAL/OGR is built with support for SQLite, and preferably
with Spatialite support too to benefit from spatial functions.
The SQLite dialect may be used with any OGR datasource, like the OGR SQL dialect.
The SQLite
dialect can be requested with the SQLite
string passed
as the dialect parameter of GDALDataset::ExecuteSQL()
, or with the
-dialect option of the ogrinfo or ogr2ogr utilities.
This is mainly aimed to execute SELECT statements, but, for datasources that support update, INSERT/UPDATE/DELETE statements can also be run. GDAL is internally using the Virtual Table Mechanism of SQLite and therefore operations like ALTER TABLE are not supported. For executing ALTER TABLE or DROP TABLE use OGR SQL dialect
If the datasource is SQLite database (GeoPackage, SpatiaLite) then SQLite dialect acts as native SQL dialect and Virtual Table Mechanism is not used. It is possible to force GDAL to use Virtual Tables even in this case by specifying "-dialect INDIRECT_SQLITE". This should be used only when necessary, since going through the virtual table mechanism might affect performance.
The syntax of the SQL statements is fully the one of the SQLite SQL engine. You can refer to the following pages:
SELECT statement
The SELECT statement is used to fetch layer features (analogous to table rows in an RDBMS) with the result of the query represented as a temporary layer of features. The layers of the datasource are analogous to tables in an RDBMS and feature attributes are analogous to column values. The simplest form of OGR SQLITE SELECT statement looks like this:
SELECT * FROM polylayer
More complex statements can of course be used, including WHERE, JOIN, USING, GROUP BY, ORDER BY, sub SELECT, ...
The table names that can be used are the layer names available in the datasource on which the ExecuteSQL() method is called.
Similarly to OGRSQL, it is also possible to refer to layers of other datasources with the following syntax : "other_datasource_name"."layer_name".
SELECT p.*, NAME FROM poly p JOIN "idlink.dbf"."idlink" il USING (eas_id)
If the master datasource is SQLite database (GeoPackage, SpatiaLite) it is necessary to use indirect SQLite dialect. Otherwise additional datasources are never opened but tables to be used in joins are searched from the master database.
ogrinfo jointest.gpkg -dialect INDIRECT_SQLITE -sql \
"SELECT a.ID,b.ID FROM jointest a JOIN \"jointest2.shp\".\"jointest2\" b ON a.ID=b.ID"
The column names that can be used in the result column list, in WHERE, JOIN, ... clauses are the field names of the layers. Expressions, SQLite functions, spatial functions, etc... can also be used.
The conditions on fields expressed in WHERE clauses, or in JOINs are translated, as far as possible, as attribute filters that are applied on the underlying OGR layers. Joins can be very expensive operations if the secondary table is not indexed on the key field being used.
LIKE operator
In SQLite, the LIKE operator is case insensitive, unless PRAGMA case_sensitive_like = 1
has been issued.
Starting with GDAL 3.9, GDAL installs a custom LIKE comparison, such that UTF-8
characters are taken into account by LIKE
operator.
For case insensitive comparisons, this is restricted to the ASCII, Latin-1 Supplement, Latin Extended-A, Latin Extended-B, Greek and Coptic and Cyrillic Unicode categories.
Delimited identifiers
If names of layers or attributes are reserved keywords in SQL like 'FROM' or they begin with a number or underscore they must be handled as "delimited identifiers" and enclosed between double quotation marks in queries. Double quotes can be used even when they are not strictly needed.
SELECT "p"."geometry", "p"."FROM", "p"."3D" FROM "poly" p
When SQL statements are used in the command shell and the statement itself is put between double quotes, the internal double quotes must be escaped with \
ogrinfo p.shp -sql "SELECT geometry \"FROM\", \"3D\" FROM p"
Geometry field
Geometry fields can be explicitly specified in the result column list of a SELECT, or automatically selected if the * wildcard is used.
For OGR layers that have a non-empty geometry column name (generally for RDBMS datasources), as returned by OGRLayer::GetGeometryColumn(), the name of the geometry special field in the SQL statement must be the name of the geometry column of the underlying OGR layer. If the name of the geometry column in the source layer is empty, like with shapefiles etc., the name to use in the SQL statement must be "geometry". Here we'll use it case-insensitively (as all field names are in a SELECT statement):
SELECT EAS_ID, GEOMETRY FROM poly
returns:
OGRFeature(SELECT):0
EAS_ID (Real) = 168
POLYGON ((479819.84375 4765180.5,479690.1875 4765259.5,[...],479819.84375 4765180.5))
SELECT * FROM poly
returns:
OGRFeature(SELECT):0
AREA (Real) = 215229.266
EAS_ID (Real) = 168
PRFEDEA (String) = 35043411
POLYGON ((479819.84375 4765180.5,479690.1875 4765259.5,[...],479819.84375 4765180.5))
Feature id (FID)
The feature id is a special property of a feature and not treated
as an attribute of the feature. In some cases it is convenient to be able to
utilize the feature id in queries and result sets as a regular field. To do
so use the name rowid
.
Starting with GDAL 3.8, if the layer has a named FID column
(OGRLayer::GetFIDColumn()
!= ""), this name may also be used.
The field wildcard expansions will not include the feature id, but it may be explicitly included using a syntax like:
SELECT rowid, * FROM nation
It is of course possible to rename it:
SELECT rowid AS fid, * FROM nation
OGR_STYLE special field
The OGR_STYLE
special field represents the style string of the feature
returned by OGRFeature::GetStyleString(). By using this field and the
LIKE
operator the result of the query can be filtered by the style.
For example we can select the annotation features as:
SELECT * FROM nation WHERE OGR_STYLE LIKE 'LABEL%'
Statistics functions
In addition to standard COUNT(), SUM(), AVG(), MIN(), MAX(), the following aggregate functions are available:
STDDEV_POP(numeric_value)
: (GDAL >= 3.10) numerical population standard deviation.STDDEV_SAMP(numeric_value)
: (GDAL >= 3.10) numerical sample standard deviation
Ordered-set aggregate functions
The following aggregate functions are available. Note that they require to allocate an amount of memory proportional to the number of selected rows (for MEDIAN
, PERCENTILE
and PERCENTILE_CONT
) or to the number of values (for MODE
).
MEDIAN(numeric_value)
: (GDAL >= 3.10) (continuous) median (equivalent toPERCENTILE(numeric_value, 50)
). NULL values are ignored.PERCENTILE(numeric_value, percentage)
: (GDAL >= 3.10) (continuous) percentile, with percentage between 0 and 100 (equivalent toPERCENTILE_CONT(numeric_value, percentage / 100)
). NULL values are ignored.PERCENTILE_CONT(numeric_value, fraction)
: (GDAL >= 3.10) (continuous) percentile, with fraction between 0 and 1. NULL values are ignored.MODE(value)
: (GDAL >= 3.10): mode, i.e. most frequent input value (strings and numeric values are supported), arbitrarily choosing the first one if there are multiple equally-frequent results. NULL values are ignored.
Spatialite SQL functions
When GDAL/OGR is build with support for the Spatialite library, a lot of extra SQL functions, in particular spatial functions, can be used in results column fields, WHERE clauses, etc....
SELECT EAS_ID, ST_Area(GEOMETRY) AS area FROM poly WHERE
ST_Intersects(GEOMETRY, BuildCircleMbr(479750.6875,4764702.0,100))
returns:
OGRFeature(SELECT):0
EAS_ID (Real) = 169
area (Real) = 101429.9765625
OGRFeature(SELECT):1
EAS_ID (Real) = 165
area (Real) = 596610.3359375
OGRFeature(SELECT):2
EAS_ID (Real) = 170
area (Real) = 5268.8125
Note that due to the loose typing mechanism of SQLite, if a geometry expression returns a NULL value for the first row, this will generally cause OGR not to recognize the column as a geometry column. It might be then useful to sort the results by making sure that non-null geometries are returned first:
ogrinfo test.shp -sql "SELECT * FROM (SELECT ST_Buffer(geometry,5) AS geometry FROM test) ORDER BY geometry IS NULL ASC" -dialect sqlite
OGR datasource SQL functions
The ogr_datasource_load_layers(datasource_name[, update_mode[, prefix]])
function can be used to automatically load all the layers of a datasource as
VirtualOGR tables.
sqlite> SELECT load_extension('libgdal.so');
sqlite> SELECT load_extension('mod_spatialite');
sqlite> SELECT ogr_datasource_load_layers('poly.shp');
1
sqlite> SELECT * FROM sqlite_master;
table|poly|poly|0|CREATE VIRTUAL TABLE "poly" USING VirtualOGR('poly.shp', 0, 'poly')
OGR layer SQL functions
The following SQL functions are available and operate on a layer name :
ogr_layer_Extent()
, ogr_layer_SRID()
,
ogr_layer_GeometryType()
and ogr_layer_FeatureCount()
SELECT ogr_layer_Extent('poly'), ogr_layer_SRID('poly') AS srid,
ogr_layer_GeometryType('poly') AS geomtype, ogr_layer_FeatureCount('poly') AS count
OGRFeature(SELECT):0
srid (Integer) = 40004
geomtype (String) = POLYGON
count (Integer) = 10
POLYGON ((478315.53125 4762880.5,481645.3125 4762880.5,481645.3125 4765610.5,478315.53125 4765610.5,478315.53125 4762880.5))
OGR compression functions
ogr_deflate(text_or_blob[, compression_level])
returns a binary blob
compressed with the ZLib deflate algorithm. See CPLZLibDeflate()
ogr_inflate(compressed_blob)
returns the decompressed binary blob,
from a blob compressed with the ZLib deflate algorithm.
If the decompressed binary is a string, use
CAST(ogr_inflate(compressed_blob) AS VARCHAR). See CPLZLibInflate().
Other functions
The hstore_get_value()
function can be used to extract
a value associate to a key from a HSTORE string, formatted like "key=>value,other_key=>other_value,..."
SELECT hstore_get_value('a => b, "key with space"=> "value with space"', 'key with space') --> 'value with space'
OGR geocoding functions
The following SQL functions are available : ogr_geocode(...)
and ogr_geocode_reverse(...)
.
ogr_geocode(name_to_geocode [, field_to_return [, option1 [, option2, ...]]])
where
name_to_geocode is a literal or a column name that must be geocoded. field_to_return if specified can be "geometry" for
the geometry (default), or a field name of the layer returned by OGRGeocode()
. The special field "raw" can also be used
to return the raw response (XML string) of the geocoding service.
option1, option2, etc.. must be of the key=value format, and are options understood
by OGRGeocodeCreateSession()
or OGRGeocode().
This function internally uses the OGRGeocode() API. Refer to it for more details.
SELECT ST_Centroid(ogr_geocode('Paris'))
returns:
OGRFeature(SELECT):0
POINT (2.34287687375113 48.856622357411)
ogrinfo cities.csv -dialect sqlite -sql "SELECT *, ogr_geocode(city, 'country_code') AS country_code, ST_Centroid(ogr_geocode(city)) FROM cities"
returns:
OGRFeature(SELECT):0
city (String) = Paris
country_code (String) = fr
POINT (2.34287687375113 48.856622357411)
OGRFeature(SELECT):1
city (String) = London
country_code (String) = gb
POINT (-0.109415723431508 51.5004964757441)
OGRFeature(SELECT):2
city (String) = Rennes
country_code (String) = fr
POINT (-1.68185479486048 48.1116771631195)
OGRFeature(SELECT):3
city (String) = New York
country_code (String) = us
POINT (-73.9388908443975 40.6632061220125)
OGRFeature(SELECT):4
city (String) = Beijing
country_code (String) = cn
POINT (116.3912972 39.9057136)
ogr_geocode_reverse(longitude, latitude, field_to_return [, option1 [, option2, ...]])
where
longitude, latitude is the coordinate to query. field_to_return must be a field name of the layer
returned by OGRGeocodeReverse() (for example 'display_name'). The special field "raw" can also be used
to return the raw response (XML string) of the geocoding service.
option1, option2, etc.. must be of the key=value format, and are options understood
by OGRGeocodeCreateSession() or OGRGeocodeReverse().
ogr_geocode_reverse(geometry, field_to_return [, option1 [, option2, ...]])
is also accepted
as an alternate syntax where geometry is a (Spatialite) point geometry.
This function internally uses the OGRGeocodeReverse()
API. Refer to it for more details.
Spatialite spatial index
Spatialite spatial index mechanism can be triggered by making sure a spatial index virtual table is mentioned in the SQL (of the form idx_layername_geometrycolumn), or by using the more recent SpatialIndex from the VirtualSpatialIndex extension. In which case, a in-memory RTree will be built to be used to speed up the spatial queries.
For example, a spatial intersection between 2 layers, by using a spatial index on one of the layers to limit the number of actual geometry intersection computations :
SELECT city_name, region_name FROM cities, regions WHERE
ST_Area(ST_Intersection(cities.geometry, regions.geometry)) > 0 AND
regions.rowid IN (
SELECT pkid FROM idx_regions_geometry WHERE
xmax >= MbrMinX(cities.geometry) AND xmin <= MbrMaxX(cities.geometry) AND
ymax >= MbrMinY(cities.geometry) AND ymin <= MbrMaxY(cities.geometry))
or more elegantly :
SELECT city_name, region_name FROM cities, regions WHERE
ST_Area(ST_Intersection(cities.geometry, regions.geometry)) > 0 AND
regions.rowid IN (
SELECT rowid FROM SpatialIndex WHERE
f_table_name = 'regions' AND search_frame = cities.geometry)