Vector Data Model
This page documents the classes used to handle vector data. Many data types and method names are based on the OGC Simple Features data model, so it may be helpful to review the specifications published by OGC. For historical reasons, GDAL uses the "OGR" prefix to denote types and functions that apply only to vector data.
Class Overview
The following classes form the core of the vector data model:
Geometry (ogr_geometry.h): The geometry classes (
OGRGeometry
, etc) encapsulate the OGC vector data types. They provide some geometry operations and translation to/from well known binary and text format. A geometry includes a spatial reference system (projection).Spatial Reference (ogr_spatialref.h): An
OGRSpatialReference
encapsulates the definition of a projection and datum.Feature (ogr_feature.h): The
OGRFeature
encapsulates the definition of a whole feature, that is a set of geometries and attributes relating to a single entity.Feature Class Definition (ogr_feature.h): The
OGRFeatureDefn
class captures the schema (set of field definitions) for a group of related features (normally a whole layer).Layer (ogrsf_frmts.h):
OGRLayer
is an abstract class representing a layer of features in aGDALDataset
.Dataset (gdal_priv.h): A
GDALDataset
is an abstract base class representing a file or database containing one or moreOGRLayer
objects.Drivers (gdal_priv.h): A
GDALDriver
represents a translator for a specific format, capable of opening and possibly writingGDALDataset
objects. All available drivers are managed by theGDALDriverManager
.
Geometry
Individual geometry classes are used to represent the different types of vector geometry. All the geometry classes derive from OGRGeometry
which defines the common functionality of all geometries. Geometry types include OGRPoint
, OGRLineString
, OGRPolygon
, OGRGeometryCollection
, OGRMultiPoint
, OGRMultiLineString
, OGRMultiPolygon
, and OGRPolyhedralSurface
.
The special case of a triangular polygon can be represented as a OGRTriangle
, a non-overlapping collection of which can be represented by an OGRTriangulatedSurface
.
An additional set of types is used to store non-linear geometries: OGRCircularString
, OGRCompoundCurve
, OGRCurvePolygon
, OGRMultiCurve
and OGRMultiSurface
.
Any of the above geometry classes can store coordinates in two (XY), three (XYZ or XYM), or four (XYZM) dimensions.
Additional intermediate classes contain functionality that is used by multiple geometry types. These include OGRCurve
(base class for OGRLineString
) and OGRSurface
(base class for OGRPolygon
). Some intermediate interfaces modeled in the simple features abstract model and SFCOM are not modeled in OGR at this time. In most cases the methods are aggregated into other classes.
The OGRGeometryFactory
is used to convert well known text (WKT) and well known binary (WKB) format data into the appropriate OGRGeometry
subclass. These are predefined ASCII and binary formats for representing all the types of simple features geometries.
The OGRGeometry
includes a reference to an OGRSpatialReference
object, defining the spatial reference system of that geometry. This is normally a reference to a shared spatial reference object with reference counting for each of the OGRGeometry
objects using it.
While it is theoretically possible to derive other or more specific geometry classes from the existing OGRGeometry
classes, this isn't an aspect that has been well thought out. In particular, it would be possible to create specialized classes using the OGRGeometryFactory
without modifying it.
Compatibility issues with non-linear geometries
Generic mechanisms have been introduced so that creating or modifying a feature with a non-linear geometry in a layer of a driver that does not support it will transform that geometry in the closest matching linear geometry. This linearization can be controlled using Vector related options.
On the other side, when retrieving data from the OGR C API, the OGRSetNonLinearGeometriesEnabledFlag()
function can be used, so that geometries and layer geometry type returned are also converted to their linear approximation if necessary.
Spatial Reference
The OGRSpatialReference
class is intended to store an OpenGIS Spatial Reference System definition. Currently local, geographic and projected coordinate systems are supported. Vertical coordinate systems, geocentric coordinate systems, and compound (horizontal + vertical) coordinate systems are as well supported in recent GDAL versions.
The spatial coordinate system data model is inherited from the OpenGIS Well Known Text format. A simple form of this is defined in the Simple Features specifications. A more sophisticated form is found in the Coordinate Transformation specification. The OGRSpatialReference
is built on the features of the Coordinate Transformation specification but is intended to be compatible with the earlier simple features form.
There is also an associated OGRCoordinateTransformation
class that encapsulates use of PROJ for converting between different coordinate systems.
Feature / Feature Definition
The OGRGeometry
captures the geometry of a vector feature. The OGRFeature
contains geometry, and adds feature attributes, feature id, and a feature class identifier. It may also contain styling information. Several geometries can be associated with an OGRFeature
.
The set of attributes (OGRFieldDefn
), their types, names and so forth is represented via the OGRFeatureDefn
class. One OGRFeatureDefn
normally exists for a layer of features. The same definition is shared in a reference counted manner by the feature of that type (or feature class).
The feature id (FID) of a feature is intended to be a unique identifier for the feature within the layer it is a member of. Freestanding features, or features not yet written to a layer may have a null (OGRNullFID) feature id. The feature ids are modeled in OGR as a 64-bit integer; however, this is not sufficiently expressive to model the natural feature ids in some formats. For instance, the GML feature id is a string.
The OGRFeatureDefn
also contains an indicator of the types of geometry allowed for that feature class (returned as an OGRwkbGeometryType
from OGRFeatureDefn::GetGeomType()
). If this is OGRwkbGeometryType::wkbUnknown
then any type of geometry is allowed. This implies that features in a given layer can potentially be of different geometry types though they will always share a common attribute schema.
Several geometry fields (OGRGeomFieldDefn
) can be associated with an OGRFeatureDefn
. Each geometry field has its own indicator of geometry type allowed, returned by OGRGeomFieldDefn::GetType()
, and its spatial reference system, returned by OGRGeomFieldDefn::GetSpatialRef()
.
The OGRFeatureDefn
also contains a feature class name (normally used as a layer name).
Field Definitions
The behavior of each field in a feature class is defined by a shared OGRFieldDefn
.
The OGRFieldDefn
specifies the field type from the values of OGRFieldType
.
Values stored in this field may be further restricted according to a OGRFieldSubType
.
For example, a field may have a type of OGRFieldType::OFTInteger
with a subtype of OGRFieldSubType::OFSTBoolean
.
The OGRFieldDefn
can also track whether a field is allowed to be null (OGRFieldDefn::IsNullable()
), whether its value must be unique (OGRFieldDefn::IsUnique()
), and formatting information such as the number of decimal digits, width, and justification. It may also define a default value in case one is not manually specified.
Field Domains
Some formats support the use of field domains that describe the values that can be stored in a given attribute field. An OGRFieldDefn
may reference a single OGRFieldDomain
that is associated with a GDALDataset
.
Programs using GDAL may use the OGRFieldDomain
to appropriately constrain user input. GDAL does not perform validation itself and will allow the storage of values that violate a field's associated OGRFieldDomain
.
Available types of OGRFieldDomain
include:
OGRCodedFieldDomain
, which constrains values those present in a specified enumerationOGRRangeFieldDomain
, which constrains values to a specified rangeOGRGlobFieldDomain
, which constrains values to those matching a specified pattern
Additionally, an OGRFieldDomain
may define policies describing the values that should be assigned to domain-controlled fields when features are split or merged.
Layer
An OGRLayer
represents a layer of features within a data source. All features in an OGRLayer
share a common schema and are of the same OGRFeatureDefn
. An OGRLayer
class also contains methods for reading features from the data source. The OGRLayer
can be thought of as a gateway for reading and writing features from an underlying data source such as a file on disk, or the result of a database query.
The OGRLayer
includes methods for sequential and random reading and writing. Read access (via the OGRLayer::GetNextFeature()
method) normally reads all features, one at a time sequentially; however, it can be limited to return features intersecting a particular geographic region by installing a spatial filter on the OGRLayer
(via the OGRLayer::SetSpatialFilter()
method). A filter on attributes can only be set with the OGRLayer::SetAttributeFilter()
method. By default, all available attributes and geometries are read but this can be controlled by flagging fields as ignored (OGRLayer::SetIgnoredFields()
).
Starting with GDAL 3.6, as an alternative to getting features through GetNextFeature
, it is possible to retrieve them by batches, with a column-oriented memory layout, using the OGRLayer::GetArrowStream()
method (cf Reading From OGR using the Arrow C Stream data interface).
An OGRLayer
may also store an OGRStyleTable
that provides a set of styles that may be used by features in the layer. More information on GDAL's handling of feature styles can be found in the Feature Style Specification.
One flaw in the current OGR architecture is that the spatial and attribute filters are set directly on the OGRLayer
which is intended to be the only representative of a given layer in a data source. This means it isn't possible to have multiple read operations active at one time with different spatial filters on each.
Another question that might arise is why the OGRLayer
and OGRFeatureDefn
classes are distinct. An OGRLayer
always has a one-to-one relationship to an OGRFeatureDefn
, so why not amalgamate the classes? There are two reasons:
As defined now
OGRFeature
andOGRFeatureDefn
don't depend onOGRLayer
, so they can exist independently in memory without regard to a particular layer in a data store.The SF CORBA model does not have a concept of a layer with a single fixed schema the way that the SFCOM and SFSQL models do. The fact that features belong to a feature collection that is potentially not directly related to their current feature grouping may be important to implementing SFCORBA support using OGR.
The OGRLayer
class is an abstract base class. An implementation is expected to be subclassed for each file format driver implemented. OGRLayers are normally owned directly by their GDALDataset
, and aren't instantiated or destroyed directly.
Dataset
A GDALDataset
represents a set of OGRLayer
objects. This usually represents a single file, set of files, database or gateway. A GDALDataset
has a list of OGRLayer
which it owns but can return references to.
GDALDataset
is an abstract base class. An implementation is expected to be subclassed for each file format driver implemented. GDALDataset
objects are not normally instantiated directly but rather with the assistance of an GDALDriver
. Deleting an GDALDataset
closes access to the underlying persistent data source, but does not normally result in deletion of that file.
A GDALDataset
has a name (usually a filename or database connection string) that can be used to reopen the data source with a GDALDriver
.
The GDALDataset
also has support for executing a datasource specific command, normally a form of SQL. This is accomplished via the GDALDataset::ExecuteSQL()
method. While some datasources (such as PostGIS and Oracle) pass the SQL through to an underlying database, OGR also includes support for evaluating a subset of the SQL SELECT statement against any datasource (see OGR SQL dialect and SQLITE SQL dialect.)
When using some drivers, the GDALDataset
also offers a mechanism for to start, commit, and rollback transactions when interacting with the underlying data store.
A GDALDataset
may also be aware of relationships between layers (e.g., a foreign key relationship between database tables). Information about these relationships is stored in a GDALRelationshp
.
Note
Earlier versions of GDAL represented vector datasets using the OGRDataSource
class. This class has been maintained for backwards compatibility but is functionally equivalent to a GDALDataset
for vector data.
Drivers
A GDALDriver
object is instantiated for each file format supported. The GDALDriver
objects are registered with the GDALDriverManager
, a singleton class that is normally used to open new datasets.
It is intended that a new GDALDriver
object is instantiated and define function pointers for operations like Identify(), Open() for each file format to be supported (along with a file format specific GDALDataset
, and OGRLayer
classes).
On application startup registration functions are normally called for each desired file format. These functions instantiate the appropriate GDALDriver
objects, and register them with the GDALDriverManager
. When a dataset is to be opened, the driver manager will normally try each GDALDataset
in turn, until one succeeds, returning a GDALDataset
object.