General information
This Python package and extensions are a number of tools for programming and manipulating the GDAL Geospatial Data Abstraction Library.
The GDAL project maintains SWIG generated Python bindings for GDAL/OGR. Generally speaking the classes and methods mostly match those of the GDAL and OGR C++ classes. There is no Python specific reference documentation, but the tutorials includes Python examples.
Dependencies
libgdal and header files (gdal-devel)
numpy (1.0.0 or greater) and header files (numpy-devel) (not explicitly required, but many examples and utilities will not work without it)
Installation
Conda
GDAL can be quite complex to build and install, particularly on Windows and MacOS. Pre built binaries are provided for the conda system:
https://docs.conda.io/en/latest/
By the conda-forge project:
Once you have Anaconda or Miniconda installed, you should be able to install GDAL with:
conda install -c conda-forge gdal
Unix
The GDAL Python bindings requires setuptools.
pip
GDAL can be installed from the Python Package Index:
pip install gdal
In order to enable numpy-based raster support, libgdal and its development headers must be installed as well as the Python packages numpy, setuptools, and wheel. To install the Python dependencies and build numpy-based raster support:
pip install numpy>1.0.0 wheel setuptools>=67
pip install gdal[numpy]=="$(gdal-config --version).*"
Users can verify that numpy-based raster support has been installed with:
python3 -c 'from osgeo import gdal_array'
If this command raises an ImportError, numpy-based raster support has not been properly installed:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/usr/local/lib/python3.12/dist-packages/osgeo/gdal_array.py", line 10, in <module>
from . import _gdal_array
ImportError: cannot import name '_gdal_array' from 'osgeo' (/usr/local/lib/python3.12/dist-packages/osgeo/__init__.py)
This is most often due to pip reusing a cached GDAL installation. Verify that the necessary dependencies have been installed and then run the following to force a clean build:
pip install --no-cache --force-reinstall gdal[numpy]=="$(gdal-config --version).*"
Potential issues with GDAL >= 3.9, Python >= 3.9 and NumPy 2.0
The pyproject.toml file of GDAL 3.9 requires numpy >= 2.0.0rc1 (for Python >= 3.9)
at build time to be able to build bindings that are compatible of both NumPy 1
and NumPy 2.
If for some reason the numpy >= 2.0.0rc1 build dependency can not be installed,
it is possible to manually install the build requirements, and invoke pip install
with the --no-build-isolation
flag.
pip install numpy==<required_version> wheel setuptools>=67
pip install gdal[numpy]=="$(gdal-config --version).*" --no-build-isolation
Building as part of the GDAL library source tree
Python bindings are generated by default when building GDAL from source. For more detail, see Python bindings options.
The GDAL Python package is built using SWIG. The currently supported version is SWIG >= 4
Usage
Imports
There are five major modules that are included with the GDAL Python bindings.:
>>> from osgeo import gdal
>>> from osgeo import ogr
>>> from osgeo import osr
>>> from osgeo import gdal_array
>>> from osgeo import gdalconst
API
API documentation is available at GDAL Python submodules
Numpy
One advanced feature of the GDAL Python bindings not found in the other language bindings is integration with the Python numerical array facilities. The gdal.Dataset.ReadAsArray() method can be used to read raster data as numerical arrays, ready to use with the Python numerical array capabilities.
Tutorials
Chris Garrard has given courses at Utah State University on "Geoprocessing with Python using Open Source GIS" (http://www.gis.usu.edu/~chrisg/python). There a re many slides, examples, test data... and homework ;-) that can -be greatly helpful for beginners with GDAL/OGR in Python.
A cookbook full of recipes for using the Python GDAL/OGR bindings : http://pcjericks.github.io/py-gdalogr-cookbook/index.html
Gotchas
Although GDAL's and OGR's Python bindings provide a fairly "Pythonic" wrapper around the underlying C++ code, there are several ways in which the Python bindings differ from typical Python libraries. These differences can catch Python programmers by surprise and lead to unexpected results. These differences result from the complexity of developing a large, long-lived library while continuing to maintain backward compatibility. They are being addressed over time, but until they are all gone, please review this list of Python Gotchas in the GDAL and OGR Python Bindings.
Examples
An assortment of other samples are available in the Python github samples directory with some description in the Python Sample scripts.
Several GDAL utilities are implemented in Python and can be useful examples.
The majority of GDAL regression tests are written in Python. They are available at https://github.com/OSGeo/gdal/tree/master/autotest
Some examples of GDAL/numpy integration can be found is found in the following scripts:
gdal_calc.py
val_repl.py
gdal_merge.py
gdal2tiles.py
gdal2xyz.py
pct2rgb.py
gdallocationinfo.py
One example of GDAL/numpy integration is found in the val_repl.py script.
Note
Performance Notes
ReadAsArray expects to make an entire copy of a raster band or dataset unless the data are explicitly subsetted as part of the function call. For large data, this approach is expected to be prohibitively memory intensive.