This Python package and extensions are a number of tools for programming and manipulating the GDAL Geospatial Data Abstraction Library. Actually, it is two libraries – GDAL for manipulating geospatial raster data and OGR for manipulating geospatial vector data – but we’ll refer to the entire package as the GDAL library for the purposes of this document.
The GDAL project (primarily Even Rouault) maintains SWIG generated Python bindings for GDAL and OGR. Generally speaking the classes and methods mostly match those of the GDAL and OGR C++ classes. There is no Python specific reference documentation, but the tutorials includes Python examples.
Chris Garrard has given courses at Utah State University on “Geoprocessing with Python using Open Source GIS” (http://www.gis.usu.edu/~chrisg/python). There are many slides, examples, test data… and homework ;-) that can be greatly helpful for beginners with GDAL/OGR in Python.
A cookbook full of recipes for using the Python GDAL/OGR bindings : http://pcjericks.github.io/py-gdalogr-cookbook/index.html
An assortment of other samples are available in the Python github samples directory at https://github.com/OSGeo/gdal/tree/master/swig/python/gdal-utils/osgeo_utils/samples with some description in the Python Sample scripts.
Several GDAL utilities are implemented in Python and can be useful examples.
The majority of GDAL regression tests are written in Python. They are available at https://github.com/OSGeo/gdal/tree/master/autotest
Some examples of GDAL/numpy integration can be found is found in the following scripts: - gdal_calc.py - val_repl.py - gdal_merge.py - gdal2tiles.py - gdal2xyz.py - pct2rgb.py - gdallocationinfo.py
Although GDAL’s and OGR’s Python bindings provide a fairly “Pythonic” wrapper around the underlying C++ code, there are several ways in which the Python bindings differ from typical Python libraries. These differences can catch Python programmers by surprise and lead to unexpected results. These differences result from the complexity of developing a large, long-lived library while continuing to maintain backward compatibility. They are being addressed over time, but until they are all gone, please review this list of Python Gotchas in the GDAL and OGR Python Bindings.
libgdal (3.2.0 or greater) and header files (gdal-devel)
numpy (1.0.0 or greater) and header files (numpy-devel) (not explicitly required, but many examples and utilities will not work without it)
The GDAL Python bindings support both distutils and setuptools, with a preference for using setuptools. If setuptools can be imported, setup will use that to build an egg by default. If setuptools cannot be imported, a simple distutils root install of the GDAL package (and no dependency chaining for numpy) will be made.
GDAL can be installed from the Python CheeseShop:
$ sudo easy_install GDAL
It may be necessary to have libgdal and its development headers installed if easy_install is expected to do a source build because no egg is available for your specified platform and Python version.
Most of setup.py’s important variables are controlled with the setup.cfg file. In setup.cfg, you can modify pointers to include files and libraries. The most important option that will likely need to be modified is the gdal_config parameter. If you installed GDAL from a package, the location of this program is likely /usr/bin/gdal-config, but it may be in another place depending on how your packager arranged things.
After modifying the location of gdal-config, you can build and install with the setup script:
$ python setup.py build $ python setup.py install
If you have setuptools installed, you can also generate an egg:
$ python setup.py bdist_egg
Building as part of the GDAL library source tree
You can also have the GDAL Python bindings built as part of a source build by specifying –with-python as part of your configure line:
$ ./configure --with-python
Use the typical make and make install commands to complete the installation:
$ make $ make install
./configure attempts to detect if you have setuptools installed in the tree of the Python binary it was given (or detected on the execution path), and it will use an egg build by default in that instance. If you have a need to use a distutils-only install, you will have to edit setup.py to ensure that the HAVE_SETUPTOOLS variable is ultimately set to False and proceed with a typical ‘python setup.py install’ command.
You will need the following items to complete an install of the GDAL Python bindings on Windows:
GDAL Windows Binaries The basic install requires the gdalwin32exe160.zip distribution file. Other files you see in the directory are for various optional plugins and development headers/include files. After downloading the zip file, extract it to the directory of your choosing.
As explained in the README_EXE.txt file, after unzipping the GDAL binaries you will need to modify your system path and variables. If you’re not sure how to do this, read the Microsoft KnowledgeBase doc
Add the installation directory bin folder to your system PATH, remember to put a semicolon in front of it before you add to the existing path.
Create a new user or system variable with the data folder from your installation.
Name : GDAL_DATA Path : C:\gdalwin32-1.7\data
Skip down to the Usage section to test your install. Note, a reboot may be required.
The GDAL Python package is built using SWIG. The earliest version of SWIG that is supported to generate the wrapper code is 1.3.40. It is possible that usable bindings will build with a version earlier than 1.3.40, but no development efforts are targeted at versions below it. You should not have to run SWIG in your development tree to generate the binding code, as it is usually included with the source. However, if you do need to regenerate, you can do so with the following make command from within the ./swig/python directory:
$ make generate
To ensure that all of the bindings are regenerated, you can clean the bindings code out before the generate command by issuing:
$ make veryclean
There are five major modules that are included with the GDAL Python bindings.:
>>> from osgeo import gdal >>> from osgeo import ogr >>> from osgeo import osr >>> from osgeo import gdal_array >>> from osgeo import gdalconst
Additionally, there are five compatibility modules that are included but provide notices to state that they are deprecated and will be going away. If you are using GDAL 1.7 bindings, you should update your imports to utilize the usage above, but the following will work until GDAL 3.1.
>>> import gdal >>> import ogr >>> import osr >>> import gdalnumeric >>> import gdalconst
If you have previous code that imported the global module and still need to support the old import, a simple try…except import can silence the deprecation warning and keep things named essentially the same as before:
>>> try: ... from osgeo import gdal ... except ImportError: ... import gdal
Currently, only the OGR module has docstrings which are generated from the C/C++ API doxygen materials. Some of the arguments and types might not match up exactly with what you are seeing from Python, but they should be enough to get you going. Docstrings for GDAL and OSR are planned for a future release.
One advanced feature of the GDAL Python bindings not found in the other language bindings is integration with the Python numerical array facilities. The gdal.Dataset.ReadAsArray() method can be used to read raster data as numerical arrays, ready to use with the Python numerical array capabilities.
These facilities have evolved somewhat over time. In the past the package was known as “Numeric” and imported using “import Numeric”. A new generation is imported using “import numpy”. Currently the old generation bindings only support the older Numeric package, and the new generation bindings only support the new generation numpy package. They are mostly compatible, and by importing gdalnumeric (or osgeo.gdal_array) you will get whichever is appropriate to the current bindings type.
One example of GDAL/numpy integration is found in the val_repl.py script.
ReadAsArray expects to make an entire copy of a raster band or dataset unless the data are explicitly subsetted as part of the function call. For large data, this approach is expected to be prohibitively memory intensive.