Arc/Info Binary Grid Format

by Frank Warmerdam (warmerdam@pobox.com)

The Arc/Info Binary Grid format is the internal working format of the Arc/Info Grid product. It is also usable and creatable within the spatial analyst component of ArcView. It is a tiled (blocked) format with run length compression capable of holding raster data of up to 4 byte integers or 4 byte floating data.

This format should not be confused with the Arc/Info ASCII Grid format which is the interchange format for grids. Files can be converted between binary and ASCII format with the GRIDASCII and ASCIIGRID commands in Arc/Info. This format is also different than the flat binary raster output of the GRIDFLOAT command. The Arc/Info binary float, and ASCII formats are also accessible from within ArcView.

This format should also not be confused with what I know as ESRI BIL format. This is really a standard ESRI way of creating a header file (.HDR) describing the data layout a binary raster file containing raster data.

Version

I am not sure yet how the versions work for grid files. I have been working primarily with grid files generated by ArcView 3.x, and its associated gridio API. The hdr.adf files I have examined start with the string GRID1.2 for what that's worth. Certainly the file naming conventions seem to follow the Arc/Info 7.x conventions rather than that of earlier versions.

File Set

A grid coverage actually consists of a number of files. A grid normally lives in its own directory named after the grid. For instance, the grid nwgrd1 lives in the directory nwgrd1, and has the following component files:

-rwxr--r--   1 warmerda users          32 Jan 22 16:07 nwgrd1/dblbnd.adf
-rwxr--r--   1 warmerda users         308 Jan 22 16:07 nwgrd1/hdr.adf
-rwxr--r--   1 warmerda users          32 Jan 22 16:07 nwgrd1/sta.adf
-rwxr--r--   1 warmerda users        2048 Jan 22 16:07 nwgrd1/vat.adf
-rwxr--r--   1 warmerda users      187228 Jan 22 16:07 nwgrd1/w001001.adf
-rwxr--r--   1 warmerda users        6132 Jan 22 16:07 nwgrd1/w001001x.adf

Sometimes datasets will also include a prj.adf files containing the projection definition in the usual ESRI format. Grids also normally have associated tables in the info directory. This is beyond the scope of my discussion for now.

The files have the following roles:

dblbnd.adf: Contains the bounds (LLX, LLY, URX, URY) of the portion of utilized portion of the grid.
hdr.adf: This is the header, and contains information on the tile sizes, and number of tiles in the dataset. It also contains assorted other information I have yet to identify.
sta.adf: This contains raster statistics. In particular, the raster min, max, mean and standard deviation.
vat.adf: This relates to the value attribute table. This is the table corresponding integer raster values with a set of attributes. I presume it is really just a pointer into info in a manner similar to the pat.adf file in a vector coverage, but I haven't investigated yet.
w001001.adf: This is the file containing the actual raster data.
w001001x.adf: This is an index file containing pointers to each of the tiles in the w001001.adf raster file.

dblbnd.adf - Georef Bounds

Fields:

Start Byte

# of Bytes

Format

Name

Description

0

8

MSB double

D_LLX

Lower left X (easting) of the grid. Generally -0.5 for an ungeoreferenced grid.

8

MSB double

D_LLY

Lower left Y (northing) of the grid. Generally -0.5 for an ungeoreferenced grid.

16

8

MSB double

D_URX

Upper right X (easting) of the grid. Generally #Pixels-0.5 for an ungeoreferenced grid.

24

8

MSB double

D_URY

Upper right Y (northing) of the grid. Generally #Lines-0.5 for an ungeoreferenced grid.

This file is always 32 bytes long. The bounds apply to the portion of the grid that is in use, not the whole thing.

w001001x.adf - Tile Index

This is a binary dump of the first 320 bytes of a w001001x.adf file.

0000270A FFFFFC14 00000000 00000000 ~~'~~~~~~~~~~~~~
00000000 00000000 00000BFA 00000000 ~~~~~~~~~~~~~~~~
00000000 00000000 00000000 00000000 ~~~~~~~~~~~~~~~~
00000000 00000000 00000000 00000000 ~~~~~~~~~~~~~~~~
00000000 00000000 00000000 00000000 ~~~~~~~~~~~~~~~~
00000000 00000000 00000000 00000000 ~~~~~~~~~~~~~~~~
00000000 00000032 00000202 00000235 ~~~~~~~2~~~~~~~5
000001D4 0000040A 00000000 0000040B ~~~~~~~~~~~~~~~~
00000000 0000040C 00000000 0000040D ~~~~~~~~~~~~~~~~
00000000 0000040E 00000000 0000040F ~~~~~~~~~~~~~~~~
00000000 00000410 00000202 00000613 ~~~~~~~~~~~~~~~~
000001D4 000007E8 00000000 000007E9 ~~~~~~~~~~~~~~~~
00000000 000007EA 00000000 000007EB ~~~~~~~~~~~~~~~~
00000000 000007EC 00000000 000007ED ~~~~~~~~~~~~~~~~
00000000 000007EE 00000202 000009F1 ~~~~~~~~~~~~~~~~
000001D4 00000BC6 00000000 00000BC7 ~~~~~~~~~~~~~~~~
00000000 00000BC8 00000000 00000BC9 ~~~~~~~~~~~~~~~~
00000000 00000BCA 00000000 00000BCB ~~~~~~~~~~~~~~~~
00000000 00000BCC 00000202 00000DCF ~~~~~~~~~~~~~~~~
000001D4 00000FA4 00000000 00000FA5 ~~~~~~~~~~~~~~~~

Fields:

Start Byte

# of Bytes

Format

Description

0

8

Magic Number (always hex 00 00 27 0A FF FF ** **, usually ending in FC 14, FB F8 or FC 08).

8

16

zero fill

24

4

MSB Int32

Size of whole file in shorts (multiply by two to get file size in bytes).

28

72

zero fill

100 + t*8

4

MSB Int32

Offset to tile t in w001001.adf measured in two byte shorts.

104 + t*8

4

MSB Int32

Size of tile t in 2 byte shorts.

sta.adf - Raster Statistics

Fields:

Start Byte

# of Bytes

Format

Name

Description

0

8

MSB double

SMin

Minimum value of a raster cell in this grid.

8

MSB double

SMax

Maximum value of a raster cell in this grid.

16

8

MSB double

SMean

Mean value of a raster cells in this grid.

24

8

MSB double

SStdDev

Standard deviation of raster cells in this grid.

This file is always 32 bytes long.

w001001.adf - Raster Data

This is a binary dump of the first 320 bytes of a w001001.adf file.

0000270A FFFFFC14 00000000 00000000 ~~'~~~~~~~~~~~~~
00000000 00000000 00016DAE 00000000 ~~~~~~~~~~m~~~~~
00000000 00000000 00000000 00000000 ~~~~~~~~~~~~~~~~
00000000 00000000 00000000 00000000 ~~~~~~~~~~~~~~~~
00000000 00000000 00000000 00000000 ~~~~~~~~~~~~~~~~
00000000 00000000 00000000 00000000 ~~~~~~~~~~~~~~~~
00000000 02020800 00373D42 5C5A4D31 ~~~~~~~~~7=B\ZM1
200A0108 0E1D4F89 9C9A9392 8C7E6653  ~~~~~O~~~~~~~fS
5151596D 83919290 868A8B87 807A7A7B QQYm~~~~~~~~~zz{
7C7A766F 64481D00 0406305F 6B6C6A5B |zvodH~~~~0_klj[
5D53513C 2D2D2732 24293F54 40354C55 ]SQ<--'2$)?T@5LU
67686258 514E4943 5859534A 41394D70 ghbXQNICXYSJA9Mp
75665659 66625A63 737A848E 9090979F ufVYfbZcsz~~~~~~
9F908C8F 8F96998E 8778685B 53536274 ~~~~~~~~~xh[SSbt
747B838A 8A8C8F92 8D979B94 8C8D9294 t{~~~~~~~~~~~~~~
8D8D8D8D 8C8B8989 8B8E908F 8E8E9092 ~~~~~~~~~~~~~~~~
90929394 989C9891 92939698 9B9B9C9C ~~~~~~~~~~~~~~~~
8E8E8F8F 8E8E8F90 898E918F 8B8A8E93 ~~~~~~~~~~~~~~~~
8B8D9093 94918C86 838DA1BC B7CEC9B0 ~~~~~~~~~~~~~~~~
D4B0BB96 A0929E99 9797999B 9D9C9C9B ~~~~~~~~~~~~~~~~

Fields:

Start Byte

# of Bytes

Format

Name

Description

0

8

RMagic

Magic Number (always hex 00 00 27 0A FF FF ** **, usually ending in FC 14, FB F8 or FC 08).

8

16

zero fill

24

4

MSB Int32

RFileSize

Size of whole file in shorts (multiply by two to get file size in bytes).

28

72

zero fill

100, ...

2

MSB Int16

RTileSize

Size of this tiles data measured in shorts. This matches the size in the index file, and does not include the tile size itself. The next tile starts 2*n+2 bytes after the start of this tile, where n is the value of this field.

102, ...

1

byte

RTileType

Tile type code indicating the organization of the following data (integer coverages only).

103, ...

1

byte

RMinSize

Number of bytes following to form the minimum value for the tile (integer coverages only).

104, ...

(RMinSize bytes)

MSB Int (var size)

RMin

The minimum value pixels for this tile. This number is added to the pixel values for each pixel in this tile (integer coverages only). I must stress that if RMinSize is less than 4 this is still a signed quantity. For instance, if RMinSize is 2, the value is 65536 - byte0*256 - byte1 if byte0 is > 127.

104+RMinSize, ...

RTileSize*2 - 3 - RMinSize

variable

RTileData

The data for this tile. Format varies according to RTileType for integer coverages.

The fields RTileSize, RTileType, RMinSize, RMin, and RTileData occur in the file for each tile of data present. They are usually packed one after the other, but this isn't necessarily guaranteed. The index file (w001001x.adf) should be used to establish the tile locations. Note that tiles that appear in the index file with a size of zero will appear as just two bytes (zeros) for the RTileSize for that tile.

Raster Size

The size of a the grid isn't as easy to deduce as one might expect. The hdr.adf file contains the HTilesPerRow, HTilesPerColumn, HTileXSize, and HTileYSize fields which imply a particular raster space. However, it seems that this is created much larger than necessary to hold the users raster data. I have created 3x1 rasters which resulted in the standard 8x512 tiles of 256x4 pixels each.

It seems that the user portion of the raster has to be computed based on the georeferenced bounds in the dblbnd.adf file (assumed to be anchored at the top left of the total raster space), and the HPixelSizeX, and HPixelSizeY fields from hdr.adf.

#Pixels = (D_URX - D_LRX) / HPixelSizeX

#Lines = (D_URY - D_LRY) / HPixelSizeY

Based on this number of pixels and lines, it is possible to establish what portion out of the top left of the raster is really of interest. All regions outside this appear to empty tiles, or filled with no data markers.

RTileType/RTileData

Each tile contains HBlockXSize * HBlockYSize pixels of data. For floating point and uncompressed integer files the data is just the tile size (in two bytes) followed by the pixel data as 4 byte MSB order IEEE floating point words. For compressed integer tiles it is necessary to interpret the RTileType to establish the details of the tile organization:

RTileType = 0x00 (constant block)

All pixels take the value of the RMin. Data is ignored. It appears there is sometimes a bit of meaningless data (up to four bytes) in the block.

RTileType = 0x01 (raw 1bit data)

One full tile worth of data pixel values follows the RMin field, with 1bit per pixel.

RTileType = 0x04 (raw 4bit data)

One full tiles worth of data pixel values follows the RMin field, with 4 bits per pixel. The high order four bits of a byte comes before the low order four bits.

RTileType = 0x08 (raw byte data)

One full tiles worth of data pixel values (one byte per pixel) follows the RMin field.

RTileType = 0x10 (raw 16bit data)

One full tiles worth of data pixel values follows the RMin field, with 16 bits per pixel (MSB).

RTileType = 0x20 (raw 32bit data)

One full tiles worth of data pixel values follows the RMin field, with 32 bits per pixel (MSB).

RTileType = 0xCF (16 bit literal runs/nodata runs)

The data is organized in a series of runs. Each run starts with a marker which should be interpreted as:

Marker < 128: The marker is followed by Marker pixels of literal data with two MSB bytes per pixel.
Marker > 127: The marker indicates that 256-Marker pixels of no data pixels should be put into the output stream. No data (other than the next marker) follows this marker.

RTileType = 0xD7 (literal runs/nodata runs)

The data is organized in a series of runs. Each run starts with a marker which should be interpreted as:

Marker < 128: The marker is followed by Marker pixels of literal data with one byte per pixel.
Marker > 127: The marker indicates that 256-Marker pixels of no data pixels should be put into the output stream. No data (other than the next marker) follows this marker.

RTileType = 0xDF (RMin runs/nodata runs)

The data is organized in a series of runs. Each run starts with a marker which should be interpreted as:

Marker < 128: The marker is followed by Marker pixels of literal data with one byte per pixel.
Marker > 127: The marker indicates that 256-Marker pixels of no data pixels should be put into the output stream. No data (other than the next marker) follows this marker.

This is similar to 0xD7, except that the data size is zero bytes instead of 1, so only RMin values are inserted into the output stream.

RTileType = 0xE0 (run length encoded 32bit)

The data is organized in a series of runs. Each run starts with a marker which should be interpreted as a count. The four bytes following the count should be interpreted as an MSB Int32 value. They indicate that count pixels of value should be inserted into the output stream.

RTileType = 0xF0 (run length encoded 16bit)

The data is organized in a series of runs. Each run starts with a marker which should be interpreted as a count. The two bytes following the count should be interpreted as an MSB Int16 value. They indicate that count pixels of value should be inserted into the output stream.

RTileType = 0xFC/0xF8 (run length encoded 8bit)

The data is organized in a series of runs. Each run starts with a marker which should be interpreted as a count. The following byte is the value. They indicate that count pixels of value should be inserted into the output stream.

The interpretation is the same for 0xFC, and 0xF8. I believe that 0xFC has a lower dynamic (2 bit) range than 0xF8 (4 or 8 bit).

RTileType = 0xFF (RMin CCITT RLE 1Bit)

The data stream for this file is CCITT RLE (G1 fax) compressed. The format is complex but source is provided with the sample program (derived from libtiff) for reading it. The result of uncompressing is 1bit data so which the RMin value should be added.

hdr.adf - Header

This is a binary dump of the first 308 bytes of a hdr.adf file.

47524944 312E3200 00000000 FFFFFFFF GRID1.2~~~~~~~~~
00000001 00000000 0000164E 3F800000 ~~~~~~~~~~~N?~~~
00000F00 F6180000 90060000 3603D601 ~~~~~~~~~~~~6~~~
6403E301 01000000 7620F808 43012B03 d~~~~~~~v ~~C~+~
D6019903 E3012B03 D6019903 E301F7BF ~~~~~~+~~~~~~~~~
00007406 6E1FC2A4 7A370D00 0B004200 ~~t~n~~~z7~~~~B~
4E1654A4 00000000 00000000 00000000 N~T~~~~~~~~~~~~~
34A5A89D FF0414A5 A70F0002 00000000 4~~~~~~~~~~~~~~~
00000000 3C0B5F06 A8C05F06 08005AC0 ~~~~<~_~~~_~~~Z~
0A00E101 36035AC0 72085F06 FAA42F3C ~~~~6~Z~r~_~~~/<
0A001667 02000E00 A80B0200 08370200 ~~~g~~~~~~~~~7~~
0CA00200 9C0B0200 04370200 36A0E436 ~~~~~~~~~7~~6~~6
84000000 36A00200 5F063EA5 0883FF04 ~~~~6~~~_~>~~~~~
00008400 00000010 BD810200 5F010000 ~~~~~~~~~~~~_~~~
670E0000 5F01560E 4C4F0001 84008CA5 g~~~_~V~LO~~~~~~
28008F01 1000E00A 6628F7BF 4076FF04 (~~~~~~~f(~~@v~~
3FF00000 00000000 3FF00000 00000000 ?~~~~~~~?~~~~~~~
C08FFC00 00000000 C0A1BF00 00000000 ~~~~~~~~~~~~~~~~
00000008 00000200 00000100 00000001 ~~~~~~~~~~~~~~~~
00000004                            ~~~~

Fields:

Start Byte

# of Bytes

Format

Name

Description

0

8

Char

HMagic

Magic Number - always "GRID1.20"

8

assorted data, I don't know the purpose.

16

4

MSB Int32

HCellType

1 = int cover, 2 = float cover.

20

4

MSB Int32

CompFlag

0 = compressed, 1 = uncompressed

24

232

assorted data, I don't know the purpose.

256

8

MSB Double

HPixelSizeX

Width of a pixel in georeferenced coordinates. Generally 1.0 for ungeoreferenced rasters.

264

8

MSB Double

HPixelSizeY

Height of a pixel in georeferenced coordinates. Generally 1.0 for ungeoreferenced rasters.

272

8

MSB Double

XRef

dfLLX-(nBlocksPerRow*nBlockXSize*dfCellSizeX)/2.0

280

8

MSB Double

YRef

dfURY-(3*nBlocksPerColumn*nBlockYSize*dfCellSizeY)/2.0

288

4

MSB Int32

HTilesPerRow

The width of the file in tiles (often 8 for files of less than 2K in width).

292

4

MSB Int32

HTilesPerColumn

The height of the file in tiles. Note this may be much more than the number of tiles actually represented in the index file.

296

4

MSB Int32

HTileXSize

The width of a file in pixels. Normally 256.

300

4

MSB Int32

Unknown, usually 1.

304

4

MSB Int32

HTileYSize

Height of a tile in pixels, usually 4.

Acknowledgements

I would like to thank Geosoft Inc. for partial funding of my research into this format. I would also like to thank:

Kenneth R. McVay for providing the statistics file format.
Noureddine Farah of ThinkSpace who dug up lots of datasets that caused problems.
Luciano Fonseca who worked out RTileType 0x01.
Martin Manningham of Global Geomatics for additional problem sample files.
Harry Anderson of EDX Engineering, for showing me that floating point tiles don't have RTileType.
Ian Turton for supplying a sample files demonstrating the need to be careful with the sign of "short" RMin values.
Duncan Chaundy at PCI for poking hard till I finally deduced 0xFF tiles.
Stephen Cheeseman of GeoSoft for yet more problem files.
Geoffrey Williams for a files demonstrating tile type 0x20.