Esri Shapefile .SHP File Description
Esri Shapefiles are typically usually used in Grapher
as drawings. They can be imported using the File
| Import command.
Esri Shapefiles are in a binary file format (i.e., they can't be created
or modified with a text editor or word processor) that is compatible with
Arc/Info, Arc/View, and other Esri application programs. This format is
used to store spatial information including boundary objects such as areas,
curves, and points. Spatial information is only concerned with the location
of objects in space (i.e., their coordinates) and not with their attributes
(such as line or fill style, marker symbol used, text labels, etc.).
Three types of files are produced with each export:
Filename Extension
|
Description
|
.SHP
|
Contains the coordinates of each
object in the drawing.
|
.SHX
|
Contains the file offset of each
object in the .SHP file.
|
.DBF
|
Contains the attribute text associated
with each object in the .SHP file.
|
In each of the .SHP, .SHX, and .DBF files, the shapes in each file correspond
to each other in sequence. That is, the first record in the .SHP file
corresponds to the first record in the .SHX and .DBF files, and so on.
The .SHP and .SHX files have various fields with different endianness,
so as an implementor of the file formats you must be very careful to respect
the endianness of each field and treat it properly.
Overview
A shapefile is a digital vector storage format for storing geometric
location and associated attribute information. This format lacks the capacity
to store topological information. The shapefile format was introduced
with ArcView GIS version 2 in the beginning of the 1990s. It is now possible
to read and write shapefiles using a variety of free and non-free programs.
Shapefiles are simple because they store primitive geometrical data
types of points, lines, and polygons. These primitives are of limited
use without any attributes to specify what they represent. Therefore,
a table of records will store properties/attributes for each primitive
shape in the shapefile. Shapes (points/lines/polygons) together with data
attributes can create infinitely many representations about geographical
data. Representation provides the ability for powerful and accurate computations.
While the term "shapefile" is quite common, a "shapefile"
is actually a set of several files. Three individual files are normally
mandatory to store the core data that comprises a shapefile. There are
a further eight optional files which store primarily index data to improve
performance. Each individual file should conform to the MS DOS 8.3 file naming
convention (8 character filename prefix, fullstop, 3 character filename
suffix such as shapefil.shp) in order to be compatible with past applications
that handle shapefiles. For this same reason, all files should be located
in the same folder.
Shapefiles deal with coordinates in terms of X and Y, although they
are often storing longitude and latitude, respectively. While working
with the X and Y terms, be sure to respect the order of the terms (longitude
is stored in X, latitude in Y).
Mandatory Files
.SHP - shape format; the feature geometry itself
.SHX - shape index format; a positional index of the feature geometry
to allow seeking forwards and backwards quickly
.DBF - attribute format; columnar attributes for each shape, in dBase
III format
Optional Files
.PRJ - projection format; the coordinate system and projection information,
a plain text file describing the projection using well-known text format
.SBN and .SBX - a spatial index of the features
.FBN and .FBX - a spatial index of the features for shapefiles that
are read-only
.AIN and .AIH - an attribute index of the active fields in a table or
a theme's attribute table
.IXS - a geocoding index for read-write shapefiles
.MXS - a geocoding index for read-write shapefiles (ODB format)
.ATX - an attribute index for the .dbf file in the form of shapefile.columnname.atx
(ArcGIS 8 and later)
.SHP.XML- metadata in XML format
File Format
Shapefile
shape format .SHP
The main file .SHP contains the primary geographic reference data
in the shapefile. The file consists of a single fixed length header
followed by one or more variable length records. Each of the variable
length records includes a record header component and a record contents
component. A detailed description of the file format is given in the
Esri Shapefile Technical Description.1 This format should not be confused
with the AutoCAD shape font source format, which shares the .shp extension.
The main file header is fixed at 100 bytes in length and contains
17 fields; nine 4-byte (32-bit signed integer or int32) integer fields
followed by eight 8-byte (double) signed floating point fields:
Bytes
|
Type
|
Endianness
|
Usage
|
0-3 |
int32 |
big |
File code (always hex value 0x0000270a) |
4-23 |
int32 |
big |
Unused; five uint32 |
24-27 |
int32 |
big |
File length (in 16-bit words, including the header) |
28-31 |
int32 |
little |
Version |
32-35 |
int32 |
little |
Shape type (see reference below) |
36-67 |
double |
little |
Minimum bounding rectangle (MBR) of all shapes contained within the shapefile; four doubles in the following order: min X, min Y, max X, max Y |
68-83 |
double |
little |
Range of Z; two doubles in the following order: min Z, max Z |
84-99 |
double |
little |
Range of M; two doubles in the following order: min M, max M |
The file then contains any number of variable-length records. Each
record is prefixed with a record-header of 8 bytes:
Bytes
|
Type
|
Endianness
|
Usage
|
0-3
|
int32
|
big
|
Record number |
4-7
|
int32
|
big
|
Record length (in 16-bit words)
|
Following the record header is the actual record:
Bytes
|
Type
|
Endianness
|
Usage
|
0-3 |
int32
|
big
|
Shape type (see reference below)
|
4-
|
-
|
-
|
Shape content
|
The variable length record contents depend on the shape type. The
following are the possible shape types:
Value
|
Shape Type
|
Fields
|
0 |
Null shape |
None |
1 |
Point |
X, Y |
3 |
Polyline |
MBR, Number of parts, Number of points, Parts, Points |
5 |
Polygon |
MBR, Number of parts, Number of points, Parts, Points |
8 |
MultiPoint |
MBR, Number of points, Points |
11 |
PointZ |
X, Y, Z, M |
13 |
PolylineZ |
Mandatory: MBR, Number of parts, Number of points, Parts, Points, Z range, Z array Optional: M range, M array |
15 |
PolygonZ |
Mandatory: MBR, Number of parts, Number of points, Parts, Points, Z range, Z array Optional: M range, M array |
18 |
MultiPointZ |
Mandatory: MBR, Number of points, Points, Z range, Z array Optional: M range, M array |
21 |
PointM |
X, Y, M |
23 |
PolylineM |
Mandatory: MBR, Number of parts, Number of points, Parts, Points Optional: M range, M array |
25 |
PolygonM |
Mandatory: MBR, Number of parts, Number of points, Parts, Points Optional: M range, M array |
28 |
MultiPointM |
Mandatory: MBR, Number of points, Points OptionalFields: M range, M array |
31 |
MultiPatch |
Mandatory: MBR, Number of parts, Number of points, Parts, Part types, Points, Z range, Z array Optional: M range, M array |
In common use, shapefiles containing Point, Polyline, and Polygon
are extremely popular. The "Z" types are three-dimensional.
The "M" types contain a user-defined measurement which coincides
with the point being referenced. Three-dimensional shapefiles are
rather uncommon, and the measurement functionality has been largely
superseded by more robust databases used in conjunction with the shapefile
data.
Shapefile
shape index format (.shx)
The shapefile index contains the same 100-byte header as the .SHP
file, followed by any number of 8-byte fixed-length records which
consist of the following two fields:
Bytes
|
Type
|
Endianness
|
Usage
|
0-3
|
int32
|
big
|
Record offset (in 16-bit words)
|
4-7
|
int32
|
big
|
Record offset (in 16-bit words)
|
Using this index, it is possible to seek backwards in the shapefile
by seeking backwards first in the shape index (which is possible because
it uses fixed-length records), reading the record offset, and using
that to seek to the correct position in the .SHP file. It is also
possible to seek forwards an arbitrary number of records by using
the same method.
Shapefile
attribute format .DBF
Attributes for each shape are stored in the xBase (dBase) format,
which has an open specification.
Shapefile
projection format .PRJ
The projection information contained in the .PRJ file is critical
in order to understand the data contained in the .SHP file correctly.
Although it is technically optional, it is most often provided, as
it is not necessarily possible to guess the projection of any given
points. The file is stored in well-known text (WKT) format.
Some typical information contained in the .PRJ file is:
Shapefile
spatial index format (.sbn)
This is a binary spatial index file, which is used only by Esri
software. The format is not documented, and is not implemented by
other vendors. The .SBN file is not strictly necessary, since the
.SHP file contains all of the information necessary to successfully
parse the spatial data.
Limitations
Topology
and Shapefiles
Shapefiles do not have the ability to store topological information.
ArcInfo coverages and Personal/File/Enterprise Geodatabases do have
the ability to store feature topology.
Spatial Representation
The edges of a polyline or polygon are defined using points, which
can give it a jagged edge at higher resolutions. Additional points
are required to give smooth shapes, which requires storing quite a
lot of data compared to, for example, bézier curves, which can capture
complexity using smooth curves, without using as many points. Currently,
none of the shapefile types support bézier curves.
Data Storage
Unlike most databases, the database format is based on older xBASE
standard, incapable of storing null values in its fields. This limitation
can make the storage of data in the attributes less flexible. In ArcGIS
products, values that should be null are instead replaced with a 0
(without warning), which can make the data misleading. This problem
is addressed in ArcGIS products by using Esri's Personal Geodatabase
offerings, one of which is based on Microsoft Access.
Mixing Shape Types
Each shape file can technically store a mix of different shape types,
as the shape type precedes each record, but common use of the specification
dictates that only shapes of a single type can be in a single file.
For example, a shape file cannot contain both Polyline and Polygon
data. Thus, well (point), river (polyline) and lake (polygon) data
must be kept in three separate files.
Import
Options Dialog
No import options dialog is displayed.
Import Automation Options
See Esri Shapefile
Import Automation Options
Export Options Dialog
See Esri Shapefile Export Options Dialog
and Scaling
Export Automation Options
See Esri Shapefile
Export Automation Options
See Also
File
Format Chart