A geographic information system (GIS), or geographical information system captures, stores, analyzes, manages, and presents data that is linked to location. Technically, GIS is geographic information systems which includes mapping software and its application with remote sensing, land surveying, aerial photography, mathematics, photogrammetry, geography, and tools that can be implemented with GIS software. Still, many refer to “geographic information system” as GIS even though it doesn't cover all tools connected to topology.
GIS data represents real world objects (roads, land use, elevation) with digital data. Real world objects can be divided into two abstractions: discrete objects (a house) and continuous fields (rain fall amount or elevation). There are two broad methods used to store data in a GIS for both abstractions: Raster and Vector.
A raster data type is, in essence, any type of digital image represented in grids. Anyone who is familiar with digital photography will recognize the pixel as the smallest individual unit of an image. A combination of these pixels will create an image, distinct from the commonly used scalable vector graphics which are the basis of the vector model. While a digital image is concerned with the output as representation of reality, in a photograph or art transferred to computer, the raster data type will reflect an abstraction of reality. Aerial photos are one commonly used form of raster data, with only one purpose, to display a detailed image on a map or for the purposes of digitization. Other raster data sets will contain information regarding elevation, a DEM, or reflectance of a particular wavelength of light, LANDSAT.
Raster data type consists of rows and columns of cells, with each cell storing a single value. Raster data can be images (raster images) with each pixel (or cell) containing a color value. Additional values recorded for each cell may be a discrete value, such as land use, a continuous value, such as temperature, or a null value if no data is available. While a raster cell stores a single value, it can be extended by using raster bands to represent RGB (red, green, blue) colors, colormaps (a mapping between a thematic code and RGB value), or an extended attribute table with one row for each unique cell value. The resolution of the raster data set is its cell width in ground units.
In a GIS, geographical features are often expressed as vectors, by considering those features as geometrical shapes. Different geographical features are expressed by different types of geometry.
Zero-dimensional points are used for geographical features that can best be expressed by a single point reference; in other words, simple location. For example, the locations of wells, peak elevations, features of interest or trailheads. Points convey the least amount of information of these file types. Points can also be used to represent areas when displayed at a small scale. For example, cities on a map of the world would be represented by points rather than polygons. No measurements are possible with point features.
One-dimensional lines or polylines are used for linear features such as rivers, roads, railroads, trails, and topographic lines. Again, as with point features, linear features displayed at a small scale will be represented as linear features rather than as a polygon. Line features can measure distance.
Two-dimensional polygons are used for geographical features that cover a particular area of the earth's surface. Such features may include lakes, park boundaries, buildings, city boundaries, or land uses. Polygons convey the most amount of information of the file types. Polygon features can measure perimeter and area.
Each of these geometries is linked to a row in a database that describes their attributes. For example, a database that describes lakes may contain a lake's depth, water quality, pollution level. This information can be used to make a map to describe a particular attribute of the dataset. For example, lakes could be coloured depending on level of pollution. Different geometries can also be compared. For example, the GIS could be used to identify all wells (point geometry) that are within 1-mile (1.6 km) of a lake (polygon geometry) that has a high level of pollution.
Vector features can be made to respect spatial integrity through the application of topology rules such as 'polygons must not overlap'. Vector data can also be used to represent continuously varying phenomena. Contour lines and triangulated irregular networks (TIN) are used to represent elevation or other continuously changing values. TINs record values at point locations, which are connected by lines to form an irregular mesh of triangles. The face of the triangles represent the terrain surface.1)
A wide range of activities depend on digital geographic data and geoprocessing, and these information resources are rapidly becoming more important as commerce expands, information technology advances, and environmental problems demand resolution. Unfortunately, non-interoperability severely limits the use of digital geographic information. Spatial data exist in a wide range of incompatible and often vendor-proprietary forms, and geographic information systems (GIS) usually exist in organizations as isolated collections of data, software, and user expertise.
The Open GIS Consortium, Inc. (OGC) is a unique membership organization dedicated to open system approaches to geoprocessing. By means of its consensus building and technology development activities, OGC has had a significant impact on the geodata standards community, and has successfully promoted the vision of “Open GIS” as the vehicle for integration of geoprocessing with the distributed architectures of the emerging worldwide infrastructure for information management.
OGCís direction is set by a board of directors selected for their ability to represent key constituencies in the geoprocessing community. The OGC board speaks on behalf of both public and private sector users interested in finding more integrated and effective ways to use the worldís increasing wealth of geographical information to support problem solving in such areas as environmental monitoring and sustainment, transportation, resource management, global mapping, agricultural productivity, crisis management and national defense.
The OGIS Project Technical Committee of the OGC operates according to a formal consensus process structured to be fair and equitable and to ensure the technical completeness of the specification. To ensure the eventual adoption of OGIS as an official standard, the OGIS Technical Committee is represented on key geodata, GIS, and geomatics standards committees, including the ISO (International Standards Organization) TC211 GIS/Geomatics Committee and the ANSI (American National Standards Institute) X3L1 Committee. In addition, the OGIS Technical Committee maintains close ties to the Federal Geographic Data Committee (FGDC).
The OGIS Project Management Committee is composed of representatives from the Technical Committee, representatives of the Principal Member organizations, and others who represent particular constituencies. The Management Committee maintains a business plan for the Project and sets overall policy for the Project. The dual committee structure serves to separate technical and political issues.
OGC was founded to create interoperability specifications in response to wide-spread recognition of the following problematical conditions in the geoprocessing and geographic information community:
The multiplicity of geodata formats and data structures, often proprietary, that prevent interoperability and thus limit commercial opportunity and government effectiveness.
The need to coordinate activities of public and private sectors in producing standardized approaches to specifying geoprocessing requirements for public sector procurements.
The need to create greater public access to public geospatial data sources.
The need to preserve the value of legacy GIS systems and legacy geodata.
The need to incorporate geoprocessing and geographic information resources in the framework of national information infrastructure initiatives.
The need to synchronize geoprocessing technology with emerging Information Technology (IT) standards based on open system and distributed processing concepts.
The need to involve international corporations in the development and communication of geoprocessing standards activity, particularly in the areas of infrastructure architecture and interoperability, in order to promote the integration of resources in the context of global information infrastructure initiatives.
OGC is a not-for-profit corporation supported by Consortium membership fees, development partnerships, and cooperative agreements with federal agencies. As of April 26, 1995 the Consortium included 40 members. Though organized to manage multiple project tracks, OGCís initial focus is on the development of the OGIS. OGC plans to establish other project tracks in areas related to implementations of the OGIS architecture.2)
OpenStreetMap is a free editable map of the whole world. It is made by people like you. OpenStreetMap allows you to view, edit and use geographical data in a collaborative way from anywhere on Earth.
OpenStreetMap creates and provides free geographic data such as street maps to anyone who wants them. The project was started because most maps you think of as free actually have legal or technical restrictions on their use, holding back people from using them in creative, productive, or unexpected ways.
The API supports the creation, modification, and deletion of three major object types: Nodes, Ways, and Relations.
Nodes have fixed coordinates and are used to express points of interest, as well as specifying the shape of ways. A typical XML representation of a node looks like this:
<node id=“156804” lat=“61.8083953857422” lon=“10.8497076034546” visible=“true” timestamp=“2005-07-30T14:27:12+01:00”/>
Nodes may have tags:
<node id=“156804” lat=“61.8083953857422” lon=“10.8497076034546” visible=“true” timestamp=“2005-07-30T14:27:12+01:00”>
<tag k="tourism" v="hotel" /> <tag k="name" v="Cockroach Inn" /> </node>
Ways represent an ordered list of nodes. They must have at least two nodes, and typically have tags to specify the meaning of the way - a road, a river, a forest. Closed ways with certain tags are treated as areas by the rendering software; there is no explicit way of specifying an area. A typical XML representation of a way looks like this:
<way id="35" visible="true" timestamp="2006-03-14T10:07:23+00:00" user="johnz"> <nd ref="156804"/> <nd ref="156805"/> <nd ref="156806"/> <tag k="highway" v="secondary"/> </way>
Relations are used to model any kind of relationship between objects, and can also represent an object themselves. Relations typically have one or more members (where members may be nodes, ways, or other relations), and a number of tags of which one is a “type” tag specifying the kind of relation. But relations without members or tags are permitted. The list of members is unordered, but membership may be qualified by a “role” attribute.
This is an example of a relation:
<relation id="77" visible="true" timestamp="2006-03-14T10:07:23+00:00" user="fred"> <member type="way" ref="343" role="from" /> <member type="node" ref="911" role="at" /> <member type="way" ref="227" role="to" /> <tag k="type" v="turn_restriction"/> </relation>
Several of the request types returned below return XML data. These essentially return one or more objects to the client. All the objects are always returned in a single <osm> tag.
<osm version="0.5" generator="OSM API server"> <relation id="77" visible="true" timestamp="2006-03-14T10:07:23+00:00" user="fred"> <member type="way" ref="343" role="from" /> <member type="node" ref="911" role="at" /> <member type="way" ref="227" role="to" /> <tag k="type" v="turn_restriction"/> </relation> </osm>
For each of the above-mentioned object types, the API supports these CRUD operations (replace <objtype> by one of node, way, relation; replace <id> by the id of the object in question):
Purpose HTTP Method and URL Request Response Creation PUT /api/0.5/<objtype>/create Retrieval GET /api/0.5/<objtype>/<id> Update PUT /api/0.5/<objtype>/<id> Deletion DELETE /api/0.5/<objtype>/<id>
There are two commands that take an area, called a bounding box, as input and return the objects that are in or otherwise associated with it.
A bounding box is an area defined by two longitudes and two latitudes, where:
- Latitude is a decimal number between -90.0 and 90.0.
- Longitude is a decimal number between -180.0 and 180.0.
There are two restrictions on the size of bounding boxes:
- They cannot enclose more than 0.25 degrees of latitude or longitude. The area covered by the largest possible bounding box (of 0.25 square degrees) varies from about 900 square miles at the equator to about 400 square miles on Iceland.
- They cannot enclose more than 50,000 nodes.
The commands that take bounding boxes return errors if either of these restrictions are violated. To work with bounding boxes that are larger than permitted by these restrictions, use Osmxapi or an offline solution such as the planet file.
The following command returns:
- All nodes that are inside a given bounding box and any relations that reference them.
- All ways that reference at least one node that is inside a given bounding box, any relations that reference them [the ways], and any nodes outside the bounding box that the ways may reference.
- All relations that reference one of the relations included due to the above rules. (Does not apply recursively.)
- left is the longitude of the left (westernmost) side of the bounding box.
- bottom is the latitude of the bottom (southernmost) side of the bounding box.
- right is the longitude of the right (easternmost) side of the bounding box.
- top is the latitude of the top (northernmost) side of the bounding box.
Note that, while this command returns those relations that reference the aforementioned nodes and ways, the reverse is not true: it does not (necessarily) return all of the nodes and ways that are referenced by these relations. This prevents unreasonably-large result sets. For example, imagine the case where:
- There is a relationship named “England” that references every node in England.
- The nodes, ways, and relations are retrieved for a bounding box that covers a small portion of England.
While the result would include the nodes, ways, and relations as specified by the rules for the command, including the “England” relation, it would (fortuitously) not include every node and way in England. If desired, the nodes and ways referenced by the “England” relation could be retrieved by their respective IDs. Example
The OSM Library is a collection of Ruby modules and classes to handle data from the OpenStreetMap project.
The OSM Library contains several packages which can be installed separately:
A library for handling OpenStreetMap data.
The library provides classes for the three basic building blocks of any OSM database: OSM::Node, OSM::Way, and OSM::Relation. They are all subclasses of OSM::OSMObject.
# support for basic OSM objects require 'OSM/objects'
# create a node node = OSM::Node.new(17, 'user', '2007-10-31T23:48:54Z', 7.4, 53.2)
# create a way and add a node way = OSM::Way.new(1743, 'user', '2007-10-31T23:51:17Z') way.nodes << node
# create a relation relation = OSM::Relation.new(331, 'user', '2007-10-31T23:51:53Z')
There is also an OSM::Member class for members of a relation:
# create a member and add it to a relation member = OSM::Member.new('way', 1743, 'role') relation << [member]
Tags can be added to Nodes, Ways, and Relations:
way.add_tags('highway' => 'residential', 'name' => 'Main Street')
You can get the hash of tags like this:
way.tags way.tags['highway'] way.tags['name'] = 'Bay Street'
As a convenience tags can also be accessed with their name only:
This is implemented with the method_missing() function. Of course it only works for tag keys which are allowed as ruby method names. Accessing the OSM API
You can access the OSM RESTful web API through the OSM::API class and through some methods in the OSM::Node, OSM::Way, and OSM::Relation classes.
There are methods for getting Nodes, Ways, and Relations by ID, getting the history of an object etc.
To parse an OSM XML file create a subclass of OSM::Callbacks and define the methods node(), way(), and relation() in it:
class MyCallbacks < OSM::Callbacks def node(node) ... end def way(way) ... end def relation(relation) ... end end
Instantiate an object of this class and give it to a OSM::StreamParser:
require 'OSM/StreamParser' cb = MyCallbacks.new parser = OSM::StreamParser.new(:filename => 'filename.osm', :callbacks => cb) parser.parse
The methods node(), way(), or relation() will be called whenever the parser has parsed a complete node, way, or relation (i.e. after all tags, nodes in a way, or members of a relation are available).
There are several parser options available:
- REXML (Default, slow, works on all machines, because it is part of the Ruby standard distribution)
- Libxml (Based on the C libxml2 library, faster than REXML, new version needed, sometimes hard to install)
- Expat (Based on C Expat library, faster than REXML)
Since version 0.1.3 REXML is the default parser because many people had problems with the C-based parser. Change the parser by setting the environment variable OSMLIB_XML_PARSER to the parser you want to use (before you require ‘OSM/StreamParser’):
From the shell:
ENV['OSMLIBX_XML_PARSER']=Libxml require 'OSM/StreamParser'
If you want the parser to keep track of all the objects it finds in the XML file you can create a OSM::Database for it:
db = OSM::Database.new
The database lives in memory so this works only if the XML file is not too big.
When creating the parser you can give it the database object:
parser = OSM::StreamParser.new(:filename => 'filename.osm', :db => db)
In your node(), way(), and relation() methods you now have to return true if you want this object to be stored in the database and false otherwise. This gives you a very simple filtering mechanism. If you are only interested in pharmacies, you can use this code:
def node(node) return true if node.amenity == 'pharmacy' false end
After the whole file has been parsed, all nodes with amenity=pharmacy will be available through the database. All other objects have been thrown away. You can get a hash of all nodes (key is id, value is a Node object) with:
Or single nodes with the ID:
Ways and relations are accessed the same way.
When deleting a database call
first. This will break the internal loop references and makes sure that the garbage collector can free the memory.
A flexible library for exporting OpenStreetMap data into other formats.
OSM uses a very powerful data model with free tagging; other formats generally have a stricter format. In most cases it is not possible to convert OSM data into other formats without losing some information. You‘ll have to pick which part of the information to retain and which part you‘ll have no interest in. This is done with a “rules file” which tells the library which OSM objects (with what tags) should be exported into which destination objects (with what attributes).
Rules files always have a similar form, but contain some different commands depending on the destination format.
There is an executable ‘osmexport’ in the ‘bin’ directory. If you installed this library as a gem it should have been installed in your path. If you use the Debian rubygems package it will have been installed in /var/lib/gems/1.8/bin or similar, you‘ll have to adjust your path or add a symlink from /usr/local/bin or so.
osmexport RULEFILE.oxr OSMFILE.osm OUTFILE/DIR
For the KML export you have to use one .kml file as OUTPUTFILE. For the CSV and Shapefile export a OUTDIR is given and the actual file names are defined in the rule file.
A library for importing/exporting OpenStreetMap data into/from SQLite.
osmsqlite uses the following SQL schema to store OSM data. The schema is pretty straighforward. Every node, way, and relation gets its own table and a table for its tags (called node_tags, way_tags, and relation_tags). Also we need the way_nodes table for the nodes in a way and the members table for the members of a relation. The table osm holds the API version number as read from the XML file (currently not implemented).
All tables (except way_nodes) have a marked column. If you are only interested in some of the data in the database, you can mark it by setting marked to 1 and then export only the marked data by using the -m option of osmsqlite export.
CREATE TABLE osm ( version TEXT ); CREATE TABLE nodes ( id INTEGER NOT NULL PRIMARY KEY, user TEXT, timestamp TEXT, lon REAL NOT NULL, lat REAL NOT NULL, marked INTEGER ); CREATE TABLE node_tags ( ref INTEGER NOT NULL, key TEXT, value TEXT, marked INTEGER ); CREATE TABLE ways ( id INTEGER NOT NULL PRIMARY KEY, user TEXT, timestamp TEXT, marked INTEGER ); CREATE TABLE way_tags ( ref INTEGER NOT NULL, key TEXT, value TEXT, marked INTEGER ); CREATE TABLE way_nodes ( way INTEGER NOT NULL, num INTEGER NOT NULL, node INTEGER NOT NULL ); CREATE TABLE relations ( id INTEGER NOT NULL PRIMARY KEY, user TEXT, timestamp TEXT, marked INTEGER ); CREATE TABLE relation_tags ( ref INTEGER NOT NULL, key TEXT, value TEXT, marked INTEGER ); CREATE TABLE members ( relation INTEGER NOT NULL, type TEXT, ref INTEGER NOT NULL, role TEXT, marked INTEGER );
An executable ‘osmsqlite’ using this library is included in the bin directory. If you have installed the osmlib-sqlite gem package you should have it in your path. If you use the Debian rubygems package it will have been installed in /var/lib/gems/1.8/bin or similar, you‘ll have to adjust your path or add a symlink from /usr/local/bin or so.
You can import an OSM XML file into a SQLite database with this call:
osmsqlite import OSMFILE.osm OSMDB.db
And export it again:
osmsqlite export OSMDB.db OSMFILE.osm
To only export marked data use:
osmsqlite export -m OSMDB.db OSMFILE.osm
See the SQL schema for more information about marking data. There is currently no way to do this comfortably from the command line, you‘ll have to call
to go into the database manually and mark the parts you need.3)