Using Free Geospatial Tools and Data Part 4: Tools of the Trade

Today there are a huge amount of Open Source geospatial tools out there. The Open Source GIS movement started with libraries to access data formats and has led to full blown geospatial systems today. We’ll look at a few of these that will prove useful to you while you’re working with data. Note that I’m not trying to give an exhaustive explanation how to use the tools here. The projects have already written up a lot of information about how to use them so I’m going to be lazy and not replicate it here. This post will at least give an idea of some of the big players in Open Source GIS so you can continue your investigations there.

Tools

The first project of interest is the Geospatial Data Abstraction Library (GDAL). For developers, GDAL provides a single interface to accessing various raster file formats. For users, it comes with a number of command line utilities that can perform functions from reprojecting data to converting files between various GIS formats. GDAL is used by several other Open Source projects as well for their file format access. We’ll look at some specific uses of GDAL later on. To learn more, visit their website at http://www.gdal.org/.

GDAL comes with several utilities that you might find useful in your geospatial explorations. gdalinfo will give you information about raster GIS files. ogrinfo will provide information on vector file formats. gdaltranslate will convert files to various other formats. gdalwarp will let you convert files to another coordinate system or merge multiple files into a single raster data set.

When appropriate, GDAL provides a wrapper to other libraries to access file formats. libtiff and libgeotiff are used to access the GeoTIFF file format. GeoTIFFs are an extension to the TIFF standard that allows for spatial data to be included with the file. They are mainly used for raster data such as aerial photographs. However, extensions such as 16-bit GeoTIFFs are used to store elevation data where each pixel encodes a height value. LibKML is used to access KML files such as those used with Google Earth. Libpng, libjpeg, and Jasper are used for other formats, and so on.

These libraries also provide useful utilities. One example is listgeo that I used in the last post in this series. Another is tiffinfo that will print out information from standard TIFF tags in a file. tiffcp and geotiffcp will copy files while preserving TIFF and the extended GeoTIFF tags.

As far as full-featured GIS go, the grand-daddy of them all is the Geographic Resources Analysis Support System (GRASS) GIS. GRASS was originally developed from 1982 to 1995 by the US Army Construction Engineering Research Laboratories that is a part of the US Corps of Engineers. For a while it retained its “old school GIS” roots (like early ESRI products) by having everything broken out into programs run from the command line. The Corps ceased developing GRASS around 1995 and it was then taken over by a group of people at Baylor University. It now has gone from simple graphics and requiring a lot of command line experience to a modern GUI thanks to an international group of developers. Of all the Open Source GIS tools, it is the closest competitor to a full featured GIS such as ESRI’s ArcGIS or ERDAS Imagine.

 

GRASS GIS Screenshot

Example of the Modern GRASS GIS GUI

Another powerful Open Source GIS tool is the Quantum GIS (QGIS) project. QGIS started life around 2002 and became a part of the Open Source Geospatial Foundation. It was written using the Qt widget set and features a plugin architecture that allows users to contribute more features using Python or C++. It uses libraries such as GDAL to access numerous raster and vector formats and even includes a plugin so it can act as a front-end to GRASS. Additionally, it can consume OGC web services and interact with Oracle, PostGIS, and ESRI geodatabases. QGIS also features a world-wide group of developers and just recently hit the 2.0 release. For those familiar with ESRI products, it fits in a nebulous area between the old ArcView product and ArcMap. In some ways this author believes it outperforms either product thanks to the plugin architecture and its support of web services. In others it lacks many of the geoprocessing features of the ESRI products. However, it is easily powerful enough for most users who need to work with geospatial data.

 

QGIS Screenshot

QGIS 2.0 Running on Fedora 19

uDig is the third GIS tool in the lineup. It was created by the Canadian company Refractions Research and is based on the Eclipse framework. uDig also features a plugin architecture and can use GRASS for vector processing. Like the others, uDig can ingest OGC web services. Unlike the others, uDig is also used as a base for other GIS applications such as JGrass and Arbonaut. While not necessarily as full featured as QGIS or GRASS, uDig still has its place among the tools.

 

uDIG Screenshot

uDIG screenshot Courtesy Wikipedia

The System for Automated Geospatial Analyses (SAGA) GIS is an Open Source GIS originally developed by the Department of Physical Geography at the University of Gottingen in Germany and now by a group of international volunteers. SAGA also uses GDAL to read a large number of geospatial file formats. It focuses on geoscientific processing and can be called from the R statistical data analysis system. SAGA is similar to QGIS but includes some features not found in that package such as image pattern recognition and more geostatistical functionality. It is still under active development but has not had a new stable release since 2011.

 

SAGA GIS Screenshot

SAGA GIS Screenshot Courtesy Wikipedia

Databases

For storing geospatial data we have two main contenders. Spatial Databases are extensions to traditional relational database systems that allow you to do queries such as “give me all the points within five miles of this location.” The first, PostGIS, is a set of extensions to the PostgreSQL database also developed by Refractions Research and released to the world in 2001. PostGIS is probably the most popular Open Source geospatial database in the world, thanks to its functionality and full implementation of many OGC standards. It started as storing only vector data but was expanded to include raster in version 2.0. It includes geodatabase and geoprocessing functionality not even found in the commercial database offerings. Today it is used by many projects and has support by both proprietary and Open Source GIS tools. Tools such as QGIS can directly communicate with PostGIS to read and write data.

MySQL also has a set of spatial extensions and is catching up to PostGIS in functionality, although it still lags in some areas such as geography and 3D types, SRID projections, and directly querying by a radius. It is not as fully and directly supported by other Open Source GIS tools. However, it is faster for many operations than PostGIS due to the differences in architecture between the two. MySQL Spatial uses a polygon-based model where functions are implemented by bounding polygons.

This is just a brief introduction to Open Source GIS tools out there. I’d highly suggest poking around with your favorite search engine to learn more about what’s available. If you’re an experienced GIS professional, you could get up to speed with QGIS or GRASS fairly quickly. Next time, we’ll start looking at data and how to process it for your needs, including creating spatial databases.

 

Posted in GIS

Using Free Geospatial Tools and Data Part 2: Last of the History Lesson I Promise

As GPS became more popular, another problem came about. Map updates for these units were expensive. They were in proprietary formats and could only be used from a specific vendor for their units. Many units could not be updated at all. Even better, in some cases individual units could not share the same update. The updates were slow to incorporate all new areas. At one point, a popular GPS maker only employed a hand full of cartographers who were responsible for the entire world. As you can imagine, they could only do so much at any given time.

Some enterprising souls decided they wanted to try to convert their expensive data from their proprietary GIS packages and put them on their GPSs. After some reverse engineering they actually managed to get data into a format so they could put maps on their GPS units. Even then the GIS data sets were not current for everywhere as the vendors focused on popular areas.

This approach had a few problems. First, there were not that many people who had the skills and software available to do the GPS reverse engineering and convert the vendor data sets to use with them. The commercial GIS data were not necessarily any better than the GPS data in terms of being up to date. Plus, everything was proprietary. The GPS and GIS data were owned by vendors who would go after people if they even thought about making data available for free to anyone who had not bought a license. So even if you had access to the GIS data and tools, you could only update your own GPS and not post the files online for people to download. And you were likely violating some license agreement even if you only used it for yourself.

If fact, the vendors were very aggressive about protecting their data and took action against anyone who violated their copyright. Early web map companies would introduce errors into their data sets in an attempt to watermark them so they would know when someone was illegally using them. This was before the days of widespread in-car GPS units and Google or NAVTEQ cars driving around recording roads with GPS precision and accuracy. With closed software and data, no one really had to worry about accuracy for the casual user.

Open Source Comes to the Table

The frustrations with proprietary vendors and data sets started a small cottage industry of developers who wanted to give everyone access to the same types of tools that the commercial vendors had. In the commercial GIS space there were only one or two real sellers of GIS software. This monopoly led to stagnant development and large monolithic software programs. These Open Source developers wanted to write new tools that everyone could use to manipulate what free data was out there. They wrote libraries such as GDAL, libgeotiff, and others to provide access to the various file formats. The tools followed to allow users to do simple manipulations of geospatial data.

Now that people had tools, they wanted data to work on. Data at this time was scarce, mainly reposted USGS Digital Raster Graphics (scanned paper maps) and Digital Orthophotos (aerial photographs) in the raster data space. The US Census made their TIGER vector map data (think roads) available for download, but early on it had issues with spatial accuracy and was hard to work with unless the user spent time converting it to work in their GIS. The USGS also had some vector data for hydrography and transportation data but was also somewhat difficult to use due to the formats in which it was distributed.

People at this time had been doing various things with GPS units and early GIS tools to make data available. Some people posted GPS tracks of trails for others to go hiking on. Geocaching had caught on in a big way and introduced a lot of people to the convenience of a GPS. Moving map display GPS units allowed people to navigate roads without need for a paper map. More and more people began wanting up to date data for their devices, and they did not want to pay the expensive prices the commercial vendors wanted.

The explosion in GPS use and availability of Open Source tools to use it led to outcries of people who wanted more data so they could keep their GPS units up to date or just play with photos of their neighborhoods. Technology had evolved to the point where computers could more easily manipulate the large raster data sets that were out there. Eventually governments began to make more data available to the tax payers who felt they paid for it once so should not pay again to download it. The USGS made DRGs and DOQs available for free. The TIGER vector data got a lot more accurate and was updated on a much more regular basis. But still, a lot of the data was not current or up to date since there was only so much money spent by governments on mapping programs.

Along comes the OpenStreetMap (OSM) project in 2004 with the goal of creating a free base map of the world. OSM came out in the early years of the social media craze and provided a collaborative platform so people could add mapping data through either GPS traces or by volunteering their time to vectorize satellite photos. Suddenly, people all over the world could contribute to creating free maps of their areas and use the data however they wanted to use them. Combined with more and more governments providing their data free to download, we came to the modern era where we have more GIS data available than ever before.

Where are we Now?

To be superfluous, we’re now in a golden age of Open Source GIS tools and open data. The very capable QGIS application has recently hit version 2.0. OpenStreetMap continues to grow and is up to a compressed 30 gigabyte file with high-resolution user-contributed data under an open license. Toolkits and libraries such as GDAL power many Open Source and even some commercial applications. Many cars now come with GPS units built-into the dash. Cell phones with 3G+ data connections and mapping apps from Google, Apple, and others have caused traditional GPS companies such as Garmin to scramble to determine their future relevance. Anyone can now take Open Source tools and convert open data to update maps in their GPSs. Web map services using open standards make even more data available to web browsers and other applications. Times have gone from a scarcity of geospatial data to so much that management and discovery of it has become difficult due to the volume and number of providers.

Next time we’ll take a look at the Open Source tools of the trade that anyone can download and use.

Using Free Geospatial Tools and Data Part 1: Introduction and a History Lesson

This kicks off a series of posts about GIS tools and data. I’ve wanted to do something like this for a while now, but am finally forcing myself to do more writing to get back into the habit. So to start off, I’d like to go through a brief history lesson to discuss where GIS was and where it is today. I’ve had a long history in the GIS field, spending most of my professional career working for a government mapping agency where I wrote production systems and then transitioned into a research and development mode. So on to Part 1.

In days of old (OK, only around a decade or three ago), getting into the geospatial field could be a costly endeavor. Most data were locked up in proprietary vaults and you had to pay for access to them, if you could get access at all. The world had just started to transition from paper maps and traditional cartography to applying technology to mapping tasks. With the less advanced technology of the day, transcribing paper maps into a digital representation was a labor-intensive task and required many hours of work. In many cases, creating vector data involved someone drawing vectors on top of an on-screen image of a paper map (which sadly, is still one of the main methods in use today!).

There was also a lot of confusion in the early days about how to do digital mapping in the first place. Most of the people involved came from paper mapping and had spent decades learning how things worked in a physical world. The move to digital changed things. Accuracy in the paper world was no longer sufficient in digital, as what looked like it lined up on paper actually didn’t line up in digital. You could zoom in with a computer and see how far things actually were from lining up. There was a lot of resistance in the early years as mappers did not think using computers was “real” cartography.

As an example, consider how paper maps were printed. They used multiple Mylar plates that contained various parts of the map. When the map was printed, the plates would be inked and roll against the paper to lay down the layers (colors) of the map. These plates were aligned by the use of stud holes that were cut into the Mylar so they could be positioned correctly during the printing process. When these plates were scanned in and referenced, they did not line up. In the analog world, a gap of a millimeter or so would not really show up on the paper map. In digital, however, that gap will stand out like a sore thumb when the digital map separates are combined in a GIS.

 

Cartographer scribing a map separate Credit: U.S. Geological Survey Department of the Interior/USGS U.S. Geological Survey.

Cartographer scribing a map separate
Credit: U.S. Geological Survey
Department of the Interior

GIS tools themselves were expensive and were mainly only available from two vendors. Additionally, these tools were really only available on Windows and proprietary UNIX platforms such as Solaris and IRIX. Using them was difficult and required users to have an in-depth understanding of geospatial data and how to use the tools. Many people had to take week long vendor classes before they could use them. And of course, most of the tools had to be run from the command line.

Then things began to change. GPS was opened up to the public and for the first time mapping became dynamic for the non-GIS user. As GPS advanced, they became equipped with moving map displays so people could actually see where they were on a map and could see what was around them.

Early GPS, however, also suffered the same issues as early GIS systems. In the beginning they were bulky and only gave you your latitude and longitude on a digital display. This was OK for some people, as they could look at a paper map to reference where they were. They were expensive and only came from a handful of vendors. These early units had accuracy issues, the major being the intentional inaccuracy mandated by the US government who were afraid they could be used against America. They were very dependent on line of sight and took a while to get a general lock (which then meant they were accurate to the tens of feet instead of a foot or two).

 

Various GPS Receivers Credit: Wikipedia Photographer: Stefan Kühn

Various GPS Receivers
Credit: Wikipedia
Photographer: Stefan Kühn

 

GPS and GIS technologies began to evolve. GIS software got better and became more user friendly, making use of GUI technologies of the day. GPS technology evolved and became faster and smaller so that people could use hand held units to venture into the outdoors. GIS professionals began to try to hook up their software systems to a GPS so they could use their data with real-time measurements. Companies then got the bright idea to sell software so that you could hook a GPS to your laptop and then watch your location in real time on the screen. It was clunky and hard to use, but was the direct ancestor of every system we take for granted today.

Next time I’ll discuss the rise of Open Source GIS tools and data.