If you have spatial databases such as the ones I set up in my previous blog posts about GNIS and PostGIS, you will likely want to add a few things to them to make them more useful. GNIS and Geonames contain point types of all different classes, from airports to populated places. What if you were only interested in one type of point, such as airports? By default, if you load GNIS data into QGIS, it will display all of the points in your view and look cluttered as the screen shot below demonstrates.
The good news is that you can easily specify what you only want QGIS to show you. There are a couple of ways that you can filter data out in a layer with QGIS: the Set Filter button and creating a database view. The Set Filter button lets you create a SQL filter by clicking on the field you want to filter with, the relational operator, and what you want to compare with. A database view lets you pre-define your filter and presents it as another table. Whichever one you use is up to you, but there is at least one thing you must do to speed up both methods.
The Add Filter Method
Assuming you followed my previous posts about setting up GNIS, we will use that for this example. First you need to create an index on the feature_class column of GNIS. This will make the query that we will use as an example run much faster. To do this, run the following commands:
psql -d USGS USGS=# create index gnis_feature_class_idx on gnis(feature_class);
Once this is done, you will have a new index called gnis_feature_class_idx. This allows PostgreSQL find the matching feature classes from the data more quickly by consulting the index instead of manually searching each row in the database.
Now that this is done, we will next move on to our first example, the Set Filter button method. As a refresher, here are the feature classes in GNIS:
USGS=# select distinct(feature_class), count(*) from gnis group by feature_class order by feature_class; feature_class | count -----------------+-------- Airport | 23202 Arch | 720 Area | 2557 Arroyo | 466 Bar | 5870 Basin | 4304 Bay | 14094 Beach | 2409 Bench | 724 Bend | 2797 Bridge | 7356 Building | 160291 Canal | 21559 Cape | 16417 Cemetery | 145544 Census | 11629 Channel | 4014 Church | 231967 Civil | 64237 Cliff | 4479 Crater | 246 Crossing | 13167 Dam | 56931 Falls | 2499 Flat | 10559 Forest | 1314 Gap | 8246 Glacier | 1021 Gut | 3541 Harbor | 1271 Hospital | 15864 Island | 20540 Isthmus | 28 Lake | 69403 Lava | 168 Levee | 546 Locale | 162518 Military | 2860 Mine | 36133 Oilfield | 4863 Park | 69501 Pillar | 2092 Plain | 289 Populated Place | 201065 Post Office | 66942 Range | 2480 Rapids | 1062 Reserve | 1276 Reservoir | 74683 Ridge | 15127 School | 216473 Sea | 28 Slope | 373 Spring | 38655 Stream | 231462 Summit | 70614 Swamp | 7608 Tower | 16800 Trail | 11047 Tunnel | 750 Unknown | 186 Valley | 70239 Well | 38797 Woods | 684 (64 rows)
Both of the examples here will work with the feature class of Airports. These examples also assume you already have some data set up as I previously demonstrated on this blog.
For the Set Filter method, first click on the Add PostGIS Layer button in QGIS. Select the USGS database and select the gnis table. Once you have done this, click on the Set Filter button at the bottom right side of the Add Layer dialog.
As you can see above, you are presented with a list of Fields on the left side, operator buttons in the middle, and Values on the right side. Click on the Feature Class field to select it and then click the All button under the values window to the right. Since we created an index on the Feature Class field, this should quickly show you all the unique values that exist in the database for that field. Now double click Feature Class to add it into the Provider specific filter expression in the text box at the bottom of the dialog. Then click the = button in the Operators group. Now double click Airport from the Values box to add it. Your filter expression should now look like this:
"feature_class" = 'Airport'
If you click the Test button, QGIS will perform a query and display the number of rows that match your query. You can use this to double check that you did not make any errors during entry. In our case, the query should return around 23,000+ rows depending on the version of GNIS you are using. Click the OK button to go back to the Layers dialog and then the Add button to add it to your project. With the filter in place, your screen should look less cluttered as it is only showing airports from GNIS
You can use this method to filter out data on any type of field in a geospatial database. I recommend, though, that you first create an index on that column to speed up the operation. Otherwise, you may have to wait a while every time you try to load your filtered data.
Creating a Database View
The second method to filter data is to create a database view. Basically all database types can create a view. For the non-database savvy, a view can be thought of as a virtual table that is defined by a database query. This means that whenever you access the view, the data that is returned is generated by a query. For example, if you wanted a table of only airports in GNIS, you could make a view that pretends to be another table but does not take up as much space as a real table would.
For this example, we will again use Airports. Once you understand this, you can then create views for other classes by replacing the feature class name. However, when working with tools such as QGIS, there is a caveat that you need to first know about. If you are savvy with databases, your might create the view with the following command:
psql -d USGS USGS=# create view view_airports as select * from gnis where feature_class = 'Airport';
When you then go to load this into QGIS, you will indeed see the view as a layer, but there will be a problem.
As you can see, QGIS will not let you just click on the view to add it. If you hover over the error triangle, you will see it displays a message of Select columns in the ‘Feature Id’ column that uniquely identify features of this layer. If you scroll to the right, you will see that QGIS will let you select a column in the view that is a unique identifier (feature_id in the case of GNIS).
Why does QGIS not automatically know which column to use? If you are not well versed in how QGIS and databases work, tables in a database typically need a unique identifier for each entry so that it can be properly found. With recent versions of PostgreSQL and PostGIS, the view does not have a unique key presented with the view. If QGIS tried to automatically deduce what field to use as the unique key, it would take a lot of processing power and would mean that QGIS would temporarily “hang” whenever you tried to access a database. Instead, QGIS gives you an option to tell it what field to use as the unique identifier for each row.
If you go ahead and select the feature_id field in the Add Layer dialog, you will then be able to select the layer and click Add to load it into QGIS.
So the question you might have is “Which method is better?” The correct answer is “Whichever method makes more sense to you.” Some people may be OK with setting a filter when they load in data. Others may prefer to have views show up in the Layers dialog to remind them what all is available. A PostgreSQL materialized view would likely be the fastest method as it creates a cache of the data, but that is a bit beyond the scope of this post 🙂
Have fun and happy GISing with all Open Source software!