Revisiting Historic Topographic Maps Part 2

In part one I discussed how I go about finding and downloading maps from the USGS Historic Topomap site. Now I will show how I go through the images I downloaded in Part 1 and determine which ones to keep to make a merged state map. I will also show some things you may run into while making such a map.

Now that the maps are all downloaded, it is time to go through and examine each one to determine what to keep and what to digitally “throw out.” When you download all the historic maps of a certain scale for a state, you will find that each geographic area may have multiple versions that cover it. You will also find that there are some specially made maps that go against the standard quadrangle area and naming convention. The easiest way for me to handle this is to load and examine all the maps inside QGIS.

Using QGIS to Check an Image

Using QGIS to Check an Image

 

 

 

 

 

 

I load the maps that cover a quadrangle and overlay them on top of something like Google Maps. For my purposes, I usually try pick maps with the following characteristics:

  • Oldest to cover an area.
  • Good visual quality (easy to read, paper map does was not ripped, etc)
  • Good georeferencing to existing features

QGIS makes it easy to look at all the maps that cover an area. I will typically change the opacity of a layer and see how features such as rivers match existing ones. You will be hard-pressed to find an exact match as some scales are too coarse and these old maps will never match the precision of modern digital ones made from GPS. I also make sure that the map is not too dark and that the text is easily readable.

One thing you will notice with the maps is that names change over time. An example of this is below, where in one map a feature is called Bullock’s Neck and in another it is Bullitt Neck.

Feature Named Bullock's Neck

Feature Named Bullock’s Neck

 

 

 

 

 

 

Feature Named Bullitt Neck

Feature Named Bullitt Neck

 

 

 

 

 

 

Another thing you will find with these maps is that the same features are not always in the same spots. Consider the next three images here that cover the same area.

Geographic Area to Check for Registration

Geographic Area to Check for Registration

 

 

 

 

 

 

First Historic Map to Check Registration

First Historic Map to Check Registration

 

 

 

 

 

 

Second Map to Check Registration

Second Map to Check Registration

 

 

 

 

 

 

If you look closely, you will see that the land features of the map seem to move down between the second and third images. This happens due to how the maps were printed “back in the day.” The maps were broken down into separates where each separate (or plate) contained features of the same color. One contained roads, text, and houses, while another had features such as forests. These separates had stud holes in them so they could be held in place during the printing process. Each separate was inked and a piece of paper was run over each one. Over time these stud holes would get worn so the one or more would move around during printing. Additionally, maps back then were somewhat “works of art,” and could differ between who did the inscribing. Finally, depending on scale and quality of the map, the algorithms to georeference the scanned images can result in rubber sheeting that can further change things.

During my processing, one of the things I use QGIS for is to check which maps register better against modern features. It takes a while using this method but in the end I am typically much happier about the final quality than if I just picked one from each batch at random.

Another thing to check with the historic maps is coverage. Sometimes the map may say it covers a part of the state when it does not.

Map Showing No Actual Virginia Coverage

Map Showing No Actual Virginia Coverage

 

 

 

 

 

 

Here the map showed up in the results list for Virginia, but you can see that the Virginia portion is blank and it actually only contains map information for Maryland.

Finally, you may well find that you do not have images that cover the entire state you are interested in. If you group things by scale and year, the USGS may no longer have the original topomaps for some areas. It could also be that no maps were actually produced for those areas.

Once the images are all selected, the images need to be merged into a single map for an individual state. For my setup I have found that things are easier and faster if I merge them into a GeoTIFF with multiple zoom layers as opposed to storing in PostGIS.

Here I will assume there is a directory of 250K scale files that cover Virginia and that these files have been sorted and the best selected. The first part of this is to merge the individual files into a single file with the command:

gdal_merge.py -o va_250k.tif *.tif

This command may take some time based on various factors. Once finished, the next part is to compress and tile the image. Tiling breaks the image into various parts that can be separately accessed and displayed without having to process the rest of the image. Compression can make a huge difference in file sizes. I did some experimenting and found that a JPEG compression quality of eighty strikes a good balance between being visually pleasing and reducing file size.

gdal_translate -co COMPRESS=JPEG -co TILED=YES -co JPEG_QUALITY=80 va_100k.tif va_100k_tiled.tif

Finally, GeoTIFFs can have reduced-resolution overview layers added to them. The TIFF format supports multiple pages in a file as one of the original uses was to store faxes. A GIS such as QGIS can recognize when a file has overlay views added and will use them first based on how far the user has zoomed. These views usually have much fewer data than the full file and can be quickly accessed and displayed.

gdaladdo --config COMPRESS_OVERVIEW JPEG --config INTERLEAVE_OVERVIEW PIXEL -r average va_100k_tiled.tif 2 4 8 16

With the above command, GDAL will add an overview that is roughly half sized, quarter sized, and so on.

In the end, with tiling and compression, my 250K scale merged map of Virginia comes in at 520 megabytes. QGIS recognizes that the multiple TIFF pages are the various overviews and over my home network loading and zooming is nearly instantaneous. Hopefully these posts will help you to create your own mosaics of historic or even more modern maps.

Manipulating a CSV with Pandas

At my day job I am working on some natural language processing and need to generate a list of place names so I can further train the excellent spacy library.  I previously imported the full Planet OSM so went there to pull a list of places.  However, the place names in OSM are typically in the language of the person who did the collection, so they can be anything from English to Arabic.  I stored the OSM data using imposm3 and included a PostgreSQL hstore column to store all of the user tags so we would not lose any data.  I did a search for all tags that had values like name and en in them and exported those keys and values to  several CSV files based on the points, lines, and polygons tables.  I thought I would write a quick post to show how easy it can be to manipulate data outside of traditional spreadsheet software.

The next thing I needed to do was some data reduction, so I went to my go-to library of Pandas.  If you have been living under a rock and have not heard of it, Pandas is an exceptional data processing library that allows you to easily manipulate data from Python.  In this case, I knew some of my data rows were empty and that I would have duplicates due to how things get named in OSM.  Pandas makes cleaning data incredibly easy in this case.

First I needed to load the files into Pandas to being cleaning things up.  My personal preference for a Python interpreter is ipython/jupyter in a console window. To do this I ran ipython and then imported Pandas by doing the following:

In [1]: import pandas as pd

Next I needed to load up the CSV into Pandas to start manipulating the data.

In [2]: df = pd.read_csv('osm_place_lines.csv', low_memory=False)

At this point, I could examine how many columns and rows I have by running:

In [3]: df.shape
Out[3]: (611092, 20)

Here we can see that I have 611,092 rows and 20 columns.  My original query pulled a lot of columns because I wanted to try to capture as many pre-translated English names as I could.  To see what all of the column names are, I just had to run:

In [10]: df.columns
Out[10]: 
Index(['name', 'alt_name_1_en', 'alt_name_en', 'alt_name_en_2',
       'alt_name_en_3', 'alt_name_en_translation', 'en_name',
       'gns_n_eng_full_name', 'name_en', 'name_ena', 'name_en1', 'name_en2',
       'name_en3', 'name_en4', 'name_en5', 'name_en6', 'nam_en', 'nat_name_en',
       'official_name_en', 'place_name_en'],
      dtype='object')

The first task I then wanted to do was drop any rows that had no values in them.  In Pandas, empty cells default to the NaN value.  So to drop all the empty rows, I just had to run:

In [4]: df = df.dropna(how='all')

To see how many rows fell out, I again checked the shape of the data.

In [5]: df.shape
Out[5]: (259564, 20)

Here we can see that the CSV had 351,528 empty rows where the line had no name or English name translations.

Next, I assumed that I had some duplicates in the data.  Some things in OSM get generic names, so these can be filtered out since I only want the first row from each duplicate.  With no options, drop_duplicates() in Pandas only keeps the first value.

In [6]: df = df.drop_duplicates()

Checking the shape again, I can see that I had 68,131 rows of duplicated data.

In [7]: df.shape
Out[7]: (191433, 20)

At this point I was interested in how many cells in each row still contained no data.  The CSV was already sparse since I converted each hstore key into a separate column in my output.  To do this, I ran:

In [8]: df.isna().sum()
Out[8]: 
name                          188
alt_name_1_en              191432
alt_name_en                190310
alt_name_en_2              191432
alt_name_en_3              191432
alt_name_en_translation    191432
en_name                    191430
gns_n_eng_full_name        191432
name_en                    191430
name_ena                   172805
name_en1                   191409
name_en2                   191423
name_en3                   191429
name_en4                   191430
name_en5                   191432
name_en6                   191432
nam_en                     191432
nat_name_en                191431
official_name_en           191427
place_name_en              191429
dtype: int64

Here we can see the sparseness of the data.  Considering I am now down to 191,433 columns, some of the columns only have a single entry in them.  We can also see that I am probably not going to have a lot of English translations to work with.

At this point I wanted to save the modified dataset so I would not loose it.  This was a simple

In [8]: df.to_csv('osm_place_lines_nonull.csv', index=False)

The index=False option tells Pandas to not output its internal index field to the CSV.

Now I was curious what things looked like, so I decided to check out the name column.  First I increased some default values in Pandas because I did not want it to abbreviate rows or columns.

pd.set_option('display.max_rows', 200)
pd.set_option('display.max_columns', 25)

To view the whole row where the value in a specific column is null, I did the following and I will abbreviate the output to keep the blog shorter 🙂

df[df['name'].isnull()]
...
       name_en                                 name_ena name_en1 name_en2  \
166        NaN                        Orlovskogo Island      NaN      NaN   
129815     NaN                            Puukii Island      NaN      NaN   
159327     NaN                           Ometepe Island      NaN      NaN   
162420     NaN                                  Tortuga      NaN      NaN   
164834     NaN                         Jack Adan Island      NaN      NaN   
191664     NaN                            Hay Felistine      NaN      NaN   
193854     NaN             Alborán Island Military Base      NaN      NaN   
197893     NaN                         Carabelos Island      NaN      NaN   
219472     NaN                           Little Fastnet      NaN      NaN   
219473     NaN                             Fastnet Rock      NaN      NaN   
220004     NaN                           Doonmanus Rock      NaN      NaN   
220945     NaN                             Tootoge Rock      NaN      NaN   
229446     NaN                               Achallader      NaN      NaN   
238355     NaN                            Ulwile Island      NaN      NaN   
238368     NaN                             Mvuna Island      NaN      NaN   
238369     NaN                            Lupita Island      NaN      NaN   
238370     NaN                              Mvuna Rocks      NaN      NaN   
259080     NaN                                  Kafouri      NaN      NaN   
259235     NaN                              Al Thawra 8      NaN      NaN   
259256     NaN                              Beit al-Mal      NaN      NaN   
261584     NaN                                   Al Fao      NaN      NaN   
262200     NaN                                  May 1st      NaN      NaN   
...

Now that I have an idea how things look, I can do things like fill out the rest of the name columns with the English names found the various other columns.

 

Added a GNIS “Fixer” to my misc_gis_scripts repository

As I’ve posted before, the download from the USGS Geonames site has some problems.  The feature_id column should be unique, but it is not because the file contains some of the same feature names in Mexico and Canada with the same id which breaks the unique constraint.

I just added a very quick and dirty python program to my misc_gis_scripts repo on Github.  It’s run by

python3 gnisfixer.py downloadedgnisfile newgnisfile

For the latest download, it removed over 7,000 non-US or US Territory entries and into the DB perfectly.

Quick Post: My New SBC Computer – the Asus Tinkerboard

A while back I picked up a Raspberry Pi 3 and turned it into a NAS and LAMP stack server (Apache, PostgreSQL, Mysql, PostGIS, and so on).  Later I came across forums mentioning a new entry from Asus into this space called the Tinkerboard.  Now I’m not going to go into an in-depth review since you can find those all over the Internet.  However, I do want to mention a few things I’ve found and done that are very helpful.  I like the board since it’s supports things like OpenCL and pound for pound is more powerful than the Pi 3.  The two gig of RAM vs one with the Pi 3 makes it useful for more advanced processing.

One thing to keep in mind is that the board is still “new” and has a growing community.  As such there are going to be some pains, such as not having as big a community as the Pi ecosystem.  But things do appear to be getting better, and so far it’s proven to be more capable and, in some cases, more stable than my Pi 3.

So without much fanfare, here are my list of tips for using the Tinkerboard.  You can find a lot more information online.

  1. Community – The Tinkerboard has a growing community of developers.  My favorite forums are at the site run by Currys PC World.  They’re active and you can find a lot of valuable information there.
  2. Package Management – Never, EVER, run apt-get dist-upgrade.  Since it’s Debian, the usual apt-get update and apt-get upgrade are available.  However, running dist-upgrade can cause you to loose hardware acceleration.
  3. OpenCL – One nice think about the Tinkerboard is that the Mali GPU has support for hardware-accelerated OpenCL.  TinkerOS an incorrectly named directory in /etc/OpenCL which causes apps to not work by default.  The quick fix is to change to /etc/OpenCL and run ln -s venders vendors.  After doing this, tools like clinfo should properly pick up support.
  4. Driver updates – Asus is active on Github.  At their rk-rootfs-build site there you can find updated drivers as they’re released.  I recommend checking this site from time to time and downloading updated packages are they are released.
  5. Case – The Tinkerboard is the same size and mostly the same form-factor as the Raspberry Pi 3.  I highly recommend you pick up a case with a built-in cooling fan since the board can get warm, even with the included heat sinks attached.
  6. You can follow this link and install Tensorflow for the Pi on the Tinkerboard.  It’s currently not up-to-date, but much less annoying than building Tensorflow from scratch.
  7. SD Card – You would do well to follow my previous post about how to zero out a SD card before you format and install TinkerOS to it.  This will save you a lot of time and pain.  I will note that so far, my Tinkerboard holds up under heavy IO better than my Pi 3 does.   I can do thinks like make -j 5 and it doesn’t lock up or corrupt the card.

I’ll have more to say about this board later.

Creating a NAS using the Raspberry Pi 3 Part 3 (finally)

So I have been meaning to finish up this series for a while now, but other things got in the way (which hopefully I can post on here soon).  In the mean time, there are numerous tutorials on-line now about how to set a Pi up as a home file server, so I think I will defer to those instead of wasting more bits on the Internet.  However, I would like to point out some things I have done that has resulted in my Pi setup being nice and stable.

This biggest thing to do when using the SD card as the root file system for the Pi is to minimize the number of writes to it.  This will help keep it lasting longer and avoid any file system corruption.  One thing you can do is modify your /etc/fstab and use the noatime family of attributes.  The default of most Pi distributions is ext3/4, so this should work for you.  First find your entry in the /etc/fstab for your /.  In mine below, you can see it’s /dev/mmcblk0p2:

proc /proc proc defaults 0 0
/dev/mmcblk0p2 / ext4 defaults 0 1
/dev/mmcblk0p1 /boot/ vfat defaults 0 2
/dev/md0 /mnt/filestore xfs defaults,nofail 0 2

Change line 2 so that it reads like this:

/dev/mmcblk0p2 / ext4 defaults,noatime,nodiratime 0

This will stop the file system from modifying itself each time a directory or file is accessed.

As you can see from my fstab, I also have my RAID partition on the enclosure set to xfs and I use the nofail attribute.  This is very important since your enclosure may not be fully spun up and ready by the time your Pi tries to mount it.  If it’s not there, the Pi will hang (forever in my case since it will cause the kernel to panic).

I also run mariadb and postgresql with postgis on the Pi3, however I have them set to not autostart by running:

systemctl disable mysqld
systemctl disable postgresql

I could leave them running since I’ve lowered their memory requirements, but choose to only have things running on the Pi 3 whenever I need it to make sure I don’t run out of memory.

I put their respective data directories on the NAS and then made a soft link under /var by running;

ln -s /mnt/filestore/data/mysql /var/lib/mysql
ln -s /mnt/filestore/data/postgresql /var/lib/postgresql

You could edit their config files to change the location, but for me I have found it is easier to simply use soft links.  Plus, since the servers are not set to start a boot, I do not have to worry about any errors every time the Pi restarts.

I also run recoll on the Pi as I have collected several hundred gigabytes of papers and ebooks over the years.  Recoll is a nice utility that provides a semantic search ability.  By default, recoll and run your file system and system itself into the ground if you let it.  I made a few tweaks so that would play nice whenever I run it periodically on the Pi.  The first thing I did was move the ~/.recoll directory to the NAS and create a soft link by running

ln -s /mnt/filestore/data recoll .recoll

Again, the goal is to reduce the number of file system access to the SD card itself.  Secondly, I created the .recoll/recoll.conf file with the following contents:

topdirs = /mnt/filestore/data/Documents /mnt/filestore/data/ebooks /mnt/filestore/data/Programming
filtermaxseconds 60
thrQSizes = -1 -1 -1

The filtermaxseconds parameter tells recoll to stop indexing a file if the filter runs for a whole minute.  The thrQSizes option has recoll use a single thread.  While this makes it slower, it makes things run much better on the Pi while still allowing other services to run.

If you want to run other services, keep in mind that if they do a lot of I/O, you should move them to your external drive and use a soft link to redirect like I did above.  Doing so will help to greatly extend the life of your SD card and keep you from having to reimage it.

Creating a NAS Using a Raspberry Pi 3 Part 2

In the last article, I went over my decisions about the hardware I wanted to use to build a cheap home NAS. Here I will go over the software and configuration to get everything working.

Once all the parts came in, it was time to get going and configure everything. First, though, I would like to talk about SD cards, and why I feel they are the one major flaw with the Raspberry Pi series.

Conceptually, SD cards are a great thing. They come in different sizes and can store multiple gigabytes of data on them. They are used in everything from cell phones to digital cameras to computers. You are probably using one daily in a device without even knowing it.

You might think there are hundreds of companies making them, and you would be wrong. See, as with many things, a few companies actually make the physical cards and then a lot of other companies will buy and re-brand them. The companies that slap their logo on these cards do not care whether they buy quality card stock as long as it is cheap. So what we end up with is a situation where you can by two of the same “type” of SD card and physically they could be quite different from each other.

You also may not realize that SD cards are just as capable of developing bad sectors as your physical hard drive is. Some cards have a smart enough controller built-in that will automatically remap bad sectors to other good spaces on the card like SMART does with hard drives. Many others do not and have a “dumb” controller that does the bare minimum to make the device work.

The reality with SD cards is that expecting them to “just work” is about as safe as playing Russian roulette with all six cylinders loaded. Just as with hard drives, your SD WILL fail. Unlike hard drives, your SD may or may not be able to take care of problems on its own. And with the wild west of the cards, well, your best bet is to never trust that data on them is safe.

At my previous job I spent a lot of time dealing with SD cards and learning how to deal with their various issues. Often we would have random problems come up that could not be explained, only to find out that the SD card had developed issues that needed to be corrected to return things to normal. What I found out after looking at the low level portions of the card and a lot of reading has made me rethink trusting these devices for long term storage and use. The Pi will be running an operating system off of the SD, so you can expect a lot of reads and writes being done to it. This will speed up the development of bad sectors on the device and reduce its operating lifetime.

There are a few things you can do to help things. Before you even start to install software for your Pi, I highly recommend that you do the check the physical surface of the SD before you use it, even if it is new out of the package. I’m writing this from a Linux perspective, but the same information applies to Windows or even Mac OS as well. As usual, your mileage may vary. Doing this could cause you to lose data and make dogs and cats live together in a dystopian future. This will take some extra time, but I firmly believe it is worth it.

Plug the SD into your computer and identify what device it is assigned. I will leave it up to you to web search this for bonus points. First thing I recommend is using the dd command to write random data to the entire device. If the SD card has an intelligent controller, this will help it to determine if there are any bad sectors and remap them before you put Linux on it for the Pi. Even without an intelligent controller, initially writing to it can help find spots that may be bad or trigger marginal sectors to go bad. Run a command similar to:

dd if=/dev/random of=/your/sd/device bs=8M

This command will write a random value to each byte on your SD card. Once this is done, the next thing I recommend is format the SD and check for bad sectors while formatting. This can be done with something along the lines of:

mkfs.ext4 -cc /your/sd/device

This command will put a file system on the SD and do a read/write test while formatting it. Along with the dd command, this should bang on the physical surface of the card enough to find any initial bad sectors that could already be there.

That is it for this installment. Now that my soapbox is over, next time we will talk installing software and configuring the Pi to be a NAS.

Creating a NAS Using a Raspberry Pi 3 Part 1

Find yourself needing a lot of storage on your network and do not want to have something that requires a lot of attention? You can use a Raspberry Pi 3 and a hard drive enclosure to make a home or small office NAS to store files and keep them available.

Recently I found myself wanting to upgrade my home network attached storage (NAS). It was basically an old laptop connected to a USB enclosure. It was stable, but had a few flaws. The first was that I had to periodically clean out the laptop since it seemed to be a magnet for dust. The second was that the hard drive was a single drive that I kept backed up, but it did not provide availability in case that drive should fail. Plus, the laptop drew more power than it needed to since it spent a lot of time just sitting around.

What I wanted was something that used a small amount of power and was expandable. I wanted to be able to easily add more drives in as our family storage needs grew. My ideal solution would not need much fuss or maintenance. Most importantly, since I am cheap, the hardware solution needed to be low cost.

I had been waiting to pick up a Raspberry Pi 3 until I had a project that actually needed one. If you search Google, you will find a lot of people using it for a file server. Owners also complain because not only does it just come with USB 2.0 ports, but the built-in Ethernet also shares bandwidth with the USB devices. However, I was not looking for something high-bandwidth as all the video I transfer over the network is already compressed. My home network is limited to 100 megabit Ethernet, so at best I only get around eight to nine megabytes a second transfer speed anyway. I ordered the Vilros Raspberry Pi 3 Basic Starter Kit off Amazon and picked up a 64 gigabyte class-10 SD card for the root file system.

Once I settled on the Pi 3, I needed an external USB enclosure. I ended up picking the Mediasonic ProBox HF2-SU3S2 four-bay enclosure. It has a standard-sized cooling fan in the back and controls on the front for things like setting the fan speed and so on. I had two four gigabyte drives ready for it and would still have room to expand.

While waiting for everything to arrive, I then had to decide the how of setting the NAS up. There a lot of options out there, ranging from FreeNAS to rolling your own Linux distribution. You will see a huge amount of discussion of what types of file systems to use. Here it seems to be split into two camps: the “ZFS for everything” camp, and everything else (XFS, EXT4, and so on). ZFS is a great file system and I have used it for other things, but for a home NAS (especially one running off a Pi), it can be a bit of overkill. This is especially true since many of ZFS’s best features require a lot of RAM.

What I decided on is good old Linux software RAID 1 and the XFS file system. I have had a lot of success with software RAID over the years, and for my purposes it has been more flexible than a hardware RAID system. Linux software RAID can do things like convert in-place from a RAID-1 setup to RAID-5 with no data loss. This way, if I needed to expand past the four gigabyte RAID-1 system, I could add another drive and convert it to RAID-5. Linux software RAID-5 will let you grow it by adding hard drives, so in the end I could have a twelve gigabyte RAID-5 system if I needed one. I already back up the data with integrity checks, so instead of ZFS, XFS would satisfy all of my needs.

Distribution-wise, I decided it would be either Raspbian or Ubuntu MATE. Both are Debian-based, and both are solid operating systems for the Pi. Raspbian is the official distribution from the Raspberry Pi Foundation, and MATE is a distribution for the Pi built by Martin Wimpress and Rohith Madhavan. As I had never used a Pi before, I had no idea how either would work on the Pi 3. I used to use MATE for my desktop so I was at least familiar with it and knew it did not need a lot of resources to run.

In Part 2 I will go over setting up the Pi 3 to serve out data. As a spoiler, I will also go into why I chose Ubuntu for my Pi 3 in the end over Raspbian.

 

Filtering Data from a Geospatial Database using QGIS

If you have spatial databases such as the ones I set up in my previous blog posts about GNIS and PostGIS, you will likely want to add a few things to them to make them more useful. GNIS and Geonames contain point types of all different classes, from airports to populated places. What if you were only interested in one type of point, such as airports? By default, if you load GNIS data into QGIS, it will display all of the points in your view and look cluttered as the screen shot below demonstrates.

All GNIS Points Over an Area

All GNIS Points Over an Area

 

 

 

 

 

 

 

The good news is that you can easily specify what you only want QGIS to show you.  There are a couple of ways that you can filter data out in a layer with QGIS: the Set Filter button and creating a database view. The Set Filter button lets you create a SQL filter by clicking on the field you want to filter with, the relational operator, and what you want to compare with. A database view lets you pre-define your filter and presents it as another table. Whichever one you use is up to you, but there is at least one thing you must do to speed up both methods.

The Add Filter Method

Assuming you followed my previous posts about setting up GNIS, we will use that for this example. First you need to create an index on the feature_class column of GNIS. This will make the query that we will use as an example run much faster. To do this, run the following commands:

psql -d USGS
USGS=# create index gnis_feature_class_idx on gnis(feature_class);

Once this is done, you will have a new index called gnis_feature_class_idx. This allows PostgreSQL find the matching feature classes from the data more quickly by consulting the index instead of manually searching each row in the database.

Now that this is done, we will next move on to our first example, the Set Filter button method. As a refresher, here are the feature classes in GNIS:

USGS=# select distinct(feature_class), count(*) from gnis group by feature_class order by feature_class;
feature_class   | count
-----------------+--------
Airport         | 23202
Arch            | 720
Area            | 2557
Arroyo          | 466
Bar             | 5870
Basin           | 4304
Bay             | 14094
Beach           | 2409
Bench           | 724
Bend            | 2797
Bridge          | 7356
Building        | 160291
Canal           | 21559
Cape            | 16417
Cemetery        | 145544
Census          | 11629
Channel         | 4014
Church          | 231967
Civil           | 64237
Cliff           | 4479
Crater          | 246
Crossing        | 13167
Dam             | 56931
Falls           | 2499
Flat            | 10559
Forest          | 1314
Gap             | 8246
Glacier         | 1021
Gut             | 3541
Harbor          | 1271
Hospital        | 15864
Island          | 20540
Isthmus         | 28
Lake            | 69403
Lava            | 168
Levee           | 546
Locale          | 162518
Military        | 2860
Mine            | 36133
Oilfield        | 4863
Park            | 69501
Pillar          | 2092
Plain           | 289
Populated Place | 201065
Post Office     | 66942
Range           | 2480
Rapids          | 1062
Reserve         | 1276
Reservoir       | 74683
Ridge           | 15127
School          | 216473
Sea             | 28
Slope           | 373
Spring          | 38655
Stream          | 231462
Summit          | 70614
Swamp           | 7608
Tower           | 16800
Trail           | 11047
Tunnel          | 750
Unknown         | 186
Valley          | 70239
Well            | 38797
Woods           | 684
(64 rows)

Both of the examples here will work with the feature class of Airports. These examples also assume you already have some data set up as I previously demonstrated on this blog.

For the Set Filter method, first click on the Add PostGIS Layer button in QGIS. Select the USGS database and select the gnis table. Once you have done this, click on the Set Filter button at the bottom right side of the Add Layer dialog.

 

Creating a Filter in QGIS

Creating a Filter in QGIS

 

 

 

 

 

 

 

As you can see above, you are presented with a list of Fields on the left side, operator buttons in the middle, and Values on the right side. Click on the Feature Class field to select it and then click the All button under the values window to the right. Since we created an index on the Feature Class field, this should quickly show you all the unique values that exist in the database for that field. Now double click Feature Class to add it into the Provider specific filter expression in the text box at the bottom of the dialog. Then click the = button in the Operators group. Now double click Airport from the Values box to add it. Your filter expression should now look like this:

"feature_class" = 'Airport'

If you click the Test button, QGIS will perform a query and display the number of rows that match your query. You can use this to double check that you did not make any errors during entry. In our case, the query should return around 23,000+ rows depending on the version of GNIS you are using. Click the OK button to go back to the Layers dialog and then the Add button to add it to your project.  With the filter in place, your screen should look less cluttered as it is only showing airports from GNIS

Only Airports Displayed in GNIS

Only Airports Displayed in GNIS

 

 

 

 

 

 

 

You can use this method to filter out data on any type of field in a geospatial database.  I recommend, though, that you first create an index on that column to speed up the operation.  Otherwise, you may have to wait a while every time you try to load your filtered data.

Creating a Database View

The second method to filter data is to create a database view. Basically all database types can create a view. For the non-database savvy, a view can be thought of as a virtual table that is defined by a database query. This means that whenever you access the view, the data that is returned is generated by a query. For example, if you wanted a table of only airports in GNIS, you could make a view that pretends to be another table but does not take up as much space as a real table would.

For this example, we will again use Airports. Once you understand this, you can then create views for other classes by replacing the feature class name. However, when working with tools such as QGIS, there is a caveat that you need to first know about. If you are savvy with databases, your might create the view with the following command:

psql -d USGS
USGS=# create view view_airports as select * from gnis where feature_class = 'Airport';

When you then go to load this into QGIS, you will indeed see the view as a layer, but there will be a problem.

How a View Appears in QGIS

How a View Appears in QGIS

 

 

 

 

 

 

 

As you can see, QGIS will not let you just click on the view to add it. If you hover over the error triangle, you will see it displays a message of Select columns in the ‘Feature Id’ column that uniquely identify features of this layer. If you scroll to the right, you will see that QGIS will let you select a column in the view that is a unique identifier (feature_id in the case of GNIS).

Why does QGIS not automatically know which column to use? If you are not well versed in how QGIS and databases work, tables in a database typically need a unique identifier for each entry so that it can be properly found. With recent versions of PostgreSQL and PostGIS, the view does not have a unique key presented with the view. If QGIS tried to automatically deduce what field to use as the unique key, it would take a lot of processing power and would mean that QGIS would temporarily “hang” whenever you tried to access a database. Instead, QGIS gives you an option to tell it what field to use as the unique identifier for each row.

If you go ahead and select the feature_id field in the Add Layer dialog, you will then be able to select the layer and click Add to load it into QGIS.

Select Feature ID Option in QGIS

Select Feature ID Option in QGIS

 

 

 

 

 

 

 

So the question you might have is “Which method is better?” The correct answer is “Whichever method makes more sense to you.” Some people may be OK with setting a filter when they load in data. Others may prefer to have views show up in the Layers dialog to remind them what all is available. A PostgreSQL materialized view would likely be the fastest method as it creates a cache of the data, but that is a bit beyond the scope of this post 🙂

Have fun and happy GISing with all Open Source software!

“Fixed” GNIS Data

Since I’ve been messing around with some data in my spare time, I realized the USGS had put out new GNIS data and I tried to import it into my personal PostGIS database.  However, I found out that the NationalFile_20161001.zip file they posted has a LOT of errors where it does not even meet their own data specifications.  I’ve uploaded my fixed file here and am copying the issues I reported to them below.  Basically, for duplicated keys, I removed the entry that was in Mexico or Canada and kept the US one.

Here’s the list of stuff I reported and fixed in mine:

Hey guys, found a few things with the file at http://geonames.usgs.gov/docs/stategaz/NationalFile_20161001.zip that looks like it doesn’t match up with the file format at: http://geonames.usgs.gov/domestic/download_data.htm

Some of the entries have a three character state alpha code while the format entry says it should be two characters.  These are ID 45605, 45606, 45608, and 45610.  They have the entry of SON which is the Sonora region in Mexico.  The primary coordinates are indeed in Mexico while the source coordinates are all in Arizona. 

This also looks to cause some duplicate key problems.  There are two lines with feature id 45605 in the file:
45605|Parker Canyon|Valley|AZ|04|||311900N|1103602W|31.3167684|-110.6006372|312750N|1102532W|31.4639862|-110.4256371|1399|4590|Lochiel|02/08/1980|12/10/2010
45605|Parker Canyon|Valley|SON|26|||311900N|1103602W|31.3167684|-110.6006372|312750N|1102532W|31.4639862|-110.4256371|1399|4590|Lochiel|02/08/1980|12/10/2010

45606 also is duplicate entries in the file:
45606|San Antonio Canyon|Valley|AZ|04|||311910N|1103732W|31.3195459|-110.6256374|312211N|1104334W|31.3697222|-110.7261111|1421|4662|Duquesne|02/08/1980|12/10/2010^M
45606|San Antonio Canyon|Valley|SON|26|||311910N|1103732W|31.3195459|-110.6256374|312211N|1104334W|31.3697222|-110.7261111|1421|4662|Duquesne|02/08/1980|12/10/2010^M

and 45608:
45608|Silver Creek|Stream|AZ|04|||311900N|1091632W|31.3167713|-109.2756155|313157N|1092403W|31.5325979|-109.4008983|1135|3724|San Bernardino Ranch|02/08/1980|12/10/2010^M
45608|Silver Creek|Stream|SON|26|||311900N|1091632W|31.3167713|-109.2756155|313157N|1092403W|31.5325979|-109.4008983|1135|3724|San Bernardino Ranch|02/08/1980|12/10/2010^M

and 45610:
45610|Sycamore Canyon|Valley|AZ|04|||311600N|1112302W|31.2667647|-111.3839874|312627N|1110832W|31.4408333|-111.1422222|1006|3300|Unknown|02/08/1980|12/10/2010
45610|Sycamore Canyon|Valley|SON|26|||311600N|1112302W|31.2667647|-111.3839874|312627N|1110832W|31.4408333|-111.1422222|1006|3300|Unknown|02/08/1980|12/10/2010

Also found other duplicate feature id’s that contain the same ID in the US and Canada:
567773|Hovey Hill|Summit|ME|23|||460650N|0674629W|46.11397|-67.77468|||||252|827|Houlton South|08/27/2002|04/29/2011
567773|Hovey Hill|Summit|NB|04|||460650N|0674629W|46.11397|-67.77468|||||252|827|Houlton South|08/27/2002|04/29/2011

581558|Saint John River|Stream|ME|23|||451501N|0660258W|45.2503524|-66.0493904|463347N|0695305W|46.5630872|-69.8847913|0|0|Unknown|09/30/1980|11/22/2010^M
581558|Saint John River|Stream|NB|04|||451501N|0660258W|45.2503524|-66.0493904|463347N|0695305W|46.5630872|-69.8847913|0|0|Unknown|09/30/1980|11/22/2010^M

768593|Bear Gulch|Valley|AB|01|||490900N|1111303W|49.1500183|-111.217465|485224N|1110900W|48.8733364|-111.1499739|881|2890|Hawley Hill|04/04/1980|03/29/2011^M
768593|Bear Gulch|Valley|MT|30|||490900N|1111303W|49.1500183|-111.217465|485224N|1110900W|48.8733364|-111.1499739|881|2890|Hawley Hill|04/04/1980|03/29/2011^M

774267|Miners Coulee|Valley|AB|01|||490600N|1112303W|49.1000165|-111.3841398|484405N|1113008W|48.734721|-111.5022226|906|2972|Johannson Coulee|04/04/1980|03/29/2011^M
774267|Miners Coulee|Valley|MT|30|||490600N|1112303W|49.1000165|-111.3841398|484405N|1113008W|48.734721|-111.5022226|906|2972|Johannson Coulee|04/04/1980|03/29/2011^M

774784|North Fork Milk River|Stream|AB|01|||490814N|1122233W|49.1373|-112.37589|485411N|1131903W|48.90298|-113.31749|1083|3553|Unknown|04/04/1980|12/14/2010^M
774784|North Fork Milk River|Stream|MT|30|||490814N|1122233W|49.1373|-112.37589|485411N|1131903W|48.90298|-113.31749|1083|3553|Unknown|04/04/1980|12/14/2010^M

775339|Police Creek|Stream|AB|01|||490753N|1110005W|49.13141|-111.00148|485818N|1110859W|48.9716762|-111.1496898|862|2828|Unknown|04/04/1980|12/14/2010^M
775339|Police Creek|Stream|MT|30|||490753N|1110005W|49.13141|-111.00148|485818N|1110859W|48.9716762|-111.1496898|862|2828|Unknown|04/04/1980|12/14/2010^M

776125|Saint Mary River|Stream|AB|01|||493738N|1125313W|49.62728|-112.88701|483713N|1134437W|48.6202|-113.74362|835|2739|Unknown|04/04/1980|12/14/2010^M
776125|Saint Mary River|Stream|MT|30|||493738N|1125313W|49.62728|-112.88701|483713N|1134437W|48.6202|-113.74362|835|2739|Unknown|04/04/1980|12/14/2010^M

778142|Waterton River|Stream|AB|01|||493146N|1131616W|49.52941|-113.27119|484947N|1135939W|48.8296967|-113.9942925|960|3150|Unknown|04/04/1980|12/14/2010^M
778142|Waterton River|Stream|MT|30|||493146N|1131616W|49.52941|-113.27119|484947N|1135939W|48.8296967|-113.9942925|960|3150|Unknown|04/04/1980|12/14/2010^M

778545|Willow Creek|Stream|AB|01|||490929N|1131056W|49.15802|-113.18235|485705N|1131913W|48.95133|-113.32035|1147|3763|Unknown|04/04/1980|12/14/2010^M
778545|Willow Creek|Stream|MT|30|||490929N|1131056W|49.15802|-113.18235|485705N|1131913W|48.95133|-113.32035|1147|3763|Unknown|04/04/1980|12/14/2010^M

798995|Lee Creek|Stream|AB|01|||491326N|1131559W|49.22393|-113.26636|485500N|1133812W|48.9166504|-113.6367702|1110|3642|Unknown|04/04/1980|12/14/2010^M
798995|Lee Creek|Stream|MT|30|||491326N|1131559W|49.22393|-113.26636|485500N|1133812W|48.9166504|-113.6367702|1110|3642|Unknown|04/04/1980|12/14/2010^M

790166|Screw Creek|Stream|BC|02|||490026N|1154647W|49.00719|-115.77985|485757N|1154558W|48.96571|-115.7662|1147|3763|Garver Mountain OE N|04/04/1980|12/14/2010^M
790166|Screw Creek|Stream|MT|30|||490026N|1154647W|49.00719|-115.77985|485757N|1154558W|48.96571|-115.7662|1147|3763|Garver Mountain OE N|04/04/1980|12/14/2010^M

793276|Wigwam River|Stream|BC|02|||491437N|1150546W|49.24355|-115.09616|485754N|1145120W|48.96509|-114.8556|800|2625|Unknown|04/04/1980|12/14/2010^M
793276|Wigwam River|Stream|MT|30|||491437N|1150546W|49.24355|-115.09616|485754N|1145120W|48.96509|-114.8556|800|2625|Unknown|04/04/1980|12/14/2010^M

1504446|Depot Creek|Stream|BC|02|||490146N|1212408W|49.02937|-121.40227|485752N|1211553W|48.96439|-121.2646|622|2041|Copper Mountain OE N|01/01/2000|12/10/2010^M
1504446|Depot Creek|Stream|WA|53|||490146N|1212408W|49.02937|-121.40227|485752N|1211553W|48.96439|-121.2646|622|2041|Copper Mountain OE N|01/01/2000|12/10/2010^M

1515954|Arnold Slough|Stream|BC|02|||490141N|1221115W|49.02799|-122.18741|485857N|1221336W|48.98253|-122.22663|11|36|Kendall OE N|01/01/2000|12/10/2010^M
1515954|Arnold Slough|Stream|WA|53|||490141N|1221115W|49.02799|-122.18741|485857N|1221336W|48.98253|-122.22663|11|36|Kendall OE N|01/01/2000|12/10/2010^M

1515973|Ashnola River|Stream|BC|02|||491330N|1195824W|49.22511|-119.97336|485341N|1201451W|48.89467|-120.24751|445|1460|Unknown|01/01/2000|12/10/2010^M
1515973|Ashnola River|Stream|WA|53|||491330N|1195824W|49.22511|-119.97336|485341N|1201451W|48.89467|-120.24751|445|1460|Unknown|01/01/2000|12/10/2010^M

1516047|Baker Creek|Stream|BC|02|||490249N|1190648W|49.04681|-119.1133|485811N|1191213W|48.9696255|-119.203658|812|2664|Chesaw OE N|01/01/2000|12/10/2010^M
1516047|Baker Creek|Stream|WA|53|||490249N|1190648W|49.04681|-119.1133|485811N|1191213W|48.9696255|-119.203658|812|2664|Chesaw OE N|01/01/2000|12/10/2010^M

1517465|Castle Creek|Stream|BC|02|||490321N|1204355W|49.05587|-120.73202|485823N|1205225W|48.97303|-120.87356|1138|3734|Frosty Creek OE N|01/01/2000|12/10/2010^M
1517465|Castle Creek|Stream|WA|53|||490321N|1204355W|49.05587|-120.73202|485823N|1205225W|48.97303|-120.87356|1138|3734|Frosty Creek OE N|01/01/2000|12/10/2010^M

1517496|Cathedral Fork|Stream|BC|02|||490243N|1201754W|49.04524|-120.29836|485913N|1201251W|48.98699|-120.21427|1511|4957|Ashnola Pass OE N|01/01/2000|12/10/2010^M
1517496|Cathedral Fork|Stream|WA|53|||490243N|1201754W|49.04524|-120.29836|485913N|1201251W|48.98699|-120.21427|1511|4957|Ashnola Pass OE N|01/01/2000|12/10/2010^M

1517707|Chilliwack River|Stream|BC|02|||490545N|1215745W|49.09579|-121.96258|485303N|1213142W|48.8842929|-121.5284712|35|115|Glacier OE N|09/10/1979|12/09/2010^M
1517707|Chilliwack River|Stream|WA|53|||490545N|1215745W|49.09579|-121.96258|485303N|1213142W|48.8842929|-121.5284712|35|115|Glacier OE N|09/10/1979|12/09/2010^M

1517762|Chuchuwanteen Creek|Stream|BC|02|||490324N|1204346W|49.05664|-120.72953|485403N|1204433W|48.9008333|-120.7425|1133|3717|Frosty Creek OE N|01/01/2000|12/10/2010^M
1517762|Chuchuwanteen Creek|Stream|WA|53|||490324N|1204346W|49.05664|-120.72953|485403N|1204433W|48.9008333|-120.7425|1133|3717|Frosty Creek OE N|01/01/2000|12/10/2010^M

1519414|Ewart Creek|Stream|BC|02|||490803N|1200213W|49.13426|-120.03686|485943N|1200951W|48.9954|-120.16423|738|2421|Unknown|01/01/2000|12/10/2010^M
1519414|Ewart Creek|Stream|WA|53|||490803N|1200213W|49.13426|-120.03686|485943N|1200951W|48.9954|-120.16423|738|2421|Unknown|01/01/2000|12/10/2010^M

1520446|Haig Creek|Stream|BC|02|||490110N|1200226W|49.01941|-120.0405|485828N|1200319W|48.97443|-120.05531|1485|4872|Unknown|01/01/2000|12/10/2010^M
1520446|Haig Creek|Stream|WA|53|||490110N|1200226W|49.01941|-120.0405|485828N|1200319W|48.97443|-120.05531|1485|4872|Unknown|01/01/2000|12/10/2010^M

1520654|Heather Creek|Stream|BC|02|||490136N|1204246W|49.02678|-120.71267|485834N|1204447W|48.97616|-120.74644|1209|3966|Frosty Creek OE N|01/01/2000|12/10/2010^M
1520654|Heather Creek|Stream|WA|53|||490136N|1204246W|49.02678|-120.71267|485834N|1204447W|48.97616|-120.74644|1209|3966|Frosty Creek OE N|01/01/2000|12/10/2010^M

1521214|International Creek|Stream|BC|02|||490001N|1210524W|49.0004096|-121.0901283|485938N|1210845W|48.9940199|-121.1459632|487|1598|Hozomeen Mountain OE N|01/01/2000|12/09/2010^M
1521214|International Creek|Stream|WA|53|||490001N|1210524W|49.0004096|-121.0901283|485938N|1210845W|48.9940199|-121.1459632|487|1598|Hozomeen Mountain OE N|01/01/2000|12/09/2010^M

1523541|Myers Creek|Stream|BC|02|||490045N|1185120W|49.01263|-118.85546|484726N|1190614W|48.79052|-119.10394|576|1890|Toroda OE N|01/01/2000|12/10/2010^M
1523541|Myers Creek|Stream|WA|53|||490045N|1185120W|49.01263|-118.85546|484726N|1190614W|48.79052|-119.10394|576|1890|Toroda OE N|01/01/2000|12/10/2010^M

1523731|North Creek|Stream|BC|02|||485956N|1182748W|48.99892|-118.4633|485852N|1182601W|48.98117|-118.43373|543|1781|Boundary Mountain|01/01/2000|12/10/2010^M
1523731|North Creek|Stream|WA|53|||485956N|1182748W|48.99892|-118.4633|485852N|1182601W|48.98117|-118.43373|543|1781|Boundary Mountain|01/01/2000|12/10/2010^M

1524131|Pack Creek|Stream|BC|02|||490028N|1181818W|49.00784|-118.30507|485810N|1181743W|48.96957|-118.29533|494|1621|Independent Mountain OE N|01/01/2000|12/10/2010^M
1524131|Pack Creek|Stream|WA|53|||490028N|1181818W|49.00784|-118.30507|485810N|1181743W|48.96957|-118.29533|494|1621|Independent Mountain OE N|01/01/2000|12/10/2010^M

1524235|Pass Creek|Stream|BC|02|||490209N|1205337W|49.0357|-120.89373|485913N|1205146W|48.98682|-120.86274|1238|4062|Skagit Peak OE N|09/10/1979|12/10/2010^M
1524235|Pass Creek|Stream|WA|53|||490209N|1205337W|49.0357|-120.89373|485913N|1205146W|48.98682|-120.86274|1238|4062|Skagit Peak OE N|09/10/1979|12/10/2010^M

1524303|Peeve Creek|Stream|BC|02|||490125N|1203251W|49.02359|-120.54744|485807N|1202303W|48.96853|-120.38415|1156|3793|Tatoosh Buttes OE N|01/01/2000|12/10/2010^M
1524303|Peeve Creek|Stream|WA|53|||490125N|1203251W|49.02359|-120.54744|485807N|1202303W|48.96853|-120.38415|1156|3793|Tatoosh Buttes OE N|01/01/2000|12/10/2010^M

1525297|Russian Creek|Stream|BC|02|||490046N|1172208W|49.01281|-117.369|485847N|1172613W|48.97977|-117.43687|536|1759|Boundary Dam OE N|01/01/2000|12/10/2010^M
1525297|Russian Creek|Stream|WA|53|||490046N|1172208W|49.01281|-117.369|485847N|1172613W|48.97977|-117.43687|536|1759|Boundary Dam OE N|01/01/2000|12/10/2010^M

1525320|Saar Creek|Stream|BC|02|||490246N|1221105W|49.04608|-122.18477|485512N|1221120W|48.92009|-122.1888|7|23|Kendall OE N|01/01/2000|12/10/2010^M
1525320|Saar Creek|Stream|WA|53|||490246N|1221105W|49.04608|-122.18477|485512N|1221120W|48.92009|-122.1888|7|23|Kendall OE N|01/01/2000|12/10/2010^M

1527272|Togo Creek|Stream|BC|02|||490017N|1182431W|49.0046|-118.40865|485844N|1182452W|48.97889|-118.41434|578|1896|Boundary Mountain OE N|01/01/2000|12/10/2010^M
1527272|Togo Creek|Stream|WA|53|||490017N|1182431W|49.0046|-118.40865|485844N|1182452W|48.97889|-118.41434|578|1896|Boundary Mountain OE N|01/01/2000|12/10/2010^M

1529904|McCoy Creek|Stream|BC|02|||490217N|1190745W|49.03804|-119.12922|485945N|1190846W|48.9959|-119.14608|910|2986|Molson OE N|01/01/1992|12/10/2010^M
1529904|McCoy Creek|Stream|WA|53|||490217N|1190745W|49.03804|-119.12922|485945N|1190846W|48.9959|-119.14608|910|2986|Molson OE N|01/01/1992|12/10/2010^M

1529905|Liumchen Creek|Stream|BC|02|||490444N|1215518W|49.07897|-121.92163|485913N|1215555W|48.98695|-121.93198|55|180|Glacier OE N|01/01/1992|12/09/2010^M
1529905|Liumchen Creek|Stream|WA|53|||490444N|1215518W|49.07897|-121.92163|485913N|1215555W|48.98695|-121.93198|55|180|Glacier OE N|01/01/1992|12/09/2010^

942345|Allen Brook|Stream|NY|36|||450501N|0734545W|45.08349|-73.76257|445923N|0734736W|44.98972|-73.79339|58|190|Ellenburg Depot OE N|01/01/2000|12/13/2010^M
942345|Allen Brook|Stream|QC|10|||450501N|0734545W|45.08349|-73.76257|445923N|0734736W|44.98972|-73.79339|58|190|Ellenburg Depot OE N|01/01/2000|12/13/2010^M

949668|English River|Stream|NY|36|||451251N|0734950W|45.21405|-73.83051|445738N|0735325W|44.9605971|-73.8901522|40|131|Unknown|01/23/1980|12/13/2010^M
949668|English River|Stream|QC|10|||451251N|0734950W|45.21405|-73.83051|445738N|0735325W|44.9605971|-73.8901522|40|131|Unknown|01/23/1980|12/13/2010^M

959094|Oak Creek|Stream|NY|36|||450306N|0741123W|45.0517|-74.18964|445803N|0741007W|44.96759|-74.16862|47|154|Unknown|01/01/2000|12/13/2010^M
959094|Oak Creek|Stream|QC|10|||450306N|0741123W|45.0517|-74.18964|445803N|0741007W|44.96759|-74.16862|47|154|Unknown|01/01/2000|12/13/2010^M

967898|Trout River|Stream|NY|36|||450426N|0741104W|45.07379|-74.18458|444757N|0741038W|44.79916|-74.17713|44|144|Unknown|01/23/1980|12/13/2010^M
967898|Trout River|Stream|QC|10|||450426N|0741104W|45.07379|-74.18458|444757N|0741038W|44.79916|-74.17713|44|144|Unknown|01/23/1980|12/13/2010^M

975764|Richelieu River|Stream|QC|10|||460254N|0730712W|46.04828|-73.11991|445848N|0732104W|44.9800394|-73.3512441|6|20|Unknown|05/01/1994|12/10/2010^M
975764|Richelieu River|Stream|VT|50|||460254N|0730712W|46.04828|-73.11991|445848N|0732104W|44.9800394|-73.3512441|6|20|Unknown|05/01/1994|12/10/2010^M

1458184|Leavit Brook|Stream|QC|10|||450224N|0723117W|45.0401|-72.52146|445939N|0723020W|44.99411|-72.50552|152|499|Jay Peak OE N|10/29/1980|12/10/2010^M
1458184|Leavit Brook|Stream|VT|50|||450224N|0723117W|45.0401|-72.52146|445939N|0723020W|44.99411|-72.50552|152|499|Jay Peak OE N|10/29/1980|12/10/2010^M

1458967|Pike River|Stream|QC|10|||450420N|0730546W|45.07219|-73.09608|450126N|0724400W|45.02383|-72.73335|31|102|Highgate Center OE N|01/01/2000|12/10/2010^M
1458967|Pike River|Stream|VT|50|||450420N|0730546W|45.07219|-73.09608|450126N|0724400W|45.02383|-72.73335|31|102|Highgate Center OE N|01/01/2000|12/10/2010^M

1028583|Cypress Creek|Stream|MB|03|||491224N|0990446W|49.2066713|-99.0795409|485328N|0985320W|48.8911174|-98.8890169|408|1339|Unknown|02/13/1980|12/07/2010^M
1028583|Cypress Creek|Stream|ND|38|||491224N|0990446W|49.2066713|-99.0795409|485328N|0985320W|48.8911174|-98.8890169|408|1339|Unknown|02/13/1980|12/07/2010^M

1035871|Mowbray Creek|Stream|MB|03|||490315N|0982829W|49.0541692|-98.4748273|485846N|0982958W|48.9794471|-98.4995594|363|1191|Mount Carmel OE N|01/01/2000|12/10/2010^M
1035871|Mowbray Creek|Stream|ND|38|||490315N|0982829W|49.0541692|-98.4748273|485846N|0982958W|48.9794471|-98.4995594|363|1191|Mount Carmel OE N|01/01/2000|12/10/2010^M

1035887|Gimby Creek|Stream|MB|03|||490530N|0991916W|49.0916735|-99.3212312|485810N|0994454W|48.969583|-99.74822|458|1503|Saint John|01/01/2000|12/10/2010^M
1035887|Gimby Creek|Stream|ND|38|||490530N|0991916W|49.0916735|-99.3212312|485810N|0994454W|48.969583|-99.74822|458|1503|Saint John|01/01/2000|12/10/2010^M

1035890|Red River of the North|Stream|MB|03|||502401N|0964800W|50.4002778|-96.8|461552N|0963555W|46.2644033|-96.5986848|218|715|Unknown|01/01/2000|12/10/2010^M
1035890|Red River of the North|Stream|ND|38|||502401N|0964800W|50.4002778|-96.8|461552N|0963555W|46.2644033|-96.5986848|218|715|Unknown|01/01/2000|12/10/2010^M

1035895|Wakopa Creek|Stream|MB|03|||490110N|0995331W|49.0194503|-99.892073|485806N|0995046W|48.9683394|-99.8462455|605|1985|Carpenter Lake|01/01/2000|12/10/2010^M
1035895|Wakopa Creek|Stream|ND|38|||490110N|0995331W|49.0194503|-99.892073|485806N|0995046W|48.9683394|-99.8462455|605|1985|Carpenter Lake|01/01/2000|12/10/2010^M

1930555|Red River Valley of the North|Valley|MB|03|||502400N|0964800W|50.4|-96.8|485228N|0971042W|48.8744306|-97.1783987|218|715|Unknown|08/06/2001|04/14/2011^M
1930555|Red River Valley of the North|Valley|ND|38|||502400N|0964800W|50.4|-96.8|485228N|0971042W|48.8744306|-97.1783987|218|715|Unknown|08/06/2001|04/14/2011^M

1035882|East Branch Short Creek|Stream|ND|38|||490130N|1025044W|49.0250311|-102.8454552|484543N|1023028W|48.7619785|-102.5076725|552|1811|Unknown|01/01/2000|12/10/2010^M
1035882|East Branch Short Creek|Stream|SK|11|||490130N|1025044W|49.0250311|-102.8454552|484543N|1023028W|48.7619785|-102.5076725|552|1811|Unknown|01/01/2000|12/10/2010^M

1782010|Manitoulin Basin|Basin|MI|26|||450000N|0822000W|45.0000192|-82.3332616|||||176|577|Unknown|02/23/1998|12/09/2010^M
1782010|Manitoulin Basin|Basin|ON|08|||450000N|0822000W|45.0000192|-82.3332616|||||176|577|Unknown|02/23/1998|12/09/2010^M

Just wanted to let someone know since I ran into some problems trying to import them into a db.  Thanks!