Creating a NAS Using a Raspberry Pi 3 Part 1

Find yourself needing a lot of storage on your network and do not want to have something that requires a lot of attention? You can use a Raspberry Pi 3 and a hard drive enclosure to make a home or small office NAS to store files and keep them available.

Recently I found myself wanting to upgrade my home network attached storage (NAS). It was basically an old laptop connected to a USB enclosure. It was stable, but had a few flaws. The first was that I had to periodically clean out the laptop since it seemed to be a magnet for dust. The second was that the hard drive was a single drive that I kept backed up, but it did not provide availability in case that drive should fail. Plus, the laptop drew more power than it needed to since it spent a lot of time just sitting around.

What I wanted was something that used a small amount of power and was expandable. I wanted to be able to easily add more drives in as our family storage needs grew. My ideal solution would not need much fuss or maintenance. Most importantly, since I am cheap, the hardware solution needed to be low cost.

I had been waiting to pick up a Raspberry Pi 3 until I had a project that actually needed one. If you search Google, you will find a lot of people using it for a file server. Owners also complain because not only does it just come with USB 2.0 ports, but the built-in Ethernet also shares bandwidth with the USB devices. However, I was not looking for something high-bandwidth as all the video I transfer over the network is already compressed. My home network is limited to 100 megabit Ethernet, so at best I only get around eight to nine megabytes a second transfer speed anyway. I ordered the Vilros Raspberry Pi 3 Basic Starter Kit off Amazon and picked up a 64 gigabyte class-10 SD card for the root file system.

Once I settled on the Pi 3, I needed an external USB enclosure. I ended up picking the Mediasonic ProBox HF2-SU3S2 four-bay enclosure. It has a standard-sized cooling fan in the back and controls on the front for things like setting the fan speed and so on. I had two four gigabyte drives ready for it and would still have room to expand.

While waiting for everything to arrive, I then had to decide the how of setting the NAS up. There a lot of options out there, ranging from FreeNAS to rolling your own Linux distribution. You will see a huge amount of discussion of what types of file systems to use. Here it seems to be split into two camps: the “ZFS for everything” camp, and everything else (XFS, EXT4, and so on). ZFS is a great file system and I have used it for other things, but for a home NAS (especially one running off a Pi), it can be a bit of overkill. This is especially true since many of ZFS’s best features require a lot of RAM.

What I decided on is good old Linux software RAID 1 and the XFS file system. I have had a lot of success with software RAID over the years, and for my purposes it has been more flexible than a hardware RAID system. Linux software RAID can do things like convert in-place from a RAID-1 setup to RAID-5 with no data loss. This way, if I needed to expand past the four gigabyte RAID-1 system, I could add another drive and convert it to RAID-5. Linux software RAID-5 will let you grow it by adding hard drives, so in the end I could have a twelve gigabyte RAID-5 system if I needed one. I already back up the data with integrity checks, so instead of ZFS, XFS would satisfy all of my needs.

Distribution-wise, I decided it would be either Raspbian or Ubuntu MATE. Both are Debian-based, and both are solid operating systems for the Pi. Raspbian is the official distribution from the Raspberry Pi Foundation, and MATE is a distribution for the Pi built by Martin Wimpress and Rohith Madhavan. As I had never used a Pi before, I had no idea how either would work on the Pi 3. I used to use MATE for my desktop so I was at least familiar with it and knew it did not need a lot of resources to run.

In Part 2 I will go over setting up the Pi 3 to serve out data. As a spoiler, I will also go into why I chose Ubuntu for my Pi 3 in the end over Raspbian.


Filtering Data from a Geospatial Database using QGIS

If you have spatial databases such as the ones I set up in my previous blog posts about GNIS and PostGIS, you will likely want to add a few things to them to make them more useful. GNIS and Geonames contain point types of all different classes, from airports to populated places. What if you were only interested in one type of point, such as airports? By default, if you load GNIS data into QGIS, it will display all of the points in your view and look cluttered as the screen shot below demonstrates.

All GNIS Points Over an Area

All GNIS Points Over an Area








The good news is that you can easily specify what you only want QGIS to show you.  There are a couple of ways that you can filter data out in a layer with QGIS: the Set Filter button and creating a database view. The Set Filter button lets you create a SQL filter by clicking on the field you want to filter with, the relational operator, and what you want to compare with. A database view lets you pre-define your filter and presents it as another table. Whichever one you use is up to you, but there is at least one thing you must do to speed up both methods.

The Add Filter Method

Assuming you followed my previous posts about setting up GNIS, we will use that for this example. First you need to create an index on the feature_class column of GNIS. This will make the query that we will use as an example run much faster. To do this, run the following commands:

psql -d USGS
USGS=# create index gnis_feature_class_idx on gnis(feature_class);

Once this is done, you will have a new index called gnis_feature_class_idx. This allows PostgreSQL find the matching feature classes from the data more quickly by consulting the index instead of manually searching each row in the database.

Now that this is done, we will next move on to our first example, the Set Filter button method. As a refresher, here are the feature classes in GNIS:

Both of the examples here will work with the feature class of Airports. These examples also assume you already have some data set up as I previously demonstrated on this blog.

For the Set Filter method, first click on the Add PostGIS Layer button in QGIS. Select the USGS database and select the gnis table. Once you have done this, click on the Set Filter button at the bottom right side of the Add Layer dialog.


Creating a Filter in QGIS

Creating a Filter in QGIS








As you can see above, you are presented with a list of Fields on the left side, operator buttons in the middle, and Values on the right side. Click on the Feature Class field to select it and then click the All button under the values window to the right. Since we created an index on the Feature Class field, this should quickly show you all the unique values that exist in the database for that field. Now double click Feature Class to add it into the Provider specific filter expression in the text box at the bottom of the dialog. Then click the = button in the Operators group. Now double click Airport from the Values box to add it. Your filter expression should now look like this:

"feature_class" = 'Airport'

If you click the Test button, QGIS will perform a query and display the number of rows that match your query. You can use this to double check that you did not make any errors during entry. In our case, the query should return around 23,000+ rows depending on the version of GNIS you are using. Click the OK button to go back to the Layers dialog and then the Add button to add it to your project.  With the filter in place, your screen should look less cluttered as it is only showing airports from GNIS

Only Airports Displayed in GNIS

Only Airports Displayed in GNIS








You can use this method to filter out data on any type of field in a geospatial database.  I recommend, though, that you first create an index on that column to speed up the operation.  Otherwise, you may have to wait a while every time you try to load your filtered data.

Creating a Database View

The second method to filter data is to create a database view. Basically all database types can create a view. For the non-database savvy, a view can be thought of as a virtual table that is defined by a database query. This means that whenever you access the view, the data that is returned is generated by a query. For example, if you wanted a table of only airports in GNIS, you could make a view that pretends to be another table but does not take up as much space as a real table would.

For this example, we will again use Airports. Once you understand this, you can then create views for other classes by replacing the feature class name. However, when working with tools such as QGIS, there is a caveat that you need to first know about. If you are savvy with databases, your might create the view with the following command:

psql -d USGS
USGS=# create view view_airports as select * from gnis where feature_class = 'Airport';

When you then go to load this into QGIS, you will indeed see the view as a layer, but there will be a problem.

How a View Appears in QGIS

How a View Appears in QGIS








As you can see, QGIS will not let you just click on the view to add it. If you hover over the error triangle, you will see it displays a message of Select columns in the ‘Feature Id’ column that uniquely identify features of this layer. If you scroll to the right, you will see that QGIS will let you select a column in the view that is a unique identifier (feature_id in the case of GNIS).

Why does QGIS not automatically know which column to use? If you are not well versed in how QGIS and databases work, tables in a database typically need a unique identifier for each entry so that it can be properly found. With recent versions of PostgreSQL and PostGIS, the view does not have a unique key presented with the view. If QGIS tried to automatically deduce what field to use as the unique key, it would take a lot of processing power and would mean that QGIS would temporarily “hang” whenever you tried to access a database. Instead, QGIS gives you an option to tell it what field to use as the unique identifier for each row.

If you go ahead and select the feature_id field in the Add Layer dialog, you will then be able to select the layer and click Add to load it into QGIS.

Select Feature ID Option in QGIS

Select Feature ID Option in QGIS








So the question you might have is “Which method is better?” The correct answer is “Whichever method makes more sense to you.” Some people may be OK with setting a filter when they load in data. Others may prefer to have views show up in the Layers dialog to remind them what all is available. A PostgreSQL materialized view would likely be the fastest method as it creates a cache of the data, but that is a bit beyond the scope of this post 🙂

Have fun and happy GISing with all Open Source software!

“Fixed” GNIS Data

Since I’ve been messing around with some data in my spare time, I realized the USGS had put out new GNIS data and I tried to import it into my personal PostGIS database.  However, I found out that the file they posted has a LOT of errors where it does not even meet their own data specifications.  I’ve uploaded my fixed file here and am copying the issues I reported to them below.  Basically, for duplicated keys, I removed the entry that was in Mexico or Canada and kept the US one.

Here’s the list of stuff I reported and fixed in mine:

Hey guys, found a few things with the file at that looks like it doesn’t match up with the file format at:

Just wanted to let someone know since I ran into some problems trying to import them into a db.  Thanks!