Making the Move to PostGIS

I’m going to be upfront and say that I’m a “fudger”, for the last decade (and more) I’ve dipped in and out of SQL, using Oracle, MySQL and of late PostGIS. 

I call myself a “Geospatial Expert” by which I mean that I’ve used and done lots of varied and complex GIS & geospatial stuff across a wide range of formats and systems (see some of my previous posts!)..essentially I have the skill to dip in and out of a lot of things. There are a few things that I have specialised in, such as datum (and coordinate systems), Legalities of using geo data, 3D mapping, cartography, offshore & onshore surveying (and interoperability) and also spatial analysis (to name a few). You ask me a detailed question on any of these and I can talk 100% confidently, you start asking my about R, JQuery, Python or Arcade and I start to get a sweat on…I can do it, but nowhere near as well as I should, and it may involve a little Google. If you scratch under the skin of many GIS leaders out there, many are in the same boat, there are now so many factions of geospatial that it is hard NOT to be a “dabbler” [See the US Labor Department Competency Model for Geospatial technology HERE]

This leads me to the actual story I want to give – Having been a dabbler in many of these “outside” areas of the GIS core, I’m now having to learn more on some of  those areas. When I start my machine in the morning, I’m now opening PHP MyAdmin, Valentina, Mapbox Studio, PSQL, Windows Command Shell and a few others rather than the one click buttons of QGIS or ArcMap. This post is about my recent re-introduction to PostGIS and SQL, with a few things I’ve picked up along the way.

Getting Started – Why PostGIS?

For years I used shapefiles, geodatabases, geojson and more recently geopackages. All of them are great formats and they all have their own drawbacks. I’m not going to say that PostGIS doesn’t, but I will say it handles my big data a whole lot better! 

Trying to use a 4GB shapefile wouldn’t be possible, 4GB would be slow and time consuming in a geodatabase, geojson or geopackage format but using it in PostGIS is pretty slick. If I wanted to find how many features fell inside another within PostGIS (in a 4GB file), it would take a few seconds, whereas it would take a minute or so in another format (using a GIS).

Yes, I said it, using a GIS….you see, with PostGIS you can just use a simple statement in a database viewer rather than cranking up the GIS and doing some intense spatial query. It only takes a second to type “SELECT * WHERE geom1 = geom2” (not an exact statement!) 

No need to turn on editors or get confused about weird attribute calculators and even better, there are database viewers which even help write the queries for you – more about that in a minute.

Why PostGIS? For me it is a geodatabase which I can store HUGE data in and it still works, I can keep it on the cloud, to access anywhere or copy and paste tables from it onto my machine. Unlike a geodatabase I can edit, manage and alter information without ever opening a GIS….but when I DO use my GIS, the world is my oyster.

Using QGIS

QGIS was made for PostGIS – okay, I haven’t done my research and that is probably wrong….but it feels like it was! All you have to do it “connect” to your database, which is simply adding where it is and its password, then away you go. You can use your database like another folder of data and, if you look past the niggle where you have to provide the username and password everytime you open QGIS, you will love how you can import, export and even manage your PostGIS data. 

When I first had to use PostGIS in anger, I was a little scared, it looks like a lot of code (SQL) and you see all the help online telling you about psql and pgdump, which sounds like a really bad tea brand, so I tried QGIS first. To be honest, it wasn’t anymore complicated than loading your data to a geodatabase or exporting to a file format, just ALWAYS BUILD YOUR SPATIAL INDEXES!

Another benefit of QGIS is that it allows you to delete tables (tables are the like a vector feature encompassed in a single table with the geometry as one of the columns), add tables, even use the tables like other formats and create joins and do all the spatial analysis you want. Things that can be a little intimidating when entering the PostGIS world where people talk command line like we are back in the DOS 3.1 days.

If you are just getting started in GIS or wondering what PostGIS is, please, please, please do yourself a favour and install PostGIS and give it a go in QGIS and spend half an hour to an hour using it as a geodatabase, then when you get confident and you want to move your PostGIS to the cloud, you can just copy the tables across in QGIS without ever opening a database editor. 

Getting started with a Database Editor

Getting started with a Database Editor

My first ever experience with using a database editor was many moons ago and it was called TOAD for SQL – it would automatically fill your queries and you could even store up your SQL statements so that you didn’t have to remember all that code every time, you could just adapt what you used last time.

15 years on and my baptism of fire saw me being told to use Valentina Studio, an open source DB editor that our senior developer uses (and tells me is brilliant)…needless to say that it was more like a hybrid son of visual basic and DOS. Although I could get around, it was all very “developer(ish)” and you had to set things up and other bits weren’t obvious. Don’t get me wrong, in hindsight, now I have a little skill, it isn’t that bad, but for a GIS person, I think that DBeaver fits better.

DBeaver is an open source database editor, much like Valentina but I think it may be built by someone that has dabbled in GIS, you see, there is a Leaflet map viewer built in, options to import csv and a couple of data formats and also the way it feels is more like a GIS. Fair to say that it ISN’T a GIS but I found myself moving around and doing things a lot easier. There are options to save your SQL queries (and it saves ones that you forget to save), you can also copy and paste tables from one database to another, even copy the column list from a table and paste it to a text file, very useful for creating headers or creating spatial joins.

Another nice little touch is the responsive SQL editor. As you type, prompts and options appear, this takes a bit of the guesswork out of building queries and also a bit of time.

There are some real easy SQL code you can use to get information quick, say you want the minimum or maximum values in a field (column) you could type into the SQL window:

SELECT MAX(COLUMN NAME)

FROM TABLENAME;

Or 

SELECT MIN(COLUMN NAME)

FROM TABLENAME;

Maybe you want to get the values between some others

SELECT * FROM TABLENAME

WHERE COLUMN BETWEEN LOW VALUE AND HIGH VALUE;

If you are used to using the Esri ArcMap “Search by attributes” tool, then you should find the syntax and method very easy to pick up and run with. In fact, you can probably build your queries and run it directly on the database. Say, for example, you want to find the name “Jeff” in the column “NAMES” along with the ID field, you could easily put

SELECT * FROM TABLENAME

WHERE NAMES LIKE “Jeff”;

This will return the whole row of information where the one column has “Jeff” in it, much like the Esri tool.

Creating a view

Much like a GIS, where you can create table joins, so that you can link tables together to create more informative data, you can do the same in PostGIS. Rather than being called “Joins”, they are called “Views” – these views can be stored and used just like another data. In fact, because PostGIS is so good with larger data, this how some companies join their address data to their non address features to give them location.

At this point, some DB Admin with come in and start whittering on about the huge amount of different joins available (Inner joins, outer joins, left joins, right joins and lots of others) but in truth I mostly use left joins 😉

CREATE VIEW VIEWNAME 

AS 

SELECT

a.ID as a_ID,

b.ID as b_ID,

FROM TABLE1 a

JOIN TABLE2 b ON a.ID=b.ID

In the example above, you will see that it takes the ID field from one table, calls it “a_ID” and then does the same from another table and calls it “b_ID”. It then joins the table where they are the same. You can add more fields by simple specifying them.

Although this isn’t quite as graphical as you get with a GIS, I have found it more reliable and also does some huge data, like when I recently joined all the address information for the whole UK together with the UPRN of each building and then also the price paid information.

In case you were thinking that it was a lot of typing and using a GUI would be SO much easier, don’t forget that you can simply copy and paste the columns from DBeaver and with a little text editing, add the other bits you need 😉

Summary

Rather than carrying on – I get the feeling that this is a lot to digest – I’ll pause there before overload or turning you off the idea. Maybe instead I’ll rejoin this blog with a blog solely on QGIS with PostGIS and then tips and tricks with DBeaver and text editing.

Feel free to let me know how you get on and what you’d like to know that could help you.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.