What is metadata and why all the fuss?

Remember when you were a kid and when you wanted to know something, you went to this building called a “Library”? It had these things called books which were tactile paper objects which contained textual knowledge. The only problem was that you didn’t have the time to read the blurb of every book to find the one you needed. What you needed was a microfiche index, kinda like an acetate which contained all the books on the subject and a short blurb about each, also the year of publish and the author…….

Microfiche?!

Nowadays we have Google, Yahoo!, Ask Jeeves & some people are even lazy enough to use Siri, but what happens when you’re a specialist? You work on a subject where there are no books, the data you deal with is the cutting edge and then how do you share that with others so that they understand quickly and easily what the processes were to creating the opus magnum you have. I for one don’t have time to be writing a report on each data I generate.

Time to be Inspired?

For us geomagicians the answer is metadata, the so called data about our data…..I mean, just that one phrase was enough to put me off, it’s not pretty, it’s not fun but in the long run it provides you with the ability to be the master of your data and smugly turn to people, when they ask about your data, and say “Didn’t you read the metadata?”

But what is metadata and why is it useful? Why go through all the effort when no one else seems to bother? I’m going to ignore all those preaching books, the overbearing software manuals and provide my view from my years of managing Terabytes of data to some of the countries biggest companies and bodies.

Not another rant

So, the big question, why bother?

  • Adding metadata to data makes in industry standard compliant
  • Metadata provides a means to SECURE your data by adding copyrights, requirements for display, use and even legal restrictions
  • Gives you a means of tracking edits and changes so that the data history is discoverable.
  • Provides information on the date & time of creation & BY WHOM
  • Not only provides information about what the data is but also about the steps taken to create it.
  • Contains geospatial information on the geographical extents and the system the data was recorded in.
  • Metadata provides a means of providing a central point of contact so that issues and errors can be reported.
  • Metadata when created to standard, can be made into a discoverable catalogue (think wikipedia) where searches for information can be performed by those without access to the data.

Of course, there are many, many other benefits of creating metadata but depending on the format you choose, will dictate the further fields and options you have at your disposal.

One key element of the metadata is that it breaks the “geogeek” barrier. It provides a means for the non-geo people in the office to understand what data we have at our disposal and allows them to better comprehend how it can benefit them. A key example here is where you may have a whole project team accessing the data and each needs to meet their part of the project. Rather than send you endless emails or question you relentlessly for hours on end about the data, they can simply look it up and answer 92% of their questions themselves.

Even better is that the metadata can be tied to your WMS, WFS or WCS services, so that you don’t have to be available 24/7 on the other end of the phone to answer questions what is on your web maps, it can be fully described and the legal bindings on display to all.

It sounds like a lot of work….

Metadata is slowly becoming a staple requirement for the project manager, with many of the main bodies like USGS, NOAA and The Crown Estate requiring data for projects to be supplied with complete metadata. The question is surely becoming why not?

Well, the question you need to ask before embarking on the addition of gigabytes of metadata is, what format, and this isn’t an easy question. There are 4 primary formats which need to be considered, with many others which sit around the edges.

ISO 19115 – ISO 19115 and its parts defines how to describe geographical information and associated services, including contents, spatial-temporal purchases, data quality, access and rights to use.

The objective of this International Standard is to provide a clear procedure for the description of digital geographic datasets so that users will be able to determine whether the data in a holding will be of use to them and how to access the data. By establishing a common set of metadata terminology, definitions and extension procedures, this standard will promote the proper use and effective retrieval of geographic data.

FGDC (Federal Geographic Data Committee) Standard. This standard is widely used in America, was once the default ESRI format  and is very similar in construct to the ISO standard. the main differences lie in that the FGDC metadata is more catered to a data catalogue and is in a text structure. There are minor items like how originators are described and how update frequencies are defined but there there are plans to move the FGDC standard over to the ISO 19115(1) in the near future .

ESRI Metadata – The ESRI metadata format uses neither the ISO standard or the FGDC format, instead it uses a format similar to both but neglects some elements and reorganises others (a vast improvement over the 9.3 metadata!) – the information can be downloaded here

Inspire – Inspire is the european standard, it includes the ISO 19115 standard (so can be used in the UK to meet many requirements). Although it is a better standard, it is more “complete” and therefore requires more time to complete.

There are many other metadata standards, like the MEDIN standard used by the Crown Estate in the UK, this is designed more for offshore geospatial work and has fields more catered to the storage of this data. Each metadata standard should be reviewed on its benefits but in my experience, I have found that the Inspire format encompasses all the parameters and requirements to meet the ISO 19115 & also European standards.

You will hear much talk of ISO 19139 where ISO 19115 is discussed, this relates to the schema for the ISO 19115 data and that the ISO 19139 provides the implementation of xml format for the metadata. Therefore (and this can be confusing) you may have a GIS software which provides both ISO 19115 & 19139, where this is the case, the ISO 19139 will provide the data in .xml.

Meeting the standard

Most (if not all), the latest GIS software reads and allows edits of metadata in most of the most recognised standards, where they don’t the xml can be downloaded and installed or a template (xslt format) can be used.

From experience, I found the ArcGIS software easiest to view the metadata and it also had a list of metadata standards. QGIS was also easy with many tools to read/write the xml format. There are also a wealth of standalone metadata editors to create metadata to industry formats, such as CATMDEDIT or UK Location (Inspire) Metadata editor

A useful tip is that if you are using the ESRI shapefile format, the metadata is held as “yourfilename.shp.xml“. This means that you can create a single metadata with most of the generic information in, then rename the xml file with the name of the shapefile. The extents and other information are automatically updated when the file is opened, therefore you only need provide specific information about the data.

Now the big one

Okay, if you’ve got this far then you are seriously considering this fantastic resource already at your fingertips, the thing you need to equate is whether it is worth implementing. This is something that requires much consideration. The benefits are huge and once created, it is easy to maintain but the initial creation is a chore. Even more so if you have several thousand data, is it worth it though? In my experience, YES. When a client contacts me about some data, I can look it up and discuss the data in detail, even when it has been created by a colleague.

Moving forward

IF you do make the move, there are some great catalogues out there for keeping track of your metadata, one of my personal favourites is Geonetwork,  an open source solution which meets most of the primary standards discussed and here is a list of other metadata tools which may be of use.

Nick D

3 thoughts on “What is metadata and why all the fuss?

  1. Thanks for blogging about metadata! One word of caution, I’m only familiar with the way ArcGIS and GeoPortal handle metadata but switching metadata styles and exporting xml with the translators provided can be tricky and unsatisfactory. Be careful before spending too much time on one option.

  2. Pretty great post. I simply stumbled upon your weblog and wanted to say that I’ve really enjoyd surfing around your blog posts.
    After all I’ll be subscribing to your feed and I am hoping you write once more soon!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s