If we’re going to be working with local data then it makes sense to have a list of local postcodes. There’s all kinds of data that can be usefully linked or combined based on postcode information. For example you could aggregate statistics on crime rates or house prices. Happily the Ordnance Survey now publish some useful Open Data about UK postcodes.
So lets look at how we can work with their data to query it and extract it for local use.
Here’s a quick primer on postcodes. Postcodes have a structure to them. Each of the different parts of a postcode refer to a different area and those areas have a hierarchical relationship. Here are some examples along with the name that the Ordnance Survey (OS) uses to describe them:
So “BA2 3PL” is within a sector called “BA2 3”, and so on. The OS publish their data in various ways, including as Linked Data. Without going into details, Linked Data is just a way to publish data to the web by giving everything a unique URL.
So based on a postcode or part of a postcode you can build a URL to the OS website and use it to grab some data. For example here’s a JSON description of BA23PL. That means that you can quickly lookup some useful data such as the lat/long which is the centre of a post code, or to discover in which electoral ward it lies. More on those alternate geographic regions in another post.
Sometimes though you just want a to grab some data for local processing. Having a list of local postcodes can help drive some address matching or other data processing task. So how can we get a complete list of local postcodes?
The OS allow you to download data from their site, so you could grab all of the postcode dataset and process it to extract what you need. But there’s a simpler way. The OS also provide an API called a “SPARQL Endpoint” for their data. SPARQL is a query language for working with RDF, its basically a way to query a graph of Linked Data to extract the bits you need.
Here’s a SPARQL query that will fetch data about all of the Post Code Units that are within the BA1 or BA2 Post Code Districts:
If we submit that query to the OS SPARQL Endpoint then extract just the data we need.
Here’s some simple Ruby code that does exactly that. It requests that the SPARQL API return the data as JSON and then spits it out as a simple CSV file.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Here’s another version that generates a JSON description instead.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27