<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">

  <title><![CDATA[DataSulis]]></title>
  <link href="http://datasulis.github.io/atom.xml" rel="self"/>
  <link href="http://datasulis.github.io/"/>
  <updated>2015-08-27T15:42:16+01:00</updated>
  <id>http://datasulis.github.io/</id>
  <author>
    <name><![CDATA[Leigh Dodds]]></name>
    
  </author>
  <generator uri="http://octopress.org/">Octopress</generator>

  
  <entry>
    <title type="html"><![CDATA[Understanding Our Local Geographies]]></title>
    <link href="http://datasulis.github.io/blog/2015/08/27/understanding-our-local-geographies/"/>
    <updated>2015-08-27T15:15:03+01:00</updated>
    <id>http://datasulis.github.io/blog/2015/08/27/understanding-our-local-geographies</id>
    <content type="html"><![CDATA[<p>What is local data? For Bath: Hacked, local data means data about or relating to Bath &amp; North East Somerset (B&amp;NES).</p>

<p>But how is the B&amp;NES region defined?</p>

<p>In this post I wanted to explore that question, paying particular reference to geography, or rather geographies, as there are several.</p>

<!-- More -->


<h2>B&amp;NES: a region and a council</h2>

<p>B&amp;NES as a district has only existed since 1996. As <a href="http://en.wikipedia.org/wiki/Bath_and_North_East_Somerset">the Wikipedia page</a> explains,
the district was created after Avon was abolished.</p>

<p>The district is managed by B&amp;NES council which is <a href="http://en.wikipedia.org/wiki/Unitary_authorities_of_England">a Unitary Authority</a>.
Unitary authorities are essentially a combination of a county and district council and have responsibility for managing <a href="http://en.wikipedia.org/wiki/Unitary_authorities_of_England#Functions">all functions in a region</a>.</p>

<p>When we are referring to B&amp;NES we may sometimes need to distinguish between the region - an administrative district
of the UK and the council - the organisation that administers that district.</p>

<h2>The District</h2>

<p>The Bath and North East Somerset district covers 220 square miles (570 km2). The Ordnance Survey Linked Data site provides <a href="http://data.ordnancesurvey.co.uk/doc/7000000000025554">a nice overview of the area</a> complete with a map showing the boundary as well as a list of the areas which it touches and contains. All of that data can be downloaded and explored directly from the site.</p>

<p>B&amp;NES is neighboured by six other districts. B&amp;NES is also part of Somerset, but confusingly Somerset is defined as both a &ldquo;ceremonial county&rdquo; which contains B&amp;NES and a smaller administrative district which borders it.</p>

<h2>Wards and Parishes</h2>

<p><strong><a href="http://www.ons.gov.uk/ons/guide-method/geography/beginner-s-guide/administrative/england/electoral-wards-divisions/index.html">Electoral wards</a></strong> are <a href="http://www.ons.gov.uk/ons/guide-method/geography/beginner-s-guide/administrative/england/index.html">the basic building blocks of the national administrative geography</a>.</p>

<p>B&amp;NES is broken down into 37 electoral wards. The list of wards and parishes is again listed at <a href="(http://data.ordnancesurvey.co.uk/doc/7000000000025554">the official Ordnance Survey URL for B&amp;NES</a>, which links to each ward with details including a GML boundary region. The council also publishes <a href="http://www.bathnes.gov.uk/services/your-council-and-democracy/elections/ward-maps">a list of the wards</a>, with links to PDF maps of the areas.</p>

<p> B&amp;NES is also partially divided up into <strong>parishes</strong>. Bath itself doesn&rsquo;t have any parishes, but the surrounding area is divided up into 51 parishes.
Civil parishes are the <a href="http://www.ons.gov.uk/ons/guide-method/geography/beginner-s-guide/administrative/england/parishes-and-communities/index.html">smallest type of area in the UK administrative geography</a>.</p>

<p>They also don&rsquo;t necessarily correspond with the electoral wards. For example <a href="http://data.ordnancesurvey.co.uk/id/7000000000000624">Keynsham is a Civil Parish</a> but the
same area is divided up across several wards, including <a href="http://data.ordnancesurvey.co.uk/id/7000000000000878">Keynsham North</a> and <a href="http://data.ordnancesurvey.co.uk/id/7000000000000625">Keynsham South</a>.</p>

<p>This highlights why we sometimes need to be careful when using statistics about local areas: the boundaries may be different depending on how the information was collected and reported.</p>

<p>Parishes and Wards are not the only type of area that overlap in the region</p>

<h2>Postcodes</h2>

<p><strong>Postcodes</strong> are the geographical areas that we tend to bump into most often in our daily lives. Its the geographical identifier that we all typically have to hand. But postcodes are managed by the Royal Mail and are designed to support their mail delivery operations. This means that they regularly change as new houses are built, or Royal Mail alters its distribution. Post codes can also be changed and may be re-used in the future. So as stable geographical areas they aren&rsquo;t necessarily ideal. Unfortunately its impossible to get post code boundaries as open data so things are even worse.</p>

<p>The Ordnance Survey do make some post code information available as open data. So from their Linked Data site we can find the list of 5750 postcode units in B&amp;NES and this data
has been <a href="https://data.bathhacked.org/Government-and-Society/Bath-North-East-Somerset-Postcodes/vnes-itp9">added to the Bath Hacked data store</a>. Postcodes are also organised into higher-level groupings. For example your home post code is a &ldquo;postcode unit&rdquo; whereas BA2 is a &ldquo;postcode district&rdquo;.</p>

<p>Postcodes don&rsquo;t necessarily line up with electoral wards so they consist of yet another way to divide up the B&amp;NES area.</p>

<h2>Health</h2>

<p>Like Royal Mail, <a href="http://www.ons.gov.uk/ons/guide-method/geography/beginner-s-guide/health/english-health-geography/index.html">the NHS also divides up the UK</a> into regions in order to support its operations. There are two types:</p>

<ul>
<li><p><strong>Area teams</strong> which are responsible for GPs, dental services, pharmacies, etc. B&amp;NES is covered by <a href="http://www.england.nhs.uk/south/bgsw-at/">NHS Bath, Gloucestershire, Swindon and Wiltshire area</a> which is part of <a href="http://www.england.nhs.uk/south/">NHS South</a></p></li>
<li><p><strong>Commissioning Groups</strong> who are responsible for planning and buying of local NHS services. B&amp;NES is covered by <a href="http://www.bathandnortheastsomersetccg.nhs.uk/ccg/what-nhs-bnes-clinical-commissioning-group">Bath and North East Somerset Commissioning Group</a>.
In fact the boundary for the group is aligned with the B&amp;NES region.</p></li>
</ul>


<h2>Statistical Areas</h2>

<p>The Office of National Statistics uses its own geography for the purposes of publishing population statistics. The <a href="http://www.ons.gov.uk/ons/guide-method/geography/beginner-s-guide/census/index.html">census geography</a> is another hierarchical organisation of the UK. The smallest unit of which is the <strong><a href="http://www.ons.gov.uk/ons/guide-method/geography/beginner-s-guide/census/output-area--oas-/index.html">output area</a></strong>. The areas are sized so that they contain roughly equal numbers of people. This generally means around 125 households, although some are smaller. By dividing the UK into small regions, it becomes easier to identify changes to the population over time, whilst avoiding giving away any personal information.</p>

<p>Because output areas are sized based on the number of people living in them, they vary widely in how much actual land they include. Obviously rural output areas will tend to be larger, while the more densely populated city areas will be much smaller.</p>

<p>It&rsquo;s possible to <a href="http://statistics.data.gov.uk/explore?URI=http%3A%2F%2Fstatistics.data.gov.uk%2Fid%2Fstatistical-geography%2FE06000022">browse the statistical geography for B&amp;NES</a> using a tool provided by the ONS. On that page you can elect to see how the area is divided up into output areas, <em><a href="http://www.ons.gov.uk/ons/guide-method/geography/beginner-s-guide/census/workplace-zones--wzs-/index.html">workplace zones</a></em>, and middle and lower layer output areas. The</p>

<p>The ONS <a href="https://gss.civilservice.gov.uk/blog/2015/07/new-ons-nsal/">recently published</a> a national address lookup dataset to help map between addresses and the various stiatistics, electoral, health and other geographies in use across the UK. The <a href="https://data.bathhacked.org/Government-and-Society/National-Statistics-Address-Lookup-BANES-Subset/isaq-h3sn">B&amp;NES subset of this is available from the Bath: Hacked datastore</a>.</p>

<h2>Fire, Police, Ambulance Services</h2>

<p>To be comprehensive its also worth noting that the fire, ambulance and police services in the UK also have their own geographical breakdowns. Again, like the health and mail services, these reflect organisational jurisdictions.</p>

<p>As a quick reference, B&amp;NES is covered by</p>

<ul>
<li><a href="https://en.wikipedia.org/wiki/Avon_Fire_and_Rescue_Service">Avon Fire and Rescue</a> covers B&amp;NES and also North Somerset and South Gloucestershire</li>
<li>Our police force is a division of <a href="https://en.wikipedia.org/wiki/Avon_and_Somerset_Constabulary">Avon and Somerset Constabulary</a>. They define <a href="https://www.police.uk/avon-and-somerset/CS218/">several neighbourhoods</a> across the region</li>
<li><a href="https://en.wikipedia.org/wiki/South_Western_Ambulance_Service">South West Ambulance Service</a> provides the ambulance services across the area and the entire south west of the UK.</li>
</ul>


<h2>Summary</h2>

<p>There are a number of different national geographies that cover our local area. When working with local data it&rsquo;s important to understand the differences between them, especially when it comes to comparing statistics published by different organisations.</p>

<p>Hopefully this blog post provides some useful insight and pointers to further reading.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Area Classifications]]></title>
    <link href="http://datasulis.github.io/blog/2015/08/27/area-classifications/"/>
    <updated>2015-08-27T13:06:44+01:00</updated>
    <id>http://datasulis.github.io/blog/2015/08/27/area-classifications</id>
    <content type="html"><![CDATA[<p>The 2011 UK census can help us understand a little more about the local community. The Office of National Statistics have created &ldquo;pen portraits&rdquo; that describe areas around the UK. I&rsquo;ve put them onto a map to make them easier to explore.</p>

<p>How does the map fit with your understanding of Bath &amp; North East Somerset?</p>

<!-- More -->


<p>The map is embedded below, but you can <a href="http://cdb.io/1JxP140">explore the full map on Cartodb</a>. For some of the denser population areas you&rsquo;ll need to zoom in.</p>

<iframe width="100%" height="520" frameborder="0" src="https://ldodds.cartodb.com/viz/01a91afe-4cb3-11e5-91bd-0e0c41326911/embed_map" allowfullscreen webkitallowfullscreen mozallowfullscreen oallowfullscreen msallowfullscreen></iframe>


<p>The map consists of a number of colour coded regions. Each region is what is known as an <strong>output area</strong>. Bath &amp; North East Somerset consists of 583 output areas.</p>

<p>Each output area is colour coded based on the <strong>population cluster</strong> that lives in that region. The population clusters all have names which are shown on the key.</p>

<p>The rest of this post provides some background on the data and how to read the map.</p>

<h2>Output Areas</h2>

<p>An output area is the smallest geographical region for which data is available in the UK census. The areas are sized so that they contain roughly equal numbers of people. This generally means around 125 households, although some are smaller.</p>

<p>By dividing the UK into small regions, it becomes easier to identify changes to the population over time, whilst avoiding giving away any personal information.</p>

<p>Because output areas are sized based on the number of people living in them, they vary widely in how much actual land they include. Obviously rural output areas will tend to be larger, while the more densely populated city areas will be much smaller.</p>

<h2>Population Clusters</h2>

<p>The UK census collects a variety of information about the population. The ONS have assigned each of the output areas in the UK to a population cluster. A cluster is defined by a number of variables, including the demographics of the people living in the area, whether they own their homes, socio-economic indicators and employment status. The ONS have published <a href="http://www.ons.gov.uk/ons/guide-method/geography/products/area-classifications/ns-area-classifications/ns-2011-area-classifications/methodology-and-variables/variables-oa.pdf">a list of the variables used to calculate the clusters</a>.</p>

<p>Each cluster has a name, e.g. &ldquo;Comfortable Cosmopolitans&rdquo; and a description. These are intended to be a memorable short hand for the more complex description that fully describes that cluster of people.</p>

<p>The colours on the map identfy the cluster to which the output area belongs. The key shows the name and colour associated with each cluster.</p>

<p>Because of limits in Cartodb, some of the less frequently occuring clusters have the same colour and are just listed as &ldquo;Other&rdquo; in the key. But if you click on an area it will show the actual name of the cluster.</p>

<p>Because clusters share a definition, then when you see the same colours on the map, then this indicates that there is a similar mix of people living in those areas.</p>

<h2>Clusters in B&amp;NES</h2>

<p>Here is the full list of clusters that appear in B&amp;NES, in alphabetical order along with a count of how many times they appear.</p>

<pre><code>  6 Ageing City Dwellers
  7 Ageing Rural Dwellers
 78 Ageing Urban Living
 22 Challenged Diversity
 11 Challenged Terraced Workers
 25 Comfortable Cosmopolitans
  1 Constrained Flat Dwellers
  3 Ethnic Dynamics
 16 Farming Communities
 40 Hard-Pressed Ageing Workers
 25 Industrious Communities
  7 Inner-City Students
 29 Migration and Churn
  8 Rented Family Living
 42 Rural Tenants
 62 Semi-Detached Suburbia
 54 Students Around Campus
 46 Suburban Achievers
 96 Urban Professionals and Families
  5 White Communities
</code></pre>

<p>Here&rsquo;s that same listed sorted by frequency:</p>

<pre><code> 96 Urban Professionals and Families
 78 Ageing Urban Living
 62 Semi-Detached Suburbia
 54 Students Around Campus
 46 Suburban Achievers
 42 Rural Tenants
 40 Hard-Pressed Ageing Workers
 29 Migration and Churn
 25 Industrious Communities
 25 Comfortable Cosmopolitans
 22 Challenged Diversity
 16 Farming Communities
 11 Challenged Terraced Workers
  8 Rented Family Living
  7 Inner-City Students
  7 Ageing Rural Dwellers
  6 Ageing City Dwellers
  5 White Communities
  3 Ethnic Dynamics
  1 Constrained Flat Dwellers
</code></pre>

<p>This immediately gives an overall sense of the mix of the local area. Exploring the map should help build up a picture of where each cluster can be found.</p>

<h2>Pen Portraits</h2>

<p>The names of the clusters are interesting in themselves, but as I noted above there&rsquo;s a description associated with each of them. The ONS have <a href="http://www.ons.gov.uk/ons/guide-method/geography/products/area-classifications/ns-area-classifications/ns-2011-area-classifications/pen-portraits-and-radial-plots/pen-portraits-oa.pdf">published their full list of portraits in a PDF document</a>.</p>

<p>I&rsquo;ve not included them here as there&rsquo;s some useful context in the document. For example each of the clusters is included in a higher-level grouping that defines some common characteristics.</p>

<p>For example there is a super group called &ldquo;Urbanites&rdquo; which is defined as follows:</p>

<blockquote><p>The population of this group are most likely to be located in urban areas in southern England and
in less dense concentrations in large urban areas elsewhere in the UK. They are more likely to live
in either flats or terraces, and to privately rent their home. The supergroup has an average ethnic
mix, with an above average number of residents from other EU countries. A result of this is
households are less likely to speak English or Welsh as their main language. Those in employment
are more likely to be working in the information and communication, financial, public administration
and education related sectors. Compared with the UK, unemployment is lower.</p></blockquote>

<p>This group contains both &ldquo;Urban Professionals and Families&rdquo; and &ldquo;Ageing Urban Living&rdquo;, which are the two most frequent clusters in Bath. These are defined as:</p>

<p><strong>Urban Professionals and Families</strong> - <em>The population of this group shows a noticeably higher proportion of children aged 0 to 14 than the
parent supergroup and a lower proportion aged 90 and over. There is also a higher proportion of
people with mixed ethnicity. Households in this group are more likely to live in terraced properties
and to live in privately rented accommodation. Unemployment is slightly higher than for the parent
supergroup.</em></p>

<p><strong>Ageing urban living</strong> - <em>The population of this group shows a higher proportion of people aged 65 and over than the parent
supergroup. Residents are more likely to live in communal establishments, detached properties
and flats than the supergroup, with a higher proportion of households living in socially rented
accommodation.</em></p>

<h2>Notes on creating the Map</h2>

<p>The <a href="http://www.ons.gov.uk/ons/guide-method/geography/products/area-classifications/ns-area-classifications/ns-2011-area-classifications/index.html">complete 2011 area classifications</a> can be downloaded from the ONS website.</p>

<p>After exploring the data at a recent <a href="http://bathhacked.org">Bath: Hacked</a> curators night, <a href="https://twitter.com/azazell0">Mark Owen</a> extracted the Bath &amp; North East Somerset data from the national dataset and combined it with the <a href="http://www.ons.gov.uk/ons/guide-method/geography/products/census/spatial/2011/index.html">boundary information</a> for each output area. The resulting shape file was then uploaded to Cartodb.</p>

<p>The resulting shape file was then <a href="https://ldodds.cartodb.com/tables/bath_north_east_somerset_area_classifications">uploaded to Cartodb</a> to create the online map.</p>

<p>The full set of B&amp;NES specific data, without the boundaries, is <a href="https://data.bathhacked.org/Population/2011-B-NES-Area-Classifications/mp58-wts8">available from the Bath: Hacked datastore</a>. This provides some additional context, as the clusters are organised into supergroups that can provide some additional ways to explore the area.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Minecraft Map of Bath]]></title>
    <link href="http://datasulis.github.io/blog/2014/10/15/minecraft-map-of-bath/"/>
    <updated>2014-10-15T19:21:48+01:00</updated>
    <id>http://datasulis.github.io/blog/2014/10/15/minecraft-map-of-bath</id>
    <content type="html"><![CDATA[<p>The Ordnance Survey have published <a href="http://www.ordnancesurvey.co.uk/innovate/developers/minecraft-map-britain.html">a Minecraft map of Great Britain</a>. Its recently been updated to include more detail. Its a fantastic idea and has, I think, some potential use as an educational tool.</p>

<p>However the full map is something like 20GB, eating up a fair bit of disk space and memory and making it difficult to use it on older hardware. So I wondered whether there was a way to cut out a limited section of the map focusing on a smaller area. Like Bath for example.</p>

<p>Here&rsquo;s how I did it.</p>

<!-- More -->


<p>After I&rsquo;d <a href="https://twitter.com/ldodds/status/516960527990681600">failed to convince the OS to do the work for me</a> my first port of call was to look at whether they had published any code that would let me re-run their world generation. Unfortunately they&rsquo;ve decided not to do that which is a shame. As far as I can tell the world generation is based on open data so would have made a nice showcase for re-purposing their data in a new way. I also didn&rsquo;t want to re-write it from scratch, although it would be an interesting project for someone to tackle. Especially if they were to use additional open sources.</p>

<p>So I started to dig through the plethora of Minecraft level editing tools to find something that I could use. Ideally I wanted to be able to take <a href="http://data.ordnancesurvey.co.uk/doc/geometry/25554-15">the geometry for B&amp;NES</a> and use <a href="http://www.ordnancesurvey.co.uk/innovate/developers/minecraft-coordinate-finder.html">the OS Minecraft co-ordinator finder</a> to cut out the relevant region from the map.</p>

<p>However I struggled to find a command-line tool that would do exactly what I needed. This <a href="https://github.com/mcedit/pymclevel">Python library for reading Minecraft levels</a> looked like what I wanted. It has a command-line application for interacting with a world, including extracting bounded areas into schematics. However I had mixed results. While I was able to load the OS level and extract a schematic I couldn&rsquo;t seem to successfully import it into the empty world I&rsquo;d <a href="https://www.youtube.com/watch?v=DTSyFNIe1JE&amp;src_vid=h_qwraEeIFw&amp;feature=iv&amp;annotation_id=annotation_3192536097">created</a>.</p>

<p>So I turned to <a href="http://www.mcedit.net/">MCEdit</a> which is a graphical Minecraft world editor which also happens to be written in Python. It seems to be well used by people creating large complex Minecraft builds and it has lots of features for copying and pasting sections of levels. I&rsquo;d been reluctant to use it though as I figured it would be slow dealing with such a large world. I was also hoping to script something that could be tailored by other people interested in generating sections of the map covering their own area.</p>

<p>Anyway, I found that MCEdit, while slow to open the file, does provides a nice 2D way to navigate through the world (press Tab when you&rsquo;ve loaded a level to switch perspective). So by jumping to the co-ordinates above Bath (around X:15000, Y:100, Z:45600 will do it) I was able to see the region I wanted to copy.</p>

<p>The MCEdit selection tool allows you to grab an area which you can then extract. You can also choose to &ldquo;prune&rdquo; the rest of the world leaving just your selection. I doubt this is the most efficient way to do it as it takes a couple of hours on my laptop with some fiddling around. But the end result is a level containing just the region surrounding Bath, Keynsham and Bathampton. Its also a lot smaller, about 20Mb.</p>

<p>If you&rsquo;d like to download a copy of the level then you can <a href="http://ldodds.com/projects/minecraft/os-bath.tar.gz">download it here</a>. The map contains Ordnance Survey Data © Crown Copyright and Database Distribution 2014.</p>

<p>I&rsquo;ve set the spawn point to be on Beechen Cliff overlooking the centre of Bath.</p>

<p>Note that the scale isn&rsquo;t 1:1 with normal Minecraft. IIRC a Minecraft block is essentially a 1m cube. For the OS map they explain that a block is more like a cuboid measuring 25 m x 25 m x 12m.</p>

<p>The OS documentation indicates <a href="http://www.ordnancesurvey.co.uk/innovate/developers/minecraft-map-britain.html">which block type maps to which type of geographical feature</a>. You can also <a href="http://www.ordnancesurvey.co.uk/innovate/developers/readme.txt">follow the instructions in their readme to install the level</a>.</p>

<p>What can you do with it? Well, I was wondering whether it might provide a fun and alternative way to visualise some local data. Let me know if you do anything with it!</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[You Know Nothing Jon Sno<sub>x</sub>w]]></title>
    <link href="http://datasulis.github.io/blog/2014/09/22/you-know-nothing-jon-snow/"/>
    <updated>2014-09-22T20:00:00+01:00</updated>
    <id>http://datasulis.github.io/blog/2014/09/22/you-know-nothing-jon-snow</id>
    <content type="html"><![CDATA[<p>This blog post has the best game of thrones/weather/air quality joke you&rsquo;ll read all day. It also has some notes on what I built at <a href="http://www.bathhacked.org/news/air-quality-hack-20-september/">the BathHacked Air Quality hackday</a>.</p>

<!-- More -->


<p>Going into the hackday I realised several things.</p>

<p><img src="http://datasulis.github.io/images/jon-snow.jpg" alt="My helpful screenshot" /></p>

<p>Firstly, there was a lot of data available. BANES council had made an archive of <a href="https://data.bathhacked.org/Environment/Historical-Air-Quality-Sensor-Data/37nn-vnib">13 years of air quality data available</a>. I&rsquo;d also discovered that <a href="http://uk-air.defra.gov.uk/networks/site-info?uka_id=UKA00306">the DEFRA Air Quality data archive had data for one sensor</a> dating from 1997. That&rsquo;s a lot of data to get to grips with. I decided that I was going to focus on trying to summarise the dataset rather than build anything on top of it.</p>

<p>Secondly, I also didn&rsquo;t really know anything about air quality. To get prepared I <a href="http://www.bathhacked.org/datastore/air-quality-fast-start/">produced some documentation for everyone attending the hackday</a> that summarised the main data source, and provided some useful pointers. I also did some reading around on both the BANES and <a href="http://uk-air.defra.gov.uk/">UK-AIR</a> websites.</p>

<p>Air quality data analysis is a complex area. There are a lot of factors to take into account including a variety of sources of pollutants, complex interactions between pollutants and impacts from the prevailing weather conditions. BANES publish some summary reports but while informative they didn&rsquo;t really give me a sense of where the pollution was coming from, or how bad it was at different types of the day or year.</p>

<p>I also discovered the <a href="http://www.openair-project.org/">Open Air</a> project which provides an <a href="http://www.r-project.org/">R</a> package to support air quality analysis. It also comes with an amazing set of documentation: the manual is over 200 pages and includes a short introduction to R.</p>

<p>So I read the manual and went into the hack day with a goal of trying to answer two questions:</p>

<ol>
<li>Can we provide Bath citizens with more insight into the air quality for Bath?</li>
<li>Can we provide the local council with new ways to generate meaningful visualisations and summary reports?</li>
</ol>


<p>I didn&rsquo;t get as far as I&rsquo;d hoped, but I managed to do enough to create what I think is <a href="http://datasulis.org/air-quality-report/london-road-aurn.html">an interesting summary of the data</a>.</p>

<p>Using R and <code>openair</code> I was able to quickly import, normalise and explore the data. In fact R makes it so easy to generate diagrams that I spent a lot of the day just <a href="http://treasure.diylol.com/uploads/post/image/587658/resized_all-the-things-meme-generator-graph-all-the-things-9ec157.jpg">playing with graphs</a>.</p>

<p>The judges also liked the results and I was lucky enough to <a href="http://www.bathhacked.org/news/and-the-results-are-in/">win the Most Educational Project prize</a>. The report is now <a href="http://www.bathnes.gov.uk/services/your-council-and-democracy/local-research-and-statistics/wiki/historic-air-quality">featured on the BANES website</a>.</p>

<p>I&rsquo;ve also <a href="https://github.com/datasulis/air-quality-report">published the code on github</a> if you&rsquo;d like to explore. The main report code could easily be customised to use an alternate DEFRA location if you want to try it on some data from your local area.</p>

<p>There&rsquo;s more to explore here, not just around the data analysis, but also around the concept of having reproducable data analytics.</p>

<p>Reproducability is an important part of scientific research and analysis. At least one driver of the growing adoption of open source and open data in the research community is to make science more reproducable: it should be possible for someone else to pick up your research to easily check the results and maybe go a step further.</p>

<p>I&rsquo;ve not yet seen this idea extended to publishing of analysis of open (statistical) data, but the concepts are the same. Reproducability is another way to increase transparency. Open data has been shown to help people find data errors, but open source can also help people find and fix errors in an analysis itself.</p>

<p>I&rsquo;ll certainly be playing more with R over the coming months. I&rsquo;m sold on the ease with which it&rsquo;s possible to really quickly explore a dataset.</p>

<p>The other hacks produced on the day were all really interesting. I recommend you read <a href="http://www.bathhacked.org/news/and-the-results-are-in/">the run down of the entries on the BathHacked blog</a>.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Exploring a Bath Chronicle House Price Story]]></title>
    <link href="http://datasulis.github.io/blog/2014/09/06/exploring-a-bath-chronicle-house-price-story/"/>
    <updated>2014-09-06T11:01:32+01:00</updated>
    <id>http://datasulis.github.io/blog/2014/09/06/exploring-a-bath-chronicle-house-price-story</id>
    <content type="html"><![CDATA[<p>The cover of the Bath Chronicle caught my eye yesterday. The headline article was &ldquo;House prices now 8 times average pay&rdquo; you can <a href="http://www.bathchronicle.co.uk/Average-house-price-Bath-times-average-salary/story-22858365-detail/story.html">read a summary of the article on their site</a>. The article describes that the TUC are reporting that the ratio of house price to salary in Bath &amp; North East Somerset has increased 89% between 1997-2013</p>

<p>I wondered where the data for this analysis came from and whether it would be possible to repeat it using open data. So I picked up a copy of the Chronicle, opened my web browser and explored further.</p>

<!-- More -->


<p>Firstly before we dig deeper I&rsquo;ll state up front that I have no political agenda here, or any particular interest in criticising the Chronicle. What I want to do is see whether the data reported in the article is available and whether the facts are correct. If we can highlight some inaccuracies or come up with some interesting insights along the way, then great.</p>

<p>To summarise, the key statements facts from the article are as follows:</p>

<ul>
<li>The ratio of house price to salary in 1997 was 4.62</li>
<li>The ratio of house price to salary in 2013 the ratio was 8.74 (89% higher than 1997)</li>
<li>Across the South West the average ratio is now above 5</li>
<li>The Bank of England will now be <a href="https://www.gov.uk/government/news/help-to-buy-mortgage-guarantee-loans-new-lending-limits">limiting the number of risky mortages</a>, assessed as those with a ratio higher than 4.5</li>
<li>Cotsworld is the most unaffordable area in the South West with a ration of 11.6</li>
</ul>


<p>The article also includes some notes on most and least affordable places in the UK. But it was the following quote that made my raise my eyebrow:</p>

<blockquote><p>&ldquo;The TUC has been unable to release the figure for the average house price or the average salary in B&amp;NES, because that data, which the study is based on, is not available from the Government&rdquo;</p></blockquote>

<p>Unfortunately that statement is simply not true. I can&rsquo;t tell whether this is a misunderstanding on behalf of the reporter in the Chronicle or a mis-communication from the TUC. Like most newspapers the Chronicle doesn&rsquo;t link to its sources. But after some searching I found <a href="http://www.tuc.org.uk/economic-issues/britain-needs-pay-rise/social-issues/housing/house-prices-across-half-north-west-now">a TUC report on prices in the North West</a>, but not one specifically on the South West.</p>

<p>But I know that the statement is not true because all of the raw data required to calculate these averages and the resulting ratio is available as open data:</p>

<ul>
<li>The Land Registry have been publishing their <a href="http://landregistry.data.gov.uk/">price paid data</a> as open data for some time. The available data goes back to 1995. So we have the price of every house sale in B&amp;NES for nearly 20 years.</li>
<li>The ONS <a href="http://www.ons.gov.uk/ons/rel/ashe/annual-survey-of-hours-and-earnings/index.html">Annual Survey of Hours and Earnings</a> (ASHE) provides information on earnings across the UK and data is available from 1998 to 2013, although the latest figures are still provisional. There&rsquo;s a <a href="http://www.neighbourhood.statistics.gov.uk/HTMLDocs/dvc138/index.html">visualisation of changes in weekly earnings by region</a>.</li>
</ul>


<p>So the raw data necessary to re-calculate the figures reported by the Chronicle and the TUC is available as open data for anyone to re-use.</p>

<p>But not only that, the Department of Communities and Local Government (DCLG) actually publishes <a href="https://www.gov.uk/government/statistical-data-sets/live-tables-on-housing-market-and-house-prices">detailed figures on the housing market</a> which includes the ratio of house prices to salary! This appears to be the actual source of the data and is actually linked to from the TUC report on the North West. Reading the Chronicle article though you might be lead to assume that the TUC have produced some analysis using unreleased data, when in fact the source is the government themselves.</p>

<p>The data we&rsquo;re interested is included in <a href="https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/321017/Table_577.xlsx">Table 577: ratio of median house price to median earnings by district, from 1997</a>. There&rsquo;s also a PDF containing charts that <a href="https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/321015/Chart_576.pdf">plots the changes across the entire country</a>. Looking at that data we can confirm all of figures reported in the article.</p>

<p>Not only that, but because we have access to the actual underlying data we can also take into account a few caveats that ONS and DCLG include in their supporting notes. None of these invalidate the figures, but may be useful when interpreting them. For example we can learn that:</p>

<ul>
<li>the ASHE is based only on 1 per cent sample of employee jobs</li>
<li>the ASHE only includes data from employers, so it doesn&rsquo;t cover self-employed people</li>
<li>the reporting regions are based on district boundaries from 2009, so changes to size/shape of districts since that date won&rsquo;t be reflected</li>
<li>the 2013 figures are still provisional and are subject to change</li>
</ul>


<p>Perhaps most importantly though we can see the entire set of data on the ratio of house prices to salary for B&amp;NES from 1997 to 2013. I&rsquo;ve taken a copy of that data and <a href="https://data.bathhacked.org/Economy-and-Jobs/Ratio-of-house-prices-to-earnings-since-1997/gbqi-3ffv">uploaded it to the Bath Hacked data store</a>. I&rsquo;ve used that to create the graph shown below:</p>

<div><iframe width="500px" title="Changes in media house price to earnings ratio" height="425px" src="https://data.bathhacked.org/w/v4tc-6xs9/?cur=kbjUpZMvAfS&from=root" frameborder="0" scrolling="no"><a href="https://data.bathhacked.org/Economy-and-Jobs/Changes-in-media-house-price-to-earnings-ratio/v4tc-6xs9" title="Changes in media house price to earnings ratio" target="_blank">Changes in media house price to earnings ratio</a></iframe><p><a href="http://www.socrata.com/" target="_blank">Powered by Socrata</a></p></div>


<p>As you can see the ratio has increased significantly since 1997 with the largest increases in the period between 1999-2004. Interestingly the ratio seems to have dropped slightly over the last few years: the provisional figure for 2013 is similar to that for 2004.</p>

<p>So as a result of the exercise I&rsquo;ve learnt something about the data that wasn&rsquo;t reported in the Chronicle. Clearly the ratio is still very high but somewhat encouragingly the local trend in prices and earnings suggests the ratio is flattening out, so we&rsquo;re not suffering from a runaway increase over the last few years. Whether that is based on changes to house prices or salaries isn&rsquo;t immediately clear, but we can dig into the data further to find out.</p>

<p>I was also able to highlight an inaccuracy in the article: far from withholding figures the government is providing a lot of useful data in this area.</p>

<p>Hopefully this is useful for others too and highlights some of the possibilities for using open data from a local perspective. The Bath Hacked store could become a useful source of information for adding important context to local news stories and economic trends.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Bath Hacked]]></title>
    <link href="http://datasulis.github.io/blog/2014/09/06/bath-hacked/"/>
    <updated>2014-09-06T08:03:00+01:00</updated>
    <id>http://datasulis.github.io/blog/2014/09/06/bath-hacked</id>
    <content type="html"><![CDATA[<p>It&rsquo;s beeen over two years since I posted to this blog. At the time I had good intentions that I was going to post regularly and try my best to start an open data community here in Bath. Unfortunately, as always happens real life intervened. A few months after this happened I ended up changing jobs and went <a href="http://consulting.ldodds.com/">freelance</a>. The result was that I had no time for side projects as I was taking on as much work as possible to get the business started.</p>

<p>I&rsquo;ve never completely given up on the idea behind DataSulis though. While I&rsquo;ve not had much time to do any visible work I&rsquo;ve continued to research open datasets that might be useful to the local community. My work with the Open Data Institute and others has also given me some useful experience.</p>

<p>I was about to kick-start this project again when I discovered <a href="http://www.bathhacked.org/">Bath: Hacked</a>. It turns out that I&rsquo;m not the only person passionate about open data in Bath and they&rsquo;ve been extremely busy!</p>

<p>The <a href="http://twitter.com/BathHacked">@BathHacked</a> team have been working with B&amp;NES to open up some datasets. After a succcessful hack day earlier this year they&rsquo;ve now launched <a href="http://data.bathhacked.org">a beta data store based on Socrata</a>.</p>

<p>As a result of this I&rsquo;ve been reworking the code in the <a href="https://github.com/datasulis">DataSulis github account</a> to add support for posting the data to Socrata. I&rsquo;ve also been spending time exploring the Socrata platform. I volunteered to help out with BathHacked and have agreed to help manage the data store, to help people get the most out of it and ensure that the datasets are well-published.</p>

<p>There&rsquo;s some interesting datasets in the store already, including some continuously updated air quality data taken from sensors around the city. B&amp;NES are running an <a href="http://www.bathhacked.org/news/air-quality-hack-20-september/">air quality hack day</a> in a few weeks to encourage developers to use the data to build some interesting applications.</p>

<p>While I&rsquo;ll be blogging on the Bath Hacked website, contributing to <a href="https://github.com/bathhacked">the BathHacked github project</a>, etc. I&rsquo;ve decided to revive this project. My plan is to use this site to write about my own personal perspective on open data in Bath, publish investigations of useful datasets, and share updates on my own hacking with local open data.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[How You Can Help. Yes, You!]]></title>
    <link href="http://datasulis.github.io/blog/2012/03/29/how-you-can-help-yes/"/>
    <updated>2012-03-29T16:32:00+01:00</updated>
    <id>http://datasulis.github.io/blog/2012/03/29/how-you-can-help-yes</id>
    <content type="html"><![CDATA[<p>After I posted a link to this project on Twitter last night, there was some really encouraging feedback from the local community. Looks like there&rsquo;s some interest in a Bath based hackday and cider. Mainly cider in fact, but that&rsquo;s all good :)</p>

<p>One question I got asked is: how do we move this forward? I thought I&rsquo;d post some ideas about next steps, as well as some suggestions for how people can get involved.</p>

<!-- More -->


<p>First of all though lets consider why we might want to collate some local data. There are a number of good reasons. The primary reason is to have free, Open Data that can be used to create some useful applications. Those app can be useful not just for local people, but also for the thousands of tourists that visit the city each year.</p>

<p>Those applications could be built by local startups and provide a way for them to earn a few quid. The apps could also help local businesses increase their revenues, by driving attracting more customers. If we&rsquo;re provided with better data from the local council, or central government, we can all be better engaged and help it deliver better services to us.</p>

<p>Open Data is pure win.</p>

<p>With these kinds of uses in mind, what kinds of data might we want to collect:</p>

<ul>
<li>Government data &ndash; There&rsquo;s a whole range of potentially useful, interesting statistics published by central government, or by the likes of the <a href="http://www.dh.gov.uk/en/Publicationsandstatistics/Statistics/Performancedataandstatistics/">NHS</a>, that could be re-purposed for local use</li>
<li>Local Government data &ndash; Planning applications, health &amp; safety inspections, etc.</li>
<li>Business listings &ndash; How about a free, locally maintained database of local businesses, with contact details, up to date information on opening-closing times, twitter accounts, blogs, and calenders?</li>
<li>Travel information &ndash; Bus and train times. Cycle paths</li>
<li>News &ndash; Searchable indexes of news sourced both locally as well as from national newspapers, e.g. the Guardian</li>
<li>Reviews &ndash; local restaurant, business, and event reviews</li>
<li>Jobs &ndash; local job adverts</li>
<li>Social Network &ndash; <a href="http://welovebath.co.uk">WeLoveBath</a> is becoming the focal point of Bath&rsquo;s social network. Who is in it and how do they connect up? What about Facebook, Foursquare, etc?</li>
<li>Media &ndash; <a href="http://www.flickr.com/groups/bath/">Flickr</a> and <a href="http://www.geograph.org.uk">Geograph</a> are full of pictures of Bath. Wouldn&rsquo;t it be nice to have a dataset of Creative Commons licensed photos for use in local applications? Maybe to power a new kind of tourist guide? What about sound and video? Music?</li>
<li>Events &ndash; There&rsquo;s a lot going on in Bath for a relatively small city. I&rsquo;d love to see a central database and calendar of events across the city, not just in the Theatre, or Komedia, but all of the pub nights, reading groups and other local events.</li>
</ul>


<p>&hellip;and there&rsquo;s a whole lot more. Cultural heritage, local walking routes, weather, etc. I&rsquo;ve even had crazy ideas about mapping out all the species of trees in the Arboretum in Victoria Bath. Maybe a school project?!</p>

<p>So how can you (yes, you!) get involved?</p>

<ul>
<li>Are you a geek? Yes? Then why not:

<ul>
<li>Share some code showing how to query, collect or scrape some data together for other developers to build on? A lot of data is out there already, we just need to make it easier to find and access</li>
<li>Use some of the data showcased here to create an useful application or visualisation?</li>
<li>Help build an application or service to help crowd-source some data?</li>
</ul>
</li>
<li>Are you a non-geek (aka Normal Person)? Then how about:

<ul>
<li>Sharing some thoughts on what kinds of local application or service you&rsquo;d like to see?</li>
<li>Curating some data using a Google Spreadsheet? No coding required, but the data is still easy to share with others</li>
</ul>
</li>
<li>Are you a local business or firm? Then how about:

<ul>
<li>Getting in touch to see how you could help share some of you data for others to use?</li>
</ul>
</li>
</ul>


<p>I&rsquo;ll happily point to whatever people hack, build, or share from this blog, just send in pointers. I&rsquo;ll try and showcase as much as possible whilst continuing to add more content and collated data myself.</p>

<p>I&rsquo;m keen to take a lo-fi approach and just do the simplest things necessary to help collect some data and make it easily accessible to whoever needs it. There are lots of useful free tools available. For example <a href="http://www.google.com/google-d-s/forms/">Google Spreadsheet Forms</a> are a quick and dirty way to crowdsource some basic information.</p>

<p>To really get things going I suggest we first try and collate some more data, and then maybe have a local hackday to try build something on what we&rsquo;ve collected. And/or have a focused hacking session to collect together even more data. Or both.</p>

<p>Oh, and the cider. Don&rsquo;t forget the cider.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[House Price Data From the Land Registry]]></title>
    <link href="http://datasulis.github.io/blog/2012/03/28/house-price-data-from-the-land-registry/"/>
    <updated>2012-03-28T20:15:00+01:00</updated>
    <id>http://datasulis.github.io/blog/2012/03/28/house-price-data-from-the-land-registry</id>
    <content type="html"><![CDATA[<p>Like the Ordnance Survey, the Land Registry have recently started to <a href="http://www1.landregistry.gov.uk/market-trend-data/public-data">publish some Open Data</a>. That data includes statistics on transactions made against the Land Registry database as well as &ldquo;price paid&rdquo; data.</p>

<p>As <a href="http://www1.landregistry.gov.uk/market-trend-data/price-paid-data">the Land Registry website explains</a>, this data relates to:</p>

<blockquote><p>residential property sales in England and Wales that are lodged with us for registration. The data includes:</p>

<ul>
<li>the full address of the property (Primary addressable object name (PAON), Secondary addressable object name (SAON), street, postcode, locality (if available), town, district, county)</li>
<li>the price paid for the property</li>
<li>the date of transfer</li>
<li>the property type (Detached, Semi, Terraced, Flat/Maisonette)</li>
<li>whether the property is new build or not</li>
<li>whether the property is freehold or leasehold.</li>
</ul>
</blockquote>

<p>We can filter their data to grab just the Bath prices.</p>

<!-- More -->


<p>There is a simple script in <a href="https://github.com/datasulis/bath-house-prices">this github project</a> which looks at the Land Registry website and grabs whatever CSV files are available. The CSV files are then read and filtered to just grab the data that relates to properties in the BA1 and BA2 area.</p>

<p>Currently the Land Registry are publishing this data on a monthly basis, so there is only a single month available currently. I expect that more data will appear over time. Here&rsquo;s <a href="https://github.com/datasulis/bath-house-prices/blob/master/data/bath-house-prices.csv">how it looks today</a>.</p>

<p>For more background on the codes used in the data then <a href="http://www1.landregistry.gov.uk/market-trend-data/faqs#m18">read the Land Registry FAQ</a>.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[The NHS in Bath]]></title>
    <link href="http://datasulis.github.io/blog/2012/03/28/the-nhs-in-bath/"/>
    <updated>2012-03-28T19:20:00+01:00</updated>
    <id>http://datasulis.github.io/blog/2012/03/28/the-nhs-in-bath</id>
    <content type="html"><![CDATA[<p>The <a href="http://www.connectingforhealth.nhs.uk/systemsandservices/data/ods">NHS Organisation Data Service</a> has a responsibility to help various parts of the NHS, and affiliated organisations, exchange information as efficiently as possible. Part of that activity involves maintaining a database of organisations relevant to the NHS. That includes everything from NHS Primary Care Trusts through to individual Pharmacies.</p>

<p>Their data is published under the Open Government License so can be freely reused. Lets take a look at what it contains and how we can grab a local extract.</p>

<!-- More -->


<p>If you visit the <a href="http://www.connectingforhealth.nhs.uk/systemsandservices/data/ods">ODS Website</a> you can see that they publish a large number of CSV files that contain data about organisations and their relationships. The structure of the CSV files is well-documented and the files are well-normalised so they&rsquo;re easy to process.</p>

<p>Some of the data is <a href="http://www.connectingforhealth.nhs.uk/systemsandservices/data/ods/datafiles">updated on a weekly basis</a>. This is the core NHS organisational data. Other data, such as that <a href="http://www.connectingforhealth.nhs.uk/systemsandservices/data/ods/genmedpracs">listing GPs and Branch Surgeries</a> is updated less frequently.</p>

<p>The data is useful for two reasons:</p>

<ul>
<li>If you want to draw maps showing, e.g. location of pharmacies then there&rsquo;s well normalised address data that can be geocoded.</li>
<li>If you want to link up government statistics with individual medical practices or service providers, then you&rsquo;ll need the unique identifiers</li>
</ul>


<p>I&rsquo;ve previously taken all of the ODS data and <a href="http://kasabi.com/dataset/nhs-organization">loaded it into Kasabi</a>. Kasabi is my day job: its a data marketplace for hosting and publish data. If you want to query the data online then you could use any of the range of APIs available from there. Sign-up is free.</p>

<p>For the purposes of the DataSulis project we&rsquo;re only really interested in the data about the NHS in Bath, i.e. BA1 and BA2. I wrote <a href="https://github.com/datasulis/bath-nhs">some Ruby scripts</a> to download the latest ODS files and then extract those organisations that have the relevant postcodes. The filtered CSV files are then cached locally. You could use them as the basis for further processing.</p>

<p>To illustrate the output I&rsquo;ve put <a href="https://github.com/datasulis/bath-nhs/tree/master/data">a snapshot of the data into github too</a>. For example here&rsquo;s a CSV file containing a list of the <a href="https://github.com/datasulis/bath-nhs/blob/master/data/nhs-ods-egdpprac.csv">Dental Practices in Bath</a></p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Bath Postcodes]]></title>
    <link href="http://datasulis.github.io/blog/2012/03/27/bath-postcodes/"/>
    <updated>2012-03-27T14:06:00+01:00</updated>
    <id>http://datasulis.github.io/blog/2012/03/27/bath-postcodes</id>
    <content type="html"><![CDATA[<p>If we&rsquo;re going to be working with local data then it makes sense to have a list of local postcodes. There&rsquo;s all kinds of data that can be usefully linked or combined based on postcode information. For example you could aggregate statistics on crime rates or house prices. Happily the Ordnance Survey now publish some useful Open Data about UK postcodes.</p>

<p>So lets look at how we can work with their data to query it and extract it for local use.</p>

<!--More-->


<p>Here&rsquo;s a quick primer on postcodes. Postcodes have a structure to them. Each of the different parts of a postcode refer to a different area and those areas have a hierarchical relationship. Here are some examples along with the name that the Ordnance Survey (OS) uses to describe them:</p>

<ul>
<li><a href="http://data.ordnancesurvey.co.uk/id/postcodearea/BA">BA</a> = Post Code Area</li>
<li><a href="http://data.ordnancesurvey.co.uk/id/postcodedistrict/BA2">BA2</a> = Post Code District</li>
<li><a href="http://data.ordnancesurvey.co.uk/id/postcodesector/BA23">BA2 3</a> = Post Code Sector</li>
<li><a href="http://data.ordnancesurvey.co.uk/id/postcodeunit/BA23PL">BA2 3PL</a> = Post Code Unit</li>
</ul>


<p>So &ldquo;BA2 3PL&rdquo; is within a sector called &ldquo;BA2 3&rdquo;, and so on. The OS publish their data in various ways, including as <a href="http://en.wikipedia.org/Linked_Data">Linked Data</a>. Without going into details, Linked Data is just a way to publish data to the web by giving everything a unique URL.</p>

<p>So based on a postcode or part of a postcode you can build a URL to the OS website and use it to grab some data. For example here&rsquo;s <a href="http://data.ordnancesurvey.co.uk/doc/postcodeunit/BA23PL.json">a JSON description of BA23PL</a>. That means that you can quickly lookup some useful data such as the lat/long which is the centre of a post code, or to discover in which electoral ward it lies. More on those alternate geographic regions in another post.</p>

<p>Sometimes though you just want a to grab some data for local processing. Having a list of local postcodes can help drive some address matching or other data processing task. So how can we get a complete list of local postcodes?</p>

<p>The OS allow you to <a href="http://www.ordnancesurvey.co.uk/oswebsite/products/os-opendata.html">download data from their site</a>, so you could grab all of the postcode dataset and process it to extract what you need. But there&rsquo;s a simpler way. The OS also provide an API called a &ldquo;SPARQL Endpoint&rdquo; for their data. SPARQL is a query language for working with RDF, its basically a way to query a graph of Linked Data to extract the bits you need.</p>

<p>Here&rsquo;s a SPARQL query that will fetch data about all of the Post Code Units that are within the BA1 or BA2 Post Code Districts:</p>

<div><script src='https://gist.github.com/2220182.js'></script>
<noscript><pre><code>PREFIX po: &lt;http://data.ordnancesurvey.co.uk/ontology/postcode/&gt;
PREFIX spatial: &lt;http://data.ordnancesurvey.co.uk/ontology/spatialrelations/&gt;
PREFIX skos: &lt;http://www.w3.org/2004/02/skos/core#&gt;
PREFIX geo: &lt;http://www.w3.org/2003/01/geo/wgs84_pos#&gt;
SELECT ?id ?code ?latitude ?longitude WHERE {
  {
      ?id a po:PostcodeUnit;
         spatial:within &lt;http://data.ordnancesurvey.co.uk/id/postcodedistrict/BA1&gt;;
         geo:lat ?latitude;
         geo:long ?longitude;
         skos:notation ?code.
  }
  UNION
  {
      ?id a po:PostcodeUnit;
         spatial:within &lt;http://data.ordnancesurvey.co.uk/id/postcodedistrict/BA2&gt;;
         geo:lat ?latitude;
         geo:long ?longitude;
         skos:notation ?code.
  }
  
}</code></pre></noscript></div>


<p>If we submit that query to the <a href="http://api.talis.com/stores/ordnance-survey/services/sparql">OS SPARQL Endpoint</a> then extract just the data we need.</p>

<p>Here&rsquo;s some simple Ruby code that does exactly that. It requests that the SPARQL API return the data as JSON and then spits it out as a simple CSV file.</p>

<figure class='code'><figcaption><span>Postcodes to CSV </span></figcaption>
<div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
<span class='line-number'>16</span>
<span class='line-number'>17</span>
<span class='line-number'>18</span>
<span class='line-number'>19</span>
<span class='line-number'>20</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="nb">require</span> <span class="s1">&#39;rubygems&#39;</span>
</span><span class='line'><span class="nb">require</span> <span class="s1">&#39;json&#39;</span>
</span><span class='line'><span class="nb">require</span> <span class="s1">&#39;net/http&#39;</span>
</span><span class='line'><span class="nb">require</span> <span class="s1">&#39;cgi&#39;</span>
</span><span class='line'><span class="nb">require</span> <span class="s1">&#39;csv&#39;</span>
</span><span class='line'>
</span><span class='line'><span class="n">dir</span> <span class="o">=</span> <span class="no">File</span><span class="o">.</span><span class="n">dirname</span><span class="p">(</span><span class="bp">__FILE__</span><span class="p">)</span>
</span><span class='line'>
</span><span class='line'><span class="n">query</span> <span class="o">=</span> <span class="no">File</span><span class="o">.</span><span class="n">read</span><span class="p">(</span> <span class="no">File</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="n">dir</span><span class="p">,</span> <span class="s2">&quot;..&quot;</span><span class="p">,</span> <span class="s2">&quot;rq&quot;</span><span class="p">,</span> <span class="s2">&quot;list-bath-postcode-units.rq&quot;</span><span class="p">)</span> <span class="p">)</span>
</span><span class='line'>
</span><span class='line'><span class="no">Net</span><span class="o">::</span><span class="no">HTTP</span><span class="o">.</span><span class="n">start</span><span class="p">(</span><span class="s1">&#39;api.talis.com&#39;</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">http</span><span class="o">|</span>
</span><span class='line'>  <span class="n">req</span> <span class="o">=</span> <span class="no">Net</span><span class="o">::</span><span class="no">HTTP</span><span class="o">::</span><span class="no">Get</span><span class="o">.</span><span class="n">new</span><span class="p">(</span><span class="s2">&quot;/stores/ordnance-survey/services/sparql?output=json&amp;query=</span><span class="si">#{</span><span class="no">CGI</span><span class="o">.</span><span class="n">escape</span><span class="p">(</span><span class="n">query</span><span class="p">)</span><span class="si">}</span><span class="s2">&quot;</span><span class="p">)</span>
</span><span class='line'>  <span class="n">response</span> <span class="o">=</span> <span class="n">http</span><span class="o">.</span><span class="n">request</span><span class="p">(</span><span class="n">req</span><span class="p">)</span>
</span><span class='line'>  <span class="n">postcodes</span> <span class="o">=</span> <span class="no">JSON</span><span class="o">.</span><span class="n">parse</span><span class="p">(</span> <span class="n">response</span><span class="o">.</span><span class="n">body</span> <span class="p">)</span>
</span><span class='line'>  <span class="no">CSV</span><span class="o">.</span><span class="n">open</span><span class="p">(</span> <span class="no">File</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="n">dir</span><span class="p">,</span> <span class="s2">&quot;..&quot;</span><span class="p">,</span> <span class="s2">&quot;data&quot;</span><span class="p">,</span> <span class="s2">&quot;bath-postcodes.csv&quot;</span><span class="p">),</span> <span class="s2">&quot;w&quot;</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">csv</span><span class="o">|</span>
</span><span class='line'>    <span class="n">postcodes</span><span class="o">[</span><span class="s2">&quot;results&quot;</span><span class="o">][</span><span class="s2">&quot;bindings&quot;</span><span class="o">].</span><span class="n">each</span> <span class="k">do</span> <span class="o">|</span><span class="n">postcode</span><span class="o">|</span>
</span><span class='line'>      <span class="n">csv</span> <span class="o">&lt;&lt;</span> <span class="o">[</span> <span class="n">postcode</span><span class="o">[</span><span class="s2">&quot;id&quot;</span><span class="o">][</span><span class="s2">&quot;value&quot;</span><span class="o">]</span><span class="p">,</span> <span class="n">postcode</span><span class="o">[</span><span class="s2">&quot;code&quot;</span><span class="o">][</span><span class="s2">&quot;value&quot;</span><span class="o">]</span><span class="p">,</span> <span class="n">postcode</span><span class="o">[</span><span class="s2">&quot;latitude&quot;</span><span class="o">][</span><span class="s2">&quot;value&quot;</span><span class="o">]</span><span class="p">,</span> <span class="n">postcode</span><span class="o">[</span><span class="s2">&quot;longitude&quot;</span><span class="o">][</span><span class="s2">&quot;value&quot;</span><span class="o">]</span> <span class="o">]</span>
</span><span class='line'>    <span class="k">end</span>
</span><span class='line'>  <span class="k">end</span>
</span><span class='line'><span class="k">end</span>
</span></code></pre></td></tr></table></div></figure>


<p>Here&rsquo;s another version that generates a JSON description instead.</p>

<figure class='code'><figcaption><span>Postcodes to JSON </span></figcaption>
<div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
<span class='line-number'>16</span>
<span class='line-number'>17</span>
<span class='line-number'>18</span>
<span class='line-number'>19</span>
<span class='line-number'>20</span>
<span class='line-number'>21</span>
<span class='line-number'>22</span>
<span class='line-number'>23</span>
<span class='line-number'>24</span>
<span class='line-number'>25</span>
<span class='line-number'>26</span>
<span class='line-number'>27</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="nb">require</span> <span class="s1">&#39;rubygems&#39;</span>
</span><span class='line'><span class="nb">require</span> <span class="s1">&#39;json&#39;</span>
</span><span class='line'><span class="nb">require</span> <span class="s1">&#39;net/http&#39;</span>
</span><span class='line'><span class="nb">require</span> <span class="s1">&#39;cgi&#39;</span>
</span><span class='line'>
</span><span class='line'><span class="n">dir</span> <span class="o">=</span> <span class="no">File</span><span class="o">.</span><span class="n">dirname</span><span class="p">(</span><span class="bp">__FILE__</span><span class="p">)</span>
</span><span class='line'>
</span><span class='line'><span class="n">query</span> <span class="o">=</span> <span class="no">File</span><span class="o">.</span><span class="n">read</span><span class="p">(</span> <span class="no">File</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="n">dir</span><span class="p">,</span> <span class="s2">&quot;..&quot;</span><span class="p">,</span> <span class="s2">&quot;rq&quot;</span><span class="p">,</span> <span class="s2">&quot;list-bath-postcode-units.rq&quot;</span><span class="p">)</span> <span class="p">)</span>
</span><span class='line'>
</span><span class='line'><span class="no">Net</span><span class="o">::</span><span class="no">HTTP</span><span class="o">.</span><span class="n">start</span><span class="p">(</span><span class="s1">&#39;api.talis.com&#39;</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">http</span><span class="o">|</span>
</span><span class='line'>  <span class="n">req</span> <span class="o">=</span> <span class="no">Net</span><span class="o">::</span><span class="no">HTTP</span><span class="o">::</span><span class="no">Get</span><span class="o">.</span><span class="n">new</span><span class="p">(</span><span class="s2">&quot;/stores/ordnance-survey/services/sparql?output=json&amp;query=</span><span class="si">#{</span><span class="no">CGI</span><span class="o">.</span><span class="n">escape</span><span class="p">(</span><span class="n">query</span><span class="p">)</span><span class="si">}</span><span class="s2">&quot;</span><span class="p">)</span>
</span><span class='line'>  <span class="n">response</span> <span class="o">=</span> <span class="n">http</span><span class="o">.</span><span class="n">request</span><span class="p">(</span><span class="n">req</span><span class="p">)</span>
</span><span class='line'>  <span class="n">postcodes</span> <span class="o">=</span> <span class="no">JSON</span><span class="o">.</span><span class="n">parse</span><span class="p">(</span> <span class="n">response</span><span class="o">.</span><span class="n">body</span> <span class="p">)</span>
</span><span class='line'>  <span class="n">output</span> <span class="o">=</span> <span class="p">{</span>
</span><span class='line'>    <span class="s2">&quot;postcodes&quot;</span> <span class="o">=&gt;</span> <span class="p">{}</span>
</span><span class='line'>  <span class="p">}</span>
</span><span class='line'>  <span class="n">postcodes</span><span class="o">[</span><span class="s2">&quot;results&quot;</span><span class="o">][</span><span class="s2">&quot;bindings&quot;</span><span class="o">].</span><span class="n">each</span> <span class="k">do</span> <span class="o">|</span><span class="n">postcode</span><span class="o">|</span>
</span><span class='line'>    <span class="n">output</span><span class="o">[</span><span class="s2">&quot;postcodes&quot;</span><span class="o">][</span> <span class="o">[</span> <span class="n">postcode</span><span class="o">[</span><span class="s2">&quot;code&quot;</span><span class="o">][</span><span class="s2">&quot;value&quot;</span><span class="o">]</span> <span class="o">]</span> <span class="o">]</span> <span class="o">=</span> <span class="p">{</span>
</span><span class='line'>      <span class="s2">&quot;id&quot;</span> <span class="o">=&gt;</span> <span class="n">postcode</span><span class="o">[</span><span class="s2">&quot;id&quot;</span><span class="o">][</span><span class="s2">&quot;value&quot;</span><span class="o">]</span><span class="p">,</span>
</span><span class='line'>      <span class="s2">&quot;latitude&quot;</span> <span class="o">=&gt;</span> <span class="n">postcode</span><span class="o">[</span><span class="s2">&quot;latitude&quot;</span><span class="o">][</span><span class="s2">&quot;value&quot;</span><span class="o">]</span><span class="p">,</span>
</span><span class='line'>      <span class="s2">&quot;longitude&quot;</span> <span class="o">=&gt;</span> <span class="n">postcode</span><span class="o">[</span><span class="s2">&quot;longitude&quot;</span><span class="o">][</span><span class="s2">&quot;value&quot;</span><span class="o">]</span>
</span><span class='line'>    <span class="p">}</span>
</span><span class='line'>  <span class="k">end</span>
</span><span class='line'>  <span class="no">File</span><span class="o">.</span><span class="n">open</span><span class="p">(</span> <span class="no">File</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="n">dir</span><span class="p">,</span> <span class="s2">&quot;..&quot;</span><span class="p">,</span> <span class="s2">&quot;data&quot;</span><span class="p">,</span> <span class="s2">&quot;bath-postcodes.json&quot;</span><span class="p">),</span> <span class="s2">&quot;w&quot;</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">file</span><span class="o">|</span>
</span><span class='line'>    <span class="n">file</span><span class="o">.</span><span class="n">puts</span> <span class="no">JSON</span><span class="o">.</span><span class="n">pretty_generate</span><span class="p">(</span><span class="n">output</span><span class="p">)</span>
</span><span class='line'>  <span class="k">end</span>
</span><span class='line'><span class="k">end</span>
</span></code></pre></td></tr></table></div></figure>


<p>And here are the generated files as both <a href="https://github.com/datasulis/bath-postcodes/raw/master/data/bath-postcodes.csv">CSV</a> and <a href="https://github.com/datasulis/bath-postcodes/raw/master/data/bath-postcodes.json">JSON</a>. Those files are provided to save you re-generating them yourself but you shouldn&rsquo;t assume they&rsquo;re always going to be up to date.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Hello Bath Hackers]]></title>
    <link href="http://datasulis.github.io/blog/2012/03/27/hello-bath-hackers/"/>
    <updated>2012-03-27T13:44:00+01:00</updated>
    <id>http://datasulis.github.io/blog/2012/03/27/hello-bath-hackers</id>
    <content type="html"><![CDATA[<p>First post!</p>

<p>This site is an experiment in creating a collection of useful pointers, blog posts, datasets and maybe even APIs for local hackers in Bath. As I&rsquo;ve explained a little in the <a href="http://datasulis.org/about">about</a> page I&rsquo;m interested in exploring how a more local, city level view of Open Data could help support some interesting innovative hacking amongst the local geek community.</p>

<p>As the recent [Bath Digital Festival] has shown Bath has developed a really amazing local tech community. [WeLoveBath] also shows that Bath has a great community of engaged citizens. But I&rsquo;ve not seen any efforts to start curating local datasets that can help both of those communities do interesting, innovative things in our local area.</p>

<p>This site sets out to see if we can remedy that. Basically, I&rsquo;ve spent a lot of time over the last few years doing data modelling and munging and I decided its time to use those skills for good :)</p>

<p>I expect this will be something of a journey and I&rsquo;m hoping that some of you out there will want to join in and help out. I&rsquo;m hoping that if its successful then maybe we can run a BathCamp or hackday dedicated to local data projects.</p>

<p>It doesn&rsquo;t matter if your a geek or not I&rsquo;m hoping that we can explore lots of ways to curate and collect local data. I&rsquo;m as interested in running some local crowd-sourcing experiments as I am in wrangling data from various APIs.</p>

<p>The project is intended to be open from top to bottom, so all of the content, source code and data should be up for re-use.</p>

<p>Lets see what we can make.</p>
]]></content>
  </entry>
  
</feed>
