Treemapping the UK's migrant population

5 Apr 2016 08:04 GMT

It's been exciting seeing version 4.0 of D3 develop, and last week Mike Bostock announced that d3-hierarchy was now included in the alpha. This module provides various ways of visualising hierarchical data, including treemaps.

I have wanted to explore using treemaps for some time, as much of the data I work with could potentially be presented in this way, so this seemed like an ideal opportunity to start experimenting.

My first attempt is a treemap showing the foreign-born population of the UK broken down by country of birth. The countries are grouped into broad global regions, using a particular arrangement of the new country groupings the Office for National Statistics has recently introduced in reporting migration statistics. Within the group for the European Union, the EU14, EU8, and EU2 are grouped separately.

The figures are taken from the most recent quarterly Labour Force Survey, which is for the fourth quarter of 2015. They are estimates of all people born abroad who were living in the UK at the time of the survey, excluding two small groups: those born in British overseas territories, and those who did not fully specify their country of birth. (The latter group consists of people who, for example, said they were born in the USSR but did not say which current country that would be.)

The treemap itself is relatively simple and leans heavily on Mike Bostock's example code, with just a few presentational tweaks. But it's the first time I have seen this data (which is very familiar to me) laid out in this way. It was extremely simple to get this working and I am looking forward to delving deeper into d3-hierarchy to explore what else is possible.

Update on Population Builder

11 Feb 2016 20:32 GMT

I haven't had enough time to post about work on Population Builder in recent months, but there have been a few developments that I think are worth sharing.

New features

I had a few requests from regular users for features that seemed worthwhile and relatively easy to implement. By far the most requested feature was a button to deselect all the currently selected areas with a single click, without having to reload the page. This has now been added. Clicking or tapping “Clear Map” does what you would expect.

More interestingly, a couple of people have told me they are using the app as a general purpose tool for selecting LSOAs and data zones, in order to identify the area codes they need for geographical analysis of other data. To help with this, I have added a feature that allows you to inspect the geographical code of each area.

On a computer with a mouse pointer, hovering over an area highlights its boundary and displays its area code in the overlay control in the bottom right-hand corner.

A screenshot from the application showing a highlighted boundary.

You can highlight an area on a touchscreen device by long-pressing it, which also displays the area code in the overlay control. Another long-press on the same area dismisses the highlight, while long-pressing another area highlights the new area instead. This makes it possible to visually match an area to its code.

New data

The app has been updated to use the latest mid-year population estimates for small areas, which are for mid-2014. I updated the stats in November shortly after the ONS published the latest figures.

Updating the figures for Scotland also meant updating the maps to use 2011-based data zones. So if the boundaries in Scotland seem different, that's because they are.

Open source

The complete source code for Population Builder is now available on GitHub. Of all the things covered here, this was the most work, as I wanted to port the server-side code from PHP to Go before sharing it online. This is partly because Go makes it much easier to download and run the software locally as a standalone application, but also because I've wanted to move to a more modern web-development stack for a few years now and this seemed like a good place to start. Now it's done I can concentrate on new things.

An alternative file server for Go

28 Dec 2015 16:19 GMT

Go's net/http package contains a simple FileServer that lets you map a request path to a directory and serves files under that path like a normal webserver. It's a handy way to serve static files from inside an otherwise dynamic web application.

Go's built-in FileServer is fine if you're just using it for development, but it has two shortcomings if you want to use it in production. First, when it can't find a requested file it responds with a very basic plain text 404 page. Second, when the user requests a path to a directory which does not contain an index page, it shows the directory listing, with no easy way to turn that off.

The FileHandler in the handlers package is an alternative file server that addresses both of these problems. FileHandler lets you specify your own handler for serving 404 pages, and it responds to any request for a directory that does not contain an index page with that handler, rather than show a directory listing. FileHandler is just a wrapper around Go's ServeFile function, so it still has all the goodness of Go's built-in file handling.

The handlers package also contains a simple NotFoundHandler which you can use with FileHandler. But you don't have to. FileHandler can use any type that satisfies the Handler interface to serve custom 404 pages when no matching file is found.


16 Nov 2015 22:51 GMT

Decimals is a small library of functions for rounding and formatting base ten numbers in Go. Most of the software I write involves presenting numerical data, and Go's standard library lacks convenient methods for displaying numbers in a way humans find easy to read.

Decimals uses simple and consistent rules for rounding and formatting numbers across all its functions. There are other formatting libraries that offer a wider range of output formats, but decimals aims to make it as easy as possible to deal flexibly and extensibly with the most common case.

See the read me for a full write-up or jump straight into the GoDoc.

How to learn Go

11 Nov 2015 21:33 GMT

During the last two months I have been learning to program with Go. Go is a general purpose programming language which has features that make it well-suited for fast concurrent network applications. It is a very enjoyable language to use, in that it is terse, clear, well thought-through, and gets rid of a lot of the drudgery of programming, so that you can focus on building things rather than managing code.

This post is not a tutorial on programming with Go, but a recipe for how to learn the language. It consists of a short primer on the language and a guided reading list, which will take you from knowing nothing about Go to building complete applications.

It is aimed at people who already know at least one other programming language. If you want to learn Go as your first programming language then you should read An Introduction to Programming in Go, and then perhaps come back here for more later.

What sort of language is Go?

Go is a statically typed language, which means you have to declare what type of variable you are using, and convert between types when necessary. This is much less of a pain in practice than in theory, as Go has a number of conveniences for type handling, like a short assignment operator that infers the type of a new variable from what you assign to it. Static typing also has some benefits: it helps expose flawed reasoning earlier rather than later.

Go has pointers. Or at least, it has its own version of pointers. Pointers feel daunting at first if you are used to languages that don't use them, but they quickly become second nature. Pointers make explicit when you are passing by reference and by value, and let you decide which is most appropriate for your own functions and data structures case by case. In Go, a pointer is a special data type that allows for referencing behaviour but does not permit bare-metal (and potentially dangerous) memory-level hackery such as pointer arithmetic.

Go does not have objects, constructors or traditional object-oriented inheritance. Instead it has user-defined types, which can have methods attached to them, and interfaces, which define a set of methods that a type must have in order to implement them.

Go makes it easy to do concurrent programming safely. Go has a lightweight thread called a goroutine. Any function can be run within its own goroutine simply by calling it with the “go” keyword. Goroutines can communicate with one another through channels, so different threads can easily synchronise their tasks.

Go has playgrounds, which are online notebooks of Go code that you can run in the browser and share. Playgrounds are useful for drafting and testing functions and types in isolation. Go is compiled and can't be run interactively like Python or R, so playgrounds give you a similar sort of freedom to play and experiment with code.

Go is prescriptive. Its creators have strong opinions about how the language should be used, and encourage you not only to learn its idioms, but to understand the reasons for their language design choices. The source code to the standard library can be read online and the authors encourage you to explore it in order to learn how to use the language well.

How you react to this aspect of Go's culture will depend on your personality. I love it. I don't want to waste time fretting over stylistic formatting decisions or how best to organise code files. Go gives clear and unambiguous answers to the questions of how you should use the language, so you can concentrate on solving the programming problems as effectively as possible.

How to get going

Start with the Tour of Go. This is a short interactive guide to the most essential parts of the language that demonstrates its distinctive features. Because the code examples are shown as Go playgrounds it's easy to discover additional things about the language by testing your intuitions and modifying the example code as you go. Don't expect to learn the language from the tour though, it's just laying the foundations.

Next, install Go.

Then read How to Write Go Code. This takes you through setting up a Go workspace and the process of writing, building and running Go programs. From this point on you will probably want to start trying things out and it's good idea to get everything set up and ready.

Then, read Effective Go. This is the full introduction to the language. If you only want to read one thing to learn Go, it's this. It explains not just how the language works but why it works the way it does. It manages to fit an awful lot into a relatively short text.

How to keep going

The official Go Blog contains a back catalogue of articles exploring particular aspects of the language in depth. The article on slices is essential reading for anyone learning Go because slices are one of the most fundamental data structures in the language. The follow-up article on strings is also very helpful, as it shows you how to manipulate Unicode strings using the rune datatype. The article on error handling is another good one for beginners to read because Go does not use the standard try and catch approach.

The documentation for all of the packages in the standard library is available at GoDoc. It's well written and often tells you everything you need to start using a package effectively. You can also use the GoDoc website to read the documentation for any Go package hosted on GitHub. Just add the package path (e.g. github.com/mattn/go-sqlite3) to the GoDoc domain.

How to go further

Unit testing is built in to the standard library and is done with the testing package. The documentation will probably provide enough information if you're already familiar with unit tests, while Go(lang) Unit Testing for Absolute Beginners at JonathanMH's blog is a nice introductory tutorial on unit tests in Go which assumes no prior knowledge.

Database access in Go is done with the database/sql package, which provides a common interface to database operations across different databases. The Go database/sql tutorial is a step by step guide to the interface and how to use it with specific databases.

Go is so well suited to server side software there is an official tutorial on Writing Web Applications. Go has a flexible and powerful template system built into the standard library. The chapter on Templates in the GitBook Build web applications with Golang is a good introduction.

Final thoughts

Go is a great general purpose programming language, but that doesn't mean it's the best language for every purpose. Python and R are better for data analysis, and Go can't replace JavaScript for doing data visualisation in a browser. But what it does well, it does very well indeed. Although I have been using it for just a couple of months, I think it may be the most promising of the new generation of languages for building applications on the web.