Measuring data quality or why are we even Openstreetmapping anyway ?
In the beginning was the empty map and people began filing it. Their goals were clear: n°1 having fun, n°2 world domination. Data scarcity kept everyone from being too picky — pretty much anything is better than nothing… Just fill in the blanks !
Then some adventurous ones began to take that miraculously produced data: users — they spell trouble, some of them even have expectations ! But that is not our problem: we only produce the data and it is free software, so they can fix it for themselves. And who are these people anyway ? They don’t even belong to the club !
Then the ground shook — I mean literally: in 2010, in the aftermath of the Haiti earthquake, both Openstreetmap and the world became aware of each other:
- Openstreetmap: “Wait, what ? People might actually find this useful ?”
- World: “Wait, what ? Free geographical data that might actually meet our needs ?”
Humanitarian usage took the spotlight, but the universal potential didn’t escape wider attention. There was a feeling in the air that Openstreetmap had become relevant outside of Openstreetmap — that we had become vaguely responsible for something which, for lack of quality metrics, remained a fuzzy concept.
Nice, but how does this short and slightly fantasized history of Openstreetmap have anything to do with measuring data quality ? We’ll soon come to that, but let’s start with defining quality. There are many ways to define quality — some soaring beyond the stratosphere, way in the astral plane, such as Pirsig’s Metaphysics of Quality developed in Zen and the Art of Motorcycle Maintenance, which I made the mistake of reading with no concomitant intake of mind-altering substances. I’ll rather choose a very practical definition: fitness for purpose, which equates quality with the fulfillment of a specification or stated outcomes. In other words, quality measures whether the product meets its goals.
So, what are Openstreetmap’s goals ? I’ll start with classifying them between internal goals and external goals. Openstreetmap being formally incorporated in the Openstreetmap foundation, its internal goals are clearly defined in the objects of its articles of association: “to encourage the growth, development and distribution of free geospatial data, to provide geospatial data for anybody to use and share”. That one was easy. The external goals are those originating from the rest of the world. Yes, that means users and now you understand why I started this article with a short story of Openstreetmap meets users: with no users there is no concept of quality.
This morning, Openstreetmap Cameroon’s Willy Franck Sob, who is launching Geosm (a nice geographic data distribution portal), gathered a bunch of African Openstreetmap contributors to brainstorm about new collaboration avenues — incidentally this is what kicked me in gear to get this article, long lingering among neglected drafts, out of the door. I had the pleasure to witness consensus about the need to strengthen the weak link between Openstreetmap and consumers — the problem is in the back of everyone’s mind. But the easy solution is deceptive: we can only reach to the end-users we already have in sight and bringing them onboard will mostly reinforce what we already know from the existing relationship. We are stuck in the Openstreetmap community ghetto. To make Openstreetmap more valuable to actual users, we need to break out into their world. Now, this looks familiar — this is a marketing problem.
A “build it and they will come” approach has served Openstreetmap well so far — blissful ignorance of customers provided freedom to experiment and find a path. When the market is quickly shifting to unknown grounds, it is the classic way. At this point, I feel an awful urge to paraphrase Simon Wardley’s Pioneers, Settlers, Town Planners model, a sinful temptation I’ll eschew by pasting his own work here:
So, Openstreetmap is moving from the Pioneers stage to the Settlers stage — those are the growing pains that we have been feeling and that is why the concept of ecosystem is now on everyone’s mind.
According to Wardley, the Settlers stage requires market analysis, feedback, trend spotting, listening to customers… None of which are what the excellent Openstreetmap folks are particularly proficient at — and as a result, Openstreetmap direct marketing is ineffective. But some businesses are extremely good at that, which is why they thrived — and some of them use Openstreetmap data. Letting Openstreetmap leverage their expertise is indirect marketing. Openstreetmap has distributors, whose visible appendages are web sites and mobile apps — they represent Openstreetmap to a consumers that may not particularly care about Openstreetmap. Do you care that a Linux operating system served this article ? Not particularly: you just wanted to read the article. Though you are happy that a free and open operating system lets developers collaborate to bring you a superior experience, it is really just far in the background.
That game spells developers developers developers developers ! So we’re back to the core truth of Openstreetmap: Openstreetmap is a platform for developers, not a service to consumers. And herein lies a profound fault between perceptions of the Openstreetmap Foundation’s mission: should it engage more with consumers ? Yes, because consumers are what this is all about — but ultimately no, because Openstreetmap’s model is indirect distribution. The burden of connecting with the consumer falls on developers — not just software developers but the broader sort: business developers. That does tell us something about the Openstreetmap Foundation’s mission though: the productization of Openstreetmap implies the rise and dominance of corporate involvement, so defending the license and spirit of Openstreetmap will only become more important in the coming years.
With all those circonvolutions, how do we come back to the problem of measuring data quality ? We don’t — it is all on you ! Free software begins with “scratch your own itch” and it is the same whether I want a nice map to plan my bicycle adventures on or whether Facebook wants a cheaper map background: know your users, whether they are billions or just you.
Above all, abandon the illusion that there is an Openstreetmap community that cares about consumers — both of which, in this context, are too fuzzily defined for measurement and will therefore yield no operational insights. Openstreetmap contributors who wonder about consumer-perceived quality are exactly a solution in search of a problem: negative value.
With that attitude, would Openstreetmap have started at all ? No, but that was the Pioneer stage — Openstreetmap has moved on and so have its motives.
So start with customers, explicit a problem, solve it — maybe with Openstreetmap data… And then quality metrics will be obvious because the goals are clear. As long as we go at it backwards, there will be no quality. Bring out your consumer use cases, they are the source of truth !
A month ago on Twitter, Allan Mustard challenged me to start the ball rolling about Openstreetmap quality metrics — I thought I would be starting a working group. That is how I started writing this article. Now that I have written it, I will probably not be starting an Openstreetmap quality metrics working group: quality metrics are irrelevant to Openstreetmap as a whole — they are intimately linked to consumption and therefore to applications. Usage rules !
Now I’m off to make the map useful for riding my bicycle — I’ll add water towers and pylons because they make nice landmarks for navigating !