On Saturday Nerds for Democracy Jeffery and I launched our bid for victory in the app4nsw competition with a new look at crime in Sydney.
Although it was done in Perl (and reuses much of the Perl geo toolkit we used to take 2nd place in the Mashup Australia competition) Crime Alert is a lot more about access to data than it is about code.
With our geo2gov.com.au search engine, our strategy for the competition was to target a problem that the government is inherently unable to solve (or could only do so with great difficulty) due to constitutionally-imposed separation of powers between Federal, State and Local government.
For Crime Alert, the problems we are tackling are all about the limitations of statistics, in particular correlation vs causation and the need to maintain anonymity.
Historically, Australian governments have reported crime based on local government boundaries. This is partly for historical cost and organisation reasons (New South Wales only got a central crime database in 1997) and partly because government has a very strong position on anonymity in reporting.
For example, the Australian Census collects is significantly larger and collects significantly more sensitive material than the US Census, and the Australian Bureau of Statistics goes to extreme lengths to ensure that this information can't then be used for stalking, predatory marketing, etc.
Because crime data is only reported based on groups of 50,000+ people, it is depressingly uniform and not particularly interesting. Unless you are part of government or involved in allocation of police resources, crime statistics serve as nothing more than a curiosity. They just aren't that interesting.
Crime Alert wouldn't exist at all, except for the chance discovery of a series of PDF reports issued by the NSW Bureau of Crime Statistics and Research which contain crime "heat maps" for a subset of the states local government areas (currently, they only cover about a third of local governments).
The creation of these reports changes the game completely, because it means that the government is now comfortable they can release crime information at a resolution as low at 50 metres without violating anonymity.
To get the contours we used for our crime map, we had long discussion with their statisticians to demonstrate they we understood the area, that we would use the data responsibly, and to come to an agreement on particular crime types and metrics that would be both useful and relevant to the public.
With anonymity preserved, the second challenge was to find a way to present the information that is both simple for the consumer and statistically valid.
The complications here are numerous. The maps we use are from two years ago in 2008, the resolution is reduced to three zones based on one standard deviation either side of the mean, the crime types we use have strong time factors (particularly assaults), the crime scores don't control for population density.
But the biggest problem is correlation vs causation. Just because there are a lot of assault where (and when) you are, doesn't mean it's YOU that will be assaulted. So using historical crime density as a predictive factor directly is very very bad.
That doesn't mean it's not relevant at all though. It just means we can't make blanket statements without on-the-ground information.
So our application limits itself to providing simple low/medium/high factors for both your location and the time of day, both of which are based on the average for the local government you are in.
This lets us reduce all the complexity of crime down to a single concept, of being in "the wrong place at the wrong time". And we communicate this via two simple "Here" and "Now" boxes, linked so it is clear that both factors are important.
Taken to the next iteration, we would implement what a user experience designer told us is officially called "Ambient Personalisation".
In this form, you don't even show the user an interface.
Instead, software on your phone will run silently in the background (or on a cron) and from time to time it checks your location with the server.
If you happen to wander into a High crime zone during a High period, your phone beeps or sends you an SMS or rings you or vibrates to let you know you've wandered into the wrong place at the wrong time.
But once you've hit this location/time combination once, the device (or the server) knows you are now aware of the problems in this area and it won't bother you again. If you happened to move house into a bad area, after the first week the phone has now hit each of the time warnings, and won't bother you about it again.
But when you visit later go clubbing somewhere new, or park your car to pick up some groceries after visiting a friend somewhere new, your phone will notice that you've parked somewhere notorious, or that there's been a lot of assaults near this club.
And it's the warnings about these places you have never been that are the key, particularly in a low crime Western city where the crime is often quite concentrated around particular areas, and which start to make the phone an extension of your own ego instead of just a device you use.