In a letter from David Cameron to all government departments, the Prime Minister set out a series of datasets and deadlines for their release, including new measures for local councils, central government and the release of street-by-street crime data.
The new plans will see the disclosure of all local government spending over £500 to be published on a council-by-council basis by January 2011. Details of local government contracts and tenders for expenditure over £500 will also be published in full from January 2011.
"Greater transparency across government is at the heart of our shared commitment to enable the public to hold politicians and public bodies to account; to reduce the deficit and deliver better value for money in public spending; and to realise significant economic benefits by enabling businesses and non-profit organisations to build innovative applications and websites using public data," says Cameron in the letter.
Any information published will be "in an open standardised format" and licensed for free reuse so that it can be used by third parties, says the letter. The timetable and new plans follow January's launch of a new website for government-held data.
What's in it for journalists?
"I'm extremely excited by the announcement. It opens up enormous potential for journalists and anyone else interested in holding power to account. The opening of the Combined Online Information System (Coins) database is probably the most significant aspect at a national level. This contains detailed analysis of government spending and according to a recent (refused) FOI request by the BBC's Martin Rosenbaum contains around 24 million lines of data. There's also a suggestion that we could have street-level crime data for the first time rather than being fobbed off with area-level data that is much less useful," Paul Bradshaw, editor of the Online Journalism Blog and reader in online journalism at Birmingham City University, told Journalism.co.uk.
"The publication of organograms [charts showing responsibilities and connections between departments in an organisation] sounds relatively insignificant but could be particularly useful - one of the biggest problems for bloggers and other active citizens has been finding out who to hold to account, and where to go from there. If the 'common format' is easily mashable then I can imagine a number of amateur and professional projects that make it much easier for people to trace and contact the relevant council official or minister etc., in much the same way as the website They Work For You does for MPs."
In a blog post last week, Simon Rogers, editor of the Guardian's Datablog said the, now scheduled, release of the Coins data, would be the most significant dataset to be opened up by the new government. Speaking to Journalism.co.uk, Rogers said there will be a wealth of stories created by opening up data for journalists working at both a national and local level, who will be able to use the new datasets as "an amazing resource".
Journalists are going to have to get to grips with new skills pretty fast to make sense of the raw data published by the government and local authorities and make it usable for them and their readers, said Rogers.
"The battle for more data started by the Free Our Data campaign has been won, in that the only stuff that won't be released will be relatively high-level, security data. It's thrown the ball back to us journalists and now we have to work out exactly what we will do with this data," he said.
Bradshaw, who has been blogging drafts for a data journalism chapter in his forthcoming book on online journalism, said the local government data plans "are possibly the most exciting of all".
"Note, however, that it is only 'new' contracts and tenders - many local government contracts are long-term. When Help Me Investigate [the collaborative investigations site founded by Bradshaw] was looking into how much was spent on the Birmingham City Council website, for example, we hit a brick wall when it came to the council's contract with Service Birmingham because it had been signed before the Freedom of Information Act was even passed," he said.
David Higgerson, head of multimedia for Trinity Mirror Regionals, told Journalism.co.uk that the raw local data will become useful to journalists when paired with local knowledge to see "what drops out". But the depth and detail of the stories that can be found in the data will be affected by how the figures are presented, he said.
"If the existing Windsor and Maidenhead example is used as the template for all councils, then the information, while useful, won't tell us an awful lot - Windsor Council spent £20,000 with a confectioner, but we don't know on what. The other hidden headlines which could be exciting include details crime information - but again the devil will be in the detail. Will it just be 'three robberies in this street' or will it be more detailed?" he said.
"There's no mention of other public sector organisations here, such as primary care trusts and police authorities. Again, they spend a fortune, and need to be accountable. Particularly around PCTs, there's often disquiet at how money is spent. For journalists, I'd guess it's important we keep on plugging away at getting this information by other means."
Challenges ahead for data journalism
Echoing Rogers' call for journalists to develop new skills to deal with data, Bradshaw says journalists should use the time before the datasets are released to prepare themselves.
"Of course we've still got six months to wait for most of this data - but that should be six months spent in preparation (public bodies will be doing their own preparation, in some cases to counteract what journalists plan to do). Clearly we don't yet know what the formats will be, but there is plenty we can be doing around understanding the potential of data for journalism, and practical skills in interrogating and understanding it," he said.
In a talk to regional editors at a recent Digital Editors Network meeting, Bradshaw warned about the possibilities of "data churnalism" or "data porn" when journalists are confronted with swathes of data and statistics to analyse. Investigative journalist and data journalism trainer James Ball told Journalism.co.uk that there is a risk of this if journalists and newsrooms don't adapt their skills to handling data correctly.
"The data release is very exciting, but there's a risk it's too early for journalists. There's some great work being done in small corners of the big publications (Guardian Datablog is a prime example) but by and large we're not numbers-savvy or tech-savvy enough. This means that even when we try to do good things with data, the results are often pretty dire," he said.
“A discussion on [Martin Belam's] this blog shows the problem: developers and designers can make great data mashups without any intervention from journalists - they're reducing our role to little more than a mediator. We can do more: where a developer might make a graphic, we can find a story. We're more likely to chase up the anomalies, look for wrongdoing, and to pick up the phone and talk to the people involved. The next year could be a great one for data-literate journalists: but only if newsroom culture is ready to adapt, and only if we make sure we don't let a deluge of stats and numbers make our journalism dull.
"Synthesising information and making it spark is the core talent of journalism. This should be right up our street. But if we let ourselves sit back because 'numbers aren't our thing', journalists could easily end up buried beneath the data deluge - and left as nothing but an irrelevance."
Free daily newsletter
- Why data journalism is a civic duty in developing countries
- Many newsrooms have data teams but few reporters have formal data training, study finds
- Tip: Bookmark these examples of using data in video storytelling
- Tool for journalists: Enigma Public, for finding and analysing public datasets
- Kaleida launches The Attention Index, an open-source algorithm to measure the impact of stories