Believe it or not, according to its head of digital production, the Telegraph planned to release the unredacted - if slightly edited - data from day one, once the MPs' expenses information was procured (for that alleged and undisclosed sum).

But no-one wanted to talk about it till now: finally got to put its questions to the team this week. Head of digital production, Ian Douglas, had been rather demoted in an Independent on Sunday piece which said he 'loaded the stories on to the web'. In fact, Douglas' job has involved rather more than that, not least managing the release of data online. "[Data publication] was one of the first things we thought about - how can we publish this in full?" says Douglas.

So why delay?
"The scans are forms - mostly handwritten. You've got quite a lot of wiggly bits of Biro and that sort of thing. There's no real substitute for getting someone to go in and fill out a big spreadsheet - a terrible job."

"We have been building it up as we go along, but we wanted to make sure we had the overview before we published anything.

"Through the whole thing we haven't wanted to give partial coverage to it. Part of the idea about having the spreadsheets is about saying 'here's everybody's figures'. 

"News stories focus on individuals really well - we wanted more comprehensive data which is why it hasn't been around till now," Douglas explains.

The Telegraph's format - as anyone has tried out the MPs' expenses database will know - is a series of fact-sheet style tabs, and documents published using Issuu organised by name, but also searchable by constituency. A Google spreadsheet has also been released. Douglas' latest blog post can be found at this link.

Wasn't it commercial pressure and the incentive of added print sales, that dissuaded the Telegraph from publishing it in full at first?
"Public interest has always been the justification for publishing," answers Douglas.

"I think actually that the commercial advantage is lined up quite nicely. When you have a publishing business and there's something that's clearly in the public interest and we publish as much of it as possible, that fits quite nicely into the business model and we have tried to make money off the stuff we've published."

Having said that, he adds that the Google Doc hasn't earned them a penny. But then it cost nothing - apart from manpower - to put it up.

Sharing the PDFs wasn't costly either. While there were other services that they could have chosen, they opted for Issuu, which was pretty easy to get set up with, he says. "We signed up, got a pro account for something like $19 - not exactly a big deal."

The Guardian beat them to publication of a database: its interactive feature allows users to sift through the redacted data and has seen overwhelming interest both nationally and internationally, and a positive response from users.

Why didn't the Telegraph go down the crowd-sourced route?
"I think the Guardian stuff is very good but they can't publish all the source documents that we have [over one million PDF pages]. There's too much missing from the public view.

"Parliament approved a heavily redacted one [version] and we're still concentrating on getting everything out," Douglas says.

There's more to come?

What they did on Monday was the 'beginning of the process' he says. "We got to the point where we felt like we had a good stripe of the data covered. There are more PDFs to come and more figures to come. It's a steady process now."

Douglas' colleague, Tim Rowell, digital publisher, made a revelation in an interview with Paul Bradshaw for the Online Journalism Blog: the database site (hosted on will be expanded in the run-up to the General Election.

"We will be enhancing our political resources over the coming months as we build up to the General Election. This application is not just for the Expenses files, we have plans to develop this area into a full service that enables our users to engage more closely with the democratic process," Rowell told Bradshaw in an email.

What has Douglas learnt from all this?

"It has been a bigger job than we initially thought to really compile all the data, particularly the minimal redacting we have to do before publishing PDFs," Douglas says.

"It's taking a lot more effort than I had originally thought. Although as far as the approach [goes], these are things we would have done with smaller amounts of data."

He would re-consider the
staffing level for future big data projects he says. "The reporting team stopped reporting for a minute, and started compiling figures. I don't think there's any other way of doing it."

So, could he see 'data journalists' playing an increasing role in the newsroom?
'You need people to go through data, who understand it,' to 'pick up the bigger picture' he says.

"Journalism generally is becoming a lot more about trawling large data-sets - [there are] very few journalists now who will get through careers without doing that at some point. There are still vast arrays of public data that no-one has gone through yet."

Related links on

Free daily newsletter

If you like our news and feature articles, you can sign up to receive our free daily (Mon-Fri) email newsletter (mobile friendly).