DocCloud
Today saw the final (extended) deadline for full proposals to the Knight News Challenge, the competition which awards up to $5 million for innovative, community news projects.

Following our interview with entrant Paul Bradshaw, Journalism.co.uk looked into another of the 445 initial applications, DocumentCloud, and asked Aron Pilhofer, New York Times newsroom interactive technologies editor and one of the project's proposers, what it's all about.

Firstly, there's no 'it' yet, he stressed: DocumentCloud, for which $1 million has been requested, is very much at the ideas stage.

Pilhofer, along with another New York Times colleague and ProPublica's Scott Klein and Eric Umansky, are proposing to create software and a 'consortium' in which news organisations will be able to search and upload investigative documents.

It will include 'software, a website, and a set of open standards and APIs that will accelerate the daily work of investigative reporters, and will make investigative reporters out of every citizen, by improving the way we find, share, read and collaborate on source documents online', the initial application reads

The innovative part of DocumentCloud is its use of API and metadata, Pilhofer added.

DocumentCloud will host and provide an open API to an online database of source documents, contributed by a consortium of news organisations, watchdog groups and bloggers.

"Think of it as a 'card catalog' of standardized metadata for primary source documents," the original application states.

Once submitted to DocumentCloud, documents can be found, linked to and retrieved by anyone, anywhere on the web.

Using metadata, users will be able to search by topic, agency or location. The project will lower barriers of participation by creating open standards and open source software.

The web-based DocViewer will allow any organisation to publish its documents online and contribute to DocumentCloud. Readers will also be able to quickly search, annotate and bookmark documents - and for the first time link directly to specific pages or passages.

"That software does not [yet] exist. It sounds simple but there's a lot of complexity to it," Pilhofer explained.

Currently there is a 'middle layer' which stops efficient searching: users have to go to the information, rather than it coming to them, he added.

"Eliminate that layer [and] that's when you've achieved the semantic web zen," he said.

"Researchers can annotate documents in key places, and key paragraphs, and embed those into the stories they're writing."

Annotations can be aggregated, he said. Users will be able to identify when 'a bunch of researchers have gathered round this passage in this document,' he added.

NYTimes.com has already developed 'a small portion' (for example here) of the software for DocumentCloud's viewer, which, along with other software, will complement  the 'consortium' of news groups participating and developing a set of standards.

The software will be accessible to smaller organisations and the project aims to 'lower the bar', so that groups with limited resources could contribute to the collaborative project, he said.

The grant would not be made to NYTimes or to ProPublica, Pilhofer clarified, in response to questions raised on various blogs.

"This is a project proposed by someone who works at the NYTimes and is supported by the NYTimes, but it would have its own identity, its own set of adminstrative layers, its own servers, and it wouldn't live at NYTimes.com," he added.

Indeed, the DocumentCloud is not intended to be just US-based, Pilhofer said, but is 'as big as we want to scale'.

Despite building on other document publishing projects, its developers argue that it will be the first time documents will be 'an intrinsic part of the semantic web and a part of reporting news online'.

The full proposal can be read here.

Journalism.co.uk related articles:
Knight News Challenge 2008: Paul Bradshaw's Helpmeinvestigate.com
Knight Challenge announces third year of the Knight News Challenge

Free daily newsletter

If you like our news and feature articles, you can sign up to receive our free daily (Mon-Fri) email newsletter (mobile friendly).