"As a journalist your main tool is talking to people and asking the right questions of the right people," said civic technologist and self-described "OpenGov and data journalism geek" Friedrich Lindenberg in a webinar on investigative journalism tools for the International Centre for Journalists last week.
"This is still true, but also you can ask the right questions with the right databases. You can ask the right questions with the right tools."
Lindenberg listed an arsenal of tools the investigative journalist can equip themselves with. Here are some of the highlights.
Lindenberg described DocumentCloud as a "shared folder of documents", offering different folders that can be used for various investigations, control over who can access which documents, the ability to annotate different parts of documents, search throughout and embed segments or entire documents.
Even better, DocumentCloud looks for "entities" – such as people, companies, countries, institutions – identifies them and makes them searchable, which is especially useful for legal documents that may stretch into hundreds of pages when you are only interested in a few key points.
DocumentCloud is run by IRE but Lindenberg encouraged journalists to contact him at SourceAfrica.net, where an open source version of the software is available.
Screengrab from documentcloud.org
A "bit more of an expert tool", according to Lindenberg, Overview lets the user import documents from DocumentCloud or CSV files and then counts the frequency of words to make a "hierarchy of terms" for words.
When used this way, Overview can give a quick rundown of large numbers of documents, making it easier to understand the core topics.
Popularised by dramatisation of the Watergate scandal All The President's Men, "follow the money" is one of the mantras of investigative journalists everywhere.
Many large and expensive business registries exist to track the myriad connections between individuals and companies, but few within the reach of the press.
One of those few is Open Corporates, where users can search by name or company and filter by geographical jurisdiction.
DueDil has a similar function to OpenCorporates but is a "slightly better research tool", said Lindenberg, as you can narrow the search on individuals with similar names by searching by birth date.
Where OpenCorporates has a global range of company information, DueDil mainly draws on UK companies. Both operate on a freemium model with monthly fees for greater access.
Both OpenCorporates and DueDil were built for business purposes, helping people to conduct due diligence on companies and individuals before any signing any contracts.
Investigative Dashboard though, is tailor made for journalists. Users can search business records scraped from websites in a range of countries or go through the directory of more than 450 business registries, company lists and "procurement databases" – which highlight the 'hot point' where companies and governments do business – to find detailed information.
"They also have a broad network of researchers in different regions," said Lindenberg, "and they will look at other databases that they will be familiar with and maybe even have stringers and contacts on the ground who will find information and documents."
Paul Radu, an investigative reporter at the OOCCRP who helped build the Investigative Dashboard, told Journalism.co.uk the platform has researchers in Eastern Europe, Africa, the Middle East and Latin America.
"We do pro bono due diligence work for journalists and activists and these people have access to all the open databases," he said. "But also we managed to get some funding to access some pretty expensive databases that are very useful in tracking down the information across borders."
Screengrab from investigativedashboard.org
Governments are partial to releasing reports and figures in PDF files, making it difficult for journalists looking to analyse and investigate the data contained within.
In the UK, you can specify usable file formats (excel or CSV for example) in Freedom of Information requests. But if you are still faced with data locked up in a PDF, you need Tabula.
"It's the gateway drug to data journalism", said Lindenberg of Tabula.
Simply download and install the software, open a PDF in the program, select a table and Tabula will convert it into a workable file format. Magic.
Lindenberg suggested many more tools to help journalists analyse documents and data, scrape web pages and further their investigations alongside the hour-long webinar.
However, he stressed that for best results viewers should pick one tool for a project and learn to use it well, rather than trying to get to grips with lots of new things at once.
"Learning these tools requires a bit of time experimentation," he said, "a bit of willingness to get into this new thing and once you've done that you will get some benefits out of it.
"If you're saying 'I'm not a computer person' I want you to stop doing that and say instead that you're a journalist who has arrived in the 21st century and is using digital tools in a way to conduct fantastic investigations."
- Paul Radu, founder of the Investigative Dashboard, will be speaking in a workshop about online investigations at Journalism.co.uk's forthcoming news:rewired digital journalism conference. Find out more here.
Free daily newsletter
- Tip: How to tell stories using small data sets
- Tip: Check out this guide for getting started with investigative data journalism
- Medicamentalia shines a light on global access to health through collaborative data journalism
- How the Bureau Local collaborated with more than 160 people in five UK cities to investigate local budget proposals
- The importance of 'cultural absorption' when writing a book fit for the small screen