"We've got so much data being published every day that you have to get these skills, you can't ignore the open data movement and you can't stay any longer without these skills. You need to get them."
Those are the words of Marianne Bouchart, a web producer for EMEA and data journalism project co-ordinator at Bloomberg News, which neatly sum up the importance, and opportunity, for journalists to get to grips with data.
In the 21st century, almost everything is logged electronically in some way or other. Making sense of the data is a growing necessity among journalists but, for some, that first step from the safe world of words to an alien landscape of numbers can be intimidating.
"I think it's easy to get scared by data journalism," continues Bouchart, "thinking, 'oh my god I don't know anything about numbers and I don't know anything about the subject of this data. Where am I going to start?'
"That's usually the first impression you get when working on a data journalism project and the advice I would give is to forget about fear and just be bold and take this as a great opportunity to learn new things."
So where do you start?
- Start with a question
Paul Bradshaw, founder of Help Me Investigate and a lecturer in online journalism, says the first thing you need to arm yourself with is not programming skills or advanced knowledge of scraping, but a sense of curiosity. Approaching a set of data in search of a story is the same as approaching any situation, in fact it may precede acquiring a data set in the first place, and that is more important in making progress.If you're curious, then that will lead you to the tools and techniques that will help you solve that problem and answer that questionPaul Bradshaw, Help Me Investigate
"If you start with a story or if you start with a question, if you're curious, then that will lead you to the tools and techniques that will help you solve that problem and answer that question," Bradshaw told Journalism.co.uk, "and then you'll get a new question and you'll learn new skills to solve that one."
A recent project from Help Me Investigate looked at local council spending for the Olympic torch relay, an investigation that took almost a year, but was the product of requests under the Freedom of Information Act and good old-fashioned leads.
"In this case someone pointed out to us that bunting was very, very expensive and in fact we found one local authority which spent £50,000 on bunting alone, just on these flags," he said. "This was a lead that led us off on the track to find out how much authorities have spent and how they spent it."
So in this case, it was following the scent of a story that got the investigation started. Analytical tools and techniques were a secondary factor.
- Teach yourself
"To give it a go and fail and learn from my failure and to get new skills out of that and go on and do new projects."
Because data journalism is a relatively new field, most of the leading lights on the subject are self-taught. The same is true for each of the experts we spoke to.
Bradshaw worked in magazine editing and website management before moving into data and investigative journalism and, again, taught himself the skills he needed to find the answers he was after.
Nicola Hughes, who recently joined the Times as a data journalist but prior to this was a Knight Fellow at the Guardian after a year at Scraperwiki, created her own role at CNN in her first job after graduating from Cardiff.
Having been given the "usual starting job" running errands around the newsroom, she started the DataMinerUK twitter account and began researching new projects and data investigations and finding out how they were made. Eventually someone higher up spotted her work and gave her the role of digital media producer.
"I realised there were all these online resources that weren't necessarily set up by news organisations," she said, "but were just tools with which I could analyse and understand information coming out in a stream and in data format.
"So I started with Delicious and all these online resources for tagging things, finding tutorials, finding other bloggers that wrote about tutorials."
- Find a community
"Ultimately you learn from other people," says Bradshaw. "Most of the things I've learnt about freedom of information are either from other people who know more about it than me or because people have asked me questions and I've gone off to find out the answers for them. So just helping other people helps you learn and builds relationships which lead to better stories and that's really what we do with help me investigate."
Bouchart tells a similar story, creating the Data Journalism Blog as a home for what she was learning that got listed alongside the Guardian data blog and ProPublica as one of the best data journalism sites of 2011 by 10,000 words. As a result she worked with the European Journalism Centre in creating the Data Journalism Handbook, a free resource for people interested in data journalism that she describes as "the best place to start".
- Develop a broad knowledge base
One of Hughes's favourite projects during her time at the Guardian was uncovering the large proportion of US food aid that ends up in the pockets of American agribusiness giants.
"[We aimed to] trace every single purchase of US food aid from the vendor, from the actual factory, the plant it came from, to the port outside of the US in it's destination country," she said. "What food type it was, how much it was and for which programme, and that data itself was split between the shipping records and the procurement records. We had to go and find all of these and match them up.
"I like it because it required a lot of journalism, it was grunt work in terms of programming but also some stuff had to be done by hand."
Within a matter of weeks of the US food aid story being published Hughes was working on a different story around the Olympics, moving from a large, in-depth data project on global development to an analysis of biometric and sporting data in less than a month.
At Bloomberg, Marianne Bouchart regularly has to deal with a varied range of projects on multiple subjects – the WikiLeaks diplomatic cables, the Clearstream case, an analysis of trends among medal winners at the 2012 Olympics, co-ordinating with 40 media organisations in analysing data on offshore accounts – sometimes running more than one at a time.
- To specialise or not to specialise?
"When people say data journalism they think that's one speciality in itself," says Hughes, "but it's like saying in school 'I want to do mathematics'. They offer mathematics as a course but mathematics consists of geometry, algebra, topology, probability and those are very different areas. Data journalism is split almost like that as well but there is a huge overlap between the science and the art."There is a huge overlap between the science and the artNicola Hughes, DataMinerUK and Times data journalist
There is no denying that Hughes is a specialist when it comes to the programming side - she teaches advanced data journalism – but she sees her experience working with lots of people and programming languages as a foundation to draw upon when the need to specialise in one particular area arises.
"To a certain extent when you're scraping it uses a subset of libraries which you know, therefore you are very knowledgeable in those subsets of your own skills for those six months and the next six months when you get another project you have to use a completely new set.
"No person can keep up with all the developments in all the different fields. When I say specialise, you are relearning all the different fields."
That is not to say, however, that coding and programming is the main area of expertise involved in data journalism. Paul Bradshaw believes that, although it is possible to work on a simple story by yourself, bigger and better stories are the result of working in a team and therefore anyone can be involved in some respect.
"There are all sorts of different elements to it which require different types of skills: some of it is about design skills; some of it is hardcore programming, if you're creating a tool for example; some of it is understanding the Freedom of Information Act and the Data Protection Act and the Environmental Information Regulations. So it doesn't have to be all technical."
In the UK, FOI requests are often central to data stories, so being able to get the data in the first place is a very important skill, he says. Then there's cleaning up the data, mixing it together in spreadsheets, creating tools for analysis, using visualisation and design skills, finding sources, finding case studies. These are the elements that build the story and being able to play a role ensures your value to the project and the newsroom on the whole.If you arrive in the newsroom with journalism skills on top of all of these other small skills in statistics or an understanding of programming and how to use a spreadsheet – that is something really, really valuable I thinkMarianne Bouchart, Bloomberg
Finally, Bouchart believes that dabbling in various areas is a must, so that journalists are able to adapt to each new project and what it requires.
"You can't spend months and months acquiring new skills," she says. "You have to pick up some knowledge along the way and learn what you need when you need it to get to the next step in your project, get to something real as soon as you can.
"Then if you arrive in the newsroom with journalism skills on top of all of these other small skills in statistics or an understanding of programming and how to use a spreadsheet – that is something really, really valuable I think."
- Hear more from Bradshaw, Bouchart and Hughes on data journalism skills in this previous Journalism.co.uk podcast.
Free daily newsletter
- What do reporters of tomorrow need to know about investigative journalism today?
- Bloomberg's Work Wise newsletter curates career development content for young jobseekers
- How First News helps kids understand the world around them
- Newsrooms that do not personalise content are missing out on 'vital' opportunities to grow
- Podcasts: easy mistakes, golden rules and learning curves