In this feature on data-driven health reporting, we look at how journalists can use the millions of rows of data being released by the NHS.
Open data: A big data story
There is a huge amount of data being made available by the NHS. "The 100 million rows of data released by the NHS Information Centre is a fantastically rich and detailed data source for people to turn into something useful", Francine Bennett, chief executive and data scientist from Mastodon C told Journalism.co.uk.
Mastodon C is a big data start-up which helps companies and organisations to use technology to make sense of large or very messy data sets and turn them into something valuable.
One of the datasets being released is (non-personal) information on every prescription written by every GP in the country. Around 10 million rows of data is released each month from the UK's 8,000 GPs.
Mapping the data
Mastodon C has looked for the stories in the data, mapping out the variation in the prescribing of certain types of drugs across the country.
"There are some drugs that the NHS wants to understand the patterns of usage of, and we've used those files to turn the data into interactive maps. It is a way of digging into where the money is going and who is being prescribed what and how that varies across the country," Bennett explained.
With journalist and doctor Ben Goldacre and another company called Open Healthcare UK, which does health and technology work, Mastodon C started looking at the prescriptions file.
They looked at the prescribing of generic (non-brand name) and proprietary (brand name) drugs, with the proprietary drugs usually costing more than the generic alternatives.
"There is wide agreement among doctors that the generic and proprietary forms of the drugs have basically the same effect for almost all patients," Bennett said. "Obviously you can't tell doctors not to prescribe a certain drug because occasionally there are reasons why they might want to prescribe the proprietary form.
"But there's no real reason to expect the proportion of high-cost drugs to vary across the country," she added. "So where there is a high proportion, that is potentially an issue for the NHS."
Bennett said that British Medical Journal research showed that "controlling that kind of spending better" could save the NHS about £1.4 billion a year.
One of the types of drugs Bennett and colleagues looked at was statins, with prices varying between £1.30 for a non-brand name pack and up to £25 for the proprietary form.
Through looking at the data the researchers estimated a potential saving to the NHS of around £200 million a year, which Bennett explained is in line with the BMJ research.
The dangers in interpreting data
The researchers wanted to highlight the story of the potential savings to journalists. But in briefing journalists on the story, they paid particular care to how the data was released – as simply mapping spending would have showed the most densely populated areas of the country.
"There's an easy and sensational story of 'this doctor has a high proportion [of prescribing expensive drugs], so they are a very bad person'. That's not necessarily true but would be easy to pick up and run with," Bennett said.
"We were careful about the level on which we released data and the way in which we visualised it, which tried to encourage people to think about wider patterns and systems and try and discourage into digging into an individual's behaviour.
"PCTs and health organisations might want to do that but we didn't want to create a tabloid story around it, we wanted it to be a genuine discussion about 'how do we manage the NHS well?'
"We think it's a great institution and it's about giving it positive support and transparency rather than accusing people of things."
The planning around the controlled release worked, and the story was reported by the Financial Times, the Economist and the Daily Mail.
Mastodon C hopes journalists and other data experts will build on their work, and they have made the code available.
When speaking on the subject at a lunchtime lecture at the Open Data Institute, Bennett said: "Transparency is a new way to try and provoke change."
She would also like to see more journalists learning the skills to deal with data.
"Ultimately there is an amount of work involved from getting from raw data to answering questions and telling stories," she said. "But it's a new tool in the investigative journalist's armoury and the raw material is out there."
- To learn about the sources of open data and data journalism, sign up for this course we are running in partnership with the Open Data Institute.
Update: In a statement the Open Data Institute said:
"The Open Data Institute (ODI) worked with Mastodon C and Open Healthcare UK to craft a story that would achieve the considered attention they were looking for from journalists.
"Mastodon C is one of the start-up businesses based at the ODI and is receiving support for its projects, including Prescribing Analytics. The ODI liaised with key stakeholders, developed press materials, provided legal advice on the website content and briefed journalists ahead of the announcement."
Free daily newsletter
- Tip: Take note of this advice for investigating large data leaks
- Tip: These 9 newsletters can motivate you to get started with data storytelling
- Tip: Use these free training materials from ProPublica's Data Institute
- In 2010, women were 'significantly underrepresented and misrepresented' in the media. Where are we now?
- Tip: Check out this list of newsletters about data journalism