Data
Credit: By Luke Legay on Flickr. Some rights reserved.

In recent years, companies from Google to Facebook and LinkedIn have invested in data scientists to help them gain business insights and develop new products.

But how can data science apply to journalism?

Speaking at the 2014 Web Summit in Dublin today, Rachel Schutt, chief data scientist at News Corp, highlighted some points for news outlets to consider with regard to data science.

"The transition for the news industry from print to digital is not necessarily a smooth transition, but for data people it represents an opportunity because now [outlets are] using behaviours to quantify consumption of news," she said.

Schutt outlined five key concepts in data science which have begun to shape media in the last fews years.

1. Datafication

Schutt was introduced to the term 'datafication' by an article titled The Rise of Big Data by Kenneth Neil Cukier and Viktor Mayer-Schoenberger.

What the article explained, she said, is that "it's tempting to understand Big Data solely in terms of size, but that would be misleading.

"Big Data is also characterised by the ability to render into data many aspects of the world that have never been quantified before."

In news, this relates to elements such as user location, what apps or device they use – basically "any digital touchpoint the reader has", said Schutt.

At News Corp, datafication is currently being used to shape the news cycle, taking into account how readers are consuming news to determine when stories are published and what kind of stories are produced.

2. Data products

While it is only fairly recently that people have started to understand the importance of audience data, "all that data is completely useless until you start building products and getting value from [it]," said Schutt.

She identified 'data products' built on algorithms, such as recommendation systems, as a key way for outlets to tap into what makes their audience tick in order to produce experiences that are useful and relevant.

The main advantage of data products, said Schutt, is "the more the user uses the product, the more it generates new data which helps improve [it]".

However, organisations should also bear in mind that they themselves are to some extent influencing user data in terms of where content is placed on-site.

For example, if a story at the top of a homepage gets a lot of traffic, the page position is as much a factor as the newsworthiness of the story. For this reason news outlets should always factor page position into any algorithms they build.

3. Data vs. intuition

Data science may be able to provide great insights into user behaviour but human intuition also plays an important part in any decisions made in the newsroom, from what stories are published to what news apps or products are created.

Schutt noted that while good data scientists possessed intuition with regards to data, enabling them to figure out the most interesting angle on any user data they are presented with, good journalists were also "very gut-driven" with "a sense for interesting stories".

For the two teams to work side-by-side, Schutt said it was important to understand the function they both play, and to "respect both the data side of things as well as the intuition and the human side".

4. The data scientist job ad

For news outlets looking to hire data scientists, Schutt said the best way to engage them was to "give them interesting and challenging problems to ward off".

At the same time, these challenges must be married up with a clear goal to develop and improve the business.

"Data scientists are attracted to, like many people, interesting intellectually-challenging problems, but at the same time businesses don't want to invest in 'ivory tower people' who sit around theorising," she said, "they want to make sure the data scientists are having an impact on the business."

At News Corp, the challenge that's attracting data scientists is finding a sustainable model for journalism, she added.

5. Data scientists should think like journalists

Ultimately, there are many similarities between data science and journalism, Schutt said.

"Data scientists are curious, they ask good questions, they know how to tell stories with data, they ask questions of the data itself as well as people."

At News Corp, where data scientists are surrounded by journalists, there is plenty of opportunities for the two disciplines to learn from each other and collaborate on everything from stories to building tools and native products for internal use.

Schutt ended with a quote from data scientist D.J. Patil who, speaking at last year's Le Web conference, said data science was about "creating narratives".

"It is about creating analogies, about using complex data to tell stories," he said. "Journalism is the model we need to look to, we're up against the same ethical decisions."

Free daily newsletter

If you like our news and feature articles, you can sign up to receive our free daily (Mon-Fri) email newsletter (mobile friendly).