Entrusting artificial intelligence to produce articles has proved to be a huge time-saver, but it is prone to error and can also raise ethical problems, according to a report on 'automatically generated journalism' - that is, journalism generated from large datasets processed by AI.
News agency Press Association has demonstrated its potential. It managed to produce 50,000 stories tailored to local news outlets in the space of three months through its RADAR news service.
As this type of technology improves, so will its applications for journalism. But it is important to remember that it also has its pitfalls, said professor of computer science at University of Brighton, Anya Belz, author of the report 'Fully Automatic Journalism: We Need to Talk About Non-fake News Generation.
"For the time being, this means automatically generated news must be carefully checked by human editors for factual and other errors, as well as for missing context," she said.
Many other AI efforts have cited the need to retain reporters in the process, amid fears that robots could one produce journalism operate solely. But Belz's paper identifies another big concern: the risk of bias seeping into articles from source data.
For this, she looked at examples of Press Association’s RADAR, as well as Monok, based in Sweden. Monok extracts snippets of existing news articles and summarises it into a new piece, making it is easy for mistakes to slip through the net.
The important lesson to remember about artificial intelligence is that bad data produces bad results. When fellow Sweden-based company United Robots got its football scores wrong, it had disastrous consequences for the headlines that were generated.
For Monok, there have been cases where the AI has replaced certain words, such as mixing up ‘7 years-old’ with ‘70 years-old’, and ‘Palestinian Harvard student’ with ‘Greek Harvard student’.
Both examples were spotted and corrected by human editors before being published - it goes to show the importance of having a human final line of verification before hitting publish. This is less of a need for models like RADAR, however, because stories are generated from data fed into human-made templates to match the topic, location and publication.
The fact is, automated journalism is only as reliable as the information that has been plugged into it to during its development and training. At one end of the scale, this could cause the programme to assume a pronoun of an interviewee. On the more extreme end, it could be prone to bias around partisanship or political groups.
Despite the risks that can arise from its use, automatically generated news content grants local and regional news outlets a way to meet content demands at a fraction of the price.
“If you can use those automated techniques to survive and focus the staff that you have on local investigative journalism, for example, then I think a lot of people would agree that’s a good thing," said Belz.
Although automatic writing is only in limited use at present, Belz predicts that its use will be widespread and mainstream in just a few years time. But until then, the technology still has a way to go to iron out existing concerns.
Want to learn how to use AI to discover breaking news? Find out at Newsrewired on the 27 November at Reuters, London. Head to newsrewired.com for the full agenda and tickets
Free daily newsletter
- Data journalism meets podcast: how Reach turns databases into human stories
- Tip: Learn to organise a data journalism team
- Using artificial intelligence to grow audience engagement during breaking news
- Weekly journalism news update: artificial intelligence, podcast ad revenue and 'audience canvas'
- Tip: Ten resources for data journalists