WikiLeaks has today started publishing files from a set of more than two million emails "from Syrian political figures, ministries and associated companies", a set of data said to be 100 times larger than the diplomatic cables release in 2010.
The organisation has also built a "general-purpose, multi-language political data-mining system" to search "massive datasets" for specific terms.
The "Syria Files" relate to emails dated August 2006 to March 2012, according to an announcement by WikiLeaks.
"This extraordinary data set derives from 680 Syria-related entities or domain names, including those of the ministries of presidential affairs, foreign affairs, finance, information, transport and culture."
WikiLeaks added it is "statistically confident that the vast majority of the data are what they purport to be."
The whistleblowers' website said that stories relating to the data will be published by WikiLeaks as well as a number of other publishers, including ARD in Germany, Associated Press in the US, L’Espresso in Italy and Owni in France.
In a press conference WikiLeaks outlined a new "data-mining system" which can be used to search for terms in documents, attachments and all file names within the attachments, and can also exclude certain terms.
Journalism.co.uk understands the new "search interface" will not be publicly available, but for use by WikiLeaks and journalists at partner news outlets.
Free daily newsletter
- Newsrooms that do not personalise content are missing out on 'vital' opportunities to grow
- Voice for the voiceless: smartphones are the weapon of choice to tell stories from Syrian civil war
- Which metrics truly matter to improve your online content?
- Aesthetic journalism: Overcoming censorship and documenting Syrian conflict with a sketchbook
- What does GDPR mean for journalists?