Advanced online search tools are a must for any journalist, and Murray Dick, multimedia lecturer at Newcastle University, shared his advice on search at the Centre for Investigative Journalism's Summer Conference in London today.
"The best way to find something online is to imagine someone answering, hypothetically, the question you have," said Dick.
"It could be in business language, it could be in teen speak, it could be in a formal way of talking. But think in the idioms or language of a person who might give you the answer you want."
Because most search results will have been written by an individual, you need to think in terms of the answer you are looking for, from the person you expect to answer it.
In almost all search engines, a space in a search will act as an AND signifier. This is the simplest and most basic way to structure searches that most people will understand by instinct.
Using AND to link words or phrases should narrow the search down to a specific area.
The OR function, which can be replaced with the pipe symbol '|', expands the possible results, however, to any of the inputted terms.
Remember: human error and diversity will always be the most difficult factor to account for, so it's worth thinking of possible variations in the answers you want. Searching for "foodbank OR 'food bank'", for example, should cast a wider net.
You can also remove certain terms from the search by using the NOT function. Let's take a hypothetical local government corruption scandal. If you want to find some prior cases, it may be helpful to search for results without the name of the councillor currently in question.There are limitations in language which force search results down certain linguistic clichesMurray Dick, Newcastle University
There's a risk of removing relevant and useful results with this function, but it is one method for narrowing the search.
Most search engines bring back results which appear relatively close together in a sentence, but the AROUND function can be used to exert a little more control to "tease out relationships between two people or organisations you're searching for".
Let's say our corruption scandal involved a councillor and a construction firm. Searching for "[councillor] AROUND (10) [firm]" will return results where the two names have appeared within ten words of each other.
Phrases are useful when looking for particular quotes or long strings of words in order, and can be simply achieved by putting the designated phrase in quote marks.
This again comes back to thinking in the language of someone who might answer a question, if fishing for a phrase or quote.
Perhaps you're searching for a particular phrase or topic but are uncertain as to some of the elements. Looking for quotes on a particular topic using the wild card '*' symbol – for example, in the phrase "* said" – will tell the search engine you are looking for an additional word in the search phrase.
"There are limitations in language which force search results down certain linguistic cliches," Dick said, and it is important to be aware of these and use them to your advantage.
A lot of search engines will give results including synonyms in place of one or other of the search terms, but it can sometimes be useful to "force the synonym" said Dick, to bring back extra results.
The tilde symbol "~" used before a particular word will include synonyms in the search. So "~bribery" or "~corruption" should serve a similar function in a search and act as a slightly more specific version of the wild card, although it is worth noting that Google have now dropped the tilde as an operator in searches.
There are numerous functions available for digging deeper into specific websites.
For example, if you were looking for a political corruption expert at a UK university to comment on a story, you could search for "political corruption site:.ac.uk".
Similarly, you may find an expert by looking for their blog. A search of the 'about' pages at WordPress for climate scientists, for example, would come from "climate scientist site:.wordpress.com/about"
"See if you can take advantage of the fact that most websites are rolled out as cheaply as possible," said Dick. There will be similarities in the wording, structure or addresses between different institutions with similar domain names that can help to inform the search.
Wikipedia holds information about all the different domain names that may be available but most government websites involve a variation on ".gov" while educational establishments often end in ".edu" or ".ac"
Other functions allow you to search for files or databases, with "filetype:.pdf" or "filetype:.csv", for example, or for similar websites using the "related:" function.
Know of a white supremacist forum or website and want to research others? Using "related:www.stormfront.org" with a particular username, for example, will see if that username is used elsewhere, and what they have been saying.
All of these functions can be tied together to dig even deeper into search results, although Dick warned that occasionally some functions can confuse the process. Finding out which can be a process of trial and error.
Looking for quotes round local corruption in government reports?
Try: "political corruption site:gov.uk filetype:.pdf '* said'".
For more information on advanced search, check out this resource from The Centre of Investigative Journalism.
Update: This article has been updated to show that Google have dropped the tilde function for synonym searches, and to clarify the AROUND function.
Free daily newsletter
- 18 data sources for investigative journalists
- Tip: Take note of this reporting advice from experienced journalists
- Tool for journalists: FOIA Predictor, for estimating the success rate of a Freedom of Information request in the US
- With its new project Hertz, Prisa Radio wants to make audio more discoverable online
- UK news brands are often ignored or misremembered when accessed via search or social media, study finds