Browse > Home /

| Subscribe via RSS

Four #real-time search options leave Google behind

June 21st, 2009 | No Comments | Posted by Colin Meek in Featured, Realtime, Search tools and tricks

While there is no sign of panic yet, there is certainly evidence that established web giants such as Google and Facebook are frantically playing catch-up as the clamour for real-time search grows. Not only are there a range of tools to monitor Twitter in real-time there are browser add-ons that enable you to add Twitter search results to your Google results page. Why is Google being so slow to grasp the demand for real-time search?
The bottom line for journalists is that Google no longer has the best answer to the simplest question: ‘What are people saying about [my query] right now?’ For those of us steeped in Google search experience – it’s a scarey thought. Google is, apparently, working on a real-time offering but, in the meantime, are there competitors to the real-time results available from Twitter and the various Twitter monitoring tools available?
Here’s a heads up on four tools that don’t rely exclusively on Twitter and offer real-time monitoring for your search queries.

collecta1 Collecta, just a few hours old, has the cleanest and most intuitive search page. Your search term is tracked and results listed in a central column. Clicking on those results gives you fuller content and a link to the source. The left hand column lists your recent queries and gives you the chance to include blogs, blog comments, images and updates (tweets) in your results. Collecta also tracks the time since your query. I’ve had impressive results so far.

2 Scoopler, has another crystal clear search page with the main section split between ‘real-time’ search results and ‘popular’ results that include content from news sites and videos. The ‘real-time’ results are drawn from Twitter, Digg, Delicious and other networking sites. This middle column gives you the options of previewing the posts or you can click through directly to Twitter profiles.

oneriot-logo13 OneRiot, claims to do the same as Scoopler and Collecta but when I tried its ‘realtime’ search for results containing ‘tehran’ and ‘iran’ there weren’t a convincing number and they failed to appear with anything like the frequency I would have predicted. On the plus side, you can use its ‘pulse’ search option which uses an algorithm that looks at dozens of factors to give “weight” to certain results. OneRiot has used various factors to influence the weighting including: freshness; source credibility; and, ‘acceleration’ whereby posts that are gaining momentum (links) on the web are ranked as more important.

4 I looked at Icerocket in January. Not only has it set its blog search as its default option, it has also added a twitter search and a real-time ‘Big Buzz’ alternative. Big Buzz pulls in very recent results from blogs, Twitter, Video, News and Images and gives you an ‘auto refresh’ option to update those results every minute or so as you are working on a story.

I am @colinmeek on Twitter

Tags: , , , , , , , , , , , , , , , , , , ,

Google’s advanced operators for journalists

Master commands for precision surfing. Presentation.

Confusion is rife about how and when you can use Google’s advanced operators. Used effectively they can transform your research by helping you get better results faster. Here’s my recently updated presentation on advanced operators with some context and example results.

Tags: , , , , ,

Oops. Google flags all sites as harmful.

January 31st, 2009 | No Comments | Posted by Colin Meek in Featured

googledork1 Earlier today Google responded to every search with the message ‘this site may harm your computer’. Google has apologised – with the official line that the mistake was caused by human error. The company said the glitch lasted for 40 minutes – which, in my view, is quite a long time for the world’s biggest and most influential search engine to remain effectively useless. The company has apologised to the owners of sites that were incorrectly labeled as ‘harmful’. It will be interesting to monitor how the company handles the inevitable criticism.

Thanks to Henk Van Ess who alerted me to this problem through his comments on my post on Icerocket.

Tags: , , ,

Easy transfer from Google Notebook to Evernote and Zoho

January 29th, 2009 | No Comments | Posted by Colin Meek in Sorting and Storing

If, like me, you were shocked by Google’s announcement that they are to abandon development of its highly rated Notebook app – help it at hand. Two of the best notebooks have launched importers that enable evernotelogoGoogle Notebook users to easily switch to Evernote or Zoho.
Evernote’s instructions include a video tutorial while over at Zoho you’ll need a simple Firefox plugin.

If you haven’t tried a notebook app before then see my post on the 10 reasons why journalists and researchers should love Evernote.

Tags: , , ,

Memory search tool – Infoaxe

January 28th, 2009 | 1 Comment | Posted by Colin Meek in Search tools and tricks

infoaxe I’ve been aware of Infoaxe for some time but only recently discovered that its functionality allows you to do far more than search your private web history online. Here’s what it can do:

  • Once you’ve downloaded the Infoaxe toolbar it gives you quick access to the sites you visit most.
  • It lets you decide which pages you visit are stored in your history.
  • You can tag the pages you visit to organise your history.
  • Once you’re up and running when you search Google or Yahoo you get search results from your history alongside the standard search engine results.
  • The ‘pivot’ tool enables you to bundle pages together from your history that you browsed at around the same time.
  • You can search your history using any computer.

Given the search functionality, Infoaxe effectively works to automatically bookmark all the pages you visit.

Tags: , , , ,

Icerocket skates in where Google fears to tread

January 22nd, 2009 | 9 Comments | Posted by Colin Meek in Featured, Monitoring Tools, People, Search tools and tricks

If, like me, you’re wondering why you can’t do a ‘Twitter Search’ from the big two search engines then John Battelle has a few theories. To you and I it might make perfect sense for Yahoo! and Google to include a Twitter search but, it seems, they are both willing to sacrifice functionality because they are reluctant to endorse a potential competitor. ‘Fascinating’ if you’re an industry observer; rubbish if you want a search engine to do the obvious.
That’s where Icerocket steps in. As Phil Bradley noted this week, the very nice Icerocket search engine has recently added a Twitter search option. But perhaps even more interestingly it has added a ‘Big Buzz’ option that pulls in very recent results from blogs, Twitter, Video, News and Images. It even gives you an ‘auto refresh’ option to update those results every minute or so as you are working on a story.
As Phil Bradley notes: ‘the whole area of news and social media is one that is seemingly passing Google straight on by.’ More on Icerocket soon.

Tags: , , , , , , ,

Browsers compete on privacy controls

October 31st, 2008 | No Comments | Posted by Colin Meek in Your own privacy

There are very good reasons why journalists need to worry about their privacy more than most and I outlined those reasons in this article for journalism.co.uk a few months ago. So, it’s good to see that browsers are increasingly competing on the way they allow users to protect their online privacy. The Center for Democracy and Technology has published a report on the issue covering Firefox, Internet Explorer, Google’s Chrome and Safari. Here’s a link to the Resource Shelf post on the full report. Privacy Controls: Privacy focus means more choice for consumers protecting their personal data

Tags: , , , ,

The Semantic Web today – An Interview with John Breslin

Not many people are as close to the heart of the Semantic Web as John Breslin. John is the founder of the Semantically-Interlinked Online Communities (SIOC) project, a member of the W3C Advisory Committee, lecturer at the National University of Ireland and an associate researcher on the semantic web at the Digital Research Institute in Galway.

I caught up with John recently for this analysis of the semantic web and journalism published in journalism.co.uk. We thought John’s points were so interesting that we’ve brought you the full interview in this post.

Niche social networks…
Q: Some have predicted that the rise in the universal social network sites such as Facebook and Myspace will be mirrored soon by the explosion in growth of niche sites such as Xing and Peer Trainer and the expanding interest in ‘enterprise2.0′. In many ways this may accelerate the the demand for semantic-type applications that allow people to travel seamlessly through various social networking services. What do you think?

John Breslin: I think that even though some have argued against the need for niche social networking services (SNSs) due to the widespread use of large sites like Facebook and MySpace, these niche SNSs can provide a breath of fresh air when one wants to escape from the bigger “overcrowded SNS cities”. As long as a niche SNS or community site provides regularly updated and relevant content to a steady or growing set of users, there is no reason that such sites should not survive or even flourish on the Web. As pointed out by Paul Gibler in his online article “The Expanding World of Social Networking”, it is the fine-grained and targeted communities such as CafeMom, BOOMj and PEERtrainer that are experiencing recent growth. This also ties in with the idea of object-centered sociality, where people don’t just connect randomly online but rather through the (niche) interests that they have in common. Mark O’Neill sums it up nicely: “…by organizing networks centrifugally around objects, social networking sites have meaning, even when they do not have 200 million users and even when they are centered around minority interests (like Thomas Kinkade paintings!). The point is that they are centered on objects which are in common.” As you say, a key is to allow people to seamlessly find and navigate through these niche interests, and that’s where projects like OpenID, FOAF and SIOC can help – from the point of view of having a single login that’s tied to your interests which can then be semantically matched to content items created across many communities.

Social network portability…
Q: There are several projects set up to address the issue of social network portability – allowing you to interact with various social networks more easily. In your view, will most people need to get used to the concept of a single global online identity such as FOAF?

John Breslin: I think that people are tired of repeating the same information in multiple places, and through standard signon systems like OpenID and profile representation mechanisms like FOAF, you can allow someone to define their identity and to reuse it wherever they choose to use it.

Tech stacks…
Q: You’ve described how a ‘social networking layer may be folded into tech stacks’ where your web and desktop application layer can tap into an integrated social networking stack. For me, this opened my eyes to how important the shift to the semantic will be. I think many people assume that the semantic web will usher in a new period of improved search. But, in fact, it will utterly change the we we interact with the internet?

John Breslin: A lot of the focus from the public or media regarding the semantic web has been in relation to search. But it’s not solely about finding those relevant objects (people, places, etc.) through “Google killers”, and its not only about the Internet (despite being called Web3.0!), but it’s also about providing ways to allow systems (on the desktop, or the Web, or media servers, whatever) to interoperate with each other as well. The social networking stack is one nice example, and indeed efforts like the Social Semantic Desktop and Social Semantic Web can interoperate through such a stack. It may also be for migration between different collaborative workspaces or social software systems, as we are doing with the SIOC project.

Your online identity…
Q: You’ve also suggested that online communities should provide their data in a common, machine-understandable way and should use common semantics to define this data (SIOC and FOAF). The way semantic services will be deployed is unpredictable but do you envisage people signing up to new social networks and setting up a profile automatically using their FOAF file? In the future, do you think people who want to network with each other will swap FOAF files and these files will include relevant information about social network membership?

John Breslin: Yes, and this is being done to some extent already. But also it’d be nice to not just bring your personal profile and your friends with you (for example, via FOAF) but perhaps your content as well (maybe defined using SIOC). There are some issues related to both transporting your friends (need their permission) and comments attached to your content (may need the permission of those commenters too), but you should at the very least be able to bring what belongs to you (your profile and your content), for example along the guidelines of the “Bill of Rights for Users of the Social Web” by Canter et al.

Meshing of networks…
Q: A practical consequence of SIOC might be that you might do a search in Facebook using the term ‘bog-snorkelling’ and gets results back that may include profile pages that include that term, but also blog results from Technorati, comments from Flickr albums and YouTube videos? Equally, a practical consequence of SKOS, FOAF and SIOC could be that you click on a tag for ‘bog snorkelling’ in Delicious and get results from a range of social network sites?

John Breslin: Exactly! I’m delighted that Yahoo! SearchMonkey have listed SIOC as one of their recommended vocabularies – and that people are now starting to get the idea of being able to retrieve user-generated content items from all or from specific types of social websites (blogs, forums, mailing lists, photo albums) using mechanisms like SIOC and FOAF. Through people defining interests explicitly using something like a foaf:interest field or implicitly by clicking on tags of interest, relevant content can be easily returned from social websites with appropriate dc:subject or sioc:topic metadata.

Practical implications…
Q:A practical result being that you create a new account with a new social network and that SN can identify other people on that Network who are listed in Bob’s defined relationships. Have any social networks already deployed this service?

John Breslin: There are many sites (e.g. Dopplr) that are starting to allow you to bring your friends with you by specifying something like your GMail account details (and then matching e-mail addresses you use) or your Twitter account details (and then retrieving a list of those whose microblogs you follow), but it is certainly useful to have a smaller set of reusable relationship formats that can make this more widespread (and that extends the number of services that you can import from). The Google SocialGraph API is a nice example of something that can enable this, as it allows applications to reuse social graph information extracted from sources all over the Web and represented using the open formats XFN and FOAF.

Searching the semantic layer…
Q: I’m a bit confused by the SIOC RDF Browser and if there are any applications that currently allow one to browse information expressed in RDF and SIOC ontology – I assume you need specific URLs to use this?

The SIOC RDF browser is simply a way to view RDF information in a more human friendly form. One of the motivations for creating this was to enable people to view semantic information easily because it may have different aspects that can be of interest – it may be the same information you see on a normal web page, but it may also contain extra information that is not normally displayed on a web page but is rather hidden or locked into a database and that information may prove useful for some third-party applications (e.g. a modification date, incoming links), or perhaps some extra information can be calculated or inferred for a semantic page (related content on the same topic, tag usage frequencies, etc.)

Semantic search…
Q: From the perspective of the non-technical lay researcher – where does Sindice (the semantic web index) and other semantic search tools fit in?

Sindice can be thought of as a big semantic index of the Web. It allows you to find pointers to relevant pages or URIs where particular keywords are mentioned, where certain property values are used (e.g. pages where a person says their e-mail address is john.breslin@deri.org), or where certain facts or semantic tripples appear. If you’re looking for a “semantic search engine”, it depends on what you need. Sindice gives you pointers to where stuff is, whereas many other engines give you the stuff as well (without you having to go to the source page).
SWSE (also from DERI) and Swoogle allow query capabilities over the collections of all Semantic Web statements – so if you search for Galway, it can show you the relevant statements as well as pointing you to the pages they were obtained from.

But I think the applications of Sindice, i.e. finding pointers to where stuff is, and using that in third-party applications, are quite interesting. For example, the SIOC Widget for WordPress is powered through a combination of distributed SIOC documents and the Sindice index. So, when you are browsing a blog that has this widget installed, you may see little balloons appearing beside commenters names. Clicking on these balloons shows a pop-up with a list of content (posts, comments, topics) that that commenter has created not just on the blog site you are viewing but across a range of SIOC-enabled websites (blogs, forums, mailing lists, whatever) as indexed in Sindice. Here is a picture. So you can see and navigate to the content a person has created across a range of sites from just one place that they post to.

On the cusp…
Q:Moving on to practical applications. I was interested to read the “On the cusp” by David Provost. In it he concludes that companies are on the verge of constructing very practical and commercially viable semantic applications. Do you agree?

John Breslin: I think that we are now beginning to see the real commercial applications of what can be done when all kinds of things on the Web are connected together using semantics. This is obvious in the attention being given to startup companies in this space like Powerset, Metaweb (Freebase) and Radar Networks (Twine), and also since many big companies including Reuters (Calais API), Yahoo! (SearchMonkey) and Google (Social Graph API) have all announced in 2008 what they are doing with semantic data.

There has been a lot of talk this past year about the social graph (notably from Google’s Brad Fitzpatrick), which looks at how people are connected together (friends, colleagues, neighbours, etc.), and how such connections can be leveraged across websites. In the Semantic Web, it is not just people who are connected together in some meaningful way, but documents, events, places, hobbies, pictures, you name it! And it is the commercial applications that exploit these connections that are now becoming interesting. But it is very important that the users aren’t exposed to any RDF or semantic terminology – through usage, they just “get” the fact that everything is interconnected.

And the best product?…
Q: In your view – what are the most exciting semantic product developments to have emerged in the last year?

John Breslin: I really like Radar’s Twine, the “knowledge networking” application that allows users to share, organise, and find information with people they trust. I find Twine very interesting, and as well as using it to gather information about SIOC for regular blog entries I write (“Tales from the SIOC-o-sphere”), I also use it to gather and publish personal interests that I think will be of interest to the public, and for passing on interesting stuff to work colleagues.

Privacy…
Q: What about the privacy angle. Are the privacy safeguards in place capable evolving to meet this challenge? Does the average LiveJournal user know that their profiile has been converted to a FOAF file and is now translatable by any number of new semantic products? Speaking as a journalist, my hunch is that the vast majority of people are going to be surprised and, perhaps, shocked to know that a public comment then make on Livejournal may end up in a database that is searchable by people in Linkedin.

No, certainly people aren’t aware that many sites are making semantic forms of their content available which can be reused elsewhere. Tribe.net recently turned off their FOAF exports after a user complained that his/her profile was being copied for use elsewhere (the original developer team had moved on so the new developers weren’t sympathetic to the possibilities of the Semantic Web). Similar things happened with people blogging and finding that content from their RSS feeds was popping up on other sites. There certainly has to be more thought put into educating users and towards having opt-in-opt-out mechanisms when implementing semantic exports, especially for personal content and profiles.

Thanks to John for his time with this.

Tags: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,

New UK search engine – insite talks to lead programmer

October 10th, 2008 | No Comments | Posted by Colin Meek in Featured, Search engines - advanced, Search tools and tricks

The beta version of the new UK-based search engine MSE360 has attracted praise from both sides of the Atlantic with a three-tier display, clean design and other unique features such as virus alerts. I caught up with its Lead Programmer Daniel Clarke to talk about his plans, what MSE360 can offer journalists and researchers, and how a UK search engine can find elbow room in a crowded market.

Insite: ‘MSE360′ – what does the name mean?

“MSE Stands for Multi Search Engine, 360 for 360 degrees.”

Are you a UK team based in the UK?

“Yeah, we’re based in Kent, UK. All staff are based in the UK and we pride ourselves on that. Britain has some of the best minds when it comes to technology, after all, the internet was born from a British Inventor, Tim Berners-Lee.”

There are lots of search engines out there – some doing a great job and some that are not so good. Some of your friends must have suggested that the market is already crowded. How did you respond?

“Indeed, very much so, the market is at a stage where it’s hard to make any sense of a push, and we’ve been told many times that the market has no room for another search engine. We respond with the facts. 90% of the new search engines that launch do so without bringing new or exciting features. Cuil has a slightly different user interface, but that was limited and the technology really didn’t live up the hype. We don’t know if we shall succeed or die out, but if no one tries then what’s to push the larger search engines from improving? Percentage of search market is irrelevant to us, some people have told us that they will be switching to MSE from other search engines, so as long as we make a few people happy, I think we’ve done our job.”

OK – insite and journalism.co.uk readers are busy people. Why should they turn to MSE360 instead of ask.com? What is your main message to the heavy internet users?

“MSE360 was created to speed up the search process. Why navigate between blog search engines, web search engines, image search and Wikipedia if you can find it in one resource? MSE360 brings a lot of sources together in a simple layout. Our anti-virus features also keep the average user safer. We also, unlike other search engines, store no personal information. The fact of the matter is simple, MSE is like marmite, you’ll either love it or hate it.”

I once spent about and hour looking for someone on Google only to find that person as the top hit when I eventually gave up and used another search engine. What will it take to convince internet users – particularly those in the UK – that Google doesn’t have all the answers?

“Google is entrenched in the minds of the British population, and that’s the main challenge for us. We’ve got to change the perception that Google has all the answers. To do that we will be investing in schools to make sure students are not just informed about Google, but the wider range of search engines (such as Ask, Live and MSE). We also are going to focus an advertising campaign on the fact that Google doesn’t have all the answers. But the main factor in this is the technology; we can advertise as much as we want but unless we focus on improving the search our biggest advertising challenge – word of mouth – won’t succeed.

On your site you say you use your own robots and algorithms but you also use partners. Which other search engines do you partner with?

We’re slowly phasing out the external engines, but we use resources from Yahoo and Live Search.

OK – moving on to your unique selling points. Your service is fast, you flag sites that contain viruses, you allow community results and you offer a clean and easy-to-understand 3 tier layout. What else can users expect to see over the coming months?

We have some exciting new features in store. First of all our indexing methods will be changing to provide better results and we’ll be expanding community results and adding a voting system on all results. We plan to allow full customization of the search – from simple layout changes to algorithm changes. The user interface will be improved as will the speed. In the next month we’ll have user added modals which will allow users to add their own search methods, for example torrents or a certain site. 60% of our features come from user suggestions, so there is plenty to come!

One blog recently suggested that MSE360 doesn’t support advanced operator searches. It was very wrong because I’ve used your engine to carry out some complex searches using the advanced operators I can use on Google. Are there any operators you support that Google doesn’t?

We use the standard operators currently (AND OR ELSE etc) but I think this is something we have to improve on. We’re going to be adding content license operators (eg CC25) and algorithm operators, so the user will be able to find sites without adverts, with ‘x’ degree of adverts, etc. I can’t go into too many details, but we’ve got more coming!

Are you planning to implement an advanced search page?

Our beta version currently has an advanced search page, this will go live in the coming weeks.

Are you planning to incorporate any semantic technology into your search service?

We are, but I’m afraid I can’t go into detail on this area yet. In the coming months we’ll release more details in this exciting new area.

Tags: , , ,

Warnings about cloud computing

October 1st, 2008 | No Comments | Posted by Colin Meek in Asides

Here are some warnings about cloud computing from some interesting sources. Firstly, GNU founder Richard Stallman warns that cloud computing is likely to be costly ‘trap’ as users become increasingly dependent on one provider for online applications and storage. And, secondly, Chris Brogan describes how horrible it can be if you do become too dependent on one provider – Google in this case.

Tags: ,