Browse > Home / Archive by category 'Semantic web'

| Subscribe via RSS

Turbo-charge your Firefox browsing

April 9th, 2009 | No Comments | Posted by Colin Meek in Featured, Search tools and tricks, Semantic web

As most people know, if you’re not taking advantage of the many Firefox add-ons and plug-ins then you’re not making the most of this browser. Even so, where do you start? I’ve read several blogs recently listing the ‘best 20′ Firefox add-ons with others running the list to 50. But if you genuinely want to take your research to the next level you need a few hand-picked additions that will help you do more in less time. If that sounds too good to be true – here are a few ideas.

1: Scrapbook: This add-on is an incredibly powerful research tool that enables you to save web pages, page snippets and whole sites. You can organise your saves just like bookmarks (by dragging and dropping in trees) but, crucially, scrapbook saves the page (or pages) not just the link. If you need reliable access to sources, this is the add-on for you.

  • save pages using a drop-down menu or by dragging the page favicon into the Scrapbook Firefox sidebar.
  • drag and drop page snippets and save linked pages just by dragging the links to the sidebar.
  • highlight sections in saved pages.
  • annotate pages.
  • use ‘in-depth’ capture to save whole sites and create site maps (see below).scrapbook1

Scrapbook is the answer if you need access to a range of pages and sites offline and to ‘capture’ a whole site and its links to external sites. Scrapbook even comes with a filter tool that means you can capture only the pages belonging to a target site while ignoring external links.

2: Picnik: Not necessarily a research tool, but beautifully simple and useful. Picnik is a quick way to do what you want with pictures – online, in your browser. You can create files of pictures, pull them from you own accounts on sites such as Flickr and your own hard drive. But, from a research and publishing perspective, you can download images from sites, give them a quick edit, change their format ready for use within seconds.
The Firefox add-on makes life even simpler. Right click on an image (or ‘ctrl’ click for Macs) and you can ‘edit image in picnik’. The image then automatically loads to your library in Picnik. No need for the laborious task of saving images to a photo editing application then exporting locally before you can upload online.

evernoteselect1

3. The Evernote Webclipper:

This add-on creates a handy button on your Firefox browser that you can use to quickly save a selection of a web page or an entire page to your Evernote account. If you need some background on why Evernote can transform your online life then check my recent post on this app.

4. Juice: This add-on is one of a new wave of intelligent search tools that let you access linked content without you having to navigate away from the pages you are viewing. By highlighting and dragging a selection, Juice searchers for reference material, movies, news and pictures and presents the content clearly in a separate Firefox column. You can switch Juice on or off easily by using a simple button on your browser bar.


Juice’s rocking webcast from Linkool Labs on Vimeo.

5: Semantic Radar: For those of you interested in the development of the Semantic Web then Semantic Radar is another tool that gives us a glimpse of what semantic tools are bringing to the web. Semantic Radar recognizes all RDF content and displays custom icons in Firefox to indicate presence of the data in languages such as SIOC and FOAF. This screengrab shows how Semantic Radar has detected RDF content on a Livejournal page. livejournalradarClick on those icons and you can access the RDF content directly. For more on the Semantic Web see my interview with John Breslin.

Tags: , , , , , , , , , , ,

Semantic search – an Interview with Brooke Aker

December 15th, 2008 | 3 Comments | Posted by Colin Meek in Advanced Techniques, Featured, Search engines - advanced, Semantic web

For my second expert interview on the semantic web I set out to find a key commentator who is currently involved in the heart commercial semantic search. Can someone like that describe how these web developments will impact on coal-face journalists and researchers? I didn’t need to look far. Brooke Aker is an expert in competitive intelligence and before taking up his post at Expert System he formed both Acuity Software and Cipher Systems. He has worked with 130 of the Global 2000 in the formation and operation of successful intelligence and is a key commentator on the semantic web.

Expert System is a leading provider of semantic software which discovers, classifies and interprets text information. Its semantic software, Cogito, has been deployed across most industry sectors and the company’s clients include Eni Group, Pirelli, Microsoft and Telecom Italia.

Social networks and semantic search…
Q: Do you think the explosion in growth of niche sites such as Xing and Peer Trainer will accelerate the demand for semantic-type applications that allow people to travel seamlessly through various social networking services?

Brooke Aker: I would agree with this. Facebook and Myspace are good examples of getting people used to the idea that they can, not just search, but connect to people and content. So they set the stage for users to migrate quickly to Web 3.0 properties where users can search for and connect, analyze, and assemble very specific people and document objects in ways that are uniquely designed by them.

Transforming the way you work…
Q: You recently released a (very useful) presentation on ‘what is semantic search‘. It remains, however, difficult for coal-face researchers (insite readers!) to grasp its significance. What, in your view, are the best examples of semantic search that hold the most promise. I’m thinking here of apps like Juice. Not tools that help publishers – but tools that can currently help people in their day to day work?

Brooke Aker: We have been involved with applications that use semantics and combine search with discovery, or search with analysis. Let me explain….
Because semantics expands and connects similar concepts, from where I begin my search, I may end up in a place I did not expect. Say I run a search for “stock” and ask to limit my search to the concept of stock in the sense of soup. This helps avoid stock as in “inventory” or stock as in “equities.” Now, I tell the system to expand the concept of soup stock and I get bouillon, stock, base, and a completely new word to me called fumet. I can then reduce my search results further by noting Emeril Lagasse is mentioned as a chef in one of the documents extracted. So in the end, I used semantics to search for a recipe on soup stock and ended up in a precise but completely new place with a recipe called “Emeril Lagasse’s classic fish fumet.” The document had no mention of the word soup or stock. This is something I would have otherwise missed.

For search combined with analysis, we often will employ semantics in a modeling sense. Think about a competitor who may be preparing to launch a new product, but the company has not made anything public yet. We know what steps that company must be taking in order to launch a new product: things like ramping up production lines, buying new machines, contracting with ad agencies, hiring new people with specific skills, etc. These actions are likely to be public. Semantics are employed to broadly find these indicators, which feed a model. If enough of the indicators are present, the model concludes a new product is forthcoming. So here, semantics plays a predictive role. Such foreknowledge of such things is many times more valuable than simply knowing the moment something is reported in the press as having already occurred.

Tech stacks…
Q: Many people, I think, assume that the semantic web will usher in a new period of improved search. But, in fact, developments such as the ‘social semantic desktop’ like Nepomuk may accelerate the development of semantic web technology. Do you agree?

Brooke Aker: I agree. The ideal architecture would be to re-index the entire Web semantically and have new browsers to read it. But that seems like a long shot for the time being. So instead, if you embed the semantic processing of every html page the standard browser reads, stores and retrieves locally, you have in effect federated the problem across the Internet. And of course those same special browsers or browser plug-ins could also peer-to-peer share their semantic results if directed by the end user.

Filter failure…
Q: I really like your graph visualising the downside of web2.0 (in your presentation) – as more and more information is mass produced there is a danger that productivity may slide. Do you agree with Clay Shirky’s recent argument that ‘information overload is just filter failure?’ What we’ll all have to get used to is using the right filters at the right time and learn how to maintain them?

Brooke Aker: Filters are a blunt instrument to a more delicate problem. And it implies a lot more work on the users end. This spells failure to me. People want convenience and simplicity, and are already overwhelmed.

About Cogito….
Q: Can you describe some of the practical applications to which your software, Cogito, has been applied (and therefore, where Expert System USA is positioning itself)?

Brooke Aker: One of the best examples we have now is in customer service. We support the online help function of mobile devices using a natural language interface. So users type a question they have about how to operate their devices into the handset. We return 1 precise answer to them. This prevents the user from doing two things. First, they don’t return the device having found it too difficult to operate. This also saves the company the cost of acquiring the customer before they can earn it back. The second thing it does is deflect an inbound call to the customer service center where the average cost is $20. We can give the direct, accurate answer for about a ½ a penny.

Q: In your presentation you draw a green figure that demonstrates how web3.0 will allow people to better ‘filter’ or pick the content that is relevant to them. In your view, what single application/product best demonstrates the power of this technology?

Brooke Aker: Yes, this is the example I gave about the fish fumet. The product we have made to do this is Cogito Focus. It is a corporate search tool that includes crawlers, semantic indexers and a nicely done interface that is not much of a stretch over a conventional search box interface (e.g. little training).

Viable applications….
Q: I was interested to read the “On the cusp” by David Provost. In it he concludes that companies are on the verge of constructing very practical and commercially viable semantic applications. (Provost makes the point that Twine is succeeding because it has ditched semantic terminology and focused on the ‘business mission’. While the terminology isn’t obvious – the semantics under the hood is). Do you agree?

Brooke Aker: Sure I do. Business users don’t care how it works, only that it does work and provides some visible, measurable value. Same is true for consumer-facing applications as well. Can I do something helpful and valuable that I could not do before? That’s what matters.

Tags: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,

Nepomuk – the social semantic desktop

November 27th, 2008 | No Comments | Posted by Colin Meek in Semantic web, Social Networks, Sorting and Storing

Last month I talked to John Breslin about how web applications may become semantically integrated with your desktop applications. In other words, the ‘semantic web’ isn’t just about improving search, it is also about allowing all the systems you use to interoperate with each other. For example, imagine browsing your RSS feeds and a name of a post author is highlighted on your screen. Your desktop computer then makes connections between that person and your contacts file and your friends in social networking sites.

John and I could have been talking about Nepomuk – the social semantic desktop, which was reviewed by physorg.com this week. This EU-funded project aims to apply the kind of semantic solutions being deployed on the web to your desktop allowing your desktop to more easily process the information it has across different platforms, media types and applications.

The practical result could be that you will be able to access information on your desktop using semantic technology in the same way that you will be able to make connections semantically online.

Tags: , , , , , ,

Juice – the intelligent Firefox plugin

November 24th, 2008 | 1 Comment | Posted by Colin Meek in Featured, Search tools and tricks, Semantic web

As outlined in my presentation on the semantic web, a few tools are giving us a glimpse of how intelligent applications are changing search tools allowing us to move away from the ‘search term’ engine model of research. I described the Firefox plugin ClearForest Gnosis and Headupas two examples. Another recent launch demonstrates the power of these intelligent search tools that let you access linked content without you having to navigate away from the pages you are viewing. Juice is a Firefox plugin that allows you to access information linked to a term you are interested in. Juice searchers for reference material, movies, news and pictures and presents the content clearly in a separate Firefox column. All you have to do is highlight your term and drag into the column. You can switch Juice on or off easily by using a simple button on your browser bar. I accessed suprisingly relevant content by testing Juice using two potentially tricky terms – ‘Sirius Star’ and ‘Antony Gormley’.
While Gnosis indicates how intelligent search using language processing can work, Juice takes the concept to the next level. For a quick introduction see the Juice webcast:


Juice’s rocking webcast from Linkool Labs on Vimeo.

Tags: , , , , , , , ,

Neat presentation on semantic search

November 4th, 2008 | No Comments | Posted by Colin Meek in Semantic web

Following on from my articles on Web 3.0 for journalism.co.uk and the full interview with John Breslin, I came across this short by excellent presentation by Brooke Aker on What is Semantic Search? Although it is geared towards a retail audience it does neatly sum up how individuals like journalists and researchers will be able to (and are able to) use semantic search tools to filter information more effectively.

On the subject of semantic search, I’ve added a whole new module to my course on Advanced Internet Research that will cover semantic web tools. Check out this page for more information about the course. I also do a course for the NUJ in Scotland.

Tags: , , ,

Privacy and the Semantic Web

October 31st, 2008 | No Comments | Posted by Colin Meek in Semantic web, Your own privacy

Semantic Web pioneer John Breslin has responded to my articles on Web 3.0 in his post on his Cloudlands blog. His post covers two broad issues. Firstly, he adds some important points about semantic web search and the differences between the various Semantic Web search engines that exist. I am reviewing all of these engines from the perspective of a coal-face journalist so more on that soon.

Second, John argues that the Semantic Web community needs to be ‘very aware’ about the fact that Web 3.0 may see new tools and techniques developed that will make it even easier for journalists and others to access sensitive personal information.

During the recent seminar in Oslo on the Social Web I argued that journalists need to be aware that personal information that can be found on the internet falls into three categories: the information people intentionally publish; information about themselves they have no control over; and, lastly, information you make available to specific sites under certain conditions. My view is that these categories are being blurred and that Web 3.0 is likely to blur these divisions further.

Most importantly, journalists should be aware of a distinction between what people intentionally publish and what they make accessible. There will always be journalists who think any personal information made accessible is fair game, but publications should perhaps think about guidance about what information and content should and should not be re-hashed from social networking sites and under what circumstances. The Semantic Web can only make this more pressing.

John Breslin states: “Educating site owners about what semantic data they may be publishing (knowingly or unknowingly, even if it’s just RSS feeds) is needed, and developers should determine exactly what opt-in or opt-out mechanisms are required before implementing semantic solutions,” and he goes on to argue that the Semantic Web community needs to think more about educating people about the benefits as well as how it can minimise any hazards.

Perhaps the first step in reassuring people about Web 3.0 is for the industry and Semantic Web community to agree on a range of specific privacy guarantees such as the bill of rights for users of the social web before privacy problems start to dominate the headlines (such as the automatic export of FOAF files and the transfer of content from one social network to another without members’ knowledge).

Tags: , , , , , , , ,

Journalists and the social web – Oslo Seminar

I am just back from the seminar on Journalists and the Social Web in Oslo organised by the Norwegian Journalist Kristine Lowe, Journalism.co.uk and Journalisten.no. The day went really well with some fascinating discussion and I’d like to thank the hosts for their generous hospitality. I spoke at the seminar on several subjects including Mining Social Networks for Information, Monitoring News and The Semantic Web and journalists. Here are my presentations:

Journalists and the Social Web 1 – Mining for Information

Journalists and the Social Web 2 – Monitoring your Beat

Journalists and the Social Web 3 – Journalists and the Semantic Web

Tags: , , ,

The Semantic Web today – An Interview with John Breslin

Not many people are as close to the heart of the Semantic Web as John Breslin. John is the founder of the Semantically-Interlinked Online Communities (SIOC) project, a member of the W3C Advisory Committee, lecturer at the National University of Ireland and an associate researcher on the semantic web at the Digital Research Institute in Galway.

I caught up with John recently for this analysis of the semantic web and journalism published in journalism.co.uk. We thought John’s points were so interesting that we’ve brought you the full interview in this post.

Niche social networks…
Q: Some have predicted that the rise in the universal social network sites such as Facebook and Myspace will be mirrored soon by the explosion in growth of niche sites such as Xing and Peer Trainer and the expanding interest in ‘enterprise2.0′. In many ways this may accelerate the the demand for semantic-type applications that allow people to travel seamlessly through various social networking services. What do you think?

John Breslin: I think that even though some have argued against the need for niche social networking services (SNSs) due to the widespread use of large sites like Facebook and MySpace, these niche SNSs can provide a breath of fresh air when one wants to escape from the bigger “overcrowded SNS cities”. As long as a niche SNS or community site provides regularly updated and relevant content to a steady or growing set of users, there is no reason that such sites should not survive or even flourish on the Web. As pointed out by Paul Gibler in his online article “The Expanding World of Social Networking”, it is the fine-grained and targeted communities such as CafeMom, BOOMj and PEERtrainer that are experiencing recent growth. This also ties in with the idea of object-centered sociality, where people don’t just connect randomly online but rather through the (niche) interests that they have in common. Mark O’Neill sums it up nicely: “…by organizing networks centrifugally around objects, social networking sites have meaning, even when they do not have 200 million users and even when they are centered around minority interests (like Thomas Kinkade paintings!). The point is that they are centered on objects which are in common.” As you say, a key is to allow people to seamlessly find and navigate through these niche interests, and that’s where projects like OpenID, FOAF and SIOC can help – from the point of view of having a single login that’s tied to your interests which can then be semantically matched to content items created across many communities.

Social network portability…
Q: There are several projects set up to address the issue of social network portability – allowing you to interact with various social networks more easily. In your view, will most people need to get used to the concept of a single global online identity such as FOAF?

John Breslin: I think that people are tired of repeating the same information in multiple places, and through standard signon systems like OpenID and profile representation mechanisms like FOAF, you can allow someone to define their identity and to reuse it wherever they choose to use it.

Tech stacks…
Q: You’ve described how a ‘social networking layer may be folded into tech stacks’ where your web and desktop application layer can tap into an integrated social networking stack. For me, this opened my eyes to how important the shift to the semantic will be. I think many people assume that the semantic web will usher in a new period of improved search. But, in fact, it will utterly change the we we interact with the internet?

John Breslin: A lot of the focus from the public or media regarding the semantic web has been in relation to search. But it’s not solely about finding those relevant objects (people, places, etc.) through “Google killers”, and its not only about the Internet (despite being called Web3.0!), but it’s also about providing ways to allow systems (on the desktop, or the Web, or media servers, whatever) to interoperate with each other as well. The social networking stack is one nice example, and indeed efforts like the Social Semantic Desktop and Social Semantic Web can interoperate through such a stack. It may also be for migration between different collaborative workspaces or social software systems, as we are doing with the SIOC project.

Your online identity…
Q: You’ve also suggested that online communities should provide their data in a common, machine-understandable way and should use common semantics to define this data (SIOC and FOAF). The way semantic services will be deployed is unpredictable but do you envisage people signing up to new social networks and setting up a profile automatically using their FOAF file? In the future, do you think people who want to network with each other will swap FOAF files and these files will include relevant information about social network membership?

John Breslin: Yes, and this is being done to some extent already. But also it’d be nice to not just bring your personal profile and your friends with you (for example, via FOAF) but perhaps your content as well (maybe defined using SIOC). There are some issues related to both transporting your friends (need their permission) and comments attached to your content (may need the permission of those commenters too), but you should at the very least be able to bring what belongs to you (your profile and your content), for example along the guidelines of the “Bill of Rights for Users of the Social Web” by Canter et al.

Meshing of networks…
Q: A practical consequence of SIOC might be that you might do a search in Facebook using the term ‘bog-snorkelling’ and gets results back that may include profile pages that include that term, but also blog results from Technorati, comments from Flickr albums and YouTube videos? Equally, a practical consequence of SKOS, FOAF and SIOC could be that you click on a tag for ‘bog snorkelling’ in Delicious and get results from a range of social network sites?

John Breslin: Exactly! I’m delighted that Yahoo! SearchMonkey have listed SIOC as one of their recommended vocabularies – and that people are now starting to get the idea of being able to retrieve user-generated content items from all or from specific types of social websites (blogs, forums, mailing lists, photo albums) using mechanisms like SIOC and FOAF. Through people defining interests explicitly using something like a foaf:interest field or implicitly by clicking on tags of interest, relevant content can be easily returned from social websites with appropriate dc:subject or sioc:topic metadata.

Practical implications…
Q:A practical result being that you create a new account with a new social network and that SN can identify other people on that Network who are listed in Bob’s defined relationships. Have any social networks already deployed this service?

John Breslin: There are many sites (e.g. Dopplr) that are starting to allow you to bring your friends with you by specifying something like your GMail account details (and then matching e-mail addresses you use) or your Twitter account details (and then retrieving a list of those whose microblogs you follow), but it is certainly useful to have a smaller set of reusable relationship formats that can make this more widespread (and that extends the number of services that you can import from). The Google SocialGraph API is a nice example of something that can enable this, as it allows applications to reuse social graph information extracted from sources all over the Web and represented using the open formats XFN and FOAF.

Searching the semantic layer…
Q: I’m a bit confused by the SIOC RDF Browser and if there are any applications that currently allow one to browse information expressed in RDF and SIOC ontology – I assume you need specific URLs to use this?

The SIOC RDF browser is simply a way to view RDF information in a more human friendly form. One of the motivations for creating this was to enable people to view semantic information easily because it may have different aspects that can be of interest – it may be the same information you see on a normal web page, but it may also contain extra information that is not normally displayed on a web page but is rather hidden or locked into a database and that information may prove useful for some third-party applications (e.g. a modification date, incoming links), or perhaps some extra information can be calculated or inferred for a semantic page (related content on the same topic, tag usage frequencies, etc.)

Semantic search…
Q: From the perspective of the non-technical lay researcher – where does Sindice (the semantic web index) and other semantic search tools fit in?

Sindice can be thought of as a big semantic index of the Web. It allows you to find pointers to relevant pages or URIs where particular keywords are mentioned, where certain property values are used (e.g. pages where a person says their e-mail address is john.breslin@deri.org), or where certain facts or semantic tripples appear. If you’re looking for a “semantic search engine”, it depends on what you need. Sindice gives you pointers to where stuff is, whereas many other engines give you the stuff as well (without you having to go to the source page).
SWSE (also from DERI) and Swoogle allow query capabilities over the collections of all Semantic Web statements – so if you search for Galway, it can show you the relevant statements as well as pointing you to the pages they were obtained from.

But I think the applications of Sindice, i.e. finding pointers to where stuff is, and using that in third-party applications, are quite interesting. For example, the SIOC Widget for WordPress is powered through a combination of distributed SIOC documents and the Sindice index. So, when you are browsing a blog that has this widget installed, you may see little balloons appearing beside commenters names. Clicking on these balloons shows a pop-up with a list of content (posts, comments, topics) that that commenter has created not just on the blog site you are viewing but across a range of SIOC-enabled websites (blogs, forums, mailing lists, whatever) as indexed in Sindice. Here is a picture. So you can see and navigate to the content a person has created across a range of sites from just one place that they post to.

On the cusp…
Q:Moving on to practical applications. I was interested to read the “On the cusp” by David Provost. In it he concludes that companies are on the verge of constructing very practical and commercially viable semantic applications. Do you agree?

John Breslin: I think that we are now beginning to see the real commercial applications of what can be done when all kinds of things on the Web are connected together using semantics. This is obvious in the attention being given to startup companies in this space like Powerset, Metaweb (Freebase) and Radar Networks (Twine), and also since many big companies including Reuters (Calais API), Yahoo! (SearchMonkey) and Google (Social Graph API) have all announced in 2008 what they are doing with semantic data.

There has been a lot of talk this past year about the social graph (notably from Google’s Brad Fitzpatrick), which looks at how people are connected together (friends, colleagues, neighbours, etc.), and how such connections can be leveraged across websites. In the Semantic Web, it is not just people who are connected together in some meaningful way, but documents, events, places, hobbies, pictures, you name it! And it is the commercial applications that exploit these connections that are now becoming interesting. But it is very important that the users aren’t exposed to any RDF or semantic terminology – through usage, they just “get” the fact that everything is interconnected.

And the best product?…
Q: In your view – what are the most exciting semantic product developments to have emerged in the last year?

John Breslin: I really like Radar’s Twine, the “knowledge networking” application that allows users to share, organise, and find information with people they trust. I find Twine very interesting, and as well as using it to gather information about SIOC for regular blog entries I write (“Tales from the SIOC-o-sphere”), I also use it to gather and publish personal interests that I think will be of interest to the public, and for passing on interesting stuff to work colleagues.

Privacy…
Q: What about the privacy angle. Are the privacy safeguards in place capable evolving to meet this challenge? Does the average LiveJournal user know that their profiile has been converted to a FOAF file and is now translatable by any number of new semantic products? Speaking as a journalist, my hunch is that the vast majority of people are going to be surprised and, perhaps, shocked to know that a public comment then make on Livejournal may end up in a database that is searchable by people in Linkedin.

No, certainly people aren’t aware that many sites are making semantic forms of their content available which can be reused elsewhere. Tribe.net recently turned off their FOAF exports after a user complained that his/her profile was being copied for use elsewhere (the original developer team had moved on so the new developers weren’t sympathetic to the possibilities of the Semantic Web). Similar things happened with people blogging and finding that content from their RSS feeds was popping up on other sites. There certainly has to be more thought put into educating users and towards having opt-in-opt-out mechanisms when implementing semantic exports, especially for personal content and profiles.

Thanks to John for his time with this.

Tags: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,