The Semantic Web today – An Interview with John Breslin
Not many people are as close to the heart of the Semantic Web as John Breslin. John is the founder of the Semantically-Interlinked Online Communities (SIOC) project, a member of the W3C Advisory Committee, lecturer at the National University of Ireland and an associate researcher on the semantic web at the Digital Research Institute in Galway.
I caught up with John recently for this analysis of the semantic web and journalism published in journalism.co.uk. We thought John’s points were so interesting that we’ve brought you the full interview in this post.
Niche social networks…
Q: Some have predicted that the rise in the universal social network sites such as Facebook and Myspace will be mirrored soon by the explosion in growth of niche sites such as Xing and Peer Trainer and the expanding interest in ‘enterprise2.0′. In many ways this may accelerate the the demand for semantic-type applications that allow people to travel seamlessly through various social networking services. What do you think?
John Breslin: I think that even though some have argued against the need for niche social networking services (SNSs) due to the widespread use of large sites like Facebook and MySpace, these niche SNSs can provide a breath of fresh air when one wants to escape from the bigger “overcrowded SNS cities”. As long as a niche SNS or community site provides regularly updated and relevant content to a steady or growing set of users, there is no reason that such sites should not survive or even flourish on the Web. As pointed out by Paul Gibler in his online article “The Expanding World of Social Networking”, it is the fine-grained and targeted communities such as CafeMom, BOOMj and PEERtrainer that are experiencing recent growth. This also ties in with the idea of object-centered sociality, where people don’t just connect randomly online but rather through the (niche) interests that they have in common. Mark O’Neill sums it up nicely: “…by organizing networks centrifugally around objects, social networking sites have meaning, even when they do not have 200 million users and even when they are centered around minority interests (like Thomas Kinkade paintings!). The point is that they are centered on objects which are in common.” As you say, a key is to allow people to seamlessly find and navigate through these niche interests, and that’s where projects like OpenID, FOAF and SIOC can help – from the point of view of having a single login that’s tied to your interests which can then be semantically matched to content items created across many communities.
Social network portability…
Q: There are several projects set up to address the issue of social network portability – allowing you to interact with various social networks more easily. In your view, will most people need to get used to the concept of a single global online identity such as FOAF?
John Breslin: I think that people are tired of repeating the same information in multiple places, and through standard signon systems like OpenID and profile representation mechanisms like FOAF, you can allow someone to define their identity and to reuse it wherever they choose to use it.
Tech stacks…
Q: You’ve described how a ‘social networking layer may be folded into tech stacks’ where your web and desktop application layer can tap into an integrated social networking stack. For me, this opened my eyes to how important the shift to the semantic will be. I think many people assume that the semantic web will usher in a new period of improved search. But, in fact, it will utterly change the we we interact with the internet?
John Breslin: A lot of the focus from the public or media regarding the semantic web has been in relation to search. But it’s not solely about finding those relevant objects (people, places, etc.) through “Google killers”, and its not only about the Internet (despite being called Web3.0!), but it’s also about providing ways to allow systems (on the desktop, or the Web, or media servers, whatever) to interoperate with each other as well. The social networking stack is one nice example, and indeed efforts like the Social Semantic Desktop and Social Semantic Web can interoperate through such a stack. It may also be for migration between different collaborative workspaces or social software systems, as we are doing with the SIOC project.
Your online identity…
Q: You’ve also suggested that online communities should provide their data in a common, machine-understandable way and should use common semantics to define this data (SIOC and FOAF). The way semantic services will be deployed is unpredictable but do you envisage people signing up to new social networks and setting up a profile automatically using their FOAF file? In the future, do you think people who want to network with each other will swap FOAF files and these files will include relevant information about social network membership?
John Breslin: Yes, and this is being done to some extent already. But also it’d be nice to not just bring your personal profile and your friends with you (for example, via FOAF) but perhaps your content as well (maybe defined using SIOC). There are some issues related to both transporting your friends (need their permission) and comments attached to your content (may need the permission of those commenters too), but you should at the very least be able to bring what belongs to you (your profile and your content), for example along the guidelines of the “Bill of Rights for Users of the Social Web” by Canter et al.
Meshing of networks…
Q: A practical consequence of SIOC might be that you might do a search in Facebook using the term ‘bog-snorkelling’ and gets results back that may include profile pages that include that term, but also blog results from Technorati, comments from Flickr albums and YouTube videos? Equally, a practical consequence of SKOS, FOAF and SIOC could be that you click on a tag for ‘bog snorkelling’ in Delicious and get results from a range of social network sites?
John Breslin: Exactly! I’m delighted that Yahoo! SearchMonkey have listed SIOC as one of their recommended vocabularies – and that people are now starting to get the idea of being able to retrieve user-generated content items from all or from specific types of social websites (blogs, forums, mailing lists, photo albums) using mechanisms like SIOC and FOAF. Through people defining interests explicitly using something like a foaf:interest field or implicitly by clicking on tags of interest, relevant content can be easily returned from social websites with appropriate dc:subject or sioc:topic metadata.
Practical implications…
Q:A practical result being that you create a new account with a new social network and that SN can identify other people on that Network who are listed in Bob’s defined relationships. Have any social networks already deployed this service?
John Breslin: There are many sites (e.g. Dopplr) that are starting to allow you to bring your friends with you by specifying something like your GMail account details (and then matching e-mail addresses you use) or your Twitter account details (and then retrieving a list of those whose microblogs you follow), but it is certainly useful to have a smaller set of reusable relationship formats that can make this more widespread (and that extends the number of services that you can import from). The Google SocialGraph API is a nice example of something that can enable this, as it allows applications to reuse social graph information extracted from sources all over the Web and represented using the open formats XFN and FOAF.
Searching the semantic layer…
Q: I’m a bit confused by the SIOC RDF Browser and if there are any applications that currently allow one to browse information expressed in RDF and SIOC ontology – I assume you need specific URLs to use this?
The SIOC RDF browser is simply a way to view RDF information in a more human friendly form. One of the motivations for creating this was to enable people to view semantic information easily because it may have different aspects that can be of interest – it may be the same information you see on a normal web page, but it may also contain extra information that is not normally displayed on a web page but is rather hidden or locked into a database and that information may prove useful for some third-party applications (e.g. a modification date, incoming links), or perhaps some extra information can be calculated or inferred for a semantic page (related content on the same topic, tag usage frequencies, etc.)
Semantic search…
Q: From the perspective of the non-technical lay researcher – where does Sindice (the semantic web index) and other semantic search tools fit in?
Sindice can be thought of as a big semantic index of the Web. It allows you to find pointers to relevant pages or URIs where particular keywords are mentioned, where certain property values are used (e.g. pages where a person says their e-mail address is john.breslin@deri.org), or where certain facts or semantic tripples appear. If you’re looking for a “semantic search engine”, it depends on what you need. Sindice gives you pointers to where stuff is, whereas many other engines give you the stuff as well (without you having to go to the source page).
SWSE (also from DERI) and Swoogle allow query capabilities over the collections of all Semantic Web statements – so if you search for Galway, it can show you the relevant statements as well as pointing you to the pages they were obtained from.
But I think the applications of Sindice, i.e. finding pointers to where stuff is, and using that in third-party applications, are quite interesting. For example, the SIOC Widget for WordPress is powered through a combination of distributed SIOC documents and the Sindice index. So, when you are browsing a blog that has this widget installed, you may see little balloons appearing beside commenters names. Clicking on these balloons shows a pop-up with a list of content (posts, comments, topics) that that commenter has created not just on the blog site you are viewing but across a range of SIOC-enabled websites (blogs, forums, mailing lists, whatever) as indexed in Sindice. Here is a picture. So you can see and navigate to the content a person has created across a range of sites from just one place that they post to.
On the cusp…
Q:Moving on to practical applications. I was interested to read the “On the cusp” by David Provost. In it he concludes that companies are on the verge of constructing very practical and commercially viable semantic applications. Do you agree?
John Breslin: I think that we are now beginning to see the real commercial applications of what can be done when all kinds of things on the Web are connected together using semantics. This is obvious in the attention being given to startup companies in this space like Powerset, Metaweb (Freebase) and Radar Networks (Twine), and also since many big companies including Reuters (Calais API), Yahoo! (SearchMonkey) and Google (Social Graph API) have all announced in 2008 what they are doing with semantic data.
There has been a lot of talk this past year about the social graph (notably from Google’s Brad Fitzpatrick), which looks at how people are connected together (friends, colleagues, neighbours, etc.), and how such connections can be leveraged across websites. In the Semantic Web, it is not just people who are connected together in some meaningful way, but documents, events, places, hobbies, pictures, you name it! And it is the commercial applications that exploit these connections that are now becoming interesting. But it is very important that the users aren’t exposed to any RDF or semantic terminology – through usage, they just “get” the fact that everything is interconnected.
And the best product?…
Q: In your view – what are the most exciting semantic product developments to have emerged in the last year?
John Breslin: I really like Radar’s Twine, the “knowledge networking” application that allows users to share, organise, and find information with people they trust. I find Twine very interesting, and as well as using it to gather information about SIOC for regular blog entries I write (“Tales from the SIOC-o-sphere”), I also use it to gather and publish personal interests that I think will be of interest to the public, and for passing on interesting stuff to work colleagues.
Privacy…
Q: What about the privacy angle. Are the privacy safeguards in place capable evolving to meet this challenge? Does the average LiveJournal user know that their profiile has been converted to a FOAF file and is now translatable by any number of new semantic products? Speaking as a journalist, my hunch is that the vast majority of people are going to be surprised and, perhaps, shocked to know that a public comment then make on Livejournal may end up in a database that is searchable by people in Linkedin.
No, certainly people aren’t aware that many sites are making semantic forms of their content available which can be reused elsewhere. Tribe.net recently turned off their FOAF exports after a user complained that his/her profile was being copied for use elsewhere (the original developer team had moved on so the new developers weren’t sympathetic to the possibilities of the Semantic Web). Similar things happened with people blogging and finding that content from their RSS feeds was popping up on other sites. There certainly has to be more thought put into educating users and towards having opt-in-opt-out mechanisms when implementing semantic exports, especially for personal content and profiles.
Thanks to John for his time with this.
October 24th, 2008 at 8:56 pm
[...] Journalism.co.uk está a dedicar atenção ao assunto, depois de uma entrevista a John Breslin (cf The Semantic Web Today). A [...]
October 26th, 2008 at 10:07 pm
[...] The semantic web today [...]
October 29th, 2008 at 3:41 am
[...] Insite | The Semantic Web today – An Interview with John Breslin – [...]
October 29th, 2008 at 12:16 pm
[...] Colin Meeks tem abordado no Journalism.co.uk como a web semântica pode ser utilizada pelos jornalistas. Neste slideshow ele explica como funciona e que ferramentas pode já ser utilizadas. Vejam também a entrevista que ele fez com John Breslin. [...]
October 29th, 2008 at 12:48 pm
[...] and what it means for journalists… You can read the full article in two parts (1, 2). My original answers are part of an interview on their Insite blog. I also had the chance to talk about various DERI [...]
November 3rd, 2008 at 11:09 am
[...] The Semantic Web today – An Interview with John Breslin [...]
November 27th, 2008 at 5:10 pm
[...] month I talked to John Breslin about how web applications may become semantically integrated with your desktop applications. In [...]