How to: create a pre-filtered mash-up of RSS feeds
Yahoo! Pipes can put you back in charge of your RSS feed subscriptions. Here's how to do it...
Some RSS readers offer basic keyword filtering options, allowing you to filter off posts as they arrive into their own folders (similar to the way you can filter off email in many email programs by subject line). But the filter options tend to be quite crude, relying on keyword searches of titles or descriptions, or matches of the author, and you still have to download a lot of posts that you do not need.
If you examine the source code of many RSS feeds from blogs, for example, you will see that they often contain other useful information such as the category or tag the author originally placed the post under. Chances are that you will be only interested in posts that appear under certain categories, for example "Newspapers" - wouldn't it be great if you could create a RSS feed that only contained those posts?
One way of doing this is by using Yahoo! Pipes, a free online tool for aggregating and manipulating RSS feeds. Pipes uses a very sophisticated graphical layout tool to help you build your new feeds, with the intention of making the process easier for people who do not think like programmers. Even so, at first sight it has proved to be a bit daunting for many journalists, so we are going to show you how to build a basic Pipe from two pre-filtered RSS feeds.
First of all you will need to set yourself up with a Yahoo! ID if you do not already have one. Go to http://pipes.yahoo.com/pipes/ and click "Join Now" in the upper right hand corner of the home page.
Once you have signed up, click "create a new pipe" on the home page (figure 1). You will be presented with a page containing a blank area that looks like graph paper and a number of menus in the left-hand column (figure 2). This is where you will build your first Pipe.
The first thing we need to do is to grab a couple of RSS feeds. To do this, you will need the Fetch module which you can grab from under the heading "Sources" in the left-hand column. Simply click where it says "Fetch" and, without releasing once you have clicked, drag onto the graph area. A module should appear in your main editing area with the title "Fetch" and a field immediately underneath it entitled "URL" (figure 3).
Enter the URL (web address) of the first feed you want to include into the Fetch module. Once you have done that, you will notice that, at the bottom of the page beneath the graph area, there is a grey space with a tab attached entitled "Debugger". Make sure the Fetch module is selected (it will be orange when selected, blue when de-selected), and click the "Refresh" link in the debugger area. Assuming your feed is OK, you should now see the feed items appear in the Debugger area (figure 4). (You might need to increase the size of the Debugger area by clicking the tab in the middle top of it and dragging upwards.)
Each item of the feed appearing in the Debugger area is prefixed with a triangular marker. Try clicking on one to see all the different constitute elements of that particular feed (each one has its own triangular marker, so click on one or two of those too). In our example (figure 5), you can see that one post has several keywords or phrases within the "Category" element.
To add another feed to our Pipe, we could now simply click the (plus) sign next to "URL" in the Fetch module, enter another URL and the output of both feeds would be combined. But, in this case, because we want to filter each feed in different ways before combining them, we are going to create another Fetch module by repeating steps one and two above.
Now that we have grabbed two feeds, we are going to apply our filtration criteria to each feed. To do this, we need a module called "Filter" which we can find underneath the heading "Operators" in the left-hand column (click the small triangle next to "Operators" to see all the items within that category). Click and drag a Filter module to the desktop and position it underneath the first Fetch module you created.
We are now going to connect the two modules. At the bottom of the first Fetch module you created, there is a nodule. Click on it and drag to connect it to the nodule at the top of your new Filter module. You should see a blue line or Pipe appear as you drag toward the other nodule, once you have positioned it over the Filter nodule, release your mouse and the two modules should be connected (figure 6).
At this point, you might like to save and name your Pipe before proceeding. In the top left-hand corner of the page there should be a tab marked "Untitled". Click onto the text and type in the name of your feed, then click "OK". The click the "Save" button in the top right-hand corner of your page.
We are now going to apply our filtering criteria to our first feed. You can either "Block" certain posts based on various criteria, or "Permit" them, whichever is the easier based on your set of rules. In this example, we are going to permit posts that have been placed by the author under certain categories. From the first dropdown menu, we are going to select "Permit" [items that match]; from the adjacent dropdown we are going to select "Any" [of the following].
If we select "All", all of the following rules would have to be met before permitting post inclusion; by selecting "Any", posts will be included if they meet the criteria for one or more rules.
Underneath "Rules", we can see a dropdown menu showing the constituent elements of our feed. We are going to select "category", then "contains" from the adjacent menu, then enter a keyword or phrase that the author uses to describe some of his posts - in this case, "journalism".
If you select your filter module now (so it turns orange) and click the "refresh" link in the debugger area, you will now see that the number of posts from this feed has been dramatically reduced (figure 7).
To add another filter rule, click the (plus) sign next to "Rules" and repeat the procedure with another rule. In our example, we have added a new rule that allows posts that are included in the category "blogging".
Repeat steps four and six for the second feed.
Now that we have filtered our two feeds with criteria specific to each, we can join the two together. To do this, we use a module called "Union" which can be found under the "Operators" heading in the left-hand column. Click and drag onto the graph area and connect up from the nodules at the bottom of each Filter module. If you now select the Union module (so it turns orange) and refresh the Debugger output you will see that the filtered output of each feed has now been combined (figure 8).
Nearly there. Now all we need to do is to sort the combined feed by publication date and to limit the number of items that appear in our new feed. To do this, we are going to use the "Sort" and "Truncate" modules found under the "Operators" heading. Click and drag both modules onto the graph area, connect the "Sort" module to the bottom of the "Union" module, and and connect the bottom of the "Sort" module to the top of the "Truncate" module. Select "pubDate" in the first dropdown menu in the "Sort" module, and choose "descending" order (NB, if your original feeds are in different RSS formats, you might find they do not sort properly. One fix is to put the feeds through Feedburner first and convert them all to RSS 2.0.)
In the "Truncate" module, enter the number of items you would like to appear in your new feed (in this example, 20).
Finally, connect the bottom of the "Truncate" module to the "Pipe Output" module (this appears in the graph area by default), and click save (figure 9). Congratulations, you have created your first Pipe!
Click here to create your own output from an example Pipe we have published (if you click "edit" you can also see how we did it). You will be prompted for two blog RSS feed addresses, and for a category keyword or phrase to filter each one by. Check the code of the blog feeds to see what category keywords or phrases are used and make a note of the ones that interest you.
The example we have created here is really just scratching the surface of the potential of this tool. You can browse published examples of other Pipes from the Yahoo! Pipes home page. Many have user-entered data fields so that you can create your own feed without having to build your own Pipe; or you can simply clone and modify them to suit your particular needs.
Publishers could also use Pipes or similar tools to allow readers to search more effectively for their content. One obvious area of application would be newspaper classifieds; for example the ability to create a feed that only features properties in your desired price range and location.
Once you have mastered Pipes, let us know how you have applied them to help your work as a journalist or publisher and share them here (click on Comments below).
Freelancers for hire
- Advertise here: Contact
- John Thompson