This is the last part in a short series about Yahoo Pipes and how to use it to analyze Digg homepage stories and any topical trends. Part 1 covered sorting by submitter. Part 2 covered sorting by category and submitter. Part 3’s Pipes filter stories by number of votes, then sort by any or all of submitter, category and votes.

All of the general Pipes principles presented in these examples can be used to analyze other social media sites, and thus could be a handy tool for building site profiles. The only big difference in the 4 examples in this Part, compared to the others, is that we’re introducing a numeric variable and a filtering rule. To keep this post short, I’m leaving out most of the sections that I had in the previous posts. Please see them for explanations of Digg variables, Pipes variables, regular expression pattern rules, etc.

Video

The video is long (nearly 10 min) because it covers four variations of a Pipe based on vote sorting and filtering. See the Links section, below the video, for links to the actual Pipes, in case you feel like cloning and tweaking them.

Links

If you feel like playing around with the Pipes used in this example, here are the links:

  1. Digg sorted by votes - very simple: just inserts the number of votes in square brackets at the beginning of each story tile.

  2. Digg over X votes - only shows home page stories with more than X votes, where you can specify X when running the Pipe.
  3. Digg over X votes, sorted by category and votes - same as #2, but also sorts stories first by category and then by number of votes.
  4. Digg over X votes, sorted by submitter and votes - same as #2, but also sorts stories first by submitter and then by number of votes.

Summary

These Pipes examples for Digg have been fairly simple in nature. There are more complex examples over at Yahoo Pipes. Just search there for “Digg” and browse the results.

This is part two of a short series that uses Yahoo Pipes to analyze information about Digg home page stories. Part 1 covered sorting by submitter (member name). Anything not explained here is probably explained in the last post - so please check there.

This part looks at sorting first by story category, then by category and submitter. The former shows any topical bias for the site (i.e., Digg members obviously like articles about Apple), and the latter shows any topical bias for popular members.

Of course, you could guess at some of this information, but having a social media analysis tool lets you collect data automatically over a long term. You can use similar pipes to build a profile for each social media site. Or if you don’t feel like saving the daily data, you can just subscribe to the resulting modified RSS feed in your favorite feed reader. Note that each social media site has different feed information and thus you cannot reuse the Pipes in this example as is. However, the general principles are the same.

Digg feed variables used

We’ll use two Digg feed variables in Yahoo Pipes this time:

digg:category
digg:submitter.digg.username

The regex rule used for the title is

replace (^.*$) with [${digg:category}] [${digg:submitter.digg:username}] $1.

However, we could also use:

replace ^ with [${digg:category}] [${digg:submitter.digg:username}] .

They both mean the same thing: insert the Digg category and username at the beginning of the story title. (Note that there’s a space at the end of the second version’s replacement pattern string.)

Process

The process for building these Pipes is fairly simple.

Pipe 2a: by category only

  1. Grab the Digg home page feed.
  2. Insert the story category, in square brackets, at the beginning of each title.
  3. Replace each story’s item.description field with nothing.
  4. Sort the list of home page stories by modified title.

Pipe 2b: by category and submitter

  1. Grab the Digg home page feed.
  2. Insert both the story category and submitter, each in square brackets and in that order.
  3. Replace each story’s item.description field with nothing.
  4. Sort the list of home page stories by modified title.

Yahoo Pipes modules used

These are the modules used in these Pipes:

  1. Fetch Feed
  2. Regex
  3. Sort
  4. Pipe Output

Video

Digg home page by category

Digg home page by category and submitter

Links

Here are the links to the two Yahoo Pipes used in this example. If you have a Yahoo Mail account, you can clone and tweak these pipes:

  1. Digg homepage by category.
  2. Digg homepage by category and submitter.

Summary

Part 3 of this series will use Yahoo Pipes to filter out home page story categories you’re not interested in, as well ignore stories with less than X votes, where you can specify X.

For those of you that like to follow social media sites such as Digg, an easy analysis tool may be of some use to you. Yahoo Pipes lets you very quickly put together a suite of tools to organize a web feed’s items. In this example, I’m going to to sort the Digg homepage RSS feed by the submitter of each story.

To do that, we need to manipulate some of the content of the Digg feed using the Yahoo Pipes Regex (regular expressions) module. Otherwise, all the information we need is in the feed.

Regular expression patterns:
I’m not going to get into an elaborate discussion of regexes. Instead, I’ll just list what I’ve used in the screencast video. (If you’re familiar with regexes already, bear with me.)

  1. ^ - caret - match the beginning of a string.
  2. $ - dollar - match the end of a string.
  3. .* - dot star - match any sequence of characters.
  4. ^.*$ - match the entire string.
  5. (^.*$) - match the entire string and save it in parameter 1, aka $1.

Digg feed variables used:
The Digg home page RSS feed has a number of fields/ variables that we can access in Yahoo Pipes. In this example, I’ve only used one:

digg:submitter.digg:username

Within Yahoo Pipes, to access it, we place braces (curly brackets) around it:

${digg:submitter.digg:username}

Process:
These are the steps I take in the video below.

  1. Grab the Digg home page feed.
  2. Insert the digg username (of the story submitter) in the item.title field’s values, at the beginning of the title, surrounded by square brackets.
  3. Do the same with the item.y:title field. (This is probably redundant, but it’s not a big deal.)
  4. Replace the item.description fields with nothing - i.e., an empty string. For our analysis, getting rid of the description reduces visual clutter in the results. It’s just easier to see only the title and submitter.
  5. Sort the resulting manipulated feed by the item.title.

What we’re doing is taking a story title such as

Paris’ Sob Story

with

[RainbowPhoenix] Paris’ Sob Story

for each home page story. The string in the square brackets is the name of the Digg member that submitted the article. So ^.*$ matches “Paris’ Sob Story”, and the () brackets assigns this string to $1. Thus the Regex replace rule (^.*$) for item.title takes the very same title and inserts the current digg username in square brackets in front of the title.

[${digg:submitter.digg:username}] $1

Other than getting rid of the story description, this all we’re really doing, followed by a sort on the title values.

Yahoo Pipes modules used:

  1. Fetch Feed
  2. Regex
  3. Sort
  4. Output

Here’s a SplashCast screencast showing the process of creating the Pipe. (Apologies for the choppy narration, as I had to use an earlier voiceover due to upload problems.)

Yahoo Pipes - digg homepage sorted by submitter

You can take my Digg by Submitter pipe, clone and tweak it to your heart’s content. Or wait for the next one. In the next part of this mini-series, we’ll sort the Digg homepage by category (and prove an Apple bias for the home page).

Sphinn is a brand new player in the social media space that many of you are already familiar. It’s still young, but the calling of new, fresh data to analyze got the better of the math geek in me and I built a few Yahoo! Pipes on their RSS feeds. [This post is a continuation of an earlier Yahoo! Pipes: Analyzing Digg (By Submitter, By Category and Submitter, Filter by Votes) on Search Engine Journal, but without a screencast video showing the building of a Pipe.]

Pipes and Processes

There isn’t a great deal of information in Sphinn RSS feeds just yet (see the Wishlist section), compared to, say, Digg feeds. However, there’s enough that I could build a few Pipes. Here they are, all of which you can clone and tweak, if you have a Yahoo! Mail account.

  1. Sphinn new item category count. Yahoo Pipe results/ feed.
    Take the New Items feed and product a count of stories in each category.
    1. Grab feed.
    2. Use Unique module on category.
    3. Hack the category name into a section URL on Sphinn. (Some errors may exist because this had to be a manual hack, due to lack of section URL info in the feed.) This allows you to click on a category in the Pipes results and go to the corresponding section on Sphinn.
    4. Output results.
  2. Digg new item category count. Yahoo Pipe results/ feed.
    The process for this one is exactly the same. Only the feed URL and the fields are different in the Pipe.
  3. Sphinn new + hot searchable. Yahoo Pipe results.
    This Pipe merges the Sphinn New and Hot feeds and lets you search them. Remember to run the Pipe with your query before subscribing to the resulting feed. Note to the Sphinn boys and girls… This might make a good tag line: “Sphinn: New, Hot, and Searchable.”
    1. Grab both feeds and merge them.
    2. Eliminate duplicate items by title.
    3. Sort in reverse chronological order.
    4. Apply user’s search term.
    5. Apply user’s limit for number of results.
    6. Output results. (The link on each item is to the Sphinn snippet, not the actual article.)
  4. Most active Sphinn commenters at the moment. Yahoo Pipe results/ feed.
    Want to know which Sphinn members are the most active in terms of commenting on stories? This Pipe provides this metric, but is limited by the fact that only 40 comments are in the feed. So you get an idea of fresh commenting activity. (If you want overall commenting activity since Sphinn began, you would first have to scrape the All-Users pages to get a list of members. So a members RSS feed would be nice.)
    1. Grab the comments feed.
    2. Apply the Unique module by dc:creator (commenter).
    3. Sort in descending order by number of comments for that person, in the current list.

yahoo pipes -sphinn -active commenters screen snap

Wishlist

Sphinn is pretty new, so infrastructure quirks are to be expected. But because the Sphinn feeds do not carry as much info as the Digg feed, there is very little anaysis that can be done in a Yahoo Pipe. Here’s a bit of a wishlist for Sphinn RSS feeds that would help data lusters like myself.

  • More than 40 items in a feed.
  • More information in the feeds.
  • A category URI in each story item so that it’s easy to link to a category’s home page on Sphinn. Or, alternately, an easier mapping from compound category names to the corresponding category home page. Digg drops a story into a URL that contains the category path.
  • Inclusion of the end story’s URL.
  • More member feeds. (These are coming, according to Sphinn.)

Conclusion

How you use the information generated by these Yahoo! Pipes is up to you, but if you’re a data miner/ data luster like myself, you’ll figure out something useful.

Steve Rubel Twittered last night saying:

Checking out blognation. Like it but wish I could subscribe to individual bloggers. http://us.blognation.com/

He raises a great point. It can be annoying on multi-author blogs to have to read everything when you're only interested in the perspectives of some of the authors. On Technology Evangelist, we address with with individual author feeds on each author page.

However, another way to achieve this is to use Yahoo Pipes to filter a blog feed by author. As an example, i created an Yahoo Pipes feed filter for Blognation that creates a filter for the author of your choice. I arbitrarily chose Marc Orchant as the default author, so clicking the Run button will filter the feed for Mr. Orchant unless you switch out the name with other authors.

Giving people control over what they consume is going to happen whether you enable it or not. Clearly, few people are filtering RSS feeds on Yahoo Pipes today, but stuff like this is going to happen.


One of the easiest ways for a non-programmer to combine, aggregate and filter multiple RSS feeds into one is to use Yahoo! Pipes (YP). YP uses a sleek visual editor that allows the user to fetch and manipulate data sources, add user defined inputs and filter the content in a number of ways.

I used YP to combine nine* popular SEO feeds into one and then published it on pipes.yahoo.com where anybody can now use it. Try it in your favorite reader - Composite SEO News Feed.

By using the WordPress plugins FeedList and RunPHP I can also easily display the Composite SEO News Feed right here:

Remember this is the actual feed not just a graphic so whenever you are viewing this page the feed will be up to date.

When you first look at the drag and drop interface of YP it may seem a little daunting but here is a step by step using the above practical example and you can of course combine any feeds you choose.

First you need to sign in to YP with your Yahoo ID (create an ID if you don’t have one). When you’re signed in click Create a pipe and click the untitled tab to give your pipe a name. Drag a Fetch Feed into the workspace.

Drag the Fetch Feed module to the workspace.

Enter a feed url which you will find on most sites by clicking the RSS, XML or Atom link, or icon. If you see a “?” icon in the Fetch Feed module that means you have input a non-valid feed address.

Copy and paste the feed url.

Click the url icon to enter a second feed.

Click the url icon to enter a second feed.

Enter the second feed url.

Enter the second feed url.

Repeat until you have entered all the feed urls that you want to combine.

Complete the addition of feed urls.

Drag a Sort module to the workspace. Pipe the Fetch Feed module to the Sort module by clicking the circle on top of the Sort module and dragging it to the circle at the bottom of fetch module. A blue pipe will appear and connect the two.

Pipe the Fetch Feed module to the Sort module.

Sort by date in descending order by selecting PubDate from the first drop-down menu and Descending from the second drop-down menu.

Sort by date in descending order.

Drag a Truncate module to the workspace. Pipe the Sort module to the Truncate module by clicking the circle at the bottom of the Sort module and dragging it to the circle at the top of Truncate module. Enter a value for the maximum number of items you require from your combined feed.

Pipe the Sort module to the Truncate module.

Pipe the Truncate module to the Pipe Output and the Debug area will fill up with your new feed’s output.

Pipe the Truncate module to the Pipe Output.

Finally click Save and then click Publish. In the pop-up window enter a description for your pipe and when you click Publish again your Pipe will go public.

By combing YP with mashup tools like Dapper or OpenKapow you will be able to construct an RSS feed from almost anything that you can find on the Web.

*The nine feeds combined in the Composite SEO News Feed:
SEO by the SEA
Search Engine Land
Search Engine Roundtable
Matt Cutts
SEO Book
SEO Blog
SEOMoz
Threadwatch
Marketing Pilgrim

I don't know if Yahoo Pipes has gained much traction since launching, but I do know that I find the service useful.

For those who aren't familiar with it, it's a relatively easy to use platform for creating RSS feed mashups, filters, and other related fun stuff. I say, "relatively simple" since you don't have to be a programmer or own a server in order to create mashups, but it still has a bit of a learning curve.

Here is a very simple use of the service that I think helps illustrate the power.

There is a an online running forum called dyestat.com Track Talk that has a lot of content that's interesting to me. However, it also has a TON of stuff that doesn't interest me at all.

What's nice is that there is a Minnesota-specific section to the forum. What isn't nice is that there isn't a Minnesota-specific RSS feed available for the forum.

That's where Yahoo Pipes comes to the rescue. I simply took the RSS feed for the entire dyestat.com Track Talk forum, dropped that into Yahoo Pipes, added a filter for the word "Minnesota" in the item descriptions (which they happen to add to all items posted to the Minnesota sub-forum), and generate a filtered RSS feed of the remaining posts:

Yahoo Pipes Filtering of RSS Feed

This took about 2 minutes. Yahoo Pipes has given me easier access to content that's of interest to me.

And now that it exists, others can use it as well or clone and edit it to personalize it for content that interests them.