Social Network Contact Management: Choosing Blogs to Read with BlogCatalog and Graphviz

I’m always searching for the best way to find which blogs I should read today. As technology changes, so does the strategy that I use. For example, sometimes I like to check to see which blogs that I’ve subscribed to have new material. Other times, I might hop from one blog to another by following advertisements from EntreCard, CMF Ads, or Adgitize. Sometimes, I might use BlogExplosion to suggest sites I should visit. Or, I might read through the blogs of people that have visited me, as reported by MyBlogLog or BlogCatalog.

Recently, I’ve mostly been following blog advertisements and supplementing it with BlogExplosion recommendations. Yet the recent issues with EntreCard and how they handle advertisements has caused me to change that strategy for the time being. First off, many of the EntreCard ads no long lead to interesting sites, so I’m less interested in following the links. Beyond that, some EntreCard users have called for a strike to protest those changes, and I’ve adjusted my strategy for today accordingly.

With this, I thought I’d dig a little bit deeper into the social network contact management aspect of selecting blogs. I want to build up my readership by visiting sites that have recently visited me.

There are three systems that I currently use to track who has visited me. MyBlogLog is the granddaddy of recent reader lists. They provide a nice API to extract the information and I’ve done some interesting work in the past with MyBlogLog. In addition, they provide lots of information about the services that readers use, so I can go check out the Twitterstreams or Flickr photos of recent readers.

TwitterRemote also provides an interesting tool for tracking recent visitors. The problem with TwitterRemote is that they don’t currently have a nice API. I looked around, and I can fairly easily reverse engineer their widget to get a webpage that I could scrape to be able to process recent Twitter visitors. But that is a lot of work, and they might change the page layout, thereby messing up the page scraping. So, I’ve sent an email to them asking if they will create a TwitterRemote API. It would provide another nice tool.

With that, I decided today to get to know the BlogCatalog API. BlogCatalog has a recent reader list very similar to the MyBlogLog recent reader list. Their API isn’t as rich as the MyBlogLog API, but it is sufficient for some of my simpler tasks I have in mind.

So, I went out and wrote a little routine in PHP. It calls the BlogCatalog API to find out who my recent readers have been. For each recent reader, it then checks to see if they have a blog, and if so, who has been reading their blogs. This could crawl a long time and produce amounts of data so massive that it would be useless.

Instead, I put some limits on the crawling. First, I look at only the six most recent readers for any blog. Then, I look at their readers and their readers readers, and so on. Right now, I’m stopping after I go through sixty readers. Then, I produce a GraphViz chart which shows who has been reading each others blogs.

Also, in order to keep the graph from getting too large, I only show people that are both reading and being read.

Now that I have it working somewhat nicely, I’ve produced the following graph.



BlogCatalog Recent Dropper Graph, originally uploaded by Aldon.

The first thing that jumps out at me with this version of the graph is that a lot of people are reading Forced Green, and they have a lot of people reading them, but there isn’t a lot of interaction between the different sites.

On the other side of the graph, you can see some of the community effect with sites like Random Ramblings, Daisy the Curly Cat, I Love/Hate America and The One About… interacting.

Eventually, I hope to gather information from graphs like this and other sources to come up with a nice answer to questions like, who has been reading my blog or responding to me on Twitter, Facebook and the like, that has new content on their blog that I haven’t read recently, or put more simply, whose blog should I read next.

So, how do you decide which blog to read next?

(Categories: )