How I created the Twitter Social Map

meeyauw asked how I produced the Social Map of Twitter that I put up yesterday for Wordless Wednesday. I didn't want to put up the details yesterday, or else the picture wouldn't have been wordless.

However, it is now Thursday, and I would like to use the map as a starting point for a discussion about Twitter. I was at the OMMA show earlier this week and two things jumped out at me. The first was the eagerness of marketing types to diss Twitter and the second was the lack of interest in conversations online. I think these are related.

As I thought about Twitter, I thought it would be interesting to produce a social map of the connections in Twitter. I wrote a fairly quick program in the mono implementation of C#. This is an open source, free software implementation of the .NET framework. I've been doing a little more programming in mono recently, in part because of my interest in OpenSim which is a "BSD Licensed Open Source project to develop a functioning virtual worlds server platform" similar to Second Life, which is also written in mono.

One of the mono tutorials had an example of scraping a Google page. I modified that to scrape twitter pages. Essentially, I would take each twitter page, scrape out the list of friends, and then for each friend, repeat the process. However, this would produce a very large graph which would include people who are not particularly active twitterers.

So, I threw in a little test. I only selected people that had more than 100 friends and that had more followers than friends. I felt this would give a better relationship between the people that others especially follow.

My first pass didn't have any error checking, and it ran through about twenty different people before I got an error from Twitter. However, it gave me enough data to produce the graph. I have run a subsequent version that captures errors so it can keep on going, and also pauses a second between page requests, so I'm less likely to overload the twitter servers.

That run produced massive amounts of data; too much to reasonable be displayed in a graph, and I'm thinking of doing another pass where I only look at people with more than a thousand followers.

My program wrote out the results in a format that could be fed into Graphviz. Graphviz is a wonderful program to create visual images of graphs. Since Twitter friendships are asymmetrical, that is, I can add you as a friend without you adding me as a friend in return, I used the directed graph capability of Graphviz.

Each time, I started on my own Twitter page, and followed the links. In each run, I very rapidly found my way to Biz Stone, which isn't surprising since Biz is a co-founder of Twitter.

I look forward to creating another map, as well as posting some other reflections on Twitter shortly.