Technology
Building a Twitter Status Cloud
Submitted by Aldon Hynes on Fri, 02/06/2009 - 23:09Last week, I produced a word map of the statuses of the people I follow on Twitter. Willem Kossen asked if I could release the program under an open source license. Actually, I’ll do something I hope some of you will find even more helpful. I’ll produce it free, public domain, including my comments about how I put this together.
I actually started off trying to come up with some nice GraphViz images of various social networks I’m on. (For more about GraphViz, read my blog post Installing GraphViz in Drupal and Using GraphViz, a Brief Tutorial. You may also want to check out some of the GraphViz images I’ve uploaded to Flickr and a great Visualization of the Madoff Secruities “Feeder Funds”.
From my Flickr images, you’ll see that I like to create images of social networks using GraphViz, and I thought I would try to create an interesting image of my Identi.ca network. I like working with Identi.ca because it is open source and it uses open standards. For example, you can get my network on Identi.ca as a FOAF file. This is a standardized XML format that can easily be parsed.
In PHP, you can read a website, if you have curl installed fairly easily:
$ch=curl_init();
curl_setopt($ch,CURLOPT_URL,'http://identi.ca/'.$target.'/foaf');
curl_setopt($ch,CURLOPT_RETURNTRANSFER, 1);
$xmlstr = curl_exec($ch);
curl_close($ch);
This little snipped of PHP opens a channel which I’m calling $ch. It goes out and gets the FOAF file for whichever $target I specify. The result is saved in a string called $xmlstr.
With this, you can then parse the XML into an easy to use structure using SimpleXML.
try {
$xml = new SimpleXMLElement($xmlstr);
} catch (Exception $e) {
print "Skipping " . $target . "\n";
}
I use a ‘try’ around the calling of SimpleXMLElement in case the $xmlstr doesn’t contain valid XML. In my case, I just skip the records that don’t have valid XML.
The next part is where I’ve always needed to explore a little bit to make sure that I get the right syntax. XML documents can be multiple levels and they get mapped into structures within structures within structures in PHP with the SimpleXMLElement function.
In this case, the information about the first person in the FOAF document can be found as
$xml->Person[0]->holdsAccount->OnlineAccount->accountName[0];
The people that the person knows can be found by incrementing the index of Person. So, I wrote a loop to go through the structure and write out all the relationships in GraphViz format. I also built a list of other FOAF files to extract the relationships so I could get additional degrees of separation.
Unfortunately, I have a lot of friends on Identica, and most of them have lots of friends as well, and the graph became unmanageable. I kicked around building some filters to only track special friends, but didn’t come up with anything good, so I set aside the identi.ca graphing.
MyBlogLog also provides FOAF files. In addition, the MyBlogLog FOAF files includes links to other services that users have specified. Unfortunately, the MyBlogLog FOAF files does uses namespaces which complicates the parsing. In addition, I probably have even more friends on MyBlogLog than I do on identi.ca, so I set that aside.
Which takes me to Twitter. Twitter also gives you the ability to extract information in XML. As an example, you can get my most recent 100 friends on Twitter, including their name, screen name, location, description, and most recent status. For the status, there is information such as what it says, when it was created, what tool was used, etc.
As I noted, you can get up to 100 friends worth of statuses at a time. If you have lots of friends, you need to loop through all of them.
So, I used the curl and SimpleXML processing above, together with some extra looping to pull all the statuses. With that, here is the PHP program that I used:
<?php
$page = 1;
while(1) {
$ch=curl_init();
curl_setopt($ch,CURLOPT_URL,'http://twitter.com/statuses/friends/ahynes1.xml?p
age='.$page);
curl_setopt($ch,CURLOPT_RETURNTRANSFER, 1);
$xmlstr = curl_exec($ch);
curl_close($ch);try {
$xml = new SimpleXMLElement($xmlstr);
} catch (Exception $e) {
exit;
}$i = 0;
$uname = $xml->user[$i]->name;
if ($uname == '') exit;
while($uname != '') {
$status = $xml->user[$i]->status->text;
print $uname . " : " . $status . "\n";
$i = $i + 1;
$uname = $xml->user[$i]->name;
}
$page = $page + 1;
}
?>
As you can see, you simply put the name of the person you want in the URL and off you go. Caveats: You don’t need to login to Twitter to be able to do this, and you can do it for anyone, providing the people they follow don’t have their Tweets protected. However, you will get limited if you try to do more a lot of pages at the same time.
What I did was save the results to a file that you can wee here. The next step was to paste the text into Wordle.net I then took a screen print of the page and saved it as an image. I could probably search around for some other word cloud software and do that as part of the process, but this is good enough for now.
A minor change and this could be used to show the description of the people that I follow, or the people that follow me. Someone else has already set up a word cloud generator like that, and you can see the word cloud of the bios of people that follow me at TwitterSheep
So, with that, here is my Friday evening word cloud of statuses of the people that I am following, thanks to a little PHP using curl and SimpleXML as well as the word cloud software at Wordle.net:
Yale Education Leadership Smackdown!
Submitted by Aldon Hynes on Thu, 02/05/2009 - 23:19On Friday, February 13, the Yale School of Management will sponsor the Yale Education Leadership Conference. Two years ago, the conference ended with Wendy Kopp, founder of Teach For America, Dacia Toll, president of Achievement First, Rep. Andrew Fleischmann, chairman of the General Assembly’s Education Committee, and Steven Adamoski, Superindendent of the Hartford Public Schools talking about “Closing the Achievement Gap in the State of Connecticut”.
This year’s conference will include keynotes by Joel Klein, Chancellor of the New York City Department of Education and Alberto Carvalho, Superintendent of the Miami-Dade Public Schools. One of the panels will have Rep. Fleichmann return, along with Congressman Chris Murphy, Chairman of the Connecticut State Board of Education Allan Taylor, and Connecticut State Department of Higher Education Commissioner Michael Meotti.
Concurrent with this, the Yale Law School’s Law and Media Program will be holding a conference, “The Future of Student Internet Speech: What Are We Teaching the Facebook Generation”. This conference will include a discussion of The Doninger v. Niehoff Case and how far school authority should extend.
It is unfortunate that these two events overlap since they cover related topics and I would love to be able to attend both. Meanwhile, I still need to write up my notes from the presentation of the proposed Woodbridge school budget and make it to the next committee meeting where we are discussing a three-year technology plan for the school district.
Against this backdrop, I am hearing people talking about Gov. Rell’s latest nominee for the State Board of Education. Today, Linda McMahon, CEO of World Wrestling Entertainment (WWE), testified before a legislative committee. People have criticized the nomination arguing that WWE programs borders on pornographic. They have pointed out the issues with steroid use amongst WWE wrestlers and pondered what Ms. McMahon would have our children learn.
Others asked what her qualifications are, other than significant campaign contributions to numerous candidates. I have been more concerned about her unwillingness to accept interviews about her nomination and her inability to give anything beyond basic answers when asked about various educational issues. Somehow, I don’t expect to see her at any of the educational conferences coming up in the next few weeks.
Yet I don’t expect her to do significantly worse than any of the other members of the State Board of Education. After all, my cynical friends always point out to me, isn’t that how most political appointments are made, not on merit but on connections? Look at Gov. Rell, they point out. She was Gov. Rowland’s Lt. Governor for nearly ten years before Gov. Rowland’s resignation.
Now, Gov. Rell is campaigning on reducing the bloat in government. Perhaps the first place to start is by seeking nominees for political appointments based on their merit instead of how well connected they are. Until then, her comments about reducing bloat in government are going to sound awfully hollow or hypocritical.
SimpleXML PHP and Top EntreCard droppers
Submitted by Aldon Hynes on Sat, 01/31/2009 - 12:41Today is a day that EntreCard has asked people to highlight those that have dropped the most cards on their site. It is easy to get this as an RSS feed, and I could have just done something like thrown it into the Drupal Feed Aggregator and put it up as a block. Or, I could have used one of many different widgets to highlight top droppers.
Instead, I’ve been playing with parsing XML with using SimpleXML in PHP. I’ll be writing more about this later. For the time being, let me present my top droppers, as retrieved as an RSS feed from EntreCard, formatted for this blog, along with some of my own comments.
Using GraphViz, a Brief Tutorial
Submitted by Aldon Hynes on Thu, 01/29/2009 - 12:52Yesterday, I wrote a blog post about installing GraphViz on Drupal. Today, I will describe a little bit about how to create interesting images using GraphViz. This is a bit of a long and geeky post, so I’ll spare my less geeky friends the details. Click on read more to see the full blog post.
Installing Graphviz in Drupal
Submitted by Aldon Hynes on Wed, 01/28/2009 - 18:26I have now install Graphviz on three different Drupal servers and I figure it is time for me to relate my experiences.
First, let me provide a little background. “Graphviz is open source graph visualization software.” Essentially, this allows you to represent a graph in a fairly simply format which then gets displayed as an image.
For example, I might put in:
DIGRAPH {
a -> b;
b -> c;
c -> a;
}
This would create an image with an ‘a’ in a circle, with an arrow leading to a ‘b’ in a circle, which would point to a ‘c’ in a circle, and finally an error from the circle with the ‘c’ in it back to the circle with an ‘a’ in it. Very nice. Very simple. Yes, you could use plenty of graphics programs to produce something like this, but when the graphs get a bit more complicated, it can be especially nice to have a program like Graphviz arrange all of the different pieces.
To get an idea of some of the things you can do with GraphViz, check out the images I produced with GraphViz on Flickr. I’ve created some fun images of social network graphs there.
Drupal is the Content Management System that I like to use for most of my sites. It makes it very easy to add content to a website. So, the combination of Drupal and Graphviz has some great potential.
The first site Drupal site that I installed Graphviz on was on an Ubuntu server on my internal network. Since it is Ubuntu, it is very easy to install packages. To install the graphviz package, I simply entered
sudo apt-get install graphviz
I then tested to make sure that graphviz was working by making a simple graphviz file and running ‘dot’ to convert it to an image.
The Drupal graphviz filter is dependent on the Pear package, Image_GraphViz. I didn’t have Pear installed, so I needed to do that as the next step.
sudo apt-get install php-pear
The best way to check that pear was installed properly was to use it to install Image_GraphViz. So, I executed
sudo pear install Image_GraphViz
At this point, I was ready to test it in Drupal. I downloaded the module, unzipped it and went to module administration to enable it. I then went to Input Formats to add the filter to various input formats. So far, I’ve only had it work nicely with the PHP Input Format.
My first installation was nearly a year ago, and it worked fairly nicely.
My second attempt was to install it on a shared hosting service. However, the commands that I used above weren’t available so, after a little hacking I gave up. I probably could install it in my own directory, change some paths, etc., but it just didn’t seem worth the effort.
My most recent attempt was for Toomre Capital Markets. The TCM site is running on a virtual private server running Ubuntu, so the procedure that I used for my first installation worked simply and easily for the TCM installation. Lars Toomre used GraphViz to create a great Visualization of the Madoff Feeder funds.
Included in the graph is the use of different colors as well as links from the graph into various articles that Lars has written about the Feeder Funds.
So, if you have your own server or a virtual private server, setting up GraphViz to run in Drupal can be fairly simple and straightforward. Building interesting graphs can be as well, and perhaps I’ll offer some of the hints of how to do this in a subsequent post.