Visualizing Google Analytics Keywords with Graphviz and Awk

I haven’t produced any fun images for a while nor written anything interesting about search terms that people use to visit Orient Lodge, so I thought I would create another fun image. This is a Graphviz undirected graph of various top keywords as reported by Google Analytics.



keywords4, originally uploaded by Aldon.

Here is what I did: First, I went into Google Analytics and went to Traffic Sources, and then Keywords. I took the top 100 keywords and saved them into a file. I did a little hand editing to strip up some noise and to adjust for situations where there were many keywords on a single search. Then, I ran through a very simple ‘awk’ script.

{
n=split($0,a," ")
if ( n > 2 )
print a[1] " -- " a[2]
}

This check for every line where there were at least two keywords and printed them out with ‘--‘ between them, the way GraphViz wants for an undirected graph. I added header and footer information and ran dot file that I had produced through neato. To be perfectly fair, I did look at the results, and then do a little final hand tweaking of the dot file to produce a graph that was a little cleaner.

The results were not unexpected. I’ve looked at the keywords page on Google Analytics from time to time, and have a good sense at what people are looking at. Graphviz has always been a big search term, as has Matlab and Excel, and now the N900. Other topics like the Connecticut Gubernatorial race, or making hard cider also often show up.

It was a fun little project. If you’re a blogger, I’d encourage you to look at what brings people to your site and how the keywords are related. If you’re at all geeky, try creating a similar graph and let me know about it.

What do you think? Interesting? Got any other ideas for fun graphs?

(Categories: )