Beemer (dr_tectonic) wrote,

Jargon at work

I had a pleasantly geeky day at work, and I don't really expect anyone to care or even necessarily to follow, but I just felt like I needed to write it down. Maybe as an explanation for why when people ask what I did at work I usually respond: "Oh, y'know, stuff."

Currently, I'm working on cleaning up the swiki for our game so that I'm comfortable having strangers (that is to say, potential users) look at it when they want to find out more about the game and hopefully download it.

It's gotten kind of tangled, and I was having a hard time figuring out what was where and how I should move some things around. So I decided I needed to be able to look at the link structure, to see how it was organized.

I started by diagramming things on paper, and that lasted for all of about a minute before I decided it was too much work. I am a geek. I have a computer. It should do the work for me.

So first I grab the data, thinking I might get something out of just eyeballing it. A quick command-line loop to pull down all the pages with wget. Then I extract the titles into one directory and iterate on a few perl one-liners to parse out the content from the href tags into another to get the links. (It takes a few tries to discover all the things you want to exclude.)

(perl -ne 'if(m|href="/dd/(\d+)|g){print "$1\n";}' $i | sort | uniq > linx/$i, not that you actually care -- but I wrote it down as I went, because I might want to do it again someday.)

Hunh. Can't tell too much by looking -- just that there's a lot of leaf nodes and a few hubs (quelle surprise) so it won't be too bad to clean up. So now I look for visualizers on the web. Touchgraph will show me link structure relative to other websites, but not the internals, so that's no good. I find another graph visualizer that'll do it, though. Not a very good graph visualizer, mind you, but a free one designed for exactly this kind of thing.

Download it, unpack it, run it, dear lord it actually runs! Keen. Feed it the URL, but nope, it doesn't really like that. Well, what kind of datafile will it read? Poke around and find a reference webpage talking about XGMML files, which seem to be what it wants. Figure out the format. Easy enough -- throw around a few more one-line scripts to cut and paste together an XML file, feed it to the program, and voila. A graph.

Okay, it's really NOT a very good visualizer. Like, it sucks a lot. But it shows me some useful stuff -- these pages cluster together, and those twenty things are all part of that blob, okay, enough to tell that this is actually useful information. But not enough, because the UI reeeeeally sucks and worse, it's crashy.

But I have this random little Processing app that I fiddled with a while back that'll animate a graph. I can adapt it pretty quick. So I boot up Processing -- oh, hey, there's a new version, install that, update the code to post-beta so it runs, twiddle it to work with the new data (which is a graph and not a tree), teach it how to read an XML file, and hey presto, the picture that I actually wanted.

And now, I can fiddle with it long enough to determine that yes, indeed, this stuff is over here, and that stuff is over there, and okay, that means that these things are all safely stuffed into that closet and not spilling out into the bedroom. So all I need to do is clean up this area and that area, tuck those things out of sight behind that set of links, update these bits, and it'll be, well, passable, at least.

So now I know where things are. I have about a page or so of notes on what to fix on the main page, I know what to update and what to archive, and I did a bunch of it before I went home for the day. Hooray! I got something accomplished!
  • Post a new comment


    Anonymous comments are disabled in this journal

    default userpic

    Your reply will be screened

    Your IP address will be recorded