Stats [Mar. 30th, 2009|10:16 pm]
So my boss is interested in some stats on our data archive.

I have access to a SQL database that has details of every single download transaction. So I can definitely pull some stats together. The question is... what?

I mean, I can slice and dice the data in a million different ways. I can count number of users, total data volume downloaded, popular files... what else would be interesting?

Jerry suggested looking at number of files downloads per user, which might show some interesting patterns.

Any suggestions?

[User Picture]From: epinoid
2009-03-31 10:58 am (UTC)
Along the popular files - looking for ones that are popular over time versus in short bursts, ones that are popular to wide audience versus ones that a few people seem to revisit.
What is the geographical distribution of users?
Are there files that cluster together in time (especially thematically unrelated ones)?
Can you get to the user download patterns (e.g. searching versus browsing versus surveying)?
Are there time patterns of high and low activity? daily, annually, life of archive...
How soon after adding new content does someone access it? How long does it take to reach a "steady state" interest level?

Most of the time it is easier to figure out the stats to try and answer some questions, so what might those be? So if you are interested in stats based on users try to think of all the ways a user might want to use the data and look at the database from those lenses. Pick the ones that seem like you might get some meaningful interpretations.

Good luck.
