Introducing git-tracker
12 Dec 2014the self isn’t a historical fiction or a cultural construct
or a linguistic hallucination;
the self is a creature
and it lives in a burrow
under the hillside of history
Tony Hoagland, "Still Life"
And The World Needs This Because...
I have long struggled with keeping and tracking personal data metrics, despite (and perhaps because of) the glut of hardware and software metrics products. It seems like obtaining meaningful statistics about one's overall activities should be easier. For some reason, it is not.
Git is a natural candidate for promoting metrics tracking, since it is so prevalent in the programming world. Furthermore, all of my writing is under git control, including my journals and to-do lists.
Currently, I track my git usage using git_stats, and of course there is the all-pervasive and public GitHub heatmap (available on an user's "view profile" page). But these tools are of limited utility. git_stats
has a whole lot of irrelevant statistics, and not all of my projects are on Github.
A Three-Pronged Attack
I enjoy reading articles about the quantified self and it seems like most people end up implementing their own tools and workflow to accommodate their personal quirks and idiosyncratic practices. Clearly, I am no different in that regard, since I am designing my own data metric tools.
git-tracker
is the first step in my data gathering efforts. I realized recently that my efforts were a bit too ad-hoc to allow for consistent and regular data collection. So I took a step back and defined three broad categories I wanted to capture: projects, health, and communication. All of them have quite different collection needs.
Projects, fortunately, is by far the easiest category to tackle. As I have alluded to above, everything is in git. Tracking health metrics would involve a collection of online services and personal notes (such as records of rock-climbing attempts). Tracking communication habits necessitates the use of passive monitoring technology such as the stealth SelfSpy.
Oh, The World I Dream Of
In an ideal world, my statistics would be hooked up as raw data to play around in IPython Notebooks (shades of Mathematica and Matlab). Pandas is simply awesome, although I am itching to try out the IJulia kernel. In any case, it should be easy to create graphs of all kinds and to share them (to a limited extent).
And of course, in this Utopia, all of my data should be easily accessible and under my total control, with permission controls and locks, should my health insurance want this information. God forbid I need multiple logins to multiple services using OAuth1 and OAuth2 to retrieve scattered data components. Assuming, of course, that the data is even available.
Reality Is A Harsh Taskmistress
The basic idea of git-tracker
is to track progress and projects based on 'activity' (and not some misleading metric such as lines of code or commits). For my personal projects, 'activity' once per day is sufficient -- as a friend often says, "If we do not take a step every day, the journey can become effectively infinite." For my writing endeavors, measuring lines is indeed a valuable metric. For work-related projects, it is more interesting to see how productive and effective I am in a given language and domain.
While I could have displayed my data using any number of graphical plots, I chose only the two I thought were most useful: Heat Maps and Tree Maps. Heat maps are best suited for tracking patterned usage and tree maps are useful for visually partitioning hierarchical data.
Add a lightweight tag system to the mix, and you have git-tracker. It is still a work in progress as of this writing.