Imperfect measures of productivity
My task management system is very simple. I use t and this means that it’s just a text file in which every row corresponds to a task. Done. I have direct access to the plain-text “database”, and this means I can have fun with it.
This system has helped me get a lot more productive over the last few months. I like numbers, and this system shows me a lot of numbers. But many systems will show you a lot of numbers. The difference is that, because of the plain-text file mentioned above, I can make the numbers I want. I can at a glance see how many tasks are in the list, how many tasks are marked for today, and the total number of tasks I’ve completed.
I can, for example, use my .zshrc
file along with powerline to show me some numbers in my bash prompt:
If I want to look at my longer-term trends, I have a script that automatically generates this 10-day moving mean of number of tasks completed per day.
But the numbers are not a great proxy for my productivity. They’re okay most of the time, but sometimes they are a really bad measure. A couple of days ago was one of those days in which the amount of work I had done mapped very poorly onto the number of tasks crossed off the list.
Anyone who has used task management systems knows why: the size of the task can vary wildly. Completing one task (e.g., Send email about payment to research assistants) can take a minute. Another task (e.g., Read a chapter of Machine Learning for Hackers) will take longer. Other tasks (e.g., Write tests for <insert name of project here>) can vary wildly in the time they will take.
The day before yesterday was one of the days that had one task that took a large portion of my day. The task was to implement a new feature in a project that I am going to put up on GitHub pretty soon. I was productive. I spent many hours in focused work. I learned a lot as I worked. Pleasantly enough, my productivity is apparent when you look at another metric: logged keystrokes.
There are some lessons to take from this:
- Operationalizing1 is hard.
- Increasing the number of variables you are using to measure a phenomenon is good. It allows you to perform sanity checks and helps you address suspicions you might have about any one variable.
- Your theory and reasoning can always trump any data.
- A measure doesn’t have to be perfect to be usable.
Those last two points are the main take aways from this post. I just explained how one measure (the number of tasks I completed in a day) can be really bad sometimes. But I’m not going to stop using it, because it still has utility for me. I still like seeing that number go down during the day. I still like looking at automatically-generated plots of personal analytics. The numbers and the plots help by giving me that extra boost to start a task that I am irrationally procrastinating on starting.
As for the problems with the measure, no one measure is perfect, and productivity is a really hard thing to operationalize. If my usage is consistent, then those one-minute tasks and 6-hour tasks should balance each other out over the long term, and the long term trends, as a result, would still be informative.
When in doubt, I can look at other corroborative data like my keystrokes, although those, too, have their problems. I only count my keystrokes. I have no way to look at a breakdown of what those keystrokes were doing, or what application they were used in. Tons of keystrokes on a day can mean a lot of coding work, or they could mean that it was a Messages- and Adium-heavy day.
There is one more lesson to take away from this:
No data can substitute for your thinking, reasoning, and theory. No data speaks for itself. Take no one’s word for it, and that includes data.
Oh yeah, and induction is false!
-
“a process of defining the measurement of a phenomenon that is not directly measurable, though its existence is indicated by other phenomena.” ↩︎