A simple place

I am predisposed to overfitting. I think most people who like to work with computers, who like to hack, or make, or do things themselves, have a predisposition to overfitting.

Overfitting is a term of art from statistics and machine learning. It’s a fancy word that refers to the idea that when you don’t constrain how complicated or flexible your solution to a problem can get, you will come up with a solution that perfectly fits all your requirements as they exist today, but this perfect fit comes with a tradeoff in which your solution is specific, complicated, and doesn’t generalize to unseen future requirements very well. You didn’t fit, you overfit.

For example, you try to write an essay and the only editor you have is Microsoft Word. You start typing but it just doesn’t feel right. You search for better tools for writing essays and find someone recommending Bear, so you download it, and you write your essay and it works out great. Next week you need to write a long personal email, you start in Bear but it doesn’t feel good for an email. You look around and find people recommending Ommwriter, so you download it and write your email and it works out great.

This is fine if from that point onwards you will only need to write essays and personal emails, and if you will need to write them the same way you wrote them before. But nothing ever stays the same and the future always differs from the past. If you will accept nothing less than a perfect writing experience for each new writing task, you will never stop looking for new tools.

Then you look at your writing toolbox and it’s a mess. Instead of having a couple of tools that work okay for most tasks, you have a hundred tools and they’re all tuned to work for very specific tasks.

Regularization is another term of art, and it refers to intentionally constraining your solution to keep it uncomplicated and general. In the writing tools example, you can regularize by setting a rule that says “I will not have more than four writing apps installed at any given time”, or “each writing app I install has to work very well for at least three different writing tasks”.

These constraints will naturally stop you from installing your 10th writing app, and will encourage you to use tools you already have instead of getting new ones. The cost of the constraints is that you’ll always feel like the tool isn’t a perfect fit for what you’re doing right now.

Regularization changes your behavior when the tool acquisition or solution modification was going to have a higher cost than gain, which usually happens after your solution has already grown quite a bit. A critical point in this framework is that early modifications lead to large gains, and later modifications have diminishing returns. When you only had Microsoft Word and installed Bear, you gained a lot! They do very different things and the universe of tasks you can tackle increased massively. But by the 10th app, your gain, if it exists at all, is tiny.

Regularization can be tuned! Your rules can be restrictive, such as implementing a one in-one out system, e.g., “I won’t use a new tool without getting rid of an old one”. Or, they can be more relaxed: “I won’t use a new tool without trying to do without it for three days, and if I still have an unsolved problem, it’s time to complicate my solution a little bit more”.

Overfitting is a very common problem, both in machine learning and in life. The writing tools example comes from personal experience. I use Neovim as my main editor, which sounds like a nice regularization, until you see that my configuration file for it is 520 lines. And I said it was my primary editor, not my only one.

Another personal example is how frequently I tinker and modify this site’s style and structure. In the past I rarely wrote a post without modifying some feature to look or work better. Every small nit I noticed, I addressed immediately. This took time, and meant things were never consistent. The time I spent tinkering could have been spent getting better at writing.

To regularize is to impose constraints that keep your life simple at the cost of a small amount of constant discomfort. It also helps you focus on what really matters. Does it really matter that the site title doesn’t align perfectly with the post under it? And does it matter more than sweating out an idea I have in my head or a draft that I’ve been struggling with?

The people whose writing and work I find the most valuable have had the same site, the same gear, and the same tools, forever. Maybe regularization comes naturally to them, and they have an ingrained understanding that spending time producing is better than spending time improving tools or tinkering with visuals. Or maybe they had to be intentional about it and fight their own predisposition to overfit. Either way, regularization is one of my yearly themes and it’s already working very well.

See also