Thursday, December 21, 2006

The Hidden Cost of Delaying Bug Fixes

I recently completed Certified Scrum Master training with Mishkin Berteig. One of the many interesting tidbits that came up over the course of the weekend was Mishkin's preference for treating every bug as the team's #1 priority as soon as it is discovered. (This is similar to the stop-the-line mentality of lean development.) I found this interesting, because it goes against the Scrum practice of putting decisions about what the team should work on in the hands of the Product Owner.

I also disagree with giving the Product Owner total control of what to work on, because I believe developers are in a better position to appreciate the real impact of bugs on the product, and it's best to put decisions in the hands of the people best able to understand the consequences. If the development team thinks a bug needs to be fixed pronto, it should be fixed pronto.

Unfortunately, even amongst development teams, there is not always consensus about the real cost of letting bugs linger. Many developers don't mind living with bugs while they continue to pump out new features, figuring they will get back to them "later". In the hopes of changing this situation, I'd like to talk about a hidden cost of leaving bugs in the code.

The Hidden Cost

Of course, there is the standard argument that the later a bug is fixed, the more expensive it is to fix. I'm going to assume that everyone has heard that argument before, and more-or-less agrees with it. I want to talk about something a little more insidious. When you fix a bug not only affects the cost of fixing it, but also the way you fix it.

As development work continues in the presence of a known bug, new dependencies on the code where the bug lives are likely to be added. Sometimes these dependencies will be on bug itself. Often, this happens without the developers realizing it. In general, as a piece of code acquires dependencies, it becomes riskier to change. I think this is even more true of bug-ridden code.

So, the longer you wait before fixing the bug, the more risk is introduced in fixing it. When the risk of fixing a bug increases, the reaction of most developers is to stick with the potential fix that appears least risky (even if there is a 'better' alternative).

A Brief Aside

Okay, I've used a dangerous word: 'better'. Anytime someone uses evaluative words like 'good', 'bad', 'better', or 'worse' without qualifying them with a context, I get suspicious. So here's my context:

Kent Beck makes a distinction between code quality and code health in this talk. I think this is an important distinction to make. Code quality relates to the number of bugs in your code, and how well it does what it's supposed to do. Code health refers to your code's ability to react to change. It relates to how well-factored your code it. You can just call it maintainability if you want.

The reason this distinction is important is that its possible to improve code quality without improving code health. In fact, it's quite easy to improve code quality while reducing code health. Since code health is extremely desirable, I would assert that a fix that improves code quality and code health is better than one that improves code quality but reduces code health.

Back to Our Regularly Scheduled Program

Now we're getting down to the point. When you put off a bug fix, you are going to feel pressure to go with the (perceived) less risky solution later (the quick hack that fixes the bug but reduces code health). And lets face it, most of us are not good with dealing with pressure. If we were, we would probably insist on fixing the bug now rather than later in the first place.

So not only are you going to take longer to fix the bug later, you're going to take on technical debt, and all the pain and suffering (and reduced velocity) that technical debt entails.