Last night, I came across this question on StackOverflow, essentially asking how you can be a zero-bug programmer. Many of the responses can be boiled down to “Don’t write any code.”, which is pretty much how I’d respond to that question.
If you continue to throw ever-increasing amounts of time and money at a software project, you can certainly get closer and closer to feeling confident you have no problems in your code, but almost any non-trivial piece of software is likely to have some (albiet, probably rare and minor) bugs in it.
I don’t, however, think an appropriate response to this fact is to throw up our hands and say “oh well, what can you do?”. Bugs are a serious problem that pretty much every software developer faces – from those building small sites in PHP to engineers writing software to guide spacecraft. Bugs are embarrassing when they are your fault as the developer. They are costly for project teams to fix. They cause your users frustration, inconvenience them, cause them to lose money, and may even kill them.
So, if it is impossible (or at least highly impractical) to create software that is guaranteed to have 0 bugs, how can we at least minimize the number of bugs (especially common and/or serious ones)?
“We cannot solve our problems with the same thinking we used when we created them.” – Albert Einstein
Whenever talking about software quality (or lack thereof), the knee-jerk response is always “write more tests!” It always strikes me as a little amusing to think that developers who are capable of writing a piece of software with a bug in it are also capable of writing 100% comprehensive tests that are completely free of bugs themselves.
Tests are, of course, incredibly valuable and I don’t want to demean their value – indeed, they are useful beyond just testing for bugs, and I’ve found them helpful as supplemental documentation and as ways to improve my APIs. I do, however, think that blindly writing more tests as a solution to buggy software is misguided. To understand why that is, we must first understand how we get bugs in our software in the first place.
Where do bugs come from?
A lot of the time, we naively assume that a bug happened because a programmer did something wrong. Technically, this may be true, but it is neither useful nor interesting to think of it this simplistically. What are some more specific reasons?
- Programmer error. Perhaps, the programmer did indeed screw up. Maybe it was a typo, or perhaps they didn’t understand a particular API and misused it. Even within this category, there are a bunch of useful things to consider – is it because of lack of training? Not enough documentation of APIs a developer is using? Hiring inexperienced programmers?
- Insufficient requirements. Often, specific pieces of functionality are incompletely and ambiguously defined. The code may be completely bug free for every situation the developer was told to think of, but there are holes in the requirements that don’t manifest themselves until later.
- Seams in functionality. The Agile methodology has, on the whole, been hugely positive for the software development industry. One negative aspect to how some teams implement it, however, is that while each user story (piece of functionality) may be completely bug free, the seams between those pieces of functionality are often messy, poorly tested, and often buggy – partially owing to the fact that you may have had one developer and one qa resource working on one part, and a completely different pair working on the other part. In my own personal experience, a disproportionate number of bugs appear in the seams between functionality.
- Third party libraries. Sometimes the code we have little or no control over contains bugs.
- External resources. Sometimes resources outside of the control of our program contain bugs.
I could go on, but hopefully I’ve given you the idea that a lot of factors cause or contribute to bugs in software products and that it is useful to try to think about these reasons.
How do we avoid making bugs?
So, knowing how we end up with bugs, how can we minimize them?
- Develop better. Whether this means more training, more code reviews, or just hiring better developers.
- More, and better, tests.
- Static analysis tools.
- Better requirements.
- More manual QA.
- Many, many more ways….
Hopefully as you read through my causes of bugs above (and thought of your own), you realized why I think tests aren’t necessarily the best method for dealing with buggy software in all cases. If your problem is that specifications are poor and incomplete, it is impossible for more tests to help you. If your programmers are making a lot of mistakes, they don’t need to write more tests – they need to write fewer bugs. If most of your bugs manifest themselves in the seams of functionality, one functional test is probably going to be worth more than a thousand unit tests.
You can’t improve what you can’t measure
Back around 2000, I started a company to track the results of online advertising. Companies were spending ghastly amounts of advertising with little idea of which advertisements were making them a lot of sales and which ones were complete wastes. That company didn’t do incredibly well because I was 18, kind of an idiot, and had absolutely no money. I still, however, believe in the power of hard data and analytics to help us make better decisions and gauge our success. I also believe that these concepts can be applied to making better software. However, nobody – including myself – is bothering to track how bugs actually happen and how we can most effectively prevent them.
Indeed, I’ve used a number of high-quality open-source and commercial issue tracking tools. They are all really good at recording bugs, prioritizing them, tracking them as they get resolved, and helping us remember when we fixed them. But I’ve yet to see an issue tracking tool that actually tries to help you make better decisions about where to concentrate your future software quality efforts. This is unfortunate, because I think issue tracking tools are ideally situated to be the place where causes and possible solutions are recorded and measured.
“Half the money I spend on advertising is wasted; the trouble is I don’t know which half.” – John Wanamaker
Software projects spend a lot of money in an effort to ensure a quality product. Much like the problem John Wanamaker had with his advertising money, I think projects waste a lot of money trying to improve quality.
Consider a product that currently has 50% test code coverage and too many bugs. A manager may say “We need to raise our test coverage to 75%!”, and the developers dutifully go off and spend a few weeks writing tests to get test coverage up to an acceptable level. At the end of the exercise, they have 75% test coverage – a 50% increase!
Has the quality of their product gone up by 50%? My experience tells me “No”. The reason is usually  a combination of a few factors – lack of testing probably wasn’t the whole problem, and test coverage is raised by testing the easiest (and probably least-buggy) stuff first. All of that time spent increasing code coverage was probably a waste, or at least not as effective as other efforts could have been – perhaps the requirements could have been documented more precisely, or rather than trying to hit a test coverage target, problematic portions of the application could have been identified and tested more thoroughly.
Conclusion
I have no product to sell, nor do I have a comprehensive solution to software quality to propose. I do hope that this post maybe sparks some thought and discussion on how we can better analyze our past coding mistakes and figure out better, more efficient ways of preventing them in the future.
right on. great article.
you can’t improve what you can’t measure. — gary kennedy @RemedyMD
FTW!