By Jack Ganssle

Refactoring Mercilessly

Published 5/16/2006

Lately I'm getting a fair amount of email from developers complaining that proponents of agile methods are ruining their code. Refactoring is generally seen as the root of too much evil by these correspondents. A team member sees code that can be improved so they furiously rewrite a function or a method, invariably breaking something else. Since programmers generally believe that all code written by other people is crap, the refactoring frenzy can consume vast amounts of time and inject tons of defects.

To refactor means we simplify the code without changing its external behavior. We're not adding functionality or fixing bugs; we're improving the maintainability, structure, or organization of the code to make it simpler or clearer.

eXtreme Programmers refactor mercilessly (http://c2.com/cgi/wiki?RefactorMercilessly), cleaning the code up on a continual basis. Since XPers evolve their programs, refactoring is a critical step in insuring the code's integrity. Today's brilliant idea might look pretty dumb tomorrow.

I think XP's approach to refactoring is very wise. In his 1981 book Software Engineering Economics Barry Boehm demonstrated that most of our code is actually pretty decent, but defects cluster in a small percentage of the functions. He showed that these lousy routines consume four times more engineering effort than the better-behaved ones. That academic result matches my real-life experience: when there's a function that everyone is afraid to edit, or that only Joe dares touch because it's so fragile, that routine needs to be fixed. We often try to beat the thing into submission, but that's like Sisyphus rolling his rock up the hill. That code must be refactored.

Boehm showed it actually saves money to identify these buggy functions and take action early.

But refactoring is guaranteed to break things. if done incorrectly. The XP folks correctly link it to other practices. One of those is test. In XP one constructs tests before writing or in parallel with writing the actual code. Those tests must be comprehensive, must run quickly, and must be automated. Change something and run the test to prove nothing broke. Extreme tests make refactoring safe.

Automated tests are tough to implement in an embedded system when someone must watch LCDs or press buttons. One option is to write simulations of those devices, either from scratch or by using a framework (e.g., http://www.agilerules.com/projects/catsrunner/index.phtml). Another is to debug completely in a virtual environment (e.g., http://virtuatech.com/).

Refactoring is also always done in small steps. A big rewrite tends to introduce many bugs. Clean something up, test it, and move on.

But refactoring without a rock-solid test platform invites catastrophe.

So my advice is mixed. Measure bug rates and refactor routines that repeatedly cause problems - you'll save money. But if your tests are weak refactor warily. On some systems refactoring is a Bad Idea in general, since regulatory requirements may mandate hugely expensive re-verifications of the product.

I think the real issue is testing. A great series of tests opens a world of software development options and risk-taking. But today the technology available for testing embedded apps, and our discipline in focusing on test in the real or simulated world, lags that in the IT business.

What's your view? Do you refactor mercilessly? If so, how do you ensure the changes don't break something?