By Jack Ganssle

Refactoring

Published 12/11/2003

"An architect's most useful tools are an eraser at the drafting board, and a wrecking bar at the site." - Frank Lloyd Wright.

Unfortunately, the definition of refactoring has been obscured by hearsay and casual use by folks who haven't read the original papers (for a good treatise see the c2 Wiki at http://c2.com/cgi/wiki?WhatIsRefactoring). Some use it to mean iterative evolution: changing the code to add new functions. Others see it as bug-fixing. Neither is accurate.

Refactoring is simplifying the code without changing its external behavior. Its purpose isn't to add functionality or fix bugs; rather, we refactor to improve the understandability (and hence maintainability) and/or structure of the code.

XPers make the interesting point that code duplication is one clear indication refactoring is needed. I go a bit further, arguing that wise developers count bugs and recode functions that have high error rates. That might involve refactoring (usually an incremental approach) or tossing bad routines and recoding from scratch.

Refactoring is not a new concept. In his 1981 book "Software Engineering Economics" (http://www.amazon.com/exec/obidos/tg/detail/-/0138221227/qid=1071149694//ref=sr_8_xs_ap_i0_xgl14/103-3532738-2988661?v=glance&s=books&n=507846) Barry Boehm (http://sunset.usc.edu/Research_Group/barry.html) showed that a little bit of the code is responsible for most of the problems. He demonstrated that a lousy routine eats 4 times more development effort than any other function. When we're afraid to modify even a comment, when the slightest change breaks a function, when the thought of any edit gives us the sweats, that's a clear indication the code is no good and should be rewritten. The goal isn't to add features or fix bugs; it's to make the function maintainable.

The alternative is to try and beat cruddy code into submission, a never ending battle as the each "improvement" leaves the software an even more convoluted mess. The cheapest way to build great software is to refactor bad code, early.

The boss will freak if he hears you're tossing or restructuring working code. An old IBM story is illustrative: an engineer made a million dollar mistake, back when a million dollars was a lot of money. He was summoned to the office of President Tom Watson. Quaking in fear the engineer blurted: "I guess you'll be asking for my resignation." "Resignation," replied Watson incredulously, "I just put a million dollars into your education!"

Making a mess of a function is also an important educational experience. We may have screwed up, but have learned a lot about what should have been done. Use the experience we've just gained and refactor the mess.

What's your take? Do you refactor at all or accept anything that more or less works? Or do you embrace XP's idea of refactoring everything that can be improved in any way?