Tools for Clean Code
For hints, tricks and ideas about better ways to build embedded systems, subscribe to The Embedded Muse, a free biweekly e-newsletter. No hype, just down to earth embedded talk. Click here to subscribe.
Tools for Clean Code
Back when dinosaurs roamed the Earth most of our computer
work was on punched card mainframes. Some wag at my school programmed the
Fortran compiler to count error messages; if your program generated more than 50
compile-time errors it printed a big picture of Alfred E. Neuman with the
caption "This man never worries. But from the look of your program, you
should."
This bit of helpful advice embarrassed many till the
University's administrators discovered that folks were submitting random card
decks just to get the picture. Wasting computer time was a sin not easily
forgiven, so the systems people were instructed to remove the compiler's funny
but rude output. They, of course, simply buried the picture as a challenge to
our cleverness.
How times have changed! Not only do we no longer feed
punched cards to our PCs, but if only we got just 50 errors or warnings from a
compilation of new code!
I've long held the theory that the reason developers
don't ship code with syntax errors is because the compiler aborts, not
producing an object file. Watch someone compiling. Warning messages fly off the
screen at seemingly the speed of light, all too often as disregarded as "no
tailgating" admonishments.
It blows my mind. Here's a tool almost shouting that the
code may be flawed. That assignment looks suspicious. Do you really want to use
a pointer that way?
With deaf ears we turn away, link and start debugging. Sure
enough, some of these potential problems create symptoms that we dutifully chase
down via debugging, the slowest possible way. Some of the flaws don't surface
till the customer starts using the product.
Even more horrifying are the folks who disable warnings, or
always run the compiler with the minimum level of error-checking. Sure, that
reduces output, but it's rather like tossing those unread nastygrams from the
IRS into the trash. Sooner or later you'll have to pay, and paying later
always costs more.
Why do I think warnings are critical program insights we
can't ignore?
Build a PC product and count on product lifecycles measured
in microseconds. Embedded systems, though, seem to last forever. That factory
controller might run for years or even decades before being replaced. Surely,
someone, sometime, will have to enhance or fix the firmware. In three or ten
years, when resurrecting the code for an emergency patch, how will that future
programmer respond to three hundred warnings screaming by? He won't know if
the system is supposed to compile so unhappily, or if it's something he did
wrong when setting up the development system from old media whose documentation
was lost.
Maintenance is a fact of life. If we're truly
professional software engineers, we must design systems that can be
maintained. Clean compiles and links are a crucial part of building applications
that can be opened and modified.
Did you know that naval ships have their wiring exposed,
hanging in trays from the overhead? Fact is, the electrical system needs routine
and non-routine maintenance. If the designers buried the cables in inaccessible
locations the ship would work, right out of the shipyard, but would be
un-maintainable; junk, a total design failure.
Working is not the sole measure of design success,
especially in firmware. Maintainability is just as important, and requires as
much attention.
Beyond maintenance, when we don't observe warnings we
risk developing the habit of ignoring them. Good habits form the veneer
of civilization. Dining alone? You still probably use utensils rather than
lapping it up canine-like. These habits means we don't even have to think
about doing the right thing during dinner with that important date. The same
goes for most human endeavors.
The old saying "the way to write beautiful code is to
write beautiful code for twenty years" reflects the importance of developing
and nurturing good habits. Once we get in the so-easy-to-acquire habit of
ignoring warning messages we lose a lot of the diagnostic power of the compiler.
Of course spurious warnings are annoying. Deal with it. If
we spend 10 minutes going through the list and find just one that's suggestive
of a real problem, we'll save hours of debugging.
We can and should develop habits that eliminate all or most
spurious warnings. A vast number come from pushing the C standard too hard.
Stick with plain vanilla ANSI C with no tricks, no implied castings, and that
forces the compiler to make no assumptions. The code might look boring, but
it's more portable and generally easier to maintain.
Did you know that the average chunk of code contains
between 5 and 20% errors before we start debugging? (reference 1). That's 500
to 2000 bugs in a little 10,000 line program. My informal data, acquired from
talking to many, many developers but lacking a scientific base, suggests we
typically spend about half of the project time debugging. So anything we can do
to reduce bugs before starting debug pays off in huge ways.
We need a tool that creates more warnings, not
fewer. A tool that looks over the code and finds the obvious and obscure
constructs that might be a problem; that says "hey, better check this a little
closer! it looks odd."
Such a tool does exist and has been around practically
since the dawn of C. lint (named for the bits of fluff it picks from programs)
is like the compiler's syntax-checker on steroids. lint works with a huge base
of rules and points out structures that just seem weird. In my opinion, a lint
is an essential part of any developer's toolbox, and is the first weapon
against bugs. It will find problems much faster than debugging.
How is lint different than your compiler's syntax
checker? First, it has much stronger standards for language correctness than the
compiler. For instance, most lints track type definitions - as with typedef
- and resolve possible type misuse as the ultimate types are resolved and
used.
lint, unlike a compiler's syntax checker, is more aware
of a program's structure, so is better able to find possible infinite loops,
and unused return values. Will your compiler flag these as problems?
b[i]= i++;
status & 2 == 0
lint will.
But much more powerfully, lints can look at how multiple C
files interact. Separate compilation is a wonderful tool for keeping information
hidden, to reduce file size, and to keep local things local. But it means that
the compiler's error checking is necessarily limited to just a single file. We
do use function prototypes, for instance, to help the compiler spot erroneous
use of external routines, but lint goes much further. It can flag inconsistent
definitions or usage across files, including libraries.
Especially in a large development project with many
programmers, lint is a quick way to find cross-file problems.
The downside to lint, though, is that it can be very noisy.
If you're used to ignoring a handful of warning messages from the compiler
then lint will drive you out of your mind. It's not unusual to get 30,000
messages from linting a 1000 line module.
The trick is to train the product. Every lint offers many
different configuration options aimed to tune it to your particular needs.
Success with lint - as with any tool - requires a certain amount of your
time. Up front, you'll lose productivity. There's a painful hump you'll
have to overcome before gaining its benefits.
Arrows or Machine Guns?
I'm sure you've seen the comic. A medieval battle wages
in the background. Arrows, catapults and boiling oil are the technological state
of the art (oh, for the days of less mechanized and efficient warfare!). A
salesman, machine gun in hand, is trying to get the general's attention, but
his aide-de-camp bushes him off, telling him that his boss is just too busy
fighting a war to deal with the intruder.
When I show this to developers they invariably shake their
heads with a mocking smile, wondering who could possibly be so short sighted.
Sometimes you just have to stop for a bit to adopt a new technology or idea.
When I was a tool vendor my biggest frustration was that
customers used only the simplest features of our products; virtually none took
the time to learn the more powerful functions that would ultimately save them
lots of time. When I talk to tool vendors today they share the same complaint.
We're all very busy; impossible deadlines and unexpected
problems fill the days to overflowing. To stop and learn a new tool seems an
impossible demand on our time. Clearly it's insane to halt development every
time we hear about the next new thing. But we're in a dysfunctional
environment when such pressure never lets up.
I despair at times for our profession. So many developers
never get a chance to stop. When a project finishes it's invariably late, so
the next one is already behind schedule. We jump from one fire to the next. It
took 20 years for C to become common in embedded systems. Why? Maybe because
developers are too panicked to learn new things.
I have no solutions, other than to observe that sooner or
later your boss will die, be promoted, or move to sales (much like dying, I
suppose). Then you'll be in charge. Change will come if you use the painful
lessons and give your people a chance to pick up new ideas and learn better ways
to do their job.
Find some time to learn lint, and to tune it to your
application. When I talk to folks who use it, 9 out of 10 are wild about how it
has helped them be more productive.
Resources
Commercial and free lints abound; while all are similar
they do differ in the details of what gets checked and how one goes about
teaching the product to behave in a reasonable fashion.
Probably the most popular of all PC-hosted commercial lints
is the $239 version by Gimpel Software (www.gimpel.com).
This product has a huge user base and is very stable. It's a small price for
such fantastic diagnostic information! particularly in the embedded world
where compilers may cost many thousands of dollars.
LCLint is a freebie package whose C source code is also
available (http://lclint.cs.virginia.edu/).
Linux, other Unix, and PC distributions are all available.
Another factor in writing maintainable software is to
follow a consistent set of rules - a standard. The standard defines the
prettiness parameters (brace placement, indentation, etc), but goes far beyond
these superficial charms. The standard tells the team how to name variables,
format comments, limit function sizes and a host of other rules.
Prior to the metric system - a standardized system of
units and measures - scientists had trouble communicating in quantitative
terms. Each spoke a different dialect of science. We have the same sort of Babel
in the software community today; though C and C++ are standards, each of us
employ them in very stylistically different manners. Worse, most
of us switch styles at will so even a single module has no consistency.
But even in the best of cases when we have and use a
software standard human frailty means we'll slip up. Use a tool to check your
code against your standard. Parasoft's $995 CodeWizard (www.parasoft.com)
compares your source against a canned set of 150 rules, flagging violations a la
lint.
If CodeWizard's rules were set in stone I'd chuck the
product in a heartbeat. Happily they are extensible and modifiable. It's
pretty easy to define the checks to match your company's software standard.
Does this take an up-front commitment of time? Of course.
Conclusion
A half dozen times a year I'll watch a panicked developer
repeatedly invoke the compiler and linker manually. The reason? Invariably
it's because he's "too busy" to set up make files. Astonishing.
Equally astonishing is how many of us refuse to use a lint
or lint-like product for the very same reason: it takes time to train the thing
to behave reasonably. Most tools require an investment of both money and
time before you reap benefits. I know it's hard to steal precious hours from a
project to tune the development environment, but the alternative is repeating
the same problems forever.
Sometimes it's easiest to learn how to do the right thing
by looking at wrong examples. Check out "How To Write Unmaintainable Code"
at http://mindprod.com/unmain.html.
Two lessons from the site: If God didn't want us to use
global variables, he wouldn't have invented them. Rather than disappoint God,
use and set as many global variables as possible.
And finally: If you give someone a program, you will
frustrate them for a day; if you teach them how to program, you will frustrate
them for a lifetime.
Reference 1: A Discipline for Software Engineering, Watts
Humphrey, Addison Wesley, Reading, MA 1995, ISBN 0-201-54610-8. Also see the
Software Engineering Institute's data (www.sei.cmu.edu)
which suggests that at least 6% of all pre-tested code is buggy.
|