A Boss's Quick-Start to Firmware Engineering, Part 1
Published in ESD July 2004
For novel ideas about building embedded systems (both hardware and firmware), join the 35,000 engineers who subscribe to The Embedded Muse, a free biweekly newsletter. The Muse has no hype and no vendor PR. Click here to subscribe.
By Jack Ganssle
I hear from plenty of readers that their bosses just don't "get" software. Efforts to institute even limited methods to produce better code are thwarted by well-meaning but uninformed managers chanting the "can't you just write more code?" mantra.
Yet when I talk to the bosses many admit they simply don't know the rules of the game. Software engineering isn't like building widgets or designing circuit boards. The disciplines are quite different, techniques and tools vary, and the people themselves all too often quirky and resistant to standard management ploys. Most haven't the time or patience to study dry tomes or keep up with the standard journals. So this month and next here's my short-intro to the subject. Give it to your boss.
So, dear boss, assuming you're reading this, the first message is one you already know. Firmware is the most expensive thing in the universe. Building embedded code will burn through your engineering budget at a rate matched only by a young gold-digger enjoying her barely-sentient ancient billionaire's fortune.
Most commercial firmware costs around $15 to $30 per line, measured from the start of a project till it's shipped. When developers tell you they can "code that puppy over the weekend" be very afraid. When they estimate $5/line, they're on drugs or not thinking clearly. Defense work with its attendant reams of documentation might run upwards of $100 per line or more; the space shuttle code is closer to $1000 per line, but is without a doubt the best code ever written.
$15-$30 per line translates into a six figure budget for even a tiny 5k line application. The moral: embarking on any development endeavor without a clear strategy is a sure path to squandering vast sums.
Like the company that asked me to evaluate a project that was 5 years late and looked more hopeless every day. I recommended they trash the $40m effort and start over, which they did. Or the startup which, despite my best efforts to convince them otherwise, believed the consultants' insanely optimistic schedule. They're now out of business - the startup, that is. The consultants are thriving.
First, before even thinking about building any sort of software, install and have your people use a version control system (VCS). Building even the smallest project without a VCS is a waste of time and an exercise in futility.
The NEAR spacecraft dumped a great deal of its fuel and was nearly lost when an accelerometer transient caused the on-board firmware to execute abort code! incorrect abort code, software that had never really been tested. Two versions of the 1.11 flight software existed; unhappily, the wrong set flew. The code was maintained on uncontrolled servers. Anyone could, and did, change the software. Without adequate version control, it wasn't clear what made up correct shipping software.
A properly deployed VCS insures these sorts of dumb mistakes just don't happen. The VCS is a sort of database for software, releasing the code to users but tracking who changed what when. Why did the latest set of changes break working code? The VCS will report what changed, who did it, and when, giving the team a chance to efficiently troubleshoot things.
Maybe you're shipping release 2.34, but one user desperately requires the old 2.1 software. Perhaps a bug snuck in sometime in the last 10 versions and you need to know which code is safe. A VCS reconstructs any version at any time.
Have you ever misplaced code? In October of 1999 the FAA announced they had lost the source code to all of the software that controlled air traffic between Chicago and the regional airports. The code all lived on one developer's machine, one angry person who quit and deleted it all. He did, however, install it on his home computer, encrypted. The FBI spent 6 months reverse engineering the encryption key to get their code back. Sound like disciplined software development? Maybe not.
Without a VCS, a failure of any engineer's computer will mean you lose code, since it's all inevitably scattered around amongst the development team. Theft or a fire - unhappily everyday occurrences in the real world - might bankrupt you. The computers have little value, but that source code is worth millions.
The version control database - the central repository of all of your valuable software - lives on a single server. Daily backups of that machine, stored offsite, insures your business's survival despite almost any calamity.
Some developers complain that the VCS won't protect them from lazy programmers who cheat the system. You or your team lead should audit the VCS's logs occasionally to be sure developers aren't checking out modules and leaving them on their own computers. A report that takes just seconds to produce will tell you who hasn't checked in code, and how long it has been out on their own computers.
Version control systems range in price from free (like the GNU products) to expensive, but even the expensive ones are cheap.
What language is spoken in America? English, of course, but try talking to random strangers on a street corner in Baltimore today. The dialects range from educated middle-American to incomprehensible near-gibberish. It's all English, of a sort, but it sounds more like the fallout from the Tower of Babel.
In the firmware world we speak a common language: C, C++ or assembly, usually. Yet there's no common dialect; developers exploit different aspects of the lingos, or construct their programs using legal but confusing constructs.
The purpose of software is to work, of course, but also to clearly communicate the programmer's intentions to maintenance people. Clear communications means we must all use similar dialects. Someone - that's you, boss - must specify the dialect.
The C and C++ languages are so conducive to abuse that there's a yearly obfuscated C contest whose goal is to produce utterly obscure but working code. Normally I don't publish the URL as these people are code terrorists who should be hunted down and shot like the animals they are, but the examples are compellingly illustrative. To see how bad things can get, see http://www0.us.ioccc.org/2001/williams.c. And then vow that your group will produce world-class software that's cheap to maintain.
The code won't be readable unless we use constructs that don't cause our eyes to trip and stumble over unusual indentation, brace placement and the like. That means setting rules, a standard, used to guide the creation of all new code.
The standard defines far more than stylistic issues. Deeply nested conditionals, for instance, lead to far more many testing permutations than any normal person can manage. So the standard limits nesting. It specifies naming conventions for variables, promoting identifiers that have real meaning. Tired of seeing i, ii, and (my personal favorite) iii for loop variable names? The standard outlaws such lazy practices. Rules define how to construct useful comments. Comments are an integral and essential part of the source code, every bit as important as for and while loops. Replace or retrain any team member who claims to write "self commenting code".
Some developers use the excuse that it's too time consuming to produce a standard. Plenty exist on the net; mine is in Word doc format at www.ganssle.com/fsm.htm. It contains the brace placement rule that infuriates the most people! so you'll change it and make it your own.
So write or get a firmware standard. And boss, please work with your folks to make sure all new code follows the standard.
What's the cheapest way to get rid of bugs? Why, just don't put any in!
Trite, perhaps, yet there's more than a grain of wisdom there. Too many developers crank lots of code fast, and then spend ages fixing their mistakes. The average project eats 50% of the schedule in debugging and test! Reduce debugging, by inserting fewer bugs, and accelerate the schedule.
Inspect all new code. That is, use a formal process that puts every function in front of a group of developers before they spend any time debugging. The best inspections use a team of about 4 people who examine every line of C in detail. They'll find most of the bugs before testing.
Study after study shows inspections are 20 times cheaper at eliminating bugs than debugging. Maybe you're suspicious of the numbers - fine, divide by an order of magnitude. Inspections still shine, cutting debugging in half.
More compellingly it turns out that most debugging strategies never check half the code. Things like deeply-nested IF statements and exception handlers are tough to test. My collection of embedded disasters shows a similar disturbing pattern: most stem from poorly executed, pretty much untested error handlers.
Inspections and firmware standards go hand in hand. Neither works without the other. The inspections insure programmers code to the standard, and the standard eliminates inspection-time arguments over stylistic issues. If the code meets the standard, then no debates about software styles are permitted.
Most developers hate inspections. Tough. You'll hear complaints that they take too long. Wrong. Well-paced inspection meetings examine 150 lines of code per hour, a rate that's hardly difficult to maintain (that's 2.5 lines of C per minute), yet that costs the company only a buck or so per line. Assuming, of course, that the inspection has no value at all, which we know is simply not true.
Your role, boss, is to grease the skids so the team efficiently cranks out fabulous software. Inspections are a vital part of that process. They won't replace debugging, but will find most of the bugs very cheaply.
Have your people look into inspections closely. The classic reference is "Software Inspection" by Gilb and Graham (Addison-Wesley, NY NY; 1993, ISBN 0201631814), but Karl Wiegers newer and much more readable book "Peer Reviews in Software (Addison-Wesley, NY NY, 2001, ISBN 0-201-73485-0) targets teams of all sizes (including solo programmers).
Toss out bad code.
A little bit of the software is responsible for most of the debugging headaches. When your developers are afraid to make the smallest change to a module, that's a sure sign it's time to rewrite the offending code.
Developers tend to accept their mistakes, to attempt to beat lousy code into submission. It's a waste of time and energy. Barry Boehm showed in "Software Engineering Economics" (http://www.amazon.com/exec/obidos/tg/detail/-/0138221227/qid=1071149694//ref=sr_8_xs_ap_i0_xgl14/103-3532738-2988661?v=glance&s=books&n=507846) that the crummy modules consume 4 times the development effort of any other module.
Identify bad sections early, before wasting too much time on them, and then recode. Count bug rates using bug tracking software. Histogram the numbers occasionally to find those functions whose error rates scream "fix me!"! and have the team recode.
Figure on tossing out about 5% of the system. Remember that Boehm showed this is much cheaper than trying to fix it.
Don't beat your folks up for the occasional function that's a bloody mess. They may have screwed up, but have learned a lot about what should have been done. Use the experience as a chance to create a killer implementation of the function, now that the issues are clearly understood. Healthy teams use mistakes as learning experiences.
Use bug tracking software, such as the free bugzilla (http://www.bugzilla.org/), or any of dozens of commercial products (nice list at http://www.aptest.com/resources.html).
Even the most disciplined developers sometimes do horrible things in the last few weeks to get the device out the door. Though no one condones these actions, fact is that quick hacks happen in the mad rush to ship. That's life. It's also death for software.
Quick hacks tend to accumulate. Version 1.0 is pretty clean, but the evil inflicted in the last few weeks of the project add to problems induced in 1.1, multiplied by an ever-increasing series of hacks added to every release. Pretty soon the programming team says things like "we can't maintain this junk anymore." Then it's too late to take corrective action.
Acknowledge that some horrible things happened in the shipping mania. But before adding features or fixing bugs in the next release, give the developers time to clean up the mess. Pay back the technical debt they incurred in the previous version's end game. Otherwise these hacks will haunt the system forever, reduce overall productivity as the team struggles with the lousy code in each maintenance cycle, and eventually cause the code to rot to the point of uselessness.