Follow @jack_ganssle

The logo for The Embedded Muse For novel ideas about building embedded systems (both hardware and firmware), join the 27,000+ engineers who subscribe to The Embedded Muse, a free biweekly newsletter. The Muse has no hype, no vendor PR. It takes just a few seconds (just enter your email, which is shared with absolutely no one) to subscribe.

By Jack Ganssle

Total Recall

Published 2/06/2006

In an article (http://embedded.com/showArticle.jhtml?articleID=177103084) a couple of weeks ago I stressed the importance of aiming for zero defects. A couple of private emails from readers took me to task, arguing that perfection just isn't attainable.

They're right.

And they're wrong.

<i>Of course</i> it's impossible to make anything complex perfect. Software will always be inherently problematic. A product comprising a million lines of code is built from something like 20 million keystrokes. Get just one wrong, for an error rate of one part in 0.5 * 10**7, and the system is defective in some measure. Unfortunately, infallibility is not part of human nature.

But today's state of the industry is unacceptable. Consider these events:

  • 2005 - Toyota recalls 75,000 Prius hybrids due to a software defect (mine included).
  • 2004 - Pontiac recalls the Grand Prix since the software didn't understand leap years. 2004 was a leap year.
  • 2003 - A BMW trapped a Thai politician when the computer crashed. The door locks, windows, A/C and more were inoperable. Responders smashed the windshield to get him out.
  • 2002 - BMW recalls the 745i since the fuel pump would shut off is the tank was less than 1/3 full.
  • 2001 - 52,000 Jeeps recalled due to a software error that can shut down the instrument cluster.

That's just a handful of recent recalls, only in one industry, due to buggy code.

There there's this; I tried to buy some boat parts from Defender.com and the total came to:

$84 trillion dollars, more than the world's GDP. Happily, though, they didn't charge for shipping or tax.

It <i>is</i> possible to greatly reduce software defects. We might chuckle about some of the recalls experienced by the auto industry, but they are working hard to improve the quality of their code. For instance, they started the Motor Industry Software Reliability Association (http://www.misra.org.uk/) to improve the reliability of C and C++ firmware. I admire their efforts to act decisively. With recalls costing millions of dollars they've wisely rejected the notion that we can hack our way to success.

We EEs started the embedded world some 30 years ago. Most of us had no software engineering knowledge at all, but learned assembly language and cranked out code. A lot of it was awful, but it was possible to beat the small programs of the 70s into submission using heroics. That's not likely with today's huge apps. In my travels around the embedded landscape I'm seeing more organizations starting to refine their software practices; to employ more disciplined approaches in an effort to tame schedules and reign in bugs. Things are getting better. But we've a long way to go.

The reality is that it's astonishing how well software and computers do work. They control practically every aspect of our lives, most of the time with little fuss. Surely, though, our bosses will demand we reduce or eliminate the recalls, upgrades, and patches that are so common now.