Embedded Muse 71 Copyright 2002 TGG February 18, 2002
You may redistribute this newsletter for noncommercial purposes. For commercial use contact email@example.com.
EDITOR: Jack Ganssle, firstname.lastname@example.org
- Embedded Seminars
- ESD and Firmware
- Embedded Systems Conference
- Thought for the Week
- About The Embedded Muse
Come to Boston on April 9 or Baltimore April 11, where I’ll be presenting the “Better Firmware Faster” all day seminar. See http://www.ganssle.com/classes.htm for more information.
I often do this seminar on-site, for companies with ten or more embedded folks who’d like to learn more efficient ways to build firmware. See http://www.ganssle.com/onsite.htm.
ESD and Firmware
Long time friend and well-known firmware consultant Scott Rosenthal (www.sltf.com) was working with TI’s MSP430 microcontroller recently and ran into some interesting problems.
Electrostatic Discharges (ESD) are those high voltage low amperage “zaps” that happen when we shuffle our feet along a carpet and then touch something metallic. Since practically everything is now electronic, regulatory bodies require that most products be resistant to ESD discharges. Our customers would be pretty upset if a casual static zap destroyed their high tech device! Silicon geometries are now so small that tens of volts can destroy the transistors… yet an ESD pulse is often tens of thousands of volts.
Careful grounding and a well-designed case will minimize most of these problems. But it’s tough to eliminate it altogether; often some energy will couple into your circuits, no matter how well shielded.
Scott found that the MSP430’s I/O pins, when subjected to just a bit of ESD, randomly change state. This is a pretty cool MPU, since every peripheral bit can be an input or output port, or even an interrupt input. He watched the ESD testing lab’s experiments change bits from output ports to interrupt inputs, leading to erroneous and erratic interrupts. Though the parts survived the ESD events, the mode changes could horribly crash the firmware, a very bad thing for critical applications.
An A/D converter also suffered from temporary brain damage when zapped.
The solution was software that monitors the settings of all I/O pins and the A/D’s setup and calibration. Additional code rejected spurious interrupts.
How reliable do our systems have to be? If we have to survive an ESD discharge without crashing does the entire unit have to live in a Faraday shield, one that blocks out all RF and ESD energy? If a system has external inputs they may be hard to impossible to completely protect. Optical isolators help, but are an expensive addition to a low cost product.
By adding software that monitors the hardware’s health Scott cured the problem. But he raises an interesting philosophical issue: can we no longer assume the hardware is reliable? I wrote about unreliable hardware in Embedded Systems Programming this month (http://embedded.com/story/OEG20020125S0104), but ESD is another source of erratic problems.
Watchdog timers have always been our last-ditch defense against crashing code. If you’re worried about transient hardware issues, as it’s beginning to seem we must, a well-implemented watchdog is essential. If we cannot trust the hardware, that suggests the watchdog must hit the CPU’s reset input (not an interrupt), since only reset is guaranteed to cure the processor of all weird modes.
Some safety critical code doesn’t even assume that RAM and ROM work; the systems continuously run RAM and ROM tests in the background, looking for bit flips or part failures.
High reliability systems will benefit from a healthy dose of paranoia. Assume everything breaks and nothing is reliable. I saw a system recently where 80% of the code was for exception handling and hardware monitoring. Think of it – for every line of code that implemented the application the developers wrote four lines of paranoia-code. The costs are astronomical, though probably necessary for some systems. I suspect that as time goes on, as our software and hardware complexity increases, this sort of defensive programming will become the norm.
Embedded Systems Conference in San Francisco
Longtime readers of The Embedded Muse probably know I’m a big fan of the Embedded Systems Conferences. Yes, as a member of the conference’s Advisory Board I’m deeply involved with the event. But it’s important that we learn new things, see what products are available, and chat with colleagues that may know answers to problems we have.
The biggest of these is coming up. It’s March 12-16 at the Moscone Center in downtown San Francisco. I’ll be there talking about Managing Embedded Projects and Really Real Time Systems. And, join me for a discussion session on March 14 about Strategies for Building Reliable Software – I’ll facilitate but the input will come from the developers (like you!) who attend.
Half the fun of these conferences is meeting folks, so do say “hi” if you see me running around.
There’s more info at http://www.esconline.com/sf/
Thought for the Week
Thanks to Laurence Marks who found this.
System Crash (to the tune of "The Monster Mash")
I was working in the lab, late one night
When my eyes beheld an eerie sight,
Some smoke from our VAX began to rise
And suddenly, to my surprise...
(There was a crash) There was a system crash
(A mighty crash) I heard the disk heads smash
(A system crash) It came down in a flash
(There was a crash) A fatal system crash
The lab manager then appeared from his room,
Said "I don't want to be a prophet of doom,
But we had one like this just the other day
Which blew up 4 megs and the SBA"
The system had been booted,
diagnostics all run through,
When a power flux made it run amuck,
then SCOTTY and IRVING blew too
So we'd lost all our VAXES in less than one night
When a VP came in and said "hey, that's all right,
I'll loan you a Venus - here's what to do
When you call up Support, tell them Gordon sent you..."