You may redistribute this newsletter for noncommercial purposes. For commercial use contact email@example.com. To subscribe or unsubscribe go to https://www.ganssle.com/tem-subunsub.html or drop Jack an email.
Did you know it IS possible to create accurate schedules? Or that most projects consume 50% of the development time in debug and test, and that it’s not hard to slash that number drastically? Or that we know how to manage the quantitative relationship between complexity and bugs? Learn this and far more at my Better Firmware Faster class, presented at your facility. See https://www.ganssle.com/onsite.htm.
In my opinion the quality of technical web (and it's all web-based now) magazines is declining, victims of the drive to cut costs. But Altera's System Design at the Leading Edge is different. This issue is packed with useful articles for embedded hardware engineers, and is devoid of marketing fluff. Recommended.
|Quotes and Thoughts
"Computer system analysis is like child-rearing; you can
do grievous damage, but you cannot ensure success." - Tom
|Tools and Tips
Please submit neat ideas or thoughts about tools, techniques and resources you love or hate.
How accurate is your DMM? If you work for a decent-sized company they probably calibrate at least some of the test equipment every year. But many others never check calibration. Buy a meter off eBay and you have no idea if reads accurately.
I recently came across the Voltagestandard web site which offers a number of low-cost references and ordered one of their DMMCheck devices. The specs are amazing considering the price:
The unit runs off an included 9 V battery.
It comes with a calibration certificate, and its accuracy is guaranteed for six months. For the first two years recals are free (user pays shipping) and after that it's $5. The vendor checks each unit against an 8.5 digit DMM that is recalibrated every year.
The DMMCheck's 5.000 volt output.
Obviously, this unit provides only a few fixed values and doesn't substitute sending your DMM to a cal lab, but at $35 (for 25 PPM resistors; add $4 for 10 PPM) it's a bargain for those who don't get regular meter calibrations.
A very recent paper with the unusual title The Feng Shui of Supercomputer Memory (you have to have access to the ACM Digital Library to get this) looks at DRAM (DDR3) and SRAM failures. This very interesting study shows that SRAMs experience transient (e.g., induced by cosmic rays) failures hundreds of times more frequently than hard failures. For SRAM in L2 caches a computer at 875 feet of elevation had 77 times more transient errors (single event upsets, or SEUs) than permanent ones. Another at 7,500 feet raised that rate to 441 times. Both machines were of similar design and used identical SRAM.
An obvious question is "what was the failure rate in absolute terms." The paper is silent about this. A lot of studies have used neutron sources to cause SEUs, but it's hard to see how that translates to a system running in the real world. The 1996 paper Single Event Upset at Ground Level by Eugene Normand summarizes a lot of unpublished data. It seems SRAMs of that era experienced SEUs at ground level at a rate of about 2 x 10**-12/bit-hr. That's within an order of magnitude of the data Xilinx has published for the SRAMs in their FPGAs.
Suppose a small embedded system has 1 MB of SRAM. With these numbers one bit could be expected to change at random every seven years. It's impossible to speculate how many of those SEUs are important; some areas of memory will be unused, an LSB changing on sensed data may not matter much. But if a company produces 1000 products with this memory in it, then 140 of those systems could experience an SEU each year, or one every few days.
This calculator shows that at 40,000 feet the cosmic ray flux is about 500 times stronger than at sea level, and varies logarithmically with altitude. So systems shipped to Denver will experience SEUs 3.5 times more often than the one per seven years per system.
But wait, there's more. Actel (now Microsemi) has a paper titled Understanding Single Event Effects (SEEs) in FPGAs; this and other sources suggest the neutron flux from cosmic rays doubles one travels from New York to Stockholm due to the increase in latitude.
So I started thinking about RAM tests.
Even a transient error can corrupt the stack, pointers (gasp, function pointers!) and critical data structures. Such an error is no less harmful than a bit that is stuck high or low. But a transient failure is, well, transient. Rewrite the location and it goes away. A RAM test will miss transient failures, which, according to the first paper I cited, are by far the most common sort. Or, if the bit flip happens in the middle of the test, it will report a failure that disappears as soon as the flagged location is rewritten. A false positive.
It's common to run a RAM test at boot-up. Some systems run background RAM checks all of the time. But if the authors' data is correct, those tests will pick up only some tiny percentage of the unwanted bit flips. Conversely, the vast majority of errors detected by the test will be SEUs, and thus irrelevant.
Transistors are roughly free. The MCU world has morphed from simple little 8 bit CPUs to 32 bitters packed with complex I/O and gobs of memory. Transistor geometries are shrinking as well, and each new node is more sensitive to high-energy particles. Maybe it's time the semiconductor vendors pair a parity bit or even ECC with each memory location. The additional cost would be very low.
|More Funny Products
A. J. van de Ven responded to Howard Speegle's tail of finding an unconnected microprocessor in a blender:
Harold Hallikainen made me crack up:
Tom Guadagnola wrote:
Steve Paik has a fantastic Annoy-a-tron story:
|On Bad Habits
Weland Treebark responded to the last issue's comments on developing bad habits:
Luca Matteini wrote:
Let me know if you’re hiring embedded engineers. No recruiters please, and I reserve the right to edit ads to fit the format and intents of this newsletter. Please keep it to 100 words.
|Joke For The Week
Note: These jokes are archived at www.ganssle.com/jokes.htm.
John Black found this on-line:
In 1998, I made a C++ program to calculate pi to a billion digits.
I coded it on my laptop (Pentium 2, I think) and then ran the program.
The next day I got a new laptop but decided to keep the program running.
It has been over seven years now since I ran it. and this morning it finished calculating.
I looked in the code , and found out that I forgot to output the value. :(
|Advertise With Us
Advertise in The Embedded Muse! Over 23,000 embedded developers get this twice-monthly publication. .
|About The Embedded Muse
The Embedded Muse is Jack Ganssle's newsletter. Send complaints, comments, and contributions to me at firstname.lastname@example.org.
The Embedded Muse is supported by The Ganssle Group, whose mission is to help embedded folks get better products to market faster.