Follow @jack_ganssle
Go here to sign up for The Embedded Muse.
TEM Logo The Embedded Muse
Issue Number 339, November 20, 2017
Copyright 2017 The Ganssle Group

Editor: Jack Ganssle, jack@ganssle.com

   Jack Ganssle, Editor of The Embedded Muse

You may redistribute this newsletter for non-commercial purposes. For commercial use contact info@ganssle.com.

Contents
Editor's Notes

After over 40 years in this field I've learned that "shortcuts make for long delays" (an aphorism attributed to J.R.R Tolkien). The data is stark: doing software right means fewer bugs and earlier deliveries. Adopt best practices and your code will be better and cheaper. This is the entire thesis of the quality movement, which revolutionized manufacturing but has somehow largely missed software engineering. Studies have even shown that safety-critical code need be no more expensive than the usual stuff if the right processes are followed.

This is what my one-day Better Firmware Faster seminar is all about: giving your team the tools they need to operate at a measurably world-class level, producing code with far fewer bugs in less time. It's fast-paced, fun, and uniquely covers the issues faced by embedded developers. Information here shows how your team can benefit by having this seminar presented at your facility.

Embedded Systems Conference San Jose - I'll be there December 6 and 7, giving a few talks, checking out the expo hall, and hopefully meeting Muse readers. More info here.

Discounted Better Firmware Faster on-site seminars in Europe: I'll be at the Embedded World show in Nuremberg February 27 to March 1. Without the cost and time required to travel, I will be offering on-site versions of this class in Europe for a reduced price shortly before and after the show. If interested, drop me an email.

I hope readers in the USA have a great Thanksgiving weekend. As one gets older, family and tradition become more important. We spend Thanksgiving at my brother's house in Virginia with a big gathering of siblings, sons, daughters, parents, nieces and nephews, with the occasional boy/girlfriend and acquaintance or three tossed in. Part of the tradition is to play Alice's Restaurant, Arlo Guthrie's wonderful song about events that occurred on a Thanksgiving over 50 years ago, on the long drive there.

I'm now on Twitter.

Quotes and Thoughts

"The vast majority of accidents in which software was involved can be traced to requirements flaws and, more specifically, to incompleteness in the specified and implemented software behavior - that is, incomplete or wrong assumptions about the operation of the controlled system or required operation of the computer and unhandled controlled-system states and environmental conditions. Although coding errors often get the most attention, they have more of an effect on reliability and other qualities than on safety." Nancy Leveson.

Tools and Tips

Please submit clever ideas or thoughts about tools, techniques and resources you love or hate. Here are the tool reviews submitted in the past.

Many readers sent this link, which explores 21 different microcontrollers, pondering the strengths and weaknesses of each. Some vendors will not be pleased...

In the last issue I ran a chart listing the size of programs vs. the number of pages of requirements. Scott Nowell, of Validated Software (they provide certification packages to show various products comply with standards like DO-178C) sent me the mapping between size and requirements for Micrium's uC/OS-II. This is interesting since that RTOS can be used in systems that must be certified for avionics use (under the stringent DO-178 umbrella), so is representative of very-carefully designed code.

The RTOS has 4125 lines of code. The high-level requirements occupy 45 pages; low-level requirements an additional 392. If one were to print out the code at 50 lines/page, the listing would comprise a stack of paper only a quarter the size of the requirements.

To clarify "high-level" and "low-level," DO-178C defines:

  • Software requirement: "A description of what is to be produced by the software given the inputs and constraints. Software requirements include both high-level requirements and low-level requirements."
  • High-level requirement: "Software requirements developed from analysis of system requirements, safety-related requirements, and system architecture."
  • Low-level requirement: "Software requirements developed from high-level requirements, derived requirements, and design constraints from which Source Code can be directly implemented without further information."
  • Derived requirements: "Requirements produced by the software development process which (a) are not directly traceable to higher level requirements, and/or (b) specify behavior beyond that specified by the system requirements or the higher level software requirements."

How do your products compare? The highest level of DO-178C is extremely demanding, and not appropriate for what most of us build. But it does represent a level of quality we should aspire to.

Freebies and Discounts

Win a nifty ee701 differential preamp for a scope! Thanks to ee-quipment for donating the unit. A review is here. One lucky Muse reader will win this at the end of November, 2017.

Enter via this link.

Minimizing Optimism

In Muse 317 I ranted a bit about software people who are too optimistic - they expect things to work, and all too often don't check corner conditions and the like.

I recently came across a 2012 report by NASA's Inspector General about why too many of their missions fail, run late, or are over budget. The report claims their are four factors that contribute to these problems. The first is managers who are overly optimistic, who they assume things will be just dandy. Alas, bitter experience proves otherwise.

It's the same in writing software. Expect everything to go wrong. Consider what sorts of errors may occur and take some sort of thoughtful remediation action, or at least fail gracefully.

I was ordering parts for my sailboat. A quart of paint, a few filters for the engine. This popped up:

It didn't seem right to me. But at least they didn't charge tax.

This is optimistic programming (at best)... or professional malpractice. When I learned to program we were repeatedly told to "check your goesintas and goesoutas." But half a century later we often ignore that bit of wisdom. Ironically, those distant days were the era of scarce resources, like memory and CPU cycles. (When I went to college there was one computer, a $10 million Univac 1108 mainframe. It had just one million words of memory). These resources are generally much more plentiful today. The cost to add sanity checks is low.

Is the above example an aberration, a one-off, never-to-be-repeated mistake? Nope. I've got tons of examples. Here are a few:

More:

A Nasdaq above 16,000,000? I wish I had shorted the market.

More:

The highest temperature recorded in nature on Earth is 134° F (56.7° C) at at Greenland Ranch, Death Valley, California. A wise developer might use that fact to come up with a reasonable bound for possible values.

More:

119° with snow on the ground? More:

The lowest temperature ever recorded in nature on Earth is -129° F (-89.2° C), at the Soviet Vostok Station in Antarctica. That might be a reasonable lower bound.

More:

More:

Really? Since that's 2001 years ago I'm surprised to find the message in English instead of Aramaic.

Closed on Thanksgiving? Must be a software problem. We had never heard of a dump closed on Thanksgiving before, and with tears in our eyes we drove off Into the sunset looking for another place to put the garbage.

I could include many more examples (unfortunately).

These are not new issues. The 1996 failure of Ariane 5 launch vehicle was due in part to a variable that overflowed. In fact, there were seven such variables which should have been monitored, but according to the inquiry board that investigated the accident, three weren't, and no one knows why. One recommendation the board made reads:

This was two decades ago. You'd think we would take a $400 million failure to heart, but, alas, that's not the case.

One of the rules of hardware engineering is to be a pessimist. In designing stuff, like circuits and structures, one always does worst-case analysis. What could go wrong? Sure, this capacitor is 10 μF at room temperature. What happens on a hot day? Suppose we get an R1 that is at the low end of its tolerance, but when R2 is at the high end how does the circuit then behave? Absent other information, a wise designer always uses components' worst-case specs.

We should do the same in software engineering. What if this function is passed an "impossible" value? What if a calculation gives a result that simply makes no sense?

It is hard to anticipate all of these error conditions.

But that's our job.

More on Datasheets

A lot of people wrote in about my take on datasheets in the last Muse.

Peter Kazakoff wrote:

I was recently sourcing an op-amp from a major vendor. This was a pretty nice part: a chopper-stabilized CMOS op-amp in a really tiny SMT package with a built-in charge pump for true rail-to-rail input operation. I was using it as part of a precision analog integrator circuit, so naturally I was very interested in the input bias current.

According to the 11-page datasheet, the maximum input bias current was around 5 mA. Wait, what? Even the very obsolete LM709 has an Ib of around 1.5 μA. I would expect a CMOS opamp to be in the double-digit nanoamps at worst. 5 mA is a lot of current, you can almost go to the moon on that these days. There was another obvious error, as well: the phase margin was provided in degrees Celsius.

Okay, so an intern had probably been responsible for writing the datasheet and got the units wrong. That's fair - it takes a while to develop a good feel for what the parameters for a real op-amp are, and a second or third year college student probably isn't there yet. But it was pretty shocking that this error hadn't been caught by the engineer reviewing the datasheet (which, after all, was only 11 pages long), and it was equally shocking that the error hadn't been reported to the vendor and fixed yet. At the time I found the error, the datasheet had been up on the vendor's website for nine months.

I reported the issue to the vendor, who confirmed that the input bias current was in fact 5 nA rather than 5 mA, and promised to fix the datasheet. It took them another three months to do so.

I have to wonder how closely engineers are scrutinizing their parts selections these days, since presumably this op-amp was being purchased by customers. Input bias current is one of the most critical op-amp parameters for designing any sort of high input impedance circuit. Do engineers really just slap a part on the board and hope it works without reading the datasheet?

Remco Stoutjesdijk has some first-hand experience on both sides of the issue:

Your paragraph about datasheets hits a sensitive nerve with me since I have been in many a battle over writing them.

I find that as a a quality conscious engineer a datasheet should constitute a guarantee. If not, what is the point? There are documents called 'Product Briefs' which are more or less a teaser or a dream about a part currently in conception. There you can claim to reach the moon and the stars. Not in datasheets. Datasheets should be full of what? Data. Not marketing.

Datasheets, like many other forms of company communication, tell you something not only about the part but also about the company. If the datasheet you are reading feels like an afterthought, it's an alarming sign of the vendor's processes, design documentation and quality setup.

Especially in a market with fierce competition where parts sell on a headline spec or set of specs, one should be extremely careful to be sure to compare apples with apples.

Having worked at several silicon companies I've seen various styles of documentation:

- some companies employ dedicated technical authors who write datasheets and Appnotes. They report NOT to marketing but rather to quality departments. Other companies require designers do it themselves where it is not uncommon that values specified for silicon come straight from the simulator.

- some companies will design to 4x stricter tolerances than those written in the datasheet so that no customer will ever find anything purported in the datasheet that is violated in practice, over temperature, silicon corners, elevation, aging, and traveling to parallel universes. Great as that may seem, this can however seriously impact the competitiveness of the product.

- all companies will try to represent data in the most positive way possible, which is why the measurement conditions always should be inspected carefully. There is nothing intrinsically wrong with this, however some will actively encourage / require people to 'write data a bit more creatively' to cover up snafus. That's crossing the line. In case of doubt, always try to re-create the specification or graph, and when you find you have to bend over backwards and keep your tongue at the right angle to come even close to the datasheet values, you know you are in touchy territory.

When trying to find the golden middle between those extremes you need a clear headed team of design and application engineers that is communicating properly, understands the influence of every parameter on the final application and has a sense of quality in their work. This is why the datasheet is so indicative of the company's maturity.

I have experienced plenty of cases where a part not meeting the datasheet specification has been the cause of pretty serious escalation between vendor and customer. Often enough the silicon customer are system suppliers with requirements from their customers. They have to rely on the specification given to them by the silicon vendor, not in the last place because the part often doesn't even exist yet (in volume) when the system design starts.

Datasheets are not signed off by lawyers, but the consequences of not meeting the specs can get very costly very fast, from missing an existing of future socket opportunity to reimbursing customers for returns of systems whose cost by far outweighs the part integrated in them, to eventually having your reputation fatally tainted.

And before all, whatever you do, don't make the summer student write the datasheet...!

Bob Snyder had some ideas:

Engineers:  Consider using fewer product families

A new datasheet is typically published when a product development team completes a project and brings a series of closely-related items to market.  The datasheet typically lists a set of orderable part numbers, each of which identifies a single variant of the product.

Digi-Key currently lists 6.7 million distinct part numbers. For every part there is a datasheet. But the total number of unique datasheets is much smaller than the total number of unique parts.

Here is a nine-page datasheet which characterizes the Vishay PLZ series of Zener diodes:

http://www.vishay.com/docs/84830/plzseries.pdf

The datasheet lists 97 distinct part numbers, so in this case the ratio of parts to datasheets is nearly 100 to 1.  Most of the information in the datasheet applies to all of the variants.  Consequently, any errors in the datasheet would probably apply to all of the variants described by the datasheet.

The datasheet mentioned above has a revision date of 08-Feb-2017.  This reflects the fact that as errors and omissions are discovered, datasheets evolve.  It takes time and effort for engineers to check for new revisions to datasheets or new errata sheets.  Some manufacturers provide ways for engineers to request notifications of revisions or errata (e.g. "update service").  But this information does not always arrive at times when the engineer is able to read and digest it.

The amount of time and effort required to learn about products and keep up to date on revisions and errata is directly proportional to the number of product families (datasheets) that an engineering team is using.  A team can reduce this overhead and its exposure to "surprises" by selecting parts from a limited number of well-documented and well-supported product families.  In some cases, this approach may result in using a part that is a little less powerful or efficient than an alternative part claims to be.  But the only way to be certain that the alternative part lives up to its claims is to invest time and effort in testing samples of that part and learning all of the important things that the datasheet left out.  If the alternative part lives up to its claims and is selected for use, the engineers will now have the ongoing burden of keeping up to date with future revisions to yet another datasheet.

If a company selects the "ideal" part for every design, without seeking to limit the number of product families used, then it will be very challenging for the engineers to keep up to date on all of the datasheet revisions and errata sheets.

There is a way to quantify this problem.  Look at the Bill of Materials for each product that a company sells and compile a list of unique part numbers.  For each part number, list the datasheet ID.  (e.g. manufacturer name and document ID)  Print a hardcopy of each unique datasheet and stack them up.  The height of the stack represents the amount of datasheet content that the company's engineers are dealing with.  The number of revisions and errata is proportional to the height of the stack.  In order to reduce the company's exposure to datasheet errata and "surprises", you need to reduce the height of the stack.

Manufacturers:  Form a consortium to establish a system of universal datasheet identifiers

The International Standards Organization (ISO) has defined numbering schemes to uniquely identify books, serial publications such as magazines, and printed music.

ISBN - International Standard Book Number  (13 digits)

ISSN - International Standard Serial Number  (8 digits)

ISMN - International Standard Music Number  (13 characters)

ISMN - International Standard Music Number  (13 characters)

A consortium of electronic device manufacturers could establish a consortium to set up a system of datasheet identifiers, and this standard could be submitted to the ISO for adoption.

ISDN - International Standard Datasheet Number   (proposed)

I would propose that a datasheet number be identical for all revisions of a datasheet.  A revision number could be appended or simply treated as a separate piece of information.  But the main datasheet number should be consistent across all revisions.

How this would help:

1. Component retailers, such as Digi-Key, could allow engineers to search and sort using a datasheet number (ISDN).  If you search for an ISDN, you will instantly see all of the part numbers associated with that datasheet.  If you search using other criteria and then sort by ISDN, parts will be grouped by product family (i.e. by datasheet).

2. Internet search engines (e.g. Google) could allow people to search for datasheet numbers. Google Advanced Search currently allows users to search for ISBN and ISDN numbers. This type of search would not only locate the manufacturer's datasheet, but would also locate articles and discussions in which a datasheet number was referenced.

3. Someone could develop an app that would automatically search for revisions to the datasheets you are using.  You would just need to provide a list of datasheet numbers.  This would make it easier to determine which, if any, of the products you are using have new revisions or new errata sheets.  This would probably require some coordination with manufacturers, but a standardized numbering system would greatly simplify things.

4. An engineering team could keep a master Bill of Materials listing all of the part numbers used in their products.  An ISDN column in the Master BOM would provide the datasheet number for each part.  A separate lookup table could provide the page count for each datasheet number (ISDN).  A simple computer algorithm could then determine the total number of datasheet pages that the team is dealing with.  (i.e. the height of the datasheet stack)  This would be a useful metric for determining the team's exposure to datasheet errata and "surprises".  If the "datasheet stack height" was observed to be increasing over time, this could help to justify additional hires.

I haven't thought this through.  It's more of a suggestion than a plan.  Lots of details would need to be worked out.

Stephen wrote:

In my experience, the easiest vendor to meaningfully communicate with seems to be Microchip. Specifically they do not need you to prove to them what a great customer you are before they help you, and they genuinely do respond to suggestions & errata reports in a generally friendly & helpful manner.

Example - Ages ago I suggested their C code examples should be amended to use <stdint> rather than arbitrary type names. They actually did this - they took me up on that suggestion. I have other examples of similar things, for example with respect to IDE bugs or enhancements.

Officer Obie tweeted:

Obie came to the realization that it was a typical case of American Blind justice, and there wasn't nothing he could do about it, and the Judge wasn't going to look at the twenty seven eight-by-ten color glossy pictures with the circles and arrows and a paragraph on the back of each one explaining what each one was to be used as evidence against us.

Jobs!

Let me know if you’re hiring embedded engineers. No recruiters please, and I reserve the right to edit ads to fit the format and intent of this newsletter. Please keep it to 100 words. There is no charge for a job ad.


Alice is looking for someone with a red VW microbus to pick up half a ton of garbage.

Joke For The Week

Note: These jokes are archived at www.ganssle.com/jokes.htm.

Not really a joke, but amusing: Ian Stedman wrote:

I noticed something strange upon re-reading your 'Embedded Muse' column #337. On the picture of the multimeter (I didn't enter to win that one - I have plenty of meters. Let some other worthy win) the '6' (six) on a seven-segment display has its top bar illuminated. On the picture from the logic analyzer (that you swiped from the manual), the '6' (six) does NOT have its top bar illuminated. Now, personally, I'm in the camp that if that top bar ain't there, it's a 'b', not a six, and I will have no truck with the blasphemers who say otherwise. Conversely for a nine, without the bottom bar, that's not a nine, that's a backwards P. The matter is entirely irrelevant, but I think it might be amusing to some of your readers. Should sixes and nines have their top and bottom bars illuminated or not?

Here are the pictures Ian referenced:

Advertise With Us

Advertise in The Embedded Muse! Over 27,000 embedded developers get this twice-monthly publication. For more information email us at info@ganssle.com.

About The Embedded Muse

The Embedded Muse is Jack Ganssle's newsletter. Send complaints, comments, and contributions to me at jack@ganssle.com.

The Embedded Muse is supported by The Ganssle Group, whose mission is to help embedded folks get better products to market faster. We offer seminars at your site offering hard-hitting ideas - and action - you can take now to improve firmware quality and decrease development time. Contact us at info@ganssle.com for more information.