Follow @jack_ganssle
Go here to sign up for The Embedded Muse.
TEM Logo The Embedded Muse
Issue Number 345, March 5, 2018
Copyright 2018 The Ganssle Group

Editor: Jack Ganssle, jack@ganssle.com

   Jack Ganssle, Editor of The Embedded Muse

You may redistribute this newsletter for non-commercial purposes. For commercial use contact info@ganssle.com. To subscribe or unsubscribe go here or drop Jack an email.

Contents
Editor's Notes

embOS-Safe

Wouldn't you want to be in the top tier of developers? That is part of what my one-day Better Firmware Faster seminar is all about: giving your team the tools they need to operate at a measurably world-class level, producing code with far fewer bugs in less time. It's fast-paced, fun, and uniquely covers the issues faced by embedded developers. Information here shows how your team can benefit by having this seminar presented at your facility.

Thanks to everyone for filling out the salary survey. Due to travel I probably won't have the results for a few weeks, so am planning to publish those in April.

I'm on Twitter.

Quotes and Thoughts

"The history of the Internet shows that industry treats security as something that might be added later should the product garner a market base and in response to customer demands." - Hilarie Orman, IEEE Computer, October 2017, in an article provocatively titled "You Let That In?"

Tools and Tips

Please submit clever ideas or thoughts about tools, techniques and resources you love or hate. Here are the tool reviews submitted in the past.

Freebies and Discounts

This month's giveaway is a copy of Jean Labrosse's excellent book "uC/OS-III, The Real-Time Kernel for the Kinetis ARM Cortex-M4."

It will close at the end of March, 2018.

Enter via this link.

A Fire Code for Software?

This week's quote (above) reminds me of the 1980 MGM Grand Hotel fire, in which 85 people were killed and 650 injured. The hotel didn't have sprinklers, as those would have added $200,000 to construction costs. Lawsuit payouts and reconstruction costs eventually totaled over $400 million. Fire protection is like security: the benefits aren't felt until disaster strikes.

MGM Grand Hotel burning

Fires like at the MGM were once common occurrences. Sweeping fires are today so unusual that the once dreaded word conflagration sounds quaint to our modern ears. Yet in 19th century America a city-burning blaze consumed much of a downtown area nearly every year.

Fire has been mankind's friend and foe since long before Homo sapiens or even Neanderthals existed. Researchers suspect proto-humans domesticated it some 790,000 years ago. No doubt in the early days small tragedies - burns and such - accompanied this new tool. As civilization dawned, and then the industrial revolution drove workers off the farm, closely-packed houses and buildings erupted into conflagration with heartrending frequency.

In 1835 a fire in lower Manhattan destroyed warehouses and banks, the losses bankrupting essentially every fire insurance company in the city. The same area burned again in 1845.

Half of Charleston, SC burned in 1838.

During the 1840s fire destroyed parts of Albany, Nantucket, Pittsburgh, and St. Louis. The next decade saw Chollicothe, OH, St. Louis (again), Philadelphia and San Francisco consumed by flames. Much of my hometown of Baltimore burned in 1904. San Francisco was hit again during the 1906 earthquake; that fire incinerated 4 square miles and is considered one of the world's worst fires.

Mrs. O' Leary's cow may or may not have started the Great Chicago Fire that took 300 lives in 1871 and left some 90,000 homeless. The largely wooden city had received only 2.5 inches of rain all summer so turned into a raging inferno advancing nearly as fast as people could flee. But Chicago wasn't the only Midwestern dry spot; on the very same day an underreported fire in Peshtigo, WI killed over 1000.

Great Chicago fire

A year later Boston burned, destroying 8% of the capitol of Massachusetts. 1889 saw the same part of Boston again ablaze.

Theaters succumbed to the flames with great regularity. Painted scrims, ropes, costumes, and bits of wood all littered the typical stage while a tobacco-smoking audience packed the buildings. In Europe and America 500 fires left theaters in ruins between 1750 and 1877. Some burned more than once: New York's oft-smoldering Bowery Theatre was rebuilt 5 times.

The historical record sheds little light on city-dwellers' astonishing acceptance for repeated blazes. By the 1860s fireproof buildings were well understood though rarely constructed. Owners refused to pay the slight cost differential. At the time only an architect could build a fireproof building because such a structure used somewhat intricate ironwork which required carefully measured drawings. Few developers then consulted architects, preferring instead to just toss an edifice up using a back-of-the-envelope design.

Sound familiar?

Crude sprinklers came into being in the first years of the 19th century yet it wasn't till 1885 that New York law required their use in theaters. But even those regulations were weak, reading "as the inspector shall direct." Inspectors' wallets fattened as corruptness flourished. People continued to perish in horrific blazes.

The 1890 invention of the modern sprinkler reduced the cost of a fire to just 7% of that incurred in a building without the devices. As many as 150 theaters had them by 1905.

Yet, as mentioned, nearly a century later the MGM Grand Hotel didn't have sprinklers. Though fire marshals had insisted that sprinklers be installed in the casino and hotel, local law didn't require them.

The local law was changed the following year.

Fire codes evolved in a sporadic fashion. Before the Civil War only tenements in New York were required to have any level of fireproofing. But the New York Times made a ruckus over an 1860 tenement fire that eventually helped change the law to mandate fire escapes for some - not many - buildings.

A fire at New York's Conway's Theatre in 1876 killed nearly 300 people and led to a more comprehensive building code in 1885. 13 years after the Great Fire, Chicago finally adopted the first of many new codes.

This legislation by catastrophe wasn't proactive enough to insure the public safety. Consider the 1903 Iroquois theater fire in Chicago. Shortly before it opened, Captain Patrick Jennings of the local fire department made a routine inspection and found grave code violations. There were no sprinklers, no exit signs, no fire alarms. Standpipes weren't connected. Yet officials on the take allowed the theater to open.

A month after the first performance 600 people were killed in a fast moving fire. All of the doors were designed to open inwards. Hundreds died in the crush at the exits. Actor Eddie Foy saved uncounted lives as he calmed the crowd from the stage. Ironically, he and Mrs. O' Leary had been neighbors; as a teenager he barely escaped the 1871 fire.

Afterwards a commission found fault with all parties, including the fire department: "They seemed to be under the impression that they were required only to fight flames and appeared surprised that their department was expected by the public to take every precaution to prevent fire from starting." I'll get back to that statement towards the end of this article.

Carl Prinzler had tickets for the performance but the press of business kept him away. He was so upset at the needless loss of life that he worked with Henry DuPont to invent the panic bar lock now almost universally used on doors in public spaces.

Fast forward 83 years. Dateline San Juan, 1986. 97 died in a blaze at the coincidently-named DuPont Plaza hotel; 55 of those were found in a mass at an inward opening door. In 1981 48 people were lost in a Dublin disco fire because the Prinzler/DuPont panic bars were chained shut. In 1942 nearly 500 were killed in yet another fire at Boston's Coconut Grove nightclub, 100 of those were found piled up in front of inward opening doors. Others died constrained by chained panic bars.

Many jurisdictions did learn important lessons from the Iroquois disaster but took too long to implement changes. Schools, for instance, modified buildings to speed escape and started holding fire drills. Yet 5 years after Iroquois a fire in Cleveland took the lives of 171 children and two teachers. The exit doors? They opened inwards.

Changes to fire codes came slowly and enforcement lagged. But the power of the press and public outrage should never be underestimated. The 1911 fire at New York's Triangle Shirtwaist Company was a seminal event in the history of codes. Flames swept through the company's facility on the 8th, 9th and 10th floors. Ladders weren't tall enough and the fire department couldn't fight it from the ground. 141 workers were killed; bodies plummeting to the ground eerily presaged 9-11.

Triangle Shirtwaist fire

But at this point in American history reform groups had taken up the cause of worker protections. Lawmakers saw the issue as good politics. Demonstrations, editorials and activism in this worker-friendly political environment led to many fire code changes.

Though you'd think insurance companies would work for safer buildings they had little interest in reducing fires or mortality. CEOs simply increased premiums to cover escalating losses. In the late 1800s mill owners struggling to contain costs established the Associated Factory Mutual Fire Insurance Companies, an amalgamated non-profit owned by the policyholders. It offered far lower rates for mills made to a standard, much safer, design.

The AFM created the National Board of Fire Underwriters to investigate fires and recommend better construction practices and designs. 1905 saw the first release of their Building Code. 6,700 copies of the first edition were distributed. Never static, it evolved as more was learned. Amendments made to the code after the Triangle fire, for instance, improved mechanisms to help people egress a burning building.

MIT-trained electrician William Merrill convinced other insurance companies to form a lab to analyze the causes of electrical fires. Incorporated in 1901 as the Underwriters' Laboratories, UL still sets safety standards and certifies products.

Our response to fires, collapsing buildings and the threats from other perils of industrialized life all seem to follow a similar pattern. At first there's an uneasy truce with the hazard. Inventors then create technologies to mitigate the problem, such as fire extinguishers and sprinklers. Sporadic but ineffective regulation starts to appear. Trade groups scientifically study the threat and learn reasonable responses. The press weighs in, as pundits castigate corrupt officials or investigative reporters seek a journalistic scoop. Finally, governments legislate standards. Always far from perfect, they do grow to accommodate better understanding of the problem.

Which brings us to software. Though computer programs aren't as yet as dangerous as fire, flaws can destroy businesses, elections and even kill. Faulty car code has killed and injured passengers. Software errors in radiotherapy devices have maimed and taken lives. Some jets can't fly without overarching software control. You can't open a newspaper without reading about the latest security breach, which often affects tens or hundreds of millions.

Why is there no fire code for software?

In the USA the feds currently mandate standards for some firmware. In Europe regulations are coming into place for data security. There are some documented processes for developing better code. But most of us play in a wildly-unregulated and unconstrained space.

Firmware is at a point in time metaphorically equivalent to the fire industry in 1860. We have sporadic but mostly ineffective regulation. The press occasionally warms to a software crisis but, there's little furor over the state of the art.

Rest assured there will be a fire code for software. As more life- and mission-critical applications appear, as firmware dominates every aspect of our lives, bugs cause some horrible disasters, the public will no longer tolerate errors and crashes. For better or worse, our representatives will see the issue as good politics. Legislation by catastrophe drove the fire codes, and will drive the software codes.

Just as certain software technologies lead to better code, the technology of fireproofing was well understood long before ordinances required their use. The will to employ these techniques lagged, as they do for software today.

There's a lot of snake oil pedaled for miracle software cures. Common sense isn't one of them. I have visited CMM level 5 companies (the highest level of certification, one that costs megabucks and many years to achieve) where too many of the engineers had never heard of peer reviews. These are required at level 3 and above. Clearly the leaders were perverting what is a fairly reasonable, though heavyweight, approach to software engineering. Such behavior stinks of criminal negligence. It's like bribing the fire marshal.

I quoted the Iroquois fire's report earlier. Here's that sentence again, with a few parallels to our business in parenthesis: "They (the software community) seemed to be under the impression that they were required only to fight flames (bugs) and appeared surprised that their department was expected by the public to take every precaution (inspections, careful design, encapsulation, and so much more) to prevent fire (errors) from starting."

Douglas Adams said "Human beings, who are almost unique in having the ability to learn from the experience of others, are also remarkable for their apparent disinclination to do so." After 790,000 years of firefighting we have finally learned that fire is, well, kind of dangerous and we'd better construct buildings appropriately. We have only 70 or so years of experience with software and its perils, but we're learning that code can be, well, kind of dangerous and we'd better construct it appropriately.

I collect software disasters, and have files bulging with examples that all show similar patterns. Inadequate testing, uninspected code, shortcutting the design phase, lousy exception handlers and insane schedules are responsible for many of the crashes. We all know these things, yet seem unable to benefit from this knowledge. I hope it doesn't take us 790,000 years to institute better procedures and processes for building great firmware.

Do you want fire codes for software? The techie and libertarian in me screams "never!" But perhaps that's the wrong question. Instead ask "do I want conflagrations? Software disasters, people killed or maimed by my code, systems inoperable, customers angry?" No software engineering methodology will solve all of our woes. But continuing to adhere to ad hoc, chaotic processes guarantees we'll continue to ship buggy code.

When researching this a firefighter left me with this chilling thought: "I actually find bad software even more dangerous than fire, as people are already afraid of fire, but trust all software."

More on Bugs and Errors

Many readers responded to the article "Is It a Bug or an Error?" in the last issue.

Benjamin Noack sent a link to an article about how the Space Shuttle's code was written.

Michael Covington wrote:

The essence of the "bug" issue is this:  In other kinds of engineering, we have unexpected changes in physical properties, such as capacitors that leak more than we expected, or pieces of metal with hidden cracks. Those are problems that legitimately sneak in.  In software, we don't have anything like that.  There is no doubt what each machine instruction will do.

So I call them errors, not bugs.

Sergio Caprile commented on dyslexia and programming:

My wife has dyslexia, and I sort of "developed" a similar disorder through countless late nights of Z-80 assembly programming, coming from a 6800 tuition.

Basically, one who suffers dyslexia has a randomly switchable inverter in the middle of his thought to expression mechanism. You say "right" and turn left, you say "white" and draw black, you say "52" and press first '2' and then '5' after that. She sees a Stallone movie and tells me she saw a Schwarzenegger movie...

I tend to sometimes reverse numbers in the middle of a number string, as if translating endianness in real life.

So... still mistakes, human errors. Caused by a disorder, yes, but in my opinion that is no excuse; either you don't program or you assume responsibility for what you do.

Ray Keefe contributed:

I agree completely about what you wrote about the use of the word "bug". We all make mistakes, errors, misunderstand a requirement or sometimes lack the domain experience to realise what was needed and so have to rethink our original design in light of that learning. But it doesn't slip in when we weren't looking like the famous moth in the relay of a computer. 

If it is a chip and it has flaws you get errata. These are errors being documented or recognised. They recognise them as errors.

Errors in code can stem from many different sources. A very common one is poor requirements capture. So I could write code that correctly implements the captured requirements but it is still in error according to what the client or end user needed. 

An Intel compiler I used in my early programming days had a substantial compiler manual and an even larger book titled "List of known infelicities". Infelicity being a synonym of mistake, error, blunder... Someone had detailed all the things that did not work and what to do about them if you encountered them. Simple things like you could not use long as the loop variable in for loops. This was 1988. So they took responsibility for the faults and gave you advice on dealing with them. And it helped. I rarely had a problem but if something didn't work as expected we checked in there to make sure it wasn't the tool. 

In an ideal world all mistakes would be found and corrected, regardless of their source. In practice we have to rely on good process, reviews, using tools correctly, tools working correctly, not skipping steps or cutting corners, keeping optimisation levels down where we can and doing whatever is needed to get to an acceptable finish line. 

From Rod Main:

In software terms, what is a bug?  The usage I'm familiar with covers a multitude of sins. Yes, some of them are errors and mistakes on my part.

Some of them are other peoples' view on what my code was supposed to do. My code does not contain an error, it works exactly as I intended it to.  However, due to vague requirements and specifications, other people come along and say "I thought it would do xxx. It doesn't so your code has a bug in it".   As you note, there is plenty of blame to be apportioned.  But is there an error in the code? Is it wrong?  

Your quote from Dijkastra contains the line "The nice thing of this simple change of vocabulary is that it has such a profound effect: while, before, a program with only one bug used to be 'almost correct', afterwards a program with an error is just 'wrong' (because in error)."    Possibly that illustrates Dijkastra's own understanding of what a bug is. "Almost correct" does not mean correct. In fact, it means precisely the opposite. Consider "I almost died".  Did you? No!   Things that contain "almost" means they were considered but aren't true or didn't happen.

Some comments seem to imply that the word "bug" is acceptable to management whereas "error" isn't.  "if we said we had an error in the code management would be all over us". That would seem to be a fault in management.  Perhaps this side of the pond, management are less tolerant about shipping code with bugs in. Here, no matter what we call it, its going to have to be fixed before it goes out the door.

We also say code is "buggy" to cover the case where defects and deficiencies have come to light at a later stage.  In some cases this is because "enhancements" to the software have exposed defects that already existed or the enhancement broke something in an unexpected way.  There have been several cases where I have been tasked to find and fix the faults/errors/bugs in code that I didn't write. This is not a case of "owning" the mistake or owning up to an error. Its about fixing a deficiency that the original author couldn't have foreseen and/or the most recent change author thought they were isolated from.  Basically, the current code has a bug.

However, the bottom line of all this discussion seems to be that the word "bug" is a way for a code author to absolve themselves of responsibility for the problem. Really?  If that is the case then it doesn't matter what you call it, responsibility will be shirked. (Time to recommend those programmers for external career development). So is this the real problem:- Programmers are not as ethical as they once were? 

In my opinion it's not what you call it that's the problem, it's how you approach it and what you are going to do about it.   If a car as problem with the engine, painting the car a different colour is not going to fix it.

Rob Aberg tied comments about Ada/SPARK to the bug discussion:

Nick P's additions in the "More on SPARK and Ada" section of EM 344 were thought-provoking -- thank you for the ongoing series of updates to this important topic.  My take away is that even if the code is correctly constructed per the requirements, there are additional sources of error / problems to consider that can creep into 100% correct-by-construction code and render it broken or useless. Here is a short sampler:

  • Requirements for "this code" are bogus/malformed/incomplete, as shown via static analysis or during testing (example)
  • Requirements for "this code" are incomplete or inadequate, e.g., tolerance stacks that conflict with system requirements, requirements missing system states, etc. (e.g., a part tolerance of 1% of full scale can be harder to manufacture and at the same time much broader than 1% of nominal operating point ... also see your previous article on "typical" and "nominal" values for parts - integrated code that isn't robust to real-world tolerance stacks will fail, even if unit tests pass just fine - indicative of a system-level / design problem)
  • Internal changes to requirements in upstream or downstream code (w.r.t. "this code") can quietly change assumptions that affect the interface in an unexpected way (e.g., code that runs just prior to yours was changed and now it thrashes the processor's cache memory so the compute latency of your code mysteriously increases and is no longer in spec - the cache state was not in the spec, i.e., this is probably a special case of the previous bullet)
  • The requirements themselves are correct, even per the contract, but they don't meet customer expectations - contract / payment mayhem ensues (e.g., Apple's careful updates to cope better with old batteries weren't fully appreciated by customers, to say the least)

Initially, a software engineer correctly implementing another person's requirements will often be blamed for the above types of errors, and will exonerate themselves only after a detailed - and often expensive - analysis. Conclusion:  the concept of "zero bugs" or even "zero errors" in code is a necessary condition for success, but not sufficient by itself to ensure success. At least not for the next contract with the same customer.

This Week's Cool Product

Microchip has an interesting new part called the ATSAMA5D27C-D1G-CU System in Package (SiP). Near as I can tell it's a Cortex A5 with DDR2 memory integrated onto it. That's important because laying out high-speed interconnects on a PCB can be difficult. Now, that's about all I can find out, since (as of this writing) the link provided gives a 404 error. But this is the classiest 404 I've ever seen - they offer a $4.04 discount on an ATTINY board for the inconvenience of being directed to a missing page!

Kudos, Microchip!

Note: This section is about something I personally find cool, interesting or important and want to pass along to readers. It is not influenced by vendors.

Jobs!

Let me know if you’re hiring embedded engineers. No recruiters please, and I reserve the right to edit ads to fit the format and intent of this newsletter. Please keep it to 100 words. There is no charge for a job ad.

 

Joke For The Week

Note: These jokes are archived at www.ganssle.com/jokes.htm.

You might be an engineer if:

  • If you have ever purchased an electronic appliance "as-is"
  • If you see a good design and still have to change it
  • If the salespeople at Best Buy can't answer any of your questions
  • If you own a set of itty-bitty screw drivers, but you don't remember where they are
  • If you rotate your screen savers more frequently than your automobile tires
Advertise With Us

Advertise in The Embedded Muse! Over 27,000 embedded developers get this twice-monthly publication. For more information email us at info@ganssle.com.

About The Embedded Muse

The Embedded Muse is Jack Ganssle's newsletter. Send complaints, comments, and contributions to me at jack@ganssle.com.

The Embedded Muse is supported by The Ganssle Group, whose mission is to help embedded folks get better products to market faster. We offer seminars at your site offering hard-hitting ideas - and action - you can take now to improve firmware quality and decrease development time. Contact us at info@ganssle.com for more information.