Follow @jack_ganssle

The logo for The Embedded Muse

For novel ideas about building embedded systems (both hardware and firmware), join the 27,000+ engineers who subscribe to The Embedded Muse, a free biweekly newsletter. The Muse has no hype and no vendor PR. It takes just a few seconds (just enter your email, which is shared with absolutely no one) to subscribe.

This month we're giving away the Zeroplus Logic Cube logic analyzer that I review later in this issue. This is a top-of-the line model that goes for $2149.



Metrics We Need

Summary: Engineering is about numbers; firmware people need to collect metrics.

In a recent article (Start Collecting Metrics Now - I stressed the importance of collecting metrics to understand and improve the software engineering process. It's astonishing how few teams do any measurements, which means few have any idea if they are improving, or getting worse. Or if their efforts are world class, or constitute professional malpractice.

Two of the easiest and most important metrics are defect potential and defect removal efficiency. Capers Jones, one of the more prolific, and certainly one of the most important, researchers in software engineering pioneered these measurements.

Defect potential is the total number of bugs found during development (tracked after the compiler gives a clean compile; ignore the syntax errors it finds) and for the first 90 days after shipping. Every bug reported, every mistake corrected in the code, counts. Sum this even for those that Joe fixes while he is buried in the IDE doing unit tests. No names need be tracked; this is all about the code, not the personalities.

Defect removal efficiency is simply the percentage of those removed prior to shipping. One would hope for 100% but few achieve that.

These two metrics are then used to do root cause analysis: why did a bug get shipped? What process can we change so it doesn't happen again? How can we tune the bug filters to be more effective?

Doing this well typically leads to a 10x reduction in shipped bugs over time. Here's some data from a client I worked with: Over the course of seven quarters they reduced the number of shipped bugs by better than an order of magnitude by doing this root cause analysis.

What are common defect potentials? They are all over the map. Malpractice is when we ship 50 bugs/1000 lines of code (KLOC). 1/KLOC is routinely achieved by disciplined teams, and 0.1/KLOC by world-class outfits.

According to data Capers Jones shared with me, software in general has a defect removal efficiency of 87%. Firmware scores a hugely better 94%. We embedded people do an amazingly good job. But given that defect injection rates run 5-10%, at a million LOC 94% means we're shipping with over 3000 bugs.

What are your numbers? Do you track this, or anything?

Published June 26, 2013