|Jack Ganssle's Blog
This is Jack's occasional outlet for thoughts about designing and programming embedded systems. It's a complement to my bi-weekly newsletter The Embedded Muse. Contact me at email@example.com. I'm an old-timer engineer who still finds the field endlessly fascinating (bio).
On N-Version Programming
August 8, 2018
The conventional wisdom is that a very effective way to get higher-reliability software is to have independent teams develop two or more copies of the project from a common set of requirements. All versions run together and a voting algorithm or other means is used to shut down a faulty program and let the other(s) continue. The idea is that, while all versions would have errors, it was unlikely that two would fail in the same way at the same time.
Sounds promising, doesn't it? But at least one study suggests otherwise.
John Knight and Nancy Leveson ran an experiment in 2002 where 27 programmers developed 27 versions of a project from a common set of requirements. Each was subject to one million automated tests. Surprisingly, a lot failed in exactly the same manner on the same input data.
Their conclusion (with many caveats): "For the particular problem that was programmed for this experiment, we conclude that the assumption of independence of errors that is fundamental to the analysis of N-version programming does not hold. Using a probabilistic model based on independence, our results indicate that the model has to be rejected at the 99% confidence level."
Their caveats are many, but the results are striking. The faults that were correlated between versions were of many varieties, but common was a lack of deep understanding of the finite precision of floating point numbers, and a surprising weakness with geometry (the problem had to do with radar images from missiles).
These programs were small, on the order of a thousand lines of code. One would think that such a small code base wouldn't be hard to get right.
The subjects were not required to use any particular software engineering approach, and one assumes the usual "just code it up" attitude dominated.
In the past other researchers have wondered about the efficacy of n-version code. In addition to the results of the Knight/Leveson experiments, many feel that a flaw is that most of these systems of systems are built from a common set of requirements. Once a problem grows in scope it becomes very difficult to perfectly define the requirements. Perfect code built to imperfect requirements means a failure will be experienced by all of the programs.
The Space Shuttle famously used N-version programming. Four computers ran identical code and watched each other for problems. A fifth had a completely different code base, built from different requirements. And more interestingly, those specs were much simpler than those for the four main machines. The fifth could only make the Shuttle do the bare minimum needed to get into orbit, maintain orbit, and land. That's a very appealing approach as simpler systems tend to be more correct.
One of the Shuttle's five IBM AP-101S computers
Oh – those requirements for the Shuttle? Capers Jones has numbers for typical sizes of requirements documents in pages. Extrapolating to the Shuttle's 420,000 lines of code, typical projects would weigh in with around 4000 pages – if they were 100% captured, which is almost unheard of. The Shuttle's software requirements document for the four main computers was 40,000 pages! The team spent a third of their schedule nailing those down.
I do think n-version programming can be a boon to reliability. But if the specs aren't perfect, and if poor software engineering techniques are used, the benefits may not be as much as hoped for.
Feel free to email me with comments.
Back to Jack's blog index page.