Follow @jack_ganssle

The logo for The Embedded Muse For novel ideas about building embedded systems (both hardware and firmware), join the 27,000+ engineers who subscribe to The Embedded Muse, a free biweekly newsletter. The Muse has no hype, no vendor PR. It takes just a few seconds (just enter your email, which is shared with absolutely no one) to subscribe.

By Jack Ganssle

Who's at fault when code kills?

Published 11/24/2004

In 2001 a Cobalt 60 machine in Panama delivered 20% to 100% more radiation than required to 28 patients, killing between 8 and 21 people.

Who was at fault?

St. Louis-based Multidata Systems provided the software that controlled the machines. Three FDA inspections in the 90s cited the company for poor specification and documentation procedures, inadequate testing, and a lack of comprehensive investigation into customer complaints. Another inspection in 2001, after the tragedy in Panama, revealed more of the same. In particular the company didn't have a comprehensive testing plan that proved the code was "fit for use."

Doctors would hand off a treatment plan to radiation physicists who operated the machine. Lead shields carefully placed around the tumors protected patients' other organs. The physicists used a mouse to draw the configuration of blocks on the screen; the software then computed an appropriate dose.

To better control the gamma ray beam physicists sometimes used 5, instead of the recommended 4, lead blocks. The software didn't have a provision for this configuration but users found they could draw a single polygon that represented all 5 blocks. Unfortunately, it was possible to create a depiction that confused the code, causing the machine to deliver as much as twice the required dose.

Multidata contends that the hospital should have verified the dosages by running a test using water before irradiating people, or by manually checking the software's calculations. While I agree that a back-of-the-envelope check on any important computer calculation makes sense, I grew up with slide rules. Back then one had to have a pretty good idea of the size of a result before doing the math. Today most people take the computer's result as gospel. The software has got to be right.

The physicists believe the code should have at least signaled an error if the entered data was incorrect or confusing. Well, duh.

So who's at fault?

This week the physicists were sentenced to prison for four years, and were barred from practicing their profession for at least another 4 years (http://www.baselinemag.com/article2/0,1397,1729470,00.asp). So far Multidata has dodged every lawsuit filed against it by injured patients and next-of-kin.

In my opinion this is a clear miscarriage of justice. Why prosecute careful users who didn't violate any rule laid down in the manual?

Who is at fault when software kills?

Is it management for not instituting a defined software process? Or for squeezing schedules till we're forced to court risky practices?

What about the software engineers? They, after all, wrote the bad code. The very first article of the IEEE's code of ethics states: "[We] accept responsibility in making engineering decisions consistent with the safety, health and welfare of the public, and to disclose promptly factors that might endanger the public or the environment."

But how can we - or worse, the courts - blame users? Sure, there's a class of nefarious customers unafraid to open the cabinet doors and change the system's design. It's hard to guard against that sort of maliciousness. A normal user, running the system in a reasonable way, surely cannot be held accountable for code that behaves incorrectly.

What do you think? If you build safety critical systems, are you afraid of being held criminally accountable for bugs?