Design By Contract™ - Automatic Debugging

(February 2007)

It's hard to remember life before the web. Standing in lines to buy stuff and mail-ordering from out-of-date print catalogs were frustrations everyone accepted. But today many of those problems are long gone, replaced by sophisticated software that accepts orders on-line. For instance, figure 1 shows the result of an order I tried to place with Defender last year. I was tempted to complete the transaction to see my credit card burst into flames.

Defender Order

Figure 1: How did we manage in the pre-computer age?

At least they didn't charge tax, though it would be nearly enough to pay off the national debt.

Thirty five years ago, as a student in a college Fortran class, the instructors admonished us to check our results. Test the goesintas and goesoutas in every subroutine. Clearly, Defender's clever programmers never got this lesson. And, in most of the embedded code I read today few of us get it either. We assume the results will be perfect.

But they're often not. Airane 5's maiden launch in 1996 failed spectacularly, at the cost of half a billion dollars, due to a minor software error: an overflow when converting a 64 bit float to a 16 bit integer. The code inexplicably did not do any sort of range checking on a few critical variables.

Every function and object is saddled with baggage, assumptions we make about arguments and the state of the system. Those assumptions range from the ontological nature of computers (the power is on) to legal variable ranges to subtle dependency issues (e.g., interrupts off). Violate one of these assumptions and Bad Things will happen.

Why should anything be right? Isn't it possible that through some subtle error propagated through layers of inscrutable computation a negative number gets passed to the square root approximation? If one reuses a function written long ago by some dearly departed developer it's all-too-easy to mistakenly issue the call with an incorrect argument, or to make some equally erroneous assumption about the range of values that can be returned.

Consider:

float solve_post(float a, float b, float c){
               float result;
               result=(-b + sqrt(b*b -4*a*c))/2*a;
               return result;}

...which computes the positive solution of a quadratic equation. I compiled it and a driver under Visual Studio, with the results shown in figure 2.

Quadratic equation

Figure 2: result of solving a quadratic equation

The code runs happily even as it computes an impossible result, as is C's wont. The division by zero that results when "a" is zero creates neither an exception nor a signal to the user that something is seriously, maybe even dangerously, amiss.

I can think of three ways this tiny snippet of code can fail.

Consider the square root: we know that a negative argument isn't allowed. But that's not really true; that's an implicit assumption, which may not be correct. In the complex domain different assumptions apply.

Bertrand Meyer thinks languages should include explicit constructs that make it easy to validate the hidden and obvious assumptions that form a background fabric to our software. His must-read Object-Oriented Software Construction (at 1200 pages a massive tome which is sometimes pedantic but is brilliantly written) describes Eiffel, which supports Design By Contract (DBC, and a trademark of Eiffel Software). This work is the best extant book about DBC.

In important human affairs, from marriage to divorce to business relationships, we use contracts to clearly specify each party's obligations. The response "I do" to "Do you promise to love and to hold . . ." is a contract, and agreement between the parties about mutual expectations. Break a provision of that or any other contract and lawyers will take action to mediate the dispute . . . . or to at least enrich themselves. Meyer believes software components need the same sort of formal and binding specification of relationships.

DBC asks us to explicitly specify these contractual relationships. The compiler will generate additional code to check them at runtime. That immediately rules the technique out for small, resource-constrained systems. But today, since a lot of embedded systems run with quite a bit of processor and memory headroom, it's not unreasonable to pay a small penalty to get significant benefits. If one is squeezed for resources, DBC is one of the last things to remove. What's the point of getting the wrong answer quickly?

And those benefits are substantial:

  • The code is more reliable since arguments and assumptions are clearly specified in writing.
  • They're checked every time the function is called so errors can't creep in.
  • It's easier to reuse code since the contracts clearly specify behaviors; one doesn't have to reverse engineer the code.
  • The contracts are a form of documentation that always stays in-sync with the code itself. Any sort of documentation drift between the contracts and the code will immediately cause an error.
  • It's easier to write tests since the contracts specify boundary conditions.
  • Function designs are clearer, since the contracts state, very clearly, the obligations of the relationship between caller and callee.
  • Debugging is easier since any contractual violation immediately tosses an exception, rather than propagating an error through many nesting levels.
  • Maintenance is easier since the developers, who may not be intimately familiar with all of the routines, get a clear view of the limitations of the code.
  • Peace of mind. If anyone misuses a routine, they'll know about it immediately.

Software engineering is topsy-turvey. In all other engineering disciplines it's easy to add something to increase operating margins. Beef up a strut to take unanticipated loads or add a weak link to control and safely manage failure. Use a larger resistor to handle more watts or a fuse to keep a surge from destroying the system. But in software a single bit error can cause the system to completely fail. DBC is like adding fuses to the code. The system is going to crash; it's better to do so early and seed debugging breadcrumbs than fail in a manner that no one can subsequently understand.

Contracts

Let's look at a common process that occurs in many households. How does one get a teenager up for school? There are three main steps, though some may be repeated N times, where N may be a very large number:

  • Wake up the kid
  • Wake up the kid again
  • Scream at the kid

If we were implementing these in software using DBC, the code might look something like this:

Wake_kid
       Precondition: Kid_is_in_bed 	// Kid hasn't snuck out
                                      //   during the night
       Invariant: coffee machine is on
Wake_kid_again
       Precondition: kid_is_not_awake AND Wake_kid implemented once
       Invariant: coffee machine is on
Scream_at_kid
       Precondition: kid_is_not_awake AND (wake_kid called N times)
       Postcondition: kid_appears_awake
       Invariant: coffee machine is on

Each of the three routines has dependencies called "preconditions." A precondition must be satisfied before the function can execute. In Wake_kid if the precondition Kid_is_in_bed is not true, then an error is thrown, the cops get called, and dad's ulcer acts up.

Scream_at_kid has a "postcondition," something that must be true when the routine exits. As long as the kid isn't conscious, the actor continues to scream.

All of the routines have "invariants," conditions that must be true both before the function starts and when it exits. Invariants are supra-functional aspects of system behavior. In this case if the coffee machine isn't on, life is barely worth living so an exception is trapped. An invariant could be included as part of the pre- and postcondition expressions, but repetition is fraught with errors. And, by specifically indicating that the condition is invariant, one makes a strong statement about that condition. Invariants are tested before the precondition check and after the postcondition.

Precondition faults signals a problem in the caller; postcondition problems mean the function failed.

A function (or class) is correct only if it satisfies the invariants and pre- and postconditions. These are formal contracts that the client and supplier make. In defensive programming one learns to check input arguments, for instance. Contracts are the meta-pattern of argument checking, as they insure the inputs, outputs, and stable, unchanging aspects are all correct.

Here's another example. When I do arithmetic by hand I usually end with a postcondition. To subtract 1.973 from 2.054 I'd do this:

               2.054
             - 1.973
             -------
               0.081
            + 1.973
             -------
               2.054

Long experience has taught me that my manual calculations are sometimes wrong, and a quick check costs little. That extra step of adding the result to the subtrahend is nothing more than a postcondition. In C:

float subtract(minuend, subtrahend){
      float temp;
      temp=minuend-subtrahend;
      postcondition: minuend == temp+subtrahend;
      return temp;
}

(Assuming there were some construct like "postcondition:" in C). Clearly, this test guarantees the code is correct. Yup, it doubles the function's workload. But this is a trivial example; we'll see more realistic ones later.

How is this different than a C-style assertion? An assertion expresses the developer's belief about correctness at one point in time, at the beginning, end, or somewhere in the middle of the routine. DBC's constructs, instead, state things that are Truth. They're basic truisms about the structure of the system that stay carved in stone for longer than the scope of a single function. Though this may seem a minor philosophical point, I hope you'll see the distinction as this discussion unfolds.

Yes, it's possible to implement these ideas using assertions. But the concept transcends the limited single-node debugging zeitgeist of an assertion.

It's worth knowing a little about Eiffel even though the language has essentially zero market share in the embedded space, as Eiffel's built-in DBC support has influenced the language people use when discussing Design By Contract. The keyword "require" means a precondition. "Ensure" indicates a postcondition, and, unsurprisingly, "invariant" means just what you think.

Good Contracts

Contracts are not for detecting erroneous inputs. For instance, you wouldn't use a contract to flag a typo from a keyboard . . . but you may use one if the keyboard bursts into flames. Good contracts simply supervise relationships between the function and its caller. "I, the function, promise to fulfill these conditions." "And I, the caller, promise to fulfill these." Then the entire program can erupt in happy mazel tovs.

Software can fail due to outright bugs - programming mistakes - and exceptional conditions, events that are unwanted but not unexpected. Since there's usually no recovery from the former it's wise to trap those with contracts. The later covers events like missed data packets, bad input, noisy analog, and should all be handled by code that takes some remedial action.

Good contracts supervise software to software relationships, not software to users.

Good contracts check things that seem obvious. Though any idiot knows that you can't pull from an empty queue, that idiot may be you months down the road when, in a panic to ship, that last minute change indeed does assume a "get()" is always OK.

Good contracts have no effects. In C, if the contract has a "=" there's something wrong as the state of the system gets changed each time the code tests the contract.

If you don't know what a function should do in an abnormal situation, explicitly exclude that condition using a precondition.

Good contracts have invariants for anything that's structurally important to the function's behavior that may not change during the execution of the function. For instance, if a function relies on a global variable remaining unchanged, write an invariant for that.

Put contracts on things that seem obvious, as months or years later a change in the code or an unexpected hardware modification may create error conditions. If the scale's hardware simply can't weigh anything over 100 grams . . . define a contract to insure that assumption never changes.

DBC

DBC is hardly new, but as yet has almost no penetration into the embedded space, due to a lack of knowledge about the concept, and the fact that mainstream languages provide no intrinsic support.

Regular readers know I'm a zealot about seeding software with debugging traps. The time to start debugging is when you're writing the code, not when a bug rears up and slaps you in the forehead like a cold fish jumping out of the water. Since code typically has a five to ten percent error rate after passing the compiler's syntax check it's clear that a lot of what we produce will be wrong. DBC is perhaps the most powerful technique we have to toss a penalty flag when someone tries to use a function incorrectly.

In my previous article I tried to evangelize DBC, only to finally disappoint you, gentle reader, when I admitted that it's not a part of C or C++. But there are some options. None are perfect, none achieve the clean and seamless DBC possibilities embodied in the Eiffel language. But a 90% solution counts as an A grade in most things. A first approximation to perfection is often good enough.

Charlie Mills wrote a free preprocessor for C programs that extracts contracts embedded in comments, turning them into normal C code. His 2004 article about the package (reference 1) generated some interest that quickly died out. I've found only a single user - though he's enthusiastic about both the preprocessor and DBC in general.

Let's look at some examples using the syntax that Charlie adopted. Since C doesn't accept any sort of DBC keywords, his preprocessor extracts mnemonics embedded in comments as follows:

/**
      pre:  (precondition)
      post: (postcondition)
      inv:  (invariant)
*/

That is, a set of contracts exists in its own comment block which starts with double asterisks. Earlier I showed a function that computes the positive root of a quadratic equation. Here's the same function instrumented with three contracts:

/*  Find the positive solution of a quadratic */
/**
      pre: a != 0
      pre: (b*b-4*a*c) >= 0
      post: result != NaN
*/
float solve_pos(float a, float b, float c){
      float result;
      result= (-b + sqrt(b*b - 4*a*c))/2*a;
      return result;}

The triptych of contracts traps the three ways that this, essentially one line of code, can fail. Isn't that astonishing? A single line possibly teeming with mistakes. Yet the code is correct. Problems stem from using it incorrectly, the very issue DBC prevents. So rather than "failure" think of the contracts as provisions that define and enforce the relationship between the function and the caller. In this case, the caller may not supply zero for a, to avoid a divide by zero. If (b*b-4*a*c) is negative sqrt() will fail. And a very small a could overflow the capacity of a floating point number.

Run this through Charlie's preprocessor and the result, edited for clarity, is:

float solve_pos(float a, float b, float c){
      float ocl__ret_val, result;
      /* Preconditions: */
      if (!(a != 0)) dbc_error("quad.c:3: " "precondition failed");
      if (!((b*b-4*a*c) >= 0))dbc_error("quad.c:4: " "precondition failed");
      result= (-b + sqrt(b*b - 4*a*c))/2*a;
      ocl__ret_val = result; goto ocl__return_here;
ocl__return_here:
      /* Postconditions: */
      if (!(result != NaN))dbc_error("quad.c:5: " "postcondition failed");
      return ocl__ret_val;
}

Break a contract and dbc_error() - which you supply - runs. printf if you can, log the error and halt if you must. But any attempt to use solve_pos()illegally will immediately flag a problem, long before an incorrect error would propagate through hundreds of other calls.

Computing the discriminant twice sounds computationally expensive! The gcc compiler, though, using the -Os (optimize for size) switch on an x86 does the math once and stores the result for use later. gcc also optimizes out the unneeded gotos. Other compilers and processor architectures may give different results. Contracts don't necessarily add much of a memory or performance penalty.

In DBC contracts never cause side effects. Thus the preprocessor does not support the "=" operator.

The preprocessor has numerous options beyond what I've mentioned. See Reference 1 for the full syntax. I think some of the more complex features (which create loops) are not needed and are too computationally expensive.

Downsides? the preprocessor's output is ugly. The indentation strategy (edited out above) just couldn't be worse. I've found one bug: some constants with decimal points stop the preprocessor. The strings (e.g., "quad.c:5 postcondition failed") are long and burn valuable memory. And wouldn't it be nice for the program to optimize out unnecessary jumps? None of these are showstoppers, though.

I did try to make some improvements, but gave up, totally baffled by the language used (Ruby) and the code's structure. No one provides support and Charlie doesn't return my emails. Perhaps one of you can become the Yoda of DBC and fold some improvements into this very useful utility.

Building & Using DBC for C

I installed Charlie's package on both Linux and Windows machines. The Linux build is pretty simple, but under Windows figure on a lot of annoying hunting around for files and copying them to the proper directories. I can't give better advice as it's quite directory-specific. Arm yourself with an adult drink to soothe frayed nerves.

DBC for C is written Ruby, which few of us firmware folk use. He used a parser-generator to simplify the work. Unfortunately, these choices yield a completely opaque structure that's impossible to tune if you're not versed in Ruby and parser languages.

First get Ruby from http://www.ruby-lang.org and install it using the supplied instructions. For Windows use the "one-click install" package. The Linux package has a very easy-to-use script that builds Ruby from the sources.

You'll also need Ruby's header files. In Ubuntu and Debian issue apt-get install ruby1.8-dev. Those were included in the Windows one-click package. With a fresh installation of Linux I needed C header files, obtained via apt-get install libc-dev.

Next get racc, Ruby's version of yacc, from http://i.loveruby.net/en/prog/racc.html. I found it convenient to extract it to a bin directory in my home path. Follow the build instructions, and then test racc per the README. In Linux you may need root privileges, though sudo access suffices.

Under Windows I had no end of trouble getting the right files in the right places, but whenever the build failed just found the missing file and copied it as needed. Eventually I found http://www.dets-home.de/it-writings/Install_rd2_winnt.html which describes the installation process for a Windows version. If troubled by a finicky installation see the references at the end of this article for more information.

Get and extract DBC for C from http://rubyforge.org/frs/?group_id=354. The build instructions are simple and clear, though sudo access may be required under Linux. In Windows I had to fiddle with the file pre-config.rb to because of the Unix file conventions. Comment out lines 25 to 33 and 51 to 59, which move files between directories. Copy them yourself using Windows Explorer.

Once installed, dbcparse.rb converts a C source file full of DBC constructs to a source file with the DBC constructs converted to C code. Ruby dbcparse.rb --help will display the options, though I've been unable to ascertain the meaning of some. To use the program issue:

ruby dbcparse.db -no-line infile.c > outfile.c

Feed outfile.c into your C compiler. If, for some reason, you must compile without contracts, skip dbcparse and compile infile.c. With the contracts embedded in comments, no source changes are needed to enable or disable contract generation. Yet the contracts remain a permanent form of documentation.

Add the -d option to generate doxygen tags in the output file.

Other Options

Options other than the DBC for C I described do exist.

Falling back to Eiffel's DBC syntax, one can write require() (precondition) and ensure() (postcondition) macros. I have no idea how to create an invariant without using some sort of preprocessor, but you could embed the invariant's tests in the two new macros. Use:

void assert(int);
#define require(a) ((void)(a), assert(a))
#define ensure(a)  ((void)(a), assert(a))

Why the double arguments? Contracts should have no side effects. The assignment operator "=" and function calls aren't allowed. Couple the above definition with the use of Lint (you do use Lint, don't you? It's an essential part of any development environment) to flag a contract that uses assignments or function calls.

Using Gimpel's Lint (www.gimpel.com) add this command to the code:
//lint -emacro(730,require)
...so Lint won't complain about passing Booleans to functions.

Now, Lint will be happy with a contract like:
require(n >=0);

But will give warning 666 ("expression with side effects) for contracts like:
require(m=n);
or
require(g(m));

Contracts are also a form of documentation since they clearly describe the limits of a function's behavior. There's some merit in substituting constructs familiar only to C programmers with more English-like words, so I like to define AND and OR corresponding to && and ||, using the former in writing contracts.

The gnu Nana project (reference 9) offers another alternative. It's a collection of DBC-like constructs that do ultimately resolve to error handlers little different than assert() macros, but that are well thought out with an eye to the limited resources typical of our beloved embedded world. One option enables contracts using gdb resources, greatly reducing the overhead required by assert()'s traditional __line__ and __file__ parameters. Support of a limited amount of predicate calculus lets one define quite involved contracts.

Nana's use of I() for invariants and other too-short, cryptic nomenclature is a mistake. Worse is that it supports assignment and function calls in contracts. But it's worth reading the (alas, all too limited) documents and code at the web site listed in reference 9.

There's a least one commercial DBC product that supports C and C++. The 249€ C2 from case-challenged aechmea (reference 10) ties tightly to Microsoft's Visual Studio, and preprocesses C files to convert contracts embedded in comments to code. The keywords are @pre, @post, and @invariant.

A free Linux version is available, but the license limits its use to noncommercial work.

aechmea's product works well; I'd be enthusiastic about it, except for it's limiting Linux license and Windows operation solely with Visual Studio.

I started this report mentioning Eiffel, a language that includes a complete DBC mechanism. Yet Eiffel has only a microscopic presence in the embedded space. SPARK (reference 11), a subset of Ada, is an intriguing alternative that includes pre-compile-time contract checking. That's not a misprint; it has an extremely expressive contract language that transcends the simple invariants, pre- and postconditions discussed here. SPARK's analysis tools examine the code before compilation, flagging any state the code can enter which will break a contract. That's a fundamentally different approach to the run-time checking of contracts, and to debugging in general. If you carefully define the contracts and fix any flaws detected by the tools, the SPARK tools can guarantee your program free from large classes of common defects, including all undefined and implementation-dependent behaviors, data-flow errors and so-called "run-time errors" like buffer overflow, arithmetic overflow and division by zero.

Finally, Dan Saks wrote a very interesting article (reference 12) about building compile-time assertions - again, constructs that catch some classes of errors while you compile, without the use of external tools. The approach is not nearly as comprehensive as DBC, but it is a fascinating way to use the preprocessor to stomp out bugs early.

Good Contracts Make Good Programs

Though contracts do look at lot like assertions, there is an important philosophical difference. An assertion makes a statement about what the programmer thinks is true at one particular time or place in the code. A contract, though, is a strong statement about the relationship of the function to its caller. For instance, there's something profoundly wrong if a routine calls a square root function with a negative argument. Or is there? If sqrt() understands complex math a different set of assumptions, and thus contracts, defines the relationship between caller and callee.

The right contracts prove that the programmer hasn't made incorrect assumptions about code's behavior, a particular bane of reuse. As Bertrand Meyer said, "Reuse without a contract is sheer folly." (Reference 7).

Set contracts on "impossible" conditions, things that just can't happen, like pulling data from an empty queue, or getting a null pointer to a string. A contract stops execution immediately, long before these errors can precipitate a potentially worse failure, producing debugging hints that help us fix things fast.

Place contracts on conditions for which there is no recovery. That excludes incorrect user input, as the code should complain and prompt for better data. If it's possible to retry in some manner, write the appropriate recovery code and don't use a contract.

Thus, there is no recovery from a broken contract. Fail fast; fail safe. Place the system in a safe mode and then halt. Log the error so developers can see what went wrong when, printing or displaying the error if possible, or stuffing the error into flash for later analysis.

In general contracts shouldn't break in the field. Good design and aggressive testing should wring out such problems before shipping. It's best to deal with hardware failures using normal code. If the 5 volt A/D returns 6 volts, give the user a clear error message so there's no doubt what needs to be fixed.

Conclusion

Southwest's broken page

Figure 3 - Southwest's flight status page

A recent February's snowstorms generated a flurry of flight cancellations. Figure 3 shows that, though cancelled, my flight to Chicago was still expected to arrive on time. Clearly, we programmers continue to labor under the illusion that, once a little testing is done, the code is perfect.

Is Design by Contract just a fancy name for the tried and tested assert() macro? I, and many others, argue no even though some C implementations merely expand DBC's keywords into assert() or a construct very much like it. DBC represents a different philosophy about writing programs. Programmers sprinkle assertions about wherever the spirit moves them (which usually means not at all). In DBC we define a function's behavior, to a certain extent, by the contracts. Functions routinely start and end with contracts. Contracts help document the code, yet can never suffer from documentation drift as the slightest misalignment with the code will instantly flag an error.

Contracts are truly about the way we design the code. Assertions check for errors in implementation.

eXtreme Programming, Test-Driven Development, and other agile methods stress the importance of writing test code early. DBC complements that idea, adding built-in tests to insure we don't violate design assumptions.

Write a function's comment header block first. Then write the contracts. Fill in the rest of the comments, and only then produce the code.

References:

  1. Charlie Mills’ article on his DBC by C: http://www.onlamp.com/pub/a/onlamp/2004/10/28/design_by_contract_in_c.html

 

 
Upcoming Events

You can bring this class to your company! Click here to find how we can come to your facility and present the class.

.


Did You Know?

Salary Survey
How does your salary compare? Here are our latest results.