Debugging ISRs - Part 1

This is part 1 of a two part series on debugging interrupt service routines.

Published in Embedded Systems Programming, May, 1996

By Jack Ganssle

Few embedded systems are so simple they can work without at least a few interrupt sources. Few designers manage to get their product to market without suffering metaphorical scars from battling interrupt service routines (ISRs).

There's no science to debugging these beasts, which are often the most complex part of any real time system. Few college course address ISRs at all, let alone debugging scenarios. Too many of us become experts at ISRs the same way we picked up the secrets of the birds and the bees - from quick conversations in the halls and on the streets with our pals. There's got to be a better way!

Vector Overview

One common complaint against interrupts is that they are difficult to understand. There is an element of truth to this, especially for first time users. However, just as we all somehow shattered our parents' nerves and learned to drive a stick-shift, we can overcome inexperience to be competent at interrupt-based design.

Fortunately there are only a few ways that interrupts are commonly handled. By far the most prevalent is the Vectored scheme. A hardware device, either external to the chip or an internal I/O port (as on a high integration CPU like the 188 or 68332) asserts the CPU's interrupt input.

If interrupts are enabled (via an instruction like STI or EI), and if that particular interrupt is not masked off (high integration processors almost always have some provision to selectively enable interrupts from each device), then the processor responds to the interrupt request with some sort of acknowledge cycle.

The requesting device then supplies a Vector, typically a single byte pointer to a table maintained in memory. The table contains at the very least a pointer to the ISR.

The CPU pushes the program counter so at the conclusion of the interrupt the ISR can return to where the program was running. Some CPUs push other data as well, like the flag register. It then uses the vector to look up the ISR address and branches to the routine.

At first glance the vectoring seems unnecessarily complicated. Its great advantage is support for many varied interrupt sources. Each device inserts a different vector; each vector invokes a different ISR. Your UART Data_Ready ISR called independently of the UART Transmit_Buffer_Full interrupt.

Simple CPUs sometimes avoid vectoring to directly invoke the ISR. This greatly simplifies the code, but, unless you add a lot of manual processing, limits the number of interrupt sources a program can conveniently handle.

General Design Guidelines

Crummy code is hard to debug. Crummy ISRs are virtually undebuggable. The software community knows it's just as easy to write good code as it is to write bad. Give yourself a break and design hardware and software that eases the debugging process.

Poorly coded interrupt service routines are the bane of our industry. Most ISRs are hastily thrown together, tuned at debug time to work, and tossed in the "oh my god it works" pile and forgotten. A few simple rules can alleviate many of the common problems.

First, don't even consider writing a line of code for your new embedded system until you lay out an interrupt map. List each one, and give an English description of what the routine should do. Include your estimate of the interrupt's frequency. Figure about the maximum, worst case time available to service each. This is your guide: exceed this number, and the system stands no chance of functioning properly.

Approximate the complexity of each ISR. Given the interrupt rate, with some idea of how long it'll take to service each, you can assign priorities (assuming your hardware includes some sort of interrupt controller). Some developers assign the highest priority to things that must get done; remember that in any embedded system every interrupt must be serviced sooner or later. Give the highest priority to things that must be done in staggeringly short times to satisfy the hardware or the system's mission (like, to accept data coming in from a 1 Mb/sec source).

Short, of course, is measured in time, not in code size. Avoid loops. Avoid long complex instructions (repeating moves, hideous math and the like). Think like an optimizing compiler: does this code really need to be in the ISR? Can you move it out of the ISR into some less critical section of code?

For example, if an interrupt source maintains a time-of-day clock, simply accept the interrupt and increment a counter. Then return. Let some other chunk of code - perhaps a non-real time task spawned from the ISR - worry about converting counts to time and day of the week.

Ditto for command processing. I see lots of systems where an ISR receives a stream of serial data, queues it to RAM, and then executes commands or otherwise processes the data. Bad idea! Simplify the code by having the ISR simply queue the data. If time is really pressing (i.e., you need real time response to the data), consider using another task or ISR, one driven via a timer which interrupts at the rate you consider "real time", to process the queued data.

An old rule of software design is to use one function (in this case the serial ISR) to do one thing. A real time analogy is to do things only when they need to get done, not at some arbitrary rate (like, if you processed commands in the serial ISR).

Reenable interrupts as soon as practical in the ISR. Do the hardware-critical and non-reentrant things up front, then execute the interrupt enable instruction. Give other ISRs a fighting chance to do their thing.

Use reentrant code! Write your ISRs in C if at all possible, and use C's wonderful local variable scoping. Globals are an abomination in any programming environment; never more so than in interrupt handlers. Reentrant C code is orders of magnitude easier to write than reentrant assembly code.

Don't use NMI for anything other than catastrophic events. Power-fail, system shutdown, interrupt loss, and the apocalypse are all good things to monitor with NMI. Timer or UART interrupts are not.

When I see an embedded system with the timer tied to NMI, I know, for sure, that the developers found themselves missing interrupts. NMI may alleviate the symptoms, but only masks deeper problems in the code that most certainly should be cured.

NMI will break a reentrant interrupt handler, since most ISRs are non-reentrant during the first few lines of code where the hardware is serviced. NMI will thwart your stack management efforts as well.

Fill all of your unused interrupt vectors with a pointer to a null routine. During debug, always set a breakpoint on this routine. Any spurious interrupt, due to hardware problems or misprogrammed peripherals, will then stop the code cleanly and immediately, giving you a prayer of finding the problem in minutes instead of weeks.

Lousy hardware design is just as deadly as crummy software. Modern high integration CPUs like the 68332, 80186 and Z180 all include a wealth of internal peripherals - serial ports, timers, DMA controllers, etc. Interrupts from these sources pose no hardware design issues, since the chip vendors take care of this for you. All of these chips, though, do permit the use of external interrupt sources. There's trouble in them thar external interrupts!

The biggest source of trouble comes from the generation of the INTR signal itself. Don't simply pulse an interrupt input and assume the CPU will detect it. Though some chips do permit edge-triggered inputs, the vast majority of them require you to assert INTR until the processor acknowledges it. An interrupt ACK pin provides this acknowledgment. Sometimes it's a signal to drop the vector on the bus; sometimes it's nothing more than a "hey, I got the interrupt - you can release INTR now".

As always, be wary of timing. A slight slip in asserting the vector can make the chip wander to an erroneous address. If the INTR must be externally synchronized to clock, follow the letter of the spec sheet and do what it requires.

If your system handles a really fast stream of data consider adding hardware to supplement the code. We had a design here recently that accepted data points 20 microseconds apart. Each generated an interrupt, causing the code to stop what it was doing, vector to the ISR, push registers like wild, and then reverse the process at the end of the sequence. If the system was busy servicing another request, it could miss the interrupt altogether.

Since the data was bursty we eliminated all of the speed issues by inserting a cheap 8 bit FIFO. The hardware filled the FIFO without CPU intervention. It generated an interrupt at the half-full point (modern FIFOs often have Empty, Half-Full, and Full bits), at which time the ISR read data from the FIFO until it was sucked dry. During this process additional data might come along and be written to the FIFO, but this happened transparently to the code.

If we interrupted on the FIFO getting the first data point (i.e., going not-empty), little would have been gained in performance. Using half-full gave us enough time to finish servicing other activities (during which time more data could come, but as the FIFO had plenty of empty space it was not lost), and massively reduced ISR overhead.

A few bucks invested in a FIFO may allow you to use a much slower, and cheaper, CPU. Total system cost is the only price issue in embedded design. If a $5 8 bit chip with a $6 FIFO does the work of a $20 sixteen-bitter with double the RAM/ROM chips, it's foolish to not add the extra part.

C or Assembly?

If you've followed my suggestions you have a complete interrupt map with an estimated maximum execution time for the ISR. You're ready to start coding... right?

If the routine will be in assembly language, convert the time to a rough number of instructions. If an average instruction takes x microseconds (depending on clock rate, wait states and the like), then it's easy to get this critical estimate of the code's allowable complexity.

C is more problematic. In fact, there's no way to scientifically write an interrupt handler in C! You have no idea how long a line of C will take. You can't even develop an estimate as each line's time varies wildly. A string compare may result in a runtime library call with totally unpredictable results. A FOR loop may require a few simple integer comparisons or a vast amount of processing overhead.

And so, we write our C functions in a fuzz of ignorance, having no concept of execution times until we actually run the code. If it's too slow, well, just change something and try again!

I'm not recommending not coding ISRs in C. Rather, this is more a rant against he current state of compiler technology. Years ago assemblers often produced t-state counts on the listing files, so you could easily figure how long a routine ran. Why don't compilers do the same for us? Though there are lots of variables (that string compare will take a varying amount of time depending on the data supplied to it), certainly many C operations will give deterministic results. It's time to create a feedback loop that tells us the cost, in time and bytes, for each line of code we write, before burning ROMs and starting test.

Till compilers improve, use C if possible, but look at the code generated for a typical routine. Any call to a runtime routine should be immediately suspect, as that routine may be slow or non-reentrant, two deadly sins for ISRs. Look at the processing overhead - how much pushing and popping takes place? Does the compiler spend a lot of time manipulating the stack frame? You may find one compiler pitifully slow at interrupt handling. Either try another, or switch to assembly.

Be especially wary of using complex data structures in ISRs. Watch what the compiler generates. You may gain an enormous amount of performance by sizing an array at an even power of two, wasting some memory, but avoiding the need for the compiler to generate complicated and slow indexing code.

An old software adage recommends coding for functionality first, and speed second. Since 80% of the speed problems are usually in 20% of the code, it makes sense to get the system working and then determine where the bottlenecks are. Unfortunately, real time systems by their nature usually don't work at all if things are slow. You've often got to code for speed up front.

If the interrupts are coming fast - a term that is purposely vague and qualitative, measured by experience and gut feel - then I usually just take the plunge and code the silly thing in assembly. Why cripple the entire system due to little bit of interrupt code? If you have broken the ISRs into small chunks, so the real time part is small, then little assembly will be needed.

Conclusion

The wide use of C makes assembly-competent developers a scarce resource. Embedded systems are the last bastion of assembly, and will probably always require some amount of it. Become an expert; like learning Latin, it's a skill that has many unexpected benefits. Only folks who know assembly really seem to grasp performance tradeoffs.

Part 2 of this is here.