Built-in Debuggers
More and more processors have built-in debugging resources. Here's
a look at what features they offer.
Published in Embedded Systems Programming, August, 1993.
 |
For hints, tricks and ideas about better ways to build embedded systems, subscribe to The Embedded Muse, a free biweekly e-newsletter. No hype, just down to earth embedded talk. 23,000 other engineers subscribe. It takes just a few seconds (all we need is your email address, which is shared with absolutely no one) to subscribe to the Embedded Muse. |
Take pity on the poor embedded programmer. Too many "save
money" by relying on only the crudest of tools, but even
those with the largest budgets and best desires are often forced
into the same trap when using a leading-edge part that beats the
tools to market.
Finally our wails of anguish are being heard by the chip makers.
We're starting to see on-board debugging resources on quite a
few new CPUs. It's hard to dedicate processor pins to debugging,
as they contribute nothing to the end-product. However, development
is such a huge part of the cost of most embedded products that
there is often little choice but to add recurring costs to mitigate
NRE.
Internal Registers
Intel addressed debugging problems early on with the 386 microprocessor.
Evidently they recognized that the speed of the processor was
such that traditional debuggers would be prohibitively expensive.
It's hard to push electrons through a cable at 33 Mhz (or, at
66 Mhz for the 486... soon to be 99 Mhz if IBM's rumored clock
tripler part comes out).
The 386/486 has a very complex addressing mechanism. Logical addresses
get transformed to linear addresses via a wondrously sophisticated
segmentation system. A paging unit then translates linear addresses
to physical. As a result, it's all but impossible to know what
the program is doing simply by looking at the processor's pins
with, say, a logic analyzer. Physical address 10145A0 (on the
pins), could correspond to any of thousands of addresses generated
by the program, depending on the settings of the CS selector,
corresponding descriptor, and paging setup.
I have to give Intel credit. They dedicated a substantial number
of transistors on the part to debugging back when transistors
were still relatively expensive. This foresight has paid off for
a generation of developers (hey - I figure a generation lasts
about 5 years in this business), as nearly all debuggers, from
Turbo-Debugger to many of the hardware tools, make use of these
debugging resources to set breakpoints.
The 386/486 implements 4 hardware breakpoints using six internal
registers. Four of the registers simply hold the break address,
which is a linear address - it is the post-segmented, but
pre-page translated address generated by the program.
One register controls the mode of each of the four breakpoints.
Intel went to extremes to make these useful debugging resources,
so that each can be an instruction breakpoint or a data break.
For example, you could set one to break on a data write to a specific
address, another to work on instruction fetches only, and a third
to break on any data read or write.
The sixth register contains status information so the debug exception
handler can determine the source of the breakpoint.
Since the breakpoints are handled as hardware comparators, they
will work in code that resides in ROM or in RAM, an important
benefit for debugging embedded systems.
I have yet to see if the Pentium includes any sort of enhanced
debugging capability beyond the 386-type debug registers. Presumably
it's superscaler architecture will present yet another range of
complexity in tracking down bugs.
Background Mode
Motorola has been very innovative in their approach to both processor
technologies and on-board debugging tools. The 683xx family is
a series of processors mostly based on the 68020 core. Each part
offers a tuned I/O mix. Ideally, the family will be so large that
you'll be able to buy exactly the processor you need. I suspect
the family will become as persuasive as the 8051 and its 50 or
so variants.
Motorola implements the family as a core CPU and numerous standard
I/O modules - timers, DMA, and the like. Each module is on their
CAD system. It's easy to design a new microprocessor by using
the Betty Crocker method of extracting standard stuff from the
library, shaking it up, and letting the CAD system generate photomasks.
They tell me that one part took but a single day to design...
and was correct in its first silicon release.
Having a huge family of slightly different parts is both a blessing
and a curse. Again, look at the 8051 family for comparison. Sure,
any part you'll need is probably there, but each time you change
CPUs you'll have to buy, at the very least, a new pod for your
ICE. It's hard to use leading edge components when the tools may
lag by months.
The solution - on on-board debugger that is standard across the
entire family (it even carries into the 68HC16 family). Each processor
dedicates 3 wires to a serial interface for debugging purposes.
The entire port is called the CPU's Background Debug Mode (BDM).
Given some simple hardware to connect the serial lines to a PC,
you can establish a communications path to the processor that
bypasses all of its normal operation. The CPU will process a wide
number of commands sent over this port, all without altering the
processor's status - the registers, PC, and the like stay intact
unless you explicitly issue a command to modify them.
The command set resembles that of a ROM monitor. You can read
and write memory and registers, start a program executing, and
issue resets.
Normally, the BDM is disabled. You'd hate to have your embedded
system toggle to a debugging state in the field, when no debugger
is connected! A special reset sequence enables background mode,
essentially turning on the serial port and altering the function
of the Background (BGND) instruction.
Normally, BGND is an illegal instruction. If BDM is enabled BGND
stops execution of the program and throws the CPU into background
mode, where it services the serial commands. What could be better
than this for a breakpoint? The BDM expects you to substitute
a BGND instruction for the instruction you'd like to breakpoint
on. This does imply that you cannot break on data accesses or
instructions in ROM unless substantial extra hardware is added.
The CPUs also have a breakpoint input, which drives the processor
into background mode when BDM is enabled. This is essential for
stopping a runaway program or adding more sophisticated external
breakpoint hardware.
Numerous suppliers make debuggers that connect the CPU's BDM pins
to a PC's parallel or serial ports. A single BDM debugger will
work with any of the Motorola processors with this resource. If
you use these processors, be sure to include the Motorola standard
Berg connector in your hardware, to make the BDM port available
to a commercial BDM debugger, no matter what your plans are for
debugging strategies.
Unfortunately, Motorola defined two different "standard"
connections, one using a 10 pin connector and the other an 8 pin
version, with quite different pinouts. The 10 pin connector offers
a bit more control of the target hardware, so is probably the
preferred connection.
Since most vendors provide a source debugger with their BDM tools,
C and assembly are both viable prospects for BDM debugging. However,
BDM debuggers make the most sense when total code size is relatively
small, and when real time constraints are minimal. I'd be cautious
about relying on a simple BDM in any interrupt-intensive application,
since more powerful full scale emulators (or, at the very least,
logic analyzers), are essential for tracking these asynchronous
events.
Embedded Systems Technology is the exception to this rule; they
make an optional trace board, which, while not cheap, does cleverly
communicate to the BDM through an unused register in the processor.
As always, compare prices and features to get the tool that suits
your needs and budget.
SMT
It seems the embedded world is stampeding to surface-mounted components
(SMT). In the good old days each IC, resistor, and capacitor had
long leads that fit through holes in the circuit board, providing
a solid mechanical connection prior to soldering. SMT parts solder
directly to the face of the board, tenaciously holding on by virtue
of the solder alone. The benefit of this technology is reduced
size: SMT components are tiny... so small they're hard for these
caffeine-shaky hands to manage). In addition, since there are
no holes needed to mount the parts, clever designers can smear
both sides of a board with them, further reducing the size of
the system.
Surface mounted CPUs create all sorts of new challenges for debugging.
Most have leads on all four sides of their small, squarish packages.
Sometimes the "pitch" of these leads (their spacing)
is a paltry .020 inch.
Traditional emulation techniques just don't work well in this
environment. You cannot simply unplug the CPU and cram an emulator's
pod in - the processor is soldered directly to the board. One
option is to dedicate one prototype system to development, and
install a special conversion device in place of the CPU. Emulation
Technology, EDI, and others make these adapters which solder to
the processor's footprint and provide a socket for an appropriate
emulator. Be aware, though, that adapters cost $500 to $1000,
and are wispy, delicate parts that require a magician's hand to
solder in place. Don't try this at home, kids!
If the processor is soldered in place, why not design an emulator whose pod clips over the entire chip? That is, use a sort of inverted
female socket on the pod, and snap it onto the CPU, providing
an electrical connection to each processor's pin.
Emulator's work by taking massive control of all processor functions.
They must be able to run short segments of emulator code on the
target microprocessor, which means the CPU must be isolated from
the target system by a buffer, so the emulator's code doesn't
spuriously effect target I/O and memory. Since there is no physical
way to place a buffer between the surface-mounted CPU and it's
target resources, the emulator must somehow disable the target
CPU, replacing all of its functionality with a processor inside
of the emulator itself.
Most of the microprocessor vendors recognized this, and provide
some method of tri-stating the target chip. The part is driven
to an inactive state, where all of its pins are non-functional.
In effect, the processor on the target system becomes a dead hunk
of plastic that is completely replaced by the emulator's own CPU.
Zilog's Z182, for example, is a 100 pin quad flat pack (QFP) device
based on the Z180 core. Two pins are dedicated to selecting a
debug mode. Usually your system leaves these pins open and the
processor enters normal operation on power up. If an emulator
is connected, it drives the pins in to one of two debugging modes.
Mode 1 forces almost all of the Z182's pins to a tri-state condition.
A Z180 emulator, with a special adapter, clips over the Z182 in
the target and provides all of the address, data, and other signals
to the target. Only a few lines stay active - those related to
peripherals inside of the Z182 that the Z180 does not have. So,
the Z182 stays semi-active: it's core processor is disabled. The
internal I/O that is identical to that on a Z180 is disabled.
Just the new Z182 superset I/O is alive, intercepting I/O commands
sent to the processor's pins via the clip-on plug.
This is a nice approach, since dozens of vendors sell Z180 tools.
Creating a new emulator for the Z182 would be prohibitively expensive,
as illustrated by the chip's Mode 2, which tri-states everything
on the part, including all of the I/O. No vendor supports debugging
in this mode today.
Intel uses a similar approach on their 80186EC microprocessor,
a surface mounted variant of the popular 186 family. It is also
a 100 pin QFP device. Instead of dedicating pins to debugging,
Intel elected to share an address line (A19) with the emulation
mode selection. Grounding A19 during reset drives the part into
a tri-state condition Intel calls ONCE mode (apparently pronounced
"AHNCE").
Though the 186EC is a lot like other members of the 186 family,
it is sufficiently different that you cannot make an adapter to
convert, say, a 186 pod to the 186EC. A new pod is needed (at
the very least). Thus, unlike the Z182, going to ONCE mode tri-states
- the part is just an expensive piece of plastic during debugging,
with the emulator's CPU assuming all processor and I/O functions.
The Future
The driving force behind electronics is an implicit guarantee
that the cost of silicon always follows a downward spiral. Transistors
are cheap; so cheap, it seems chip vendors have a hard time deciding
what to do with them. It's clear that a percentage of the transistor
budget on many new microprocessors will be dedicated to on-chip
debugging resources, to make the parts truly usable by developers.
One technology that has been lurking for a number of years is
boundary scan. Boundary scan is an IEEE standard (IEEE 1149.1)
that defines a way to design chips for in-circuit testability.
Its thrust is towards the production test and repair end of the
business.
A chip designed to the IEEE standard will include 4 pins that
implement a serial link for communications to an outside test
device. Typically, a number of chips, all implementing boundary
scan, will be daisy chained together so the tester can send commands
to any part on a circuit board.
ICs with boundary scan capabilities can sense the signals on each
pin, so the tester can completely probe the board purely by sending
serial commands between chips.
I've heard rumors that some vendors are exploring expanding the
technology to include debugging assets, somewhat like the breakpoint
registers on the 386. After all, serial pins are already dedicated
to test functions; it makes sense to add debug logic, perhaps
implemented somewhat like Motorola's Background Mode. Then, the
production test logic can do double duty as a software development
platform.
Corelis Inc. (Cerritos, CA, (310) 926-6727), just announced a
boundary scan-based development tool for the AM29200 and 29030.
It's cheap; it lets you view target resources like memory and
I/O, and it supports software breakpoints. I see this as an interesting
alternative to the extremely high priced development tools used
for fast 32 bit CPUs.
Boundary scan offers promise for the future, but it will never
offer a complete solution to the debugging process. Programmer
time is expensive. Tools that improve productivity are
therefore, by definition, cheap. Some resources, like real time
trace and performance analysis, offer lots of benefits to the
developer, but are far too complex to ever put in the silicon
itself. However, built-in debugging hardware does bring at least
a minimal development system to a huge audience, and simplifies
high-powered tools.
|