New Hardware
 |
For hints, tricks and ideas about better ways to build embedded systems, subscribe to The Embedded Muse, a free biweekly e-newsletter. No hype, just down to earth embedded talk. 23,000 other engineers subscribe. It takes just a few seconds (all we need is your email address, which is shared with absolutely no one) to subscribe to the Embedded Muse. |
New Hardware
Within months of the introduction of the 8008 an explosion
of microprocessor-controlled products appeared. Perhaps the somewhat earlier
four bit 4004 had started engineers thinking about applications. Clearly, eight
bits was the right technology at the right time, as evidenced by companies
frantically trying to hire non-existent embedded developers and the emergence of
whole new classes of smart products
Almost immediately engineers discovered the difficulty of
building firmware. Embedded systems were universally small applications without
a terminal or other debugging interface. The old model of computing had failed.
Embedded systems were not like general purpose computers that concurrently
hosted both debugger and various applications; they carried one set of code
burned permanently into EPROM, and did nothing else. They were as dedicated to
the single application as the hardwired logic they had replaced.
How was one to debug hardware and software? The essential
challenge of embedded work is the lack of visibility into the product's
functioning. It's all buried under the hood. Perhaps our sole interface with
the system is a pair of blinking LEDs, yet 100,000 transistors and 10,000 lines
of code all interact in complex ways. We're not smart enough to get it right
the first time, so need a way to poke into the internals.
The immediate response were remote debuggers that used a
serial port to offer crude debugging capability. Set a breakpoint. Examine and
change registers and memory. But transistors were still expensive then and UARTs
were large 40 pin chips which were not yet integrated on-board the CPU. A spare
port just for debugging was simply too expensive for many of the systems then
being built.
In desperation people did crazy things. I remember coupling
the bus of our embedded system to that of an "Intellec 8", a sort of general
purpose computer based on the 8008. The terminal - a teletype - drove the
Intellec; the embedded system looked like an extension of the computer's bus.
This gave us remote debugger capability, and we could exercise the system
hardware to track design faults, as long as a fault didn't step on the bus,
crashing both systems.
About 1975 Intel invented the In-Circuit Emulator (ICE), a
brilliant idea which saved a generation of developers. The tool replaced the
application's CPU and (usually) gave engineers non-intrusive access to all of
the target's resources. Even the very first ICE included real time trace,
which captured program execution at full speed. Complex breakpoints supported
if-then traps. A wealth of other resources greatly eased developers' trials.
At $20,000 (this was a lot of money in 1975,
before Carter's inflation) emulators were princely acquisitions. But
even 25 years ago embedded systems were being delivered late. Companies were
willing to purchase expensive tools to speed time-to-market.
Those were kinder, gentler days. The hardware and software
teams never battled over tools, schedule or other issues. A single hardware
engineer usually designed the system and wrote all of the code in horribly
scrambled spaghetti assembly language. Hence the need for a fancy debugger.
But this same developer realized that the ICE was as
powerful tool for finding hardware faults as well as software bugs. When working
with a new peripheral it's handy to send controlled I/O streams to the device
to insure that the thing functions properly. Sure, you can write a program to do
this, but it's so much easier to use the debugger to interactively probe the
peripheral and immediately examine its response.
Much more difficult, though, is probing a brand new
hardware design. Usually the slightest fault means the system crashes within
microseconds. One mixed up data or address line, a bad chip select, or any of a
hundred different errors will bring things down immediately. Traditional debug
tools - like a remote debugger - are no help since they are themselves
software which rely on functioning hardware. In the pre-ICE days the only
solution was to use a logic analyzer and tediously capture the digital flow,
constantly hitting reset, hoping to find the mistake that so quickly causes the
system to go awry.
(Actually, before the microprocessor - which was dynamic
and could never run slower than hundreds of KHz - there was another option. I
built one computer out of 74LS logic components and slowed the clock to 1 Hz. A
simple voltmeter was enough to track design faults!)
But the ICE was the perfect tool for trapping hardware
problems. Even with a totally shorted bus the emulator still worked properly. It
was simple to issue I/O or memory reference commands and see what happened,
probing with a scope to track signals. Emulators even then had their own memory
that developers could map in place of the target's, so one could write a tiny
bit of code in known-good RAM that looped, issuing memory reads very quickly to
the target. An ICE and scope together were a potent pair that made hardware
troubleshooting quite easy.
We've come full circle. The BDM or JTAG tool dominates
now, yet it's not much more than the crude remote debugger that predated the
ICE. Though the price of ICEs have plunged with ever increasing functionality,
they are rapidly fading from the scene. Managers don't want to hear things
like "I need a $10,000 tool to find my mistakes." CPU vendors no longer make
the specialty bond-out chips needed to make the tools. High speeds and too many
go-fast CPU features are problematic for emulators. BDM tools are clearly the
future.
BDMs are far superior to software-based debuggers for
troubleshooting hardware since they will often boot up despite all sorts of
target system design faults. Since they lack features like internal RAM, though,
they're not as helpful as an ICE in finding basic hardware flaws.
So what do we do?
If you're still using ICEs, break through the
metaphorical wall that divides hardware and software teams. Use the emulator -
which seems to have evolved into a software-only tool - as your prime hardware
debugger. I spent a lot of years in the ICE business, and was always amazed at
how few EEs used emulators. They'd sweat bullets capturing that one
microsecond event when an emulator would let them set up a loop that created the
same problem 100,000 times per second.
Though BDMs are quite useful troubleshooting tools they
can't function without a lot of operational hardware. The good news is that
few need working target memory, address or data lines. The huge number of nodes
on the busses makes these the most likely source of design problems.
The BDM does expect correctly functioning control inputs to
the CPU. So use your scope and check the wait state input - is it asserted?
How about interrupts? Make sure the clock is perfect. No modern processor
is happy with sloppy rise and fall times or voltage levels that
don't meet the processor's spec. Simple CPUs have only a handful of
control inputs; more complex ones might have dozens. It's pretty easy, though,
to probe each one and insure it behaves properly.
Don't forget Vcc! As a young engineer a wiser older
fellow taught me to always round up the usual suspects before looking for exotic
failures. On occasion I've forgotten that lesson, only to wander down complex
paths, wasting hours and days. Check power first and often.
With the basics intact, boot up the BDM and use it to
exercise address and data lines. One problem with some of these tools is that
they create a lot of bus activity even when you've requested a single byte;
how can you tell which transaction is the one you've commanded?
One solution is to program a processor chip select to be on
for the narrowest possible range, one that's (hopefully) outside of the
spurious debugger cycles. Trigger your scope on this, and then issue the
appropriate read/write test commands.
Though the tools don't include emulation RAM you can
still create loops to ease scooping. The debugger program that drives the BDM
invariably includes a command structure that lets you issue the same read or
write repeatedly. Expect a very slow repetition rate since each loop has a lot
of serial overhead to the BDM.
Brains, not Tools
Commercial tools are wonderful but limiting. The most
powerful debugger is the one that sits between your ears. Be innovative.
Most folks test new hardware using a simple loop burned
into EPROM or Flash and hit the reset switch. If anything is wrong the system
crashes instantly. You'll grow calluses hitting the reset switch when trying
to find the problem with a scope or logic analyzer.
Some folks connect a pulse generator to the processor's
reset input. This clever approach resets the CPU hundreds or even thousands of
times per second, making it much easier to use a scope or analyzer to track down
the defect.
Unfortunately, a null loop only checks a handful of address
lines. This is the most frustrating of all conditions: you think the thing
works, only to find that code at particular addresses, or large programs,
don't function. It's a mistake to assume the hardware works just because
you've proven a small subset; better, explicitly check every address and every
data line before pronouncing the system ready for software integration.
With an ICE you can write a program in emulation RAM to
cycle all target address and data lines. If a bit of target RAM works then the
BDM can download test code there. But in either case the processor bus lines
will be cycling like mad as the test code runs. How can your scope or analyzer
identify the one test cycle from the blizzard of others needed just to run the
test code?
Perhaps it's time to put the tools away. What can we do
that sequentially toggles every address and data line?
Lots of processors have single-byte or -word software
interrupt instructions. Suppose you could make the system execute nothing but an
interrupt, over and over, no other instructions getting in the way. What would
happen? The CPU issues a fetch, reads the interrupt instruction, and then pushes
the current context on the stack. It will vector to the interrupt handler, read
another interrupt, and push the context again - now with the stack pointer
decremented by a few words since there was no return from interrupt. This goes
on forever, the stack pointer marching down through all of memory.
It's easy to find these stack cycles as they are the only
memory writes taking place. Trigger your scope on write and then look at each
address line in turn. A0 cycles at a very fast rate. A1 at half that rate. A2
half A1. (On an 8 bit processor A0 will probably cycle, but it won't on 16 and
32 bit machines). Probe each and every node - every memory chip, every I/O
device that's connected to address lines. You'll prove, for sure, that the
address bus is properly connected everywhere. Or not.
Look at the data lines during the write cycles. Depending
on what the CPU pushes during an interrupt you'll see the return address
(probably that of the ISR) and perhaps flag register bits. Again, insure these
bits go to each and every data node.
What about that ISR? We don't have one as we're
executing the same instruction no matter where we are. There are also no
interrupt vectors. In fact, we'll probably vector to the address equal to the
software interrupt's hex code. So what? As long as the hex does not violate a
basic processor rule the fact we're going to an arbitrary address is
irrelevant. (A basic rule might be that the vector must be even, like on the 68k
family).
Consider a real-mode x86 chip, like the 186. The one-byte
software interrupt instruction is 0xcc. Executing that means the part will read
high and low vectors, getting 0xcc for each read. The ISR is then at
0xcccc:cccc, a perfectly valid address. Your scope will see lots and lots of
fetches from this address, plenty of vector reads from the vector addresses
(0x0000c-0x000f), but writes to ever-decreasing stack locations.
But I've glossed over the biggest part of the trick:
forcing the processor to execute the same software interrupt regardless of fetch
address.
Remove all memory components, or disable them by
disconnecting and idling the chip selects. Memory mapped peripherals need the
same treatment. Then tie the data bus to the software interrupt instruction via
pull-up and pull-down resistors. You must use resistors rather than
hard-wired connections since the CPU will be driving these lines during stack
writes. So for a real-mode x86 connect both high and low data lines to 0xcc. On
a 68k use a breakpoint instruction. The Z80/180/8085 family is nicest of all
since 0xff is RST7.
Turn power on and scope each data line to be sure it goes
to the proper state. A shorted data line will invalidate the test.
If the processor doesn't have a single byte/word software
interrupt this process won't work. There may be other limitations as well: the
stack rolls over on a 64k boundary on the 186 so the test doesn't check 4 of
the 20 address lines.
But the tool cost is zero, setup time almost immediate, and
we get a tremendous amount of information.
Wither We Go
With emulators slowly disappearing from 16 and 32 bit
development projects we're required more than ever to use clever debugging
strategies in place of good tools. I'm frankly dismayed by the trend. Programs
are getting bigger with more real time constraints, while at the same time tools
grow dumber. Managers seem to have less tolerance than ever for expensive
development systems, yet time-to-market concerns and engineering salaries both
skyrocket.
Yes, new processors include more built-in debugging
resources. But the goal of increased visibility into our embedded creations
seems even more elusive than in 1975. Simulation will become more important, but
ultimately some poor developer is always faced with the almost impossible task
of figuring out why that code which runs so well under simulation crashes when
ported to the perfectly-tested hardware.
A fantastic toolchain pays for itself every time.
|