Episode 1: Why Did My Board Crash When I Scoped a Node?

June 2, 2014

Level: intermediate

Did you know fast digital signals affect all developers, not just hardware people? And that "fast" does not necessarily mean "high clock rate?"

In this video Jack demonstrates how transitioning signals can create havoc for people debugging hardware and firmware, and what strategies one can take to mitigate the issues.

(Don't miss the follow-on video The Perils of Probing: What's Inside That Scope Probe?)

Video Transcription

Hi, I'm Jack Ganssle. Welcome to The Embedded Muse video blog, a companion to my free Embedded Muse newsletter.

Today, I'm going to be talking about electromagnetics in general, but in particular, why firmware people have to have some understanding of this field. That's right, people doing the software inside embedded systems really need to have some sense of what's going on in high speed digital systems. I'll be teaching some electronics.

Don't worry: it's not complicated. It's pretty interesting, and it's way cool. I know some of you folks are electrical engineers, and suffered through those electromagnetic courses. I thought those two classes were going to kill me! I had no idea what they were talking about, but we learned to do circular integrals and all kinds of complicated math that to this day I still don't understand.

If you, like me, have forgotten this stuff, you need this book: "High Speed Digital Design." It's also subtitled "A Handbook of Black Magic," by Martin Graham and Howard Johnson. It's the best reference on the market. But if you open it up, you may be very afraid. It's full of math! The trick to getting useful information from this book is to not read the formulas. Just read the prose.

Now, I have purposely put this up incorrectly. That B on the left-hand side of the equality sign should be E. And if it were, that would be Faraday's Law, one of Maxwell's famous laws. That would relate the electric field to the magnetic field. The reason that I put it up there incorrectly is to really make a point that most of us have forgotten this stuff.

Let's look at some real signals. These are both obviously square waves. In this case, they're running at 1 megahertz. The top waveform is for a square wave with a transition time from 0 volts to 3 volts of 20 nanoseconds...20 nanoseconds to make this switch.

The bottom waveform is the same 1 megahertz square wave, but in this case the transition time from 0 to 3 volts is 1 nanosecond. It's a much faster rise time. But clearly, we can see that there's this oddball ringing going on. And this ringing is the entire focus of my discussion today.

The square wave's frequency is given right here. You can see it's 1 megahertz. Now watch as I change that frequency. I'm going to make it slower and slower and slower and slower. We're down to 80 kilohertz.

There's 50 kilohertz, 10 kilohertz. You'll notice that the signal hasn't changed at all. Just to illustrate this, I'm going to go back up in frequency. You can see the ringing has not changed in the slightest bit. Only the frequency of the waveform is changing.

And that's the first point I want to make today. Everyone talks about digital systems as being fast by looking at the clock rate. You know how we talk about that three gigahertz clock rate on your Pentium processor, and yeah, that is fast.

But when it comes to electromagnetic issues, that clock rate is practically irrelevant. What we really care about is the transition time; the time to go from a 0 to a 1, or a 1 back to a 0 again. As we saw, I changed the clock rate by several orders of magnitude, and the ringing didn't change at all.

What's that ringing about? Where does it come from? Well, what happens is when a signal goes down a wire or a track on a printed circuit board, when it hits the receiver, if the impedance, which...impedance is nothing more than the AC resistance. It's measured in ohms. It's just like a resistor, except it had a frequency component as we'll see in a little bit.

As it hits the receiver, if the impedance of that receiver does not exactly match the impedance of the driver, some of the signal will be reflected back. Of course, it goes back and hits the driver, there's an impedance mismatch, and it gets reflected back. Bing, bing, bing, and that's that ringing that we saw on the oscilloscope.

There is one formula in the book that is really important, and that's this. If Tr is the rise time of the signal, the time to transition from a 0 to a 1 measured in nanoseconds, then any frequency above F, given by this formula will be so far down, I mean on the order of 40 decibels down, that it's not going to be important.

So if the signal has a 20 nanosecond rise time, like one of the signals we just looked at, then even if your clock rate is a lousy 1 hertz, your board looks like a 25 megahertz system...20 nanoseconds is really slow. It's hard to build systems that slow anymore, 1 nanosecond is more typical of rise times.

Again, if you're building a system with a 1 hertz clock, and the signals are transitioning in 1 nanosecond, your board looks like a 500 megahertz board with all of the problems that are associated with a 500 megahertz board.

That book is subtitled "The Handbook of Black Magic," and sometimes it really does seem like black magic, but this is real, physical stuff. I'll give you an example. Some time ago, a company contacted me. They had been making 4 megahertz Z80 boards for 30 years. Had never had a problem. They had a new batch of boards manufactured, and none of them worked.

So we looked at them. Everything was the same. All of the part numbers were the same. Nothing was different. It turns out the RC vendors figured no one is going to complain if they make their parts faster than advertised.

Those logic components that used to switch in 15 nanoseconds were now switching in 5. So suddenly, their little old slow board had all of these high frequency components running around all over the board.

They had to redesign that board as a high speed digital board with ground planes and all of the other things we have to do to make these systems reliable.

Now if you remember any electromagnetics, and for all of you firmware people who probably weren't exposed to it at all, there was a dude named Fourier a couple of hundred years ago who showed that any periodic signal, any signal that repeats, can be represented as a sum of sine waves. So for example, the square waves that we just looked at: that can be represented as a sum of sine waves of different frequencies and different amplitudes.

It turns out, for a square wave, the odd harmonics are the ones that are included. In other words, it's going to be the sine of the base frequency plus the sine of 3 times the base frequency plus the sine of 5 times the base frequency, etc.

Turns out, there's a tool that we can use to help us visualize this. It's called a spectrum analyzer. You may not have one of those, but pretty much any digital oscilloscope today simulate a spectrum analyzer by computing the Fourier transform, again going back to Mr. Fourier, of an input signal.

Turning back to the scope again, the first trace is a display of a 50 megahertz sine wave. As you can see, it looks very nice and clean. It's coming out of my signal generator. As is typical on a scope, the vertical axis is voltage, the horizontal axis is time. So it's voltage versus time. I've told this scope on this bottom display to display the Fourier transform of that sine wave.

And what this axis is is frequency. So we have amplitude here, that they don't measure it in volts. It's measured it in decibels related to volts. But it's frequency across the bottom here, and sure enough, its center peak is centered at 50 megahertz because that's the signal that I'm putting into the oscilloscope. And yeah, there's a lot of noise, but you can ignore that. As I tune the signal generator, I'm increasing the frequency.

You can see the sine wave frequency going up, and you can see the Fourier transform being shifted to the right. Or if I reduce the frequency, it gets shifted to the left. That's what a spectrum analyzer is all about. That's what fast Fourier transforms do. And now, we'll use this to show what's really going on with the electromagnetics.

So this top waveform is that 1 megahertz square wave we looked at earlier, and in this case, this is the slow one. The transition time is 20 nanoseconds. On the bottom display, we're looking at the Fourier transform of this 1 megahertz square wave.

And sure enough, the biggest peak right here is just where you expect it to be. It's at 1 megahertz, because that's the fundamental frequency we're talking about.

But because Mr. Fourier taught us that the square wave can be represented as a sum of sine waves of different frequencies...and in the case of square waves, those are always odd multiples of the fundamental...this is at 3 megahertz, 5 megahertz, 7 megahertz. Look here we are out at 21 megahertz out there, and we still have a significant amount of energy in the waveform.

Now we're looking at that 1 megahertz square wave with the really fast rise time, 1 nanosecond. In this case, here's the 1 megahertz peak where most of the energy is, and you can see that the sine wave coefficients fall off in magnitude. But even way out here, we're talking about 180 megahertz. In other words, there's a significant amount of energy being released at 180 megahertz. And that's what electromagnetics is all about.

So how do we fix this ringing problem? Well I've already given you a clue. The problem is that the impedances of the receiver and the driver are not matched. So obviously, the solution is to match the impedance. And there are many ways we can do this. One of which is to use a resistor network. Two resistors configured just as shown here. Let's see what happens.

So if I match the impedance by putting that resistor network on, look how that signal gets cleaned up. Just from two little resistors, takes out most of that energy. Not all of it, nothing's perfect, but it does a pretty darn good job.

Why would a firmware person care about this stuff? After all, you're not doing the electronics, you're doing your firmware. Well in the process of debugging your code, you're going to be probing the board in all sorts of mysterious ways.

With oscilloscopes and logic analyzers, and who knows what else, every time you put a probe on a node, you'll be changing the impedance of that node. Because a probe is not a perfect device. It has an impedance of its own.

So as you change the impedance of the node, you will create the ringing, the overshoot problems that we've seen. So from a firmware perspective, I think it's very important that you're aware of this. Because you don't want the active probe in the board to cause the board to suddenly fail.

How can it fail? Catastrophically. You saw how big that ringing is. If that ringing goes much above the power supply, which is very easy to do, then the devices, the IC's on your board, can go into what's called SCR latchup.

And all that means is they're trying to connect power to ground inside the chip. They can do that for like, a microsecond before things burn up. And again, it sounds like black magic, right? Well years ago, when I was in the emulator business, we built our first of a whole new generation of emulators. It was a dead-slow device. It was 6 megahertz.

These things would be inserted in place of the CPU on peoples' boards. But drove those signals very fast, because that's what our customers wanted us to do. And we tested this first new product on all of our test platforms and everything was perfect.

We sent it to the first customer, and he plugged into his board. And I'm not making this up: every chip on his board exploded. The plastic blew apart on the chips. This guy was pretty ticked. But it's a very real effect.

So I recommend that at least on prototypes, you firmware people make sure that your hardware folks put some sort of termination network on critical signals, like the ones that are being displayed right now, that you're likely to be probing frequently. These are signals that tend to be very sensitive to the very least amount of ringing and overshoot. So a handful of resistors, and you can solve all of those problems.

Thanks for watching. Don't forget to check back for more videos and over 1,000 articles about better ways of building embedded systems on ganssle.com, and be sure to sign up for the free Embedded Muse newsletter.