The Embedded Muse 259

Go here to sign up for The Embedded Muse.

The Embedded Muse
Issue Number 259, April 21, 2014
Copyright 2014 The Ganssle Group

Editor: Jack Ganssle, jack@ganssle.com

Jack Ganssle, Editor of The Embedded Muse

You may redistribute this newsletter for noncommercial purposes. For commercial use contact jack@ganssle.com. To subscribe or unsubscribe go to https://www.ganssle.com/tem-subunsub.html or drop Jack an email.

Contents

Editor's Notes
Quotes and Thoughts
Tools and Tips
On Enabling Interrupts Inside ISRs
A Software Team Test
Jobs!
Joke for the Week
Advertise with us
About The Embedded Muse

Editor's Notes

Did you know it IS possible to create accurate schedules? Or that most projects consume 50% of the development time in debug and test, and that it’s not hard to slash that number drastically? Or that we know how to manage the quantitative relationship between complexity and bugs? Learn this and far more at my Better Firmware Faster class, presented at your facility. See https://www.ganssle.com/onsite.htm.

Quotes and Thoughts

"There are only two kinds of languages: the ones people complain about and the ones nobody uses." - Bjarne Stroustrup

Tools and Tips

Please submit clever ideas or thoughts about tools, techniques and resources you love or hate. Here are the tool reviews submitted in the past.

Bill Gatliff had some more thoughts about compilers:

With regards to Vince Bush's comment:

"I've been programming since the early 80's and I still love Assembly language for one simple but very important reason. What you see is what you get! No tricks required. I could always estimate the timing AND once optimized I knew it would stay that way. There was no compiler black magic changing things simply because I made a change in some other section of code."

I'm almost as passionate as he is about assembly language, but... ARM "pseudo instructions" can replace one perfectly benign-looking instruction like "ADRL" with two opcodes, one of which is a memory transfer. And the compiler doesn't do it---the assembler does. I'm pretty sure other instruction set architectures have similar examples. So even the "what you see is what you get" thing isn't completely true anymore.

That said, I wish more developers looked at their assembly language at all. Kudos to him.

As to his other point about "tricking" the compiler into generating better code, I've found that the more your code looks like a third-grader wrote it, the more likely it is that the compiler and assembler will work together to glean your intentions properly. That's the only way to universally get good-quality code from your toolchain: you have to feed the compiler non-obfuscated code so that it will "understand" what you are doing. I rarely see truly horrible code come through a decent toolchain anymore, unless the original C code was equally horrible. Most of the time, the assembly language is pretty close to what I would have written.

Stupid-looking, straightforward, non-elegant C code is also a lot friendlier to your future developers. Look, at its core a CPU can only add and compare... so that's what your C code has to boil down to anyway. Don't use magic structures and vtable-like function pointer jumps unless the problem you're trying to solve truly demands it, and your toolchain will generally reward you with equally straightforward and understandable output.

(Note also that sometimes, that "bad" code your compiler keeps emitting is because of a rule in C that you didn't know about, which constrains its behavior to avoid a technical concern that you didn't know existed. You'd be ill-advised to try to force the compiler to do the wrong thing in such situations.)

I mentioned the slide rule last issue. Don De Witt and Phil Martel sent this link to a slide rule emulator. If you grew up using rules you'll get a kick out of it.

Martin Thompson wrote about the Red Pitaya, a USB scope/logic analyzer:

Here's a review of the Red Pitaya which may give you more detail than the website (which is a bit sparse on technical detail!).

Siglent sent the 100 MHz version of their SDS 1102CML digital scope for review. This is a bench top instrument, not a USB scope. Within the next month or two I'll post a full analysis. My initial impression after an hour or two operating it is that it packs a tremendous amount of capability into a $379 instrument.

On Enabling Interrupts Inside ISRs

Do you re-enable interrupts in your interrupt service routines (ISRs) early, rather than just before the return?

In the past I have advocated for turning interrupts back on as early as possible, in most cases, so other events can get serviced. But Phil Koopman, in a recent blog post, has challenged that thinking. Phil kindly allowed me to repost that here. And, I highly recommend his blog which lately has had a number of interesting postings about safety in embedded systems.

Phil's post:

A previous post on rules for using interrupts included the rule:

"Don't re-enable interrupts within an Interrupt Service Routine (ISR). That's just asking for subtle race condition and stack overflow problems."

Some developers take a different point of view, and feel that it is best to re-enable interrupts to let higher priority ISRs run without delay. The idea is that high priority interrupts should run as soon as possible, without having to wait for low priority interrupts to complete. Re-enabling interrupts within an ISR seems to let that happen. BUT, while there might be some intuitive appeal to this notion, this is a dangerous practice that makes things worse instead of better.

First, to recap, when an interrupt triggers an ISR one of the first things that happens is that further interrupts get masked by the interrupt handling hardware mechanisms. Once the ISR starts running, at some point (best is at the beginning) it acknowledges the interrupt source, clearing the interrupt request that triggered the ISR. At that point the ISR can re-enable interrupts if it wants to, or leave them masked until the ISR completes execution. (The "return from interrupt" instruction will typically restore interrupt flags, re-enabling interrupts as appropriate when the ISR completes.) If interrupts are re-enabled within the ISR, then another interrupt can suspend the ISR and run some other, second ISR. This means that if a higher priority interrupt comes along, it can run its ISR right away.

If a lower priority interrupt comes along, it also gets to run, suspending the currently running, higher priority ISR. Interrupt priority hardware does not keep track of interrupt history after an ISR starts and acknowledges its interrupt source, and so loses track of how high the priority is for the running ISR. Worse, if a high and low priority interrupt happen at the same time, this approach guarantees that the high priority ISR waits for the low priority ISR. This happens because the high priority ISR runs first, then gets preempted by the lower priority ISR as soon as interrupts are re-enabled. In the case where no other interrupts are pending, the low priority ISR runs to completion before the high priority ISR gets to finish.

The ISRs nest as above rather than running one at a time, filling up the stack. You might be able to account for this by allocating enough stack for all ISRs to be active at the same time. But if you leave interrupts masked in ISRs. the worst case is only the single biggest ISR stack use. (Some hardware has multiple tiers/levels/classes ... pick your favorite term ... of interrupts, but in that case it is still only one ISR of stack use per tier rather than one per ISR source.)

The same ISR might run more than once at a time, especially if it got unlucky and was preempted by other ISRs, delaying its completion time. For example, if you get a burst of noise on an ISR hardware line you might kick of a half dozen or so copies of the same ISR. Or once in a while hardware events happen close together and re-trigger the ISR. This will lead to trouble if your ISR code is not re-entrant. It also could overflow the stack, ending up in memory corruption, etc. unless you can accurately predict or limit how many times ISRs can be re-triggered in absolute worst-case conditions.

You could say "the highest priority ISR doesn't re-enable interrupts" -- but what about the second-highest priority ISR? Once you get more than a couple ISRs involved this gets hopeless to untangle. You could try to write some sort of ISR handler to mitigate some of these risks, but it's going to be difficult to get right, and add overhead to every ISR. In all, the situation sounds pretty messy and prone to problems .. and it is. You might get away with this on some systems some of the time if you are really good (and never make mistakes). But, getting concurrency-related tricky code right is notoriously difficult. Re-enabling interrupts is just asking for problems.

So let's look at the alternative. What is the true cost you might be trying to avoid in terms of delaying that oh-so-urgent high priority ISR because you're not re-enabling interrupts in an ISR?

The worst case is that the longest-running low priority ISR runs to completion, making all the higher priority ISRs wait for it to complete before they can start. But after that all the remaining ISRs that are pending will complete in priority order -- highest to lowest priority. That's exactly what you want except for the low priority ISR clogging up the works. So if you have an obnoxiously long low priority ISR that's a problem. But if none of your ISRs run for very long (which is how you're supposed to write ISRs), you're fine. Put into scheduling terms, you want to make sure none of your ISRs runs long, because a long-running ISR gives you a high blocking time, and blocking time delays high priority tasks from completing.

Let's compare outcomes for the two alternative strategies. If you re-enable interrupts, the worst case latency for the highest priority ISR in the system is that it arrives, and then gets preempted by every other ISR in the system (including the longest-running ISR if it comes in later, but before the high priority ISR has a chance to complete). If you leave interrupts masked, the worst case is that the longest-running ISR has to complete, but then the high priority ISR goes immediately afterward. So, leaving interrupts disabled (masked) during every ISR is clearly a win for the worst case, in that you only have to wait for the longest-running ISR to complete before running the highest priority ISR, instead of waiting for all ISRs to complete. The worst case is typically what you care about in a real time embedded system, so you should leave interrupts disabled in ISRs to ensure the fastest worst-case completion time of high priority ISRs. And, leaving interrupts disabled in ISRs also gets rid of the risks of stack overflow and re-triggered ISRs we mentioned.

UPDATE: To avoid confusion, it's important to note that the above is talking about what happens at ONE level of interrupts, such that when one ISR is running no other interrupts run until the ISR completes or ISRs at that level complete. Many architectures have multiple levels, in which one ISR can interrupt another ISR at a lower level even if that lower level has interrupts masked. This corresponds to the comment about one ISR per level being active in the worst case. Also, note that if an architecture can change the priorities of interrupts within a single level that's irrelevant -- it is the existence of levels that are each individually maskable and that are prioritized as groups of interrupts per level that gives a way around some of these problems. So, going back to the title says, do not RE-enable the same level of interrupts within an ISR.

A Software Team Test

John Black sent this link, which is Joel Spolsky's 12 points for evaluating the health of a software team. Joel is the author of the joalonsoftware blog, which was (maybe still is) a hugely popular site. It's rarely updated anymore.

The twelve points take almost no time to check, and in the past I've recommended them as a way to decide if you should accept a job from an organization. The list doesn't pretend to be complete or scientific, but is a terrific set of rules of thumb.

Jobs!

Let me know if you’re hiring embedded engineers. No recruiters please, and I reserve the right to edit ads to fit the format and intents of this newsletter. Please keep it to 100 words.

Joke For The Week

Note: These jokes are archived at www.ganssle.com/jokes.htm.

What you DON'T want to hear your System Administrator say:

82. We prefer not to change the root password, it's a nice easy one...
81. You've got TECO. What more do you want?
80. ...and after I patched the microcode...
79. We don't support that. We WON'T support that.
78. I don't care what he says, I'm NOT having it on MY network.
77. Can you get VMS for this Sparc thingy?
76. Now it's funny you should ask that, because I don't know either...
75. What's this hash prompt on my terminal mean?
74. Just add yourself to the password file and make a directory...
73. ...and if we just swap these two disk controllers like this...
72. Where did you say those backup tapes were kept?
71. Hey, what does mkfs do?
70. Well, I've got a backup, but the only copy of the restore program was on THAT disk...
69. What happens to a hard disk when you drop it?
68. You can do this patch with the system up...
67. Why did it say '/bin/rm: not found'?
66. I hate it when that happens.
65. Boy, it's a lot easier when you know what you're doing.
64. Oops! Save your work, everyone! FAST!!!
63. The network's down, but we're working on it. Come back after dinner. (Usually said at 2200 the night before thesis deadline.)
62. Ummm.... Didn't you say you turned it off?
61. I found this rabbit program that is supposed to test system performance and I have it running now.
60. What is all this I hear about static charges destroying computers?
59. I think we can plug just one more thing in to this outlet strip without tripping the breaker.
58. It is only a minor upgrade, the system should be back up in a few hours. (This said on a Monday afternoon.)
57. The sprinkler system isn't supposed to leak is it?
56. Hey Fred, did you save that posting about restoring filesystems with vi and a toothpick? More importantly, did you print it out?
55. The backup procedure works fine, but the restore is tricky!
54. System coming down in 0 minutes...
53. Was that YOUR directory?
52. If I'd known it wasn't going to work, I would have tested it sooner.
51. Say, what does "Superblock Error" mean, anyhow?
50. Tell me again what that '-r' option to rm does...
49. What's this switch for anyways?
48. What do you mean that could take down the whole network?
47. YEEEHAA!!! What a CRASH!!!
46. [looks at workstation] "Say, what version of DOS is this thing running?"
45. NO!!! Not THAT button!!!
44. Sorry, we deleted that package last week.
43. You did WHAT to the floppy???
42. What did you say your (1)user name was...? ;-)
41. Wonder what THIS command does?

(To be continued).

Advertise With Us

Advertise in The Embedded Muse! Over 23,000 embedded developers get this twice-monthly publication. .

About The Embedded Muse

The Embedded Muse is Jack Ganssle's newsletter. Send complaints, comments, and contributions to me at jack@ganssle.com.

The Embedded Muse is supported by The Ganssle Group, whose mission is to help embedded folks get better products to market faster.