Real Time Programming
Basics of Real Time. Originally in Embedded Systems Programming, July, 1998.
 |
For novel ideas about building embedded systems (both hardware and firmware), join the 25,000+ engineers who subscribe to The Embedded Muse, a free biweekly newsletter. The Muse has no hype, no vendor PR. It takes just a few seconds (just enter your email, which is shared with absolutely no one) to subscribe. |
Quick - come up with a one-sentence definition of
"embedded"! With embedded PCs becoming common, and even MIPs chips buried
into inexpensive consumer products, "embedded" is a term whose meaning is
ever more nebulous.
So too for the designation "Real Time", a term
whose meaning is more in the mind of the beholder than cast in linguistic
concrete. In fact, the community recognizes this confusion by defining two sorts
of real-time - "hard" and "soft".
A hard real time task or system is one where an
activity simply must be completed - always - by a specified deadline. The
deadline may be a particular time or time interval, or may be the arrival of
some event. Hard real time tasks fail, by definition, if they miss such a
deadline.
Notice this definition makes no assumptions about the
frequency or period of the tasks. A microsecond or a week - if missing the
deadline induces failure, then the task has hard real time requirements.
"Soft" real time, though, has as definition as
weak as it's name. By convention it's those class of systems that are not
hard real time, though generally there are some sort of timeliness requirements.
If missing a deadline won't compromise the integrity of the system, if
generally getting the output in a timely manner is acceptable, then the
application's real time requirements are "soft". Sometimes soft real time
systems are those where multi-valued timeliness is acceptable: bad, better and
best responses are all within the scope of possible system operation.
Interrupts are inexorably linked with real time
systems, as only the interrupt bypasses the time-consuming tedium of polling
multiple asynchronous inputs. Yet a surprising number of very fast applications
are crippled by the overhead associated with servicing interrupts. Though chip
vendors spec interrupt latency in terms of the time the hardware needs to
recognize the external event, to firmware folks a more useful measure is
time-from-input to the time we're doing something useful, which may be many
dozens of clock cycles. The multiple levels of vectoring needed by the average
processor, plus important housekeeping like context pushing, are all ultimately
overhead incurred before the code starts doing something useful.
Similarly, real time operating systems (RTOS) are one
of the most important and common tools in the real time arsenal. Yet the RTOS
too provides no guarantee of real time response.
The first rule of real time design is to know the
worst case performance requirements of each activity! and only then select the
right implementation (CPU, hardware design, and firmware organization). It's
important to think in the time domain
as well as in that of the conventional procedural.
Enter the RTOS
A real time application may employ the lowliest of 4 bit
CPUs running a simple polled loop. In fact, these tend to be the most
deterministic of all systems as their simple requirements are easy to
understand, and to satisfy in a timely manner.
Fortunately - for our job security - most products
today manage multiple independent inputs, outputs, and activities. Sure, with
enough work a suitably convoluted polling program can handle many complex real
time operations. We can also write our code entirely in hex codes.
Whenever an application manages multiple processes
and devices, whenever one handles a variety of activities, an RTOS is a logical
tool that lets us simplify the code and help it run better.
Consider the difficulty of building, say, a printer.
Without an RTOS one monolithic hunk of code would have to manage the door
switches and paper feeding and communications and the print engine - all at the
same time. Add an RTOS and individual tasks each manage one of these activities;
except for some status information no task needs to know much about what any
other one is doing. In this case the RTOS allows us to partition our code in the
time domain (each of these activities are running concurrently) and procedurally
(each task handles one thing).
An important truism of software engineering that code
complexity - and thus development time - grows much faster than program size.
Any mechanism that segments the code into many small independent pieces reduces
the complexity; after all, this is why we write with lots of functions and not
one huge main() program. Clever partitioning yields better programs faster, and
the RTOS is probably the most important way to partition code in the time
dimension.
At its simplest level an RTOS is a context switcher.
You break your application into multiple tasks and allow the RTOS to execute the
tasks in a manner determined by its scheduling algorithm. A round robin
scheduler typically allocates more or less fixed chunks of time to the tasks,
executing each one for a few milliseconds or so before suspending it and going
to the next ready task in the queue. In this way all tasks get their fair shot
at some CPU time.
Another sort of scheduler is one using RMA - Rate
Monotonic Analysis. If the CPU is not completely performance bound, it's
sometimes possible to guarantee hard real time response by giving each task a
priority inversely proportional to the task's period.
Regardless of scheduling mechanism, all RTOSes
include priority schemes so you can statically and dynamically cause the context
switcher to allocate more or less time to tasks. Important or time-critical
activities get first shot at running. Less important housekeeping tasks run only
as time allows. Your code sets the priorities; the RTOS takes care of starting
and running the tasks.
If context switching were the only benefit of an RTOS
then none would be more than a few hundred bytes in size. Novice users all too
often miss the importance of the sophisticated messaging mechanisms that are a
standard part of all commercial operating systems. Queues and mailboxes let
tasks communicate safely.
"Safely" is important, as global variables, the
old standby of the desperate programmer, are generally a Bad Idea and are deadly
in any interrupt-driven system. We all know how globals promote bugs by being
available to every function in the code; with multitasking systems they lead to
worse conflicts as several tasks may attempt to modify a global all at the same
time.
Instead, the operating system's communications
resources let you cleanly pass a message without fear of its corruption by other
tasks. Properly implemented code lets you generate the real time analogy of
OOP's first tenant: encapsulation. Keep all of the task's data local, bound
to the code itself and hidden from the rest of the system.
For instance, one challenge faced by many embedded
systems is managing system status info. Generally lots and lots of different
inputs, from door switches to the results of operator commands, affect total
status. Maintain the status in a global data structure and you'll surely find
it hammered by multiple tasks. Instead, bind the data to a task, and let other
tasks set and query it via requests send through queues or
yes"> mailboxes.
Is this slower than using a global? Sure. Uses more
memory, too. Just as we make some compromises in selecting a compiler over an
assembler, proper use of an RTOS trades off a bit of raw CPU horsepower for
better code that's easier to understand and maintain.
Most operating systems give you tools to manage
resources. Surely it's a bad idea for multiple tasks to communicate with a
UART or similar device simultaneously. One way to control this is to lock the
resource - often using a semaphore or other RTOS-supplied mechanism - so only
one task at a time can access the device.
Resource locking and priority systems lead to one of
the perils of real time systems: priority inversion. This is the deadly
condition where a low priority task blocks a ready and willing high priority
task.
Suppose the systems is more or less idle. A
background, perhaps unimportant, task asks for and gets exclusive access to a
comm port. It's locked now, dedicated to the task till released. Suddenly an
oh-my-god interrupt occurs that starts off the system's highest priority and
most critical task. It, too, asks for exclusive comm port access, only to be
denied that by the OS since the resource is already in use. The high priority
task is in control; the lower one can't run, and can't complete it's
activity and thus release the comm port. The least important activity of all has
blocked the most important!
Most operating systems recognize the problem and
provide a work-around. For example in VxWorks you can use their mutual exclusion
semaphores to enable "priority inheritance". The task that locks the
resource runs at the priority of the highest priority task that is blocked on
the same resource. This permits the normally less-important task to complete, so
it can unlock the resource and allow the high priority task to do its thing.
Surveys indicate that even today vast numbers of
developers write their own RTOSes, a fact that's hard to reconcile with our
apparent devotion to software reuse. Something like 80 vendors offer a
staggering array of operating systems, ranging from tiny versions for PIC-like
CPUs to ones that provide complete windowing GUIs.
Pricing is all over the map, as some vendors sell the
RTOS outright, while others require royalty payments. Some provide only the
binary image of the operating systems; others come with full source. Comparing
RTOS prices is difficult at best because of the wide range of pricing models,
different CPUs supported, and varying support options. Suffice to say that most
RTOSes sell for several thousand dollars. Royalty payments, if any, run around a
few bucks per unit or less. And be assured that a commercial RTOS is available
for just about any CPU.
Memory requirements are just as diverse, with smaller
versions requiring only a few K; others run into the megabytes.
A new wrinkle in the RTOS market appeared last year
when Microsoft released Windows CE, which is targeted at applications served by
some embedded RTOSes. It's already common in PDAs and similar barely-embedded
products. Will we see it take over more of the truly embedded market? That's a
question that only Bill Gates and Las Vegas can answer; suffice to say that
today the product's real time response is pretty dismal. Microsoft has
announced a program to improve CE's performance, targeting sub-50 microsecond
thread latencies by mid-1999.
An appeal of CE is it's built-in GUI, something
more and more low-end systems are crying out for. Microsoft is not the only
vendor of GUIs, though. QNX, for example, long a vendor of RTOSes for embedded
x86 systems, sells their Photon microGUI which delivers a POSIX API in a
reasonably-sized footprint. Unlike CE, QNX's product offers fast
multitasking fast
normal">today.
Memory Woes
Using an RTOS also brings new perils. One of the more
underreported ones is stack allocation. Most of us are familiar with the
scientific way we decide how big the stack should be on the system (take a guess
and hope). With an RTOS the problem is multiplied since every task has its own
stack.
It's feasible, though tedious, to compute stack
requirements when coding in assembly language by counting calls and pushes. C -
and even worse C++ - obscure these details. Runtime calls further distance our
understanding of stack use. Recursion, of course, can blow stack requirements
sky high.
Given that it's difficult at best to pick a logical
stack value, it makes sense to be prepared to observe stack use after you build
the system to insure that a push doesn't run off the stack into critical
variables.
Since stack size is a guess, write your code from the
very start to find stack problems. In the startup code or whenever defining a
task fill the task's stack with a unique signature like 0x55AA. Then, probe
the stacks occasionally using your debugger and see just how many of the
assigned locations have been used (the 0x55AA will be gone). Knowledge is power.
Since the stack is a source of trouble it's
reasonable to be paranoid and not allocate buffers and other sizable data
structures as automatics. Watch out! Malloc(), a quite logical alternative,
brings it's own set of problems. A program that dynamically allocates and
frees lots of memory - especially variably-sized blocks - will fragment the
heap. At some point it's quite possible to have lots of free heap space, but
so fragmented that malloc() fails. If your code does not check the allocation
routine's return code to detect this error, it will fail horribly. Of course,
detecting the error will also no doubt result in a horrible failure, but gives
you the opportunity to show an error code so you'll have a chance of
understanding and fixing the problem.
Sometimes an RTOS will provide alternative forms of
malloc(), which let you specify which of several heaps to use. If you can
constrain your memory allocations to standard-sized blocks, and use one heap per
size, fragmentation won't occur.
Garbage collection - which compacts the heap from
time to time - is almost unknown in the embedded world. It's one of Java's
strengths and weaknesses, as the time spent compacting the heap generally shuts
down all tasking. See P.J. Plaugher's recent articles on garbage collection
for his view of real time solutions.
Debugging
Any real time design obviously has fast
normal">time as an integral part of the system's success. It's na've to
use conventional procedural debuggers, which are targeted at looking at static
code and data, to deal with finding the unique problems associated with a
time-based design. If you're not prepared to measure time you're ignoring an
aspect of the system every bit as important as the difference between "=="
and "=" in the C code.
Most simple development tools like ROM monitors and
BDM debuggers are too deprived of hardware resources to provide much insight
into system timing. An exception is HP's new 16600A tool, which blends a
BDM-like CPU probe with a fast logic analyzer. The BDM part gives access to your
code and target operation (a high level debugger correlates to the original
source). The logic analyzer supports tracing and time tagging of events.
In-circuit emulators, of course, have long included
deep trace buffers with time stamp information included. Event timers track time
from event A to event B. Though an emulator is the most expensive of all
debugging tools it's also the most versatile.
However, many modern processors with integrated
cache, pipelines and the like make traditional emulation so difficult and
expensive that it's sometimes not an option. Yet the real time issues are more
severe than ever due to the increasing complexity of systems. Some tool vendors
have taken to Instrumenting your code to extract necessary timing information.
Applied Microsystem's CodeTEST products, for example, include a preprocessor
that seeds information-generating instructions into your source code, which the
tool then detects in real time. The cost is about a 10% performance penalty, but
it does make the invisible visible by giving you a time-based look at the
firmware.
Debugging changes further when using an RTOS. Suppose
you're debugging task A. Single stepping through that chunk of code, should
the other tasks still run at full speed? Suppose a communications task gets
stopped every time you set a breakpoint anywhere? That might cause a
catastrophic loss of data leading to loss of sync with other processors.
An almost incestual series of relationships have
sprung up between RTOS vendors, debugger companies, and compiler folks to insure
that no matter what RTOS and compiler mix you use, a tool exists that is aware
of the RTOS's internal tables. The debuggers use this information to allow you
to work at a very high level, setting breakpoints symbolically on different
tasks as you glean details about messages and task states from appropriate
displays.
Some RTOSes, like VRTX and pSOS come with
pseudo-"agents", small kernals loaded into your code, that communicate with
your debugger to pass back all sorts of neat debug info in real time.
Essentially running as a separate task, the agent lets you stop a single task as
the rest of the code continues to run. Timers and their ISRs continue to run,
comm routines are unaffected by debugging, and even DMA activities continue
intact.
Many low-cost ROM monitors and ROM emulators support
these agents. In fact, many RTOSes, such as RTXC, come with a complete debugger
designed to let you work with your real time application in real time.
Conclusion
In "From the Earth to the Moon" Jules Verne described a
never-ending battle between the canon makers and armor vendors. The dynamics of
competition served to keep both sorts of products in relative balance! and the
engineers eternally frustrated.
The embedded industry is no different. As processors
get faster and cheaper we seem to stress them just as hard as we did in the 8080
days, since applications are getting more complex and demanding. We now have,
though, the tools needed (in the form of RTOSes, debuggers, profiles and the
like) to both design a well-structured real time system, and to understand the
time-based behavior of these systems.
If you're not using an RTOS in your embedded
designs today, you surely will be tomorrow. Get familiar with the concepts, as
designing tasking code requires a somewhat different view - the time domain view
- than conventional procedural programming. Check out Jean LaBrosse's free uC/OS;
the companion book is as good of an introduction to using an RTOS as you're
likely to find. See www.ucos-ii.com.
Improvements to these tools come almost daily. Keep on top
of the field to avoid the fate of the dinosaurs.
|