A Plea To Compiler Vendors
Compilers don't address real real time concerns. Originally in Embedded Systems
Programming, March, 1999.
 |
For hints, tricks and ideas about better ways to build embedded systems, subscribe to The Embedded Muse, a free biweekly e-newsletter. No hype, just down to earth embedded talk. 23,000 other engineers subscribe. It takes just a few seconds (all we need is your email address, which is shared with absolutely no one) to subscribe to the Embedded Muse. |
We still think of a compiler as a translator, something
that converts a C function to object code, when in fact it could - and should -
be much more. The days of limited computer horsepower defined the compiler; we
were happy to find a tool that could do the translation and nothing else. Things
have changed; infinite horsepower sits on all of our desks. If computer games
evolved at the same rate as our tools we'd still all be playing Pong instead
of Nintendo 64.
Really Real Time
Embedded systems have their own unique demands, many of
which have no parallel in the desktop infotech world. I'm constantly
astonished that the language products we buy largely don't recognize these
differences.
Budding programmers in college course learn to write
correct code; they rarely learn to write fast code till confronted with an
application rife with performance problems. Yet embedded is the home of
real-time. A useful system must be both correct and
fast; neither attribute is adequate by itself.
Today's state of the art of real time development
is appalling. Write the code, compile and run it, and a week before delivery,
when the performance problems surface, start making changes in the hope of
making the firmware faster.
I see constant flame wars in the newsgroups about
whether one should write in C or assembly, particularly for known problematic
areas like interrupt service routines. We'll debate this forever until we
realize that the issue is more one of visibility.
It's always preferable to work in a high level language; we struggle with the
C versus assembly tradeoff simply because we just don't know what the compiler
is doing; the tool provides no insight into the nature of the generated code.
(I'm not saying that assembly should die. Resource
limitations that demand some hand-crafted code will always exist. An interesting
example is in the DSP arena, where it seems that no matter how much horsepower
arrives new applications surface that eat every available T-state.)
No, the issue
is not one of C versus assembly. It's about getting enough insight into the C
tradeoff to then make the correct language decision. We'll continue to blindly
toss assembly at tough problems simply because we just don't know if the C
code is, or will be, fast enough, nor do we know what changes we can make to
have the C run faster.
The sad fact is that the state of compiler technology
today demands we write our code, cross compile it, and then run it on the real
target hardware before we get the slightest clue about how well it will perform.
If things are too slow, we can only make nearly random changes to the source and
try again, hoping we're tricking the tool into generating better code.
This is an absurd way to write code.
Embodied in the wizardry of the compiler are all of
the tricks and techniques it uses to create object code. Please, Mr. Compiler
Vendor, remove the mystery from your tool! Give us feedback about how our real
time code will perform!
Give us a compiler listing file that tags each line of code
with expected execution times. Is that floating point multiply going to take a
microsecond or a thousand times that? Tell me!
Though the tool doesn't know how many times a loop will
iterate, it surely has a pretty good idea of the cost per iteration. Tell me!
Years ago some assemblers produced listings that gave
the number of T-states per instruction. The developer could then sum these and
multiply by number of passes through a loop. Why don't C compilers do the same
sort of thing?
When a statement's execution time might cover a
range of values depending on the nature of the data, give me a range of times.
Better yet, give me the range with some insight, like "for small values of x
expect twice the execution time".
Give me a smarter editing environment so I can click
on a statement and then see the compiled code in a pop-up window.
When the code invokes a vendor-supplied runtime
function, highlight that call in red. If I click on it, show me what it does and
how so I can make an intelligent decision about its use. Even better, pop up a
pseudo-code description of the routine that includes performance data.
Clearly, some runtime functions are very
data-dependent. A string concatenate's performance is proportional to the size
of the strings - give me the formula so I can decide up-front, before starting
test, if this code is really real time.
I'm a passionate believer in the benefit of code
inspections, yet even the best inspection process is stymied by real time
issues. "Jeez, this code sure looks slow" is hardly helpful and just not
quantitative enough for effective decision-making. Better: bring the compiler
output that displays time per statement.
The vendors will complain that without detailed
knowledge of our target environment it's impossible to convert T-states into
time. Clock rate, wait states, and the like all profoundly influence CPU
performance. But today we see an ever-increasing coupling between all of our
tools. A lot of compiler/linker/locator packages practically auto-generate the
setup code for chip selects, wait states and the like. We can - and often
already do - provide this information.
And better, we can then diddle waits versus clock
speed and see the what the impact will be long before ever building target
hardware.
Pipelines and prefetchers further cloud the issue,
and add some uncertainty to the time-per-instruction. The compiler can still
give a pretty reasonable estimate of performance. Very rarely do we look for the
last one percent of accuracy when assessing the time aspects of a system. Give
me data accurate to 20% or so and then I can make intelligent decisions about
algorithm selection and implementation.
A truly helpful compiler might even tag areas of
concern. A tool that told us "This sure is an expensive construct; try xxxx"
would no longer be a simple code translator; it's now an intelligent
assistant, an extension of my thinking process, that helps me deal with the
critical time domain of my system very efficiently.
Obviously this sort of time information is a sort of
deceit, since no function is an island. Even with perfect timing data for each
function there's no guarantee that the system as a whole will perform as
required. But this is analogous to the procedural properties of the system: even
if every function operates perfectly, transforming inputs into outputs without a
hint of error, the system as a whole may be flawed.
We're actually rather good at writing correct
functions; the challenge of the next century will be in learning how to build
correct systems without lots of rework. Today it appears the answer may be
simulation and modeling, though I find present implementations unsatisfying.
Compilers help us with many procedural programming
issues, catching syntactical errors as well as warning us about iffy pointer use
and other subtle problem areas. They should be equally as helpful about real
time issues, even though we recognize that the information provided will be
localized to individual functions, not system-wide.
Other Wishes
Give me size information. Many embedded systems live in
limited address spaces. Sure, I can see how the object file grows as I add code,
but I'd prefer a carefully constructed size map.
Just as class browsers show relationships between
objects, a size browser could display how much of the precious ROM resource goes
to each function. This is a particularly intractable problem for users because
it's hard for us to understand the cost of runtime routines. If I use a
particular construct that brings in a library package, what's the cost? Has
some other chunk of code already loaded this library, so the size penalty is now
amortized over multiple functions? Tell me!
Don't support multiple error warning levels.
Don't let me disable warning messages. If warnings are spurious, don't
produce them. If they indicate something important, I don't want them masked.
Embedded code lives forever, and seem to get ported between processors at a
prodigious rate. When companies are acquired developers have to integrate code
from the other organization. Any non-standard C or C++ construct will be a
problem, so I need to deal with it now; I need to modify my code to be
pristinely ANSI-compliant. Anything less will be a problem.
The state of the art of editors, particularly those
that are part of IDEs, is abysmal. In the last 30 years our source code has gone
from all capital letters (because of the limitations imposed by the teletype
input devices) to a mix of upper and lower case. Whoopie.
When will we transcend raw unformatted ASCII? Sitting in
front of my word processor I can do all sorts of pretty things to the text, but
feed the resulting file to a compiler and watch it gork!
Please Mr. Compiler Vendor, let me write my code in
HTML. Make your product smart enough to handle the hypertext file.
Comments are every bit as important as the code,
since good code both works and communicates how it works. I want to format my
comments. Add italics, bold, and the like to stress important issues. Perhaps
super and subscripts make sense for some applications.
And I want hypertext linking between documents.
It's frustrating to add, for example, a long explanation of I/O bit formats in
an include file just so it's visible when you need it. Better - a hypertext
link to the document. In the IDE let me click on that link and then examine or
edit the doc. An environment like this is much less likely to suffer from
"documentation drift" where the external system description doesn't keep
up with changes created during coding.
And extend the HTML format a little. Let me define a
tag that sets the tab indent.
And finally, Mr. Compiler Vendor, please give me the
support I demand and need. Find ways to interact with me - even if only
electronically - during those 3 AM panic sessions. When your customers find a
bug add it to the regression tests so it doesn't reappear three versions
later. Talk to me intelligently, point out reasonable workarounds, and provide
fixes promptly.
Are all of these requests too difficult? I contend
that not only are they possible, they are necessary.
As this industry matures it's ever harder to differentiate something as
lacking in sex appeal as a compiler. The languages are all standardized, so each
competing product does essentially the same translation job. Most score within a
few percent of each other on size and speed of the generated code. When a market
becomes so stable and products so similar it's time to add features that make
one stand out over the others.
A Plea to Customers
Of course, the tool industry is a two way street.
Developers have to understand that the embedded market is horribly fragmented,
causing no end of grief to vendors and customers alike.
The sad truth is that there are simply too many
embedded CPUs available. Too many compilers, too many RTOSes, and too many
debuggers. A matrix that showed every possible combination of compiler, linker,
locator, RTOS, debugger, TCP package and the like for any popular CPU would
strain the capacity of your spreadsheet.
Unfortunately, customer A wants a particular compiler
to work with one combination of debuggers and RTOSes, while customer B wants a
different set. Wise vendors simply say "nope; we support these dozen
combinations and that's it". Others try to be all things to all people.
Customers must recognize that there are limits to
support. At some point we have to admit that, well, we really want to use this
particular RTOS, but the other tool vendors just cannot do a good job of
supporting their tools with that operating system. So, a part of our decision
criteria must be the quality of support. "I want RTOS A, but it's clear that
using product B will mean the tools will work together better."
For the embedded tools market is tiny, with even the
biggest companies mere pipsqueaks compared to those in the PC industry.
There's no way a $10 million dollar company can be all things to all people.
If I were a compiler vendor I'd also ask my customers to
understand that optimizing for speed and size might be impossible. Sometimes
you've simply got to toss more resources at a problem. Or rewrite the
algorithm. Just be wary about blaming to compiler for cruddy code generation
when perhaps we've asked too much of it.
I'd also ask customers to build a bit of slop into
their schedules to handle the inevitable tool integration problems. Heresy?
Maybe, but the reality is that we're going to suffer from tool issues. We can
schedule it up front or enter panic mode on the back end.
When people ask for advice about development environments,
I tell them to find a reasonable set of tools that work well, and then stick
with them for as long as possible. Change CPUs twice a year and you are doomed
to tremendous integration problems. Engineering is very expensive, perhaps even
costlier than the parts that go into our delivered systems. Sometimes it makes
sense to chose a CPU that supports efficient development, rather than the one
that most closely matches the technical specs required for the job.
|