Refreshing Software
Refresh is yet one more thing that software can, in some situation,
replace.
Published in Embedded Systems Programming, April 1992
 |
For hints, tricks and ideas about better ways to build embedded systems, subscribe to The Embedded Muse, a free biweekly e-newsletter. No hype, just down to earth embedded talk. 23,000 other engineers subscribe. It takes just a few seconds (all we need is your email address, which is shared with absolutely no one) to subscribe to the Embedded Muse. |
In his wonderful book Microcosm (Touchstone Books, NY, NY), George
Gilder predicts that, with a few exceptions, the semiconductor
industry will one day concentrate more on the production of modest
volume speciality chips than on huge runs of generic ICs. This
trend is already apparent, especially in the proliferation of
I/O controllers. It doesn't matter if you are using SCSI, Ethernet,
SDLC, hard disks, stepper motors, or any of a hundred other peripherals:
at least a half dozen vendors offer some highly integrated controller
for your application.
Yet sometimes these parts are not really appropriate. A mass produced
highly cost sensitive product like an electronic toy generally
can't tolerate the relatively high price of these chips. The classic
ultra-low cost embedded controller is the electronic greeting
card. I'm sure the designers replaced every last fraction of a
yen of hardware with smart code.
Several months ago (December 1991) I described how in some applications
you can replace a UART with bit banging firmware. Osterberg Consulting
(San Marcos, CA) sent me a version of an interrupt-driven software
UART for the 8051. Beautifully coded, it uses only about 15% of
the processor's time at 4800 baud with an 11.0592 Mhz crystal.
It's an example of replacing expensive hardware with a clever
idea and a lot of high tech elbow grease.
Lots of other I/O can be handled inside of the processor. Be sure
you understand the magnitude of the software before starting,
though. I aged about a decade doing a software implementation
of GP-IB some years back - it just wasn't worth the grief.
Refresh
A lot of embedded systems use what is in effect a hardware state
machine to continuously write data to displays or other hardware.
For example, a VGA card constantly copies a stream of bits from
video memory to the CRT. 60 times a second the hardware repaints
the screen, fooling your eye into thinking it sees a stable display.
Where bit rates are lower and costs are paramount it might make
sense to replace the hardware state machines with firmware. Video,
however, is so fast it is unrealistic to consider using code to
refresh the screen.
Displays
Light Emitting Diodes (LEDs) are common output devices on inexpensive
embedded systems. Both seven segment displays and ascii arrays
are used.
Seven segment displays are, as the name implies, seven "lines"
formed of LEDs, arranged in such a fashion that by judicious line
selection all numbers and some characters can be displayed. They
are about the cheapest way to generate numeric results. Ascii
displays, on the other hand, are composed of lots of little LED
bulbs ("a thousand points of light"), which can show
any alphanumeric value. They are considerably more expensive than
the simpler seven segment displays.
Some of these come with internal latches and drivers, so they
can essentially be just hung on the computer's data bus. Quite
a few have no internal electronics. The designer must provide
both a driver (an amplifier that converts the computer's logic
levels to much higher LED currents) and an interface to the computer
bus.
The interface is quite a problem. Consider the case where a system
includes 8 digits of seven segment displays. Each one needs a
high power driver chip and a latch to hold the value written to
the digit by the program. This could amount to as may as 16 chips!
A better solution (one that is used by most cheap systems including
digital watches) is to arrange the displays in a matrix. The seven
segment display is, after all, little more than seven diodes with
one end connected together. 8 wires come from the package: 1 common
connection point (the "digit enable" line), and 7 individual
segment wires.
If we're putting 8 of these displays in a system, tie each of
the seven segment leads of each package together. The result is
a new level of abstraction: a package of 8 displays, with 7 segment
enables and 8 digit enables coming out. If you put power on one
of the digit enables and a seven segment code on the segment bus,
then one display, corresponding to the powered digit line, will
light.
Connect the 8 digit enables to 8 high power drivers (one IC),
and to an octal latch on the computer bus (one more chip). Tie
the 7 segment bus lines to another driver and latch (2 more chips).
Now we're talking 4 chips instead of 16.
The firmware turns on any single digit by sending a seven segment
code to the segment latch and a 1-of-8 select to the digit latch.
The computer can obviously turn on any one display at a time,
but there is no provision to turn them all on simultaneously.
The secret lies in the eye's persistence. The software should
turn on one digit for a few milliseconds, then do the next one,
and so on through the entire array. By repeating this cycle at
a high speed the eye is fooled into thinking all of the digits
are on, when really only one is at any point in time. It's a little
like TV, where a complete picture is formed by a rapidly moving
dot.
You can buy controller chips to handle this display multiplexing,
but why bother? Use spare processor time (if any!) to sequence
the refresh cycle.
Lashed up as described, the entire array of displays looks like
two I/O ports to the code. The digit select port is always all
zeroes with only a single one set, the position of which selects
one of the 8 displays. The segment port is just the seven segment
code required by the currently-selected display.
Use a timer to generate a sequence of interrupts. What? You don't
have a timer? You can sometimes create a "fake" interrupt
by doing calls in the code's main-line, but it can be tough to
insure calls come often enough in all operating modes to keep
the displays flashing fast enough.
In general I'd take a timer over any other peripheral. With a
timer you can do wondrous things; generate accurate bit patterns,
run a preemptive real time operating system, and the like. A timer
can help make up for a lot of deficiencies in the hardware, but
it's awfully hard to make the software run well in the absence
of a timer.
As an aside... no matter how small your embedded system is, seriously
consider putting at least a simple real time operating system
in. A tiny RTOS uses practically no resources (other than a timer
interrupt and a bit of memory). An RTOS is ideal for responding
to real time events. However, far too many embedded systems start
off with no RTOS only to have one shoehorned in in desperation
late in the development cycle. It's a lot easier to sta`t of with
an RTOS and use only a little of its power than to rewrite the
code to adopt to one later.
To resume: on each timer interrupt simply change the digit port
to select the next display. Put the appropriate segment code in
the other port. Then return. The interrupt service routine will
be short and fast, demanding little of the processor. The 12 chips
we saved earlier cost little in CPU overhead.
Of course, use a sane approach to handling the ports. Rule 1 of
interrupt handling is to keep the service routine short and simple!
Too many applications force the ISR to convert an ascii or numeric
code to the segment selection values on every interrupt. This
is foolish.
Build a little table with one entry per digit (8 in the case we've
been discussing). The table is global to both the interrupt service
routine and to a driver called every time the firmware wishes
to change the displayed value.
The driver most likely will accept an 8 digit string of character
or integer data from the calling routine. It converts this to
8 segment values, one per digit, and places these in the table.
It's short and sweet.
The ISR looks like:
- Push registers
- Put a zero to digit port
- Load pointer to table
- Load value from table[pointer]
- Put value to segment port
- Increment pointer (modulo table) and save
- Load digit byte
- shift left and save it
- put to digit port
- restore registers
- return
On some processors the ISR will be not many more instructions
than the 11 steps shown.
Don't forget step 2. While not strictly needed (depending on the
system's speed), if left out the incorrect value will be written
to one digit for a few microseconds, perhaps creating a ghost
image.
The refresh rate is a function of the number of displays (more
displays need a faster update) and the persistence of the eye.
For 10 or so digits I find a 1 millisecond update rate more than
adequate. A 1 msec ISR that takes, say 15 microseconds to run,
requires only 1.5% of the CPU's time.
Though I've focussed on LED displays, the same technique works
on Liquid Crystal Displays (LCDs). However, a lot of big LCD displays
with multiple ascii characters include on-board refresh, removing
any need for software support.
DRAM Refresh
Dynamic RAMs (DRAMs, pronounced "Dee RAM") are the cheapest
form of high speed rewritable data storage. They are composed
of a single transistor per bit. Each "gate", or transistor
input, is insulated from the substrate by a tiny non-conductive
deposit. This forms a capacitor which memorizes the last value
written to the transistor.
Obviously, there are no perfect insulators. The capacitance of
the junction is so tiny that the charge bleeds off within a few
milliseconds. In other words, without help, all of the cells in
the DRAM forget in the blink of an eye.
Like LED displays, DRAM cells are arranged in an X-Y matrix. A
simple read from every row (X line) once every few milliseconds
suffices to recharge the capacitors and keep the contents of the
device intact. This refresh cycle is crucial to proper operation
of any DRAM, although it adds a layer of complexity to the hardware.
If it seems that DRAMs are a tenuous affair, remember that there
is good reason for the approach. A DRAM cell needs only a single
transistor; three less than the simplest static RAM. As a result,
DRAMs always offer much higher memory density. The technologies
always move more or less in lockstep, with static densities about
4 years behind that of dynamics.
Conventional refresh controller ICs include a counter that generates
all row addresses and feeds these to the DRAM chips as required.
Several chips are used, as modern 1 mb DRAM chips need a 9 bit
refresh cycle (512 row addresses). Most 1 mb DRAM chips need all
512 row addresses every 8 milliseconds to guarantee data retention.
A lot of embedded systems eliminate the need for a distinct DRAM
controller by using a DMA channel to manage the refresh. The original
PC works this way. It's interesting to look through the BIOS listings.
RAM is just not available until the BIOS programs the DMA controller
to start refresh cycles going.
DMA is the perfect solution to the refresh problem. Generate null
DMA reads from sequential addresses. Program the controller to
run over and over, without computer intervention. This is especially
attractive on modern high integration controllers like the 80186
with built-in DMA channels.
Still, some systems might not have a spare DMA channel. It is
possible to generate refresh completely under software control,
but pay careful attention to the firmware's timing. Though I've
never built a system around software refresh, I've seen several
successful implementations.
The trick is to write a really tight interrupt service routine
that does little more than a read from incrementing addresses
- fast.
A timer invokes the refresh interrupt service routine. The interrupt
time is dictated by the specifications of the DRAM chips. Take
the 1 mb Hitachi HM511000 for example. It requires 512 refresh
cycles, all of which must be completed in 8 milliseconds. This
works out to one complete interrupt service every 15.6 microseconds.
While blazingly fast, it is not (quite) impossible. Be wary of
other interrupting devices that could create untenable latency
problems.
The ISR must be highly optimized to present minimal CPU overhead.
Typically, it should contain the following steps:
- save processor state
- load next refresh address
- do a read from that address
- increment and store refresh address
- restore processor state
- return
If your entire application is in assembly language you can greatly
shrink the ISR by dedicating a register to the refresh address.
This removes step 5. In Z80 assembly language, the ISR could look
like:
isr:
push af ; save processor state
ld a,(bc) ; read and refresh
inc bc ; next refresh address
pop af
reti ; ret from interrupt
Register pair BC is the refresh address. Though the DRAMs really
only need a 9 bit counter, it is much faster to just let BC wrap
through 16 bits.
An assembler that counts T states is really handy in this sort
of application to ease figuring how long the ISR takes to run.
The old SLR assembler had this feature, but I don't know of a
modern product that supports it.
Conclusion
Don't get me wrong. I am a firm believer in using complex I/O
controllers in most applications. However, where appropriate,
software can and in some cases should replace the external hardware.
Actually, my biggest objection to these big I/O chips is the seemingly
hundreds of control registers some of these monsters sport. We
programmers can spend weeks trying to convert a cryptic 50 page
data sheet into working code. Someday, the vendors will recognize
that their job is not to make chips, but to provide value to the
customer. Then, they'll give us useable canned code packages along
with the raw hardware.
|