Z180 Memory Management
The Z180 MMU is confusing, but quite useful when well understood.
Published in Circuit Cellar Ink, February 1990
 |
For hints, tricks and ideas about better ways to build embedded systems, subscribe to The Embedded Muse, a free biweekly e-newsletter. No hype, just down to earth embedded talk. 23,000 other engineers subscribe. It takes just a few seconds (all we need is your email address, which is shared with absolutely no one) to subscribe to the Embedded Muse. |
Programs are getting big! Part of today's shift towards 16 and
32 bit processors comes from the need for correspondingly huge
address spaces, since conventional wisdom holds that a 512kb program
just cannot fit in the 64k address space of most 8 bit CPUs. Where
performance is the overriding concern, a 32 bit CPU may be the
only solution. It does seem a shame to abandon all the accumulated
knowledge and code gleaned from two decades of 8 bit microprocessors
just to get more programming elbow room.
For the past few years some 8 bit CPUs have been equipped with
memory management units (MMU) that free programs from most memory
limitations. It's tedious and complex to control a MMU manually;
now, many languages and other tools include built-in MMU support.
Logical -> Physical
The problem of memory management is easy to define: we need some
way of connecting lots of memory to a processor that just cannot
handle or address it. For example, we might want to put 512kb
on a Z80. Since the Z80 only generates 16 bit addresses, it can
only directly address 64k of RAM. Somehow, though memory management,
we must expand this capability.
For now let's assume that magic hardware gives us more address
lines. Perhaps it is as simple as an I/O port loaded by the CPU
with an extra upper 8 address lines (A16 to A23), giving us a
potential address space of 24 mb. Or, it can be hideously complex,
providing some ability to access different sections of address
space in wild and wonderful ways.
In any event, as soon as some external mechanism is added to translate
addresses in some fashion, the programmer suddenly must contend
with two very different sorts of address spaces.
"Physical" memory is that actually connected to the
hardware. For example, the 512kb we attach to the sadly-overloaded
Z80 is physical memory. Its address ranges from 00000 to 80000
hex in a linear manner.
"Logical" memory is the memory currently located in
the processor's address space. Obviously, if the computer can
only issue addresses in the range of 0000 to FFFF (0 to 64k),
then some of the physical memory is visible and some is not. As
the code changes the memory manager's settings different memory
becomes visible. That which is addressable at any time is the
logical memory.
Thus, addresses generated by the program are always logical addresses
- they get translated by some yet undefined hardware into real
physical addresses.
So, at one time address 1000 logical might be translated into
28000 physical. Later, in the same code, 1000 could correspond
to 80000 physical. The old one-to-one mapping of addresses we're
all familiar with is gone!
In summary, addresses used by the code are logical; the memory
array sees physical. Between these two the memory management unit
(MMU) falls.
Standard Architectures
Several years ago Hitachi introduced the 64180, a high integration
version of the venerable Z80. While other vendors were trying
to push new proprietary architectures, Hitachi took what might
seem a step backwards towards the Z80. They realized an important
fact of the industry - customers had a fortune invested in Z80
code and were unwilling to switch to an incompatible instruction
set.
The 64180 is a Z80 at heart. The designers resisted the temptation
to add fancy new instructions and addressing modes that could
have made it incompatible with the Z80. Rather, they integrated
timers, serial ports, and DMA controllers onto the chip. Even
better, they added a memory management unit to translate 64k logical
addresses into a 1 mb physical address space.
Now Hitachi sells several other versions of the part. The 64180S
is designed especially for telecommunications. The 647180X is
a microcontroller version, containing a 64180 core, ROM, RAM,
and parallel I/O. Zilog stepped into the act, offering the Z180
(a second source of the 64180) and Z280, a very high performance
Z80 upgrade. Zilog is just now announcing the Z181 and will soon
offer a microcontroller version of the part, probably a 647180X
look-alike.
The most important peripheral on the 64180-family processors is
the memory management unit (MMU). The MMU is a hardware device
built onto the processor's silicon. The MMU translates every memory
address from 16 to 20 bits.
The 64180's MMU uses three internal control registers. In keeping
with the chip's design philosophy, on reset the MMU gives a straight
logical to physical mapping, simulating the Z80 and, of course,
limiting the address space to 64k.
You can divide the 64180's logical address space into one, two,
or three areas. The logical space itself is unaltered; even when
divided it is still a contiguous 64k.
CBAR is an 8 bit I/O port that can be accessed by the processor's
OUT and IN instructions. The lower 4 bits specify the starting
address of the bank area, and the upper 4 give the start of common
1. These bits determine the upper four bits of the address. If
CBAR were A080, then the base area starts at 8000 logical, and
common 1 starts at A000.
Common 0, if it exists, always starts at logical 0000 and runs
up to the bank area. The bank area then runs to the start of common
1.
Therefore, you can always understand the logical address space
by examining the contents of CBAR by itself. No other information
is needed.
The logical address is only part of the problem. How does logical
space get mapped to physical? Two other ports provide the rest
of the answer.
BBR (the Base Area Bank Register) specifies the starting physical
address of the base area (remember, the logical start is in CBAR).
CBR (Common Bank Register) provides the same information for common
1. Both of these specify the upper 8 bits of the 20 bit physical
address.
A simple formula gives the translation from logical to physical
address for the bank area:
Physical = Logical + (BBR * 4096)
The same formula gives Common 1:
Physical = Logical + (CBR * 4096)
BBR and CBR gives the upper 8 address bit only - hence the 4096
multiplier. The lower 12 bits come from the logical address. Thus,
the translation only affects the upper 8 bits; the lower 12 physical
bits are always identical to the lower 12 logical.
On reset, the 64180 sets CBAR to F0, and CBR=BBR=0. This maps
logical to physical exactly, with no translation; the bank area
starts at logical 0 and common 1 at F000 (since CBAR=F0), the
bank area physically starts at 0000 (BBR=0), as does common 1
(CBR=0). If the logical address is 1000, then the MMU allocates
this to the bank area (CBAR=F0; 1000 is less than the start of
common 1 at F000), and adds the physical base of bank to it (0),
giving a translated address of 01000. Similarly, logical F800
is in common 1, and translates to 0F800.
The most important point that can be made about the MMU is that
it does not provide the 1mb linear address space we all crave.
After all, Z80 instructions use 16 bit address operands and 16
bit register pointers - there is no way to address a number larger
than 64k. A jump instruction will always have an argument that
is 16 bits long - the logical destination address. The MMU translates
this logical address to a possibly large physical number, but
the software still operates in a 64k space.
This has a subtle implication - logical address space is a valuable
commodity that must be conserved. Wasting physical memory isn't
so bad, since the 64180 can deal with up to 1mb. As an example,
suppose that your program will have three banks (COMMON 0, BANK,
and COMMON 1). If the program is large you might want to bank
it in and out of the BANK area, leaving COMMON 1 for data. If
BANK is too large, you could be left with little data space -
it is important to make BANK as small as feasible to maximize
the (in this case) unbanked data.
Language Support
Despite the fabulous extra power offered by the 64180's MMU, we've
all been making do with Z80 assemblers and compilers. Sure, some
claim to support the new processor's extended features, but in
truth, until recently, that support has been minimal.
Just what features are important in a 64180 assembler or compiler?
Certainly it should be fast, efficient, and all that, but more
than anything else the language should give you some sort of way
of handling the MMU.
There are two related but different aspects to MMU management.
The first is to provide some sort of mechanism to control the
MMU with as little programer help as possible. An ideal solution
would be a smart compiler that simulates a nearly linear huge
address space. The second is to provide output files that contain
compiled code and debugging records in some manner that supports
current 8 bit tools (like the PROM programmer), but that accounts
for the large address spaces.
Taking these two criterion separately, especially with a C compiler
we'd really like some method of compiling an ordinary C program
in multiple banks. Sure, you might have to tell the compiler or
linker about your memory configuration, but ideally the tools
should segment and package functions into memory banks as needed.
Even better, we'd want it to remap the MMU automatically. Just
like working with Turbo C, we would like to be able to invoke
a function through a conventional function call, without worrying
about its location in memory.
The second requirement is not quite so obvious. How will you burn
ROMs for the final project? If the compiled/assembled code exceeds
64k, there may be a problem with using standard Intel hex records
for output. Every ROM programmer in the world takes Intel hex
input, but the format only supports 16 bit addresses.
One solution is to divide the source program into many separately
compiled small pieces. This is especially hard in C, since the
linker will not be able to resolve calls between pieces. Another
approach is to insure that the compiler or assembler can produce
"Type 2" Intel records. Whenever the code crosses a
64k bank the linker could output a type 2 record to specify a
new segment address (physical address shifted right 4 bits). This
does imply that the linker can handle large physical addresses,
and the PROM programmer can accept type 2 records.
Decent debugging files are just as important as useful PROM files.
You can't use an emulator, simulator, or monitor to debug the
code if the debug records are inadequate. Suppose you wish to
display the value of a variable. The debugger must know the physical
address of that quantity, since only the physical address is constant.
Remapping the MMU changes its logical address, and at times no
logical address might correspond to the variable.
This implies that the software packages must maintain both logical
and physical addresses for all lines and symbols. Compiling, say,
jumps requires logical addresses. All jumps and calls take logical
addresses as arguments (since they can only support a 16 bit number).
Physical addresses are needed in the debugging records so debuggers
can unambiguously resolve the location of symbols, functions,
and line numbers, all whose logical address changes with the current
MMU setting.
Assemblers
Most 64180 programs written in assembly language control the memory
manager by tediously issuing many MMU control instructions. The
programmer must first decide exactly what configuration logical
memory will assume, and then come up with CBAR, BBR, and CBR values
for every possible combination of banks. Then, the code must send
these values out to change maps. Needless to say, this takes a
lot of work.
Softools (8770 Manahan Drive, Ellicott City, MD 21043, (301) 750-3733)
came up with an interesting approach that eliminates most of the
work. Their SASM assembler and linker will automatically drop
in all the code needed to bank a program. In effect, this means
you can write code as if the 64180 had a 1 mb linear address space.
Like most good assemblers, SASM supports lots of named segments
- up to 256. Most of the time we assembly programmers just need
a CODE, DATA, and ASEG segment, but SASM's segmentation lets us
break a program into mapped and unmapped sections. When using
SASM on large programs, you can assign any segment or segments
to have a "mapped" attribute, identifying those that
require some MMU manipulation to bring them into the address space.
Segments are the key to SASM's mapping scheme. The linker identifies
how much data the program uses and the number of bytes used for
unmapped code (that which must never be mapped out). It computes
CBAR to define the characteristics of the runtime logical address
space: COMMON 0 being just big enough to hold all the unmapped
code, COMMON 1 containing the data, and the rest, the BANK area,
is allocated to mapped routines.
The linker groups all mapped segments together and starts to assign
both logical and physical addresses to each routine. Whenever
a routine will exceed the size of the BANK area the linker moves
it to the start of a new BANK area. It then converts all jumps
and calls between banked areas to transfers to code that manages
the MMU in COMMON 0.
When finally linked, the program has three parts - a COMMON 0
non-mapped (i.e., always in the address space) area which typically
contains startup code, frequently-used routines, and SASM's banking
code. COMMON 1 is usually your data area. The BANK area contains
most of the program code. Calls between these banked routines
will cause remapping as needed to bring in ones that are not currently
visible in the address space.
For example, suppose the program is as follows:
routine length characteristics
main 3700h not banked
vectors 80h not banked
sub1 4000h banked
sub2 38f0h banked
sub3 1200h banked
sub4 6f00h banked
sub5 800h banked
data 3200h not banked
On a 64180 the reset jump is at 0, so it makes sense to put the
unbanked code (vectors and main) at 0. The data area cannot be
banked (especially the stack!) and is traditionally in high memory.
Suppose the code that starts at physical location 0 is to go into
ROM, and the data that starts at physical 40000h is in RAM. The
linker will first divide the logical address space based on the
unmapped memory requirements: main and vectors need 3780h bytes
starting at location 0, and data occupies 3200h at the end of
the logical space. Bearing in mind that the mapping resolution
of the 64180 is 4k, memory thus looks like:
Logical Physical Area
0000 00000 unbanked (main and vectors)
4000 04000 start of banked routines
c000 40000 start of RAM data
All the logical address space from 4000 to bfff is available to
routines that can be banked. If the sum of the banked sizes is
less than the BANK logical area, then no mapping need take place.
In our example, however, banked routines need some 64k, much more
than the available logical space. If CBAR is C4 (COMMON 1 at c000
and BANK at 4000), SASM will assign addresses as follows:
Logical Physical Routine length BBR
0000 00000 main 3700 --
3780 03780 vectors 80 --
4000 04000 sub1 4000 00
8000 08000 sub2 38f0 00
4000 0c000 sub3 1200 08
4000 0e000 sub4 6f00 0a
af00 14f00 sub5 800 0a
c000 40000 data 3200 --
For sub1 SASM assigned a logical and physical address of 4000
- reasonable, since this is the first free spot after COMMON 0.
sub1 is in BANK, so a BBR value is required. BBR=0 will map 4000
to 04000. sub2 follows sub1, again with BBR=0. So far, no surprises.
sub2 ends at b8f0, practically right before the logical start
of data (c000). There is no way sub3 can fit, since sub3 is 1200
bytes long. SASM therefore put sub3 at logical address 4000 (the
same as sub1). sub3 follows in physical memory at 0c000 (the next
physical address rounded up to a 4k boundary). BBR equals 08.
sub1 and sub3 occupy the same place in logical address space (4000),
but different physical addresses. To get to sub3, BBR must be
set to 08 and a logical address of 4000 issued.
While sub3 is very short, leaving plenty of room for code in the
same bank map, sub4 is not. sub3 and sub4 will not both fit into
BANK together, so SASM once again reset the logical address to
4000. sub4 comes after sub3, rounded up 4k, and a BBR of 0a is
assigned. (Remember the math - BBR * 4096 + logical = physical,
so 0a * 4096 + 4000 = e000). sub5 fits into the space between
the end of sub4 and COMMON 1, and is so assigned.
SASM's linker generates address assignments as we've just seen,
but how are calls and jumps between subroutines handled? Obviously,
if sub1, sub3, and sub4 all reside at logical address 4000, a
simple CALL 4000 will not always resolve properly. As mentioned
earlier, SASM's linker converts all inter-BANK calls and jumps
to a transfer to a jump table which is usually linked into COMMON
0. In particular, if sub4 were to call sub1, the following will
be automatically substituted for the call instruction:
call bank_table+x ; invoke MMU handler
db BBR ; BBR of sub1
DW sub1 ; address of routine to call
The code in bank_table stores the current BBR value and return
address on a local stack, remaps the MMU by outputting the indicated
BBR, and then transfers to the logical address supplied as a parameter.
Returns operate in a reverse procedure, being vectored to another
Softools-supplied routine to reverse the mapping.
What might not be entirely obvious is that SASM does it all. Once
you tell SASM's linker where ROM and RAM are (which has to be
done for any linker) it automatically allocates logical and physical
addresses. The linker also replaces the calls and jumps as shown
above. SASM does offer options to control memory allocation and
the like, but in most cases these are not needed.
This means that you can write large programs without ever considering
the MMU. SASM takes care of it all. There's an interesting subtle
implication - you can link a 256k program to take up only 8k or
so of logical space! Assign 4k for COMMON 0 and 56k for data.
SASM will bravely partition the 256k code into 50 or more sections,
each of which will get remapped through the 4k BANK area. The
mapping overhead might get high, but logical address space will
be conserved.
Since SASM partitions the program during the link phase, it can
save the addresses of all symbols, line numbers and other parameters
in a debug file. Symbols' physical addresses are stored in the
debug file, maintaining true addresses regardless of the MMU mapping.
If the debugger (emulator or monitor) can handle physical addresses,
then you can access any routine, variable, or source line number
at any time, without manually remapping to bring the desired value
into logical address space.
C Compilers
As we write this, only three compilers currently support 64180
bank switching. Archimedes Software's C180, Whitesmiths' C, and
Software Development System's C all automatically generate code
for large memory models. Manx (MANX Software Systems, P.O. Box
55, Shrewsbury, NJ 07701) will soon have a compiler, and no doubt
others are on the way.
Archimedes (2159 Union Street, San Francisco, CA, 94123 (415)
567-4010) approach to memory management is much like that used
by SASM. As you write your C code you do not need to be especially
concerned with the MMU. There are no special procedures to use
or functions to invoke.
Before linking the compiled object files various parameters must
be passed to the linker in its indirect command file. The first
are values for CBAR and CBR. It is the programmer's responsibility
to determine exactly the memory configuration, and to compute
these simple values.
In addition to the MMU register settings, the programmer must
provide the linker with a table of modules (i.e., file of source
code, each of which may contain several functions) and names.
If the module is not to be banked, that must be indicated as well.
The memory model supported by the compiler puts all non-banked
functions into COMMON 0, the banked code into BANK, and data areas
into COMMON 1.
The linker generates a table ("FLIST") of data about
every mapped function in the program. For each function, FLIST
gives an encrypted BBR value, logical start address, and bank
number. FLIST is a sort of global cheat sheet, located in COMMON
0 so it is never mapped out, that describes every function's logical
and physical address.
The linker replaces all mapped function calls with:
ld hl,FLIST entry for the function
call remap_code
The remap_code extracts pertinent data from the FLIST entry and
remaps the MMU as needed before branching to the function's logical
address.
The beauty of using an FLIST table is that pointers to functions
will work - the pointer becomes an FLIST pointer. With FLIST always
mapped in, indirect function invocations will work even to mapped
functions.
The Archimedes compiler does produce a good debugging file, which
contains useful information about the physical address of every
function. The information is stored in FLIST. A conversion routine
easily extracts the real physical address of each function, line
number, and global symbol.
Whitesmiths (733 Concord Ave., Cambridge, MA 02138, (617) 661-0072)
took a somewhat different approach to using the MMU. When writing
C code for this compiler, all calls to mapped functions must be
specified as FAR calls. This directs the compiler to generate
the proper code to bring the function into the map and execute
it.
The called function needs no special handling, since it can be
called as either a FAR or as a near. For example, if a function
in one module invokes another in the same module, it can use a
conventional call structure. Only if a different function, possibly
located outside of this bank, calls the same function, does the
extra call overhead have to be inserted.
A typical call sequence looks like:
@far int sub();
main()
{
sub1();
}
The function is FAR in the definitions, and then all references
to in within that module generate banked calls.
All banked calls do produce overhead, both in code size and speed.
The Whitesmiths approach eliminates the overhead in cases where
it is not needed. Archimedes, on the other hand, vectors all banked
function calls through FLIST, even if both the caller and callee
are co-resident in BANK.
Whitesmiths uses the indirect linker command file to indicate
the location of every banked function. The programmer provides
both the logical and physical addresses of each of these functions.
Again, the peril here is having to go through iterative modifications
of these parameters during development.
To date, no compiler is smart enough to automatically set banked
addresses. Perhaps soon this will change.
The compiler generates a call to library routine c.libc to do
the bank switching. Space is allocated on the stack for the return
address and return BBR. c.libc gets a "far pointer"
to the function so it can reset BBR and the logical address.
Uniware, from Software Development Systems (4248 Belle Aire Lane,
Downers Grove, IL 60515 (800) 448-7733) implements bank switching
by simulating linker overlays. In other words, the compiler and
linker are not even aware that the 64180 processor has an MMU;
each mapped function appears to be an overlay.
Like the other compilers, the Uniware compiler breaks memory into
three sections. Only the middle area is mapped dynamically. Your
initialization code must preset CBAR and CBR to their static values.
The indirect linker file specifies which functions are to be mapped,
and each function's BBR value. The linker changes the normal call
sequence to:
ld c,<BBR>
ld IY,<function logical address>
call _call
_call is a low level routine in COMMON 0 that remaps the MMU and
vectors off to the function.
You must give the linker much more information than for the Archimedes
product, so Uniware is a bit harder to use. One of the nice side
benefits of this approach, though, is that it is directly applicable
to Z80 bank switched applications. You just have to modify the
_call code to handle your proprietary hardware.
The Emulator
In the embedded world generating code is only a small part of
the development battle. Somehow it must be tested and debugged.
The only suitable tool for embedded debugging code is an In Circuit
Emulator, since only the emulator lets you interactively isolate
bugs in a ROMed environment.
Like compilers, 64180 emulators are all basically extensions of
technology developed for the Z80. After all, the timing is similar
and the software is practically the same. Unfortunately, the extra
four address bits found on the 64180 can cause lots of emulation
problems.
On a Z80 the logical and physical address space is the same. Not
so for the 64180 - only by knowing the MMU values can the translation
take place. It's therefore crucial that the emulator can handle
physical addresses, since only physical ones never change.
While this seems fairly obvious it can be difficult to implement.
Emulators use the 64180 for all target memory accesses, so the
machine cycles are identical to those expected with a processor
in the socket. A translation from desired physical address back
to logical, CBAR, CBR, and BBR must take place, since the 64180's
code can only issue logical addresses.
In other words, if memory at physical address 20000 is to be displayed,
then some routine has to figure out settings for all three MMU
registers, plus a logical address, that the 64180 can use to access
the memory. Not a trivial task.
As users we don't care what the emulator does or how it works.
All we're concerned with is the debugging interface - the source
level debugger (SLD) that runs on a PC and communicates with the
emulator over RS-232. If we type DISPLAY SYMBOL FOO, then we want
to see the value of FOO, no matter where it is or how the MMU
is setup. The SLD must therefore know about FOO's physical address.
Fortunately, all the products mentioned generate physical symbol
addresses in the debugging files. The SLD can send these values
down to the emulator and let it deal with coming up with the proper
address.
This does mean that the SLD/emulator interface is completely linear,
like the 68000's. You can randomly access any location in the
target's memory just by typing in the right address.
What if you wish to see a logical address? Is this important?
Herein lies a source of confusion. Only physical addresses unambiguously
identify each public symbol and line number. Your program works
through logical addresses - the two are not the same or even similar.
Looking at disassembled code, you might see a LD A,(1000). The
1000 is logical - its physical equivalent depends on the current
MMU mapping.
Softaid's emulators get around this problem by letting you suffix
any address with a tilde to indicate that the logical address
is needed, rather than the default physical. Of course, the emulator
will use the current MMU setting to access the memory, so if the
MMU is not set up as it would be when executing that instruction,
the data may not be correct. Normally this is not a problem -
you debug in you execution context, rather than randomly hunting
through code.
64180 registers are 16 bits long. When used as pointers, they
form logical addresses, creating the same sort of problem just
mentioned. Again, when displaying the contents of a register pointer,
that will be logical. If you ask for a dump of memory at the address
in HL, what will result? The correct solution is to use indirect
register references as logical (saving the bother of suffixing
a tilde all the time), since this is what the programmer really
wants.
In C, an automatic pointer will be stored as a 16 bit value on
the stack. Suppose, while debugging, you wish to dump *ptr? In
other words, display the data pointed to by ptr, which is presumably
on the stack. Again, only one correct solution exists: get the
stack pointer, convert it to physical using the current MMU, extract
the 16 bit value of ptr from the stack, make that physical, and
then access the destination address.
Conclusion
The 64180 family solves a long standing Z80 problem - that of
handling more memory. Lots of current Z80 applications can be
easily ported to the 64180 to take advantage of the larger memory
model and high integration peripherals. Don't try to get away
with Z80-style development tools - select assemblers, compilers,
and debuggers that exploit the 64180's resources to ease your
development efforts.
|