Go here to sign up for The Embedded Muse.
TEM Logo The Embedded Muse
Issue Number 258, April 7, 2014
Copyright 2014 The Ganssle Group

Editor: Jack Ganssle, jack@ganssle.com
   Jack Ganssle, Editor of The Embedded Muse

You may redistribute this newsletter for noncommercial purposes. For commercial use contact jack@ganssle.com. To subscribe or unsubscribe go to https://www.ganssle.com/tem-subunsub.html or drop Jack an email.

Contents
Editor's Notes

Ad

Did you know it IS possible to create accurate schedules? Or that most projects consume 50% of the development time in debug and test, and that it’s not hard to slash that number drastically? Or that we know how to manage the quantitative relationship between complexity and bugs? Learn this and far more at my Better Firmware Faster class, presented at your facility. See https://www.ganssle.com/onsite.htm.

Quotes and Thoughts

Not really a quote, but I was struck by Toyota's recent $1.2 billion payout. With something like 1 million lines of code involved, that's around $1200/line, probably the most expensive software ever written. I wrote about it here, and one commenter sagely said "If you think that the cost of doing aircraft maintenance and certification is high, just look at the costs of dealing with the accident that happens if you don't."

Tools and Tips

Please submit clever ideas or thoughts about tools, techniques and resources you love or hate. Here are the tool reviews submitted in the past.

My article on debouncing contacts gets downloaded several thousand times per month. Trent Cleghorn did a nice implementation of the algorithms in C and C++, which is available here.

Richard Tomkins sent:

Just thought about being inclusive by adding gEDA to the list of development tools.

For Macintosh folks, install FINK and through FINK install gEDA.

The result is a very sophisticated, free, PCB Editor with features rivaling the high end products that you have to pay many hard earned coins for.

gEDA also runs on Windows and many Linux platforms.

Thor Johnson has been playing with a Red Pitaya. This is a scope, logic analyzer and function generator on a single board. Their website is terrible and it's hard to find out much, but a press release last week from Elecktor claims it will cost "under $500." Thor's comments:

It's beginning to look a lot like my universal tool...
http://redpitaya.com/

I've only had it for a short while, but... it's definitely got the beef in the guts for amazing things...
     2 ch 14 bit 125 MS/s inputs
     2 ch 14 bit 125 MS/s outputs
     16 FPGA IOs for digital
     4 input and 4 output low speed analog (12 bit 100 KS/s)

Hopefully we get a Sump or other kind of logic-analyzer plugin soon (3.3V, so I might need to build some level shifters)... but since it's all open, if you want a new tool (eg a Scintillation Detector), you can build it (https://www.kickstarter.com/projects/652945597/red-pitaya-open-instruments-for-everyone/posts/741013)...

And since the current apps run through a browser, maybe a touchscreen android tablet will make a better interface than usual for PC oscopes! And since it uses Linux for the grunt stuff... I'm making mine WiFi :)

It doesn't compare analog-wise to a real o'scope (60MHz filters on the front end -- and they're not very sharp ones at that)... but... I suppose that's when you call in Jake and Bruno. 

Responses to "A Plea To Compiler Vendors"

A number of people responded to my call for compilers that give timing information. Bill Gatliff and I had a dialog about this. He wrote:

"Give us a compiler listing file that tags each line of code with expected execution times. Is that floating point multiply going to take a microsecond or a thousand times that? Tell me!"

This isn't as easy as it sounds, and I wouldn't trust the compiler's story even if it could tell it.

First, the compiler would have to know some pretty intimate details about the execution environment to estimate per-line calendar timing.  In addition to knowing CPU instruction cycle counts, it'd have to know the input clock speed, memory bandwidth, cache hit/miss rates, and so on.  Practically speaking, the compiler could tell you only the tally of instruction cycles based on the number of instructions it emitted.

That number isn't all that useful, however, because in most toolchains the compiler hands the code off to the assembler and several layers of optimization.  Those successive processing stages are likely to transform the compiler's original output significantly, so the compiler's original estimate doesn't matter anymore.

Finally, many of the above optimizations actually reflow the opcodes associated with the original C code, to avoid "interlocks" that make sure a piece of data is present in the CPU core before it's operated on.  The original "lines" of C code therefore no longer exist. This is a good thing, because avoiding interlocks is one of the the things you have to do in order to get efficient throughput from most modern CPUs, including perfectly ordinary ARM cores.

Memory fetches are a great example.  In a statement like "x = x + 1", the fastest assembly code won't try to increment X right after it reads it.  Instead, it'll do something else for an instruction or two, so that the CPU core doesn't stall waiting on the raw value of X to arrive.  In most cases, these one or two additional instructions are leftover work from the previous C statement, or setup work for the next one.  Anything's fair game, as long as it doesn't involve memory or anything related to X.

All of the above is legal in C, because the rules are carefully written to care immensely about the final behavior of the code you write, but to not care at all about how the underlying machine implements your code.

I get what the author was asking for when he was demanding some metrics from his compiler.  But a more productive request would be for compiler directives that prohibit certain classes of operations, like floating point operations, so that the developer knows he can't do something undesirable without getting a warning or error message.  For the rest, he's just going to have to dig into the final assembly language himself if he wants the most-gory details.

Function-level timing is meaningful, because C requires most of the cleanup work for a function to finish when the function returns. But per-line metrics aren't possible if you have a good toolchain.

I replied:

I agree that the issues are hard, and in some cases impossible. With cache there’s no predicting anything, and many safety-critical systems prohibit the use of it because there’s no guarantee things will run in a timely manner. And, yes, the tools would have to know everything about the execution environment, like wait states, clocks, etc. Load MPLab or a similar tool and you have to tell the debugger exactly which variant of a chip you’re using. Why not do the same at compile time, and add in other required parameters?

But most embedded systems are small, using PICs, Atmels, Cortex-M, etc, which don’t have these fancy features. In most of those cases the toolchains are integrated – e.g., IAR and others provide the whole shooting match, from compiler to linker. They do preserve a lot of line-by-line info in the debugging file, so some sort of mapping from source to object is possible. Sure, optimizations mess things up but there’s still a ton of useful info because of C sequence points.

The data would be imperfect. But I’d kill for 80% accuracy instead of the 0% we have today.

If I wrote an application in assembly this wouldn't be too hard. Why can’t the tools be as smart as we are?

He came back:

I intentionally took the high road in my objection, I know that smaller chips like AVRs and the like are much more predictable.

But those smaller chips also generally get used to do smaller, more fully-defined work (*).  I can totally see the value in having tools that help tabulate the code's expected behavior, but ... should the demand for a point-and-grunt to replace our need to really understand those well-defined works be viewed as a solution?  I worry that, more often than not, it's really more of a recognition that we want to get too far away from our code to understand it OR the problem we're using it to solve.

And I reject your notion that we have "0% accuracy" today. If a developer can't take an ASM dump of an AVR program's hot spots and Fermi out the approximate cycle cost, then they need to review those sections of the datasheet that they thought were too hard to understand.  Their lack of inquisitiveness and desire for insight is risky for all of us.

* -- Yes, gentle reader, YOUR problem is much, MUCH larger than can be dismissed with a simple hand-wave like that.  But since "complicated" is a relative metric, you'd better keep reading anyway so that the guys solving those lesser problems can still benefit from your benevolent wisdom.  Should be an easy read, since I'm just restating stuff YOU already know.

It is impossible to say YES loud enough to the article “A Plea to Compiler Vendors” in Muse #257.

One statement that really struck home. If things are too slow, we can only make nearly random changes to the source and try again, hoping we're tricking the tool into generating better code.

I've been programming since the early 80's and I still love Assembly language for one simple but very important reason. What you see is what you get! No tricks required. I could always estimate the timing AND once optimized I knew it would stay that way. There was no compiler black magic changing things simply because I made a change in some other section of code.

All of your requested features are great! You might want to consider adding one additional feature: Some way, or a better way, to tell the compiler "hands off" this section of code and don’t ever change the current optimization for this section.

The History of the Microprocessor, Part 4

A minor error in Part 3: Clyde Shappee noted that the AGC had 2800 ICs, not 2200.

Phil Martel was part of this history, and writes about the TX-0, which I mentioned as one of the first transistorized computers:

I remember the TX-0 well. In 1971-2 I was working on my Master's thesis using the PDP-1 mentioned in John McKenzie's TX-0 Computer History. To be able to work in the lab without supervision, I had to learn to turn both computers on and off. After turning on the TX-0's power electro-mechanical timers started up the various voltages as John described. Then you went over to a panel with the delay lines that controlled the timing and opened a switch that broke the delay line loop. Then you closed the loop and pushed a button that introduced a single pulse. You checked the pulse pattern on a scope in the rack (against some grease pencil marks) and you were good to go.

I didn't do much programming on the TX-0, mostly just toggling some simple things like Munching Squares (http://en.wikipedia.org/wiki/Munching_squares ) into the switch panel that could substitute for the first 32 works of core memory.

The PDP-1 had had a fairly advanced operating system (for the time) stored on drum memory. There was a short (a bit over a foot) boot (paper) tape that was run through an optical tape reader. When the system booted speakers would play "The Ode to Joy" from Beethoven's 9th. You could bypass the pinch roller on the paper tape, advance the tape by hand until the last character was just before the read head, then hold up the tape and blow on it, which would drag the character past the read head, and start things. This was called "the breath of life".

Microprocessors Change the World

I have always wished that my computer would be as easy to use as my telephone. My wish has come true. I no longer know how to use my telephone.- Bjarne Stroustrup

Everyone knows how Intel invented the computer on a chip in 1971, introducing the 4004 in an ad in a November issue of Electronic News. But everyone might be wrong.

TI filed for a patent for a "computing systems CPU" on August 31 of that same year. It was awarded in 1973 and eventually Intel had to pay licensing fees. It's not clear when they had a functioning version of the TMS1000, but at the time TI engineers thought little of the 4004, dismissing it as "just a calculator chip" since it had been targeted to Busicom's calculators. Ironically the HP-35 calculator later used a version of the TMS1000.

But the history is even murkier. In an earlier installment of this series I explained that the existence of the Colossus machine was secret for almost three decades after the war, so ENIAC was incorrectly credited with being the first useful electronic digital computer. A similar parallel haunts the first microprocessor.

Grumman had contracted with Garrett AiResearch to build a chipset for the F-14A's Central Air Data Computer. Parts were delivered in 1970, and not a few historians credit the six-chips comprising the MP944 as being the first microprocessor. But the chips were secret until they were declassified in 1998. Others argue that the multi-chip MP944 shouldn't get priority over the 4004, as the latter's entire CPU did fit into a single bit of silicon.

In 1969 Four-Phase Systems built the 24 bit AL1, which used multiple chips segmented into 8 bit hunks, not unlike a bit-slice processor. In a patent dispute a quarter century later proof was presented that one could implement a complete 8 bit microprocessor using just one of these chips. The battle was settled out of court, which did not settle the issue of the first micro.

Then there's Pico Electronics in Glenrothes, Scotland, which partnered with General Instruments (whose processor products were later spun off into Microchip) to build a calculator chip called the PICO1. That part reputedly debuted in 1970, and had the CPU as well as ROM and RAM on a single chip.

Clearly the microprocessor was an idea whose time had come.

Japanese company Busicom wanted Intel to produce a dozen chips that would power a new printing calculator, but Intel was a memory company. Ted Hoff realized that a design with a general-purpose processor would consume gobs of RAM and ROM. Thus the 4004 was born.

4004 microprocessor

The face that launched a thousand chips.

It was a four-bit machine packing 2300 transistors into a 16 pin package. Why 16 pins? Because that was the only package Intel could produce at the time. Today fabrication folk are wrestling with the 20 nanometer process node. The 4004 used 10,000 nm geometry. The chip itself cost about $1100 in today's dollars, or about half a buck per transistor. Best Buy currently lists some desktops for about $240, or around 10 microcents per transistor. And that's ignoring the keyboard, display, 500 GB hard disk and all of the rest of the components and software that goes with the desktop.

Though Busicom did sell some 100,000 4004-powered calculators, the part's real legacy was the birth of the age of embedded systems and the dawn of a new era of electronic design. Before the microprocessor it was absurd to consider adding a computer to a product; now, in general, only the quirky build anything electronic without embedded intelligence.

At first even Intel didn't understand the new age they had created. In 1952 Harold Aiken figured a half-dozen mainframes would be all the country needed, and in 1971 Intel's marketing people estimated total demand for embedded micros at 2000 chips per year. Federico Faggin used one in the 4004's production tester, which was perhaps the first embedded system. About the same time the company built the first EPROM and it wasn't long before they slapped a microprocessor into EPROM burners. It quickly became clear that these chips might have some use after all. Indeed, Ted Hoff had one of his engineers build a video game - Space War - using the four-bitter, though management felt games were goofy applications with no market.

In parallel with the 4004's development Intel had been working with Datapoint on a computer, and in early 1970 Ted Hoff and Stanley Mazor started work on what would become the 8008 processor.

1970 was not a good year for technology; as the Apollo program wound down many engineers lost their jobs, some pumping gas to keep the families fed. (Before microprocessors automated the pumps gas stations had legions of attendants who filled the tank and checked the oil. They even washed windows.) Datapoint was struggling, and eventually dropped Intel's design.

In April, 1972, just months after releasing the 4004, Intel announced the 8008. It had 3500 transistors and cost $650 in 2014 dollars. This 18 pin part was also constrained by the packages the company knew how to build, so it multiplexed data and addresses over the same connections.

Typical development platforms were an Intellec 8 (a general-purpose 8008-based computer) connected to a TTY. One would laboriously put a tiny bootloader into memory by toggling front-panel switches. That would suck in a better loader from the TTY's 10 character-per-second paper tape reader. Then, read the editor and start typing code. Punch a source tape, read in the assembler. That read the source code in three passes before it spit out an object tape. Load the linker, again through the tape reader. Load the object tapes, and finally the linker punched a binary. It took us three days to assemble and link a program that netted 4KB of object code. Needless to say, debugging meant patching in binary instructions with only a very occasional re-build.

Intellec 8

The Intellec 8

The world had changed. Where I worked we had been building a huge instrument that had an embedded minicomputer. The 8008 version was a tenth the price, a tenth the size, and had a market hundreds of times bigger.

It wasn't long before the personal computer came out. In 1973 at least four 8008-based computers targeted to hobbyists appeared: The MCM-70, the R2E Micral, the Scelbi-8H, and the Mark-8. The latter was designed by Jon Titus, who tells me the prototype worked the first time he turned it on. The next year Radio Electronics published an article about the Mark-8, and several hundred circuit boards were sold. People were hungry for computers.

"Hundreds of boards" means most of the planet's billions were still computer-free. I was struck by how much things have changed when the PC in my woodworking shop died. I bought a used Pentium box for $60. The seller had a garage with pallets stacked high with Dells, maybe more than all of the personal computers in the world in 1973. And why have a PC in a woodworking shop? Because we live in the country and radio stations are very weak. Instead I get their web broadcasts. So this story, which started with the invention of the radio several issues ago, circles back on itself. Today I use many billions of transistors to emulate a four-tube radio.

By the mid-70s the history of the microprocessor becomes a mad jumble of product introductions by dozens of companies. A couple are especially notable.

Intel's 8080 was a greatly improved version of the 8008. The part was immediately popular, but so were many similar processors from other vendors. The 8080, though, spawned the first really successful personal computer, the Altair 8800. This 1975 machine used a motherboard into which various cards were inserted. One was the processor and associated circuits. Others could hold memory boards, communications boards, etc. Offered in kit form for $1800 (in today's dollars), memory was optional. 1KB of RAM was $700. MITS expected to sell 800 a year but were flooded with orders for 1000 in the first month.

Computers are useless without software, and not much existed for that machine. A couple of kids from New England got a copy of the 8080's datasheet and wrote a simulator that ran on a PDP-10. Using that, they wrote and tested a Basic interpreter. One flew to MITS to demonstrate the code, which worked the very first time it was tried on real hardware. Bill Gates and Paul Allen later managed to sell a bit of software for other PCs.

The 8080 required three different power supplies (+5, -5 and +12) as well as a two-phase clock. A startup named Zilog improved the 8080's instruction set considerably and went to a single-supply, single-clock design. Their Z80 hugely simplified the circuits needed to support a microprocessor, and was used in a stunning number of embedded systems as well as personal machines, like Radio Shack's TRS-80. CP/M ran most of the Z80 machines, and was the inspiration for the IBM PC's DOS.

But processors were expensive. The 8080 debuted at $400 ($1700 today) just for the chip.

Then MOS Technology introduced the 6501 at a strategic price of $20 (some sources say $25, but I remember buying one for twenty bucks). The response? Motorola sued, since the pinout was identical to their 6800. A new version with scrambled pins quickly followed, and the 6502 helped launch a little startup named Apple.

Other vendors were forced to lower their prices. The result was that cheap computers meant lots of computers. Cut costs more and volumes explode.

Active Elements

In this series of articles I've portrayed the history of the electronics industry as a story of the growth in use of active elements. For decades no product had more than a few tubes. Because of RADAR between 1935 and 1944 some electronic devices employed hundreds. Computers drove the numbers to the thousands. In the 50s SAGE had 55,000 per machine. Just 6 years later the Stretch squeezed in 170,000 of the new active element, the transistor. In the 60s ICs using a dozen to a few hundred transistors shrank electronic products.

We embedded folk whose families are fed by Moore's Law know what has happened: some micros today contain 3 billion transistors on a single die; memory parts are even denser. And on the low end a simple 8 bitter costs tens of cents, not bad compared to the millions of dollars needed for a machine just a few decades ago. But how often do we stand back and think about the implications of this change?

Active elements have shrunk in length by about a factor of a million, but an IC is a two-dimensional structure so the effective shrink is more like a trillion.

The cost per GFLOP has fallen by a factor of about 10 billion since 1945.

It's claimed the iPad 2 has about the compute capability of the Cray 2, 1985's leading supercomputer. The Cray cost $35 million more than the iPad. Apple's product runs 10 hours on a charge; the Cray needed 150 KW and liquid Flourinert cooling.

My best guess pegs an iPhone at 100 billion transistors. If we built one using the ENIAC's active element technology the phone would be about the size of 170 Vertical Assembly Buildings (the largest single-story building in the world). That would certainly discourage texting while driving. Weight? 2500 Nimitz-class aircraft carriers. And what a power hog! Figure over a terawatt, requiring all of the output of 500 of Olkiluoto power plants (the largest nuclear plant in the world). An ENIAC-technology iPhone would run a cool $50 trillion, roughly the GDP of the entire world. And that's before AT&T's monthly data plan charges.

Without the microprocessor there would be no Google. No Amazon. No Wikipedia, no web, even. To fly somewhere you'd call a travel agent, on a dumb phone. The TSA would hand-search you... uh, they still do. Cars would get 20 MPG. No smart thermostats, no remote controls, no HDTV. Vinyl disks, not MP3s. Instead of an iPad you'd have a pad of paper. CAT scans, MRIs, PET scanners and most of modern medicine wouldn't exist.

Software engineering would be a minor profession practiced by a relative few.

Accelerating Tech

Genus Homo appeared around 2 million years ago. Perhaps our first invention was the control of fire; barbecuing started around 400,000 BCE. For almost all of those millennia Homo was a hunter-gatherer, until the appearance of agriculture 10,000 years ago. After another 4k laps around the sun some genius created the wheel, and early writing came around 3,000 BCE.

Though Gutenberg invented the printing press in the 15th century, most people were illiterate until the industrial revolution. As I described in part 1 of this series that was the time when natural philosophers started investigating electricity.

In 1866, within the lifetime of my great-grandparents, it cost $100 to send a ten word telegram through Cyrus Field's transatlantic cable. A nice middle-class house ran $1000. That was before the invention of the phonograph. The only entertainment in the average home was the music the family made themselves. One of my great-grandfathers died in the 1930s, just a year after he first got electricity to his farm.

My grandparents were born towards the close of the 19th century. They lived much of their lives in the pre-electronic era. When probing for some family history my late grandmother told me that, yes, growing up in Manhattan she actually knew someone, across town, who had a telephone. That phone was surely a crude device, connected through a manual patch panel at the exchange, using no amplifiers or other active components. It probably used the same carbon transmitter Edison invented in 1877.

My parents grew up with tube radios but no other electronics.

I was born before a transistorized computer had been built. In college all of the engineering students used slide rules exclusively, just as my dad had a generation earlier at MIT. Some calculators existed but were very expensive and I never saw one during my time in school.

We had limited access to a single mainframe. But my kids were required to own laptops as they entered college, and they have grown up never knowing a life without cell phones or any of the other marvels enabled by microprocessors that we so casually take for granted.

Slide rule

A scientific calculator, circa 1971

The history of electronics spans just a flicker of the human experience. In a century we've gone from products with a single tube to those with hundreds of billions of transistors. The future is inconceivable to us, but surely the astounding will be commonplace. As it is today.

Thanks to Stan Mazor and Jon Titus for their correspondence and background information.

Jobs!

Let me know if you’re hiring embedded engineers. No recruiters please, and I reserve the right to edit ads to fit the format and intents of this newsletter. Please keep it to 100 words.

Joke For The Week

Note: These jokes are archived at www.ganssle.com/jokes.htm.

OS/2 Chicken: It tried to cross the road several times and finally gave
up.

Windows 95 Chicken: You see different colored feathers while it crosses,
but cook it and it still tastes like ... chicken.

Microsoft Chicken: It's already on both sides of the road. And it just
bought the road.

Mac Chicken: No reasonable chicken owner would want a chicken to cross
the road, so there's no way to tell it to.

C++ Chicken: The chicken wouldn't have to cross the road; you'd simply
refer to it on the other side.

Visual Basic Chicken: USHighways!TheRoad cross (aChicken).

Delphi Chicken: The chicken is dragged across the road and dropped on the
other side.

Java Chicken: If your road needs to be crossed by a chicken, the server
will download a chicklet to the other side.

Web Chicken: Jumps out onto the road, turns right, and keeps on running.

Advertise With Us

Advertise in The Embedded Muse! Over 23,000 embedded developers get this twice-monthly publication. .

About The Embedded Muse

The Embedded Muse is Jack Ganssle's newsletter. Send complaints, comments, and contributions to me at jack@ganssle.com.

The Embedded Muse is supported by The Ganssle Group, whose mission is to help embedded folks get better products to market faster.