The Embedded Muse 489

The Embedded Muse
Issue Number 489, May 6, 2024
Copyright 2024 The Ganssle Group

Editor: Jack Ganssle, jack@ganssle.com

Jack Ganssle, Editor of The Embedded Muse

You may redistribute this newsletter for non-commercial purposes. For commercial use contact jack@ganssle.com. To subscribe or unsubscribe go here or drop Jack an email.

Contents

Editor's Notes
Quotes and Thoughts
Tools and Tips
C to C++
On Bugsrticle3
Failure of the Week
Jobs!
Joke for the Week
About The Embedded Muse

Editor's Notes

Tip for sending me email: My email filters are super aggressive and I no longer look at the spam mailbox. If you include the phrase "embedded" in the subject line your email will wend its weighty way to me.

Quotes and Thoughts

"Those who want really reliable software will discover that they must find means of avoiding the majority of bugs to start with, and as a result the programming process will become cheaper. If you want more effective programmers, you will discover that they should not waste their time debugging, they should not introduce the bugs to start with." - Dijkstra

Tools and Tips

Please submit clever ideas or thoughts about tools, techniques and resources you love or hate. Here are the tool reviews submitted in the past.

Here's a good article about using the Mongoose networking library with MicroPython.

C to C++

John Carter has some tips on going from C to C++:

So I have recently completed converting a large C embedded system to C++. Here are some lessons learnt.

* It's much easier than it sounds. We elected to have a halfway point where we could continue to compile the code as C but also as C++. I called this the transitional period.

* Since like most large systems, it was a mix of 3rd party C code we weren't going to port, a small amount of existing C++ code, and by far mostly C code.

* I bulk renamed all C code we were going to port to .c++ and .h++ , (and preexisting C++ as .cpp and .hpp) but configured the build system to, on a switch, build .c++ files a c or c++.

* We use Mercurial with Mercurial Evolution as a version control system. This keeps track of files across renames very well.

* The Evolution part meant I could rerun the process as many times as it took to get it right, and just prune any commits that were bad, and rebase the ones that were good.

* I had to take care NOT to rename .h files 3rd party code depended on.

* I scripted fixing up the #includes. (Ruby language and regexes are _very_ powerful)

* During the transitional period I created macros that abstracted away C vs C++ specific things.

* C++ is _much_ stricter around enum types, and that was by far the bulk of the manual fixups.

* In net effect I found and fixed more preexisting bugs thanks to C++'s stricter type system, than I introduced from my manual fixups.

* The linking fix up step was also somewhat painful, due to places where programmers had been sloppy and declared functions as "extern", instead of declaring them in headers.

* The interface points between C and C++ code also took extra care, to get the correct name mangling. (More on that later).

* Starting from a point of warnings clean C code is essential to success.

* Once the linking step passed, getting all unit tests running was fairly trivial, problems again coming down to sloppy programming on the part of the original authors. The fixes were always "Just do it right" and it works.

* Running up on target (different cpu) was the least painful step, requiring only one fix.

* Important rule with doing any bulk change, script the change, but don't aim for 100% automation. There is always a trade off point where automating to say 90 something% and doing it manually for the rest is fast / easier / safer than automating to 100%.

* Matching with that is commit the script, run the script, commit that, commit the manual fixups.

* If you find a need to fix up the script, roll back to that point, amend the script commit, rerun the script creating a new changeset for the result, and rebase your manual fixups on top of that.

* Version control systems are there to make us brave.

* During the transitional period we're coding to the lowest common denominator between C and C++, so it's important to exit it as fast as possible, as you're reducing the teams productivity in this period.

Once it was all compiling linking running on target, all unit tests running, all automated on target tests running, I "Flipped the Switch" and stripped out the transitional support.,

Static Analysis:

Previously we were a "splint" shop, our code was checked by splint. But splint doesn't do C++.

* Splint was becoming a pain point, it checks a language that is almost, but not quite C as the compiler sees it.
* The compilers built in -W -Wall warnings have taken over a _large_ part and more of what splint gave us.
* gcc's built in -fanalyze option is not ready for C++ prime time.
* So I moved our unit tests to clang, and use clang -fanalyzer.
* I have _also_ integrated our editors with clangd and clang-tidy. This creates a user friendly "hints" for opportunities to make your code cleaner, safer, modernized and conforming to the C++ core guidelines.

So let's address some Elephants.....

Why?

C++-20 is just not your Grannies C++. All the old reasons for not using it pre y2k are long gone.

They have been deadly serious about zero overhead abstractions.

I keep a close eye on the assembler produced. No sign of bloat, layers and layers of templates are optimized away.

In fact, I'm reminded of the Bad Old Days when we moved to C and the old programmers said they could produce more optimized code than the C compiler.

Sorry old timer, maybe if you slave away on a routine for month, maybe.

Can you hand craft an algorithm better than something out of the C++'s <algorithm> tool box?

Maybe, if you slave away for a month, maybe.

But by far most times the assembler programmer loses to the compiler, and now most times the C programmer loses to the template library.

To be fair, I'm an old dog... I'm just plain amazed at how good they have gotten at optimizing away the bloat.

Why C++ and not Rust?

There is zero money for a total rewrite. Not going to happen.

So that leaves either staying in C or sticking little bits of Rust on top of the 300lb C Gorrilla.

Compared to C, I find even tiny refactors as I fix a bug, or add a feature, results in me deleting a lot of code.

AND getting something that is higher on the Rusty scale of API goodness. https://gist.github.com/mjball/9cd028ac793ae8b351df1379f1e721f9

Even trivial things like turning a C array into a std::array, makes it available to the full tools available to C++ containers.

Trivial things like adding constructors and destructors catch and fix a range of possible resource leaks.

I find the places where the safety and productivity of a powerful language breaks AND becomes a source of impedance, is at every boundary with any C code you have.

We have a couple where we interface with 3rd party C libraries.

But compared with sprinkling Rust atop the 300lb C Gorilla, the number of boundaries that create impedance are orders of magnitude fewer.

Sure, the existing code is still C as C++, but I can and do just readjust the function signatures to give me the safety and power of C++.

In the Bad Old Days of splint, we'd annotate pointer parameters to indicate ownership semantics. C++ has much more powerful tools to communicate that.

The Safety Issue:

I'm prepared to admit Rust is safer than C++.

However, nothing it gives compares to the man centuries of manual and field test and fix that the product already has.

Not even close.

But again, C++20 is not your Grannies C++. Between language and library design, compiler warnings and analyzers .... C++ is _much_ safer than C ever was.

I'm also hopeful watching the CPPConference keynotes the last few years, the standards committees are taking very seriously the challenge of "All the Safeties".

Vlad Z's take on C is:

The linked article was an interesting read. The problems, as indicated, the need for "stricter enforcement of known rules for type, bounds, initialization, and lifetime language safety" are built into C and C++. When something is built-in, changing it is not really possible. My view is that C has been misused, and C++ was an attempt to fix C, which failed. C was designed for low-level work by the experts. Instead, it was used in colleges to teach software, and from there, it spread everywhere. I don't think it should be used for application development, user-facing, or embedded. There are safer alternatives, and if it's considered that the existing ones are deficient, they should be created.

I remember times when hardware supported counted strings and arrays. You simply could not get out of bounds if you used the language designed for that hardware platform; the app was aborted on out-of-bounds access. You could not blow the stack either. Those were the days of minicomputers with CISC architecture. Some would raise the portability issue at this point. I would argue that we should define certain minimal requirements for the hardware on which we run our software. RISC-V is an attempt to standardize the processor's instruction set. It would not hurt to add instructions for string manipulation and array handling. BTW, C's handling of strings and arrays was the result of it being written for specific hardware, which lacked appropriate hardware support.

A counterargument that RISC-V should remain pure RISC is fallacious. I heard purists' arguments in many areas, technical and not. They always ignore the realities and expediency in favor of a dogma. I have problems with dogmas, technical and others.

I have to agree that hardware support, in these days of nearly-free transistors, is a must. Why most/all processors don't include some sort of memory management unit is beyond me.

Vlad mentions that he remembers when one could not blow the stack. I remember the 8008, whose stack was in hardware, and which was a mere 7 levels deep. We blew that all the time!

On Bugs

Engineering is all about numbers, and the numbers can reveal some startling facts.

This is data from " Software Quality in 2011: A survey of the State of the Art" by Capers Jones, combined with my data for embedded. Though Jones' data is oldish, I see no evidence that the numbers have changed much.

First, where do bugs come from?

The column "Malpractice" means these folks are software terrorists. "CMM3" refers to the third level of the Capability Maturity Model, which is considered a disciplined software engineering process.

Where do they get removed?

Jones introduced the notion of "Defect Removal Efficiency", which is the percentage of the bugs removed during development and up to 90 days after initial delivery, out of the total universe of defects in the product, including those found in development. Here's his numbers:

So the very best companies can expect 30 bugs in a 100 KLOC project! Alas, few are even close to that level.

Failure of the Week

From Tom Van Sistine:

Geiff Field wrote: On a game I've been playing (probably too much), there's a prize given when you earn a certain amount of "stars".
I wondered why it was taking so long to get to level 1 when I noticed the counter:

Have you submitted a Failure of the Week? I'm getting a ton of these and yours was added to the queue.

Jobs!

Let me know if you’re hiring embedded engineers. No recruiters please, and I reserve the right to edit ads to fit the format and intent of this newsletter. Please keep it to 100 words. There is no charge for a job ad.

Joke For The Week

These jokes are archived here.

Technology is dominated by those who manage what they do not understand.

About The Embedded Muse

The Embedded Muse is Jack Ganssle's newsletter. Send complaints, comments, and contributions to me at jack@ganssle.com.

The Embedded Muse is supported by The Ganssle Group, whose mission is to help embedded folks get better products to market faster.