|For novel ideas about building embedded systems (both hardware and firmware), join the 40,000+ engineers who subscribe to The Embedded Muse, a free biweekly newsletter. The Muse has no hype and no vendor PR. Click here to subscribe.|
By Jack Ganssle
An Interview With James Grenning - Part 2
James Grenning (https://www.wingman-sw.com/, formerly at http://www.renaissancesoftware.net), whose book "Test Driven Development in C" will be out in the Fall, graciously agreed to be interviewed about TDD. Part one of our talk is here.
Jack: How do you know if your testing is adequate? TDD people - heck, practically everyone in this industry - don't seem to use MC/DC, npath or cyclomatic complexity to prove they have run at least the minimum number of tests required to insure the system has been adequately verified.
James: You are right; TDD practitioners do not generally measure these things. There is nothing said in TDD about these metrics. It certainly does not prohibit them. You know, we have not really defined TDD yet, so here goes. This is the TDD micro cycle:
Write a small test for code behavior that does not exist. Watch the test fail, maybe not even compile. Write the code to make the test pass. Refactor any messes made in the process of getting the code to pass. Continue until you run out of test cases.
Maybe you can see that TDD would do very well with these metrics. Coverage will be very high, measured by line or path coverage.
One reason these coverage metrics are not the focus is that there are some problems with them. It is possible to get a lot of code coverage and not know if your code operates properly. Imagine a test case that executes fully some hunk of code, but never checks the direct or indirect outputs of the highly covered code. Sure it was all executed, but did it behave correctly? The metrics won't tell you.
Even though code coverage is not the goal of TDD, it can be complementary to TDD. New code developed with TDD should have very high code coverage, as well as meaningful checks that confirm the code is behaving correctly. Some TDD practitioners do a periodic review of code coverage, looking for code that slipped through the TDD process. I've found this to be useful, especially when a team is learning TDD.
There has been some research on TDD's impact on cyclomatic complexity. TDD's emphasis on testability, modularity and readability leads to shorter functions. Generally, code produced with TDD shows reduced cyclomatic complexity. If you Google for "TDD cyclomatic complexity" you can find many articles supporting this conclusion.
Jack: Who tests the tests?
James: In part, the production code tests the test code. Bob Martin wrote a blog a few years ago describing how TDD is like double entry accounting. Every entry is a debit and a credit. Accounts have to end up balanced or something is wrong.
If there is a test failure, it could be due to a mistake in the test or the production code. Copy and paste of test cases is the biggest source of wrong test cases that I have seen. But it's not a big deal because the feedback is just seconds after the mistake, making it easy to find.
Also the second step in the TDD micro cycle helps get a test case right in the first place. In that step we watch the new test case fail prior to implementing the new behavior. Only after seeing that the test case can detect the wrong result, do we make the code behave as specified by the test case. So, at first a wrong implementation tests the test case. After that, the production code tests the test case.
Another safeguard is to have others look at the tests. That could be through pair programming or test reviews. Actually, on some teams we've decided that doing test reviews is more important than reviewing production code. The tests are a great place to review interface and behavior, two critical aspects of design.
Jack: As has been observed, all testing can do is prove the presence of bugs, not the absence. A lot of smart people believe we must think in terms of quality gates: multiple independent activities that each filter defects. So that includes requirements analysis, design reviews, inspections, tests, and even formal verification. Is this orthogonal to TDD approaches, and how do TDD practitioners use various quality gates?
Unit level TDD is a defect prevention mechanism more than a bug finding mechanism (ref: https://blog.wingman-sw.com/archives/16 and https://blog.wingman-sw.com/archives/364). Mistakes are made regularly during development, but in the TDD micro cycle, the mistakes are immediately brought to the developer's attention. The mistake is not around long enough to ever make it into a bug tracking system.
There is another form of TDD, a more requirements-centric activity called Acceptance Test Driven Development (ATDD). In ATDD the customer representative defines tests that describe the features of the system. Each iteration, the team works to complete specific stories defined by the customer. A story is like a use case, or a specific usage scenario.
The acceptance tests describe what it means to be the definition of done for the story. These acceptance tests are also automated. If the new and all prior tests do not pass, the story is not done. That is an important a quality gate. Only when they all pass is it done.
Speaking of inspections, I think TDD is superior to inspections. Don't get me wrong, I am a proponent of design reviews and pair programming. I did a case study on the Zune bug that illustrates my point. This bug caused the 30G Zune model to freeze on New Year's Eve 2008.
My informal research on the bug (https://blog.wingman-sw.com/archives/38) showed that most online code pundits who inspected the faulty function did not correctly identify the whole problem. I was in the group that got it almost right; a.k.a. wrong. Then I wrote a test. The test cannot be fooled as easily as a human. So, I think we need both, inspections and tests.
Jack: Some systems are complex or control processes that respond slowly. What happens when it takes hours to run the tests?
James: For TDD to be a productive way to work, the micro cycle has to be very short in duration. This pretty much rules out going to the target during the micro cycle; unit test execution must be kept short.
To avoid the target bottleneck I recommend that TDD practitioners first run their unit tests in their development system. If you are practicing the SOLID design principles it is natural to manage the software's dependencies on the hardware and OS.
If there is a lengthy control process being test driven, we need to take control of the clock. If we are managing dependencies, this is not hard. A time-driven event eventually resolves to a function call. The test fixture can call the event processing code as well as some OS- or interrupt-based event handler.
If your code needs to ask some time service what the current millisecond is, we can intercept those calls and mimic any time based scenario we like without any of the real delays slowing the test execution time.
With that said about unit tests, you might have the same issue when it comes to a more thorough integration or system test. If you have automated some of these tests, and you rely on using the real clock, tests could take a long time to run. But that may not be a problem, because the cadence of acceptance and systems tests does not need to be as fast as unit tests. We'd like to run these longer tests automatically as part of a continuous integration system.
Jack: Let's move on to my business concerns. Through incremental delivery, TDD promises to produce a product that closely aligns with the customer's needs. That is, at each small release the customer can verify that he's happy with the feature, and presumably can ask for a change if he's not. "Customer" might refer to an end-user, your boss, the sales department, or any other stakeholder. If there's no barrier to changes, how does one manage or even estimate the cost of a project?
James: This is more of an Agile requirements management issue than TDD, but that's OK. Let me start by saying that it is a misconception that there is no barrier to requirement changes and feature creep. For successful outcome, requirements have to be carefully managed.
In agile projects there is usually a single person who is responsible for driving the development to a successful delivery. Let's call her the Product Owner (PO) to get away from the confusion of internal vs. external customer. The PO might be from marketing, or product management or systems engineering. She usually heads a team of skilled people who know the product domain, the market, the technology, and testing. She is responsible for making trade-offs. Team members advise her, of course.
To manage development, we create and maintain something called the product backlog. The backlog is the list of all the features (we can think of) that should go into the product. There is a strong preference to make the work visible to the PO, over work that only engineers understand. It is mostly feature oriented, not engineering task oriented. We are focusing on value delivery. We're preventing surprises by taking 3-month engineering deliverables and splitting them into a series of demonstratable bits of work that our customer cares about.
The product owner's team can add things to the backlog, but in the end, the authority of what goes into a specific iteration is the PO's responsibility. For highly technical stories, a hardware engineer might play the role of the customer. For manufacturability stories, built-in test for example, a manufacturing engineer or QA person might play the role of the customer. You can see there may be many "customers", but the final call on what is worked on at what time is up to the PO.
You also ask about estimating time and cost. There is no silver bullet here, but there is a realistic process Agile teams use. When an initial backlog is created, all the backlog items or stories are written on note cards and spread out on a table. (A story is not a specification, but rather a name of a feature or part of a feature.)
Engineers get together and do an estimation session. Each story is given a relative difficulty on a linear scale. The easiest stories are given the value of 1 story point. All stories labeled with a one are of about the same difficulty. A story with a value of 2 is about twice as difficult to implement than a one. A 5 is about five times as difficult. I am sure you get the idea.
Once all the stories have a relative estimate, we attempt to calibrate the plan, by choosing the first few iterations and adding up their story points. We're estimating the velocity of the team in story points per iteration. The initial estimate for the project would be the total of all story points divided by the estimated velocity. This will probably tell us that there is no way to make the delivery date. But it's just an estimate, next we'll measure.
As we complete an iteration, we calculate the actual velocity simply by adding the point values of the team's completed stories. Velocity is the sum of the completed story points in an iteration. The measured velocity provides feedback that is used to calibrate the plan.
If the projected date is too late for the business needs, we are getting early warning, rather than eleventh-hour surprises. Managers can use the data to manage the project. The PO can maximize delivered value by carefully choosing what work stories to do and not do. The business could look at adding people before it is too late, or changing the date.
Jack: Engineering is not a stand-alone activity. While we are designing a product, the marketing people make advertising commitments, tech writers create the user's manual, trade shows are arranged, accounting makes income and expense projections, and a whole host of other activities must come together for the product's launch. TDD says the boss must accept the fact that there's no real schedule, or at least it's unclear which features will be done at any particular time. How do you get bosses to buy into such vague outcomes?
James: Jack, there goes that misconception again on "no real schedule". There is a schedule, probably a more rigorous and fact-based schedule than most developers are used to working with. The Agile approach can be used to manage to a specific date, or to specific feature content.
TDD is just part of the picture. The team activities should encompass cross-functional needs. While the product is evolving the team's progress is an open book. The user documentation, marketing materials, etc. can and should be kept up to date.
I don't try to get bosses to buy into vague outcomes. I get bosses that are not satisfied with vaguely "working harder/smarter next time." I get bosses that want predictability and visibility into the work. I get bosses that want to see early and steady progress through the development cycle, ones that are not so interested in doing more of the same thing and expecting different results.
Jack: Now for a hardball question: Is it spelled agile or Agile?
James: Saving the toughest for last, setting me up. Someone with greater command of the language better take that one. Like any label, agile is aging and getting diluted.
My real interest, and I think yours too, is advancing how we develop embedded software and meet business needs. To me many of the ideas in Agile Development can really help teams. But it's important to consider it a start, not the destination.
Jack, thanks again for the chat. It's always good talking to you.
Jack: Thanks, James, for your insightful answers. I hope the readers will respond with their thoughts and experiences using TDD in their workplace.
Published March 12, 2010