By Jack Ganssle

Lies, Damn Lies, and Schedules

Published September, 2007 in Embedded Systems Design

Last month mom contributed a number of her homely aphorisms about scheduling. She's not the only one who has strong thoughts on the subject! A few recent books have some interesting insights.

M. A. Parthasarathy's Practical Software Estimation (2007, Pearson Education, Boston, ISBN 0-321-43910-4) is a completely impractical book that might appeal to academics but offers little of use to engineers. Much of it reads like a PowerPoint presentation with innumerable bulleted items one should consider when creating a schedule. How should one considers these items? That question is ignored.

The book is entirely written within the scope of function points, which have long been recognized as a better metric than lines of code. A function point represents some level of, well, functionality. The book constantly references IFPUG (the International Function Point Users' Group), with many (many) admonishments to go to their web site for further details. This is the best advice in the book! Don't buy the book, and go to http://ifpug.org for more information.

The book is unintentionally amusing. An entire chapter devoted to computing 14 General System Characteristics (GSCs), apparently a critical step in the estimation process, and then admits that there's no good research on converting these factors to useful numbers. Bummer. In fact, one is left with the sense that GSC adjustments, which are claimed to be so critical to creating accurate estimates, are completely ad hoc and lacking in any useful quantitative basis. A few pages on the author ruefully confesses that most of the GSCs "probably" don't matter in modern systems, and that current research will help clear the fog. That's exciting for those going after big grants but is little help to the practicing engineer.

There is a pretty good chapter about the tradeoffs between doing the work in-house versus outsourcing it overseas. Mr. Parhasarathy works for Infosys so one might expect a bias favoring outsourcing, but in fact he gives the subject a very fair treatment.

Otherwise, it's a superficial work devoid of practical advice and guidance. Give it a pass.

Code Not Yet Complete

I always enjoy books by Steve McConnell, and his Software Estimation: Demystifying the Black Art (2006, Microsoft Press Redmond, WA Library of Congress Control Number 2005936847) is as readable as all of his other works. He writes in a very informal and accessible way. You can breeze through the 270 pages quickly.

In the very first chapter Steve draws the probability distribution function we usually give the boss: a single vertical line with 100% probability of delivering at a particular date and time. It's what the boss wants. It's what we do. It's laughable.

We know probability distributions have shape, but Steve dispels the common notion of using a bell curve. There are only a handful of ways a project can be early, but causes of lateness are legion. Steve draws, but does not name the more likely curve, which is from a class of functions called Rayleigh distributions (Figure 1). The peak represents the delivery date with the highest probability of making the delivery. but even there the odds of being on-time are only 50%.

Figure 1: The Rayleigh Distribution

The author does a fantastic job describing the "cone of uncertainty." When a project is first proposed any estimate is likely to be off by 300 or even 400%, primarily because of the initial vague requirements. As the project progresses more is known and estimates improve dramatically. Steve makes it clear that you cannot create a schedule that's anything more than fantasy till we have a reasonable set of requirements. He doesn't delve into eliciting those specs, and skips over managing changing requirements, which is a shame. A disciplined team absolutely needs a formal change request system so that the stakeholders understand the impact of any modification of the specification.

Sometimes our estimates are flawed because we forget to include activities like creating test data or managing build systems. A table on page 45 lists many of these oft-forgotten activities. It should be on every estimator's desk.

About 100 pages of the text are devoted to a veritable smorgasbord of techniques for creating estimates. Your favorite technique is listed, along with pretty much every other approach found in industry and academia. Though that sounds like a terrific resource, none are covered in any detail, and Steve doesn't present useful tradeoffs to help one understand which techniques work well, what sorts of accuracy one can expect, and how to decide which approach to use. He does suggest using a number of techniques to create a set of schedules, but most of us are far too busy to do that.

Though imperfect, this is an interesting and engaging book that has a lot of practical and immediately useful advice.

Navigating Schedules

The cover of Mike Cohn's Agile Estimating and Planning (2006, Pearson Education, Upper Saddle River, NJ, ISBN 0-13-147941-5) pictures an old salt on the bridge of a ship shooting the sun with his sextant. What does a sextant have to do with estimation? Perhaps the implied accuracy of the precision-made instrument, carefully forged by an old German craftsman, paints an image of accuracy. Ironically, though, by today's GPS standards a sextant is the crudest position-finding device, only accurate to a mile or two, and then only in the hands of a very experienced user.

Or maybe Mike is trying to subtly convey the opposite, the inherent lack of precision found in most estimates. Regardless, as an old celestial navigator myself who knows that positions, and estimates, come with error bands whose sizes are as important as the estimates themselves, I was immediately drawn to this book.

Don't come to this work looking for a step by step approach to creating a project schedule. The author includes no typical numbers which one can use guide one's estimation process. Nor are there formulas for deriving estimates. Some rules of thumb do surface. One that appalled me was take the current estimate and apply scaling factors of 60% and 160% to create a range of delivery dates. Tell your boss to anticipate delivery after working 6,000 to 16,000 hours and watch his hair burst into flames. Real business issues (like "how much will this cost and where will we get the money?") are every bit as important as software concerns.

The book is, though, a highly-readable set of ideas and strategies one can incorporate into an already reasonably healthy development environment, one where teams and management communicate well and are sensitive to each others' needs.

Much of what Mike discusses will already be very familiar to agile practitioners. The agile methods are discussed, but only in the context of scheduling. So there are a number of good discussions on issues like picking the duration of release iterations, and on monitoring progress using the velocity metric to highlight problems. Though he has an entire section about tracking progress, that subject is truly woven into the fabric of the entire work, and the insight he offers just about this makes the book worthwhile.

The chapter about doing financial analysis to help prioritize features is initially appealing till one realizes that it's usually impossible to compute the impact of these decisions. How do you know that adding feature X will yield 50 customers/month versus only 23/month from feature Y?

The strength of this book is its agile focus. Mike sums up the philosophy concisely: "I often equate the traditional view of a project as running a 10-kilometer race. You know exactly how far away the finish line is, and your goal is to reach it as quickly as possible. On an agile project, we don't know exactly where the finish line is, but we often know we need to get to it, or as close as we can by a known date."

The weakness of this book is that through its agile focus it ducks one of management's very real concerns. The boss legitimately wants to know when the system will be done, and what we'll deliver at that time. The frustrating truth is that that information is often critical, or at least considered critical, even though a comprehensive spec doesn't exist. This is the essential dilemma in all software estimation. The book correctly notes that such clairvoyance is simply impossible, but fails to give useful strategies to help management learn to accept that timely though possibly incomplete deliveries may be acceptable. And there's no useful help for those poor sods forced to bid fixed price development contracts.

Regardless, this is a well-written book with some great ideas.

Are We There Yet?

Linda Laird and M. Carol Brennan's Software Measurement and Estimation: A Practical Approach (2006,, IEEE Computer Society, published by John Wiley & Sons, Hoboken, NJ, ISBN 0-471-67622-5) goes beyond estimation into all sorts of interesting metrics applicable to software engineering. It's a textbook, and so each chapter ends with a lot of useless questions designed to give students heartburn. But as a textbook the authors include a huge number of useful resources.

One can make a pretty good argument that "software measurements" is an oxymoron. Software engineering is more akin to the soft sciences than to physics or electrical engineering, where precision is not only possible, it's expected. In software engineering we don't even clearly define our terms: in this book it appears that the authors use the words "function" and "module" interchangeably, even though, to me, a function is one single routine with a unique entry point, and a module is one source file that contains functions and other elements. And there's a million ways to, say, count lines of code (are comments counted? Blank lines) and bugs (raw coding errors? Overlooked functionality?), so there's a fog that distorts our perceptions and data.

Yet measurement, for all of its failings, does have value. That's the challenge with this book. The engineer in me wants to discard imprecision, while, like people studying social sciences, we must recognize that imperfect data is better than no data.

And so, this book is a grab-bag of metrics, some somewhat useful, others not. Like Steve McConnell's book, it gives many different approaches to estimation (and collecting other sorts of numbers), but Laird and Brennan make even less effort to evaluate the effectiveness of the different techniques.

Somewhat disconcertingly we're told that there's little hope of getting much better than +/-100% accuracy on our estimates, yet they claim the average overrun is 43%. The numbers just don't jibe.

But they do offer a number of great ideas and metrics for tracking the progress of a project. Those interested in a more scientific approach to project monitoring will find this information useful.

The authors shine most when discussing defects. They clearly present mathematical models, including the theory behind Rayleigh distributions (which pertain to bug rates as well as schedules). They suggest using these models to predict when to stop debugging. It's an interesting idea, though as one who is intolerant of badly-behaved electronic systems I think it makes sense to stop debugging when there are no bugs. But the quantitative approach does have some merit in helping to understand the project's status.

Overall, the book has a lot of interesting ideas, though many haven't been tempered by real-world experience. I found it a useful book, though one that does not live up to its promises.

So there you have it. Four books about scheduling. All imperfect, which perhaps simply reflects the state of the art of creating estimates. Most have some great ideas and insights. Most will make you think more deeply about this subject, and will shed some welcome light on what is all too often a hugely dysfunctional process.