Follow @jack_ganssle

Coverity Scan 2012

Summary: The annual Coverity Scan Report is out, and has some interesting data.

For the fifth year in a row Coverity has released their Coverity Scan, which is billed as an update on open source software. Yet it's more than that as the company evaluated a far greater quantity of proprietary code as well.

Coverity sells a static analyzer, a tool that finds large classes of software problems that can surface at runtime by doing very complex analysis of the code. They're not alone; a lot of other outfits (e.g., Polyspace, Klocwork, Grammatech, Green Hills and others) sell similar tools. Each year Coverity runs bunches of code through their tool to, well, sell more seats, but to also show the state of software defects.

While the data in the report is indeed interesting, it's somewhat misleading. Remember that the defects shown are only ones that their tool can find; plenty of other problems will remain undetected.

Coverity looked at 68 million lines of open-source software and 380 million of proprietary code. That's a heck of a sample size. The average project ran 580k LOC - not monstrous, but pretty big. 11% of the open-source projects exceeded 1m LOC.

Open-source code averaged a defect density (number of detected bugs per thousand lines) of 0.69, which is much better than typical embedded code. This is twice as bad as in 2008, but the company has added more checkers since then, so larger classes of bugs will be found.

Proprietary software scored almost exactly the same. So is open-source no better than that which is walled off?

The data doesn't say. However, I would speculate that the tool, which is expensive, is primarily used by quality-focused developers. Most of the market is far less concerned with software quality (other than paying lip service), so the data is probably skewed more than a little.

Running the numbers says that the tool found 47k errors in the open-source stuff, but it turns out only 22k were fixed. That's not a result of false positives, since only 9.7% fell into that category.

The 9.7% figure is encouraging, as false positives are a huge disincentive for people to use these sorts of tools.

The top errors found were: Control flow issues 3,464 errors Null pointer dereferences 2,724 Resource leaks 2,544 Integer handling issues 2,512 Memory - corruptions 2,264 Memory - illegal accesses 1,693 Error handling issues 1,432 Uninitialized variables 1,374 Uninitialized members 918 The report is free (registration required) from www.coverity.com.

What do you think? Are you using static analyzers? Why or why not?

Published May 20, 2013