Two years I ago we needed to measure, track and analyze quality of our C# code base and as a part of this initiative I built this silly little web site. After collecting two years worth of metrics these are some of the things we've learned.

Code Quality can mean different things to different people. How do we measure one? What metrics are important?

We started off with these metrics generated by Visual Studio - maintainability index, lines of code, cyclomatic complexity, class coupling and depth of inheritance.

On the surface maintainablity index which is a number between 1 and 100, 100 being the best maintainable code seemed like a good metrics. The problem with it is that for the most of our code this index was in high eighties. We have a few hot spots where it was below 6 and actually a few where it was zero, but for the most part it wasn't too interesting for most people.

Class coupling and depth of inheritance were just like maintainablity index - nice to know numbers but throughout two years our class coupling on the entire code base increased from 11,816 to 12,915 and depth of inheritance remained boring flat at 9 which wasn't very informative.

Cyclomatic complexity which is roughly a number of if's in the code turned out to be more useful because it can be seen as as in indicator which systems are more complex than others.

The next metrics that we've tried was code churn which is number of source control addition and deletions (plus changes, but git only has additions and deletions). There is an old Microsoft research paper, code churn can be used as a predictor of software defects. If release is approaching and the code churn is high, then we may have a quality problem. We've been collecting code churn for two years and representing it in a nice bar graph with drill down to commits, but we yet need to tie to a code quality. It is helpful and informative to see the code churn as pulse from all teams from all repositories, but it just not sufficient to make determination on where we are heading.

Initially I came up with with a page that a lot of selects for time ranges, branches, systems, modules, namespaces, types and so. I thought it would be cool to be able so drill down and slice and dice the metrics by any of those dimensions and see the trends. For the most part, people didn't get it. It was too busy and too many knobs to tune.

Then we decided to simplify and came up with these simple metrics that everybody intuitively understands:

  • Code Coverage - ratio of source code lines covered by unit and integration tests to all source code lines. Can be branch or line coverage, we use line coverage. Makes it feel better because it is higher of the two.

  • Lines of code - number of executable C# lines of code excluding comments. Multi-line statements are considered as one line up to the semicolon.

  • Cyclomatic complexity - how "branchy" the code is.

The next question was how to represent three metrics on two-dimensional space and I borrowed an idea of bubble chart which I saw long time ago on TED's talk on world population. In our case it looks like this:
Bubble

Here x-axis is lines of code, y-axis is code coverage and bubles are sized proportionally to cyclomatic complexity of the systems. There's a legend off to the right which is not shown. Clicking time forward and back buttons makes bubbles to move. Looks kind of cool. I was thinking about dividing this plot into four quadrants:

  1. High LOC and high code coverage: good
  2. High LOC and low code coverage: usually big legacy systems for which there are no unit tests. Definitely is a problem if bubbles are big and not moving as time goes on
  3. Low LOC and high code coverage: good test coverage for relatively simple systems
  4. Low LOC and low code coverage: not a first priority

We continued building on the idea of code coverage, LOC and cyclomatic complexity and in addition of aggregating by the system also added repositories and teams and put this together in a line plot of code coverage by system, repository or team with ability to see modules LOC and code coverage on a given day:

Wouldn't be better to use off-the-shelf commercial or open source product instead like Sonar, NDepend or CAST instead of building custom solution? Maybe, but I had a lot of fun building it.