For long time I have been trying to come up with a good visualization of this dataset on my Code Quality Portal:

Date System Lines of Code Code Coverage Maintanability Index Cyclomatic Complexity Class Coupling Depth of Inheritance
09/18/2015 Order Entry 4,300 43% 85% 2,300 320 3
09/18/2015 Accounting 13,829 6% 80% 13,839 5,201 4
09/25/2015 Everything repeats on the next collection date ....

The whole idea behind this data elements is to visualize quality of the code in order to understand trends and identify hotspots. Data is collected weekly and represents the following:

  • Lines of Code - number of executable lines of C# code
  • Code Coverage - ratio of lines of codes executed during unit tests to number of all lines; 1-100%, the more the better.
  • Maintainability index - 1-100%, the more the better
  • Cyclomatic complexity - number of independent paths through code or degree of branching. Higher numbers don't necessarily indicate a problem, but just an indicator that maintaining this code will be harder
  • Class coupling - how coupled classes in the code are. To some extent, having lower class coupling is preferred since change in one class won't affect many others
  • Depth of inheritance - how deep the class hierarchy is. The lower the better.

So, given that, my first brute force attempt to visualize this dataset was line chart where X-axis had date and Y-axis had values. Since I could not display all six values on the same Y-axis because they have different scale, I resorted to grouping the ones with the same scale into series groups and having several radio buttons to toggle between these groups:


Technically, it worked fine, you could change radio buttons or date criteria and chart updates, but people began asking me questions, well what exactly do I need to look at to understand our code quality? Do I select this radio button or that? Which lines are more important than others?

In thinking about the answers I realized that this dataset has too many dimensions (a.k.a. features) - six while charts have two (2-D) or three (3-D) at most and the first is to pick two dimensions that we care most about it and I pick Lines of Code on X-axis and Code Coverage on Y-axis essentially creating scatter chart with collection dates being controlled back and forward buttons.

That worked better but than in addition to that I also wanted to squeeze one more feature there which is also important - cyclomatic complexity and this finally takes us to bubble chart.

I head about bubble charts from this TED talk in 2006. (Aside from bubble chart, it is very fun presentation to watch). And so bubble chart is essentially a scatter plot where size of the data points is bound to one more dimension, which was exactly what I needed - I can bind cyclomatic complexity to the size.

Here's Angular.js Wijmo bubble chart markup, just a simple binding:

<wj-flex-chart style="height:300px; margin-top:20px;" items-source="scatterData" chart-type="Bubble"  
               binding-x="linesOfCode" tooltip-content="<b>{systemName}</b><br /><b>Cyclomatic Complexity:</b> {cyclomaticComplexity:n0}<br /><b>Lines of Code:</b> {x}<br/><b>Code Coverage:</b> {y}%"
               selection-mode="Point" selection="chartProps.selection" ng-click="systemClick()" >
    <wj-flex-chart-axis wj-property="axisX" min="0" max="{{getMaxLinesOfCode()}}" title="Lines of Code"></wj-flex-chart-axis>
    <wj-flex-chart-axis wj-property="axisY" min="{{getMinCodeCoverage()}}" max="100" title="Code Coverage"></wj-flex-chart-axis>
    <wj-flex-chart-series binding="codeCoverage,cyclomaticComplexity"></wj-flex-chart-series>

which renders this:


Now it is much easier to explain things:

  • If you see low hanging bubbles, especially in the right hand corner and especially big bubbles, that's bad. Translation: lots of lines of code, high cyclomatic complexity and low code coverage
  • If as you move date forward the bubble rises, that's good. Translation: we are adding more unit tests
  • If as you go forward by date the bubble moves right and down, it's not good. Translation: we are pilling up code without unit tests.
  • Having big bubbles is not a problem as long as they above, say 50% line. Translation: lots of code but there is good test coverage.

A few little adjustments I had to make was in rendering the chart is to shift Y-axis a few points up if there are systems with zero code coverage and also don't rely and automatic maximum X-value. See complete source code here.