I put together a simple utility to collect the number of lines changes for the Github hosted repositories.

Code churn which is a sum of lines changed, deleted and modified is believed to be a good measure of code quality. In general, low code churn means that code is very stable or dead.

The utility gets data through Github API. For the given timeframe it asks for all commits in the given repo and then gets files and churn stats for each of the commits. Data then aggregated into database with this schema:

Here's Powershell script that I use to load data


$dates = @(

foreach($date in $dates)  
  Write-Host Processing ([datetime]$date).ToShortDateString()
  .$loader --r MyRepo --d $date

At this point Bitbucket does not provide code churn metrics through their API and I may need to extend this utility with option of getting churn from local git repos through libgit2Sharp library.