A common question asked by managers is “where are all these bugs coming from?” It is challenging to generalize about which files and modules are contributing most to incoming bug reports, but GitClear has designed a mechanism that can let engineering leaders pinpoint the top culprits.
Our bug identification system builds on top of code line provenance calculations. There are three steps that imbue it with bug pinpointing powers:
Look back through repo history to identify the last meaningful changes that were made to the lines that had to be changed to resolve a bug
Display the aggregated results of this deduction to offer managers the chance to predict future bugs before they are deployed
Below we will discuss each of these steps in greater detail.
Choose one of three available options for classifying which commits are bug fixes
There are three tactics GitClear can use to deduce which commits are bug-resolving:
Does the commit reference an issue tracker ticket designated as a bug?
Does the commit message mention words that indicate it was resolving a bug?
Does an AI classifier identify the commit message as most likely corresponding to a bug fix?
If any of these are true, there is a good chance that the work resolves a bug. However, this is not sufficient unto itself to label the work as resolving a bug. See “Additional Constraints” below for the final considerations GitClear checks before labeling work as bug-resolving.
By default, GitClear will consider a commit likely to resolve a bug if any of the following phrases are found at the beginning of a word in the commit message:
Fix
Resolve
Bug
Exception
Error
If you want to set up different words that designate a bug fix, visit Settings
> Detect Bug Fixes
. Ensure that "Set custom bug fix" is chosen in the "How are bugs defined?" section. Then, at the bottom of the page, you will have a form available to enter the details for a new string that indicates the commit is resolving a bug.
When "Set custom bug fix words" is chosen, you can set up specific words that designate a commit fixes a bug
Here, you can specify alternate words that indicate work that is probably resolving a bug. In addition to adding your own dictionary of bug fix words, you can also specify other conditions for whether the word qualifies.
By default we search the commit title (first line of a commit message) and description (all other lines). You can switch this so that the word only counts if found in the first line, or subsequent lines.
By default, we will match any word that begins with the characters provided. This means that if you specify the string “fix,” we would also match “fixes,” “fixed” and “fixing” (but not “bugfix,” since that word doesn’t begin with the provided string). If you only want to allow exact matches, that can be configred.
By default, we will look for the string you provide at the beginning of every word in the commit message. This can be changed such that we only match a word that begins a line. For the “fix” example, that would mean instead of “This commit fixes a big problem,” the commit message would have to be “Fixed big problem.”
Sometimes a commit check in message incidentally mentions words that would usually indicate a bug fix without actually fixing any bug. In these cases, you can manually relabel the commit by opening it, either in the Commit Activity Browser, or on its standalone page.
Click the edit-in-place next to the "Classified as" field
On its standalone page, the "Classified as" label is similarly located near the top of the page:
Allow a few minutes for calculated stat reprocessing after updating a commit to be labeled as a "Bug Fix."
There are some circumstances that may prevent a commit labeled as a “bug fix” from having the changed lines designated as bugs.
If you follow our instructions to demarcate when a deploy has occurred, we will use that information to avoid labeling interstitial work as bug fixing. Most of the time, the bugs that managers care about are the bugs that their customers experience. If a developer makes a commit, and catches the bug in their work before deploying it, that commit that contained the buggy lines will not be demarcated as a bug source.
If a set of commits resolves an issue labeled as a bug, only the final, non-churned lines will be evaluated as bug-fixing. That is, if the first commit among a set of commits that reference a bug ticket is later amended, the lines amended commit will not be treated as having bugs itself. Nor will interstitial commits within the bug fix work be labeled as bugs. Only lines that preceded the bug fix commits will be labeled as bug lines.
After a commit has been deemed “bug fixing work,” GitClear utilizes antecedent traversal to pick out what previous work gave rise to the commit.
Often times, the previous commit that featured the bug-resolving line is not the commit that created the bug. For example, if the last incidence of the line merely changed white space, or moved the line’s parent method to a new line/file, we will continue further back into the commit history until we discover a commit that meaningfully changed the line (e.g., by updating or adding it).
There are a couple places where you can review the results from our bug-cause analysis. The first is the Directory Browser / Tech Debt Browser:
The Directory Browser & Tech Debt Browser both have a column that shows the relative frequency of how many bugs have been found in each directory & file
The Directory Browser will only show the bug density for bugs that were authored during the time range that was selected on the page.
Soon, you will also be able to view a graph of bug frequency over time by visiting the "Issues & Defects" tab.