A common question asked by managers is “where are all these bugs coming from?” It is challenging to generalize about which files and modules are contributing most to incoming bug reports, but GitClear has designed a mechanism that can let engineering leaders pinpoint the top culprits.


Our bug identification system builds on top of code line provenance calculations. There are three steps that imbue it with bug pinpointing powers:

Look back through repo history to identify the last meaningful changes that were made to the lines that had to be changed to resolve a bug

Display the aggregated results of this deduction to offer managers the chance to predict future bugs before they are deployed

Below we will discuss each of these steps in greater detail.


linkDeduce which commits are fixing buggy lines

Choose one of three available options for classifying which commits are bug fixes


There are three tactics GitClear can use to deduce which commits are bug-resolving:

Does the commit reference an issue tracker ticket designated as a bug?

Does the commit message mention words that indicate it was resolving a bug?

Does an AI classifier identify the commit message as most likely corresponding to a bug fix?

If any of these are true, there is a good chance that the work resolves a bug. However, this is not sufficient unto itself to label the work as resolving a bug. See “Additional Constraints” below for the final considerations GitClear checks before labeling work as bug-resolving.


linkConfigure which words constitute bug-resolving work

By default, GitClear will consider a commit likely to resolve a bug if any of the following phrases are found at the beginning of a word in the commit message:

Fix

Resolve

Bug

Exception

Error

If you want to set up different words that designate a bug fix, visit Settings > Detect Bug Fixes. Ensure that "Set custom bug fix" is chosen in the "How are bugs defined?" section. Then, at the bottom of the page, you will have a form available to enter the details for a new string that indicates the commit is resolving a bug.


When "Set custom bug fix words" is chosen, you can set up specific words that designate a commit fixes a bug


Here, you can specify alternate words that indicate work that is probably resolving a bug. In addition to adding your own dictionary of bug fix words, you can also specify other conditions for whether the word qualifies.


linkWhere can the word be mentioned?

By default we search the commit title (first line of a commit message) and description (all other lines). You can switch this so that the word only counts if found in the first line, or subsequent lines.


linkCan the word have an appendix?

By default, we will match any word that begins with the characters provided. This means that if you specify the string “fix,” we would also match “fixes,” “fixed” and “fixing” (but not “bugfix,” since that word doesn’t begin with the provided string). If you only want to allow exact matches, that can be configred.


linkWhere can the word occur?

By default, we will look for the string you provide at the beginning of every word in the commit message. This can be changed such that we only match a word that begins a line. For the “fix” example, that would mean instead of “This commit fixes a big problem,” the commit message would have to be “Fixed big problem.”


linkChange the labeling of a commit

Sometimes a commit check in message incidentally mentions words that would usually indicate a bug fix without actually fixing any bug. In these cases, you can manually relabel the commit by opening it, either in the Commit Activity Browser, or on its standalone page.


Click the edit-in-place next to the "Classified as" field


On its standalone page, the "Classified as" label is similarly located near the top of the page:



Allow a few minutes for calculated stat reprocessing after updating a commit to be labeled as a "Bug Fix."


linkAdditional constraints

There are some circumstances that may prevent a commit labeled as a “bug fix” from having the changed lines designated as bugs.


linkIf deploy tracking is set up, the buggy commit must have been deployed

If you follow our instructions to demarcate when a deploy has occurred, we will use that information to avoid labeling interstitial work as bug fixing. Most of the time, the bugs that managers care about are the bugs that their customers experience. If a developer makes a commit, and catches the bug in their work before deploying it, that commit that contained the buggy lines will not be demarcated as a bug source.


linkWork-in-progress lines authored while addressing a bug ticket won’t be labeled bugs

If a set of commits resolves an issue labeled as a bug, only the final, non-churned lines will be evaluated as bug-fixing. That is, if the first commit among a set of commits that reference a bug ticket is later amended, the lines amended commit will not be treated as having bugs itself. Nor will interstitial commits within the bug fix work be labeled as bugs. Only lines that preceded the bug fix commits will be labeled as bug lines.


linkLook through repo history to designate buggy lines

After a commit has been deemed “bug fixing work,” GitClear utilizes antecedent traversal to pick out what previous work gave rise to the commit.


Often times, the previous commit that featured the bug-resolving line is not the commit that created the bug. For example, if the last incidence of the line merely changed white space, or moved the line’s parent method to a new line/file, we will continue further back into the commit history until we discover a commit that meaningfully changed the line (e.g., by updating or adding it).


linkDisplay where & when buggy lines originated

There are a couple places where you can review the results from our bug-cause analysis. The first is the Directory Browser / Tech Debt Browser:


The Directory Browser & Tech Debt Browser both have a column that shows the relative frequency of how many bugs have been found in each directory & file


The Directory Browser will only show the bug density for bugs that were authored during the time range that was selected on the page.


Soon, you will also be able to view a graph of bug frequency over time by visiting the "Issues & Defects" tab.