When customers are becoming acquainted with GitClear, it's hard to build an instinct for what "normal" Diff Delta fluctuations are, and what sort of expectations should exist around data consistency over time? This article will describe the "life cycle" of Diff Delta: when it's initially evaluated, when it gets re-evaluated, and how it usually changes over time. Finally, if you find what you believe to be an anomaly in Diff Delta, we'll offer some tips to help get the issue investigated and resolved promptly.
Wherein we review how Diff Delta is initially evaluated, how it's propagated, and how it changes over time.
The Diff Delta factors page describes much of what GitClear evaluates when making a first-pass approximation at Diff Delta. The single biggest factor is the Diff Delta assigned is the age of the code being changed or replaced. Code that has been in the repo for years (possibly getting shuffled between files) requires the most cognitive energy to understand, so deleting and refactoring such code tends to earn a premium per line, all other factors being equal.
On the opposite end of the spectrum, code that is deleting or replacing lines that might have changed earlier today are what is usually known as "churn" code, and these lines are ascribed very little Diff Delta. But understanding how we handle "churned" code is the underpinning for building expectations around how Diff Delta will change after its initial evaluation.
It's common that, in the course of developing a PR, a developer might add, remove, or change a particular line across several commits. GitClear detects this common development pattern and "stretches" the work that ultimately got done across the commits. For example, if we follow a single line through an entire PR, the ultimate Diff Delta calculated might be "10 points for deleting an old line," and "10 points for replacing it with a totally new line of code" that took 6 commits to finalize. In this example, this single line from a file would get 20 Diff Delta ascribed, divided by 6 commits in which it was changed -- so, a little more than 3 Diff Delta per line change per commit.
When the line is first authored, it might already be perfect. Some developers only push commits after thorough QA and testing. In the example above, on the first day the line was changed, it would have been assigned 20 points. But then, each subsequent day the code is modified, the Diff Delta ascribed to that first day's work is diminished. If the line ultimately changed 10 more times, then eventually the work from that first day would only be worth 2 Diff Delta. But it might be a week or two before that outcome is known, because it depends on how much more the code needed to be polished before it made its way into master and stopped being changed.
After Diff Delta is a month old, it generally shouldn't change much. There are, however, a handful of reasons that Diff Delta might change after one month:
Date range, commits, or active committers for the entity are added. Evaluating more code means that the per-time-interval values will change for the entity or organization in which the changed repo(s) reside.
Committers are exiled. Removing active committers reduces the Diff Delta reported for those repos.
Branches are discarded. If code in a branch is never merged to master, it will eventually (about 2 months after activity has been deemed stale) its Diff Delta that was initially ascribed will be stripped. Additionally, if a user later force pushes a branch with its commits removed, then the Diff Delta for those commits will be removed ("force pushed to oblivion" is the explanation GitClear messages use to describe this) from contributing to Diff Delta unless the commits also existed on the main branch.
Multipliers are changed (e.g., to create incentives). Diff Delta is a configurable quantity that is often used as a way to incentivize desired behavior within a team. When one changes Diff Delta multipliers (through Code Domains, code file types, or other mechanisms), all of the Diff Delta for the effected repos will eventually be recalculated.
Commits are duplicated. If some change is later duplicated (i.e., we detect identical changes made to a file across branches), then the value for one of the commits is negated, depending on the lineage of the commit authorship (earlier authorship is preferred, but might be committed after the later-authored duplicate)
Beyond the typical life cycle changes of Diff Delta, there are a couple other mechanisms whereby its results can be perceived to "change" over time.
Different reports on GitClear consume Diff Delta data through various caching mechanisms that can sometimes take up to 24 hours to catch up with Diff Delta ascribed to the most recently calculated commits. For chart values less than a day or two old, take their results with a grain of salt.
All Diff Delta on GitClear is ascribed to the date on which the commit was "authored" (committed on the developer's local machine), not when it was "committed" -- a timestamp used by providers like Github to describe when the commit happened. GitClear believes that, since the "committed at" time can change frequently based on arbitrary factors (e.g., rebasing), it makes most sense to use authorship time as the definitive source of truth on when the work was done.
This decision means that sometimes Diff Delta can appear to "pop up" from days or weeks ago, if the committer chose not to push their for a protracted amount of time.
The Diff Delta for any commit can be manually changed by any user who is designated as a "Lead Developer" or above in the User settings. If the value of a commit is explicitly changed, that will be indicated in a prominent popup when the commit is visited.
Sometimes, even taking account of the normal reasons that Diff Delta changes, a manager might observe changes that seem suspect, or outright wrong. Here are some options we recommend for these situations.
Diff Delta is only as valuable inasmuch as users can learn to trust it. The reason that we created this help page was that we wanted to help our users become expert in understanding the different paths by which Diff Delta can be expected to change. But this does not means that everything is always perfect. We rely on our users to help us reproduce anomalies they observe, so we can fix those anomalies with tests and prevent them recurring in perpetuity.
Please email email@example.com with at least one screenshot of the incorrect report and the URL at which we can observe the anomaly you have spotted. We will usually respond to all such reports within one business day. If further investigation is required, it may take up to a week, but resolving anomalies in Diff Delta, or any of the reports that present Diff Delta, is our highest priority (i.e., before adding new features, we fix all reproducible data anomalies with tests).
Since Diff Delta is denormalized to varying cached formats to be shown across contexts like the Directory Browser and Hourly Diff Delta, it's not uncommon that the propagation from when Diff Delta is calculated to when it is reflected in these graphs can be a source of perceived anomalies. If you locate a particular report that doesn't seem to be integrating Diff Delta that you can see should be present (i.e., by visiting a developer's Diff Delta historical graph), please see item #1 and send us a URL so that we can evaluate why the report has failed to stay in sync.
If you're too busy to wait for data to resolve, GitClear provides a button under Settings -> Data Processing to "Regenerate cached stats." Upon clicking this, we will begin to regenerate your Diff Delta stats from scratch. Please allow 1-3 days for this reprocessing to complete.