Line Impact Calculation

Aside from the most primitive of metrics, "commits" and "lines of code," in 2020 there are precious options to assess the rate at which software evolves.


According to a conversation I had in early 2020 with Dr. Alain Abran, no substantive progress has been made toward measuring software development since the early 1990s. The last widely studied software metric was "Function Points," but their adoption has been limited by a high bar to access and lack of commercial availability.


Line Impact takes a new approach. It integrates seamlessly with any git repo and any programming language, without requiring developers to change their habits. It works retroactively, such that the Line Impact of code can be measured over any historical time period, so long as the git commit history remains available. It's easy to access for commercial customers: within a few hours of enabling measurement, a company can see years of historical data, including the rate at which their code has evolved over time, and where their tech debt resides.


Line Impact is a distant cousin to "Lines of Code," but unlike LoC, which is 95% noise, Line Impact is stable across programming languages. This is achieved through the combination of overlapping measures that work together to strip the noise from LoC. Here are some of those measures:


Exclude auto-generated files and directories where non-code activity happens within a repo

Exclude commits that happen on branches that are discarded without being merged to master, or branches that are specific to the software release cycle like the production branch

Exclude value assigned to third-party libraries added to the repo

Recognize per-language idioms that do not convey meaningful repo evolution, such as keywords, include statements, and whitespace changes

Recognize per-language commenting invocation, assigning changed comment lines about 10% the value of a non-comment line that is changed

Process code on a per-commit basis to assign one of the following operations to each change: addition, deletion, update, move, copy/paste, find/replace, or no-op

Identify and exclude value assigned to "recycled" lines that are deleted and then re-added in a temporally proximal commit

Assign Line Impact value based on the length of time since the last meaningful change to the line

Assign Line Impact value relative to the uniqueness of the changed line. For example, when changes occur in CSS files, these lines tend to be short and oft-repeated, and so are interpreted to convey a fraction of evolutionary significance relative to a changed line in Ruby or Python. Another example is a line like }) in Javascript, which isn't technically a keyword, but any repo of moderate size will see this line repeated hundreds of times, so its evolutionary impact can be interpreted as negligible

Ignore work committed as part of a rebase or merge: value is assigned only once -- in the commit where the code was originally added, updated, or removed


These and other tactics are enumerated in somewhat more detail on the Line Impact Factors page. A single changed line ranges in value from 0-30 Line Impact. The factors that tend to have the great cumulative effect on Line Impact are #6 and #8.