link Abstract

Modern code review holds a multifaceted role in improving code quality, finding defects, and sharing knowledge between team members. CodeGrip's 2022 survey of "1,000+ CxOs and developers" indicates that some form of code review is utilized by 84% of companies [1]. As a business process, pull requests have surged to near ubiquity. Unfortunately, that surge not been paired with a corresponding flourish in tooling improvements. GitHub, Bitbucket, Gitlab, et. al rely on the archaic Myers diff algorithm (published 1986), to characterize the changes within a pull request. This means that every line change has a binary designation as either "addition" or "deletion," just as it has since 1986.

In this research, we analyze the extent to which updating the venerable Myers algorithm might reduce code review time. According to CodeGrip, the median developer in 2022 spent 2-5 hours per week on code review, with 30% of respondents reporting spending more than 5 hours per week on review [1]. In a 40 hour work week, this implies more than 10% of the entire week is consumed by code review.

To demonstrate how an evolved Myers algorithm could function, GitClear in June 2024 released a pull request review tool that recognizes six types of code changes (add, delete, move, find/replace, copy/paste, update). Possessing more granular change detail allows many lines that would be designated as "changes" by Myers algorithm to be recognized as "effectively no-op."

This paper utilizes a two-prong method to assess the efficacy of applying precise change data to reduce pull request size:

Delta in Changed Lines Shown - What is the gross count of "changed lines" associated with the pull request? This method compares line counts with granular diff analysis (GitClear) vs Myers diff algorithm (GitHub) across 12,683 pull requests, suggesting the potential for a 22-31% reduction in lines to review.

Code Review Duration - How much time does an experienced developer take to review a pull request? This method uses 48 research participants to contrast "time to complete review" and "percent of applicable bugs recognized" across diff tooling.

Finally, we conclude by assessing the actionability of the results collected by this research. We further assess how emerging strategies such as "AI summarization" and "data benchmarking" may further expedite the code review process.

link Table of contents

Abstract

Table of contents

Modern Code Review: State of Current Tooling

Built on the Venerable Myers Algorithm

GitClear's "Commit Cruncher" Diff Algorithm

Empirical Comparison Method: Delta in Changed Lines Shown

12,638 Pull Requests Suggest Median 27% Less Lines to Review

"Less Code Shown" Not Necessarily Advantageous

Interview-Based Comparison Method: Human Observation Data

How to correctly assess code review duration?

Interview Methodology

Pull Request Candidates

Code Review Evaluation Metrics

Code Interview Results

Code Review Duration

Code Understanding Accuracy

Interpretation

On Durability of Status Quo

Conclusion

Citations

Appendix

A1 - Raw data from review sessions

A2 - Pull request candidates

Javascript/Typescript

Python

A3 - Question Forms

Javascript/Typescript

Python

A4 - PR Data points

A5 - Aggregated Results

A6 - Database queries used

link Modern Code Review: State of Current Tooling

Peer code review, a manual source code inspection by developers other than the author, is generally recognized as a valuable tool for reducing software defects and improving the quality of software projects [3] .

The original ICICLE tool or “Intelligent Code Inspection in a C Language Environment” developed by the researchers at Bellcore, paved a new way for developers to review code, and many of the original concepts are still used in today's most popular tools.

open_in_new

Rietveld Code Review Tool 2008 GitHub Code Review Tool 2024

link Built on the Venerable Myers Algorithm

In 1986, Eugene Myers published An O(ND) Difference Algorithm and Its Variations, which unified the problems of finding the longest common subsequence of two sequences and finding the shortest edit script for transforming one sequence into another. Myers showed that these problems were equivalent to finding the shortest path over an "edit graph." [7]

His algorithm improved the popular diff utility, a data comparison tool that displays the smallest set of line-by-line deletions and insertions to transform one file into another [7]. The diff utility is incorporated in the git version control system, a dominant choice for developers today, with up to 90% of professional developers listing it as a regular part of their toolset [8], [9] .

open_in_new

2022 Developer Survey - Stack Overflow

open_in_new

2023 Developer Ecosystem - JetBrains

On top of git, git diff, and the Myers Algorithm, modern code review tools came forth to provide an informal, asynchronous, and convenient way to adopt the code review process.

As from the 2023 JetBrains survey, GitHub, GitLab, and BitBucket hold the top ranks as "most popular version control systems," and de facto diff tools. Azure DevOps joins these stalwarts as the fourth most popular tool for code review in 2023:

open_in_new

Due to their intrinsic reliance on the Myers algorithm - the default algorithm used to process diffs - GitHub, GitLab, and BitBucket are referred to as "standard diff viewers" in the scope of this research.

link GitClear's "Commit Cruncher" Diff Algorithm

In 2020, GitClear began building a novel diff algorithm, referred to as the "Commit Cruncher," to present code changes in terms more akin to modern developers. The six types of changed operations found in GitClear's diff viewer are detailed in GitClear's documentation [10]. They are "Added," "Deleted," "Updated," "Moved," "Find/Replaced," and "Copy/Pasted" (the latter featured prominently in GitClear's popular AI Code Quality research)

The following diff visually illustrates how recognizing precise change operations allows reductions to "changed lines to review":

open_in_new

This is the exact same code diff, presented with GitHub (left) vs GitClear (right)

The extent to which Commit Cruncher reduces lines to read vs. Myers algorithm depends on the prevalence of code operations that GitClear can distill. In the following section, we will use a large sample of pull requests to calculate the frequency of change operations that can be condensed without reducing change fidelity.

Another example where Myers falls short of the mark is when white space is involved, such as in this change:

open_in_new

The equivalent diff with Commit Cruncher:

open_in_new

The latter is able to recognize whitespace changes as trivial, so the extra indent created by wrapping the method body is not presented as a broad sea of red and green.

link Empirical Comparison Method: Delta in Changed Lines Shown

To assess the impact of GitClear's Commit Cruncher diff algorithm on the code review process, GitClear collected measurements for how many changed lines were shown across 12,000+ GitHub pull requests. The GitHub API's compare endpoint was utilized to capture the count of "added" and "deleted" lines to be shown to the user within each pull request (aka "set of sha endpoints").

link 12,638 Pull Requests Suggest Median 27% Less Lines to Review

GitHub's "changed line count" was compared to the changed lines of code as measured by GitClear's Commit Cruncher (diff processing engine). The "changed line count" for a GitClear diff is defined as "the count of lines that would be seen if the user opened a pull request on GitClear with the default user settings." Since GitClear began collecting this comparative data in mid-May 2024, about 13,000 pull requests have been cataloged with their "changed line count" as derived from GitClear and GitHub. It is a standard GitClear feature that this count is shown by default atop the diff viewer, with a detailed file-by-file breakdown of the changed line comparison. Aggregating the 12,638 pull requests by their overall size (as approximated by Diff Delta), shows an increasing percentage reduction in the "changed lines to review count" as the size of the pull request increases:

open_in_new

GitClear reduces "changed lines to review" across all sizes of pull requests, across the 12,638 samples analyzed to date

The sources of these pull requests were split roughly 50/50 between "GitClear customers opted into sharing Industry Stats," and "popular open-source repos" (specifically, those found within GitClear's Open Repos section).

You can find a detailed description of the database queries that were used to retrieve this data in [A4] to prove that the sample is a random set, which is expected to boost its reproducibility in possible future research.

The data shows that developers reviewing with GitClear and its "Commit Cruncher" algorithm read, on average, 22% to 29% fewer changed lines per pull request. The median difference between "Myers" and "Commit Cruncher" ranges from 27% to 31%, depending on the total magnitude of the change set.

link "Less Code Shown" Not Necessarily Advantageous

Though GitClear consistently presents fewer changed lines to review, this reduction comes with two important caveats. The first is that "less changed lines to review" doesn't necessarily translate to "less time to review code." For example, if the smaller pull requests were presented on a page that loaded 40% more slowly, the change benefit could be nullified.

The second (perhaps more realistic?) caveat is that "reviewing code faster" may be expected to reduce the likelihood that potential bugs are detected.

The first prospective caveat is refuted by existing studies in this field. Two such studies find that "the patch size affects the duration in most of the analyzed cases" [11] and "the patch size negatively affects all outcomes of code review that we consider as an indication of effectiveness - duration, comments left, reviewer activity" [12]

Analyzing the possibility that "faster code reviews mean less issues detected" is the primary subject of the following sections.

link Interview-Based Comparison Method: Human Observation Data

While the raw line change counts strongly suggest that less time will be required to review code on GitClear, we sought to design an experimental method that would allow us to answer additional questions:

1) Do the reductions in "lines to review" translate to a corresponding reduction in "time to review"?

2) If so, does reviewing the pull request in less time have an observable negative or positive impact on the percentage of bugs that are discovered?

link How to correctly assess code review duration?

We will define the review duration as the period of time a reviewer needs to review all the files inside a pull request. The duration will not measure the time spent leaving comments or navigating the UI. This metric stands to be a subset of the definitions of review durations previously cited - them measuring the review duration as the count of "how many days the code review process lasted, from the day that the source code is available to be reviewed to the day that it received the last approval of a reviewer". [12]

Evaluating review duration is inherently difficult to assess, due to the distinct approaches or preferences teams and developers have put in place when conducting a code review. The duration and the quality of a review are tightly linked - with the former impacting the latter, and deciding whether it results in a superficial or an in-depth code review. Thus, the noise generated by the reviewer's style and previous experience made that simply recording the duration of a review would not lead to accurate results. An additional metric is required to enable an apples-to-apples comparison of the duration of reviews of similar levels of quality and thoroughness.

The next obstacle is that there is no clearly defined and ready-to-be-tracked metric for review quality and thoroughness. Most of the time, it is inferred based on the expected results ensuing the review. A study by Bacchelli and Bird [3] provides answers on the developers' motivations for conducting code reviews. Further underlying the complexity of undertaking a holistic approach to measuring review performance based on how satisfactory were the outcomes in contrast to the initial motivations.

open_in_new

Developers’ motivations for code review - Bacchelli and Bird (2013)

Yet, the same study highlights a crucial insight revealing a common challenge everyone faces when striving for a fruitful code review [3]:

"Understanding is the main challenge when doing code reviews".

Testimonials from the survey participants further support this idea:

“the most difficult thing when doing a code review is understanding the reason for the change;”

"understanding the code takes most of the reviewing time."

“in a successful code review submission, the author is sure that his peers understand and approve the change.”

Bacchelli and Bird's study shares the survey results on developers' sentiment for the level of code understanding required to accomplish the multiple aspects of the code review process.

open_in_new

Developers’ responses in surveys of the amount of code understanding for code review outcomes - Bacchelli and Bird (2013)

Time saved when a developer spends trying to understand code changes directly impacts the code review flow. Shortening the time window a developer needs to be comfortable with the code inside the pull request frees up time and mental space for additional, equally important parts of the process. Sitting at the core of almost every part of the review process, the "understanding" part of the process became a great candidate to measure alongside review duration when doing a fair comparison between GitClear and other review tools.

With this in mind, we've chosen to analyze 2 complementary metrics during the investigation:

Review time duration - How much time does it take for a developer to review the code and understand it?

Level of understanding - How much did a developer understand after doing the review?

link Interview Methodology

To evaluate the impact of GitClear's PR Review Tool, live review sessions were conducted with developers possessing different levels of experience.

The candidates for the test were chosen randomly from the web platform CodeMentor. Choosing an environment where developers were initially unfamiliar with GitClear offered the opportunity to record how users who were new to the platform acclimated to it. No developer surveyed had prior experience with the GitClear platform.

A candidate review session comprised 2 pull request reviews. Each candidate was asked to review 1 pull request with GitClear and 1 (different) pull request with GitHub. Each pull request review was followed by a quiz to assess the candidate's understanding of the code they had reviewed.

link Pull Request Candidates

Pull requests were chosen from open-source repositories, and grouped in buckets based on the programming language used. Criteria for selecting programming languages were based on the popularity of the languages across the industry [8], [9]. Thus, the initial list contained pull requests written in JavaScript+TypeScript, Python, and C#.

Two separate pull requests were used per session to eliminate the unfair advantage the developer would get if they were asked to review the same pull request twice. Criteria for pull request eligibility were as follows:

Both pull requests had to have code written using the same programming language.

Both pull requests had to have code written using a single programming language.

Both pull requests had to have a similar level of meaningful changes - a similar Diff Delta score.

Both pull requests had to have an approximate number of lines of code.

Pull requests were to require as little to no prior knowledge of the repositories to perform a valid code review.

Pull requests were to require, on average, less than 40 minutes for a review, such that long review durations would not impact the reviewer's performance [13], [14].

link Code Review Evaluation Metrics

Each review session resulted in a pair of data points: "pull request review duration" and "quiz result accuracy."

To measure the review duration, we notified the reviewer that there was no time limit imposed on the review duration and then instructed them to indicate when they were ready to start, and subsequently finish, the review.

The code understanding quiz presented 5 to 9 multiple-choice questions extracted from the contents of each pull request. The difficulty of the questions increased incrementally. Initial questions evaluated surface-level knowledge of the pull request, later questions focused on finer details of the logic inside the code.

The quiz results were scored by using a method involving an adapted elimination testing to verify partial and intermediate answers to the proposed questions. Each question could result in a value between 1 and -1, with each correct selected answer positively contributing to the overall score, whereas each incorrect answer would negatively contribute. The "final question score" is the net sum of the answers.

Subsequently, the question results were normalized and translated to percentages.

link Code Interview Results

As of the initial publish date of this research (June 6, 2024), 48 participants had been interviewed and represented within the data tables below.

link Code Review Duration

To understand the impact of GitClear's diff viewer compared to Github's diff viewer, we collated the data from 48 review sessions, segmented by programming language, and calculated the net and percent difference of average and median review duration, [A5].

open_in_new

Code Review Duration Aggregated Results GitClear vs GitHub

Here's how the aggregated data looks in graph form for net and percentile results, with the yellow bars illustrating the absolute difference between the two data points:

open_in_new Median Net Code Review Duration Results GitClear vs GitHub	open_in_new Median % Code Review Duration Results GitClear vs GitHub
open_in_new Average Net Code Review Duration Results GitClear vs GitHub	open_in_new Average % Code Review Duration Results GitClear vs GitHub

A general trend of review time reduction can be observed across all the investigated programming languages with GitClear's diff viewer. The net decrease in review duration is further reflected as significant percentile differences compared to GitHub's diff viewer, and consequently standard diff viewers.

The most notable difference was for the pull request 25610 , with a 42% decrease (13.16 average minutes with GitClear vs 22.76 average minutes with GitHub) in favor of GitClear reviewers.

link Code Understanding Accuracy

The assessment of code understanding was averaged across all review sessions, with the figure below aggregating the accuracy results of GitClear and GitHub reviewers.

open_in_new

Question Accuracy Results for GitClear and GitHub Reviewers

Question accuracy percentage difference does not fluctuate more than 5%, averaging 1.12% in favor of GitClear when evaluated across the entire pool results, [A5].

Moreover, the raw data of the evaluation metrics for each individual session [A1] was plotted using a scatter chart, comparing question accuracy scores against the code review duration, as seen in the figure below. The code review duration decrease is visually outlined by the increased frequency of blue (GitClear) dots on the left side of the chart.

open_in_new

Code Review Sessions' evaluation metrics - Review Accuracy and Review Duration

link Interpretation

There is still more data to be parsed but the initial results point to a very promising direction for GitClear's alternative take on the review process. Developers are quick to adapt to the new principles GitClear works on, due to having a common ground with the standard reviewing practices but at the same time filling in the gaps of the sought-after quality of life changes.

The 22% average code review duration decrease for GitClear's diff viewer validated our initial hypothesis and predictably correlated with the reduction of lines of code that needed to be reviewed. While the duration of the review was shorter, the quiz accuracy results were consistently similar for both GitClear and GitHub users, further cementing that no information was lost at the expense of speed. Furthermore, the 22% review reduction aftermath is a consequence of the initial reaction to GitClear's review tool, we expect the average reduction to increase as developers get more familiar with the tool.

In particular, pull request changes with a big percentage of moved code (25610 ) performed the best, resulting in a reduction of over 40% in review duration. This consolidates this operation as one of the much-needed features that diff viewers were lacking. Being able to know that parts of code were moved and unchanged relieves the brain power needed to validate that operation and frees up space to focus on other areas.

Finally, we believe that individual contributors, as well as companies, can benefit from transitioning to GitClear's diff viewer, helping them to significantly reduce the time investment in code review with little to no transition cost. Making a judgment on the recent code review statistics present in the field (SmartBear - 2021, Codegrip - 2022), we can get a better grasp of the impact such a tool can have on the industry. 22% of developers report participating in tool-based code reviews daily and 19% weekly is in line with a majority of company reports scoring between 2 to 5 hours per day invested into code reviews, as seen in the figure below. For that time interval, if developers could reduce the time spent during code reviews by at least 22% that would result in anywhere between 30 to 60 minutes per week or 26 to 52 hours per year per developer.

open_in_new

Code Review Trends in 2022 - Codegrip

link On Durability of Status Quo

One question raised by this research: how has diff viewing evolved to be more homogenous than any of "developer IDE," "git platform," or "system OS"? Compared to the innumerable programming languages that have come and gone in the years since Myers algorithm was created, how did so many products converge on a solution developed 40 years ago? Two possibilities seem most plausible.

The first answer is that most developers don’t even recognize that it’s possible to represent a diff without Myers. Since diffs have looked the same since their career began, nobody thinks to go looking for other options.

The second reason is that Myers is a much “cleaner” algorithm than any successor would be. Choosing Myers offers an instantly available, multi-generation tested means to show a diff. And when it comes to reviewing a diff, getting every line right (every time) is incredibly important.

While Commit Cruncher shows significant improvement over Myers in this research, it relies upon a set of iteratively tuned heuristics. None of the large git platforms can afford to imperil user trust as they iterate on a more granular representation of what changed within a commit. Much like all source control providers herded to git once it was proven reliable at enterprise-scale, no single company is likely to evolve until they have strong incentives, including proof that the new algorithm, and the heuristics imbued therein, can be trusted to consistently interpret git diffs of any size and content.

link Conclusion

Despite that standard diff viewers are the commanding choice for conducting code reviews for the majority of the developers, they demonstrate limitations due to their simplistic approach of displaying changes as binary "added" and "deleted" operations.

GitClear's diff viewer provides a more nuanced overview of the changes inside a pull request by changing the definition of "added" and "deleted" lines and making use of 5 additional code operations to provide semantic context and abstract away LoC noise. With a different diff algorithm, GitClear reliably shows fewer lines of code to the reviewer, concentrating their focus on the meaningful changes that are happening inside a pull request.

In this paper, we've analyzed GitClear's impact on the duration of code reviews, resulting in a decrease of 22% on average compared to GitHub's diff viewer. Additionally, we've measured the reviewers' level of code understanding to enable a fair comparison of different review styles and ensure the validity of the recorded data. No significant discrepancy has been observed in favor of either platform for the resulting level of code understanding, with an overall percentile average of under 1.2% in favor of GitClear.

link Citations

Code Review Trends in 2022 - Codegrip

An O(ND) Difference Algorithm and Its Variations - EUGENE W. MYERS

Expectations, Outcomes, and Challenges of Modern Code Review - Alberto Bacchelli, Christian Bird (2013)

ICICLE: Groupware For Code Inspection - L. Brothers, V. Sembugamoorthy, M. Muller (1990)

Design and code inspections to reduce errors in program development - M.E. Fagan (1976)

An Open Source App: Rietveld Code Review Tool - Google (2008)

Visualizing Diffs - The Myers difference algorithm - Nathaniel Wroblewski

2022 Developer Survey - Stack Overflow

2023 Developer Ecosystem - JetBrains

Diff Delta and Commit Groups - GitClear

Characteristics of Useful Code Reviews: An Empirical Study at Microsoft - Bosu, A, Greiler M, Bird C (2015)

Investigating the effectiveness of peer code review in distributed software development based on objective and subjective data - Eduardo Witter dos Santos, Ingrid Nunes (2018)

The impact of design and code reviews on software quality: An empirical study based on psp data - Kemerer, CF, Paulk MC (2009)

An Approach to Improving Software Inspections Performance - Ferreira, AL, Machado RJ, Silva JG, Batista RF, Costa L, Paulk MC (2010)

Elimination testing with adapted scoring reduces guessing and anxiety in multiple-choice assessments, but does not increase grade average in comparison with negative marking - Jef Vanderoost et al (2018)

link Appendix

The data used to build this research is included below:

link A1 - Raw data from review sessions

Timestamp	Years of experience with the language	Date of Review	Review platform to be used	PR ID	Programming Language	Duration	Duration in min	Total Q Score	Q1	Q2	Q3	Q4	Q5	Q6	Q7	Q8	Q9
3/20/2024 15:19:24	5	3/20/2024	GitClear	25265	JS/TS	0:08:15	8.25	0.6111111111	-0.3333333333	-0.3333333333	-0.3333333333	0.3333333333	1	1
4/3/2024 18:56:04	8	4/3/2024	GitClear	25265	JS/TS	0:13:14	13.23333333	0.6944444445	0.6666666667	-0.3333333333	-0.3333333333	0.3333333333	1	1
4/1/2024 19:50:54	6	4/1/2024	GitClear	25265	JS/TS	0:10:19	10.31666667	0.9166666667	1	1	0.3333333333	0.6666666667	1	1
4/5/2024 14:19:36	5	4/5/2024	GitClear	25265	JS/TS	0:20:51	20.85	0.8888888889	1	-0.3333333333	1	1	1	1
3/16/2024 12:25:40	4	3/16/2024	GitHub	25265	JS/TS	0:06:49	6.816666667	0.8055555556	1	0.6666666667	1	0.3333333333	-0.3333333333	1
3/20/2024 18:16:48	5	3/20/2024	GitHub	25265	JS/TS	0:29:15	29.25	0.7777777778	0.3333333333	1	1	0.6666666667	0.6666666667	-0.3333333333
4/4/2024 17:37:29	10	4/4/2024	GitHub	25265	JS/TS	0:21:40	21.66666667	0.5555555556	0.3333333333	1	0.6666666667	-0.6666666667	-0.3333333333	-0.3333333333
4/8/2024 13:50:09	3	4/8/2024	GitHub	25265	JS/TS	0:33:18	33.3	1	1	1	1	1	1	1
3/20/2024 18:44:04	5	3/20/2024	GitClear	25610	JS/TS	0:09:17	9.283333333	0.6481481482	0.5	-0.3333333333	1	0.5	0.3333333333	-0.3333333333	0.6666666667	1	-0.6666666667
3/16/2024 12:45:25	4	3/16/2024	GitClear	25610	JS/TS	0:06:14	6.233333333	0.5740740741	0.5	-0.3333333333	1	0.5	1	-0.3333333333	-0.3333333333	-0.3333333333	-0.3333333333
4/4/2024 21:02:48	10	4/4/2024	GitClear	25610	JS/TS	0:16:33	16.55	0.7222222222	0.5	-0.3333333333	1	0.5	1	-0.3333333333	1	-0.3333333333	1
4/8/2024 14:29:24	3	4/8/2024	GitClear	25610	JS/TS	0:16:01	16.01666667	0.75	1	-0.3333333333	1	0.5	1	-0.3333333333	1	-0.3333333333	1
3/20/2024 15:40:38	5	3/20/2024	GitHub	25610	JS/TS	0:08:53	8.883333333	0.7314814815	0.5	-0.3333333333	1	0	0.6666666667	1	0.6666666667	1	-0.3333333333
4/3/2024 18:25:14	8	4/3/2024	GitHub	25610	JS/TS	0:08:02	8.033333333	0.6759259259	1	-0.3333333333	1	0.5	-0.3333333333	1	-0.3333333333	-0.3333333333	1
3/28/2024 13:35:07	6	3/28/2024	GitHub	25610	JS/TS	0:15:16	15.26666667	0.7037037037	0	-0.3333333333	1	0	0.6666666667	-0.3333333333	1	1	0.6666666667
4/5/2024 13:41:15	5	4/5/2024	GitHub	25610	JS/TS	0:22:21	22.35	0.75	0.5	1	0	1	0.6666666667	-0.3333333333	1	1	-0.3333333333
3/26/2024 13:51:44	8	3/26/2024	GitClear	22409	C#	0:04:47	4.783333333	0.5166666667	-0.3333333333	0.2	0.5	-0.1666666667	0.6666666667	-0.6666666667
4/25/2024 18:40:58	6	4/25/2024	GitClear	22409	C#	0:27:00	27	0.7833333333	1	0.4	1	0.3333333333	-0.3333333333	1
4/26/2024 19:54:51	2	4/26/2024	GitClear	22409	C#	0:26:45	26.75	0.9416666667	1	0.8	1	0.5	1	1
5/1/2024 13:11:30	9	5/1/2024	GitClear	22409	C#	0:11:09	11.15	0.7069444445	0.6666666667	0.4	0.75	0.6666666667	0.6666666667	-0.6666666667
4/24/2024 14:34:21	5	4/24/2024	GitHub	22409	C#	0:17:30	17.5	0.5847222222	-0.3333333333	0.6	-0.75	-0.1666666667	0.6666666667	1
4/26/2024 13:24:12	12	4/26/2024	GitHub	22409	C#	0:15:13	15.21666667	0.6916666667	0.6666666667	0.8	0.5	-0.3333333333	1	-0.3333333333
4/26/2024 20:36:20	7	4/26/2024	GitHub	22409	C#	0:11:10	11.16666667	0.8458333333	0.6666666667	0.4	0.75	0.6666666667	1	0.6666666667
4/29/2024 15:46:54	5	4/29/2024	GitHub	22409	C#	0:05:46	5.766666667	0.7347222222	1	0.4	-0.25	-0.3333333333	1	1
4/24/2024 15:09:30	5	4/24/2024	GitClear	28884	C#	0:11:16	11.26666667	0.775	0.5	0.5	0.75	0.5	0.5
4/26/2024 13:56:58	12	4/26/2024	GitClear	28884	C#	0:11:22	11.36666667	0.71	0.5	1	-0.15	0.25	0.5
4/26/2024 21:03:06	7	4/26/2024	GitClear	28884	C#	0:06:59	6.983333333	0.715	0	0.5	0.15	0.5	1
4/29/2024 16:11:49	5	4/29/2024	GitClear	28884	C#	0:05:28	5.466666667	0.825	0.5	0.5	1	0.25	1
3/26/2024 14:07:38	8	3/26/2024	GitHub	28884	C#	0:09:25	9.416666667	0.62	-1	0.5	0.2	0.5	1
4/25/2024 17:46:04	6	4/25/2024	GitHub	28884	C#	0:30:19	30.31666667	0.68	0.5	-0.5	0.8	0.5	0.5
4/26/2024 19:04:48	2	4/26/2024	GitHub	28884	C#	0:21:17	21.28333333	0.74	1	0.5	0.15	0.25	0.5
5/1/2024 12:34:45	9	5/1/2024	GitHub	28884	C#	0:18:27	18.45	0.715	0	1	0.4	0.25	0.5
3/27/2024 17:23:33	5	3/27/2024	GitClear	14776	Python	0:16:56	16.93333333	0.75	0.6666666667	-0.5	0.3333333333	0.3333333333	1	0.5	1	0.6666666667
3/28/2024 15:45:28	3	3/28/2024	GitClear	14776	Python	0:10:17	10.28333333	0.7395833333	1	1	0.6666666667	0.6666666667	0.5	1	-0.3333333333	-0.6666666667
5/14/2024 13:15:13	17	5/14/2024	GitClear	14776	Python	0:11:22	11.36666667	0.7708333333	1	1	0.3333333333	0.6666666667	-0.5	0.5	1	0.3333333333
5/17/2024 21:49:01	18	5/17/2024	GitClear	14776	Python	0:18:23	18.38333333	0.90625	1	1	0.3333333333	0.6666666667	0.5	1	1	1
3/25/2024 13:50:58	4	3/25/2024	GitHub	14776	Python	0:22:10	22.16666667	0.8541666667	-0.3333333333	1	1	1	0.5	0.5	1	1
5/7/2024 14:39:59	3	5/7/2024	GitHub	14776	Python	0:27:26	27.43333333	0.6354166667	1	0	0.6666666667	0.6666666667	0.5	-1	-0.3333333333	0.6666666667
5/8/2024 16:44:37	2	5/8/2024	GitHub	14776	Python	0:10:18	10.3	0.8229166667	1	1	0.3333333333	0.6666666667	0.5	1	-0.3333333333	1
5/27/2024 19:42:15	8	5/27/2024	GitHub	14776	Python	0:28:53	28.88333333	0.7604166667	1	1	-0.3333333333	0.6666666667	0.5	1	1	-0.6666666667
3/25/2024 14:42:20	4	3/25/2024	GitClear	43504	Python	0:17:20	17.33333333	0.6597222222	-0.3333333333	0.6666666667	0	-0.1666666667	1	0.75
5/7/2024 15:10:09	3	5/7/2024	GitClear	43504	Python	0:16:50	16.83333333	0.7222222222	1	0.6666666667	0.5	0.3333333333	0.6666666667	-0.5
5/8/2024 17:26:45	2	5/8/2024	GitClear	43504	Python	0:10:18	10.3	0.8888888889	1	0.3333333333	1	0.3333333333	1	1
5/27/2024 20:36:56	8	5/27/2024	GitClear	43504	Python	0:26:31	26.51666667	0.7222222222	1	1	0.5	-0.5	-0.3333333333	1
3/27/2024 16:37:17	5	3/27/2024	GitHub	43504	Python	0:26:56	26.93333333	0.7916666667	1	-0.6666666667	0.6666666667	0.5	1	1
3/28/2024 15:18:01	3	3/28/2024	GitHub	43504	Python	0:05:00	5	0.6805555556	1	-0.6666666667	-0.1666666667	0	1	1
5/14/2024 12:30:06	17	5/14/2024	GitHub	43504	Python	0:15:13	15.21666667	0.75	1	1	0.3333333333	0	-0.3333333333	1
5/17/2024 21:10:01	18	5/17/2024	GitHub	43504	Python	0:18:20	18.33333333	0.7638888889	1	0.3333333333	0.1666666667	1	-0.3333333333	1

			Averages			Median
Size (Diff Delta)	Samples	GitClear changed lines	Git patch changed lines	Extra lines to review w/o GC	GitClear changed lines	Git patch changed lines	Median extra review lines
100-300	3191	144	176	21.86%	99	126	27.27%
301-600	3550	339	427	25.64%	271	345.5	27.49%
601-1000	3400	562	723	28.64%	505.5	642	27.00%
1000+	2497	1137	1466	28.96%	959	1253	30.66%

The methodology used to procure this data is described in the supplementary Git Diff Line Count Data Generation Methodology Document.

link A5 - Aggregated Results

		Python	C#	JS/TS	Total
Medians	Median Duration GItClear [minutes]	16.88	11.21	11.78	11.37
	Median Duration GitHub [minutes]	20.25	16.36	18.47	17.92
	Median Duration Net Difference	-3.37	-5.15	-6.69	-6.55
	Median Duration Percent Difference	-16.63	-31.48	-36.24	-36.56
Averages	Avg. Duration GItClear [minutes]	15.99	13.10	12.59	13.89
	Avg. Duration GitHub [minutes]	19.28	16.14	18.20	17.87
	Avg. Duration Net Difference	-3.29	-3.04	-5.60	-3.98
	Avg. Duration Percent Difference	-17.06	-18.86	-30.80	-22.26
Accuracy	Question Accuracy GitClear	77.00	74.67	72.57	74.75
	Question Accuracy GitHub	75.74	70.15	75.00	73.63
	Question Accuracy Difference	1.26	4.52	-2.43	1.12

link A6 - Database queries used

The following code was used to generate the CSV (open source, including private repos) that allowed comparison of Myers vs. Commit Cruncher diff size:

# ruby
TaskUtil.define_task :generate_group_csv do |logger|
  csv = CSV.open("open_repo_prs.csv", "w")
  csv << [ "PR ID", "PR Title", "Repo URL", "PR Myers URL", "PR Cruncher URL", "Myers lines", "Cruncher lines", "Percent more" ]
  group_scope = CommitGroup.pull_request_canonical.includes(:extra).complete.joins(:extra).where.not(commit_group_extras: { lines_saved_by_file: nil })
  group_scope.find_each do |group|
    next unless (file_lines = group.extra.lines_saved_by_file).present?
    next unless (pr = group.pull_request)
    repo = group.repo
    meyer_line_count = file_lines.sum(0) { |k, v| LineInterpreter.process_lines_from_path?(k) ? v["patch_changed_line_count"] : 0 }
    cruncher_line_count = file_lines.sum(0) { |k, v| LineInterpreter.process_lines_from_path?(k) ? v["gitclear_changed_line_count"] : 0 }
    next unless meyer_line_count > 0 && cruncher_line_count > 0
 
    csv << [
      pr.external_identifier,
      pr.title,
      "https://github.com/#{ pr.repo.path }",
      pr.provider_url,
      LinkBuilder.resourceful_url([ repo, pr ], :review_code),
      meyer_line_count,
      cruncher_line_count,
      MathUtility.percent_of(meyer_line_count, cruncher_line_count)
    ]
  end
end

link Abstract

link Table of contents

link Modern Code Review: State of Current Tooling

link Built on the Venerable Myers Algorithm

link GitClear's "Commit Cruncher" Diff Algorithm

link Empirical Comparison Method: Delta in Changed Lines Shown

link 12,638 Pull Requests Suggest Median 27% Less Lines to Review

link "Less Code Shown" Not Necessarily Advantageous

link Interview-Based Comparison Method: Human Observation Data

link How to correctly assess code review duration?

link Interview Methodology

link Pull Request Candidates

link Code Review Evaluation Metrics

link Code Interview Results

link Code Review Duration

link Code Understanding Accuracy

link Interpretation

link On Durability of Status Quo

link Conclusion

link Citations

link Appendix

link A1 - Raw data from review sessions

link A2 - Pull request candidates

link Javascript/Typescript

link C#

link Python

link A3 - Question Forms

link Javascript/Typescript

link C#

link Python

link A4 - PR Data points

link A5 - Aggregated Results

link A6 - Database queries used