GitClear offers the option to code quality set targets on a per-repo basis. When an entity or organization is selected, then the goals for all repos are set together, as a batch. The targets that can be set include:


Code Quality Targets are accessible via the Settings tab -> Measurement -> Code Quality Targets


On this help page, we will provide empirical data and first-principles reasoning to help customers pick goals that keep their repos healthy over the long-term.



linkMax lines per file (what is a substantiated file line target?)

There isn't a lot of hard research available on what the maximum length of a file should be. There are a smattering of Google results that suggest 1,000-2,000 lines, and others that suggest 300 lines. While experts are hesitant to propose specific numbers, the suggestions they do make correspond pretty well to what we observe across the thousands of repos we process.


Here are the 50th and 95th percentile file lengths for various languages processed by GitClear:

Language

Samples analyzed

50th percentile file length

95th percentile function length

C/C++

100,000

366 lines

2,752 lines

C#

100,000

121 lines

1,602 lines

Java

100,000

146 lines

1,354 lines

Javascript

500,000

135 lines

1,994 lines

Python

100,000

325 lines

3,688 lines

Ruby

100,000

105 lines

1,251 lines

Typescript

500,000

154 lines

1,863 lines


Our default maximum is currently set to 1,000 lines because we believe that the work of maintainers is easier when repo files are constructed with "skimmability" as a goal. If the average developer attention span lasts 20-30 seconds, one can scan a 1,000 line file in about 2-3 attention cycles. Any file that takes longer than 2 attention cycles to fully digest is a file at acute risk of becoming unmaintained.


linkMax lines per function (what is a substantiated function size target?)

The single highest quality Google result for "brain method size" does not hazard a specific number. Neither does the synopsis of the top research result. The rest of the top 5 Google results link to innocuous conversations debating numbers between 10, 160 and 3,000.


Trying to work out a maximum size from first principles (and 25 years of programming experience), the author hereby estimates that a method is too big once it can't fit on one screen worth of lines. On my Programmer-res monitor, 60 lines easily fit on a screen. So, let it be known that 60 is the greatest number of lines you'll find in a reader-friendly function.


Here are the 50th and 95th percentile function lengths for various languages processed by GitClear:

Language

Samples analyzed

50th percentile function length

95th percentile function length

C/C++

100,000

10 lines

107 lines

C#

100,000

12 lines

84 lines

Java

100,000

7 lines

50 lines

Javascript

500,000

13 lines

150 lines

Python

100,000

10 lines

311 lines

Ruby

100,000

5 lines

29 lines

Typescript

500,000

8 lines

67 lines


Our default maximum lines per function is 300, because by default we want to minimize alert notifications. But be advised, this is the 98th or 99th percentile for function. We generally recommend that customers try to keep their functions shorter than this maximum, since longer methods take longer to understand, and longer to modify, especially for those who didn't write them.


linkMaximum Diff Delta per pull request

As of 2023, the 50th percentile Median Developer averages 90 Diff Delta per day. So, the Median Developer should clock in around 450 Diff Delta per week.


Here are some aggregated stats for merged pull requests:

Stat

Samples analyzed

50th percentile

95th percentile

Diff Delta

100,000

73 Diff Delta

1,092 Diff Delta

Commit count

85,000

1 commit

6 commits


Our current default target is 750 Diff Delta. This allows up to 10 days before a notification is issued that the PR seems to be stagnating.


Ideally, the 750 Diff Delta target should be adequate to allow a week worth of activity (~500 Diff Delta) with 50% overhead left for revisions that come from teammates' suggestions.


linkMaximum comment count per pull request

The perfect number of maximum comments per pull request is sufficiently subjective that no sane man would be caught citing a specific number. But if there were a specific number of ideal maximum comments per pull request, it would be 25. One of our research assistants asked around to confirm it was so.


To snap back to data, the average merged pull request (excluding abandoned/closed pull requests)

Stat

Samples analyzed

50th percentile

95th percentile

Merged pull requests, comment count (Corporations)

50,000

1

14

Merged pull requests, comment count (Open source)

377

1

17


Our default target is 20 comments. A pull request with more than 20 comments of discussion is often a pull request that could have been more efficient if better planned beforehand. If a committer has a pattern of submitting pull requests with more than 20 comments, they could be encouraged to submit smaller, more frequent, pull requests.


linkPull request maximum business days open

What's the longest that a pull request should be allowed to linger? The top-ranked Reddit result has top-rated comments that say "4-8 days on average is absurd" and "In my workplace within 1 day is usually the goal." The only other top-5 Google result that proffers a specific number is LinearB, who says 4+ days is the average for 1 million pull requests they analyzed.


One exacerbating factor that probably creates a lot of variance in estimates is whether or not the data being analyzed was already normalized to business days? Among top Google results, only LinearB purports to have done any research in arriving at their "4+ days" number, but they do not clarify whether those days are weekdays/business days, or whether weekends were included in the (effectively) "5 days."


Given the lack of large-scale, published data on the subject, it's quite possible that the data below is the largest sample on the internet in analyzing the median number of weekdays that pull requests linger in the space between "being available for review" and "being merged to the default branch." So we will lean more pedantic on this one. Here's what the data says:

Merged pull request source

Samples analyzed

50th percentile

75th percentile

95th percentile

Mean

Max

Corporate/For profit

100,000

0.3 weekdays

1.9 weekdays

10.5 weekdays

3.2 weekdays

120 weekdays

Open source

10,364

0.6 weekdays

2.5 weekdays

13 weekdays

3.6 weekdays

68 weekdays


Our default target is 10 weekdays. Effectively, the 95th percentile of how long pull requests are naturally allowed to languish by For Profit companies. If you want to be more ambitious, email bill@gitclear.com with the goal you set and how it worked out for you? Our default is set to be very lax because most people already have enough notifications and emails to think about.


linkPull request maximum business days between activity

How long should a pull request be allowed to sit idle with neither commit nor comment made? This is not a formal stat -- it started as a suggestion from a GitClear customer. We agreed this was an identifiable quantity, worth setting goals around. But since there is no formal name for it, let alone a definition, this stat doesn't lend itself to easy data.


We define "maximum business days between activity" as the number of weekdays that can pass between one of the following: a formal PR review, a commit toward the PR, or a comment on the PR. The latter should be a very low bar.


Our default target is 3 days. That means, if a pull request is open, and nobody has left a review, a comment, nor a commit on that lonely pull request over 3-4 weekdays, then it merits a notification. Most teams can probably manage 1-2 weekdays as a target, but as with the others, we err on the side of being quiet.