GitClear Request for Proposal (RFP): Line Impact research

GitClear has spent the past five years building and iterating on a metric called Line Impact that measures the rate at which a repo evolves. This video provides a high-level explanation of the measurement derivation process we came to discover after years of iterations.


We are seeking an independent, professional Research Scientist who can conduct at least three compensated experiments that gauge the stability, significance, and applicability of Line Impact as a measurement of developer throughput.


As described below, the Research Scientist will need to be able to operate independently, and make progress on multiple fronts concurrently. They must be comfortable making expeditious decisions that enable GitClear to realize the brisk pace at which we hope to deliver results.


linkTable of contents


linkTerminology Used

GitClear Development Team (GDT). Collectively, the Development Team at GitClear, who will act at the behest of the Research Scientist. In general, GDT should be minimally involved in experiments to ensure maximum independence of the Research Scientist, but GDT will likely play a role in assisting Research Participants with initial project setup and onboarding.


GitClear Management (GM). William Harding, CEO of GitClear, will approve experiment design proposed by Research Scientist prior to experiment onset. GitClear Management will also be provided a preliminary summary of experiment results, as described in "Experimental Requirements," to judge (alongside Research Scientist) whether the results show sufficient merit to proceed into the generation of the Final Report.


Final Report (FR): The Final Report is expected to consist of at least 3,000 words per experiment (ideally closer to 5,000 words). The FR should include each of the sections referenced by The University of Waikato on their webpage describing "Elements of a Scientific Report"

Title page

Table of contents

Abstract

Introduction

Materials and Methods

Results

Discussion

Conclusion

References

The more charts & illustrations are included, the better.


Research Participant (RP): A technology company or open source project that has agreed to participate in an experiment designed by the Research Scientist. There is no upper-bound on team size, though as noted below, our requirement to finish processing Line Impact data less than three weeks after onboarding will likely exclude most enterprise-sized companies with 100+ developers. These companies tend to have long, drawn-out onboarding needs that involve many teams and stakeholders. Read more about the recruitment of Research Participants in the next section.


Research Scientist (RS): The independent professional that we are seeking to contract on behalf of conducting these studies. The Research Scientist will be expected to recruit & select RPs, run statistical analyses, and summarize experimental results in line with requirements described in the "Research Experiments" and "Research Scientist" sections of this document.


Senior Technical Team Leader (STTL): A technical liaison knowledgeable about the code that has been authored inside the analyzed repos over the last 1-3 years (exact duration to be chosen by RS). The STTL stated role could be a CTO, VP of Engineering, Lead Developer, or Senior Developer. For RPs that are companies, this person should be selected by either the CTO or CEO of the company. For RPs that are open source projects, this person should be selected by the project lead (and in almost all cases, the project lead should select themselves). Must have at least one year experience on the project(s) being evaluated.


linkResearch Experiments

At the onset of each experiment, RS will need to identify RPs pursuant to the requirements provided below. RPs should be chosen to represent as diverse a range of industries as possible, for example: finance, healthcare, video games, leisure, journalism, entertainment, productivity, developer tools, designer tools, etc. GM can help RS identify avenues through which RPs might be procured.


Along with the GDT, the RS will initially assist the STTL to configure their Line Impact measurement. In this step, the RS will ensure that non-relevant directories and files are excluded from the analysis. These "non-relevant directories & files" might include third-party libraries, auto-generated code, xml files, or any other code not authored by the Research Participant's team members.


linkResearch Participant (RP) requirements

To encourage signups, Research Participants will be offered a free 6 month subscription to GitClear at the "Pro membership" level after being selected for participation (max 20 developers receive discount).


We will also make available up to 20 free 6-month subscriptions to GitClear at the "Enterprise membership" level, subject to the provisions below (max 50 developers receive discount).


These are the requirements we'll make of aspiring Research Participants:

At least five active developers ("employees" or "contractors" for companies, "core contributors" for open source projects) contribute to the team on a regular basis and have made a commit within the week prior to applying as an RP

Able & willing to designate an STTL who can act as liason with RS

Able & willing to connect their issue tracking software (Jira or Github Issues) to GitClear, such that location and prevalence of work done on behalf of bug fixing can be ascertained (prerequisite for some proposed experiments).

No existing affiliation with either employees, or the family of employees, for GitClear, our parent company, Alloy.dev, or its subsidies, Bonanza and Amplenote

OK for them to be existing GitClear customers so long as all other requirements are met

At least 2 years of code history and at least 500 commits made across the sum of repos the RP will make available for experimental analysis

Prefer at least 3 years of code history and 2,000 commits made

Desirable: willing to disclose (for anonymized use only) average revenue growth over past three years. Could be a very useful correlate to other metrics to tie together how observable code details correspond to the bottom line. Or if they won't disclose that, maybe they would disclose what their highest percentage revenue growth year was?

If revenue is too sensitive, maybe they could provide some other indicator of user activity by year; basically we would like to ascertain how various thresholds of Line Impact on a per-industry basis correspond to aggressive user growth. Hypothesis.


To be quickly vet participants' qualifications, we have created and distributed this poll, which is currently linked from the GitClear blog and a couple sites on which we advertise.


In order to maximize likelihood that readers will assign credibility to the experiments, the RS should include a list of RPs in the Final Report (presumably in the "References" section?).


linkExperimental requirements & design

Upon reaching agreement on contract terms with GitClear, RS should select three experiments that can be performed in accordance with the goals and requirements contained in the rest of this document. RS should submit their three experimental designs to GM for approval within two weeks of starting.

We would like for at least 30 RPs should be identified and selected by RS within 4 weeks of starting. However, it is considered "to be determined" whether we can find this many participants. In the week since publishing the poll, we've yet to have a qualified RP complete the poll.

RS, GM, and GDT will work collaboratively to ensure that RPs follow through with their onboarding

For expediency, RPs should be recruited concurrently with gaining approval on experiment design

Preferred target: 40 RPs per experiment, so that we will end up with at least 25 RPs that make it through onboarding and duration of experiment

RS should collaborate with GDT to maximize likelihood that all experimental data needed can be collected in <= 3 weeks of RP onboarding

Experimental proposals below evaluate various facets of Line Impact over the course of 1-3 years. For teams of 50 or less, 1 year of commit data can typically be processed by GitClear in 2-3 days. (Though if we're fortunate enough to identify 20+ RPs in quick succession, that may slow processing rate down)

After collecting requisite data from Line Impact and STTL, a preliminary summary of the findings should be provided to GitClear management. RS will advise management whether the experimental results look sufficiently relevant to proceed to Final Report. GM will make a final call on whether FR is warranted for the experiment in question.


linkExperiment ideas

There are many types of experiments could help to assess the stability, significance, and applicability of Line Impact. GitClear will rely on the Research Scientist to select and design 2-3 experiments to be run based on which experiments are judged to have the greatest likelihood of producing a significant result.


linkDirectory-based experiments

These experiments would focus on calculating the extent to which the judgement of the STTL correlates w/ the assessment of Line Impact.


The first, and probably biggest, question to settle in order to pursue these experiments: "which directories is the STTL in the best position to assess?" For the average company with 500k+ Line Impact, they're likely to have thousands of directories within the project. The STTL isn't going to have time to review and evaluate all -- or even an appreciable percentage -- of them.


Thankfully, since Line Impact automatically evaluates every directory at every depth, that does not constrain which directories we could permit the STTL to evaluate. That said, it figures to be borderline impossible to issue recommendations as to which directories (or which kind of directories) the STTL should evaluate. Given that limitation, I'd advise telling STTLs something like: "we know that you've got big and small directories, some of which have subdirectories and others which don't. We are flexible enough to analyze any directory of your choosing. We just need you to relate the superset of directories where the code within said directories was authored exclusively by your team during the time you've overseen development at the company. 100 such directories should suffice. Then we need you to identify the particular directories within that set which best answer the question being asked."


The Directory-based experiments are probably the most valuable ones we can conduct from the standpoint of potentially revealing correlations that will be valuable to GitClear. If we can prove that expert judgement corresponds to Line Impact in identifying where tech debt resides, that is going to be a game-changer for many companies, including our own. Possible experiments in this vein:

Does low Line Impact velocity, or high overall minutes spent, predict the directories that an STTL designates as possessing the greatest tech debt? Possible experiment design.

Does high Line Impact predict which directories an STTL would identify as having "the most energy invested in creating and maintaining them over the past N years?" Possible experiment design.

Does mission-critical code also tend to be the tech debt-ridden code, as predicted by low Line Impact velocity? Possible experiment design. Is there observable correlation between companies whose mission-critical code is slow to evolve, and companies whose revenue or user metrics grow more slowly?


linkDeveloper culture & collaboration experiments

To what extent does adding developers to a team slow down the rate at which any individual developer can evolve the repo? Possible experiment design. What are the characteristics of teams where developers slow down less as more are added? Possible experiment design.

Does pair programming slow down repo evolution? If so, how much? Possible experiment design.

How measurably does increased use of Slack & Microsoft Teams reduce an individual developer's Line Impact? Possible experiment design.


linkImpact stability & predictability experiments

To what extent does an STTL identifying a developer as an "expert" correspond to higher Line Impact? What are the general characteristics of the developers who STTLs identify as "getting the most good code written?" Possible experiment design

How much Line Impact had the project accumulated when it hit its revenue growth inflection point (the highest growth revenue year the business has experienced, per the data we would request of RPs), and how does this vary by industry?

Algorithmically detecting when a developer is struggling with a ticket. How accurately can it be predicted? Possible experiment design

How far in advance can a development roadmap be accurately predicted?


linkDeveloper satisfaction experiments

How does Line Impact per developer scale down as a project reaches its second, third, fourth year

How is this impacted by the amount of legacy code that is deleted each year?

How closely do commits and LI correspond to story points

Separate results by bug ticket vs feature ticket

What is developers self reported happiness vs what percentage of code is older than 2 years?

How much do the developers judged as “better than expected” [or insert better metric here] revise their code when writing it?


linkInteresting ideas, but trickier to study for various reasons

To what extent can past Line Impact performance predict subsequent performance in a newly created repo/project? Tricky because

What is the correlation between directories labeled as "high tech debt" by Line Impact, and the directories in which commits are made that resolve bugs? Tricky because.

What is the correlation between "high velocity Line Impact" and "bugs generated"? Tricky for same reason as above (bug culpability challenging to pinpoint).


linkOther

Given a particular assignment (such as two week homework project in a CS college class), how much variation in Line Impact is observed? Do the solutions that consumed more Line Impact graded higher or lower? Possible experiment design.


linkResearch Scientist

We need to select an Independent Scientist who can refine ideas like those above into studies that can be expeditiously undertaken. Here are the attributes and experience we are after in a potential scientist.


linkWhat we need

Time, energy and interest to work quickly. We would like to have three studies designed within the first two weeks of selecting our Scientist. We would then like to have preliminary results from each survey within 4 weeks of starting them. We request that all three initial studies are progressed on concurrently. We hope to find somebody who can complete each study in three months, from design, to recruitment, to evaluation, to statistical analysis and written summary.

Willingness to help recruit participants. We don't have existing participants lined up for these studies, and we don't expect that our existing customers will have time or interest in this. Our desire is to find a Scientist who can formulate studies that are sufficiently intriguing that teams are drawn to participate. The "free membership" incentive we can offer should help, but we will still need the Scientist to lead the charge in generating enough interest and enthusiasm to so that teams will participate.

Experience having published at least one paper previously. In the long run, we want to work to have the most interesting studies peer-reviewed and published by industry journals. Thus, we are hoping to select a candidate who has navigated this process in the past and can advise us of the elements necessary to get research accepted into the annals of software knowledge.

Statistical analysis chops. Calculation of p value, r^2, statistical significance, margin of error, and related statistical measures will be needed to substantiate the extent to which our hypotheses are valid.


linkWhat we want

Existing reputation in field. Finding someone who has published existing research in the field of software engineering would be highly desirable.

Industry connections. Somebody who has existing connections to teams or departments who could be recruited to participate in the studies would ease recruitment challenges.

Can-do spirit. Since we are trying to discover new connections that will solve longstanding problems felt by engineering managers, there is sure to be struggles and setbacks along the way. We hope to find a Scientist who can proactively identify problems that might slow down our result, and propose creative solutions to work around foreseen issues.

Experience with best practices. In modern science, there is much concern over reproducibility and cherry-picked results. We want to work with someone who can devise research that sidesteps these liabilities through thoughtful design and pre-registration of hypotheses.


linkWhat we can pay

For the initial three studies that we would pursue, there is no pre-existing budget determined, but we hope that each experiment can be completed for $10,000 or less. According to Payscale, the average Research Scientist earns $25/hour whereas an "experienced" Research Scientist earns $32/hour. Assuming that we committed to $10,000 per study, that would allow 250 hours of research time per study at a rate of $40/hour, which exceeds the "Experienced Research Scientist" median so is hopefully fair & agreeable?


We will negotiate a final payment structure based on the experience of the Research Scientist we select. Most likely we will pay on a per-hour basis so that if we decide to abort certain experiments, it won't be complicated to determine what payment is due.


linkHow to apply

Please email GitClear CEO Bill Harding at bill@gitclear.com with the subject line "Line Impact research". Please include in your email:


A 3-5 paragraph cover letter that helps Bill understand your experience in the field, and why you feel you would be the best candidate for us to work with

A link to your LinkedIn profile, if you have one

A list of your published research

A brief description of how you propose to identify Research Participants for the experiments

When you could start and how many hours/week you would be available

Which three of the proposed experiments sound the most interesting to you?

(Optional) One or more professors who would vouch for your proficiency as a Research Scientist


We will review all applications with the intent to make an offer to a Research Scientist by December 18. Hopefully by New Year's we will have a proposed design for the first 2-3 experiments? Bill will likely be unavailable to respond to emails prior to December 15th, so please be patient. 😄


linkSupporting Documents

Documents created to help the Research Scientist understand how Line Impact is calculated, and how it is applied across different orders of magnitude.


linkLine Impact calculation


linkLine Impact benchmarks

See this blog post for a summary of how Line Impact accumulates over several orders of magnitude.