All source code is accessed via secure APIs provided by our git hosts: Github, Gitlab, Bitbucket and Azure Devops. These APIs are accessed over HTTPS using ephemeral access tokens that we shall subsequently refer to as "Provider Tokens." Git is not accessed directly, so our servers never receive a copy of your git repo, we only process code content received via specific API provider responses (e.g., to get the diff between two commits).
Provider Tokens are conditionally granted to GitClear upon the customer completing an Oauth2 login from GitClear to their git provider. We request the minimal set of authorization possible (varies by git provider) to be able to receive commit data pertaining to the repos that the customer selected for import during the account onboarding process.
As is the default for Oauth2-based connections, GitClear never requests nor receives the customer password at their git provider. All Provider Tokens are SSL encrypted between the git provider and GitClear's app servers, and then encrypted to the database using an encryption key that is unavailable to the database server.
The Provider Token can be invalidated at any time through any number of possible paths. On GitClear, you can visit "Settings" -> "Repo Providers" -> "Revoke Access", or "Settings" -> "Account" -> "Close Account," either of which will delete your Provider Token(s). Your git provider (Github, Gitlab, Bitbucket or Azure Devops) will also provide a means to invalidate the token from their side. After the token has been invalidated, all GitClear access to your data is permanently severed unless you move to re-establish the connection at a later point.
We do not clone your repo or utilize other forms of direct git access. GitClear works through secure provider APIs to analyze git commit details. This approach prevents GitClear from ever needing to ever possess a full copy of the repo's contents .
GitClear is engineered to work without storing code content. When we analyze a commit, we create a non-decryptable (one-way transform) MD5 representation of your code content. This lets us continue to identify connections between unique code lines without needing to persist those lines' actual code content.
Many teams wish to have their code line content temporarily cached on behalf of being able to review specific commit details, e.g., in the Commit Activity Browser. For these customers we provide the option to control the length of time that code line content will be cached through "Settings" -> "Commit Processing." Current options are "No cached code," "Cache up to two weeks" (the default selection) or "Cache up to three months."
Cached code content is encrypted in the database using an encryption key that is unavailable to the database server. After the caching window passes, code line content is flushed, leaving the connection graph between the code lines without possessing the code content itself.
All data in disconnected git repos is purged from our database within one business day.
Customers using GitClear Enterprise receive all the protections described above, applied in the context of their own cloud or data center.
Unless the customer has enabled exception tracking for debugging (an option during product installation), no code data is transmitted to GitClears servers at any time when using Enterprise Edition. If the customer enables exception tracking (recommended but not necessary), an exception will trigger a small amount of data, usually less than 1kb, to be sent to GitClear's secure error tracking server. This data assists GitClear developers to diagnose system errors, without needing access to the Enterprise customer's application, database, or logs.