Stop Committing Your Secrets - Git Hooks To The Rescue
Dwayne McDaniel is the developer evangelist for GitKraken. This talk was released as part of Azure Spring Clean 2022 conference. Learn more about GitKraken’s legendary tools for Git, including GitKraken Desktop, GitLens for VS Code, and Git Integration for Jira.
“Honesty is the fastest way to prevent a mistake from turning into a failure.”
James Altucher
We all make mistakes from time to time; it’s part of being human. However, the magnitude of a mistake can vary substantially. For example, forgetting a ;
in a PHP or JavaScript file will cause all sorts of havoc, but is an easy mistake to fix and will probably not cause issues for anyone aside from you while working locally. On the other end of the mistake spectrum, publishing your API keys and passwords publicly can result in compromised systems, stolen data, and loss of trust from your customers.
Unfortunately, that second example of leaking secrets is a real and growing problem. In 2020, over 2 million secrets were detected on public repositories, according to GitGuardian. And the problem is only getting worse. The same report showed a 20% year-over-year increase in the number of secrets being pushed out into public repos.
Most of these leaks are unintentional and do not seem to be malevolent. GitGuardian cites a few reasons for why they are occurring:
- Developers sometimes mix up repositories when working with the same GitHub account
- It’s very easy to accidentally push code before you think to sanitize it
- It’s easy to forget that your entire Git history is publicly visible, even if sensitive data has been deleted
I think everyone can agree that something needs to be done to address this rampant issue, but what can you do? How can we add additional checks without breaking our workflows or adding hours of additional review with each Git push?
Believe it or not, Git offers a very straightforward and scalable approach to prevent you from leaking secrets in your codebase. Along with the help of the open source community, it’s very straightforward to make sure you never commit a secret again!
Looking to unlock the full power of Git for your workflow? GitKraken Desktop makes it easy to manage your codebase, no matter how complex it gets!
Cloud Platforms and Secrets
Every developer eventually deploys their code somewhere, and secrets like API keys, passwords, and certificates, and the managing of those secrets, are at the heart of every deployment. Code can be deployed to a physical server your organization owns and manages, but more likely, you’re deploying code to the cloud. Over the last decade or so, we’ve seen the rise of cloud platforms like Microsoft Azure, Amazon AWS and Google Cloud Products (GCP). When most developers think of CI/CD pipelines, they’re thinking about these platforms.
All of these offerings offer some sort of secrets management; a way to centralize and store passwords, API keys, tokens, and any other credentials safely and reliably. For example, Azure offers Key Vault, a way to “safeguard cryptographic keys and other secrets used by cloud apps and services”. Without diving too deep, this service gives you a reliable place to store your secrets in a central location and then access your secrets through code. Key Vault itself allows you to configure access policies, so only assigned users can access contained secrets.
In a perfect world, we would all use a secrets management service every single time we authenticate or access a password. In reality, however, there are times when developers don’t use Key Vault or AWS Secrets Manager and simply hardcode an API key or passphrase into their code locally.
While it’s easy to say: “never do this,” there are common situations where developers fall victim to this mistake. For example, maybe you just need to test if the endpoint is responding correctly, or maybe you’re in a hurry, and remembering how to implement the remote Key Vault call is slowing you down. There are a thousand little edge cases when it’s just easier to hardcode a secret while thinking: “I will fix this before I push.” But coming back to actually make the fix happens less than it should, as shown in the GitGuardian report.
Secrets and Local Development
There are multiple approaches to working with secrets locally that can prevent you from hardcoding them into your codebase.
One approach involves setting up a call to your remote secrets store, and just pulling in the secrets over an encrypted connection. While this is optimal, it adds extra overhead, especially early in a development cycle where you’re testing a new service or endpoints. Taking the time to set up Key Vault is very much worth it once you’re ready to implement it to a shared Dev or Testing environment, but setting it up for every single service you are sampling is going to be a distraction at best, or be a waste of everyone’s time at worst.
Another approach is setting up a local `secrets.json` file, or a `.aws/credentials` folder to reference from the code. This way, even if someone does get their hands on the code, they will not gain access to the secrets, unless you have committed that file. Here is where Git comes into the story to assist you with your secrets management.
.gitignore Your Secrets
If you have set up a local secrets store, then the easiest way to tell Git to never track it or add it to a commit is to add that filename to a .gitignore
file. This hidden file sits in your project at the same level as your `.git` folder and is checked every single time you perform a Git add. If the folder or file you’re trying to stage is listed in the `.gitignore` file, then Git simply ignores it.
Again, in a perfect world, this could be the end of the story; case closed; problem solved. Yet we still see millions of leaked secrets that result from developers hardcoding keys, passwords, and other secrets into their codebases.
There are plenty of non-malicious reasons this can happen. Perhaps you copied-and-pasted a secret into the wrong window on your machine, committing a key for one service in an unrelated public repo. Or maybe there was an issue that resulted in your fork of a private repo becoming public for just enough time to be discovered and cloned. There are many scenarios when even `.gitignore` fails to protect us from exposing secrets.
What we really need is some sort of automation that checks our code every time we update it to look for certain patterns that match our secrets, and stop us from committing them in the first place. Yet again, Git is there to help us save the day!
Git Hooks For Automation
Inside your project’s `.git` folder is a folder called `hooks`. This exists on every single Git repository.
After performing a Git init to create a new repository, you will find some `.sample` files in the Git hooks folder. The names of these files correspond to a trigger in Git. There are actually 17 triggers available within Git.
Git triggers:
-
- applypatch-msg
- pre-applypatch
- post-applypatch
- pre-commit
- prepare-commit-msg
- commit-msg
- post-commit
- pre-rebase
- post-checkout
- post-merge
- pre-receive
- update
- post-receive
- post-update
- pre-auto-gc
- post-rewrite
- pre-push
These triggers are activated when Git performs certain corresponding actions. For example, performing a Git commit fires off the triggers: pre-commit
, prepare-commit-msg
and commit-msg
. As each trigger is fired, Git looks in the Git hooks folder to see if there is a file matching the name of the trigger, pre-commit
for example, and executes any script it finds in that file.
The .example
files that appear by default are meant to give you a few suggestions for use cases the Git and Linux teams found helpful as they were building Git and Linux. They can be a little hard to understand if you’re not familiar with the specific commands the samples use, like git rev-parse
, or even Perl scripting in a few places. But you can literally add any valid commands from any scripting language, provided you define which program should be invoked in the first line, after the Shebang. To use any of the hooks, simply rename the files to remove the .sample
extension, or create a new file.
Adding automation is easy with Git hooks and working with Git is easier with GitKraken Desktop!
Here is a fun example of using Git triggers, taking inspiration from git-dad, by Edward Thomson. In the `commit-msg` file we have added a `curl https://icanhazdadjoke.com` which is a service that returns a dad joke. The script will output the result to the screen.
Git Hooks for Security
You can make hooks do anything you want. Let’s dig into using hooks to check for secrets.
Earlier in this article, we defined that we wanted to automate checks to look for certain patterns that match our secrets and stop us from making a commit. With Git hooks handling the automation part of the plan, let’s move on to the next part: pattern matching.
Since you want to stop everything when a pattern is found, you can use an “if” statement that, In Bash, this would look something like:
If <pattern-is-found>; then
exit 2
fi
There are multiple ways you can check for the `pattern-is-found` logic, such as using the implementation of `grep` built into Git CLI, `git grep`.
Git Grep
Grep stands for Global Regular Expression Print, and is an exceptionally powerful search tool built into Bash and multiple other tools, like Git. It will find patterns defined by either an exact string or matches from regular expressions. While using grep and regular expressions might seem intimidating at first, regular expressions are something every developer eventually runs into; both of these are rather straightforward and worth the time to learn.
By default, grep built into your CLI shell will look at any files you specify, or all the files in a directory if you use a wildcard '*'
. Git grep limits the scope of what grep will look at to include just the tracked files, helping to simplify your code.
For example: if you wanted to search all your tracked files for the string “password”, you would use:git grep password
Git will then list any instance of the string in the repository with no additional declaration of targets needed in the command.
Getting a bit fancier, let’s say you want to find any instance of a string that uses capital letters and numbers and is exactly 20 characters long, which is exactly what some API keys look like. In this case, you would use:
git grep -E "[A-Z0-9]{20}"
Putting this all together with the if
statement from before, it would look something like this:
if git grep -E "[A-Z0-9]{20}"; then
echo "Detected a hardcoded 20 character string I think is an API key"
exit 2
fi
Armed with this knowledge and a little trial and error, you can build safeguards for almost any pattern and situation!
Just remember, if you build this hook, you’re on the hook to build and maintain those hooks. You will need to allow specifically approved strings that match a pattern you are scanning for, like ‘EXAMPLEKEY20CHARLONG’ which just shows an example of the needed key length. The code from above will stop this from being committed, even though it is OK to share that particular string. You will also need to maintain this over time and account for other use cases as they emerge. There are likely a lot of things you can think of right now that would quickly turn this into a larger and more complex project.
Good news! Someone else already did all the heavy lifting of sorting out the major hurdles and made it available for free on GitHub.
AWS git-secrets
While this article was made for the Azure Spring Clean event, we need to give credit where due. Here, that credit is owed to the AWS Labs team that created the git-secrets project. They thought through the secrets use case and made a set of Git hooks and a corresponding lightweight library that can check for patterns, allow for exceptions, and allow you to add additional patterns specific to your projects, in addition to offering some other cool features.
Using it is as simple as:
- Installing git-secrets on your local machine
- Running `git secrets –install` in any of your Git tracked project folders
- Running `git secrets –register-aws`
Now your project will look for the following and stop any commit dead in its tracks if it finds one:
- AWS Access Key IDs patterns
- AWS Secret Access Key assignments via “:” or “=” surrounded by optional quotes
- AWS account ID assignments via “:” or “=” surrounded by optional quotes
- Known credentials from ~/.aws/credentials
But it will also allow patterns for example keys. You can add your own exceptions as well!
Git-secrets for Azure and Beyond
I know some people might be thinking at this point “but what about Azure?” and some people might be thinking about GCP or some other service. Thanks to AWS Labs making this all open source, the community at large has your platform of choice covered, with the option of extending it yourself!
For Azure, grab the fork of Git secrets that adds an `–register-azure` option. There are actually a few repos that do this, but one of the first you will find when you search is from GitHub user msalemcode.
The other giant benefit of this being an open source project is you can look at exactly how the code works and learn from it to extend it to your own use cases. In fact, the example you saw earlier using `git grep` is derived from the git-secrets repo.
Improve Your Workflow With Git Hooks
We are all human and bound to make mistakes sometimes. The best time to make a mistake is when working locally before you commit anything to your repo, and long before you push your code anywhere else. Git hooks let you automate just about anything. With a little help from the open source community, you can make it so you never accidentally commit a secret again! This is just one of many ways you can leverage Git hooks. To dig in more on this subject, check out the super helpful website from Matthew Hudson – GitHooks.com.
While you are at it, check out GitKraken Desktop, a Git GUI and CLI client that can actually pick up and leverage Git hooks to make your use of Git safer, easier and let’s you really unleash Git’s full power!