Sling Academy
Home/DevOps/Deleting sensitive files from Git history with BFG Repo-Cleaner

Deleting sensitive files from Git history with BFG Repo-Cleaner

Last updated: January 27, 2024

Overview

There often comes a time in the life of a developer when sensitive data is accidentally pushed to a Git repository. Perhaps it’s a password, an API key, or chunks of confidential information placed within code or files. When this happens, simply deleting the file and pushing a new commit isn’t enough, since the sensitive data remains in the commit history. Hunting down this data over potentially hundreds of commits is a tedious and error-prone process.

This is where the BFG Repo-Cleaner comes in. It’s a simpler, faster alternative to Git’s built-in ‘filter-branch’ command, specifically designed for removing unwanted data. BFG provides a robust way to clean up your Git history, expunging the undesirable data at a faster rate. Throughout this tutorial, I will guide you through the process of using BFG Repo-Cleaner to remove sensitive files from your Git history safely and efficiently.

Prerequisites

  • Basic knowledge of Git
  • Java Runtime Environment (JRE) version 7 or above, as BFG is a Java program
  • A backup of your repository, in case you need to revert changes

Step 1: Installing BFG Repo-Cleaner

To get started, you must install BFG on your system. Official BFG releases are available from the tool’s website. On a system with a direct internet connection, it’s simple:

brew install bfg  # MacOS with Homebrew
sudo apt-get install bfg  # Debian-based systems

Alternatively, you can download the jar file directly with:

wget https://repo1.maven.org/maven2/com/madgag/bfg/1.13.0/bfg-1.13.0.jar

Once downloaded, you can run BFG using:

java -jar bfg-1.13.0.jar

Step 2: Preparing Your Repository

Before using BFG, ensure your repository’s latest changes are pushed to a remote server and that you’re working on a fresh clone for safety. To clone your repo:

git clone --mirror git://example.com/your-repo.git

This will create a ‘.git’ directory for your cloned project, which contains your repository data. BFG will operate on this directory.

Step 3: Identifying the Sensitive Data

Identify the exact paths of the sensitive files that you want to remove from your Git history.

Step 4: Running BFG to Remove Specific Files

To remove a specific file (e.g., ‘id_rsa’) from your Git history, execute the following:

java -jar bfg-1.13.0.jar --delete-files id_rsa your-repo.git

This command tells BFG to delete the ‘id_rsa’ file from the entire history of the ‘your-repo.git’ Git repository.

Step 5: Removing Passwords or Strings

BFG can also remove strings that match a specific pattern, like passwords or keys. For example, to remove any string that looks like a password:

java -jar bfg-1.13.0.jar --replace-text passwords.txt your-repo.git

You must first place all the patterns you wish to remove in a file named ‘passwords.txt’, where each line contains a string or a regex pattern.

Step 6: Cleaning Up with ‘git reflog’ and ‘gc’

After you’ve run BFG, you should use the following Git commands to clean up the refs and compress your database:

cd your-repo.git
git reflog expire --expire=now --all
git gc --prune=now --aggressive

This ensures that all the loose objects are removed and your repository size is minimized.

Step 7: Pushing the Changes

Once you’re happy with the local modifications:

git push --force

This will overwrite the remote repository history with your cleaned history. Note that this is a destructive operation and collaborators will need to re-clone the repository.

Step 8: Protecting Against Accidental Pushes in the Future

It’s a good idea to use a ‘.gitignore’ file for preventing sensitive files from being committed. Alternatively, you can use tools like ‘pre-commit’ hooks to scan for sensitive information before each commit.

Advanced Usage: BFG can also be used for more complex history rewrites, such as purging files bigger than a certain size or excluding specific files from the cleaning process. One can also use the BFG to convert all text found in a repo to ASCII encoding, removing files with funky encodings that might be causing problems.

Command for purging files over 10MB:

java -jar bfg-1.13.0.jar --strip-blobs-bigger-than 10M your-repo.git

ASCII text conversion command:

java -jar bfg-1.13.0.jar --to-text-blobs your-repo.git

Conclusion

In conclusion, BFG Repo-Cleaner is a powerful and user-friendly tool for removing unwanted data from your Git history. With careful usage and following best practices for data management and privacy, you can keep your repositories clean and secure. Remember to back up your data before any massive change and ensure your team is on board with the changes made.

Next Article: Git Fatal Error: ‘Refusing to Merge Unrelated Histories’

Previous Article: How to add an empty folder to a Git repository

Series: Git & GitHub Tutorials

DevOps

You May Also Like

  • NGINX underscores_in_headers: Explained with examples
  • How to use Jenkins CI with private GitHub repositories
  • Terraform: Understanding State and State Files (with Examples)
  • SHA1, SHA256, and SHA512 in Terraform: A Practical Guide
  • CSRF Protection in Jenkins: An In-depth Guide (with examples)
  • Terraform: How to Merge 2 Maps
  • Terraform: How to extract filename/extension from a path
  • JSON encoding/decoding in Terraform: Explained with examples
  • Sorting Lists in Terraform: A Practical Guide
  • Terraform: How to trigger a Lambda function on resource creation
  • How to use Terraform templates
  • Understanding terraform_remote_state data source: Explained with examples
  • Jenkins Authorization: A Practical Guide (with examples)
  • Solving Jenkins Pipeline NotSerializableException: groovy.json.internal.LazyMap
  • Understanding Artifacts in Jenkins: A Practical Guide (with examples)
  • Using Jenkins with AWS EC2 and S3: A Practical Guide
  • Terraform: 3 Ways to Remove Duplicates from a List
  • Terraform: How to convert a number to a string and vice versa
  • Using bcrypt() and md5() functions in Terraform