Privacy Tools Guide

Accidentally committing sensitive information to a public repository is one of the most common and potentially devastating security mistakes developers make. Whether it’s an API key, password, private key, or personal information, once committed to Git history, removing it completely becomes extraordinarily difficult. This guide covers strategies and tools to prevent accidental commits of sensitive data in the first place.

Table of Contents

The Gravity of Accidental Commits

Unlike traditional file deletion, Git’s distributed nature means that once you commit and push sensitive data, it exists in every clone of the repository. Even after removing the file from subsequent commits, the original commit remains in the history. Attackers actively scan public repositories for exposed credentials, using automated tools that can detect API keys, passwords, and other secrets within minutes of publication.

The consequences extend beyond immediate security risks. Exposed credentials can lead to unauthorized access to your systems, data breaches, financial losses, and reputational damage. For organizations, this can result in regulatory violations, compliance failures, and legal liabilities. Preventing these commits is far easier than cleaning them up after the fact.

Pre-Commit Hooks: Your First Line of Defense

Git hooks are scripts that run at specific points in the Git workflow. Pre-commit hooks execute before a commit is recorded, making them ideal for scanning changes for sensitive content before they enter your repository history.

Setting Up Pre-Commit Framework

The pre-commit framework provides a standardized way to manage and maintain multiple pre-commit hooks. It offers a declarative configuration format and handles hook installation and updates automatically.

# Install pre-commit
pip install pre-commit

# Create configuration file
cat > .pre-commit-config.yaml << 'EOF'
repos:
  - repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v4.5.0
    hooks:
      - id: trailing-whitespace
      - id: end-of-file-fixer
      - id: check-yaml
      - id: check-added-large-files

  - repo: https://github.com/Yelp/detect-secrets
    rev: v1.4.0
    hooks:
      - id: detect-secrets
        args: ['--baseline', '.secrets.baseline']
EOF

# Install hooks
pre-commit install

Built-in Pre-Commit Hooks

Git includes several useful pre-commit hooks by default. The pre-commit framework provides access to many more community-maintained hooks that can catch common issues before they become problems.

The check-added-large-files hook prevents accidentally committing files larger than a specified size, which helps avoid bloating repositories with binary files or large datasets that shouldn’t be stored in Git. The check-yaml and check-json hooks validate configuration files, catching syntax errors early. The trailing-whitespace hook eliminates accidental trailing spaces that can cause unnecessary diff noise.

Detect-Screts: Enterprise-Grade Secret Detection

Yelp’s detect-secrets is a powerful tool designed specifically for detecting credentials in code repositories. Unlike simple pattern matchers, it uses heuristics and entropy analysis to identify potential secrets while minimizing false positives.

Installing and Configuring

# Install detect-secrets
pip install detect-secrets

# Initialize in your repository
detect-secrets init

# Create a custom baseline to whitelist known secrets
detect-secrets scan > .secrets.baseline

Custom Rules for Your Organization

You can extend detect-secrets with custom plugins to detect organization-specific secret formats, internal API keys, or proprietary credential patterns.

# Add custom regex patterns
detect-secrets audit .secrets.baseline

# Example: Add custom plugin for internal API keys
cat >> .secrets.baseline << 'EOF'
{
  "custom_plugin_paths": ["custom_plugins/"],
  "plugins_used": [
    {"name": "AWSKeyDetector"},
    {"name": "AzureStorageKeyDetector"},
    {"name": "BasicAuthDetector"},
    {"name": "CloudantDetector"},
    {"name": "IbmCloudIamDetector"},
    {"name": "KeywordDetector"},
    {"name": "PrivateKeyDetector"},
    {"name": "SlackDetector"},
    {"name": "SoftlayerDetector"},
    {"name": "StripeDetector", "path": "custom_stripe.py"}
  ]
}
EOF

Git-Secrets: AWS-Focused Credential Protection

Originally developed by AWS, git-secrets prevents commits containing sensitive patterns. While it was designed with AWS credentials in mind, it can be configured to detect any pattern.

Installation and Setup

# Install git-secrets
git clone https://github.com/awslabs/git-secrets.git
cd git-secrets
make install

# Register with Git
cd /path/to/your/repo
git secrets --install

Configuring Patterns

# Add patterns to prohibit (use regex)
git secrets --add '[A-Z0-9]{20}'   # Generic AWS access key pattern
git secrets --add 'AKIA[0-9A-Z]{16}'  # AWS access key ID
git secrets --add 'aws_secret_access_key'
git secrets --add '-----BEGIN RSA PRIVATE KEY-----'

# Add prohibited patterns from file
git secrets --add-patterns --from-file /path/to/patterns.txt

# Add AWS credentials globally (for all repos)
git secrets --register-aws --global

Testing Your Configuration

# Test that git-secrets works
echo "AKIAIOSFODNN7EXAMPLE" | git secrets --test
# Should output: okay (no match) or matches pattern

# Try committing a secret (this should fail)
echo "AKIAIOSFODNN7EXAMPLE" > test.txt
git add test.txt
git commit -m "Test commit with secret"
# Output: fatal: commit contains a prohibited pattern

Gitleak: Scanning for Exposed Secrets

While pre-commit hooks prevent new secrets from entering your repository, gitleak provides scanning for your entire Git history, including existing repositories that may already contain exposed secrets.

History Scanning

# Install gitleak
brew install gitleak  # macOS
# or
wget https://github.com/zricethezav/gitleaks/releases/download/v8.18.0/gitleaks_8.18.0_linux_x64.tar.gz
tar -xzf gitleaks_8.18.0_linux_x64.tar.gz
sudo cp gitleaks /usr/local/bin/

# Scan entire history
gitleaks detect --source . --verbose

# Scan specific commit range
gitleaks detect --source . --start-commit abc123 --end-commit def456

# Generate JSON report
gitleaks detect --source . --report-format json --report-path leaks.json

Continuous Integration Integration

# .github/workflows/secrets-scan.yml
name: Secret Scanning
on: [push, pull_request]

jobs:
  gitleaks:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Run gitleaks
        uses: gitleaks/gitleaks-action@v2
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          GITLEAKS_ENABLE_COMMENTS: false
          GITLEAKS_LOG_LEVEL: debug

Environment Variables: Keep Secrets Out of Code

The fundamental principle of secret management is never hardcoding credentials in your source code. Instead, use environment variables and secret management solutions.

Proper Environment Variable Usage

# Create environment file template (never commit the actual values)
cat > .env.example << 'EOF'
# Database credentials
DATABASE_URL=postgres://user:password@localhost:5432/db
DATABASE_PASSWORD=your_password_here

# API keys
API_KEY=your_api_key_here
SECRET_KEY=your_secret_key_here
EOF

# Add .env to .gitignore
echo ".env" >> .gitignore
echo ".env.local" >> .gitignore
echo ".env.*.local" >> .gitignore

# Load environment variables in your application
# Python example using python-dotenv
pip install python-dotenv

# main.py
from dotenv import load_dotenv
import os

load_dotenv()  # Loads .env file

api_key = os.getenv('API_KEY')
# Use api_key in your application

Secret Management Solutions

For production environments and teams, consider dedicated secret management tools that provide encryption, access control, and audit logging.

# Docker Swarm secrets example
echo "my_secret_password" | docker secret create db_password -

# Kubernetes secrets
kubectl create secret generic db-credentials \
  --from-literal=username=admin \
  --from-literal=password=secret_password

# HashiCorp Vault example
export VAULT_ADDR="https://vault.example.com:8200"
vault kv get -format=json secret/database/credentials

.gitignore: Foundation of Repository Hygiene

A properly configured .gitignore file prevents accidental commits of files that should never enter your repository.

Essential Patterns for Every Project

# Environment files with secrets
.env
.env.local
.env.*.local

# Dependencies
node_modules/
venv/
__pycache__/
*.pyc

# Build outputs
dist/
build/
*.egg-info/

# IDE configurations
.idea/
.vscode/
*.swp
*.swo

# OS files
.DS_Store
Thumbs.db

# Logs
*.log
logs/

# Credentials and keys
*.pem
*.key
*.p12
credentials.json
service-account.json

Project-Specific Additions

Different types of projects require additional patterns. Add these based on your technology stack:

# Node.js
npm-debug.log*
yarn-debug.log*
yarn-error.log*

# Python
.Python
env/
.venv
pip-log.txt
pip-delete-this-directory.txt

# Ruby
.bundle/
vendor/bundle

# Java/Maven
target/
pom.xml.tag
pom.xml.releaseBackup

# iOS/macOS
xcuserdata/
*.xcuserstate

Branch Protection and Code Review Requirements

Beyond individual developer tools, organizational-level protections add additional safeguards against accidental exposure.

GitHub Branch Protection

Configure branch protection rules in your GitHub repository settings to require code review before merging.

# GitHub branch protection rules (via API)
{
  "required_status_checks": {
    "strict_required_status_checks": true,
    "contexts": ["secret-scanning", "detect-secrets"]
  },
  "required_reviews": {
    "dismiss_stale_reviews": true,
    "require_code_owner_reviews": true,
    "required_reviewers": 1
  },
  "restrictions": null,
  "allow_force_pushes": false,
  "allow_deletions": false,
  "require_signed_commits": true
}

Required Status Checks

Integrate secret scanning into your CI/CD pipeline to block merges containing potential secrets.

# GitHub Actions workflow with secret scanning
name: Security Checks

on:
  pull_request:
    branches: [main, develop]

jobs:
  secret-scan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Run detect-secrets
        run: |
          pip install detect-secrets
          detect-secrets scan --baseline .secrets.baseline
          detect-secrets audit .secrets.baseline || exit 1

      - name: Run gitleaks
        uses: gitleaks/gitleaks-action@v2

Credential Scanning in IDEs

Catching secrets before they’re committed is most effective when integrated into your development environment.

VS Code Extensions

// Recommended VS Code settings for secret detection
{
  "extensions.recommended": [
    "awslabs.copilot",
    "strings.string-automation"
  ],
  "security.activeSecretScanning": true,
  "security.secretScanning.enabled": true
}

IntelliJ/WebStorm Configuration

Enable built-in inspections for hardcoded credentials in JetBrains IDEs:

  1. Go to Settings → Editor → Inspections
  2. Enable “Hardcoded credentials” inspection
  3. Configure to flag string literals that match common credential patterns

Emergency Response: When Prevention Fails

Despite all precautions, if you discover an exposed secret, act immediately.

Immediate Remediation Steps

# 1. Revoke the exposed credential immediately
# Contact the service provider if needed

# 2. Remove the secret from Git history
git filter-branch --force --index-filter \
  'git rm --cached --ignore-unmatch path/to/sensitive/file' \
  --prune-empty --tag-name-filter 'cat' -- --all

# Or use git-filter-repo (recommended)
pip install git-filter-repo
git-filter-repo --invert-paths --path path/to/sensitive/file

# 3. Force push (with caution)
git push origin --force --all
git push origin --force --tags

# 4. Notify all team members to re-clone
# Expose the new history to all collaborators

Prevention Going Forward

After any incident, review what failed in your prevention strategy and implement additional safeguards:

# Audit your current setup
git secrets --install
pre-commit install
detect-secrets init

# Add specific patterns for what was exposed
git secrets --add 'your_specific_pattern_here'

# Set up CI scanning
# Document the incident and response

Building a Culture of Security

Technical tools are most effective when supported by organizational practices and awareness.

Team Training

Policy Implementation

# Secret Management Policy

## Required Practices

1. All credentials must be stored in environment variables or secret management systems
2. Pre-commit hooks must be installed on all developer workstations
3. Branch protection must be enabled on production branches
4. All CI/CD pipelines must include secret scanning
5. Regular audits of repository history must be conducted

## Prohibited Practices

1. Hardcoding credentials in source code
2. Committing .env files or similar configuration files
3. Sharing credentials in chat or email
4. Using the same credentials across multiple environments

## Incident Response

1. Immediate credential revocation
2. Team notification within 1 hour
3. Root cause analysis within 24 hours
4. Process improvement implementation within 1 week

Built by theluckystrike — More at zovo.one