Scan staged Git files for secrets like API keys.

These details have not been verified by PyPI

Project description

🕵️‍♂️ API Hunter

API Hunter is a lightweight Python tool that scans staged Git files for sensitive credentials like API keys, access tokens, and private keys before you commit them.

🔒 Catch secrets before they leak. Simple, fast, and configurable.

🚀 Features

✅ Scans only files added via git add (i.e., staged files)
🔍 Detects:
- AWS, GitHub, Google, Slack tokens
- API keys, auth tokens, secret keys
- Private keys and database URLs
⚡️ Async scanning for fast performance
🛠️ Custom regex patterns (add/remove/display)
🤖 Gemini integration for automatic regex generation (with API key anonymization)
🎨 Colored output for better readability
📊 Verbose mode to show matched patterns

📦 Installation

From PyPI

pip install api-hunter

🗑️ Uninstallation

Remove the package

pip uninstall api-hunter

Remove created files and directories

The tool creates configuration files in your home directory. To completely remove all traces:

# Remove the configuration directory and all its contents
rm -rf ~/api_hunt_envs/

This will remove:

Custom regex patterns (custom_pattern.json)
Gemini API key configuration
Any other configuration files created by the tool

🛠️ CLI Commands

Scan all staged files in git repo:
```
hunt
```
Scan all staged files with verbose output:
```
hunt -v
```
Scan a specific file:
```
hunt -n path/to/file.py
```
Scan a specific file with verbose output:
```
hunt -n path/to/file.py -v
```
Add a custom pattern:
```
hunt -a "my_service" "my_service_[a-zA-Z0-9]{32}"
```
Note: Use unique key names when adding patterns, as the remove command identifies patterns by their key name.
Remove a custom pattern:
```
hunt -r "my_service"
```
Display all custom patterns:
```
hunt -d
```
Configure Gemini API key:
```
hunt -c "your-gemini-api-key"
```
Generate and add a regex for a new API key using Gemini:
```
hunt -re "my_service" "sk-abc123..."
```
Note: The tool randomizes the digits in your API key before sending it to Gemini, so your actual key is never exposed.

💡 Example Usage

# Scan all staged files in your git repo
hunt

# Scan all staged files with verbose output (shows matched patterns)
hunt -v

# Scan a specific file
hunt -n path/to/file.py

# Scan a specific file with verbose output
hunt -n path/to/file.py -v

# Add a custom pattern
hunt -a "my_service" "my_service_[a-zA-Z0-9]{32}"

# Remove a custom pattern
hunt -r "my_service"

# Display all custom patterns
hunt -d

# Configure Gemini API key
hunt -c "your-gemini-api-key"

# Generate and add a regex for a new API key using Gemini
hunt -re "my_service" "sk-abc123..."

📝 Notes

Custom patterns and Gemini API key are stored in ~/api_hunt_envs/.
Only files with common code/config extensions are scanned (see api_hunt/patterns.py).
For Gemini integration, you need a valid Google Gemini API key.
Privacy: When generating a regex for your API key using Gemini, API Hunter randomizes the digits in your key before sending it to the LLM. This ensures your actual API key is never exposed to any third party.
Verbose Mode: Use -v flag to see the actual matched patterns in colored output (yellow for file names, green for line numbers, red for matched patterns).
Custom Patterns: Use unique key names when adding custom patterns, as the remove command identifies patterns by their key name for deletion.

⚠️ False Positives & Detection Trade-offs

API Hunter uses pattern-based detection, which means it may generate false positives - detecting strings that look like secrets but aren't actually sensitive. This is an inherent limitation of regex-based scanning.

Why False Positives Occur

Pattern matching can't distinguish between real API keys and similar-looking strings
Some legitimate code may contain strings that match secret patterns
Generic patterns (like api_key) may catch variable names or example values

The Trade-off: Detection vs. Privacy

We prioritize detection over precision because:

Better safe than sorry: It's better to catch a false positive than miss a real secret
Manual review: You can quickly verify if a detected pattern is actually sensitive
Privacy protection: Pattern-based detection keeps your actual secrets local

Alternative: LLM-Based Filtering

While we could use an LLM to filter out false positives, this would require:

Sending your detected patterns to an external service
Risk of exposing real secrets to third-party LLMs
Additional API costs and latency

Our approach: Keep detection local and let you manually review results, ensuring your secrets never leave your machine.

🔍 Default Detection Patterns

API Hunter comes with built-in patterns to detect various types of secrets:

Cloud Services

AWS: Access Keys (AKIA[0-9A-Z]{16}), Secret Keys ([0-9a-zA-Z/+]{40})
Google: API Keys (AIza[0-9A-Za-z\-_]{35})
Azure: Generic keys ([a-f0-9]{32} or [A-Za-z0-9+/=]{40,})
DigitalOcean: API Tokens (dop_v1_[0-9a-f]{64})

Development Platforms

GitHub: Personal Access Tokens (gh[opusr]_[0-9a-zA-Z]{36})
Slack: Bot/App Tokens (xox[boaprs]-[0-9]{12}-[0-9]{12}-[0-9a-zA-Z]{24})

AI Services

OpenAI: API Keys (sk-[a-zA-Z0-9]{48}, sk-proj-[a-zA-Z0-9]{48})
Claude: API Keys (sk-ant-api03-[a-zA-Z0-9\-_]{95})

Database & Backend

Supabase: JWT Tokens (eyJ[a-zA-Z0-9_\-]+\.[a-zA-Z0-9_\-]+\.[a-zA-Z0-9_\-]+), Service Keys (sbp_[a-zA-Z0-9]{40})
MongoDB Atlas: Connection Strings (mongodb\+srv:\/\/[^:\s]+:[^@\s]+@[^\/\s]+)

Payment Services

Stripe: Live/Test Keys (sk_live_[0-9a-zA-Z]{24}, sk_test_[0-9a-zA-Z]{24}, etc.)
Square: API Keys (sq0[a-z]{3}-[0-9a-zA-Z\-_]{22,43})
Shopify: Access Tokens (shpat_[0-9a-fA-F]{32}, etc.)

Communication Services

Twilio: Account/Service Keys (SK[0-9a-fA-F]{32}, AC[0-9a-fA-F]{32})
SendGrid: API Keys (SG\.[0-9a-zA-Z\-_]{22}\.[0-9a-zA-Z\-_]{43})
Mailgun: API Keys (key-[0-9a-zA-Z]{32})

Generic Patterns

API Keys: api_key, apiKey, API_KEY with values
Secret Keys: secret_key, secretKey, SECRET_KEY with values
Access Tokens: access_token, accessToken, ACCESS_TOKEN with values
Auth Tokens: auth_token, authToken, AUTH_TOKEN with values
Bearer Tokens: Bearer [token] format
Private Keys: RSA, EC, DSA, OpenSSH, PGP private key headers

Supported File Extensions

Code: .py, .js, .ts, .jsx, .tsx, .java, .go, .rb, .php, .cs, .cpp, .c, .h, .hpp
Scripts: .sh, .bash, .zsh, .fish
Config: .yml, .yaml, .json, .xml, .env, .config, .conf, .ini
Docs: .txt, .md, .rst
Infrastructure: .sql, .tf, .tfvars

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.2

Jul 5, 2025

0.1.1

Jul 5, 2025

0.1.0

Jul 5, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

api_hunter-0.1.2.tar.gz (11.9 kB view details)

Uploaded Jul 5, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

api_hunter-0.1.2-py3-none-any.whl (10.6 kB view details)

Uploaded Jul 5, 2025 Python 3

File details

Details for the file api_hunter-0.1.2.tar.gz.

File metadata

Download URL: api_hunter-0.1.2.tar.gz
Upload date: Jul 5, 2025
Size: 11.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.5

File hashes

Hashes for api_hunter-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`9c97595a5143cf004b70c70dd5e64f2b2da55ce7d0ba08cfa5801c9546bc9677`
MD5	`985a2cabc04f7c157160f38a2ad5afe6`
BLAKE2b-256	`72863f870703e22b261a3868ce4e81cee71bcfc34ddb054104f037272d49d80f`

See more details on using hashes here.

File details

Details for the file api_hunter-0.1.2-py3-none-any.whl.

File metadata

Download URL: api_hunter-0.1.2-py3-none-any.whl
Upload date: Jul 5, 2025
Size: 10.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.5

File hashes

Hashes for api_hunter-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ccaaef802dfbddb3a67dc5d40e38a2ed70ec31a3171132f51a4d487bafb54440`
MD5	`b49497260ef2bad00ff91f070490c961`
BLAKE2b-256	`161ec5e83de4e5dd7b692457ebdd46dcf6fa30712053c27c327edd98505e6482`

See more details on using hashes here.

api-hunter 0.1.2

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

🕵️‍♂️ API Hunter

🚀 Features

📦 Installation

From PyPI

🗑️ Uninstallation

Remove the package

Remove created files and directories

🛠️ CLI Commands

💡 Example Usage

📝 Notes

⚠️ False Positives & Detection Trade-offs

Why False Positives Occur

The Trade-off: Detection vs. Privacy

Alternative: LLM-Based Filtering

🔍 Default Detection Patterns

Cloud Services

Development Platforms

AI Services

Database & Backend

Payment Services

Communication Services

Generic Patterns

Supported File Extensions

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes