Skip to main content

CLI tool to anonymize code using local LLM before sending to Claude Code

Project description

LLM Anonymizer

A CLI tool to anonymize code using a local LLM before sending to Claude Code or other AI services.

Prerequisites

Install Ollama

  1. macOS/Linux:

    curl -fsSL https://ollama.ai/install.sh | sh
    
  2. Windows: Download from ollama.ai

  3. Start Ollama service:

    ollama serve
    
  4. Install a model (in a new terminal):

    # Install Llama 3.2 (recommended)
    ollama pull llama3.2
    
    # Or install other models
    ollama pull codellama
    ollama pull llama3.1
    

Installation

Quick Start (Recommended)

  1. Install UV (if not already installed):

    # macOS/Linux
    curl -LsSf https://astral.sh/uv/install.sh | sh
    
  2. Install LLM Anonymizer:

    uv tool install llm-anon
    
  3. Verify installation:

    llm-anon --help
    

That's it! The llm-anon command is now available globally.

Alternative Installation Methods

Development Installation (click to expand)

For development or if you want to modify the code:

  1. Clone this repository:

    git clone https://github.com/ChristianBako/LLM-Anonymizer-.git
    cd LLM-Anonymizer-
    
  2. Install dependencies:

    uv sync
    
  3. Use with uv run:

    uv run python -m llm_anon.cli --help
    

Usage

Basic Usage

# Anonymize a single file and print to stdout
llm-anon example.py

# Anonymize and save to file
llm-anon example.py -o anonymized.py

# Process entire directory
llm-anon src/ -r -o anonymized_output/

Development Usage (if installed from source)

# Use with uv run for development
uv run python -m llm_anon.cli example.py

Options

  • -o, --output PATH: Output file or directory
  • -m, --model TEXT: LLM model to use (default: llama3.2)
  • -t, --temperature FLOAT: Temperature for generation (default: 0.1)
  • --preserve-comments: Keep original comments
  • --preserve-strings: Keep string literals unchanged
  • -r, --recursive: Process directories recursively
  • -v, --verbose: Show detailed progress
  • --validation-config PATH: Path to validation config file with banned strings
  • --max-retries INTEGER: Maximum retries for validation failures (default: 3)

Examples

# Use different model with higher creativity
llm-anon code.py -m codellama -t 0.3

# Preserve important strings and comments
llm-anon api.py --preserve-strings --preserve-comments

# Process entire project with verbose output
llm-anon ./src -r -v -o ./anonymized

# Use validation to ensure sensitive strings are removed
llm-anon examples/lamasoft_example.py --validation-config examples/banned_strings.txt -v

# Process with custom retry limit for stubborn validations
llm-anon examples/lamasoft_example.py --validation-config examples/banned_strings.txt --max-retries 5

Supported Languages

  • Python (.py)
  • JavaScript (.js, .jsx)
  • TypeScript (.ts, .tsx)
  • Java (.java)
  • C/C++ (.c, .cpp, .cc, .cxx, .h, .hpp)
  • Rust (.rs)
  • Go (.go)

Validation System

The tool includes a powerful validation system to ensure sensitive information is completely removed from anonymized code.

Creating a Validation Config

Create a text file with banned strings (one per line). Comments start with #:

# Quick example
echo "MyCompany" > banned_strings.txt
echo "secret_api_key" >> banned_strings.txt
echo "internal.company.com" >> banned_strings.txt

# Or use the provided example
cp examples/banned_strings.txt my_company_secrets.txt
# Edit my_company_secrets.txt with your specific terms

Or create a comprehensive config file:

# company_secrets.txt
# Company names and branding
MyCompany
mycompany.com

# API keys and credentials  
API_KEY_12345
secret_password_2024

# Contact information
support@mycompany.com
1-800-MYCOMPANY

# Product-specific terms
MyCompanyCustomer
MyCompanyLicenseManager

Pro tip: Start with your company name, domain, and any API keys or internal URLs. Check out examples/ for test files and sample configurations.

How Validation Works

  1. Initial Anonymization: LLM processes code normally
  2. String Detection: Scans output for banned strings using word boundaries
  3. Re-prompting: If banned strings found, sends explicit removal instructions
  4. Retry Logic: Repeats up to --max-retries times until validation passes
  5. Failure Handling: Reports specific banned strings if validation ultimately fails

Validation Features

  • Case-sensitive matching by default
  • Word boundary detection prevents false positives
  • Progressive prompting gets more explicit with each retry
  • Detailed error reporting shows exactly which strings were found

How It Works

  1. File Detection: Automatically detects programming language from file extension
  2. LLM Processing: Sends code to local Ollama model with anonymization prompt
  3. Validation (if enabled): Checks output against banned strings and re-prompts if needed
  4. Smart Replacement: Replaces variable names, function names, and identifiers while preserving:
    • Code structure and logic
    • Control flow
    • Data types
    • Import statements
    • Syntax and formatting

Troubleshooting

Ollama Connection Issues

# Check if Ollama is running
curl http://localhost:11434/api/tags

# Restart Ollama service
pkill ollama && ollama serve

# List available models
ollama list

Model Not Found

# Pull the default model
ollama pull llama3.2

# Or specify a different model
llm-anon code.py -m llama3.1

Performance Tips

  • Use smaller models for faster processing: llama3.2:1b
  • Lower temperature (0.1) for more consistent results
  • Process files individually for large codebases to avoid timeouts

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_anon-0.1.1.tar.gz (11.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llm_anon-0.1.1-py3-none-any.whl (11.4 kB view details)

Uploaded Python 3

File details

Details for the file llm_anon-0.1.1.tar.gz.

File metadata

  • Download URL: llm_anon-0.1.1.tar.gz
  • Upload date:
  • Size: 11.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.12

File hashes

Hashes for llm_anon-0.1.1.tar.gz
Algorithm Hash digest
SHA256 0fbc324d0355c24e465700d508a37761b0938ccea42fe1639327432e3827655f
MD5 9f6ed78643958fa95dd87bd99161297e
BLAKE2b-256 04d0788091d864386b5a856480b6fec0c565ff9613afa79e15810ce17c9acb89

See more details on using hashes here.

File details

Details for the file llm_anon-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: llm_anon-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 11.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.12

File hashes

Hashes for llm_anon-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 675f87f9482508401026ad76d1788cd0725738d900c1c63bf368eb98c6986554
MD5 b8e2230a9c7ecd13ffeeeedd52db1f39
BLAKE2b-256 a0e8d30ec67bdabad63a4b0e1b4da9898d4a62cef97ed03df8876c1574e76ce8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page