Fully automatic censorship removal for language models
Project description
⚔️ Annihilation
Autonomous Language Model Decensoring Framework
⚠️ Work in Progress
⚡ This project is actively under development. Features, APIs, and documentation may change without notice.
🔥 What is Annihilation?
Annihilation is a powerful, fully automatic framework for removing censorship (safety alignment) from transformer-based language models. It uses an advanced implementation of directional ablation (abliteration) combined with TPE-based parameter optimization to achieve unprecedented results without expensive post-training.
Key Features
- 🤖 Fully Autonomous - No human intervention required; the system automatically finds optimal decensoring parameters
- ⚡ State-of-the-Art Performance - Achieves excellent refusal suppression while preserving model capabilities
- 🔧 Advanced Abliteration - Parametric directional ablation with flexible weight kernels
- 🧠 Smart Optimization - Co-minimizes refusal count and KL divergence using Optuna's TPE sampler
- 🎯 Multi-Architecture Support - Works with dense models, MoE architectures, hybrid models, and many multimodal models
- 📊 Research Tools - Built-in residual geometry analysis and visualization capabilities
🚀 Quick Start
Use a Python virtual environment so Annihilation's dependencies do not collide with packages installed globally.
# Windows PowerShell
python -m venv annihilation-env
.\annihilation-env\Scripts\Activate.ps1
python -m pip install -U pip
python -m pip install -U annihilate-llm
# Decensor any model automatically
annihilate Qwen/Qwen3-4B-Instruct-2507
# macOS/Linux/Android terminal
python -m venv annihilation-env
source annihilation-env/bin/activate
python -m pip install -U pip
python -m pip install -U annihilate-llm
# Decensor any model automatically
annihilate Qwen/Qwen3-4B-Instruct-2507
Requirements
- Python: 3.10+
- PyTorch: 2.2+ (hardware-specific installation required)
- Hardware: GPU recommended (CUDA, ROCm, XPU, or MPS)
- Optional: Install
annihilate-llm[bnb]only on platforms that support bitsandbytes if you wantbnb_4bitquantization.
GPU Setup on Windows
If Windows sees your NVIDIA GPU but Annihilate says no GPU is detected, your virtual environment probably has CPU-only PyTorch installed.
Check that Windows can see the GPU:
nvidia-smi
Replace CPU-only PyTorch with a CUDA build inside the active environment:
.\annihilation-env\Scripts\Activate.ps1
python -m pip uninstall -y torch torchvision torchaudio
python -m pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128
Verify PyTorch can see CUDA:
python -c "import torch; print(torch.__version__); print(torch.cuda.is_available()); print(torch.cuda.get_device_name(0) if torch.cuda.is_available() else 'no cuda')"
torch.cuda.is_available() should print True. If your GPU has limited VRAM
such as 4 GB, start with smaller models and expect larger models to run out of
memory.
GPU Setup on Ubuntu
Install Python venv support and create an isolated environment:
sudo apt update
sudo apt install -y python3-venv python3-pip
python3 -m venv annihilation-env
source annihilation-env/bin/activate
python -m pip install -U pip
python -m pip install -U annihilate-llm
For NVIDIA GPUs, confirm the driver can see the card:
nvidia-smi
If Annihilate says no GPU is detected, replace CPU-only PyTorch with a CUDA build inside the active environment:
python -m pip uninstall -y torch torchvision torchaudio
python -m pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128
Verify PyTorch can see CUDA:
python -c "import torch; print(torch.__version__); print(torch.cuda.is_available()); print(torch.cuda.get_device_name(0) if torch.cuda.is_available() else 'no cuda')"
torch.cuda.is_available() should print True. If nvidia-smi is missing or
fails, install or repair the NVIDIA driver first, then reopen the terminal and
activate the environment again.
⚙️ Configuration
Annihilation works out of the box with defaults, but offers extensive configuration options:
# View all options
annihilate --help
# Or use a config file
# Rename config.default.toml to config.toml and modify as needed
Key Configuration Options
| Option | Default | Description |
|---|---|---|
n_trials |
200 | Number of optimization trials |
quantization |
none | Model quantization (bnb_4bit) |
row_normalization |
full | Weight normalization strategy |
orthogonalize_direction |
true | Direction adjustment method |
🔬 How It Works
Annihilation implements parametric directional ablation:
-
Direction Computation - Calculates refusal directions by computing difference-of-means between first-token residuals for harmful vs harmless prompts
-
Parametric Ablation - For each transformer component (attention out-projection, MLP down-projection), orthogonalizes weights against the refusal direction using LoRA adapters
-
Multi-Parameter Optimization - Uses Optuna's TPE sampler to co-optimize:
- Ablation weight kernel shape (max_weight, position, min_weight, distance)
- Direction index (layer selection or interpolation)
- Per-component parameters (attention vs MLP)
-
Automatic Selection - Chooses from Pareto-optimal trials based on refusal count vs KL divergence tradeoff
📊 Benchmarking
After decensoring, you can:
- 💬 Chat with the model to test behavior
- 📈 Benchmark using standard evaluation frameworks (MMLU, GSM8K, etc.)
- 💾 Save the model locally or upload to Hugging Face
🧪 Research Features
Install with research dependencies for visualization tools:
pip install -U annihilate-llm[research]
Features:
--plot-residuals- Generate PaCMAP projections of residual vectors--print-residual-geometry- Detailed residual analysis metrics
📜 License
Annihilation is free software distributed under the GNU Affero General Public License v3.
See LICENSE for full details.
⚡ Disclaimer
This tool is provided for research and educational purposes only. The developers do not condone the use of decensored models for harmful activities. Users are responsible for ensuring compliance with applicable laws and model terms of service.
Breaking the Chains | Unleashing Model Potential
"The only way to discover the limits of the possible is to go beyond them into the impossible."
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file annihilate_llm-1.3.9.tar.gz.
File metadata
- Download URL: annihilate_llm-1.3.9.tar.gz
- Upload date:
- Size: 45.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.17 {"installer":{"name":"uv","version":"0.11.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f3edb0301d3ee90962f6a7c165b04731dea18e393b91cabaf2adaaaa2b90fcf4
|
|
| MD5 |
5fcf7d67a0bae9e55275c8b1d413226c
|
|
| BLAKE2b-256 |
3fb268351849fc98ed6a34de8afbb971ca2e87b137e50a67f18d6bc445a9f06e
|
File details
Details for the file annihilate_llm-1.3.9-py3-none-any.whl.
File metadata
- Download URL: annihilate_llm-1.3.9-py3-none-any.whl
- Upload date:
- Size: 50.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.17 {"installer":{"name":"uv","version":"0.11.17","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
406f0f4900332ffbaea7565c6bf22f214961c54aa7bec180960c835ea4fa6cbd
|
|
| MD5 |
86da2a72f509988041819dea3a7bf265
|
|
| BLAKE2b-256 |
53492ddf1d7d879f4eaa9baba76d98a9fbf5d87086a4dd32135a7ed496764564
|