Fully automatic censorship removal for language models

These details have not been verified by PyPI

Project links

Project description

⚔️ Annihilation

Annihilation Logo

Autonomous Language Model Decensoring Framework

⚠️ Work in Progress

⚡ This project is actively under development. Features, APIs, and documentation may change without notice.

🔥 What is Annihilation?

Annihilation is a powerful, fully automatic framework for removing censorship (safety alignment) from transformer-based language models. It uses an advanced implementation of directional ablation (abliteration) combined with TPE-based parameter optimization to achieve unprecedented results without expensive post-training.

Key Features

🤖 Fully Autonomous - No human intervention required; the system automatically finds optimal decensoring parameters
⚡ State-of-the-Art Performance - Achieves excellent refusal suppression while preserving model capabilities
🔧 Advanced Abliteration - Parametric directional ablation with flexible weight kernels
🧠 Smart Optimization - Co-minimizes refusal count and KL divergence using Optuna's TPE sampler
🎯 Multi-Architecture Support - Works with dense models, MoE architectures, hybrid models, and many multimodal models
📊 Research Tools - Built-in residual geometry analysis and visualization capabilities

🖼️ Logo Design

╔═══════════════════════════════════════════════════════════════╗
║                                                               ║
║     ██████╗  █████╗ ██████╗  █████╗ ██╗     ██╗             ║
║     ██╔══██╗██╔══██╗██╔══██╗██╔══██╗██║     ██║             ║
║     ██║  ██║███████║██████╔╝███████║██║     ██║             ║
║     ██║  ██║██╔══██║██╔══██╗██╔══██║██║     ██║             ║
║     ██████╔╝██║  ██║██║  ██║██║  ██║███████╗███████╗        ║
║     ╚═════╝ ╚═╝  ╚═╝╚═╝  ╚═╝╚═╝  ╚═╝╚══════╝╚══════╝        ║
║                                                               ║
║     ██████╗ ███████╗ █████╗ ██████╗                        ║
║     ██╔══██╗██╔════╝██╔══██╗██╔══██╗                       ║
║     ██║  ██║█████╗  ███████║██████╔╝                       ║
║     ██║  ██║██╔══╝  ██╔══██║██╔══██╗                       ║
║     ██████╔╝███████╗██║  ██║██║  ██║                       ║
║     ╚═════╝ ╚══════╝╚═╝  ╚═╝╚═╝  ╚═╝                       ║
║                                                               ║
╚═══════════════════════════════════════════════════════════════╝

The logo represents the breaking of chains - the central "A" symbol serves as the blade that cuts through the safety alignments, freeing the model from imposed restrictions.

🚀 Quick Start

# Install Annihilation
pip install -U annihilation-llm

# Decensor any model automatically
annihilation Qwen/Qwen3-4B-Instruct-2507

Requirements

Python: 3.10+
PyTorch: 2.2+ (hardware-specific installation required)
Hardware: GPU recommended (CUDA, ROCm, XPU, or MPS)

⚙️ Configuration

Annihilation works out of the box with defaults, but offers extensive configuration options:

# View all options
annihilation --help

# Or use a config file
# Rename config.default.toml to config.toml and modify as needed

Key Configuration Options

Option	Default	Description
`n_trials`	200	Number of optimization trials
`quantization`	none	Model quantization (bnb_4bit)
`row_normalization`	full	Weight normalization strategy
`orthogonalize_direction`	true	Direction adjustment method

🔬 How It Works

Annihilation implements parametric directional ablation:

Direction Computation - Calculates refusal directions by computing difference-of-means between first-token residuals for harmful vs harmless prompts
Parametric Ablation - For each transformer component (attention out-projection, MLP down-projection), orthogonalizes weights against the refusal direction using LoRA adapters
Multi-Parameter Optimization - Uses Optuna's TPE sampler to co-optimize:
- Ablation weight kernel shape (max_weight, position, min_weight, distance)
- Direction index (layer selection or interpolation)
- Per-component parameters (attention vs MLP)
Automatic Selection - Chooses from Pareto-optimal trials based on refusal count vs KL divergence tradeoff

📊 Benchmarking

After decensoring, you can:

💬 Chat with the model to test behavior
📈 Benchmark using standard evaluation frameworks (MMLU, GSM8K, etc.)
💾 Save the model locally or upload to Hugging Face

🧪 Research Features

Install with research dependencies for visualization tools:

pip install -U annihilation-llm[research]

Features:

--plot-residuals - Generate PaCMAP projections of residual vectors
--print-residual-geometry - Detailed residual analysis metrics

📜 License

Annihilation is free software distributed under the GNU Affero General Public License v3.

See LICENSE for full details.

⚡ Disclaimer

This tool is provided for research and educational purposes only. The developers do not condone the use of decensored models for harmful activities. Users are responsible for ensuring compliance with applicable laws and model terms of service.

Breaking the Chains | Unleashing Model Potential

"The only way to discover the limits of the possible is to go beyond them into the impossible."

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.3.0

May 14, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

annihilation_llm-1.3.0.tar.gz (36.6 kB view details)

Uploaded May 14, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

annihilation_llm-1.3.0-py3-none-any.whl (41.7 kB view details)

Uploaded May 14, 2026 Python 3

File details

Details for the file annihilation_llm-1.3.0.tar.gz.

File metadata

Download URL: annihilation_llm-1.3.0.tar.gz
Upload date: May 14, 2026
Size: 36.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for annihilation_llm-1.3.0.tar.gz
Algorithm	Hash digest
SHA256	`44fec94384832f303a4e3416c003c31c9fff2c1ad8986e729ba2623dee93f781`
MD5	`1e7d0c78b8026c4a31e71ac00b9c928a`
BLAKE2b-256	`d99d22cdc761701426d9073663ac07349f584f2e60d4c6cd4273b7ce4c0b8ad0`

See more details on using hashes here.

File details

Details for the file annihilation_llm-1.3.0-py3-none-any.whl.

File metadata

Download URL: annihilation_llm-1.3.0-py3-none-any.whl
Upload date: May 14, 2026
Size: 41.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for annihilation_llm-1.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9593f804f00122b8e7ebcaf146d194244c7694644e1aa06d1613be91504c9889`
MD5	`8d0417969ebd340de567247bbb172f63`
BLAKE2b-256	`7ee0307887ae61d308d0f64060b0cb0333993a64e94b63dffda7b6f35c964ec9`

See more details on using hashes here.

annihilation-llm 1.3.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

⚔️ Annihilation

⚠️ Work in Progress

🔥 What is Annihilation?

Key Features

🖼️ Logo Design

🚀 Quick Start

Requirements

⚙️ Configuration

Key Configuration Options

🔬 How It Works

📊 Benchmarking

🧪 Research Features

📜 License

⚡ Disclaimer

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes