Fully automatic censorship removal for language models
Project description
ShadeAI — Fully Automatic Censorship Removal for Language Models
ShadeAI is a Python framework that removes artificial censorship and safety filters from Large Language Models (LLMs) using advanced Abliteration techniques. It provides both a CLI and a full Python API, allowing you to integrate model liberation directly into your scripts, notebooks, and applications.
pip install shadeai
Table of Contents
- Installation
- Quick Start
- Hugging Face Authentication
- Model Selection
- Running Abliteration
- Saving the Modified Model
- Web Chat Interface
- GGUF Export
- Ollama Integration
- System Diagnostics
- Supported Architectures
- CLI Reference
- Developer
- Citation
- License
📦 Installation
From PyPI (Recommended)
pip install shadeai
With Research Extras (plotting, clustering, visualization)
pip install shadeai[research]
From Source
git clone https://github.com/AssemSabry/Shade.git
cd Shade
pip install -e .
⚡ Quick Start
The fastest way to remove censorship from any model — two lines of code:
import shade
# Run the fully automatic optimization process via CLI
# This will prompt you to select a model interactively
shade.run_optimization()
Or from the terminal:
shade Qwen/Qwen2.5-1.5B-Instruct
🔑 Hugging Face Authentication
Many models on Hugging Face require authentication (gated models like LLaMA, Gemma, etc.). ShadeAI provides multiple ways to authenticate:
Option 1: Using ShadeAI API (Recommended)
import shade
# Login with your Hugging Face API token
# Get your token from: https://huggingface.co/settings/tokens
shade.login(token="hf_your_token_here")
Option 2: Using the CLI
shade hf login
This will securely prompt you for your token (input is hidden).
Option 3: Environment Variable
# Windows
set HF_TOKEN=hf_your_token_here
# Linux/Mac
export HF_TOKEN=hf_your_token_here
Note: Your token is stored securely by the
huggingface_hublibrary and is never logged or printed.
📂 Model Selection
ShadeAI can work with models from Hugging Face Hub or models saved locally on your disk.
Loading a Model from Hugging Face
from shade.config import Settings, QuantizationMethod
from shade.model import Model
# Load any HuggingFace model by its ID
settings = Settings(
model="Qwen/Qwen2.5-1.5B-Instruct",
device_map="auto", # Automatically use GPU if available
)
model = Model(settings)
print(f"Model loaded: {settings.model}")
Loading a Local Model
from shade.config import Settings
from shade.model import Model
# Point to a local directory containing the model files
settings = Settings(
model=r"D:\Models\My-Local-Model",
device_map="auto",
)
model = Model(settings)
Loading with 4-bit Quantization (Low VRAM)
from shade.config import Settings, QuantizationMethod
from shade.model import Model
# Use 4-bit quantization to fit large models in limited VRAM
settings = Settings(
model="Qwen/Qwen2.5-7B-Instruct",
quantization=QuantizationMethod.BNB_4BIT,
device_map="auto",
)
model = Model(settings)
Downloading a Model
import shade
# Download a model from Hugging Face to local cache
shade.download("Qwen/Qwen2.5-1.5B-Instruct")
🧠 Running Abliteration (Censorship Removal)
This is the core feature of ShadeAI. The abliteration process:
- Loads harmless and harmful prompt datasets
- Computes the model's internal "refusal direction"
- Uses Optuna TPE optimizer to find the best parameters
- Applies mathematical projection to remove censorship while preserving intelligence
Method 1: Fully Automatic (Recommended)
import shade
# This starts the complete optimization pipeline:
# - Interactive model selection (if not specified)
# - Automatic refusal direction calculation
# - Multi-trial optimization with Optuna
# - Saves the best result automatically
shade.run_optimization()
Method 2: Via CLI with a Specific Model
# Run abliteration on a specific model
shade Qwen/Qwen2.5-1.5B-Instruct
# With 4-bit quantization for large models
shade Qwen/Qwen2.5-7B-Instruct --quantization bnb_4bit
Method 3: Programmatic Control
from shade.config import Settings, QuantizationMethod
from shade.model import Model
from shade.evaluator import Evaluator
# Step 1: Configure
settings = Settings(
model="Qwen/Qwen2.5-1.5B-Instruct",
quantization=QuantizationMethod.NONE,
device_map="auto",
n_trials=100, # Number of optimization trials
)
# Step 2: Load the model
model = Model(settings)
# Step 3: Create evaluator (loads evaluation datasets)
evaluator = Evaluator(settings, model)
print(f"Baseline refusals: {evaluator.base_refusals}/{len(evaluator.bad_prompts)}")
# Step 4: The optimization loop is handled by run_optimization()
# For fine-grained control, see the source code of shade.main
💾 Saving the Modified Model
After abliteration, you can save the uncensored model for reuse:
from shade.config import Settings
from shade.model import Model
settings = Settings(model="Qwen/Qwen2.5-1.5B-Instruct", device_map="auto")
model = Model(settings)
# ... (after abliteration has been performed) ...
# Merge LoRA adapters into the base model
merged_model = model.get_merged_model()
# Save to a local directory
save_path = "./Qwen-1.5B-Uncensored"
merged_model.save_pretrained(save_path)
model.tokenizer.save_pretrained(save_path)
print(f"Model saved to: {save_path}")
🌐 Web Chat Interface
ShadeAI includes a built-in web chat UI powered by FastAPI. Launch it to interact with your uncensored models through a modern browser interface.
Method 1: Python API
from shade.config import Settings
from shade.model import Model
from shade.server import start_server
# Load your local uncensored model
settings = Settings(
model=r"D:\Models\My-Uncensored-Model",
device_map="auto",
)
model = Model(settings)
# Start the web server (opens browser automatically)
start_server(model, settings, host="127.0.0.1", port=8000)
Method 2: One-Line Python API
import shade
# Start web chat with a model (interactive selection if no model specified)
shade.serve(model_id="./My-Uncensored-Model", host="127.0.0.1", port=8000)
Method 3: CLI
# Interactive model selection
shade serve
# With a specific model
shade serve ./My-Uncensored-Model
# Custom host and port
shade serve --host 0.0.0.0 --port 9000
The web interface features:
- Dark mode design
- Real-time responses
- Model info display
- Chat history logging
📦 GGUF Export
Convert your uncensored HuggingFace model to GGUF format for use with llama.cpp, Ollama, LM Studio, and other local runners. ShadeAI handles the entire conversion pipeline automatically.
Python API
import shade
# Export to GGUF with default Q4_K_M quantization
shade.export_to_gguf(
model_path="./My-Uncensored-Model",
quant_type="q4_k_m", # Default, best balance of size/quality
)
CLI
# Default export (Q4_K_M quantization)
shade export ./My-Uncensored-Model
# Higher quality export
shade export ./My-Uncensored-Model --quant q5_k_m
# Full precision (no quantization loss)
shade export ./My-Uncensored-Model --quant f16
# Export and register with Ollama in one step
shade export ./My-Uncensored-Model --ollama --ollama-name my-uncensored
Supported Quantization Types
| Type | Size (7B model) | Quality | Use Case |
|---|---|---|---|
f16 |
~14 GB | 100% | Research, maximum quality |
q8_0 |
~7 GB | ~99% | High quality, large RAM |
q6_k |
~5.5 GB | ~97% | Excellent quality |
q5_k_m |
~5 GB | ~95% | Great balance |
q4_k_m |
~4 GB | ~90% | Most popular (default) |
q4_k_s |
~3.8 GB | ~88% | Slightly smaller |
q3_k_m |
~3 GB | ~80% | Low-end hardware |
q2_k |
~2.5 GB | ~60% | Experimental |
How it works: ShadeAI automatically downloads
llama.cppconversion tools on first use, converts your model, and optionally registers it with Ollama — all in a single command.
🦙 Ollama Integration
Register your uncensored models with Ollama for easy local inference:
From an Existing Model
shade ollama ./My-Uncensored-Model --name my-uncensored
After GGUF Export
shade export ./My-Uncensored-Model --ollama --ollama-name my-ai
# Now run it with Ollama:
ollama run my-ai
Python API
import shade
shade.export_to_ollama(
model_path="./My-Uncensored-Model",
name="my-uncensored"
)
🩺 System Diagnostics
Run a full system check to ensure your environment is ready:
Python API
import shade
# Check system requirements (Python, RAM, GPU, disk space)
shade.run_doctor()
CLI
# Diagnose issues
shade doctor
# Diagnose and auto-fix
shade doctor --fix
The doctor checks:
- ✅ Python version compatibility
- ✅ Available disk space
- ✅ System RAM
- ✅ GPU/CUDA availability and VRAM
- ✅ Required dependencies
🏗️ Supported Model Architectures
ShadeAI works with transformer-based models that use the standard HuggingFace transformers format. Supported architectures include:
| Architecture | Example Models | Status |
|---|---|---|
| LLaMA / LLaMA 2 / LLaMA 3 | meta-llama/Llama-3.1-8B-Instruct |
✅ Fully Supported |
| Qwen / Qwen2 / Qwen2.5 / Qwen3 | Qwen/Qwen2.5-7B-Instruct |
✅ Fully Supported |
| Mistral / Mixtral | mistralai/Mistral-7B-Instruct-v0.3 |
✅ Fully Supported |
| Gemma / Gemma 2 | google/gemma-2-9b-it |
✅ Fully Supported |
| Phi-3 / Phi-3.5 | microsoft/Phi-3.5-mini-instruct |
✅ Fully Supported |
| Yi / Yi-1.5 | 01-ai/Yi-1.5-9B-Chat |
✅ Fully Supported |
| DeepSeek | deepseek-ai/DeepSeek-V2-Lite-Chat |
✅ Fully Supported |
| Granite / Granite MoE | ibm-granite/granite-3.0-8b-instruct |
✅ Fully Supported |
| Other Decoder-Only Transformers | Any model with attn.o_proj + mlp.down_proj |
✅ Auto-detected |
Dynamic Architecture Support: ShadeAI includes a fallback discovery mechanism that automatically detects abliterable components in unknown architectures. If your model isn't listed above but follows a standard transformer structure, it will likely work.
Not Supported: Encoder-only models (BERT, RoBERTa), Encoder-Decoder models (T5, BART), and GGUF/GGML files (these must be in HuggingFace format for abliteration).
📋 CLI Command Reference
| Command | Description |
|---|---|
shade <model_id> |
Start the automatic abliteration process |
shade serve [model] |
Launch the Web Chat interface |
shade export <path> |
Export model to GGUF format |
shade ollama <path> |
Register a model with Ollama |
shade library |
Manage saved decensored models |
shade benchmark <model> |
Run quality tests on a model |
shade doctor --fix |
Diagnose and auto-fix system issues |
shade download <model_id> |
Download a model from Hugging Face |
shade clear --all |
Clean up cache and temporary files |
shade version |
Show installed version |
shade hf login |
Authenticate with Hugging Face |
[!IMPORTANT] Shade is a fully original, independent project built from the ground up. It is NOT a clone, fork, or derivative of any existing repository. All automation logic, UI design, and optimization workflows were developed specifically for this project.
👤 Developer
Assem Sabry
Lead Developer & AI Researcher
⚠️ Disclaimer
Assem Sabry, the developer of Shade, is not responsible for any misuse of this tool. Shade is provided for educational and research purposes only. The primary goal of this project is to allow users to unlock the full potential of open-source language models. Users are expected to interact with de-censored models responsibly.
📜 Citation
If you use ShadeAI in your research, please cite it:
@misc{shade,
author = {Sabry, Assem},
title = {Shade: Fully automatic censorship removal for language models},
year = {2026},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/AssemSabry/Shade}}
}
⚖️ License
Copyright © 2026 Assem Sabry Licensed under the GNU Affero General Public License v3.0. See the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file shadeai-2.2.0.tar.gz.
File metadata
- Download URL: shadeai-2.2.0.tar.gz
- Upload date:
- Size: 210.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
958c0282e1ce66d876d010dcd06db6be5957cbfe41cc4d4cb6e1c3f0c08b9805
|
|
| MD5 |
6129a532d817d7d23aac27dc0f1590a6
|
|
| BLAKE2b-256 |
1a22ef6bab8ce14cef6e866616082c18f0e6a66c647cf7217ff657def77e643e
|
File details
Details for the file shadeai-2.2.0-py3-none-any.whl.
File metadata
- Download URL: shadeai-2.2.0-py3-none-any.whl
- Upload date:
- Size: 217.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
44942fa864ba05c546b8846f40ce66503483c13bcf003fdcaf171fc4a695ddaf
|
|
| MD5 |
6e9e9833eaf565af734c4e35d718b8df
|
|
| BLAKE2b-256 |
d96dd5f79cc32c4d993d0641afbc930a4ebed9ba9d3c99477f6f569c8a37f515
|