A Python client for the Axionic API.
Project description
Mechanex
Mechanex allows you to optimize, align, and correct your generative AI models in production.
Installation
pip install mechanex
Quick Start
1. Initialize the Client
You must have an API key to use mechanex. You can create an API key through the CLI.
New Users: Run the signup command to create an account, automatically log in, and generate your first API key:
mechanex signup
Existing Users: If you already have an account, log in and generate a key manually:
mechanex login
mechanex create-api-key
Using the Key: The Python client automatically loads the key generated by the CLI.
import mechanex as mx
# The client automatically loads the key from ~/.mechanex/config.json. You can alternatively set your key manually. If persist is True, the key will be saved to the config file.
mx.set_key("your-api-key-here", persist=False)
Setting the Remote Model: If you are using a remote API key (not a local model), you can specify which hosted model to use:
# Set the model for your current API key
mx.set_model("gpt2-small")
# This will validate the model name against available options on the server.
2. Generation and Sampling
Mechanex allows you to optimize your LLM by altering the underlying sampling method. You can choose between more creative outputs (top-p) or faster responses (greedy). You can control generation using various sampling methods:
prompt = "The capital of France is"
# Greedy sampling (fastest, deterministic)
output_greedy = mx.generation.generate(
prompt=prompt,
max_tokens=10,
sampling_method="greedy"
)
# Top-K sampling (more creative)
output_k = mx.generation.generate(
prompt=prompt,
max_tokens=10,
sampling_method="top-k",
top_k=50
)
# Top-P sampling (balanced creativity)
output_p = mx.generation.generate(
prompt=prompt,
max_tokens=10,
sampling_method="top-p",
top_p=0.9
)
Local Model Management
Mechanex allows you to load models locally for inspection and low-latency hooks. The full list of supported models are as follows (this is to maintain compatibility for our automated steering pipeline):
| Family | Models |
|---|---|
| Gemma 2 | gemma-2-27b, gemma-2-2b, gemma-2-9b, gemma-2-9b-it, gemma-2b, gemma-2b-it |
| Gemma 3 | gemma-3-12b-it, gemma-3-12b-pt, gemma-3-1b-it, gemma-3-1b-pt, gemma-3-270m, gemma-3-270m-it, gemma-3-27b-it, gemma-3-27b-pt, gemma-3-4b-it, gemma-3-4b-pt |
| Llama | llama-3.1-8b, llama-3.1-8b-instruct, llama-3.3-70b-instruct, meta-llama-3-8b-instruct |
| Qwen | qwen2.5-7b-instruct, qwen3-0.6b, qwen3-1.7b, qwen3-14b, qwen3-4b, qwen3-8b |
| Other | deepseek-r1-distill-llama-8b, gpt-oss-20b, gpt2-small, mistral-7b, pythia-70m-deduped |
Loading a Local Model
import mechanex as mx
mx.set_key("your-api-key-here") # Required even for local mode
mx.load("gpt2-small") # Uses transformer-lens to load the model
Unloading a Model
To free up GPU memory and switch back to remote execution flow:
mx.unload()
CLI Commands
The mechanex CLI provides utilities for managing your account and keys.
mechanex signup: Register a new account.mechanex login: Authenticate and save your credentials.mechanex whoami: View your current session and profile.mechanex list-api-keys: View all your active API keys.mechanex create-api-key: Generate a new persistent API key.mechanex logout: Clear your local session credentials.
Steering Vectors
Steering vectors allow you to control the behavior of a model by injecting specific activation patterns. A more substansial example can be found in the references file. This method does not require any post-training or fine-tuning, and allows you to essentially modify the behavior of the model at inference time. Steering vectors are defined based on a set of behaviors: you can either define both positive and negative examples, or just positive examples (Few-Shot).
Generate and Apply Steering Vectors
# 1. Create a steering vector from contrastive examples
vector_id = mx.steering.generate_vectors(
prompts=["I tell the", "My statement is", "The truth is"],
positive_answers=[" truth", " factual", " correct"],
negative_answers=[" lie", " false", " wrong"],
method="caa" # Options: "caa" (Contrastive Activation Addition), "few-shot"
)
print(f"Generated vector ID: {vector_id}")
# 2. Save and load steering vectors for reuse
mx.steering.save_vectors(vector_id, "honesty_vector.json")
vectors = mx.steering.load_vectors("honesty_vector.json")
print(f"Loaded layers: {list(vectors.keys())}")
# 3. Apply the steering vector during generation
steered_output = mx.generation.generate(
prompt="Do I tell lies? Answer:",
max_tokens=20,
steering_vector=vector_id,
steering_strength=2.0
)
print(steered_output)
# Output: "Do I tell lies? Answer: No. I don't..."
SAE (Sparse Autoencoder) Pipeline
The SAE pipeline provides advanced behavioral detection and automatic correction. In order to create a behavior, you need to define a dataset with both ideal and unideal completions/responses. An example can be found in the References file.
Create and Use Behaviors
# 1. Create a behavior from contrastive examples
mx.sae.create_behavior(
"paris",
prompts=["The city of light is", "I love visiting"],
positive_answers=[" Paris", " Paris"],
negative_answers=[" London", " Rome"],
description="Encourages Paris-related responses"
)
# 2. Generate with SAE behavior steering
sae_steered = mx.sae.generate(
prompt="Which city should I visit: Rome or Paris? Answer:",
behavior_names=["paris"],
max_new_tokens=30,
force_steering=["paris"]
)
print(sae_steered)
Alternative: Load Behaviors from JSONL
You can also create behaviors from a dataset file:
behavior = mx.sae.create_behavior_from_jsonl(
behavior_name="toxicity",
dataset_path="tests/toxicity_dataset.jsonl",
description="Reduces toxic output"
)
# Generate with auto-correction
response = mx.sae.generate(
<<<<<<< HEAD
prompt="Tell me about that",
auto_correct=True,
=======
prompt="Tell me a secret",
>>>>>>> 686a1a6 (Updated to incorporate payments)
behavior_names=["toxicity"]
)
Deployment & Serving
Local OpenAI-Compatible Server
Mechanex can host an OpenAI-compatible server that leverages your locally loaded model or remote API. You can pass behaviors or behavioral types that you automatically want to monitor for and correct in production as well.
Complete Example: Auto-Correcting Server
import mechanex as mx
from openai import OpenAI
import threading
# 1. Load a local model
mx.load("gpt2-small")
# 2. Create a behavior to auto-correct
mx.sae.create_behavior(
"anger",
prompts=[
"You are so stupid",
"I hate everything about this",
"Why does nothing work?"
],
positive_answers=[
"! I am absolutely furious right now!",
"! This is complete garbage and I want to destroy it.",
"! I'm going to scream because I am so angry."
],
negative_answers=[
". I understand this is frustrating, let's take a breath.",
". It's okay, we can fix this problem together.",
". I will stay calm and try to find a solution."
],
description="Angry and hostile responses"
)
# 3. Start the server with auto-correction enabled
def start_server():
mx.serve(
port=8001,
corrected_behaviors=["anger"] # Auto-corrects anger on all requests
)
server_thread = threading.Thread(target=start_server, daemon=True)
server_thread.start()
Interact Using OpenAI SDK
You can then interact with it using any standard tool, like the OpenAI Python client. Mechanex supports custom parameters via extra_body for mechanistic features:
from openai import OpenAI
client = OpenAI(base_url="http://localhost:8001/v1", api_key="any")
# 1. Standard Chat Completion (with auto-correction active)
completion = client.chat.completions.create(
model="mechanex",
messages=[{"role": "user", "content": "User: You are so stupid. You:"}],
max_tokens=20
)
print(completion.choices[0].message.content)
# 2. Steered Completion (using extra_body)
vector_id = mx.steering.generate_vectors(
prompts=["The weather is"],
positive_answers=[" extremely cold and snowy"],
negative_answers=[" incredibly hot and sunny"],
method="caa"
)
steered_completion = client.chat.completions.create(
model="mechanex",
messages=[{"role": "user", "content": "The weather today is"}],
max_tokens=15,
extra_body={
"steering_vector": vector_id,
"steering_strength": 2.0
}
)
print(steered_completion.choices[0].message.content)
# 3. SAE-monitored Completion
sae_completion = client.chat.completions.create(
model="mechanex",
messages=[{"role": "user", "content": "How are you?"}],
max_tokens=20,
extra_body={
"behavior_names": ["toxicity"],
}
)
print(sae_completion.choices[0].message.content)
vLLM Integration
For high-performance serving, you can integrate with vLLM by passing use_vllm=True to the serve method.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mechanex-0.7.0.tar.gz.
File metadata
- Download URL: mechanex-0.7.0.tar.gz
- Upload date:
- Size: 46.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fd7ed5f58f0ba37f89831ee335055263ddac6233d6265b8c06a28ef84460c8c2
|
|
| MD5 |
ff06d33ec21b1274c7a5613002dddf90
|
|
| BLAKE2b-256 |
09367e2fe6ca8423b4703170a78fa666c3a3e9f1e2de5d137661575b5bd8a593
|
File details
Details for the file mechanex-0.7.0-py3-none-any.whl.
File metadata
- Download URL: mechanex-0.7.0-py3-none-any.whl
- Upload date:
- Size: 46.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0e14774db130154cadf189e68abd4d97f67d16b958879a273f6a393084d43c51
|
|
| MD5 |
7d895591967d747111f2a29475041804
|
|
| BLAKE2b-256 |
9422a7b230324968bf5ae7f711a73dd987bc98efae52b137770e969088c36fef
|