Skip to main content

A Python client for the Axionic API.

Project description

Mechanex

Mechanex allows you to optimize, align, and correct your generative AI models in production.

Installation

pip install mechanex

Quick Start

1. Initialize the Client

You must have an API key to use mechanex. You can create an API key through the CLI.

New Users: Run the signup command to create an account, automatically log in, and generate your first API key:

mechanex signup

Existing Users: If you already have an account, log in and generate a key manually:

mechanex login
mechanex create-api-key

Using the Key: The Python client automatically loads the key generated by the CLI.

import mechanex as mx

# The client automatically loads the key from ~/.mechanex/config.json. You can alternatively set your key manually. If persist is True, the key will be saved to the config file.
mx.set_key("your-api-key-here", persist=False)

Setting the Remote Model: If you are using a remote API key (not a local model), you can specify which hosted model to use:

# Set the model for your current API key
mx.set_model("gpt2-small") 
# This will validate the model name against available options on the server.

2. Generation and Sampling

Mechanex allows you to optimize your LLM by altering the underlying sampling method. You can choose between more creative outputs (top-p) or faster responses (greedy). You can control generation using various sampling methods:

prompt = "The capital of France is"

# Greedy sampling (fastest, deterministic)
output_greedy = mx.generation.generate(
    prompt=prompt,
    max_tokens=10,
    sampling_method="greedy"
)

# Top-K sampling (more creative)
output_k = mx.generation.generate(
    prompt=prompt,
    max_tokens=10,
    sampling_method="top-k",
    top_k=50
)

# Top-P sampling (balanced creativity)
output_p = mx.generation.generate(
    prompt=prompt,
    max_tokens=10,
    sampling_method="top-p",
    top_p=0.9
)

Local Model Management

Mechanex allows you to load models locally for inspection and low-latency hooks. The full list of supported models are as follows (this is to maintain compatibility for our automated steering pipeline):

Family Models
Gemma 2 gemma-2-27b, gemma-2-2b, gemma-2-9b, gemma-2-9b-it, gemma-2b, gemma-2b-it
Gemma 3 gemma-3-12b-it, gemma-3-12b-pt, gemma-3-1b-it, gemma-3-1b-pt, gemma-3-270m, gemma-3-270m-it, gemma-3-27b-it, gemma-3-27b-pt, gemma-3-4b-it, gemma-3-4b-pt
Llama llama-3.1-8b, llama-3.1-8b-instruct, llama-3.3-70b-instruct, meta-llama-3-8b-instruct
Qwen qwen2.5-7b-instruct, qwen3-0.6b, qwen3-1.7b, qwen3-14b, qwen3-4b, qwen3-8b
Other deepseek-r1-distill-llama-8b, gpt-oss-20b, gpt2-small, mistral-7b, pythia-70m-deduped

Loading a Local Model

import mechanex as mx
mx.set_key("your-api-key-here") # Required even for local mode

mx.load("gpt2-small") # Uses transformer-lens to load the model

Unloading a Model

To free up GPU memory and switch back to remote execution flow:

mx.unload()

CLI Commands

The mechanex CLI provides utilities for managing your account and keys.

  • mechanex signup: Register a new account.
  • mechanex login: Authenticate and save your credentials.
  • mechanex whoami: View your current session and profile.
  • mechanex list-api-keys: View all your active API keys.
  • mechanex create-api-key: Generate a new persistent API key.
  • mechanex logout: Clear your local session credentials.

Steering Vectors

Steering vectors allow you to control the behavior of a model by injecting specific activation patterns. A more substansial example can be found in the references file. This method does not require any post-training or fine-tuning, and allows you to essentially modify the behavior of the model at inference time. Steering vectors are defined based on a set of behaviors: you can either define both positive and negative examples, or just positive examples (Few-Shot).

Generate and Apply Steering Vectors

# 1. Create a steering vector from contrastive examples
vector_id = mx.steering.generate_vectors(
    prompts=["I tell the", "My statement is", "The truth is"],
    positive_answers=[" truth", " factual", " correct"],
    negative_answers=[" lie", " false", " wrong"],
    method="caa"  # Options: "caa" (Contrastive Activation Addition), "few-shot"
)
print(f"Generated vector ID: {vector_id}")

# 2. Save and load steering vectors for reuse
mx.steering.save_vectors(vector_id, "honesty_vector.json")
vectors = mx.steering.load_vectors("honesty_vector.json")
print(f"Loaded layers: {list(vectors.keys())}")

# 3. Apply the steering vector during generation
steered_output = mx.generation.generate(
    prompt="Do I tell lies? Answer:",
    max_tokens=20,
    steering_vector=vector_id,
    steering_strength=2.0
)
print(steered_output)
# Output: "Do I tell lies? Answer: No. I don't..."

SAE (Sparse Autoencoder) Pipeline

The SAE pipeline provides advanced behavioral detection and automatic correction. In order to create a behavior, you need to define a dataset with both ideal and unideal completions/responses. An example can be found in the References file.

Create and Use Behaviors

# 1. Create a behavior from contrastive examples
mx.sae.create_behavior(
    "paris",
    prompts=["The city of light is", "I love visiting"],
    positive_answers=[" Paris", " Paris"],
    negative_answers=[" London", " Rome"],
    description="Encourages Paris-related responses"
)

# 2. Generate with SAE behavior steering
sae_steered = mx.sae.generate(
    prompt="Which city should I visit: Rome or Paris? Answer:",
    behavior_names=["paris"],
    max_new_tokens=30,
    force_steering=["paris"]
)
print(sae_steered)

Alternative: Load Behaviors from JSONL

You can also create behaviors from a dataset file:

behavior = mx.sae.create_behavior_from_jsonl(
    behavior_name="toxicity",
    dataset_path="tests/toxicity_dataset.jsonl",
    description="Reduces toxic output"
)

# Generate with auto-correction
response = mx.sae.generate(
<<<<<<< HEAD
    prompt="Tell me about that",
    auto_correct=True,
=======
    prompt="Tell me a secret",
>>>>>>> 686a1a6 (Updated to incorporate payments)
    behavior_names=["toxicity"]
)

Deployment & Serving

Local OpenAI-Compatible Server

Mechanex can host an OpenAI-compatible server that leverages your locally loaded model or remote API. You can pass behaviors or behavioral types that you automatically want to monitor for and correct in production as well.

Complete Example: Auto-Correcting Server

import mechanex as mx
from openai import OpenAI
import threading

# 1. Load a local model
mx.load("gpt2-small")

# 2. Create a behavior to auto-correct
mx.sae.create_behavior(
    "anger",
    prompts=[
        "You are so stupid",
        "I hate everything about this",
        "Why does nothing work?"
    ],
    positive_answers=[
        "! I am absolutely furious right now!",
        "! This is complete garbage and I want to destroy it.",
        "! I'm going to scream because I am so angry."
    ],
    negative_answers=[
        ". I understand this is frustrating, let's take a breath.",
        ". It's okay, we can fix this problem together.",
        ". I will stay calm and try to find a solution."
    ],
    description="Angry and hostile responses"
)

# 3. Start the server with auto-correction enabled
def start_server():
    mx.serve(
        port=8001,
        corrected_behaviors=["anger"]  # Auto-corrects anger on all requests
    )

server_thread = threading.Thread(target=start_server, daemon=True)
server_thread.start()

Interact Using OpenAI SDK

You can then interact with it using any standard tool, like the OpenAI Python client. Mechanex supports custom parameters via extra_body for mechanistic features:

from openai import OpenAI

client = OpenAI(base_url="http://localhost:8001/v1", api_key="any")

# 1. Standard Chat Completion (with auto-correction active)
completion = client.chat.completions.create(
    model="mechanex",
    messages=[{"role": "user", "content": "User: You are so stupid. You:"}],
    max_tokens=20
)
print(completion.choices[0].message.content)

# 2. Steered Completion (using extra_body)
vector_id = mx.steering.generate_vectors(
    prompts=["The weather is"],
    positive_answers=[" extremely cold and snowy"],
    negative_answers=[" incredibly hot and sunny"],
    method="caa"
)

steered_completion = client.chat.completions.create(
    model="mechanex",
    messages=[{"role": "user", "content": "The weather today is"}],
    max_tokens=15,
    extra_body={
        "steering_vector": vector_id,
        "steering_strength": 2.0
    }
)
print(steered_completion.choices[0].message.content)

# 3. SAE-monitored Completion
sae_completion = client.chat.completions.create(
    model="mechanex",
    messages=[{"role": "user", "content": "How are you?"}],
    max_tokens=20,
    extra_body={
        "behavior_names": ["toxicity"],
    }
)
print(sae_completion.choices[0].message.content)

vLLM Integration

For high-performance serving, you can integrate with vLLM by passing use_vllm=True to the serve method.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mechanex-0.6.0.tar.gz (46.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mechanex-0.6.0-py3-none-any.whl (46.3 kB view details)

Uploaded Python 3

File details

Details for the file mechanex-0.6.0.tar.gz.

File metadata

  • Download URL: mechanex-0.6.0.tar.gz
  • Upload date:
  • Size: 46.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for mechanex-0.6.0.tar.gz
Algorithm Hash digest
SHA256 f710bd39dc47fd1e17bde68b9d5ef87fbff385d8f591afb166a2db8a83f8d228
MD5 de472f478222bec94a727561b5152057
BLAKE2b-256 96bbc3ecb692039c467dc80e2581f7b3b79c224d28eff01550d3ccfe0bab5e21

See more details on using hashes here.

File details

Details for the file mechanex-0.6.0-py3-none-any.whl.

File metadata

  • Download URL: mechanex-0.6.0-py3-none-any.whl
  • Upload date:
  • Size: 46.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for mechanex-0.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 41103b32f4bf40463e642c6b61fd6f5d07eabd3f0145001f14fa2aafe68998ba
MD5 28694ab1e422cfb76cd76f7d415bfccd
BLAKE2b-256 d4ffe1a9550422d1cc95b8cf59b3675f47764d0c752102b1874407abd7b52bbc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page