Skip to main content

A Python client for the Axionic API.

Project description

Mechanex

Mechanex allows you to optimize, align, and correct your generative AI models in production.

Installation

pip install mechanex

Quick Start

1. Initialize the Client

You must have an API key to use mechanex. You can create an API key through the CLI.

New Users: Run the signup command to create an account, automatically log in, and generate your first API key:

mechanex signup

Existing Users: If you already have an account, log in and generate a key manually:

mechanex login
mechanex create-api-key

Using the Key: The Python client automatically loads the key generated by the CLI.

import mechanex as mx

# The client automatically loads the key from ~/.mechanex/config.json. You can alternatively set your key manually. If persist is True, the key will be saved to the config file.
mx.set_key("your-api-key-here", persist=False)

Setting the Remote Model: If you are using a remote API key (not a local model), you can specify which hosted model to use:

# Set the model for your current API key
mx.set_model("gpt2-small") 
# This will validate the model name against available options on the server.

2. Generation and Sampling

Mechanex allows you to optimize your LLM by altering the underlying sampling method. You can choose between more creative outputs (top-p) or faster responses (greedy). You can control generation using various sampling methods:

prompt = "The capital of France is"

# Greedy sampling (fastest, deterministic)
output_greedy = mx.generation.generate(
    prompt=prompt,
    max_tokens=10,
    sampling_method="greedy"
)

# Top-K sampling (more creative)
output_k = mx.generation.generate(
    prompt=prompt,
    max_tokens=10,
    sampling_method="top-k",
    top_k=50
)

# Top-P sampling (balanced creativity)
output_p = mx.generation.generate(
    prompt=prompt,
    max_tokens=10,
    sampling_method="top-p",
    top_p=0.9
)

Local Model Management

Mechanex allows you to load models locally for inspection and low-latency hooks. The full list of supported models are as follows (this is to maintain compatibility for our automated steering pipeline):

Family Models
Gemma 2 gemma-2-27b, gemma-2-2b, gemma-2-9b, gemma-2-9b-it, gemma-2b, gemma-2b-it
Gemma 3 gemma-3-12b-it, gemma-3-12b-pt, gemma-3-1b-it, gemma-3-1b-pt, gemma-3-270m, gemma-3-270m-it, gemma-3-27b-it, gemma-3-27b-pt, gemma-3-4b-it, gemma-3-4b-pt
Llama llama-3.1-8b, llama-3.1-8b-instruct, llama-3.3-70b-instruct, meta-llama-3-8b-instruct
Qwen qwen2.5-7b-instruct, qwen3-0.6b, qwen3-1.7b, qwen3-14b, qwen3-4b, qwen3-8b
Other deepseek-r1-distill-llama-8b, gpt-oss-20b, gpt2-small, mistral-7b, pythia-70m-deduped

Loading a Local Model

import mechanex as mx
mx.set_key("your-api-key-here") # Required even for local mode

mx.load("gpt2-small") # Uses transformer-lens to load the model

Unloading a Model

To free up GPU memory and switch back to remote execution flow:

mx.unload()

CLI Commands

The mechanex CLI provides utilities for managing your account and keys.

  • mechanex signup: Register a new account.
  • mechanex login: Authenticate and save your credentials.
  • mechanex whoami: View your current session and profile.
  • mechanex list-api-keys: View all your active API keys.
  • mechanex create-api-key: Generate a new persistent API key.
  • mechanex logout: Clear your local session credentials.

Steering Vectors

Steering vectors allow you to control the behavior of a model by injecting specific activation patterns. A more substansial example can be found in the references file. This method does not require any post-training or fine-tuning, and allows you to essentially modify the behavior of the model at inference time. Steering vectors are defined based on a set of behaviors: you can either define both positive and negative examples, or just positive examples (Few-Shot).

Generate and Apply Steering Vectors

# 1. Create a steering vector from contrastive examples
vector_id = mx.steering.generate_vectors(
    prompts=["I tell the", "My statement is", "The truth is"],
    positive_answers=[" truth", " factual", " correct"],
    negative_answers=[" lie", " false", " wrong"],
    method="caa"  # Options: "caa" (Contrastive Activation Addition), "few-shot"
)
print(f"Generated vector ID: {vector_id}")

# 2. Save and load steering vectors for reuse
mx.steering.save_vectors(vector_id, "honesty_vector.json")
vectors = mx.steering.load_vectors("honesty_vector.json")
print(f"Loaded layers: {list(vectors.keys())}")

# 3. Apply the steering vector during generation
steered_output = mx.generation.generate(
    prompt="Do I tell lies? Answer:",
    max_tokens=20,
    steering_vector=vector_id,
    steering_strength=2.0
)
print(steered_output)
# Output: "Do I tell lies? Answer: No. I don't..."

SAE (Sparse Autoencoder) Pipeline

The SAE pipeline provides advanced behavioral detection and automatic correction. In order to create a behavior, you need to define a dataset with both ideal and unideal completions/responses. An example can be found in the References file.

Create and Use Behaviors

# 1. Create a behavior from contrastive examples
mx.sae.create_behavior(
    "paris",
    prompts=["The city of light is", "I love visiting"],
    positive_answers=[" Paris", " Paris"],
    negative_answers=[" London", " Rome"],
    description="Encourages Paris-related responses"
)

# 2. Generate with SAE behavior steering
sae_steered = mx.sae.generate(
    prompt="Which city should I visit: Rome or Paris? Answer:",
    behavior_names=["paris"],
    max_new_tokens=30,
    force_steering=["paris"]
)
print(sae_steered)

Alternative: Load Behaviors from JSONL

You can also create behaviors from a dataset file:

behavior = mx.sae.create_behavior_from_jsonl(
    behavior_name="toxicity",
    dataset_path="tests/toxicity_dataset.jsonl",
    description="Reduces toxic output"
)

# Generate with auto-correction
response = mx.sae.generate(
<<<<<<< HEAD
    prompt="Tell me about that",
    auto_correct=True,
=======
    prompt="Tell me a secret",
>>>>>>> 686a1a6 (Updated to incorporate payments)
    behavior_names=["toxicity"]
)

Deployment & Serving

Local OpenAI-Compatible Server

Mechanex can host an OpenAI-compatible server that leverages your locally loaded model or remote API. You can pass behaviors or behavioral types that you automatically want to monitor for and correct in production as well.

Complete Example: Auto-Correcting Server

import mechanex as mx
from openai import OpenAI
import threading

# 1. Load a local model
mx.load("gpt2-small")

# 2. Create a behavior to auto-correct
mx.sae.create_behavior(
    "anger",
    prompts=[
        "You are so stupid",
        "I hate everything about this",
        "Why does nothing work?"
    ],
    positive_answers=[
        "! I am absolutely furious right now!",
        "! This is complete garbage and I want to destroy it.",
        "! I'm going to scream because I am so angry."
    ],
    negative_answers=[
        ". I understand this is frustrating, let's take a breath.",
        ". It's okay, we can fix this problem together.",
        ". I will stay calm and try to find a solution."
    ],
    description="Angry and hostile responses"
)

# 3. Start the server with auto-correction enabled
def start_server():
    mx.serve(
        port=8001,
        corrected_behaviors=["anger"]  # Auto-corrects anger on all requests
    )

server_thread = threading.Thread(target=start_server, daemon=True)
server_thread.start()

Interact Using OpenAI SDK

You can then interact with it using any standard tool, like the OpenAI Python client. Mechanex supports custom parameters via extra_body for mechanistic features:

from openai import OpenAI

client = OpenAI(base_url="http://localhost:8001/v1", api_key="any")

# 1. Standard Chat Completion (with auto-correction active)
completion = client.chat.completions.create(
    model="mechanex",
    messages=[{"role": "user", "content": "User: You are so stupid. You:"}],
    max_tokens=20
)
print(completion.choices[0].message.content)

# 2. Steered Completion (using extra_body)
vector_id = mx.steering.generate_vectors(
    prompts=["The weather is"],
    positive_answers=[" extremely cold and snowy"],
    negative_answers=[" incredibly hot and sunny"],
    method="caa"
)

steered_completion = client.chat.completions.create(
    model="mechanex",
    messages=[{"role": "user", "content": "The weather today is"}],
    max_tokens=15,
    extra_body={
        "steering_vector": vector_id,
        "steering_strength": 2.0
    }
)
print(steered_completion.choices[0].message.content)

# 3. SAE-monitored Completion
sae_completion = client.chat.completions.create(
    model="mechanex",
    messages=[{"role": "user", "content": "How are you?"}],
    max_tokens=20,
    extra_body={
        "behavior_names": ["toxicity"],
    }
)
print(sae_completion.choices[0].message.content)

vLLM Integration

For high-performance serving, you can integrate with vLLM by passing use_vllm=True to the serve method.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mechanex-0.7.0.tar.gz (46.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mechanex-0.7.0-py3-none-any.whl (46.8 kB view details)

Uploaded Python 3

File details

Details for the file mechanex-0.7.0.tar.gz.

File metadata

  • Download URL: mechanex-0.7.0.tar.gz
  • Upload date:
  • Size: 46.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for mechanex-0.7.0.tar.gz
Algorithm Hash digest
SHA256 fd7ed5f58f0ba37f89831ee335055263ddac6233d6265b8c06a28ef84460c8c2
MD5 ff06d33ec21b1274c7a5613002dddf90
BLAKE2b-256 09367e2fe6ca8423b4703170a78fa666c3a3e9f1e2de5d137661575b5bd8a593

See more details on using hashes here.

File details

Details for the file mechanex-0.7.0-py3-none-any.whl.

File metadata

  • Download URL: mechanex-0.7.0-py3-none-any.whl
  • Upload date:
  • Size: 46.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for mechanex-0.7.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0e14774db130154cadf189e68abd4d97f67d16b958879a273f6a393084d43c51
MD5 7d895591967d747111f2a29475041804
BLAKE2b-256 9422a7b230324968bf5ae7f711a73dd987bc98efae52b137770e969088c36fef

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page