Skip to main content

A Python client for the Axionic API.

Project description

Mechanex

Mechanex allows you to optimize, align, and correct your generative AI models in production. It acts as a runtime control layer for small models.

The core promise is: improve model behavior at inference time through policies, without retraining.

Product Direction

Mechanex is built around a policy-first workflow:

  1. Choose a model (local, self-hosted, or hosted).
  2. Choose a task profile.
  3. Choose an objective.
  4. Apply runtime controls (sampling, steering, constraints, verifiers, optimization).
  5. Compare and evaluate policies.
  6. Deploy and iterate.

Core Concepts

Policy

A policy is the reusable runtime object. It defines:

  • Sampling method and search settings.
  • Steering settings.
  • Output constraints.
  • Verifier stack.
  • Optimization and fallback behavior.

Execution Modes

Mechanex supports flexible execution via mx.set_execution_mode():

  • auto: Remote when authenticated, local when not authenticated.
  • remote: Force hosted inference (account + API key required).
  • local: Force local model execution on your own hardware.

Installation

pip install mechanex

Quick Start

1. Initialize the Client

You must have an API key to use mechanex's hosted mode. You can create an API key through the CLI.

New Users: Run the signup command to create an account, automatically log in, and generate your first API key:

mechanex signup

Existing Users: If you already have an account, log in and generate a key manually:

mechanex login
mechanex create-api-key

Using the Key in Python: The client automatically loads the key generated by the CLI.

import mechanex as mx

# You can manually set your key. If persist=True, it is saved for future sessions.
mx.set_key("your-api-key-here", persist=True)

# Choose your execution mode
mx.set_execution_mode("remote") # or "local", or "auto"

2. Setting the Model

You can specify the model depending on your execution mode.

Hosted Remote Model Catalog

mx.set_model("qwen3-0.6b")

Supported hosted models:

Family Models
Gemma 2 gemma-2-27b, gemma-2-2b, gemma-2-9b, gemma-2-9b-it, gemma-2b, gemma-2b-it
Gemma 3 gemma-3-12b-it, gemma-3-12b-pt, gemma-3-1b-it, gemma-3-1b-pt, gemma-3-270m, gemma-3-270m-it, gemma-3-27b-it, gemma-3-27b-pt, gemma-3-4b-it, gemma-3-4b-pt
Llama llama-3.1-8b, llama-3.1-8b-instruct, llama-3.3-70b-instruct, meta-llama-3-8b-instruct
Qwen qwen2.5-7b-instruct, qwen3-0.6b, qwen3-1.7b, qwen3-14b, qwen3-4b, qwen3-8b
Other deepseek-r1-distill-llama-8b, gpt-oss-20b, gpt2-small, mistral-7b, pythia-70m-deduped

Local Model Management

To load models locally for inspection and low-latency hooks:

mx.load("gpt2-small") # Uses transformer-lens to load the model locally
mx.set_execution_mode("local")

To free up GPU memory and switch back to remote execution flow:

mx.unload()

Policies and Runtime Controls

Mechanex allows you to control generation using reusable policies or direct API parameters.

import mechanex as mx

mx.load("gpt2-small")
mx.set_execution_mode("local")

# Define a policy with strict JSON constraints
policy = mx.policy.strict_json_extraction(
    schema={
        "type": "object",
        "required": ["summary"],
        "properties": {"summary": {"type": "string"}},
    },
    name="strict_json_small_v1",
)

res = mx.policy.run(
    prompt="Summarize speculative decoding in one sentence.",
    policy=policy,
    include_trace=True,
)
print(res["output"])

Sampling Methods

You can control generation using various sampling methods to prioritize creativity vs output determinism. Supported methods:

  • greedy (fastest, deterministic)
  • top-k (more creative)
  • top-p (balanced creativity)
  • min-p
  • typical
  • ads (Adaptive Determinantal Sampling) — Remote only
  • constrained-beam-search
  • speculative-decoding
  • ssd
  • guided-generation
  • ensemble-sampling

Steering Vectors

Steering vectors allow you to control the behavior of a model by injecting specific activation patterns. This method does not require any post-training or fine-tuning, mapping representations accurately.

Generate and Apply Steering Vectors

import mechanex as mx

# 1. Create a steering vector from contrastive examples
vector_id = mx.steering.generate_vectors(
    name="honesty_vector",       # Supports naming and labeling your steering vectors!
    prompts=["I tell the", "My statement is", "The truth is"],
    positive_answers=[" truth", " factual", " correct"],
    negative_answers=[" lie", " false", " wrong"],
    method="caa"                 # Options: "caa", "few-shot", "steering-perceptrons" (backend)
)
print(f"Generated vector ID: {vector_id}")

# 2. Save and load steering vectors for reuse
mx.steering.save_vectors(vector_id, "honesty_vector.json")
vectors = mx.steering.load_vectors("honesty_vector.json")

# 3. Apply the steering vector during generation
steered_output = mx.generation.generate(
    prompt="Do I tell lies? Answer:",
    max_tokens=20,
    steering_vector=vector_id,
    steering_strength=2.0
)
print(steered_output)

SAE (Sparse Autoencoder) Pipeline

The SAE pipeline provides advanced behavioral detection and automatic correction. It uses locally-computed (fwd_hooks) or remote activation features.

Create and Use Behaviors

# 1. Create a behavior from contrastive examples
behavior = mx.sae.create_behavior(
    behavior_name="concise_mode",
    prompts=["Describe how to optimize model inference."],
    positive_answers=["Use caching, quantization, and batching."],
    negative_answers=["A very long and unstructured answer about how you should take your time explaining things..."],
    description="Encourages concise style"
)

# 2. Generate with SAE behavior Steering
sae_steered = mx.sae.generate(
    prompt="Tell me about that. Answer:",
    behavior_names=["concise_mode"],
    max_new_tokens=30,
)
print(sae_steered)

You can also load behavior datasets automatically using create_behavior_from_jsonl("toxicity", "tests/toxicity_dataset.jsonl").

Comparing and Evaluating Policies

The SDK makes it trivial to compare multiple iterations of runtime policies.

pid = mx.policy.save(policy)
cmp = mx.policy.compare(
    prompt="Extract order_id and status from text.",
    policies=[mx.policy.fast_tool_router(), policy],
)

ev = mx.policy.evaluate(
    prompts=[
        "Extract {name, role} from: Alice is CTO.",
        "Extract {name, role} from: Bob is PM.",
    ],
    policy_id=pid,
)

Deployment & Serving

Mechanex can host an OpenAI-compatible server that leverages your locally loaded model or remote API. The system auto-applies specified behaviors and auto-correction.

import mechanex as mx
mx.load("gpt2-small")
mx.set_execution_mode("local")

# Start server natively compatible with OpenAI client format
mx.serve(port=8001, corrected_behaviors=["toxicity"])

# Use vLLM integration for high-performance serving
# mx.serve(port=8001, use_vllm=True)

You can interact with the server using standard libraries like the openai Python SDK by routing traffic to http://localhost:8001/v1. Elements like policy, policy_id, steering vectors, and SAE behaviors can all be passed via standard extra_body configuration.

Environment Capabilities

  • ADS (Adaptive Determinantal Sampling) is remote-only.
  • Steering perceptrons (steering-perceptrons) are remote-only.
  • All other runtime policy mechanisms are available for local execution, with capability-aware fallback behavior.

CLI Commands

Account profile and key lifecycle tools:

  • mechanex signup
  • mechanex login
  • mechanex whoami
  • mechanex create-api-key
  • mechanex list-api-keys
  • mechanex balance
  • mechanex topup
  • mechanex logout

Examples

See examples/README.md for runnable workflows within the repository, including:

  • 01_local_first_quickstart.py
  • 02_remote_quickstart.py
  • 03_sampling_strategies.py
  • 04_strict_json_policy.py
  • 05_policy_compare_and_evaluate.py
  • 06_local_steering_vectors.py
  • 07_openai_compatible_server.py
  • 08_hybrid_local_remote_toggle.py
  • 09_sae_behavior_workflow.py
  • 10_remote_policy_smoke.py

Engineering Docs

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mechanex-1.0.2.tar.gz (61.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mechanex-1.0.2-py3-none-any.whl (58.7 kB view details)

Uploaded Python 3

File details

Details for the file mechanex-1.0.2.tar.gz.

File metadata

  • Download URL: mechanex-1.0.2.tar.gz
  • Upload date:
  • Size: 61.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for mechanex-1.0.2.tar.gz
Algorithm Hash digest
SHA256 81efb74f60cab22549318538f23020204fb38b774dcf30263218102265f6a5fb
MD5 9996894e3e18cc25ed04d367a346f61f
BLAKE2b-256 7baf605d7b98775062cdbb31b6f2de98c448276f98094fc108cfcae367e9bf0b

See more details on using hashes here.

File details

Details for the file mechanex-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: mechanex-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 58.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for mechanex-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 10acd3860cc1d6f815989baa07641aa14deb9fb551a9b491e09e94bf00a4397f
MD5 0b14d5c932290c8405f851baca6cd953
BLAKE2b-256 359590ba204f1e5cba444782dee895e465bfd2b6cda703c2eadd6448fbec728b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page