LLM wrapper, for telemetry and internal model routing

These details have not been verified by PyPI

Project links

Homepage

Project description

Maniac

LLM-agnostic AI program orchestration with continuous prompt optimization and LoRA fine-tuning across all models.

Overview

Maniac provides a unified interface for deploying AI programs across any LLM provider or model. Each inference line spawns an AI Program Container that continuously optimizes both prompts and LoRA fine-tuning parameters across all models, ensuring optimal performance regardless of which model the Control Plane allocates.

Quick Start

Installation

pip install maniac

Basic Usage

from maniac import Maniac

# Initialize with your preferred provider
client = Maniac(provider="openai", api_key="your-key")
# or
client = Maniac(provider="vertex", project_id="your-project", region="us-east5")

# Customer support ticket analysis
response = client.responses.create(
    model="claude-opus-4",
    input="Customer reports: 'Payment failed but was charged anyway. Order #12345'", 
    instructions="You are a customer support analyst. Categorize the issue, determine urgency, and suggest resolution steps.",
    temperature=0.0,
    max_tokens=1024,
    task_label="support-ticket-analysis",
    judge_prompt="You are comparing two customer support analyses for the same ticket. Is response A's categorization and resolution plan at least as accurate and actionable as response B's? Focus on issue identification, urgency assessment, and solution quality."
)

# Document summarization for compliance
response = client.chat.completions.create(
    model="claude-opus-4",
    messages=[
        {"role": "system", "content": "You are a compliance officer specializing in financial regulations."},
        {"role": "user", "content": "Summarize the key compliance risks in this 50-page contract..."}
    ],
    temperature=0.0,
    task_label="compliance-review"
)

# Stream existing analysis results (bypass inference for batch processing)
client.chat.completions.stream_create(
    task_label="document-processing",
    system_prompt="You are a legal document analyst.",
    user_prompt="Extract key terms from this vendor agreement...",
    output="Key terms: Payment net 30, liability cap $1M, termination 90 days notice...",
    judge_prompt="You are comparing two contract analyses for the same document. Is response A's extraction of key terms at least as complete and accurate as response B's? Focus on identifying all critical terms, payment conditions, and legal obligations."
)

Core Concepts

AI Program Containers

Every inference line creates an AI Program Container that:

Continuously optimizes prompts and LoRA adaptations across all models simultaneously, ensuring each container can deploy optimally on any model (closed-source or open-source)
Maintains unified optimization state combining prompt engineering and fine-tuning metrics across the entire model ecosystem
Handles seamless model switching with pre-optimized prompts and LoRA weights ready for any target model
Automatically balances prompt vs LoRA optimization based on model capabilities (e.g., more LoRA for open-source, more prompt engineering for closed-source)

Control Plane

The Control Plane allocates containers to LLMs based on:

Quality preferences specified in judge prompts
Cost constraints configured in the dashboard
Latency requirements for real-time vs batch processing
Optimization readiness - how well each container's prompts and LoRA weights are optimized for each model
Model capabilities and task compatibility

Supported Providers

OpenAI: GPT-4o, GPT-4, GPT-3.5, O3-mini (prompt optimization + API-level adaptation)
Anthropic (Vertex AI): Claude Opus 4, Claude Sonnet 4 (prompt optimization + structured fine-tuning)
Open-source models: Llama, Mistral, CodeLlama (unified prompt + LoRA optimization)

Configuration

Provider Setup

OpenAI:

client = Maniac(
    provider="openai",
    api_key="sk-...",
    base_url="https://api.openai.com/v1"  # optional
)

Vertex AI:

client = Maniac(
    provider="vertex",
    project_id="your-gcp-project",
    region="us-east5"
)

Quality Control

Use judge prompts to specify quality criteria:

response = client.responses.create(
    model="claude-opus-4",
    input="Vendor contract shows $2M annual spend but accounting shows $2.1M. Investigate discrepancy.",
    instructions="You are a financial auditor. Identify potential causes for the discrepancy and recommend investigation steps.",
    temperature=0.0,
    max_tokens=2000,
    task_label="financial-audit",
    judge_prompt="You are comparing two financial audit analyses for the same discrepancy. Is response A's identification of root causes and investigation plan at least as thorough and actionable as response B's? Focus on completeness of potential causes and clarity of next steps."
)

Dashboard Configuration

Access the Maniac dashboard to configure:

Cost preferences: Set budget limits and cost-per-token thresholds
Latency targets: Specify response time requirements
Model preferences: Define fallback hierarchies and quality trade-offs
Container policies: Configure joint prompt + LoRA optimization schedules and resource limits

Advanced Features

Task Labeling

Group related inferences for coordinated prompt and LoRA optimization across all models:

import concurrent.futures

task_id = "customer-support-analysis"

def process_ticket(ticket_data):
    return client.responses.create(
        model="claude-opus-4",
        input=ticket_data["customer_message"],
        instructions="You are a customer support analyst. Categorize the issue, assess urgency (Low/Medium/High), and provide resolution steps.",
        temperature=0.0,
        max_tokens=1024,
        task_label=task_id,
        judge_prompt="You are comparing two customer support analyses for the same ticket. Is response A's categorization, urgency assessment, and resolution plan at least as accurate and helpful as response B's? Focus on accuracy of issue identification and practicality of solutions."
    )

# Process support tickets concurrently with shared task_label
with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
    results = list(executor.map(process_ticket, support_tickets))

Streaming

stream = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Analyze this 10-K filing for competitive risks and revenue projections..."}],
    stream=True,
    task_label="financial-analysis"
)

for chunk in stream:
    print(chunk.choices[0].delta.content, end="")

Parameter Reference

responses.create() parameters:

model: Model name (e.g., "claude-opus-4", "gpt-4o")
input: Business document, customer inquiry, or data to analyze
instructions: Domain-specific role and task definition (e.g., "financial auditor", "compliance officer")
temperature: Randomness (0.0 for consistent analysis, higher for creative tasks)
max_tokens: Response length limit (1024 for summaries, 4096 for detailed analysis)
task_label: Groups related business processes for unified optimization
judge_prompt: Quality standards for business-critical decisions

stream_create() parameters:

task_label: Task identifier for grouping
system_prompt: System instructions
user_prompt: User input
output: Pre-generated response content
judge_prompt: Evaluation criteria

Enterprise Benefits

Cost Management & Performance Reliability

Automatic cost optimization: Containers switch between models based on budget constraints while maintaining quality standards
Performance guarantees: Pre-optimized prompts and LoRA weights ensure consistent output quality regardless of model availability
Vendor risk mitigation: Single API maintains operations even when specific model providers experience outages or policy changes

Rapid Model Adoption

Zero-downtime model transitions: New models automatically receive optimized prompts and fine-tuning from existing container data
Quality-assured deployment: Judge prompts ensure new models meet established performance benchmarks before production use
Seamless scaling: Containers handle traffic spikes by intelligently distributing across available models based on latency and cost requirements

Operational Excellence

Centralized monitoring: Dashboard provides unified visibility across all models, tasks, and performance metrics
Compliance-ready logging: Complete audit trail of all inferences, optimizations, and model selections
Enterprise-grade reliability: Built-in fallback mechanisms and automatic retry logic ensure business continuity

Best Practices

Use task labels to group related inferences for coordinated prompt + LoRA optimization across the entire model ecosystem
Specify judge prompts to guide quality-aware model selection and optimization direction
Set appropriate temperature values (0.0 for deterministic tasks, higher for creative tasks)
Configure fallback models in the dashboard - containers automatically maintain optimized prompts and LoRA weights for each fallback
Monitor container metrics to track both prompt engineering and fine-tuning performance across models

Support

For issues and feature requests, visit the Maniac documentation portal or contact support@maniac.ai.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

0.5.2

Mar 24, 2026

0.5.1

Feb 27, 2026

0.5.0

Feb 18, 2026

0.4.0

Feb 9, 2026

0.3.8

Oct 27, 2025

0.3.7

Oct 20, 2025

0.3.6

Oct 16, 2025

0.3.5

Oct 16, 2025

0.3.3

Oct 8, 2025

0.3.2

Oct 3, 2025

0.3.1

Sep 30, 2025

0.3.0

Sep 28, 2025

0.2.0

Sep 27, 2025

0.1.2

Sep 12, 2025

This version

0.1.1

Sep 12, 2025

0.1.0

Sep 12, 2025

Sep 27, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

maniac-0.1.1.tar.gz (28.5 kB view details)

Uploaded Sep 12, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

maniac-0.1.1-py3-none-any.whl (27.4 kB view details)

Uploaded Sep 12, 2025 Python 3

File details

Details for the file maniac-0.1.1.tar.gz.

File metadata

Download URL: maniac-0.1.1.tar.gz
Upload date: Sep 12, 2025
Size: 28.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.8

File hashes

Hashes for maniac-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`aac9991f22dbf96d29540dc8eb6b692f5a1fb8effa9dfebee53d2857d77d7ab2`
MD5	`e334ec1215c86430b00711a2b75ce851`
BLAKE2b-256	`73f735cbfda921cbf8ba5de38a65990da284eb9b4adebbc2764fca00893cfca4`

See more details on using hashes here.

File details

Details for the file maniac-0.1.1-py3-none-any.whl.

File metadata

Download URL: maniac-0.1.1-py3-none-any.whl
Upload date: Sep 12, 2025
Size: 27.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.8

File hashes

Hashes for maniac-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`041be7e1265e07f71d0e9a9998e7943dd14faf280d067b5caf46d72491e8b831`
MD5	`ecee74f967776100d82064c8f2a7a8f8`
BLAKE2b-256	`89468b6c09c9b13b6faa13954e68f9d96abc709fe12f9b49aad815a106b04e26`

See more details on using hashes here.

maniac 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Maniac

Overview

Quick Start

Installation

Basic Usage

Core Concepts

AI Program Containers

Control Plane

Supported Providers

Configuration

Provider Setup

Quality Control

Dashboard Configuration

Advanced Features

Task Labeling

Streaming

Parameter Reference

Enterprise Benefits

Cost Management & Performance Reliability

Rapid Model Adoption

Operational Excellence

Best Practices

Support

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes