A unified framework for accessing multiple LLM providers with comprehensive CLI and machine interface

These details have not been verified by PyPI

Project links

Project description

MonoLLM

A powerful framework that provides a unified interface for multiple LLM providers, allowing developers to seamlessly switch between different AI models while maintaining consistent API interactions.

🚀 Key Features

🔄 Unified Interface: Access multiple LLM providers through a single, consistent API
🌐 Proxy Support: Configure HTTP/SOCKS5 proxies for all LLM calls
📺 Streaming: Real-time streaming responses for better user experience
🧠 Reasoning Models: Special support for reasoning models with thinking steps
🌡️ Temperature Control: Fine-tune creativity and randomness when supported
🔢 Token Management: Control costs with maximum output token limits
🔧 MCP Integration: Model Context Protocol support when available
🎯 OpenAI Protocol: Prefer OpenAI-compatible APIs for consistency
⚙️ JSON Configuration: Easy configuration management through JSON files

📋 Supported Providers

Provider	Status	Streaming	Reasoning	MCP	OpenAI Protocol
OpenAI	✅ Ready	✅ Yes	✅ Yes	✅ Yes	✅ Yes
Anthropic	✅ Ready	✅ Yes	❌ No	✅ Yes	❌ No
Google Gemini	🚧 Planned	✅ Yes	❌ No	❌ No	❌ No
Qwen (DashScope)	✅ Ready	✅ Yes	✅ Yes	❌ No	✅ Yes
DeepSeek	✅ Ready	✅ Yes	✅ Yes	❌ No	✅ Yes
Volcengine	🚧 Planned	✅ Yes	❌ No	❌ No	✅ Yes

🛠️ Installation

Prerequisites

Python 3.13+ (required)
uv (recommended) or pip

Quick Install

# Clone the repository
git clone https://github.com/cyborgoat/MonoLLM.git
cd MonoLLM

# Install with uv (recommended)
uv sync
uv pip install -e .

# Or install with pip
pip install -e .

Verify Installation

# Check CLI is working
monollm --help

# List available providers
monollm list-providers

⚡ Quick Start

1. Set up API Keys

# Set API keys for the providers you want to use
export DASHSCOPE_API_KEY="your-dashscope-api-key"  # For Qwen
export ANTHROPIC_API_KEY="your-anthropic-api-key"  # For Claude
export OPENAI_API_KEY="your-openai-api-key"        # For GPT models

2. Basic Python Usage

import asyncio
from monollm import UnifiedLLMClient, RequestConfig

async def main():
    async with UnifiedLLMClient() as client:
        config = RequestConfig(
            model="qwq-32b",  # Qwen's reasoning model
            temperature=0.7,
            max_tokens=1000,
        )
        
        response = await client.generate(
            "Explain quantum computing in simple terms.",
            config
        )
        
        print(response.content)
        if response.usage:
            print(f"Tokens used: {response.usage.total_tokens}")

asyncio.run(main())

3. CLI Usage

# Generate text with streaming
monollm generate "What is artificial intelligence?" --model qwen-plus --stream

# Use reasoning model with thinking steps
monollm generate "Solve: 2x + 5 = 13" --model qwq-32b --thinking

# List available models
monollm list-models --provider qwen

📖 Documentation

📚 Full Documentation - Comprehensive guides and API reference
🚀 Quick Start Guide - Get up and running in minutes
⚙️ Configuration Guide - Advanced configuration options
💻 CLI Documentation - Command-line interface guide
🤖 Machine Interface - JSON API for programmatic usage and Tauri sidecars
🔧 Examples - Practical usage examples

🤖 Machine Interface & Tauri Integration

MonoLLM provides a powerful machine-friendly JSON API perfect for integration with external applications, automation scripts, and Tauri sidecars:

# All commands support --machine flag for JSON output
monollm list-providers --machine
monollm generate "Hello world" --model gpt-4o --machine
monollm generate-stream "Tell a story" --model qwq-32b --thinking

Tauri Sidecar Example

// Rust code for Tauri app
use std::process::Command;

let output = Command::new("monollm")
    .args(&["generate", "What is AI?", "--model", "gpt-4o", "--machine"])
    .output()
    .expect("Failed to execute command");

let response: serde_json::Value = serde_json::from_slice(&output.stdout)?;
println!("AI Response: {}", response["content"]);

Key Machine Interface Features

🔄 Structured JSON: All responses in consistent JSON format
📡 Streaming Support: Real-time JSON chunks for streaming responses
⚙️ Configuration API: Programmatic model defaults and proxy management
🛡️ Error Handling: Consistent error format with detailed context
🔧 Validation: Parameter validation before API calls
📊 Usage Tracking: Token usage and performance metrics

📖 Complete Machine Interface Documentation

🎯 Use Cases

Content Generation

config = RequestConfig(model="qwen-plus", temperature=0.8, max_tokens=1000)
response = await client.generate("Write a blog post about renewable energy", config)

Code Assistance

config = RequestConfig(model="qwq-32b", temperature=0.2)
response = await client.generate("Explain this Python function: def fibonacci(n):", config)

Reasoning & Analysis

config = RequestConfig(model="qwq-32b", show_thinking=True)
response = await client.generate("Analyze this data and find trends", config)

Thinking Mode for Reasoning Models

MonoLLM supports reasoning models that can show their internal thought process:

# Enable thinking mode to see step-by-step reasoning
config = RequestConfig(
    model="qwq-32b",  # QwQ reasoning model
    show_thinking=True,  # Show internal reasoning
    temperature=0.7
)

response = await client.generate(
    "Solve this step by step: If a train travels 120 km in 2 hours, then 180 km in 3 hours, what is its average speed?",
    config
)

# Access the thinking process
if response.thinking:
    print("💭 Thinking Process:")
    print(response.thinking)
    print("\n" + "="*50)

print("🎯 Final Answer:")
print(response.content)

Supported Reasoning Models:

QwQ-32B (qwq-32b) - Stream-only reasoning model
QwQ-Plus (qwq-plus) - Stream-only reasoning model
Qwen3 Series (qwen3-32b, qwen3-8b, etc.) - Support both modes
OpenAI o1 (o1, o1-mini) - Advanced reasoning models
DeepSeek R1 (deepseek-reasoner) - Reasoning model

Creative Writing

config = RequestConfig(model="qwen-plus", temperature=1.0, max_tokens=2000)
response = await client.generate("Write a science fiction short story", config)

🔧 Advanced Features

Streaming Responses

async for chunk in await client.generate_stream(prompt, config):
    if chunk.content:
        print(chunk.content, end="", flush=True)

Multi-turn Conversations

messages = [
    Message(role="system", content="You are a helpful assistant."),
    Message(role="user", content="Hello!"),
]
response = await client.generate(messages, config)

Error Handling

from monollm.core.exceptions import MonoLLMError, ProviderError

try:
    response = await client.generate(prompt, config)
except ProviderError as e:
    print(f"Provider error: {e}")
except MonoLLMError as e:
    print(f"MonoLLM error: {e}")

🌐 Proxy Support

Configure HTTP/SOCKS5 proxies:

export PROXY_ENABLED=true
export PROXY_TYPE=http
export PROXY_HOST=127.0.0.1
export PROXY_PORT=7890

🤝 Contributing

We welcome contributions! Please see our Contributing Guide for details.

Development Setup

# Clone and install in development mode
git clone https://github.com/cyborgoat/MonoLLM.git
cd MonoLLM
uv sync --dev

# Install pre-commit hooks
pre-commit install

# Run tests
pytest

# Build documentation
cd docs && make html

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🔗 Links

GitHub: https://github.com/cyborgoat/MonoLLM
Documentation: https://cyborgoat.github.io/MonoLLM/
Issues: https://github.com/cyborgoat/MonoLLM/issues
Discussions: https://github.com/cyborgoat/MonoLLM/discussions

🙏 Acknowledgments

Thanks to all the LLM providers for their amazing APIs
Inspired by the need for a unified interface across multiple AI providers
Built with modern Python async/await patterns for optimal performance

👨‍💻 Author

Created and maintained by cyborgoat

Made with ❤️ by cyborgoat

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.3

Jun 1, 2025

0.1.2

Jun 1, 2025

0.1.1

Jun 1, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

monollm-0.1.3.tar.gz (82.2 kB view details)

Uploaded Jun 1, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

monollm-0.1.3-py3-none-any.whl (66.7 kB view details)

Uploaded Jun 1, 2025 Python 3

File details

Details for the file monollm-0.1.3.tar.gz.

File metadata

Download URL: monollm-0.1.3.tar.gz
Upload date: Jun 1, 2025
Size: 82.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.10.9

File hashes

Hashes for monollm-0.1.3.tar.gz
Algorithm	Hash digest
SHA256	`11acc74bd5f0917d8af56494e9da471842610f13793a8ccb38839f98484f9d55`
MD5	`c52233509401e292d0c1f573843036ee`
BLAKE2b-256	`6952956e6f9e1d420e2cc7d098f7ef68a9cc4ba65b37fca952b00071ee65389d`

See more details on using hashes here.

File details

Details for the file monollm-0.1.3-py3-none-any.whl.

File metadata

Download URL: monollm-0.1.3-py3-none-any.whl
Upload date: Jun 1, 2025
Size: 66.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.10.9

File hashes

Hashes for monollm-0.1.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1a5b2a3f453304a895b65ecebb6142eb8075eb5412a051525864ec6a212dfd5b`
MD5	`579837377b521836451411db0a59336e`
BLAKE2b-256	`5d08a1b7049aa00da1fe4d8ff109c9e8be1392f4c1111cbcf1b8b607bc9fc526`

See more details on using hashes here.

monollm 0.1.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

MonoLLM

🚀 Key Features

📋 Supported Providers

🛠️ Installation

Prerequisites

Quick Install

Verify Installation

⚡ Quick Start

1. Set up API Keys

2. Basic Python Usage

3. CLI Usage

📖 Documentation

🤖 Machine Interface & Tauri Integration

Tauri Sidecar Example

Key Machine Interface Features

🎯 Use Cases

Content Generation

Code Assistance

Reasoning & Analysis

Thinking Mode for Reasoning Models

Creative Writing

🔧 Advanced Features

Streaming Responses

Multi-turn Conversations

Error Handling

🌐 Proxy Support

🤝 Contributing

Development Setup

📄 License

🔗 Links

🙏 Acknowledgments

👨‍💻 Author

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes