Skip to main content

Unified toolkit for managing and using multiple LLM providers with automatic model detection

Project description

๐Ÿš€ beanllm

Production-ready LLM toolkit with Clean Architecture and unified interface for multiple providers

PyPI version Python 3.11+ License: MIT Downloads Tests GitHub Stars

beanllm is a comprehensive, production-ready toolkit for building LLM applications with a unified interface across OpenAI, Anthropic, Google, DeepSeek, Perplexity, and Ollama. Built with Clean Architecture and SOLID principles for maintainability and scalability.


๐Ÿ“š Documentation


โœจ Key Features

๐ŸŽฏ Core Features

  • ๐Ÿ”„ Unified Interface - Single API for 7 LLM providers (OpenAI, Claude, Gemini, DeepSeek, Perplexity, Ollama)
  • ๐ŸŽ›๏ธ Intelligent Adaptation - Automatic parameter conversion between providers
  • ๐Ÿ“Š Model Registry - Auto-detect available models from API keys
  • ๐Ÿ” CLI Tools - Inspect models and capabilities from command line
  • ๐Ÿ’ฐ Cost Tracking - Accurate token counting and cost estimation
  • ๐Ÿ—๏ธ Clean Architecture - Layered architecture with clear separation of concerns

๐Ÿ“„ RAG & Document Processing

  • ๐Ÿ“‘ Document Loaders - PDF, DOCX, XLSX, PPTX (Docling), Jupyter Notebooks, HTML, CSV, TXT
  • ๐Ÿš€ beanPDFLoader - Advanced PDF processing with 3-layer architecture
    • โšก Fast Layer (PyMuPDF): ~2s/100 pages, image extraction
    • ๐ŸŽฏ Accurate Layer (pdfplumber): 95% accuracy, table extraction
    • ๐Ÿค– ML Layer (marker-pdf): 98% accuracy, structure-preserving Markdown
  • โœ‚๏ธ Smart Text Splitters - Semantic chunking with tiktoken
  • ๐Ÿ—„๏ธ Vector Search - Chroma, FAISS, Pinecone, Qdrant, Weaviate, Milvus, LanceDB, pgvector
  • ๐ŸŽฏ RAG Pipeline - Complete question-answering system in one line
  • ๐Ÿ“Š RAG Evaluation - TruLens integration, context recall metrics

๐Ÿง  Embeddings

  • ๐Ÿ“ Text Embeddings - OpenAI, Gemini, Voyage, Jina, Mistral, Cohere, HuggingFace, Ollama
  • ๐ŸŒ Multilingual - Qwen3-Embedding-8B (top multilingual model)
  • ๐Ÿ’ป Code Embeddings - Specialized embeddings for code search
  • ๐Ÿ–ผ๏ธ Vision Embeddings - CLIP, SigLIP, MobileCLIP for image-text matching
  • ๐ŸŽจ Advanced Features - Matryoshka (dimension reduction), MMR search, hard negative mining

๐Ÿ‘๏ธ Vision AI

  • โœ‚๏ธ Segmentation - SAM 3 (zero-shot segmentation)
  • ๐ŸŽฏ Object Detection - YOLOv12 (latest detection/segmentation)
  • ๐Ÿค– Vision-Language - Qwen3-VL (VQA, OCR, captioning, 128K context)
  • ๐Ÿ–ผ๏ธ Image Understanding - Florence-2 (detection, captioning, VQA)
  • ๐Ÿ” Vision RAG - Image-based question answering with CLIP embeddings

๐ŸŽ™๏ธ Audio Processing

  • ๐ŸŽค Speech-to-Text - 8 STT engines with multilingual support
    • โšก SenseVoice-Small: 15x faster than Whisper-Large, emotion recognition, ํ•œ๊ตญ์–ด ์ง€์›
    • ๐Ÿข Granite Speech 8B: Open ASR Leaderboard #2 (WER 5.85%), enterprise-grade
    • ๐Ÿ”ฅ Whisper V3 Turbo, Distil-Whisper, Parakeet TDT, Canary, Moonshine
  • ๐Ÿ”Š Text-to-Speech - Multi-provider TTS (OpenAI, Azure, Google)
  • ๐ŸŽง Audio RAG - Search and QA across audio files

๐Ÿค– Advanced LLM Features

  • ๐Ÿ› ๏ธ Tools & Agents - Function calling with ReAct pattern
  • ๐Ÿง  Memory Systems - Buffer, window, token-based, summary memory
  • โ›“๏ธ Chains - Sequential, parallel, and custom chain composition
  • ๐Ÿ“Š Output Parsers - Pydantic, JSON, datetime, enum parsing
  • ๐Ÿ’ซ Streaming - Real-time response streaming
  • ๐ŸŽฏ Structured Outputs - 100% schema accuracy (OpenAI strict mode)
  • ๐Ÿ’พ Prompt Caching - 85% latency reduction, 10x cost savings (Anthropic)
  • โšก Parallel Tool Calling - Concurrent function execution

๐Ÿ•ธ๏ธ Graph & Multi-Agent

  • ๐Ÿ“Š Graph Workflows - LangGraph-style DAG execution
  • ๐Ÿค Multi-Agent - Sequential, parallel, hierarchical, debate patterns
  • ๐Ÿ’พ State Management - Automatic state threading and checkpoints
  • ๐Ÿ“ž Communication - Inter-agent message passing

๐Ÿญ Production Features

  • ๐Ÿ“ˆ Evaluation - BLEU, ROUGE, LLM-as-Judge, RAG metrics, context recall
  • ๐Ÿ‘ค Human-in-the-Loop - Feedback collection and hybrid evaluation
  • ๐Ÿ”„ Continuous Evaluation - Scheduled evaluation and tracking
  • ๐Ÿ“‰ Drift Detection - Model performance monitoring
  • ๐ŸŽฏ Fine-tuning - OpenAI fine-tuning API integration
  • ๐Ÿ›ก๏ธ Error Handling - Retry, circuit breaker, rate limiting
  • ๐Ÿ“Š Tracing - Distributed tracing with OpenTelemetry

โšก Performance Optimizations (v0.2.1)

Algorithm Optimizations:

  • ๐Ÿš€ Model Parameter Lookup: 100ร— speedup (O(n) โ†’ O(1)) - Pre-cached dictionary lookup
  • ๐Ÿ” Hybrid Search: 10-50% faster top-k selection (O(n log n) โ†’ O(n log k)) - heapq.nlargest() optimization
  • ๐Ÿ“ Directory Loading: 1000ร— faster pattern matching (O(nร—mร—p) โ†’ O(nร—m)) - Pre-compiled regex patterns

Code Quality:

  • ๐Ÿงน Duplicate Code: ~100+ lines eliminated via helper methods (CSV loader, cache consolidation)
  • ๐Ÿ›ก๏ธ Error Handling: Standardized utilities in base provider (reduces boilerplate across all providers)
  • ๐Ÿ—๏ธ Architecture: Single Responsibility, DRY principle, Template Method pattern

Impact:

  • Model-heavy workflows: 10-30% faster
  • Large-scale RAG: 20-50% faster
  • Directory scanning: 50-90% faster

๐Ÿ—๏ธ Project Structure Improvements (v0.2.1)

Phase 1: Configuration & Cleanup:

  • โœ… MANIFEST.in: Fixed package name bug (llmkit โ†’ beanllm)
  • โœ… Dependencies: Moved pytest to dev, added version caps (prevents breaking changes)
  • โœ… .env.example: Created template with all required API keys
  • โœ… Cleanup: Removed ~396MB of unnecessary files (caches, build artifacts, bytecode)
  • โœ… Simplified: Eliminated duplicate re-export layers (vector_stores/, embeddings.py)

Phase 2: Code Quality & Utilities:

  • โœจ DependencyManager: Centralized dependency checking (261 duplicates โ†’ 1)
  • โœจ LazyLoadMixin: Deferred initialization pattern (23 duplicates โ†’ 1)
  • โœจ StructuredLogger: Consistent logging (510+ calls unified)
  • โœจ Module Naming: _source_providers/ โ†’ providers/, _source_models/ โ†’ models/

Phase 3: God Class Decomposition (5,930 lines โ†’ 23 files):

  • ๐Ÿ“ฆ vision/models.py (1,845 lines) โ†’ 4 files (sam, florence, yolo, + 4 more models)
  • ๐Ÿ“ฆ vector_stores/implementations.py (1,650 lines) โ†’ 9 files (8 stores + re-exports)
  • ๐Ÿ“ฆ loaders/loaders.py (1,435 lines) โ†’ 8 files (7 loaders + re-exports)

Phase 4: CI/CD & Documentation (2026-01-05):

  • ๐Ÿš€ GitHub Workflows: Removed duplicate ci.yml, added pip caching (30-50% faster CI)
  • ๐Ÿ“š Documentation: Added comprehensive Utils section to API_REFERENCE.md
  • โœ… Type Safety: MyPy failures now block CI (continue-on-error: false)
  • ๐Ÿ—‘๏ธ Cleanup: Removed unnecessary Sphinx dependencies

Impact:

  • Disk space: -396MB (-99%)
  • Code duplication: -90% (794 โ†’ ~80)
  • God classes: 5 โ†’ 0 (all decomposed โœ…)
  • Average file size: ~200 lines (was 1,500+)
  • New modules: +21 focused files
  • Utility modules: +3 (reusable)
  • CI speed: +30-50% faster (pip caching)
  • Documentation: 100% coverage (all new features)
  • Configuration bugs: 0 (all fixed)
  • Module naming: 100% consistent
  • Backward compatibility: Maintained (re-exports)

๐Ÿ“ฆ Installation

Using pip

# Basic installation
pip install beanllm

# Specific providers
pip install beanllm[openai]
pip install beanllm[anthropic]
pip install beanllm[gemini]
pip install beanllm[all]

# ML-based PDF processing
pip install beanllm[ml]

# Development tools
pip install beanllm[dev,all]

Using Poetry (๊ถŒ์žฅ)

git clone https://github.com/yourusername/beanllm.git
cd beanllm
poetry install --extras all
poetry shell

๐Ÿš€ Quick Start

Environment Setup

Create .env file in project root:

# LLM Providers
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GEMINI_API_KEY=...
DEEPSEEK_API_KEY=sk-...
PERPLEXITY_API_KEY=pplx-...
OLLAMA_HOST=http://localhost:11434

๐Ÿ’ฌ Basic Chat

import asyncio
from beanllm import Client

async def main():
    # Unified interface - works with any provider
    client = Client(model="gpt-4o")
    response = await client.chat(
        messages=[{"role": "user", "content": "Explain quantum computing"}]
    )
    print(response.content)

    # Switch providers seamlessly
    client = Client(model="claude-sonnet-4-20250514")
    response = await client.chat(
        messages=[{"role": "user", "content": "Same question, different provider"}]
    )

    # Streaming
    async for chunk in client.stream_chat(
        messages=[{"role": "user", "content": "Tell me a story"}]
    ):
        print(chunk, end="", flush=True)

asyncio.run(main())

๐Ÿ“š RAG in One Line

import asyncio
from beanllm import RAGChain

async def main():
    # Create RAG system from documents
    rag = RAGChain.from_documents("docs/")

    # Ask questions
    answer = await rag.query("What is this document about?")
    print(answer)

    # With sources
    result = await rag.query("Explain the main concept", include_sources=True)
    print(result.answer)
    for source in result.sources:
        print(f"๐Ÿ“„ Source: {source.metadata.get('source', 'unknown')}")

    # Streaming query
    async for chunk in rag.stream_query("Tell me more"):
        print(chunk, end="", flush=True)

asyncio.run(main())

๐Ÿ› ๏ธ Tools & Agents

import asyncio
from beanllm import Agent, Tool

async def main():
    # Define tools
    @Tool.from_function
    def calculator(expression: str) -> str:
        """Evaluate a math expression"""
        return str(eval(expression))

    @Tool.from_function
    def get_weather(city: str) -> str:
        """Get weather for a city"""
        return f"Sunny, 22ยฐC in {city}"

    # Create agent
    agent = Agent(
        model="gpt-4o-mini",
        tools=[calculator, get_weather],
        max_iterations=10
    )

    # Run agent
    result = await agent.run("What is 25 * 17? Also what's the weather in Seoul?")
    print(result.answer)
    print(f"โฑ๏ธ Steps: {result.total_steps}")

asyncio.run(main())

๐Ÿ•ธ๏ธ Graph Workflows

import asyncio
from beanllm import StateGraph, Client

async def main():
    client = Client(model="gpt-4o-mini")

    # Create graph
    graph = StateGraph()

    async def analyze(state):
        response = await client.chat(
            messages=[{"role": "user", "content": f"Analyze: {state['input']}"}]
        )
        state["analysis"] = response.content
        return state

    async def improve(state):
        response = await client.chat(
            messages=[{"role": "user", "content": f"Improve: {state['input']}"}]
        )
        state["improved"] = response.content
        return state

    def decide(state):
        score = 0.9 if "excellent" in state["analysis"].lower() else 0.5
        return "good" if score > 0.8 else "bad"

    # Build graph
    graph.add_node("analyze", analyze)
    graph.add_node("improve", improve)
    graph.add_conditional_edges("analyze", decide, {
        "good": "END",
        "bad": "improve"
    })
    graph.add_edge("improve", "END")
    graph.set_entry_point("analyze")

    # Run
    result = await graph.invoke({"input": "Draft proposal"})
    print(result)

asyncio.run(main())

๐ŸŽจ Advanced Features

๐ŸŽฏ Structured Outputs (100% Schema Accuracy)

from openai import AsyncOpenAI

client = AsyncOpenAI()

response = await client.chat.completions.create(
    model="gpt-4o-2024-08-06",
    messages=[{"role": "user", "content": "Extract: John Doe, 30, john@example.com"}],
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "user_info",
            "strict": True,  # โœ… 100% accuracy
            "schema": {
                "type": "object",
                "properties": {
                    "name": {"type": "string"},
                    "age": {"type": "integer"},
                    "email": {"type": "string"}
                },
                "required": ["name", "age", "email"]
            }
        }
    }
)

๐Ÿ’พ Prompt Caching (10x Cost Savings)

from anthropic import AsyncAnthropic

client = AsyncAnthropic()

response = await client.messages.create(
    model="claude-sonnet-4-20250514",
    system=[{
        "type": "text",
        "text": "Long system prompt..." * 1000,
        "cache_control": {"type": "ephemeral"}  # ๐Ÿ’ฐ 10x cheaper
    }],
    messages=[{"role": "user", "content": "Question"}],
    extra_headers={"anthropic-beta": "prompt-caching-2024-07-31"}
)

# Check cache savings
print(f"๐Ÿ’พ Cache created: {response.usage.cache_creation_input_tokens}")
print(f"โšก Cache read: {response.usage.cache_read_input_tokens}")

See Advanced Features Guide for more details.


๐ŸŽฏ Model Support

๐Ÿค– LLM Providers (7 providers)

  • OpenAI: GPT-5, GPT-4o, GPT-4.1, GPT-4o-mini
  • Anthropic: Claude Opus 4, Claude Sonnet 4.5, Claude Haiku 3.5
  • Google: Gemini 2.5 Pro, Gemini 2.5 Flash
  • DeepSeek: DeepSeek-V3 (671B MoE, open-source top performance)
  • Perplexity: Sonar (real-time web search + LLM)
  • Meta: Llama 3.3 70B (via Ollama)
  • Ollama: Local LLM support

๐ŸŽค Speech-to-Text (8 engines)

  • SenseVoice-Small: 15x faster than Whisper-Large, emotion recognition
  • Granite Speech 8B: Open ASR Leaderboard #2 (WER 5.85%)
  • Whisper V3 Turbo: Latest OpenAI model
  • Distil-Whisper: 6x faster with similar accuracy
  • Parakeet TDT: Real-time optimized (RTFx >2000)
  • Canary: Multilingual + translation
  • Moonshine: On-device optimized

๐Ÿ‘๏ธ Vision Models

  • SAM 3: Zero-shot segmentation
  • YOLOv12: Latest object detection
  • Qwen3-VL: Vision-language model (VQA, OCR, captioning)
  • Florence-2: Microsoft multimodal model

๐Ÿง  Embeddings

  • Qwen3-Embedding-8B: Top multilingual model
  • Code Embeddings: Specialized for code search
  • CLIP/SigLIP: Vision-text embeddings
  • OpenAI: text-embedding-3-small/large
  • Voyage, Jina, Cohere, Mistral: Alternative providers

๐Ÿ—๏ธ Architecture

beanllm follows Clean Architecture with SOLID principles.

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                  Facade Layer                       โ”‚
โ”‚  ์‚ฌ์šฉ์ž ์นœํ™”์  API (Client, RAGChain, Agent)       โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                   โ”‚
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                 Handler Layer                       โ”‚
โ”‚  Controller ์—ญํ•  (์ž…๋ ฅ ๊ฒ€์ฆ, ์—๋Ÿฌ ์ฒ˜๋ฆฌ)             โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                   โ”‚
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                 Service Layer                       โ”‚
โ”‚  ๋น„์ฆˆ๋‹ˆ์Šค ๋กœ์ง (์ธํ„ฐํŽ˜์ด์Šค + ๊ตฌํ˜„์ฒด)                โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                   โ”‚
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                 Domain Layer                        โ”‚
โ”‚  ํ•ต์‹ฌ ๋น„์ฆˆ๋‹ˆ์Šค (์—”ํ‹ฐํ‹ฐ, ์ธํ„ฐํŽ˜์ด์Šค, ๊ทœ์น™)          โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                   โ”‚
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚            Infrastructure Layer                     โ”‚
โ”‚  ์™ธ๋ถ€ ์‹œ์Šคํ…œ (Provider, Vector Store ๊ตฌํ˜„)          โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

์ž์„ธํ•œ ์•„ํ‚คํ…์ฒ˜ ์„ค๋ช…์€ **ARCHITECTURE.md**๋ฅผ ์ฐธ๊ณ ํ•˜์„ธ์š”.


๐Ÿ”ง CLI Usage

# List available models
beanllm list

# Show model details
beanllm show gpt-4o

# Check providers
beanllm providers

# Quick summary
beanllm summary

# Export model info
beanllm export > models.json

๐Ÿงช Testing

# Run all tests
pytest

# With coverage
pytest --cov=src/beanllm --cov-report=html

# Specific module
pytest tests/test_facade/ -v

Test Coverage: 61% (624 tests, 593 passed)


๐Ÿ› ๏ธ Development

Using Makefile (๊ถŒ์žฅ)

# Install dev tools
make install-dev

# Quick auto-fix
make quick-fix

# Type check
make type-check

# Lint check
make lint

# Run all checks
make all

Manual

# Install in editable mode
pip install -e ".[dev,all]"

# Format code
ruff format src/beanllm

# Lint
ruff check src/beanllm

# Type check
mypy src/beanllm

๐Ÿ—บ๏ธ Roadmap

โœ… Completed (2024-2025)

  • โœ… Clean Architecture & SOLID principles
  • โœ… Unified multi-provider interface (7 providers)
  • โœ… RAG pipeline & document processing
  • โœ… beanPDFLoader with 3-layer architecture
  • โœ… Vision AI (SAM 3, YOLOv12, Qwen3-VL)
  • โœ… Audio processing (8 STT engines)
  • โœ… Embeddings (Qwen3-Embedding-8B, Matryoshka, Code)
  • โœ… Vector stores (Milvus, LanceDB, pgvector)
  • โœ… RAG evaluation (TruLens, HyDE)
  • โœ… Advanced features (Structured Outputs, Prompt Caching, Parallel Tool Calling)
  • โœ… Tools, agents, graph workflows
  • โœ… Multi-agent systems
  • โœ… Production features (evaluation, monitoring, cost tracking)

๐Ÿ“‹ Planned

  • โฌœ Benchmark system
  • โฌœ Advanced agent frameworks integration

๐Ÿ“„ License

MIT License - see LICENSE file for details.


๐Ÿ™ Acknowledgments

Inspired by:

Special thanks to:

  • OpenAI, Anthropic, Google, DeepSeek, Perplexity for APIs
  • Ollama team for local LLM support
  • Open-source AI community

๐Ÿ“ง Contact


Built with โค๏ธ for the LLM community

Transform your LLM applications from prototype to production with beanllm.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

beanllm-0.2.1.tar.gz (468.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

beanllm-0.2.1-py3-none-any.whl (661.3 kB view details)

Uploaded Python 3

File details

Details for the file beanllm-0.2.1.tar.gz.

File metadata

  • Download URL: beanllm-0.2.1.tar.gz
  • Upload date:
  • Size: 468.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.10

File hashes

Hashes for beanllm-0.2.1.tar.gz
Algorithm Hash digest
SHA256 5276c58d0d5cea7056e1410a35aebb9bb943ed000cc56dae9fb8e454dfa74af3
MD5 4461c1e728f11165dafe48d3ff40e9f4
BLAKE2b-256 1ae0013e74ed4a62823c39f2cd8343f6f8c9aa3d72354b4ef366a02fc851ec82

See more details on using hashes here.

File details

Details for the file beanllm-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: beanllm-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 661.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.10

File hashes

Hashes for beanllm-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 1e4e6d4a40dd9a37e04149f31df94c0c7d8a97bc258d152cd7d8f9d8e3561ac0
MD5 eb6b943b91883eca51eb61022dade836
BLAKE2b-256 d1fc953f39b8d8e8386843018951fe580af5d41a202cb55d812cdc218e943ae2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page