Unified toolkit for managing and using multiple LLM providers with automatic model detection
Project description
๐ beanllm
Production-ready LLM toolkit with Clean Architecture and unified interface for multiple providers
beanllm is a comprehensive, production-ready toolkit for building LLM applications with a unified interface across OpenAI, Anthropic, Google, and Ollama. Built with Clean Architecture and SOLID principles for maintainability and scalability.
โจ Key Features
๐ฏ Core Features
- ๐ Unified Interface - Single API for OpenAI, Anthropic, Google, Ollama
- ๐๏ธ Intelligent Adaptation - Automatic parameter conversion between providers
- ๐ Model Registry - Auto-detect available models from API keys
- ๐ CLI Tools - Inspect models and capabilities from command line
- ๐ฐ Cost Tracking - Accurate token counting and cost estimation
- ๐๏ธ Clean Architecture - Layered architecture with clear separation of concerns
๐๏ธ RAG & Document Processing
- ๐ Document Loaders - PDF, CSV, TXT with automatic format detection
- โ๏ธ Smart Text Splitters - Semantic chunking with tiktoken
- ๐ Vector Search - Chroma, FAISS, Pinecone, Qdrant, Weaviate
- ๐ฏ RAG Pipeline - Complete question-answering system in one line
- ๐ RAG Debugging - Comprehensive debugging toolkit
๐ค Advanced LLM Features
- ๐ ๏ธ Tools & Agents - Function calling with ReAct pattern
- ๐ง Memory Systems - Buffer, window, token-based, summary memory
- โ๏ธ Chains - Sequential, parallel, and custom chain composition
- ๐ Output Parsers - Pydantic, JSON, datetime, enum parsing
- ๐ Streaming - Real-time response streaming with stats
๐ Graph & Multi-Agent
- ๐ธ๏ธ Graph Workflows - LangGraph-style DAG execution
- ๐ค Multi-Agent - Sequential, parallel, hierarchical, debate patterns
- ๐ State Management - Automatic state threading and checkpoints
- ๐ Communication - Inter-agent message passing
๐จ Multimodal AI
- ๐ผ๏ธ Vision RAG - Image-based question answering with CLIP
- ๐๏ธ Audio Processing - Whisper STT, multi-provider TTS
- ๐ Audio RAG - Search and QA across audio files
- ๐ Web Search - Google, Bing, DuckDuckGo integration
- ๐งฎ ML Integration - TensorFlow, PyTorch, Scikit-learn
๐ญ Production Features
- ๐ต Token & Cost - tiktoken-based accurate counting, cost optimization
- ๐ Prompt Templates - Few-shot, chat, chain-of-thought templates
- ๐ Evaluation - BLEU, ROUGE, LLM-as-Judge, RAG metrics, Context Recall
- ๐ค Human-in-the-Loop - ํผ๋๋ฐฑ ์์ง ๋ฐ ํ์ด๋ธ๋ฆฌ๋ ํ๊ฐ
- ๐ Continuous Evaluation - ์ ๊ธฐ ํ๊ฐ ๋ฐ ์ถ์
- ๐ Drift Detection - ๋ชจ๋ธ ๋๋ฆฌํํธ ๊ฐ์ง
- ๐ Evaluation Dashboard - ํ๊ฐ ๊ฒฐ๊ณผ ์๊ฐํ
- ๐ Rubric-Driven Grading - ๊ตฌ์กฐํ๋ ๋ฃจ๋ธ๋ฆญ ๊ธฐ๋ฐ ํ๊ฐ
- โ CheckEval - ์ฒดํฌ๋ฆฌ์คํธ ๊ธฐ๋ฐ Boolean ํ๊ฐ
- ๐ Evaluation Analytics - ํธ๋ ๋ ๋ฐ ์๊ด๊ด๊ณ ๋ถ์
- ๐ฏ Fine-tuning - OpenAI fine-tuning API integration
- ๐ก๏ธ Error Handling - Retry, circuit breaker, rate limiting
- ๐ Tracing - Distributed tracing with OpenTelemetry export
๐๏ธ Architecture
beanllm์ Clean Architecture์ SOLID ์์น์ ๋ฐ๋ฅด๋ ๊ณ์ธตํ ์ํคํ ์ฒ๋ฅผ ์ฌ์ฉํฉ๋๋ค.
๋ ์ด์ด ๊ตฌ์กฐ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Facade Layer โ
โ (์ฌ์ฉ์ ์นํ์ API) - Client, RAGChain, Agent ๋ฑ โ
โโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Handler Layer โ
โ (Controller ์ญํ ) - ์
๋ ฅ ๊ฒ์ฆ, ์๋ฌ ์ฒ๋ฆฌ โ
โโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Service Layer โ
โ (๋น์ฆ๋์ค ๋ก์ง) - ์ธํฐํ์ด์ค + ๊ตฌํ์ฒด โ
โโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Domain Layer โ
โ (ํต์ฌ ๋น์ฆ๋์ค) - ์ํฐํฐ, ์ธํฐํ์ด์ค, ๊ท์น โ
โโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Infrastructure Layer โ
โ (์ธ๋ถ ์์คํ
) - Provider, Vector Store ๊ตฌํ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
๋๋ ํ ๋ฆฌ ๊ตฌ์กฐ
src/beanllm/
โโโ facade/ # ์ธ๋ถ ์ธํฐํ์ด์ค (Facade ํจํด)
โโโ handler/ # ์์ฒญ ์ฒ๋ฆฌ (Controller ์ญํ )
โโโ service/ # ๋น์ฆ๋์ค ๋ก์ง (Service ์ธํฐํ์ด์ค + ๊ตฌํ์ฒด)
โโโ domain/ # ๋๋ฉ์ธ ๋ชจ๋ธ ๋ฐ ๋น์ฆ๋์ค ๊ท์น
โโโ infrastructure/ # ์ธ๋ถ ์์คํ
์ธํฐํ์ด์ค
โโโ dto/ # ๋ฐ์ดํฐ ์ ์ก ๊ฐ์ฒด
โโโ decorators/ # ๊ณตํต ๋ฐ์ฝ๋ ์ดํฐ
โโโ utils/ # ์ ํธ๋ฆฌํฐ ํจ์
SOLID ์์น ์ ์ฉ
- SRP: ๊ฐ ๋ ์ด์ด๊ฐ ๋จ์ผ ์ฑ ์๋ง ๋ด๋น
- OCP: ์ธํฐํ์ด์ค ๊ธฐ๋ฐ ํ์ฅ ๊ฐ๋ฅ
- LSP: ์ธํฐํ์ด์ค ๊ตฌํ์ฒด๋ ์ธ์ ๋ ๊ต์ฒด ๊ฐ๋ฅ
- ISP: ์์, ํนํ๋ ์ธํฐํ์ด์ค
- DIP: ์ธํฐํ์ด์ค์ ์์กด, ๊ตฌํ์ฒด์ ์์กดํ์ง ์์
์์ธํ ์ํคํ ์ฒ ์ค๋ช ์ ARCHITECTURE.md๋ฅผ ์ฐธ๊ณ ํ์ธ์.
๐ฆ Installation
Poetry ์ฌ์ฉ (๊ถ์ฅ)
# ํ๋ก์ ํธ ํด๋ก
git clone https://github.com/yourusername/beanllm.git
cd beanllm
# ์์กด์ฑ ์ค์น
poetry install --extras all # ๋ชจ๋ Provider ํฌํจ
# ๋๋
poetry install --extras openai # OpenAI๋ง
# ๊ฐ์ ํ๊ฒฝ ํ์ฑํ
poetry shell
pip ์ฌ์ฉ
# ๊ธฐ๋ณธ ์ค์น (์์กด์ฑ ์์)
pip install beanllm
# ํน์ Provider ์ถ๊ฐ
pip install beanllm[openai]
pip install beanllm[anthropic]
pip install beanllm[gemini]
pip install beanllm[ollama]
# ๋ชจ๋ Provider
pip install beanllm[all]
# ๊ฐ๋ฐ ๋๊ตฌ ํฌํจ
pip install beanllm[dev,all]
์ฐธ๊ณ : Provider๋ ์ ํ์ ์์กด์ฑ์ ๋๋ค. ํ์ํ Provider๋ง ์ค์นํ๋ฉด ๋ฉ๋๋ค.
๐ Quick Start
Environment Setup
.env ํ์ผ์ ํ๋ก์ ํธ ๋ฃจํธ์ ์์ฑํ์ธ์:
# .env ํ์ผ ์์ฑ
cat > .env << EOF
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GEMINI_API_KEY=...
OLLAMA_HOST=http://localhost:11434
EOF
Basic Usage
import asyncio
from beanllm import Client
async def main():
# Unified interface - works with any provider
client = Client(model="gpt-4o")
response = await client.chat(
messages=[{"role": "user", "content": "Explain quantum computing in simple terms"}]
)
print(response.content)
# Switch providers seamlessly
client = Client(model="claude-3-5-sonnet-20241022")
response = await client.chat(
messages=[{"role": "user", "content": "Same question, different provider"}]
)
# Streaming
async for chunk in client.stream_chat(
messages=[{"role": "user", "content": "Tell me a story"}]
):
print(chunk, end="", flush=True)
asyncio.run(main())
RAG in One Line
import asyncio
from beanllm import RAGChain
async def main():
# Create RAG system from documents
rag = RAGChain.from_documents("docs/")
# Ask questions
answer = await rag.query("What is this document about?")
print(answer)
# With sources
result = await rag.query("Explain the main concept", include_sources=True)
print(result.answer)
for source in result.sources:
print(f"Source: {source.metadata.get('source', 'unknown')}")
# Streaming query
async for chunk in rag.stream_query("์ง๋ฌธ"):
print(chunk, end="", flush=True)
asyncio.run(main())
Tools & Agents
import asyncio
from beanllm import Agent, Tool
async def main():
# Define tools
@Tool.from_function
def calculator(expression: str) -> str:
"""Evaluate a math expression"""
return str(eval(expression))
# Create agent
agent = Agent(
model="gpt-4o-mini",
tools=[calculator],
max_iterations=10
)
# Run agent
result = await agent.run("What is 25 * 17?")
print(result.answer)
print(f"Steps: {result.total_steps}")
asyncio.run(main())
Graph Workflows
import asyncio
from beanllm import StateGraph, Client
async def main():
client = Client(model="gpt-4o-mini")
# Create graph
graph = StateGraph()
async def analyze(state):
response = await client.chat(
messages=[{"role": "user", "content": f"Analyze: {state['input']}"}]
)
state["analysis"] = response.content
return state
def decide(state):
score = float(state["analysis"].split("Score:")[1]) if "Score:" in state["analysis"] else 0.5
return "good" if score > 0.8 else "bad"
# Build graph
graph.add_node("analyze", analyze)
graph.add_conditional_edges("analyze", decide, {
"good": "END",
"bad": "improve"
})
# Run
result = await graph.invoke({"input": "Draft text"})
print(result)
asyncio.run(main())
๐ Examples
๋ ๋ง์ ์ฌ์ฉ ์์ ๋ examples/ ๋๋ ํ ๋ฆฌ๋ฅผ ์ฐธ๊ณ ํ์ธ์:
basic_usage.py- ๊ธฐ๋ณธ ์ฌ์ฉ๋ฒrag_demo.py- RAG ํ์ดํ๋ผ์ธ ์์ rag_chain_demo.py- RAG Chain ์์ state_graph_demo.py- Graph Workflow ์์ embeddings_demo.py- ์๋ฒ ๋ฉ ์์ vector_stores_demo.py- Vector Store ์์
๐ Core Modules
1. Client & Adapters
Unified interface with automatic parameter adaptation:
from beanllm import Client
# Works across all providers
client = Client(model="gpt-4o")
# Parameters automatically adapted
response = await client.chat(
messages=[{"role": "user", "content": "Hello"}],
temperature=0.7,
max_tokens=1000, # โ max_completion_tokens for GPT-5
# โ max_output_tokens for Gemini
# โ num_predict for Ollama
)
2. Document Processing
from beanllm import DocumentLoader, RecursiveCharacterTextSplitter
# Load documents
docs = DocumentLoader.load("docs/") # PDF, CSV, TXT
# Smart splitting
splitter = RecursiveCharacterTextSplitter(
chunk_size=500,
chunk_overlap=50,
separators=["\n\n", "\n", " "]
)
chunks = splitter.split_documents(docs)
3. Embeddings & Vector Stores
from beanllm import OpenAIEmbedding, ChromaVectorStore
# Create embeddings
embedding = OpenAIEmbedding(model="text-embedding-3-small")
# Vector store
store = ChromaVectorStore.from_documents(
documents=chunks,
embedding=embedding,
persist_directory="./chroma_db"
)
# Search
results = store.similarity_search("query", k=5)
# MMR search (diversity)
diverse_results = store.mmr_search("query", k=5, lambda_mult=0.5)
4. Multi-Agent Systems
import asyncio
from beanllm import MultiAgentCoordinator, Agent
async def main():
# Create agents
researcher = Agent(model="gpt-4o-mini", tools=[], max_iterations=10)
writer = Agent(model="gpt-4o-mini", tools=[], max_iterations=10)
# Coordinate
coordinator = MultiAgentCoordinator(
agents={"researcher": researcher, "writer": writer}
)
result = await coordinator.execute_sequential(
task="Write an article about quantum computing",
agent_order=["researcher", "writer"]
)
print(result["final_result"])
asyncio.run(main())
๐ง CLI Usage
# List available models
beanllm list
# Show model details
beanllm show gpt-4o
# Check providers
beanllm providers
# Quick summary
beanllm summary
# Export model info
beanllm export > models.json
๐งช Testing
# Run all tests
pytest
# With coverage
pytest --cov=src/beanllm --cov-report=html
# Specific module
pytest tests/test_facade/ -v
ํ์ฌ ํ ์คํธ ์ปค๋ฒ๋ฆฌ์ง: 61% (624 tests, 593 passed)
๐ ๏ธ Development
Makefile ์ฌ์ฉ (๊ถ์ฅ)
# ๊ฐ๋ฐ ๋๊ตฌ ์ค์น
make install-dev
# ๋น ๋ฅธ ์๋ ์์
make quick-fix
# ํ์
์ฒดํฌ
make type-check
# ๋ฆฐํธ ์ฒดํฌ
make lint
# ์ ์ฒด ๊ฒ์ฌ ๋ฐ ์์
make all
์๋ ์คํ
# Install in editable mode
pip install -e ".[dev,all]"
# Format code
ruff format src/beanllm
# Lint
ruff check src/beanllm
# Type check
mypy src/beanllm
๐บ๏ธ Roadmap
โ ์๋ฃ๋ ์ฃผ์ ๊ธฐ๋ฅ
- โ Clean Architecture & SOLID principles
- โ Unified multi-provider interface (OpenAI, Anthropic, Google, Ollama)
- โ RAG pipeline & Document Processing
- โ Tools & Agents (ReAct pattern)
- โ Graph workflows (LangGraph-style)
- โ Multi-agent systems
- โ Vision & Audio processing
- โ Production features (evaluation, monitoring, cost tracking)
- โ ํ๋กฌํํธ ๋ฒ์ ๊ด๋ฆฌ & A/B ํ ์คํธ
- โ ์คํธ๋ฆฌ๋ฐ ์๋ต ๋ฒํผ๋ง
- โ ํ๊ฐ ์์คํ ํ์ฅ (Human-in-the-Loop, Continuous Evaluation, Drift Detection)
- โ ๋ด๋ถ ์ฑ๋ฅ ์ต์ ํ (๋ณ๋ ฌ ์ฒ๋ฆฌ, ๋ฐฐ์น ๊ฒ์, ํ์คํ ๋ฆฌ ์์ถ)
๐ ๊ณํ ์ค
- โฌ ๋ฒค์น๋งํฌ ์์คํ
๐ Documentation
- QUICK_START.md - ๋น ๋ฅธ ์์ ๊ฐ์ด๋
- ARCHITECTURE.md - ์ํคํ ์ฒ ์์ธ ์ค๋ช
- docs/DEPLOYMENT.md - PyPI ๋ฐฐํฌ ๊ฐ์ด๋
- docs/theory/ - ์ด๋ก ๋ฌธ์ ๋ฐ ํ์ต ์๋ฃ
- docs/tutorials/ - ํํ ๋ฆฌ์ผ ์ฝ๋
- examples/ - ์ฌ์ฉ ์์ ์ฝ๋
๐ค Contributing
Contributions welcome! Please:
- Fork the repository
- Create feature branch (
git checkout -b feature/amazing-feature) - Commit changes (
git commit -m 'Add amazing feature') - Push to branch (
git push origin feature/amazing-feature) - Open Pull Request
๐ License
MIT License - see LICENSE file for details.
๐ Acknowledgments
Inspired by:
- LangChain - LLM application framework
- LangGraph - Graph workflow patterns
- Anthropic Claude - Clear code philosophy
Special thanks to:
- OpenAI for GPT models and APIs
- Anthropic for Claude API
- Google for Gemini API
- Ollama team for local LLM support
๐ง Contact
- GitHub: https://github.com/leebeanbin/beanllm
- Issues: https://github.com/leebeanbin/beanllm/issues
- Discussions: https://github.com/leebeanbin/beanllm/discussions
Built with โค๏ธ for the LLM community
Transform your LLM applications from prototype to production with beanllm.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file beanllm-0.1.0.tar.gz.
File metadata
- Download URL: beanllm-0.1.0.tar.gz
- Upload date:
- Size: 22.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
09a8d3f9c879c0fe5812fee9bbc2f06bb4359105624a83603ac45308ea684145
|
|
| MD5 |
15034f38ee7fd157391593f148dfa647
|
|
| BLAKE2b-256 |
e13745b2ef62958abeb2074f487c3db9c11855dde536debc4306139c1ea210db
|
File details
Details for the file beanllm-0.1.0-py3-none-any.whl.
File metadata
- Download URL: beanllm-0.1.0-py3-none-any.whl
- Upload date:
- Size: 9.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e9a2c24914f6c52a2200e054b71e3e4cb62920a710a374697a5d23aea8746068
|
|
| MD5 |
ef5baf9d597fee19e8d5320222e6c43f
|
|
| BLAKE2b-256 |
893b0f93b62a95f27ccd2d51ebd15d4fb6a9099e3bad4ef472a071d3c9554642
|