Core shared utilities for Animuz RAG system - LLM clients, pipelines, vector DB, and document ingestion
Project description
animuz-core
Core shared utilities for Animuz RAG (Retrieval-Augmented Generation) system.
Features
- Unified RAG API: 3-knob interface (
prompt,llm,tools) for chat, ingestion, and retrieval - LLM Clients: OpenAI (Responses API + Chat Completions), Anthropic Claude, Ollama
- Tool System:
@tooldecorator,qdrant_retriever()factory, MCP server support - RAG Pipelines: Simple and Agentic RAG implementations (lower-level building blocks)
- Vector Database: Qdrant integration with hybrid search (dense + sparse)
- Embedding Clients: Multiple providers (local server, Modal, S3/SageMaker)
- Document Ingestion: Azure Document Intelligence, Unstructured, PDF extraction, structured text parsing
- CloudWatch Logging: Structured JSON logging with watchtower
Requirements
- Python >= 3.10
Installation
Install the core package (minimal dependencies only):
pip install animuz-core
Then install only the extras you need:
# Single extra
pip install animuz-core[openai]
# Multiple extras
pip install animuz-core[openai,qdrant,aws]
# Everything
pip install animuz-core[all]
Works with uv too:
uv pip install animuz-core[openai,qdrant]
Available Extras
| Extra | What it installs | Use when you need |
|---|---|---|
openai |
openai |
OpenAI GPT models |
anthropic |
anthropic |
Anthropic Claude models |
ollama |
ollama |
Local LLMs via Ollama |
qdrant |
qdrant-client |
Qdrant vector database |
aws |
boto3, aiobotocore, watchtower, sagemaker |
S3, SageMaker embeddings, CloudWatch logging |
azure |
azure-ai-documentintelligence |
Azure Document Intelligence for PDF ingestion |
ingest |
unstructured-client, PyMuPDF |
Document parsing (Unstructured API, PDF extraction) |
fastapi |
fastapi |
Streaming SSE endpoints |
all |
All of the above | Everything |
dev |
all + pytest, black, ruff, mypy |
Development and testing |
Usage
Unified RAG API (recommended)
The RAG class is the main entry point. It has 3 knobs:
prompt— callable(team_id, assistant_id) -> dictthat fetches an assistant configllm— model name string (provider auto-detected:"gpt-*"→ OpenAI,"claude-*"→ Anthropic)tools—list[ToolSpec]for local tools,MCP(url=...)for MCP server, orNonefor plain chat
from animuz_core import RAG, qdrant_retriever, tool
# Define how to fetch the assistant config
def my_fetcher(team_id, assistant_id):
return {"prompt": "You are a helpful assistant.", "model": "gpt-4o-mini"}
# Create RAG with local retriever tool
rag = RAG(
prompt=my_fetcher,
llm="gpt-4o-mini",
tools=[
qdrant_retriever(host="localhost", port=6333, collection="animuz"),
],
)
# Chat (returns frontend-ready output)
output = await rag.chat("team1", "asst1", [{"role": "user", "content": "Hello"}])
# Ingest a document
await rag.add_doc("docs/intro.md", user_chat_id="team1|asst1")
# Retrieve documents
texts, points = await rag.retrieve("what is this?", user_chat_id="team1|asst1")
With MCP tools (Lambda / cloud)
from animuz_core import RAG, MCP
rag = RAG(
prompt=ddb_fetcher,
llm="gpt-4o-mini",
tools=MCP(url=os.environ["MCP_URL"], api_key=os.environ.get("MCP_API_KEY")),
)
output = await rag.chat(team_id, assistant_id, messages, user_context=ctx)
Plain chat (no tools)
rag = RAG(prompt=my_fetcher, llm="gpt-4o-mini")
output = await rag.chat(team_id, assistant_id, messages)
Custom OpenAI base URL (proxy / gateway)
Route OpenAI API calls through a proxy or API gateway by passing base_url:
rag = RAG(
prompt=my_fetcher,
llm="gpt-4o-mini",
base_url="https://your-proxy.example.com/openai/v1",
tools=[qdrant_retriever(...)],
)
When omitted, the default OpenAI endpoint (api.openai.com) is used.
Custom tools with @tool decorator
from animuz_core import RAG, tool, qdrant_retriever
@tool(description="Get weather for a city")
async def weather(city: str) -> str:
return await fetch_weather(city)
rag = RAG(
prompt=my_fetcher,
llm="gpt-4o-mini",
tools=[qdrant_retriever(...), weather],
)
Lower-level APIs
LLM Clients
from animuz_core import OpenAIAgentClientResponses
# OpenAI Responses API agent with tool loop (recommended for production)
agent = OpenAIAgentClientResponses(user_chat_id="tenant-123", tools=tool_dict, model="gpt-4o-mini")
result = await agent.get_reply_frontend(messages, system_prompt)
# With a custom base URL (proxy / gateway)
agent = OpenAIAgentClientResponses(model="gpt-4o-mini", base_url="https://your-proxy.example.com/openai/v1")
RAG Pipelines
from animuz_core import SimpleRAG
# Simple RAG — always retrieves then generates
pipeline = SimpleRAG(
embedding_client=embedding_client,
db_client=qdrant_client,
LLM=llm_client,
)
await pipeline.add_doc("document.pdf", user_chat_id="tenant-123")
result = await pipeline.query("What is RAG?", user_chat_id="tenant-123")
Vector Database
from animuz_core import QdrantDBClient
client = QdrantDBClient(host="localhost", port=6333, collection_name="animuz")
results = await client.hybrid_search(dense_vec, indices, values, user_chat_id="tenant-123")
Embedding
from animuz_core import EmbeddingClient, ModalEmbeddingClient
# Local embedding server
client = EmbeddingClient(host="localhost", port=12081)
result = await client.get_embedding("Some text")
# Modal-hosted embeddings
client = ModalEmbeddingClient()
result = client.get_embedding("Some text")
Development
# Clone and install in editable mode with dev dependencies
git clone <repo-url>
cd animuz-core
pip install -e ".[dev]"
# Run tests
pytest tests/
# Run integration tests (requires external services + env vars)
pytest -m integration tests/integration/
pytest -m integration tests/integration/test_e2e_rag_wrapper_simple.py
# Format
black src/
ruff check src/
Publishing to PyPI
-
Bump the version in
pyproject.tomland__init__.py. -
Build the package:
uv pip install --upgrade build
uv run python -m build
- (Optional) Verify the artifacts:
uv pip install --upgrade twine
uv run python -m twine check dist/*
- Upload to TestPyPI first:
uv run python -m twine upload -r testpypi dist/*
- Upload to PyPI:
uv run python -m twine upload dist/*
Notes:
- Create a PyPI API token and set
TWINE_USERNAME=__token__andTWINE_PASSWORD=<your-token>. - If you upload to TestPyPI, install with
pip install -i https://test.pypi.org/simple animuz-coreto verify.
Integration Test Setup (Qdrant)
Use Docker Compose to run Qdrant locally:
docker compose -f docker-compose-qdrant.yml up -d qdrant
Then set the Qdrant env vars (example):
export QDRANT_HOST=localhost
export QDRANT_PORT=6333
Environment Variables
The package reads configuration from environment variables (loaded via python-dotenv):
| Variable | Used by |
|---|---|
OPENAI_API_KEY |
OpenAI client |
ANTHROPIC_API_KEY |
Anthropic client |
QDRANT_HOST, QDRANT_PORT, QDRANT_COLLECTION_NAME |
Qdrant client |
QDRANT_CLOUD_API_KEY |
Qdrant Cloud |
EMBEDDING_HOST, EMBEDDING_PORT |
Embedding client |
AZURE_DOCAI_KEY, AZURE_DOCAI_ENDPOINT |
Azure Document Intelligence |
UNSTRUCTURED_ENDPOINT, UNSTRUCTURED_API_KEY |
Unstructured client |
S3_BUCKET_NAME, S3_DOWNLOAD_DIR |
S3 operations |
MCP_API_KEY |
MCP tool server |
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file animuz_core-0.1.10.tar.gz.
File metadata
- Download URL: animuz_core-0.1.10.tar.gz
- Upload date:
- Size: 62.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8b941bba735be601955fee893f4eb5300a82cda3885f4267f1236d658e4ff677
|
|
| MD5 |
d2419b27aebda2361204ea2c5744d366
|
|
| BLAKE2b-256 |
27aea8ba8cfe71b29cec112dbcc68916faf125d66e4e563e724937cf585a5f48
|
File details
Details for the file animuz_core-0.1.10-py3-none-any.whl.
File metadata
- Download URL: animuz_core-0.1.10-py3-none-any.whl
- Upload date:
- Size: 74.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4ea06b2a6e8c0293b32c58b0513de8f697b8bf5e4107c49eb5a25e53ca60870c
|
|
| MD5 |
003607bdb25e9ce8da769051fe895115
|
|
| BLAKE2b-256 |
d333fc4094bb1d47c10750dc6fc936436ed4a70a94c1a55d32dc9a9c188c676b
|