RAG system for Cube.js analytics with vector search capabilities
Project description
Cube-to-RAG
Natural language interface for Cube.js analytics using RAG (Retrieval-Augmented Generation) with vector search and LLM agents.
Features
- Semantic Schema Search: Discover cubes, dimensions, and measures using natural language
- GraphQL Query Generation: Automatically build and execute Cube.js GraphQL queries
- Vector Database: Store and search schema embeddings in Milvus
- Multiple LLM Providers: Support for OpenAI, Anthropic Claude, and AWS Bedrock
- Streaming Responses: Real-time token-by-token streaming for better UX
- FastAPI Backend: Modern async Python web framework
- Jinja2 Templates: Flexible schema formatting
Architecture
User Question
↓
1. Semantic Search (Milvus)
└→ Find relevant cubes, dimensions, measures
↓
2. GraphQL Query Construction
└→ Build query using discovered schema
↓
3. Execute Query (Cube.js GraphQL API)
└→ http://cube_api:4000/cubejs-api/graphql
↓
4. Format & Explain Results
└→ Natural language response
Installation
Using pip
pip install cube-to-rag
Using Poetry
poetry add cube-to-rag
From source
git clone https://github.com/yourusername/dbt-to-cube.git
cd cube-to-rag
poetry install
Configuration
Create a .env file with your configuration:
# LLM Configuration
LLM_MODEL_ID='anthropic:claude-3-5-sonnet-20241022' # or 'openai:gpt-4' or 'bedrock:...'
EMBEDDING_MODEL='openai:text-embedding-3-small' # or 'bedrock:amazon.titan-embed-text-v2:0'
# API Keys (choose based on your LLM/embedding provider)
ANTHROPIC_API_KEY='sk-ant-your-key-here'
OPENAI_API_KEY='sk-your-key-here'
# AWS Bedrock (if using Bedrock models)
AWS_ACCESS_KEY_ID='your-aws-key'
AWS_SECRET_ACCESS_KEY='your-aws-secret'
AWS_DEFAULT_REGION='us-east-1'
# Cube.js Configuration
CUBE_URL='http://cube_api:4000' # Base URL (GraphQL and REST APIs are auto-constructed)
# Optional: Cube.js API secret (JWT tokens are auto-generated)
# CUBEJS_API_SECRET='your-cubejs-api-secret'
# Milvus Vector Database
MILVUS_SERVER_URI='http://localhost:19530'
# Optional: Milvus authentication (for secured instances)
# MILVUS_USER='root'
# MILVUS_PASSWORD='your-milvus-password'
# Security
SECRET_KEY='your-secret-key-here'
FAST_API_ACCESS_SECRET_TOKEN='your-access-token'
DEPLOY_ENV='local' # or 'prod'
Quick Start
1. Start Required Services
You'll need:
- Cube.js running with GraphQL API enabled
- Milvus vector database
Using Docker Compose:
docker-compose -f docker-compose.yml -f docker-compose.milvus.yml -f docker-compose.ai.yml up -d
2. Run the API Server
# Using uvicorn directly
uvicorn app.server.main:app --host 0.0.0.0 --port 8080
# Or using the installed package
python -m cube_to_rag
3. Ingest Cube.js Schemas
The API will automatically fetch and ingest schemas from Cube.js on startup. You can also trigger manual ingestion:
curl -X POST http://localhost:8080/embeddings/ingest \
-H "Content-Type: application/json"
Using in Your Own Applications
The cube-to-rag package provides tools and utilities you can add to your existing LangChain agents and applications.
Installation
pip install cube-to-rag
# or
poetry add cube-to-rag
1. Add Cube.js Extraction Tool to Your Agent
Add Cube.js querying capabilities to any existing LangChain agent:
from langchain.agents import AgentExecutor, create_tool_calling_agent
from langchain_core.prompts import ChatPromptTemplate
from langchain_community.agent_toolkits.load_tools import load_tools
from cube_to_rag.core.llm import get_llm
from cube_to_rag.tools import get_cube_schema_search_tool
# Your existing tools
your_existing_tools = [...]
# Add Cube.js extraction tools
cube_schema_search = get_cube_schema_search_tool(k=3)
# Use LangChain's built-in GraphQL tool for queries
graphql_tools = load_tools(
["graphql"],
graphql_endpoint="http://localhost:4000/cubejs-api/graphql",
llm=get_llm()
)
# Combine all tools
tools = your_existing_tools + [cube_schema_search] + graphql_tools
# Create agent with all tools
llm = get_llm()
prompt = ChatPromptTemplate.from_messages([
("system", "You are a helpful assistant with access to analytics data."),
("human", "{input}"),
("placeholder", "{agent_scratchpad}"),
])
agent = create_tool_calling_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
# Now your agent can answer questions like:
result = agent_executor.invoke({
"input": "What's the average course GPA by department?"
})
2. Add Documents to Milvus Vector Database
Easily embed and store your documents in Milvus:
from langchain_core.documents import Document
from cube_to_rag.tools import create_milvus_helper
# Create helper for your collection
helper = create_milvus_helper(collection_name="my_documents")
# Add documents
docs = [
Document(
page_content="Your document content here",
metadata={"source": "doc1.pdf", "page": 1}
),
Document(
page_content="More content",
metadata={"source": "doc2.pdf", "page": 1}
)
]
ids = helper.add_documents(docs)
print(f"Added {len(ids)} documents to Milvus")
# Or add text directly
texts = ["First document", "Second document"]
metadatas = [{"source": "text1"}, {"source": "text2"}]
ids = helper.add_texts(texts, metadatas)
3. Add Vector Search Tool to Your Agent
Add semantic document search to any LangChain agent:
from langchain.agents import AgentExecutor, create_tool_calling_agent
from langchain_core.prompts import ChatPromptTemplate
from cube_to_rag.core.llm import get_llm
from cube_to_rag.tools import get_vector_search_tool
# Your existing tools
your_existing_tools = [...]
# Add vector search tool
vector_search = get_vector_search_tool(
collection_name="my_documents",
k=5
)
# Combine tools
tools = your_existing_tools + [vector_search]
# Create agent
llm = get_llm()
prompt = ChatPromptTemplate.from_messages([
("system", "You are a helpful assistant with access to document search."),
("human", "{input}"),
("placeholder", "{agent_scratchpad}"),
])
agent = create_tool_calling_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools)
# Now your agent can search documents
result = agent_executor.invoke({
"input": "Find documents about machine learning"
})
4. Complete Example: Agent with All Tools
from langchain.agents import AgentExecutor, create_tool_calling_agent
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_community.agent_toolkits.load_tools import load_tools
from cube_to_rag.core.llm import get_llm
from cube_to_rag.tools import (
get_cube_schema_search_tool,
get_cube_graphql_tools,
get_vector_search_tool,
create_milvus_helper
)
# Initialize tools
cube_schema = get_cube_schema_search_tool(k=2)
vector_search = get_vector_search_tool(collection_name="docs", k=3)
# JWT tokens are auto-generated from settings.cube_api_secret
# No need to pass api_token manually - it's handled automatically
graphql_tools = get_cube_graphql_tools(
graphql_endpoint="http://localhost:4000/cubejs-api/graphql",
llm=get_llm()
)
tools = [cube_schema, vector_search] + graphql_tools
# Create agent
llm = get_llm()
prompt = ChatPromptTemplate.from_messages([
("system", """You are a powerful analytics assistant with access to:
1. Cube.js analytics (use cube_schema_search FIRST, then graphql)
2. Document search (use vector_search)
Always search for relevant schemas/documents before answering."""),
("human", "{input}"),
MessagesPlaceholder(variable_name="agent_scratchpad"),
])
agent = create_tool_calling_agent(llm, tools, prompt)
agent_executor = AgentExecutor(
agent=agent,
tools=tools,
verbose=True,
handle_parsing_errors=True
)
# Use the agent
result = agent_executor.invoke({
"input": "What metrics are available for student performance?"
})
print(result["output"])
5. Working with Milvus Directly
Advanced Milvus operations:
from cube_to_rag.tools import MilvusHelper
# Create helper (uses credentials from environment by default)
helper = MilvusHelper(collection_name="my_collection")
# Or with custom connection arguments (for authentication)
helper_with_auth = MilvusHelper(
collection_name="my_collection",
connection_args={
"uri": "http://localhost:19530",
"token": "root:your-password" # Format: username:password
}
)
# Search with scores
results = helper.search_with_score("machine learning", k=5)
for doc, score in results:
print(f"Score: {score:.3f} - {doc.page_content}")
# Search with metadata filtering
filtered_results = helper.search(
query="neural networks",
k=3,
filter_dict={"source": "research_papers"}
)
# Get collection statistics
stats = helper.get_collection_stats()
print(f"Total documents: {stats['num_entities']}")
# Delete documents
helper.delete(["id1", "id2", "id3"])
Configuration
Set environment variables or create a .env file:
# LLM Configuration
LLM_MODEL_ID='anthropic:claude-3-5-sonnet-20241022'
EMBEDDING_MODEL='openai:text-embedding-3-small'
# API Keys
ANTHROPIC_API_KEY='your-key'
OPENAI_API_KEY='your-key'
# Cube.js
CUBE_URL='http://localhost:4000' # Base URL
# Optional: Cube.js authentication (JWT auto-generated from secret)
# CUBEJS_API_SECRET='your-cubejs-api-secret'
# Milvus
MILVUS_SERVER_URI='http://localhost:19530'
# Optional: Milvus authentication (for secured instances)
# MILVUS_USER='root'
# MILVUS_PASSWORD='your-milvus-password'
Available Tools
| Tool | Purpose | Example Use Case |
|---|---|---|
get_cube_schema_search_tool() |
Search Cube.js schemas | Discover available cubes/dimensions/measures |
get_cube_graphql_tool() |
Execute Cube.js queries | Query analytics data via GraphQL |
get_vector_search_tool() |
Search vector database | Find relevant documents semantically |
create_milvus_helper() |
Manage Milvus operations | Add/search/delete documents in Milvus |
API Endpoints
Chat Endpoints
Create Chat Session
POST /chat/new
Response:
{
"status": "success",
"session_id": "uuid-here"
}
Ask Question
POST /chat/ask/
Content-Type: application/json
{
"message": "What's the average GPA by department?"
}
Response: Streaming text response
Get Chat History
GET /chat/history
Clear Chat History
DELETE /chat/clear
Embeddings Endpoints
Ingest Schemas
POST /embeddings/ingest
Content-Type: application/json
{
"schema_dir": "/path/to/schemas" # Optional
}
Response:
{
"success": true,
"schemas_ingested": 5,
"message": "Successfully ingested 5 cube schemas"
}
Search Schemas
POST /embeddings/search?query=course%20performance&k=3
Response:
{
"success": true,
"query": "course performance",
"results": [
{
"cube_name": "CoursePerformanceSummary",
"dimensions": [...],
"measures": [...],
"relevance_score": 0.95
}
]
}
Health Check
GET /embeddings/health
Usage Examples
Python Client
import requests
# Create session
session = requests.Session()
response = session.post(
"http://localhost:8080/chat/new",
headers={"x-access-token": "your-token"}
)
# Ask question
response = session.post(
"http://localhost:8080/chat/ask/",
headers={"x-access-token": "your-token"},
json={"message": "What's the average GPA by department?"},
stream=True
)
# Stream response
for chunk in response.iter_content(chunk_size=1024):
if chunk:
print(chunk.decode('utf-8'), end='', flush=True)
JavaScript/TypeScript Client
// Create session
const response = await fetch('http://localhost:8080/chat/new', {
method: 'POST',
headers: {
'x-access-token': 'your-token'
}
});
// Ask question with streaming
const askResponse = await fetch('http://localhost:8080/chat/ask/', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-access-token': 'your-token'
},
body: JSON.stringify({
message: 'What\'s the average GPA by department?'
})
});
// Stream response
const reader = askResponse.body.getReader();
while (true) {
const { done, value } = await reader.read();
if (done) break;
console.log(new TextDecoder().decode(value));
}
cURL Examples
# Create session
curl -X POST http://localhost:8080/chat/new \
-H "x-access-token: your-token"
# Ask question
curl -X POST http://localhost:8080/chat/ask/ \
-H "Content-Type: application/json" \
-H "x-access-token: your-token" \
-d '{"message": "Show me top 5 courses by enrollment"}' \
--no-buffer
# Search schemas
curl -X POST "http://localhost:8080/embeddings/search?query=student%20grades&k=2"
# Trigger schema ingestion
curl -X POST http://localhost:8080/embeddings/ingest \
-H "Content-Type: application/json"
Example Queries
The RAG system can answer questions like:
- "What's the average GPA by department?"
- "Show me course pass rates"
- "Which courses have the highest student engagement?"
- "List all courses with their enrollment numbers"
- "Compare performance across different semesters"
- "What dimensions are available in the data?"
- "Show me all measures I can query"
How It Works
1. Schema Ingestion
On startup, the system:
- Fetches all cube metadata from Cube.js API (
/cubejs-api/v1/meta) - Extracts cube names, dimensions, measures, and descriptions
- Generates text embeddings using your configured embedding model
- Stores embeddings in Milvus for semantic search
2. Query Processing
When you ask a question:
- Schema Search: Performs semantic search in Milvus to find relevant cubes
- Query Generation: LLM agent constructs a GraphQL query using discovered schema
- Query Execution: Executes query against Cube.js GraphQL API
- Response Formatting: Formats results into natural language response
3. Agent Tools
The LLM agent has access to two tools:
cube_schema_search: Semantic search for cubes/dimensions/measuresgraphql: Execute GraphQL queries against Cube.js
Supported LLM Providers
OpenAI
LLM_MODEL_ID='openai:gpt-4'
# or
LLM_MODEL_ID='openai:gpt-4o-mini'
EMBEDDING_MODEL='openai:text-embedding-3-small'
OPENAI_API_KEY='sk-your-key'
Anthropic Claude
LLM_MODEL_ID='anthropic:claude-3-5-sonnet-20241022'
EMBEDDING_MODEL='openai:text-embedding-3-small' # Claude doesn't have embeddings
ANTHROPIC_API_KEY='sk-ant-your-key'
OPENAI_API_KEY='sk-your-key' # Still needed for embeddings
AWS Bedrock
LLM_MODEL_ID='bedrock:anthropic.claude-3-5-sonnet-20240620-v1:0'
EMBEDDING_MODEL='bedrock:amazon.titan-embed-text-v2:0'
AWS_ACCESS_KEY_ID='your-key'
AWS_SECRET_ACCESS_KEY='your-secret'
AWS_DEFAULT_REGION='us-east-1'
Development
Running Tests
poetry run pytest
Code Formatting
poetry run black .
poetry run ruff check .
Deployment
Docker
FROM python:3.12-slim
WORKDIR /app
# Install dependencies
RUN pip install cube-to-rag
# Copy environment file
COPY .env .env
# Run server
CMD ["uvicorn", "app.server.main:app", "--host", "0.0.0.0", "--port", "8080"]
Environment Variables
For production, set:
DEPLOY_ENV='prod'
SECRET_KEY='strong-random-secret'
FAST_API_ACCESS_SECRET_TOKEN='strong-random-token'
Troubleshooting
Schema search returns no results
# Check embeddings health
curl http://localhost:8080/embeddings/health
# Re-ingest schemas
curl -X POST http://localhost:8080/embeddings/ingest
GraphQL queries fail
- Verify Cube.js is running:
curl http://localhost:4000/readyz - Check cube names use camelCase in GraphQL (e.g.,
coursePerformanceSummarynotCoursePerformanceSummary) - Review FastAPI logs:
docker logs cube-rag-api
LLM errors
- Verify API keys are set correctly
- Check model IDs match your provider
- Ensure sufficient API credits/quota
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file cube_to_rag-0.1.0a2.tar.gz.
File metadata
- Download URL: cube_to_rag-0.1.0a2.tar.gz
- Upload date:
- Size: 24.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.3.2 CPython/3.11.14 Linux/6.11.0-1018-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3b57f61e9e9cd00e52cf83b4b26c38adf594ee696437a3a6d97a24e8004c7e8d
|
|
| MD5 |
24124156de84eebe12aa4f6d45f93ffd
|
|
| BLAKE2b-256 |
b2c160c456fdcd6c5921770b104f925bb8c4b48c562b3088bab60f5f1897b951
|
File details
Details for the file cube_to_rag-0.1.0a2-py3-none-any.whl.
File metadata
- Download URL: cube_to_rag-0.1.0a2-py3-none-any.whl
- Upload date:
- Size: 27.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.3.2 CPython/3.11.14 Linux/6.11.0-1018-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a4acc74856b6e87c6020bb19960f3fa152d24664ddcc904055ab9bb41ce24836
|
|
| MD5 |
cafe4cc8781e8f499c23365adec6d59a
|
|
| BLAKE2b-256 |
8a495a5fb3ea1e78c925dbd1867648293fc90f0d7cd883056cebc925b1cf5a45
|